Algorithm Modification to Improve the Kinect’s Performance in Point Cloud Processing

Zhang, M., Zhang, Z., Esche, S. K. & Chassapis, C.
Proceedings of the ASME International Mechanical Engineering Congress and Exposition IMECE'14, Montreal, Canada, November 14 - 20, 2014.

Abstract

Since its introduction in 2010, Microsoft’s Kinect input device for game consoles and computers has shown its great potential in a large number of applications, both in game development and research work. For instance, the Kinect has been employed to capture the users’ skeletons from the point clouds it produces and to subsequently use the estimated postures of the users to control the game processes. The Kinect was originally designed as an input device capable of sensing human motions, and it produces desired results in most skeleton-related applications. Since point clouds can carry much more information than just human postures, the Kinect has also captured the attention of researchers and developers in many other areas. Among the many prototype applications developed so far are for example hand gesture tracking, face scanning, object recognition, classification and motion tracking, 3D surface reconstruction, indoor mapping, etc. Many of these implementations are still in the prototype stages and exhibit a somewhat limited performance. These limitations are mainly caused by the quality of the point clouds generated by the Kinect, which include limited range, high dependency on surface properties, shadowing, low depth accuracy, etc. One of the Kinect’s most significant limitations is the low accuracy and high error associated with its point cloud. The severity of these defects varies with the points’ locations in the Kinect’s camera coordinate system. The available traditional algorithms for processing point clouds are based on the same assumption that input point clouds are perfect and have the same characteristics throughout the entire point cloud.

In the first part of this paper, the Kinect’s point cloud properties (including resolution, depth accuracy, noise level and error) and their dependency on the point pixel location will be systematically studied. Second, the Kinect’s calibration, both by hardware and software approaches, will be explored and methods for improving the quality of its output point clouds will be identified. Then, modified algorithms adapted to the Kinect’s unique properties will be introduced. This method allows to better judge the output point cloud properties in a quantifiable manner and then to modify traditional computer vision algorithms by adjusting their assumptions regarding the input cloud properties to the actual parameters of the Kinect. Finally, the modified algorithms will be tested in a prototype application. This example will demonstrate that the Kinect’s potential in various application areas increases substantially if its properties are considered in the design of the algorithms for processing the acquired point cloud data.