Online generation of large scale reconstructions from RGB-D data by Tristan Igelbrink
Generating polygonal maps from RGB-D data is an active field of research in robotic mapping. Kinect Fusion and related algorithms provide means to generate reconstructions of large
environments. In this master project a system for online generation of large scale reconstructions from RGB-D data should be provided.
The Kinect Fusion implementation by Anatoly Baksheev provide the possibility to stream a TSDF (Truncated Signed Distance Function) representation from the GPU to the host’s CPU. To generate topologically correct triangle meshes, we use the LVR implementation of Marching Cubes.
The original Kinect Fusion algorithm is limited to a relatively small volume, per default . A common approach to overcome this problem is a cyclical buffer to translate the reconstruction volume while the camera is moving. If the tracked camera position excesses a certain distance threshold, a shift of the volumes center towards the camera is triggered. This shift takes places in discrete voxel units and independently for each dimension .
The part of the TSDF volume which is shifted out of scope is called a slice and downloaded to main system memory. We stream the raw TSDF values to system memory and integrate them
directly into a hash grid structure. Once the slice is copied to main memory, the GPU values are cleared and filled with new data. The blue volume is shifted left, and new cells from allocated in the cyclical buffers are used within the volume (left). The red part of the initial volume is downloaded and integrated into the global grid (right). After shifts, the Marching Cubes reconstruction on the hash grid is performed asynchronously to unlock the TSDF volume as quickly as possible to ensure that the GPU based tracking and integration can continue with minimal interruption.
The figure shows the work flow of our processing pipeline. The Kinect Fusion algorithm with shift detection is running asynchronously on the systems GPU. When a shift is necessary,
the TSDF slice is downloaded to the host and processed: First, the slice is aligned with the global grid and new cells are computed and merged. After integration, Marching Cubes is performed on the new cells. The generated triangles are integrated into global mesh. If the alignment is computed correctly, already generated vertices can be detected and reused for a topologically consistent mesh. The correct alignment strongly depends on a correct camera pose estimation. If this estimation is incorrect, the new slice may become misaligned, but this issue is system inherent to Kinect Fusion and also exists in the original implementation.
For integration of a local slice grid into the global grid, we need to find already existing adjacent cells. These cells can be addressed by computing the global indices using the slice offset indices $(u_x, u_y, u_z)$ and the local indices within the slice within a slice:
Using these indices, each cell can be exactly identified within the global grid. To make the resulting grid consistent, every cell that lies on an local grid edge is marked as a fusion cell. These fusion cells can then be used for consistent alignment by the next slice because of their global accessibility.
To further reduce the amount of triangles in the mesh for every reconstructed slice a region growing approach is used to determine planes in the mesh. The contour from these planes are polygons which can be retesselated to remove redundant triangles (see last picture above).
The key features of the LVR Kinfu implementation are:
- Fast Kinect Fusion algorithm (kinfu_remake from Anatoly Baksheev)
- Extended with Kinect Fusion Large Scale
- Realtime surface reconstruction from TSDF representation
- Memory efficient meshes through topologically correct reconstruction with LVR MC
- More memory efficient due LVR optimization pipeline