Dense real-time mapping of object-class semantics from RGB-D video
We propose a real-time approach to learn semantic maps from moving RGB-D cameras. Our method models geometry, appearance, and semantic labeling of surfaces. We recover camera pose using simultaneous localization and mapping while concurrently recognizing and segmenting object classes in the images. Our object-class segmentation approach is based on random decision forests and yields a dense probabilistic labeling of each image. We implemented it on GPU to achieve a high frame rate. The probabilistic segmentation is fused in octree-based 3D maps within a Bayesian framework. In this way, image segmentations from various view points are integrated within a 3D map which improves segmentation quality. We evaluate our system on a large benchmark dataset and demonstrate state-of-the-art recognition performance of our object-class segmentation and semantic mapping approaches.