Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

LiDAR-based Recurrent 3D Semantic Segmentation with Temporal Memory Alignment

: Duerr, Fabian; Pfaller, Mario; Weigel, Hendrik; Beyerer, Jürgen

Volltext (PDF; )

Institute of Electrical and Electronics Engineers -IEEE-; IEEE Computer Society:
International Conference on 3D Vision, 3DV 2020. Proceedings : 25 - 28 November 2020, Virtual Event; International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT)
Los Alamitos, Calif.: IEEE Computer Society Conference Publishing Services (CPS), 2020
ISBN: 978-1-7281-8129-5
ISBN: 978-1-7281-8128-8
International Conference on 3D Vision (3DV) <2020, Online>
International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT) <2020, Online>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IOSB ()

Understanding and interpreting a 3d environment is a key challenge for autonomous vehicles. Semantic segmentation of 3d point clouds combines 3d information with semantics and thereby provides a valuable contribution to this task. In many real-world applications, point clouds are generated by lidar sensors in a consecutive fashion. Working with a time series instead of single and independent frames enables the exploitation of temporal information. We therefore propose a recurrent segmentation architecture(RNN), which takes a single range image frame as input and exploits recursively aggregated temporal information. An alignment strategy, which we call Temporal Memory Alignment, uses ego motion to temporally align the memory between consecutive frames in feature space. A Residual Network and ConvGRU are investigated for the memory update. We demonstrate the benefits of the presented approach on two large-scale datasets and compare it to several state-of-the-art methods. Our approach ranks first on the Semantic KITTI [4] multiple scan benchmark and achieves state-of-the-art performance on the single scan benchmark. In addition, the evaluation shows that the exploitation of temporal information significantly improves segmentation results compared to a single frame approach.