Cross-Dataset Generalization: Bridging the Gap Between Real and Synthetic LiDAR Data

Strunz, Martin; Protzmann, Robert; Radusch, Ilja

doi:10.1007/978-3-031-87345-4_14

April 29, 2025

Conference Paper

Abstract

In this paper, we present and evaluate a new synthetic street-view LiDAR dataset for computer vision and object detection in the automated driving domain. The dataset consists of labeled data for 100 scenarios with about 1.1 h of drive time, 40.000 frames and 500.000 bounding boxes and will be publicly available from now. We employed the Eclipse MOSAIC simulation framework in conjunction with the SUMO simulator for routes and traffic densities as well as CARLA for simulation of sophisticated LiDAR sensors and environments. Before generating our own data, we analyzed already existing LiDAR datasets regarding numerous statistics as well as features they contain. As a result, we reproduced key aspects of the analyzed datasets to generate data that will comprise realistic and therefore diverse scenarios. A common challenge is that deep learning models for object detection perform best on test data that are strongly related to the data they were trained on. The comparison in this paper finally shows that our model proved the best cross-dataset performance.

Author(s)

Strunz, Martin

Protzmann, Robert

Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS

Radusch, Ilja

Technische Universität Berlin

Mainwork

Simulation Tools and Techniques

Conference

International Conference on Simulation Tools and Techniques 2024

Options

Cross-Dataset Generalization: Bridging the Gap Between Real and Synthetic LiDAR Data