• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection
 
  • Details
  • Full
Options
May 29, 2025
Journal Article
Title

Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection

Abstract
Mono RGB cameras and automotive radar sensors provide a complementary information set that makes them excellent candidates for sensor data fusion to obtain robust traffic user detection. This has been widely used in the vehicle domain and recently introduced in roadside-mounted smart infrastructure-based road user detection. However, the performance of the most commonly used late fusion methods often degrades when the camera fails to detect road users in adverse environmental conditions. The solution is to fuse the data using deep neural networks at the early stage of the fusion pipeline to use the complete data provided by both sensors. Research has been carried out in this area, but is limited to vehicle-based sensor setups. Hence, this work proposes a novel deep neural network to jointly fuse RGB mono-camera images and 3D automotive radar point cloud data to obtain enhanced traffic user detection for the roadside-mounted smart infrastructure setup. Projected radar points are first used to generate anchors in image regions with a high likelihood of road users, including areas not visible to the camera. These anchors guide the prediction of 2D bounding boxes, object categories, and confidence scores. Valid detections are then used to segment radar points by instance, and the results are post-processed to produce final road user detections in the ground plane. The trained model is evaluated for different light and weather conditions using ground truth data from a lidar sensor. It provides a precision of 92%, recall of 78%, and F1-score of 85%. The proposed deep fusion methodology has 33%, 6%, and 21% absolute improvement in precision, recall, and F1-score, respectively, compared to object-level spatial fusion output.
Author(s)
Agrawal, Shiva
Technische Hochschule Ingolstadt
Bhanderi, Savankumar
Technische Hochschule Ingolstadt
Elger, Gordon  
Fraunhofer-Institut für Verkehrs- und Infrastruktursysteme IVI  
Journal
Sensors. Online journal  
Open Access
File(s)
Download (20.25 MB)
Rights
CC BY 4.0: Creative Commons Attribution
DOI
10.3390/s25113422
10.24406/publica-7780
Additional link
Full text
Language
English
Fraunhofer-Institut für Verkehrs- und Infrastruktursysteme IVI  
Keyword(s)
  • Artificial intelligence

  • Camera

  • Deep learning

  • Data processing

  • Object detection

  • Perception

  • Radar

  • Roadside-mounted sensors

  • Sensor data fusion

  • Smart infrastructure

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024