• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Collaborative Perceiver: Elevating Vision-Based 3D Object Detection via Local Density-Aware Dense Spatial Occupancy
 
  • Details
  • Full
Options
2026
Conference Paper
Title

Collaborative Perceiver: Elevating Vision-Based 3D Object Detection via Local Density-Aware Dense Spatial Occupancy

Abstract
Vision-based bird’s-eye-view (BEV) 3D object detection has advanced significantly in autonomous driving by offering cost-effectiveness and rich contextual information. However, existing methods often construct BEV representations by collapsing extracted object features, neglecting intrinsic environmental contexts, such as roads and pavements. This hinders detectors from comprehensively perceiving the characteristics of the physical world. To alleviate this, we introduce a multi-task learning framework, Collaborative Perceiver (CoP), that leverages spatial occupancy as auxiliary information to mine consistent structural and conceptual similarities shared between 3D object detection and occupancy prediction tasks, bridging gaps in spatial representations and feature refinement. To this end, we first propose a pipeline to generate dense occupancy ground truths incorporating local density information (LDO) for reconstructing detailed environmental information. Next, we employ a voxel-height-guided sampling (VHS) strategy to distill fine-grained local features according to distinct object properties. Furthermore, we develop a global-local collaborative feature fusion (CFF) module that seamlessly integrates complementary knowledge between both tasks, thus composing more robust BEV representations. Extensive experiments on the nuScenes benchmark demonstrate that CoP outperforms existing vision-based frameworks, achieving 49.5% mAP and 59.2% NDS on the test set.
Author(s)
Yuan, Jicheng
Technische Universität Berlin
Nguyen-Duc, Manh
Technische Universität Berlin
Liu, Qian
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Hauswirth, Manfred  
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Phuoc, Danh Le
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Mainwork
Neural Information Processing. Proceedings, Part II  
Project(s)
Berechnungsgrundlagen zur semantischen Verarbeitung von Strömen  
Funder
Deutsche Forschungsgemeinschaft  
Conference
International Conference on Neural Information Processing 2025  
DOI
10.1007/978-981-95-4378-6_4
Language
English
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Keyword(s)
  • 3D Object Detection

  • 3D Occupancy Prediction

  • Bird’s-eye-view

  • Collaborative perception

  • Multi-task Learning

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024