• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Advanced Post-processing for Object Detection Dataset Generation
 
  • Details
  • Full
Options
2025
Conference Paper
Title

Advanced Post-processing for Object Detection Dataset Generation

Abstract
Fast acquisition and generation of training data is an important problem for the training of Deep Neural Networks (DNNs). Previous work using luminance keying for efficient training data acquisition achieves good results with very low effort, due to the high light absorption capabilities of the black background which enables keying through luminance instead of chroma. However, it does not reach real-world recording levels of performance. This paper significantly improves the achievable performance of luminance keying as evaluated for the use case of object detection. We introduce a novel post-processing pipeline incorporating a denoising diffusion probabilistic model (DDPM) to capitalize on luma-key recordings. First, we employ low rank adaptation (LoRA) to teach the recorded objects to a diffusion model. Second, we use Depth Anything to estimate the depth of the luma-key data. Third, utilizing ControlNet with depth estimates and Canny image filtering for guidance, we generate photo-realistic training images, using a wide range of relevant prompts, which increases model robustness in diverse environments. Applying the high quality masks of luminance keying, we get perfect ground truth for the object detection training. Extensive testing on the YCB-V object set demonstrates that our approach performs favorably compared to traditional techniques that require 3D meshes and material data, such as physically-based rendering or in-distribution dataset splits. Our proposed pipeline improves luminance keying to provide an efficient methodology for creating high-quality training datasets, facilitating the swift development and training of state-of-the-art DNNs for object detection, and is applicable to similar tasks, such as classification, and segmentation.
Author(s)
Pöllabauer, Thomas  orcid-logo
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Berkei, Sarah  
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Knauthe, Volker
TU Darmstadt, Fachgebiet Graphisch-Interaktive Systeme  
Kuijper, Arjan  orcid-logo
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Mainwork
Advances in Visual Computing. 19th International Symposium, ISVC 2024. Proceedings. Pt.I  
Project(s)
Non-Destructive Inspection Services for Digitally Enhanced Zero Waste Manufacturing  
Funder
European Commission  
Conference
International Symposium on Visual Computing 2024  
DOI
10.1007/978-3-031-77392-1_1
Language
English
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Keyword(s)
  • Branche: Automotive Industry

  • Branche: Healthcare

  • Branche: Cultural and Creative Economy

  • Research Line: Computer graphics (CG)

  • Research Line: Computer vision (CV)

  • Research Line: Machine learning (ML)

  • LTA: Scalable architectures for massive data sets

  • LTA: Machine intelligence, algorithms, and data structures (incl. semantics)

  • LTA: Generation, capture, processing, and output of images and 3D models

  • 3D Computer vision

  • Machine learning

  • Pattern recognition

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024