• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Abschlussarbeit
  4. Self-Optimizing Augmentation Pipeline
 
  • Details
  • Full
Options
2024
Master Thesis
Title

Self-Optimizing Augmentation Pipeline

Abstract
Training effectiveness of deep neural network models is crucial for their success [53]. In addition to the training effectiveness, the inference time, which describes the time required to generate results, also plays an important role as it determines the applicability. Deep neural networks that suffer from inefficient training or long inference times could be unusable for many applications. Performance in terms of training time or inference time is an important factor in many fields of machine learning applications. Especially applications in real scenarios, such as in industrial environments, require a short inference time in order to be competitive in real time. Similarly, the rapid increase in model sizes in recent years has led to a growing interest in acceleration techniques for training. In particular, the emergence of very large deep networks such as LLMs increases the importance of being able to perform training in a reasonable amount of time, as these can require up to hundreds of GPU-years of training. To contribute to the latter problem, I propose a new training technique that aims to optimize the training effectiveness of neural networks in the field of computer vision by improving the performance. I present the self-optimizing augmentation pipeline SOAP, applied and investigated in the field of 6D object pose estimation, more precisely on the 6D object pose estimator GDRNPP.
SOAP requires a differentiable image generation process. For this purpose, the training process is analyzed separately with two different image generation models, PyTorch3D and Stable Diffusion. The proposed pipeline is evaluated on the LM-O, T-LESS and ITODD datasets, a subset of the seven core datasets of the BOP challenge, focusing on the task of model-based 6D localization of seen objects. To be benchmarkable, I present an evaluation that is comparable to the corresponding method in the BOP Leaderboard [4].
Thesis Note
Darmstadt, TU, Master Thesis, 2024
Author(s)
Boller, Andre
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Advisor(s)
Kuijper, Arjan  orcid-logo
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Pöllabauer, Thomas  orcid-logo
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Language
English
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Keyword(s)
  • Branche: Automotive

  • Branche: Healthcare

  • Branche: Information Technology

  • Branche: Maritime Economy

  • Branche: Cultural and Creative Economy

  • Research Line: Computer graphics (CG)

  • Research Line: Computer vision (CV)

  • Research Line: Machine learning (ML)

  • LTA: Scalable architectures for massive data sets

  • LTA: Machine intelligence, algorithms, and data structures (incl. semantics)

  • 3D Computer vision

  • Deep learning

  • 3D Pattern/Structure recognition

  • 3D Object localisation

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024