Self-Optimizing Augmentation Pipeline

Boller, Andre

2024

Master Thesis

Abstract

Training effectiveness of deep neural network models is crucial for their success [53]. In addition to the training effectiveness, the inference time, which describes the time required to generate results, also plays an important role as it determines the applicability. Deep neural networks that suffer from inefficient training or long inference times could be unusable for many applications. Performance in terms of training time or inference time is an important factor in many fields of machine learning applications. Especially applications in real scenarios, such as in industrial environments, require a short inference time in order to be competitive in real time. Similarly, the rapid increase in model sizes in recent years has led to a growing interest in acceleration techniques for training. In particular, the emergence of very large deep networks such as LLMs increases the importance of being able to perform training in a reasonable amount of time, as these can require up to hundreds of GPU-years of training. To contribute to the latter problem, I propose a new training technique that aims to optimize the training effectiveness of neural networks in the field of computer vision by improving the performance. I present the self-optimizing augmentation pipeline SOAP, applied and investigated in the field of 6D object pose estimation, more precisely on the 6D object pose estimator GDRNPP.
SOAP requires a differentiable image generation process. For this purpose, the training process is analyzed separately with two different image generation models, PyTorch3D and Stable Diffusion. The proposed pipeline is evaluated on the LM-O, T-LESS and ITODD datasets, a subset of the seven core datasets of the BOP challenge, focusing on the task of model-based 6D localization of seen objects. To be benchmarkable, I present an evaluation that is comparable to the corresponding method in the BOP Leaderboard [4].

Thesis Note

Darmstadt, TU, Master Thesis, 2024

Author(s)

Boller, Andre

Fraunhofer-Institut für Graphische Datenverarbeitung IGD

Advisor(s)

Kuijper, Arjan