Options
2024
Master Thesis
Title
Poses of Steel: Forging Accurate 6D Metal Pose Estimation in Images with Iron-ic Precision
Abstract
This thesis focuses on improving the accuracy of 6D pose estimation for metallic objects, addressing challenges such as reflections, specular highlights, and occlusion. Overcoming those challenges is very important, as many real-world applications incorporate metallic objects, especially in industrial settings, and metallic surfaces are everywhere. A novel dataset was created to simulate those conditions. This dataset contains objects from three categories: Can, Household, and Industry. This wide range of objects makes it suitable for many applications. All the objects are presented in various scenes with different backgrounds and lighting conditions to represent real-world scenarios. In addition to the dataset, this work improves the GDR-Net [1], which already performs well in 6D pose estimation. The improvements include different methods to generate keypoints on the ground-truth data and use the heatmap for the training. By learning to predict this heatmap, the model can enhance its understanding of the spatial scene structure, which leads to an increase in the average performance. The second extension focus on reconstructing the input scene under ideal conditions, which means the model learns to predict the scene’s appearance without the metallic components of the objects. This optimized scene representation leads to more straightforward scenarios with reduced reflections or specular highlights caused by the environment. This representation helps the model to focus on the pose estimation process without being distracted by reflections. This makes the model more reliable when working with metallic surfaces. The improved model was evaluated using the new dataset and the ITODD dataset from the BOP challenge. It shows that the extensions can improve the model accuracy despite the unique challenges of metallic objects. However, when the same modifications are applied, the ITODD dataset shows a reduced performance. This highlights that the extensions address the challenges of reflection under controlled conditions but may require further adjustments to generalize better under different conditions or datasets.
Despite those limitations, the results demonstrate that adopting an existent pose estimation model to unique challenges can enhance its performance and is a promising direction for future research.
Despite those limitations, the results demonstrate that adopting an existent pose estimation model to unique challenges can enhance its performance and is a promising direction for future research.
Thesis Note
Darmstadt, TU, Master Thesis, 2024
Language
English
Keyword(s)
Branchen: Automotive Industry
Branche: Healthcare
Branche: Information Technology
Branche: Cultural and Creative Economy
Research Line: Computer graphics (CG)
Research Line: Computer vision (CV)
Research Line: Machine learning (ML)
LTA: Scalable architectures for massive data sets
LTA: Machine intelligence, algorithms, and data structures (incl. semantics)
3D Computer vision
Deep learning
3D Pattern/Structure recognition
3D Object localisation