Options
2025
Conference Paper
Title
Improving Zero-Shot Template-Based 6D Pose Estimation with Geometric Features
Abstract
6D Object Pose Estimation is a fundamental problem in robotics and augmented reality. Most of today’s state-of-the-art approaches rely on deep learning and require large sets of training images depicting the target objects. A growing number of algorithms try to generalize from a set of known objects, available for training, to unseen objects at test time. Among those, GigaPose is a template-based approach, that renders the target object in an onboarding phase shortly before inference and uses learned latent codes of these renderings and observed objects for feature matching. While learned representation prove powerful in a wide range of tasks, we propose the integration of additional purely geometric features, which can be extracted basically for free from the available 3D meshes during the onboarding phase. This representation is then used as an additional input for template- and 2D-2D correspondence matching in our approach. We consider multiple relevant features and, implementing one of them, demonstrate improved performance on the core datasets of the relevant BOP Challenge. Our results suggest that, indeed, utilizing additional geometric features can improve the relevant metrics without much additional cost.
Author(s)
Keyword(s)
Branche: Automotive Industry
Branche: Healthcare
Branche: Information Technology
Branche: Cultural and Creative Economy
Research Line: Computer vision (CV)
Research Line: Machine learning (ML)
LTA: Scalable architectures for massive data sets
LTA: Machine intelligence, algorithms, and data structures (incl. semantics)
3D Computer vision
Machine learning
Pattern recognition
3D Object localisation