Transition in Focus of Prediction Tasks for Skeleton Graph Component Detection with Transformer

Wang, Zhiyuan; Yang, Cong; Zhang, Yulu; Boukhers, Zeyd; Sui, Wei; Ji, Yi; Liu, Chunping

doi:10.1145/3696409.3700170

December 3, 2024

Conference Paper

Abstract

Recent advancements in skeleton extraction have significantly improved the process by simplifying the skeleton regression task into graph component detection. Despite the advancements in skeleton topology, accuracy in detailing skeletal parts remains challenging, with specific issues such as jagged edges in high-resolution images. This paper identifies the limitations of current detection models that can adapt during the decomposition and reconstruction phases, which impacts the overall precision of the extraction. In response, we propose an approach that revises the primary focus of the detection tasks. Inspired by the success of pixel-wise binary classification methods, we propose a gradual transition in focus from a coordinate localization regression task to a classification task of predicting points during the training process. This transition can be achieved by adjusting the number of object queries in the Transformer model. Theoretical and experimental evaluations validate the effectiveness of our approach. Our method yields significant improvements in performance over the baseline across various shape and image datasets (e.g., 0.836 vs. 0.826 for BlumNet on the SK1491 dataset).

Author(s)

Wang, Zhiyuan

Yang, Cong

Zhang, Yulu

Boukhers, Zeyd

Fraunhofer-Institut für Angewandte Informationstechnik FIT

Sui, Wei

Ji, Yi

Liu, Chunping

Mainwork

6th ACM International Conference on Multimedia in Asia, MMAsia 2024. Proceedings

Conference

International Conference on Multimedia in Asia 2024

Options

Transition in Focus of Prediction Tasks for Skeleton Graph Component Detection with Transformer