• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. HybridFormer: Bridging Local and Global Spatio-Temporal Dynamics for Efficient Skeleton-Based Action Recognition
 
  • Details
  • Full
Options
2025
Conference Paper
Title

HybridFormer: Bridging Local and Global Spatio-Temporal Dynamics for Efficient Skeleton-Based Action Recognition

Abstract
Skeleton-based action recognition aims to identify human actions from joint coordinates and skeletal connections. Previous works have effectively employed graph convolutional networks and, more recently, attention-based architectures to capture joint topology. Yet, most approaches treat spatial and temporal dynamics separately. With the proven efficacy of global modeling in image and video recognition, it prompts an inquiry into its applicability and benefits for skeleton-based recognition. However, applying global modeling to skeletons introduces challenges, including extensive data requirements and substantial computational demands. In this paper, we attempt to address these challenges and present a detailed mathematical analysis of the computational complexities. We propose a novel, efficient model, HybridFormer, which initially uses local blocks for separate spatial and temporal modeling, laying a solid foundation for learning. Subsequently, global blocks with attention mechanisms merge these dimensions, capturing complex action interdependencies. This dual-phase approach overcomes previous limitations, achieves performance comparable to state-of-the-art on the NTU-60, NTU-120, and NW-UCLA datasets, and demonstrates significantly enhanced inference efficiency.
Author(s)
Zhong, Zeyun
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Li, Tianrui
Martin, Manuel  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Cormier, Mickael  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Wu, Chengzhi
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Diederichs, Frederik  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Beyerer, Jürgen  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Mainwork
Computer Vision - ECCV 2024 Workshops. Proceedings. Part XIII  
Conference
European Conference on Computer Vision 2024  
DOI
10.1007/978-3-031-91575-8_2
Language
English
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024