• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation
 
  • Details
  • Full
Options
2023
Conference Paper
Title

Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation

Abstract
Although human action anticipation is a task which is inherently multi-modal, state-of-the-art methods on well known action anticipation datasets leverage this data by applying ensemble methods and averaging scores of uni-modal anticipation networks. In this work we introduce transformer based modality fusion techniques, which unify multi-modal data at an early stage. Our Anticipative Feature Fusion Transformer (AFFT) proves to be superior to popular score fusion approaches and presents state-of-the-art results outperforming previous methods on EpicKitchens-100 and EGTEA Gaze+. Our model is easily extensible and allows for adding new modalities without architectural changes. Consequently, we extracted audio features on EpicKitchens-100 which we add to the set of commonly used features in the community.
Author(s)
Zhong, Zeyun
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Schneider, David
Voit, Michael  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Stiefelhagen, Rainer
Beyerer, Jürgen  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Mainwork
IEEE Winter Conference on Applications of Computer Vision, WACV 2023. Proceedings  
Conference
Winter Conference on Applications of Computer Vision 2023  
DOI
10.1109/wacv56688.2023.00601
Language
English
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024