• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Scalable Video Action Anticipation with Cross Linear Attentive Memory
 
  • Details
  • Full
Options
2026
Conference Paper
Title

Scalable Video Action Anticipation with Cross Linear Attentive Memory

Abstract
Recent advances in action anticipation rely heavily on Transformer architectures to learn discriminative representations of the past observation, incurring high computational and memory overhead that limits their applicability to long videos. While temporal processors with linear complexity like RNNs and state-space models offer efficient alternatives, their sequential nature risks overlooking subtle cues in observed frames that could enhance future anticipation. We address this limitation with Cross Linear Attentive Memory (CLAM), a memory module that selectively retrieves complementary context cues from frame features. By reformulating linear attention to replace traditional cross-attention, CLAM achieves linear computation complexity and constant memory usage relative to input length. Finally, by fusing the outputs of the temporal processor and CLAM, a non-autoregressive Transformer decoder generates future actions in one shot with high accuracy. Experiments on egocentric (EpicKitchens100 and Ego4D) and third-person (Thumos14) benchmarks demonstrate our model’s superior anticipation accuracy and scalability, processing longer sequences with significantly less latency growth than alternatives. Our approach also achieves promising results in online action detection.
Author(s)
Zhong, Zeyun
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Martin, Manuel  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Schneider, David
Lerch, David
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Wu, Chengzhi
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Diederichs, Frederik  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Gall, Jürgen
Beyerer, Jürgen  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Mainwork
IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2026. Proceedings  
Conference
Winter Conference on Applications of Computer Vision 2026  
DOI
10.1109/WACV61042.2026.00783
Language
English
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Keyword(s)
  • Feeds

  • Antennas

  • Memory modules

  • Printed circuits

  • LoRa

  • Videos

  • Streaming media

  • Video equipment

  • Protocols

  • Data communication

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024