• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
 
  • Details
  • Full
Options
2024
Conference Paper
Title

AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers

Abstract
Large Language Models are prone to biased predictions and hallucinations, underlining the paramount importance of understanding their model-internal reasoning process. However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively. While partial solutions exist, our method is the first to faithfully and holistically attribute not only input but also latent representations of transformer models with the computational efficiency similar to a single backward pass. Through extensive evaluations against existing methods on LLaMa 2, Mixtral 8x7b, Flan-T5 and vision transformer architectures, we demonstrate that our proposed approach surpasses alternative methods in terms of faithfulness and enables the understanding of latent representations, opening up the door for concept-based explanations. We provide an LRP library at https://github.com/rachtibat/LRP-eXplains-Transformers.
Author(s)
Achtibat, Reduan
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Vakilzadeh Hatefi, Sayed Mohammad
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Dreyer, Maximilian
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Jain, Aakriti
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Wiegand, Thomas  
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Lapuschkin, Sebastian Roland
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Samek, Wojciech  
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Mainwork
41st International Conference on Machine Learning, ICML 2024. Proceedings  
Conference
International Conference on Machine Learning 2024  
Link
Link
Language
English
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024