• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Ensuring medical AI safety: interpretability-driven detection and mitigation of spurious model behavior and associated data
 
  • Details
  • Full
Options
2025
Journal Article
Title

Ensuring medical AI safety: interpretability-driven detection and mitigation of spurious model behavior and associated data

Abstract
Deep neural networks are increasingly employed in high-stakes medical applications, despite their tendency for shortcut learning in the presence of spurious correlations, which can have potentially fatal consequences in practice. Whereas a multitude of works address either the detection or mitigation of such shortcut behavior in isolation, the Reveal2Revise approach provides a comprehensive bias mitigation framework combining these steps. However, effectively addressing these biases often requires substantial labeling efforts from domain experts. In this work, we review the steps of the Reveal2Revise framework and enhance it with semi-automated interpretability-based bias annotation capabilities. This includes methods for the sample- and feature-level bias annotation, providing valuable information for bias mitigation methods to unlearn the undesired shortcut behavior. We show the applicability of the framework using four medical datasets across two modalities, featuring controlled and real-world spurious correlations caused by data artifacts. We successfully identify and mitigate these biases in VGG16, ResNet50, and contemporary Vision Transformer models, ultimately increasing their robustness and applicability for real-world medical tasks. Our code is available at https://github.com/frederikpahde/medical-ai-safety.
Author(s)
Pahde, Frederik
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Wiegand, Thomas  
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Lapuschkin, Sebastian Roland
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Samek, Wojciech  
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Journal
Machine learning  
Open Access
File(s)
Download (7.67 MB)
Rights
CC BY 4.0: Creative Commons Attribution
DOI
10.1007/s10994-025-06834-w
10.24406/publica-5290
Additional link
Full text
Language
English
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Keyword(s)
  • Bias mitigation

  • Data annotation

  • Explainable artificial intelligence

  • Interpretability

  • Spurious correlations

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024