• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. AudioProtoPNet: An interpretable deep learning model for bird sound classification
 
  • Details
  • Full
Options
2025
Journal Article
Title

AudioProtoPNet: An interpretable deep learning model for bird sound classification

Abstract
Deep learning models have significantly advanced acoustic bird monitoring by recognizing numerous bird species based on their vocalizations. However, traditional deep learning models are black boxes that provide no insight into their underlying computations, limiting their usefulness to ornithologists and machine learning engineers. Explainable models could facilitate debugging, knowledge discovery, trust, and interdisciplinary collaboration. We introduce AudioProtoPNet, an adaptation of the Prototypical Part Network (ProtoPNet) for multi-label bird sound classification. It is inherently interpretable, leveraging a ConvNeXt backbone to extract embeddings and a prototype learning classifier trained on these embeddings. The classifier learns prototypical patterns of each bird species’ vocalizations from spectrograms of instances in the training data. During inference, recordings are classified by comparing them to learned prototypes in the embedding space, providing explanations for the model’s decisions and insights into the most informative embeddings of each bird species. The model was trained on the BirdSet training dataset, which consists of 9734 bird species and over 6800 h of recordings. Its performance was evaluated on the seven BirdSet test datasets, covering different geographical regions. AudioProtoPNet outperformed the state-of-the-art bird sound classification model Perch, which is superior to the more popular BirdNet, achieving an average AUROC of 0.90 and a cmAP of 0.42, with relative improvements of 7.1% and 16.7% over Perch, respectively. These results demonstrate that even for the challenging task of multi-label bird sound classification, it is possible to develop powerful yet interpretable deep learning models that provide valuable insights for professionals in ornithology and machine learning.
Author(s)
Heinrich, René Patrick Gerald
Fraunhofer-Institut für Energiewirtschaft und Energiesystemtechnik IEE  
Rauch, Lukas
Universität Kassel
Sick, Bernhard
Universität Kassel
Scholz, Christoph
Fraunhofer-Institut für Energiewirtschaft und Energiesystemtechnik IEE  
Journal
Ecological Informatics  
Open Access
DOI
10.1016/j.ecoinf.2025.103081
Language
English
Fraunhofer-Institut für Energiewirtschaft und Energiesystemtechnik IEE  
Keyword(s)
  • Avian diversity

  • Bioacoustics

  • Bird sound recognition

  • BirdNet

  • Deep learning

  • Explainable AI

  • Interpretability

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024