• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Variance-maximizing acoustic dimensions for machine learning-based soundscape assessment
 
  • Details
  • Full
Options
March 2025
Conference Paper
Title

Variance-maximizing acoustic dimensions for machine learning-based soundscape assessment

Abstract
Machine learning (ML) models used in audio signal processing commonly rely on time-frequency representations, such as linear and log Mel spectrograms, as input features. Although these two-dimensional representations facilitate the application of advanced machine vision techniques to audio tasks, they limit the auditory and acoustic information to an average of sound energy (or amplitude) over a relatively coarse time and frequency grid. Prior research has indicated that a substantial amount of information in audio signals can be represented using a few fundamental acoustic dimensions. These dimensions are derived from time series of low-level signal features motivated by human auditory perception within the soundscape framework and are designed to maximize variance across the selected sample. This study examines the application of these acoustic dimensions as inputs for ML tasks, including automatic acoustic scene classification (ASC) and acoustic event detection (AED), comparing them to traditional spectrogram representations. The comparison focuses on accuracy, robustness to unlearned acoustic environments and events, computational complexity, and explainability. Finally, the potential advantages of employing these dimensions for robust, efficient, and automated soundscape assessment are discussed in relation to existing environmental monitoring tools.
Author(s)
Bergner, Jakob
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Masovic, Drasko
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Sladeczek, Christoph  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Bös, Joachim  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Mainwork
DAS/DAGA 2025, 51st Annual Meeting on Acoustics. Proceedings  
Conference
Annual Meeting on Acoustics 2025  
Link
Link
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Keyword(s)
  • Intelligent Acoustic Sensors

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024