Options
2025
Poster
Title
Automatic Retrieval of Indicator Sounds for Acoustic Geo-Tagging
Title Supplement
Poster presented at DAS | DAGA 2025
Abstract
Humans describe ambient audio using concepts like sound sources and acoustic scene. Location-specific sounds, such as sirens and church bells, enhance the understanding of a recording's geographic origin. Originally coined as “soundmarks” in soundscape research, these distinctive acoustic landmark sounds define a location’s unique sonic character. In the past decade, the Detection and Classification of Acoustic Scenes and Events (DCASE) community has focused mainly on tasks like acoustic scene classification (ASC) and sound event detection (SED). This study explores acoustic geo-tagging (AGT), which aims to identify an audio clip's geographic origin (city, country, or acoustic scene). AGT is vital for context-sensitive audio processing in hearing aids, verifying multimedia content, and audio forensics. We propose a novel taxonomy of four sound categories, ranging from general sound events with no geographic specificity to highly specific soundmarks, which are unique for a location. Additionally, we evaluate two data-driven approaches for automatically retrieving indicator sounds, which can serve as cues for AGT since they occur more frequently in specific areas: (a) a statistical method based on sound-location co-occurrence and (b) a deep learning-based approach utilizing explainable AI techniques.
Conference
Rights
Under Copyright
Language
English