Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Deep Neural Network Approaches for Selective Hearing based on Spatial Data Simulation

: Hestermann, Simon; Lukashevich, Hanna; Sladeczek, Christoph


Werner, S. ; Verband Deutscher Tonmeister -VDT-; TU Ilmenau:
Audio for virtual, augmented and mixed realities. Proceedings of ICSA 2019 : 5th International Conference on Spatial Audio, September 26th to 28th, 2019, Ilmenau, Germany
Ilmenau: ilmedia, 2019
URN: urn:nbn:de:gbv:ilm1-2019200492
DOI: 10.22032/dbt.39936
International Conference on Spatial Audio (ICSA) <5, 2019, Ilmenau>
Conference Paper
Fraunhofer IDMT ()

Selective Hearing (SH) refers to the listener’s attention to specific sound sources of interest in their auditory scene. Achieving SH through computational means involves detection, classification, separation, localization and enhancement of sound sources. Deep neural networks (DNNs) have been shown to perform these tasks in a robust and time-efficient manner. A promising application of SH are intelligent noise-cancelling headphones, where sound sources of interest, such as warning signals, sirens or speech, are extracted from a given auditory scene and conveyed to the user, whilst the rest of the auditory scene remains inaudible. For this purpose, existing noise cancellation approaches need to be combined with machine learning techniques. In this context, we evaluate a convolutional neural network (CNN) architecture and a long short-term memory (LSTM) architecture for the detection and separation of sirens. In addition, we propose a data simulation approach for generating different sound environments for a virtual pair of headphone microphones. The Fraunhofer SpatialSound Wave technology is used for a realistic evaluation of the trained models. For the evaluation, a three-dimensional acoustic scene is simulated via the object-based audio approach.