Deep Neural Network Approaches for Selective Hearing based on Spatial Data Simulation

Hestermann, Simon; Lukashevich, Hanna; Sladeczek, Christoph

doi:10.22032/dbt.39957

2019

Conference Paper

Abstract

Selective Hearing (SH) refers to the listener's attention to specific sound sources of interest in their auditory scene. Achieving SH through computational means involves detection, classification, separation, localization and enhancement of sound sources. Deep neural networks (DNNs) have been shown to perform these tasks in a robust and time-efficient manner. A promising application of SH are intelligent noise-cancelling headphones, where sound sources of interest, such as warning signals, sirens or speech, are extracted from a given auditory scene and conveyed to the user, whilst the rest of the auditory scene remains inaudible. For this purpose, existing noise cancellation approaches need to be combined with machine learning techniques. In this context, we evaluate a convolutional neural network (CNN) architecture and a long short-term memory (LSTM) architecture for the detection and separation of sirens. In addition, we propose a data simulation approach for generating different sound environments for a virtual pair of headphone microphones. The Fraunhofer SpatialSound Wave technology is used for a realistic evaluation of the trained models. For the evaluation, a three-dimensional acoustic scene is simulated via the object-based audio approach.

Author(s)

Hestermann, Simon

Lukashevich, Hanna

Sladeczek, Christoph

Mainwork

Audio for virtual, augmented and mixed realities. Proceedings of ICSA 2019

Conference

International Conference on Spatial Audio (ICSA) 2019

Options

Deep Neural Network Approaches for Selective Hearing based on Spatial Data Simulation