Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Acoustic scene classification using time-delay neural networks and amplitude modulation filter bank features

: Moritz, Niko; Schröder, Jens; Goetze, Stefan; Anemüller, Jörn; Kollmeier, Birger

Volltext (PDF; )

Virtanen, T. ; Tampere University of Technology:
Detection and Classification of Acoustic Scenes and Events Workshop, DCASE 2016. Proceedings. Online resource : 3 September 2016, Budapest, Hungary
Tampere: Tampere University of Technology, 2016
ISBN: 978-952-15-3807-0
Detection and Classification of Acoustic Scenes and Events Workshop (DCASE) <2016, Budapest>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IDMT ()
Time-delay neural networks; acoustic scene classification; DCASE; amplitude modulation filter bank features

This paper presents a system for acoustic scene classification (ASC) that is applied to data of the ASC task of the DCASE’16 challenge (Task 1). The proposed method is based on extracting acoustic features that employ a relatively long temporal context, i.e., amplitude modulation filer bank (AMFB) features, prior to detection of acoustic scenes using a neural network (NN) based classification approach. Recurrent neural networks (RNN) are well suited to model long-term acoustic dependencies that are known to encode important information for ASC tasks. However, RNNs require a relatively large amount of training data in comparison to feed-forward deep neural networks (DNNs). Hence, the time-delay neural network (TDNN) approach is used in the present work that enables analysis of long contextual information similar to RNNs but with training efforts comparable to conventional DNNs. The proposed ASC system attains a recognition accuracy of 76.5 % on the development set, which is 4.0 % higher compared to the DCASE’16 baseline system.