Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Performance comparison of GMM, HMM and DNN based approaches for acoustic event detection within Task 3 of the DCASE 2016 challenge

: Schröder, Jens; Anemüller, Jörn; Goetze, Stefan

Volltext (PDF; )

Virtanen, T. ; Tampere University of Technology:
Detection and Classification of Acoustic Scenes and Events Workshop, DCASE 2016. Proceedings. Online resource : 3 September 2016, Budapest, Hungary
Tampere: Tampere University of Technology, 2016
ISBN: 978-952-15-3807-0
Detection and Classification of Acoustic Scenes and Events Workshop (DCASE) <2016, Budapest>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IDMT ()
acoustic event detection; DCASE; Gabor filterbank; Deep neural network

This contribution reports on the performance of systems for polyphonic acoustic event detection (AED) compared within the framework of the “detection and classification of acoustic scenes and events 2016” (DCASE’16) challenge. State-of-the-art Gaussian mixture model (GMM) and GMM-hidden Markov model (HMM) approaches are applied using Mel-frequency cepstral coefficients (MFCCs) and Gabor filterbank (GFB) features and a non-negative matrix factorization (NMF) based system. Furthermore, tandem and hybrid deep neural network (DNN)-HMM systems are adopted. All HMM systems that usually are of single label type, i.e., systems that only output one label per time segment from a set of possible classes, are extended to multi label classification systems that are a compound of single binary classifiers classifying between target and non-target classes and, thus, are capable of multi labeling. These systems are evaluated for the data of residential areas of Task 3 from the DCASE’16 challenge. It is shown that the DNN based system performs worse than the traditional systems for this task. Best results are achieved using GFB features in combination with a single label GMM-HMM approach.