Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Acoustic scene classification by combining autoencoder-based dimensionality reduction and convolutional neural networks

: Abeßer, Jakob; Mimilakis, Stylianos-Ioannis; Gräfe, Robert; Lukashevich, Hanna

Volltext (PDF; )

Virtanen, T. ; Tampere University of Technology:
Detection and Classification of Acoustic Scenes and Events Workshop, DCASE 2017. Proceedings : 16 - 17 November 2017, Munich, Germany
Tampere: Tampere University of Technology, 2017
ISBN: 978-952-15-4042-4
Workshop on Detection and Classification of Acoustic Scences and Events (DCASE) <2, 2017, Munich>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IDMT ()

Motivated by the recent success of deep learning techniques in various audio analysis tasks, this work presents a distributed sensor-server system for acoustic scene classification in urban environments based on deep convolutional neural networks (CNN). Stacked autoencoders are used to compress extracted spectrogram patches on the sensor side before being transmitted to and classified on the server side. In our experiments, we compare two state-of-the-art CNN architectures subject to their classification accuracy under the presence of environmental noise, the dimensionality reduction in the encoding stage, as well as a reduced number of filters in the convolution layers. Our results show that the best model configuration leads to a classification accuracy of 75% for 5 acoustic scenes. We furthermore discuss which confusions among particular classes can be ascribed to particular sound event types, which are present in multiple acoustic scene classes.