Active Learning for Sound Event Classification using Monte-Carlo Dropout and PANN Embeddings

Shishkin, Stepan; Hollosi, Danilo; Doclo, Simon; Goetze, Stefan

2021

Conference Paper

Abstract

Labeling audio material to train classifiers comes with a large amount of human labor. In this paper, we propose an active learning method for sound event classification, where a human annotator is asked to manually label sound segments up to a certain labeling budget. The sound event classifier is incrementally re-trained on pseudo-labeled sound segments and manually labeled segments. The segments to be labeled during the active learning process are selected based on the model uncertainty of the classifier, which we propose to estimate using Monte Carlo dropout, a technique for Bayesian inference in neural networks. Evaluation results on the UrbanSound8K dataset show that the proposed active learning method, which uses pre-trained audio neural network (PANN) embeddings as input features, outperforms two baseline methods based on medoid clustering, especially for low labeling budgets.

Author(s)

Shishkin, Stepan

Hollosi, Danilo

Doclo, Simon

Goetze, Stefan

Mainwork

6th Workshop on Detection and Classification of Acoustic Scenes and Events, DCASE 2021. Proceedings

Conference

Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE) 2021

Options

Active Learning for Sound Event Classification using Monte-Carlo Dropout and PANN Embeddings