Uncertainty in Semi-Supervised Audio Classification - A Novel Extension for FixMatch

Grollmisch, Sascha; Cano, Estefanía; Lukashevich, Hanna; Abeßer, Jakob

doi:10.23919/EUSIPCO58844.2023.10289789

2023

Conference Paper

Abstract

Semi-supervised learning (SSL) is a commonly used technique when annotated data is scarce but unlabeled data is easily available. In recent years, SSL has seen a large boost in the computer vision domain and methods such as FixMatch were successfully adapted to audio classification tasks. However, there still remains a gap between SSL methods and the fully supervised baselines, which were trained with all labels available. In this work, we first investigate the quality of the pseudo-labels, i. e., generated labels for unlabeled data, for musical instrument family classification and acoustic scene classification. Based on these insights, we propose and evaluate a novel extension of FixMatch that quantifies and considers the uncertainty of the pseudo-labels. Additionally, we highlight the problematic tradeoff between pseudo-label quality and quantity. Our results show that Monte-Carlo Dropout combined with temperature scaling improved the pseudo-label accuracy from 78.4% to 86.7% for instrument family and from 87.9% to 89.9% for acoustic scene classification. Even though the accuracy on the test sets improved from 71.0% to 72.1% and from 69.2% to 70.8%, respectively, there is still a gap to the fully supervised baseline leaving room for future work.

Author(s)

Grollmisch, Sascha

Fraunhofer-Institut für Digitale Medientechnologie IDMT

Cano, Estefanía

Lukashevich, Hanna

Fraunhofer-Institut für Digitale Medientechnologie IDMT

Abeßer, Jakob

Fraunhofer-Institut für Digitale Medientechnologie IDMT

Mainwork

31st European Signal Processing Conference, EUSIPCO 2023. Proceedings

Conference

European Signal Processing Conference 2023

Options

Uncertainty in Semi-Supervised Audio Classification - A Novel Extension for FixMatch