• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Analyzing the potential of pre-trained embeddings for audio classification tasks
 
  • Details
  • Full
Options
2020
Conference Paper
Title

Analyzing the potential of pre-trained embeddings for audio classification tasks

Abstract
In the context of deep learning, the availability of large amounts of training data can play a critical role in a model's performance. Recently, several models for audio classification have been pre-trained in a supervised or self-supervised fashion on large datasets to learn complex feature representations, so-called embeddings. These embeddings can then be extracted from smaller datasets and used to train subsequent classifiers. In the field of audio event detection (AED) for example, classifiers using these features have achieved high accuracy without the need of additional domain knowledge. This paper evaluates three state-of-the-art embeddings on six audio classification tasks from the fields of music information retrieval and industrial sound analysis. The embeddings are systematically evaluated by analyzing the influence on classification accuracy of classifier architecture, fusion methods for file-wise predictions, amount of training data, and initial training domain of the embeddings. To better understand the impact of the pre-training step, results are also compared with those acquired with models trained from scratch. On average, the OpenL3 embeddings performed best with a linear SVM classifier. For a reduced amount of training examples, OpenL3 outperforms the initial baseline.
Author(s)
Grollmisch, Sascha  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Kehling, Christian  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Taenzer, Michael  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Cano, E.
Mainwork
28th European Signal Processing Conference, EUSIPCO 2020. Proceedings  
Conference
European Signal Processing Conference (EUSIPCO) 2020  
European Signal Processing Conference (EUSIPCO) 2021  
DOI
10.23919/Eusipco47968.2020.9287743
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Keyword(s)
  • Environmental Sound Analysis

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024