• English
  • Deutsch
  • Log In
    Password Login
    Have you forgotten your password?
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation
 
  • Details
  • Full
Options
2020
Journal Article
Title

Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation

Abstract
The goal of this article is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA, we examine the mapping functions of three fundamental DAE-based models in music source separation; one with single-layer encoder and decoder, one with multi-layer encoder and single-layer decoder, and one using skip-filtering connections (SF) with a single-layer encoding and decoding. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structures.
Author(s)
Mimilakis, Stylianos Ioannis  
Drossos, Konstantinos
Cano, Estefanía
Schuller, Gerald  
Journal
IEEE ACM transactions on audio, speech, and language processing  
Open Access
DOI
10.1109/TASLP.2019.2952013
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Keyword(s)
  • approximation algorithm

  • computational modeling

  • DAE

  • decoding

  • denoising autoencoder

  • multiple signal classification

  • music source separation

  • NCA

  • neural couplings algorithm

  • neural networks

  • noise reduction

  • singing voice

  • skip connections

  • source separation

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024