• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Reducing interference with phase recovery in DNN-based monaural singing voice separation
 
  • Details
  • Full
Options
2018
Conference Paper
Title

Reducing interference with phase recovery in DNN-based monaural singing voice separation

Abstract
State-of-the-art methods for monaural singing voice separation consist in estimating the magnitude spectrum of the voice in the short-time Fourier transform (STFT) domain by means of deep neural networks (DNNs). The resulting magnitude estimate is then combined with the mixture's phase to retrieve the complex-valued STFT of the voice, which is further synthesized into a time-domain signal. However, when the sources overlap in time and frequency, the STFT phase of the voice differs from the mixture's phase, which results in interference and artifacts in the estimated signals. In this paper, we investigate on recent phase recovery algorithms that tackle this issue and can further enhance the separation quality. These algorithms exploit phase constraints that originate from a sinusoidal model or from consistency, a property that is a direct consequence of the STFT redundancy. Experiments conducted on real music songs show that those algorithms are efficient for reducing interference in the estimated voice compared to the baseline approach.
Author(s)
Mimilakis, S.I.  
Magron, P.
Drossos, K.
Virtanen, T.
Mainwork
Interspeech 2018. Online resource  
Conference
International Speech Communication Association (Interspeech Annual Conference) 2018  
Open Access
Link
Link
DOI
10.21437/Interspeech.2018-1845
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Keyword(s)
  • Automatic Music Analysis

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024