• English
  • Deutsch
  • Log In
    Password Login
    or
  • Research Outputs
  • Projects
  • Researchers
  • Institutes
  • Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation
 
  • Details
  • Full
Options
2018
Conference Paper
Titel

MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation

Abstract
Monaural singing voice separation task focuses on the prediction of the singing voice from a single channel music mixture signal. Current state of the art (SOTA) results in monaural singing voice separation are obtained with deep learning based methods. In this work we present a novel recurrent neural approach that learns long-term temporal patterns and structures of a musical piece. We build upon the recently proposed Masker-Denoiser (MaD) architecture and we enhance it with the Twin Networks, a technique to regularize a recurrent generative network using a backward running copy of the network. We evaluate our method using the Demixing Secret Dataset and we obtain an increment to signal-to-distortion ratio (SDR) of 0.37 dB and to signal-to-interference ratio (SIR) of 0.23 dB, compared to previous SOTA results.
Author(s)
Drossos, K.
Mimilakis, S.I.
Serdyuk, D.
Schuller, G.
Virtanen, T.
Bengio, Y.
Hauptwerk
International Joint Conference on Neural Networks, IJCNN 2018. Proceedings
Konferenz
International Joint Conference on Neural Networks (IJCNN) 2018
Thumbnail Image
DOI
10.1109/IJCNN.2018.8489565
Language
English
google-scholar
Fraunhofer-Institut für Digitale Medientechnologie IDMT
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Send Feedback
© 2022