
Publica
Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten. MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation
| Institute of Electrical and Electronics Engineers -IEEE-; IEEE Computational Intelligence Society; International Neural Network Society: International Joint Conference on Neural Networks, IJCNN 2018. Proceedings : 8-13 July 2018, Rio de Janeiro, Brazil Piscataway, NJ: IEEE, 2018 ISBN: 978-1-5090-6014-6 ISBN: 978-1-5090-6015-3 pp.2439-2446 |
| International Joint Conference on Neural Networks (IJCNN) <2018, Rio de Janeiro> |
|
| English |
| Conference Paper |
| Fraunhofer IDMT () |
Abstract
Monaural singing voice separation task focuses on the prediction of the singing voice from a single channel music mixture signal. Current state of the art (SOTA) results in monaural singing voice separation are obtained with deep learning based methods. In this work we present a novel recurrent neural approach that learns long-term temporal patterns and structures of a musical piece. We build upon the recently proposed Masker-Denoiser (MaD) architecture and we enhance it with the Twin Networks, a technique to regularize a recurrent generative network using a backward running copy of the network. We evaluate our method using the Demixing Secret Dataset and we obtain an increment to signal-to-distortion ratio (SDR) of 0.37 dB and to signal-to-interference ratio (SIR) of 0.23 dB, compared to previous SOTA results.