Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask

 
: Mimilakis, S.I.; Drossos, K.; Santos, J.F.; Schuller, G.; Virtanen, T.; Bengio, Y.

:

Institute of Electrical and Electronics Engineers -IEEE-; IEEE Signal Processing Society:
IEEE International Conference on Acoustics, Speech, and Signal Processing 2018. Proceedings : April 15-20, 2018, Calgary Telus Convention Center, Calgary, Alberty, Canada
Piscataway, NJ: IEEE, 2018
ISBN: 978-1-5386-4658-8
ISBN: 978-1-5386-4657-1
ISBN: 978-1-5386-4659-5
pp.721-725
International Conference on Acoustics, Speech, and Signal Processing (ICASSP) <2018, Calgary>
English
Conference Paper
Fraunhofer IDMT ()

Abstract
Singing voice separation based on deep learning relies on the usage of time-frequency masking. In many cases the masking process is not a learnable function or is not encapsulated into the deep learning optimization. Consequently, most of the existing methods rely on a post processing step using the generalized Wiener filtering. This work proposes a method that learns and optimizes (during training) a source-dependent mask and does not need the aforementioned post processing step. We introduce a recurrent inference algorithm, a sparse transformation step to improve the mask generation process, and a learned denoising filter. Obtained results show an increase of 0.49 dB for the signal to distortion ratio and 0.30 dB for the signal to interference ratio, compared to previous state-of-the-art approaches for monaural singing voice separation.

: http://publica.fraunhofer.de/documents/N-520118.html