Data-driven solo voice enhancement for jazz music retrieval

Balke, Stefan; Dittmar, Christian; Abeßer, Jakob; Müller, Meinard

doi:10.1109/ICASSP.2017.7952145

2017

Conference Paper

Abstract

Retrieving short monophonic queries in music recordings is a challenging research problem in Music Information Retrieval (MIR). In jazz music, given a solo transcription, one retrieval task is to find the corresponding (potentially polyphonic) recording in a music collection. Many conventional systems approach such retrieval tasks by first extracting the predominant F0-trajectory from the recording, then quantizing the extracted trajectory to musical pitches and finally comparing the resulting pitch sequence to the monophonic query. In this paper, we introduce a data-driven approach that avoids the hard decisions involved in conventional approaches: Given pairs of time-frequency (TF) representations of full music recordings and TF representations of solo transcriptions, we use a DNN-based approach to learn a mapping for transforming a ""polyphonic"" TF representation into a ""monophonic"" TF representation. This transform can be considered as a kind of solo voice enhancement. We evaluate our approach within a jazz solo retrieval scenario and compare it to a state-of-the-art method for predominant melody extraction.

Author(s)

Balke, Stefan

Dittmar, Christian

Abeßer, Jakob

Müller, Meinard

Mainwork

IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017. Proceedings

Conference

International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2017

Options

Data-driven solo voice enhancement for jazz music retrieval