A Novel Dataset for Time-Dependent Harmonic Similarity between Chord Sequences

CC BY 4.0Bittner, FrancaFrancaBittnerAbeßer, JakobJakobAbeßerNadar, Christon-RagavanChriston-RagavanNadarLukashevich, HannaHannaLukashevichKramer, PatrickPatrickKramer2023-07-132024-04-152023-07-132021https://doi.org/10.24406/publica-1623https://publica.fraunhofer.de/handle/publica/44558310.24406/publica-1623State-of-the-art algorithms in many music information retrieval (MIR) tasks such as chord recognition, multipitch estimation, or instrument recognition rely on deep learning algorithms, which require large amounts of data to be trained and evaluated. In this paper, we present the IDMT-SMT-CHORD-SEQUENCES dataset, which is a novel synthetic dataset of 15,000 chord progressions played on 45 different musical instruments. The dataset is organized in a triplet fashion and each triplet includes one "anchor" chord sequence as well as one corresponding similar and dissimilar chord progression. The audio files are synthesized from MIDI data using FluidSynth with a selected sound font. Furthermore, we conducted a benchmark experiment on time-dependent harmonic similarity based on learnt embedding representations. The results show that a convolutional neural network (CNN), which considers the temporal context of a chord progression, outperforms a simpler approach based on temporal averaging of input features.enAutomatic Music AnalysisA Novel Dataset for Time-Dependent Harmonic Similarity between Chord Sequencespresentation