An Open Dataset of Synthetic Speech

Yaroshchuk, Artem; Papastergiopoulos, Christoforos; Cuccovillo, Luca; Aichroth, Patrick; Votis, Konstantinos; Tzovaras, Dimitrios

doi:10.1109/WIFS58808.2023.10374863

2023

Conference Paper

Abstract

This paper introduces a multilingual, multispeaker dataset composed of synthetic and natural speech, designed to foster research and benchmarking in synthetic speech detection. The dataset encompasses 18,993 audio utterances synthesized from text, alongside with their corresponding natural equiva-lents, representing approximately 17 hours of synthetic audio data. The dataset features synthetic speech generated by 156 voices spanning three languages, namely, English, German, and Spanish, with a balanced gender representation. It targets state-of-the-art synthesis methods, and has been released with a license allowing seamless extension and redistribution by the research community.