Options
2023
Conference Paper
Title
The AudioLabs System for the Blizzard Challenge 2023
Abstract
In this paper, we describe our contribution to the Blizzard Challenge 2023. This challenge has the goal of understanding and comparing research techniques in building corpus-based speech synthesizers on the same data. The 2023 edition of the challenge focuses on the French language and low-resource settings. Our text-to-speech (TTS) synthesis system consists of three main building blocks. First, a non-autoregressive acoustic model converts symbolic input sequences (phonemes) into mel-scaled speech spectrograms. Second, a post-processing model based on a generative adversarial network (GAN) enhances the predicted mel spectrograms. Third, the GAN-based neural vocoder StyleMelGAN converts the enhanced spectrogram into a time-domain speech waveform.
Author(s)
Conference