• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Low-Resource Text-to-Speech Using Specific Data and Noise Augmentation
 
  • Details
  • Full
Options
2023
Conference Paper
Title

Low-Resource Text-to-Speech Using Specific Data and Noise Augmentation

Abstract
Many neural text-to-speech architectures can synthesize nearly natural speech from text inputs. These architectures must be trained with tens of hours of annotated and high-quality speech data. Compiling such large databases for every new voice requires a lot of time and effort. In this paper, we describe a method to extend the popular Tacotron-2 architecture and its training with data augmentation to enable single-speaker synthesis using a limited amount of specific training data. In contrast to elaborate augmentation methods proposed in the literature, we use simple stationary noises for data augmentation. Our extension is easy to implement and adds almost no computational overhead during training and inference. Using only two hours of training data, our approach was rated by human listeners to be on par with the baseline Tacotron-2 trained with 23.5 hours of LJSpeech data. In addition, we tested our model with a semantically unpredictable sentences test, which showed that both models exhibit similar intelligibility levels.
Author(s)
Kayyar Lakshminarayana, Kishor
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Dittmar, Christian  
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Pia, Nicola
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Habets, Emanuël
Mainwork
31st European Signal Processing Conference, EUSIPCO 2023. Proceedings  
Conference
European Signal Processing Conference 2023  
DOI
10.23919/EUSIPCO58844.2023.10289912
Language
English
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Keyword(s)
  • low-resource

  • speech-synthesis

  • tacotron

  • text-to-speech

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024