• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Multi-Staged Cross-Lingual Acoustic Model Adaption for Robust Speech Recognition in Real-World Applications - A Case Study on German Oral History Interviews
 
  • Details
  • Full
Options
2020
Conference Paper
Title

Multi-Staged Cross-Lingual Acoustic Model Adaption for Robust Speech Recognition in Real-World Applications - A Case Study on German Oral History Interviews

Abstract
While recent automatic speech recognition systems achieve remarkable performance when large amounts of adequate, high quality annotated speech data is used for training, the same systems often only achieve an unsatisfactory result for tasks in domains that greatly deviate from the conditions represented by the training data. For many real-world applications, there is a lack of sufficient data that can be directly used for training robust speech recognition systems. To address this issue, we propose and investigate an approach that performs a robust acoustic model adaption to a target domain in a cross-lingual, multi-staged manner. Our approach enables the exploitation of large-scale training data from other domains in both the same and other languages. We evaluate our approach using the challenging task of German oral history interviews, where we achieve a relative reduction of the word error rate by more than 30% compared to a model trained from scratch only on the target domain, and 6-7% relative compared to a model trained robustly on 1000 hours of same-language out-of-domain training data.
Author(s)
Gref, Michael  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Walter, Oliver  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Schmidt, Christoph Andreas  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Behnke, Sven  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Köhler, Joachim  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Mainwork
12th Language Resources and Evaluation Conference, LREC 2020. Proceedings. Online resource  
Project(s)
KA3
Funder
Bundesministerium für Bildung und Forschung BMBF (Deutschland)  
Conference
Language Resources and Evaluation Conference (LREC) 2020  
Open Access
DOI
10.24406/publica-fhg-408149
File(s)
N-590452.pdf (256.11 KB)
Rights
CC BY-NC 4.0: Creative Commons Attribution-NonCommercial
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • acoustic modeling

  • acoustic model adaption

  • cross-lingual

  • digital humanities

  • oral history

  • speech recognition

  • transfer learning

  • under-resourced speech recognition

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024