• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Abschlussarbeit
  4. Multilingual Automatic Phonetic Transcription - a Linguistic Investigation of its Performance on German and Approaches to Improving the State of the Art
 
  • Details
  • Full
Options
2025
Master Thesis
Title

Multilingual Automatic Phonetic Transcription - a Linguistic Investigation of its Performance on German and Approaches to Improving the State of the Art

Abstract
Phonetic transcription represents the pronunciation of speech in a language-independent script. Accurate manual transcriptions require expert knowledge and are time-consuming. Automatic phonetic transcription (APT) can help reduce the high cost of phonetic transcription, but it is still limited by training data scarcity and quality.
This work investigates the highest performing multilingual APT models to find ways to improve them. For this, we seek to improve the modeling for a single target language, German, while aiming to maintain the language-independent accuracy. After the initial investigation, we develop the transcription bootstrapping approach "selective augmentation" and apply it to a model based on the state of the art "MultIPA". Using this approach, we exemplarily improve plosive phonation recognition including the addition of aspiration recognition by selectively transferring plosive phonation information from a helper model trained with Hindi. We propose criteria for judging the improvement and conduct an acoustic phonetic analysis of the VOT.
The results show the efficacy of selective augmentation, since voicing recognition accuracy is increased by 17.56% and aspiration recognition from 0% to 61.17%. In addition, the tenuis class is successfully reduced by 32.21%, thereby reducing the conflations between the German phonemes. We finally discuss how selective augmentation may be further improved.
Thesis Note
Bonn, Univ., Master Thesis, 2025
Author(s)
Bystrich, Tobias
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Advisor(s)
Pritzen, Julia  orcid-logo
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Schmidt, Christoph Andreas  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Wich-Reif, Claudia
DOI
10.24406/publica-4418
File(s)
Multilingual_Automatic_Phonetic_Transcription_TB.pdf (4.62 MB)
Rights
Under Copyright
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • multilingual

  • Automatic Phonetic Transcription

  • transcription of spoken speech

  • phonetic alphabet

  • automatic speech recognition

  • Artificial Intelligence

  • machine learning

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024