Multilingual Automatic Phonetic Transcription - a Linguistic Investigation of its Performance on German and Approaches to Improving the State of the Art

Under CopyrightPritzen, JuliaSchmidt, Christoph AndreasWich-Reif, ClaudiaBystrich, TobiasTobiasBystrich2025-03-182025-03-182025https://doi.org/10.24406/publica-4418https://publica.fraunhofer.de/handle/publica/48559810.24406/publica-4418Phonetic transcription represents the pronunciation of speech in a language-independent script. Accurate manual transcriptions require expert knowledge and are time-consuming. Automatic phonetic transcription (APT) can help reduce the high cost of phonetic transcription, but it is still limited by training data scarcity and quality. This work investigates the highest performing multilingual APT models to find ways to improve them. For this, we seek to improve the modeling for a single target language, German, while aiming to maintain the language-independent accuracy. After the initial investigation, we develop the transcription bootstrapping approach "selective augmentation" and apply it to a model based on the state of the art "MultIPA". Using this approach, we exemplarily improve plosive phonation recognition including the addition of aspiration recognition by selectively transferring plosive phonation information from a helper model trained with Hindi. We propose criteria for judging the improvement and conduct an acoustic phonetic analysis of the VOT. The results show the efficacy of selective augmentation, since voicing recognition accuracy is increased by 17.56% and aspiration recognition from 0% to 61.17%. In addition, the tenuis class is successfully reduced by 32.21%, thereby reducing the conflations between the German phonemes. We finally discuss how selective augmentation may be further improved.enmultilingualAutomatic Phonetic Transcriptiontranscription of spoken speechphonetic alphabetautomatic speech recognitionArtificial Intelligencemachine learningMultilingual Automatic Phonetic Transcription - a Linguistic Investigation of its Performance on German and Approaches to Improving the State of the Artmaster thesis