Options
2023
Bachelor Thesis
Titel
Data-Driven and Rule-Based Approaches to Improving Bavarian Speech Recognition
Abstract
In this work we explored different approaches to improving the automatic speech recognition for the German dialect Bavarian for which little transcribed training data exists. The aim was to transcribe Bavarian dialect to written Standard High German. From the annotated data we could gather, we created datasets for training and testing. This resulted in approximately 24 hours of annotated training data, while the three test sets amount to approximately 6.7 hours in total. Our Bavarian models were based on the Fraunhofer IAIS hybrid model and two end-to-end models based on wav2vec 2.0. The experiments in this work may be classified using two categories: Rule-based approaches and data-driven approaches. While the wav2vec 2.0 models were trained in a purely data-driven manner, both approaches were applied to the hybrid model. The data-driven approach for the hybrid model also included training on approximately 24 hours of Bavarian speech. Additionally, a language model was trained on a larger set of transcriptions and subtitle data: a text corpus of annotations from the annotated training set and subtitles of a Bavarian television series. The rule-based approach was conducted via lexicon modifications using phonetic mapping rules. We found promising results and a relative -30% difference of WER compared to the hybrid baseline model. We find that despite Bavarian being a low-resource language, the greatest potential for improving Bavarian speech recognition lies in data-driven training of the acoustic modeling even though all approaches show promise.
ThesisNote
Bonn, Univ., Bachelor Thesis, 2023
Author(s)