Joint Standard German and Bavarian Subdialect Identification of Broadcast Speech

Stadtschnitzer, Michael; Schmidt, Christoph Andreas

2018

Conference Paper

Abstract

Automatic speech recognition is a very important technique for numerous applications like automatic subtitling, dialogue systems and information retrieval systems. Speech recognition systems usually perform very well in clean and controlled environments. However they still often fail, when mismatches between the trained models and the testing data are present, e.g. due to noise, reverberation, or dialects. A method to cope with dialects is to identify the dialect in advance, and then use specialized dialectal speech recognition models for the decoding. Also, dialect identification systems have been recently used for targeted advertising, service customization, forensics tasks and for text-to-speech synthesis of regional speech. In this work, we annotate a large quantity of dialectal and st andard German speech from a German broadcaster, and exploit the data to train and evaluate a joint standard German and Bavarian subdialect identification system, that is able to distinguish between standard German and three Bavarian subdialects, namely Bavarian, Swabian and Franconian, with promising performance.

Author(s)

Stadtschnitzer, Michael

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Schmidt, Christoph Andreas

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Hauptwerk

Fortschritte der Akustik. DAGA 2018

Konferenz

Deutsche Jahrestagung für Akustik (DAGA) 2018

Options

Joint Standard German and Bavarian Subdialect Identification of Broadcast Speech