Bilingual I-Vector Extractor for DNN Hybrid Acoustic Model Training in German Speech Recognition Systems

Wang, Yao; Gref, Michael; Walter, Oliver; Schmidt, Christoph Andreas

2021

Conference Paper

Abstract

In recent research, i-vectors have been shown to be significantly beneficial for speaker recognition and have been successfully applied in deep neural network (DNN) acoustic model (AM) training to improve the performance of automatic speech recognition (ASR). This paper describes our work in developing a bilingual i-vector extractor for training a German speech recognition system. A bilingual data set, which consisting of German and English speech data is used to train an i-vector extractor for a DNN hybrid acoustic model. The system is evaluated on different data sets. The results show that i-vector extractors trained with bilingual data can be used to improve the training of ASR models in the case of insufficient monolingual data. Additionally, using telephone speech as a case study, we show that i-vector extractor training with data from this domain leads to improvements in recognition.

Author(s)

Wang, Yao

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Gref, Michael

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Walter, Oliver

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Schmidt, Christoph Andreas

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Mainwork

Speech communication

Conference

Conference on Speech Communication 2021

Options

Bilingual I-Vector Extractor for DNN Hybrid Acoustic Model Training in German Speech Recognition Systems