Improving robust speech recognition for German oral history interviews using multi-condition training
In historical sciences, the term oral history refers to conducting and analyzing interviews with contemporary witnesses. To significantly reduce the resources needed to transcribe these interviews, we work on the adaptation of our speech recognition system to oral history interviews. In this work, we build on our previous experiments by using 1000 hours of training data from the broadcast domain. Utilizing the Kaldi ASR toolkit, we show that advanced chain acoustic models greatly benefit from large data sets and achieve remarkable performance on several test sets. To further improve the speech recognition performance on oral history interviews, we apply artificially created multi-condition data to the chain model training and reduce the WER on the oral history test set compared to a clean trained chain model by 4.8% absolute and 13.9% relative.