Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Potentials for ASR based on multiple acoustic models and model selection using standard speech features

: Winkler, Thomas; Stein, Daniel; Bardeli, Rolf; Schneider, Daniel; Köhler, Joachim

Preprint urn:nbn:de:0011-n-2250788 (140 KByte PDF)
MD5 Fingerprint: 635cdc28f2fd3490991231f62cab550e
Erstellt am: 22.3.2013

Fingscheidt, Tim ; Informationstechnische Gesellschaft -ITG-, Fachausschuss Sprachakustik:
Sprachkommunikation 2012 : Beiträge zur 10. ITG-Fachtagung vom 26. bis 28. September 2012 in Braunschweig
Berlin: VDE-Verlag, 2012 (ITG-Fachbericht 236)
ISBN: 978-3-8007-3455-9
Fachtagung Sprachkommunikation <10, 2012, Braunschweig>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IAIS ()
speech recognition; multi-model approach; multiconditional model; audio

Acoustic modelling is a key issue for successful automatic speech recognition (ASR). Common ASR systems are usually adapted to a certain use case by training robust acoustic models on speech data from the domain recorded in conditions typical for the use case. Varying conditions thus need either multi-conditional or multiple acoustic models. We present a multi-model approach coping with various acoustic conditions in this work. For each utterance the best matching set of acoustic models is selected based on acoustic information of the same acoustic features and acoustic models used for ASR. Our initial experiments show, that we achieve results comparable to a manual selection of the acoustic models but that we are still slightly outperformed by multiconditional models with a comparable number of mixtures. We further show, that an ideal selection would indeed improve the results compared to multi-conditional models.