Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Speech recognition as a retrieval problem

: Rieber, Joscha Simon; Bardeli, Rolf

Volltext (PDF; )

Horbach, M. ; Gesellschaft für Informatik -GI-, Bonn:
Informatik 2013 - Informatik angepasst an Mensch, Organisation und Umwelt. CD-ROM : 43. Jahrestagung der Gesellschaft für Informatik e.V. (GI) vom 16. - 20. September 2013 in Koblenz, Germany
Bonn: Köllen Druck + Verlag, 2013 (GI-Edition - Lecture Notes in Informatics (LNI). Proceedings 220)
ISBN: 978-3-88579-614-5
Gesellschaft für Informatik (Jahrestagung) <43, 2013, Koblenz>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IAIS ()

Common approaches to automatic speech recognition (ASR) are based on training statistical models for the acoustics of speech. In our work, a retrieval-based ASR system is developed that does not rely on training and thus provides more flexible application. It is based on a set of known reference word utterances for each possibly occurring word in a test string. A test word string is identified by finding the most similar reference for each word by using an approach based on dynamic time warping (DTW). The DTW variant suitable for recognizing strings of connected words is called level-building DTW, proposed by Myers and Rabiner in 1981. It is using a level-wise iteration to match each word in the test utterance with the most similar reference. In our work, an ASR system for connected digit recognition based on level-building DTW is developed, evaluated and compared with a state-of-the-art HMM recognizer.