Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Automatic sentence boundary detection for German broadcast news

: Dzhambazov, Georgi

Preprint urn:nbn:de:0011-n-2249790 (171 KByte PDF)
MD5 Fingerprint: 101e566335c6b837965d65aa4b4f6218
Erstellt am: 22.3.2013

Fingscheidt, Tim ; Informationstechnische Gesellschaft -ITG-, Fachausschuss Sprachakustik:
Sprachkommunikation 2012 : Beiträge zur 10. ITG-Fachtagung vom 26. bis 28. September 2012 in Braunschweig
Berlin: VDE-Verlag, 2012 (ITG-Fachbericht 236)
ISBN: 978-3-8007-3455-9
Fachtagung Sprachkommunikation <10, 2012, Braunschweig>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IAIS ()
speech recognition; sentence boundary detection; broadcast

In this work we aim at enriching the transcript of an automatic speech recognition system with punctuation by automatically detecting sentence ends. We make use of a simple word-based language model and combine it with a decision tree for the acoustic features of speech. The focus lies on selecting robust acoustic features that reflect the prosodic characteristics of the German language in a most optimal way. We arrive at a Sentence Unit Error Rate of 54 compared to the state-of-the art rate for English of 61, by applying a comparable detection system. This is a sound indication that prosody has a stronger cue on perception of sentence boundaries for German than for English. Our work is, to our knowledge, the first system developed for sentence boundary detection for the broadcast news dom ain for German language. Our results can therefore serve as a baseline for further studies in this scenario.