Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Improved transcription and indexing of oral history interviews for digital humanities research

 
: Gref, Michael; Köhler, Joachim; Leh, Almut

:
Volltext urn:nbn:de:0011-n-4942025 (429 KByte PDF)
MD5 Fingerprint: 11eaeefaa65331ce69843ebd3a6c2b11
(CC) by-nc
Erstellt am: 23.5.2018


Calzolari, N. ; European Language Resources Association -ELRA-, Paris:
LREC 2018, Eleventh International Conference on Language Resources and Evaluation. Proceedings. Online resource : May 7-12, 2018, Phoenix Seagaia Conference Center Miyazaki, Japan
Paris: ELRA, 2018
ISBN: 979-10-95546-00-9
S.3124-3131
International Conference on Language Resources and Evaluation (LREC) <11, 2018, Miyazaki>
Bundesministerium für Bildung und Forschung BMBF
Forschungsinfrastrukturen für die Geistes- und qualitativen Sozialwissenschaften; 01UG1511B; KA3
Kölner Zentrum für Analyse und Archivierung audiovisueller Daten
Englisch
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IAIS ()
acoustic modeling; robust speech recognition; multi-condition training; speech retrieval; oral history

Abstract
This paper describes different approaches to improve the transcription and indexing quality of the Fraunhofer IAIS Audio Mining system on Oral History interviews for the Digital Humanities Research. As an essential component of the Audio Mining system, automatic speech recognition faces a lot of difficult challenges when processing Oral History interviews. We aim to overcome these challenges using state-of-the-art automatic speech recognition technology. Different acoustic modeling techniques, like multi-condition training and sophisticated neural networks, are applied to train robust acoustic models. To evaluate the performance of these models on Oral History interviews a German Oral History test-set is presented. This test-set represents the large audio-visual archives "Deutsches Gedächtnis" of the Institute for History and Biography. The combination of the different applied techniques results in a word error rate reduced by 28.3% relative on this test-set compared to the current baseline system while only one eighth of the previous amount of training data is used. In context of these experiments new opportunities are set out for Oral History research offered by Audio Mining. Also the workflow is described used by Audio Mining to process long audio-files to automatically create time-aligned transcriptions.

: http://publica.fraunhofer.de/dokumente/N-494202.html