Options
2007
Conference Paper
Titel
Supporting radio archive workflows with vocabulary independent spoken keyword search
Abstract
Archive departments of large radio broadcasters stand to benefit greatly from speech recognition technology and other audio processing techniques. In order to move towards a practical understanding of how these technologies can support archive staff, two large German radio broadcasters, Deutsche Welle and Westdeutscher Rundfunk, commissioned Fraunhofer IAIS to build a German-language radio archive prototype. This paper discusses the development and assessment of the spoken keyword search module of this prototype. The search module was designed and tested in a project group consisting of both multimedia researchers and archive professionals. As a result, the prototype is unique in that its design and evaluation are tuned explicitly to the requirements of archivists. The paper discusses the special needs of radio archive staff and how they were accommodated in the design of the keyword search functionality. In particular, the archive staff required a vocabulary-independent search facility capable of searching for keywords in an archive containing a high proportion of spontaneous speech. Keyword search is implemented using a fuzzy-matching algorithm, which performs a similarity search on syllable transcripts generated by the speech recognizer. An evaluation is carried out to assess whether or not the radio archive prototype fulfilled the needs of archivists.
;
Archive departments of large radio broadcasters stand to benefit greatly from speech recognition technology and other audio processing techniques. In order to move towards a practical understanding of how these technologies can support archive staff, two large German radio broadcasters, Deutsche Welle and Westdeutscher Rundfunk, commissioned Fraunhofer IAIS to build a German-language radio archive prototype. This paper discusses the development and assessment of the spoken keyword search module of this prototype. The search module was designed and tested in a project group consisting of both multimedia researchers and archive professionals. As a result, the prototype is unique in that its design and evaluation are tuned explicitly to the requirements of archivists. The paper discusses the special needs of radio archive staff and how they were accommodated in the design of the keyword search functionality. In particular, the archive staff required a vocabulary-independent search facility capable of searching for keywords in an archive containing a high proportion of spontaneous speech. Keyword search is implemented using a fuzzy-matching algorithm, which performs a similarity search on syllable transcripts generated by the speech recognizer. An evaluation is carried out to assess whether or not the radio archive prototype fulfilled the needs of archivists.