Towards large scale vocabulary independent spoken term detection: Advances in the Fraunhofer IAIS audiomining system

Schneider, Daniel; Schon, Jochen; Eickeler, Stefan

2008

Conference Paper

Abstract

This contribution presents the advances of the Fraunhofer IAIS Audiomining system for vocabulary independent spoken term detection since the last SIGIR workshop on searching spontaneous conversational speech in 2007. Based on feedback from archivists involved in the development of the prototype, a set of requirements for spoken term detection systems was established, guiding the development of the overall system. After improving the automatic speech recognition (ASR) baseline with data from the broadcast domain, the syllable error rate on a set of broadcast news and broadcast conversation shows could be improved by 45.6% relative, while the time required for analyzing the data could be reduced by 90%. Based on the new ASR results, the F1 value of the fuzzy syllable search used for open vocabulary spoken term detection was increased by 49% relative. The best results could be achieved with a hybrid word and syllable system, with a relative F1 improvement of 58% compared to the 2007 prototype. With the better ASR baseline, exact string search on the syllable transcripts becomes a promising alternative, yielding precise results on large audiovisual archives with only small reductions in recall.

Author(s)

Schneider, Daniel

Schon, Jochen

Eickeler, Stefan

Mainwork

ACM SIGIR Workshop "Searching Spontaneous Conversational Speech" 2008. Proceedings

Conference

Workshop "Searching Spontaneous Conversational Speech" (SSCS) 2008

International Conference on Research and Development in Information Retrieval (SIGIR) 2008

Options

Towards large scale vocabulary independent spoken term detection: Advances in the Fraunhofer IAIS audiomining system