Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Social recommendation using speech recognition: Sharing TV scenes in social networks

 
: Schneider, Daniel; Tschöpel, Sebastian; Schwenninger, Jochen

:
Postprint urn:nbn:de:0011-n-2051371 (287 KByte PDF)
MD5 Fingerprint: 8b70baa4f116744a11c6818173c0c78e
© 2012 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Created on: 13.6.2012


Institute of Electrical and Electronics Engineers -IEEE-:
WIAMIS 2012, 13th International Workshop on Image Analysis for Multimedia Interactive Services : 23rd - 25th May 2012, Dublin City University, Ireland
New York, NY: IEEE, 2012
ISBN: 978-1-4673-0791-8 (Print)
ISBN: 978-1-4673-0789-5 (Online)
ISBN: 978-1-4673-0790-1
4 pp.
International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) <13, 2012, Dublin>
English
Conference Paper, Electronic Publication
Fraunhofer IAIS ()
audio processing; social network; recommendation; automatic speech recognition; LVCSR

Abstract
We describe a novel system which simplifies recommendation of video scenes in social networks, thereby attracting a new audience for existing video portals. Users can select interesting quotes from a speech recognition transcript, and share the corresponding video scene with their social circle with minimal effort. The system has been designed in close cooperation with the largest German public broadcaster (ARD), and was deployed at the broadcasters public video portal. A twofold adaptation strategy adapts our speech recognition system to the given use case. First, a database of speakeradapted acoustic models for the most important speakers in the corpus is created. We use spectral speaker identification for detecting whether one of these speakers is speaking, and select the correspondin g model accordingly. Second, we apply language model adaptation by exploiting prior knowledge about the video category.

: http://publica.fraunhofer.de/documents/N-205137.html