• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Large text and audio data alignment for multimedia applications
 
  • Details
  • Full
Options
2003
Conference Paper
Title

Large text and audio data alignment for multimedia applications

Abstract
This paper describes the technique for the large text and the large audio alignment. This technique includes a segmentation of the audio into homogeneous speech segments, a recognition of each speech fragment using speech recognizer, a description of each speech fragment by keywords that are selected from the output of the speech recognizer on the base of acoustic confidence score and on the base of salience with respect to the other speech fragments. The sentences of the text are described by the same keywords. The global alignment between the large text and the large audio using only keywords gives rough correspondence between the sentences of the text and the audio fragments. The next recognition pass is based on the finite state automaton generated from roughly aligned sentences that correspond to each speech fragment. This pass gives more precise alignment. Suggested technique gives accurate alignment between the text and the audio.
Author(s)
Biatov, K.
Mainwork
Text, speech and dialogue  
Conference
International Conference on Text, Speech and Dialogue (TSD) 2003  
Language
English
IMK  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024