Audio clips content comparison using latent semantic indexing
This paper describes experiments for audio clips comparison based on spoken context. The spoken content is obtained using automatic speech recognition. The social tags that are available for most of the audio clips are used as keywords. These keywords are mapped to the spoken transcription representing the audio clips on the base of the social tags-keywords. The clips are described using the term frequency-inverse document frequency weighting. This description statistically evaluates how important are the keywords for the documents. The Latent Semantic Indexing (LSI) is applied on audio clips-feature vectors matrix mapping the clips content into low dimensional latent semantic space. The clips are compared using document-document comparison measure based in LSI. The similarity based on LSI is compared with the results obtained by using the standard vector space model.