Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Two level discriminative training for audio events recognition in sport broadcasts

: Biatov, K.

Moscow State Linguistic University:
Twelfth International Conference SPEECH and COMPUTER, SPECOM 2007. Proceedings : October 15-18, 2007, Moscow, Russia
Moscow: Moscow State Linguistic Univ., 2007
International Conference SPEECH and COMPUTER (SPECOM) <12, 2007, Moscow>
Fraunhofer IAIS ()
audio events detection; machine learning; discriminative learning; signal processing

In this paper, two level discriminative learning for audio events recognition in sport broadcasts archive is described. The audio events recognition is based on the idea that audio events are composed of basic units. Basic units are some elementary events. Audio events used for semantic interpretation (mid-level concepts) are presented as a combination of the basic units. Models for the basic units are GMM models. Each 5 frames of audio data are recognized using models of the basic units. Each mid-level concept is described by the distribution of the basic units. The distribution of the basic units in each class of segment corresponding to mid-level concepts is considered as a macro model of this class. For events recognition the tree based framework is used. In each level of the tree two macro models are compared. The two level discriminative learning for macro models is applied. First discriminative training level is on the level of basic units, second is on the level of macro models. The suggested approach is compared with maximum likelihood decision and SVM with polynomial kernel. The results of experiments indicate significant improvement in comparison with the conventional approaches in the task of acoustically closely audio events recognition.