Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Perceptual audio features for unsupervised key-phrase detection

: Zeddelmann, D. von; Kurth, F.; Müller, M.


Douglas, S.C. ; IEEE Signal Processing Society:
IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010. Proceedings. Vol.1 : 14-19 March 2010, Dallas, Texas, USA
Piscataway/NJ: IEEE, 2010
ISBN: 978-1-4244-4296-6
ISBN: 978-1-4244-4295-9
International Conference on Acoustics, Speech and Signal Processing (ICASSP) <35, 2010, Dallas/Tex.>
Conference Paper
Fraunhofer FKIE

We propose a new type of audio feature (HFCC-ENS) as well as an unsupervised method for detecting short sequences of spoken words (key-phrases) within long speech recordings. Our technical contributions are threefold: Firstly, we propose to use bandwidth-adapted filterbanks instead of classical MFCC-style filters in the feature extraction step. Secondly, the time resolution of the resulting features is adapted to account for the temporal characteristics of the spoken phrases. Thirdly, the key-phrase detection step is performed by matching sequences of the resulting HFCC-ENS features with features extracted from a target speech recording. We evaluate the proposed method using the German Kiel Corpus and furthermore investigate speech-related properties of the proposed feature.