Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Perceptual audio features for unsupervised key-phrase detection

 
: Zeddelmann, D. von; Kurth, F.; Müller, M.

:

Douglas, S.C. ; IEEE Signal Processing Society:
IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010. Proceedings. Vol.1 : 14-19 March 2010, Dallas, Texas, USA
Piscataway/NJ: IEEE, 2010
ISBN: 978-1-4244-4296-6
ISBN: 978-1-4244-4295-9
pp.257-260
International Conference on Acoustics, Speech and Signal Processing (ICASSP) <35, 2010, Dallas/Tex.>
English
Conference Paper
Fraunhofer FKIE

Abstract
We propose a new type of audio feature (HFCC-ENS) as well as an unsupervised method for detecting short sequences of spoken words (key-phrases) within long speech recordings. Our technical contributions are threefold: Firstly, we propose to use bandwidth-adapted filterbanks instead of classical MFCC-style filters in the feature extraction step. Secondly, the time resolution of the resulting features is adapted to account for the temporal characteristics of the spoken phrases. Thirdly, the key-phrase detection step is performed by matching sequences of the resulting HFCC-ENS features with features extracted from a target speech recording. We evaluate the proposed method using the German Kiel Corpus and furthermore investigate speech-related properties of the proposed feature.

: http://publica.fraunhofer.de/documents/N-188656.html