On the use of spectro-temporal features for the IEEE AASP challenge 'detection and classification of acoustic scenes and events'
In this contribution, an acoustic event detection system based on spectro-temporal features and a two-layer hidden Markov model as back-end is proposed within the framework of the IEEE AASP challenge 'Detection and Classification of Acoustic Scenes and Events' (D-CASE). Noise reduction based on the log-spectral amplitude estimator by  and noise power density estimation by  is used for signal enhancement. Performance based on three different kinds of features is compared, i.e. for amplitude modulation spectrogram, Gabor filterbank-features and conventional Mel-frequency cepstral coefficients (MFCCs), all of them known from automatic speech recognition (ASR). The evaluation is based on the office live recordings provided within the D-CASE challenge. The influence of the signal enhancement is investigated and the increase in recognition rate by the proposed features in comparison to MFCC-features is shown. It is demonstrated that the proposed spectro-temporal feature s achieve a better recognition accuracy than MFCCs.