• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Robust ASR in reverberant environments using temporal cepstrum smoothing for speech enhancement and an amplitude modulation filterbank for feature extraction
 
  • Details
  • Full
Options
2014
Conference Paper
Title

Robust ASR in reverberant environments using temporal cepstrum smoothing for speech enhancement and an amplitude modulation filterbank for feature extraction

Abstract
This paper presents techniques aiming at improving automatic speech recognition (ASR) in single channel scenarios in the context of the REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge. System improvements range from speech enhancement over robust feature extraction to model adaptation and word-based integration of multiple classifiers. The selective temporal cepstrum smoothing (TCS) technique is applied to enhance the reverberant speech signal at moderate noise levels, based on a statistical model of room impulse responses (RIRs) and minimum statistics (MS), considering estimates of late reverberations and the noise power spectrum densities (PSDs). Robust feature extraction is performed by amplitude modulation filtering of the cepstrogram to extract its temporal modulation information. As an alternative classifier, the acoustic models have been adopted using different RIRs and a RIR selection scheme based on a multi-layer perceptron (MLP) system that uses spectro-temporal features as the input. In the final stage, a system combination approach achieved by recognizer output voting error reduction (ROVER) is employed to obtain a jointly optimal recognized transcription. The proposed system has been evaluated in two different processing modes, i.e. utterancebased batch processing and full batch processing, which results in an overall average absolute improvement of 11% under variant reverberant conditions compared to the baseline system.
Author(s)
Xiong, F.
Moritz, N.
Rehr, R.
Anemüller, J.
Meyer, B.T.
Gerkmann, T.
Goetze, S.
Doclo, Simon  
Mainwork
REVERB Challenge Workshop 2014, REverberant Voice Enhancement and Recognition Benchmark. Online resource  
Conference
REverberant Voice Enhancement and Recognition Benchmark Challenge Workshop (REVERB) 2014  
Link
Link
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024