• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. Integration of Optimized Modulation Filter Sets into Deep Neural Networks for Automatic Speech Recognition
 
  • Details
  • Full
Options
2016
Journal Article
Title

Integration of Optimized Modulation Filter Sets into Deep Neural Networks for Automatic Speech Recognition

Abstract
Inspired by physiological studies on the human auditory system and by results from psychoacoustics, an amplitude modulation filter bank (AMFB) has been developed and successfully applied to feature extraction for automatic speech recognition (ASR) in earlier work. Here, we address the question as to which amplitude modulation (AM) frequency decomposition leads to optimal ASR performance by proposing a parameterized functional relationship between modulation center frequency and modulation bandwidth. Word error rates (WERs) of ASR experiments with 1551 different AMFBs are systematically evaluated and compared, resulting in the identification of a comparatively narrow range of optimal modulation frequency to modulation bandwidth characteristics. To integrate modulation processing with deep neural network (DNN) acoustic modeling, we propose (1) merging of modulation filter coefficients with DNN weights prior to a final training step and (2) an improved mean-variance normalization scheme for AMFBs.
Author(s)
Moritz, Niko
Kollmeier, Birger  
Anemüller, Jörn
Journal
IEEE ACM transactions on audio, speech, and language processing  
DOI
10.1109/TASLP.2016.2615239
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024