• English
  • Deutsch
  • Log In
    Password Login
    or
  • Research Outputs
  • Projects
  • Researchers
  • Institutes
  • Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. On DNN posterior probability combination in multi-stream speech recognition for reverberant environments
 
  • Details
  • Full
Options
2017
Conference Paper
Titel

On DNN posterior probability combination in multi-stream speech recognition for reverberant environments

Abstract
A multi-stream framework with deep neural network (DNN) classifiers has been applied in this paper to improve automatic speech recognition (ASR) performance in environments with different reverberation characteristics. We propose a room parameter estimation model to determine the stream weights for DNN posterior probability combination with the aim of obtaining reliable log-likelihoods for decoding. The model is implemented by training a multi-layer perceptron to distinguish between various reverberant environments. The method is tested in known and unknown environments against approaches based on inverse entropy and autoencoders, with average relative word error rate improvements of 46% and 29%, respectively, when performing multi-stream ASR in different reverberant situations.
Author(s)
Xiong, F.
Goetze, S.
Meyer, B.T.
Hauptwerk
IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017. Proceedings
Konferenz
International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2017
Thumbnail Image
DOI
10.1109/ICASSP.2017.7953158
Language
English
google-scholar
Fraunhofer-Institut für Digitale Medientechnologie IDMT
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Send Feedback
© 2022