• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain
 
  • Details
  • Full
Options
2022
Conference Paper
Title

A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain

Abstract
Frequency domain processing, and in particular the use of Modified Discrete Cosine Transform (MDCT), is the most widespread approach to audio coding. However, at low bitrates, audio quality, especially for speech, degrades drastically due to the lack of available bits to directly code the transform coefficients. Traditionally, post-filtering has been used to mitigate artefacts in the coded speech by exploiting a-priori information of the source and extra transmitted parameters. Recently, data-driven post-filters have shown better results, but at the cost of significant additional complexity and delay. In this work, we propose a mask-based post-filter operating directly in MDCT domain of the codec, inducing no extra delay. The real-valued mask is applied to the quantized MDCT coefficients and is estimated from a relatively lightweight convolutional encoder-decoder network. Our solution is tested on the recently standardized low-delay, low-complexity codec (LC3) at lowest possible bitrate of 16 kbps. Objective and subjective assessments clearly show the advantage of this approach over the conventional post-filter, with an average improvement of 10 MUSHRA points over the LC3 coded speech.
Author(s)
Gupta, Kishan
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Korse, Srikanth  
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Edler, Bernd  
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Fuchs, Guillaume  
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Mainwork
IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022. Proceedings  
Conference
International Conference on Acoustics, Speech, and Signal Processing 2022  
Open Access
DOI
10.1109/ICASSP43922.2022.9747410
Language
English
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Keyword(s)
  • Complex-Valued Transform

  • Deep Neural Network (DNN)

  • Mask-Based Post-Filter

  • Modified Discrete Cosine Transform (MDCT)

  • Real-Valued Transform

  • Speech Coding

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024