• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. NESC: Robust Neural End-2-End Speech Coding with GANs
 
  • Details
  • Full
Options
2022
Conference Paper
Title

NESC: Robust Neural End-2-End Speech Coding with GANs

Abstract
Neural networks have proven to be a formidable tool to tackle the problem of speech coding at very low bit rates. However, the design of a neural coder that can be operated robustly under real-world conditions remains a major challenge. Therefore, we present Neural End-2-End Speech Codec (NESC) a robust, scalable end-to-end neural speech codec for high-quality wideband speech coding at 3 kbps. The encoder uses a new architecture configuration, which relies on our proposed DualPathConvRNN (DPCRNN) layer, while the decoder architecture is based on our previous work Streamwise-StyleMelGAN. Our subjective listening tests on clean and noisy speech show that NESC is particularly robust to unseen conditions and signal perturbations.
Author(s)
Pia, Nicola
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Gupta, Kishan
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Korse, Srikanth  
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Multrus, Markus  
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Fuchs, Guillaume  
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Mainwork
Interspeech 2022  
Conference
International Speech Communication Association (INTERSPEECH Annual Conference) 2022  
DOI
10.21437/Interspeech.2022-430
Language
English
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Keyword(s)
  • Generative Adversarial Network

  • neural speech coding

  • residual quantization

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024