• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning
 
  • Details
  • Full
Options
2025
Conference Paper
Title

GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning

Abstract
Enhancing speech quality under adverse SNR conditions remains a significant challenge for discriminative deep neural network (DNN)-based approaches. In this work, we propose DisCoGAN, which is a time-frequency-domain generative adversarial network (GAN) conditioned by the latent features of a discriminative model pre-trained for speech enhancement in low SNR scenarios. Our proposed method achieves superior performance compared to state-of-the-art discriminative methods and also surpasses end-to-end (E2E) trained GAN models. We also investigate the impact of various configurations for conditioning the proposed GAN model with the discriminative model and assess their influence on enhancing speech quality.
Author(s)
Shetu, Shrishti Saha
Friedrich-Alexander-Universität Erlangen-Nürnberg
Habets, Emanuël Anco Peter
Friedrich-Alexander-Universität Erlangen-Nürnberg
Brendel, Andreas
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Mainwork
IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2025. Proceedings  
Conference
International Conference on Acoustics, Speech and Signal Processing 2025  
DOI
10.1109/ICASSP49660.2025.10890549
Language
English
Fraunhofer-Institut für Integrierte Schaltungen IIS  
Keyword(s)
  • FiLM

  • GAN

  • latent feature conditioning

  • low SNR

  • speech enhancement

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024