• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Audio Transformer for Synthetic Speech Detection via Formant Magnitude and Phase Analysis
 
  • Details
  • Full
Options
2024
Conference Paper
Title

Audio Transformer for Synthetic Speech Detection via Formant Magnitude and Phase Analysis

Abstract
This paper introduces a novel multi-task transformer for synthetic speech detection. The network encodes magnitude and phase of the input speech with a feature bottleneck, used to autoencode the input magnitude, to predict the trajectory of the fundamental frequency (f0), and to discern if the input speech is synthetic or natural. The approach achieves state-of-the-art performance on the ASVspoof 2019 LA dataset while still retaining interpretability, with an AUC score of 0.910.
Author(s)
Cuccovillo, Luca  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Gerhardt, Milica  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Aichroth, Patrick  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Mainwork
IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024. Proceedings  
Conference
International Conference on Acoustics, Speech, and Signal Processing 2024  
Open Access
DOI
10.1109/ICASSP48485.2024.10445932
Additional link
Full text
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Keyword(s)
  • Media Forensics

  • Signal processing

  • Transformers

  • Multitasking

  • Acoustics

  • Trajectory

  • Speech synthesis

  • synthetic speech detection

  • audio deepfakes

  • audio transformer

  • voice formants

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024