• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Calibrating POI-based Synthetic Speech Detection
 
  • Details
  • Full
Options
June 30, 2025
Conference Paper
Title

Calibrating POI-based Synthetic Speech Detection

Abstract
Recent advances in deep learning have yielded increasingly sophisticated speech generation systems (Text-To-Speech or Voice Conversion algorithms) capable of producing realistic synthetic speech material that is often indistinguishable from human voices. Although these technologies support a wide range of legitimate applications, they also facilitate malicious uses, including impersonation and misinformation, thereby posing significant societal threats. As a result, synthetic speech detection has emerged as an urgent research focus. Despite numerous proposed methods, a persistent generalization problem remains: detectors struggle to classify out-of-domain samples unseen during training, hence adapting and staying consistent when facing real-world scenarios.
We tackle this limitation with a Person-of-Interest framework that exploits speaker-specific characteristics for Synthetic Speech Detection, thereby enhancing generalizability across diverse generators. Specifically, we introduce an ensemble approach that addresses a previously unstudied calibration problem: the system uses only recording-level statistics to self-calibrate, leveraging the abstraction capabilities of a large-scale, pre-trained audio model. Experiments demonstrate that our method achieves strong performance, high generalizability, and robustness across various datasets.
Author(s)
Le Roux, Thomas
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Cuccovillo, Luca  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Aichroth, Patrick  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Mainwork
MAD 2025, 4th ACM International Workshop on Multimedia AI against Disinformation. Proceedings  
Conference
International Workshop on Multimedia AI against Disinformation 2025  
Open Access
File(s)
Download (798.29 KB)
Rights
CC BY 4.0: Creative Commons Attribution
DOI
10.1145/3733567.3735565
10.24406/publica-5327
Additional link
Landing Page
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Keyword(s)
  • Media Forensics

  • Trustworthy AI

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024