Options
2026
Book Article
Title
Synthetic Audio Detection
Abstract
The rapid advancement of neural text-to-speech and voice conversion technologies has led to synthetic speech of unprecedented quality, raising urgent concerns over its misuse in disinformation, fraud, and impersonation. Despite growing awareness and notable contributions from the research community, synthetic speech detection remains a fundamentally open problem. This chapter adopts a holistic perspective to analyse existing detection strategies and propose future directions. We begin with a detailed analysis of data-driven state-of-the-art methods, providing an historical perspective and highlighting their reliance on vocoding artefacts—traces that are currently effective but are likely to disappear as neural codecs replace traditional audio compression. We then explore content-aware and voice-based approaches, which aim to identify inconsistencies intrinsic to the speech signal rather than to the vocoding process. To address generalisation and long-term robustness, we investigate viable generalisation paradigms to detect anomalies due to the synthesis algorithms without needing access to representative synthetic examples. Finally, we discuss broader non-technical challenges, including data availability, financial sustainability, and impact of regulatory frameworks, to then outline a sustainable path forward for future research on synthetic speech detection.
Author(s)
Keyword(s)