Robust Audio Deepfake Detection: Exploring Front-/Back-End Combinations and Data Augmentation Strategies for the ASVspoof5 Challenge

Schäfer, Karla; Neu, Matthias; Choi, Jeong-Eun

doi:10.21437/ASVspoof.2024-9

2024

Conference Paper

Abstract

The robustness and generalizability of audio deepfake detectors are becoming more important due to the technical advances in generation methods and the widespread usage of audio deepfakes. The ASVspoof5 challenge addresses this by providing a new dataset. This paper presents Fraunhofer SIT's anti-spoofing detectors submitted to the ASVspoof5 challenge. AASIST(-L), RawGAT-ST and data augmentation was used in the closed condition. In the open condition, we evaluated different SSL-based front-ends using diverse training data. The results indicate that the utilisation of extensive data augmentation improve the results when using a non SSL-based front-end, whereas its incorporation with an SSL-based front-end led to a decline in performance. The implementation of a large SSL front-end improved the result. Our best detector in the closed setting attained a min DCF of 0.589 and the best in the open condition (using SSL) a min DCF of 0.174 on the ASVspoof5 evaluation set.

Author(s)

Schäfer, Karla

Fraunhofer-Institut für Sichere Informationstechnologie SIT

Neu, Matthias

Choi, Jeong-Eun

Fraunhofer-Institut für Sichere Informationstechnologie SIT

Mainwork

The Automatic Speaker Verification Spoofing Countermeasures Workshop 2024

Conference

The Automatic Speaker Verification Spoofing Countermeasures Workshop 2024

Options

Robust Audio Deepfake Detection: Exploring Front-/Back-End Combinations and Data Augmentation Strategies for the ASVspoof5 Challenge