• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Text Vs. Speech? Detecting Audio Deepfakes on Instagram
 
  • Details
  • Full
Options
2026
Conference Paper
Title

Text Vs. Speech? Detecting Audio Deepfakes on Instagram

Abstract
With the increasing use of AI, deepfakes are becoming an increasingly prevalent threat in today’s world. At the same time, the performance of most detectors drops significantly when faced with unseen data, whereas generation models are improving, resulting in fewer artefacts. We examined deepfakes published on Instagram, using the SocialDF dataset. In addition to analysing the deepfakes in the frequency domain using audio deepfake detectors, we transcribed the speech and analysed the text (e.g. emotion and topics) and the audio content (e.g. emotion and music genre). We found that audio deepfake detectors struggle to identify real-world deepfakes on Instagram. Furthermore, current audio deepfake detection uses audio artefacts only. Content is not used for detection purposes. We suggest using both the speech recording and the content. This approach improves results on real-world data and provides an explanation for the classification. Using content information, we outperformed frequency-based detection with an F1-score of 74.3%.
Author(s)
Schäfer, Karla
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
Mainwork
Advances in Information Retrieval. 48th European Conference on Information Retrieval, ECIR 2026. Proceedings. Part II  
Conference
European Conference on Information Retrieval 2026  
DOI
10.1007/978-3-032-21300-6_32
Language
English
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024