• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Using LLMs to Identify Personal Data Processing in Source Code
 
  • Details
  • Full
Options
2025
Conference Paper
Title

Using LLMs to Identify Personal Data Processing in Source Code

Abstract
Assessing the privacy impact of software products is essential for adhering to regulatory requirements but it is also highly challenging. This is due to the need for expertise in both software engineering and data protection, and the time-intensive and error-prone nature of the task, particularly when dealing with large and frequently changing applications. In this study, we present a Large Language Model-based approach to automatically classify source code for its privacy impact. Our contributions are (1) a dataset of code snippets, labeled with personal data from a W3C personal data taxonomy (2) an extensible approach and framework to auto-classify source code using the taxonomy and different prompting strategies, and (3) a demonstration of multiple experiments that give insight into an effective use of such a framework. Our results demonstrate that LLM-based detection of personal data processing in source code is feasible with levels of accuracy that can effectively support human reviewers in assessing software at scale.
Author(s)
Kunz, Immanuel
Fraunhofer-Institut für Angewandte und Integrierte Sicherheit AISEC  
Kao, Ching-Yu Franziska
Fraunhofer-Institut für Angewandte und Integrierte Sicherheit AISEC  
Kowatsch, Daniel
Fraunhofer-Institut für Angewandte und Integrierte Sicherheit AISEC  
Hiller, Jens
Google LLC, Europe
Schütte, Julian
Google LLC, Europe
Prokhorenkov, Dmitry
Technische Universität München
Bettinger, Konstantin
Fraunhofer-Institut für Angewandte und Integrierte Sicherheit AISEC  
Mainwork
46th IEEE Symposium on Security and Privacy Workshops, SPW 2025. Proceedings  
Conference
Symposium on Security and Privacy 2025  
Deep Learning Security and Privacy Workshop 2025  
DOI
10.1109/SPW67851.2025.00018
Language
English
Fraunhofer-Institut für Angewandte und Integrierte Sicherheit AISEC  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024