• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. GerMedIQ: A Resource for Simulated and Synthesized Anamnesis Interview Responses in German
 
  • Details
  • Full
Options
2025
Conference Paper
Title

GerMedIQ: A Resource for Simulated and Synthesized Anamnesis Interview Responses in German

Abstract
Due to strict privacy regulations, text corpora in non-English clinical contexts are scarce. Consequently, synthetic data generation using Large Language Models (LLMs) emerges as a promising strategy to address this data gap. To evaluate the ability of LLMs in generating synthetic data, we applied them to our novel German Medical Interview Questions Corpus (GerMedIQ), which consists of 4,524 unique, simulated question-response pairs in German. We augmented our corpus by prompting 18 different LLMs to generate responses to the same questions. Structural and semantic evaluations of the generated responses revealed that large-sized language models produced responses comparable to those provided by humans. Additionally, an LLM-as-a-judge study, combined with a human baseline experiment assessing response acceptability, demonstrated that human raters preferred the responses generated by Mistral (124B) over those produced by humans. Nonetheless, our findings indicate that using LLMs for data augmentation in non-English clinical contexts requires caution.
Author(s)
Hofenbitzer, Justin
Technische Universität München
Schöning, Sebastian  
Fraunhofer-Institut für Produktionstechnik und Automatisierung IPA  
Belle, Sebastian
Universität Heidelberg
Lammert, Jacqueline
Technische Universität München
Modersohn, Luise
Technische Universität München
Boeker, Martin
Technische Universität München
Frassinelli, Diego
Ludwig-Maximilians-Universität München
Mainwork
63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025. Proceedings. Vol.4: Student Research Workshop  
Conference
Association for Computational Linguistics (ACL Annual Meeting) 2025  
Student Research Workshop 2025  
Open Access
DOI
10.18653/v1/2025.acl-srw.84
Additional link
Full text
Language
English
Fraunhofer-Institut für Produktionstechnik und Automatisierung IPA  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024