• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Robustness Evaluation of the German Extractive Question Answering Task
 
  • Details
  • Full
Options
January 2025
Conference Paper
Title

Robustness Evaluation of the German Extractive Question Answering Task

Abstract
To ensure reliable performance of Question Answering (QA) systems, evaluation of robustness is crucial. Common evaluation benchmarks commonly only include performance metrics, such as Exact Match (EM) and the F1 score. However, these benchmarks overlook critical factors for the deployment of QA systems. This oversight can result in systems vulnerable to minor perturbations in the input such as typographical errors. While several methods have been proposed to test the robustness of QA models, there has been minimal exploration of these approaches for languages other than English. This study focuses on the robustness evaluation of German language QA models, extending methodologies previously applied primarily to English. The objective is to nurture the development of robust models by defining an evaluation method specifically tailored to the German language. We assess the applicability of perturbations used in English QA models for German and perform a comprehensive experimental evaluation with eight models. The results show that all models are vulnerable to character-level perturbations. Additionally, the comparison of monolingual and multilingual models suggest that the former are less affected by character and word-level perturbations.
Author(s)
Satheesh, Shalaka  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Beckh, Katharina  orcid-logo
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Klug, Katrin  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Allende-Cid, Héctor
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Houben, Sebastian
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Hassan, Teena  
Hochschule Bonn-Rhein-Sieg
Mainwork
COLING 2025, the 31st International Conference on Computational Linguistics. Proceedings of the Main Conference  
Conference
International Conference on Computational Linguistics 2025  
Link
Link
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • Benchmarking

  • Character level

  • F1 scores

  • German language

  • Performance metrices

  • Computational linguistics

  • Natural Language Processing Systems

  • Language Modeling

  • Question Answering

  • Reliable performance

  • Robustness evaluation

  • System evaluation

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024