• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Anderes
  4. Fine-tuning and aligning question answering models for complex information extraction tasks
 
  • Details
  • Full
Options
2023
Paper (Preprint, Research Paper, Review Paper, White Paper, etc.)
Title

Fine-tuning and aligning question answering models for complex information extraction tasks

Title Supplement
Published on arXiv
Abstract
The emergence of Large Language Models (LLMs) has boosted performance and possibilities in various NLP tasks. While the usage of generative AI models like ChatGPT opens up new opportunities for several business use cases, their current tendency to hallucinate fake content strongly limits their applicability to document analysis, such as information retrieval from documents. In contrast, extractive language models like question answering (QA) or passage retrieval models guarantee query results to be found within the boundaries of an according context document, which makes them candidates for more reliable information extraction in productive environments of companies. In this work we propose an approach that uses and integrates extractive QA models for improved feature extraction of German business documents such as insurance reports or medical leaflets into a document analysis solution. We further show that fine-tuning existing German QA models boosts performance for tailored extraction tasks of complex linguistic features like damage cause explanations or descriptions of medication appearance, even with using only a small set of annotated data. Finally, we discuss the relevance of scoring metrics for evaluating information extraction tasks and deduce a combined metric from Levenshtein distance, F1-Score, Exact Match and ROUGE-L to mimic the assessment criteria from human experts.
Author(s)
Engelbach, Matthias
Fraunhofer-Institut für Arbeitswirtschaft und Organisation IAO  
Klau, Dennis
Univ. Stuttgart, Institut für Arbeitswissenschaft und Technologiemanagement -IAT-  
Scheerer, Felix
Univ. Stuttgart, Institut für Arbeitswissenschaft und Technologiemanagement -IAT-  
Drawehn, Jens  
Fraunhofer-Institut für Arbeitswirtschaft und Organisation IAO  
Kintz, Maximilien  
Fraunhofer-Institut für Arbeitswirtschaft und Organisation IAO  
Conference
International Conference on Knowledge Discovery and Information Retrieval 2023  
International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management 2023  
Open Access
File(s)
Download (430.62 KB)
Rights
CC BY-NC-ND 4.0: Creative Commons Attribution-NonCommercial-NoDerivatives
DOI
10.48550/arXiv.2309.14805
10.24406/publica-2445
Language
English
Fraunhofer-Institut für Arbeitswirtschaft und Organisation IAO  
Keyword(s)
  • Question-answering

  • Language models

  • Information extraction

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024