• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Automated Information Extraction from Unstructured Documents Using Large Language Models and different Questioning Strategies
 
  • Details
  • Full
Options
2025
Conference Paper
Title

Automated Information Extraction from Unstructured Documents Using Large Language Models and different Questioning Strategies

Abstract
In the era of big data, organizations face significant challenges in extracting valuable information from unstructured documents. This paper explores the application of locally hosted large language models (LLMs) to automate information extraction processes, focusing on their effectiveness and reliability. We investigate various querying strategies, including simple, repetitive, and two-stage approaches, to assess their impact on extraction accuracy. A demonstrator was developed to test these strategies using documents in cooperation with the Volkswagen Group, with results evaluated through the BERTScore metric. Our findings reveal that while certain strategies yield promising results, their performance inconsistency suggests they should be applied with caution in accuracy-critical contexts. We discuss the implications of our results for knowledge management and outline avenues for future research, including the exploration of more advanced models and tailored querying techniques to enhance information retrieval efficacy.
Author(s)
Brünnhäußer, Jörg
Fraunhofer-Institut für Produktionsanlagen und Konstruktionstechnik IPK  
Zhang, Xiaoxu
Fraunhofer-Institut für Produktionsanlagen und Konstruktionstechnik IPK  
Konietzko, Erik Paul  
Fraunhofer-Institut für Produktionsanlagen und Konstruktionstechnik IPK  
Kehl, Stefan
Volkswagen AG
Mainwork
International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2025  
Conference
International Conference on Electrical, Computer, Communications and Mechatronics Engineering 2025  
DOI
10.1109/ICECCME64568.2025.11277979
Language
English
Fraunhofer-Institut für Produktionsanlagen und Konstruktionstechnik IPK  
Keyword(s)
  • Information extraction

  • Large Language Models

  • Prompt engineering

  • unstructured data

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024