• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. A Textual Information Extraction Application based on XML Data Models and a Multidimensional Natural Language Processing Pipeline Approach
 
  • Details
  • Full
Options
2022
Conference Paper
Title

A Textual Information Extraction Application based on XML Data Models and a Multidimensional Natural Language Processing Pipeline Approach

Abstract
Modern data and information systems usually contain considerable amounts of data and documents and thus provide a large amount of information. The automatic extraction of domain-specific information is all the more important in order to improve work with such systems. If information is available as free text information, machine processing can prove to be a difficult technical hurdle. State-of-the-art approaches use modern Natural Language Processing (NLP) methods to solve such tasks. In this paper, we want to introduce a data-driven approach, applying an XML data model to an application-specific scenario, using different NLP methods, which are combined into a multidimensional pipeline. It is important to understand how certain NLP methods can be used and what their limitations are. Individual modern NLP methods are often not sufficient and resilient enough to solve complex information extraction tasks. Therefore, it has to be examined how such problems can be alleviated or circumvented by a combination of different NLP methods. As a distinction to categorical grammar models, all cases considered here should be available as free text. The approach presented in this paper is still a work in progress, yet first evaluation results will be given.
Author(s)
Dorrn, Tobias
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Dambier, Natalie  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Müller, Almuth
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Kuwertz, Achim  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Mainwork
16th International Conference on INnovations in Intelligent SysTems and Applications, INISTA 2022  
Conference
International Conference on INnovations in Intelligent SysTems and Applications 2022  
Open Access
DOI
10.1109/inista55318.2022.9894171
10.24406/h-427097
File(s)
A Textual Information Extraction Application based on XML Data Models and a Multidimensional Natural Language Processing Pipeline Approach.pdf (430.27 KB)
Rights
Under Copyright
Language
English
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Keyword(s)
  • NLP

  • XML

  • pipeline

  • application

  • information extraction

  • Technological innovation

  • Protocols

  • Information retrieval

  • Natural language processing

  • Data models

  • Data mining

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024