• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Dynamic-automatic pipelines for finding topic-specific information clusters using NLP methods in connection with a model-driven approach
 
  • Details
  • Full
Options
2022
Conference Paper
Title

Dynamic-automatic pipelines for finding topic-specific information clusters using NLP methods in connection with a model-driven approach

Abstract
Finding and extracting topic-specific information from free-text sources is an important task for classifying and distinguishing content of information systems. Such a compression process of information, in which non-relevant text parts can also be ignored, is also advantageous with regard to the further machine processing and evaluation of topic-specific documents. State-of-the-art approaches normally use well-trained modern Natural Language Processing (NLP) methods to solve such tasks. However, use cases can arise where no suitable training data sets are available to adequately prepare or fine-tune the NLP methods used. In this paper, we want to detail a model-driven approach, applying an XML data model to an application-specific scenario, combining different NLP methods into a dynamic automated NLP pipeline. The goal of this pipeline is the automatic extraction of specific information (related to certain domains or topics) from text documents allowing a structured further processing of this information. Specifically, a scenario is considered where such information has to be aligned to a given information model, defining e.g. the terms relevant for the further processing. The solution approaches described here deal with a scenario in which information clusters on a specific topic can be obtained from a given data set, even without domain-specific model training. The basis is the use of a dynamic (i.e., using different NLP methods and models) and fully automatic (i.e., using different topics at the same time) pipeline architecture combined with an XML data model. The presented approach details and extends our earlier work and gives new qualitative and first quantitative results.
Author(s)
Dorrn, Tobias
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Kuwertz, Achim  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Mainwork
Artificial Intelligence and Machine Learning in Defense Applications IV  
Conference
Conference "Artificial Intelligence and Machine Learning in Defense Applications" 2022  
DOI
10.1117/12.2648385
Language
English
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Keyword(s)
  • NPL

  • information extraction

  • model-driven approach

  • XML

  • domain-specific application

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024