• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Process mining between the lines: Extracting object-centric event logs from textual data
 
  • Details
  • Full
Options
2026
Journal Article
Title

Process mining between the lines: Extracting object-centric event logs from textual data

Abstract
Organizations generate vast amounts of unstructured textual data — a valuable source of information that frequently remains underutilized for process mining. However, textual descriptions often record exceptions and manual activities absent from structured data, and therefore, enable a better understanding of deviations from the expected business process behavior. Importantly, unstructured sources typically retain the object-centric characteristics of real-world processes — information that gets flattened or lost in case-centric event logs. Yet, existing approaches primarily target structured data sources or produce case-centric event logs. To address this gap, we present an automated approach to derive object-centric event logs directly from unstructured textual descriptions. The approach comprises two subcomponents: a collector that identifies events and objects (including their attributes and relationships), and a refiner that consolidates and cleans the extracted information. We instantiate each subcomponent in heuristic and generative implementations and create four pairwise combinations of collector and refiner instances to assess the effectiveness of heuristic natural language processing and generative artificial intelligence techniques. We compare these variants quantitatively and qualitatively in a controlled, artificial setting based on synthesized texts and demonstrate the practical utility on two naturally occurring corpora (fire status updates and a legal judgment). Our results show that the configurations with a generative collector achieve the highest extraction quality. In particular, the fully generative variant produces coherent and standardized event and object labels. Overall, this study fills a notable research gap by enabling the incorporation of textual information into process mining applications.
Author(s)
Buss, Alina
TUM School of Management, Munich
Kecht, Christoph
Fraunhofer-Institut für Angewandte Informationstechnik FIT  
Kratsch, Wolfgang
Fraunhofer-Institut für Angewandte Informationstechnik FIT  
Röglinger, Maximilian  
Fraunhofer-Institut für Angewandte Informationstechnik FIT  
Sadeghianasl, Sareh
Queensland University of Technology
Wynn, Moe Thandar
Queensland University of Technology
Journal
Information systems  
Open Access
File(s)
Download (2.91 MB)
Rights
CC BY 4.0: Creative Commons Attribution
DOI
10.1016/j.is.2026.102713
10.24406/publica-8235
Additional link
Full text
Language
English
Fraunhofer-Institut für Angewandte Informationstechnik FIT  
Keyword(s)
  • Generative artificial intelligence

  • Large language models

  • Natural language processing

  • Object-centric event logs

  • Process mining

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024