Options
2023
Conference Paper
Title
Semantic Extraction of Key Figures and their Properties from Tax Legal Texts Using Neural Models
Abstract
Applying information extraction to legislative texts is a challenging task that requires a specification to distinguish the relevant parts from the less relevant parts of the text. Moreover, there is still a lack of appropriate language- and domain-specific data in the field of information extraction. This work investigates the extraction and modeling of key figures from legal texts. We introduce a universally applicable annotation scheme together with a semantic model for key figures and their logically connected properties in legal texts. Moreover, we release KeyFiTax, a dataset with key figures based on paragraphs of German tax acts manually annotated by tax experts together with a knowledge graph populated from these paragraphs based on our semantic model. Using our dataset, we also evaluate and compare state-of-the-art entity extraction models in terms of long entity spans and low-resource data. Furthermore, we present a transformer-based approach for relation extraction using entity markers to obtain a logical formulation of the key figures. Finally, we introduce task triggers for training a combined resource-efficient entity and relation extraction model. We make our dataset together with the semantic model and the knowledge graph, as well as the implementation of the entity and relation extraction approaches investigated in this work public.
Author(s)
Open Access
Rights
CC BY 4.0: Creative Commons Attribution
Language
English
Keyword(s)