Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs

: Iglesias, E.; Jozashoori, S.; Chaves-Fraga, D.; Collarana, D.; Vidal, M.-E.


d'Aquin, M. ; Association for Computing Machinery -ACM-; Association for Computing Machinery -ACM-, Special Interest Group on Information Retrieval -SIGIR-; Association for Computing Machinery -ACM-, Special Interest Group on Hypertext, Hypermedia, and Web:
29th ACM International Conference on Information & Knowledge Management, CIKM 2020. Proceedings : October 19-23, 2020, Virtual Event, Ireland
New York: ACM, 2020
ISBN: 978-1-4503-6859-9
International Conference on Information & Knowledge Management (CIKM) <29, 2020, Online>
Fraunhofer IAIS ()

In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed results indicate that SDM-RDFizer is two orders of magnitude faster than state of the art, thus, meaning that SDM-RDFizer an interoperable and scalable solution for knowledge graph creation. SDM-RDFizer is publicly available as a resource through a Github repository and a DOI.