Semantic Data Integration for Knowledge Graph Construction at Query Time
The evolution of the Web of documents into a Web of services and data has resulted in an increased availability of data from almost any domain. For example, general domain knowledge bases such as DBpedia or Wikidata, or domain specific Web sources like the Oxford Art archive, allow for accessing knowledge about a wide variety of entities including people, organizations, or art paintings. However, these data sources publish data in different ways, and they may be equipped with different search capabilities, e.g., SPARQL endpoints or REST services, thus requiring data integration techniques that provide a unified view of the published data. We devise a semantic data integration approach named FuhSen that exploits keyword and structured search capabilities of Web data sources and generates on-demand knowledge graphs merging data collected from available Web sources. Resulting knowledge graphs model semantics or meaning of merged data in terms of entities that satisfy keyword queries, and relationships among those entities. FuhSen relies on both RDF to semantically describe the collected entities, and on semantic similarity measures to decide on relatedness among entities that should be merged. We empirically evaluate the results of FuhSen data integration techniques on data from the DBpedia knowledge base. The experimental results suggest that FuhSen data integration techniques accurately integrate similar entities semantically into knowledge graphs.