Semantic Based Approach for Entity Matching on Noisy Semistructured Data
Entity Matching on Noisy Semi-structured Data, such as RDF graphs is an active field of research where many approaches have been proposed for interlinking individuals of Knowledge Graph datasets. These methods have included schema learning techniques, string matching on labels using SPARQL, attribute based approaches and methods that use Knowledge Graph embeddings. This thesis proposes a novel entity matching pipeline that parses Knowledge Graphs to fetch literals and label values, handles semantic interoperability conflicts and performs attribute based matching using data from literals by using a Deep Learning model. We test our approach on the Itunes dataset and Wikidata-DBpedia dataset. We believe that our technique can help interlink individuals of various RDF datasets and extend knowledge via entity matching. Our technique also promises to be robust to dirty, semi-structured data when literals have large texts.
Bonn, Univ., Master Thesis, 2021