Databugger: A test-driven framework for debugging the web of data
Linked Open Data (LOD) comprises of an unprecedented volume of structured data on the Web. However, these datasets are of varying quality ranging from extensively curated datasets to crowd-sourced or extracted data of often relatively low quality. We present Databugger, a framework for test-driven quality assessment of Linked Data, which is inspired by test-driven software development. Databugger ensures a basic level of quality by accompanying vocabularies, ontologies and knowledge bases with a number of test cases. The formalization behind the tool employs SPARQL query templates, which are instantiated into concrete quality test queries. The test queries can be instantiated automatically based on a vocabulary or manually based on the data semantics. One of the main advantages of our approach is that domain specific semantics can be encoded in the data quality test cases, thus being able to discover data quality problems beyond conventional quality heuristics.