Options
August 20, 2023
Conference Paper
Title
An Extensive Methodology and Framework for Quality Assessment of DCAT-AP Datasets
Abstract
The DCAT Application Profile for Data Portals is a crucial cornerstone for publishing and reusing Open Data in Europe. It supports the harmonization and interoperability of Open Data by providing an expressive set of properties, guidelines, and reusable vocabularies. However, a qualitative and accurate implementation by Open Data providers remains challenging. To improve the informative value and the compliance with RDF-based specifications, we propose a methodology to measure and assess the quality of DCAT-AP datasets. Our approach is based on the FAIR and the 5-star principles for Linked Open Data. We define a set of metrics, where each one covers a specific quality aspect. For example, if a certain property has a compliant value, if mandatory vocabularies are applied or if the actual data is available. The values for the metrics are stored as a custom data model based on the Data Quality Vocabulary and is used to calculate an overall quality score for each dataset. We implemented our approach as a scalable and reusable Open Source solution to demonstrate its feasibility. It is applied in a large-scale production environment (data.europa.eu) and constantly checks more than 1.6 million DCAT-AP datasets and delivers quality reports.
Author(s)
Keyword(s)