Options
October 6, 2025
Conference Paper
Title
PLVS: A Documentation and Exploration System for Data Provenance, Lineage, and Versioning
Abstract
Data is amongst an organization's most valuable assets, and global dataspaces aim to unlock its value through secure sharing and exchange. A key challenge in building these data marketplaces is providing transparent access to data assets that undergo continuous changes by multiple actors throughout their lifecycle. This resulting provenance, lineage, and versioning information offers insights into a data asset's history and quality, but often goes untracked. To address these challenges, we developed Provenance, Lineage, and Versioning System (PLVS), a W3C PROV-compliant system that captures standardized provenance and lineage metadata (who performed which operations and when) and detects precise differences between dataset versions. PLVS provides an interactive graphical interface for provenance and lineage exploration and version comparison; RESTful APIs for provenance and lineage metadata and dataset version comparison data; and semantic graph-based storage to store this provenance and lineage metadata using the W3C PROV Data Model (W3C PROV-DM). We validated PLVS with Smart Cities use cases in the PISTIS Horizon Europe project, demonstrating its effectiveness in improved data transparency and trustworthiness, data quality assessments, error detection and correction, and auditability in federated dataspaces.
Author(s)
Conference