Towards semantification of big data technology

Mami, Mohamed Nadjib; Scerri, Simon; Auer, Sören; Vidal, Maria-Esther

doi:10.1007/978-3-319-43946-4_25

2016

Conference Paper

Abstract

Much attention has been devoted to support the volume and velocity dimensions of Big Data. As a result, a plethora of technology components supporting various data structures (e.g., key-value, graph, relational), modalities (e.g., stream, log, real-time) and computing paradigms (e.g., in-memory, cluster/cloud) are meanwhile available. However, systematic support for managing the variety of data, the third dimension in the classical Big Data definition, is still missing. In this article, we present SeBiDA, an approach for managing hybrid Big Data. SeBiDA supports the Semantification of Big Data using the RDF data model, i.e., non-semantic Big Data is semantically enriched by using RDF vocabularies. We empirically evaluate the performance of SeBiDA for two dimensions of Big Data, i.e., volume and variety; the Berlin Benchmark is used in the study. The results suggest that even in large datasets, query processing time is not affected by data variety.

Author(s)

Mainwork

Big data analytics and knowledge discovery

Conference

International Conference on Data Warehousing and Knowledge Discovery (DaWaK) 182016

Options

Towards semantification of big data technology