DESERT: A Continuous SPARQL Query Engine for On-Demand Query Answering
The Internet of Things (IoT) has been rapidly adopted in many domains ranging from household appliances e.g. ventilation, lighting, and heating, to industrial manufacturing and transport networks. Despite the, enormous benefits of optimization, monitoring, and maintenance rendered by IoT devices, an ample amount of data is generated continuously. Semantically describing IoT generated data using ontologies enables a precise interpretation of this data. However, ontology-based descriptions tremendously increase the size of IoT data and in presence of repeated sensor measurements, a large amount of the data are duplicates that do not contribute to new insights during query processing or IoT data analytics. In order to ensure that only required ontology-based descriptions are generated, we devise a knowledge-driven approach named DESERT that is able to on-Demand factorizE and Semantically Enrich stReam daTa. DESERT resorts to a knowledge graph to describe IoT stream data; it utilizes only the data that is required to answer an input continuous SPARQL query and applies a novel method of data factorization to reduce duplicated measurements in the knowledge graph. The performance of DESERT is empirically studied on a collection of continuous SPARQL queries from SRBench, a benchmark of IoT stream data and continuous SPARQL queries. Furthermore, data streams with various combinations of uniform and varying data stream speeds and streaming window size dimensions are considered in the study. Experimental results suggest that DESERT is capable of speeding up continuous query processing while creates knowledge graphs that include no replications.