Now showing 1 - 10 of 60
  • Publication
    DESERT: A Continuous SPARQL Query Engine for On-Demand Query Answering
    The Internet of Things (IoT) has been rapidly adopted in many domains ranging from household appliances e.g. ventilation, lighting, and heating, to industrial manufacturing and transport networks. Despite the, enormous benefits of optimization, monitoring, and maintenance rendered by IoT devices, an ample amount of data is generated continuously. Semantically describing IoT generated data using ontologies enables a precise interpretation of this data. However, ontology-based descriptions tremendously increase the size of IoT data and in presence of repeated sensor measurements, a large amount of the data are duplicates that do not contribute to new insights during query processing or IoT data analytics. In order to ensure that only required ontology-based descriptions are generated, we devise a knowledge-driven approach named DESERT that is able to on-Demand factorizE and Semantically Enrich stReam daTa. DESERT resorts to a knowledge graph to describe IoT stream data; it utilizes only the data that is required to answer an input continuous SPARQL query and applies a novel method of data factorization to reduce duplicated measurements in the knowledge graph. The performance of DESERT is empirically studied on a collection of continuous SPARQL queries from SRBench, a benchmark of IoT stream data and continuous SPARQL queries. Furthermore, data streams with various combinations of uniform and varying data stream speeds and streaming window size dimensions are considered in the study. Experimental results suggest that DESERT is capable of speeding up continuous query processing while creates knowledge graphs that include no replications.
  • Publication
    TurtleEditor: A web-based RDF editor to support distributed ontology development on repository hosting platforms
    ( 2017) ;
    Similea, Alexandra
    ;
    ; ;
    Bansal, Srividya
    ;
    Chaves-Gattaz, Cristiane
    ;
    ;
    Persia, Fabio
    ;
    Pilato, Giovanni
    ;
    Zhang, Guigang
    Ontologies are increasingly being developed on web-based repository hosting platforms such as GitHub. Accordingly, there is a demand for ontology editors which can be easily connected to the hosted repositories. TurtleEditor is a web-based RDF editor that provides this capability and supports the distributed development of ontologies on repository hosting platforms. It offers features such as syntax checking, syntax highlighting, and auto completion, along with a SPARQL endpoint to query the ontology. Furthermore, TurtleEditor integrates a visual editing view that allows for the graphical manipulation of the RDF graph and includes some basic clustering functionality. The text and graph views are constantly synchronized so that all changes to the ontology are immediately propagated and the views are updated accordingly. The results of a user study and performance tests show that TurtleEditor can indeed be effectively used to support the distributed development of ontologies on repository hosting platforms.
  • Publication
    Git4Voc: Collaborative vocabulary development based on git
    Collaborative vocabulary development in the context of data integration is the process of finding consensus between experts with different backgrounds, system understanding and domain knowledge. The complexity of this process increases with the number of people involved, the variety of the systems to be integrated and the dynamics of their domain. In this paper, we advocate that the usage of a powerful version control system is one of the keys to address this problem. Driven by this idea and the success of the version control system Git in the context of software development, we investigate the applicability of Git for collaborative vocabulary development. Even though vocabulary development and software development have much more similarities than differences, there are still important challenges. These need to be considered in the development of a successful versioning and collaboration system for vocabulary development. Therefore, this paper starts by presenting the challenges we are faced with during the collaborative creation of vocabularies and discusses its distinction to software development. Drawing from these findings, we present Git4Voc which comprises guidelines on how Git can be adopted to vocabulary development. Finally, we demonstrate how Git hooks can be implemented to go beyond the plain functionality of Git by realizing vocabulary-specific features like syntactic validation and semantic diffs.
  • Publication
    A semi-supervised method for topic extraction from micro postings
    ( 2015) ; ;
    Samiei, A.
    ;
    Andrienko, Gennady
    ;
    Andrienko, Natalia
    Social networking services have become a major channel for the digital society to share content, opinions, experiences on activities or events, as well as on products, services and brands. Evaluating digital feedback on the latter can be a valuable asset for companies seeking product and consumer insights. However, the analysis of short, noisy, fragmented, and often subjective textual data still remains a challenge. Typically, the human analyst needs to be actively involved during extraction and modeling to resolve ambiguities that will inevitable arise in such data and to put the model into context. This paper proposes a visual analytics approach that enables a first intuition and exploration of topics appearing in the text corpus, and facilitates the interactive-iterative refinement of the overall topic model describing the stream of tweets. A second contribution is the discussion of efficient graph community detection algorithms to extract initial topics as the starting point of interactive analysis that complement approaches such as LDA. The applicability and utility of the proposed approach is shown for a real-world use case: the analysis of product insights and topic-driven social networks analysis for a specific product line for an international hair styling and cosmetics company.
  • Publication
    SINA: Semantic interpretation of user queries for question answering on interlinked data
    ( 2015)
    Shekarpour, Saeedeh
    ;
    Marx, E.
    ;
    ;
    The architectural choices underlying Linked Data have led to a compendium of data sources which contain both duplicated and fragmented information on a large number of domains. One way to enable non-experts users to access this data compendium is to provide keyword search frameworks that can capitalize on the inherent characteristics of Linked Data. Developing such systems is challenging for three main reasons. First, resources across different datasets or even within the same dataset can be homonyms. Second, different datasets employ heterogeneous schemas and each one may only contain a part of the answer for a certain user query. Finally, constructing a federated formal query from keywords across different datasets requires exploiting links between the different datasets on both the schema and instance levels. We present Sina, a scalable keyword search system that can answer user queries by transforming user-supplied keywords or natural-languages queries into conjunctive SPARQL queries over a set of interlinked data sources. Sina uses a hidden Markov model to determine the most suitable resources for a user-supplied query from different datasets. Moreover, our framework is able to construct federated queries by using the disambiguated resources and leveraging the link structure underlying the datasets to query. We evaluate Sina over three different datasets. We can answer 25 queries from the QALD-1 correctly. Moreover, we perform as well as the best question answering system from the QALD-3 competition by answering 32 questions correctly while also being able to answer queries on distributed sources. We study the runtime of SINA in its mono-core and parallel implementations and draw preliminary conclusions on the scalability of keyword search on Linked Data.
  • Publication
    Semantic-based retrieval of cultural heritage multimedia objects
    ( 2012) ; ;
    Doerr, Martin
    ;
    Hill, Hermann Josef
    ;
    Today's search interfaces typically offer keyword searches and facets for the retrieval of cultural heritage multimedia objects. Facets, however, are usually based on a static set of metadata fields. This set is often called an indexing profile. Graph-based repositories based on predicates about resources allow for more precise semantics. They offer stronger support for retrieval, and they can be adopted to almost any metadata format. Technically, those predicates may be serialized as RDF triples, but handling a huge amount of objects with numerous predicates puts an unpredictable load on the query engine. In this paper, we present an approach on analysing transition paths in the RDF triples at ingest time and using the results to create facets in the search index.
  • Publication
    Agriculture's technological makeover
    With agriculture becoming a knowledge-intensive industry, pervasive computing is poised to help solve one of the world's most pressing challenges of information dissemination, marketing products, and conducting business. These intelligent systems measure light reflectance from leaves and correlate it to nitrogen levels in the soil, and control application systems then apply the optimal amounts of fertilizer. John Deere developed a prototype of an automatic tractor that uses satellite signals to follow pre-programmed routes without a human driver. It is expected that with the availability of more computation combined with sensing and networking, new forms of farming will emerge. Farmers can obtain corresponding data from hyperspectral measurements of plants, which record spectrums of several hundred wavelengths. This data contains information about changes in a plant's pigment composition, which indicates metabolic processes involved in responses to biotic or abiotic stress.
  • Publication
    An agent-based model to evaluate carpooling at large manufacturing plants
    ( 2012)
    Bellemans, T.
    ;
    ;
    Cho, S.
    ;
    Giannotti, Fosca
    ;
    Janssens, D.
    ;
    Knapen, L.
    ;
    ; ;
    Nanni, M.
    ;
    Pedreschi, Dino
    ;
    ;
    Trasarti, Roberto
    ;
    Yasar, A.-U.-H.
    ;
    Wets, G.
    Carpooling is thought to be part of the solution to resolve traffic congestion in regions where large companies dominate the traffic situation because coordination and matching between commuters is more likely to be feasible in cases where most people work for a single employer. Moreover, carpooling is not very popular for commuting. In order for car-pooling to be successful, an online service for matching commuter profiles is indispensable due to the large community involved. Such service is necessary but not sufficient because carpooling requires rerouting and activity rescheduling along with candidate matching. We advise to introduce services of this kind using a two step process: (1) an agent-based simulation is used to investigate opportunities and inhibitors and (2) online matching is made available. This paper describes the challenges to build the model and in particular investigates possibilities to derive the data required for commuter behavior modeling from bi g data (such as GSM, GPS and/or Bluetooth).
  • Publication
    The ACGT project in retrospect: Lessons learned and future outlook
    ( 2011)
    Bucur, A.
    ;
    ;
    Sengstag, T.
    ;
    Sfakianakis, Stelios
    ;
    Tsiknakis, Manolis
    ;
  • Publication
    Perception beyond the here and now
    ( 2011) ;
    Langheinrich, M.
    ;
    Emerging sensor-equipped computing devices are overcoming longstanding temporal and spatial boundaries to human perception.