• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Efficient computation of comprehensive statistical information of large OWL datasets: A scalable approach
 
  • Details
  • Full
Options
2023
Journal Article
Title

Efficient computation of comprehensive statistical information of large OWL datasets: A scalable approach

Abstract
Computing dataset statistics is crucial for exploring their structure, however, it becomes challenging for large-scale datasets. This has several key benefits, such as link target identification, vocabulary reuse, quality analysis, big data analytics, and coverage analysis. In this paper, we present the first attempt of developing a distributed approach (OWLStats) for collecting comprehensive statistics over large-scale OWL datasets. OWLStats is a distributed in-memory approach for computing 50 statistical criteria for OWL datasets utilizing Apache Spark. We have successfully integrated OWLStats into the SANSA framework. Experiments results prove that OWLStats is linearly scalable in terms of both node and data scalability.
Author(s)
Mohamed, H.
Universität Bonn
Fathalla, S.
Universität Bonn
Lehmann, Jens  
Universität Bonn
Jabeen, H.
Leibniz Institute for the Social Sciences GESIS
Journal
Enterprise information systems  
DOI
10.1080/17517575.2022.2062683
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • Distributed processing

  • in-memory approach

  • SANSA framework

  • scalable architecture

  • Semantic Web

  • statistics computations

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024