• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Querying Data Lakes using Spark and Presto
 
  • Details
  • Full
Options
2019
Conference Paper
Title

Querying Data Lakes using Spark and Presto

Abstract
Squerall is a tool that allows the querying of heterogeneous, large-scale data sources by leveraging state-of-the-art Big Data processing engines: Spark and Presto. Queries are posed on-demand against a Data Lake, i.e., directly on the original data sources without requiring prior data transformation. We showcase Squerall's ability to query five different data sources, including inter alia the popular Cassandra and MongoDB. In particular, we demonstrate how it can jointly query heterogeneous data sources, and how interested developers can easily extend it to support additional data sources. Graphical user interfaces (GUIs) are offered to support users in (1) building intra-source queries, and (2) creating required input files.
Author(s)
Mami, Mohamed Nadjib  
Graux, Damien  
Scerri, Simon  
Jabeen, Hajira
Auer, Sören  
Mainwork
The Web Conference 2019. Proceedings of The World Wide Web Conference WWW 2019  
Project(s)
BETTER  
Funder
European Commission EC  
Conference
World Wide Web Conference (WWW) 2019  
Open Access
DOI
10.1145/3308558.3314132
Additional full text version
Landing Page
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024