• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Constance: An intelligent data lake system
 
  • Details
  • Full
Options
2016
Conference Paper
Title

Constance: An intelligent data lake system

Abstract
As the challenge of our time, Big Data still has many research hassles, especially the variety of data. The high diversity of data sources often results in information silos, a collection of non-integrated data management systems with heterogeneous schemas, query languages, and APIs. Data Lake systems have been proposed as a solution to this problem, by providing a schema-less repository for raw data with a common access interface. However, just dumping all data into a data lake without any metadata management, would only lead to a 'data swamp'. To avoid this, we propose Constance1, a Data Lake system with sophisticated metadata management over raw data extracted from heterogeneous data sources. Constance discovers, extracts, and summarizes the structural metadata from the data sources, and annotates data and metadata with semantic information to avoid ambiguities.
Author(s)
Hai, Rihan
Geisler, Sandra  
Quix, Christoph  
Mainwork
International Conference on Management of Data, SIGMOD 2016. Proceedings  
Conference
International Conference on Management of Data 2016  
DOI
10.1145/2882903.2899389
Language
English
Fraunhofer-Institut für Angewandte Informationstechnik FIT  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024