Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Data Lake

 
: Quix, Christoph; Geisler, Sandra; Hai, Rihan

:

Schintler, L.A.:
Encyclopedia of Big Data. Online resource
Cham: Springer Nature (Springer Nature Living Reference. Business and Management)
https://link.springer.com/referencework/10.1007/978-3-319-32001-4
ISBN: 978-3-319-32001-4
6 S.
Englisch
Aufsatz in Buch
Fraunhofer FIT ()

Abstract
Data lakes (DL) have been proposed as a new concept for centralized data repositories. In contrast to data warehouses (DW), which usually require a complex and fine-tuned Extract-Transform-Load (ETL) process, DLs use a simpler model which just aims at loading the complete source data in its raw format into the DL. While a more complex ETL process with data transformation and aggregation increases the data quality, it might also come with some information loss as irregular or unstructured data not fitting into the integrated DW schema will not be loaded into the DW. Moreover, some data silos might not get connected to integrated data repositories at all due to the complexity of the data integration process. DLs address these problems: they should provide access to the source data in its original format without requiring an elaborated ETL process to ingest the data into the lake.

: http://publica.fraunhofer.de/dokumente/N-586625.html