Now showing 1 - 2 of 2
  • Publication
    Data Lake
    Data lakes (DL) have been proposed as a new concept for centralized data repositories. In contrast to data warehouses (DW), which usually require a complex and fine-tuned Extract-Transform-Load (ETL) process, DLs use a simpler model which just aims at loading the complete source data in its raw format into the DL. While a more complex ETL process with data transformation and aggregation increases the data quality, it might also come with some information loss as irregular or unstructured data not fitting into the integrated DW schema will not be loaded into the DW. Moreover, some data silos might not get connected to integrated data repositories at all due to the complexity of the data integration process. DLs address these problems: they should provide access to the source data in its original format without requiring an elaborated ETL process to ingest the data into the lake.
  • Publication
    Evaluation of real-time traffic applications based on data stream mining
    ( 2014)
    Geisler, Sandra
    ;
    Quix, Christoph
    Traffic management today requires the analysis of a huge amount of data in real-time in order to provide current information about the traffic state or hazards to road users and traffic control authorities. Modern cars are equipped with several sensors which can produce useful data for the analysis of traffic situations. Using mobile communication technologies, such data can be integrated and aggregated from several cars which enables intelligent transportation systems (ITS) to monitor the traffic state in a large area at relatively low costs. However, processing and analyzing data poses numerous challenges for data management solutions in such systems. Real-time analysis with high accuracy and confidence is one important requirement in this context. We present a summary of our work on a comprehensive evaluation framework for data stream-based ITS. The goal of the framework is to identify appropriate configurations for ITS and to evaluate different mining methods for data analysis. The framework consists of a traffic simulation software, a data stream management system, utilizes data stream mining algorithms, and provides a flexible ontology-based component for data quality monitoring during data stream processing. The work has been done in the context of a project on Car-To-X communication using mobile communication networks. The results give some interesting insights for the setup and configuration of traffic information systems that use Car-To-X messages as primary source for deriving traffic information and also point out challenges for data stream management and da ta stream mining.