Now showing 1 - 10 of 46
  • Publication
    On Residual-based Diagnosis of Physical Systems
    ( 2022) ;
    Niggemann, Oliver
    In this article we describe a novel diagnosis methodology for physical systems such as industrial production systems. The article consists of two parts: Part one analyzes the differences between using sensor values and using residual values for fault diagnosis. Residual values denote the health of a component by comparing sensor values to a predefined model of normal behaviour. We further analyse how faults propagate through components of a physical system and argue for the use of residual values for diagnosing physical systems. In part two we extend the theory of established consistency-based diagnosis algorithms to use residual values. We also illustrate how users of the presented diagnosis methodology are free to substitute the residual generating equations and the diagnosis algorithm to suit their specific needs. For diagnosis, we present the algorithm HySD, based on Satisfiability Modulo Linear Arithmetic. We present an implementation of HySD using threshold values and a symbolic diagnosis approach. However, the approach is also suitable to integrate modern machine learning methods for anomaly detection and combine them with a multitude of diagnosis approaches. Through experiments on the process-industry benchmark Tennessee Eastman Process and another benchmark consisting of multiple tank systems we show the feasibility of our approach. Overall we show how our novel diagnosis approach offers a practical methodology that allows industry to advance from current state of the art anomaly detection to automated fault diagnosis.
  • Publication
    Application of LSTM Networks for Water Demand Prediction in Optimal Pump Control
    ( 2021) ;
    Gonuguntla, Naga
    ;
    ; ;
    Thomas, Jorge A.
    Every morning, water suppliers need to define their pump schedules for the next 24 h for drinking water production. Plans must be designed in such a way that drinking water is always available and the amount of unused drinking water pumped into the network is reduced. Therefore, operators must accurately estimate the next day's water consumption profile. In real-life applications with standard consumption profiles, some expert system or vector autoregressive models are used. Still, in recent years, significant improvements for time series prediction have been achieved through special deep learning algorithms called long short-term memory (LSTM) networks. This paper investigates the applicability of LSTM models for water demand prediction and optimal pump control and compares LSTMs against other methods currently used by water suppliers. It is shown that LSTMs outperform other methods since they can easily integrate additional information like the day of the week or national holidays. Furthermore, the online- and transfer-learning capabilities of the LSTMs are investigated. It is shown that LSTMs only need a couple of days of training data to achieve reasonable results. As the focus of the paper is on the real-world application of LSTMs, data from two different water distribution plants are used for benchmarking. Finally, it is shown that the LSTMs significantly outperform the system currently in operation.
  • Publication
    Towards Distributed Healthcare Systems - Virtual Data Pooling Between Cancer Registries as Backbone of Care and Research
    ( 2021) ;
    Bartholomäus, Sebastian
    ;
    Breitschwerdt, Rüdiger
    ;
    ; ;
    Hartz, Tobias
    ;
    Kachel, Philipp
    ;
    ; ;
    Zeissig, Sylke Ruth
    German cancer registries offer a systematic approach for the collection, storage, and management of data on patients with cancer and related diseases. Much hope in research and healthcare in general is depending on such register-based analyses in order to comprehensively consider the features of a highly diverse population. Next to the data collection the cancer registries are responsible for data protection. To fulfill legal regulations, access to data has to be controlled in a strict way leading to sometimes bureaucratic and slow processes. The situation is especially complicated in Germany, since cancer data is distributed over numerous federal cancer registries. If a nationwide data evaluation is conducted a research team has to negotiate a separate contract with each cancer registry. In a joint work in progress effort of cancer registries, technical, medical, and economical experts we propose a different solution for cooperative data processing. Our approach aims for combining data in a virtual pool based on the selection criteria of individual requests from researchers. To achieve our goal, we adapt the Fraunhofer Medical Data Space as enabling technology. The architecture we propose will allow us to pool data of multiple partners regulated by data access policies. In doing so, each of the data sources can introduce its own rules and specifications on how data is used. Additionally, we add a digital consent management that will allow individual patients to decide how their data is used. Finally, we show the high potential of the cooperative analysis of distributed cancer data supported by the proposed solution in our approach.
  • Publication
    Grundlagen des Maschinellen Lernens
    Zu definieren, was die menschliche Intelligenz sowie intelligentes Handeln – und da­mit auch die Künstliche Intelligenz – ausmacht, ist außerordentlich schwer und be­schäftigt Philosophen und Psychologen seit Jahrtausenden. Allgemein anerkannt istaber, dass die Fähigkeit zu lernen ein zentrales Merkmal vonIntelligenzist. So ist auchdas Forschungsgebiet desMaschinellen Lernens(engl.machine learning, ML) ein zen­traler Teil der Künstlichen Intelligenz, das hinter vielen aktuellen Erfolgen von KI-Sys­temen steckt.
  • Publication
    Are you sure? Prediction revision in automated decision-making
    With the rapid improvements in machine learning and deep learning, decisions made by automated decision support systems (DSS) will increase. Besides the accuracy of predictions, their explainability becomes more important. The algorithms can construct complex mathematical prediction models. This causes insecurity to the predictions. The insecurity rises the need for equipping the algorithms with explanations. To examine how users trust automated DSS, an experiment was conducted. Our research aim is to examine how participants supported by an DSS revise their initial prediction by four varying approaches (treatments) in a between-subject design study. The four treatments differ in the degree of explainability to understand the predictions of the system. First we used an interpretable regression model, second a Random Forest (considered to be a black box [BB]), third the BB with a local explanation and last the BB with a global explanation. We noticed that all participants improved their predictions after receiving an advice whether it was a complete BB or an BB with an explanation. The major finding was that interpretable models were not incorporated more in the decision process than BB models or BB models with explanations.
  • Publication
    New active learning algorithms for near-infrared spectroscopy in agricultural applications
    The selection of training data determines the quality of a chemometric calibration model. In order to cover the entire parameter space of known influencing parameters, an experimental design is usually created. Nevertheless, even with a carefully prepared Design of Experiment (DoE), redundant reference analyses are often performed during the analysis of agricultural products. Because the number of possible reference analyses is usually very limited, the presented active learning approaches are intended to provide a tool for better selection of training samples.
  • Publication
    Multimedia analysis platform for crime prevention and investigation. Results of MAGNETO project
    ( 2021)
    Perez, Francisco J.
    ;
    Garrido, Victor J.
    ;
    Garcia, Alberto
    ;
    Zambrano, Marcelo
    ;
    Kozik, Rafal
    ;
    Choras, Michal
    ;
    ; ;
    Nowadays, the use of digital technologies is promoting three main characteristics of information, i.e. the volume, the modality and the frequency. Due to the amount of information generated by tools and individuals, it has been identified a critical need for the Law Enforcement Agencies to exploit this information and carry out criminal investigations in an effective way. To respond to the increasing challenges of managing huge amounts of heterogeneous data generated at high frequency, the paper outlines a modular approach adopted for the processing of information gathered from different information sources, and the extraction of knowledge to assist criminal investigation. The proposed platform provides novel technologies and efficient components for processing multimedia information in a scalable and distributed way, allowing Law Enforcement Agencies to make the analysis and a multidimensional visualization of criminal information in a single and secure point.
  • Publication
    Generative Machine Learning for Resource-Aware 5G and IoT Systems
    Extrapolations predict that the sheer number of Internet-of-Things (IoT) devices will exceed 40 billion in the next five years. Hand-crafting specialized energy models and monitoring sub-systems for each type of device is error prone, costly, and sometimes infeasible. In order to detect abnormal or faulty behavior as well as inefficient resource usage autonomously, it is of tremendous importance to endow upcoming IoT and 5G devices with sufficient intelligence to deduce an energy model from their own resource usage data. Such models can in-turn be applied to predict upcoming resource consumption and to detect system behavior that deviates from normal states. To this end, we investigate a special class of undirected probabilistic graphical model, the so-called integer Markov random fields (IntMRF). On the one hand, this model learns a full generative probability distribution over all possible states of the system-allowing us to predict system states and to measure the probability of observed states. On the other hand, IntMRFs are themselves designed to consume as less resources as possible-e.g., faithful modelling of systems with an exponentially large number of states, by using only 8-bit unsigned integer arithmetic and less than 16KB memory. We explain how IntMRFs can be applied to model the resource consumption and the system behavior of an IoT device and a 5G core network component, both under various workloads. Our results suggest, that the machine learning model can represent important characteristics of our two test systems and deliver reasonable predictions of the power consumption.