Now showing 1 - 10 of 40
  • Publication
    Computer Scientist's and Programmer's View on Quantum Algorithms: Mapping Functions' APIs and Inputs to Oracles
    ( 2022) ;
    Tcholtchev, Nikolay Vassilev
    ;
    ;
    Quantum Computing (QC) is a promising approach which is expected to boost the development of new services and applications. Specific addressable problems can be tackled through acceleration in computational time and advances with respect to the complexity of the problems, for which QC algorithms can support the solution search. However, QC currently remains a domain that is strongly dominated by a physics' perspective. Indeed, in order to bring QC to industrial grade applications we need to consider multiple perspectives, especially the one of software engineering and software application/service programming. Following this line of thought, the current paper presents our computer scientist's view on the aspect of black-box oracles, which are a key construct for the majority of currently available QC algorithms. Thereby, we observe the need for the input of API functions from the traditional world of software engineering and (web-)services to be mapped to the above mentioned black-box oracles. Hence, there is a clear requirement for automatically generating oracles for specific types of problems/algorithms based on the concrete input to the belonging APIs. In this paper, we discuss the above aspects and illustrate them on two QC algorithms, namely Deutsch-Jozsa and the Grover's algorithm.
  • Publication
    V2X Attack Vectors and Risk Analysis for Automated Cooperative Driving
    ( 2021)
    Sawade, Oliver
    ;
    ;
    Cooperative systems have finally entered the automotive mass market with the introduction of the 2020 VW Golf. These "Day-1" functions are aimed at driver information and warning only, but the integration of cooperative systems and automated driver assistance is already planned on several levels such as cooperative perception and cooperative driving maneuvers. The introduction of wireless open networks into highly critical systems sets highest demands to safety and security. In this paper we examine several cybersecurity attack vectors on Day-1 and future cooperative systems by applying the methodology used in functional safety. We evaluate attack difficulty (exposure), severity and controllability for a selection of current and next-gen functions. From this analysis, we derive associated risks and thus give recommendations to researchers and engineers. Finally, we simulate a selection of attacks on a platoon and evaluate function behavior and the possibility of critical system malfunction.
  • Publication
    VisionKG: Towards a unified vision knowledge graph
    ( 2021)
    Le-Tuan, Anh
    ;
    Tran, Trung-Kien
    ;
    Nguyen-Duc, Manh
    ;
    Yuan, Jicheng
    ;
    ;
    Phuoc, Danh Le
    Computer Vision (CV) has recently achieved signi_cant im-provements, thanks to the evolution of deep learning. Along with ad-vanced architectures and optimisations of deep neural networks, CV data for (cross-datasets) training, validating, and testing contributes greatly to the performance of CV models. Many CV datasets have been created for different tasks, but they are available in heterogeneous data formats and semantic representations. Therefore, it is challenging when one needs to combine different datasets either for training or testing purposes. This paper proposes a unified framework using the Semantic Web technology that provides a novel way to interlink and integrate labelled data across different data sources. We demonstrate its advantages via various sce-narios with the system framework accessible both online and via APIs.4.
  • Publication
    Beyond the Hype: Why Do Data-Driven Projects Fail?
    ( 2021) ;
    Blume, Julia
    ;
    Fabian, Benjamin
    ;
    Fomenko, Elena
    ;
    Berlin, Marcus
    ;
    Despite substantial investments, data science has failed to deliver significant business value in many companies. So far, the reasons for this problem have not been explored systematically. This study tries to find possible explanations for this shortcoming and analyses the specific challenges in data-driven projects. To identify the reasons that make data-driven projects fall short of expectations, multiple rounds of qualitative semi-structured interviews with domain experts with different roles in data-driven projects were carried out. This was followed by a questionnaire surveying 112 experts with experience in data projects from eleven industries. Our results show that the main reasons for failure in data-driven projects are (1) the lack of understanding of the business context and user needs, (2) low data quality, and (3) data access problems. It is interesting, that 54% of respondents see a conceptual gap between business strategies and the implementation of analytics solutions. Based on our results, we give recommendations for how to overcome this conceptual distance and carrying out data-driven projects more successfully in the future.
  • Publication
    Automatic Generation of Grover Quantum Oracles for Arbitrary Data Structures
    ( 2021)
    Seidel, Raphael
    ;
    ;
    Tcholtchev, Nikolay
    ;
    ;
    The steadily growing research interest in quantum computing together with the accompanying technological advances in the realization of quantum hardware fuels the development of meaningful real-world applications, as well as implementations for well-known quantum algorithms. One of the most prominent examples till today is Grover's algorithm, which can be used for efficient search in unstructured databases. Quantum oracles that are frequently masked as black boxes play an important role in Grover's algorithm. Hence, the automatic generation of oracles is of paramount importance. Moreover, the automatic generation of the corresponding circuits for a Grover quantum oracle is deeply linked to the synthesis of reversible quantum logic, which despite numerous advances in the field still remains a challenge till today in terms of synthesizing efficient and scalable circuits for complex boolean functions. In this paper, we present a flexible method for automatically encoding unstructured databases into oracles, which can then be efficiently searched with Grover's algorithm. Furthermore, we develop a tailor-made method for quantum logic synthesis, which vastly improves circuit complexity over other current approaches. Finally, we present another logic synthesis method that considers the requirements of scaling onto real world backends. We compare our method with other approaches through evaluating the oracle generation for random databases and analyzing the resulting circuit complexities using various metrics.
  • Publication
    Piveau: A large-scale open data management platform based on semantic web technologies
    ( 2020) ;
    Stefanidis, Kyriakos
    ;
    ; ;
    Urbanek, Sebastian
    ;
    The publication and (re)utilization of Open Data is still facing multiple barriers on technical, organizational and legal levels. This includes limitations in interfaces, search capabilities, provision of quality information and the lack of definite standards and implementation guidelines. Many Semantic Web specifications and technologies are specifically designed to address the publication of data on the web. In addition, many official publication bodies encourage and foster the development of Open Data standards based on Semantic Web principles. However, no existing solution for managing Open Data takes full advantage of these possibilities and benefits. In this paper, we present our solution ""Piveau"", a fully-fledged Open Data management solution, based on Semantic Web technologies. It harnesses a variety of standards, like RDF, DCAT, DQV, and SKOS, to overcome the barriers in Open Data publication. The solution puts a strong focus on assuring data quality and scalability. We give a detailed description of the underlying, highly scalable, service-oriented architecture, how we integrated the aforementioned standards, and used a triplestore as our primary database. We have evaluated our work in a comprehensive feature comparison to established solutions and through a practical application in a production environment, the European Data Portal. Our solution is available as Open Source.
  • Publication
    Pushing the scalability of RDF engines on IoT edge devices
    ( 2020)
    Le Tuan, Anh
    ;
    Hayes, Conor
    ;
    ;
    Le-Phuoc, Danh
    Semantic interoperability for the Internet of Things (IoT) is enabled by standards and technologies from the Semantic Web. As recent research suggests a move towards decentralised IoT architectures, we have investigated the scalability and robustness of RDF (Resource Description Framework)engines that can be embedded throughout the architecture, in particular at edge nodes. RDF processing at the edge facilitates the deployment of semantic integration gateways closer to low-level devices. Our focus is on how to enable scalable and robust RDF engines that can operate on lightweight devices. In this paper, we have first carried out an empirical study of the scalability and behaviour of solutions for RDF data management on standard computing hardware that have been ported to run on lightweight devices at the network edge. The findings of our study shows that these RDF store solutions have several shortcomings on commodity ARM (Advanced RISC Machine) boards that are representative of IoT edge node hardware. Consequently, this has inspired us to introduce a lightweight RDF engine, which comprises an RDF storage and a SPARQL processor for lightweight edge devices, called RDF4Led. RDF4Led follows the RISC-style (Reduce Instruction Set Computer) design philosophy. The design constitutes a flash-aware storage structure, an indexing scheme, an alternative buffer management technique and a low-memory-footprint join algorithm that demonstrates improved scalability and robustness over competing solutions. With a significantly smaller memory footprint, we show that RDF4Led can handle 2 to 5 times more data than popular RDF engines such as Jena TDB (Tuple Database) and RDF4J, while consuming the same amount of memory. In particular, RDF4Led requires 10%-30% memory of its competitors to operate on datasets of up to 50 million triples. On memory-constrained ARM boards, it can perform faster updates and can scale better than Jena TDB and Virtuoso. Furthermore, we demonstrate considerably faster query operations than Jena TDB and RDF4J.
  • Publication
    A fault-tolerant protocol to enable distributed state machines using IEEE802.11p
    ( 2020)
    Sawade, Oliver
    ;
    ;
    Autonomous vehicles promise a revolution in mobility, enabling safe, resource-efficient urban and inter-urban transport with a high degree of user convenience. To achieve optimal efficiency, autonomous vehicles must be viewed as a network of communicating cyber-physical systems which exchange information to optimize a utility function under strict security and safety requirements. Vehicles can exchange information to extend their perception horizon, exchange driving modes to enhance scene understanding and most importantly cooperate directly with other automated and autonomous vehicles in cooperative driving maneuvers such as platooning. In this paper we present a novel communication protocol built on the current vehicular communication standard IEEE 802.11p, which enables negotiation and execution of cooperative driving maneuvers based on distributed state machines. The main goal of this protocol is to achieve a common synchronized state and common state transitions while supporting fault-tolerance and self-supervision under security and safety constraints. This paper presents the Collaborative Maneuver Protocol (CMP) and provides a formal proof of correctness. We furthermore present an application in a platooning function and provide an evaluation of robustness in regard to packet loss.
  • Publication
    A provenance meta learning framework for missing data handling methods selection
    ( 2020)
    Liu, Qian
    ;
    Missing data is a big problem in many real-world data sets and applications, which can lead to wrong or misleading results of analyses and lower quality and confidence in the results. A large number of missing data handling methods have been proposed in the research community but there exists no universally single best method which can handle all the missing data problems. To select the right method for a specific missing data handling problem, it usually depends on multiple inter-twined factors. To alleviate this methods selection problem, in this paper, we propose a Provenance Meta Learning Framework to simplify this process. We conducted an extensive literature review over 118 missing data handling method survey papers from 2000 to 2019. With this review, we analyse 9 influential factors and 12 selection criteria for missing data handling methods and further perform a detailed analysis of 6 popular missing data handling methods (4 machine learning methods, i.e., KNN Imputation (KNNI), Weighted KNN Imputation (WKNNI), K Means Imputation (KMI), and Fuzzy KMI (FKMI), and 2 ad-hoc methods, i.e., Median/Mode Imputation (MMI) and Group/Class MMI (CMMI)). We focus on missing data handling methods selection for 3 different classification techniques, i.e., C4.5, KNN, and RIPPER. In our evaluations, we adopt 25 real world data sets from KEEL and UCI data sets repositories. Our Provenance Meta Learning Framework suggests that using KNNI to handle missing values when missing data mechanism is Missing Complete At Random (MCAR), missing data pattern is uni-attribute missing data pattern, or monotone missing data pattern, missing data rate is within [1%,5%], number of class labels is 2, sample size is no more than 10'000, since it can keep classification performance better and have higher imputation accuracy and imputation exhaustiveness than all the other 5 missing data handling methods when subsequent classification methods are KNN or RIPPER.
  • Publication
    Quantum DevOps: Towards reliable and applicable NISQ Quantum Computing
    Quantum Computing is emerging as one of the great hopes for boosting current computational resources and enabling the application of ICT for optimizing processes and solving complex and challenging domain specific problems. However, the Quantum Computing technology has not matured to a level where it can provide a clear advantage over high performance computing yet. Towards achieving this "quantum advantage", a larger number of Qubits is required, leading inevitably to a more complex topology of the computing Qubits. This raises additional difficulties with decoherence times and implies higher Qubit error rates. Nevertheless, the current Noisy Intermediate-Scale Quantum (NISQ) computers can prove useful despite the intrinsic uncertainties on the quantum hardware layer. In order to utilize such error-prone computing resources, various concepts are required to address Qubit errors and to deliver successful computations. In this paper describe and motivate the need for the novel concept of Quantum DevOps. which entails regular checking of the reliability of NISQ Quantum Computing (QC) instances. By means of testing the computational reliability of basic quantum gates and computations (C-NOT, Hadamard, etc.)it consequently estimates the likelihood for a large scale critical computation (e.g. calculating hourly traffic flow models for a city) to provide results of sufficient quality. Following this approach to select the best matching (cloud) QC instance and having it integrated directly with the processes of development, testing and finally the operations of quantum based algorithms and systems enables the Quantum DevOps concept.