Now showing 1 - 7 of 7
  • Publication
    Benchmarking table recognition performance on biomedical literature on neurological disorders
    Table recognition systems are widely used to extract and structure quantitative information from the vast amount of documents that are increasingly available from different open sources. While many systems already perform well on tables with a simple layout, tables in the biomedical domain are often much more complex. Benchmark and training data for such tables are however very limited. To address this issue, we present a novel, highly curated benchmark dataset based on a hand-curated literature corpus on neurological disorders, which can be used to tune and evaluate table extraction applications for this challenging domain. We evaluate several state-of-the-art table extraction systems based on our proposed benchmark and discuss challenges that emerged during the benchmark creation as well as factors that can impact the performance of recognition methods. For the evaluation procedure, we propose a new metric as well as several improvements that result in a better performance evaluation. The resulting benchmark dataset (https://zenodo.org/record/5549977) as well as the source code to our novel evaluation approach can be openly accessed. Supplementary data are available at Bioinformatics online.
  • Publication
    Quantum Machine Learning. Eine Analyse zu Kompetenz, Forschung und Anwendung
    In unserer Studie »Quantum Machine Learning« geben wir einen Einblick in das Quantencomputing, erklären, welche physikalischen Effekte eine Rolle spielen und wie diese dazu genutzt werden, Verfahren des Maschinellen Lernens zu beschleunigen. Neben den logischen Komponenten werden auch Techniken für die Implementierung der Hardware von Quantencomputern vorgestellt. Die Studie gibt außerdem einen Überblick über die aktuelle Forschungs- und Kompetenzlandschaft und ordnet die Position Deutschlands im internationalen Wettbewerb ein. Zudem stellt die Studie konkrete Anwendungsbereiche und Marktpotenziale für verschiedene Branchen vor. Denn in den kommenden Jahren werden Unternehmen aus unterschiedlichen Branchen vor der Herausforderung stehen, neue Markt- und Geschäftspotenziale mithilfe des Quantencomputings zu erarbeiten, um ihre Wertschöpfung zu steigern. Mit dieser Studie möchten wir Akteuren aus Wirtschaft, Wissenschaft und Gesellschaft Orientierung bieten und die Potenziale aufzeigen, die schon heute sichtbar sind und in Zukunft in Unternehmen Einsatz finden werden.
  • Publication
    Informed Machine Learning - A Taxonomy and Survey of Integrating Knowledge into Learning Systems
    Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process, which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. First, we provide a definition and propose a concept for informed machine learning, which illustrates its building blocks and distinguishes it from conventional machine learning. Second, we introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Third, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.
  • Publication
    BioKEEN: a library for learning and evaluating biological knowledge graph embeddings
    ( 2019)
    Ali, M.
    ;
    Hoyt, C.T.
    ;
    Domingo-Fernandez, D.
    ;
    Lehmann, J.
    ;
    Jabeen, H.
    Knowledge graph embeddings (KGEs) have received significant attention in other domains due to their ability to predict links and create dense representations for graphs' nodes and edges. However, the software ecosystem for their application to bioinformatics remains limited and inaccessible for users without expertise in programing and machine learning. Therefore, we developed BioKEEN (Biological KnowlEdge EmbeddiNgs) and PyKEEN (Python KnowlEdge EmbeddiNgs) to facilitate their easy use through an interactive command line interface. Finally, we present a case study in which we used a novel biological pathway mapping resource to predict links that represent pathway crosstalks and hierarchies.
  • Publication
    The KEEN Universe. An Ecosystem for Knowledge Graph Embeddings with a Focus on Reproducibility and Transferability
    ( 2019) ;
    Jabeen, Hajira
    ;
    Hoyt, Charles Tapley
    ;
    There is an emerging trend of embedding knowledge graphs (KGs) in continuous vector spaces in order to use those for machine learning tasks. Recently, many knowledge graph embedding (KGE) models have been proposed that learn low dimensional representations while trying to maintain the structural properties of the KGs such as the similarity of nodes depending on their edges to other nodes. KGEs can be used to address tasks within KGs such as the prediction of novel links and the disambiguation of entities. They can also be used for downstream tasks like question answering and fact-checking. Overall, these tasks are relevant for the semantic web community. Despite their popularity, the reproducibility of KGE experiments and the transferability of proposed KGE models to research fields outside the machine learning community can be a major challenge. Therefore, we present the KEEN Universe, an ecosystem for knowledge graph embeddings that we have developed with a strong focus on reproducibility and transferability. The KEEN Universe currently consists of the Python packages PyKEEN (Python KnowlEdge EmbeddiNgs), BioKEEN (Biological KnowlEdge EmbeddiNgs), and the KEEN Model Zoo for sharing trained KGE models with the community.
  • Publication
    RatVec: A General Approach for Low-dimensional Distributed Vector Representations via Rational Kernels
    ( 2019)
    Brito, Eduardo
    ;
    ;
    Domingo-Fernández, Daniel
    ;
    Hoyt, Charles Tapley
    ;
    We present a general framework, RatVec, for learning vector representations of non-numeric entities based on domain-specific similarity functions interpreted as rational kernels. We show competitive performance using k-nearest neighbors in the protein family classification task and in Dutch spelling correction. To promote re-usability and extensibility, we have made our code and pre-trained models available athttps://github.com/ratvec.
  • Publication
    Kognitive Systeme und Robotik
    Kognitive Systeme können komplexe Prozesse überwachen, analysieren und gewinnen daraus auch die Fähigkeit, in ungeplanten oder unbekannten Situationen richtig zu entscheiden. Fraunhofer-Experten setzen Verfahren des maschinellen Lernens ein, um neue kognitive Funktionen für Roboter und Automatisierungslösungen zu nutzen. Dazu statten sie Systeme mit Technologien aus, die von menschlichen Fähigkeiten inspiriert sind bzw. diese imitieren und optimieren. Der Bericht beschreibt diese Technologien, erläutert aktuelle Anwendungsbeispiele und entwirft Szenarien für zukünftige Anwendungsfelder.