Now showing 1 - 10 of 12
  • Publication
    A hybrid approach unveils drug repurposing candidates targeting an Alzheimer pathophysiology mechanism
    ( 2022)
    Lage-Rupprecht, Vanessa
    ;
    ;
    Dick, Justus
    ;
    ; ;
    Gebel, Stephan
    ;
    Pless, Ole
    ;
    Reinshagen, Jeanette
    ;
    ; ; ; ; ;
    The high number of failed pre-clinical and clinical studies for compounds targeting Alzheimer disease (AD) has demonstrated that there is a need to reassess existing strategies. Here, we pursue a holistic, mechanism-centric drug repurposing approach combining computational analytics and experimental screening data. Based on this integrative workflow, we identified 77 druggable modifiers of tau phosphorylation (pTau). One of the upstream modulators of pTau, HDAC6, was screened with 5,632 drugs in a tau-specific assay, resulting in the identification of 20 repurposing candidates. Four compounds and their known targets were found to have a link to AD-specific genes. Our approach can be applied to a variety of AD-associated pathophysiological mechanisms to identify more repurposing candidates.
  • Publication
    Benchmarking table recognition performance on biomedical literature on neurological disorders
    Table recognition systems are widely used to extract and structure quantitative information from the vast amount of documents that are increasingly available from different open sources. While many systems already perform well on tables with a simple layout, tables in the biomedical domain are often much more complex. Benchmark and training data for such tables are however very limited. To address this issue, we present a novel, highly curated benchmark dataset based on a hand-curated literature corpus on neurological disorders, which can be used to tune and evaluate table extraction applications for this challenging domain. We evaluate several state-of-the-art table extraction systems based on our proposed benchmark and discuss challenges that emerged during the benchmark creation as well as factors that can impact the performance of recognition methods. For the evaluation procedure, we propose a new metric as well as several improvements that result in a better performance evaluation. The resulting benchmark dataset (https://zenodo.org/record/5549977) as well as the source code to our novel evaluation approach can be openly accessed. Supplementary data are available at Bioinformatics online.
  • Publication
    CLEP: A hybrid data- and knowledge-driven framework for generating patient representations
    ( 2021-05-08) ;
    Ali, Mehdi
    ;
    ; ; ; ;
    Hoyt, Charles Tapley
    ;
    Domingo-Fernández, Daniel
    As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation.
  • Publication
    Combining Machine Learning and Simulation to a Hybrid Modelling Approach: Current and Future Directions
    In this paper, we describe the combination of machine learning and simulation towards a hybrid modelling approach. Such a combination of data-based and knowledge-based modelling is motivated by applications that are partly based on causal relationships, while other effects result from hidden dependencies that are represented in huge amounts of data. Our aim is to bridge the knowledge gap between the two individual communities from machine learning and simulation to promote the development of hybrid systems. We present a conceptual framework that helps to identify potential combined approaches and employ it to give a structured overview of different types of combinations using exemplary approaches of simulation-assisted machine learning and machine-learning assisted simulation. We also discuss an advanced pairing in the context of Industry 4.0 where we see particular further potential for hybrid systems. In this paper, we describe the combination of machine learning and simulation towards a hybrid modelling approach. Such a combination of data-based and knowledge-based modelling is motivated by applications that are partly based on causal relationships, while other effects result from hidden dependencies that are represented in huge amounts of data. Our aim is to bridge the knowledge gap between the two individual communities from machine learning and simulation to promote the development of hybrid systems. We present a conceptual framework that helps to identify potential combined approaches and employ it to give a structured overview of different types of combinations using exemplary approaches of simulation-assisted machine learning and machine-learning assisted simulation. We also discuss an advanced pairing in the context of Industry 4.0 where we see particular further potential for hybrid systems.
  • Publication
    Quantum Machine Learning. Eine Analyse zu Kompetenz, Forschung und Anwendung
    In unserer Studie »Quantum Machine Learning« geben wir einen Einblick in das Quantencomputing, erklären, welche physikalischen Effekte eine Rolle spielen und wie diese dazu genutzt werden, Verfahren des Maschinellen Lernens zu beschleunigen. Neben den logischen Komponenten werden auch Techniken für die Implementierung der Hardware von Quantencomputern vorgestellt. Die Studie gibt außerdem einen Überblick über die aktuelle Forschungs- und Kompetenzlandschaft und ordnet die Position Deutschlands im internationalen Wettbewerb ein. Zudem stellt die Studie konkrete Anwendungsbereiche und Marktpotenziale für verschiedene Branchen vor. Denn in den kommenden Jahren werden Unternehmen aus unterschiedlichen Branchen vor der Herausforderung stehen, neue Markt- und Geschäftspotenziale mithilfe des Quantencomputings zu erarbeiten, um ihre Wertschöpfung zu steigern. Mit dieser Studie möchten wir Akteuren aus Wirtschaft, Wissenschaft und Gesellschaft Orientierung bieten und die Potenziale aufzeigen, die schon heute sichtbar sind und in Zukunft in Unternehmen Einsatz finden werden.
  • Publication
    Informed Machine Learning - A Taxonomy and Survey of Integrating Knowledge into Learning Systems
    Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process, which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. First, we provide a definition and propose a concept for informed machine learning, which illustrates its building blocks and distinguishes it from conventional machine learning. Second, we introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Third, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.
  • Publication
    RatVec: A General Approach for Low-dimensional Distributed Vector Representations via Rational Kernels
    ( 2019)
    Brito, Eduardo
    ;
    ;
    Domingo-Fernández, Daniel
    ;
    Hoyt, Charles Tapley
    ;
    We present a general framework, RatVec, for learning vector representations of non-numeric entities based on domain-specific similarity functions interpreted as rational kernels. We show competitive performance using k-nearest neighbors in the protein family classification task and in Dutch spelling correction. To promote re-usability and extensibility, we have made our code and pre-trained models available athttps://github.com/ratvec.
  • Publication
    The KEEN Universe. An Ecosystem for Knowledge Graph Embeddings with a Focus on Reproducibility and Transferability
    ( 2019) ;
    Jabeen, Hajira
    ;
    Hoyt, Charles Tapley
    ;
    There is an emerging trend of embedding knowledge graphs (KGs) in continuous vector spaces in order to use those for machine learning tasks. Recently, many knowledge graph embedding (KGE) models have been proposed that learn low dimensional representations while trying to maintain the structural properties of the KGs such as the similarity of nodes depending on their edges to other nodes. KGEs can be used to address tasks within KGs such as the prediction of novel links and the disambiguation of entities. They can also be used for downstream tasks like question answering and fact-checking. Overall, these tasks are relevant for the semantic web community. Despite their popularity, the reproducibility of KGE experiments and the transferability of proposed KGE models to research fields outside the machine learning community can be a major challenge. Therefore, we present the KEEN Universe, an ecosystem for knowledge graph embeddings that we have developed with a strong focus on reproducibility and transferability. The KEEN Universe currently consists of the Python packages PyKEEN (Python KnowlEdge EmbeddiNgs), BioKEEN (Biological KnowlEdge EmbeddiNgs), and the KEEN Model Zoo for sharing trained KGE models with the community.
  • Publication
    Predicting Missing Links Using PyKEEN
    ( 2019) ;
    Hoyt, Charles Tapley
    ;
    Domingo-Fernandez, Daniel
    ;
    PyKEEN is a framework, which integrates several approaches to compute knowledge graph embeddings (KGEs). We demonstrate the usage of PyKEEN in an biomedical use case, i.e. we trained and evaluated several KGE models on a biological knowledge graph containing genes annotations to pathways and pathway hierarchies from well-known databases. We used the best performing model to predict new links and present an evaluation in collaboration with a domain expert.
  • Publication
    BioKEEN: a library for learning and evaluating biological knowledge graph embeddings
    ( 2019)
    Ali, M.
    ;
    Hoyt, C.T.
    ;
    Domingo-Fernandez, D.
    ;
    Lehmann, J.
    ;
    Jabeen, H.
    Knowledge graph embeddings (KGEs) have received significant attention in other domains due to their ability to predict links and create dense representations for graphs' nodes and edges. However, the software ecosystem for their application to bioinformatics remains limited and inaccessible for users without expertise in programing and machine learning. Therefore, we developed BioKEEN (Biological KnowlEdge EmbeddiNgs) and PyKEEN (Python KnowlEdge EmbeddiNgs) to facilitate their easy use through an interactive command line interface. Finally, we present a case study in which we used a novel biological pathway mapping resource to predict links that represent pathway crosstalks and hierarchies.