Now showing 1 - 10 of 11
  • Publication
    How Does Knowledge Injection Help in Informed Machine Learning?
    Informed machine learning describes the injection of prior knowledge into learning systems. It can help to improve generalization, especially when training data is scarce. However, the field is so application-driven that general analyses about the effect of knowledge injection are rare. This makes it difficult to transfer existing approaches to new applications, or to estimate potential improvements. Therefore, in this paper, we present a framework for quantifying the value of prior knowledge in informed machine learning. Our main contributions are threefold. Firstly, we propose a set of relevant metrics for quantifying the benefits of knowledge injection, comprising in-distribution accuracy, out-of-distribution robustness, and knowledge conformity. We also introduce a metric that combines performance improvement and data reduction. Secondly, we present a theoretical framework that represents prior knowledge in a function space and relates it to data representations and a trained model. This suggests that the distances between knowledge and data influence potential model improvements. Thirdly, we perform a systematic experimental study with controllable toy problems. All in all, this helps to find general answers to the question how knowledge injection helps in informed machine learning.
  • Publication
    A machine learning method for the identification and characterization of novel COVID-19 drug targets
    ( 2023-05-03) ;
    Delong, Lauren Nicole
    ;
    Masny, Aliaksandr
    ;
    Lentzen, Manuel
    ;
    ;
    Dijk, David van
    ;
    ;
    Hansen, Anne Funck
    ;
    ; ; ; ; ; ;
    Kannt, Aimo
    ;
    Foldenauer, Ann Christina
    ;
    ;
    Resch, Eduard
    ;
    Frank, Kevin
    ;
    ; ; ;
    Laue, Hendrik
    ;
    ;
    Hirsch, Jochen
    ;
    Wischnewski, Marco
    ;
    ; ;
    Tom Kodamullil, Alpha
    ;
    Gemünd, Andre
    ;
    Fluck, Juliane
    ;
    Steinborn, Carina
    ;
    ; ;
    Hermanowski, Helena
    ;
    ;
    Klein, Jürgen
    ;
    ; ; ;
    Knieps, Meike
    ;
    ;
    Wendland, Philipp Johannes
    ;
    Wegner, Philipp
    ;
    ; ; ;
    Lentzen, Manuel
    ;
    In addition to vaccines, the World Health Organization sees novel medications as an urgent matter to fight the ongoing COVID-19 pandemic. One possible strategy is to identify target proteins, for which a perturbation by an existing compound is likely to benefit COVID-19 patients. In order to contribute to this effort, we present GuiltyTargets-COVID-19 (https://guiltytargets-covid.eu/), a machine learning supported web tool to identify novel candidate drug targets. Using six bulk and three single cell RNA-Seq datasets, together with a lung tissue specific protein-protein interaction network, we demonstrate that GuiltyTargets-COVID-19 is capable of (i) prioritizing meaningful target candidates and assessing their druggability, (ii) unraveling their linkage to known disease mechanisms, (iii) mapping ligands from the ChEMBL database to the identified targets, and (iv) pointing out potential side effects in the case that the mapped ligands correspond to approved drugs. Our example analyses identified 4 potential drug targets from the datasets: AKT3 from both the bulk and single cell RNA-Seq data as well as AKT2, MLKL, and MAPK11 in the single cell experiments. Altogether, we believe that our web tool will facilitate future target identification and drug development for COVID-19, notably in a cell type and tissue specific manner.
  • Publication
    Quantum Feature Selection
    ( 2022-03-24)
    Mücke, Sascha
    ;
    ;
    Müller, Sabine
    ;
    ;
    In machine learning, fewer features reduce model complexity. Carefully assessing the influence of each input feature on the model quality is therefore a crucial preprocessing step. We propose a novel feature selection algorithm based on a quadratic unconstrained binary optimization (QUBO) problem, which allows to select a specified number of features based on their importance and redundancy. In contrast to iterative or greedy methods, our direct approach yields higher- quality solutions. QUBO problems are particularly interesting because they can be solved on quantum hardware. To evaluate our proposed algorithm, we conduct a series of numerical experiments using a classical computer, a quantum gate computer and a quantum annealer. Our evaluation compares our method to a range of standard methods on various benchmark datasets. We observe competitive performance.
  • Publication
    Benchmarking table recognition performance on biomedical literature on neurological disorders
    Table recognition systems are widely used to extract and structure quantitative information from the vast amount of documents that are increasingly available from different open sources. While many systems already perform well on tables with a simple layout, tables in the biomedical domain are often much more complex. Benchmark and training data for such tables are however very limited. To address this issue, we present a novel, highly curated benchmark dataset based on a hand-curated literature corpus on neurological disorders, which can be used to tune and evaluate table extraction applications for this challenging domain. We evaluate several state-of-the-art table extraction systems based on our proposed benchmark and discuss challenges that emerged during the benchmark creation as well as factors that can impact the performance of recognition methods. For the evaluation procedure, we propose a new metric as well as several improvements that result in a better performance evaluation. The resulting benchmark dataset (https://zenodo.org/record/5549977) as well as the source code to our novel evaluation approach can be openly accessed. Supplementary data are available at Bioinformatics online.
  • Publication
    A hybrid approach for identifying drug repurposing candidates and their mechanisms
    ( 2022)
    Lage-Rupprecht, Vanessa
    ;
    ;
    Senior researcher Vanessa Lage-Rupprecht and two collaborators talk about what data science means to them and illustrate how they managed to create a data and lab coexistence in their drug-repurposing project, which was recently published in Patterns. In this article, they have developed a drug-target-mechanism-oriented data model, Human Brain PHARMACOME, and have presented it as a resource to the community.
  • Publication
    Quantum Machine Learning. Eine Analyse zu Kompetenz, Forschung und Anwendung
    In unserer Studie »Quantum Machine Learning« geben wir einen Einblick in das Quantencomputing, erklären, welche physikalischen Effekte eine Rolle spielen und wie diese dazu genutzt werden, Verfahren des Maschinellen Lernens zu beschleunigen. Neben den logischen Komponenten werden auch Techniken für die Implementierung der Hardware von Quantencomputern vorgestellt. Die Studie gibt außerdem einen Überblick über die aktuelle Forschungs- und Kompetenzlandschaft und ordnet die Position Deutschlands im internationalen Wettbewerb ein. Zudem stellt die Studie konkrete Anwendungsbereiche und Marktpotenziale für verschiedene Branchen vor. Denn in den kommenden Jahren werden Unternehmen aus unterschiedlichen Branchen vor der Herausforderung stehen, neue Markt- und Geschäftspotenziale mithilfe des Quantencomputings zu erarbeiten, um ihre Wertschöpfung zu steigern. Mit dieser Studie möchten wir Akteuren aus Wirtschaft, Wissenschaft und Gesellschaft Orientierung bieten und die Potenziale aufzeigen, die schon heute sichtbar sind und in Zukunft in Unternehmen Einsatz finden werden.
  • Publication
    Integration of Structured Biological Data Sources using Biological Expression Language
    ( 2019-05-08)
    Hoyt, Charles Tapley
    ;
    ; ;
    Llaó, Josep Marin
    ;
    Konotopez, Andrej
    ;
    ; ;
    Muslu, Özlem
    ;
    English, Bradley
    ;
    Müller, Simon
    ;
    Lacerda, Mauricio Pio De
    ;
    ;
    Colby, Scott
    ;
    Türei, Dénes
    ;
    Palacio-Escat, Nicolàs
    ;
    Background: The integration of heterogeneous, multiscale, and multimodal knowledge and data has become a common prerequisite for joint analysis to unravel the mechanisms and aetiologies of complex diseases. Because of its unique ability to capture this variety, Biological Expression Language (BEL) is well suited to be further used as a platform for semantic integration and harmonization in networks and systems biology. Results: We have developed numerous independent packages capable of downloading, structuring, and serializing various biological data sources to BEL. Each Bio2BEL package is implemented in the Python programming language and distributed through GitHub (https://github.com/bio2bel) and PyPI. Conclusions: The philosophy of Bio2BEL encourages reproducibility, accessibility, and democratization of biological databases. We present several applications of Bio2BEL packages including their ability to support the curation of pathway mappings, integration of pathway databases, and machine learning applications.
  • Publication
    Informed Machine Learning - A Taxonomy and Survey of Integrating Knowledge into Learning Systems
    Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process, which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. First, we provide a definition and propose a concept for informed machine learning, which illustrates its building blocks and distinguishes it from conventional machine learning. Second, we introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Third, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.
  • Publication
    Cognitive Systems and Robotics
    Cognitive systems are able to monitor and analyze complex processes, which also provides them with the ability to make the right decisions in unplanned or unfamiliar situations. Fraunhofer experts are employing machine learning techniques to harness new cognitive functions for robots and automation solutions. To do this, they are equipping systems with technologies that are inspired by human abilities, or imitate and optimize them. This report describes these technologies, illustrates current example applications, and lays out scenarios for future areas of application.
  • Publication
    RatVec: A General Approach for Low-dimensional Distributed Vector Representations via Rational Kernels
    ( 2019)
    Brito, Eduardo
    ;
    ;
    Domingo-Fernández, Daniel
    ;
    Hoyt, Charles Tapley
    ;
    We present a general framework, RatVec, for learning vector representations of non-numeric entities based on domain-specific similarity functions interpreted as rational kernels. We show competitive performance using k-nearest neighbors in the protein family classification task and in Dutch spelling correction. To promote re-usability and extensibility, we have made our code and pre-trained models available athttps://github.com/ratvec.