Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

CLEP: A hybrid data- and knowledge-driven framework for generating patient representations

: Bharadhwaj, Vinay Srinivas; Ali, Mehdi; Birkenbihl, Colin; Mubeen, Sarah; Lehmann, Jens; Hofmann-Apitius, Martin; Hoyt, Charles Tapley; Domingo-Fernández, Daniel

Fulltext urn:nbn:de:0011-n-6360674 (3.5 MByte PDF)
MD5 Fingerprint: cc16041aed853261054d02b99262d8a9
(CC) by
Created on: 18.6.2021

Bioinformatics (2021), Online First, Art. btab340, 8 pp.
ISSN: 1367-4803
ISSN: 1460-2059
ISSN: 1367-4811
Bundesministerium für Bildung und Forschung BMBF (Deutschland)
01IS18050D; MLWin
Journal Article, Electronic Publication
Fraunhofer SCAI ()
Fraunhofer IAIS ()
Knowledge Graphs; machine learning; knowledge graph embeddings; network biology; precision medicine

As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation.