Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

PathME: Pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data

: Lemsara, Amina; Ouadfel, Salima; Fröhlich, Holger

Volltext urn:nbn:de:0011-n-5903016 (1.2 MByte PDF)
MD5 Fingerprint: 06b90c9539b6552ab5f57c72a00a77c1
(CC) by
Erstellt am: 29.5.2020

BMC bioinformatics. Online journal 21 (2020), Art. 146, 20 S.
ISSN: 1471-2105
Zeitschriftenaufsatz, Elektronische Publikation
Fraunhofer SCAI ()
precision medicine; artificial intelligence; biomarkers

Recent years have witnessed an increasing interest in multi-omics data, because these data allow for better understanding complex diseases such as cancer on a molecular system level. In addition, multi-omics data increase the chance to robustly identify molecular patient sub-groups and hence open the door towards a better personalized treatment of diseases. Several methods have been proposed for unsupervised clustering of multi-omics data. However, a number of challenges remain, such as the magnitude of features and the large difference in dimensionality across different omics data sources.
We propose a multi-modal sparse denoising autoencoder framework coupled with sparse non-negative matrix factorization to robustly cluster patients based on multi-omics data. The proposed model specifically leverages pathway information to effectively reduce the dimensionality of omics data into a pathway and patient specific score profile. In consequence, our method allows us to understand, which pathway is a feature of which particular patient cluster. Moreover, recently proposed machine learning techniques allow us to disentangle the specific impact of each individual omics feature on a pathway score. We applied our method to cluster patients in several cancer datasets using gene expression, miRNA expression, DNA methylation and CNVs, demonstrating the possibility to obtain biologically plausible disease subtypes characterized by specific molecular features. Comparison against several competing methods showed a competitive clustering performance. In addition, post-hoc analysis of somatic mutations and clinical data provided supporting evidence and interpretation of the identified clusters.
Our suggested multi-modal sparse denoising autoencoder approach allows for an effective and interpretable integration of multi-omics data on pathway level while addressing the high dimensional character of omics data. Patient specific pathway score profiles derived from our model allow for a robust identification of disease subgroups.