Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Deep learning for clustering of multivariate clinical patient trajectories with missing values

: Jong, Johann de; Emon, Mohammad Asif; Wu, Ping; Karki, Reagon; Sood, Meemansa; Godard, Patrice; Ahmad, Ashar; Vrooman, Henri; Hofmann-Apitius, Martin; Fröhlich, Holger

Fulltext urn:nbn:de:0011-n-5617517 (600 KByte PDF)
MD5 Fingerprint: a2ac1bcf864b6789862e0bffd81cbb33
(CC) by
Created on: 23.10.2019

GigaScience 8 (2019), No.11, Art. giz134, 14 pp.
ISSN: 2047-217X
European Commission EC
Aetionomy - Organising Mechanistic Knowledge about Neurodegenerative Diseases for the Improvement of Drug Development and Therapy
Journal Article, Electronic Publication
Fraunhofer SCAI ()
patient stratification; deep learning; multivariate short time series; multivariate longitudinal data; clustering

Background: Precision medicine requires a stratification of patients by disease presentation that is sufficiently informative to allow for selecting treatments on a per-patient basis. For many diseases, such as neurological disorders, this stratification problem translates into a complex problem of clustering multivariate and relatively short time series, because (1) these diseases are multifactorial and not well described by single clinical outcome variables, and (2) disease progression needs to be monitored over time. Additionally, clinical often additionally suffer from the presence of many missing values, further complicating any clustering attempts.
Findings: The problem of clustering multivariate short time series with many missing values is generally not well addressed in the literature so far. In this work, we propose a deep learning-based method to address this issue, variational deep embedding with recurrence (VaDER). VaDER relies on a Gaussian mixture variational autoencoder framework, which is further extended to (1) model multivariate time series and (2) directly deal with missing values. We validated VaDER by accurately recovering clusters from simulated and benchmark data with known ground truth clustering, while varying the degree of missingness. We then used VaDER to successfully stratify Alzheimers disease (AD) patients and Parkinsons disease (PD) patients into subgroups characterized by clinically divergent disease progression profiles. Additional analyses demonstrated that these clinical differences reflected known underlying aspects of AD and PD.
Conclusions: We believe our results show that VaDER can be of great value for future efforts in patient stratification, and multivariate short time series clustering in general.