Options
November 7, 2025
Master Thesis
Title
Development and application of modern generative AI methods for emulating clinical studies
Abstract
Heterogeneous clinical datasets compiled from multiple trials present significant challenges for modeling and analyzing treatment trajectories, as patients often have irregular and individualized measurement time points. This variability complicates model training and raises questions about how such complex datasets can be effectively leveraged for reconstruction and simulation. Understanding patient trajectories and estimating treatment effects in this context is important for disease modeling, trial emulation, and the design of future interventions. This work involves training and extending the MultiNODEs model to reconstruct and simulate patient trajectories from the PRO-ACT database, which contains patients diagnosed with Amyotrophic Lateral Sclerosis (ALS). The model
was extended to include a modular static encoder and a binary cross-entropy head with a monotonic transformation to model patient mortality. It is also used for counterfactual simulations based on real-world data in order to explore hypothetical scenarios and estimate potential treatment effects. MultiNODEs is a hybrid AI framework based on Neural Ordinary Differential Equations (NODEs) that integrates static and longitudinal patient information to generate realistic synthetic patient trajectories.
The results include reconstruction, simulation, synthetic patient generation, and counterfactual analyses. Categorical variables are captured very accurately in reconstruction and simulation, while continuous variables exhibit slightly more variability but still closely reflect observed trends. Synthetic patient trajectories capture continuous variables accurately and reflect observed trends, while categorical variables are less accurate. Counterfactual simulations show a similar pattern, with continuous variables reproduced robustly and categorical variables more difficult to model due to differences between real-world and clinical trial populations, which limit the number of patients with matching profiles.
Overall, this study demonstrates that reconstructing and simulating complex patient trajectories from heterogeneous datasets is feasible. It highlights the potential of MultiNODEs for predictive modeling and provides guidance for improving counterfactual predictions and extending the model’s applicability to diverse clinical datasets.
was extended to include a modular static encoder and a binary cross-entropy head with a monotonic transformation to model patient mortality. It is also used for counterfactual simulations based on real-world data in order to explore hypothetical scenarios and estimate potential treatment effects. MultiNODEs is a hybrid AI framework based on Neural Ordinary Differential Equations (NODEs) that integrates static and longitudinal patient information to generate realistic synthetic patient trajectories.
The results include reconstruction, simulation, synthetic patient generation, and counterfactual analyses. Categorical variables are captured very accurately in reconstruction and simulation, while continuous variables exhibit slightly more variability but still closely reflect observed trends. Synthetic patient trajectories capture continuous variables accurately and reflect observed trends, while categorical variables are less accurate. Counterfactual simulations show a similar pattern, with continuous variables reproduced robustly and categorical variables more difficult to model due to differences between real-world and clinical trial populations, which limit the number of patients with matching profiles.
Overall, this study demonstrates that reconstructing and simulating complex patient trajectories from heterogeneous datasets is feasible. It highlights the potential of MultiNODEs for predictive modeling and provides guidance for improving counterfactual predictions and extending the model’s applicability to diverse clinical datasets.
Thesis Note
Koblenz, Hochschule, Master Thesis, 2025
Author(s)
Advisor(s)
Valderrama Nino, Diego Felipe