Options
2025
Journal Article
Title
Combining Machine Learning With Real-World Data to Identify Gaps in Clinical Practice Guidelines: Feasibility Study Using the Prospective German Stroke Registry and the National Acute Ischemic Stroke Guidelines
Abstract
Background: Clinical practice guidelines (CPGs) serve as essential tools for guiding clinicians in providing appropriate patient care. However, clinical practice does not always reflect CPGs. This is particularly critical in acute diseases requiring immediate treatment, such as acute ischemic stroke, one of the leading causes of morbidity and mortality worldwide. Adherence to CPGs improves patient outcomes, yet guidelines may not address all patient scenarios, resulting in variability in treatment decisions. Identifying such gaps would augment CPGs but is challenging when using traditional methods. Objective: This study aims to leverage real-world data coupled with machine learning (ML) techniques to systematically identify and quantify gaps in German thrombolysis-in-stroke guidelines. Methods: We analyzed observational data from the German Stroke Registry – Endovascular Treatment (GSR-ET), a prospective national registry involving 18,069 patients from 25 stroke centers in whom endovascular treatment of a large vessel occlusion was attempted between 2015 and 2023. Key variables included demographic, clinical and imaging information, treatment details, and outcomes. A random forest model was used to predict intravenous thrombolysis treatment decisions based on three different sets of features: (1) guideline-recommended features, (2) clinician-selected features, and (3) features as documented in the GSR-ET before thrombolytic treatment. Feature importance scores, permutation importance, and Shapley Additive Explanations values were used, with clinician guidance, to interpret the model and identify key factors associated with guideline deviations and independent clinician judgments. Results: Of all GSR-ET patients, 13,440 (74.4%) were analyzed after excluding those with incomplete or implausible data. The random forest model’s performance, measured by area under the receiver operating characteristics curve, was 0.71 (95% CI 0.68‐0.73), 0.74 (95% CI 0.73‐0.75), and 0.77 (95% CI 0.76‐0.78) for the guideline-recommended, clinician-selected, and GSR-ET feature sets, respectively. Across all sets, time from symptom onset to admission was the most important predictor of thrombolysis treatment decisions. Age, which according to the German guidelines is not to be considered for thrombolysis administration, emerged as a significant predictor in the GSR-ET feature set, suggesting a potential gap between guidelines and clinical practice. Conclusions: In our study, we introduce an innovative approach that combines real-world data with ML techniques to identify discrepancies between CPGs and actual clinical decision-making. Using intravenous thrombolysis in large vessel occlusion stroke as a model, our findings suggest that treatment decisions may be influenced by factors not explicitly included in the current German guideline, such as patient age and pre-stroke functional status. This approach may help uncover clinically relevant variables for potential inclusion in future guideline refinements.
Author(s)