Self-Service Data Preprocessing and Cohort Analysis for Medical Researchers
Medical researchers are increasingly interested in data-driven approaches to support informed decisions in many medical areas. They collect data about the patients they treat, often creating their own specialized data tables with more characteristics than what is defined in their clinical information system (CIS). Usually, these data tables or sEHR (small electronical health records) are rather small, maybe containing the data of only hundreds of patients. Medical researchers are struggling to find an easy way to first clean and transform these sEHR, and then create cohorts and perform confirmative or exploratory analysis. This paper introduces a methodology and identifies requirements for building systems for self-service data preprocessing and cohort analysis for medical researchers. We also describe a system based on this methodology and the requirements that shows the benefits of our approach. We further highlight these benefits with an example scenario from our projects with clinicians specialized on head&neck cancer treatment.