Cooperative feature selection in personalized medicine
The chapter discusses a research support system to identify diagnostic result patterns that characterise pertinent patient groups for personalized medicine. Example disease is breast cancer. The approach integrates established clinical findings with systems biology analyses. In this respect it is related to personalized medicine as well as translational research. Technically the system is a computer based support environment that links machine learning algorithms for classification with an interface for the medical domain expert. The involvement of the clinician has two reasons. On the one hand the intention is to impart an in-depth understanding of potentially relevant 'omics' findings from systems biology (e.g. genomics, transcriptomics, proteomics, and metabolomics) for actual patients in the context of clinical diagnoses. On the other hand the medical expert is indispensable for the process to rationally constrict the pertinent features towards a manageable selection of diagnostic findings. Without the suitable incorporation of domain expert knowledge machine based selections are often polluted by noise or irrelevant but massive variations. Selecting a subset of features is necessary in order to tackle the problem that for statistical reasons the amount of features has to be in an appropriate relationship to the number of cases that are available in a study (curse of dimensionality). The cooperative selection process is iterative. Interim results of analyses based on automatic temporary feature selections have to be graspable and criticisable by the medical expert. In order to support the understanding of machine learning results a prototype based approach is followed. The case type related documentation is in accordance with the way the human expert is cognitively structuring experienced cases. As the features for patient description are heterogeneous in their type and nature, the machine learning based feature selection has to handle different kinds of pertinent dissimilarities for the features and integrate them into a holistic representation.