Options
2011
Conference Paper
Title
Enabling the reuse of data mining processes in healthcare by integrating data semantics
Abstract
Biomedical researchers today deal with analyzing clinical and genomic data. However, such analysis scenarios are typically non-standardized and not easily reusable. Data mining patterns guide in the application of data mining solutions to new practical problems. For the reuse of data mining solutions it is important that the new data set shares the semantics of the original one. This is particularly important in the medical domain, where data is often semantically heterogeneous. A formal representation of requirements and pre-requisites would allow improving the efficiency of the reutilization process. This addresses in particular the most time consuming phases in a data mining project, which are data understanding and data preparation. We show how the integration of semantic information i nto data mining patterns enables the formal checking of data requirements in analysis scenarios. Our approach is based on the encoding of data requirements in a query targeted at a semantically annotated data source, and thus allows reusing concepts of semantic mediation.