Supporting parallel R code in clinical trials: A grid-based approach
In this paper, we describe an extension to the ACGT GridR environment which allows the parallelization of loops in R scripts in view of their distributed execution on a computational grid. The ACGT GridR service is extended by a component that uses a set of preprocessor-like directives to organize and distribute calculations. The use of parallelization directives as special R comments provides users with the potential to accelerate lengthy calculations with changes to preexisting code. The GridR service and its extension are developed as components of the ACGT platform, one aim of which is to facilitate the data mining of clinical trials involving large datasets. In ACGT, GridR scripts are executed in the framework of a specifically developed workflow environment, which is also briefly outlined in the present article.