Options
2004
Conference Paper
Titel
Dynamic Workflows for Grid Applications
Abstract
There are several approaches in the Grid computing community to execute not only single tasks on single Grid resources but also to support workflow schemes that enable the composition and execution of complex Grid applications. The most commonly used workflow model for this purpose is the Directed Acyclic Graph (DAG). DAGs have a very simple structure and are easy to use; they possess, however, two relevant disadvantages: they do not support bidirectional coupling and it is not possible to explicitly define loops. Within the establishment of the Fraunhofer Resource Grid, we developed a Grid Job Definition Language (GJobDL) that is based on the concept of Petri nets instead of DAGs. Petri nets are graphical representations of the workflow of discrete systems. In contrast to DAGs, which only describe the dynamical behaviour of the system, Petri nets also describe the system's state. The type of Petri nets we introduced here corresponds to the concept of Petri nets with individual tokens (coloured Petri net) and constant arc expressions. The Grid Job Definition Language is used to describe the workflow of a Grid application on an abstract level. This description is independent from the Grid infrastructure and defines the relationships between the software components (transitions) and the data (places). Transitions can be annotated with conditions that are dependent from the tokens that are moving along the arcs of the Petri net. During the workflow execution, the abstract workflow must be concretized in order to be mapped onto the real Grid environment. This requires dynamic completion of the workflow based on actual information. It may be necessary to introduce new tasks - such as data transfers, deployment of software, authorization request, and data retrievals. These tasks can be represented by sub Petri nets that replace parts of the existent Petri net during runtime of the Grid application. Only few Grid initiatives include advanced fault management. Mostly the fault management is predefined implicitly by the Grid architecture, and results in re-scheduling, recovering or migration of single tasks in case of a fault. We propose a concept for fault management of entire job workflows, by explicitly modelling the fault management within the workflow model. This can be done user-defined or automatically by introducing new tasks enabling fault management, based on fault management templates.
Konferenz