• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Efficient Scheduling of Scientific Workflow Actions in the Cloud Based on Required Capabilities
 
  • Details
  • Full
Options
2021
Conference Paper
Title

Efficient Scheduling of Scientific Workflow Actions in the Cloud Based on Required Capabilities

Abstract
Distributed scientific workflow management systems processing large data sets in the Cloud face the following challenges: (a) workflow tasks require different capabilities from the machines on which they run, but at the same time, the infrastructure is highly heterogeneous, (b) the environment is dynamic and new resources can be added and removed at any time, (c) scientific workflows can become very large with hundreds of thousands of tasks, (d) faults can happen at any time in a distributed system. In this paper, we present a software architecture and a capability-based scheduling algorithm that cover all these challenges in one design. Our architecture consists of loosely coupled components that can run on separate virtual machines and communicate with each other over an event bus and through a database. The scheduling algorithm matches capabilities required by the tasks (e.g. software, CPU power, main memory, graphics processing unit) with those offered by the available virtual machines and assigns them accordingly for processing. Our approach utilises heuristics to distribute the tasks evenly in the Cloud. This reduces the overall run time of workflows and makes efficient use of available resources. Our scheduling algorithm also implements optimisations to achieve a high scalability. We perform a thorough evaluation based on four experiments and test if our approach meets the challenges mentioned above. The paper finishes with a discussion, conclusions, and future research opportunities. An implementation of our algorithm and software architecture is publicly available with the open-source workflow management system ""Steep"".
Author(s)
Krämer, Michel  
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Mainwork
Data Management Technologies and Applications. 9th International Conference, DATA 2020. Revised Selected Papers  
Conference
International Conference on Data Science, Technology and Applications (DATA) 2020  
DOI
10.1007/978-3-030-83014-4_2
Language
English
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Keyword(s)
  • Lead Topic: Visual Computing as a Service

  • Research Line: Computer graphics (CG)

  • workflow management

  • Cloud Computing

  • distributed system

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024