Effective dynamic scheduling on heterogeneous multi/manycore desktop platforms

Binotto, Alecio; Pedras, Bernardo; Götz, Marcelo; Kuijper, Arjan; Pereira, Carlos Eduardo; Stork, André; Fellner, Dieter W.

doi:10.1109/SBAC-PADW.2010.6

2010

Conference Paper

Abstract

GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to desktops towards high performance computing. Together with multicore CPUs and other co-processors, a powerful heterogeneous execution platform is built on a desktop for data intensive calculations. In our perspective, we see the modern desktop as a heterogeneous cluster that can deal with several applications' tasks at the same time. To improve application performance and explore such heterogeneity, a distribution of workload over the asymmetric PUs (Processing Units) plays an important role for the system. However, this problem faces challenges since the cost of a task at a PU is non-deterministic and can be influenced by several parameters not known a priori, like the problem size domain. We present a context-aware architecture that maximizes application performance on such platforms. This approach combines a model for a first scheduling based on an offline performance benchmark with a runtime model that keeps track of tasks' real performance. We carried a demonstration using a CPU-GPU platform for computing iterative SLEs (Systems of Linear Equations) solvers using the number of unknowns as the main parameter for assignment decision. We achieved a gain of 38.3% in comparison to the static assignment of all tasks to the GPU (which is done by current programming models, such as OpenCL and CUDA for Nvidia).