Parallel paradigms and run-time management techniques for many-core architectures: The 2PARMA approach
The trend in computing architectures is currently replacing complex superscalar architectures with meshes of small homogeneous processing units connected by an on-chip network. This trend is mostly driven by inherent silicon technology frontiers, which are getting as closer as the process densities levels increase. The number of cores to be integrated in a single chip is rapidly increasing in the coming years, moving from multi-core to many-core architectures. This trend requires a global rethinking of software and hardware design approaches. Multi-core architectures are nowadays prevalent in general purpose computing and in high performance computing and more scalable multi-core architectures are and will be widely adopted for high-end graphics and media processing, e.g. IBM Cell BE, NVIDIA Fermi, SUN Niagara and Tilera TILE64. The 2PARMA project focuses on the flexible family of parallel and scalable computing processors, which we call Many-core Computing Fabric (MCCF) Template, composed of many homogeneous processing cores interconnected by an on-chip mesh as shown in Figure 1. The 2PARMA project aims at providing parallel programming models and run-time resource management techniques to exploit the features of many-core processor architectures, by focusing on the definition of parallel programming models that combine component-based and single-instruction multiple-thread approaches, instruction set virtualisation based on portable bytecode, run-time resource management policies and mechanisms as well as design space exploration methodologies for Many-core Computing Fabrics. The above scientific and technical objectives are intended to meet some of the main challenges in computing system research, i.e., to improve performance by providing software programmability techniques to exploit the hardware parallelism; to provide efficient management of power/performance trade-offs through runtime resource management and optimisation; to improve system reliability, mainly in terms of lifetime and yield of hardware components by providing transparent resource reconfiguration and instruction set virtualisation; to increase the productivity of the process of developing parallel software by using semi-automatic parallelism extraction techniques and extending the OpenCL programming paradigm for parallel computing systems. The main topics investigated within the 2PARMA project are related to the analysis and development of the complete software layer able to exploit the features of future many-core processor architectures. In this context, the programmability of Many-core Computing Fabrics at both the programming language and Operating System level plays an important role. On one hand, it leverages the increasingly popular Component-Based Software Engineering (CBSE) and develops parallelism extraction techniques to identify opportunities for parallelisation in the design phase; 2PARMA then introduces extensions of existing standards for parallel programming, such as OpenCL, to express data parallelism for Many-core Computing Fabrics. On the Operating System level, 2PARMA provides the means to define and deploy peripherals to the Many-core Computing Fabric, preserving isolation among them and efficient communication between host and Computing Fabric. The 2PARMA intends providing developers with comfortable tools and programming environments aiming at increasing software cycles productivity with respect to current, mainly manual, methodologies. Given the opportunities for adaptation of the application to the available resources, 2PARMA develops intelligent policies to manage the system resources taking into account the Quality-of-Service (QoS) requirements imposed by the user to each application, while optimising the resource usage for system-wide performance and energy goals. 2PARMA project aims at supporting efficient and optimal tasks, data and devices managements, able to dynamically adapting to the changing context, while reducing as much as possible the system power consumption with respect to conventional power management strategies. Finally, continuous adaptation and runtime management require large amount of information on the system and the applications to take effective and timely decisions. 2PARMA goes beyond traditional design space exploration (DSE) by defining a methodology to provide synthetic information about the points of operation of each application with respect to the subsets of resources available. Design space exploration methodologies developed in 2PARMA provide also architectural customisation to support parallel programming models, especially communication and memory mapping.