Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Evaluation of parallel communication models in Nekbone, a Nek5000 mini-application

: Akhmetova, Dana; Peng, Ivy Bo; Markidis, Stefano; Laure, Erwin; Machado, Rui; Rahn, Mirko; Bartsch, Valeria; Gong, Jing; Ivanov, Ilya


Institute of Electrical and Electronics Engineers -IEEE-; IEEE Computer Society:
IEEE International Conference on Cluster Computing, CLUSTER 2015. Proceedings : 8-11 September 2015, Chicago, Illinois, USA
Los Alamitos, Calif.: IEEE Computer Society Conference Publishing Services (CPS), 2015
ISBN: 978-1-4673-6598-7
International Conference on Cluster Computing (CLUSTER) <2015, Chicago/Ill.>
Conference Paper
Fraunhofer ITWM ()

Nekbone is a proxy application of Nek5000, a scalable Computational Fluid Dynamics (CFD) code used for modelling incompressible flows. The Nekbone mini-application is used by several international co-design centers to explore new concepts in computer science and to evaluate their performance. We present the design and implementation of a new communication kernel in the Nekbone mini-application with the goal of studying the performance of different parallel communication models. First, a new MPI blocking communication kernel has been developed to solve Nekbone problems in a three-dimensional Cartesian mesh and process topology. The new MPI implementation delivers a 13% performance improvement compared to the original implementation. The new MPI communication kernel consists of approximately 500 lines of code against the original 7,000 lines of code, allowing experimentation with new approaches in Nekbone parallel communication. Second, the MPI blocking communication in the new kernel was changed to the MPI non-blocking communication. Third, we developed a new Partitioned Global Address Space (PGAS) communication kernel, based on the GPI-2 library. This approach reduces the synchronization among neighbor processes and is on average 3% faster than the new MPI-based, non-blocking, approach. In our tests on 8,192 processes, the GPI-2 communication kernel is 3% faster than the new MPI non-blocking communication kernel. In addition, we have used the OpenMP in all the versions of the new communication kernel. Finally, we highlight the future steps for using the new communication kernel in the parent application Nek5000.