• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Analyzing Put/Get APIs for thread-collaborative processors
 
  • Details
  • Full
Options
2014
Conference Paper
Title

Analyzing Put/Get APIs for thread-collaborative processors

Abstract
In High-Performance Computing (HPC), GPU-based accelerators are pervasive for two reasons: first, GPUs provide a much higher raw computational power than traditional CPUs. Second, power consumption increases sub-linearly with the performance increase, making GPUs much more energy-efficient in terms of GFLOPS/Watt than CPUs. Although these advantages are limited to a selected set of workloads, most HPC applications can benefit a lot from GPUs. The top 11 entries of the current Green500 list (November 2013) are all GPU-accelerated systems, which supports the previous statements. For system architects the use of GPUs is challenging though, as their architecture is based on thread-collaborative execution and differs significantly from CPUs, which are mainly optimized for single-thread performance. The interfaces to other devices in a system, in particular the network device, are still solely optimized for CPUs. This makes GPU-controlled IO a challenge, although it is desirable for savings in terms of energy and time. This is especially true for network devices, which are a key component in HPC systems. In previous work we have shown that GPUs can directly source and sink network traffic for Infiniband devices without any involvement of the host CPUs, but this approach does not provide any performance benefits. Here we explore another API for Put/Get operations that can overcome some limitations. In particular, we provide a detailed reasoning about the issues that prevent performance advantages when directly controlling IO from the GPU domain.
Author(s)
Klenk, B.
Oden, L.
Fröning, H.
Mainwork
43rd International Conference on Parallel Processing Workshops, ICPPW 2014. Proceedings  
Conference
International Conference on Parallel Processing (ICPP) 2014  
International Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications (HUCAA) 2014  
DOI
10.1109/ICPPW.2014.61
Language
English
Fraunhofer-Institut für Techno- und Wirtschaftsmathematik ITWM  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024