• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. CPU vs. GPU - Performance comparison for the Gram-Schmidt algorithm
 
  • Details
  • Full
Options
2012
Journal Article
Title

CPU vs. GPU - Performance comparison for the Gram-Schmidt algorithm

Abstract
The Gram-Schmidt method is a classical method for determining QR decompositions, which is commonly used in many applications in computational physics, such as orthogonalization of quantum mechanical operators or Lyapunov stability analysis. In this paper, we discuss how well the Gram-Schmidt method performs on different hardware architectures, including both state-of-the-art GPUs and CPUs. We explain, in detail, how a smart interplay between hardware and software can be used to speed up those rather compute intensive applications as well as the benefits and disadvantages of several approaches. In addition, we compare some highly optimized standard routines of the BLAS libraries against our own optimized routines on both processor types. Particular attention was paid to the strong hierarchical memory of modern GPUs and CPUs, which requires cache-aware blocking techniques for optimal performance. Our investigations show that the performance strongly depends on the employed algorithm, compiler and a little less on the employed hardware. Remarkably, the performance of the NVIDIA CUDA BLAS routines improved significantly from CUDA 3.2 to CUDA 4.0. Still, BLAS routines tend to be slightly slower than manually optimized code on GPUs, while we were not able to outperform the BLAS routines on CPUs. Comparing optimized implementations on different hardware architectures, we find that a NVIDIA GeForce GTX580 GPU is about 50% faster than a corresponding Intel X5650 Westmere hexacore CPU. The self-written codes are included as supplementary material.
Author(s)
Brandes, T.
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Arnold, A.
Institute for Computational Physics, University of Stuttgart, Pfaffenwaldring 27, 70569, Stuttgart, Germany
Soddemann, T.
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Reith, Dirk  orcid-logo
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Journal
European physical journal special topics  
DOI
10.1140/epjst/e2012-01638-7
Language
English
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Keyword(s)
  • CPU

  • GPU

  • Gram-Schmidt algorithm

  • computational physics

  • QR decompositions

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024