Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Design of a High-Performance Tensor-Vector Multiplication with BLAS

 
: Bassoy, Cem

:

Rodrigues, J,:
Computational science - ICCS 2019. Part 1 : 19th international conference, Faro, Portugal, June 12-14, 2019 : proceedings
Cham: Springer International Publishing, 2019 (Lecture Notes in Computer Science 11536)
ISBN: 978-3-030-22733-3
ISBN: 978-3-030-22734-0
pp.32-45
International Conference on Computational Science (ICCS) <19, 2019, Faro>
English
Conference Paper
Fraunhofer IOSB ()

Abstract
Tensor contraction is an important mathematical operation for many scientific computing applications that use tensors to store massive multidimensional data. Based on the Loops-over-GEMMs (LOG) approach, this paper discusses the design of high-performance algorithms for the mode-q tensor-vector multiplication using efficient implementations of the matrix-vector multiplication (GEMV). Given dense tensors with any non-hierarchical storage format, tensor order and dimensions, the proposed algorithms either directly call GEMV with tensors or recursively apply GEMV on higher-order tensor slices multiple times. We analyze strategies for loop-fusion and parallel execution of slice-vector multiplications with higher-order tensor slices. Using OpenBLAS, our parallel implementation attains 34.8 Gflops/s in single precision on a Core i9-7900X Intel Xeon processor. Our parallel version of the tensor-vector multiplication is on average 6.1x and up to 12.6x faster than state-of-the-art approaches.

: http://publica.fraunhofer.de/documents/N-552136.html