GPU-based polynomial finite element matrix assembly for simplex meshes
In this paper, we present a matrix assembly technique for arbitrary polynomial order finite element simulations on simplex meshes for graphics processing units (GPU). Compared to the current state of the art in GPU-based matrix assembly, we avoid the need for an intermediate sparse matrix and perform assembly directly into the final, GPU-optimized data structure. Thereby, we avoid the resulting 180% to 600% memory overhead, depending on polynomial order, and associated allocation time, while simplifying the assembly code and using a more compact mesh representation. We compare our method with existing algorithms and demonstrate significant speedups.