Communication efficient distributed learning of neural networks in Big Data environments using Spark

Alkhoury, Fouad; Wegener, Dennis; Sylla, Karl-Heinz; Mock, Michael

doi:10.24406/publica-r-413405

2021

Conference Paper

Abstract

Distributed (or federated) training of neural networks is an important approach to reduce the training time significantly. Previous experiments on communication efficient distributed learning have shown that model averaging, even if provably correct only in case of convex loss functions, is also working for the training of neural networks in some cases, however restricted to simple examples with relatively small standard data sets. In this paper, we investigate to what extent distributed communication efficient learning scales to huge data sets and complex, deep neural networks. We show how to integrate communication efficient distributed learning into the big data environment Spark and apply it to a complex realworld scenario, namely image segmentation on a large automotive data set (A2D2). We present evidence based results that the distributed approach scales successfully with increasing number of computing nodes in the case of fully convolutional networks.