Distributed training of deep neural networks: Theoretical and practical limits of parallel scalability

Keuper, J.; Preundt, F.-J.

doi:10.1109/MLHPC.2016.006

2016

Conference Paper

Abstract

This paper presents a theoretical analysis and practical evaluation of the main bottlenecks towards a scalable distributed solution for the training of Deep Neural Networks (DNNs). The presented results show, that the current state of the art approach, using data-parallelized Stochastic Gradient Descent (SGD), is quickly turning into a vastly communication bound problem. In addition, we present simple but fixed theoretic constraints, preventing effective scaling of DNN training beyond only a few dozen nodes. This leads to poor scalability of DNN training in most practical scenarios.

Author(s)

Keuper, J.

Fraunhofer-Institut für Techno- und Wirtschaftsmathematik ITWM

Preundt, F.-J.

Fraunhofer-Institut für Techno- und Wirtschaftsmathematik ITWM

Mainwork

2nd Workshop on Machine Learning in HPC Environments, MLHPC 2016

Conference

Workshop on Machine Learning in HPC Environments (MLHPC) 2016

Supercomputing Conference & Expo (SC) 2016

Options

Distributed training of deep neural networks: Theoretical and practical limits of parallel scalability