Training Mixed Precision Neural Networks with Energy Constraints for a FeFET-Crossbar-Based Accelerator
Dedicated hardware accelerators, quantization of network parameters and optimized dataflows can increase the efficiency of deep convolutional neural network (CNN) inference on mobile or embedded devices. Especially mixed-precision networks achieve a high network compression with only small accuracy degradations. To learn a precision distribution policy among all layer's weights and activations feedback on the training process is necessary. Therefore, a differentiable energy consumption model of a scalable, two-level, FeFET crossbar-based accelerator is developed and a differentiable, pre-defined, accelerator-specific dataflow algorithm is implemented. Both are directly integrated into a standard, gradient-descent, mixed-precision training method via an energy consumption loss. The addition al loss enables the training process to learn optimal precision parameters with respect to a trade-off between model accuracy and energy consumption. The proposed training method is evaluated for the MNIST dataset and the by the quantization induced limits are discussed. How the training method must be parametrized to ensure a stable training process is presented. When compared to an energy-unaware, mixed-precision training method, which uses memory constraints to learn the precision distribution, the energy-aware method proposed in this work achieves a 1.5 times higher energy reduction.
München, TU, Master Thesis, 2021