Options
2020
Conference Paper
Titel
A High Throughput MobileNetV2 FPGA Implementation Based on a Flexible Architecture for Depthwise Separable Convolution
Abstract
Convolutional Neural Networks are widely applied to various computer vision tasks. For most of these applications, high throughput and energy efficiency are top priorities. MobileNetV2 features very low memory requirements as well as a relatively small model size. On the ILSVRC 2012 classification challenge, it provides a decent prediction accuracy of 71.7 percent at low computational requirements. We present an FPGA based MobileNetV2 accelerator with a high throughput of 1050 frames per second at a power consumption of 34 watt under full load. This equates to a power efficiency of 32 milli-joule per frame. We describe our approach of using stream interfaces and auto-generated control signals to enable fast design of flexible architectures. By using quantization techniques, limiting the accuracy of the used number format to a 16 bit fixed point format, we were able to reduce the memory usage for weights as well as activations by a factor of two. Since the basic building block of MobileNetV2 can be used to build higher performance networks as well, the findings of this paper remain applicable, when higher prediction accuracies are required.