Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Quantization Considerations of Dense Layers in Convolutional Neural Networks for Resistive Crossbar Implementation

: Lei, Zhang; Borggreve, D.; Vanselow, F.; Brederlow, R.

Postprint urn:nbn:de:0011-n-6056873 (840 KByte PDF)
MD5 Fingerprint: 8e5bafcb955ef198c7f95f324f954b7c
© IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Erstellt am: 15.10.2020

Institute of Electrical and Electronics Engineers -IEEE-:
9th International Conference on Modern Circuits and Systems Technologies, MOCAST 2020 : 7-9 September 2020, Bremen, Germany, virtual conference
Piscataway, NJ: IEEE, 2020
ISBN: 978-1-7281-6687-2
ISBN: 978-1-7281-6688-9
6 S.
International Conference on Modern Circuits and Systems Technologies (MOCAST) <9, 2020, Online>
European Commission EC
H2020; 826655; TEMPO
Technology & Hardware for Neuromorphic Computing
Bundesministerium für Bildung und Forschung BMBF (Deutschland)
16ESE0407; TEMPO
Technology & Hardware for Neuromorphic Computing
Konferenzbeitrag, Elektronische Publikation
Fraunhofer EMFT ()
convolutional neural network; Neuromorphic Computing Hardware; approximate computing; Neural Network Quantization; Resistive Crossbar; Memristive Devices

The accuracy and power consumption of resistive crossbar circuits in use for neuromorphic computing is restricted by the process variation of the resistance-switching (memristive) device and the power overhead of the mixed-signal circuits, such as analog-digital converters (ADCs) and digital analog converters (DACs). Reducing the signal- and weight resolution can improve the robustness against process variation, relax requirements for mixed-signal devices, and simplify the implementation of crossbar circuits. This work aims to establish a methodology to achieve low-resolution dense layers for CNNs in terms of network architecture selection and quantization method. To this end, this work studies the impact of the dense layer configuration on the required resolution for its inputs and weights in a small convolutional neural network (CNN). This analysis shows that carefully selecting the network architecture for the dense layer can significantly reduce the required resolution for its input signals and weights. This work reviews criteria for appropriate architecture selection and the quantization method for the binary and ternary neural network (BNN and TNN) to reduce the weight resolution of CNN dense layers. Furthermore, this work presents a method to reduce the input resolution for the dense layer down to one bit by analyzing the distribution of the input values. A small CNN for inference with one-bit quantization for inputs signals and weights can be realized with only 0.68% accuracy degradation for MNIST Dataset.