Publica
Hier finden Sie wissenschaftliche Publikationen aus den FraunhoferInstituten. Quantization Considerations of Dense Layers in Convolutional Neural Networks for Resistive Crossbar Implementation
:
Postprint urn:nbn:de:0011n6056873 (840 KByte PDF) MD5 Fingerprint: 8e5bafcb955ef198c7f95f324f954b7c © IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Erstellt am: 15.10.2020 
 Institute of Electrical and Electronics Engineers IEEE: 9th International Conference on Modern Circuits and Systems Technologies, MOCAST 2020 : 79 September 2020, Bremen, Germany, virtual conference Piscataway, NJ: IEEE, 2020 ISBN: 9781728166872 ISBN: 9781728166889 6 S. 
 International Conference on Modern Circuits and Systems Technologies (MOCAST) <9, 2020, Online> 
 European Commission EC H2020; 826655; TEMPO Technology & Hardware for Neuromorphic Computing 
 Bundesministerium für Bildung und Forschung BMBF (Deutschland) 16ESE0407; TEMPO Technology & Hardware for Neuromorphic Computing 

 Englisch 
 Konferenzbeitrag, Elektronische Publikation 
 Fraunhofer EMFT () 
 convolutional neural network; Neuromorphic Computing Hardware; approximate computing; Neural Network Quantization; Resistive Crossbar; Memristive Devices 
Abstract
The accuracy and power consumption of resistive crossbar circuits in use for neuromorphic computing is restricted by the process variation of the resistanceswitching (memristive) device and the power overhead of the mixedsignal circuits, such as analogdigital converters (ADCs) and digital analog converters (DACs). Reducing the signal and weight resolution can improve the robustness against process variation, relax requirements for mixedsignal devices, and simplify the implementation of crossbar circuits. This work aims to establish a methodology to achieve lowresolution dense layers for CNNs in terms of network architecture selection and quantization method. To this end, this work studies the impact of the dense layer configuration on the required resolution for its inputs and weights in a small convolutional neural network (CNN). This analysis shows that carefully selecting the network architecture for the dense layer can significantly reduce the required resolution for its input signals and weights. This work reviews criteria for appropriate architecture selection and the quantization method for the binary and ternary neural network (BNN and TNN) to reduce the weight resolution of CNN dense layers. Furthermore, this work presents a method to reduce the input resolution for the dense layer down to one bit by analyzing the distribution of the input values. A small CNN for inference with onebit quantization for inputs signals and weights can be realized with only 0.68% accuracy degradation for MNIST Dataset.