Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

An In-DRAM Neural Network Processing Engine

 
: Sudarshan, Chirag; Lappas, Jan; Ghaffar, Muhammad Mohsin; Rybalkin, Vladimir; Weis, Christian; Jung, Matthias; Wehn, Norbert

:

Institute of Electrical and Electronics Engineers -IEEE-:
IEEE International Symposium on Circuits and Systems, ISCAS 2019. Proceedings : Sapporo, Japan, May 26-29, 2019
Piscataway, NJ: IEEE, 2019
ISBN: 978-1-72810-397-6
Art. 8702458, 5 S.
International Symposium on Circuits and Systems (ISCAS) <2019, Sapporo>
European Commission EC
H2020; 732631; OPRECOMP
Englisch
Konferenzbeitrag
Fraunhofer IESE ()
Processing-in-Memory (PIM); DRAM; Binary Weighted Network (BWN); Convolution Neural Network (CNN)

Abstract
Many advanced neural network inference engines are bounded by the available memory bandwidth. The conventional approach to address this issue is to employ high bandwidth memory devices or to adapt data compression techniques (reduced precision, sparse weight matrices). Alternatively, an emerging approach to bridge the memory-computation gap and to exploit extreme data parallelism is Processing in Memory (PIM). The close proximity of the computation units to the memory cells reduces the amount of external data transactions and it increases the overall energy efficiency of the memory system. In this work, we present a novel PIM based Binary Weighted Network (BWN) inference accelerator design that is inline with the commodity Dynamic Random Access Memory (DRAM) design and process. In order to exploit data parallelism and minimize energy, the proposed architecture integrates the basic BWN computation units at the output of the Primary Sense Amplifiers (PSAs) and the rest of the substantial logic near the Secondary Sense Amplifiers (SSAs). The power and area values are obtained at sub-array (SA) level using exhaustive circuit level simulations and full-custom layout. The proposed architecture results in an area overhead of 25 % compared to a commodity 8 Gb DRAM and delivers a throughput of 63.59 FPS (Frames per Second) for AlexNet. We also demonstrate that our architecture is extremely energy efficient, 7.25× higher FPS/W, as compared to previous works.

: http://publica.fraunhofer.de/dokumente/N-552646.html