• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. A High-Throughput, Resource-Efficient Implementation of the RoCEv2 Remote DMA Protocol and its Application
 
  • Details
  • Full
Options
2023
Journal Article
Title

A High-Throughput, Resource-Efficient Implementation of the RoCEv2 Remote DMA Protocol and its Application

Abstract
The use of application-specific accelerators in data centers has been the state of the art for at least a decade, starting with the availability of General Purpose GPUs achieving higher performance either overall or per watt. In most cases, these accelerators are coupled via PCIe interfaces to the corresponding hosts, which leads to disadvantages in interoperability, scalability and power consumption. As a viable alternative to PCIe-attached FPGA accelerators this paper proposes standalone FPGAs as Network-attached Accelerators (NAAs). To enable reliable communication for decoupled FPGAs we present an RDMA over Converged Ethernet v2 (RoCEv2) communication stack for high-speed and low-latency data transfer integrated into a hardware framework.
For NAAs to be used instead of PCIe coupled FPGAs the framework must provide similar throughput and latency with low resource usage. We show that our RoCEv2 stack is capable of achieving 100 Gb/s throughput with latencies of less than 4μs while using about 10% of the available resources on a mid-range FPGA. To evaluate the energy efficiency of our NAA architecture, we built a demonstrator with 8 NAAs for machine learning based image classification. Based on our measurements, network-attached FPGAs are a great alternative to the more energy-demanding PCIe-attached FPGA accelerators.
Author(s)
Schelten, Niklas
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Steinert, Fritjof
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Knapheide, Justin
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Schulte, Anton
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Stabernack, Benno  
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Journal
ACM transactions on reconfigurable technology and systems : TRETS  
DOI
10.1145/3543176
Additional full text version
Landing Page
Language
English
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Keyword(s)
  • data center

  • FPGA

  • high-performance computing

  • machine learning

  • Network-attached Accelerator

  • RDMA

  • RoCEv2

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024