Revisiting network support for RDMA

Radhika Mittal, Alexander Shpiner, Aurojit Panda, Eitan Zahavi, Arvind Krishnamurthy, Sylvia Ratnasamy, Scott Shenker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks. Rather than seek to fix these issues, we instead ask: is PFC fundamentally required to support RDMA over Ethernet? We show that the need for PFC is an artifact of current RoCE NIC designs rather than a fundamental requirement. We propose an improved RoCE NIC (IRN) design that makes a few simple changes to the RoCE NIC for better handling of packet losses. We show that IRN (without PFC) outperforms RoCE (with PFC) by 6-83% for typical network scenarios. Thus not only does IRN eliminate the need for PFC, it improves performance in the process! We further show that the changes that IRN introduces can be implemented with modest overheads of about 3-10% to NIC resources. Based on our results, we argue that research and industry should rethink the current trajectory of network support for RDMA.

Original languageEnglish (US)
Title of host publicationSIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
PublisherAssociation for Computing Machinery, Inc
Pages313-326
Number of pages14
ISBN (Electronic)9781450355674
DOIs
StatePublished - Aug 7 2018
Event2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018 - Budapest, Hungary
Duration: Aug 20 2018Aug 25 2018

Other

Other2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018
CountryHungary
CityBudapest
Period8/20/188/25/18

Fingerprint

Ethernet
Flow control
performance
Packet loss
artifact
scenario
Trajectories
industry
resources
Industry

Keywords

  • Datacenter transport
  • IWARP
  • PFC
  • RDMA
  • RoCE

ASJC Scopus subject areas

  • Communication
  • Electrical and Electronic Engineering
  • Computer Networks and Communications
  • Signal Processing

Cite this

Mittal, R., Shpiner, A., Panda, A., Zahavi, E., Krishnamurthy, A., Ratnasamy, S., & Shenker, S. (2018). Revisiting network support for RDMA. In SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (pp. 313-326). Association for Computing Machinery, Inc. https://doi.org/10.1145/3230543.3230557

Revisiting network support for RDMA. / Mittal, Radhika; Shpiner, Alexander; Panda, Aurojit; Zahavi, Eitan; Krishnamurthy, Arvind; Ratnasamy, Sylvia; Shenker, Scott.

SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc, 2018. p. 313-326.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mittal, R, Shpiner, A, Panda, A, Zahavi, E, Krishnamurthy, A, Ratnasamy, S & Shenker, S 2018, Revisiting network support for RDMA. in SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc, pp. 313-326, 2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018, Budapest, Hungary, 8/20/18. https://doi.org/10.1145/3230543.3230557
Mittal R, Shpiner A, Panda A, Zahavi E, Krishnamurthy A, Ratnasamy S et al. Revisiting network support for RDMA. In SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc. 2018. p. 313-326 https://doi.org/10.1145/3230543.3230557
Mittal, Radhika ; Shpiner, Alexander ; Panda, Aurojit ; Zahavi, Eitan ; Krishnamurthy, Arvind ; Ratnasamy, Sylvia ; Shenker, Scott. / Revisiting network support for RDMA. SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc, 2018. pp. 313-326
@inproceedings{827a2335bc574898960032ef3572be2b,
title = "Revisiting network support for RDMA",
abstract = "The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks. Rather than seek to fix these issues, we instead ask: is PFC fundamentally required to support RDMA over Ethernet? We show that the need for PFC is an artifact of current RoCE NIC designs rather than a fundamental requirement. We propose an improved RoCE NIC (IRN) design that makes a few simple changes to the RoCE NIC for better handling of packet losses. We show that IRN (without PFC) outperforms RoCE (with PFC) by 6-83{\%} for typical network scenarios. Thus not only does IRN eliminate the need for PFC, it improves performance in the process! We further show that the changes that IRN introduces can be implemented with modest overheads of about 3-10{\%} to NIC resources. Based on our results, we argue that research and industry should rethink the current trajectory of network support for RDMA.",
keywords = "Datacenter transport, IWARP, PFC, RDMA, RoCE",
author = "Radhika Mittal and Alexander Shpiner and Aurojit Panda and Eitan Zahavi and Arvind Krishnamurthy and Sylvia Ratnasamy and Scott Shenker",
year = "2018",
month = "8",
day = "7",
doi = "10.1145/3230543.3230557",
language = "English (US)",
pages = "313--326",
booktitle = "SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Revisiting network support for RDMA

AU - Mittal, Radhika

AU - Shpiner, Alexander

AU - Panda, Aurojit

AU - Zahavi, Eitan

AU - Krishnamurthy, Arvind

AU - Ratnasamy, Sylvia

AU - Shenker, Scott

PY - 2018/8/7

Y1 - 2018/8/7

N2 - The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks. Rather than seek to fix these issues, we instead ask: is PFC fundamentally required to support RDMA over Ethernet? We show that the need for PFC is an artifact of current RoCE NIC designs rather than a fundamental requirement. We propose an improved RoCE NIC (IRN) design that makes a few simple changes to the RoCE NIC for better handling of packet losses. We show that IRN (without PFC) outperforms RoCE (with PFC) by 6-83% for typical network scenarios. Thus not only does IRN eliminate the need for PFC, it improves performance in the process! We further show that the changes that IRN introduces can be implemented with modest overheads of about 3-10% to NIC resources. Based on our results, we argue that research and industry should rethink the current trajectory of network support for RDMA.

AB - The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks. Rather than seek to fix these issues, we instead ask: is PFC fundamentally required to support RDMA over Ethernet? We show that the need for PFC is an artifact of current RoCE NIC designs rather than a fundamental requirement. We propose an improved RoCE NIC (IRN) design that makes a few simple changes to the RoCE NIC for better handling of packet losses. We show that IRN (without PFC) outperforms RoCE (with PFC) by 6-83% for typical network scenarios. Thus not only does IRN eliminate the need for PFC, it improves performance in the process! We further show that the changes that IRN introduces can be implemented with modest overheads of about 3-10% to NIC resources. Based on our results, we argue that research and industry should rethink the current trajectory of network support for RDMA.

KW - Datacenter transport

KW - IWARP

KW - PFC

KW - RDMA

KW - RoCE

UR - http://www.scopus.com/inward/record.url?scp=85056407588&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85056407588&partnerID=8YFLogxK

U2 - 10.1145/3230543.3230557

DO - 10.1145/3230543.3230557

M3 - Conference contribution

SP - 313

EP - 326

BT - SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication

PB - Association for Computing Machinery, Inc

ER -