Preventing TCP incast throughput collapse at the initiation, continuation, and termination

Adrian S W Tam, Kang Xi, Yang Xu, H. Jonathan Chao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Incast applications have grown in popularity with the advancement of data center technology. It is found that the TCP incast may suffer from the throughput collapse problem, as a consequence of TCP retransmission timeouts when the bottleneck buffer is overwhelmed and causes the packet losses. This is critical to the Quality of Service of cloud computing applications. While some previous literature has proposed solutions, we still see the problem not completely solved. In this paper, we investigate the three root causes for the poor performance of TCP incast flows and propose three solutions, one for each at the beginning, the middle and the end of a TCP connection. The three solutions are: admission control to TCP flows so that the flow population would not exceed the network's capacity; retransmission based on timestamp to detect loss of retransmitted packets; and reiterated FIN packets to keep the TCP connection active until the the termination of a session is acknowledged. The orchestration of these solutions prevents the throughput collapse. The main idea of these solutions is to ensure all the on-going TCP incast flows can maintain the self-clocking, thus eliminates the need to resort to retransmission timeout for recovery. We evaluate these solutions and find them work well in preventing the retransmission timeout of TCP incast flows, hence also preventing the throughput collapse.

Original languageEnglish (US)
Title of host publication2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012
DOIs
StatePublished - 2012
Event2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012 - Coimbra, Portugal
Duration: Jun 4 2012Jun 5 2012

Other

Other2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012
CountryPortugal
CityCoimbra
Period6/4/126/5/12

Fingerprint

Throughput
Cloud computing
Packet loss
Access control
Quality of service
Recovery

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Tam, A. S. W., Xi, K., Xu, Y., & Chao, H. J. (2012). Preventing TCP incast throughput collapse at the initiation, continuation, and termination. In 2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012 [6245995] https://doi.org/10.1109/IWQoS.2012.6245995

Preventing TCP incast throughput collapse at the initiation, continuation, and termination. / Tam, Adrian S W; Xi, Kang; Xu, Yang; Chao, H. Jonathan.

2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012. 2012. 6245995.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tam, ASW, Xi, K, Xu, Y & Chao, HJ 2012, Preventing TCP incast throughput collapse at the initiation, continuation, and termination. in 2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012., 6245995, 2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012, Coimbra, Portugal, 6/4/12. https://doi.org/10.1109/IWQoS.2012.6245995
Tam ASW, Xi K, Xu Y, Chao HJ. Preventing TCP incast throughput collapse at the initiation, continuation, and termination. In 2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012. 2012. 6245995 https://doi.org/10.1109/IWQoS.2012.6245995
Tam, Adrian S W ; Xi, Kang ; Xu, Yang ; Chao, H. Jonathan. / Preventing TCP incast throughput collapse at the initiation, continuation, and termination. 2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012. 2012.
@inproceedings{85331c6a151241dcbe64d6072c3a2d26,
title = "Preventing TCP incast throughput collapse at the initiation, continuation, and termination",
abstract = "Incast applications have grown in popularity with the advancement of data center technology. It is found that the TCP incast may suffer from the throughput collapse problem, as a consequence of TCP retransmission timeouts when the bottleneck buffer is overwhelmed and causes the packet losses. This is critical to the Quality of Service of cloud computing applications. While some previous literature has proposed solutions, we still see the problem not completely solved. In this paper, we investigate the three root causes for the poor performance of TCP incast flows and propose three solutions, one for each at the beginning, the middle and the end of a TCP connection. The three solutions are: admission control to TCP flows so that the flow population would not exceed the network's capacity; retransmission based on timestamp to detect loss of retransmitted packets; and reiterated FIN packets to keep the TCP connection active until the the termination of a session is acknowledged. The orchestration of these solutions prevents the throughput collapse. The main idea of these solutions is to ensure all the on-going TCP incast flows can maintain the self-clocking, thus eliminates the need to resort to retransmission timeout for recovery. We evaluate these solutions and find them work well in preventing the retransmission timeout of TCP incast flows, hence also preventing the throughput collapse.",
author = "Tam, {Adrian S W} and Kang Xi and Yang Xu and Chao, {H. Jonathan}",
year = "2012",
doi = "10.1109/IWQoS.2012.6245995",
language = "English (US)",
isbn = "9781467312981",
booktitle = "2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012",

}

TY - GEN

T1 - Preventing TCP incast throughput collapse at the initiation, continuation, and termination

AU - Tam, Adrian S W

AU - Xi, Kang

AU - Xu, Yang

AU - Chao, H. Jonathan

PY - 2012

Y1 - 2012

N2 - Incast applications have grown in popularity with the advancement of data center technology. It is found that the TCP incast may suffer from the throughput collapse problem, as a consequence of TCP retransmission timeouts when the bottleneck buffer is overwhelmed and causes the packet losses. This is critical to the Quality of Service of cloud computing applications. While some previous literature has proposed solutions, we still see the problem not completely solved. In this paper, we investigate the three root causes for the poor performance of TCP incast flows and propose three solutions, one for each at the beginning, the middle and the end of a TCP connection. The three solutions are: admission control to TCP flows so that the flow population would not exceed the network's capacity; retransmission based on timestamp to detect loss of retransmitted packets; and reiterated FIN packets to keep the TCP connection active until the the termination of a session is acknowledged. The orchestration of these solutions prevents the throughput collapse. The main idea of these solutions is to ensure all the on-going TCP incast flows can maintain the self-clocking, thus eliminates the need to resort to retransmission timeout for recovery. We evaluate these solutions and find them work well in preventing the retransmission timeout of TCP incast flows, hence also preventing the throughput collapse.

AB - Incast applications have grown in popularity with the advancement of data center technology. It is found that the TCP incast may suffer from the throughput collapse problem, as a consequence of TCP retransmission timeouts when the bottleneck buffer is overwhelmed and causes the packet losses. This is critical to the Quality of Service of cloud computing applications. While some previous literature has proposed solutions, we still see the problem not completely solved. In this paper, we investigate the three root causes for the poor performance of TCP incast flows and propose three solutions, one for each at the beginning, the middle and the end of a TCP connection. The three solutions are: admission control to TCP flows so that the flow population would not exceed the network's capacity; retransmission based on timestamp to detect loss of retransmitted packets; and reiterated FIN packets to keep the TCP connection active until the the termination of a session is acknowledged. The orchestration of these solutions prevents the throughput collapse. The main idea of these solutions is to ensure all the on-going TCP incast flows can maintain the self-clocking, thus eliminates the need to resort to retransmission timeout for recovery. We evaluate these solutions and find them work well in preventing the retransmission timeout of TCP incast flows, hence also preventing the throughput collapse.

UR - http://www.scopus.com/inward/record.url?scp=84866594857&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84866594857&partnerID=8YFLogxK

U2 - 10.1109/IWQoS.2012.6245995

DO - 10.1109/IWQoS.2012.6245995

M3 - Conference contribution

AN - SCOPUS:84866594857

SN - 9781467312981

BT - 2012 IEEE 20th International Workshop on Quality of Service, IWQoS 2012

ER -