packet reordering in high-speed networks and its impact on high-speed tcp variants

7
Packet reordering in high-speed networks and its impact on high-speed TCP variants Jie Feng, Zhipeng Ouyang, Lisong Xu * , Byrav Ramamurthy Department of Computer Science and Engineering, University of Nebraska–Lincoln, 255 Avery Hall, Lincoln, NE 68588-0115, USA article info Article history: Received 17 December 2007 Received in revised form 13 September 2008 Accepted 15 September 2008 Available online 25 September 2008 Keywords: Packet reordering High-speed networks Reordering model Reordering-tolerant TCP abstract Several recent Internet measurement studies show that the higher the packet sending rate, the higher the packet-reordering probability. This implies that recently proposed high-speed TCP variants are more likely to experience packet reordering than regular TCP in high-speed networks, since they are designed to achieve much higher throughput than regular TCP in these networks. In this paper, we first study the characteristics of packet reordering in high speed networks. Second, we verify the impact of packet reor- dering on high speed TCP variants and evaluate the effectiveness of the existing reordering-tolerant TCP enhancements using simulations. Our simulation results demonstrate that high-speed TCP variants per- form poorly in the presence of packet reordering, and existing reordering-tolerant algorithms can signif- icantly improve the performance of high-speed TCP variants. Ó 2008 Elsevier B.V. All rights reserved. 1. Introduction In the past few years a number of high-speed TCP variants, such as HSTCP [1], STCP [2], BIC [3], and FAST [4], have been proposed to address the under-utilization problem of TCP in high-speed and long-distance networks. These high-speed TCP variants modify the congestion avoidance algorithms [5] of TCP to be more aggres- sive in high-speed networks, but they still use the same fast retransmit and fast recovery algorithms [5] as TCP, which enables TCP to detect and recover from packet loss earlier than the timeout period. However, it is well known [6–8] that the fast retransmit and fast recovery algorithms of TCP may misinterpret packet reor- dering as packet loss, and therefore cause TCP to perform poorly in networks with severe packet reordering. Consequently, high-speed TCP variants using the same fast retransmit and fast recovery algo- rithms may not achieve the expected high throughput when pack- et-reordering occurs. Internet measurement results [9,10] show that a small percent- age of TCP traffic experiences packet reordering, and that only very few reordered packets actually trigger TCP fast retransmit and fast recovery. This explains why TCP can achieve a satisfactory perfor- mance in most cases. However, several recent studies [11–13] demonstrate a strong correlation between inter-packet spacing and packet reordering. Basically, smaller inter-packet spacing may increase the probability of packet reordering. This result im- plies that for the same packet size, the higher the packet sending rate, the greater the packet-reordering probability. For example, a re- cent DARPA SuperNet experiment [11] shows that a flow with a 1500-byte packet size experiences significantly more reordered packets when its sending rate is higher than 600 Mbps. Since high-speed TCP variants achieve much higher throughput than reg- ular TCP in high-speed networks, it is highly likely that they will experience more packet reordering than regular TCP in these networks. While there is some debate [14] about whether packet reorder- ing is a pathological behavior of the Internet, and whether this is- sue should be addressed by designing a reordering-free network or by designing a reordering-tolerant TCP, this paper focuses on char- acterizing packet-reordering phenomenon in high-speed networks, and its impact on recently proposed high-speed TCP variants. Dif- ferent from previous studies [6,8,15–18], in this paper we charac- terize the packet-reordering phenomenon with three parameters: reordering interval, reordering delay, and reordering block size. The new reordering model enables us to study the impact of each of these three parameters on the performance of high-speed TCP variants. The proposed model is not related to any particular phys- ical explanation of the reordering phenomena, and with the pro- posed model, we can study the impact of packet reordering in a simple simulation scenario, since the model is general enough to simulate the effect of most reordering situations. In addition, with the new reordering model, we evaluate the effectiveness of the existing reordering-tolerant TCP enhancements in this work. The rest of this paper is organized as follows: Section 2 briefly reviews the cause of packet reordering and its negative impact on TCP. Section 3 describes a new packet-reordering generator used in our simulation. Section 4 presents the characteristics of packet reordering in high-speed networks. Section 5 discusses the simulation results of high-speed TCP variants in networks with 0140-3664/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.comcom.2008.09.022 * Corresponding author. Tel.: +1 402 472 1053. E-mail addresses: [email protected] (J. Feng), [email protected] (Z. Ouyang), [email protected] (L. Xu), [email protected] (B. Ramamurthy). Computer Communications 32 (2009) 62–68 Contents lists available at ScienceDirect Computer Communications journal homepage: www.elsevier.com/locate/comcom

Upload: jie-feng

Post on 26-Jun-2016

219 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Packet reordering in high-speed networks and its impact on high-speed TCP variants

Computer Communications 32 (2009) 62–68

Contents lists available at ScienceDirect

Computer Communications

journal homepage: www.elsevier .com/locate /comcom

Packet reordering in high-speed networks and its impact on high-speed TCP variants

Jie Feng, Zhipeng Ouyang, Lisong Xu *, Byrav RamamurthyDepartment of Computer Science and Engineering, University of Nebraska–Lincoln, 255 Avery Hall, Lincoln, NE 68588-0115, USA

a r t i c l e i n f o

Article history:Received 17 December 2007Received in revised form 13 September2008Accepted 15 September 2008Available online 25 September 2008

Keywords:Packet reorderingHigh-speed networksReordering modelReordering-tolerant TCP

0140-3664/$ - see front matter � 2008 Elsevier B.V. Adoi:10.1016/j.comcom.2008.09.022

* Corresponding author. Tel.: +1 402 472 1053.E-mail addresses: [email protected] (J. Feng), zouy

[email protected] (L. Xu), [email protected] (B. Ramam

a b s t r a c t

Several recent Internet measurement studies show that the higher the packet sending rate, the higher thepacket-reordering probability. This implies that recently proposed high-speed TCP variants are morelikely to experience packet reordering than regular TCP in high-speed networks, since they are designedto achieve much higher throughput than regular TCP in these networks. In this paper, we first study thecharacteristics of packet reordering in high speed networks. Second, we verify the impact of packet reor-dering on high speed TCP variants and evaluate the effectiveness of the existing reordering-tolerant TCPenhancements using simulations. Our simulation results demonstrate that high-speed TCP variants per-form poorly in the presence of packet reordering, and existing reordering-tolerant algorithms can signif-icantly improve the performance of high-speed TCP variants.

� 2008 Elsevier B.V. All rights reserved.

1. Introduction

In the past few years a number of high-speed TCP variants, suchas HSTCP [1], STCP [2], BIC [3], and FAST [4], have been proposed toaddress the under-utilization problem of TCP in high-speed andlong-distance networks. These high-speed TCP variants modifythe congestion avoidance algorithms [5] of TCP to be more aggres-sive in high-speed networks, but they still use the same fastretransmit and fast recovery algorithms [5] as TCP, which enablesTCP to detect and recover from packet loss earlier than the timeoutperiod. However, it is well known [6–8] that the fast retransmitand fast recovery algorithms of TCP may misinterpret packet reor-dering as packet loss, and therefore cause TCP to perform poorly innetworks with severe packet reordering. Consequently, high-speedTCP variants using the same fast retransmit and fast recovery algo-rithms may not achieve the expected high throughput when pack-et-reordering occurs.

Internet measurement results [9,10] show that a small percent-age of TCP traffic experiences packet reordering, and that only veryfew reordered packets actually trigger TCP fast retransmit and fastrecovery. This explains why TCP can achieve a satisfactory perfor-mance in most cases. However, several recent studies [11–13]demonstrate a strong correlation between inter-packet spacingand packet reordering. Basically, smaller inter-packet spacingmay increase the probability of packet reordering. This result im-plies that for the same packet size, the higher the packet sendingrate, the greater the packet-reordering probability. For example, a re-

ll rights reserved.

[email protected] (Z. Ouyang),urthy).

cent DARPA SuperNet experiment [11] shows that a flow with a1500-byte packet size experiences significantly more reorderedpackets when its sending rate is higher than 600 Mbps. Sincehigh-speed TCP variants achieve much higher throughput than reg-ular TCP in high-speed networks, it is highly likely that they willexperience more packet reordering than regular TCP in thesenetworks.

While there is some debate [14] about whether packet reorder-ing is a pathological behavior of the Internet, and whether this is-sue should be addressed by designing a reordering-free network orby designing a reordering-tolerant TCP, this paper focuses on char-acterizing packet-reordering phenomenon in high-speed networks,and its impact on recently proposed high-speed TCP variants. Dif-ferent from previous studies [6,8,15–18], in this paper we charac-terize the packet-reordering phenomenon with three parameters:reordering interval, reordering delay, and reordering block size.The new reordering model enables us to study the impact of eachof these three parameters on the performance of high-speed TCPvariants. The proposed model is not related to any particular phys-ical explanation of the reordering phenomena, and with the pro-posed model, we can study the impact of packet reordering in asimple simulation scenario, since the model is general enough tosimulate the effect of most reordering situations. In addition, withthe new reordering model, we evaluate the effectiveness of theexisting reordering-tolerant TCP enhancements in this work.

The rest of this paper is organized as follows: Section 2 brieflyreviews the cause of packet reordering and its negative impacton TCP. Section 3 describes a new packet-reordering generatorused in our simulation. Section 4 presents the characteristics ofpacket reordering in high-speed networks. Section 5 discussesthe simulation results of high-speed TCP variants in networks with

Page 2: Packet reordering in high-speed networks and its impact on high-speed TCP variants

J. Feng et al. / Computer Communications 32 (2009) 62–68 63

packet reordering. Section 6 discusses the effectiveness of theexisting reordering-tolerant TCP enhancement. Finally, Section 7provides the conclusion.

2. Background of packet reordering

Packet reordering [19,20] is a phenomenon in which packetswith higher sequence numbers are received earlier than those withsmaller sequence numbers. For example, a TCP sender sequentiallysends four packets: P1, P2, P3 and P4. They may arrive at the TCP re-ceiver in the following order: P1, P3, P2 and P4, where P2 isreordered.

Recent research work [21,14,8] shows that packet reorderingcan be caused by two major reasons. The first reason is the localparallelism within a packet router, which is a promising approachto build a high-speed and inexpensive packet router. Thus, packetreordering is more likely to occur in high-speed networks. The sec-ond reason is the load balancing among multiple links. The analysisin a recent study [21] shows that multiple links with slightly differ-ent link delays may introduce significant packet reordering.

TCP [6] attempts to distinguish packet reordering from packetloss by using the number of dupACKs (duplicate acknowledge-ments). An ACK is generated by a TCP receiver to inform the TCPsender the next sequence number that it is expecting to receive.RFC 2581 [5] suggests that a TCP sender should consider three ormore dupACKs as an indication that a packet has been lost, basedon the assumption that a reordered packet can trigger only oneor two dupACKs. However, if a reordered packet causes three ormore dupACKs, TCP misinterprets it as a lost packet. Consequently,TCP unnecessarily invokes fast retransmit (referred to as false fastretransmit [8]) to retransmit the packet that seems to be lost, andunnecessarily invokes fast recovery to reduce the TCP sending rate.Therefore, TCP performs poorly in networks with severe packetreordering.

Let us consider the above example: four packets arrive at a TCPreceiver in the following order: P1, P3, P2 and P4. When P1 arrives,the receiver sends an ACK to ask the next packet which is P2, butthen P3 arrives, which indicates that P2 is missing. Therefore, thereceiver sends an ACK for P2 again (this ACK is a dupACK). In thisexample, the reordered packet P2 causes only one dupACK. If P2 ar-rives after P5 does, then P2 triggers three dupACKs. Consequently,TCP sender misinterprets it as a packet loss, then P2 isretransmitted.

P8 P7 P6 P5 P4 P3 P2 P1

TCP Sender TCP Receiver

Fig. 1. Original packet sequence without reordering.

3. Packet-reordering models

In this section, we present a new packet-reordering model,which can be used to generate relatively realistic packet-reorder-ing patterns, and more importantly, with which we can study theimpact of specific properties, such as the reordering interval, ofpacket reordering on TCP performance.

3.1. Limitations of current packet-reordering models

Several methods have been proposed to simulate packet reor-dering in previous studies. The models can be classified into twocategories and both categories have their own limitations. Oneclass of models reorders only one packet each time by swappingtwo packets in a router queue [6], or by using hiccup module[15]. However in practice, a block of packets instead of one packetmay be reordered at the same time. The other class of models gen-erates multiple reordered packets each time by extending the NS-2error model to delay a configurable percentage of packets [8], or bychanging the link delay periodically [16]. Although these modelscan generate more realistic reordered traffic, it is hard to isolate

the impact of a specific property of packet reordering on TCP per-formance from that of other properties. For example, it is difficultto isolate the performance of TCP as the number of reordered pack-ets in an reordering event increasing, although we can only verifythe performance degradation when a certain percentage of packetsare reordered. IETF is currently developing metrics [17,18], such asreorder density, to capture the occurrences and characteristics ofpacket reordering in the Internet. However, these metrics are moreappropriate for describing packet-reordering experienced by a flowinstead of generating packet reordering.

3.2. Proposed packet-reordering models

Considering the limitations of current packet-reordering mod-els, we propose a new packet-reordering generator that is imple-mented by extending the error model in NS-2 [22]. The proposedpacket-reordering generator models packet-reordering phenome-non with three parameters, and it enables us to study the impactof each of the three parameters on the performance of high-speedTCP variants.

Our packet-reordering generator can be described by the fol-lowing three parameters:

� Reordering interval: The time interval between two consecutivepacket-reordering events.

� Reordering delay: The time interval from the first reorderedpacket in a reordering event to the earliest forwarded packetwith a higher sequence number.

� Reordering block size: The number of packets being reordered asan entity.

A packet-reordering generator works as a special router withone input and one output port, and it forwards all incoming pack-ets to the outgoing port. To introduce packet reordering by usingthe generator, a reordering block size number of packets are delayedfor a reordering delay every reordering interval, and this is called areordering event. All other packets are forwarded immediatelywithout any delay. All three parameters could be random variablesfollowing certain distributions, which makes the generator providea realistic reordered traffic.

In this paper, we measure the reordering delay in two differentunits. Let tðPiÞ denote the time that packet Pi departs from a gen-erator. Suppose that Px is the first reordered packet in a reorderingevent, and Py is the earliest packet departing from the generatoramong all packets with a higher sequence number (i.e.,tðPyÞ < tðPxÞ and y > x). First, the reordering delay is measured asthe time interval tðPyÞ � tðPxÞ. Second, it is measured as the num-ber of packets forwarded between tðPyÞ and tðPxÞ. Note that wecan determine one of these two values from the another one; ifthe packet sending rate is known and assuming the same inter-ar-rival time for all packets. Both units (i.e., seconds and packets) areused in this paper depending on which one is more convenient touse in any given context.

Figs. 1–3 illustrate how our packet-reordering generator works.Consider that eight packets are sent from a TCP sender to a TCP re-ceiver in the order shown in Fig. 1, and they arrive at a reordering

Page 3: Packet reordering in high-speed networks and its impact on high-speed TCP variants

TCP Sender TCP Receiver

P8 P7 P2 P6 P5 P4 P3 P1

reordering block size = 1

reordering delay = 4 packets

Fig. 2. New packet sequence after packet 2 is reordered with reordering blocksize = 1 and reordering delay = 4 packets.

TCP Sender TCP Receiver

reordering delay = 4 packets

P8 P3 P2 P7 P6 P5 P4 P1

reordering block size = 2

Fig. 3. New packet sequence after packet 2 and 3 are reordered with reorderingblock size = 2 and reordering delay = 4 packets.

0.01

0.1

1

10

100 200 300 400 500 600 700 800 900Reo

rder

ing

inte

rval

(Sec

ond)

Sending rate (Mbps)

AverageMaximal

Fig. 4. The reordering interval in experiments with different data rate. The analysisshows that the higher the packet sending rate, the shorter the average reorderinginterval; that is, the more the reordering events.

0

0.2

0.4

0.6

0.8

1

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

CD

F

Reordering interval (Seconds)

Fig. 5. The cumulative distribution function (CDF) of the reordering intervals in anexperiment with a 900 Mbps rate.

64 J. Feng et al. / Computer Communications 32 (2009) 62–68

generator back-to-back in that order. Assume that packet P2 is thefirst one to be reordered at the beginning of a reordering interval. Ifthe reordering block size is one packet, and the reordering delay isfour packets, then packet P2 is delayed until packets P3 to P6 departfrom the generator. In this case, eight packets arrive at the TCP re-ceiver in the order shown in Fig. 2. Now, we consider a case wherethe reordering block size is two packets. In this case, both packetsP2 and P3 are delayed, and eight packets arrive at the TCP receiverin the order shown in Fig. 3.

4. Characteristics of packet-reordering in high-speed networks

Now, we study the characteristics of packet-reordering in high-speed networks by using the new reordering model, i.e. analyzingthe three parameters: reordering interval, reordering delay and reor-dering block size. The reasonable values for three parameters arefound by analyzing the experimental results conducted by Gharaiet al. [11].

Gharai et al. [11] measured the occurrence of packet reorderingover the DARPA SuperNet. The authors set up three traffic genera-tion and monitoring systems at different places: the USC/ISI-Eastlaboratory in the Washington DC metro area, CMU in Pittsburghand the network operations center at One Wilshire Boulevard inLos Angeles. At each site, the generation system introduce UDP/IPflows into the high-speed network with iperf v1.1.1, and the recei-ver record the traffic with tcpdump for analysis. The test flowswere transmitting at a rate ranging from 1 to 900 Mbps amongthree places. Each test was repeated with three different packetsizes of 500, 1500 and 4500 bytes. A total of 198 group of flowswere tested, comprising about 60 million packets, over the threelinks. In 2006, we borrow their original data (70 Gigabyte data aftercompressing) and analyze the three parameters of our models byusing their primitive experiment data. We observed similar resultsfor experiments with different packet sizes. In this paper, we pri-marily discuss and show our findings for experiments with a pack-et size of 1500 bytes, and give a comparison on packet-reorderinginterval with different packet sizes.

Fig. 4 shows the reordering interval that is averaged over allreordering intervals in all experiments with a 1500-byte packetsize. It clearly shows that the higher the packet sending rate, theshorter the reordering interval; that is the more the reorderingevents. For example, the average reordering intervals at 100 Mbpsis about 10 s, whereas that at 900Mbps is only about 0.02 s. This isconsistent with the observations in [11]. We also observe that evenin the same experiment, the reordering interval is not a constant.

For example, the cumulative distribution function (CDF) of thereordering intervals in an experiment with a 900 Mbps rate shownin Fig. 5 indicates that the maximum and minimum reorderingintervals experienced by the same flow may differ by more than100 times. This suggests that previous simulation results obtainedwith a fixed reordering interval may not reflect the actual perfor-mance of TCP in the presence of packet reordering.

Fig. 6 shows the average and maximum reordering delay mea-sured by first calculating the average and maximum reordering de-lay of each experiment, and then calculating the average among allexperiments with the same sending rate. We note that eventhrough the results are obtained by analyzing all experiment re-sults conducted in [11], the plots are uneven due to the limitednumber of experiments. However, we can still see the pattern thatboth average and maximum reordering delays slightly increase asthe packet sending rate increases. For example, the average reor-dering delay at 100 Mbps is only 1 packet, whereas that at900 Mbps is more than 2 packets. In addition, in most cases, theaverage reordering delay is less than 3 packets. That is, a significantnumber of reordering events lead to less than three dupACKs, andthen do not trigger false fast retransmit. However, a high-speedflow is more likely to experience reordering events with a reorder-ing delay long enough to trigger false fast retransmit.

Fig. 7 shows the average and maximum reordering block sizes.We note that the average reordering block size is relatively insen-sitive to the packet sending rate, and it is always very close to 1.That is, in most reordering events, only one packet is reordered,even for a flow with a high sending rate. We also clearly see thatthe maximum reordering block size increases as the rate increases.This implies that it is more likely for a larger number of packets tobe reordered as an entity for a flow with a higher sending rate.

In terms of packet-reordering interval, we observe that a major-ity of flows with 500 byte packets experience packet reordering atdata rates of 200 Mbps and higher, and the packet-reorderinginterval sharply decreases to less than 0.01 s when the sending rate

Page 4: Packet reordering in high-speed networks and its impact on high-speed TCP variants

0 1 2 3 4 5 6 7 8

100 200 300 400 500 600 700 800 900Reo

rder

ing

dela

y (P

acke

ts)

Sending rate (Mbps)

AverageMaximum

Fig. 6. The reordering delay in experiments with different data rate. The analysisshows that both average and maximum reordering delay slightly increase as thepacket sending rate increases.

0

0.5

1

1.5

2

2.5

3

100 200 300 400 500 600 700 800 900

Reo

rder

ing

Bloc

k Si

ze (P

acke

ts)

Packet Sending Rate (Mbps)

Average Maximum

Fig. 7. The reordering size in experiments with different data rate. From theanalysis, we can see that even though the average reordering block size remainsalmost same for all sending rates, the maximum reordering block size increases asthe rate increases.

J. Feng et al. / Computer Communications 32 (2009) 62–68 65

is higher than 200 Mbps. While most of the 1500-bytes-packetsflows experience packet reordering at sending rates of 600 Mbpsand above, none of the flows with 4500-byte packets at low datarates tested experience reordering. The analysis results are shownin Fig. 8. This coincides with a clear inter-packet arrival timethreshold, that is when the inter-packet arrival time increases be-yond 0.02 ms, the packet-reordering occurs dramatically often.Thus, packet-reordering occurs more often in high-speed networksthan in low-speed networks.

5. Impact of packet reordering on the performance of high-speed TCP variants

The impact of each of the three reordering parameters on theperformance of high-speed TCP variants is studied using simula-tion. We only show the simulation results for BIC [3] and HSTCP[1] in this section. Since all high-speed TCP variants use the sameTCP fast retransmit and fast recovery algorithms, other high-speed

0 2 4 6 8

10 12 14

100 200 300 400 500 600 700 800 900Reo

rder

ing

inte

rval

(Sec

ond)

Sending rate (Mbps)

Average interval with packet size 500Average interval with packet size 1500Average interval with packet size 4500

Fig. 8. The comparison of reordering interval in experiments with different datarate and different packet size.

TCP variants should achieve similar performance degradationtrends as BIC and HSTCP.

5.1. Simulation setup

In this paper, we study the packet reordering in high-speed net-works via NS-2 simulations. The simulation setup is depicted inFig. 9. The reordering generator is introduced between the high-speed TCP sender S1 and the router R1, so that only the TCP connec-tion between S1 and D1 experiences packet reordering. The band-width and one-way delay of the bottleneck link are set to1000 Mbps and 50 ms, respectively. Each source and sink are con-nected to the bottleneck link through different access links withdelays randomly varied from 0.1 to 0.9 ms to mitigate the phase ef-fect. To increase traffic dynamics and further reduce phase effect,various kinds of background traffic are simulated in bothdirections.

In our simulations, the three parameters of the reordering gen-erator are set according to our analysis in Section 4. Both reorder-ing interval and reordering delay are random variables with adistribution approximately following their measured CDFs, respec-tively. Unless otherwise noted, the averages of the reorderinginterval and reordering delay are set to 0.02 s and 20 ls, respec-tively, which correspond to their measured average values at900 Mbps. The reordering block size is set to a fixed value of 1packet.

We use TCP/SACK as the agent for TCP connections (except forreordering tolerant algorithms that need special TCP agents). Thereasons that why TCP/SACK is chosen are as follow. One is due toits higher performance compared with TCP/Reno and TCP/New-Reno. The other one is due to the fact that most of the reorderingtolerant algorithms are based on TCP/SACK.

5.2. Simulation results

In this subsection, we show the simulation results concerningthe impact of packet reordering on high-speed TCP variants, andin this report we use BIC and HSTCP as examples. Fig. 10 illustratesthe impact of reordering intervals on the performance of BIC andHSTCP. We vary the average reordering interval from 0.01 to0.5 s, and set the other two parameters to the values given in Sec-tion 5.1.

From the simulation results, we can make the following threeobservations. First, as we expected, a smaller reordering intervaldoes lead to lower throughput. Second, the throughput of BIC dropssharply when the reordering interval is less than 0.02 s. This is be-cause when the reordering interval is small enough so that a flowmay experience more than one reordering event within one RTT,the high speed TCP variant cannot efficiently recover from false fastretransmit and fast recovery. Third, HSTCP suffers more than BICfrom packet reordering. For example, with a 0.5-s reordering inter-val, BIC can achieve as high as 400 Mbps throughput, however,

reorder generatorS1

S2

Sn Dn

D2

D1

R1 R2

Fig. 9. Simulation network topology.

Page 5: Packet reordering in high-speed networks and its impact on high-speed TCP variants

1

10

100

1000

0.015625 0.03125 0.0625 0.125 0.25 0.5

Thro

ughp

ut (M

bps)

Reordering Interval (Seconds)

BICHSTCP

Fig. 10. The performance of TCP as the reordering interval decreases. The resultshows that the shorter the reordering interval, the lower the throughput of high-speed TCP variants. BIC achieves better performance than HSTCP, but still suffersfrom very small reordering intervals.

1

10

100

1000

0 1 2 3 4

Thro

ughp

ut (M

bps)

Reordering Block Size(Packets)

BICHSTCP

Fig. 12. The throughput degradation as the reordering block size increases. The plotillustrates that even a small reordering block size (such as 1 packet) cansignificantly degrade the throughput of BIC and HSTCP. However, further increasingthe reordering block size only slightly reduces the throughput.

66 J. Feng et al. / Computer Communications 32 (2009) 62–68

HSTCP can achieve only 10 Mbps throughput. The reason is thatBIC is designed to be more aggressive than HSTCP in high-speednetworks, and it can achieve higher throughput even in networkswith severe packet-reordering.

Fig. 11 evaluates the impact of reordering delay. We vary theaverage reordering delay from 0 to 100 ls, and set the other twoparameters to the values given in Section 5.1. First, we note thatboth BIC and HSTCP achieve very high throughput, when there isno packet reordering. However, even a very small reordering delay(such as 20 ls) can significantly degrade their throughput. This isbecause the average reordering interval is set to 0.02 s (that isthe average reordering interval of all experiments with a 900 Mbpsrate conducted in [11]). As discussed in the last paragraph, thisinterval is so small that a flow may experience more than one reor-dering events within one RTT. With a small reordering delay, eventhough one reordering event alone may not lead to false TCP fastretransmit, two or more intermixed reordering events are morelikely trigger false TCP fast retransmit and fast recovery. Second,we see that increasing the reordering delay further only slightly re-duces the throughput. Intuitively, once false TCP fast retransmitand fast recovery are triggered, a longer reordering delay onlymakes the flow stay in fast recovery slightly longer, which doesnot change the throughput too much.

Fig. 12 plots the throughput of BIC and HSTCP for different reor-dering block sizes. As we can see from the figure, the impact of thereordering block size is very similar to that of the reordering delay.Even a small reordering block size (such as one packet) can signif-icantly degrade the throughput of BIC and HSTCP. However, furtherincreasing the reordering block size only slightly reduces thethroughput. The reason is that the TCP/SACK mechanism can re-cover quickly from multiple lost packets or reordered packets inone congestion window unless the succeeding partial ACKs aretriggered. We also observe that, in Figs. 11 and 12, when there is

Fig. 11. The throughput degradation as the reordering delay increases. The plotillustrates that even a small reordering delay (such as 20 ls) can significantlydegrade the throughput of BIC and HSTCP. However, further increasing thereordering delay only slightly reduces the throughput.

no packet reordering, which is shown by the points at 0 reorderingdelay and 0 block size, BIC has a higher throughput than HSTCP.The reason why BIC has a higher throughput than HSTCP is thatBIC is designed to be more aggressive than HSTCP in high-speednetworks, thus in a network environment without severe packetreordering, BIC should achieve higher throughput than HSTCPdoes, and this is consistent with the results shown in the originalBIC paper [3].

The simulation results demonstrate that all three reorderingparameters negatively affect the performance of high-speed TCPvariants. Among the three parameters, reordering interval is themajor cause of the degradation of TCP performance, i.e., the reor-dering interval is very small as measured in DARPA SuperNet[11], high-speed TCP variants suffer significantly from packet reor-dering even with very small reordering delay and block sizes.

6. Effect of existing reordering-tolerant enhancements of TCP

6.1. Related reordering-tolerant TCP enhancements

Most reordering-tolerant algorithms increase the dupthreshfrom the default value of 3 to a larger number to prevent falseTCP fast retransmit and false fast recovery. Blanton and Allman[6] adjust dupthresh using reordering history based on D-SACKinformation [23]. TCP-RR [8] proposed by Zhang et al. uses D-SACKand a cost function of timeout and false fast retransmit to adjustdupthresh. AVG-DEV [16] developed by Ma and Leung adjusts dup-thresh by considering both the average and the deviation of reor-dering history. TCP-NCR [24] [25] designed by Bhandarkar et al.sets dupthresh to approximately the size of a congestion window.TCP-Aix [26] recently proposed by Ekström et al. adjusts dupthreshto avoid false fast retransmits, and in order to improve the perfor-mance under packet reordering, TCP-Aix is also designed to haveother mechanisms, such as separating loss recovery from conges-tion control.

Besides the enhancements introduced above, some variants useother mechanisms to improve the performance of TCP under pack-et reordering. Some algorithms, such as TCP-PR [27] proposed byBlhacek et al., use a time threshold based on RTT estimation to trig-ger TCP fast retransmit and recovery, instead of the number of du-pACKs. Therefore, reordered packets will not be retransmittedwithin the time threshold, which also leads to higher performanceof the TCP connections. Some other algorithms, such as RN-TCP[28] developed by Sathiaseelan and Radzik, require additionalinformation from routers to distinguish packet reordering frompacket loss. Although these algorithms are designed to reducethe probability of false fast retransmit, they are fairly hard toimplement. A more comprehensive survey of reordering-tolerantalgorithms can be found in [20].

Page 6: Packet reordering in high-speed networks and its impact on high-speed TCP variants

Fig. 14. The comparison of throughput of BIC against the reordering delay. Theresults demonstrate that existing reordering-tolerant algorithms can significantlyimprove the throughput of BIC. BIC can achieve very high throughput even withlong reordering delay.

1

10

100

1000

0 1 2 3 4Th

roug

hput

(Mbp

s)Reordering Block Size (Packets)

BIC with TCP-NCRBIC with AVG-DEV

BIC with TCP-RRBIC with SACK

Fig. 15. The comparison of throughput of BIC against the reordering block size. Theresults demonstrate that existing reordering-tolerant algorithms can significantlyimprove the throughput of BIC. However, BIC still suffers from packet reorderingwith large reordering block sizes.

1

10

100

1000

0.015625 0.03125 0.0625 0.125 0.25 0.5 1

Thro

ughp

ut (M

bps)

Reordering Interval (Seconds)

HSTCP with TCP-NCRHSTCP with AVG-DEV

HSTCP with TCP-RRHSTCP with TCP SACK

Fig. 16. The comparison of throughput of HSTCP against the reordering interval.The results demonstrate that existing reordering-tolerant algorithms can signifi-cantly improve the throughput of HSTCP. However, HSTCP still suffers from packetreordering with small reordering intervals.

J. Feng et al. / Computer Communications 32 (2009) 62–68 67

6.2. Evaluation of existing reordering-tolerant TCP enhancements

In this section, we evaluate the effectiveness of some currentreordering-tolerant algorithms in high-speed networks, since allthese reordering-tolerant algorithms are only tested in low speednetworks. We simulate TCP-NCR [24], AVG-DEV [16], and TCP-RR[8], which are three recently proposed algorithms to make TCPmore robust to packet reordering. They use slightly different meth-ods to prevent false fast retransmit and false fast recovery, andtheir simulation results show that these three algorithms workbetter than other similar reordering-tolerant algorithms in theirstudied simulation environments.

In order to study the effectiveness of reordering-tolerant TCPenhancements, we designed some simulations with the sametopology and settings as those in Section 5. The default values ofsimulation parameters are the same as in Figs. 10–12. We choseBIC and HSTCP as the high-speed variants to evaluate the effective-ness of packet-reordering enhancements, and we have obtainedsimilar simulation results for both BIC and HSTCP. Below, we useBIC as an example to analyze the effectiveness of reordering-toler-ant TCP enhancements in high-speed networks. The simulation re-sults of HSTCP are very similar as those of BIC.

Figs. 13–15 illustrate the effectiveness of three reordering-tol-erant algorithms, TCP-NCR, AVG-DEV and TCP-RR, as we vary thereordering interval, reordering delay, and reordering block size.We also show the throughput of BIC without any reordering-toler-ant algorithm (referred to as BIC-SACK in figures) as a referencecase. Figs. 16–18 are the results for HSTCP. Below, we use BIC asan example to analyze the existing reordering-tolerant TCPenhancements. From the simulation results, we observe the fol-lowing: first, all reordering-tolerant algorithms can significantlyimprove the throughput of BIC in all simulation scenarios. Forexample, BIC-SACK achieves less than 10 Mbps throughput in mostcases, whereas BIC with any of TCP-NCR, AVG-DEV, and TCP-RRachieves a rate greater than 100 Mbps throughput in all cases. Sec-ond, TCP-NCR consistently performs better than the other twoalgorithms: AVG-DEV and TCP-RR. This is expected, since TCP-NCR has the largest dupthresh (i.e., the number of dupACKs to trig-ger TCP fast retransmit and recovery) and the virtual layer, andboth of them decrease the probability of false fast retransmit andfalse fast recovery, which leads to a higher performance whenpacket-reordering occurs in the networks. Third, even with TCP-NCR, BIC still suffers from packet reordering, especially with smallreordering intervals and large reordering block sizes. For example,Fig. 15 shows that BIC with TCP-NCR achieves only 463 Mbpsthroughput with a reordering block size of 2 packets, whereas itachieves 632 Mbps throughput with a reordering block size of 1packet. Intuitively, even though TCP-NCR can effectively preventmost false TCP fast retransmit and false fast recovery due to packet

1

10

100

1000

0.015625 0.03125 0.0625 0.125 0.25 0.5

Thro

ughp

ut (M

bps)

Reordering Interval (Seconds)

BIC with TCP-NCRBIC with AVG-DEV

BIC with TCP-RRBIC with SACK

Fig. 13. The comparison of throughput of BIC against the reordering interval. Theresults demonstrate that existing reordering-tolerant algorithms can significantlyimprove the throughput of BIC. However, BIC still suffers from packet reorderingwith small reordering intervals.

Fig. 17. The comparison of throughput of HSTCP against the reordering delay. Theresults demonstrate that existing reordering-tolerant algorithms can significantlyimprove the throughput of HSTCP. HSTCP can achieve very high throughput evenwith long reordering delay.

Page 7: Packet reordering in high-speed networks and its impact on high-speed TCP variants

1

10

100

1000

0 1 2 3 4

Thro

ughp

ut (M

bps)

Reordering Block Size (Packets)

HSTCP with TCP-NCRHSTCP with AVG-DEV

HSTCP with TCP-RRHSTCP with TCP SACK

Fig. 18. The comparison of throughput of HSTCP against the reordering block size.The results demonstrate that existing reordering-tolerant algorithms can signifi-cantly improve the throughput of HSTCP. However, HSTCP still suffers from packetreordering with large reordering block sizes.

68 J. Feng et al. / Computer Communications 32 (2009) 62–68

reordering only, it cannot recover quickly enough from false fastretransmit and recovery caused by persistent packet loss and pack-et reordering in the network.

7. Conclusion

In this paper, we presented a new packet-reordering generatorwhich can be described by three parameters: reordering interval,reordering delay, and reordering block size. The new reorderinggenerator enables us to study the impact of each of these threeparameters on the performance of high-speed TCP variants. Ourmodel is not related to any particular physical reasons of a reorder-ing phenomena, and it is general enough to simulate the effect ofmost reordering situations. In order to enhance our model andmake it generate realistic reordering events, we determine thethree parameters in our model using the actual measured resultsin real high-speed networks. By analyzing the experimental resultsmeasured in the DARPA SuperNet [11], we find that the averagereordering interval decreases as the packet sending rate increases,which is consistent with the findings in [11]. We also find that thereordering delay and reordering block size slightly increases as thepacket sending rate increases. All of these findings imply that a TCPflow with a high sending rate is more likely to experience morefalse TCP fast retransmit and recovery due to packet reordering.

We also performed NS-2 simulation experiments to study theperformance of high-speed TCP variants in the presence of packetreordering. Our simulation results demonstrate that all three reor-dering parameters affect the performance of high-speed TCP vari-ants. When the reordering interval is very small as measured inthe DARPA SuperNet, high-speed TCP variants suffer significantlyfrom packet reordering even with very small reordering delayand block sizes.

Finally, we evaluated the effectiveness of the existing reorder-ing-tolerant TCP enhancements. Our simulation results show thatcurrent reordering-tolerant algorithms can significantly improvethe performance of high-speed TCP variants.

Acknowledgements

We thank Dr. Ladan Gharai, Dr. Colin Perkins and Dr. Tom Leh-man for sharing their experimental results with us. The work re-

ported in this paper is supported in part by a Grant-in-Aid fromUNL Research Council and Nebraska EPSCoR First Award.

References

[1] S. Floyd, HighSpeed TCP for large congestion windows, RFC 3649.[2] T. Kelly, Scalable TCP: improving performance in highspeed wide area

networks, ACM SIGCOMM Computer Communication Review 33 (2) (2003)83–91.

[3] L. Xu, K. Harfoush, I. Rhee, Binary increase congestion control for fast long-distance networks, in: Proceedings of IEEE INFOCOM, Hong Kong, 2004.

[4] C. Jin, D.X. Wei, S.H. Low, FAST TCP: motivation, architecture, algorithms,performance, in: Proceedings of IEEE INFOCOM, Hong Kong, 2004.

[5] M. Allman, V. Paxson, W. Stevens, TCP congestion control, RFC 2581.[6] E. Blanton, M. Allman, Using TCP DSACKs and SCTP duplicate TSNs to detect

spurious retransmissions, RFC 3708.[7] V. Paxson, End-to-end internet packet dynamics, IEEE/ACM Transactions on

Networking 7 (3) (1999) 277–292.[8] M. Zhang, B. Karp, S. Floyd, L. Peterson, RR-TCP: a reordering robust TCP with

DSACK, in: Proceedings of IEEE ICNP, Atlanta, Georgia, 2003.[9] S. Dharmapurikar, V. Paxson, Robust TCP stream reassembly in the presence of

adversaries, in: Proceedings of 14th USENIX Security Symposium, Baltimore,MD, 2005, pp. 222–231.

[10] S. Jaiswal, G. Iannaccone, C. Diot, J. Kurose, D. Towsley, Measurement andclassification of out-of-sequence packets in a Tier-1 IP backbone, in:Proceedings of IEEE INFOCOM, San Francisco, CA, 2003, pp. 1199–1209.

[11] L. Gharai, C. Perkins, T. Lehman, Packet reordering, high speed networks andtransport protocol performance, in: Proceedings of the 13th ICCCN, Chicago, IL,2004, pp. 73–78.

[12] J. Bellardo, S. Savage, Measuring packet reordering, in: Proceedings of ACM/USENIX Internet Measurement Workshop, Marseille, France, 2002, pp. 97 –105.

[13] M. Przybylski, B. Belter, A. Binczewski, Shall we worry about packetreordering? in: Proceedings of TERENA Networking Conference, Poznan,Poland, 2005, pp. 28–36.

[14] J. Bennett, C. Partridge, N. Schectman, Packet reordering is not pathologicalnetwork behavior, IEEE/ACM Transaction on Networking 7 (6) (1999) 789–798.

[15] G. Neglia, V. Falletta, G. Bianchi, Is TCP packet reordering always harmful? in:Proceedings of MASCOTS, The Netherlands, 2004.

[16] C. Ma, K. Leung, Improving tcp robustness under reordering networkenvironment, in: Proceedings of Global Telecommunications Conference,2004.

[17] A. Morton, L. Ciavattone, G. Ramachandran, S. Shalunov, J. Perser, Packetreordering metrics, RFC 4737.

[18] A. Jayasumana, N. Piratla, A. Bare, T. Banka, R. Whitner, J. McCollom, Improvedpacket reordering metrics, RFC 5236.

[19] J. Mogul, Observing TCP dynamics in real networks, in: Proceedings of ACMSIGCOMM, Baltimore, MD, 1992.

[20] K.-C. Leung, V.O.K. Li, D. Yang, An overview of packet reordering intransmission control protocol (tcp): problems, solutions, and challenges, in:IEEE Transactions on Parallel and Distributed Systems (IEEE TPDS), 2006.

[21] N. Piratla, A. Jayasumana, Reordering of packets due to multipath forwarding –an analysis, in: Proceedings of IEEE ICC, Istanbul, Turkey, 2006.

[22] NS-2, Network simulator 2. Available from: <http://www.isi.edu/nsnam/ns/>,2006.

[23] S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky, An extension to the selectiveacknowledgement (sack) option for tcp, in: RFC 2883, 2000.

[24] S. Bhandarkar, A.L.N. Reddy, M. Allman, E. Blanton, Improving the robustnessof tcp to non-congestion events, in: RFC 4653, 2006.

[25] S. Bhandarkar, A.L.N. Reddy, Robustness to packet reordering in high-speednetworks, in: Proceedings of the fifth International Workshop on Protocols forFast Long-Distance Networks, 2007.

[26] S. Landström, H. Ekström, L. Larzon, R. Ludwig, Tcp-aix: making tcp robust toreordering and delay variations, in: Research Report in Luleá University ofTechnology, 2006.

[27] S. Bohacek, J. Hespanha, J. Lee, C. Lim, K. Obraczka, TCP-PR: TCP for persistentpacket reordering, in: Proceedings of IEEE International Conference onDistributed Computing Systems, 2003.

[28] A. Sathiaseelan, T. Radzik, Reorder notifying TCP (RN-TCP) with explicit packetdrop notification, International Journal of Communication Systems 19 (6)(2006) 659–678.