tcp(no ip) review part2

67
TCP (No-IP) – Part 2 Congestion Management

Upload: diptanshu-singh

Post on 19-Jan-2017

363 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Tcp(no ip) review part2

TCP (No-IP) – Part 2Congestion Management

Page 2: Tcp(no ip) review part2

Disclaimer• This Presentation contains a mixture of self made slides and material

already out on the Internet. • The goal of this presentation/talk is to provide TCP refresher.• At the end you wont know everything about TCP.• The goal here is to introduce you to the Mountains by climbing on top

of a hill. How to climb the Mountains is an exercise left for the audience/reader .

• My hope is that you will get at least something new out of this but, if you already knew everything then at least you got the free Lunch .

Page 3: Tcp(no ip) review part2

TCP Congestion Control• As we noticed that TCP strategy to combat packet loss is

retransmission induced by either a retransmission timer expiry or Fast Retransmit.

• Now assume for a moment that multiple TCP connections sharing a Path simply retransmitting more packets when that Network path is experiencing congestion.

• This is only going to make things worse and is known as pouring Gasoline on a fire and we all know how that will go

Page 4: Tcp(no ip) review part2

Motivation• It’s a mind-boggling unintuitive fact that quickest way to transmit data is

not always sending the data as fast as you can.• Observe the traffic on the Highway

• Traffic will move faster if the if the traffic load is high but not too high• When the load is higher then what a highway can handle, bottleneck will cause

the whole highway to slow down

• It is in fact beneficial to a user by restricting one’s own transmission rate if the current rate is more than a network can handle. This “sacrificial act” of restricting one’s own transmission rate is known as “congestion control”

• On the other hand if the network capability increases then its “natural” one would increase one’s transmission rate.

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 5: Tcp(no ip) review part2

Motivation• Congestion Control is one of a few application of “Communism” that will actually work because it benefit oneself

if one will do it (as long as everyone else will do it also).

• Communist principle fails in general because it doesn't take into account the “self-beneficial” motives of humanoids. Ordinary humans would only do something if it benefits oneself. Marx had this crazy fantasy that everyone is a Saint (He must be smoking something)

• Ref: http://www.philosophybasics.com/branch_communism.html “Communism is a socio-economic structure that promotes the establishment of a classless, stateless society based on common ownership of the means of production. It encourages the formation of a proletarian state in order to overcome the class structures and alienation of labour that characterize capitalistic societies, and their legacy of imperialism and nationalism. Communism holds that the only way to solve these problems is for the working class (or proletariat) to replace the wealthy ruling class (or bourgeoisie), through revolutionary action, in order to establish a peaceful, free society, without classes or government.Communism, then,] is the idea of a free society with no division or alienation, where humanity is free from oppression and scarcity, and where there is no need for governments or countries and no class divisions. It envisages a world in which each person gives according to their abilities, and receives according to their needs. Its proponents claim it to be the only means to the full realization of human freedom.”

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 6: Tcp(no ip) review part2

What is TCP Congestion Control ?• It’s a set of behaviors determined by algorithms that each TCP

implements in an attempt to prevent Network from being overwhelmed by too large of an aggregate offered traffic load.

networkcongestion

Page 7: Tcp(no ip) review part2

TCP Congestion Control• So in order to deal with congestion, we want TCP to slow down when

congestion occurs or about to occur• When the congestion is subsided, detect and use appropriate amount

of new bandwidth as it becomes available.• As you can see that how this simple task can be quite complicated as

TCP doesn't know the state of the Intermediate routers. • TCP sender has to somehow react to congestion after detecting

congestion.

Page 8: Tcp(no ip) review part2

Basic Principle• The basic Congestion control principle is simple:

• Decrease your transmission rate when you detect congestion• Increase your transmission rate if you detect no congestion.

• There are two problems which need to be solved when designing a Congestion Method• How do I detect an onset of presence of Congestion• How fast do you increase transmission rate when no congestion is detected and how fast you

decrease transmission rate when you do detect a congestion.• Congestion Detection:

• Explicit signals from the Routers. (present routers don’t send signals. QOS Policy with ECN ??)• Packet Loss:

• This can be detected by the source if receiver uses NACK• If the transport uses only +ve ACK, a packet loss can be detected through RTO.• This is the worst form of feedback signal because that means the congestion has already occurred and it

dropped the packet.• Because of that very reason, TCP takes drastic steps when it detects packet loss

• End-to-End Delay:• This is much better option compared to Packet loss but its harder to use.• By measuring the end-to-end delay, source can detect an impeding collapse rather than detecting a

collapse which has already happened.• End-to-End Delay increases/decreases when the Queuing delay increases/decreases

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 11: Tcp(no ip) review part2

AIMD is at the heart of Congestion Control• Assume that you have to obtain a binary congestion indication signal from the network

0 – No Congestion1 – Congestion

• Next question is • How much to increase transmission rate if there is no congestion• How much to decrease transmission rate if there is congestion

• Design considerations:• Is this method fair ? Will everyone gets a fair share of available BW• Is this method stable ? Does this converge to a solution or the system state jump back and forth wildly ?• What's the efficiency and distributedness.

• Raj Jain paper study’s about the linear increase and decrease for Congestion Avoidance• http://www.mathcs.emory.edu/~cheung/Courses/558a/Papers/Jain-AIMD.pdf

• AIMD – Additive Increase and Multiplicative Decrease, is a special function of linear increase and decrease

• Congestion control in TCP uses AIMD as congestion control.• Legacy of Jain’s research.

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 12: Tcp(no ip) review part2

AIMD• Distributed, fair and efficient• Packet loss is seen as sign of congestion and results in a multiplicative rate decrease

• Factor of 2

• TCP periodically probes for available bandwidth by increasing its rate

Page 13: Tcp(no ip) review part2

TCP Window-Recap• Transmit Window Size :

• It is the amount of data that TCP can have "outstanding" (that TCP does not know whether the data has been received or not)

• Advertised Window Size:• It is the amount of data that receiver is willing to buffer when data arrived out

of order.

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 14: Tcp(no ip) review part2

TCP Window- Congestion Window• Congestion Window Size is the amount of data that the sender can inject

into the network without causing congestion in the network.

• The amount of data varies over time (it depends on the current network status - it changes like the weather and it is just as unpredictable....)

• Relationship between Transmit window and Congestion Window:• Sender must NEVER transmit more data than amount that will cause Network

congestion.• Transmit Window Size <= Congestion Window Size (CWND)

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 15: Tcp(no ip) review part2

TCP Window- Congestion Window• Advertised Window Size (AWS) = amount of data that the receiver will

buffer. AWS is negotiated at connection establishment and remains unchanged afterwards

• Congestion Window Size (CWS) = window size imposed by the TCP congestion mechanism to avoid causing congestion in the network CWS changes over time !!!

• Transmit Window Size (TWS) = the amount of unacknowledged data, i.e., data that TCP transmits in a burst without receiving any indication on what happened to the data.

• TWS = min (AWS, CWS)

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 16: Tcp(no ip) review part2

How TCP controls the Transmission rate ?

• TCP will transmit using a transmit window size equal to:• TWS = min (AWS, CWS)

• The TCP congestion control algorithm will adjust the value of CWS according to signals/events from the network

• A change in CWS can change the transmission window size and indirectly change the transmission data rate....

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 17: Tcp(no ip) review part2

TCP Phases• TCP always operates in two phases:

• Slow Start Phase: the period when TCP has no information about the current network status.

• In particular, TCP does not know how much traffic the network can handle safely (i.e., without causing congestion).

• Congestion Avoidance Phase: the period when TCP knows that it is transmitting at a data rate that is very close to a rate that can cause congestion.

• TCP will update CWS differently in different phases.

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 18: Tcp(no ip) review part2

How does TCP know that its near Congestion?• It’s like playing a Video Game without a cheat sheet

• How do you know when you must be very careful ?• You walk in a trap in the previous game and died..• And when you restart the game, and after arriving to the same situation(Again), you progress

very carefully.

• Life of TCP is a like a never ending Video Game• When TCP detects congestion (through a packet loss), it records the current

transmit window size used.(The current window size indirectly determines how fast TCP transmits data)

• After detecting congestion, TCP will use a smaller transmit window size (this will reduce the transmission rate)

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 19: Tcp(no ip) review part2

How does TCP know that its near Congestion?

• Immediately after the window size reduction, TCP will first try to increase the transmit window size aggressively (because it's goal is to transmit the data as fast as possible, but without causing congestion).This phase is the slow start phase

• When TCP reach a window size that is close to the one that had caused congestion (and loss), it will increase the window size much less aggressively This phase is the congestion avoidance phase

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 21: Tcp(no ip) review part2

TCP Phases - Goals• Goal of TCP is to get the highest possible throughput• This is not achieved by sending as fast as possible. It is achieved by

sending at a data rate that is close to the available network bandwidth. (Remember the Highway Analogy)

• Available Network bandwidth changes constantly• TCP finds the available bandwidth by remembering when it dropped a

packet last time and proceeds carefully starting from half this bandwidth.

NOTE: the first time that TCP starts, it has no idea what the network capacity is and the only thing that it can do is to estimate it as well as it can - it increases the data rate very rapidly to find the bottle neck in a very short amount of time.

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 22: Tcp(no ip) review part2

TCP Congestion Control• TCP doesn't’t implement's a single Congestion avoidance mechanism but many.• TCP Will try to guess what the situation is and many times it can guess wrongly!!!• TCP Congestion avoidance mechanism employed have very sexy sounding names.• Tahoe (Jacobson 1988)

• Slow Start• Congestion Avoidance• Fast Retransmit

• Reno (Jacobson 1990)• Fast Recovery

• Vegas (Brakmo & Peterson 1994)• New Congestion Avoidance

• RED (Floyd & Jacobson 1993)• Probabilistic marking

• REM (Athuraliya & Low 2000)• Clear buffer, match rate

• Others…

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 23: Tcp(no ip) review part2

• If we have large actual window, should we send data in one shot?• No, use acks to clock sending new data

• Arrival of an ACK, called ACK clock, triggers the system to take the action of sending another packet.

Self-Clocking: ACK Clock

Page 24: Tcp(no ip) review part2

• The period when the slow start mechanism is used is called a slow start phase.

• Operation of the slow start mechanism:• Initialization: set CWS = MSS (i.e., 1 full packet)• When TCP receives an ACK, it increases the CWS by MSS (i.e., one full packet)

Slow Start

W = 2^k where k = round trip timeK = log2 W RTTs to reach W

Page 25: Tcp(no ip) review part2

Slow Start • Start with cwnd = 1 (slow start)• On each successful ACK increment cwnd

cwnd cnwd + 1• Exponential growth of cwnd

each RTT: cwnd 2 x cwnd• Enter CA when cwnd >= ssthresh

Page 26: Tcp(no ip) review part2

Slow Start

data packet

ACK

receiversender

1 RTT

cwnd1

2

34

5678

cwnd cwnd + 1 (for each ACK)

Page 27: Tcp(no ip) review part2

• What is Slow Start trying to achieve ?• Slow start can determine an estimate for the current network capacity very

rapidly• By increasing the congestion window size exponentially, TCP will be able to

find a congestion window size that causes packet loss• since the routers in the Internet does not return any indication about their

state, the only way to see find out if a certain data rate will cause congestion is to "actually try it" :-)

Slow Start

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 28: Tcp(no ip) review part2

• When to BEGIN to use Slow Start ?• The slow start mechanism is used when the sender does not know the current network

state - i.e., it does not know how much traffic will cause the network to become congested.

• Slow start is used in 2 situation:• when a TCP connection is first establish. Obviously, when the sender have never

transmitted any data, it could not know what the current network situation is....

• When TCP has detected a packet loss, TCP will also use slow start. The reason for this is a bit more subtle. There can be many reason that cause a packet loss, one of it is someone started a new TCP connection which added more traffic to the network. Since the sender does not know how fast the new connection is injecting data into the network, it cannot know what the current network capacity is.

Slow Start

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 29: Tcp(no ip) review part2

• When to STOP using Slow Start ?• TCP contains a special variable called SSThresh - "Slow Start Threshold".• SSThresh defines the upper limit of the range of TWS where TCP will use slow start,

i.e.:if TWS <= SSThresh, TCP is in the slow start phase• Consequently, if TWS > SSThresh, TCP is in the congestion avoidance phase - i.e., TCP

transmitting at a rate that may cause congestion. TCP will still try to increase its data rate, but not exponentially.

Note:SSThresh can change over time (because network status changes)

Slow Start

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 30: Tcp(no ip) review part2

• We saw that Slow Start increases exponentially. Then why on earth we are calling an exponential increase a Slow Start ??

• Answer: Its in the history.• Prior to Jacobson's work, TCP Sender will

send data equal to the amount of advertised window of the receiver. This caused network collapse.

• So compared to sending all AWS packets, this way of start transmitting slower and growing gradually is indeed slower and hence Slow start.

Slow Start- Million $ question

Van Jacobson

Page 31: Tcp(no ip) review part2

• The period outside the slow start phase is congestion avoidance phase.

• Operation of TCP in the congestion avoidance phase:• Starts when cwnd ssthresh• On each successful ACK:

• cwnd cwnd + 1/cwnd• Linear growth of cwnd

each RTT: cwnd cwnd + 1

Congestion Avoidance

Page 32: Tcp(no ip) review part2

Congestion Avoidance

Page 33: Tcp(no ip) review part2

Congestion Avoidancecwnd

1

2

3

1 RTT

4

data packet

ACK

cwnd cwnd + 1 (for each cwnd ACKS)

receiversender

Page 34: Tcp(no ip) review part2

Congestion AvoidanceWhat is Congestion Avoidance trying to achieve ?• As you have already seen, TCP begins a congestion avoidance phase when it reaches a "near congestion

point”• Why would TCP not keep CWS constant ? Would you not cause a congestion if you keep

increasing CWS ???• The answer is TCP does not really know what the current capacity of network is... and network condition

keeps changing.• The goal of TCP is to transfer data as fast as possible. If TCP would stop increasing CWS, it would not be

true to its goal.• So during the congestion avoidance period, TCP is testing the tolerance of the network: after it has

successfully transferring CWS amount of data, it adds one more packet to the congestion window: CWS + MSS and retest the network.

• This technique should sound very familiar to all of us.... - kids have perfected this skill when they ask their parents for favors :-)

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 35: Tcp(no ip) review part2

Congestion AvoidanceWhen does the Congestion Avoidance phase end ?• Eventually, TCP will push the congestion window too far and cause some packet

drop.• When a packet loss occurs, it can cause the sending TCP to timeout• When a timeout occurs, TCP will begin a slow start phase:

TCP first set SSThresh = CWS/2.• The reason is quite obvious: TCP caused a problem (packet loss, timeout) while

using the congestion window size CWS.• It remembers when it went wrong by setting SSThresh to half that value.

• Then TCP invokes the slow start mechanism which will set CWS = 1 x MSS (i.e., 1 packet worth of data) and increases CWS at an exponential rate towards SSThresh (that was set to half of the value that caused problems).

Ref: http://www.mathcs.emory.edu/~cheung/Courses/558a/Syllabus/6-transport/TCP.html

Page 36: Tcp(no ip) review part2

assume ssthresh = 8*MSS

Example:Slow Start/Congestion Avoidance

cwnd = 10

cwnd = 4

Eight ACKs

cwnd = 2

cnwd = 8

cwnd = 1

cwnd = 9

Eight TCP-PDUs

nineACKs

nine TCP-PDUs

ten ACKs

ten TCP-PDUs

cwnd = 11

0

2

4

6

8

10

12

1 2 3 4 5 6 7

transmission number

cong

estio

n w

indo

w s

ize

(in M

SS)

ssthresh

S R

Page 37: Tcp(no ip) review part2

Slow Start & Congestion Avoidance

ssthresh

• Initally:- cwnd = 1*MSS- ssthresh = very high

• If a new ACK comes:- if cwnd < ssthresh update

cwnd according to slow start - if cwnd > ssthresh update

cwnd according to congestion avoidance

- If cwnd = ssthresh either

• If timeout (i.e. loss) :- ssthresh = flight size/2;- cwnd = 1*MSS

time

cwnd

Loss, e.g. timeout

slow start – in green

congestion avoidance – in blue

(initial) ssthresh

Page 38: Tcp(no ip) review part2

Slow Start & Congestion Avoidance

Page 41: Tcp(no ip) review part2

t=xr

t=(x+1)r

Fast Retransmit PacketAck

At a random point in the transfer

pkt 15pkt 16

pkt 17pkt 18pkt 19pkt 20

ack 15ack 16

pkt 17(retransmit)

ack 16

ack 16ack 16

3 dup acks

TCPReceiver

TCPSender

Fast Retransmit

Page 42: Tcp(no ip) review part2

Fast Recovery• Fast recovery is a relatively small improvement made to TCP after some research discovered that most fast

retransmit actions occur during mild congestions - i.e., only a few routers are congested and the congestion can be cleared very quickly.

• Idea: each dupACK represents a packet having left the pipe (successfully received)• Instead of forcing TCP to perform a slow start (which reduces CWS down to 1 x MSS), research found that TCP

can use a larger congestion window - that will let TCP send at a faster rate than when CWS = 1 x MSS• Fast recovery rule:

• When TCP performs a fast restransmit (and did not timeout), then set SSThresh = CWS/2 and set CWS = SSThresh + 3 * MSS.

• Use congestion avoidance with the new settings of SSThresh and CWS. (I.e., do not use (the very drastic) slow start

• Each time another duplicate ACK arrives, set cwnd = cwnd + 1. Then, send a new data segment if allowed by the value of cwnd.

• Once receive a new ACK (an ACK which acknowledges all intermediate segments sent between the lostpacket and the receipt of the first duplicate ACK), exit fast recovery. Set CWS = SSThresh and then, continue with linear increasing due to congestion avoidance algorithm.

Page 43: Tcp(no ip) review part2

Fast Recovery• Fast recovery algorithm (avoiding initial slow start phase)

1. When the third duplicate ACK is received, Set ssthresh = cwnd / 2; Retransmit the missing segment; cwnd = ssthresh + 3 ; 2. Each time another duplicate ACK arrives, Increment cwnd by the segment size; Transmit a new segment (if allowed by the new cwnd value);

3. When the next ACK arrives that acknowledges new data, cwnd = ssthresh ; cwnd = cwnd + 1 every roundtrip time ;

Page 44: Tcp(no ip) review part2

Fast Recovery

Page 45: Tcp(no ip) review part2

Fast Recoverycwnd

Slow Start Congestion AvoidanceTime

“inflating” cwnd with dupACKs

“deflating” cwnd with a new ACK

(initial) ssthresh

new ACK

fast-retransmitfast-retransmit

new ACK

timeout

Concept:• After fast retransmit,

reduce cwnd by half, and continue sending segments at this reduced level.

Problems:• Sender has too many

outstanding segments.• How does sender transmit

packets on a dupACK? Need to use a “trick” - inflate cwnd.

Page 46: Tcp(no ip) review part2

• After receiving 3 dupACKS:1. Retransmit the lost segment.2. Set ssthresh = flight size/2.3. Set cwnd = ssthresh, and ndupacks = 3.

N.B. In Reno: send_win = min ( rwnd, cwnd + ndupacks ).

• If dupACK arrives:• ++ ndupacks• Transmit new segment, if allowed.

• If new ACK arrives:• ndupacks = 0• Exit fast recovery.

• If RTO expires:• ndupacks = 0• Perform slow-start - ( ssthresh = flight size/2, cwnd = 1 )

Fast Retransmit & Fast Recovery

Page 47: Tcp(no ip) review part2

012

Ack(1)

3456

Ack(1)Ack(1)Ack(1)

78

Initial statecwnd=7Slow start

cwnd=8

Fast Retransmitcwnd=8/2+3=7ssthresh=8/2=4=> Fast Recovery

Ack(1)Ack(1)

Ack(1)Ack(1)

Ack(9)cwnd=8cwnd=9 9

cwnd=10 10cwnd=11 11

Exit Fast Recoverycwnd=ssthresh=4=> Congestion Avoidance

Ack(10)

1

12

Fast Retransmit and Recovery

Page 48: Tcp(no ip) review part2

Limitation of Previous Algorithm• If cwnd size is too small (smaller than 4 packets) then it’s not possible to get 3 duplicate

acks and run the algorithm

• The algorithm can not manage a loss of multiple packets from a single window of data• It will cause a use of retransmission time out

• The algorithm doesn’t manage a loss of packets during the Fast Recovery stage• Not a loss of the retransmitted packet• There is no recursive run of the Fast Retransmit

Page 49: Tcp(no ip) review part2

Sender Receiver012

Ack(1)

3456

Ack(1)

Ack(1)

78

Initial State cwnd=7Slow Start

cwnd=8

1Fast Retransmitcwnd=8/2+3=7ssthresh=8/2=4=> Fast Recovery

Ack(1)

Ack(1)Ack(1)

Ack(3)cwnd=8

Exit Fast Recoverycwnd=ssthresh=4=> Congestion Avoidance

Ack(3)

Flight Size =No. of Unacknowledgedsegments

Flight Size > cwnd=> No new segments

cwnd=9 9

The algorithm doesn’tknow which segmentswere acknowledged

What was happenif this packet was lost?

Page 50: Tcp(no ip) review part2

Improvements to Fast Recovery• The Idea: If the sender remembers the number of the last segment that was sent before

entering the Fast Retransmit phase• Then it can deal with a situation when a “new” ACK (which is not duplicate ACK)

does not cover the last remembered segment (“partial ACK”)• This is a situation when more packets were lost before entering the Fast Retransmit.

• After discovering such situation the sender will retransmit the new lost packet too and will stay at the Fast Recovery stage

• The sender will finish the Fast Recovery stage when it will get ACK that covers last segment sent before the Fast Retransmit

Page 51: Tcp(no ip) review part2

• Set ssthresh to max (FlightSize / 2, 2*MSS)• Record to “Recovery” variable the highest sequence number transmitted• Retransmit the lost segment and set cwnd to ssthresh + 3*MSS.

• The congestion window is increased by the number of segments (three) that were sent and buffered by the receiver.

• For each additional duplicate ACK received, increment cwnd by MSS.• Thus, the congestion window reflects the additional segment that has left the

network.• Transmit a segment, if allowed by the new value of cwnd and the receiver's advertised

window.

Improvements to Fast Recovery (I)

Page 52: Tcp(no ip) review part2

• When a partial ACK is received• retransmit the first unacknowledged segment • deflate the congestion window by the amount of new acknowledged data,

then add back one MSS• send a new segment if permitted by the new value of cwnd

• When an acknowledge of all of the data up to and including "recover“ arrives:• In our example: Set cwnd to ssthresh

Improvements to Fast Recovery (II)

Page 53: Tcp(no ip) review part2

Sender Receiver012

Ack(1)

34

5

Ack(1)

Ack(1)

Initial Statecwnd=5Slow Start

cwnd=6

1

Fast Retransmitcwnd=6/2+3=6ssthresh=6/2=3Recover=6=> Fast Recovery

Ack(1)

Ack(3)cwnd=7

Recover < AckExit Fast Recoverycwnd=ssthresh=3=> Congestion Avoidance

Ack(8)

7

6Ack(1)

Ack(3)

3

Recover >= AckPartial Ackcwnd=7-(3-1)+1=6 cwnd=7 8

Ack(9)

Page 54: Tcp(no ip) review part2

TCP SACK• Basic problem is that cumulative acks provide little information.• If there are multiple packet loss, it forces the sender to either

• wait a roundtrip time to find out about each lost packet• Or unnecessarily retransmit segments which have been correctly received.

• SACK can help in these situations.

Page 55: Tcp(no ip) review part2

TCP SACKSaves from RTT time.

Page 56: Tcp(no ip) review part2

sender

receiver

100-199

ACK 200200-299

300-399

400-499

500-599

ACK 200

ACK 200

ACK 200

fast retransmit200-599

ACK 600

sender

receiver100-199

ACK 200200-299

300-399

400-499

500-599ACK 200, SACK 300-400

ACK 200, SACK 300-500

fast retransmit200-299

ACK 600

ACK 200, SACK 300-600

TCP without SACK TCP with SACK

Without SACK and With SACKSaves from Unnecessary data retransmit

Page 57: Tcp(no ip) review part2

TCP Options(SACK)• How many SACK Blocks can be present in a SACK option ?• A SACK block is represented by a pair of 32 bit Seq Nos. • If there are n SACK blocks then SACK option is (8 n + 2) bytes long. 2

bytes are used to hold the Kind and length of the SACK option.• Due to limited option space (40 bytes) in a TCP header, max no. of

SACK blocks which can be sent in a single segment is THREE (Assuming Timestamp option is used).

Page 58: Tcp(no ip) review part2

sender

receiver

100-299

ACK 300

100 299

Receiver Buffer

300-499

500-699

ACK 300, SACK 500-700

500300 699

700-899

900-1099

ACK 300, SACK 900-1100, 500-700

699300 500 900 1099

1100-1299

Another SACK Example

Page 59: Tcp(no ip) review part2

sender

receiver

1100-1299

300-499

699300 500 900 1099

ACK 700, SACK 900-1100

699300 500 900 1099

700-899

ACK 1100700300 500 900 1099

1100

Another SACK Example

Page 60: Tcp(no ip) review part2

Evolution of TCP

1975 1980 1985 1990

1981TCP & IP

RFC 793 & 791

1974TCP described by

Vint Cerf, Bob KahnIn IEEE Trans Comm

1983BSD Unix 4.2

supports TCP/IP

1984Nagel’s algorithm

to reduce overhead

of small packets;predicts

congestion collapse

1987Karn’s algorithmto better estimate

round-trip time

1986Congestion

collapse 1st observed

19904.3BSD Renofast recovery

delayed ACK’s1975

Three-way handshakeRay Tomlinson

In SIGCOMM 75 1988Van Jacobson’s

algorithmsslow start, congestion

avoidance, fast retransmit (all

implemented in 4.3BSD Tahoe)SIGCOMM 88

Page 61: Tcp(no ip) review part2

Evolution of TCP

1993 1994 1996

1994ECN

Explicit CongestionNotification

(Floyd)

1993TCP Vegas(not implemented)real congestion

avoidance (Brakmo et al)

1994T/TCP

Transaction TCP (Braden)

1996NewReno

modified fast recovery

SACK TCPSelective Ack(Floyd et al)

1996Improving TCP

startup (Hoe)

1996FACK TCP

Forward Ack extension to SACK

(Mathis et al)

Page 62: Tcp(no ip) review part2

Summary- Packet Loss Management• TCP Reno (RFC 2581) can manage a loss of at most one packet from a single

window of data• TCP NewReno (RFC 2582) can manage a loss of more than one packet without

changing the TCP message structure• TCP SACK (RFC 2018) enables to cope with a loss of more than one packet by

changing message structure (using TCP options)

Page 63: Tcp(no ip) review part2

Summary of TCP Behavior

• When entering slow start, if connection is new,ssthresh = arbitrarily large value

cwnd = 1.else,ssthresh = max(flight size/2, 2*MSS)cwnd = 1.

• In slow start ++cwnd on new ACK

TCP Variation

Response to 3 dupACK’s

Response to Partial ACK of Fast Retransmission

Response to “full” ACK of Fast Retransmission

Tahoe Do fast retransmit,enter slow start ++cwnd ++cwnd

Reno Do fast retransmit,enter fast recovery

Exit fast recovery, deflate window, enter congestion

avoidance

Exit fast recovery, deflate window, enter congestion

avoidance

NewRenoDo fast retransmit,enter modified fast

recovery

Fast retransmit and deflate window – remain in

modified fast recovery

Exit modified fast recovery, deflate window, enter congestion avoidance

• When entering either fast recovery or modified fast recovery,

ssthresh = max(flight size/2, 2*MSS)cwnd = ssthresh.

• In congestion avoidancecwnd += 1*MSS per RTT

Page 64: Tcp(no ip) review part2

TCP Throughput – DerivationMatthew Mathis equation• TCP Throughput =

• Each cycle we deliver Packets so the total time taken at each cycle will be = RTT (

• The data per cycle we will be sending is equal to the Area of the shaded region. Which is Height . Area of the Orange + Area of the RED. =

Page 65: Tcp(no ip) review part2

TCP Throughput – DerivationProbability Basics

• Scenario: If I play 5 times and I loose once, what is the probability that I loose ?P =

• If a packet is dropped after sending a given amount of packets p, what is the probability of a drop ?

P =

Page 66: Tcp(no ip) review part2

TCP Throughput – DerivationMatthew Mathis equation• So we concluded that total no. of packets delivered in the shaded area = packets per cycle.

• By probability we know we can deliver of packets before we drop: A = where Area A =

p = , which means w =

• Going back to the original formula of throughput : = = = =

We know the value of W = , So substituting this in the above formula. = = = where C =

Page 67: Tcp(no ip) review part2

Questions ?