tcp 10. tcp – purpose tcp provides reliable data transmission over an unreliable network. tcp...

62
TCP 10

Post on 22-Dec-2015

242 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP

10

Page 2: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP – purpose• TCP provides reliable data transmission over an unreliable network.

• TCP provides congestion control

• TCP provides flow control

• TCP passes messages– Inputs

• Destination address• Destination port• Source port (socket)• Message

– Outputs• Message• Error reporting

• If TCP reports that the message has been delivered then we can rest assured that the receiving application has received the data. What the application does with it is another story.

• At least 85% of all traffic uses TCP….but I heard the 50% of traffic in S. Korea uses UDP (gaming).

• UDP– No flow control

– No error reporting (little error reporting)

TCP

BGP FTP HTTPSMTP telnet

IP

icmp UDP OSPF

Page 3: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP header

• IP header is 20 bytes (source IP, destination IP, protocol, TTL,…)

• TCP header 20 bytes

Source port Destination port

Sequence #

ACK #

Header length 4 bitsReserved

6

URG

ACK

PSH

RST

SYN

FIN

REC WIN 16

CHECK SUM 16 Urgent ptr 16

Options and padding

Page 4: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

• Ports – used so a single host can have many connections at the same time. When a packet arrives, it is distinguished by the source IP, source port, and destination port. More or less, the IPs and port define an application

• Sequence number – indicates the 1st byte of the data.• ACK# is the next expected sequence number• Header length in 32 bit words. 4 bits means the max size is 60 bytes. 20

bytes are used by the header, so up to 40 bytes more could be in options.• flags

– URG – urgent ptr (urgent data and valid urgent ptr, eg., cntrl-c)– ACK – ACK number is valid– PSH – receiver (the receiver should pass this data to the application as soon as

possible… as oppose to what? This should be set when this packet will empty the outgoing buffer so the receiver should not wait for a full buffer before passing data to the app. Just send it now.)

– RST – reset connection (something went wrong, good for detecting attacks).– SYN – synchronize sequence number– FIN – sender is finished sending data

Page 5: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

connection establishment

Node A initiates a connection with node B=> Node A performs an active open, node B passive open (listen)

sourcedest

SYN=1, seq=2197ACK=0

SYN=1, seq#=197ACK=1, ack#=2198

ACK flag=1ack#=198seq#=2198

Send SYN

Send SYN/ACK

Send ACK (for syn) Initial SYN depends on

implementation…

Page 6: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Connection establishment

• If the first SYN is dropped, then it is resent 3 seconds later. If this is dropped, it is resent 6 seconds. And so on. The maximum waiting time is 64 seconds. The maximum time can be as high as 180 second. But this depends on the implementation.

• If the listener doesn’t get an ACK, it will retransmit in 3 second and back-off in the same way.

• But if the listener gets a data packet, the ack will be set and this will end the connection establishment.

• Often during connection establishment connection setup data is included in the options. – Eg., the segment size is included in the options.

– More option discussed later

Page 7: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Connection termination

• FIN flag implies no more data will be sent from that host.• A FIN from each side closes the connection.• A FIN from only one size puts the connection in the half close state.• Example

– Node A sends first• A sends pkt with FIN=1 and seq#=U (A enters FIN_WAIT)• B responds with ACK and ack#=U+1 (B enters close_wait)• A receives ACK (A enters FIN_WAIT2)

• Now b closes• B send pkt with FIN set and seq#=V (enters LAST_ACK)• A responds with ACK and ack#=V+1 (enters TIME_WAIT and stays there for 120

seconds and then enters closed)• B receives ACK and enters closed.

• Use netstat to determine the state of the TCP connections.

Page 8: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Sending data• Either side can send data. When sequence number indicates where the first

byte is placed in the receiver buffer.

• The receiver responds with an ACK, the ack# indicates the next empty byte location in the buffer.

Seq#=20Ack#=1001Data = ‘Hi’, size = 2 (bytes)

Seq#=1001Ack#=22Data size =0

15

buffer

Seq #

SYN had seq#=14

16 17 18 19 20 21 22

H iS t e v e

Page 9: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Seq#=20Ack#=1001Data = ‘Hi’, size = 2 (bytes)

Seq#=1001Ack#=20Data size =0

15

buffer

Seq #

SYN had seq#=14

16 17 18 19 20 21 22

Seq#=22Ack#=1001Data = ‘Bye’, size = 3 (bytes)

15

buffer

Seq #

SYN had seq#=14

16 17 18 19 20 21 22

B y e

S t e v e

S t e v e

Seq#=20Ack#=1001Data = ‘Hi’, size = 2 (bytes)

Seq#=1001Ack#=25Data size =0

15

buffer

Seq #

SYN had seq#=14

16 17 18 19 20 21 22

B y eS t e v e H i

Note: here the receiver is not sending data, so its seq num is never changing and the reply ack is never changing. But the definitions of the ACK and SYN remain valid. Note that SYN and FIN packets are special cases. No data, but the ACKs increment.

Page 10: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Retransmission time-out

• How to decide when a packet should be retransmitted?

• Two methods. Here we talk about the first, when the ACK has not been received in a long time, TCP assumes that the packet was dropped.

• How long is a long time…..? No good solution.

SRTTk 1 SRTTk 1 RTTkRTTMDk 1 RTTMDk 1 |SRTTk RTTk | 0.9 or 7/8 0.25

RTT is the round-trip timeSRTT is a smoothed (filtered) version of RTTRTTMD accounts for the variance of RTT

RTOk maxSRTTk 4RTMDk,MinRTOMinRTO 200ms in linux, 500ms in BSD,RFC’s say it should be 1second

This does not work all that well. Really, it is MinRTO that controls when time-outs occur. Van Jackobson’s algorithm does not work well. But more analysis is required.

Van jackobson’s algorithm

Page 11: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

RTO analysis

Suppose that the pdf of RTT is e R (exponentially distributed, e.g., M/M/1 queue)Mean deviation is

0

r 1

e rdr 0

1 1

r e rdr 1

r 1 e0 rdr

e1

e 1

2r e

1

Ptimeout PR 1 4 2

e 1 8

e1 1

e rdr e 8e 1 1 0.019 2%

K0 5 10 15 20

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0 5 10 15 200

0.01

0.02

0.03

0.04

0.05

0.06

0.07

P(R

TT

>R

TO

)

Using the July 25, 2001 snapshot of round-trip times from the NLANR data set. we computed empirical probability of spurious timeouts. The total data set consists of nearly 13000 connections between 122 sites and 17.5 million round-trip time measurements. This data consisted of time series of round-trip times for each connection with each time series containing 1440 round-trip times (one sample per minute over the entire day)

Page 12: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Detecting drops with triple Dup ACKsSeq#=20Ack#=1001Data = ‘Hi’, size = 2 (bytes)

Seq#=22Ack#=1001Data = ‘Bye’, size = 2 (bytes)

Seq#=1001Ack#=22Data size =0

Seq#=25Ack#=1001Data = ‘Wazup’, size = 5 (bytes)

Seq#=1001Ack#=22Data size =0Rwin=2Seq#=30

Ack#=1001Data = ‘Give’, size = 4 (bytes)

Seq#=1001Ack#=22Data size =0Rwin=2

Seq#=34Ack#=1001Data = ‘Me’, size = 2 (bytes)

Seq#=1001Ack#=22Data size =0Rwin=2

Seq#=1001Ack#=36Data size =0Rwin=2

Seq#=22Ack#=1001Data = ‘Bye’, size = 2 (bytes)

15

buffer

Seq # 16 17 18 19 20 21 22

S t e v e25 30 30 35

H i

15 16 17 18 19 20 21 22

S t e v e25 30 30 35

H i Wa z u p

15 16 17 18 19 20 21 22

S t e v e25 30 30 35

H i Wa z u p G i v e

15 16 17 18 19 20 21 22

S t e v e25 30 30 35

H i Wa z u p G i v e Me

15 16 17 18 19 20 21 22

S t e v e25 30 30 35

H i Wa z u p G i v e MeB y e

Page 13: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Why triple dup ACK?• Why not one DUP ACK?

1. Bennet and Partridge, Packets reordering is not pathological network behavior, 1999. This paper showed that packet reordering can/does occur. Further research into this could be a project.1. The reason for the packet reordering is that the routers have parallel paths through them. So,

depending on the order of arrival and the packet sizes, the incoming order will be different from the outgoing order.

2. Supposedly this was only a problem with older model juniper routers. There are many of these routers out there. Cisco field day!

3. Reordering only happens when the packets arrive at nearly the same time. This might not happen that much in TCP (see ACK clocking later).

4. However, this is an active research area.5. Load balancing can cause packets to take different paths. This can cause reordering. Load balancing

is a good project topic.6. Route flap can also cause reordering.

2. Why not a larger DUPThres (larger than 3)? 1. This casues other problems.2. Limited transmit can help. See my papers on TCP-PR for details.

1. Using triple DUP ACKs instead of RTO is called fast retransmit because the drop is detected faster.

Page 14: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Flow control – so the receive doesn’t get overwhelmed.

• The number of unacknowledged packets must be lass than the receiver window.

• As the receivers buffer fills, decreases the receiver window.

Seq#=20Ack#=1001Data = ‘Hi’, size = 2 (bytes)

Seq#=1001Ack#=24Data size =0Rwin=0

Seq#=22Ack#=1001Data = ‘By’, size = 2 (bytes)

Seq#=4Ack#=1001Data = ‘e’, size = 1 (bytes)

Seq#=1001Ack#=22Data size =0Rwin=2

15

buffer

Seq #

SYN had seq#=14

16 17 18 19 20 21 22

S t e v e H i

S t e v e H i B y

15 16 17 18 19 20 21 22

24 25 26 27 28 29 30 31Application reads buffer

Seq#=1001Ack#=24Data size =0Rwin=9

24 25 26 27 28 29 30 31

e

Page 15: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Flow control – so the receive doesn’t get overwhelmed.

• The number of unacknowledged packets must be lass than the receiver window.

• As the receivers buffer fills, decreases the receiver window.

Seq#=20Ack#=1001Data = ‘Hi’, size = 2 (bytes)

Seq#=1001Ack#=24Data size =0Rwin=0

Seq#=22Ack#=1001Data = ‘By’, size = 2 (bytes)

Seq#=4Ack#=1001Data = , size = 0 (bytes)

Seq#=1001Ack#=22Data size =0Rwin=2

15

buffer

Seq #

SYN had seq#=14

16 17 18 19 20 21 22

S t e v e H i

S t e v e H i B y

15 16 17 18 19 20 21 22

24 25 26 27 28 29 30 31Application reads buffer

Seq#=1001Ack#=24Data size =0Rwin=9

Seq#=1001Ack#=24Data size =0Rwin=9

3 s

Seq#=4Ack#=1001Data = ‘e’, size = 1 (bytes)

24 25 26 27 28 29 30 31

e

window probe

Page 16: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Flow control – so the receive doesn’t get overwhelmed.

• The number of unacknowledged packets must be lass than the receiver window.

• As the receivers buffer fills, decreases the receiver window.

Seq#=20Ack#=1001Data = ‘Hi’, size = 2 (bytes)

Seq#=1001Ack#=24Data size =0Rwin=0

Seq#=22Ack#=1001Data = ‘By’, size = 2 (bytes)

Seq#=4Ack#=1001Data = , size = 0 (bytes)

Seq#=1001Ack#=22Data size =0Rwin=2

15

buffer

Seq #

SYN had seq#=14

16 17 18 19 20 21 22

S t e v e H i

S t e v e H i B y

15 16 17 18 19 20 21 22

3 s

Seq#=1001Ack#=24Data size =0Rwin=0

6 s

Seq#=4Ack#=1001Data = , size = 0 (bytes)

Max time between probes is 60 or 64 seconds

Page 17: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Receiver window

• The receiver window field is 16 bits.• Default receiver window

– By default, the receiver window is in units of bytes. – Hence 64KB is max receiver size for any (default) implementation.– Ethernet segments are 1500 bytes (TCP data =1460). – So that would give 44 packets.– If the bit-rate was 10Mbps, what is the RTT so that this window size is

equal to the bandwidth delay product.

• Receiver window scale – During SYN, one option is Receiver window scale. – This option provides the amount to shift the Receiver window.– Eg. Is rec win scale = 4 and rec win=10, tehn real receiver window is

10<<4 = 160 bytes.

Page 18: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Congestion Control

• Make sure not to overwhelm the network

• How much data to put into the network?

• The sender maintains a the congestion window (cwnd) that is the maximum number of unacknowledged packets.

• InFlight is the number of unacked packets.

• If InFlight < cwnd, then a packet can be sent.

• When an ACK arrives, InFlight decreases so another packet can be sent.

Page 19: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

suppose that cwnd = 4*MSS

Seq#=20 Ack#=1001Data = …, size =1 MSS (bytes)

MSS is maximum segment size = min of segment sizes of sender and receiver. It is negotiated during SYN.

suppose MSS=1000

Seq#=1020 ck#=1001 Data = …, size =1 MSS (bytes)

Inflight=1MSS

Inflight=2MSS

Seq#=2020 Ack#=1001 Data = …, size =1 MSS (bytes)

Inflight=3MSS

Seq#=3020 Ack#=1001 Data = …, size =1 MSS (bytes)Inflight=4MSS

Seq#=1001Ack#=1020Data size =0

Inflight=3MSSSeq#=4020 Ack#=1001 Data = …, size =1 MSS (bytes)

Inflight=4MSS

Seq#=1001Ack#=1020Data size =0

Inflight=3MSS Seq#=4020 Ack#=1001 Data = …, size =1 MSS (bytes)

Inflight=4MSS

Page 20: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

suppose that cwnd = 4*MSS

Seq#=20 Ack#=1001Data = …, size =1 MSS (bytes)

MSS is maximum segment size = min of segment sizes of sender and receiver. It is negotiated during SYN.

suppose MSS=1000

Seq#=1020 ck#=1001 Data = …, size =1 MSS (bytes)Inflight=1MSS

Inflight=2MSS Seq#=2020 Ack#=1001 Data = …, size =1 MSS (bytes)

Inflight=3MSSSeq#=3020 Ack#=1001 Data = …, size =1 MSS (bytes)

Inflight=4MSS

Seq#=1001Ack#=1020Data size =0

Inflight=3MSSSeq#=4020 Ack#=1001 Data = …, size =1 MSS (bytes)

Inflight=4MSS

Seq#=1001Ack#=1020Data size =0

Inflight=3MSS Seq#=4020 Ack#=1001 Data = …, size =1 MSS (bytes)

Inflight=4MSS

ACK clockingWhat is the maximum rate that ACKs can arrive at the sender?

Page 21: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

ACK clocking100Mbps 10Mbps 100Mbps

Packets can leave here at 100Mbps

Page 22: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

ACK clocking100Mbps 10Mbps 100Mbps

Packets can leave here at 100Mbps

Packets leave here at a rate of 10Mbps

What rate do packets leave here?

Page 23: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

ACK clocking100Mbps 10Mbps 100Mbps

Packets can leave here at 100Mbps

Packets leave here at a rate of 10Mbps

What rate do packets leave here?Ans: 10Mbps, they arrive at 10Mbps

100Mbps 10Mbps 100Mbps

What about the ACKs?

What rate do ACKs leave here?

Page 24: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

ACK clocking100Mbps 10Mbps 100Mbps

Packets can leave here at 100Mbps

Packets leave here at a rate of 10Mbps

What rate do packets leave here?Ans: 10Mbps, they arrive at 10Mbps

100Mbps 10Mbps 100Mbps

What about the ACKs?

What rate do ACKs leave here?Ans: 40/1040 * 10Mbps. Or at a rate so that if a oacket is send for each ACK, then the rate that the packets are sent is 10MbpsWhat rate do ACKs leave here?

Ans: 40/1040 * 10Mbps. Or at a rate so that if a oacket is send for each ACK, then the rate that the packets are sent is 10Mbps

What about the packets?

Page 25: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

ACK clocking100Mbps 10Mbps 100Mbps

Packets can leave here at 100Mbps

Packets leave here at a rate of 10Mbps

What rate do packets leave here?Ans: 10Mbps, they arrive at 10Mbps

100Mbps 10Mbps 100Mbps

What about the ACKs?

What rate do ACKs leave here?Ans: 40/1040 * 10Mbps. Or at a rate so that if a oacket is send for each ACK, then the rate that the packets are sent is 10MbpsWhat rate do ACKs leave here?

Ans: 40/1040 * 10Mbps. Or at a rate so that if a oacket is send for each ACK, then the rate that the packets are sent is 10Mbps

What about the packets? 10Mbps. Perfect!!!

Page 26: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Congestion control

• ACK clocking makes the sender not send any faster than the bottleneck link speed.

• But how to “fill the pipe?”

Not sending pckts. Wasted bandwidth

Sending at “burst” rate of 10Mbps

Sending at “burst” rate of 10Mbps

We only send cwnd packets in a burst. How big should cwnd be?

Page 27: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Congestion control

• ACK clocking makes the sender not send any faster than the bottleneck link speed.

• But how to “fill the pipe?”

We only send cwnd packets in a burst. How big should cwnd be?

RTTThe number of pckts sent in one RTT is the cwnd.In order to not waste bandwidth, how many packets should be sent?

Page 28: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Congestion control

• ACK clocking makes the sender not send any faster than the bottleneck link speed.

• But how to “fill the pipe?”

We only send cwnd packets in a burst. How big should cwnd be?

RTT

The number of pckts sent in one RTT is the cwnd.In order to not waste bandwidth, how many packets should be sent?

Cwnd (bytes)= Link byte-rate (byte/s) * RTT s

Bottleneck links speed

Bandwidth delay product = Link byte-rate (byte/s) * RTT s

Page 29: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Congestion control

• Ideally cwnd = bandwidth delay product.

• This ignores fairness. If there are N flows that are also use the same link. Then ideally cwnd = bandwidth delay product/N.

• But how to find this value???

Page 30: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP congestion control• Theme: probe the system.

– Slowly increase cwnd until there is a packet drop. That must imply that the cwnd size (or sum of windows sizes) is larger than the BWDP.

– Once a packet is dropped, then decrease the cwnd. And then continue to slowly increase.

• Two phases: – slow start (to get to the ballpark of the correct cwnd)

– Congestion avoidance, to oscillate around the correct cwnd size.

Connectionestablishment

Slow-startCongestion avoidance

Cwnd>ssthressTriple dup ack

timeout

Connectiontermination

Page 31: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Slow start

• When the connect first start (and after a timeout for today’s TCPs)

• Cwnd starts at 1 or 2 MSS.

• For each non-dup ACK received, the window size increase by one.

• This increasing continues until the window reaches the value of SSThres.

• The initial value of SSThres is often large (taken as infinite). So the Rwin limits the growth of the window.

Page 32: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Slow start

SYN: Seq#=20 Ack#=X

SYN: Seq#=1000 Ack#=21

SYN: Seq#=21 Ack#=1001

Seq#=21 Ack#=1001 Data=‘…’ size =1000

Seq#=1001 Ack#=1021 size =0Seq#=1021 Ack#=1001 Data=‘…’ size =1000Seq#=2021 Ack#=1001 Data=‘…’ size =1000

Seq#=1001 Ack#=1021 size =0

Seq#=1021 Ack#=1001 Data=‘…’ size =1000Seq#=2021 Ack#=1001 Data=‘…’ size =1000 Seq#=1001 Ack#=1021 size =0Seq#=1021 Ack#=1001 Data=‘…’ size =1000Seq#=2021 Ack#=1001 Data=‘…’ size =1000

cwnd

1

2

34

5678

The pipe is full!

Page 33: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Slow start

SYN: Seq#=1000 Ack#=21

Seq#=1001 Ack#=1021 size =0

Seq#=1001 Ack#=1021 size =0

Seq#=1001 Ack#=1021 size =0

cwnd

1

2

34

567

8The pipe is full!

What is happening here?

RTT

RTT

RTT

RTT

RTT??

Cwnd doubles every RTT!!

Page 34: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Slow start

SYN: Seq#=1000 Ack#=21

Seq#=1001 Ack#=1021 size =0

Seq#=1001 Ack#=1021 size =0

Seq#=1001 Ack#=1021 size =0

cwnd

1

2

34

567

8What is happening here?Now the queue is filling. Either it will fill and drop a packet or the recWin will stop cwnd from increasing

RTT

RTT

RTT

RTT

RTT??

Cwnd doubles every RTT!!

Page 35: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

• If RecWin!=inf and RecWin<bandwidth delay product + queue size, and there are no other packets, then there will never be a drop. Lots of conditions, but a large number of flows do not experience drops.

• If RecWin/ssthress=inf and the outgoing link of the sender is not the bottleneck, then eventually there will be a drop. If the drop is detected with triple dupack, then cwnd = cwnd/2 and congestion avoidance is entered.

• If the drop(s) is(are) detected with timeout, then ssthress=cwnd/2, cwnd=1 and slowstart is continued.

• If ssthress< bandwidth delay product + queue size and RecWin>ssthress, the congestion avoidance is entered.

Page 36: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Congestion AvoidanceBasics: additive increase multiplicative decrease (AIMD)!!Rough view

For every cwnd’s worth of packets, cwnd is incremented by one.When there is a drop, cwnd=cwnd/2.

cwnd4

5

6

cwnd6

63

Seq# (MSS)

1234

56789

101112131415

161718192021

2345

5678910

1112131415

1515151515

1112131415

15

22

2223

2324

4 24

Page 37: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Rough view of TCP congestion control

Slow start Congestion avoidance

dropsCwnd=ssthres

Slow start Congestion avoidance

dropsdrop

Slow start Congestion avoidance

dropsdrop

Slow start

Page 38: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP - more detailed view

• Delayed ACKs– The worry was that the network was going to be all jammed up

with ACKs. – So instead of sending an ACK for every pck, delay the ack and

maybe ack two packets• Generate an ACK for at least every other packet.• Don’t delay an ACK by more than 500ms. (exact number depends on

implementation.)• If packets are out of order, generate an ACK for every packet. • Also, immediately send an ACK when a “gap” in the buffer is filled.

– Delayed ACKs can greatly slow down a connection. • Eg., the first packet is delayed by 500ms• Depending on the implementation, cwnd will grow more slowly.

Page 39: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Details - Fast recovery

• cwnd after a drop

• Recall, TCP only sends packets when InFlight < Cwnd.

• InFlight only decreases when a new ACK is received, I.e., a DUP ACK does not cause InFlight to change. – If a DUP ACK arrives, then it means that a packet arrived at the

receiver and an ACK was sent. So the number of packet in the network has decreased. So InFlight should decrease.

– But maybe the network has duplicated the ACK. To be conservative, leave InFlight as is (I guess).

Page 40: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Fast recovery

• Upon the two DUP ACK arrival, do nothing. Don’t send any packets (InFlight is the same).

• Upon the third Dup ACK, – set SSThres=cwnd/2.

– Cwnd=cwnd/2+3

– Retransmit the requested packet.

• Upon every other DUP ACK, cwnd=cwnd+1.

• If InFlight<cwnd, send a packet and increment InFlight.

• When a new ACK arrives, set cwnd=ssthres (RENO).

• When an ACK arrives that ACKs all packets that were outstanding when the first drop was detected, cwnd=ssthres (NEWRENO)

Page 41: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Fast recovery

cwnd4

5

6

cwnd6

6=6/2+37

Seq# (MSS)

1234

56789

101112131415

161718192021

2345

5678910

1112131415

1515151515

1112131415

15

22

2223

2324

3 24

Inflight6

6

878

3

Page 42: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Fast recovery – multiple drops - RENOcwnd4

5

6

cwnd6

6=6/2+37

Seq# (MSS)

1234

56789

101112131415

161718192021

2345

5678910

1112

1212

1212121212

1112

1212

12

15

2223

1515

3 24

Inflight6

6

878

3

15

5=2+3 15165

1522

Why is this bad?The first drop told us that we were sending to fast.The second drop tells us the same thing (already).So why react to the same news twice….NewReno

Page 43: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Fast Recovery – multiple drops - NewReno

• The problem was that one of the packets that was outstanding when the drop was detected was also dropped.

• Solution (NewReno)– When a drop is detected,

• Ssthres=cwnd/2• Cwnd=cwnd/2+3• Recover = seq# of largest byte sent.• Retransmit the dropped packet

– Upon a DUP ACK, increment cwnd and sent if Inflight<cwnd– If ACK is larger than pervious ACK, but smaller than recover (partial ack)

• Suppose that pervious ack#=X and now ack#=Y<recover• Retransmit drop packet• Cwnd = cwnd – (Y-X)+1• Of course, Inflight = Inflight-(Y-X)• So transmit another packet (that makes two transmissions)

– If ACK>recover,• Cwnd=ssthres• Exit fast recovery

Page 44: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Fast Recovery – single drops - NewRenocwnd

161718192021

Inflight14 14

17171717

171014

Recover=29

1112131514

15

3116

7

Note how the actual number outstanding is always = 7

Page 45: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Fast Recovery – multiple drops - NewRenocwnd

161718192021

Inflight14 14

17171717

171014

Recover=29

111213151415

2116

16=19-(21-17)+1

19

15=19-4 21

29

35

7

2 drops takes 2 RTT to recover.N drops takes N RTT to recover.If N*RTT>RTO, then

slow-steady => no TOimpatient => TO

Exit fast recovery

NewReno sends two packets for every ACK indicating a multiple drop.

Page 46: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Other things

• Idle restart– If no packet has been sent in RTO seconds

• SSThress=Cwnd• Cwnd=1• Slow-start

– Avoids big bursts after idle times• E.g., get data form disk• http 1.1

• Timeout – exponential back off– If no ACK arrives before RTO timer expires, then time-out

• Ssthress=cwnd/2; Cwnd=2; slow-start• RTO=min(2*RTO,64s)

– If next packet is dropped, then the wait is longer– Gives up after 9-12 tries. But implementation dependent (ns never stops)

• If a retransmitted is dropped, the TCP times out.

Page 47: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Dup ACKs after timeoutcwnd

161718192021

Inflight14 14

17171717

1014

Recover=29

11121315141516

16=19-(21-17)+1

19

15=19-4

29

18

eventually timeout17

1819

20212223

24

262830

42

24

1731

30

42424242424242

DUP ACKS

Set send_high to maximum seq# sent.If DUP ACKs are received for segments less than send_high, assume it does not indicate a drop. In case there was a drop, then there will be a time out.

Page 48: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Selective Acknowledgment – SACKThe latest widespread congestion control

• Problem: when a multiple packets are dropped, the cumulative ACK does not give information as to which packets were dropped. As a result, fast recovery is not so fast; it takes one RTT per lost packet.

• Solution: embed into the ACK some information about which packets have successfully arrived.• TCP-SACK allows ACKs to contain information about received packets.• If the packets are received in order, then the ACK looks the same as TCP-RENO or TCP-

NEWRENO. But if a packet the packets arrive out of order, then the ACK contains SACK blocks. • A SACK block indicates a sequence of segments that have been received.

A A A S S S S S S S N N N

ACKed SACKed SACKed Not Sent

15 20 25 30 35seq num

Page 49: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP-SACK

A A A S S S S S S S N N N

ACKed SACKed SACKed Not Sent

15 20 25 30 35seq num

Highest ACK

left edge of 2nd block

right edge of 2nd block

left edge of 2nd block

right edge of 2nd block

SACK blocks are 8 bytes long (4 bytes for each edge)The SACK option includes 1 byte to specify that it is a SCK block and one byte for the number of SACK blocks. 1 SACK block = 10 bytes + 2 bytes padding -> 52 bytes header2 SACK blocks = 18 bytes + 2 bytes padding -> 60 bytes header3 SACK blocks = 26 bytes + 2 bytes padding -> 68 bytes header4 SACK blocks = 34 bytes + 2 bytes padding -> 76 bytes headerMax ACK is 80 bytesIf time stamp option is used, then the max number of SACK blocks is 3.

kind=5 length=2left edge of 2st block = 26right edge of 2st block = 30left edge of 1st block = 20right edge of 1st block = 23

SACKoption

Page 50: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Generation of SACKs1. No SACK blocks if no out of order packets

2. No delayed ACK if out of order packets (send an ACK for every received packet.

3. When an out of order packet arrives, the first SACK block contains contain the segment that just arrived.

4. The ACK should contain as many SACK blocks as fit and are required (no skimping to save bit-rate).

5. The SACK blocks included should be those that have most recently been reported (see 3). So if there are at most 3 SACK blocks, then each continuous block of segments will be reported at least 3 times.

6. If the packet that arrived has just been received (a duplicate reception), then the first SACK block should identify this packet. (This is the DSACK extension to SACK). In this case, the next SACK block should indicate the continuous sequence of segments that contain the segments received in duplicate.

A A A S S S S S S S N N N

ACKed SACKed SACKed Not Sent

15 20 25 30 35seq num

left edge of 2nd block

right edge of 2nd block

left edge of 2nd block

right edge of 2nd block

Now suppose that segment 21 arrives for a second time.

kind=5 length=2left edge of DUP packet = 21

right edge of DUP packet = 22

left edge of 2st block = 26

left edge of 1st block = 20

SACKoption

right edge of 1st block = 23

right edge of 2st block = 30

Page 51: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

DSACK• DSACK is to identify packets that have been needlessly retransmitted.• The primary source of such retransmissions is packet reordering.• If such a retransmission occurs, it likely means that cwnd was divided by 2 needlessly.• DSACK helps identify these needless divides by two.• It is not clear what can be done once they are identified.• Many ideas have been suggested, but it remains to be scene if they actually improve

things – Ethan Blanton, Mark Allman, On Making TCP More Robust to Packet Reordering (2002):

show that some improvement is possible– Bohacek et al shows that if there is persistent reordering, more drastic measures are required.– Neither paper includes analysis of the current situation in the Internet.

• The current situation is not completely known.• The homework provides backbone traces with rampant reordering.• In my opinion (on 2/20/04) some sort of timer-based approach is necessary. The DUPACK threshold

approach is not appropriate because a burst of packets (as can be seen in the homework) can be very reordered. But reordering by more than a few milliseconds is very rare.

• A project could examine this.

Page 52: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Eifel Detection

• DSACK is only useful after the arrival of the second copy of the packet.

• Eifel uses time-stamps to inform the sender that a packet that was thought to have been lost has actually arrived.

Page 53: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP-SACK (Sender side)• Slow start and the linear increase part of SACK is the same as TCP-RENO/NEWRENO.

The fast recovery part is different.• SACK provides more information about which packets have been lost. The sender can

use this to determine– which packets to send– when to send packets

• When to assume that a packet is lost1. If DupThresh continuous SACK blocks have been SACKed that have larger sequence number.

The idea is that DupThresh packets have been SACKed with larger sequence number, but continuous SACK blocks are used instead.

2. If DupThresh*MSS bytes have been SACKed that have larger sequence number.

A A A S S S S N N N

ACKed SACKed SACKed Not Sent

15:19 40:44 65:69 78:82seq numPacket num 3 8 13 18 23

70:71

14 1572:73

1674:75

1776:77

1983:87

S S

Assumed dropped because of reason 1 and 21. Number of continuous sack blocks

with higher seq num = 4DupThresh2. Number SACKed bytes with large seq

num = 25 MSS*DupThresh

MSS=5 bytesDupThresh=3

Assumed dropped because of reason 1 only1. Number of continuous sack blocks

with higher seq num = 3 DupThresh2. Number SACKed bytes with large seq

num = 9<MSS*DupThresh

Not assumed dropped.

little packets

Page 54: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Number in “pipe” or InFlight

• If a packet has been sent, not lost, and not SACKed, then this packet is assumed to be in the pipe.

• Any packet that has been retransmitted and not SACKed.– Retransmission happen in order (smallest seq num first, why?)

– Let HighRX denote the highest segment that has been Retransmitted.

– Any packet that has been not been SACKed and has seq num less been retransmitted, so it is in the pipe.

Page 55: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Which packet to send next? (during fast recovery)• The next to transmit is the segment with the smallest seq num that satisfies

1. If the segment is less than HighRX

2. If the segment has seq num less than the largest segment in a SACK block

3. If the segment is assumed to be lost.

• If the above is an empty set, then the next to be sent is smallest segment that has not yet been sent.• If the above is also empty (because there are no more packets to be sent),

A A A S S S S S S S N N N

ACKed SACKed SACKed Not Sent

15 20 25 30 35seq num

HighRX

already retransmitted next to be sent

A A A S S S S S N N N

ACKed SACKedSACKed

Not Sent

15 20 25 30 35seq num

HighRX

already retransmitted next to be sent

A A A S S S S S

ACKedSACKed SACKed

15 20 25seq num

HighRXalready retransmittednext to be sent

end of file

Page 56: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP-SACK congestion control

• When a loss is detected:– set RecoveryPoint=Seq num of highest segment sent. Fast

recovery ends when this seq num is ACKed (SACKed is not good enough).

– ssthresh = cwnd=Inflight

– Retransmit lost packet with smallest seq num.

– Set HighRX equal to the retransmitted packet

• During recovery (until RecoveryPoint is ACKed)– If pipe<cwnd, then send next to be sent.

Page 57: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP-SACK notes

• After RTO, the TCP-SACK sender starts fresh and erases SAKC info from prior to the RTO (some of it might be regained in retransmissions of SACK blocks).

• Like NEWRENO, the highest seq sent before an RTO is recorded and a dupack from a packet qith seq num less than this highest seq does not cause fast recovery/retransmit.

• Like NEWRENO, the retransmit timer can be reset during recovery (slow and steady) or not (impatient).

Page 58: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP-SACK timeout• SACK, NewReno, etc. will time-out if a retransmission is lost.• If SACK uses the same technique to increase cwnd as NewReno

(I.e., cwnd=inflight/2+3…). and if there are more than cwnd/2 packets are lost, SACK will time-out.

• The ns implementation has this problem.

cwndnewReno

161718192021

Inflight

14 14

171717

1710

14

1112

13

14

14

29

17

14

no more packet senttime-out

pkt sent

Page 59: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

TCP-SACK burst• SACK, NewReno, etc. will time-out if a

retransmission is lost.• Multiple drops lead to a burst of packets being

sent.161718192021

171717

17,18,19,20

29

17

cwndSACK

pipe

74,5,6,72122

24

31

lost ACK clocking and sent a burst

7

14

38

37

recovery ends

pkt sent

Page 60: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

Limited Transmit• When a packet is dropped and the window size is less than 4, TCP will always timeout (not enough ACKs arrive

to get triple DUP).• It, upon receiving a DUP ACK, a packet is transmitted, then there might be enough DUPACKs to cause fast

retransmitted and avoid time-out.• Limited transmit allow for a packet to be send when the second Dup Ack is received. (In general, for every other

dup ack).• Even if a packet is lost, sending a packet for every other ACK is sending at half the bit-rate.• While this helps TCP avoid time-outs, it also makes this version of TCP far more aggressive for loss probability

greater than about 1% (where time-outs become quite prevalent for non-limited transmit TCP)

cwnd3

Seq# (MSS)

123

4

5

2

2

2

2

Triple dup ack!No time out

cwnd3

Seq# (MSS)

123

4

5

2

2

Time out

Page 61: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

cwnd4 1

234

5

6

7

2

2

Triple dup ack!

cwnd

Seq# (MSS)

1234

5

2

2

2

cwnd5

Seq# (MSS)

1234

6

25

Triple dup ack!

2

2

2

Limited Transmit

Page 62: TCP 10. TCP – purpose TCP provides reliable data transmission over an unreliable network. TCP provides congestion control TCP provides flow control TCP

ECN

• Sometimes the router will have a large enough queue to accept the packet, but the queue occupancy is beyond a threshold, so in order to try to get the TCP flows to send at a slower rate, the router would drop packets (even though there is room in the queue).

• It’s funny to drop packets when there is room in the queue, so another option is to mark the packets. The receiver should include in the ACK that packet that is being ACKed has been marked and the sender should react to this marking as it would to a drop, except that there is no reason to retransmit the marked packet.

• This approach has little impact in general, except, like limited transmit, when the loss probability if very high, it can reduce timeouts.