congestion - university of california,...
TRANSCRIPT
Unit 20
Congestion Control
Acknowledgments: These slides were originally developed by Prof. Jean Walrand for EE122.The past and current EE122 instructors including Profs. Kevin Fall, Abhay Parekh, Shyam Parekh,
and Adam Wolisz have contributed to their evolution.
Congestion Control
� The Problem
� Questions
� Approaches
� TCP: Algorithm
� TCP Refinements
� Summary
TOC – Congestion Control
The Problem
� Flows share links:
How to share the links bandwidth?
TOC – Congestion Control - The Problem
Questions
� What should be the ideal sharing?
� Does it matter?
� Discovering available bandwidth
� What is fair?
TOC – Congestion Control - Questions
Does it matter?
� Congestion occurs� Access link
� Slow link (56k, DSL, T1, wireless, …)
� Access network� E.g., behind the DSLAM
� Can improve treatment of flows� E.g., one flow should not get a much smaller
fraction of bandwidth
� Some flows might need some guaranteed bandwidth
TOC – Congestion Control - Questions – 1
Questions: Available bandwidth?� Example:
A B
C E D F
1010 10 10 10
103 36
x
zy
= router = host
3 = link with bandwidth of 3Mbps(same for 6 and 10)
x, y, z = throughput of flows
TOC – Congestion Control - Questions – 2
Questions: Available bandwidth?� Example:
A B
C E D F
1010 10 10 10
103 36
x
zy
• Assume C�D with rate y and E�F with rate z• How does A “discover” the available bandwidth to B?• Some approaches:
1. Reservation2. Adapt to congestion3. Test for sufficient bandwidth4. Pricing congestion
TOC – Congestion Control - Questions – 2
Questions: Available bandwidth?
A B
C E D F
1010 10 10 10
103 36
x
zy
• Assume C�D with rate y and E�F with rate z• How does A “discover” the available bandwidth to B?• Some approaches:
1. Reservation2. Adapt to congestion3. Test for sufficient bandwidth4. Pricing congestion
TOC – Congestion Control - Questions – 3
Available bandwidth: Reservation
A B
C E D F
1010 10 10 10
103 36
x
zy
1. Routers (or manager) keep track of reserved rates2. A requests a rate R to B from the network3. The network figures out if R is available4. If R is available, routers (or manager) update
reservations and confirm to A5. Note: Complex, Slow, Requires enforcement,
Renegotiations, Pricing
TOC – Congestion Control - Questions – 3.1 Reserve
Available bandwidth: Adapt
A B
C E D F
1010 10 10 10
103 36
x
zy
1. Transmit and slow down if congestion occur2. Example:
• Initially: x= 0, y = 3, z = 3• Then A increases its rate; C and E notice congestion
and slow down • Later, C stops: A and E increase rates
3. Notes:• No guarantees: throughput may drop• Key question: how to adapt rates
TOC – Congestion Control - Questions – 3.2 Adapt
Available bandwidth: Test
A B
C E D F
1010 10 10 10
103 36
x
zy
1. Assume flows require at most 1Mbps (e.g., video)2. Routers monitor their rates to see if they have at least 1
Mbps of available bandwidth; they mark packets otherwise
3. If A wants a new flow to B, it sends test packets to B4. If routers do not mark test packets, then A can start its
new flow; otherwise, A does not start it5. Advantages:
1. relatively simple2. guarantee
TOC – Congestion Control - Questions – 3.3 Test
Available bandwidth: Pricing
A B
C E D F
1010 10 10 10
103 36
x
zy
• When they get saturated, routers mark packets• If a flow with rate R uses a saturated link, it gets marks with
rate R (multiple saturated links result in multiple marks)• Each mark costs one unit• Source slows down if price becomes excessive• x= 1+, y = 2+, z = 2+
� pA = 1 + 1; pC = pE = 2• x = 2+, y = 1+, z = 1+
� pA = 2 + 2; pC = pE = 1
TOC – Congestion Control - Questions – 3.4 Pricing
Questions: What is Fair?� Example:
A B
C E D F
1010 10 10 10
103 36
x
zy
• x = y = z = 1.5: fair in max-min sense
• x = 0, y = z = 3: maximizes x + y + z
• 5x = 4y = 4z: equalizes resources flows use with x = 1.33, y = z = 1.67
• What if A�B needs 2Mbps?(and is willing to pay for it)
TOC – Congestion Control - Questions – 3.5 Fair?
Congestion Control: Approaches� Telephone Network: Reservation
� Transmission Control Protocol (TCP)� Adapt rate to congestion
� Algorithm for adaptation attempts to be fair …
� User Datagram Protocol (UDP)� Transmit and hope for the best
� Various proposals for Internet:� Reservation
� Pricing
� Test
� Note: Either by hosts or between domains
TOC – Congestion Control - Approaches
Congestion Control: TCP Algorithm
� Principles
� Example
� Multiple Sources
� A Bad Algorithm: AIAD
� AIMD: Additive Increase – Multiplicative Decrease
� Why AIAD Fails
TOC – Congestion Control - TCP Algorithm
TCP Algorithm: Principles
� We focus on the “standard” TCP (reno)
� Idea: � Not congested => increase rate
� Congested => slow down
� Questions:� How to detect congestion?
� Missing ACKs
� How to increase/slow down?
� AIMD
TOC – Congestion Control - TCP Algorithm – Principles
TCP Algorithm: Example
� No congestion � x increases by one packet/RTT every RTT
� Congestion � decrease x by factor 2
A Bx
C = 50 pkts/RTT
0
10
20
30
40
50
60
1
28
55
82
109
136
163
190
217
244
271
298
325
352
379
406
433
460
487
Backlog in router (pkts)
Congested if > 20
Rate (pkts/RTT)
TOC – Congestion Control - TCP Algorithm – Example
TCP Algorithm: Multiple Sources
A Bx
C = 50 pkts/RTT
D E
0
10
20
30
40
50
60
1
28
55
82
109
136
163
190
217
244
271
298
325
352
379
406
433
460
487
� No congestion � rate increases by one packet/RTT every RTT
� Congestion � decrease rate by factor 2
Rates equalize � fair share
y
TOC – Congestion Control - TCP Algorithm – Multiple Sources
TCP Algorithm: Bad Algorithm
A Bx
C = 50 pkts/RTT
D E� No congestion � x increases by one packet/RTT every RTT
� Congestion � decrease x by 1
0
10
20
30
40
50
60
1
28
55
82
109
136
163
190
217
244
271
298
325
352
379
406
433
460
487
y
TOC – Congestion Control - TCP Algorithm – Bad
TCP Algorithm: AIMD
C
x
y
A Bx
C
D Ey
Limit rates:x = y
TOC – Congestion Control - TCP Algorithm – AIMD
TCP Algorithm: Why AIAD Fails
C
x
y
A Bx C
D Ey
Limit rates:x and y depend
on initial values
TOC – Congestion Control - TCP Algorithm – Why AIAD Fails
Congestion Control: TCP Refinements
� TCP Phases
� Slow Start and Congestion Avoidance
� Fast Retransmit
� Fast Recovery: 1st Look
� Fast Recovery: 2nd Look
� Window Updates
� Flow Control
� Summary
TOC – Congestion Control - TCP Refinements
TCP Phases
� Slow Start
� While (W ≤ Slow Start Threshold)
� W = W + 1 for each new ack
� Recovery Mechanism: Timeout
� Congestion Avoidance
� While (W > Slow Start Threshold)
� W = W + 1/W for each new ack
� Recovery Mechanism: Fast Transmit/Recovery and
Timeout
We next examine the details of the above phases
Refinements: Slow Start & Congestion AvoidanceSlow Start & Congestion AvoidanceSlow Start & Congestion AvoidanceSlow Start & Congestion Avoidance
� Objective: Discover available BW quickly
� Solution: Exponential increase of window (Slow Start)
� Probing for more BW: Additive increase (Congestion Avoidance)
W
Timeout
1
n
n/2
Threshold64KB
exp exp
AdditiveSlope = 1/RTT
TOC – Congestion Control - TCP Refinements – Slow Start
Slow StartCongestionAvoidance
Refinements: Fast Retransmit
n
n+1
n+1n+2
Cumulative ACKs:ACK # = next expected #
n+1
n+3
n+13rd duplicated ACK:� likely packet loss� retransmitn+1
timeout
TOC – Congestion Control - TCP Refinements – Fast Retransmit
n+1
n+4
Refinements: Fast Recovery (1)
� Timeout � Reset Window = 1 unit (MSS)
� 3rd Dup ACK � Window/2
Window
Slope =1 MSS/RTT
3rd Dup ACK
n
n/2
Timeout
1
Moderate congestion(subsequent pkts arrived)
Severe congestion
TOC – Congestion Control - TCP Refinements – Fast Recovery 1
Refinements: Fast Recovery (2)� Window adjustment is tricky:
Want W � W/2
W = W first 2 DA (Dup Ack)At 3rd DA:
ssthresh = W/2W = ssthresh + 3
W = W + 1 at each DA after 3rd DA
W = W first 2 DA (Dup Ack)At 3rd DA:
ssthresh = W/2W = ssthresh + 3
W = W + 1 at each DA after 3rd DA
W/2 – 1 outstandingpackets:
n+W+1, …, n+3W/2–1
W/2 – 1 outstandingpackets:
n+W+1, …, n+3W/2–1
W = ssthreshW = ssthresh
TOC – Congestion Control - TCP Refinements – Fast Recovery 2
n + 1
W/2 + 3
W
4
3rd DA
n + 1
n + W
W – 4 acks
�(W/2 + 3) + (W – 4)
= W + W/2 – 1
W – 4 acks
�(W/2 + 3) + (W – 4)
= W + W/2 – 1
n + W + 1
Refinements: Window Updates� Exponential: W = W + 1 at each ACK:
W = 1 W = 2 W = 4 W = 8
� Additive: W = W + 1/W at each ACK:
W = 8
W = 8+ 1/8
W = 8.125 + 1/8.125≈ 8 + 2/8
W ≈ 8 + 8/8 = 9W ≈ 9 + 9/9 = 10
TOC – Congestion Control - TCP Refinements – Window Updates
Congestion Example(Source: TCP/IP Illustrated - I, W.
Stevens)
Congestion Example (Cont’d)(Source: TCP/IP Illustrated - I, W.
Stevens)
Refinements: Flow Control� Objective: Avoid saturating destination
� Algorithm: Receiver advertises window RAW
RAW
actual window = min {RAW, W}actual window open = actual window - OUT
whereOUT = Outstanding = Last sent – last ACKedW = Cong. Window from AIMD + refinements
[ACK | RAW | …]
TOC – Congestion Control - TCP Refinements – Flow Control
Refinements: Summary
W
1
64KB
� Actual window = min {RAW, W}
X0.5
TO
3DA
X0.5
3DA TO
X0.5 X0.5
SS CA SS CA
3 3
TOC – Congestion Control - TCP Refinements – Summary
Congestion Control: Summary
� Slow Start: Discover available bandwidth� Congestion Avoidance: AIMD � Tries to be fair
� Refinements:� Fast Retransmit: 3rd DA
� Fast Recovery: Reset W to W/2 (instead of W = 1)[More precisely: ssthresh = W/2, W = ssthresh + 3,
W = W + 1 per DA after 3rd DA,W = ssthresh when get new ACK]
� TO: set ssthresh = W/2, W = 1, SS until W = ssthresh,
then CA
� Timers:� Timeout = Average + 4 Deviations
� If time out � Timeout x 2 (up to a maximum)
Reset after new ACK
� Flow Control:� Actual window = min {RAW, W}
TOC – Congestion Control - Summary
TCP Timers
� Retransmission Timer: We have discussed this earlier
� Persist Timer: Uses “window probes” to avoid deadlock after the receiver closes its receive window size
� Keepalive Timer: Server sends a probe packet after the connection has been idle for a while (It’s controversial and is not part of TCP specification)
� 2MSL Timer: We have discussed this earlier
Special TCP Considerations
� Silly Window Syndrome (SWS): Occurs when small amounts of data are exchanged instead of full-sized segments
� To avoid SWS
— Receiver should not advertise small windows
— Sender should avoid sending small segments (see the textbook
and Stevens for details)
— Nagle’s algorithm allows a sender to have only one outstanding small segment
� Karn’s algorithm: Do not update the RTT estimators when the acknowledgement for the retransmitted data arrives