multi-gbps tcp 9:00-10:00 harvey newman (physics, caltech) high speed networks & grids...

95
Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and TCP Grid 10:45-11:00 Discussion 11:00-11:15 Break 11:15-12:00 Steven Low (CS/EE, Caltech) TCP/AQM protocols and duality model 12:00-12:15 Dohy Hong (INRIA/Caltech) Synchronization effect of TCP 12:15- 1:00 Lunch

Post on 21-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Multi-Gbps TCP

9:00-10:00 Harvey Newman (Physics, Caltech)

High speed networks & grids

10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN)

LHC networks and TCP Grid

10:45-11:00 Discussion

11:00-11:15 Break

11:15-12:00 Steven Low (CS/EE, Caltech)

TCP/AQM protocols and duality model

12:00-12:15 Dohy Hong (INRIA/Caltech)

Synchronization effect of TCP

12:15- 1:00 Lunch

Page 2: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Multi-Gbps TCP

1:00-2:00 Fernando Paganini (EE, UCLA)

Control theory and stability of TCP/AQM

2:00-2:30 Steven Low (CS/EE, Caltech)

Stabilized Vegas (& instability of Reno/RED)

2:30-3:00 Zhikui Wang (EE, UCLA)

FAST simulations

3:00-3:15 Break

3:15-4:00 David Wei, Cheng Hu (CS, Caltech)

Some related projects and TCP kernel

4:00-4:15 Break

4:15-5:00 Discussion

Page 3: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Copyright, 1996 © Dale Carnegie & Associates, Inc.

Dualtiy Model of TCP/AQM

Steven Low

CS & EE, Caltechnetlab.caltech.edu

2002

Page 4: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Acknowledgment

S. Athuraliya, D. Lapsley, V. Li, Q. Yin (UMelb)

S. Adlakha (UCLA), D. Choe (Postech/Caltech), J. Doyle (Caltech), K. Kim (SNU/Caltech), L. Li (Caltech), F. Paganini (UCLA), J. Wang (Caltech)

L. Peterson, L. Wang (Princeton)

Page 5: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Part IProtocols

Page 6: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Window Flow Control

~ W packets per RTTLost packet detected by missing ACK

RTT

time

time

Source

Destination

1 2 W

1 2 W

1 2 W

data ACKs

1 2 W

Page 7: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Source Rate

Limit the number of packets in the network to window W

Source rate = bps

If W too small then rate « capacityIf W too big then rate > capacity

=> congestion

RTT

MSSW

Page 8: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP Window Flow Controls

Receiver flow control Avoid overloading receiver Set by receiver awnd: receiver (advertised) window

Network flow control Avoid overloading network Set by sender Infer available network capacity cwnd: congestion window

Set W = min (cwnd, awnd)

Page 9: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Receiver Flow Control

Receiver advertises awnd with each ACK

Window awndclosed when data is received and ack’dopened when data is read

Size of awnd can be the performance limit (e.g. on a LAN)sensible default ~16kB

Page 10: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Network Flow Control

Source calculates cwnd from indication of network congestion

Congestion indicationsLosses DelayMarks

Algorithms to calculate cwndTahoe, Reno, Vegas, RED, REM …

Page 11: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP Congestion Controls Tahoe (Jacobson 1988)

Slow Start Congestion Avoidance Fast Retransmit

Reno (Jacobson 1990) Fast Recovery

Vegas (Brakmo & Peterson 1994) New Congestion Avoidance

RED (Floyd & Jacobson 1993) Probabilistic marking

BLUE (Feng, Kandlur, Saha, Shin 1999)

REM/PI (Athuraliya et al 2000/Hollot et al 2001) Clear buffer, match rate

AVQ (Kunniyur & Srikant 2001)

Page 12: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP/AQM

Tahoe (Jacobson 1988) Slow Start Congestion Avoidance Fast Retransmit

Reno (Jacobson 1990) Fast Recovery

Vegas (Brakmo & Peterson 1994) New Congestion Avoidance

BLUE RED (Floyd & Jacobson 1993) REM/PI (Athuraliya et al 2000, Hollot et al 2001) AVQ (Kunniyur & Srikant 2001)

Page 13: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP Tahoe (Jacobson 1988)

SStime

window

CA

SS: Slow StartCA: Congestion Avoidance

Page 14: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Slow Start

Start with cwnd = 1 (slow start)On each successful ACK increment

cwndcwnd cnwd + 1

Exponential growth of cwndeach RTT: cwnd 2 x cwnd

Enter CA when cwnd >= ssthresh

Page 15: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Slow Start

data packetACK

receiversender

1 RTT

cwnd1

2

34

5678

cwnd cwnd + 1 (for each ACK)

Page 16: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Congestion Avoidance

Starts when cwnd ssthreshOn each successful ACK:

cwnd cwnd + 1/cwndLinear growth of cwnd

each RTT: cwnd cwnd + 1

Page 17: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Congestion Avoidance

cwnd1

2

3

1 RTT

4

data packetACK

cwnd cwnd + 1 (for each cwnd ACKS)

receiversender

Page 18: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Packet Loss

Assumption: loss indicates congestionPacket loss detected by

Retransmission TimeOuts (RTO timer)Duplicate ACKs (at least 3)

1 2 3 4 5 6

1 2 3

Packets

Acknowledgements

3 3

7

3

Page 19: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Fast Retransmit

Wait for a timeout is quite longImmediately retransmits after 3

dupACKs without waiting for timeoutAdjusts ssthresh

flightsize = min(awnd, cwnd)ssthresh max(flightsize/2, 2)

Enter Slow Start (cwnd = 1)

Page 20: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Successive Timeouts

When there is a timeout, double the RTOKeep doing so for each lost

retransmission Exponential back-off Max 64 seconds1

Max 12 restransmits1

1 - Net/3 BSD

Page 21: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Summary: Tahoe

Basic ideas Gently probe network for spare capacity Drastically reduce rate on congestion Windowing: self-clocking Other functions: round trip time estimation,

error recoveryfor every ACK { if (W < ssthresh) then W++ (SS) else W += 1/W (CA)}for every loss {

ssthresh = W/2 W = 1 }

Page 22: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP Reno (Jacobson 1990)

CASS

Fast retransmission/fast recovery

Page 23: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Fast recovery

Motivation: prevent `pipe’ from emptying after fast retransmit

Idea: each dupACK represents a packet having left the pipe (successfully received)

Enter FR/FR after 3 dupACKs Set ssthresh max(flightsize/2, 2) Retransmit lost packet Set cwnd ssthresh + ndup (window inflation) Wait till W=min(awnd, cwnd) is large enough;

transmit new packet(s) On non-dup ACK (1 RTT later), set cwnd

ssthresh (window deflation) Enter CA

Page 24: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

9

94

0 0

Example: FR/FR

Fast retransmitRetransmit on 3 dupACKs

Fast recoveryInflate window while repairing loss to fill pipe

timeS

timeR

1 2 3 4 5 6 87

8

cwnd 8ssthresh

1

74

0 0 0

Exit FR/FR

44

411

00

1011

Page 25: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Summary: Reno

Basic ideasFast recovery avoids slow startdupACKs: fast retransmit + fast recoveryTimeout: fast retransmit + slow start

slow start retransmit

congestion avoidance FR/FR

dupACKs

timeout

Page 26: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Active queue management

Idea: provide congestion information by probabilistically marking packets

IssuesHow to measure congestion (p and G)?How to embed congestion measure? How to feed back congestion info?

x(t+1) = F( p(t), x(t) )

p(t+1) = G( p(t), x(t) )

Reno, Vegas

DropTail, RED, REM

Page 27: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

RED (Floyd & Jacobson 1993)

Congestion measure: average queue length pl(t+1) = [pl(t) + xl(t) - cl]+

Embedding: p-linear probability function

Feedback: dropping or ECN marking

Avg queue

marking

1

Page 28: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

REM (Athuraliya & Low 2000)

Congestion measure: pricepl(t+1) = [pl(t) + (l bl(t)+ xl

(t) - cl )]+

Embedding:

Feedback: dropping or ECN marking

Page 29: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

REM (Athuraliya & Low 2000)

Congestion measure: pricepl(t+1) = [pl(t) + (l bl(t)+ xl

(t) - cl )]+

Embedding: exponential probability function

Feedback: dropping or ECN marking

0 2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Link congestion measure

Lin

k m

arkin

g probability

Page 30: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Match rate

Clear buffer and match rate

Clear buffer

Key features

)] )(ˆ )( ()([ )1( ll

llll ctxtbtptp

)()( 1 1 tptp sl

Sum prices

Theorem (Paganini 2000)

Global asymptotic stability for general utility function (in the absence of delay)

Page 31: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Active Queue Management

pl(t) G(p(t), x(t))

DropTail loss [1 - cl/xl (t)]+ (?)

RED queue [pl(t) + xl(t) - cl]+

Vegas delay [pl(t) + xl (t)/cl - 1]+

REM price [pl(t) + lbl(t)+ xl (t) - cl )]

+

x(t+1) = F( p(t), x(t) )

p(t+1) = G( p(t), x(t) )

Reno, Vegas

DropTail, RED, REM

Page 32: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Congestion & performance

pl(t) G(p(t), x(t))

Reno loss [1 - cl/xl (t)]+ (?)

Reno/RED queue [pl(t) + xl(t) - cl]+

Reno/REM price [pl(t) + lbl(t)+ xl (t) - cl )]

+

Vegas delay [pl(t) + xl (t)/cl - 1]+

Decouple congestion & performance measure RED: `congestion’ = `bad performance’ REM: `congestion’ = `demand exceeds supply’

But performance remains good!

Page 33: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Part IIDuality Model

Page 34: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Congestion Control

Heavy tail Mice-elephants

Elephant

Internet

Mice

Congestion control

efficient & fair sharing

small delay

queueing + propagation

CDN

Page 35: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP

xi(t)

Page 36: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP & AQM

xi(t)

pl(t)

Example congestion measure pl(t)

Loss (Reno)Queueing delay (Vegas)

Page 37: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Outline

Protocol (Reno, Vegas, RED, REM/PI…)

Equilibrium Performance

Throughput, loss, delay

Fairness Utility

Dynamics Local stability Cost of stabilization

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Page 38: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP & AQM

xi(t)

pl(t)

TCP: Reno Vegas

AQM: DropTail RED REM/PI AVQ

Page 39: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP & AQM

xi(t)

pl(t)

TCP: Reno Vegas

AQM: DropTail RED REM/PI AVQ

Page 40: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Model structure

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

Multi-link multi-source network

x y

q p

from F. Paganini

Page 41: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP Reno (Jacobson 1990)

SStime

window

CA

SS: Slow StartCA: Congestion Avoidance Fast retransmission/fast recovery

Page 42: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

TCP Vegas (Brakmo & Peterson 1994)

SStime

window

CA

Converges, no retransmission … provided buffer is large enough

Page 43: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

queue size

for every RTT

{ if W/RTTmin – W/RTT < then W ++

if W/RTTmin – W/RTT > then W -- }

for every loss

W := W/2

Vegas model

pl(t+1) = [pl(t) + yl (t)/cl - 1]+Gl:

iiiii

ii dtqtxD

txtx )()( if 1

)(12

else )(1 txtx ss

Fi:

iiiii

ii dtqtxD

txtx )()( if 1

)(12

Page 44: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Vegas model

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

1

)()(

l

lll c

tytpG

)()(sgn ))((

1 )(

2tqtxd

tqdtxF iiii

iiii

Page 45: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Overview

Protocol (Reno, Vegas, RED, REM/PI…)

Equilibrium Performance

Throughput, loss, delay

Fairness Utility

Dynamics Local stability Cost of stabilization

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Page 46: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Model

c1 c2

Network Links l of capacities cl

Sources iL(s) - links used by source iUi(xi) - utility if source rate = xi

x1

x2

x3

121 cxx 231 cxx

Page 47: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Primal problem

Llcy

xU

ll

iii

xi

, subject to

)( max0

AssumptionsStrictly concave increasing Ui

Unique optimal rates xi exist Direct solution impractical

Page 48: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Related Work Formulation

Kelly 1997 Penalty function approach

Kelly, Maulloo & Tan 1998 Kunniyur & Srikant 2000

Duality approach Low & Lapsley 1999 Athuraliya & Low 2000

Extensions Mo & Walrand 2000 La & Anantharam 2000

Dynamics Johari & Tan 2000, Massoulie 2000, Vinnicombe 2000, … Hollot, Misra, Towsley & Gong 2001 Paganini 2000, Paganini, Doyle, Low 2001, …

Page 49: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Duality Approach

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Primal-dual algorithm:

)( )( max )( min

, subject to )( max

00

0

:Dual

:Primal

ll

ll

sss

xp

ll

sss

x

xcpxUpD

LlcxxU

s

s

Page 50: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Duality Model of TCP

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Primal-dual algorithm:

Reno, Vegas

DropTail, RED, REM

Source algorithm iterates on rates Link algorithm iterates on prices With different utility functions

Page 51: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Duality Model of TCP

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Primal-dual algorithm:

Reno, Vegas

DropTail, RED, REM

(x*,p*) primal-dual optimal if and only if

0 if equality with ** lll pcy

Page 52: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Duality Model of TCP

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Primal-dual algorithm:

Reno, Vegas

DropTail, RED, REM

Any link algorithm that stabilizes queue generates Lagrange multipliers solves dual problem

Page 53: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Gradient algorithm

Gradient algorithm

))(( )1( : source 1' tqUtx iii

)])(()([ )1( :link lllll ctytptp

Theorem (Low & Lapsley ’99) Converge to optimal rates in distributed asynchronous environment

Page 54: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Gradient algorithm

Gradient algorithm

))(( )1( : source 1' tqUtx iii

)])(()([ )1( :link lllll ctytptp

Vegas: approximate gradient algorithm

)()(sgn ))((

1 )(

2txtx

tqdtxF ii

iiii

))(( 1' tqU ii

Page 55: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Summary: equilibrium

Llcx

xU

l

l

sss

xs

, subject to

)( max0

Flow control problem

TCP/AQM Maximize aggregate source utility With different utility functions

Primal-dual algorithm

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Reno, Vegas

DropTail, RED, REM

Page 56: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Implications

PerformanceRate, delay, queue, loss

FairnessUtility function

Persistent congestion

Page 57: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Performance

DelayCongestion measures: end to end

queueing delay

Sets rate

Equilibrium condition: Little’s LawLoss

No loss if converge (with sufficient buffer)Otherwise: revert to Reno (loss

unavoidable)

)()(

tq

dtx

s

sss

Page 58: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Vegas Utility

iiiii xdxU log)(

Equilibrium (x, p) = (F, G)

Proportional fairness

Page 59: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Validation (L. Wang, Princeton)

Source 1 Source 3 Source 5

RTT (ms) 17.1 (17) 21.9 (22) 41.9 (42) Rate (pkts/s) 1205 (1200) 1228 (1200) 1161 (1200)Window (pkts) 20.5 (20.4) 27 (26.4) 49.8 (50.4)Avg backlog (pkts) 9.8 (10)

NS-2 simulation, single link, capacity = 6 pkts/ms 5 sources with different propagation delays, s = 2

pkts/RTT

meausred theory

Page 60: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Persistent congestion

Vegas exploits buffer process to compute prices (queueing delays)

Persistent congestion due to Coupling of buffer & price Error in propagation delay estimation

Consequences Excessive backlog Unfairness to older sources

Theorem (Low, Peterson, Wang ’02)

A relative error of s in propagation delay estimation distorts the utility function to

sssssssss xdxdxU log)1()(ˆ

Page 61: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Validation (L. Wang, Princeton)

Single link, capacity = 6 pkt/ms, s = 2 pkts/ms, ds = 10 ms

With finite buffer: Vegas reverts to Reno

Without estimation error With estimation error

Page 62: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Validation (L. Wang, Princeton)

Source rates (pkts/ms)# src1 src2 src3 src4 src51 5.98 (6) 2 2.05 (2) 3.92 (4)3 0.96 (0.94) 1.46 (1.49) 3.54 (3.57)4 0.51 (0.50) 0.72 (0.73) 1.34 (1.35) 3.38 (3.39)5 0.29 (0.29) 0.40 (0.40) 0.68 (0.67) 1.30 (1.30) 3.28

(3.34) # queue (pkts) baseRTT (ms)1 19.8 (20) 10.18 (10.18)2 59.0 (60) 13.36 (13.51)3 127.3 (127) 20.17 (20.28)4 237.5 (238) 31.50 (31.50)5 416.3 (416) 49.86 (49.80)

Page 63: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Vegas/REM

Vegas/REM Vegas

peak = 43 pktsutilization : 90% - 96%

Page 64: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Outline

Protocol (Reno, Vegas, RED, REM/PI…)

Equilibrium Performance

Throughput, loss, delay

Fairness Utility

Dynamics Local stability Cost of stabilization

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Page 65: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Vegas model

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

1

)()(

l

lll c

tytpG

)()(sgn ))((

1 )(

2tqtxd

tqdtxF iiii

iiii

Page 66: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Vegas model

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

lieR lis

lif link uses source if

lieR lislib link uses source if \

Page 67: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Approximate model

ii

ii

dtqtx

i tTtx

)()(

21sgn

)(

1 )(

Page 68: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Approximate model

ii

ii

dtqtx

i tTtx

)()(

21sgn

)(

1 )(

ii

ii

dtqtx

i tTtx

)()(1-

21 tan

)(

12 )(

Page 69: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Linearized model

iii

i

i

ii q

asT

a

q

xx

l

ll y

cp

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

12

ii

i Txa

controls equilibrium delay

Page 70: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Stability

Cannot be satisfied with >1 bottleneck link!

Theorem (Choe & Low, ‘02) Locally asymptotically stable if

c

c

i

i

a

a

sin

min

time triproud

delay queueinglink > 0.63

Page 71: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Stabilized Vegas

)(1 tan)(

12 )( )()(1-

2tq

tTtx iid

tqtxi ii

ii

ii

ii

dtqtx

i tTtx

)()(1-

21 tan

)(

12 )(

Page 72: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Linearized model

iii

i

i

ii q

asT

a

q

xx

l

ll y

cp

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

controls equilibrium delay

Page 73: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Linearized model

iii

iii q

as

asbx

ll

l yc

p

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

controls equilibrium delaychoose ai = a, i =

Page 74: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Stability

Theorem (Choe & Low, ‘02) Locally asymptotically stable if

),( queueing tripround

time tripround aσM

example LHS < 10*10 = 100 a = 0.1, = 0.015 (a,) = 120

Page 75: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Stability

Theorem (Choe & Low, ‘02) Locally asymptotically stable if

),( queueing tripround

time tripround aσM

Application Stabilized TCP with current routers Queueing delay as congestion measure has the right

scaling Incremental deployment with ECN

Page 76: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Papers

A duality model of TCP flow controls (ITC, Sept 2000)

Optimization flow control, I: basic algorithm & convergence (ToN, 7(6), Dec 1999)

Understanding Vegas: a duality model (J. ACM, 2002)

Scalable laws for stable network congestion control (CDC, 2001)

REM: active queue management (Network, May/June 2001)

netlab.caltech.edu

Page 77: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Backup slides

Page 78: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Dynamics

Small effect on queueAIMDMice trafficHeterogeneity

Big effect on queueStability!

Page 79: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Stable: 20ms delay

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

10

20

30

40

50

60

70Window

time (ms)

Win

dow

(pk

ts)

individual window

Window

Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

Page 80: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

100

200

300

400

500

600

700

800Instantaneous queue

time (ms)

Inst

anta

neou

s qu

eue

(pkt

s)

Queue

Stable: 20ms delay

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

10

20

30

40

50

60

70Window

time (ms)

Win

dow

(pk

ts)

individual window

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

10

20

30

40

50

60

70Window

time (ms)

Win

dow

(pk

ts)

individual window

average window

Window

Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

Page 81: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

10

20

30

40

50

60

70Window

time (10ms)

Win

dow

(pk

ts)

individual window

Unstable: 200ms delay

Window

Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

Page 82: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

10

20

30

40

50

60

70Window

time (10ms)

Win

dow

(pk

ts)

individual window

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

10

20

30

40

50

60

70Window

time (10ms)

Win

dow

(pk

ts)

individual window

average window

Unstable: 200ms delay

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

100

200

300

400

500

600

700

800Instantaneous queue

time (10ms)

Inst

anta

neou

s qu

eue

(pkt

s)

QueueWindow

Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

Page 83: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Other effects on queue

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

100

200

300

400

500

600

700

800Instantaneous queue

time (ms)

Inst

anta

neou

s qu

eue

(pkt

s)

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

100

200

300

400

500

600

700

800Instantaneous queue

time (10ms)

Inst

anta

neou

s qu

eue

(pkt

s)

20ms

200ms

0 10 20 30 40 50 60 70 80 90 1000

100

200

300

400

500

600

700

800Instantaneous queue (50% noise)

time (sec)

inst

anta

neou

s qu

eue

(pkt

s)

50% noise

0 10 20 30 40 50 60 70 80 90 1000

100

200

300

400

500

600

700

800Instantaneous queue (50% noise)

time (sec)

inst

anta

neou

s qu

eue

(pkt

s)

50% noise

0 10 20 30 40 50 60 70 80 90 1000

100

200

300

400

500

600

700

800

time (sec)

Instantaneous queue (pkts)

inst

anta

neou

s qu

eue

(pkt

s)

avg delay 16ms

0 10 20 30 40 50 60 70 80 90 1000

100

200

300

400

500

600

700

800

time (sec)

Instantaneous queue (pkts)

inst

anta

neou

s qu

eue

(pkt

s)

avg delay 208ms

Page 84: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Effect of instability

Larger jitters in throughputLower utilization

20 40 60 80 100 120 140 160 180 2000

20

40

60

80

100

120

140

160

delay (ms)

win

dow

Mean and variance of individual window

mean

variance

No noise

Page 85: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Linearized system

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

))(),(),((~

:)(),( tytptzGtytpG lllllll

)(2

)(

))(1( )()(),(

2

2tq

txtqtxtxtqF i

i

i

iiiii

Page 86: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Linearized system

F1

FN

G1

GL

Rf(s)

Rb’(s)

x y

q p

s

n

s

n

ni ii

i

n

ee

xc

scs

c

wpspsL

1

1

)(

1)(

***

TCP RED queue delay

Page 87: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Stability condition

Theorem: TCP/RED stable if

222

2

3

33

)1(4

)1 )(

2

-(Nc

N

c

TCP: Small Small c Large N

RED: Small Large delay

Page 88: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Validation

50 55 60 65 70 75 80 85 90 95 10050

55

60

65

70

75

80

85

90

95

100Round trip propagation delay at critical frequency (ms)

delay (NS)

dela

y (m

odel

)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5Critical frequency (Hz)

frequency (NS)

frequ

ency

(mod

el)

dynamic-link model

30 data points

Page 89: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5Critical frequency (Hz)

frequency (NS)

frequ

ency

(mod

el)

dynamic-link model

30 data points

Validation

50 55 60 65 70 75 80 85 90 95 10050

55

60

65

70

75

80

85

90

95

100Round trip propagation delay at critical frequency (ms)

delay (NS)

dela

y (m

odel

)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5Critical frequency (Hz)

frequency (NS)

frequ

ency

(mod

el)

dynamic-link model

static-link model

30 data points

Page 90: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Stability region

8 9 10 11 12 13 14 1550

55

60

65

70

75

80

85

90

95

100Round trip propagation delay at critical frequency

capacity (pkts/ms)

dela

y (

ms)

N=40

N=30

N=20

N=20 N=60

Unstable for Large delay Large

capacity Small load

Page 91: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Role of AQM

(Sufficient) stability condition

-1

} );( {co HK

Page 92: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Role of AQM

(Sufficient) stability condition

-1

} );( {co HK

TCP:N

c

2

22**wpj

e j

AQM: scale down K & shape H

Page 93: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Role of AQM

} );( {co HK

TCP:N

c

2

22**wpj

e j

RED:

1

c

cj

e j

1

REM/PI:

12a

j

aaje j21 /

Queue dynamics (windowing)

Problem!!

Page 94: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Role of AQM

To stabilize a given TCP Fi with least cost

),( subject to

)()()()(min0)(

pxFx

dttRptptxQtx TT

p

Linearized F, single link, identical sources Any AQM

solves optimal control with appropriate Q and R

))()()(()( 321* tyktyktbktp

))()()(()( 321 tyktyktbktp

Page 95: Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and

Role of AQM

To stabilize a given TCP Fi with least cost

),( subject to

)()()()(min0)(

pxFx

dttRptptxQtx TT

p

Linearized F, single link, identical sources

))()()(()( 321* tyktyktbktp

k1=0

Nonzero buffer

k3=0

Slower decay