an experimental study of the learnability of congestion...

An experimental study of the learnability ofcongestion control

Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker,Hari Balakrishnan

MIT CSAIL

http://web.mit.edu/remy/learnability

August 31, 2014

1 / 17

This talk

I How easy is it to learn a network protocol toachieve a desired goal, despite a mismatched setof assumptions?

I cf. Learning: “Knowledge acquisition withoutexplicit programming” (Valiant 1984)

2 / 17

This talk

I How easy is it to learn a network protocol toachieve a desired goal, despite a mismatched setof assumptions?

I cf. Learning: “Knowledge acquisition withoutexplicit programming” (Valiant 1984)

2 / 17

Preview of key results

I Can tolerate mismatched link-rate assumptions

I Need precision about the number of senders

I TCP compatibility is a double-edged sword

I Can tolerate mismatch in the # of bottlenecks

3 / 17

Experimental method

4 / 17

Experimental method

4 / 17

Experimental method

< Mbps, ms>

4 / 17

Experimental method

< Mbps, ms>

4 / 17

Experimental method

< Mbps, ms>

4 / 17

Experimental method

< Mbps, ms>

4 / 17

Experimental method

< Mbps, ms>

4 / 17

Experimental method

< Mbps, ms>

Training Networks

5 / 17

Experimental method

< Mbps, ms>

Training Networks

Objective Function:- log (tpt/delay)- Avg. Flow Completion time

Learner

5 / 17

Experimental method

< Mbps, ms>

Training Networks

LearnerCongestionControlAlgorithm

5 / 17

Experimental method

< Mbps, ms>

Training Networks

Remy(SIGCOMM 13)

RemyCC

5 / 17

Experimental method

< Mbps, ms>

Training Networks

< Mbps, ms>

Test withinns-2

Testing Networks

Remy(SIGCOMM 13)

RemyCC

5 / 17

Remy compared with an ideal protocol

0100200300400500

Queueing delay (ms)

6 / 17

0100200300400500

Queueing delay (ms)

6 / 17

0100200300400500

Queueing delay (ms)

RemyCC

6 / 17

0100200300400500

Queueing delay (ms)

RemyCC

Cubic Cubic/sfqCoDel

6 / 17

Learning network protocols despite mismatchedassumptions

I Is there a tradeoff between operating range andgenerality in link rates?

I Is there a tradeoff between performance andoperating range in link rates?

7 / 17

Performance and link-rate operating range

1 10 100 1000Link rate (Mbps)

Objective Function(Normalized)

8 / 17

2x range

8 / 17

2x range10x range

8 / 17

2x range10x range

100x range

8 / 17

2x range10x range

100x range1000x range

8 / 17

Cubic-over-sfqCoDel

2x range10x range

100x range1000x range

8 / 17

I Very clear generality vs. operating range tradeoff

I Only weak evidence of a performance vs.operating range tradeoff

I Possible to design a forwards-comptabibleprotocol handling a wide range in link rates

9 / 17

Can we learn a protocol that performs well bothwhen there are few senders and when there aremany senders?

10 / 17

Imperfections in the number of senders

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

1 - 50

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

1 - 50

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

1 - 50

1 - 100

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

1 - 50

1 - 100

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

1 - 50

1 - 100

11 / 17

0 20 40 60 80 100

Number of senders

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

−0.2

1 - 50

1 - 100

Cubic-over-sfqCoDel

11 / 17

Tradeoff between performance with few senders andperformance with many senders

11 / 17

What are the costs and benefits of learning a newprotocol that shares fairly with a legacy sender?

12 / 17

Imperfect assumptions about the nature of other senders

I TCP-Aware RemyCC: Contends with:I TCP-Aware RemyCC half the timeI TCP NewReno half the time.

I TCP-Naive RemyCC: Contends with:I TCP-Naive RemyCC all the time

13 / 17

Imperfect assumptions about the nature of other senders

I TCP-Aware RemyCC: Contends with:I TCP-Aware RemyCC half the timeI TCP NewReno half the time.

I TCP-Naive RemyCC: Contends with:I TCP-Naive RemyCC all the time

13 / 17

RemyCC competing against itself

163264128

Queueing delay (ms)

Better

NewRenoRemyCC

[TCP-naive]

14 / 17

163264128

Queueing delay (ms)

Better

NewRenoRemyCC

[TCP-naive]

Cost of TCP-awareness

14 / 17

163264128

Queueing delay (ms)

Better

NewRenoRemyCC

[TCP-naive]

Cost of TCP-awareness

RemyCC[TCP-aware]

14 / 17

RemyCC competing against TCP NewReno

6496128

Queueing delay (ms)

Better

NewReno

RemyCC[TCP-naive]

15 / 17

6496128

Queueing delay (ms)

Better

NewReno

RemyCC[TCP-naive]

Benefit of TCP-awareness

Effect ofTCP-awareadversary

15 / 17

6496128

Queueing delay (ms)

Better

NewRenoRemyCC

[TCP-aware]

NewReno

RemyCC[TCP-naive]

Benefit of TCP-awareness

Effect ofTCP-awareadversary

15 / 17

TCP awareness benefits you when needed, costs ifyou don’t

15 / 17

Caveats

I Remy as a proxy for an optimal learner

I Results may change with better learners

I Negative results may no longer hold

16 / 17

Caveats

16 / 17

Caveats

16 / 17

Caveats

16 / 17

The learnability of congestion control

I Can tolerate mismatch in the # of bottlenecksI Ongoing work in using findings:

I improve Google’s datacenter transportI user-space implementation of RemyCC

I http://web.mit.edu/remy/learnability

17 / 17

I Ongoing work in using findings:

17 / 17

I improve Google’s datacenter transport

I user-space implementation of RemyCC

17 / 17

Backup slides

17 / 17

The Remy protocol synthesis procedure

I Protocol: range-based rule table from state to action

I State: Congestion signals tracked by the senderI s ewma : EWMA over packet inter-transmit timesI r ewma : EWMA over ACK inter-arrival timesI rtt ratio: Ratio of RTT to minimum RTTI slow r ewma: Slower version of s ewma

I Action: modify window, transmission rateI Multiplier m to current windowI Increment c to current windowI Minimum inter-transmit time.

17 / 17

I Protocol: range-based rule table from state to actionI State: Congestion signals tracked by the sender

I s ewma : EWMA over packet inter-transmit timesI r ewma : EWMA over ACK inter-arrival timesI rtt ratio: Ratio of RTT to minimum RTTI slow r ewma: Slower version of s ewma

17 / 17

I Protocol: range-based rule table from state to actionI State: Congestion signals tracked by the sender

I s ewma : EWMA over packet inter-transmit timesI r ewma : EWMA over ACK inter-arrival timesI rtt ratio: Ratio of RTT to minimum RTTI slow r ewma: Slower version of s ewma

17 / 17

1. Start with one rule: one action for all states

2. Optimize each action to maximize objective

3. Find most used rule

4. Median split that rule based on state usage

5. Repeat 2, 3, and 4 till you converge

17 / 17

One action for all states. Find the best value.

s_ewma

r_ewma

<?,?,?>

17 / 17

The best (single) action. Now split it on median.

s_ewma

r_ewma

<0.90,4,3.3>

17 / 17

Simulate

s_ewma

r_ewma

<0.90,4,3.3>

17 / 17

Optimize each of the new actions

s_ewma

r_ewma

<0.90,4,3.3>

17 / 17

Now split the most-used rule

s_ewma

r_ewma

<0.90,5,2.8>

<0.60,19,76.2>

<0.70,6,53.5>

<0.80,5,4.1>

17 / 17

Simulate

s_ewma

r_ewma

<0.90,5,2.8>

<0.60,19,76.2>

<0.70,6,53.5>

<0.80,5,4.1>

17 / 17

Optimize

s_ewma

r_ewma

<0.90,5,2.8>

<0.60,19,76.2>

<0.70,6,53.5>

<0.80,5,4.1>

17 / 17

s_ewma

r_ewma

<0.90,5,2.8>

<0.30,29,49.7>

<0.60,17,13.3>

<0.80,8,3.3>

<0.80,8,62.7>

<0.80,17,4.6>

<0.80,7,16.9>

17 / 17

Simulate

s_ewma

r_ewma

<0.30,29,49.7>

<0.60,17,13.3>

<0.80,8,3.3>

<0.80,8,62.7>

<0.80,17,4.6>

<0.80,7,16.9>

<0.90,5,2.8>

17 / 17

Can applications with different objectives coexist?

I Tpt. Sender: A throughput-intensive sender

log(throughput)− 0.1 ∗ log(delay) (1)

I Lat. Sender: A latency-sensitive sender

log(throughput)− 10.0 ∗ log(delay) (2)

I Running over a FIFO queue

17 / 17

Training for diversity has a cost ...

124816326412825651210242048

Queueing delay (ms)

17 / 17

124816326412825651210242048

Queueing delay (ms)

Tpt. Sender[naive]

Lat. Sender[naive]

17 / 17

124816326412825651210242048

Queueing delay (ms)

Tpt. Sender[naive]

Tpt. Sender[coevolved]

Lat. Sender[naive]

Lat. Sender[coevolved]

Cost of Coexistence

17 / 17

but, benefits the docile sender

124816326412825651210242048

Queueing delay (ms)

17 / 17

124816326412825651210242048

Queueing delay (ms)

Tpt. Sender[naive]

Lat. Sender[naive]

17 / 17

124816326412825651210242048

Queueing delay (ms)

Tpt. Sender[naive]

Tpt. Sender[coevolved]

Lat. Sender[naive]

Lat. Sender[coevolved]

Benefit of coevolution

Effect ofplaying nice

17 / 17

an experimental study of the learnability of congestion...

Documents

gierut / learnability project

evaluating the learnability of programming languages of...

learnability in optimality theory

language type frequency and learnability from a...

a tutorial on boosting yoav freund rob...

ui design - learnability

processing, representation and learnability of the

learnability and the vapnik-chervonenkis dimension

idioms-based model for learnability and language acquisition

translucency and learnability of blissymbols 1 translucency

the causal-neural connection: expressiveness, learnability

learnability in the second language acquisition of

learnability and perceived beneﬁts of parallel faceted

pac-learnability of probabilistic deterministic finite state...

the learnability of quantum states - scott aaronson

lexically specific constraints: gradience, learnability, and

learnability and semantic universals

ieee socialcom 2009: netviz nirvana (nodexl learnability)

investigating learnability, user performance, and …...

learnability of abstract syntactic principles