cloud control with distributed rate limiting

32
Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren University of California, San Diego

Upload: chen

Post on 23-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Cloud Control with Distributed Rate Limiting. Barath Raghavan , Kashi Vishwanath , Sriram Ramabhadran , Kenneth Yocum , and Alex C. Snoeren University of California, San Diego. Centralized network services. Hosting with a single physical presence However, clients are across the Internet. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Cloud Control with Distributed Rate Limiting

Cloud Control withDistributed Rate Limiting

Barath Raghavan, Kashi Vishwanath,Sriram Ramabhadran, Kenneth Yocum, and Alex C.

SnoerenUniversity of California, San Diego

Page 2: Cloud Control with Distributed Rate Limiting

Centralized network services

• Hosting with a single physical presence– However, clients are across the Internet

Page 3: Cloud Control with Distributed Rate Limiting

Running on a cloud• Resources and clients are across the

world• Services combine these distributed

resources

1 Gbps

Page 4: Cloud Control with Distributed Rate Limiting

Key challenge

We want to control distributed resources as if they were

centralized

Page 5: Cloud Control with Distributed Rate Limiting

Ideal: Emulate a single limiter

• Make distributed feel centralized– Packets should experience same limiter

behavior

S

S

S

D

D

D

0 ms

0 ms

0 ms

Limiters

Page 6: Cloud Control with Distributed Rate Limiting

Distributed Rate Limiting (DRL)

Achieve functionally equivalent behavior to a central limiter

GlobalRandom Drop

FlowProportional Share

Packet-level(general)

Flow-level(TCP specific)

GlobalToken Bucket1 2 3

Page 7: Cloud Control with Distributed Rate Limiting

Distributed Rate Limiting tradeoffs

Accuracy(how close to K Mbps is delivered, flow rate

fairness)+

Responsiveness(how quickly demand shifts are accommodated)

Vs.

Communication Efficiency(how much and often rate limiters must

communicate)

Page 8: Cloud Control with Distributed Rate Limiting

Limiter 1

DRL Architecture

Limiter 2

Limiter 3

Limiter 4

Gossip

GossipGossipEstimatelocal demand

Estimateintervaltimer

Set allocationGlobal

demand

Enforce limit

Packetarrival

Page 9: Cloud Control with Distributed Rate Limiting

Token Buckets

Token bucket, fill rate K Mbps

Packet

Page 10: Cloud Control with Distributed Rate Limiting

Demand info(bytes/sec)

Building a Global Token Bucket

Limiter 1 Limiter 2

Page 11: Cloud Control with Distributed Rate Limiting

Baseline experiment

Limiter 13 TCP flowsS D

Limiter 27 TCP flowsS D

Single token bucket10 TCP flows

S D

Page 12: Cloud Control with Distributed Rate Limiting

Global Token Bucket (GTB)

Single token bucket Global token bucket

7 TCP flows3 TCP flows

10 TCP flows

Problem: GTB requires near-instantaneous arrival info

(50ms estimate interval)

Page 13: Cloud Control with Distributed Rate Limiting

Global Random Drop (GRD)

5 Mbps (limit)4 Mbps (global arrival rate)

Case 1: Below global limit, forward packet

Limiters send, collect global rate info from others

Page 14: Cloud Control with Distributed Rate Limiting

Global Random Drop (GRD)

5 Mbps (limit)6 Mbps (global arrival rate)

Case 2: Above global limit, drop with probability:Excess

Global arrival rate

Same at all limiters

16=

Page 15: Cloud Control with Distributed Rate Limiting

GRD in baseline experiment

Single token bucket Global random drop

7 TCP flows3 TCP flows

10 TCP flows

(50ms estimate interval)

Delivers flow behavior similar to a central limiter

Page 16: Cloud Control with Distributed Rate Limiting

GRD with flow join(50ms estimate interval)

Flow 1 joins at limiter 1

Flow 2 joins at limiter 2Flow 3 joins at limiter 3

Page 17: Cloud Control with Distributed Rate Limiting

Flow Proportional Share (FPS)Limiter 1

3 TCP flowsS D

Limiter 27 TCP flows

S D

Page 18: Cloud Control with Distributed Rate Limiting

Flow Proportional Share (FPS)

Limiter 1 Limiter 2

“3 flows”“7 flows”

Goal: Provide inter-flow fairness for TCP flows

Local token-bucketenforcement

Page 19: Cloud Control with Distributed Rate Limiting

Estimating TCP demandLimiter 1

D

Limiter 23 TCP flows

S D

1 TCP flow

S1 TCP flow

S

Page 20: Cloud Control with Distributed Rate Limiting

Estimating TCP demand

Local token rate (limit) = 10 Mbps

Flow A = 5 Mbps

Flow B = 5 Mbps

Flow count = 2 flows

Page 21: Cloud Control with Distributed Rate Limiting

Estimating TCP demandLimiter 1

1 TCP flowS

D

Limiter 23 TCP flows

S D

S 1 TCP flow

Page 22: Cloud Control with Distributed Rate Limiting

Key insight: Use a TCP flow’s rate to infer demand

Estimating skewed TCP demand

Local token rate (limit) = 10 MbpsFlow A = 2 Mbps

Flow B = 8 Mbps

Flow count ≠ demand

Bottlenecked elsewhere

Page 23: Cloud Control with Distributed Rate Limiting

Estimating skewed TCP demand

Local token rate (limit) = 10 MbpsFlow A = 2 Mbps

Flow B = 8 Mbps

Local LimitLargest Flow’s Rate

108=

Bottlenecked elsewhere

= 1.25 flows

Page 24: Cloud Control with Distributed Rate Limiting

2.50 flowsLimiter 2

10 Mbps x 1.251.25 + 2.50

Flow Proportional Share (FPS)Global limit = 10 Mbps

1.25 flowsLimiter 1

Set local token rate =

= 3.33 Mbps

Global limit x local flow countTotal flow count

=

Page 25: Cloud Control with Distributed Rate Limiting

Under-utilized limitersLimiter 1

DS 1 TCP flowS 1 TCP flow

Set local limit equal to actual usage

Wasted rate

(limiter returns to full utilization)

Page 26: Cloud Control with Distributed Rate Limiting

Flow Proportional Share (FPS)(500ms estimate interval)

Page 27: Cloud Control with Distributed Rate Limiting

Additional issues• What if a limiter has no flows and

one arrives?• What about bottlenecked traffic?• What about varied RTT flows?• What about short-lived vs. long-lived

flows?

• Experimental evaluation in the paper– Evaluated on a testbed and over

Planetlab

Page 28: Cloud Control with Distributed Rate Limiting

Cloud control on Planetlab• Apache Web servers on 10 Planetlab

nodes• 5 Mbps aggregate limit• Shift load over time from 10 nodes to

4 nodes

5 Mbps

Page 29: Cloud Control with Distributed Rate Limiting

Static rate limitingDemands at 10 apache servers on Planetlab

Demand shifts to just 4 nodesWasted capacity

Page 30: Cloud Control with Distributed Rate Limiting

FPS (top) vs. Static limiting (bottom)

Page 31: Cloud Control with Distributed Rate Limiting

Conclusions• Protocol agnostic limiting (extra cost)

– Requires shorter estimate intervals• Fine-grained packet arrival info not

required– For TCP, flow-level granularity is

sufficient• Many avenues left to explore

– Inter-service limits, other resources (e.g. CPU)

Page 32: Cloud Control with Distributed Rate Limiting

Questions!