sharing the data center network alan shieh, srikanth kandula, albert greenberg, changhoon kim, bikas...
TRANSCRIPT
Sharing the Data Center NetworkSharing the Data Center Network
Alan Shieh, Srikanth Kandula, Albert Greenberg, Changhoon Kim, Bikas Saha
Microsoft Research, Cornell University, Windows Azure, Microsoft Bing
NSDI’11
NSLab, RIIT, Tsinghua Univ
Outline
Introduction Seawall Design Evaluation Discussion Summary
2
NSLab, RIIT, Tsinghua Univ
Introduction
3
Data Center Provide compute and storage resources for web
search, content distribution and social networking Achieve cost efficiencies and on-demand scaling Highly-multiplexed shared environments
VMs and tasks from multiple tenants coexisting in the same cluster
Network performance interference and denial of service attacks is high
NSLab, RIIT, Tsinghua Univ
Introduction
4
Problem with network sharing in datacenters Performance interference in infrastructure cloud
services Network usage is a distributed resource Large number of flows Higher rate UDP flows
Poorly-performing schedules in Cosmos (Bing) Poor sharing of the network leads to poor performance
and wasted resources
NSLab, RIIT, Tsinghua Univ
Introduction
5
Poor sharing of the network leads to poor performance and wasted resources
* Optimal bandwidth shares is non-goal
Require perfect knowledge about client demands
Map-Reduce workloads (5 maps and 1 reduce)
NSLab, RIIT, Tsinghua Univ
Introduction
6
Magnitude of scale and churn The number of classes to share bandwidth among is
large and varies frequently
Cloud datacenters traffic is even harder to predict
NSLab, RIIT, Tsinghua Univ
Introduction
7
Requirements Traffic Agnostic, Simple Service Interface Require no changes to network topology or
hardware Scale to large numbers of tenants and high churn Enforce sharing without sacrificing efficiency
NSLab, RIIT, Tsinghua Univ 8
VM 1 VM 2 VM 3 (weight = 2)
VM 2 flow 1
VM 2 flow 2 VM 2 flow 3VM 3:~50%
VM 2:~25%
VM 1:~25%
NSLab, RIIT, Tsinghua Univ
In-network queuing and rate limiting
Network-to-source congestion control (Ethernet QCN)
End-to-end congestion control (TCP)
HV
Guest
HV
Guest
HV
Guest
HV
Guest
HV
Guest
HV
Guest
Throttle send rateThrottle send rate
Existing mechanisms are insufficient
Detect congestionDetect congestion
Not scalable. Can underutilize links.Not scalable. Can underutilize links.
Requires new hardware. Inflexible policy.Requires new hardware. Inflexible policy.
Poor control over allocation. Guests can change TCP stack.
Poor control over allocation. Guests can change TCP stack.
NSLab, RIIT, Tsinghua Univ
Seawall Design
10
Congestion controlled hypervisor-to-hypervisor tunnels
HV
Guest
HV
Guest
NSLab, RIIT, Tsinghua Univ
Seawall Design
11
Bandwidth Allocator Weighted additive increase, multiplicative
decrease (AIMD) derived from TCP-Reno
Decrease:
Increase: Three improvements
Combine feedback from multiple destinations Modify the adaptation logic to converge quickly and
stay at equilibrium longer Nest traffic
NSLab, RIIT, Tsinghua Univ
Seawall Design
12
Step 1 : Using distributed control loops to determine per-link, per-entry share
Lacking of XCP, QCN, SideCar
NSLab, RIIT, Tsinghua Univ
Seawall Design
13
Step 2 : Convert per-link, per-entity shares to per-link, per-tunnel shares
Use β=0.9, allocates β fraction of the link bandwidth proportional to current usage and the rest evenly across destinations
The allowed share of the first destination converges to within 20% of its demand in four iterations
Orange entity has demands (2x, x, x) to the three destinations
NSLab, RIIT, Tsinghua Univ
Seawall Design
14
Improving the Rate Adaptation Logic Use control laws from CUBIC to achieve faster
convergence, longer dwell time at the equilibrium point, and higher utilization than AIMD
If switches support ECN, Seawall also incorporates the control laws from DCTCP
Smoothed multiplicative decrease
Concave or convex increase
NSLab, RIIT, Tsinghua Univ
Seawall Design
15
Less than goal, concave increase
Above goal, convex increase
NSLab, RIIT, Tsinghua Univ
Seawall Design
16
Nesting traffic – deferring congestion control If a sender always sends less than the rate allowed by
Seawall, she can launch a short overwhelming burst of traffic
UDP and TCP flows behave differently: full burst UDP flow immediately uses all the rate and a set of TCP flows can take several RTTs to ramp up
TCP flow queries rate limiter
NSLab, RIIT, Tsinghua Univ
Evaluation
17
Traffic-agnostic network allocation Selfish traffic = Full-burst UDP
NSLab, RIIT, Tsinghua Univ
Evaluation
18
Selfish traffic = Many TCP flows
NSLab, RIIT, Tsinghua Univ
Evaluation
19
Selfish traffic = Arbitrarily many destinations
NSLab, RIIT, Tsinghua Univ
Discussion
20
Seawall and cloud data centers Sharing policies
Work-conserving, max-min fair Achieve higher utilization Dynamic weight changes
System architecture Support rate- and window-based limiters Based on both hardware and software
Partitioning sender/receiver functionality Receiver-driven approach customized for map-reduce
NSLab, RIIT, Tsinghua Univ
Summary
21
Seawall is a first step towards providing data center administrators with tools to divide their network across the sharing entities without requiring any cooperation from the entities
Well-suited to emerging hardware trends in data center and virtualization hardware
NSLab, RIIT, Tsinghua Univ 22