sharing the data center network alan shieh, srikanth kandula, albert greenberg, changhoon kim, bikas...

Sharing the Data Center NetworkSharing the Data Center Network

Alan Shieh, Srikanth Kandula, Albert Greenberg, Changhoon Kim, Bikas Saha

Microsoft Research, Cornell University, Windows Azure, Microsoft Bing

NSDI’11

NSLab, RIIT, Tsinghua Univ

Outline

Introduction Seawall Design Evaluation Discussion Summary

2


Introduction

3

Data Center Provide compute and storage resources for web

search, content distribution and social networking Achieve cost efficiencies and on-demand scaling Highly-multiplexed shared environments

VMs and tasks from multiple tenants coexisting in the same cluster

Network performance interference and denial of service attacks is high


Introduction

4

Problem with network sharing in datacenters Performance interference in infrastructure cloud

services Network usage is a distributed resource Large number of flows Higher rate UDP flows

Poorly-performing schedules in Cosmos (Bing) Poor sharing of the network leads to poor performance

and wasted resources


Introduction

5

Poor sharing of the network leads to poor performance and wasted resources

* Optimal bandwidth shares is non-goal

Require perfect knowledge about client demands

Map-Reduce workloads (5 maps and 1 reduce)


Introduction

6

Magnitude of scale and churn The number of classes to share bandwidth among is

large and varies frequently

Cloud datacenters traffic is even harder to predict


Introduction

7

Requirements Traffic Agnostic, Simple Service Interface Require no changes to network topology or

hardware Scale to large numbers of tenants and high churn Enforce sharing without sacrificing efficiency

NSLab, RIIT, Tsinghua Univ 8

VM 1 VM 2 VM 3 (weight = 2)

VM 2 flow 1

VM 2 flow 2 VM 2 flow 3VM 3:~50%

VM 2:~25%

VM 1:~25%


In-network queuing and rate limiting

Network-to-source congestion control (Ethernet QCN)

End-to-end congestion control (TCP)

HV

Guest

HV

Guest

HV

Guest

HV

Guest

HV

Guest

HV

Guest

Throttle send rateThrottle send rate

Existing mechanisms are insufficient

Detect congestionDetect congestion

Not scalable. Can underutilize links.Not scalable. Can underutilize links.

Requires new hardware. Inflexible policy.Requires new hardware. Inflexible policy.

Poor control over allocation. Guests can change TCP stack.

Poor control over allocation. Guests can change TCP stack.


Seawall Design

10

Congestion controlled hypervisor-to-hypervisor tunnels

HV

Guest

HV

Guest


Seawall Design

11

Bandwidth Allocator Weighted additive increase, multiplicative

decrease (AIMD) derived from TCP-Reno

Decrease:

Increase: Three improvements

Combine feedback from multiple destinations Modify the adaptation logic to converge quickly and

stay at equilibrium longer Nest traffic


Seawall Design

12

Step 1 : Using distributed control loops to determine per-link, per-entry share

Lacking of XCP, QCN, SideCar


Seawall Design

13

Step 2 : Convert per-link, per-entity shares to per-link, per-tunnel shares

Use β=0.9, allocates β fraction of the link bandwidth proportional to current usage and the rest evenly across destinations

The allowed share of the first destination converges to within 20% of its demand in four iterations

Orange entity has demands (2x, x, x) to the three destinations


Seawall Design

14

Improving the Rate Adaptation Logic Use control laws from CUBIC to achieve faster

convergence, longer dwell time at the equilibrium point, and higher utilization than AIMD

If switches support ECN, Seawall also incorporates the control laws from DCTCP

Smoothed multiplicative decrease

Concave or convex increase


Seawall Design

15

Less than goal, concave increase

Above goal, convex increase


Seawall Design

16

Nesting traffic – deferring congestion control If a sender always sends less than the rate allowed by

Seawall, she can launch a short overwhelming burst of traffic

UDP and TCP flows behave differently: full burst UDP flow immediately uses all the rate and a set of TCP flows can take several RTTs to ramp up

TCP flow queries rate limiter


Evaluation

17

Traffic-agnostic network allocation Selfish traffic = Full-burst UDP


Evaluation

18

Selfish traffic = Many TCP flows


Evaluation

19

Selfish traffic = Arbitrarily many destinations


Discussion

20

Seawall and cloud data centers Sharing policies

Work-conserving, max-min fair Achieve higher utilization Dynamic weight changes

System architecture Support rate- and window-based limiters Based on both hardware and software

Partitioning sender/receiver functionality Receiver-driven approach customized for map-reduce


Summary

21

Seawall is a first step towards providing data center administrators with tools to divide their network across the sharing entities without requiring any cooperation from the entities

Well-suited to emerging hardware trends in data center and virtualization hardware

NSLab, RIIT, Tsinghua Univ 22

sharing the data center network alan shieh, srikanth kandula, albert greenberg, changhoon kim, bikas...

Documents

tsinghua univ introduction

tsinghua univ seawall

traffic slide

high slide

network sharing

sidecar slide

efficiency slide

network leads