the only constant is change: incorporating time-varying bandwidth reservations in data centers
DESCRIPTION
The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers. Di Xie, Ning Ding , Y. Charlie Hu, Ramana Kompella. Cloud Computing is Hot. Private Cluster. Key Factors for Cloud Viability. Cost Performance. Performance Variability i n Cloud. - PowerPoint PPT PresentationTRANSCRIPT
1
The Only Constant is Change: Incorporating Time-Varying Bandwidth
Reservations in Data Centers
Di Xie, Ning Ding, Y. Charlie Hu, Ramana Kompella
2
Cloud Computing is Hot
Private Cluster
3
Key Factors for Cloud Viability
• Cost
• Performance
4
Performance Variability in Cloud
• BW variation in cloud due to contention [Schad’10 VLDB]
• Causing unpredictable performance
Local Cluster Amazon EC20
100
200
300
400
500
600
700
800
900
1000
Bandwidth (Mbps)
Network performance variability
Tenant Enterprise
Map Reduce
Job
Results
• Data analytics on an isolated clusterCompletion
Time4 hours
Network performance variability
Tenant Enterprise
Map Reduce
Job
Results
• Data analytics on an isolated clusterCompletion
Time4 hours
• Data analytics in a multi-tenant datacenter
Tenant
Map Reduce
Job
Results
Datacenter
CompletionTime
10-16 hours
Network performance variability
Tenant Enterprise
Map Reduce
Job
Results
Data analytics on an isolated clusterCompletion
Time4 hours
Data analytics in a multi-tenant datacenter
Tenant
Map Reduce
Job
Results
Datacenter
CompletionTime
10-16 hours
Variable tenant costsExpected cost (based on 4 hour completion time) = $100
Actual cost = $250-400
Network performance variability
Tenant Enterprise
Map Reduce
Job
Results
• Data analytics on an isolated clusterCompletion
Time4 hours
• Data analytics in a multi-tenant datacenter
Tenant
Map Reduce
Job
Results
Datacenter
CompletionTime
10-16 hours
Variable tenant costsExpected cost (based on 4 hour completion time) = $100
Actual cost = $250-400
Unpredictability of application performance and tenant costs is a key hindrance to cloud adoption
Key Contributor: Network performance variation
9
Reserving BW in Data Centers
• SecondNet [Guo’10]– Per VM-pair, per VM access bandwidth reservation
• Oktopus [Ballani’11]– Virtual Cluster (VC)– Virtual Oversubscribed Cluster (VOC)
10
How BW Reservation Works
. . .
Virtual Cluster Model
Time
Bandwidth
N VMs
VirtualSwitch
1. Determine the model 2. Allocate and enforce the model
0 T
B
Only fixed-BW reservationRequest <N, B>
11
Network Usage for MapReduce Jobs
Hadoop Sort, 4GB per VM
12
Network Usage for MapReduce Jobs
Hadoop Sort, 4GB per VM
Hadoop Word Count, 2GB per VM
13
Network Usage for MapReduce Jobs
Hadoop Sort, 4GB per VM
Hadoop Word Count, 2GB per VM
Hive Join, 6GB per VM
14
Network Usage for MapReduce Jobs
Hadoop Sort, 4GB per VM
Hadoop Word Count, 2GB per VM
Hive Join, 6GB per VM
Hive Aggregation, 2GB per VM
15
Network Usage for MapReduce Jobs
Hadoop Sort, 4GB per VM
Hadoop Word Count, 2GB per VM
Hive Join, 6GB per VM
Hive Aggregation, 2GB per VM
Time-varying network usage
16
Motivating Example
• 4 machines, 2 VMs/machine, non-oversubscribednetwork
• Hadoop Sort– N: 4 VMs– B: 500Mbps/VM
1Gbps
500Mbps500Mbps
500Mbps
Not enough BW
17
Motivating Example
• 4 machines, 2 VMs/machine, non-oversubscribednetwork
• Hadoop Sort– N: 4 VMs– B: 500Mbps/VM
1Gbps
500Mbps
18
Under Fixed-BW Reservation Model
1Gbps
500MbpsJob3Job2
Virtual Cluster Model
Job1 Time
0 5 10 15 20 25 30
500
Bandwidth
19
Under Time-Varying Reservation Model
1Gbps
500Mbps
TIVC Model
Job1 Time
0 5 10 15 20 25 30
500Job2Job3Job4Job5
J1 J2J3 J4J5
Bandwidth
Doubling VM, network utilization and the job
throughput
HadoopSort
20
Temporally-Interleaved Virtual Cluster (TIVC)
• Key idea: Time-Varying BW Reservations
• Compared to fixed-BW reservation– Improves utilization of data center
• Better network utilization• Better VM utilization
– Increases cloud provider’s revenue– Reduces cloud user’s cost– Without sacrificing job performance
21
Challenges in Realizing TIVC
. . .
Virtual Cluster Model
Time
Bandwidth
N VMs
VirtualSwitch 0 T
B
Request <N, B>
Time
Bandwidth
0 T
B
Request <N, B(t)>
Q1: What are right model functions?
Q2: How to automatically derive the models?
22
Challenges in Realizing TIVC
Q3: How to efficiently allocate TIVC?
Q4: How to enforce TIVC?
23
Challenges in Realizing TIVC
• What are the right model functions?
• How to automatically derive the models?
• How to efficiently allocate TIVC?
• How to enforce TIVC?
24
How to Model Time-Varying BW?
Hadoop Hive Join
25
TIVC Models
Virtual Cluster
T11 T32
26
Hadoop Sort
27
Hadoop Word Count
v
28
Hadoop Hive Join
29
Hadoop Hive Aggregation
30
Challenges in Realizing TIVC
What are the right model functions?
• How to automatically derive the models?
• How to efficiently allocate TIVC?
• How to enforce TIVC?
31
Possible Approach
• “White-box” approach– Given source code and data of cloud application,
analyze quantitative networking requirement– Very difficult in practice
• Observation: Many jobs are repeated many times– E.g., 40% jobs are recurring in Bing’s production data
center [Agarwal’12]– Of course, data itself may change across runs, but size
remains about the same
32
Our Approach
• Solution: “Black-box” profiling based approach1. Collect traffic trace from profiling run2. Derive TIVC model from traffic trace
• Profiling: Same configuration as production runs– Same number of VMs– Same input data size per VM– Same job/VM configuration
How much BW should we reserve to the application?
33
Impact of BW Capping
No-elongation BW threshold
34
Choosing BW Cap
• Tradeoff between performance and cost– Cap > threshold: same performance, costs more– Cap < threshold: lower performance, may cost less
• Our Approach: Expose tradeoff to user1. Profile under different BW caps2. Expose run times and cost to user3. User picks the appropriate BW cap
Only below threshold ones
35
From Profiling to Model Generation
• Collect traffic trace from each VM– Instantaneous throughput of 10ms bin
• Generate models for individual VMs
• Combine to obtain overall job’s TIVC model– Simplify allocation by working with one model– Does not lose efficiency since per-VM models are
roughly similar for MapReduce-like applications
36
Generate Model for Individual VM
1. Choose Bb
2. Periods where B > Bb, set to BcapBW
Time
Bcap
Bb
37
Challenges in Realizing TIVC
What are the right model functions?
How to automatically derive the models?
• How to efficiently allocate TIVC?
• How to enforce TIVC?
38
TIVC Allocation Algorithm
• Spatio-temporal allocation algorithm– Extends VC allocation algorithm to time dimension– Employs dynamic programming– Chooses lowest level subtree
• Properties– Locality aware– Efficient and scalable
• 99th percentile 28ms on a 64,000-VM data center in scheduling 5,000 jobs
39
Challenges in Realizing TIVC
What are the right model functions?
How to automatically derive the models?
How to efficiently allocate TIVC?
• How to enforce TIVC?
40
Enforcing TIVC Reservation
• Possible to enforce completely in hypervisor– Does not have control over upper level links– Requires online rate monitoring and feedback– Increases hypervisor overhead and complexity
• Enforcing BW reservation in switches– Most small jobs will fit into a rack– Only a few large jobs cross the core– Avoid complexity in hypervisors
41
Challenges in Realizing TIVC
What are the right model functions?
How to automatically derive the models?
How to efficiently allocate TIVC?
How to enforce TIVC?
42
Proteus: Implementing TIVC Models
1. Determine the model
2. Allocate and enforce the model
43
Evaluation
• Large-scale simulation– Performance– Cost– Allocation algorithm
• Prototype implementation– Small-scale testbed
44
Simulation Setup
• 3-level tree topology– 16,000 Hosts x 4 VMs– 4:1 oversubscription
• Workload– N: exponential distribution around mean 49 – B(t): derive from real Hadoop apps
50Gbps
10Gbps
…
… …1Gbps
…
20 Aggr Switch
20 ToR Switch
40 Hosts
… … …
45
Batched Jobs
• Scenario: 5,000 time-insensitive jobs
42% 21% 23% 35%
1/3 of each type
Completion time reduction
All rest results are for mixed
46
Varying Oversubscription and Job Size
25.8% reduction for non-oversubscribed
network
47
Dynamically Arriving Jobs
• Scenario: Accommodate users’ requests in shared data center– 5,000 jobs arrives dynamically with varying loads
Rejected: VC: 9.5%
TIVC: 3.4%
48
Analysis: Higher Concurrency
• Under 80% load
7% higher job concurrency
28% higher VM utilization
Rejected jobs are large
28% higher revenue
Charge VMs
V M
49
Testbed Experiment
• Setup– 18 machines
• Real 30 MapReduce jobs– 10 Sort– 10 Hive Join– 10 Hive Aggre.
50
Testbed ResultTIVC finishes job faster than VC,
Baseline finishes the fastest
Baseline suffers at variability of completion time, TIVC achieves
similar performance as VC
51
Conclusion• Network reservations in cloud are important
– Previous work proposed fixed-BW reservations– However, cloud apps exhibit time-varying BW usage
• They propose TIVC abstraction – Provides time-varying network reservations– Uses simple pulse functions– Automatically generates model– Efficiently allocates and enforces reservations
• Proteus shows TIVC benefits both cloud provider and users significantly