live migration of an entire network (and its hosts) eric keller, soudeh ghorbani, matthew caesar,...
TRANSCRIPT
Live Migration of an Entire Network (and its Hosts)
Eric Keller, Soudeh Ghorbani, Matthew Caesar, Jennifer Rexford
HotNets 2012
2
Widely supported to help:• Consolidate to save energy• Re-locate to improve performance
Virtual Machine Migration
Hypervisor
Apps
OS
Hypervisor
Apps
OS
Apps
OS
Apps
OS
Apps
OS
Apps
OS
3
Many VMs working together
But Applications Look Like This
4
Networks have increasing amounts of state
And Rely on the Network
Configuration Learned Software-Defined
5
Joint (virtual) host and (virtual) network migration
Ensemble Migration
No re-learning,No re-configuring,No re-calculating
Capitalize on redundancy
6
Some Use Cases
7
• Customer driven – for cost, performance, etc.• Provider driven – offload when too full
1. Moving between cloud providers
8
• Reduce energy consumption(turn off servers, reduce cooling)
2. Moving to smaller set of servers
9
• Migrate ensemble to infrastructure dedicated to testing (special equipment)
3. Troubleshooting
10
Automated migration according to some objectiveand easy manual migration
Goal: General Management Tool
Monitoring
Objective
Migration
Ensemble Migration
Automation
manual
11
LIve Migration of Ensembles
Migration Primitives
Migration Orchestration
Tenant Control
LIMENetwork Virtualization
API to operator/ automation
Software-defined networkVirtualized servers
Tenant Control
virtual topology Migration is transparent
12
Why Transparent?
13
Separate Out FunctionalityTenant Control
Network Virtualization
Tenant Control
virtual topology
14
Separate Out Functionality
Migration Primitives
Migration Orchestration
Tenant Control
Network Virtualization
Tenant Control
virtual topology
15
Multi-tenancy
Migration Primitives
Migration Orchestration
Tenant Control
Network Virtualization
Tenant Control
virtual topology
InfrastructureOperator
Tenants
16
Can we base it off of VM migration?• Iteratively copy state• Freeze VM• Copy last delta of state• Un-freeze VM on new server
How to Live Migrate an Ensemble
17
Applying to Ensemble
Iterative copy
18
Applying to Ensemble
Freeze and copy
19
Applying to Ensemble
Resume
20
Applying to Ensemble
Resume
Complex to implementDowntime potentially large
21
Applying to Whole Network
Iterative copy
22
Applying to Whole Network
Freeze and copy
23
Applying to Whole Network
Resume
24
Applying to Whole Network
Resume
Lots of packet lossLots of “backhaul” traffic
25
Applying to Each Switch
Iterative copy
26
Applying to Each Switch
Freeze and copy
27
Applying to Each Switch
Resume
28
Applying to Each Switch
Resume
Bursts of packet lossEven more “backhaul” traffic
Long total time
29
• Clone the network• Migrate the VMs individually (or in groups)
A Better Approach
30
Clone the Network
Copystate
31
Clone the Network
Cloned Operation
32
Clone the Network
MigrateVMs
33
Clone the Network
MigrateVMs
34
• Minimizes backhaul traffic• No packet loss associated with the network
(network is always operational)
Clone the Network
35
• Same guarantees as migration-free
• Preserve application semantics
Consistent View of a Switch
Migration Primitives
Migration Orchestration
Network Virtualization
Switch_A_0 Switch_A_1
Switch_A
Application view
Physical reality
36
Sources of Inconsistency
Switch_A_0 Switch_A_1
Apps
OS
Packet 0 Packet 1
R1R2
R1R2
Migration-free: packet 0 and packet 1 traverse same physical switch
VM(end host)
37
1. Local Changes on Switch
Switch_A_0 Switch_A_1
(e.g. delete rule after idle timeout)
Apps
OS
Packet 0 Packet 1
R1R2
R1R2
VM(end host)
38
2. Update from Controller
Switch_A_0 Switch_A_1
Apps
OS
Packet 0 Packet 1
R_newR1R2
R1R2
Install(R_new)
(e.g. rule installed at different times)
VM(end host)
39
3. Events to Controller
Switch_A_0 Switch_A_1
Apps
OS
Packet 0Packet 1
R1R2
R1R2
Packet-in(pkt 0)
Packet-in(pkt 1)(received at controller first)
(e.g. forward and send to controller)
VM(end host)
40
Consistency in LIME
Migration Primitives
Migration Orchestration
Network Virtualization
Switch_A_0 Switch_A_1
Switch_A
*Restrict use of some features* Use a commit protocol
* Emulate HW functions* Combine information
41
• LIME is a general and efficient migration layer• Hope is future SDN is made migration friendly
• Develop models and prove correctness– end-hosts and network– “Observational equivalence”
• Develop general migration framework– Control over grouping, order, and approach
Conclusions and Future work