rapid virtual machine cloning for cloud computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... ·...

34
SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla, *Joseph Andrew Whitney, *Adin Matthew Scannell, *Philip Patchin, *Stephen M. Rumble, *Eyal de Lara, *Michael Brudno, *Mahadev Satyanarayanan April 2009 EuroSys '09: Proceedings of the fourth ACM european conference on Computer systems Publisher: ACM May 21, 2009 * University of Toronto, Toronto, ON, Canada * Carnegie Mellon University, Pittsburgh, PA, USA

Upload: others

Post on 26-Mar-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

SnowFlock:Rapid Virtual Machine Cloning

for Cloud Computing*Horacio Andrés Lagar-Cavilla, *Joseph Andrew Whitney,

*Adin Matthew Scannell, *Philip Patchin, *Stephen M. Rumble, *Eyal de Lara, *Michael Brudno, *Mahadev Satyanarayanan

April 2009EuroSys '09: Proceedings of the fourth ACM european conference on Computer systems Publisher: ACM May 21, 2009

* University of Toronto, Toronto, ON, Canada* Carnegie Mellon University, Pittsburgh, PA, USA

Page 2: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

OUTLINE

• INTRODUCTION

• VM FORK

• DESIGN RATIONALE

• IMPLEMENTATION

• APPLICATION EVALUATION

• CONCLUSIONS

Page 3: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

INTRODUCTION

• A major advantage of cloud computing is the ability to use a variable number of machines and VMs depending on the needs

• But lack of agility of current cloud API fails to provide users with the full potential of the cloud model

• This forces cloud users into employing ad hoc solutions

Page 4: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

INTRODUCTION

• VM fork is a cloud computing abstraction that instantaneously clones a VM into multiple replicas running on different hosts

• SnowFlock provides swift parallel stateful VM cloning, scales to hundreds of workers, consumes few cloud I/O resources, and has negligible runtime overhead

Page 5: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

VM FORK

• The semantics of VM fork are similar to those of the familiar process fork:

• a parent VM issues a fork call which creates a number of clones, or child VMs

• each of the forked VMs proceeds with an identical view of the system

• save for a unique identifier (vmid) which allows them to be distinguished

Page 6: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

VM FORK

• However, each forked VM has its own independent copy of the OS and virtual disk, and state updates are not propagated between VMs

• A key feature of SnowFlock’s usage model is the ephemeral nature of children

• VM fork has to be used with care as it replicates: conflicts may arise

Page 7: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

DESIGN RATIONALE

• VM fork is a heavyweight operation as VM instances can easily occupy GBs of RAM

• While one could implement VM fork using existing VM suspend/resume functionality, the whole scale copying is far too taxing

• A second approximation to solving the problem of VM fork latency uses SnowFlock’s multicast library

Page 8: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

DESIGN RATIONALE

0

100

200

300

400

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

Seco

nds

Number of Clones

NFS Multicast

Latency for forking a 1GB VM by suspending and distributing the image over NFS and multicast

Page 9: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

DESIGN RATIONALE

• SnowFlock’s fast fork implementation is based on following four insights:

• it is possible to start executing a child VM on a remote site by initially replicating only minimal state

• children will typically access only a fraction of the original memory image of the parent

Page 10: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

DESIGN RATIONALE

• it is common for children to allocate memory after forking

• children often execute similar code and access common data structures

• The first two led to the design of VM Descriptors and Memory-On-Demand

• The third and forth lead avoidance heuristics and multicast respectively

Page 11: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

IMPLEMENTATION

• SnowFlock is implemented as a combination of modifications to the Xen VMM and daemons that run in domain0

• The SnowFlock daemons form a distributed system that controls the life-cycle of VMs by orchestrating their cloning and deallocation

Page 12: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

IMPLEMENTATION

• Uses four mechanisms to fork a VM:

• First, the parent VM is temporarily suspended to produce a VM descriptor, then distributed to other physical hosts to spawn new VMs

• Second, the memory-on-demand mechanism, memtap, lazily fetches additional VM memory state as execution proceeds

Page 13: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

IMPLEMENTATION

• Third, the avoidance heuristics leverage the cooperation of the guest kernel to reduce the amount of memory that needs to be fetched on demand

• Finally, the multicast distribution system mcdist is used to deliver VM state simultaneously and efficiently, as well as providing implicit pre-fetching

Page 14: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

VM Descriptors• A condensed VM image that allows swift VM

replication to a separate host

• Construction of a VM descriptor:

• Spawning a thread in the VM kernel

• quiesces its I/O devices

• deactivates all but one of the VCPUs

• issues a hypercall suspending the VM’s execution

IMPLEMENTATION

Page 15: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

VM Descriptors• When the hypercall succeeds, a privileged

process in domain0 maps the suspended VM memory to populate the descriptor

• The descriptor contains:

• metadata describing the VM and its VDs

• a few memory pages shared between the VM and the Xen hypervisor

IMPLEMENTATION

Page 16: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

VM Descriptors• the registers of the main VCPU

• the Global Descriptor Tables (GDT) used by the x86 segmentation hardware

• the page tables of the VM

• In addition to those used by the guest kernel, each process in the VM needs a small number of additional page tables

IMPLEMENTATION

Page 17: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

VM DescriptorsIMPLEMENTATION

The time spent replicating a single-processor VM with 1 GB of RAM to n clones in n physical hosts

Page 18: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Memory-On-Demand• Immediately after being instantiated from a

descriptor, the VM will find it is missing state needed to proceed

• Memtap is a combination of hypervisor logic and a userspace domain0 process associated with the clone VM

• Memtap implements a copy-on-access policy for the clone VM’s memory

IMPLEMENTATION

Page 19: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Memory-On-DemandIMPLEMENTATION

Page Fetching Overhead

Page 20: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Avoidance Heuristics• Allow to bypass large numbers of

unnecessary memory fetches while retaining correctness

• optimizes the general case in which a clone VM allocates new state

• addresses the case where a virtual I/O device writes to the guest memory

IMPLEMENTATION

Page 21: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Avoidance HeuristicsIMPLEMENTATION

Effectiveness of Heuristics and MulticastSHRiMP, 1GB footprint. Bars show aggregate page requests from 32

clones vs. pages sent. Labels show average benchmark completion times.

Page 22: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Multicast Distribution• mcdist is the multicast distribution system

that effectively provides data to all cloned VM simultaneously

• It accomplish two goals that are not served by point-to-point communication:

• data needed by clones is often prefetched

• the load on the network is greatly reduced

IMPLEMENTATION

Page 23: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Multicast Distribution• The mcdist clients are memtap processes,

which will receive pages asynchronously and unpredictably in response to requests

• To maximize total goodput, the server uses flow control logic to limit its sending rate and avoid overwhelming busy clients

• Another server flow control mechanism is lockstep detection

IMPLEMENTATION

Page 24: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Multicast DistributionIMPLEMENTATION

Effectiveness of Heuristics and MulticastSHRiMP, 1GB footprint. Bars show aggregate page requests from 32

clones vs. pages sent. Labels show average benchmark completion times.

Page 25: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Multicast DistributionIMPLEMENTATION

Multicast Scales.BLAST, 256MB DB. Speedup vs. one thread

Page 26: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

APPLICATION EVALUATION

• The evaluation of SnowFlock focuses on a particular demanding scenario: the ability to deliver interactive parallel computation

• All experiments were carried out on a cluster of 32 Dell PowerEdge 1950 blade servers

• Each has 4 GB of RAM, 4 Intel Xeno 3.2 GHz cores, and a Broadcom gigabit NIC

Page 27: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Applications• 3 typical applications from bioinformatics

• NCBI BLAST: searches a database of biological sequences to find sequences similar to a query

• SHRiMP: is a tool for aligning large collections of very short DNA sequences

• ClustalW: generates a multiple alignment of a collection of protein or DNA sequences

APPLICATION EVALUATION

Page 28: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Applications• 3 applications representative of the fields of

graphics, rendering, parallel compilation, and financial services

• QuantLib: an open source toolkit widely used in quantitative finance

• Aqsis Renderman: an open source implementation of Pixar’s Renderman interface

APPLICATION EVALUATION

Page 29: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Applications• Distcc: a software which distributes

builds of C/C++ programs over network for parallel compilation

APPLICATION EVALUATION

Page 30: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Results• How does SnowFlock compare to other

methods for instantiating VMs?

• How close does SnowFlock come to achieving optimal application speedup?

• How scalable is SnowFlock?

APPLICATION EVALUATION

Page 31: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

ComparisonAPPLICATION EVALUATION

Time (s) State (MB)

SnowFlock

S/R over multicast

S/R over NFS

70.63±0.68 41.79±0.7

157.29±0.97 1124

412.29±11.51 1124

SnowFlock vs. VM Suspend/ResumeSHRiMP, 128 threads.

Page 32: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

PerformanceAPPLICATION EVALUATION

Application Benchmark

Page 33: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

Scale and AgilityAPPLICATION EVALUATION

Concurrent Execution of Multiple Forking VMsFor each task allocate 32 threads (32 VM × 1 core),

and cycle cloning, processing and joining repeatedly.

Page 34: Rapid Virtual Machine Cloning for Cloud Computingosnet.cs.nchu.edu.tw/powpoint/seminar/2009... · SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing *Horacio Andrés Lagar-Cavilla,

CONCLUSIONS

• SnowFlock provides cloud users and programmers the capacity to instantiate dozens of VMs in different hosts in sub-second time with little runtime overhead

• VM fork provides a well-understood programming interface with substantial performance