xs japan 2008 services english
DESCRIPTION
Andrew Warfield: Services in the Virtualization PlaneTRANSCRIPT
![Page 1: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/1.jpg)
Services in the Virtualization Plane
Andrew Warfield
Adjunct Professor, UBC
Technical Director, Citrix Systems
![Page 2: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/2.jpg)
The Virtualization Plane
Physical Machine
OS
Applications Applications
![Page 3: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/3.jpg)
• Geoff’s call graph goes on this slide.
20ms in the Linux Kernel• 355,000 Branches• 350 Syscalls, 312 Interrupts, 255 PFs• And this is less than 10% of whathappened in that 20ms!
20ms in the Linux Kernel• 355,000 Branches• 350 Syscalls, 312 Interrupts, 255 PFs• And this is less than 10% of whathappened in that 20ms!
![Page 4: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/4.jpg)
The Virtualization Plane
Virtual Machine Monitor
OS
Applications Applications
Physical Machine
![Page 5: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/5.jpg)
The Virtualization Plane
• Huge opportunity for innovation.• OS agnostic and hardware agnostic.• Build useful services that are co‐located, but isolated from VMs.
• Live migration was the first example.
Virtual Machine Monitor
VMVM VMVM
Physical Machine
Virtual Machine Monitor
VMVM VMVM
Physical Machine
Virtual Machine Monitor
VMVM VMVM
Physical Machine
ApplianceAppliance ApplianceAppliance ApplianceAppliance ApplianceAppliance ApplianceAppliance ApplianceAppliance
Virtualization Plane
![Page 6: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/6.jpg)
Overview
![Page 7: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/7.jpg)
REMUS: TRANSPARENT HIGH AVAILABILITY
Graduate Student: Brendan Cully
![Page 8: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/8.jpg)
Example 1: (Brendan Cully)Remus: Transparent High Availability
• As with process migration, HA is complicated and difficult to maintain.
• Database HA is generally based around a replicated log, and a recovery protocol based around detailed application semantics.
• HA is for “the very rich and the very scared.”
• Idea: Use simple mechanisms in the virtualization layer to provide universal HA.
![Page 9: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/9.jpg)
Remus
Xen
Mail ServerVM PVM
Xen
Mail ServerVMPVM
3ms
<17ms
![Page 10: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/10.jpg)
Remus
Xen
Mail ServerVM PVM
Xen
Mail ServerVMPVM
3ms
<17ms
InternetInternet
“checkpoint ok!”
![Page 11: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/11.jpg)
Remus
Xen
Mail ServerVM PVM
Xen
Mail ServerVMPVM
InternetInternet
Remus demonstrates that efficient and complete state capture can provide hardware fault tolerant‐
style whole‐system failover to unmodifiedapplications.
The simplicity of the approach is critical, because high availability and fault recovery code is
notoriously difficult to get right.
Remus demonstrates that efficient and complete state capture can provide hardware fault tolerant‐
style whole‐system failover to unmodifiedapplications.
The simplicity of the approach is critical, because high availability and fault recovery code is
notoriously difficult to get right.
![Page 12: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/12.jpg)
Remus: Current Work
• Disaster Tolerant Computing.Extend HA/FT to work in the wide area. Deployment between UBC and TRU~350km fiber connection.
• Exposing Remus to Applications.Apply paravirtualization to transparent HA.E.g. let a database know that some memory is unprotected
![Page 13: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/13.jpg)
Remus: Summary
• Published at NSDI 2008 – won “Best Paper”award.
• Some patches in xen‐unstable, remaining patches to appear over the next month or two.
• This summer, added support for HVM guests.
![Page 14: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/14.jpg)
PARALLAX: VIRTUAL STORAGE FOR VIRTUAL MACHINES
Graduate Student: Dutch Meyer
![Page 15: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/15.jpg)
Parallax: Storage Virtualization for VMs
• VMs are fantastic, but turn out to be a bit clunky to work with.
• VM images are really big, and most storage systems don’t really provide the operations you want to really innovate with VMs.
• Horizontal scale: create lots of images based on a gold master.
• Vertical Scale: Lots of snapshots of a single image.• Thin provisioning is critical, as is low‐cost storage.• Parallax is basically just page tables for disks!
![Page 16: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/16.jpg)
Parallax: Storage Virtualization for VMs
![Page 17: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/17.jpg)
Parallax
metadata
data
•Virtualizes block devices using address mapping trees.•Very low overhead (2ms) snapshots.•Equally low overhead image cloning.•Space efficient.•Many very interesting challenges.
•Virtualizes block devices using address mapping trees.•Very low overhead (2ms) snapshots.•Equally low overhead image cloning.•Space efficient.•Many very interesting challenges.
![Page 18: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/18.jpg)
Parallax Summary
• Parallax is used on a daily basis in our lab.
• Vertical integration of storage from array through to guest.
• Horizontal integration across hosts in a cluster.
• Storage is no longer a barrier for deploying new VMs.
• In the process of adding powerful new features: deduping, linearization, and CAS.
![Page 19: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/19.jpg)
TRALFAMADORE: ENHANCING AND UNDERSTANDING SYSTEMS
Graduate Students: Geoffrey Lefebvre and Brendan Cully
![Page 20: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/20.jpg)
Tralfamadore: Motivation
• How much do we really know about what software is doing, especially when things go wrong?
• What if we had a detailed recording?
• What if the recording was interactive and could be queried and changed?
![Page 21: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/21.jpg)
Tralfamadore
Production Network
Test/Dev Network1. Continuously log execution for long periods of time.1. Continuously log execution for long periods of time.
Remus Checkpoints
Remus Checkpoints
ParallaxParallax
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
![Page 22: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/22.jpg)
Tralfamadore
Production Network
Test/Dev Network2. Re‐execute slices of history to generate indexes.2. Re‐execute slices of history to generate indexes.
Remus Checkpoints
Remus Checkpoints
ParallaxParallax
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
ExecutionIndex
ExecutionIndex Indexing Servers
![Page 23: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/23.jpg)
Tralfamadore
Production Network
Test/Dev Network3. Queries to search, select, and interrogate history.3. Queries to search, select, and interrogate history.
Remus Checkpoints
Remus Checkpoints
ParallaxParallax
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
ExecutionIndex
ExecutionIndex Indexing Servers
Query Servers
“Find points in execution when…”• “…/etc/passwd was modified”• “…network receive buffers were overloaded.”• “… the stack looked like this.”• “… control flow ran function b, shortly after running function a.”
“Find points in execution when…”• “…/etc/passwd was modified”• “…network receive buffers were overloaded.”• “… the stack looked like this.”• “… control flow ran function b, shortly after running function a.”
![Page 24: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/24.jpg)
Tralfamadore
Production Network
Test/Dev Network4. Re‐execute and analyze modified checkpoints.4. Re‐execute and analyze modified checkpoints.
Remus Checkpoints
Remus Checkpoints
ParallaxParallax
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
RAMRAM
DiskDisk
ExecutionIndex
ExecutionIndex Indexing Servers
Query Servers Re‐execution Servers
Use piles of existing tools:• Emulators,• Profilers,•Binary Patching,•Debuggers
Use repeated re‐execution to tackle non‐determinism.
Use piles of existing tools:• Emulators,• Profilers,•Binary Patching,•Debuggers
Use repeated re‐execution to tackle non‐determinism.
![Page 25: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/25.jpg)
Applications of Tralfamadore
• Fault injection / fuzzing of software close to real execution experience.• Retrospectively attach a debugger to any node in this call graph.• Test patches in the past!
• Fault injection / fuzzing of software close to real execution experience.• Retrospectively attach a debugger to any node in this call graph.• Test patches in the past!
![Page 26: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/26.jpg)
Understanding Execution
• Prototype application of Tralfamadore performs dynamic analysis of trace data and maps it back on top of source code.
• Allows developers to understand how source is behaving in deployments.
• Very early work, but a few examples follow…
![Page 27: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/27.jpg)
![Page 28: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/28.jpg)
![Page 29: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/29.jpg)
![Page 30: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/30.jpg)
Tralfamadore Summary
• We hope to be able to do detailed, retrospective analysis of system behaviour.
• Current focus has been on understanding execution.
• Future work will involve performance and security analysis and assisting reproduction of system failures.
![Page 31: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/31.jpg)
Overall Conclusions
• The virtualization plane presents a great opportunity to build low‐level extensions to software.
• I have shown three example services, and expect many more to follow.
• Interesting challenges exist in evolving virtualization to provide these services while maintaining isolation and cross‐platform benefits.
![Page 32: XS Japan 2008 Services English](https://reader034.vdocuments.us/reader034/viewer/2022051411/54710245b4af9fb40a8b4a45/html5/thumbnails/32.jpg)
Thank You!