mit iap course lecture #1: virtualization 101 · 2011-11-03 · full virtualization • no...

50
Copyright © 2007 VMware, Inc. All rights reserved. MIT IAP Course Lecture #1: Virtualization 101 Carl Waldspurger (SB SM ’89 PhD ’95) VMware R&D January 16, 2007

Upload: others

Post on 08-Apr-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

Copyright © 2007 VMware, Inc. All rights reserved.

MIT IAP Course

Lecture #1: Virtualization 101

Carl Waldspurger (SB SM ’89 PhD ’95)

VMware R&D

January 16, 2007

Page 2: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

2Copyright © 2007 VMware, Inc. All rights reserved.

What is Virtualization?

Virtual systems

• Abstract physical components using logical objects

• Dynamically bind logical objects to physical configurations

Examples

• Network – Virtual LAN (VLAN), Virtual Private Network (VPN)

• Storage – Storage Area Network (SAN), LUN

• Computer – Virtual Machine (VM), simulator

vir•tu•al (adj): existing in essence or effect,though not in actual fact

Page 3: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

3Copyright © 2007 VMware, Inc. All rights reserved.

Overview

Virtual Machines

Virtualization Approaches

Processor Virtualization

Additional Topics

Page 4: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

4Copyright © 2007 VMware, Inc. All rights reserved.

Starting Point: A Physical Machine

Physical Hardware

• Processors, memory, chipset,I/O bus and devices, etc.

• Physical resources often underutilized

Software

• Tightly coupled to hardware

• Single active OS image

• OS controls hardware

Page 5: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

5Copyright © 2007 VMware, Inc. All rights reserved.

What is a Virtual Machine?

Hardware-Level Abstraction

• Virtual hardware: processors, memory, chipset, I/O devices, etc.

• Encapsulates all OS and application state

Virtualization Software

• Extra level of indirectiondecouples hardware and OS

• Multiplexes physical hardwareacross multiple “guest” VMs

• Strong isolation between VMs

• Manages physical resources, improves utilization

Page 6: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

6Copyright © 2007 VMware, Inc. All rights reserved.

VM Isolation

Secure Multiplexing

• Run multiple VMs on single physical host

• Processor hardware isolates VMs, e.g. MMU

Strong Guarantees

• Software bugs, crashes, viruses within one VM cannot affect other VMs

Performance Isolation

• Partition system resources

• Example: VMware controls for reservation, limit, shares

Page 7: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

7Copyright © 2007 VMware, Inc. All rights reserved.

VM Encapsulation

Entire VM is a File

• OS, applications, data

• Memory and device state

Snapshots and Clones

• Capture VM state on the fly and restore to point-in-time

• Rapid system provisioning, backup, remote mirroring

Easy Content Distribution

• Pre-configured apps, demos

• Virtual appliances

Page 8: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

8Copyright © 2007 VMware, Inc. All rights reserved.

VM Compatibility

Hardware-Independent

• Physical hardware hidden by virtualization layer

• Standard virtual hardware exposed to VM

Create Once, Run Anywhere

• No configuration issues

• Migrate VMs between hosts

Legacy VMs

• Run ancient OS on new platform

• E.g. DOS VM drives virtual IDE and vLance devices, mapped tomodern SAN and GigE hardware

Page 9: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

9Copyright © 2007 VMware, Inc. All rights reserved.

Common Virtualization Uses Today

Server Consolidation and Containment – Eliminate server sprawl by deploying systems into virtual machines that can run safely and move transparently across shared hardware

Test and Development – Rapidly provision test and development servers; store libraries of pre-configured test machines

Enterprise Desktop – Secure unmanaged PCs without compromising end-user autonomy by layering a security policy in software around desktop virtual machines

Business Continuity – Reduce cost and complexity by encapsulating entire systems into single files that can be replicated and restored onto any target server

Page 10: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

10Copyright © 2007 VMware, Inc. All rights reserved.

Overview

Virtual Machines

Virtualization Approaches

• Virtual machine monitors (VMMs)

• Virtualization platform types

• Alternative system virtualizations

Processor Virtualization

Additional Topics

Page 11: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

11Copyright © 2007 VMware, Inc. All rights reserved.

What is a Virtual Machine Monitor?

VMM Characteristics

• Fidelity

• Performance

• Isolation / Safety

An Old Concept

• Classic definition fromPopek & Goldberg ’74

• IBM mainframes since ’60s

Page 12: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

12Copyright © 2007 VMware, Inc. All rights reserved.

VMM Technology

So this is just like Java, right?

• No, a Java VM is very different from the physical machine that runs it

• A hardware-level VM reflects underlying processor architecture

Like a simulator or emulator that can run old Nintendo games?

• No, they emulate the behavior of different hardware architectures

• Simulators generally have very high overhead

• A hardware-level VM utilizes the underlying physical processor directly

Page 13: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

13Copyright © 2007 VMware, Inc. All rights reserved.

VMMs Past

An Old Idea

• Hardware-level VMs since ’60s

• IBM S/360, IBM VM/370mainframe systems

• Timeshare multiple single-user OS instances on expensive hardware

Classical VMM

• Run VM directly on hardware

• “Trap and emulate” modelfor privileged instructions

• Vendors had vertical control over proprietary hardware, operating systems, VMM

From IBM VM/370 product announcement, ca. 1972

Page 14: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

14Copyright © 2007 VMware, Inc. All rights reserved.

VMMs Present

Renewed Interest

• Academic research since ’90s

• VMs for commodity systems

• Server consolidation

VMM for x86

• Industry-standard hardware, from laptops to datacenter

• Run unmodified commodity guest operating systems

• Significant challenges, e.g.“non-virtualizable” instructions

• Pioneered by VMware in ’98

VMware Fusion for Mac OS X running WinXP, 2006

Page 15: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

15Copyright © 2007 VMware, Inc. All rights reserved.

VMM Platform Types

Hosted Architecture

• Install as application on existing x86 “host” OS, e.g. Windows, Linux, OS X

• Small context-switching driver

• Leverage host I/O stack and resource management

• Examples: VMware Player/Workstation/Server, Microsoft Virtual PC/Server, Parallels Desktop

Bare-Metal Architecture

• “Hypervisor” installs directly on hardware

• Acknowledged as preferred architecture for high-end servers

• Examples: VMware ESX Server, Xen, Microsoft Viridian (2008)

Page 16: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

16Copyright © 2007 VMware, Inc. All rights reserved.

System Virtualization Alternatives

OS Level Hardware Level

Virtual machines abstracted using a layer at different places

Language Level

Page 17: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

17Copyright © 2007 VMware, Inc. All rights reserved.

System Virtualization Taxonomy

System Virtualization

• Java• Microsoft .NET / Mono• Smalltalk

High-Level LanguageHardware Level

Bare-Metal/Hypervisor

• HP Integrity VM• IBM zSeries z/VM• VMware ESX Server• Xen

Hosted

• Microsoft Virtual Server• Microsoft Virtual PC• Parallels Desktop• VMware Player• VMware Workstation• VMware Server

Para-virtualization

• Virtual Iron• VMware VMI• Xen

OS Level

• FreeBSD Jail• HP Secure Resource

Partitions• Sun Solaris Zones• SWsoft Virtuozzo• User-Mode Linux

• Bochs• Microsoft VPC for Mac• QEMU• Virtutech Simics

Emulators

Page 18: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

18Copyright © 2007 VMware, Inc. All rights reserved.

Overview

Virtual Machines

Virtualization Approaches

Processor Virtualization

• Classical techniques

• Software x86 VMM

• Hardware-assisted x86 VMM

• Para-virtualization

Additional Topics

Page 19: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

19Copyright © 2007 VMware, Inc. All rights reserved.

Classical Instruction Virtualization

Trap and Emulate

• Run guest operating system deprivileged

• All privileged instructions trap into VMM

• VMM emulates instructions against virtual statee.g. disable virtual interrupts, not physical interrupts

• Resume direct execution from next guest instruction

Implementation Technique

• This is just one technique

• Popek and Goldberg criteria permit others

Page 20: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

20Copyright © 2007 VMware, Inc. All rights reserved.

Classical Memory Virtualization

Traditional VMM Approach

Extra Level of Indirection

• Virtual →→→→ “Physical”Guest maps VPN to PPNusing primary page tables

• “Physical” →→→→ MachineVMM maps PPN to MPN

Shadow Page Table

• Composite of two mappings

• For ordinary memory referencesHardware maps VPN to MPN

• Cached by physical TLB

VPN

PPN

MPN

hardwareTLB

shadowpage table

guest

VMM

Page 21: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

21Copyright © 2007 VMware, Inc. All rights reserved.

Memory Traces

Shadow Page Table

• Derived from primary page table in guest

• VMM must keep primary and shadow coherent

Trace = Coherency Mechanism

• Write-protect primary page table

• Trap guest writes to primary

• Update or invalidate corresponding shadow

• Transparent to guest

Page 22: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

22Copyright © 2007 VMware, Inc. All rights reserved.

Classical VMM Performance

Native Speed Except for Traps

• No overhead in direct execution

• Overhead = trap frequency × average trap cost

Trap Sources

• Most frequent: Guest page table traces

• Privileged instructions

• Memory-mapped device traces

Page 23: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

23Copyright © 2007 VMware, Inc. All rights reserved.

x86 Virtualization Challenges

Not Classically Virtualizable

• x86 ISA includes instructions that read or modify privileged state

• But which don’t trap in unprivileged mode

Example: POPF instruction

• Pop top-of-stack into EFLAGS register

• EFLAGS.IF bit privileged (interrupt enable flag)

• POPF silently ignores attempts to alter EFLAGS.IF in unprivileged mode!

• So no trap to return control to VMM

Deprivileging not possible with x86!

Page 24: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

24Copyright © 2007 VMware, Inc. All rights reserved.

How to Virtualize x86?

Interpretation

• Problem – too inefficient

• x86 decoding slow

Code Patching

• Problem – not transparent

• Guest can inspect its own code

Binary Translation (BT)

• Approach pioneered by VMware

• Run any unmodified x86 OS in VM

Extend x86 Architecture

Page 25: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

25Copyright © 2007 VMware, Inc. All rights reserved.

Software VMM: Binary Translation

Direct execute unprivileged guest application code

• Will run at full speed until it traps, we get an interrupt, etc.

“Binary translate” all guest kernel code, run it unprivileged

• Since x86 has non-virtualizable instructions,proactively transfer control to the VMM (no need for traps)

• Safe instructions are emitted without change

• For “unsafe” instructions, emit a controlled emulation sequence

• VMM translation cache for good performance

Page 26: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

26Copyright © 2007 VMware, Inc. All rights reserved.

VMware Translator Properties

Binary – input is x86 “hex”, not source

Dynamic – interleave translation and execution

On Demand – translate only what about to execute (lazy)

System Level – makes no assumptions about guest code

Subsetting – full x86 to safe subset

Adaptive – adjust translations based on guest behavior

Page 27: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

27Copyright © 2007 VMware, Inc. All rights reserved.

BT Mechanics

Each Translator Invocation

• Consume a basic block (BB)

• Produce a compiled code fragment (CCF)

Store CCF in Translation Cache

• Future reuse

• Capture working set of guest kernel

• Amortize translation costs

• Not “patching in place”

translator

Input: BB

Output: CCF

55 ff 33 c7 03 ...

55 ff 33 c7 03 ...

Page 28: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

28Copyright © 2007 VMware, Inc. All rights reserved.

Example: IDENT Translation

80304a69 push %ebp

80403a6a push (%ebx)

80403a6c mov (%ebx), ffffffff

80403a72 mov %edx, %esp

80403a74 mov %esp, 81c(%ebx)

80403a7a push %edx

80403a7b mov %ebp, %eax

80403a7d call 80460ba4

25555b0 push %ebp

25555b1 push (%ebx)

25555b3 mov (%ebx), ffffffff

25555b9 mov %edx, %esp

25555bb mov %esp, 81c(%ebx)

25555c1 push %edx

25555c2 mov %ebp, %eax

25555c4 push 80403a82

25555c9 int 3a

25555cb data: 80460ba4BB

CCF25555c4: push return address25555c9: invoke translator on callee

Page 29: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

29Copyright © 2007 VMware, Inc. All rights reserved.

Adaptive BT

Translated Code Is Fast

• Mostly IDENT translations

• Runs “at speed”

Except Writes to Traced Memory

• Page fault (shown as !*!)

• Decode and interpret instruction

• Fire trace callbacks

• Resume execution

• Can take 1000’s of cycles

!*!

Invoke Translator

TranslationCache

Page 30: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

30Copyright © 2007 VMware, Inc. All rights reserved.

Adaptive BT: Fast Trace Handling

Detect and Track Trace Faults

Splice in TRACE Translation

• Execute memory access in software

• Avoid page fault

• No re-decoding

• Faster resumption

Faster Traces

• 10x performance improvement

• Adapts to runtime behavior

JMP

Invoke Translator

TRACE

Page 31: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

31Copyright © 2007 VMware, Inc. All rights reserved.

Software VMM Evaluation

Benefits

• Adaptation

• Fast traces

• Fast I/O emulation

• Flexibility

Costs

• Running translator

• Path lengthening

• System call slowdown

• Complexity

Page 32: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

32Copyright © 2007 VMware, Inc. All rights reserved.

Hardware-Assisted VMM

Recent x86 Extension

• 1998 – 2005: Software-only VMMs using binary translation

• 2005: Intel and AMD start extending x86 to support virtualization

First-Generation Hardware

• Enables classical trap-and-emulate VMMs

• Intel VT, aka “Vanderpool Technology”

• AMD SVM, aka “Pacifica”

Performance

• VT/SVM help avoid BT, but not MMU ops (actually slower!)

• Main problem is efficient virtualization of MMU and I/O,Not executing the virtual instruction stream

Page 33: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

33Copyright © 2007 VMware, Inc. All rights reserved.

VT/SVM Architecture

Diagram

• Y-axis: old school x86 privilege (CPL)

• X-axis: virtualization privilege

Guest Mode

• Runs unmodified OS

• Sensitive operations “exit”(trap out) to host mode

VMCB

• Virtual Machine Control Block

• VMM-controlled, hardware-walked

• Buffers simple exits

CPL 3CPL 3

CPL 2

CPL 1

CPL 0

CPL 2

CPL 1

CPL 0

Host Guest

Page 34: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

34Copyright © 2007 VMware, Inc. All rights reserved.

Hardware-Assisted VMM

Hardware-Assisted Direct ExecCPL 0-3

VMMCPL 0-3

Host mode

Guest mode

Fault,Trace, Interrupt, I/O ...

Resume Guest

Page 35: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

35Copyright © 2007 VMware, Inc. All rights reserved.

Hardware-Assisted VMM Evaluation

Benefits

• Simplicity (no BT)

• Fast system calls

• No translator overheads

Costs

• Exits: 1000’s of cycles for traces and I/O

• No adaptation or software flexibility

• Stateless model

Future

• Hardware support for fast MMU virtualization

• Intel EPT, AMD NPT

Page 36: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

36Copyright © 2007 VMware, Inc. All rights reserved.

What is Paravirtualization?

Full Virtualization

• No modifications to guest OS

• Excellent compatibility, good performance, but complex

Paravirtualization Exports Simpler Architecture

• Term coined by Denali project in ’01, popularized by Xen

• Modify guest OS to be aware of virtualization layer

• Remove non-virtualizable parts of architecture

• Avoid rediscovery of knowledge in hypervisor

• Excellent performance and simple, but poor compatibility

Ongoing Linux Standards Work

• “Paravirt Ops” interface between guest and hypervisor

• Small team from VMware, Xen, IBM LTC, etc.

Page 37: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

37Copyright © 2007 VMware, Inc. All rights reserved.

Paravirtualization: Conceptual Diagram

Hardware

Hypervisor

Guest OS

Hardware

Hypervisor

Guest OS

Full Virtualization Paravirtualization

Hypercalls(GOOD)

System callinterface

NOT GOOD!

Page 38: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

38Copyright © 2007 VMware, Inc. All rights reserved.

VMware Vision: Transparent Paravirtualization

Same OS binary

Xen 3.0.x VMware ESX

NativeNative Native

Dom0VMI

LinuxDomU

XenoLinux

VMILinux

VMILinux

WindowsSolaris

Page 39: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

39Copyright © 2007 VMware, Inc. All rights reserved.

Further Reading

VMware Publications

• www.vmware.com/academic/resources.html

• A Comparison of Software and Hardware Techniques for x86 Virtualization (ASPLOS ’06)

• Fast Transparent Migration for Virtual Machines (USENIX ’05)

• Memory Resource Management in VMware ESX Server (OSDI ’02)

• Virtualizing I/O Devices on VMware Workstation’s Hosted VMM (USENIX ’01)

Additional Academic Publications

• Xen and the Art of Virtualization (SOSP ’03)

• Disco: Running Commodity Operating Systems on Scalable Multiprocessors (SOSP ’97)

• Many more …

Page 40: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

40Copyright © 2007 VMware, Inc. All rights reserved.

Additional Topics

I/O Virtualization

Memory Management

Page 41: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

41Copyright © 2007 VMware, Inc. All rights reserved.

I/O Virtualization Stack

Guest Device Driver

Virtual Device

• Model existing device, e.g. e1000

• Model an idealized device, e.g. vmxnet

Virtualization Layer

• Emulates the virtual device

• Remaps guest and real I/O addresses

• Multiplexes and drives physical device

• Provides additional features, e.g. transparent NIC teaming

Real Device

• Physical hardware, e.g. bcm5700

• Likely to be different than virtual device

Guest OS

Device Driver

Device Driver

I/O Stack

DeviceEmulation

Page 42: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

42Copyright © 2007 VMware, Inc. All rights reserved.

I/O Virtualization Implementations

Device Driver

I/O Stack

Guest OS

Device Driver

DeviceEmulation

Device Driver

I/O Stack

Guest OS

Device Driver

DeviceEmulation

DeviceEmulation

Host OS/Dom0/Parent Domain

Guest OS

Device Driver

DeviceManager

Hosted or Split Hypervisor Direct

Passthrough I/O

VMware Workstation, VMware Server,VMware ESX Server (for slow devices),Xen, Microsoft Viridian, Virtual Server

VMware ESX Server (storage and network)

A Future OptionMany Challenges

Emulated I/O

Page 43: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

43Copyright © 2007 VMware, Inc. All rights reserved.

Passthrough I/O Virtualization

High Performance

• Guest drives device directly

• Minimizes CPU utilization

Enabled by HW Assists

• I/O-MMU for DMA isolatione.g. Intel VT-d, AMD IOMMU

• Partitionable I/O devicee.g. PCI-SIG IOV spec

Challenges

• Hardware independence

• Migration, suspend/resume

• Memory overcommitment

I/O MMU

DeviceManager

VF VF VF

PF

PF = Physical Function, VF = Virtual Function

I/O Device

Guest OS

Device Driver

Guest OS

Device Driver

Guest OS

Device Driver

VirtualizationLayer

Page 44: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

44Copyright © 2007 VMware, Inc. All rights reserved.

Additional Topics

I/O Virtualization

Memory Management

Page 45: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

45Copyright © 2007 VMware, Inc. All rights reserved.

Memory Management

Desirable capabilities

• Efficient memory overcommitment

• Accurate resource controls

• Exploit sharing opportunities

Challenges

• Allocations should reflect both importance and working set

• Best data to guide decisions known only to guest OS

• Guest and meta-level policies may clash

Page 46: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

46Copyright © 2007 VMware, Inc. All rights reserved.

VMware Memory Management

Reclamation mechanisms

• Ballooning – guest driver allocates pinned PPNs, hypervisor deallocates backing MPNs

• Swapping – hypervisor transparently pages out PPNs,paged in on demand

• Page sharing – hypervisor identifies identical PPNsbased on content, maps to same MPN copy-on-write

Allocation policies

• Proportional sharing – revoke memory from VMwith minimum shares-per-page ratio

• Idle memory tax – charge VM more for idle pagesthan for active pages to prevent unproductive hoarding

Page 47: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

47Copyright © 2007 VMware, Inc. All rights reserved.

Ballooning

Guest OS

balloon

Guest OS

balloon

Guest OS

inflate balloon

(+ pressure)

deflate balloon

(– pressure)

may page outto virtual disk

may page in

from virtual disk

guest OS manages memory

implicit cooperation

Page 48: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

48Copyright © 2007 VMware, Inc. All rights reserved.

Page Sharing

Motivation

• Multiple VMs running same OS, apps

• Collapse redundant copies of code, data, zeros

Transparent page sharing

• Map multiple PPNs to single MPN copy-on-write

• Pioneered by Disco [Bugnion ’97], but required guest OS hooks

Content-based sharing

• General-purpose, no guest OS changes

• Background activity saves memory over time

Page 49: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

49Copyright © 2007 VMware, Inc. All rights reserved.

Page Sharing: Scan Candidate PPN

VM 1 VM 2 VM 3

011010110101010111101100

MachineMemory …06af

343f8123b

Hash:VM:PPN:MPN:

hint frame

hashtable

hash page contents…2bd806af

Page 50: MIT IAP Course Lecture #1: Virtualization 101 · 2011-11-03 · Full Virtualization • No modifications to guest OS • Excellent compatibility, good performance, but complex Paravirtualization

50Copyright © 2007 VMware, Inc. All rights reserved.

Page Sharing: Successful Match

VM 1 VM 2 VM 3

MachineMemory …06af

2123b

Hash:Refs:MPN:

shared frame

hashtable