virtual machines - stony brooknhonarmand/... · •device pass through –directly assign a...

27
Fall 2014 :: CSE 506 :: Section 2 (PhD) Virtual Machines Heyi Li and Zhen Cao (Some of the figures are from the Internet)

Upload: others

Post on 20-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Virtual Machines

Heyi Li and Zhen Cao

(Some of the figures are from the Internet)

Page 2: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Outline• Basic concepts

• When virtual is better

• Implementation

• When virtual is harder

Page 3: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Basic Concepts• What is a virtual machine?

– An emulation of a particular computer system

• System VM vs. Process VM– System VM: supports the execution of a

complete OS (Xen)

– Process VM: supports the execution of a single process (JVM)

• Hypervisor (VMM)– Computer software that creates and runs VMs

• Type I & II Hypervisor

VMware ESX, Microsoft Hyper-V, Xen

Hardware

Hypervisor

VM1 VM2

Type 1 (bare-metal)

Host

Guest

Hardware

Hosting OS

Process Hypervisor

VM1 VM2

Type 2 (hosted)

VMware Workstation, Microsoft Virtual PC, Sun VirtualBox, QEMU, KVM

Host

Guest

Page 4: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Applications and Benefits

• Energy efficiency• Reducing Maintenance costs

• Rapid deployment• Security

Server Consolidation

HWn

HW0

VM1 VMn

OS

App

OS

App …

HW

VM1 VMn

VMM

OS

App

OS

App

Test and DevelopmentVM1

HW

VMM

OS

App

OS

App

Page 5: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Virtualization Requirements• Fidelity

– Software on the VM executes identically to its execution on hardware, barring time effects

• Performance– Performance overhead must be small

• Safety– The VMM manages all hardware resources

Page 6: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Obstacles for X86• Trap-and-emulate

– All virtualization-sensitive instructions are also privileged instructions

• x86 architecture once thought to be not fully virtualizable– Certain privileged instructions behave differently when run in unprivileged

mode (POPF)

– Certain unprivileged instructions can access privileged state (SGDT)

• Techniques to address inability to virtualize x86 – Full virtualization w/o hardware support – Binary Translation (VMware ESX)

– Paravirtualization (Xen)

– Hardware-assisted virtualization

Page 7: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Binary Translation

Page 8: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Binary Translation• Binary: input is binary x86 code, not source code

• On-the-fly: dynamic and on demand

• Only need to translate kernel mode code– User mode: direct execution

• Even for kernel mode, most instruction sequences don’t change

• Instructions that do change:– Indirect control flow: call/ret, jmp

– PC-relative addressing

– Privileged instructions

Page 9: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

1. A translation unit stops at 12 instructions or a control-flow instruction

2. Translated into Compiled Code Fragments(CCF) and cached

TUBinary Translator

Translation Cache

CCF

PC [x] [y]

([x], [y])

Hash Table

Execute1

5

3

2 4

3. Track the translation cache with a hash table

4. Execute the CCF

5. Continuation (either fall-through or taken-branch)

Page 10: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Memory

Guest Virtual Address (gVA) Space

0 4GB

Guest Physical Address (gPA) Space

0

Host Physical Address (hPA) Space

0

Guest Page Table (Visible to guest OS)

VMM PhysMap (Pmap) (Maintained by VMM)

4GB

4GB

Shadow Page Table(Resides in hardwareand maintained byVMM)

Page 11: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Shadow Page Tables • Translation from gVA to hPA directly by hardware

• If not present, page fault generated by hardware

• Hidden page fault: the mapping present in guest page table– VMM walks the guest page table to determine the gPA backing that gVA

– VMM allocates a physical page, and adds the mapping to Pmap

– Updates the shadow page table

• True page fault: the mapping not present in guest page table– VMM generates an exception on the virtual cpu

– Resume executing on the first instruction of the guest exception handler

Page 12: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

I/O Virtualization – Direct I/O Model• Place drivers for high-performance I/O

devices directly into hypervisor

• Not attempt to have the virtual hardware match the specific underlying hardware

• Virtualize selected, canonical I/O devices

• Problems– Larger Hypervisor

– Need to protect hypervisor from driver faultsHypervisor

SharedDevices

I/O Services

Device Drivers

VM0

Guest OSand Apps

VMn

Guest OSand Apps

Full Virtualization

Page 13: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Paravirtualization

Page 14: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

CPU Virtualization• Privilege levels in x86

– Ring 0: Xen– Ring 1: guest OS– Ring 3: user apps

• Isolation– Guest user mode and guest kernel mode

• Page table “supervisor” bit: PTE_U

– Guest OS and VMM• Segmentation

– Problem with x86-64

Page 15: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

CPU Virtualization (cont.)• Privileged instructions

– Hypercalls– Modify source codes– Validated and executed by Xen (e.g., installing a new PT)

• Exceptions– Registered with Xen once. Accepted (validated) if don’t require to execute

exception handlers in ring0.– Called directly without Xen intervention– All syscalls from apps to guest OS handled this way (and executed in ring1)

• Page fault handlers are special– Faulting address can be read only in ring 0– Xen reads the faulting address and passes it via stack to the OS handler in

ring1

Page 16: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Memory Virtualization• Physical memory

– At domain creation, hardware pages “reserved”– Domain can increase/decrease its quota– Xen does not guarantee that the hardware pages are contiguous

• Virtual memory– Register guest OS page tables directly with MMU– Guest OS allocates and initializes a page from its own memory reservation

and registers it with Xen• Every guest OS has its own address space• Xen occupies top 64MB of every address space.

• To save switching costs between address spaces (hypervisor calls)

– Xen involved only in memory updates

Page 17: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

I/O Virtualization – Indirect I/O Model• Uses a privileged virtual

machine (Domain0) for all device drivers

• Simple interfaces for guest OSes

• Pros– higher security

• Cons – lower performance

SharedDevices

I/O Services

Hypervisor

Device Drivers

Service VMs

VMn

VM0

Guest OSand Apps

Guest VMs

Paravirtualization

Page 18: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Hardware-assist Virtualization (HVM)

Page 19: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Intel’s VT-x• More-privileged mode for VMM

• Less-privileged mode for guest OS

• Eliminate de-privileging of Ring for guest OS

Ring 3

Ring 0

VMXRoot

Virtual Machines (VMs)

Apps

OS

VM Monitor (VMM)

Apps

OS

VM Exit VM Entry

Page 20: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

VM Control Structure(VMCS)• Execution controls determine when exits occur

– Access to privileged state, occurrence of exceptions, etc.

– Flexibility provided to avoid unwanted exits

• Guest-state area– Processor state saved into the guest-state area on VM exits and loaded on VM

entries

• Host-state area– Processor state loaded from the host-state area on VM exits

• Other

Page 21: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Extended Page Table(EPT)

• A new page-table structure, under the control of the VMM– Defines mapping between GPA & HPA

– EPT base pointer (new VMCS field) points to the EPT page tables

– EPT (optionally) activated on VM entry, deactivated on VM exit

• Guest has full control over its own IA-32 page tables– No VM exits due to guest page faults, INVLPG, or CR3 changes

GuestPageTables

Guest Linear Address Guest Physical Address ExtendedPageTables

Host Physical Address

EPT Base Pointer (EPTP)CR3

Page 22: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

I/O Virtualization

Hypervisor

SharedDevices

I/O Services

Device Drivers

VM0

Guest OSand Apps

VMn

Guest OSand Apps

Full Virtualization

SharedDevices

I/O Services

Hypervisor

Device Drivers

Service VMs

VMn

VM0

Guest OSand Apps

Guest VMs

Paravirtualization

AssignedDevices

Hypervisor

VM0

Guest OSand Apps

DeviceDrivers

VMn

Guest OSand Apps

DeviceDrivers

Pass-through Model

Page 23: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

IOMMU• Device pass through

– Directly assign a physical device to a particular guest OS

– Address space translation handled transparently

• Device isolation– Safely map a device to a particular guest without risking the integrity of other

guests

Page 24: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

IOMMU• Translation Control Entry

– Translation from a DMA address to a host memory address

Page 25: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Security Problems• Transience

– Large numbers of machines appear and disappear from the network sporadically

• Diversity– Long and painful upgrade cycles

• Identity– Difficult to establish who owns a VM running on a particular physical host

• Mobility– Can be easily copied over a network or carried on portable storage media

Page 26: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Discussion

Page 27: Virtual Machines - Stony Brooknhonarmand/... · •Device pass through –Directly assign a physical device to a particular guest OS –Address space translation handled transparently

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Thanks!