virtual machine performance
DESCRIPTION
TRANSCRIPT
Virtual Machine Performance
Qian LinDec. 9th, 2010
Trusted Computing Review (TCR) 2010, section 2
Related topics• Optimization for VM performance improvement
• Measurement: tools & methods
• High performance computing in virtual machines
Background• Performance is a permanent issue!– no best, but better– global optimization -> infrastructure, architecture, ...– local optimization -> CPU, memory, I/O, storage, ...
• How to arbitrate the performance?– principles & standards vs. feasibility– tools & methods vs. implementation
• Various applications focus on different aspects– application deployment– case study
Related conferences• First-tier– SOSP, OSDI, ASPLOS, ISCA, USENIX ATC, EuroSys– PPoPP, HPDC, ICDCS, NSDI
• Second-tier– VEE, HPCA, PACT, SC, ICS, IPDPS, IISWC, Euro-Par, CLUSTER
• Others– GCC, HiPC, SAC, ICPADS– HPCVirt
Virtualization infrastructure• Operating system support for virtual machines.
USENIX ATC’03– examine and reduce the large overhead for Type II VMMs
(e.g., SimOS, UML, UMLinux)
Virtualization infrastructure• Xen and the art of virtualization. SOSP’03
• Xen and the art of repeated research. USENIX ATC’04
Virtualization infrastructure• A comparison of software and hardware techniques
for x86 virtualization. ASPLOS’06– conclusion: the hardware VMM suffers lower performance
than the pure software VMM– defect of hardware VMM
• no support for MMU virtualization• fails to co-exist with existing software techniques for MMU
virtualization
Look ahead for nested paging hardware
Virtualization infrastructure• Accelerating two dimensional page walks for
virtualized systems. ASPLOS’08– present an in-depth examination of the 2D page table walk
overhead and options for decreasing it
Virtualization infrastructure• Virtualizing I/O devices on VMware workstation’s
hosted virtual machine monitor. USENIX ATC’01– architecture design– performance evaluation
Optimization• Satori: Enlightened page sharing. USENIX ATC’09– system for sharing memory in virtualized systems– detect sharing opportunities and manage the surplus
memory
Optimization• High performance VMM-Bypass I/O in virtual
machines. USENIX ATC’06– allows time-critical I/O operations to be carried out
directly in guest VMs without involvement of the VMM and/or a privileged VM
Optimization• Optimizing network virtualization in Xen.
USENIX ATC’06– redefine the virtual network interfaces of guest domains to
incorporate high-level network offload features– optimize the implementation of the data transfer path
between guest and driver domains– provide support for guest operating systems to effectively
utilize advanced virtual memory features such as superpages and global page mappings
Optimization• High performance and scalable I/O virtualization via
self-virtualized devices. HPDC’07– self-virtualized devices, which offload selected
virtualization functionality from the hypervisor– self-virtualized network interface (SV-NIC)
Optimization• Bridging the gap between software and hardware
techniques for I/O virtualization. USENIX ATC’08– Problem 1: paravirtualized I/O causes high CPU overhead.– problem 2: direct I/O removes the benefits of the driver
domain model.– Solution: bridge the performance gap between the driver
domain model and direct I/O
Optimization• XenLoop: a transparent high performance inter-VM
network loopback. HPDC’08– a fully transparent and high performance– intercept outgoing network packets and shepherds the
packets destined to co-resident VMs through a high-speed inter-VM shared memory channel
Optimization• Virtualization Polling Engine (VPE): Using dedicated
CPU cores to accelerate I/O virtualization. ICS’09– takes advantage of dedicated CPU cores to help with the
virtualization of I/O devices by using an event-driven execution model with dedicated polling threads.
Optimization• High performance network virtualization with SR-IOV.
HPCA’09
Optimization• I/O scheduling model of virtual machine based on
multi-core dynamic partitioning. HPDC’10– Problem: scheduling of I/O missions was now treated as a
secondary concern when compared with scheduling of processor resources. • This would cause serious degradation of I/O performance and
make virtualization less desirable for I/O-intensive applications.
– Solution: monitor I/O operations, divide processor cores into 3 subsets which take different missions respectively.
Measurement• Measuring CPU overhead for I/O processing in the
Xen virtual machine monitor. USENIX ATC’05– a light weight monitoring system– measure the CPU usage of different virtual machines
caused by I/O processing– “page-flipping” technique of Xen
• the memory page containing the I/O data in the driver domain is exchanged with an unused page provided by the guest OS.
Measurement• Diagnosing performance overheads in the Xen virtual
machine environment. VEE’05– Xenoprof: a system-wide statistical profiling toolkit
implemented for Xen• enable coordinated profiling of multiple VMs in a system to obtain
the distribution of hardware events (e.g., clock cycles, cache and TLB misses)
– use the toolkit to analyze performance overheads incurred by networking applications running in Xen VMs
Measurement• Xenprobes, a lightweight user-space probing
framework for Xen virtual machine. USENIX ATC’07– a lightweight framework to probe the guest kernels– be useful for various purposes
• monitor real-time status of production systems• analyze performance bottlenecks• log specific events tracing problems
– introduce some unique advantages• put the breakpoint handlers in user-space => easy use• allow to probe multiple guests at the same time• support all kind of OS supported by Xen
Measurement• An analysis of HPC benchmarks in virtual machine
environments. Euro-Par’08– Problem: predicting performance for applications is
toughly difficult in virtual environments.– Research: investigate the behavior and identify patterns of
various overheads for HPC benchmark applications.
Measurement• Application performance modeling in a virtualized
environment. HPCA’09– build performance models for applications in virtualized
environments– propose an iterative model training technique based on
artificial neural networks which is found to be accurate across a range of applications
Measurement• Performance comparison of two virtual machine
scenarios using an HPC application. HPCVirt’09– compare the performance implications using HPC
application– two VM node configuration
• 2 VMs with 1 process/VM• 1 VM with 2 processes/VM
– the difference in overall performance impact is around 3%
HPC• A case for high performance computing with virtual
machines. ICS’06– Two key ideas: VMM bypass I/O and scalable VM image
management.
HPC• Virtualization for high-performance computing.
OSR 2006(vol.40)– discuss the trends, motivations, and issues in hardware
virtualization with emphasis on their value in HPC environments
HPC• Improving performance by embedding HPC
applications in lightweight Xen domains. HPCVirt’08– HPC application and its execution environment can be
embedded within a lightweight guest domain
Summary: research areas• Reduce virtualization overhead– infrastructure
• Xen vs. KVM vs. VMware• cloud computing related
– CPU and memory• On the low-level, software strategies are becoming less important,
but hardware. • On the high level, optimization is increasingly derived from
algorithm rather than architecture.
– I/O • continue to be hot topics!• network, disk, filesystem, ...
Summary: research areas• Measurement and tools– benchmark– diagnosis and performance bottleneck– implementation of practical tools
• Application driven performance improvement– behavior analysis of specific applications, especially with
respect to that triggering virtualization overhead– local optimize and customize VM for definite application
scenario
Our past work• Optimizing virtual machines using hybrid
virtualization. SAC’11
TCR: to be expected ...• VM security
• Virtualization technology and platform
• Novel memory architecture
• Cloud computing
• App. case study under virtualization environment
• VM miscellaneous (e.g., migration, time keeping)