vmware virtualization of oracle and java scott b. drummonds tim harris

52
VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Upload: scot-tucker

Post on 12-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

VMware Virtualization of Oracle and Java

Scott B. Drummonds

Tim Harris

Page 2: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Agenda

VMware Virtualization Overview

Architecture, performance, and overheads

Best Practices

Oracle, Java

Performance Data

Scaling, large memory pages, AMD RVI

Conclusions

Page 3: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

VMware Virtualization of Oracle and Java

Page 4: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

VMware Infrastructure 3 Architecture

Page 5: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

VMkernel

Guest

PhysicalHardware

CPU resource is controlled byThe scheduler, and virtualizedby the monitor

Memory is allocated by the VMkernel, and virtualized by the monitor

Network and I/O devices are emulated and proxies though native device drivers

Monitor

Guest

MemoryAllocator

NIC Drivers

Virtual Switch

I/O Drivers

File System

Monitor

Scheduler

Virtual NIC Virtual SCSI

TCP/IPFile

System

VMware ESX Virtualization Architecture

Page 6: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Speeding Up Virtualization

Privileged instruction virtualization

Traps from de-privileging or ring compression to handle privileged instructions

Memory virtualizationMemory partitioning and allocation of physical memory

Device and I/O virtualization

Routing I/O requests between virtual devices and physical HW

Where are the various virtualization performance hits?

Page 7: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

VMkernel

Guest

PhysicalHardware

There are different types of monitors for different workloads and CPU types

VMware ESX provides a dynamic framework to allowthe best monitor for theworkload

BinaryTranslation

MemoryAllocator

NIC Drivers

Virtual Switch

I/O Drivers

File SystemScheduler

Virtual NIC Virtual SCSI

Guest

Para-Virtualization

Guest

HardwareAssist

Multi-Mode Monitors

Page 8: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

MMU Virtualization without hardware support

Guest maintains sets of page tables

In native execution, they would have been used for address translation

Virtual Machine Monitor (VMM) maintains a set of shadow page tables

There is one shadow page table for each guest page table

VMM sets CR3 to point to shadow page tables

Translation happens using shadow page tables

Guest page tables contain VPN->PPN translations, while shadows contain VPN->MPN translations

CR3

Guest page tables

Shadow page tables

VPN->PPN translations

VPN->MPN translations

Page 9: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Rapid Virtualization Indexing

RVI provides mechanism to avoid shadow page tables

Provides a second layer of page tables

Contain physical to machine address translations

Hypervisor maintains them

So, with RVI

Guest page tables contain virtual to physical translation

Second level tables contain physical to machine translation

Using two tables, hardware converts virtual address to machine address

guestCR3

nestedCR3

VPN -> PPN mapping

PPN -> MPN mapping

guest page tables

hypervisor page tables

Page 10: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

What is Page Sharing?

Content-based

Hint (hash of page content) generated for 4K pages

Hint is used for a match

If matched, perform bit by bit comparison

COW (Copy-on-Write)

Shared pages are marked read-only

Write to the page breaks sharing

VM 1 VM 2 VM 3

Hypervisor

VM 1 VM 2 VM 3

Hypervisor

Page 11: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Expand

Shrink

May page contentout to virtual disk

May bring contentfrom virtual disk

Borrow Pages

LendPages

ESX Server Memory Ballooning

Guest OS has better information than VMkernel

Which pages are stale

Which pages are unused

Guest Driver installed with VMware Tools

Artificially induces memory pressure

VMkernel decides how much memory to reclaim, but guest OS gets to choose particular pages

VM with VMware Tools Installed

VM with VMware Tools Installed

Page 12: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Oracle Databases on VI3

Page 13: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Oracle Database Characteristics

What its not:

Its *not* a Huge I/O consumer

Most common Oracle databases have modest I/O profiles

It does *not* have a small memory footprint

Large-ish memory footprint and modest I/O most common

Tuning a DB for virtualization is *not* unique rocket science

Many standard tuning activities benefit virtualized DBs substantially

Page 14: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Capacity Planner Data for Oracle Databases

Out of 13K Physical Oracle DBs considered

65% of systems on 2 core systems, averaging 5% CPU utilization

Roughly 4% of systems fully consume more than 2 cores

Most consume between 2 and 4G of RAM

Static RAM consumption points to fixed SGA with fixed # of PGAs

Page 15: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Oracle DB Workload Characteristics

Memory

Large In-Memory Footprint (SGA)

Ensure good cache hit ratio

Target 98% or higher

Small number of processes to protect against TLB misses

I/O

Lower than generally assume (50 IOP per second average)

Depends on quality of SQL execution plans

Overall Privileged Instructions

I/O, Context Switch (TLB misses)

Page 16: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Making your DB Ready to Virtualize

Well Tuned DB for Physical is Good for Virtual

Poor Execution Plans even worse in Virtual

Cause poor cache re-use

Cause additional I/O and hence CPU Overhead

Cause additional impact on storage

Minimize Full Table Scans

Up to date statistics for CBO

Small number of “DB file scattered read” events

Tune SQL with high number of physical reads per execute

Page 17: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Virtualization Overhead for Oracle DBs

Well Tuned DB

Typically 10 to 20% additional CPU required over physical

Poorly Tuned DB

Maybe 20 to 30% or even more

Depends on SQL execution plans

User Impact of Additional CPU Requirements

Allocate additional CPU per VM to cover overhead

Results in minimal impact to user response time

If VM CPU pegged then expect substantial impact on user

Page 18: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Oracle Physical to Virtual Conversion Process

All but the Hungriest DBs will fit on a VM

Use smallest VM that will suffice

Ie. 1 vCPU VM more efficient than 2 vCPU if it fits

Current limit of 4 vCPU per VM

Will not exist for long

Limits virtualization of DBs that consume more than 3 physical cores

Appears to be a small number of all DBs Less than 5% in our surveys

Page 19: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

General Best Practices for Virtualizing DBs

Characterize DBs into three rough groups

Green DBs – typically 70%

Ideal candidate for virtualization: Well tuned and modest CPU consumption

Yellow DBs – typically 25%

Likely candidate for virtualization May need some SQL tuning and monitoring to understand CPU

and I/O requirements

Red DBs – typically 5%

Unlikely candidates until larger VMs available

Consumes 4 or more physical cores

Not a lot of SQL tuning to be done

Page 20: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

OLTP vs DSS Oracle Workloads and Virtualization

OLTP Workloads

Assume frequent small queries

Should hit efficient index almost all the time

Basic Diagnostics with AWR report Need small physical reads per exec, no full table scans

DSS Workloads

Should hit summary tables vs base tables as much as possible

Use materialized views to roll up as batch jobs at night

Daytime load should be index look ups May summarize delta from summary in real time when necessary

Page 21: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Direct I/O

Guest-OS Level Option for Bypassing the guest cache

Uncached access avoids multiple copies of data in memoryAvoid read/modify/write module file system block sizeBypasses many file-system level locks

Enabling Direct I/O on Linux

# vi init.orafilesystemio_options=“setall”

Check:

# iostat 3(Check for I/O size matching the DB block size…)

Page 22: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Asynchronous I/O

An API for single-threaded process to launch multiple outstanding I/Os

Multi-threaded programs could just just multiple threadsOracle databases uses this extensivelySee aio_read(), aio_write() etc...

Enabling AIO on Linux

# rpm -Uvh aio.rpm# vi init.orafilesystemio_options=“setall”

Check:

# ps –aef |grep dbwr# strace –p <pid>io_submit()… <- Check for io_submit in syscall trace

Page 23: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Use Large Pages

Guest-OS Level Option to use Large MMU PagesMaps the large SGA region with fewer TLB entriesReduces MMU overheads

Enabling Large Pages on Linux

# vi /etc/sysctl.conf (add the following lines:)

vm/nr_hugepages=2048vm/hugetlb_shm_group=55

# cat /proc/vminfo |grep HugeHugePages_Total: 1024HugePages_Free: 940Hugepagesize: 2048 kB

Page 24: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Linux Versions

Some older Linux versions have a 1Khz timer to optimize desktop-style applications

There is no reason to use such a high timer rate on server-class applications

The timer rate on 4vcpu Linux guests is over 70,000 per second!

Use RHEL5.1

Install 2.6.18-53.1.4 kernel or later

Put divider=10 on the end of the kernel line in grub.conf and reboot.

All the RHEL clones (CentOS, Oracle EL, etc.) work the same way.

Page 25: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Page Sharing and Large Memory Pages

Large Pages In Oracle

Can increase efficiency of memory management

Large Pages are Not Shared

Expect less reduction in memory consumption with large pages

Hardware Assisted Memory Management

Benefits from use of Large Pages

No other hypervisor uses large pages today

Expect AMD RVI and Intel EPT to work well with VMware Infrastructure

And likely not with hypervisors that don’t support large pages

Page 26: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Page Sharing With Oracle DBs

Page Sharing in Vmware Infrastructure

Reduce memory consumption by sharing common pages

Common pages include

OS related pages

Executable related pages

Ie. Oracle executables for each VM running Oracle

Serves to allow larger SGA with overall memory consumption reduction

Page 27: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Oracle Performance Study: SwingBench TPC-like Transaction Processing Benchmark

Order-entry benchmark: order & product processing

Java client generator with Oracle back-end

Page 28: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

SwingBench Configuration

Database:• # of Users = 5,011,872 • # of Products = 5,011,872 • Db size = 5.11GB

Test• Test duration = 10 mins • # of Users per run = 30 • Think time = 0 • New customer - 11% • Browse products - 28% • Order products - 28% • Process orders - 5% • Browse orders - 28%

• RHEL5 U1• 64 bit• 2.6.18-53.1.13.el5

• ESX version: 3.5 build# 60217 • Number of VMs: 1• vCPU: 4 • Mem: 6GB• vDisk: 16GB• vNIC: 1

• Oracle 11g (11.1.0) • RHEL4 x86_64• SGA: 3GB • PGA: 1GB

• Dell PowerEdge 2950• Mem: 8GB• Two dual core) Intel(R) Xeon(R) CPU 5160 @ 3.00GHz processor 4GB cache• Storage: CX 3-40 (30 disks)

Page 29: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Measuring the Performance of DB Virtualization

Page 30: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

SwingBench Oracle Single DB Scaling

Number of virtual CPUs in Database/Guest

Page 31: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Study: The Oracle DVD-Store Benchmark Simulate a large multi-tier application with Oracle as

the back-end database

Simulates DVD store transactions

Java client tier

Oracle Database

Sun 16-core x4600 M2VMware ESX 3.5Oracle 10G R2RHEL4, Update 4, 64-bit

EMC CLARiiON CX-34030 x 15k Spindles

Page 32: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Many Large Databases: Scaling Out What happens when we consolidate more than one

large database per host?

Increase number of large databases and measure performance

Key criteria: Throughput and Response Time

Scale DVD-Store Benchmark

From 1 to 7 Databases, each with their own VM

From 2 to 16 Physical CPU cores

From 32 to 256 GB of RAM

Page 33: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

“Large” Database Consolidation Study

Page 34: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Oracle Performance (Response time)

Page 35: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Java on VI3

Page 36: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Java Workload Characteristics

CPU

Intensive; threads; not processes

Memory

Heavy

Network

Tends to be light

Storage

Tends to be light

Page 37: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Page Sharing Java

Common pages to OS

Common pages to JVM

Common application pages—only where apps are identical?

Garbage collection

Fewer zero pages

Tends to fill up assigned memory

Configurable through JVM?

Page 38: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Page Sharing and Large Memory Pages

Beware of the combination of large pages and memory over-commitment

large pages are not shared

when sharing is needed, large pages are backed by normal pages (4K)

This is a repeat of earlier slide in Oracle section…reconcile for final

Page 39: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

VM Memory Over-commitment with Java

JVM is a VM within the OS

If balloon driver takes memory from JVM, access to JVM heap will force guest swapping

this is particularly bad with JVM heap access which tends to be random—no locality

Page 40: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Balloon and Swap Interaction

oom_killer

Page 41: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Java Config

Multiple JVMs known to outperform single large JVM

Requires app with a scale-out model

Scaling out VMs a better idea

DRS

Page 42: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Java Tuning

Understand

Objects created and put in Eden

After certain life, pushed to long-lived area

GC sweeps Eden aggressively and less so with long-lived area

So…

Eden sizing impacts memory access

GC thread count increases raise memory access profile and virtual overhead

This is another reason for using our model of multiple VMs each with their own JVM

Page 43: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

JRockit

BEA

OS-less

Optimal out-of-box

Page 44: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Common App Tuning

Linux kernel 2.6.22.16 (check)

RHEL

SUSE – 250 Hz

Others

Page 45: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Performance Data

Page 46: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Java Scalability

Will Java scale to 16 cores?

If no, show graph.

Page 47: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

SwingBench Oracle Single DB Scaling

Number of virtual CPUs in Database/Guest

Page 48: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

“Large” Database Consolidation Study

Scaling to16 Cores,

256GB RAM!

Page 49: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Oracle Performance (Response time)

Page 50: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Storage Protocols: Sequential Read Throughput

Page 51: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

Storage Protocols: Sequential Write Throughput

Page 52: VMware Virtualization of Oracle and Java Scott B. Drummonds Tim Harris

VMFS Performance: VMFS versus RDM