vmworld europe 2014: virtualizing databases doing it right – the sequel

124

Upload: vmworld

Post on 15-Jul-2015

118 views

Category:

Technology


0 download

TRANSCRIPT

Disclaimer

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

2

Don Sullivan Product Manager Bus Critical Aps

• VMware Chief Technology Office Ambassador

• Started Working with Oracle Version 7,8.X,9.X,10.x .….

• Co-Author Oracle Certified Master Practicum

• Frequent speaker at VMworld, EMCworld, VMug & IOUG

• Principal Oracle University Instructor (Oracle 7 through 10g)

– Backup and Recovery, Performance Tuning, OracleNet

– RMAN, Partitioning, SQL Tuning, OPS/RAC,

– Oracle Internals classes:(Advanced Recovery, Space and

Transaction Management, Optimizer Internals, Datatypes and

Storage Internals)

3

Michael J Corey

Books Include:Virtualizing SQL Server with VMware Doing IT

Right

Oracle Database 12c: Install, Configure & Maintain

like a Professional

Oracle 11g A Beginner’s Guide

Oracle 10g A Beginner’s Guide

Oracle 9i - A Beginner's Guide

SQL Server 7 Data Warehousing

Oracle8i - Data Warehousing

Oracle8i - A Beginner's Guide

Oracle8 - Data Warehousing

Oracle8 – Tuning

Oracle8 - A Beginner's Guide

Oracle - Data Warehousing

Oracle - A Beginner's Guide

Tuning Oracle

Key Past/Current Affiliations:Past President of the IOUG

Founding Board IOUG Virtualization SIG

Past Member IOUG Board of Directors

Past Director of Education IOUG

Founder Professional Association of SQL Server

Talkin’Cloud Top 200 Channel Partner Experts Cloud

Past Member Microsoft Data Warehouse Council

Past Member Oracle Educational Advisory Council

Past Director of Conferences IOUG Alive

Executive Board Massachusetts Robert H. Goddard

Council on Science, Technology, Engineering & Mathematics

Started Working with Oracle Version 3.0 Beta Tested Oracle 5,6,6.2,7,8.X,9.X.…. Presented on Technology & Business Topics from Brazil to Australia Worked with Oracle on UNIX, Linux, Windows, MVS,VM, VMS,..

http://www.pearsonitcertification.com/store/virtualizing-oracle-databases-on-vsphere-9780133570182http://www.pearsonitcertification.com/store/virtualizing-sql-server-with-vmware-doing-it-right-9780321927750

New RDBMS books from VMware Press

vmwarepress.com

Doing Something Different• Presentation Covers Both Oracle & Microsoft SQL Server

• More & More DBA’s are faced with maintaining both

• Many Issues faced are shared

6

“This is a Database on Virtualized Infrastructure Session, Principals Apply all Databases”

Dial Tone – The New World Order

Why Customers Are Virtualizing

Databases

(Business Critical Applications)

VMware

Concise Set

Very

Efficient

Drivers

Focused

Driver Set

Well

Vetted O/S

Hardware Resource

O/S

Du Jour

Many Drivers

Many Versions

New

Driver’s

Can Cause

Issues

Why Your Company Cares: Virtualization is Strategic

1:1 relationship between applications and hardware

Relevant cost metric = cost per server

• 8% - 12% Utilization is typical

Many:1 relationship between applications and hardware

Relevant cost metric = cost per application

• 60 - 80% Utilization: is typical

• 60% reduction in CapEx

• 30% reduction in OpEx

• 80% reduction in Energy

Physical World

1 :1

Virtual World

Many :1

The New

Norm

“Can You Say Right-Sizing”

Memory Hot Add / CPU Hot Plug

Reduction in CPU Utilization

Increased processing rate

Adding Memory

Oracle – Hot Plug vCPU

Oracle - Hot Add MemoryOracle database memory parameters are defined at instance startup.

You will have to restart the database to take advantage of added

memory.

Unless you have set SGA_MAX_SIZE to Big

Caution Shared Resource Environment !

Typically…

SGA_TARGET_SIZE <= SGA_MAX_SIZE

or could be wasting memory

http://www.vmware.com/files/pdf/solutions/oracle/

Oracle_Databases_VMware_Workload_Characte

rization_Study.pdf

1St Time Goal of Consistency Standardization

Can Be Achieved

“Any Resource, Any Server, At Any Time” in the (Pool)

The 10

Millionth Model

T was

produced on

June 4, 1927

Trend Keeps Growing

Trigger Points When to Virtualize

Architecting for Performance:The Right Hypervisor

Is your database to “Big” to Virtualize?

Very Large ERP System

• 75+ application tiers – VMware/RHEL

• 8 TB database; 8.8 billion rows of data

• 52 million transactions per day

• 79K IOPS

• 40K blocks per second interconnect traffic

• 40,000+ named users

• 4,000+ peak concurrent users

Source EMC

“Yes This is Virtualized”

Performance Results• Virtualization has ~5% overhead as compared to

native

• The database tps on a virtual machine is 5% less than that on the physical machine.

• 2P represents 12 cores and 4P represents 24 cores

22

• For 100 users the delta is ~6% and that

increases up to ~10% for 1700 users.

• When the system gets busier, native

starts to have a slightly larger

advantage over virtualization.

Performance Results - Continued• Both virtual and native, by moving from

2P (12 cores) to 4P (24 cores)

• The database tps increases by 40% to 50%

• The CPU utilization drops from 80% to 60%

23

• For RAC , by moving from 2P (12 cores) to 4P (24

cores)

• The database tps increases by 40% to 60%

• The CPU utilization drops from 75% to 60%

“Who Architects a Database With Less than 5% Overhead - One Busy Day Your Done”

Mega vMotion RAC on vSphere Functional Stress Test

VMW, EMC, CiscoExecuted by “Principled Technologies” 2013

WWW.principledtechnologies.com/Vmware/vMotion_oracle_rac_1013.pdf

3 RAC Node, vMotion on all 3 Nodes Simultaneously – Without any network disruption

25

Service Level Agreement/The DBASituation: Customer Monitors Critical Medical Equipment within a Hospital. A SQL

Server Database is at core of system. Having Huge performance problems

“Failure is not an option”.

Solution: Need to take Server Down. Adjust BIOS Setting Causing SQL Server to only

have access to 50% of the available CPU.

Customer: Never a time they can take Server down for 5 minutes

Stand Alone Instance – Had it been virtualized DBA would have had options

No Win - SLA

Yet this situation points to

a bigger issue concerning

“Managements”

expectations concerning

the availability of the

database and the

physical infrastructures

ability to support those

expectations.

Have The Conversation• Get the Resources You Need to meet the expectation

• OR – Reset Expectations concerning Database Uptime

Avoid Good Intention BIOS Setting

Check Power Management Settings

• Default lot of Servers is “Green” Friendly Setting

• Saves Energy, When Server Inactive

• Many Times Does Not Ramp UP CPU Quickly and in Some Cases

Completely

• Avoid Dozing Setting

• Slows CPU to half its Speed

Proper Setting for server hosting a Database

is “High Performance”

BIOS Settings to ConsiderIf Your Processors Support it

• Enable “Turbo Mode”

• Enable “Hyper-threading”

Enable all hardware-assisted virtualization features in the

BIOS.

Virtualizing Databases: Doing IT Right

Lessons Learned – Tier 1

“What Works in Tier-2 (non-production), will not always

work with Tier-1 (production)”

33

Doing It Right 1st Time: Very Conservative

Designed to Insure You Avoid Common Traps & Pitfalls Associated with Production Databases

being Virtualized

Starting Out Right

Doing It Right: Read Best Practices Guides

Read The Documentation

From All Your Vendors……

VMware, Microsoft, Storage

Vendor, Network Vendor….

Appendix of this deck

Professional Association of SQL Server

http://virtualization.sqlpass.org/“Take Advantage of All resources Available to You”

Blogs: Longwhiteclouds.com

39

http://vsphere-land.com/news/2014-top-vmware-virtualization-blog-voting-

results.html?utm_content=bufferc62e1&utm_medium=social&utm_source=twitter.com&u

tm_campaign=buffer

#13

Most Up To Date Information

Installation• Plan your SQL Server installation

SLAs, RPOs, RTOs

Baseline current workload, at least 1 business cycle

Baseline existing (workload) vSphere implementation

Estimated growth rates

I/O requirements (I/O per sec, throughput, latency)

Storage (Disk type/speed, RAID, flash cache solution, etc)

Software versions (vSphere, Windows, SQL)

Product Keys

Licensing (may determine architecture)

Workload type (OLTP, Batch, Warehouse)

Accounts needed for installation / service accounts

High Availability strategy

Backup & Recovery strategy

“If you aim at nothing,

you will hit it every

time” – Zig Ziglar

Planning a High Availability Strategy Requirements

• Recovery Time Objective (RTO)

• What does 99.99% availability really mean?

• Recovery Point Objective (RPO)

• Zero data lost?

• HA vs. DR requirements

Evaluating a technology

• What’s the cost for implementing the technology?

• What’s the complexity of implementing, and managing the technology?

• What’s the downtime potential?

• What’s the data loss exposure?

Availability % Downtime / Year Downtime / Month * Downtime / week

"Two Nines" - 99% 3.65 Days 7.2 Hours 1.69 Hours

"Three Nines" - 99.9% 8.76 Hours 43.2 Minutes 10.1 Minutes

"Four Nines" - 99.99% 52.56 Minutes 4.32 Minutes 1.01 Minutes

"Five Nines" - 99.999% 5.26 Minutes 25.9 Seconds 6.06 Seconds

* Using a 30 day month

Is Being Down 3 Days In A Row Ok?

You Had 99% Availability !

Baseline, Baseline, Baseline………

45

Why will making it Virtual make it perform better?

IF so how?

– New Hardware?

– Faster CPU?

– Faster Drives?

“There are no silver bullets”

“IT” Food Groups: What to Baseline

• Existing Physical Database Infrastructure

• Existing/Proposed vSphere Infrastructure

46

When You Base Line a database Make Sure The Sample Interval Is frequent

CPU, Memory, Disk (15 Seconds or less)

SQL Server TSQL (1 Minute)

“A Lot can

happen in a

short amount of

time”

“SAME Applies to Oracle ! ! ! - A lot Can Happen

Migrations - The Bigger Picture

Database As A Service – Road MapMultiple Tier Approach

• Different levels for different DB placement

• Basic and Premium

– Basic = Low utilization, test / dev DBs

– Premium = Moderate to High utilization, production, high visibility

• Different underlying hardware

• Different SLAs, RTO, RPOs and HA between tiers

Center of Excellence

• Assist with migrations, net new DBs and Capacity Management

– Communication, no “throwing it over the wall”

• VMware/SAN/Network/DB teams to discuss DB migrations

– Optional Teams: Security, Procurement

50

“Few Dedicated Personnel to each Level of Stack –

End Users are taking advantage automation”

vSphere Environment

SQL Server Baseline – Suggested Values

SQL Server – Perfmon Counters

SQL Profiler Counters

These are suggested values - work with your DBAs to

determine their KPIs

Migration – Baseline: Physical (disk) PreLogicalDisk\Avg Disk sec/Read read latency

LogicalDisk\Avg Disk sec/Write write latency

LogicalDisk\Disk Read Bytes /sec Read throughput

LogicalDisk\Disk Write Bytes /sec Write throughput

LogicalDisk\Disk Reads/sec Read IOPS

LogicalDisk\Disk Writes/sec Write IOPS

LogicalDisk\Disk Transfers/sec Combined IOPS

Migration – Baseline: Virtual (disk) Post

Export output Excel, and

graphed using a variety of tools,

such as Jonathan Kehayias’

Powershell script.

Compare the results against the

required IOPS as measured in

the pre-deployment

assessment.

Determine IOPS & ThroughputORION (Part of 11.2 now)sudo -u root ./orion_linux_x86-64 -run advanced -testname traxpoc -num_disks 20-cache_size 8000 -duration 240 -matrix basicSLOB (Silly Little Oracle Benchmark)Calibrate I/O – Native to Oracle starting in 11.1SQL> declare2 l_latency integer;3 l_iops integer; 4 l_mbps integer; 5 begin 6 dbms_resource_manager.calibrate_io7 (5,10,l_iops,l_mbps,l_latency);8 dbms_output.put_line ('max_iops = '||l_iops);9 dbms_output.put_line (’latency = '||l_latency);10 dbms_output.put_line ('max_mbps = '||l_mbps);11 end;12 /max_iops = 5348latency = 10max_mbps = 641

Other Free Tools:• Swingbench• TPC Benchmark • Custom scriptsHow do you know for sure?Oracle’s - $$$:Database Replay

Oracle Calibrate I/O Tip

Don’t’ keep it a Secret• DBA’s – tell vSphere, Storage, and Network Admins your needs

– Storage: (IOPS / throughput)

– CPU: (MHz)

– Memory: (Total GB)

– Network: Bandwidth

– Features (i.e.: Windows clustering)

– Anticipated Growth Rates

– Anticipated Activity

– Other

“They Flunked Mind Reading”

Before You Install a Database on New VM

• Do basic throughput testing of the IO subsystem prior to deploying a Database

• Tools you can use– SQLIO/IOMETER

– Slob…..

62

“Check It Before You Wreck it”-- Jeff Szastak

Should You PV (Via Converter)

Production Environment’s Build “New” From Scratch – GI/GO

SQL Server - Unattended Installation Options

VMware vRealize Automation (vCac) Command Line• http://msdn.microsoft.com/en-us/library/ms144259

Configuration File• http://msdn.microsoft.com/en-us/library/dd239405

Sysprep• http://msdn.microsoft.com/en-us/library/ee210664

• FYI – Available as of SQL Server 2008 R2

ORACLE- Unattended Installation Options

You At the VMworld

Party While your

Database is

Provisioned

VMware vRealize Automation

DBCA Silent Install

http://docs.oracle.com/cd/E11882_01/install.112/e24321/app_nonint.htm#CIHHFDGG

RAC Silent Install

http://docs.oracle.com/cd/E11882_01/install.112/e24660/cripts.htm#RILIN1119

Phone-A-Friend

VMware has stated that it will take the ______support call if a customer calls ______ Support and ______ Support is being difficult because the

customer is running on VMware.

• Hint……. “TSANET.ORG--- Hardware or Software”

Use SQL Server/Oracle recommended installation guidelines for respective operating

system – same as physical !

Physical World 1 :1 Virtual World

Many :1

Same As Physical

If your OS and database don’t know they are virtualized do you need to tell them?

Did You Hear That?

Architecting For Performance: Design

OLTP Large amount of small queries

Sustained CPU utilization during working hours

Sensitive to peak contentions (slow downs affects SLA)

Generally Write intensive

May generate many chatty network round trips

Typically runs during off-peak hours, low CPU utilization

during the normal working hours

Can withstand peak contention, but sustain activity is key

Batch / ETL

Database Workloads Types

DSS

Small amount of large queries

CPU, memory, disk IO intensive

Peaks during month end, quarter end, year end

Can benefit from inter-query parallelism with large number of

threads

OLTP vs. Batch Workloads What this says:

• Average 15% Utilization

• Moderate sustained activity (around 28% during working hours 8am-6pm)

• Minimum activities during non working hours

• Peak utilization of 58%

What this says:

• Average 15% Utilization

• Very quiet during the working day (less than 8% utilization)

• Heavy activity during 1am-4am, with avg. 73%, and peak 95%

Batch Workload (avg. 15%)

OLTP Workload (avg. 15%)

OLTP vs. Batch WorkloadsWhat This Means

• Better Server Utilization

• Improved Consolidation Ratios

• Less Equipment To Patch, Service, Etc

• Saves Money/Less Licensing

OLTP/Batch Combined Workload

“Many Tier-2 were built for

capacity not performance”

Separate development, test from production

environments into different host clusters in the beginning

Where?/What Year Was The First

Documented Use Of The Word “Nerd”

?

The Year Was 1950

More VMs vs. More DB Instances

More VMs

• Better resource isolation

• Better security, patch management

• Better Performance

• Less Risk

Fewer VMs (More instances)

• Less expensive in some licensing models

• No OS isolation (configuration, security, fault)

• No resource isolation

• Less Segmentation (HIPPA, PCI,…..)

Note: Both Work, Both Valid Strategies

Architecting For Performance: Storage

Golden Rules

“Your Database is just an

extension of your Storage”

Michael Webster

“Your Storage is Just a Set

of containers for your

database”

Don Sullivan

Storage• The fundamental relationship between

consumption and supply has not changed

• Spindle count and RAID configuration still rules

• host demand is an aggregate of VMs

• Factors that affect storage performance

• storage protocols

• storage configuration

• VMFS configuration (Separate LUN’s, All on one LUN, Does it even matter?)

VMFS

More I/O In Flight to the Array

Use VMFS vs. RDM• VMFS Advantages

– Negligible performance cost and superior functionality

– Ability to take full advantage of future functionality enhancements (Future Awesomeness)

• Align VMFS on 64K boundaries

– Automatic with vCenter

– www.vmware.com/pdf/esx3_partition_align.pdf

• With vSphere 4.1

– Use VAAI (Storage API)*

• With vSphere 5.x

– Use VASA (Storage API)*

0

1000

2000

3000

4000

5000

6000

7000

8000

4K IO 16K IO 64K IO

VMFS

RDM (virtual)

RDM(physical)

IOP

S

VMFS Scalability

* Work With Storage Vendor For Details

Thin Provisioning Perf / Block ZeroingMBs I/O Throughput

USE use Thick Eager Zerod Disk for best

performance

Maximum Performance happens eventually, but

when using lazy zeroing, zeroing needs to occur

before you can get maximum performance

At minimum Databases, LOGS, TEMPDB

Check with Storage Vendor to see how they handle

Thin Provisioning. Your Mileage may vary

VAAI capable array can alter config

http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf

Database Thick Provision Eager Zeroed Options

Inflation

Storage vMotion

Windows

vmkfstools- VMware KB 1011170

- vmkfstools –D “My VM.vmdk

- Eager or zeroedthick

- vmkfstools –k “My VM.vmdk

- converts to eager Zeroed

Optimizations – SQL Server: Disk Disk

• Instant file initialization – add SQL Server service account to PERFORM VOLUME MAINTAINCE TASK under User Rights Assignment in Local Policies of Server’s settings.

• By default, every time the database file needs to grow, OS will zero fill this file & block writes until complete

• Adding requires a restart of the SQL Service,

• removal requires a reboot

http://msdn.microsoft.com/en-us/library/ms175935(v=SQL.105).aspx

For those who want to be less conservative (for TempDB ) SQL 2005 50% the number of cores up to 8, 2008+ 25%-50% ratio of files to cores, usually up to 8.

The number of data files and tempdb files is important enough that Microsoft has two spots in the Top 10 SQL Server Storage best practices highlighting the number of data

files per CPU

TEMPDB 1 datafile per CPU

(DUAL Core Counts as 2 CPU’s)

(Raid 1+0 – Write Intensive)

Data Files 1 datafile per CPU

200GB DB/4 vCPU = 4@50GB

Make Equal Size/Grow Equally

http://technet.microsoft.com/en-us/library/cc966534.aspx

Storage Paravirtual SCSI (PVSCSI) adapters

PVSCSI adapters are high-performance storage adapters that can result in greater throughput and lower CPU utilization.

• Up to 30% CPU Savings

• Up to 12% I/O Improvement

Paravirtual Adapter Knows Its Virtual

* Very Important to Use Most Current Version

Always Check Storage Vendors Best Practices

“>80% of the issues

in a virtualized

Environment have

to do with Storage

misconfigurations”

Storage – Putting It All Together

• Work with storage engineer, deliver realistic requirements early in the cycle

• Size for performance, not capacity

• Large number of small drives, not small number of large drives

• More / faster spindles are better for performance• Understand the I/O requirements of different workloads

• Transactional data vs. log vs. backup

• OLTP vs. DSS

“Golden Rule: Capacity Versus Performance”

Storage – Putting It All Together•Understand the path to the drives, i.e. throughput, multi-pathing

•Use eagerzeroedthick disk provisioning to avoid lazy zeroing

• Place swap file on separate dedicated drive on SAN, mitigate the impact of swapping with EFD (for high performance workload)

• Can potentially slow down vMotions

• Follow SQL Server storage best practices

http://technet.microsoft.com/en-us/library/cc966534.aspx

Work with your SAN Vendor as well, they have Best Practices for running these workloads on your array

Architecting For Performance: Processor

vCPUs – Hyper-Threading

hyper-threading processor to appear as two "logical" processors to the host operating system

99

Still only One

Processor

vCPU’s

• With Databases Avoid Over Commitment of Processor Resources till have “actionable” performance data you can scale (vCOPs)

• 1-1 Ratio Physical Cores to vCPU’s• Out of the gate !

Hyper-Threaded CPU != Full vCPU

Within The VMIn a virtual environment each vCPU is a single thread. There is no virtual equivalent of a hyper-

thread.

Guest Operating O/S sees the number of allocated vCPU’s

Non-Virtualized O/S – Would see the Hyper threads.

Oracle: Latches, Parallelism… Based upon visible CPU’s. Be Careful How You Set these things.

Processor – Putting It All Together

• Leverage hardware-assisted virtualization (enabled by default)

• Consider avg. and peak utilization

• Be aware of hyper-threading, a hyper-thread does not provide the full power of a physical core

• Consider future growth of the system, sufficient head room should be reserved

• In high performance environment, consider adding additional hosts when avg. host CPU utilization exceeds 65%

• Consider increasing CPU resource if guest VM CPU utilization is above 65% in average

• Ensure Power Saving Features are “OFF”

• Use vCOPs for consumption & capacity

Architecting For Performance: Memory

Optimizations SQL Server: Memory

Memory – Max / Min

Min is set to 0

• only change when the OS is requesting memory for other apps

Max, is 2 TB by default

• Should not equal or exceed total VM RAM, may lead to OS starvation

• Do not set to 0, may prevent SQL from starting

• If using “Hot Add” remember to modify this setting

SSQL Max Memory = VMMem – ThreadStack – OS Mem – VM Overhead• ThreadStack = NumOfSQLThreads(ThreadStackSize)• ThreadStackSize = 1 MB on x86 | 2 MB on x64

http://msdn.microsoft.com/en-us/library/ms178067.aspx

Max SQL Mem ExampleNtirety Rule**

• 2 Gig + Additional 1 Gig per 16 Gig Physical Memory

106 **In the context of the VM size or Physical Machine Size

Running Multiple Instances on Same VM

Two options, and do nothing is not one of them

Option 1: Use max server memory

• Create max setting for each instance

• Give each instance memory proportional to expected workload / db size

• Do not exceed total RAM allocated to VM

Option 2: Use min server memory

• Create min settings for each instance

• Give each instance memory proportional to expected workload / db size

• The sum should be 1-2 GB less than RAM allocated to VM

Settings can be modified without having to restart the instancesPro Con

Max server memoryWhen a new process or instance starts, memory is available immediately to fulfill the request

If instances are not running, the running instances cannot access the available RAM

Min server memoryRunning instances can leverage memory previously used by instances that are no longer running

When a new process or instance starts, running instances need to release memory

SQL Server: Memory

108

Lock Pages in Memory

■ This keeps SQL more responsive when paging occurs

■ SQL Server Lock Pages in Memory is ON in >= 32/64 bit Standard Edition (2012)

■ Account needs “Locked pages in Memory” rights

▪ Give it the RIGHTS

http://msdn.microsoft.com/en-us/library/ms178067.aspx

Non-Uniform Memory Access (NUMA)• NUMA, avoiding the performance hit when several processors attempt to address the

same memory by providing separate memory for each NUMA Node.

• Speeds up Processing

• NUMA Nodes Specific to Each Processor Model

109

Non-Uniform Memory Access (NUMA)“All Processors Can Use All Memory”

• 4 Sockets, 6 cores.

• 4 NUMA Nodes

• 128 Gig RAM

• Each NUMA Node = 32 Gig RAM

110

“In this example Optimal

Performance:

Each VM < 32GB*”

*CPU Overhead Needs

to be accounted for.

Minimal

*vNuma – Minimizes

Impact when this

happens

Home Node - NUMA

The home node for a virtual machine is first selected considering current CPU and memory load across all NUMA nodes.

Wide NUMA Allows for the use of Multiple NUMA Nodes Efficiently

Hot Add CPU disables vNUMA

**** Properly Size Database/Don’t Need Hot Add CPU *****111

Swapping Occurs Two Places

1. Guest VM Swapping

2. ESXi Host Swapping

114

Swapping can slow

down I/O performance

of disks for other VM’s

Ballooning, Memory Compression, Swapping Slow You Down

Stating the Obvious

Ballooning• Kicks in – When Physical Host experiencing memory

contention

• Balloon Driver Runs on each individual VM

• Communicates with guest O/S to determine what is happening with memory

• Works with the server to reclaim pages that are considered least valuable by the guest OS

Exceeding Host Memory can lead to ballooning, Memory Compression or Swapping

Swapping can slow

down I/O performance

of disks for other VM’s

Don’t Shut Off Memory Ballooning

Ballooning is Your First

Line of Defense

Total Memory DemandActive memory (%ACTV) of VM’s +Memory Overhead – Page sharing of

VM’s (DE-Duping)

DE-Duping = Transparent Page Sharing

Transparent Page Sharing more effective The more similar the VM’s are

“Put Like Operating Systems On Same

Physical Host”

TPS – When It Kicks In• Before Ballooning

• Always Running on preset cycle looking for opportunity to reclaim memory

• Very Low Overhead

• Runs At HOST Level

Memory Reservations• VM is only allowed to power on if the

CPU & memory reservation is available (Strict admission)

• The amount of memory can be guaranteed even under heavy loads.

• SET CPU/Not Guaranteed

• VMware HA Strict Admission Control – Settings Can Override this behavior

126

Oracle Approximate Memory Architecture

Set the memory reservation to SGA size plus OS.

(Reservation & configured memory might be the same.)

Client sessions and context

SGA

(DB buffer cache, and others)

Operating System

VM

Co

nfi

gu

red

Me

mo

ry Instance(PMON, SMON, DBWR, LGWR, CKPT,

others)

Reservations and vswp

Setting a reservation creates a 0.00 K

Large Pages/Huge Pages -- Broken Down at Hypervisor Level. Not Guest O/S

“Large/Huge

PAGES Do

Not Normally

SWAP”

In the cases where host memory is overcommitted,

ESX may have to swap out pages. Since ESX will not

swap out large pages, during host swapping, a

large page will be broken into small pages. ESX tries

to share those small pages using the pre-generated

hashes before they are swapped out. The motivation

of doing this is that the overhead of breaking a

shared page is much smaller than the overhead of

swapping in a page if the page is accessed again in

the future.

http://kb.vmware.com/kb/1021095

Oracle – Hugepages/etc/security/limits.conf to set soft and hard limits.

oracle soft nofile 131072

oracle hard nofile 131072

oracle soft nproc 131072

oracle hard nproc 131072

oracle soft core unlimited

oracle hard core unlimited

# -- The following entries need to adjusted with HugePages settings# oracle soft memlock 50000000

# oracle hard memlock 50000000

“HUGE PAGES Do Not Normally SWAP”

Use large pages in the guest (start SQL Server w/ Trace flag –T834)

SQL Server In-Guest Memory Best Practices

Memory – Putting It ALL Together• Do not overcommit memory for production, mission critical SQL Server VMs

• Set provision memory = reservation = SQL Server max server memory + OS memory + virtualization overhead

• Set provision memory = reservation = Oracle SGA + OS memory + virtualization overhead

• To avoid swapping, memory limit should never be set below the provisioned size. Setting memory limit is not recommended in general

• To avoid NUMA remote memory access, size VM memory equal to or less than the memory per NUMA node if possible

Architecting For Performance: Network

Jumbo Frames

• Jumbo frames are Ethernet Frames Ethernet with more than 1500 bytes of payload. Conventionally, jumbo frames can carry up to 9000 bytes of payload

Data Movers, Pick One

Enable Jumbo FramesCheck to seeWill Suceed

ping -M do -s 8972 -c 2 rac01a-privping -M do -s 8972 -c 2 rac01b-privping -M do -s 8972 -c 2 rac02a-privping -M do -s 8972 -c 2 rac02b-privPING rac01a (10.17.33.31) 8972(9000) bytes of data.8980 bytes from rac01a-priv (10.17.33.31): icmp_seq=1 ttl=64 time=0.017 ms8980 bytes from rac01a-priv (10.17.33.31): icmp_seq=2 ttl=64 time=0.018 ms

Will Failping -M do -s 8973 -c 2 rac01a-privping -M do -s 8973 -c 2 rac01b-privping -M do -s 8973 -c 2 rac02a-privping -M do -s 8973 -c 2 rac02b-priv

Make sure: switch support is enabled

9000 Bytes

- 20 Bytes IP Header

- 8 Bytes of ICMP Header

“8192/64 = 128”

SQL Server: Network

Network

Default packet size is 4,096

• If jumbo frames are available for the entire stack, set packet size to 8,192

Maximize Data Throughput for Network Applications

• Limit file system cache by OS

• NIC > File & Printer Sharing Microsoft Networks

• Use Minimize Memory or Balance

http://blogs.msdn.com/b/johnhicks/archive/2008/03/03/sql-server-checklist.aspx

Jumbo Frames

“Cost of Reducing To 1500 Bytes Then Back Again is Very Expensive”

Splitting Is Bad

Network – Putting All Together

• Separate SQL workloads with chatty network traffic (Microsoft Always On – Are you there) from the one with chunky access into different physical NIC

• With 10Gbe do at VLAN level (4Gig-E NICs (4Gb total vs 20Bg total) 2 10Gbe Nics)

• Separate traffic for vMotion, service console, and SQL Server at physical NIC level • 10Gbe Sufficient Bandwidth at Host but separate by VLAN

• Have 4 NICs per host to ensure performance and redundancy of network (Virtualized Environment = Network Heavy)

• Using 4 10Gbe NIC’s overkill from redundancy perspective. 2 10 Gbe Nic’s Usually enough

• vSphere 5.0 Introduced ability to use more than 1 NIC for vMotion. (More vMoitions going at one time. Added specifically for memory intensive applications, ie: Databases)

• Use VMXNET3 (VMware driver – reduces physical CPU utilization)

WSFC – Cluster Validation Wizard

144

Use this to validate support for your configuration• Required by Microsoft Support for condition of support for YOUR

configuration

Run this before installing AAG (AlwayOn Availabilty Group), and every time you make changes

• Save resulting html reports for reference

If running non-symmetrical storage, possible hotfixes required• http://msdn.microsoft.com/en-us/library/ff878487(SQL.110).aspx#

SystemReqsForAOAG

http://www.pearsonitcertification.com/store/virtualizing-oracle-databases-on-vsphere-9780133570182http://www.pearsonitcertification.com/store/virtualizing-sql-server-with-vmware-doing-it-right-9780321927750

New RDBMS books from VMware Press

vmwarepress.com

Thank YouMichael Corey

[email protected]

Blog: http://michaelcorey.ntirety.com

http://www.dbtablog.com/

@Michael_Corey

Don Sullivan

[email protected]

@@dfsulliv