vspere performance

VSP1800

@Insertspeaker

vSphere

Performance

Best Practices

Robert Moran

Premier Services Engineer VMware, Inc. Global Support Services Cork, Ireland

2

Disclaimer

This session may contain product features that are currently under development.

This session/overview of the new technology represents no commitment from VMware to deliver these features in

any generally available product.

Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new technologies or features discussed or presented have not been determined.

3

Global Support Services and Customer Advocacy

Bangalore, India

Tokyo, Japan

Cork, Ireland Burlington, Canada

Palo Alto, CA Broomfield, CO

Support offices

Local language support

Spanish, Portuguese, French, German, Japanese, Chinese

Global Coverage

24x7, 365 days/year

6 Support Centers

1000+ Support

Engineers

Follow-the-sun

Support for

Severity 1 Issues

Support Relationships

with 100% of the

Fortune 100;

99% of Fortune 500

4

Customer Support Day Events

Coming to a location near you: sharing of VMware best practices!

Support Days are a collaboration between VMware Support, Sales and customers you learn directly from the experts

Topics are driven by customer input, and

typically include:

Best practices

Tips/tricks

Top issues

Product roadmaps/demos

Certification offerings

http://www.vmware.com/go/supportdays

5

Overview

What a performance problem sounds like:

My VM is running slow and I dont know what to do!

I tried adding more memory and CPUs but the problem got worse!`

My VM is slow on one host but fast on another!

What to look for? Where to start?

We will explore some of the most common performance-related

issues that our support centers receive cases for

6

A word about performance.

Troubleshooting methodology must define: How to find root cause

How to fix the problem

Must answer these questions: 1. How do we know when we are done?

2. Where do we start looking for problems?

3. How do we know what to look for to identify a problem?

4. How do we find the root-cause of a problem we have identified?

5. What do we change to fix the root-cause?

6. Where do we look next if no problem is found?

7

Agenda

Benchmarking & Tools

Best Practices and Troubleshooting

The 4 food groups Memory

CPU

Storage

Network

2009 VMware Inc. All rights reserved

BENCHMARKING & TOOLS

9

Benchmarking

Consistent and reproducible results

Important to have base level of acceptable performance Expectation vs. Acceptable

Determine baseline of performance prior to deployment Benchmark on a physical system if applicable

Avoid subjective metrics, stay quantitative The system seems slower

This worked better last year

10

Benchmarking

Benchmarking should be done at the application layer Use application-specific benchmarking tools and load generators

Check with the application vendor

Isolate variables, benchmark optimum situation before introducing load

Understand dependencies Human interaction

Other food groups

Compare apples-to-apples

11

Aggregates thousands of metrics into Workload, Capacity, Health scores

Self-learns normal conditions using patented analytics

Smart alerts of impending performance and capacity degradation

Identifies potential performance problems before they start

Slide 11

Tools vCenter Operations

12

Tools vCenter Operations

Slide 12

13

Tools esxtop

Valuable tool built in to vSphere hosts

View or capture real-time data View or playback data later

Import data in 3rd party tools

vSphere Client performance graphs get their data from the kernel and VSI

Presentation/unit may be different (e.g. %RDY)

http://communities.vmware.com/docs/DOC-9279


MEMORY

15

Memory Overhead

A VMs RAM is not necessarily machine RAM vRAM + overhead = maximum machine RAM

Source: vSphere 5.1 Resource Management Guide

Note: These are estimated values

16

Memory Host Memory Management

Occurs under normal circumstances and when there is contention

Transparent Page Sharing

Occurs when memory is under contention

Ballooning

Compression

Swapping

17

Memory Transparent Page Sharing

18

Memory Ballooning

19

Memory Compression

20

Memory Swapping

21

Memory Swapping

22

Memory Ballooning vs. Swapping

Ballooning is better than swapping

Guest can surrender unused/free pages

Guest chooses what to swap, can avoid swapping hot pages

23

Memory VM Resource Allocation

24

Memory Resource Pool Allocation

25

Memory Rightsizing

Generally it is better to OVER-allocate than UNDER-allocate

If the running VMs are consuming too much host/pool memory Some VMs may not get physical memory

Ballooning or host swapping

Higher disk IO

All VMs slow down

26

Memory Rightsizing

If a VM has too little vRAM Applications suffer from lack of RAM

The guest OS swaps

Increased disk traffic, thrashing

SAN slow down as a result of increased disk traffic

If a VM has too much vRAM Higher overhead memory

Possible decreased failover capacity

Longer vMotion time

Larger VSWP file

Wasted resources

27

Memory Troubleshooting

Wrong resource allocation May not notice a limit, e.g. VM or template with a limit gets cloned

Custom share values

Ballooning or swapping at the host level Ballooning is a warning sign, not a problem

Swapping is a performance issue if seen over an extended period

Swapping/paging at the guest level Under-provisioned guest memory

Missing balloon driver (Tools)

28

Memory Best Practices

Avoid high active host memory over-commitment No host swapping occurs when total memory demand is less than the physical

memory (Assuming no limits)

Right-size guest memory Avoid guest OS swapping

Ensure there is enough vRAM to cover demand peaks

Use a fully automated DRS cluster Use Resource Pools with High/Normal/Low shares

Avoid using custom shares


CPU

30

CPU Overview

Raw processing power of a given host or VM Hosts provide CPU resources

VMs and Resource Pools consume CPU resources

CPU cores/threads need to be shared between VMs

Fair scheduling vCPU time Hardware interrupts for a VM

Parallel processing for SMP VMs

I/O

31

CPU esxtop

32

CPU esxtop

Interpret the esxtop columns correctly

%RDY - The percentage of time a VM is ready to run, but no physical processor is ready to run it

%USED Physical CPU usage

%SYS Percentage of time in the VMkernel

%IDLE %WAIT- %IDLE can be used to estimate I/O wait time

33

CPU Performance Overhead & Utilization

Different workloads have different overhead costs (%SYS) even for the same utilization (%USED)

CPU virtualization adds varying amounts of system overhead Direct execution vs. privileged execution

Non-paravirtual adapters vs. paravirtual adapters

Virtual hardware (Interrupts!)

Network and storage I/O

34

CPU vSMP

Relaxed Co-Scheduling: vCPUs can run out-of-sync

Idle vCPUs incur a scheduling penalty configure only as many vCPUs as needed

Imposes unnecessary scheduling constraints

Use Uniprocessor VMs for single-threaded applications

35

CPU Scheduling

Over committing physical CPUs

VMkernel CPU Scheduler

36

CPU Scheduling



X X

37

CPU Scheduling



X X X X

38

CPU Ready Time

The percentage of time that a vCPU is ready to execute, but waiting for physical CPU time

Does not necessarily indicate a problem Indicates possible CPU contention or limits

39

CPU NUMA nodes

Non-Uniform Memory Access system architecture

Each node consists of CPU cores and memory

A CPU core in one NUMA node can access memory in another node, but at a small performance cost

vNUMA is now available in vSphere 5

NUMA node 1 NUMA node 2

40

CPU - vNUMA

Virtual NUMA (vNUMA) exposes host NUMA topology to the guest operating system

Requires hardware version 8

Enabled by default on VMs with more than 8 vCPUs VMs with 8 or less need to have their advanced configuration edited to enable

vNUMA

41

CPU Power Management

Can be set from the vSphere Client

42

CPU Troubleshooting

vCPU to pCPU over allocation HyperThreading does not double CPU capacity!

Limits or too many reservations can create artificial limits.

Expecting the same consolidation ratios with different workloads Virtualizing easy systems first, then expanding to heavier systems

Compare Apples to Apples

Frequency, turbo, cache sizes, cache sharing, core count, instruction set

43

CPU Best Practices

Right-size vSMP VMs

Keep heavy-hitters separated Fully automated DRS should do this for you

Use anti-affinity rules if necessary

Use a fully automated DRS cluster Use Resource Pools with High/Normal/Low shares

Avoid using custom shares


STORAGE

45

Storage esxtop Counters

Different esxtop storage views Adapter (d)

VM (v)

Disk Device (u)

Key Fields: DAVG + KAVG = GAVG

QUED/USD Command Queue Depth

CMDS/s Commands Per Second

MBREADS/s

MBWRTN/s

46

Storage Troubleshooting with esxtop

High DAVG: issue beyond the adapter Over utilized storage processors, too few platters in the RAID set, etc.

High KAVG: issue in the kernel storage stack Driver issue

Full queue

Aborts: GAVG exceeding 5000 ms Command will be repeated, storage delay for the VM

47

Storage Benchmarking with iometer

48

Storage Storage I/O Control

Allows the use of Shares per VMDK

Throttling occurs when datastore reaches latency threshold Higher share VMDKs perform IO first

vCenter monitors latency across all hosts Not effective if datastore shared with other vCenters

49

Storage Storage DRS

Datastore clusters Maintenance mode

Anti-affinity rules

vCenter monitors for latency and disk space Migrate VMDKs for better performance or utilization

Not effective with automated tiering SANs Check HCL to confirm these features are compatible

50

Storage Troubleshooting

Snapshots

Excessive traffic down one HBA / Switch / SP can cause latency Consider using Round Robin in conjunction with ALUA

Always be paranoid when it comes to monitoring storage I/O

Consider your I/O patterns Peak time for storage IO?

Virus scans, database maintenance, user logins

Always consult with array vendor They know the best practices for their array!

51

Storage Best Practices

Use different tiers of storage for different VM workloads Slower storage for OS VMDKs

Faster storage for databases or other high-IO applications

Use the Paravirtual SCSI adapter Reduced overhead, higher throughput

Use path balancing where possible, either through 3rd party plugins / Round Robin and ALUA, if supported.

Use Storage DRS with SIOC Balance for both free space and latency

Simplified datastore management


NETWORK

53

Network Load Balancing

Load balancing defines which uplink is used Route based on Port ID

Route based on IP hash

Route based on MAC hash

Route based on NIC load (Load Based Teaming)

Probability of high-bandwidth VMs being on the same physical NIC

Traffic will stay on elected uplink until an event occurs NIC link state change, adding/removing NIC from a team, beacon probe

timeout

54

How to Check Network Performance

VM VM on same ESXi host. This will exclude physical network problems

VM VM on different ESXi host. This will involve physical NICs and switch as well

Physical VM. Will also test physical devices but we can focus on one VM

Physical Physical: this will give us some number about what to expect

Use iperf/jperf/netperf. Free tool for network test

55

Iperf

56

Iperf

Windows and Linux version

Will not use storage

We can use different option for test (UDP/TCP)

Automatically calculates bandwidth

57

Network Troubleshooting

Check counters for NICs and VMs Network load imbalance

10 Gbps NICs can incur a significant CPU load when running at 100%

Ensure hardware supports TSO Use latest drivers and firmware for your NIC on the host

For multi-tier VM applications, use DRS affinity rules to keep VMs on same host

Same vSwitch / VLAN, rules out physical network

If using Jumbo Frames, ensure it is enabled end-to-end

58

Network Best Practices

Use the vmxnet3 virtual adapter Less CPU overhead

10 Gbps connection to vSwitch

Use the latest driver/firmware for the NICs on the host

Use network shares Requires Virtual Distributed Switch 4.1

Isolate vMotion and iSCSI traffic from regular VM traffic Separate vSwitches with dedicated NIC(s)

Most applicable with Gigabit NICs

59

In conclusion

60

Key Takeaways Performance Best Practices

Understand your environment Hardware, storage, networking

VMs & applications

Advanced configuration values do not need to be tweaked or modified

In almost all situations

Use fully automated DRS

Use Paravirtual hardware

61

Important Links

62

Important Links

FILL OUT A SURVEY

AT WWW.VMWORLD.COM/MOBILE

COMPLETE THE SURVEY

WITHIN ONE HOUR AFTER

EACH SESSION AND YOU WILL

BE ENTERED INTO A DRAW

FOR A GIFT FROM THE

VMWARE COMPANY STORE

VSP1800

@Insertspeaker

vSphere

Performance

Best Practices

Robert Moran

Premier Services Engineer VMware, Inc. Global Support Services Cork, Ireland

vspere performance

Documents

vmware support

support centers

support engineers

support days

benchmarking benchmarking

performance problem

issues support relationships

global support services