Download - WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V VM Performance
30 Bite-Sized Tips forBest VM Performance
Greg Shields, MVPSenior Partner and Principal Technologist
www.ConcentratedTech.com
#1: Purchase Compatible Hardware
• …and not just “compatible with ESX”.• Purchase hardware compatible with each other.
● Particularly considering vMotion needs.
#2: Buy Nehalem/Opteron
• Intel Nehalem & AMD Opteron include support for Intel EPT / AMD RVI processor extensions.
● Together, referred to as Second Level Address Translations, or SLAT
● Includes hardware-assisted memory management unit (MMU) virtualization.
● Significantly faster for certain workloads, such as those with large context switches.
● Finally, full support for Remote Desktop Services / XenApp
• Note that these support Large Memory Pages, which will disable ESX’s page table sharing.
#3: Mind NIC Oversubscription
• One of the greatest benefits of iSCSI is its linear scalability.
● Need more throughput, just add another NIC!
• However, VLANs and link aggregation introduce the notion of NIC oversubscription.
● Ceteris Paribus, Storage traffic >>> Regular traffic.
• Even with VLANs, always segregate storage NICs from production networking NICs.
● If possible/affordable use segregated network paths.● Monitor! This will kill your performance faster than anything!
#4: Consider FurtherSegregating Heavy Workloads
• Some VMs run workloads that make heavy use of their attached disks.
● Consider segregating these workloads onto their own independent NICs and paths.
● Keep an eye on your IOPS.
#5: vSphere 4.0 VMs Don’t Backup Applications Correctly!
• vSphere 4.1 added full support for Microsoft VSS on Server 2008 guests.
● This support is only automatic if the guest was initially created on a vSphere 4.1 host.
● Hosts upgraded from vSphere 4.0 aren’t properly backing up their applications.
• Fix this by setting disk.EnableUUID to True.● Power off machine.● Edit Settings | Options | General | Configuration Parameters |
Add Row● Power on machine.
#6: HBA Max Queue Depth
• One solution for poor fibre storage performance can be adjusting your HBA maximum queue depth.
● More queues can mean more performance, but less cross-device and cross-VM optimizations.
● 32 by default.● This is not a task taken lightly.● Kind of like adjusting air/fuel mix on a carburetor.
• Multi-step process.• See http://kb.vmware.com/kb/1267 for details.
#7: Consider Hardware iSCSI
• …but, perhaps, don’t buy them…
• ESX’s software iSCSI initiator works well.● However, using it incurs a small processing overhead.● Hardware iSCSI NICs offload this overhead to the card.● NFS/NAS storage also experience this behavior.
#7: Consider Hardware iSCSI
• …but, perhaps, don’t buy them…
• ESX’s software iSCSI initiator works well.● However, using it incurs a small processing overhead.● Hardware iSCSI NICs offload this overhead to the card.● NFS/NAS storage also experience this behavior.
• Newer NICs reduce this effect, those with…● Checksum offload● TCP segmentation offload (TSO)● 64-bit DMA addressing● Multiple Scatter Gather elements per Tx frame● Jumbo frames
#8: Set NICs to Autonegotiate
• VMware’s recommendation is to set all NICs to autonegotiate, full duplex.
● This is sort of “duh” these days.● But its worth mentioning, because…
● Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.
#8: Set NICs to Autonegotiate
• VMware’s recommendation is to set all NICs to autonegotiate, full duplex.
● This is sort of “duh” these days.● But its worth mentioning, because…
● Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.
● Just smack around those old coots.
#8: Set NICs to Autonegotiate
• VMware’s recommendation is to set all NICs to autonegotiate, full duplex.
● This is sort of “duh” these days.● But its worth mentioning, because…
● Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.
● Just smack around those old coots. ● “You can take your Token Ring and your IPX
and go home now!”
#9: Do Not Team Storage NICs
• What? Don’t team them?● Well, I guess I mean “team” as in the classic sense of network
teaming.
#9: Do Not Team Storage NICs
• What? Don’t team them?● Well, I guess I mean “team” as in the classic sense of network
teaming.
• Remember that storage NICs leverage MPIO for link aggregation.
● MPIO is a superior technology over link aggregation anyway for storage.
● ‘tis also easier to use, and better for routing!
• vCenter’s GUI wizards make this hard not to do, but be aware that extra steps are required…
#10: Enable Hyperthreading
• Early in ESX’s days we debated whether hyperthreading improved or decreased overall performance.
● That debate is over. The winner is “increase”.
• Today, hyperthreading adds a non-linear additional quantity of processing capacity.
● Like 20-30% (???), not a full extra proc.But you know this.
● Enable it in your servers’ BIOS.● Just turn it on, OK?
#11: Allocate Only the CPUs You Need
• Allocate only as many vCPUs as a VM requires.● Start with only one as your baseline. Rarely deviate. Circle
this bullet point. No, really.● Don’t use dual vCPUs if single-threaded application.● Don’t assign more vRAM than necessary.
#11: Allocate Only the CPUs You Need
• Allocate only as many vCPUs as a VM requires.● Start with only one as your baseline. Rarely deviate. Circle
this bullet point. No, really.● Don’t use dual vCPUs if single-threaded application.● Don’t assign more vRAM than necessary.
• More vCPUs equals more problems.● More vCPUs equals more interrupts● Extra overhead in maintaining consistent memory view
between vCPUs. This is tough, especially with today’s descheduled processing.
● Some OSs migrate single-threaded workloads between multiple CPUs, adding a performance tax.
• More CPUs good for CPU spike handling.
#12: Disconnect UnusedPhysical Hardware Devices
• COM, LPT, USB, Floppy, CD/DVD, NICs, etc.all consume interrupt resources.
● High priority resources.● It is a big deal to insert a CD/DVD/USB.
• Connected Windows guests will poll CD/DVD drives very frequently, significantly affecting performance.
● Disconnect these in VM properties when not in use.● There’s a reason why the “Connected” checkbox exists!● Note! Connected devices can prevent a vMotion!
#13: Upgrade to VM Version 7
• Virtual hardware version 7 offers some very significant performance improvements.
● VMXNET3 paravirtualized NIC driver● PVSCSI paravirtualized SCSI driver● Upgrade VMware tools. Reboot.● (More on these in a minute)
• Note that VMv7 hardware cannot be vMotioned to ESX servers prior to 4.0.
● Be careful of this.
#14: Don’t Fear Scaling Out
• Creating VMs is easy, so we create them.● You’ll eventually run out of CPU resources.● You’ll probably run out of RAM first.● Don’t run more VMs than processing/memory capacity.
• When running very close to capacity, use CPU reservations to guarantee 100% CPU availability for the console.
● Host | Configuration | System Resource Allocation | Edit● Particularly important if you have software installed there.● This is unnecessary in ESXi.
#15: 80% is Nice
• VMware recommends maintaining an administrative ceiling on utilization at 80%.
● This reserves enough capacity for failure, service console.● VMware suggests that 90% should be a warning for
overconsumption.
• Less workload dynamics can shift this up.● …but, seriously, who can really state that?
#16: With Older OSs,Use UP HAL When Possible
• Newer OSs (Vista, W7, 2008) use the same HAL for all UP/SMP conditions.
• Older OSs leverage two HALs● A Uniprocessor HAL● A Multiprocessor HAL
• An SMP HAL that is only given a single vCPU will run slightly slower.
● Slightly more synchronization code.
• Note that this will impact hot add.
#17: Mind Scheduling Affinity
• It is possible to tag a VM to a particular pProc.● Good for ensuring that VM has processing resources during
contention.● Setting Code Sharing to None prevents any other vProc from
using a pProc on the same core. Like disabling HT.● Setting Code Sharing to
Internal prevents vProcson other VMs from usingpProc on same core.Only same VM.
● Just set this to Any.
#18: Don’t Touch this Setting.
• Exceptionally rare are the cases when this setting shouldbe adjusted.
● So, no touchy.
#18: Don’t Touch this Setting.
• Exceptionally rare are the cases when this setting shouldbe adjusted.
● So, no touchy.
● I will tell youwhen.
● I havevery reasonableconsulting rates.
#19: Don’t Just Keep Up theOld (and Dumb) Habit of Assigning 4G of RAM to Every Stinking Virtual Machine, No Matter What Workload it Runs. Really.
• Consciously consider the amount of RAM that a VM needs, and assign it that RAM.
● Yes, VMware has memory ballooning.● But overallocating unnecessarily increases VM overhead.● Ballooning isn’t automatic. Ballooning is slow. Ballooning is
reactive.
#19: Don’t Just Keep Up theOld (and Dumb) Habit of Assigning 4G of RAM to Every Stinking Virtual Machine, No Matter What Workload it Runs. Really.
• Consciously consider the amount of RAM that a VM needs, and assign it that RAM.
● Yes, VMware has memory ballooning.● But overallocating unnecessarily increases VM overhead.● Ballooning isn’t automatic. Ballooning is slow. Ballooning is
reactive.
• Note to Self: Talk about Hyper-V’s Dynamic Memory here. Very cool.
#20: Stop with the Snapshots
• Snapshots are (were) a significant selling point in the early days of virtualization.
● About to do something risky? Snapshot! Its like a career protection device!
#20: Stop with the Snapshots
• Snapshots are (were) a significant selling point in the early days of virtualization.
● About to do something risky? Snapshot! Its like a career protection device!
• However, snapshots aren’t (and never were) meant for long-term storage.
● And I mean “no more than just a few minutes” long.● They’re not meant for backups.● Reverting to an aged snapshot can break computer trust
relationships to Windows domain.● Managing snapshots, particularly linked ones, significantly
reduces overall VM performance.
#21: Perform vSphere Tasksin the Off Hours
• Some vSphere tasks are actually quite impactful on VM operations.
● Provisioning virtual disks● Cloning virtual machines● svMotion● Manipulating file permissions● Backups● Anti-virus (bleh)
• Do these tasks during off hours, or you may impact performance for other running VMs.
#22: Mind Affinities
• Some VMs need to regularly communicate with each other with high throughput.
● “Keep Virtual Machines Together”● Make sure these machines share the same vSwitch.● Collocation forces inter-VM traffic through the system bus
rather than pNICs, significantly increasing speed.
• The loss of other VMs could be bad, if both are collocated on the same host.
● “Separate Virtual Machines”
#23: Disable Screen Savers
• And Window animations.
• Screen savers represent a machine interrupt, particularly those with heavy graphics.
● “Pipes”, I’m looking right at you!● This interrupt is particularly impactful on collocated VMs.
● …and, plus, screen savers on servers is sooooo 2002.
#24: Use NTP, not VMwareTools for Time Sync
• …and here’s one out of the odd files…
• VMware suggests configuring VMs to sync time from an external NTP server.
● They prefer this even over their own internal timekeeper.● Their timekeeper uses a much lower resolution than NTP.● NTP = milliseconds● NT5DS = 1 second● VMware Tools = ?
#25: Never Use PerfMon Inside the VM, Except…
• Not that you’d ever actually use PerfMon, but…● Measuring performance from within a virtual machine fails to
take into account for unscheduled time.● Essentially, when the ESX server isn’t servicing the VM, no
time passes within that VM.● Also, in-VM PerfMon doesn’t recognize virtualization
overhead.● Most important, in-VM PerfMon can’t see down into layers
below the VM: Storage, processing, etc.
#25: Never Use PerfMon Inside the VM, Except…
• Not that you’d ever actually use PerfMon, but…● Measuring performance from within a virtual machine fails to
take into account for unscheduled time.● Essentially, when the ESX server isn’t servicing the VM, no
time passes within that VM.● Also, in-VM PerfMon doesn’t recognize virtualization
overhead.● Most important, in-VM PerfMon can’t see down into layers
below the VM: Storage, processing, etc.
• VMware Tools adds PerfMon counters to VMs.● These are OK to use, as they’re synched from ESX.
#26: Paravirtualization is Your Friend
• VM Hardware Version 7 adds two new paravirtualized drivers.
● VMXNET3 replaces E1000● PVSCSI replaces BusLogic/LSILogic
• Paravirtualized drivers are superior to emulation● They are “aware” they’ve been virtualized. Can work directly
with host without needing emulation.● Mexican menus versus French menus.
• VMXNET3 supports TSO & Jumbo Frames,in the VM!
● Even if the physical hardware doesn’t support TSO!
#27: Turn on Jumbo Frames,but Do it Everywhere
• If you plan to use Jumbo Frames…● MTU size is usually set to 9000● Make sure you enable it everywhere.● This brings particular assist with large file transfers (think
WDS, virtual disk provisioning, etc.) and storage connections.
• Not all network equipment supports Jumbo Frames.● Test, test, test.
#28: DRS Will PrioritizeFaster Hosts over Slower Ones
• A neat fact (that I didn’t know):● When potential hosts for a DRS relocation have compatible
CPUs but different CPU frequencies and/or memory capacity…
● …DRS will prioritize relocating VMs to the system with the highest CPU frequency and more memory.
● This won’t be the case if that CPU is already at capacity.
#29: Disable FT, UnlessYou’re Using It
• …and most of you aren’t.
• You can “turn on” but not “enable” FT.● Problem: Turning on Fault Tolerance automatically disables
some features that enhance VM performance.● Hardware virtual MMU is one.
• Or, just don’t use that horrible feature. Har!● (Is there anyone from VMware in the audience…?)
#30: Match Configured OSwith Actual OS
• Big oops here, usually during OS migrations.• This setting also
sets a few important low-level kerneloptimizations.
• Make sureyours arecorrect!
BONUS TIP #31: Follow the Numbers
• Private Clouds are all about quantifying performance in terms of supply and demand.
• vSphere gives you those numbers. Just sum ‘em up.
Final Thoughts
• See! Creating good VMs isn’t all that easy.● Our jobs aren’t going away any time soon!● These little optimizations add up
• Be smart with your virtual environment and always remember…
Final Thoughts
• See! Creating good VMs isn’t all that easy.● Our jobs aren’t going away any time soon!● These little optimizations add up
• Be smart with your virtual environment and always remember…
• …you cannot change the laws of Physics!
30 Bite-Sized Tips forBest VM Performance
Greg Shields, MVPSenior Partner and Principal Technologist
www.ConcentratedTech.com