vmworld 2013: successfully virtualize microsoft exchange server
DESCRIPTION
VMworld 2013 Alex Fontana, VMware Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshareTRANSCRIPT
Successfully Virtualize
Microsoft Exchange Server
Alex Fontana, VMware
VAPP5613
#VAPP5613
2
Agenda
Exchange on VMware vSphere Overview and Updates
VMware vSphere Best Practices for Exchange
Availability and Recovery Options
Q & A
3
Exchange on VMware vSphere Overview and Updates
4
Continued Trend toward Virtualization
32-bit application
900MB database
cache
4KB block size
High read/write
ratio
64-bit application
32+ GB database
cache
8KB block size
Closer to 1:1
read/write ratio
70% reduction in
disk I/O
64-bit application
72+ GB database
cache
32KB block size
More sequential
I/O optimization
50% reduction in
disk I/O from
Exchange 2007
64-bit application
50% reduction in
disk I/O from
Exchange 2010
Rewritten store
process
Full virtualization
support at RTM
5
Support Considerations (What Is and What Isn’t?)
Support for Exchange has evolved drastically over the last two
years leading to confusion and misconceptions
What is Supported?
• Virtualization of all server roles, including Unified Messaging with Exchange
2010 SP1 and 2013
• Combining Exchange 2010 SP1 and 2013 DAG with vSphere HA and vMotion
• Thick virtual disks and raw-device mappings (pass-thru disk)
• Fibre channel, FCoE, iSCSI (native and in-guest)
What is Not Supported?
• NFS Storage for Exchange files (binaries, mailbox database, HT queue, logs)
• Thin virtual disks
• Virtual machine snapshots
• What about backups? *
MS TechNet – Understanding Exchange 2010 Virtualization: (http://technet.microsoft.com/en-us/library/jj126252)
MS TechNet – Exchange 2013 Virtualization: (http://technet.microsoft.com/en-us/library/jj619301(v=exchg.150).aspx)
6
Common Support Misconceptions
“Exchange supports a virtual processor-to-logical processor ratio no greater than 2:1, although we recommend a ratio of 1:1.”¹ • Microsoft uses “logical” to describe physical processor cores. Think “physical
cores”, nothing more, nothing less.
“All failover activity occurring at the hypervisor level must result in a cold boot when the virtual machine is activated on the target node.”¹ • vSphere HA only restores as a cold boot, vMotion is not considered “failover
activity” and is a supported method of “online migration”.
“…virtual machine snapshots aren't application aware, and using them can have unintended and unexpected consequences…”¹ • True. If your backup strategy is based on VMware snapshots (i.e. vDP-A) there
must be an Exchange-aware in-guest agent for quiescing and log truncation.
“…using dynamic memory features for Exchange isn't supported.”¹ • “Dynamic Memory” is a Hyper-V technology, there is no equivalent technology
in vSphere. Over-committment of memory is not supported by Microsoft and is not a recommended practice by either Microsoft or VMware for Exchange.
¹ http://technet.microsoft.com/en-us/library/jj619301(v=exchg.150).aspx
7
VMware vSphere Best Practices for Exchange
8
Best Practices for vCPUs
CPU over-commitment is possible and supported, but
approach conservatively
• Size according to physical core capabilities
Enable hyper-threading at the host level and VM (HT Sharing:Any)
• If a vCPU requires a full core the CPU scheduler will halt the other hyperthread
• Better resource utilization for non-vCPU worlds (ESXi system processes)
• .Net garbage collection memory over-allocation not an issue with VMs
Enable Non-Uniform Memory Access (NUMA)
• Exchange is not NUMA-aware, but ESXi is and will schedule SMP VM vCPUs
onto a single NUMA node (if it fits)
Size the VM to fit within a NUMA node
• If the NUMA node is 8 cores, keep the VM <= 8 vCPUs
9
CPU Over-Commitment
Allocating 2 vCPUs to every physical core (2 x over-commit) is
supported, but don’t do it. Keep it 1:1 until a steady workload is
achieved.
• Sizing is always based on a dedicated core’s capability (SPECInt2006). Start
over-committing and you might as well toss those numbers out.
11
What About vNUMA?
vSphere 5.0 introduced vNUMA … does it apply to Exchange?
• Not really – Exchange is not NUMA-aware
Instead use vSockets to assign vCPUs, leave “Cores per Socket” at
1, and keep the number of vCPUs <= NUMA node size
Use Sockets,
not vCores
12
Best Practices for Virtual Memory
No memory over-commitment. None. Don’t allow it.
• Exchange allocates the majority of memory presented to the guest OS to jet
cache, ESXi memory reclamation techniques can affect performance
Unsure if you can guarantee access to physical memory?
Use reservations.
• High VM turn-around can result in inadvertent over-commitment
• Keep in mind, vSphere HA may be unable to failover VMs if reserve
is unavailable
Do not disable the balloon driver
• If memory does come under contention the balloon driver is the first level
of defense before memory compression or…eek…swapping!
14
Storage Best Practices
Use multiple vSCSI adapters
• More on why in a sec…
Use Eagerzeroedthick virtual disks (or uncheck Quick Format)
• Eliminates penalty on first write
• Takes longer to initially provision virtual machines
• Do not use if using array thin provisioning
Use 64KB allocation unit size when formatting NTFS
Follow storage vendor recommendations for path policy
• No restrictions as with Windows Failover Clustering
Set power policy to high performance
• Or disable power management in BIOS
Don’t confuse DAG and MSCS when it comes to
storage requirements
15
Storage Best Practices – vSCSI Adapters (1)
Avoid inducing queue depth
saturation within the guest OS
Default configuration will
attempt to place first 15
storage targets onto a single
vSCSI adapter
16
Storage Best Practices – vSCSI Adapters (2)
Spread high IO workloads
across multiple VMDKs,
VMFS volumes, or RDMs
(a.k.a. storage targets)
Spread storage targets across
multiple vSCSI adapters
17
Storage Best Practices – vSCSI Adapters (3)
Exchange 2013 JetStress
1 LSI SAS vSCSI Adapter, 5 VMDKs, 5 Databases
High IO Latency Avg aggregate IO:
2900 IOPS
DB Page Fault
Stalls = BAD!
18
Storage Best Practices – vSCSI Adapters (4)
Exchange 2013 JetStress
3 LSI SAS vSCSI Adapter, 5 VMDKs, 5 Databases
Much better IO
latency, <20ms Avg aggregate IO:
5200 IOPS
Zero DB Page Fault
Stalls = GOOD!
19
When to Use Raw Device Mappings?
Performance?
• Performance is no longer a deciding factor for using raw device
mappings (RDMs)
• VMDK disks perform comparably to RDMs
Capacity?
• Not a concern in vSphere 5.5
• Pre-5.5 VMDKs limited to 2TB, pRDMs in both cases support 64TB
Storage interaction?
• Backup solutions might require RDMs because of storage interaction
needed for hardware-based Volume Shadow Copy Service (VSS)
Considerations
• Easier to exhaust 255 LUN limitation in ESXi
• VMFS volumes can support multiple virtual disks
• vSphere storage features leverage virtual disks
20
What About NFS and In-Guest iSCSI?
NFS
• Explicitly not supported for Exchange data (binaries, databases or logs)
by Microsoft
• Consider using for guest operating system (C: drive)
In-guest iSCSI
• Supported for DAG database storage
• Facilitates easy storage zoning and access masking
• Useful for minimizing number of LUNs zoned to an ESXi host
• Offloads storage processing resources away from ESXi hosts
21
Networking Best Practices
VMware vSphere Distributed Switch™ or standard switch?
• Choice is yours, but distributed switches require less management overhead
Separate traffic types
• Management – vmkernel, vSphere vMotion, VMware vSphere Fault Tolerance (FT)
• Storage – iSCSI, FCoE
• Virtual machine – MAPI, replication, DMZ
Configure vSphere vMotion to use multiple NICs to
increase throughput
Use the VMXNET3 paravirtualized network interface within
the guest
• Refer to VMware KB 2039495
Following Microsoft best practices – allocate multiple NICs to
Exchange virtual machines participating in a DAG
22
Exchange DAG Networking
DAG virtual machine should have two virtual network adapters for
replication and client access – MAPI
• If separate networks are not possible, use a single virtual NIC
23
Avoid Database Failover during vSphere vMotion
When using vSphere vMotion with DAG nodes
• If supported at the physical networking layer, enable jumbo frames on all
vmkernel ports to reduce the frames that must be generated and processed
• If jumbo frames cannot be supported across all networking paths, modify
cluster heartbeat setting samesubnetdelay parameter to a maximum of
2000ms (default = 1000ms)
• Always dedicate vSphere vMotion interfaces for the best performance
• Where possible, use multiple vSphere vMotion interfaces for
increased throughput
C:\> C:\cluster.exe /cluster:dag-name /prop
samesubnetdelay=2000
PS C:\> $cluster = get-cluster dag-name;
$cluster.SameSubnetDelay = 2000
24
Availability and Recovery Options Backup and Recovery
25
Database Protection
Database backups
• Software-based VSS using Windows Backup or third-party software
• Allows use of VMFS or RDM
• Hardware-based VSS using storage vendor software
• Can use either full clones or snapshots
• Requires physical mode RDMs, unless using NFS or iSCSI from within the guest OS
26
Availability and Recovery Options Local Site Options
27
VMware vCloud Networking and Security Edge
Client Access servers require load balancing for high availability
Exchange 2010 and 2013 supports using hardware/software load
balancers or DNS round-robin (2013)
DNS round-robin is passive load balancing with no insight into
number of connections or load
Hardware load balancers have higher
cost and require more management
VMware vCloud® Networking and
Security Edge™ uses existing vSphere
capacity to provide security and
load balancing
vCloud Networking and Security Edge
can be deployed in high availability pairs
for redundancy
28
High Availability with vSphere HA
No need for multiple database copies to manage
Easy to configure and manage
Virtual machines recover in minutes after hardware failure
Protects from hardware and guest OS failures only
MBX
CAS
MBX CAS
HA Failover
29
vSphere HA + Exchange DAG
Protects from hardware and application failure
vSphere HA allows DAG to maintain protection level
Supports vSphere vMotion and DRS
Equivalent or better protection than physical DAG
DAG 1
CAS
HA FailoverCAS DAG 1
30
vSphere HA + Exchange 2010/2013 DAG Recommendations
Achieving better than physical DAG protection requires N+1
vSphere configuration (N = number of DAG members)
One DAG member per host, co-locate members of different DAGs
on the same host
• Recommended database distribution is symmetrical, hosting two members of
the same DAG on a single host creates a single point of failure
Create an anti-affinity rule for each DAG
• Ensures DAG members are kept separate during power-on placement
• vSphere HA may violate this rule
Enable DRS Fully Automated mode
• Allows DRS to remediate a vSphere HA violation
31
Availability and Recovery Options Remote Site Options
32
vCenter Site Recovery Manager + DAG
DAG provides local site high availability
During a site failure, multiple applications can be recovered using
the same process
After workflow is initiated, vCenter Site Recovery Manager
automates the recovery process
Entire process can be tested without actually failing over services!
33
• All DAG members rebooted
• Databases mounted
• Power on remaining DAG members
• IP customization
• Reboot
• DAG-Node-1 rebooted
• Configure new Witness Server* for DAG
• Configure new IP address for DAG
• Reboot DAG-Node-1
• IP customization
• Reboot
• Press the big red button
SRM Recovery Workflow for DAG
SRM
Recovery
Power On
DAG-Node-1
Reconfigure
DAG
Power On
Remaining
Recover
DAG
34
Exchange 2013 Stretched DAG with Automated Failover
Automated site resiliency solution for Exchange 2013
Requires three well-connected sites to provide automated
site recovery
Exchange sites must provide Client Access and Mailbox resources
35
Take Aways…
Successfully virtualizing Exchange 2010 and 2013 is achievable
and supported!
Don’t get hung up on support terminology, when in doubt
contact your VMware rep, they’ll contact me, and we’ll have
this conversation again
Approach CPU over-commit cautiously, but DO NOT
over-commit memory
The majority of performance related calls we receive at VMware for
Exchange are storage related. Make sure you are following best
practices outlined here. (more vSCSI adapters)
DAG + vSphere HA and vMotion is the way! Optimize your network
for vMotions to avoid DB failover, and use DRS to remediate any
rule violations.
DAG + SRM is ok. Understand the sequence of events to get the
DAG up and running if disparate networks are part of recovery.
36
Shameless Plug
New book available for VMworld 2013
Topics include:
• Virtualizing business critical apps
• Active Directory
• Windows Failover Clustering
• Exchange 2013
• SQL 2012
• SharePoint 2013
Available on-site at the VMworld
Book Store
Available online at Amazon and
Pearson (pearsonitcertification.com)
Book signing Wednesday 12:30-1:30pm
37
Questions
38
Other VMware Activities Related to This Session
Group Discussions:
VAPP1006-GD
SQL/MS Apps with Jeff Szastak
VAPP5613
THANK YOU
Successfully Virtualize
Microsoft Exchange Server
Alex Fontana, VMware
VAPP5613
#VAPP5613