Exchange Server 2013High Availability | Site ResilienceScott SchnollPrincipal Technical Writer
OUC-B314
Agenda• Storage• High Availability• Site Resilience• Announcements
Storage
Storage Challenges• Disks• Capacity is increasing, but IOPS are not
• Databases• Database sizes must be manageable
• Database Copies• Reseeds must be fast and reliable• Passive database copy IOPS are inefficient• Lagged copies have asymmetric storage requirements require manual care
• Multiple Databases Per Volume• Autoreseed• Self-Recovery from Storage Failures• Lagged Copy Innovations
Storage Innovations
Multiple database per volume
Multiple databases per volume
DB1 DB4DB3DB2
DB4
DB3
DB2
DB1
DB4
DB3
DB2
DB1
DB4
DB3
DB2
DB1 Passive
ActiveLagge
d
4-member DAG4 databases4 copies of each database4 databases per volume
Symmetrical design
Multiple databases per volume
DB1 DB1DB1DB1
Passive
ActiveLagge
d
Single database copy/disk:Reseed 2TB Database = ~23 hrsReseed 8TB Database = ~93 hrs
20 MB/s
Multiple databases per volume
DB1 DB4DB3DB2
DB4
DB3
DB2
DB1
DB4
DB3
DB2
DB1
DB4
DB3
DB2
DB1 Passive
ActiveLagge
d
Single database copy/disk:Reseed 2TB Database = ~23 hrsReseed 8TB Database = ~93 hrs
4 database copies/disk:Reseed 2TB Disk = ~9.7 hrsReseed 8TB Disk = ~39 hrs
12 MB/s
12 MB/s
20 MB/s 20 MB/s
• Requirements• Single logical disk/partition per physical disk
• Recommendations• Databases per volume should equal the number of copies per database• Same neighbors on all servers
Multiple databases per volume
Autoreseed
• Disk failure on active copy = database failover
• Failed disk and database corruption issues need to be addressed quickly
• Fast recovery to restore redundancy is needed
Seeding Challenges
• Automatically restore redundancy after disk failure using provisioned spares
Seeding Innovations
In-Use Storage
Spares
X
Disk re-seed
operation
Autoreseed Workflow
Periodically scan for
failed and
suspended
copies
Check prerequisites: single copy, spare
availability
Allocate and remap a spare
Start the
seed
Verify that the new
copy is healthy
Admin replaces failed
disk
1. Detect a copy in an F&S state for 15 min in a row2. Try to resume copy 3 times (with 5 min sleeps in
between)3. Try assigning a spare volume 5 times (with 1 hour
sleeps in between)4. Try InPlaceSeed with SafeDeleteExistingFiles 5 times
(with 1 hour sleeps in between)5. Once all retries are exhausted, workflow stops6. If 3 days have elapsed and copy is still F&S,
workflow state is reset and starts from Step 1
Autoreseed Workflow
• Prerequisites• Copy is not ReseedBlocked or ResumeBlocked• Logs and database file(s) are on same volume• Database and log folder structure matches required naming convention• No active copies on failed volume• All copies are F&S on the failed volume• No more than 8 F&S copies on the server (if so, might be a controller
failure)
• For InPlaceSeed• Up to 10 concurrent seeds are allowed• If a database files exists, wait for 2 days before in-place reseeding• Waiting period based on LastWriteTime of database file
Autoreseed Workflow
AutoreseedConfigure storage subsystem with spare disks
Create DAG, add servers with configured storage
Create directory and mount points
Configure DAG, including 3 new properties
Create mailbox databases and database copies
\
ExchDbs
ExchVols
Vol1 Vol3MDB1 MDB2
MDB1
Vol2
MDB2
MDB1.DB MDB1.log
MDB1.DB MDB1.log
AutoDagDatabasesRootFolderPath
AutoDagVolumesRootFolderPath
AutoDagDatabaseCopiesPerVolume = 1
• Requirements• Single logical disk/partition per physical disk• Specific database and log folder structure must be used
• Recommendations• Same neighbors on all servers• Databases per volume should equal the number of copies per database
• Configuration instructions (updated April 2013)• http://aka.ms/autoreseed
Autoreseed
Autoreseed• Numerous fixes in CU1• Autoreseed not detecting spare disks correctly• Autoreseed not using spare disks• Increased Autoreseed copy limits (previously 4, now 8)• Better tracking around mount path and ExchangeVolume path• Get-MailboxDatabaseCopyStatus displays ExchangeVolumeMountPoint• Shows the mount point of the database volume under C:\
ExchangeVolumes
Other Seeding Innovations in CU1• Update-MailboxDatabaseCopy includes new
parameters designed to aid with automationParameter Description
BeginSeed Useful for scripting reseeds. Task asynchronously starts the seeding operation and then exits the cmdlet.
MaximumSeedsInParallel
Used with Server parameter to specify maximum number of parallel seeding operations across specified server during full server reseed operation. Default is 10.
SafeDeleteExistingFiles
Used to perform a seeding operation with a single copy redundancy pre-check prior to the seed. Because this parameter includes the redundancy safety check, it requires a lower level of permissions than DeleteExistingFiles, enabling a limited permission administrator to perform the seeding operation
Server Used as part of a full server reseed operation to reseed all database copies in a F&S state. Can be used with MaximumSeedsInParallel to start reseeds of database copies in parallel across specified server in batches of up to value of MaximumSeedsInParallel parameter copies at a time
Self-recovery from storage failures
• Storage controllers are basically mini-PCs• As such, they can crash, hang, etc., requiring administrative
intervention
• Other operator-recoverable conditions can occur• Loss of vital system elements• Hung or highly latent IO
Recovery Challenges
• Innovations added in Exchange 2010 carried forward
• New recovery behaviors added to Exchange 2013• Even more added to Exchange 2013 CU1
Recovery Innovations
Exchange Server 2010 Exchange Server 2013
ESE Database Hung IO (240s) System Bad State (302s)
Failure Item Channel Heartbeat (30s) Long I/O times (41s)
SystemDisk Heartbeat (120s) MSExchangeRepl.exe memory threshold (4GB)
Exchange Server 2013 CU1
Bus reset (event 129)
Replication service endpoints not responding
Cluster database hang (GUM updates blocked)
Lagged copy innovations
• Activation is difficult• Lagged copies require manual care• Lagged copies cannot be page patched
Lagged Copy Challenges
• Automatic log file replay• Low disk space (enable in registry)• Page patching (enabled by default)• Less than 3 other healthy copies (enable in Active Directory; configure
in registry)
• Integration with Safety Net• No need for log surgery or hunting for the point of corruption
Lagged Copy Innovations
High Availability
• High availability focuses on database health
• Best copy selection insufficient for new architecture
• DAG network configuration still manual
High Availability Challenges
High Availability Innovations• Managed Availability• Best Copy and Server Selection• DAG Network Autoconfig
Managed Availability
Managed Availability• Key tenets for Exchange 2013• Access to a mailbox is provided by protocol stack on the Mailbox server
that hosts the active copy of the mailbox• If a protocol is down on a Mailbox server, all access to active databases
on that server via that protocol is lost
• Managed Availability was introduced to detect and automatically recover from these kinds of failures• For most protocols, quick recovery is achieved via a restart action• If the restart action fails, a failover can be triggered
• An internal framework used by component teams
• Sequencing mechanism to control when recovery actions are taken versus alerting and escalation
• Enhances the Best Copy Selection algorithm by taking into account overall server health of source and target
Managed Availability
• MA failovers are recovery action from failure• Detected via a synthetic operation or live data• Throttled in time and across the DAG
• MA failovers can happen at database or server level• Database: Store-detected database failure can trigger database
failover• Server: Protocol failure can trigger server failover
• Single Copy Alert integrated into MA• ServerOneCopyInternalMonitorProbe (part of DataProtection Health Set)• Alert is per-server to reduce flow• Still triggered across all machines with copies• Logs 4138 (red) and 4139 (green) events
Managed Availability
Best Copy and Server Selection
• Exchange 2010 used several criteria• Copy queue length• Replay queue length• Database copy status – including activation blocked• Content index status
• Using just this criteria is not good enough for Exchange 2013, because protocol health is not considered
Best Copy Selection Challenges
• Still an Active Manager algorithm performed at *over time based on extracted health of the system
• Replication health still determined by same criteria and phases
• Criteria now includes health of entire protocol stack• Considers a prioritized protocol health set in the selection using four
priorities – critical, high, medium, low• Failover responders trigger added checks to select a “protocol not
worse” target
Best Copy and Server Selection
Managed Availability imposes 4 new constraints
on theBest Copy Selection
algorithm
Best Copy and Server Selection
All HealthyServer that has all health sets in a healthy state
Up to Normal HealthyServer that has all health sets Medium and above in a healthy state
All Better than SourceServer that has health sets in a state that is better than the server hosting the affected copy
Same as SourceServer that has health sets in a state that is the same as the server hosting the affected copy
1
2
3
4
BCSS Changes in CU1• PAM tracks number of active databases per
server• Honors MaximumActiveDatabases, if
configured• Allows Active Manager to exclude servers that are already hosting the
maximum amount of active databases when determining potential candidates for activation
• Keeps an in-memory state that tracks the number of active databases per server
• When the PAM role moves or when the Exchange Replication service is restarted on the PAM, this information is rebuilt from the cluster database
DAG Network Innovations
• DAG networks must be manually collapsed in a multi-subnet deployment
• Small remaining administrative burden for deployment and initial configuration
DAG Network Challenges
• Automatically collapsed in multi-subnet environment
• Automatic or manual configuration• Default is Automatic• Requires specific settings on MAPI and Replication network interfaces
• Manual edits and EAC controls blocked by default• Set DAG to manual network setup to edit or change DAG networks
DAG Network Innovations
Site Resilience
• Operationally complex• Mailbox and Client Access recovery
connected• Namespace is a SPOF
Site Resilience Challenges
Site Resilience Innovations• Key Characteristics• DNS resolves to multiple IP addresses• Almost all protocol access in Exchange 2013 is HTTP• HTTP clients have built-in IP failover capabilities• Clients skip past IPs that produce hard TCP failures• Admins can switchover by removing VIP from DNS• Namespace no longer a SPOF• No dealing with DNS latency
• Operationally simplified• Mailbox and Client Access recovery
independent• Namespace provides redundancy
Site Resilience Innovations
• Operationally Simplified• Previously loss of CAS, CAS array, VIP, LB, etc., required admin to
perform a datacenter switchover• In Exchange Server 2013, recovery happens automatically• The admin focuses on fixing the issue, instead of restoring service
Site Resilience
• Mailbox and CAS recovery independent• Previously, CAS and Mailbox server recovery were tied together in site
recoveries• In Exchange Server 2013, recovery is independent, and may come
automatically in the form of failover• This is dependent on business requirements and configuration
Site Resilience
• Namespace provides redundancy• Previously, the namespace was a single point of failure• In Exchange 2013, the namespace provides redundancy by leveraging
multiple A records and client’s OS/HTTP stack ability to failover
Site Resilience
• Support for new deployment scenarios• With the namespace simplification, consolidation of server roles,
separation of CAS array and DAG recovery, de-coupling of CAS and Mailbox by AD site, and load balancing changes, if available, three locations can simplify mailbox recovery in response to datacenter-level events
• You must have at least three locations• Two locations with Exchange; one with witness server• Exchange sites must be well-connected• Witness server site must be isolated from network failures affecting
Exchange sites
Site Resilience
Site Resilience Failover Examples
alternate datacenter: Portlandprimary datacenter: Redmond
Site Resilience Failover Examples
cas3 cas4cas1 cas2
VIP: 192.168.1.50X VIP: 10.0.1.50
mail.contoso.com: 192.168.1.50, 10.0.1.50
Removing failing IP from DNS puts you in control of in service time of VIPWith multiple VIP endpoints sharing the same namespace, if one VIP fails, clients automatically failover to alternate VIP(s)
mail.contoso.com: 10.0.1.50
third datacenter: Paris
alternate datacenter: Portland
primary datacenter: Redmond
Site Resilience Failover Examples
dag1mbx1 mbx2 mbx3 mbx4
Assuming MBX3 and MBX4 are operating and one of them can lock the witness.log file,automatic failover of active databases should occur
witness
X
alternate datacenter: Portlandprimary datacenter: Redmond
Site Resilience Failover Examples
dag1
witness
mbx1 mbx2 mbx3 mbx4XXX
alternate datacenter: Portlandprimary datacenter: Redmond
dag1
Site Resilience Failover Examples
witness
mbx1 mbx2 mbx3 mbx4
alternate witness
1. Mark the failed servers/site as down: Stop-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Redmond
2. Stop the Cluster Service on Remaining DAG members: Stop-Clussvc
3. Activate DAG members in 2nd datacenter: Restore-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Portland
X
ANNOUNCEMENTS
Coming in CU2
Coming in CU2• Microsoft Exchange DAG Management
service• MSExchangeDAGMgmt• Has MonitoringComponent moved into it• Continues to write events to the same place that the Replication
service writes to (Application event log with source of MSExchangeRepl and crimson channel)
• Additional functionality will be moved from MSExchangeRepl to MSExchangeDAGMgmt in the future
Possibly Coming in CU2• Use Windows Azure for witness server• Testing and validation currently underway• Requires extending internal Active Directory permissions to public
cloud• Involves creating a file server on top of Azure IaaS VM role• HA file server in Azure: Two persistent VMs can use XStore for shared
storage
Coming in CU2• Enterprise Edition support for 100
databases/server• To enable this• We made code changes in CU2• We fixed some blocking bugs• We did extensive testing in validation, including internal Dogfood
environments
Questions?
Scott [email protected]: @SchnollBlog: http://aka.ms/Schnoll
Related contentMicrosoft Exchange Server 2013 Managed AvailabilityMicrosoft Exchange Server 2013 Sizing
Virtualization in Microsoft Exchange Server 2013Exchange 2013 On-Premises Upgrade and CoexistenceExchange Server 2013 Tips & Tricks
Track resourcesExchange Team Blog:
http://blogs.technet.com/b/exchange/
Twitter:Follow @MSFTExchange Join the conversation, use #IamMEC
Check out: Microsoft Exchange Conference 2014: www.iammec.com Office 365 FastTrack: http://fasttrack.office.com//Technical Training with Ignite: http://ignite.office.com/
msdn
Resources for Developers
http://microsoft.com/msdn
Learning
Microsoft Certification & Training Resources
www.microsoft.com/learning
TechNet
Resources
Sessions on Demand
http://channel9.msdn.com/Events/TechEd
Resources for IT Professionals
http://microsoft.com/technet
Complete an evaluation on CommNet and enter to win!
MS tag
Scan the Tagto evaluate this session now on myTechEd Mobile
© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.