site recovery manager in concert with storage replication
TRANSCRIPT
Site Recovery Manager In Concert with Storage Replication Protecting Your Critical Applications
Breakout Session # 3096
Michael Tan Bala GaneshanEngineering Manager Chief Technologist: Server Virtualization
EMC Corporation Symmetrix Engineering, EMC Corporation
Date: 9/15-9/18/2008
Disclaimer
This session may contain product features that are currently under development.
This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product.
Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.contracts, purchase orders, or sales agreements of any kind.
Technical feasibility and market demand will affect final delivery.
Pricing and packaging for any new technologies or features discussed or presented have not been determined.
“These features are representative of feature areas under development. Feature commitments are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery.”
Agenda
• Overview of Application Virtualization & DR Challenges
• Overview of VMware SRM
• SRM Integration with RecoverPoint
• SRM Integration with SRDF
• SRM Failback Addressed• SRM Failback Addressed
• Demonstration of SRM with Critical Application
• Q&A Session
The ability to provide a reliable Infrastructure
� Address exponential growth of messaging environments importance
Minimize and prevent email downtime
� SLA’s are more stringent
� More choices driving additional complexity
Choosing the right BC solution for your business
� Cost of downtime and data loss is now unacceptable
Critical Business Challenges For Messaging Environments
� Cost of downtime and data loss is now unacceptable
� Application, services and data restart is complex and inconsistent
Cost and complexity of managing BC processes
� Deploy and rehearsal of BC/DR procedures
� Lack of time and expertise
� Server consolidation
Up to 60% of business critical information passes through
or remains within corporate messaging systems.
Critical Application Virtualization End to End
Application-consistent recovery?
Corruption protection?
Local site Remote site
Application response time?
OCS Exchange SQLOCS Exchange SQL
RPO = ?RTO = ?
Virtual or Physical?
SANSANSAN
Existing
infrastructure?
Disaster-recovery testing?
Communicationscost?
SUNIBM HPHDSEMC
SUNIBM HPHDSEMC
Site Recovery Manager IntegrationMake DR a property of VM like HA and DRS
Provides central management of recovery plans
from VirtualCenter
Turns manual recovery processes into automated
recovery plans
Simplifies and automates disaster recovery
workflows:
� Setup, testing, failover
4 EMC products are integrated with SRM
� RecoverPoint
Site Recovery Manager Integration
� RecoverPoint
� SRDF
� MirrorView
� Celerra Replicator
Makes disaster recovery rapid, reliable, manageable, affordable
Protection Group
Collection of protected virtual machines
Recovery Plan
Complete set of steps needed to
recover (or test the recovery of) the
protected virtual machines in one or
more protection groups
Recovery Plan
A
(Array Failure)
Recovery Plan
B
(Site Failure)
Key Concepts —VMware Site Recovery Manager
Datastore Groups
Replicated datastores containing the
complete set of virtual machines you
wish to protect via SRM
Collection of protected virtual machines
that will be failed over to the recovery
site together. Protection groups are
mapped to datastore groups.
VMFS VMFS
Site Recovery Manager Disaster Recovery Setup
Create recovery plans
� For virtual machines, applications, business units
Integrate with replication
� Identify which virtual machines are protected by replication configuration
Map recovery resources
� Server resources, network � Server resources, network resources, management objects
Specify recovery process
� Convert manual runbook to pre-programmed response
� Customizable with scripting and callouts
VM Failover with SRM and EMC Replication
ESX w/VMs
VDI Cluster
ESX w/VMs
Virtual Center SRMSRDF Adapter
4. < and pressing the Failover button brings
up data on secondary side, performing these
tasks for user:
•Replicated data online
VC Console VC Console
MirrorView Adapter
Replicator AdapterVirtual CenterSRM
SRDF Adapter
MirrorView Adapter
Replicator Adapter
ESX Server Farm
Symmetrix,
CLARiiON, or
Celerra
FC or iSCSI
SAN
Symmetrix,
CLARiiON, or
Celerra
FC or iSCSI
SAN
FC or IP Network
Inter-array Connectivity
between like pairs
– e.g. Symm<>Symm
1. Production Data 2. Replicated data
•Replicated data online
•Replicated data presented to remote ESX
•Scan for New disk
•Virtual Machine Registered
•VM powered up
3. In event of
disaster,
production data
goes down<
VM Failover With Site Recovery Manager
VDI ESX
Cluster
ESX w/VMs
Virtual Center SRMRecoverPoint
Adapter
VC Console RP Console
Virtual CenterSRMRecoverPoint
Adapter
VC ConsoleRP Console
4. Choose latest image
Allow image access to remote
ESX
Scan for New Disk
Register Virtual Machine
3. VM goes
down
SANSAN
3rd
Party
WAN SAN
3rd
Party
1. Map VMs to
LUNs &
Consistency
Groups
2. Replicate
LUNs
LocalJournals
RemoteJournal
SAN Storage
SAN Storage
Site A Site B
Register Virtual Machine
Power Up VM
Failover and resync data
RecoverPoint RecoverPoint
SRM/RecoverPoint Application Failover Demo Environment
SSM
ESX/Oracle App ESX3.0/SaPESX/Exchange 2k7
IP
Network
RecoverPoint
WAN
RecoverPoint
“Santa Clara” “London”
Oracle App
/ESX
SaP/ESX
SSMSSM
DMX4
Exchange2k7/ESX
CX3_80
MDS 9509
Invista
CPC
SPA0 SPA1
SPB
0SPB
1
SSMSSM
MDS 9513
SSM
CX3_40
SaP/ESX
DEMONSTRATION
SRDF and SRM Integration
Automated
VM Failover
Common Interface
to VMware SRM
SANSANSAN
Non-disruptive
DR Testing
SymmetrixSymmetrix
FC or IP Network
Inter-array Connectivity
Requirements for SRDF Adapter
DMX storage arrays
� DMX-1/2 running Enginuity code 5671
� DMX-3 and DMX-4 running Enginuity code 5771 or later
Management of Symmetrix array is done in-band
� Solutions Enabler version 6.5.0 or later
Host running Solutions Enabler is required at each site
� Can be the Virtual Center Server if it has connection to the storage array
Solutions Enabler in a client/server configuration
� Host providing management service needs to be configured to provide service
� Solutions Enabler on Virtual Center Server needs to be configured to access server
� SSL connections between client and server recommended
Solutions Enabler can be run on the ESX Server Console
� Port 2707 has to be opened
SRDF Family of Remote-Replication Products
SRDF/Synchronous
� No data exposure
� Some performance
impact
� Limited distance
� Supports higher-tier
applications
Source
Limited distance
Target
1
4
2
3
ESX Servers
applications
SRDF/Asynchronous
– RPO in seconds
– No performance
impact
– Unlimited distance
– Supports multiple
application tiers
Source
Unlimited distance
Target
1
2
3
ESX Servers
Overview of TimeFinder
TimeFinder used to create local replicas
TimeFinder/Mirror
� High performance
� Requires use of devices with attribute BCV
� When established and synchronized BCV acts as another mirror
TimeFinder/Clone
� Can use any device that has the same configuration as source volume
� Lower performance than TimeFinder/Mirror but has optimizations
� Rapidly becoming the de-facto standard
Supported Functionality and Restrictions
SRDF/S and SRDF/A modes are supported
� No support for SRDF/S consistency groups
� No support for MSC
� Provides consistency across multiple SRDF/A groups and different arrays
� No support for advanced configurations (SRDF/AR, SRDF/STAR etc.)
TimeFinder/Snap is not supported for testing recovery plans
� No support for “splitting” SRDF link for testing recovery plans
� TimeFinder/Mirror is required for testing recovery plans using non-grouped
devices
Common options
� Specified in a flat file, symvmwsrm_options in the install directory
� copy_type, copy_state_wait_interval, copy_state_max_wait
� email_addr_target, email_server, email_subject, email_log_level
� log_level
Logging and Basic Troubleshooting
VMware SRM maintains logs on the Virtual Center Server
� Location determined by VMware
Adapter logs located with Solutions Enabler Logs
� C:\Program Files\EMC\SYMAPI\log
Log file name is symvmwsrm-<date>.log
API logs is symapi-<date>.log
� Available on the Symmetrix API server handling the client requests
API server logs may be required for troubleshooting client/server issues
� Log file is on the server handling the request
� Located in either c:\program files\emc\SYMAPI\log or /var/symapi/log
� File name is storsrvd.log*
� Logging level may have to be increased to capture potential communication issues
Troubleshooting requires the logs listed above
• Both source and target side files are required
•Visual Changes to “Array Manager” screen
• Uses terminology familiar for EMC Symmetrix customers
•Configure protection side and recovery side array
managers
•Replicated datastores and candidates for protection
automatically determined by VMware SRM
Configuring the SRDF Adapter for VMware SRM
automatically determined by VMware SRMCustomers are
prompted for
SYMAPI Server
Failback
Failback is a manual processOBUT NOT HARD
Problem is exacerbated since VMware resignatures remote side LUN
� The production site views the resignatured LUNs as snapshots
� Makes the product inappropriate for scheduled movement of workload
Process for failback depends on the type of event that caused the
failover
� The storage array on the protection site available
� Failover initiated due to server unavailability
� The array on the protection site is temporarily unavailable
� The array on the protection site is permanently destroyed
Detailed procedure available in a co-logoed white paper
� Also available on PowerLink and VMware website
Failback– Server Outage
The adapter could perform automatic swap if possible
� SRDF adapter will perform the swap
� Failback requires breaking of existing SRM connection
� Unregister impacted virtual machines at protection site
� Recreate the connections, protection groups and recovery
plans from remote to production siteplans from remote to production site
� Initiate failover from remote site to production site using SRM
Failback– Production array temporarily unavailable
If array temporarily unavailable during failover
� Reversing of replication direction fails
� Adapter automatically resorts to alternatives (if needed)
� When production array becomes available
� Update production LUN incrementally
� Once the production LUNs have synchronized process to
failback is same as discussed before
Failback– Production DMX permanently destroyed
If array is permanently destroyed
� Dynamic swap fails
� Adapter automatically resorts to alternatives
� Contact EMC support or PS for setting up failback
� After failover the “remote mirror” personality of LUN should be
deleteddeleted
� Install new array, create appropriate devices
� Create new relationship between the old and new devices
� Once devices are synchronized process is the same as initial
setup of SRM
EMC Adapters With VMware SRM – BenefitsSimplifies DR Process
� Manual decision
� Automatic procedure
� Single UI from VMware’s Virtual Center
Adds VM-centric custom management procedure
� VM Power-up priorities
� VM dependencies
� VM network overrides
Aligned with VMware’s VocabularyAligned with VMware’s Vocabulary
� Datastore Groups
� Protection Groups
� Recovery Plan
� VMs, LUNs inventory
Non-Disruptive Testing
� Utilizes features from VMware and EMC data replication technologies
� Consistent, More rapid to operate, Manageable solution
Related Sessions and Activities
Best Practices for VMware ESX Server Replication and
RecoverPoint
Pete Conway’s Presentation
Chad Sakac’s Presentation
EMC Booth Demos
Q&A
Breakout Session 3096
Michael Tan,
Bala GaneshanEMC Corp