site recovery manager in concert with storage replication

26
Site Recovery Manager In Concert with Storage Replication Protecting Your Critical Applications Breakout Session # 3096 Michael Tan Bala Ganeshan Engineering Manager Chief Technologist: Server Virtualization EMC Corporation Symmetrix Engineering, EMC Corporation Date: 9/15-9/18/2008

Upload: hakiet

Post on 30-Dec-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Site Recovery Manager In Concert with Storage Replication

Site Recovery Manager In Concert with Storage Replication Protecting Your Critical Applications

Breakout Session # 3096

Michael Tan Bala GaneshanEngineering Manager Chief Technologist: Server Virtualization

EMC Corporation Symmetrix Engineering, EMC Corporation

Date: 9/15-9/18/2008

Page 2: Site Recovery Manager In Concert with Storage Replication

Disclaimer

This session may contain product features that are currently under development.

This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new technologies or features discussed or presented have not been determined.

“These features are representative of feature areas under development. Feature commitments are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery.”

Page 3: Site Recovery Manager In Concert with Storage Replication

Agenda

• Overview of Application Virtualization & DR Challenges

• Overview of VMware SRM

• SRM Integration with RecoverPoint

• SRM Integration with SRDF

• SRM Failback Addressed• SRM Failback Addressed

• Demonstration of SRM with Critical Application

• Q&A Session

Page 4: Site Recovery Manager In Concert with Storage Replication

The ability to provide a reliable Infrastructure

� Address exponential growth of messaging environments importance

Minimize and prevent email downtime

� SLA’s are more stringent

� More choices driving additional complexity

Choosing the right BC solution for your business

� Cost of downtime and data loss is now unacceptable

Critical Business Challenges For Messaging Environments

� Cost of downtime and data loss is now unacceptable

� Application, services and data restart is complex and inconsistent

Cost and complexity of managing BC processes

� Deploy and rehearsal of BC/DR procedures

� Lack of time and expertise

� Server consolidation

Up to 60% of business critical information passes through

or remains within corporate messaging systems.

Page 5: Site Recovery Manager In Concert with Storage Replication

Critical Application Virtualization End to End

Application-consistent recovery?

Corruption protection?

Local site Remote site

Application response time?

OCS Exchange SQLOCS Exchange SQL

RPO = ?RTO = ?

Virtual or Physical?

SANSANSAN

Existing

infrastructure?

Disaster-recovery testing?

Communicationscost?

SUNIBM HPHDSEMC

SUNIBM HPHDSEMC

Page 6: Site Recovery Manager In Concert with Storage Replication

Site Recovery Manager IntegrationMake DR a property of VM like HA and DRS

Provides central management of recovery plans

from VirtualCenter

Turns manual recovery processes into automated

recovery plans

Simplifies and automates disaster recovery

workflows:

� Setup, testing, failover

4 EMC products are integrated with SRM

� RecoverPoint

Site Recovery Manager Integration

� RecoverPoint

� SRDF

� MirrorView

� Celerra Replicator

Makes disaster recovery rapid, reliable, manageable, affordable

Page 7: Site Recovery Manager In Concert with Storage Replication

Protection Group

Collection of protected virtual machines

Recovery Plan

Complete set of steps needed to

recover (or test the recovery of) the

protected virtual machines in one or

more protection groups

Recovery Plan

A

(Array Failure)

Recovery Plan

B

(Site Failure)

Key Concepts —VMware Site Recovery Manager

Datastore Groups

Replicated datastores containing the

complete set of virtual machines you

wish to protect via SRM

Collection of protected virtual machines

that will be failed over to the recovery

site together. Protection groups are

mapped to datastore groups.

VMFS VMFS

Page 8: Site Recovery Manager In Concert with Storage Replication

Site Recovery Manager Disaster Recovery Setup

Create recovery plans

� For virtual machines, applications, business units

Integrate with replication

� Identify which virtual machines are protected by replication configuration

Map recovery resources

� Server resources, network � Server resources, network resources, management objects

Specify recovery process

� Convert manual runbook to pre-programmed response

� Customizable with scripting and callouts

Page 9: Site Recovery Manager In Concert with Storage Replication

VM Failover with SRM and EMC Replication

ESX w/VMs

VDI Cluster

ESX w/VMs

Virtual Center SRMSRDF Adapter

4. < and pressing the Failover button brings

up data on secondary side, performing these

tasks for user:

•Replicated data online

VC Console VC Console

MirrorView Adapter

Replicator AdapterVirtual CenterSRM

SRDF Adapter

MirrorView Adapter

Replicator Adapter

ESX Server Farm

Symmetrix,

CLARiiON, or

Celerra

FC or iSCSI

SAN

Symmetrix,

CLARiiON, or

Celerra

FC or iSCSI

SAN

FC or IP Network

Inter-array Connectivity

between like pairs

– e.g. Symm<>Symm

1. Production Data 2. Replicated data

•Replicated data online

•Replicated data presented to remote ESX

•Scan for New disk

•Virtual Machine Registered

•VM powered up

3. In event of

disaster,

production data

goes down<

Page 10: Site Recovery Manager In Concert with Storage Replication

VM Failover With Site Recovery Manager

VDI ESX

Cluster

ESX w/VMs

Virtual Center SRMRecoverPoint

Adapter

VC Console RP Console

Virtual CenterSRMRecoverPoint

Adapter

VC ConsoleRP Console

4. Choose latest image

Allow image access to remote

ESX

Scan for New Disk

Register Virtual Machine

3. VM goes

down

SANSAN

3rd

Party

WAN SAN

3rd

Party

1. Map VMs to

LUNs &

Consistency

Groups

2. Replicate

LUNs

LocalJournals

RemoteJournal

SAN Storage

SAN Storage

Site A Site B

Register Virtual Machine

Power Up VM

Failover and resync data

RecoverPoint RecoverPoint

Page 11: Site Recovery Manager In Concert with Storage Replication

SRM/RecoverPoint Application Failover Demo Environment

SSM

ESX/Oracle App ESX3.0/SaPESX/Exchange 2k7

IP

Network

RecoverPoint

WAN

RecoverPoint

“Santa Clara” “London”

Oracle App

/ESX

SaP/ESX

SSMSSM

DMX4

Exchange2k7/ESX

CX3_80

MDS 9509

Invista

CPC

SPA0 SPA1

SPB

0SPB

1

SSMSSM

MDS 9513

SSM

CX3_40

SaP/ESX

Page 12: Site Recovery Manager In Concert with Storage Replication

DEMONSTRATION

Page 13: Site Recovery Manager In Concert with Storage Replication

SRDF and SRM Integration

Automated

VM Failover

Common Interface

to VMware SRM

SANSANSAN

Non-disruptive

DR Testing

SymmetrixSymmetrix

FC or IP Network

Inter-array Connectivity

Page 14: Site Recovery Manager In Concert with Storage Replication

Requirements for SRDF Adapter

DMX storage arrays

� DMX-1/2 running Enginuity code 5671

� DMX-3 and DMX-4 running Enginuity code 5771 or later

Management of Symmetrix array is done in-band

� Solutions Enabler version 6.5.0 or later

Host running Solutions Enabler is required at each site

� Can be the Virtual Center Server if it has connection to the storage array

Solutions Enabler in a client/server configuration

� Host providing management service needs to be configured to provide service

� Solutions Enabler on Virtual Center Server needs to be configured to access server

� SSL connections between client and server recommended

Solutions Enabler can be run on the ESX Server Console

� Port 2707 has to be opened

Page 15: Site Recovery Manager In Concert with Storage Replication

SRDF Family of Remote-Replication Products

SRDF/Synchronous

� No data exposure

� Some performance

impact

� Limited distance

� Supports higher-tier

applications

Source

Limited distance

Target

1

4

2

3

ESX Servers

applications

SRDF/Asynchronous

– RPO in seconds

– No performance

impact

– Unlimited distance

– Supports multiple

application tiers

Source

Unlimited distance

Target

1

2

3

ESX Servers

Page 16: Site Recovery Manager In Concert with Storage Replication

Overview of TimeFinder

TimeFinder used to create local replicas

TimeFinder/Mirror

� High performance

� Requires use of devices with attribute BCV

� When established and synchronized BCV acts as another mirror

TimeFinder/Clone

� Can use any device that has the same configuration as source volume

� Lower performance than TimeFinder/Mirror but has optimizations

� Rapidly becoming the de-facto standard

Page 17: Site Recovery Manager In Concert with Storage Replication

Supported Functionality and Restrictions

SRDF/S and SRDF/A modes are supported

� No support for SRDF/S consistency groups

� No support for MSC

� Provides consistency across multiple SRDF/A groups and different arrays

� No support for advanced configurations (SRDF/AR, SRDF/STAR etc.)

TimeFinder/Snap is not supported for testing recovery plans

� No support for “splitting” SRDF link for testing recovery plans

� TimeFinder/Mirror is required for testing recovery plans using non-grouped

devices

Common options

� Specified in a flat file, symvmwsrm_options in the install directory

� copy_type, copy_state_wait_interval, copy_state_max_wait

� email_addr_target, email_server, email_subject, email_log_level

� log_level

Page 18: Site Recovery Manager In Concert with Storage Replication

Logging and Basic Troubleshooting

VMware SRM maintains logs on the Virtual Center Server

� Location determined by VMware

Adapter logs located with Solutions Enabler Logs

� C:\Program Files\EMC\SYMAPI\log

Log file name is symvmwsrm-<date>.log

API logs is symapi-<date>.log

� Available on the Symmetrix API server handling the client requests

API server logs may be required for troubleshooting client/server issues

� Log file is on the server handling the request

� Located in either c:\program files\emc\SYMAPI\log or /var/symapi/log

� File name is storsrvd.log*

� Logging level may have to be increased to capture potential communication issues

Troubleshooting requires the logs listed above

• Both source and target side files are required

Page 19: Site Recovery Manager In Concert with Storage Replication

•Visual Changes to “Array Manager” screen

• Uses terminology familiar for EMC Symmetrix customers

•Configure protection side and recovery side array

managers

•Replicated datastores and candidates for protection

automatically determined by VMware SRM

Configuring the SRDF Adapter for VMware SRM

automatically determined by VMware SRMCustomers are

prompted for

SYMAPI Server

Page 20: Site Recovery Manager In Concert with Storage Replication

Failback

Failback is a manual processOBUT NOT HARD

Problem is exacerbated since VMware resignatures remote side LUN

� The production site views the resignatured LUNs as snapshots

� Makes the product inappropriate for scheduled movement of workload

Process for failback depends on the type of event that caused the

failover

� The storage array on the protection site available

� Failover initiated due to server unavailability

� The array on the protection site is temporarily unavailable

� The array on the protection site is permanently destroyed

Detailed procedure available in a co-logoed white paper

� Also available on PowerLink and VMware website

Page 21: Site Recovery Manager In Concert with Storage Replication

Failback– Server Outage

The adapter could perform automatic swap if possible

� SRDF adapter will perform the swap

� Failback requires breaking of existing SRM connection

� Unregister impacted virtual machines at protection site

� Recreate the connections, protection groups and recovery

plans from remote to production siteplans from remote to production site

� Initiate failover from remote site to production site using SRM

Page 22: Site Recovery Manager In Concert with Storage Replication

Failback– Production array temporarily unavailable

If array temporarily unavailable during failover

� Reversing of replication direction fails

� Adapter automatically resorts to alternatives (if needed)

� When production array becomes available

� Update production LUN incrementally

� Once the production LUNs have synchronized process to

failback is same as discussed before

Page 23: Site Recovery Manager In Concert with Storage Replication

Failback– Production DMX permanently destroyed

If array is permanently destroyed

� Dynamic swap fails

� Adapter automatically resorts to alternatives

� Contact EMC support or PS for setting up failback

� After failover the “remote mirror” personality of LUN should be

deleteddeleted

� Install new array, create appropriate devices

� Create new relationship between the old and new devices

� Once devices are synchronized process is the same as initial

setup of SRM

Page 24: Site Recovery Manager In Concert with Storage Replication

EMC Adapters With VMware SRM – BenefitsSimplifies DR Process

� Manual decision

� Automatic procedure

� Single UI from VMware’s Virtual Center

Adds VM-centric custom management procedure

� VM Power-up priorities

� VM dependencies

� VM network overrides

Aligned with VMware’s VocabularyAligned with VMware’s Vocabulary

� Datastore Groups

� Protection Groups

� Recovery Plan

� VMs, LUNs inventory

Non-Disruptive Testing

� Utilizes features from VMware and EMC data replication technologies

� Consistent, More rapid to operate, Manageable solution

Page 25: Site Recovery Manager In Concert with Storage Replication

Related Sessions and Activities

Best Practices for VMware ESX Server Replication and

RecoverPoint

Pete Conway’s Presentation

Chad Sakac’s Presentation

EMC Booth Demos

Page 26: Site Recovery Manager In Concert with Storage Replication

Q&A

Breakout Session 3096

Michael Tan,

Bala GaneshanEMC Corp