workload resiliency with emc vplex - data storage ... · increased workload resiliency to the data...

Workload Resiliency with EMC VPLEX Best Practices Planning

Abstract

This white paper provides a brief introduction to EMC® VPLEX™ and describes how VPLEX provides increased workload resiliency to the data center. Best practice recommendations for high-availability deployments are given along with descriptions of how VPLEX handles various failure scenarios.

May 2010

Copyright © 2010 EMC Corporation. All rights reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com

All other trademarks used herein are the property of their respective owners.

Part Number h7138

Workload Resiliency with EMC VPLEX Best Practices Planning 2

Table of Contents Executive summary ............................................................................................4 Introduction.........................................................................................................4

Audience ...................................................................................................................................... 4 VPLEX technology overview..............................................................................4

EMC VPLEX clustering architecture ............................................................................................ 6 VPLEX device virtualization ..................................................................................................... 6

VPLEX hardware overview.................................................................................7 Deployment overview .........................................................................................8

VPLEX Local deployment ............................................................................................................ 8 When to use a VPLEX Local deployment ................................................................................ 9

VPLEX Metro deployment within a data center ........................................................................... 9 When to use a VPLEX Metro deployment within a data center ............................................. 10

VPLEX Metro deployment between data centers...................................................................... 10 When to use a VPLEX Metro deployment between data centers .......................................... 11

Workload resiliency..........................................................................................11 Storage array outages ............................................................................................................... 12

Best practices......................................................................................................................... 12 SAN outages.............................................................................................................................. 12

Best practice........................................................................................................................... 13 VPLEX component failures ........................................................................................................ 14

Fibre Channel port failure....................................................................................................... 14 I/O module failure ................................................................................................................... 15 Director failure ........................................................................................................................ 15 Engine power supply failure ................................................................................................... 15 Engine fan failure ................................................................................................................... 15 Intra-cluster IP subnet failure ................................................................................................. 16 Intra-cluster Fibre Channel switch failure............................................................................... 16 VPLEX Engine failure............................................................................................................. 16 Standby Power Supply failure ................................................................................................ 16 Inter-cluster link failure ........................................................................................................... 16 Metadata volume failure ......................................................................................................... 17 Dirty region log failure ............................................................................................................ 17 Management server failure..................................................................................................... 17 Uninterruptible Power Supply failure...................................................................................... 17

VPLEX Cluster failures .............................................................................................................. 17 Host failures ............................................................................................................................... 18 Data center outages................................................................................................................... 18

Conclusion ........................................................................................................18 References ........................................................................................................19


Executive summary The EMC® VPLEX™ family is the next-generation solution for information mobility and access within, across, and between data centers. It is the first platform in the world that delivers both local and distributed federation.

Local federation provides the transparent cooperation of physical elements within a site. Distributed federation extends access between two locations across distance. VPLEX is a solution for federating both EMC and non-EMC storage.

The VPLEX solution complements EMC’s virtual storage infrastructure and provides a layer supporting virtual storage between host computers running the applications of the data center and the storage arrays providing the physical storage used by these applications.

In this white paper we explore how VPLEX can be used to add increased levels of resiliency to applications running within and between data centers. We term this capability workload resiliency.

Introduction This white paper shows how VPLEX technology can be used to increase workload resiliency in the data center. By combining techniques and practices that are in common use today with best practices for VPLEX deployment, workload resiliency in the data center can be improved to withstand array outages and even high-impact events such as data center maintenance or power outages. These capabilities are introduced with a brief overview of the VPLEX technology and applicable use cases. The VPLEX hardware is then described and common deployments of this technology are provided. The remainder of the paper describes how VPLEX behaves under various failure conditions that can occur within the environment. Best practice recommendations are given for overcoming each type of failure and for achieving the highest levels of workload resiliency in the data center with VPLEX.

Audience This white paper is intended for storage architects and storage administrators who want to know how VPLEX can help add resiliency to the storage infrastructure of the data center. Familiarity with the basic concepts of storage arrays, storage area networks (SANs), and server infrastructure is assumed.

VPLEX technology overview EMC VPLEX introduces a new architecture that incorporates learnings from EMC’s 20-plus years of expertise in designing, implementing, and perfecting enterprise-class intelligent cache and distributed data protection solutions.

Built on a foundation of scalable and highly-available processor engines, the EMC VPLEX family is designed to seamlessly scale from small to medium to large configurations. VPLEX resides between the servers and heterogeneous storage assets and uses a unique clustering architecture that allows servers at multiple data centers to have read/write access to shared block storage devices.


Unique characteristics of this new architecture include:

• Scale-out clustering hardware that lets you start small and grow bigger with predictable service levels

• Advanced data caching, utilizing large-scale SDRAM cache to improve performance and reduce I/O latency and array contention

• Distributed cache coherence for automatic sharing, balancing, and failover of I/O across the cluster • A consistent view of one or more LUNs across VPLEX Clusters separated either by a few feet within

a data center or across synchronous distances, enabling new models of high availability and workload relocation.

Figure 1. Capability of the EMC VPLEX system to federate heterogeneous storage

EMC AccessAnywhere™, available with VPLEX, is a breakthrough technology from EMC that enables a single copy of data to be shared, accessed, and relocated over distance. EMC GeoSynchrony™ is the VPLEX operating system.

The VPLEX family consists of two products: VPLEX Local and VPLEX Metro.

• VPLEX Local provides simplified management and nondisruptive data mobility across heterogeneous arrays.

• VPLEX Metro provides data access and mobility between two VPLEX Clusters within synchronous distances.

Figure 2. EMC VPLEX family offering with architectural limits

With a unique scale-up and scale-out architecture, the VPLEX family's advanced data caching and distributed cache coherency provide workload resiliency, automatic sharing, and balancing and failover of storage domains, and enable both local and remote data access with predictable service levels.

VPLEX Local supports local federation. VPLEX Metro delivers distributed federation capabilities and extends access between two locations at synchronous distances. VPLEX Metro leverages AccessAnywhere to enable a single copy of data to be shared, accessed, and relocated over distance.

The combination of a virtualized data center and EMC VPLEX provides customers entirely new ways to solve IT problems and introduce new models of computing. Specifically, customers can:


• Move virtualized applications across data centers • Enable workload balancing and relocation across sites • Aggregate data centers and deliver IT services “24 x forever”

EMC VPLEX clustering architecture VPLEX uses a unique clustering architecture to help customers remove the physical boundaries of the data center and allow servers at multiple data centers to have read/write access to shared block storage devices.

A VPLEX Local configuration is defined by one, two, or four VPLEX Engines, which are integrated into a single cluster through their fully redundant inter-engine fabric interconnections. This cluster interconnect allows for the online addition of VPLEX Engines, providing exceptional scalability for both VPLEX Local and VPLEX Metro configurations. All connectivity between VPLEX Cluster nodes and across VPLEX Metro configurations is fully redundant, ensuring protection against single points of failure.

A VPLEX Cluster can scale up through the addition of more engines, and scale out by connecting clusters into a VPLEX Metro (two VPLEX Clusters connected within metro distances). VPLEX Metro helps transparently move and share workloads (including virtualized hosts), consolidates data centers, and optimizes resource utilization across data centers. In addition, it provides nondisruptive data mobility, heterogeneous storage management, and improved application availability. VPLEX Metro supports up to two clusters, which can be in the same data center at two different sites within synchronous distances (approximately up to 60 miles or 100 kilometers apart).

Figure 3. Local and distributed federation with EMC VPLEX Local and VPLEX Metro

EMC VPLEX maintains customer expectations for high-end storage in terms of availability. High-end availability is more than just redundancy; it means nondisruptive operations and upgrades, and being “always online.” EMC VPLEX provides:

• AccessAnywhere, with full connectivity of resources across clusters and Metro-Plex configurations • Data mobility and migration options across heterogeneous storage arrays • The power to maintain service levels and functionality as consolidation grows • Simplified control for provisioning in complex environments • Dynamic load balancing of data between storage arrays.

VPLEX device virtualization Device virtualization capabilities within VPLEX allow storage devices to be claimed as storage volumes within the VPLEX where they can be sliced or encapsulated as extents and used to form composite devices that are exposed to hosts as virtual volumes. VPLEX supports several different virtual-to-physical device transformations including:

• Extents — An extent is a device forming a contiguous range of pages of a storage volume. An extent may wholly encapsulate a storage volume or it may define a sliced portion of a storage volume. A VPLEX 4.0 page consists of 4 KB of storage.


• RAID 0 — A RAID 0 device is an aggregation of two or more devices on which the logical pages of

storage are striped across these devices with the goal of increasing I/O performance by distributing data across multiple spindles.

• RAID-C — A RAID-C device is an aggregation of two or more devices that are logically concatenated to form a larger storage device.

• RAID 1 — A RAID 1 device consumes two or more like-size local devices to form a mirror. Each of these devices provides a full copy of the data and writes to the RAID 1 device are applied to each leg of this mirror.

• DR-1 — A DR-1 device is a distributed RAID 1 device; it is like a RAID 1, but the legs of the mirrored device are provided by different clusters of a VPLEX system.

VPLEX 4.0 provides distributed cache coherency on top of its device virtualization capabilities. The VPLEX cache is maintained globally and consistently between the various system directors. Together the caches of the directors maintain the illusion and behavior that each virtual volume is a single disk despite the fact that the data of this volume may be spread across many different devices and distributed between and accessed from different data centers.

VPLEX hardware overview As described in the previous section each VPLEX 4.0 system is composed of one or two VPLEX Clusters each consisting of one, two, or four engines. A VPLEX Engine is a chassis containing two directors, redundant power supplies, fans, I/O modules, and management modules. The directors are the workhorse components of the system and are responsible for processing I/O requests from the hosts, serving and maintaining data in the distributed cache, providing the virtual-to-physical I/O translations, and interacting with the storage arrays to service I/O.

Figure 4 illustrates a VPLEX Metro consisting of two four-engine clusters.

IPIP

FCFC

Cluster-1

8 port FC SW

8 port FC SW

Engine SPS

Switch UPS

Switch UPS

Management Server

Engine SPS

Engine SPS

Engine SPS

Engine-1-1

Engine-1-2

Engine-1-3

Engine-1-4

Cluster-1

8 port FC SW

8 port FC SW

Engine SPS

Switch UPS

Switch UPS

Management Server

Engine SPS

Engine SPS

Engine SPS

Engine-1-1

Engine-1-2

Engine-1-3

Engine-1-4

8 port FC SW

8 port FC SW

Engine SPS

Switch UPS

Switch UPS

Management Server

Engine SPS

Engine SPS

Engine SPS

Engine-1-1

Engine-1-2

Engine-1-3

Engine-1-4

Cluster-2

8 port FC SW

8 port FC SW

Engine SPS

Switch UPS

Switch UPS

Management Server

Engine SPS

Engine SPS

Engine SPS

Engine-2-1

Engine-2-2

Engine-2-3

Engine-2-4

Cluster-2

8 port FC SW

8 port FC SW

Engine SPS

Switch UPS

Switch UPS

Management Server

Engine SPS

Engine SPS

Engine SPS

Engine-2-1

Engine-2-2

Engine-2-3

Engine-2-4

8 port FC SW

8 port FC SW

Engine SPS

Switch UPS

Switch UPS

Management Server

Engine SPS

Engine SPS

Engine SPS

Engine-2-1

Engine-2-2

Engine-2-3

Engine-2-4

Figure 4. A VPLEX Metro with two large clusters


Figure 5 is a picture of a VPLEX Engine. This photo shows the 12 I/O modules supported by the engine, with six allocated to each director. Each director has two four-port 8 Gb/s Fibre Channel I/O modules used for front-end SAN (host) connectivity and two four-port 8 Gb/s Fibre Channel I/O modules used for back-end SAN (storage array) connectivity. Each of these modules has 10 Gb/s effective PCI bandwidth to the CPUs of their corresponding director. A fifth I/O module provides two ports of 8 Gb/s Fibre Channel connectivity for intra-cluster communication and two ports of 8 Gb/s Fibre Channel connectivity for inter-cluster communication. The sixth I/O module contains four ports of 1Gb/s Ethernet and is currently unused.

Director BDirector A

I/O Modules

PowerSupply B

ManagementModule B

PowerSupply A

ManagementModule A

Figure 5. A VPLEX Engine and its components

The engine houses two redundant power supplies each capable of providing full power to the chassis. Redundant management modules provide IP connectivity to the directors from the management server that is provided with each cluster. Two private IP subnets provide redundant IP connectivity between the directors of a cluster and the cluster’s management server. Four redundant fans provide cooling to the engine’s chassis and support a 3+1 configuration model that supplies sufficient cooling in the presence of a single fan failure.

Each engine is supported by a redundant Standby Power Supply unit that provides power to ride through transient power-loss conditions lasting up to five minutes.

Clusters containing two or more engines are fitted with a pair of Fibre Channel switches that provide redundant Fibre Channel connectivity that support intra-cluster communication between the directors. Each Fibre Channel switch is backed by a dedicated Uninterruptible Power Supply (UPS) that provides support for riding through transient power loss.

Deployment overview VPLEX supports several different deployment models to suit different needs. The next few sections describe these different models and when they should be used.

VPLEX Local deployment Figure 6 illustrates a typical deployment of a VPLEX Local system. VPLEX Local systems are supported in small, medium, or large configurations consisting of one, two, or four engines, respectively, yielding systems that provide two, four, or eight directors.


Site ASite A

VPLEX

VirtualStorage

FCSAN

FCFCSANSAN

FCFCSANSAN

ManagementServerDirectors

HostServer

StorageArrays

Private IPManagementSubnets

Figure 6. Example of a VPLEX Local small deployment

When to use a VPLEX Local deployment VPLEX Local is appropriate when the virtual storage capabilities of workload relocation, workload resiliency, and simplified storage management are desired within a single data center and the scaling capacity of VPLEX Local is sufficient to meet the needs of this data center. If a larger scale is needed, consider deploying a VPLEX Metro (discussed next), or consider deploying multiple instances of VPLEX Local.

VPLEX Metro deployment within a data center Figure 7 illustrates a typical deployment of a VPLEX Metro system in a single data center. VPLEX Metro systems contain two clusters, each cluster having one, two, or four engines. The clusters in a VPLEX Metro deployment need not have the same number of engines. For example a 2 x 4 VPLEX Metro system is supported with one cluster having two engines, and the other cluster having four engines.


Data Center AData Center A

FCFCVPLEX

FCSAN

FCFCSANSAN

VirtualStorage

FCSAN

FCFCSANSAN

SANSANSANSAN

IPIP

Figure 7. Example of a VPLEX Metro deployment within a data center

When to use a VPLEX Metro deployment within a data center Deploying VPLEX Metro within a data center is appropriate when the virtual storage capabilities of workload relocation, workload resiliency, and simplified storage management are desired within a single data center and more scaling is needed beyond that of a VPLEX Local solution or when additional resiliency is desired.

VPLEX Metro provides the following additional resiliency benefit over VPLEX Local. The two clusters of a VPLEX Metro can be separated by up to 100 km. This provides excellent flexibility for deployment within a data center and allows the two clusters to be deployed at separate ends of a machine room or on different floors to provide better fault isolation between the clusters. For example, this allows the clusters to be placed in different fire suppression zones, which can mean the difference between riding through a localized fault such as a contained fire and a total system outage.

VPLEX Metro deployment between data centers Figure 8 illustrates a deployment of a VPLEX Metro system between two data centers. This deployment is similar to that shown in the section “VPLEX Metro deployment within a data center,” only here the clusters are placed in separate data centers. This typically means that separate hosts connect to each cluster. Clustered applications can have, for example, one set of application servers deployed in data center A, and another set deployed in data center B for added resiliency and workload relocation benefits. As described in the previous section, it is important to understand that a site or total cluster failure of the primary cluster for a distributed volume will require manual resumption of I/O on the secondary site.


Data Center BData Center BData Center AData Center A

FCFCVPLEX

FCSAN

FCFCSANSAN

VirtualStorage

FCSAN

FCFCSANSAN

SANSANSANSAN

IPIP

SANSANSANSAN

Figure 8. Example of a VPLEX Metro deployment between data centers

When to use a VPLEX Metro deployment between data centers A deployment of VPLEX Metro between two data centers is appropriate when the additional workload resiliency benefits of having an application’s data present in both data centers is desired. This deployment is also recommended when • Applications in one data center want to access data in the other data center. • One application wants to redistribute workloads between the two data centers. • One data center has run out of space, power, or cooling.

Workload resiliency In the next few sections we study different faults that can occur in a data center and look at how VPLEX can be used to add additional resiliency to applications, allowing their workload to ride through these fault conditions. The following classes of faults and service events are considered:

• Storage array outages (planned and unplanned) • SAN outages • VPLEX component failures • VPLEX Cluster failures • Host failures • Data center outages


Storage array outages To overcome both planned and unplanned storage array outages, VPLEX supports the ability to mirror the data of a virtual volume between two or more storage volumes1 using a RAID 1 device. Figure 9 provides an illustration of a virtual volume that is mirrored between two arrays. Should one array incur an outage, either planned or unplanned, the VPLEX system will be able to continue processing I/O on the surviving mirror leg. Upon restoration of the failed storage volume, the data from the surviving volume is resynchronized to the recovered leg.

Best practices • For critical data, it is recommended to mirror data onto two or more storage volumes that are provided

by separate arrays. • For the best performance, these storage volumes should be configured identically and be provided by

the same type of array.

Site ASite A

VPLEX

1

VirtualStorage

2

FCSAN

FCFCSANSAN

FCFCSANSAN

MirrorLegs

MV

MirroredVirtualVolume

Figure 9. Example use of RAID 1 mirroring to protect against array outages

SAN outages When a pair of redundant Fibre Channel fabrics is used with VPLEX, VPLEX directors should be connected to both fabrics both for the front-end (host-side) connectivity, as well as for the back-end

1 Up to eight legs per mirror volume are supported.


(storage array side) connectivity. This deployment, along with the isolation of the fabrics, allows the VPLEX system to ride through failures that take out an entire fabric and allows the system to provide continuous access to data through this type of fault. Hosts must also be connected to both fabrics and use multipathing software to ensure continuous data access in the presence of such failures. Figure 10 illustrates a best practice dual-fabric deployment.

Site ASite A

VPLEX

VirtualStorage

SAN-BE-BSAN-BE-B

SAN-BE-ASAN-BE-A

SANFE-BSANFE-B

SANFE-ASANFE-A

Figure 10. Recommended use of a dual-fabric deployment

Best practice • It is recommended that I/O modules be connected to redundant fabrics. For example, in a deployment

with fabrics A and B, it is recommended that the ports of a director be connected as shown in Figure 11.


Figure 11. Recommended fabric assignments for FE and BE ports

VPLEX component failures All critical processing components of a VPLEX system use at minimum pair-wise redundancy to maximize access to data. This section describes how VPLEX component failures are handled and the best practices that should be used to allow applications to tolerate these failures.

All component failures that occur within a VPLEX system are reported through events that call back to the EMC Service Center to ensure timely response and repair of these fault conditions.

Fibre Channel port failure All VPLEX communications happen over redundant paths that allow communication to continue in the presence of port failures. This redundancy allows multipathing software in the host servers to retransmit and redirect I/O around path failures that occur as a result of port failures or other events in the SAN that lead to path loss.

VPLEX uses its own multipathing logic to maintain redundant paths to back-end storage from each director. This allows VPLEX to ride through port failures on the back-end VPLEX ports as well as on the back-end fabrics and the array ports that connect the physical storage to VPLEX.

The Small Form-factor Pluggable (SFP) transceivers that are used for connectivity to VPLEX are serviceable Field Replaceable Units (FRUs).

Best practices • Ensure that there is a path from each host to at least one front-end port on director A and at least one

front-end port on director B. When the VPLEX system has two or more engines, ensure that the host has at least one “A-side” path in one engine and at least one “B-side” in a separate engine. For maximum availability, each host can have a path to at least one front-end port on every director.

• Use multipathing software on the host servers to ensure timely response and continuous I/O in the presence of path failures.

• Ensure that each host has a path to each virtual volume through each fabric. • Ensure that the LUN mapping and masking for each storage volume presented from a storage array to

VPLEX presents the volumes out of at least two ports from the array on at least two different fabrics


and connects to at least two different ports serviced by two different back-end I/O modules of each director within a VPLEX Cluster.

• Ensure that the fabric zoning provides hosts redundant access to the VPLEX front-end ports and provides VPLEX the redundant access to the array ports.

I/O module failure I/O modules within VPLEX serve dedicated roles. Each VPLEX director has two front-end /O modules, two back-end I/O modules, and one COM I/O module used for intra- and inter-cluster connectivity. Each I/O module is a serviceable FRU. The following sections describe the behavior of the system and best practices for maximizing availability in the presence of these failures.

FE I/O module Should an FE I/O module fail, all paths connected to this I/O module will be disrupted and fail. The best practices listed on page 14 should be followed to ensure that hosts have a redundant path to their data.

During the removal and replacement of an I/O module, the affected director will be reset.

BE I/O module Should a BE I/O module fail, all paths connected to this I/O module will be disrupted and fail. The best practices listed on page 14 should be followed to ensure that each director has a redundant path to each storage volume through a separate I/O module.


COM I/O module Should the COM I/O module of a director fail, the director will reset and all service provided from the director will stop. The best practices listed on page 14 ensure that each host has redundant access to its virtual storage through multiple directors, so the reset of a single director will not cause the host to lose access to its storage.


Director failure A director failure causes the loss of all service from that director. Each VPLEX Engine has a pair of directors for redundancy. VPLEX Clusters containing two or more engines benefit from the additional redundancy provided by the additional directors. Each director within a cluster is capable of presenting the same storage. The best practices described on page 14 allow a host to ride through director failures by placing redundant paths to their virtual storage through ports provided by different directors. The combination of multipathing software on the hosts and redundant paths through different directors of the VPLEX system allows the host to ride through the loss of a director.

In a multi-engine system, a host can maintain access to its data in the unlikely event that multiple directors should fail by having paths to its virtual storage provided by each director in the system.

Each director is a serviceable FRU.

Engine power supply failure The VPLEX Engine power supplies are fully redundant and no loss of service or function is incurred in the presence of a single power supply failure.

Each power supply is a serviceable FRU and can be removed and replaced with no disruption to the system.

Engine fan failure The VPLEX Engine fan units are fully redundant and no loss of service is incurred during the loss of a single fan unit. Each engine contains four fan units. In the event of a single fan failure the remaining three


fan units will continue to provide sufficient cooling for the system. Should two fans fail, the engine will automatically shut down to prevent damage from overheating.

Each fan unit is a serviceable FRU and can be removed and replaced with no disruption to the system.

Intra-cluster IP subnet failure Each VPLEX Cluster has a pair of private local IP subnets that connect the directors to the management server. These subnets are used for management traffic as well as for protection against intra-cluster partitioning. Link loss on one of these subnets can result in the inability of some members to communicate with other members on that subnet; this results in no loss of service or manageability due to the presence of the redundant subnet.

Intra-cluster Fibre Channel switch failure Each VPLEX Cluster with two or more engines uses a pair of dedicated Fibre Channel switches for intra-cluster communication between the directors within the cluster. Two redundant Fibre Channel fabrics are created with each switch serving a different fabric. The loss of a single Fibre Channel switch results in no loss of processing or service.

VPLEX Engine failure In VPLEX Clusters containing two or more engines, the unlikely event of an engine failure will result in the loss of service from the directors within this engine, but virtual volumes serviced by the directors in other surviving engines will remain available. The “Best practices” on page 14 describe the best practice of placing redundant paths to a virtual volume on directors from different engines in multi-engine VPLEX Clusters.

Standby Power Supply failure Each VPLEX Engine is supported by a pair of Standby Power Supplies (SPS) that provide a hold-up time of five minutes, allowing the system to ride through transient power loss. A single SPS provides enough power for the attached engine. VPLEX provides a pair of SPSs for high availability.

Each SPS is a FRU and can be replaced with no disruption to the services provided by the system. The recharge time for an SPS is up to 5.5 hours and the batteries in the SPS are capable of supporting two sequential five-minute outages.

Inter-cluster link failure Each director in a VPLEX system has two links dedicated to inter-cluster communication. Each of these links should be configured (for example, zoned) to provide paths to each director in the remote cluster. In this manner, full connectivity between directors remains available even in the presence of a failure of a single link. Should one director lose both links, all inter-cluster I/O will be suspended between the two clusters to preserve write-order fidelity semantics and ensure that the remote site maintains a recoverable image. When this occurs, remote mirrors will be fractured and user-configured rules, referred to as detach rules, will be executed to determine which VPLEX Cluster should continue to allow I/O for a given remote mirror. These rules can be configured on a per-device basis, allowing some volumes to remain available on one cluster, and other volumes to remain available on the other cluster.

Once the link failures have been repaired, I/O can be restored and resynchronization tasks started to restore remote mirrors. These actions will take place automatically, or volumes can be configured to require manual resumption of I/O should coordination with server actions be required. I/O to the devices can take place immediately without needing to wait for the resynchronization tasks to complete.


Metadata volume failure VPLEX maintains its configuration state, referred to as metadata, on storage volumes provided by storage arrays on the SAN. Each VPLEX Cluster maintains its own metadata, which describes the local configuration information for this cluster as well as any distributed configuration information shared between clusters. It is strongly recommended that the metadata volume for each cluster be configured with multiple back-end storage volumes provided by different storage arrays of the same type. The data protection capabilities provided by these storage arrays, such as RAID 1 and RAID 5, should be used to ensure the integrity of the system’s metadata. Additionally, it is highly recommended that backup copies of the metadata be made whenever configuration changes are made to the system.

VPLEX uses this persistent metadata upon a full system boot and loads the configuration information onto each director. When changes to the system configuration are made, these changes are written out to the metadata volume. Should access to the metadata volume be interrupted, the VPLEX directors will continue to provide their virtualization services using the in-memory copy of the configuration information. Should the storage supporting the metadata device remain unavailable, a new metadata device should be configured. Once a new device has been assigned and configured the in-memory copy of the metadata device maintained by the cluster will then be recorded out onto the new metadata device.

The ability to perform configuration changes is suspended when access to the persistent metadata device is not available.

Dirty region log failure VPLEX Metro uses a dirty region log to record information about which regions of a fractured distributed mirror have been updated while a mirror leg is detached. This information is kept for each such detached mirror leg. Should this volume become inaccessible, the directors will record the entire leg as out-of-date and will require a full resynchronization of this leg of the volume once it is reattached to the mirror.

Management server failure Each VPLEX Cluster has a dedicated management server that provides management access to the directors and supports management connectivity for remote access to the peer cluster in a VPLEX Metro environment. As the I/O processing of the VPLEX directors does not depend upon the management servers, the loss of a management server will not interrupt the I/O processing and virtualization services provided by VPLEX.

Uninterruptible Power Supply failure In VPLEX Clusters containing two or more engines, a pair of Fibre Channel switches supports intra-cluster communication between the directors in these engines. Each switch has a dedicated UPS that provides backup power in the case of transient power loss. The UPS units will allow the Fibre Channel switches to continue operating for up to five minutes following the loss of power. The lower UPS in the rack also provides backup power to the management server.

VPLEX Cluster failures VPLEX Metro supports two forms of distributed devices: metro-distributed virtual volumes and remote virtual volumes. Metro-distributed virtual volumes provide synchronized copies (mirrors) of the volume’s data in each cluster. The mirrored volume appears and behaves as a single volume and acts in a similar manner to a virtual volume that uses a RAID 1 device, but with the added value that each cluster maintains a copy of the data. Remote virtual volumes provide access to a virtual volume whose data resides in one cluster. Remote virtual volumes, like metro-distributed virtual volumes, are able to take advantage of the VPLEX distributed coherent cache and its prefetch algorithms to provide better performance than a SAN-extension solution.

For each metro-distributed virtual volume, a detach rule identifies which cluster in a VPLEX Metro should detach its mirror leg (removing it from service) in the presence of communication loss between the two


clusters. These rules effectively define a bias (or a winner) site should the clusters lose communication with each other. There are two conditions that can cause the clusters to lose communication: one is inter-cluster link failures discussed in “Inter-cluster link failure,” and the other is a cluster failure. This section describes the latter class of failure. In the presence of cluster failure, any metro-distributed virtual volume that had a detach rule that identified the surviving site as the winning site will continue to have I/O serviced on the surviving leg of the device. Those volumes whose detach rules declared this site to detach in the case of communication loss will have their I/O suspended. Due to the inability to distinguish a cluster failure from a link failure, this behavior is designed to preserve the integrity of the data on these distributed devices.

There are two failure cases for remote virtual volumes. First, should the cluster which supplies the physical media for the virtual volume fail, the remote virtual volume will be completely inaccessible. For the second scenario, should the remote cluster fail (the cluster with no physical media for this volume), access to the virtual volume will remain available from the hosting cluster (the cluster with the physical data), but not from the remote cluster.

Host failures While this is not a capability provided by VPLEX, host-based clustering is an important technique for maximizing workload resiliency. With host-based clustering, an application can continue providing service in the presence of host failures using either an active/active or an active/passive processing model. When combined with the above capabilities of VPLEX, the result is an infrastructure capable of providing very high degrees of availability.

Data center outages A VPLEX Metro distributed between two data centers can be used to protect against data loss in the presence of a data center outage by mirroring data between the two data centers. This deployment can further improve the access to data in the presence of a data center outage in the manner described in “VPLEX Cluster failures” (that is, if a data center outage results in the loss of one of the VPLEX Clusters in a VPLEX Metro). Data access will remain available for metro-distributed virtual volumes whose winning cluster is the surviving data center. For those volumes whose winning cluster is the data center with the outage, access to the data for these volumes can be restored in the other cluster with the invocation of a manual command to resume the suspended I/O. When combined with failover logic for host clusters, this provides infrastructure that is able to restore service operations quickly even in the presence of an unplanned data center outage.

Some data center outages are caused by the loss of power in the data center. VPLEX uses standby and UPSs to overcome transient losses of power lasting five minutes or less. This must be combined with similar supporting infrastructure for the hosts, network equipment, and storage arrays for a comprehensive solution for tolerating transient power loss. For power-loss conditions lasting longer than five minutes, VPLEX will stop providing virtualization services. The write-through caching properties of VPLEX ensure that application data has been written to the back-end storage arrays prior to acknowledging the data to the host.

Conclusion VPLEX provides extensive internal hardware and software redundancy that not only ensures high availability of the VPLEX services but that further improves upon the workload resiliency of the surrounding infrastructure. When combined with the best practices of host-based clustering, multipathing, fabric redundancy, storage media protection, and standby power infrastructure the resulting solution provides a solid foundation for ensuring that virtual storage provides a robust solution to the availability of storage.


References More information on virtual storage infrastructure and the capabilities of VPLEX 4.0 is provided in these white papers.

• Implementation and Planning Best Practices for EMC VPLEX Technical Notes • Using VMware Virtualization Platforms with EMC VPLEX — Best Practices Planning • Nondisruptive Storage Relocation: Planned Events with EMC VPLEX — Best Practices Planning • VMotion over Distance for Microsoft, Oracle, and SAP Enabled by VCE Vblock1, EMC Symmetrix

VMAX, EMC CLARiiON, and EMC VPLEX Metro — An Architectural Overview • Implementing EMC VPLEX and Microsoft Hyper-V and SQL Server with Enhanced Failover

Clustering Support — Applied Technology


workload resiliency with emc vplex - data storage ... · increased workload resiliency to the data...

Documents