docu9765 technical notes implementation and planning best practices for emc vplex

130
1 Implementation and Planning Best Practices for EMC ® VPLEXVS2 Hardware with GeoSynchrony v5.x Technical Note Nov 2013 These technical notes describe various EMC VPLEX configurations and the recommended best practices for each configuration. EMC VPLEX Overview ............................................................................. 4 VPLEX Components ................................................................................ 11 Requirements vs. Recommendations .................................................... 15 Back-end/Storage Array Connectivity ................................................. 17 Host Connectivity .................................................................................... 30 Host Cluster cross-connect ...................................................................... 39 Path loss handling semantics (PDL and APD) ..................................... 43 VBLOCK and VPLEX Front End Connectivity Rules ......................... 46 Storage View Considerations ................................................................. 48 Fan In / Fan Out Ratios........................................................................... 49 Network Best Practices ............................................................................ 53 VPLEX Witness......................................................................................... 75 Consistency Groups ................................................................................. 76 Rule Sets .................................................................................................... 77 System Volumes ....................................................................................... 78 Migration of Host/Storage to a VPLEX Environment ............................. 82 Storage Volume Considerations............................................................. 87 Export Considerations ............................................................................. 93 LUN Expansion ........................................................................................ 94 Data Migration.......................................................................................... 96 Scale Up Scale Out and Hardware Upgrades ...................................... 99 VPLEX and RecoverPoint Integration ................................................. 101 Monitoring for Performance Best Practices .......................................... 103 Storage Resource Management Suite VPLEX Solution Pack ........ 103 VPLEX Administration Recommendations ............................................ 108

Upload: challasvijay

Post on 07-Sep-2015

54 views

Category:

Documents


10 download

DESCRIPTION

Docu9765 Technical Notes Implementation and Planning Best Practices for EMC VPLEX

TRANSCRIPT

  • 1

    Implementation and Planning

    Best Practices for EMC VPLEX VS2 Hardware

    with GeoSynchrony v5.x

    Technical Note

    Nov 2013

    These technical notes describe various EMC VPLEX configurations and the recommended best practices for each configuration.

    EMC VPLEX Overview ............................................................................. 4 VPLEX Components ................................................................................ 11 Requirements vs. Recommendations .................................................... 15 Back-end/Storage Array Connectivity ................................................. 17

    Host Connectivity .................................................................................... 30 Host Cluster cross-connect...................................................................... 39 Path loss handling semantics (PDL and APD) ..................................... 43 VBLOCK and VPLEX Front End Connectivity Rules ......................... 46 Storage View Considerations ................................................................. 48 Fan In / Fan Out Ratios ........................................................................... 49 Network Best Practices ............................................................................ 53 VPLEX Witness ......................................................................................... 75 Consistency Groups ................................................................................. 76 Rule Sets .................................................................................................... 77 System Volumes ....................................................................................... 78 Migration of Host/Storage to a VPLEX Environment ............................. 82

    Storage Volume Considerations ............................................................. 87 Export Considerations ............................................................................. 93 LUN Expansion ........................................................................................ 94 Data Migration.......................................................................................... 96 Scale Up Scale Out and Hardware Upgrades ...................................... 99 VPLEX and RecoverPoint Integration ................................................. 101 Monitoring for Performance Best Practices .......................................... 103 Storage Resource Management Suite VPLEX Solution Pack ........ 103 VPLEX Administration Recommendations ............................................ 108

  • 2

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Summary ..................................................................................................116 Appendix A: VS1 ....................................................................................117 Glossary....................................................................................................121

  • 3

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Audience

    These technical notes are for EMC field personnel and partners and customers who will be configuring, installing, and supporting VPLEX. An understanding of these technical notes requires an understanding of the following:

    SAN technology and network design

    Fibre Channel block storage concepts

    VPLEX concepts and components

    The next section presents a brief review of VPLEX.

  • 4

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    EMC VPLEX Overview

    EMC VPLEX represents the next-generation architecture for data mobility and continuous availability. This architecture is based on EMCs 20+years of expertise in designing; implementing and perfecting enterprise class intelligent cache and distributed data protection solutions.

    VPLEX addresses two distinct use cases:

    Data Mobility: The ability to move applications and data across different storage installationswithin the same data center, across a campus, or within a geographical region

    Continuous Availability: The ability to create a continuously available storage infrastructure across these same varied geographies with unmatched resiliency

    Figure 1 Access anywhere, protect everywhere

  • 5

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    VPLEX Platform Availability and Scaling Summary

    VPLEX addresses continuous availability and data mobility requirements and scales to the I/O throughput required for the front-end applications and back-end storage.

    Continuous availability and data mobility features are characteristics of VPLEX Local, VPLEX Metro, and VPLEX Geo.

    A VPLEX cluster consists of one, two, or four engines (each containing two directors), and a management server. A dual-engine or quad-engine cluster also contains a pair of Fibre Channel switches for communication between directors within the cluster.

    Each engine is protected by a standby power supply (SPS), and each Fibre Channel switch gets its power through an uninterruptible power supply (UPS). (In a dual-engine or quad-engine cluster, the management server also gets power from a UPS.)

    The management server has a public Ethernet port, which provides cluster management services when connected to the customer network. This Ethernet port also provides the point of access for communications with the VPLEX Witness.

    VPLEX scales both up and out. Upgrades from a single engine to a dual engine cluster as well as from a dual engine to a quad engine are fully supported and are accomplished non-disruptively. This is referred to as scale up. Scale out upgrades from a VPLEX Local to a VPLEX Metro or VPLEX Geo are also supported non-disruptively.

    Note: Scaling out also supports any size cluster to any size cluster meaning that it is not required that both clusters contain the same number of engines.

    For access to all VPLEX related collateral, interaction and information from EMC please visit the customer accessible:

    EMC Community Network - ECN: Space: VPLEX

    Former Powerlink.emc.com now support.emc.com

    https://community.emc.com/community/connect/vplex?view=discussionshttps://support.emc.com/
  • 6

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Data Mobility

    EMC VPLEX enables the connectivity to heterogeneous storage arrays providing seamless data mobility and the ability to manage storage provisioned from multiple heterogeneous arrays from a single interface within a data center. Data Mobility and Mirroring are supported across different array types and vendors.

    VPLEX Metro or Geo configurations enable migrations between locations over synchronous/asynchronous distances. In combination with, for example, VMware and Distance vMotion or Microsoft Hyper-V, it allows you to transparently relocate Virtual Machines and their corresponding applications and data over synchronous distance. This provides you with the ability to relocate, share and balance infrastructure resources between data centers. Geo is currently supported with Microsoft Hyper-V only.

    The EMC Simple Support Matrix (ESSM) Simplified version of the ELab Navigator

    Note: Please refer to the ESSM for additional support updates.

    Figure 2 Application and data mobility

    All Directors in a VPLEX cluster have access to all Storage Volumes making this solution what is referred to as an N -1 architecture. This type of architecture allows for multiple director failures without loss of access to data down to a single director.

  • 7

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    During a VPLEX Mobility operation any jobs in progress can be paused or stopped without affecting data integrity. Data Mobility creates a mirror of the source and target devices allowing the user to commit or cancel the job without affecting the actual data. A record of all mobility jobs are maintained until the user purges the list for organizational purposes.

    Figure 3 Migration process comparison

    One of the first and most common use cases for storage virtualization in general is that it provides a simple transparent approach for array replacement. Standard migrations off an array are time consuming due to the requirement of coordinating planned outages with all necessary applications that dont inherently have the ability to have new devices provisioned and copied to without taking an outage. Additional host remediation may be required for support of the new array which may also require an outage.

    VPLEX eliminates all these problems and makes the array replacement completely seamless and transparent to the servers. The applications continue to operate uninterrupted during the entire process. Host remediation is not necessary as the host continues to operate off the Virtual Volumes provisioned from VPLEX and is not aware of the change in the backend array. All host level support requirements apply only to VPLEX and there are no necessary considerations for the backend arrays as that is handled through VPLEX.

    If the solution incorporates RecoverPoint and the RecoverPoint Repository, Journal and Replica volumes reside on VPLEX virtual volumes then array replacement is also completely transparent to even to RecoverPoint. This solution results in no interruption in the replication so there is no requirement to reconfigure or

  • 8

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    resynchronize the replication volumes.

    Continuous Availability

    Virtualization Architecture

    Built on a foundation of scalable and continuously available multi processor engines, EMC VPLEX is designed to seamlessly scale from small to large configurations. VPLEX resides between the servers and heterogeneous storage assets, and uses a unique clustering architecture that allows servers at multiple data centers to have read/write access to shared block storage devices.

    Unique characteristics of this new architecture include:

    Scale-out clustering hardware lets you start small and grow big with predictable service levels

    Advanced data caching utilizes large-scale SDRAM cache to improve performance and reduce I/O latency and array contention

    Distributed cache coherence for automatic sharing, balancing, and failover of I/O across the cluster

    Consistent view of one or more LUNs across VPLEX clusters (within a data center or across synchronous distances) enabling new models of continuous availability and workload relocation

    With a unique scale-up and scale-out architecture, VPLEX advanced data caching and distributed cache coherency provide workload resiliency, automatic sharing, balancing, and failover of storage domains, and enables both local and remote data access with predictable service levels.

    EMC VPLEX has been architected for multi-site virtualization enabling federation across VPLEX Clusters. VPLEX Metro supports max 5ms RTT, FC or 10 GigE connectivity. VPLEX Geo supports max 50ms RTT, and 10GigE. The nature of the architecture will enable more than two sites to be connected in the future.

    EMC VPLEX uses a VMware Virtual machine located within a separate failure domain to provide a VPLEX Witness between VPLEX Clusters that are part of a

  • 9

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    distributed/federated solution. This third site needs only IP connectivity to the VPLEX sites and a 3 way VPN will be established between the VPLEX management servers and the VPLEX Witness.

    Many solutions require a third site, with a FC LUN acting as the quorum disk. This must be accessible from the solutions node in each site resulting in additional storage and link costs.

    Please refer to the section in this document on VPLEX Witness for additional details.

    Storage/Service Availability

    Each VPLEX site has a local VPLEX Cluster and physical storage and hosts are connected to that VPLEX Cluster. The VPLEX Clusters themselves are interconnected across the sites to enable federation. A device is taken from each of the VPLEX Clusters to create a distributed RAID 1 virtual volume. Hosts connected in Site A actively use the storage I/O capability of the storage in Site A, Hosts in Site B actively use the storage I/O capability of the storage in Site B.

  • 10

    EMC VPLEX Overview

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 4 Continuous availability architecture

    VPLEX distributed volumes are available from either VPLEX cluster and have the same LUN and storage identifiers when exposed from each cluster, enabling true concurrent read/write access across sites.

    When using a distributed virtual volume across two VPLEX Clusters, if the storage in one of the sites is lost, all hosts continue to have access to the distributed virtual volume, with no disruption. VPLEX services all read/write traffic through the remote mirror leg at the other site.

  • 11

    VPLEX Components

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    VPLEX Components

    VPLEX Engine VS2

    VS2 refers to the second version of hardware for the VPLEX cluster. The VS2 hardware is on 2U engines and will be detailed below. For information on VS1 hardware please see the appendix.

    The following figures show the front and rear views of the VPLEX VS2 Engine.

    Figure 5 Front and rear view of the VPLEX VS2 Engine

    Connectivity and I/O paths

    This section covers the hardware connectivity best practices for connecting to the SAN. The practices mentioned below are based on Dual Fabric SAN, which is Industry best practice. Well discuss Host and Array connectivity. The VPLEX hardware is designed with a standard preconfigured port arrangement that is not reconfigurable. The VS2 hardware must be ordered as a Local, Metro or Geo. VS2 hardware is pre-configured with FC or 10 Gigabit Ethernet WAN connectivity from the factory and does not offer both solutions in the same configuration.

  • 12

    VPLEX Components

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 6 VPLEX preconfigured port arrangement VS2 hardware

    Director A and Director B each have four I/O modules. I/O modules A0 and B0 are configured for host connectivity and are identified as frontend while the A1 and B1 are configured for array connectivity identified as backend. The frontend ports will log in to the fabrics and present themselves as targets for zoning to the host initiators and the backend ports will log in to the fabrics as initiators to be used for zoning to the array targets. Each director will connect to both SAN fabrics with both frontend and backend ports. Array direct connect is also supported however limiting. Special consideration must be used if this option is required.

    The I/O modules in A2 and B2 are for WAN connectivity. This slot may be populated by a four port FC module or a two port 10 GigE for VPLEX Metro or VPLEX Geo configurations. VPLEX Local configurations will ship with filler blanks in slots A2 and B2 and may be added in the field for connecting to another net new cluster for Metro or Geo upgrades. The I/O modules in slots A3 and B3 are populated with FC modules for Local Com and will only use the bottom two ports.

    The FC WAN Com ports will be connected to dual separate backbone fabrics or networks that span the two sites. This allows for data flow between the two VPLEX clusters in a Metro configuration without requiring a merged fabric between the two sites. Dual fabrics that currently span the two sites are also supported but not required. The 10 GigE I/O modules will be connected to dual networks consisting of the same QoS. All A2-FC00 and B2-FC00 ports or A2-XG00 and B2-XG00 ports from both clusters will connect to one fabric or network and all A2-FC01 and B2-FC01 ports for or A2-XG01 and B2-XG01 ports will connect to the other fabric or network. This provides a redundant network capability where each director on one cluster communicates with all the directors on the other site even in the event of a fabric or network failure. For both VPLEX Metro and VPLEX Geo, each directors WAN COM

  • 13

    VPLEX Components

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    ports on one cluster must see all of the directors WAN COM ports within the same port group on the other cluster across two different pipes. This applies in both directions.

    When configuring the VPLEX Cluster cabling and zoning, the general rule is to use a configuration that provides the best combination of simplicity and redundancy. In many instances connectivity can be configured to varying degrees of redundancy. However, there are some minimal requirements that must be adhered to for support of features like NDU. Various requirements and recommendations are outlined below for connectivity with a VPLEX Cluster.

    Frontend (FE) ports provide connectivity to the host adapters also known as host initiator ports. Backend (BE) ports provide connectivity to storage arrays ports known as target ports or FAs.

    Do not confuse the usage of ports and initiator ports within documentation. Any general reference to a port should be a port on a VPLEX director. All references to HBA ports on a host should use the term initiator port. VPLEX Metro and VPLEX Geo sections have a more specific discussion of cluster-to-cluster connectivity.

    General information (applies to both FE and BE)

    Official documents as may be found in the VPLEX Procedure Generator refer to a minimal config and describe how to connect it to bring up the least possible host connectivity. While this is adequate to demonstrate the features within VPLEX for the purpose of a Proof of Concept or use within a Test/Dev environment it should not be implemented in a full production environment. As clearly stated within the documentation for minimal config this is not a Continuously Available solution. Solutions should not be introduced into production environments that are not HA. Also, this minimal config documentation is specific to host connectivity. Please do not try to apply this concept to backend array connectivity. The requirements for backend must allow for connectivity to both fabrics for dual path connectivity to all backend storage volumes from each director.

    The following are recommended:

    Dual fabric designs for fabric redundancy and HA should be implemented to avoid a single point of failure. This provides data access even in the event of a full fabric outage.

    Each VPLEX director will physically connect to both fabrics for both host (front-end) and storage (back-end) connectivity. Hosts will connect to both an A director and B director from both fabrics and across engines for the supported HA level of connectivity as required with the NDU pre-checks.

  • 14

    VPLEX Components

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 7 Continuous availability front-end configurations (dual-engine)

    Back-end connectivity checks verify that there are two paths to each LUN

    from each director. This assures that the number of combined active and passive paths (reported by the ndu pre-check command) for the LUN is greater than or equal to two. This check assures that there are at least two unique initiators and two unique targets in the set of paths to a LUN from each director. These backend paths must be configured across both fabrics as well. No volume is to be presented over a single fabric to any director as this is a single point of failure.

    Fabric zoning should consist of a set of zones, each with a single initiator and up to 16 targets.

    Avoid incorrect FC port speed between the fabric and VPLEX. Use highest possible bandwidth to match the VPLEX maximum port speed and use dedicated port speeds i.e. do not use oversubscribed ports on SAN switches.

    Each VPLEX director has the capability of connecting both FE and BE I/O modules to both fabrics with multiple ports. The ports connected to on the SAN should be on different blades or switches so a single blade or switch failure wont cause loss of access on that fabric overall. A good design will group VPLEX BE ports with Array ports that will be provisioning groups of devices to those VPLEX BE ports in such a way as to minimize traffic across blades.

  • 15

    Requirements vs. Recommendations

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Requirements vs. Recommendations

    Production Environment Test or Proof of Concept

    Environment Notes

    Dual Fabric Requirement for High

    Availability Requirement if the tests involve High Availability

    Dual Fabrics are a general best practice.

    Dual HBA Required Required Single initiator hosts are not supported and a dual port HBA is a

    single point of failure also.

    Initiator connected to both an A and B

    Director Required Recommended

    For a Production Environment, it is also required that the connectivity for each initiator span engines in a dual or quad engine VPLEX Cluster.

    Four "active" backend paths per Director per

    Storage Volume

    Recommended but also Its a requirement to not have more than 4 active paths per director

    per storage volume

    Recommended

    This is the maximum number of "active" paths. An active/passive or ALUA array will have a maximum of four active and four passive or non-preferred paths making eight

    in all.

    Two "active" backend paths per Director per

    Storage Volume Required Required

    This is a minimum requirement in NDU which dictates that to two VPLEX Director backend ports will

    be connected to two array ports per Storage Volume. Depending on workload, size of environment and array type, four "active" path configurations have proven to alleviate performance issues and therefore are recommended over the minimum of two active paths

    per director per storage volume. Try to avoid only two path connectivity in production environments.

    Host Direct Connected to VPLEX Directors

    Not Supported Not Supported Host direct connect to a VPLEX

    Director is never supported.

    Arrays direct connected to VPLEX

    Directors

    Not Recommended but Supported

    Supported

    Array direct connect is supported but extremely limited in scale which is why it is not recommended for a production

    environment.

    WAN COM single port

    connectivity Not Supported Not Supported

    Two ports on the WAN COM for each Director must be configured each it their separate port groups.

    Fibre Channel WAN COM (Metro Only) is also supported with all four ports each in their own port group.

  • 16

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Metadata and Logging Volume

    It is required that metadata

    and metadata backups are configured across arrays at the

    local site if more than one array is present. If the site were built originally with a

    single array and another array were to be added at a later time then it is required to

    move one leg of the metadata volume and one backup to the

    new array.

    It is required that

    metadata and metadata backups are configured

    across arrays at the local site if more than one

    array is present. A standard test during a POC is to perform an

    NDU. It would be

    undesirable to have to use the --skip option if not

    needed.

    While it is a requirement for metadata and metadata backups to be configured across arrays, it is highly recommended to mirror logging volumes across arrays as well. Loss of the array that contains the logging volumes

    would result in additional overhead of full rebuilds after the array came back up.

    Host Cross-Cluster Connect

    Supported with VPLEX Witness required

    Supported with VPLEX Witness required

    VPLEX Witness is a hard requirement for Host Cross-Cluster Connect regardless of the type of environment. The auto resume attribute on all consistency groups must be set to true as an additional requirement.

    Array Cross-Cluster Connect

    Only supported if both sites are within 1ms latency from each

    other and strictly for the purpose of adding protection

    to metadata and logging

    volumes. Storage Volumes are not supported connected to

    both clusters

    Only supported if both sites are within 1ms

    latency from each other and strictly for the purpose of adding

    protection to metadata

    and logging volumes. Storage Volumes are not supported connected to

    both clusters

    Connecting an array to both sides of a VPLEX Metro or Geo is not supported if the sites exceed 1ms latency from each other. If done then extreme caution must be taken not to share the same

    devices to both clusters. Also, be cautious if evaluating performance or fault injection tests with such a configuration.

    VPLEX Witness Requirement for High

    Availability.

    Optional but should

    mirror what will be in production

    VPLEX Witness is designed to work

    with a VPLEX Metro or Geo. It is not implemented with a VPLEX Local. VPLEX Witness has proven to be such a valuable enhancement that it should be considered a requirement. VPLEX Witness must never be co-located with either of the VPLEX Clusters that it is

    monitoring.

  • 17

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Back-end/Storage Array Connectivity

    The best practice for array connectivity is to use A/B fabrics for redundancy however VPLEX is also capable of backend direct connect. This practice is immediately recognizable as being extremely limited. The following best practices for fabric connect should be followed with regards to the direct connect where applicable.

    Direct connect is intended for Proof of Concept, test/dev and specific sites that have only 1 array. This allows for backend connectivity while reducing the overall cost of switch ports. Sites with multiple arrays or large implementations should utilize SAN connectivity as that provides the optimal solution overall.

    Note: Direct connect applies only to backend connectivity. Frontend direct connect is

    not supported.

    Active/Active Arrays

    Each director in a VPLEX cluster must have a minimum of two I/O paths to every local back-end storage array and to every storage volume presented to that cluster (required). This is referred to as an ITL or Initiator/Target/LUN.

    Each director will have redundant physical connections to the back-end storage across dual fabrics (required) Each director is required to have redundant paths to every back-end storage array across both fabrics.

    Each storage array should have redundant controllers connected to dual fabrics, with each VPLEX Director having a minimum of two ports connected to the back-end storage arrays through the dual fabrics (required).

    VPLEX recommends a maximum of 4 active paths per director to a given LUN (Optimal). This is considered optimal because each director will load balance across the four active paths to the storage volume.

    High quantities of storage volumes or entire arrays provisioned to VPLEX should be divided up into appropriately sized groups (i.e. masking views or storage groups) and presented from the array to VPLEX via groups of four array ports per VPLEX director so as not to exceed the four active paths per VPLEX director limitation. As an example, following the rule of four active paths per storage volume per director (also referred to as ITLs), a four engine VPLEX cluster could have each director connected to four array ports dedicated to that director. In other words, a quad engine VPLEX cluster would have the ability to connect to 32 ports on a single array for access to a single device presented through all 32 ports and still meet the connectivity rules of 4 ITLs per director. This can be accomplished using only two ports per backend I/O module leaving the other two ports for access to another set of volumes over the same or

  • 18

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    different array ports.

    Appropriateness would be judged based on things like the planned total IO workload for the group of LUNs and limitations of the physical storage array. For example, storage arrays often have limits around the number of LUNs per storage port, storage group, or masking view they can have.

    Maximum performance, environment wide, is achieved by load balancing across maximum number of ports on an array while staying within the IT limits. Performance is not based on a single host but the overall impact of all resources being utilized. Proper balancing of all available resources provides the best overall performance.

    Load balancing via Host Multipath between VPLEX directors and then from the four paths on each director balances the load equally between the array ports

    1. Zone VPLEX director A ports to one group of four array ports.

    2. Zone VPLEX director B ports to a different group of four array ports.

    3. Repeat for additional VPLEX engines.

    4. Create a separate port group within the array for each of these logical path groups.

    5. Spread each group of four ports across array engines for redundancy.

    6. Mask devices to allow access to the appropriate VPLEX initiators for both port groups.

  • 19

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 8 Active/Active Array Connectivity

    This illustration shows the physical connectivity to a VMAX array. Similar considerations should apply to other active/active arrays. Follow the array best practices for all arrays including third party arrays.

    The devices should be provisioned in such a way as to create digestible chunks and

  • 20

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    provisioned for access through specific FA ports and VPLEX ports. The devices within this device grouping should restrict access to four specific FA ports for each VPLEX Director ITL group.

    The VPLEX initiators (backend ports) on a single director should spread across engines to increase HA and redundancy. The array should be configured into initiator groups such that each VPLEX director acts as a single host per four paths.

    This could mean four physical paths or four logical paths per VPLEX director depending on port availability and whether or not VPLEX is attached to dual fabrics or multiple fabrics in excess of two.

    For the example above following basic limits on the VMAX:

    Initiator Groups (HBAs); max of 32 WWN's per IG; max of 8192 IG's on a VMax; set port flags on the IG; an individual WWN can only belong to 1 IG. Cascaded Initiator Groups have other IG's (rather than WWN's) as members.

    Port Groups (FA ports): max of 512 PG's; ACLX flag must be enabled on the port; ports may belong to more than 1 PG

    Storage Groups (LUNs/SymDevs); max of 4096 SymDevs per SG; a SymDev may belong to more than 1 SG; max of 8192 SG's on a VMax

    Masking View = Initiator Group + Port Group + Storage Group

    We have divided the backend ports of the VPLEX into two groups allowing us to create four masking views on the VMAX. Ports FC00 and FC01 for both directors are zoned to two FAs each on the array. The WWNs of these ports are the members of the first Initiator Group and will be part of Masking View 1. The Initiator Group created with this group of WWNs will become the member of a second Initiator Group which will in turn become a member of a second Masking View. This is called Cascading Initiator Groups. This was repeated for ports FC02 and FC03 placing them in Masking Views 3 and 4. This is only one example of attaching to the VMAX and other possibilities are allowed as long as the rules are followed.

    VPLEX virtual volumes should be added to Storage Views containing initiators from a director A and initiators from a director B. This translates to a single host with two initiators connected to dual fabrics and having four paths into two VPLEX directors. VPLEX would access the backend arrays storage volumes via eight FAs on the array through two VPLEX directors (an A director and a B director). The VPLEX A director and B director each see four different FAs across at least two VMAX engines if available.

    This is an optimal configuration that spreads a single hosts I/O over the maximum number of array ports. Additional hosts will attach to different pairs of VPLEX directors in a dual-engine or quad-engine VPLEX cluster. This will help spread the overall environment I/O workload over more switches, VPLEX and array resources.

  • 21

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    This would allow for the greatest possible balancing of all resources resulting in the best possible environment performance.

    Figure 9 Show ITLs per Storage Volume

    This illustration shows the ITLs per Storage Volume. In this example the VPLEX Cluster is a single engine and is connected to an active/active array with four paths per Storage Volume per Director giving us a total of eight logical paths. The Show ITLs panel displays the ports on the VPLEX Director from which the paths originate and which FA they are connected to.

    The proper output in the Show ITLs panel for an active/passive or ALUA supported array would have double the count as it would also contain the logical paths for the passive or non-preferred SP on the array.

  • 22

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Active/Passive and ALUA Arrays

    Some arrays have architecture and implementation requirements that necessitate special consideration. When using an active-passive or ALUA supported array, each director needs to have logical (zoning and masking) and physical connectivity to both the active and passive or non-preferred controllers. That way you will not lose access to storage volumes if an active controller should fail. Additionally, arrays like the CLARiiON have limitations on the size of initiator or storage groups. It may be necessary to have multiple groups to accommodate provisioning storage to the VPLEX. Adhere to logical and physical connectivity guidelines discussed earlier.

    Figure 10 VS2 connectivity to Active/Passive and ALUA Arrays

    Points to note would be that for each CLARiiON, each SP has connection to each fabric through which each SP has connections to all VPLEX directors. The above examples shows Fabric-A with SPa0 & SPb0 (even ports) and Fabric-B with SPa3 & SPb3(odd ports) for dual fabric redundancy.

    ALUA support allows for connectivity similar to Active/Passive arrays. VPLEX will recognize the non-preferred path and refrain from using it under normal conditions. A director with proper maximum path connectivity will show eight ITLs per director but will only report four active paths.

  • 23

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    When provisioning storage to VPLEX, ensure that mode 4 (ALUA) or mode 1 set during VPLEX initiator registration prior to device presentation. Dont try to change it after devices are already presented.

    Proper connectivity for active/passive and ALUA arrays can be handled in a couple of ways. You have the option of configuring to a minimum configuration or maximum configuration which amount to two or four active paths per Director per LUN as well as two or four passive or non-preferred paths per Director per LUN. This is known as an ITL or Initiator/Target/LUN Nexus. A minimum configuration is 4 (two active and two passive/non-preferred) paths (Logical or Physical) per Director for any given LUN and a maximum supported configuration is 8 (eight) paths per Director per LUN for Active/Passive and ALUA.

    Note: VNX2 will support Active/Active connectivity on classic LUNs, i.e.

    those not bound on a pool. Pool LUNs are not supported in Symmetrical

    Active/Active. VPLEX does not support mixed mode configurations.

    The next set of diagrams depicts both a four active path per Director per LUN Logical configuration and a four active path per Director per LUN Physical configuration. Both are supported configurations as well as two active paths per Director per LUN configurations.

    The commands used in the VPlexcli to determine the port WWNs and the ITLs used in the following diagram are:

    VPlexcli:/ll **/hardware/ports

    VPlexcli:/clusters/cluster-/storage-elements/storage-volumes/>ll --full

    As an example:

  • 24

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 11 Backend port WWN identification

    Running the long listing on the hardware/ports allows you to determine which WWN is associated with which VPLEX backend port.

  • 25

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 12 ITL association

    From the storage-volumes context you can select a sample volume and cd to that context. Running the ll --full command will show the ITLs.

    In this example we have sixteen entries for this volume. This is a single engine VPLEX cluster connected to a VNX array. Even though this gives us eight paths per director for this volume only four paths go to the array SP that owns the volume. In either mode 1 or mode 4 (ALUA), the paths going to the other SP will not be used for I/O. Only in the case of a trespass will they become active.

    Note: All paths, whether active or not, will perform device discovery during an array rediscover. Over allocating the number of paths beyond the supported limits will have detrimental effects on performance and/or backend LUN provisioning.

    VPLEX

    Backend port

    WWN

    array port

    WWN

  • 26

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 13 1:1 Physical path configuration

  • 27

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    This drawing was developed from the output from the two commands shown above.

    Figure 14 Logical path configuration

  • 28

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    A slight modification from the previous drawing helps illustrate the same concept but using only two VPLEX backend ports per director. This gives us the exact same number of ITLs and meets the maximum supported limit as spreading across all four ports.

    Significant Bit

    The above two illustrations show the significant bit but there are other bit considerations for identifying all possible ports. The following will help explain the bit positions as they apply to the various modules on a CLARiiON / VNX.

    The CLARiiON CX4 Series supports many more SP ports. As such the original method of specifying the Ports would cause an overlap between SP A high end ports and SP B low end ports.

    That is, SPA9 would have the significant byte pair as 69, which is SPB1.

    The new method is as follows :

    SPA0-7 and SPB0-7 are the same as the old method.

    Port SP A SP B

    00 50:06:01:60:BB:20:02:07 50:06:01:68:BB:20:02:07

    01 50:06:01:61:BB:20:02:07 50:06:01:69:BB:20:02:07

    02 50:06:01:62:BB:20:02:07 50:06:01:6A:BB:20:02:07

    03 50:06:01:63:BB:20:02:07 50:06:01:6B:BB:20:02:07

    04 50:06:01:64:BB:20:02:07 50:06:01:6C:BB:20:02:07

    05 50:06:01:65:BB:20:02:07 50:06:01:6D:BB:20:02:07

    06 50:06:01:66:BB:20:02:07 50:06:01:6E:BB:20:02:07

    07 50:06:01:67:BB:20:02:07 50:06:01:6F:BB:20:02:07

  • 29

    Back-end/Storage Array Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    For the higher port numbers byte 12 is changed to represent the higher ports thus:

    0 0-7

    4 8-15

    8 16-23

    C 24-31

    And the 8th byte cycles back to 0-7 for SP A and 8-F for SP B. So for ports 8-11 on SP A and SP B we have:

    Port SP A SP B

    08 50:06:01:60:BB:24:02:07 50:06:01:68:BB:24:02:07

    09 50:06:01:61:BB:24:02:07 50:06:01:69:BB:24:02:07

    10 50:06:01:62:BB:24:02:07 50:06:01:6A:BB:24:02:07

    11 50:06:01:63:BB:24:02:07 50:06:01:6B:BB:24:02:07

    Additional Array Considerations

    Arrays, such as the Symmetrix, that do in-band management may require a direct path from some hosts to the array. Such a direct path should be solely for the purposes of in-band management. Storage volumes provisioned to the VPLEX should never simultaneously be masked directly from the array to the host; otherwise there is a high probability of data corruption. It may be best to dedicate hosts for in-band management and keep them outside of the VPLEX environment.

    Storage volumes provided by arrays must have a capacity that is a multiple of 4 KB. Any volumes which are not a multiple of 4KB will not show up in the list of available volumes to be claimed. For the use case of presenting storage volumes to VPLEX that contain data and are not a multiple of 4K then those devices will have to be migrated to a volume that is a multiple of 4K first then that device presented to VPLEX. The alternative would be to use a host based copy utility to move the data to a new and unused VPLEX device.

    Remember to reference the EMC Simple Support Matrix, Release Notes, and online documentation for specific array configuration requirements. Remember to follow array best practices for configuring devices to VPLEX.

  • 30

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Host Connectivity

    Front-end/host initiator port connectivity

    Dual fabric designs are considered a best practice

    The front-end I/O modules on each director should have a minimum of two physical connections one to each fabric (required).

    Each host should have at least one path to an A director and one path to a B director on each fabric for a total of four logical paths (required for NDU).

    Multipathing or path failover software is required at the host for access across the dual fabrics

    Each host should have fabric zoning that provides redundant access to each LUN from a minimum of an A and B director from each fabric.

    Four paths are required for NDU

    Observe Director CPU utilization and schedule NDU for times when average directory CPU utilization is below 50%

    GUI Performance Dashboard in GeoSynchrony 5.1 or newer

    Skipping the NDU pre-checks would be required for host connectivity with less than four paths and is not considered a best practice

    NOTE: An RPQ will be required for single attached hosts and/or environments that do not have redundant dual fabric configurations.

    More information is available in the

  • 31

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Export Considerations section.

    Note: Each Initiator / Target connection is called an IT Nexus. Each VPLEX front-end port supports up to 400 IT nexuses and, on VS2, each engine has a total of 8 front-end target ports. Dual and quad-engine clusters provide additional redundancy but do not increase the total number of initiator ports supported on a per-cluster basis. For that matter, all listed limits in the Release Notes apply to all VPLEX Cluster configurations equally regardless whether it is a single, dual or quad engine configuration.

  • 32

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Additional recommendations

    If more than one engine is available, spread I/O paths across engines as well as directors.

    Note: For cluster upgrades when going from a single engine to a dual engine cluster or from a dual to a quad engine cluster you must rebalance the host connectivity across the newly added engines. Adding additional engines and then not connecting host paths to them is of no benefit. Additionally until further notice, the NDU pre-check will now flag any host connectivity as a configuration issue if they have not been rebalanced. Dual to Quad upgrade will not flag an issue provided there were no issues prior to the upgrade. You may choose to rebalance the workload across the new engines or add additional hosts to the pair of new engines.

    Complete physical connections to the VPLEX before commissioning/setup.

    Use the same FE/BE ports on each director to avoid confusion, that is, B0-FC00 and A0-FC00. Please refer to hardware diagrams for port layout.

  • 33

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 15 Host connectivity for Single Engine Cluster meeting NDU pre-check requirements

    This illustration shows dual HBAs connected to two Fabrics with each connecting to two VPLEX directors on the same engine in the single engine cluster. This is the

  • 34

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    minimum configuration that would meet NDU requirements.

    This configuration increases the chance of a read cache hit increasing performance.

    Please refer to the Release Notes for the total FE port IT Nexus limit.

    Pros:Meets the NDU requirement for Single Engine configurations only

    Cons:Single engine failure could cause a DU event

  • 35

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 16 Host connectivity for HA requirements for NDU pre-checks dual or quad engine

  • 36

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    The previous illustration shows host connectivity with dual HBAs connected to four VPLEX directors. This configuration offers increased levels of HA as required by the NDU pre-checks. This configuration could be expanded to additional ports on the VPLEX directors or additional directors for the quad-engine configuration. This configuration still only counts as four IT Nexus against the limit as identified in the Release Notes for that version of GeoSynchrony.

    Pros:

    Offers HA for hosts running load balancing software

    Good design for hosts using load balancing multipath instead of path failover software

    Director failure only reduces availability by 25%

    Cons:

    Reduces the probability of a read cache hit potentially impacting performance initially until cache chunk is duplicated in all servicing directors (applies to hosts using load balancing software)

    Duplicating cache chunks into too many different directors consumes proportional amounts of overall system cache reducing total cache capacity (applies to hosts using multipath load balancing software)

  • 37

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Figure 17 Host connectivity for HA quad engine

    The previous illustration shows host connectivity with dual HBAs connected to four VPLEX engines (eight directors). This configuration counts as eight IT Nexuses against the total limit as defined in the Release Notes for that version of GeoSynchrony. Hosts using active/passive path failover software should connect a path to all available directors and manual load balance by selecting a different director

  • 38

    Host Connectivity

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    for the active path with different hosts.

    Most host connectivity for hosts running load balancing software should follow the recommendations for a dual engine cluster. The hosts should be configured across two engines and the hosts should alternate between pairs of engines effectively load balancing the I/O across all engines.

    Pros:

    Offers highest level of Continuous Availability for hosts running load balancing software

    Best Practice design for hosts using path failover software

    Director failure only reduces availability by 12.5%

    Allows for N-1 director failures while always providing access to data as long as 1 director stays online

    Cons:

    Reduces the probability of a read cache hit potentially impacting performance initially until cache chunk is duplicated in all servicing directors (applies to hosts using load balancing software)

    Duplicating cache chunks into too many different directors consumes proportional amounts of overall system cache reducing total cache capacity (applies to hosts using multipath load balancing software)

    Consumes double the IT Nexus count against the system limit identified in the Release Notes as compared to the dual engine configuration

    o Most hosts should be attached to four directors at most unless absolutely necessary

  • 39

    Host Cluster cross-connect

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Host Cluster cross-connect

    Figure 18 Host Cluster connected across sites to both VPLEX Clusters

    Cluster cross-connect applies to specific host OS and multipathing configurations as listed in the VPLEX ESSM only.

    Host initiators are zoned to both VPLEX clusters in a Metro.

    Host multipathing software can be configured for active path/passive path with active path going to the local VPLEX cluster. When feasible, configure the multipathing driver to prefer all local cluster paths over remote cluster paths.

    Separate HBA ports should be use for the remote cluster connection to avoid merging of the local and remote fabrics

    Connectivity at both sites follow same rules as single host connectivity

    Supported stretch clusters can be Cluster cross-connected (Please refer to VPLEX ESSM)

    Cluster cross-connect is limited to a VPLEX cluster separation of no more than 1ms latency

    Cluster cross-connect requires the use of VPLEX Witness

    VPLEX Witness works with Consistency Groups only

    Cluster cross-connect must be configured when using VPLEX Distributed Devices only

    Cluster cross-connect is supported in a VPLEX Metro synchronous environment only

    At least one backend storage array is required at each site with redundant connection to the VPLEX cluster at that site. Arrays are not cross connected to

  • 40

    Host Cluster cross-connect

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    each VPLEX cluster

    All Consistency Groups used in a Cluster cross-connect are required to have the auto-resume attribute set to true

    The unique solution provided by Cluster cross-connect requires hosts have access to both datacenters. The latency requirements for Cluster cross-connect can be achieved using an extended fabric or fabrics that span both datacenters. The use of backbone fabrics and LSAN zones may introduce additional latency preventing a viable use of Cluster cross-connect. The rtt for Cluster cross-connect must be within 1ms.

    PowerPath supports auto-standby on PPVE 5.8 its documented in the release notes and CLI guide.

    Other PP OSs that support the auto-standy feature are:

    Windows

    Linux (SUSE and RHEL)

    Solaris

    HP-UX

    The only thing that the customer has to do is enable the autostandby feature:

    #powermt set autostandby=on trigger=prox host=xxx

    PowerPath will take care of setting to autostandby those paths associated with the remote/non-preferred VPLEX cluster.

    PP groups the paths by VPLEX cluster and the one with the lowest minimum path latency is designated as the local/preferred cluster.

    HP-UX MPIO

    HP-UX MPIO policy least-command-load is better performing than simple round-robin.

    https://support.emc.com/docu44831_PowerPath/VE_for_VMware_vSphere_5.8_Release_Notes_.pdfhttps://support.emc.com/docu44838_PowerPath/VE_5.8_for_VMware_vSphere_Remote_CLI_Guide.pdf
  • 41

    Host Cluster cross-connect

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Cluster cross-connect for VMWare ESXi

    Note: We encourage any customer moving to a VPLEX-Metro to move to ESX 5.0 Update 1 to benefit from all the HA enhancements in ESX 5.0 as well as the APD/PDL handling enhancements provided in update 1

    Applies to vSphere 4.1 and newer and VPLEX Metro Spanned SAN configuration

    HA/DRS cluster is stretched across the sites. This is a single HA/DRS cluster with ESXi hosts at each site

    A single standalone vCenter will manage the HA/DRS cluster

    The vCenter host will be located at the primary datacenter

    The HA/VM /Service console/vMotion networks should use multiple NIC cards on each ESX for redundancy

    The latency limitation of 1ms is applicable to both Ethernet Networks as well as the VPLEX FC WAN networks

    The ESXi servers should use internal disks or local SAN disks for booting. The Distributed Device should not be used as a boot disk

    All ESXi hosts initiators must be registered as default type in VPLEX

    VPLEX Witness must be installed at a third location isolating it from failures that could affect VPLEX clusters at either site

    It is recommended to place the VM in the preferred site of the VPLEX distributed volume (that contains the datastore)

    In case of a Storage Volume failure or a BE array failure at one site, the VPLEX will continue to operated with the site that is healthy. Furthermore if a full VPLEX failure or WAN COM failure occurs and the cluster cross-connect is operational then these failures will be transparent to the host

    Create a common storage view for ESX nodes on site 1 on VPLEX cluster-1

    Create a common storage view for ESX nodes on site 2 on VPLEX cluster-2

    All Distributed Devices common to the same set of VMs should be in one consistency group

    All VMs associated with one consistency group should be collocated at the same site with the bias set on the consistency group to that site

    If using ESX Native Multi-Pathing (NMP) make sure to use the fixed policy and make sure the path(s) to the local VPLEX is the primary path(s) and the path(s) to the remote VPLEX is only stand-by

    vMSC is support for both non-uniform and uniform (cross-connect)

    For additional information please refer to White Paper: Using VMware vSphere with

    http://powerlink.emc.com/km/live1/en_US/Offering_Technical/White_Paper/h7118-vmware-virtualization-vplex-wp.pdf?mtcs=ZXZlbnRUeXBlPUttQ2xpY2tDb250ZW50RXZlbnQsZG9jdW1lbnRJZD0wOTAxNDA2NjgwNWQzYzJiLGRvY3VtZW50VHlwZT1wZGYsbmF2ZU5vZGU9MGIwMTQwNjY4MDRkZjBhOV9Hcmlk
  • 42

    Host Cluster cross-connect

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    EMC VPLEX Best Practices Planning found on PowerLink.

  • 43

    Path loss handling semantics (PDL and APD)

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Path loss handling semantics (PDL and APD)

    vSphere can recognize two different types of total path failures to an ESXi 5.0 u1 server. These are known as "All Paths Down" (APD) and "Persistent Device Loss" (PDL). Either of these conditions can be declared by the ESXi server depending on the failure condition.

    Persistent device loss (PDL)

    This is a state that is declared by an ESXi server when a SCSI sense code (2/4/3+5) is sent from the underlying storage array (in this case a VPLEX) to the ESXi host informing the ESXi server that the paths can no longer be used. This condition can be caused if the VPLEX suffers a WAN partition causing the storage volumes at the non-preferred location to suspend. If this does happen then the VPLEX will send the PDL SCSI sense code (2/4/3+5) to the ESXi server from the site that is suspending (i.e. the non-preferred site).

    Figure 19 Persistent device loss process flow

    All paths down (APD)

  • 44

    Path loss handling semantics (PDL and APD)

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    This is a state where all the paths to a given volume have gone away (for whatever reason) but no SCSI sense code can be sent by the array (e.g. VPLEX) or alternatively nothing is received by the ESXi server. An example of this would be a dual fabric failure at a given location causing all of the paths to be down. In this case no SCSI sense code will be generated or sent by the underlying storage array, and even if it was, the signal would not be received by the host since there is no connectivity. Another example of an APD condition is if a full VPLEX cluster fails (unlikely as there are no SPOFs). In this case a SCSI sense code cannot be generated since the storage hardware is unavailable, and thus the ESXi server will detect the problem on its own resulting in an APD condition. ESXi versions prior to vSphere 5.0 Update 1 could not distinguish between an APD or PDL condition, causing VM's to become non-responsive rather than to automatically invoke a HA failover (i.e. if the VPLEX suffered a WAN partition and the VMs were running on the non-preferred site). Clearly this behavior is not desirable when using vSphere HA with VPLEX in a stretched cluster configuration. This behavior changed in vSphere 5.0 Update 1 since the ESXi server is now able to receive and act on a 2/4/3+5 sense code if it is received and declare PDL, however additional settings are required to ensure the ESXi host acts on this condition.

    The settings that need to be applied to vSphere 5.0 update 1 deployments (and beyond, including vSphere 5.1) are: 1. Use vSphere Client and select the cluster, right-click and select Edit Settings. From the pop-up menu, click to select vSphere HA, then click Advanced Options. Define and save the following option: das.maskCleanShutdownEnabled=true 2. On every ESXi server, create and edit (with vi) the /etc/vmware/settings with the content below, then reboot the ESXi server. The following output shows the correct setting applied in the file: ~ # cat /etc/vmware/settings disk.terminateVMOnPDLDefault=TRUE

    Refer to the ESXi documentation for further details and the whitepaper found here:

    http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR-CLSTR-USLET-102-HI-RES.pdf

    Note: vSphere and ESXi 5.1 introduces a new feature called APD timeout.

    This feature is automatically enabled in ESXi 5.1 deployments and while not to be confused with PDL states does carry an advantage whereby if both

    fabrics to the ESXi host or an entire VPLEX cluster fails, the host (which

    would normally hang (also known as a VM zombie state)) would now be able

    to respond to non-storage requests since "hostd" will effectively disconnect the unreachable storage, however this feature does not cause the affected VM

    to die. Please see this article for further details:

    http://www.vmware.com/files/pdf/techpaper/Whats-New-

    http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR-CLSTR-USLET-102-HI-RES.pdfhttp://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR-CLSTR-USLET-102-HI-RES.pdfhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Storage-Technical-Whitepaper.pdf
  • 45

    Path loss handling semantics (PDL and APD)

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    VMware-vSphere-51-Storage-Technical-Whitepaper.pdf

    It is expected that since VPLEX uses a non-uniform architecture that this

    situation should never be encountered on a VPLEX METRO cluster.

    As discussed above, vSphere HA does not automatically recognize that a SCSI PDL (Persistent device loss) state is a state that should cause a VM to invoke a HA failover. Clearly, this may not be desirable when using vSphere HA with VPLEX in a stretched cluster configuration. Therefore, it is important to configure vSphere so that if the VPLEX WAN is partitioned and a VM happens to be running at the non-preferred site (i.e., the storage device is put into a PDL state), the VM recognizes this condition and invokes the steps required to perform a HA failover.

    ESX and vSphere versions prior to version 5.0 update 1 have no ability to act on a SCSI PDL status and will therefore typically hang (i.e., continue to be alive but in an unresponsive state). However, vSphere 5.0 update 1 and later do have the ability to act on the SCSI PDL state by powering off the VM, which in turn will invoke a HA failover. To ensure that the VM behaves in this way, additional settings within the vSphere cluster are required.

    At the time of this writing the settings are:

    1. Use vSphere Client and select the cluster, right-click and select Edit Settings. From the pop-up menu click to select the vSphere

    HA, then click Advanced Options. Define and save the option: das.maskCleanShutdownEnabled=true

    2. On every ESXi server, vi /etc/vmware/settings with the content below, then reboot the ESXi server.

    The following output shows the correct setting applied in the file:

    ~ # cat /etc/vmware/settings

    disk.terminateVMOnPDLDefault=TRUE

    Refer to the ESX documentation for further details.

  • 46

    VBLOCK and VPLEX Front End Connectivity Rules

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    VBLOCK and VPLEX Front End Connectivity Rules

    Note: All rules in BOLD cannot be broken, however Rules in Italics can be adjusted

    depending on customer requirement, but if these are general requirements simply use the

    suggested rule.

    1. Physical FE connectivity

    a. Each VPLEX Director has 4 front end ports. 0, 1, 2 and 3. In all cases even ports

    connect to fabric A and odd ports to fabric B.

    i. For single VBLOCKS connecting to single VPLEX

    Only ports 0 and 1 will be used on each director. 2 and 3 are

    reserved.

    Connect even VPLEX front end ports to fabric A and odd to fabric B.

    ii. For two VBLOCKS connecting to a single VPLEX

    Ports 0 and 1 will be used for VBLOCK A

    Ports 2 and 3 used for VBLOCK B

    Connect even VPLEX front end ports to fabric A and odd to fabric B.

    2. ESX Cluster Balancing across VPLEX Frontend

    All ESX clusters are evenly distributed across the VPLEX front end in the following patterns:

    3. Host / ESX Cluster rules

    a. Each ESX cluster must connect to a VPLEX A and a B director.

    Engine #

    Director A B

    Cluster # 1,2,3,4,5,6,7,8 1,2,3,4,5,6,7,8

    Engine #

    Director A B A B

    Cluster # 1,3,5,7 2,4,6,8 2,4,6,8 1,3,5,7

    Engine #

    Director A B A B A B A B

    Cluster# 1,5 2,6 3,7 4,8 4,8 3,7 2,6 1,3

    Dual Engine

    Quad Engine

    Engine 1 Engine 2

    Engine 1 Engine 2 Engine 3 Engine 4

    Engine 1

    Single Engine

  • 47

    VBLOCK and VPLEX Front End Connectivity Rules

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    b. For dual and quad configs, A and B directors must be picked from different

    engines (see table above for recommendations)

    c. Minimum directors that an ESX cluster connects to is 2 VPLEX directors.

    d. Maximum directors that an ESX cluster connects to is 2 VPLEX directors.

    e. Any given ESX cluster connecting to a given VPLEX cluster must use the same

    VPLEX frontend ports for all UCS blades regardless of host / UCS blade count.

    f. Each ESX host should see four paths to the same datastore

    i. 2 across fabric A

    A VPLEX A Director port 0 (or 2 if second VBLOCK)

    A VPLEX B Director port 0 (or 2 if second VBLOCK)

    ii. 2 across fabric B

    The same VPLEX A Director port 1 (or 3 if second VBLOCK)

    The same VPLEX B Director port 1 (or 3 if second VBLOCK)

    4. Pathing policy

    a. Non cross connected configurations recommend to use adaptive pathing policy in all

    cases. Round robin should be avoided especially for dual and quad systems.

    b. For cross connected configurations, fixed pathing should be used and preferred

    paths set per Datastore to the local VPLEX path only taking care to alternate and

    balance over the whole VPLEX front end (i.e. so that all datastores are not all

    sending IO to a single VPLEX director).

  • 48

    Storage View Considerations

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Storage View Considerations

    Storage Views provide the framework for masking Virtual Volumes to hosts. They contain three sets of objects:

    1. Virtual Volumes

    2. VPLEX Frontend Ports

    3. Host Initiators

    The combination of each of the VPLEX Frontend Ports and the Host Initiators are called IT Nexus (Initiator/Target Nexus). An IT Nexus can only access a single Storage View. If a Host requires access to another Storage View with the same set of Initiators then the Initiators must be zoned to other VPLEX frontend ports and those IT Nexus combinations would be used for access to the new Storage View. The NDU requirements must be met for each Storage View independently even if the Host Initiator and frontend connectivity meet the requirements for a different Storage View.

    Virtual Volumes can be placed in multiple Storage Views which offer additional options to architecting different solutions based on unique customer needs.

    Best practices Single Host

    Create a separate storage view for each host and then add the volumes for that host to only that view

    For redundancy, at least two initiators from a host must be added to the storage view

    Clustered Hosts

    Create a separate storage view for each node for private volumes such as boot volume

    Create new pairs of IT Nexus for node HBA Initiators on different VPLEX Frontend Ports

    Create a Storage View for the Cluster that contains the shared Virtual Volumes and add IT Nexus pairs that are different than those for the private volume Storage Views

    As Mentioned above, there are additional options for Clustered Hosts. This specific option may consume more IT Nexuses against the system limits but it allows for a single Storage View for shared volumes. This minimizes the possibility of user error when adding or removing shared volumes.

  • 49

    Fan In / Fan Out Ratios

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Fan In / Fan Out Ratios

    The published system limits are based on what has been tested and qualified to work for that specific code level and GeoSynchrony feature Local, Metro or Geo. Always refer to the Release Notes for the code level that the solution is being architected to. All Cluster limits identified in the Release Notes apply equally to the single, dual or quad engine cluster. Upgrading Clusters to increase engine count apply to performance considerations only and do not increase defined limits.

    Note: ESX Initiators apply to the physical server, not the individual VMs.

    Fan out (Backend connectivity):

    Rule number one, all directors must see all storage equally. This means that all devices provisioned from all arrays must be presented to all directors on the VPLEX cluster equally. Provisioning some storage devices to some directors and other storage devices to other directors on VPLEX is not supported. All VPLEX directors must have the exact same view of all the storage.

    Rule number two, all directors must have access to the same storage devices over both fabrics. The purpose of presenting to both fabrics provides the redundancy necessary to survive a fabric event that could take an entire fabric down without causing the host to lose access to data. Additionally, the NDU process tests for redundancy across two backend ports for each director per Storage Volume.

    Rule number three, a device must not be accessed by more than four active paths on a given director. The limit is based on logical count of 4 paths per Storage Volume per Director not physical port count. An ALUA supported array would have eight paths per Storage Volume per Director as only four of those paths would be active at any given time. Four paths are optimal as VPLEX will Round Robin across those four paths on a per director basis.

    Note: It is very important not to exceed the 4 paths per Storage Volume per Director as this would cause a DU event under certain Fabric failures as VPLEX tries to timeout the paths. Host platforms would weather the storm if connectivity was within supported limits but could cause extreme timeouts on the host if an excessive number of paths per Storage Volume per Director were configured causing the host to fail the device resulting in a DU event.

  • 50

    Fan In / Fan Out Ratios

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Theoretically, based on rule number three of 4 paths per Storage Volume per Director and the Storage Volume limit of 8,000, each port would have to support 8,000 devices per port on the VS2 hardware but only of the I/O going to that device will be on any given port for that director. This is only an example to illustrate the concept.

    The access of these devices for the I/O are controlled by the host attach rules which means that even though you could have up to 8,000 devices per port on the VS2 hardware, as per this example, you will only be accessing a fraction of those devices at any given time. That ratio of total device attach vs. access is directly proportional to the number of engines in the cluster and which directors the host are actually attached to.

    Whether you have a single, dual or quad engine configuration, the rules mentioned previously still apply. The different number of engines provides increasing points of access for hosts thereby spreading the workload over more directors, and possibly, more engines.

    The VPLEX limits for IT nexus (Initiator - Target nexus) per port are as follows. You cannot exceed 256 IT nexus per backend port or 400 per frontend port. (Please refer to the Release Notes for the specific limits based on GeoSynchrony level). This means a total of 256 array ports can be zoned to the initiator of the backend port on VPLEX and a total of 400 host initiators can be zoned to any individual frontend port on VPLEX (VPLEX Local or Metro only as an example as limits for Geo are much less).

    Note: The IT nexus limit was increased to 400 with GeoSynchrony 5.1 patch 2 for frontend port connectivity in conjunction with increasing the IT nexus limit to 3,200 for the VPLEX Local and VPLEX Metro solutions. Please refer to Release Notes for specific support limits.

    Fan in (Host connectivity)

    Rule number one - Hosts HBAs should be connected to fabric A and fabric B for HA. The dual fabric connectivity rule for storage applies here for the exact same reason.

    Rule number two - Hosts HBAs should be connected to an A director and a B director on both fabrics as required by NDU pre-checks. This totals a minimum of four paths all together. For dual or quad engine configurations, the initiators must span engines for Director A and Director B connectivity.

    VPLEX caching algorithms allow the transfer of "chunks" of cache to transfer between directors via the internal Local Com. In addition to cache transfers within the cluster, cache chunks are also transferred between clusters in a VPLEX Metro or VPLEX Geo configuration over the WAN COM. With NDU requirements in mind, you can

  • 51

    Fan In / Fan Out Ratios

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    potentially optimize performance by selecting directors on the same engine for the director A and B connections or optimize HA by selecting an A director and a B director on separate engines, if available, so that you will survive a complete engine failure.

    When considering attaching hosts running load balancing software to more than two engines in a large VPLEX configuration, performance and scalability of the VPLEX complex should be considered. This caution is provided for the following reasons:

    Having a host utilize more than the required number of directors can increases cache update traffic among the directors

    Decreases probability of read cache hits

    Considerations for multipath software limits must be observed

    Impact to availability is proportional to path provisioning excess

    o More paths increase device discovery and hand shaking

    o High fabric latencies increase the chance that the host will fail

    Based on the reliably and availability characteristics of a VPLEX hardware, attaching a host to two engines provides a continuously available configuration without unnecessarily impacting performance and scalability of the solution.

    Hosts running multipath failover software such as ESX with NMP should connect to every Director in the VPLEX cluster and select a different Director for each node in the ESX cluster for the active path. The exception might be for Host Cross-Connect in a Metro utilizing Quad engine clusters at both sites. This would result in a total of 16 paths from the server to the LUN. ESX allows for 8 paths connectivity to a maximum of 256 LUNs. If 16 paths were configured you would reduce the total number of supported LUNs to 128.

    DMP Tuning Parameters

    dmp_lun_retry_timeout

    Specifies a retry period for handling transient errors. When all paths to a disk fail, there may be certain paths that have a temporary failure and are likely to be restored soon. If I/Os are not retried for a period of time, the I/Os may be failed to the application layer *** EVEN THOUGH SOME PATHS ARE EXPERIENCING A TRANSIENT FAILURE. The DMP tunable dmp_lun_retry_timeout can be used for more robust handling of such transient errors. If the tunable is set to a non-zero value, I/Os to a disk with all failed paths will be retried until the specified dmp_lun_retry_timeout interval or until the I/O succeeds on one of the paths, whichever happens first.

  • 52

    Fan In / Fan Out Ratios

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    The default value of the tunable is 0, which means that the paths are probed only once.

    You would also want to try this if they are not the defaults on your hosts:

    Set the vxdmp setting:

    vxdmpadm setattr enclosure emc-vplex0 recoveryoption=throttle iotimeout=30

    vxdmpadm setattr enclosure emc-vplex0 dmp_lun_retry_timeout=60

    Port utilization must be the deciding factor in determining which port to attach to. VPLEX provides performance monitoring capabilities which will provide information for this decision process. Please refer to the GeoSynchrony Release Notes for the specific limits for that level of code.

    GeoSynchrony 5.0 and later also supports cross-connect in a Metro environment. This is a unique HA solution utilizing distributed devices in consistency groups controlled by a VPLEX Witness. Please refer to the Cluster Cross-connect section for additional details.

    Host environments that predominantly use path failover software instead of load balancing software should observe system loads based on the active path and balance the active paths across directors so the overall VPLEX director CPU utilization is balanced.

    In conclusion, it is more important to observe port utilization and defined limits than to simply say X number of connections per port.

  • 53

    Network Best Practices

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Network Best Practices

    This section provides guidance around the network best practices for all components within the VPLEX family of products. For the purpose of clarification, there are several parts that will be covered.

    Management network

    Management server to management server Virtual Private Network (VPN)

    VPLEX Witness to management servers VPN

    Director to Director communications

    Local cluster communications between directors

    Remote cluster communications between directors

    Note: The network best practices section does not cover host cluster network considerations as VPLEX does not provide solutions directly for host stretch cluster networks. Host stretch clusters may require layer 2 network connectivity so please refer to host requirements and network vendor products that provide the ability to separate cluster nodes across datacenters. Even though this document does not provide for host cluster network requirements they are still an integral part of the

    overall solution.

    For the purpose of defining the terminology used, Director to Director communication refers to the intra-cluster connectivity (within the cluster) as well as the inter-cluster connectivity (between the clusters). For the purpose of clarification, VPLEX-Local only has intra-cluster communications to be considered. VPLEX-Metro and VPLEX-Geo have inter-cluster communications; in addition, that uses different carrier media. VPLEX-Metro uses Fibre Channel connectivity and supports switched fabric, DWDM and FCIP protocols. The inter-cluster communications for VPLEX-Metro will be referred to as FC WAN COM or FC WAN or WAN COM. The product CLI display both as WAN COM but identify the ports with FC or XG for Fibre Channel or 10 Gig Ethernet.

    Note: VPLEX Metro supports 10 GigE WAN COM connectivity but will continue to follow all current limits of latency and redundancy for synchronous write through mode.

    VPLEX-Geo utilizes Ethernet communications for inter-cluster communications

  • 54

    Network Best Practices

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    referred to as WAN COM. VS2 hardware has to be ordered with the appropriate hardware for VPLEX-Metro utilizing the FC WAN COM and four port Fibre Channel I/O modules or configured for VPLEX-Geo utilizing the dual port 10 GigE modules for WAN COM.

    Note: Reconfiguration of the hardware I/O module of the WAN COM after installation is currently not supported. Please submit an RPQ if necessary.

    VPLEX Local

    The only network requirements for a VPLEX Local are for management access. Access requires configuring the IP Address for eth3 on the management server and connecting the management server to the customer network.

    The eth3 port on the management server is configured for Auto Negotiate and is 1 GigE capable. Even though the primary use is for management, file transfer is very important and proper speed and duplex connectivity is imperative for transferring collect-diagnostics or performing NDU package transfers. If file transfer appear extremely slow then it would be prudent to check the network switch to make sure that the port connected to isnt set to 100/full duplex as no auto negotiations would happen and the network connectivity would default to 100/half duplex. This situation would cause file transfers to take possibly hours to transfer. Please dont assume that you wont use this port for file transfer as it will be used in a Metro or Geo NDU as well as the possibility of performing the NDU via ESRS remotely.

    For some very specific installations such as government dark sites, it may be required that the VPLEX Cluster not be connected to the network for security reasons. In this situation, it is still required to configure an IP Address for eth3 but a non-routable address such as 192.168.x.x can be used in order to proceed with the installation. The IP Address can be changed at a later date if necessary but a specific procedure is required to reconfigure ipsec when attempting to change the IP Address. Management of the cluster can be performed via the service port. This only applies to the VPLEX Local solution as both VPLEX Metro and VPLEX Geo require VPN connectivity between clusters over the management servers eth3 port.

    Note: VPLEX only supports IPv4 for GeoSynchrony releases 5.2 and older. IPv6 is

    available with GeoSynchrony 5.3 and newer.

  • 55

    Network Best Practices

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Intra-Cluster Local Communications (Local COM)

    Intra-Cluster Local Communications apply to all variations of VPLEX. This is the director to director communications within the VPLEX cluster. Intra-Cluster Local Communications are completely private to the Cluster and will never be connected to the customer SAN. The I/O module ports are referred to as Local COM. The Local COM communications are based on dual redundant Fibre Channel networks that include two SAN switches internal to the cluster for dual-engine and quad-engine configurations. A single-engine cluster simply connects corresponding Local COM ports together with Fibre Channel cables direct connected.

    Virtual Private Network Connectivity (VPN)

    VPN connectivity applies to VPLEX Metro and VPLEX Geo. The VPN is configured between management servers as well as the optional VPLEX Witness server. There are two distinct VPN networks to consider management-server to management-server VPN and VPLEX Witness to management-server VPN. These VPNs serve different purposes and therefore have different requirements.

    The VPLEX Witness VPN has a sole purpose of establishing and maintaining communications with both management servers. This is for the purpose of proper failure identification. For additional details on VPLEX Witness please see the VPLEX Witness section.

    Management server to management server VPN is used to establish a secure routable connection between management servers so both clusters can be managed from either management servers as a single entity.

    VPN connectivity for management communications

    Requires a routable/pingable connection between the management servers for each cluster.

    The best practice for configuring the VPN is to follow the installation guide and run the automated VPN configuration script.

    Network QoS requires that the link latency does not exceed 1 second (not millisecond) for management server to VPLEX Witness server

    Network QoS must be able to handle file transfers during the NDU procedure

    Static IP addresses must be assigned to the public ports on each Management Server

  • 56

    Network Best Practices

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    (eth3) and the public port in the Cluster Witness Server. If these IP addresses are in different subnets, the IP Management network must be able to route packets between all such subnets.

    NOTE: The IP Management network must not be able to route to the following reserved VPLEX subnets: 128.221.252.0/24, 128.221.253.0/24, and 128.221.254.0/24.

    The following protocols need to be allowed on the firewall (both in the outbound and inbound filters):

    Internet Key Exchange (IKE): UDP port 500

    NAT Traversal in the IKE (IPsec NAT-T): UDP port 4500

    The following ports as well:

    Encapsulating Security Payload (ESP): IP protocol number 50

    Authentication Header (AH): IP protocol number 51

  • 57

    Network Best Practices

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    Port Function Service

    Public port TCP/22

    Service port TCP/22

    Log in to management

    server OS, copy files to and from the management

    server using the SCP sub-

    service and establish SSH tunnels

    SSH

    Public port TCP/50 IPsec VPN ESP

    Public port UDP/500 ISAKMP

    Public port UDP/4500 IPSEC NAT traversal

    Public port UDP/123 Time synchronization service

    NTP

    Public port TCP/161

    Public port UDP/161

    Get performance statistics SNMP

    Public port TCP/443

    Service port TCP/443

    Web access to the VPLEX

    & RP Management Consoles graphical user

    interface

    Https

    Localhost TCP/5901 Access to the management servers desktop. Not

    available on the public network. Must be

    accessed through the SSH

    tunnel.

    VNC

    Localhost TCP/49500 VPlexcli. Not available on

    the public network. Must

    be accessed through SSH.

    Telnet

    Public port 389 Lightweight Directory

    Access Protocol

    LDAP

    Public port 636 LDAP using TLS/SSL

    (was sldap)

    ldaps

    Public port 7225 Protocol for communicating with the

    RecoverPoint functional

    API

    RecoverPoint

    Public Port 80 web server for

    management (TCP)

    RecoverPoint

    Http

    Table 1 VPLEX and RecoverPoint port usage

  • 58

    Network Best Practices

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    IP Management network must be capable of transferring SSH traffic between Management Servers and Cluster Witness Server.

    The following port must be open on the firewall:

    Secure Shell (SSH) and Secure Copy (SCP): TCP port 22

  • 59

    Network Best Practices

    Implementation and Planning Best Practices for EMC VPLEX Technical Notes

    VPLEX Metro

    Cluster connectivity

    VPLEX Metro connectivity is defined as the communication between clusters in a VPLEX Metro. The two key components of VPLEX Metro communication are FC (FCIP, DWDM) or 10 GigE and VPN between management servers. Please refer to the VPN section for details. FC WAN is the Fibre Channel connectivity and 10 GigE is the Ethernet connectivity options between directors of each cluster. Choose one or the other but not both.

    FC WAN connectivity for inter-cluster director communication

    Latency must be less than or equal to 5ms rtt

    Each directors FC WAN ports must be able to see at least one FC WAN port on every other remote director (required).

    The directors l