emc symmetrix vmax and sybase iq: deploying efds for · pdf fileemc symmetrix vmax and sybase...

14
White Paper Abstract This white paper details the benefits of deploying Sybase IQ on EMC ® Symmetrix VMAXusing Enterprise Flash Drive (EFD) technology. February 2011 EMC SYMMETRIX VMAX AND SYBASE IQ: DEPLOYING EFDS FOR COST-EFFECTIVENESS AND PERFORMANCE

Upload: voliem

Post on 14-Feb-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

White Paper

Abstract

This white paper details the benefits of deploying Sybase IQ on EMC® Symmetrix VMAX™ using Enterprise Flash Drive (EFD) technology. February 2011

EMC SYMMETRIX VMAX AND SYBASE IQ: DEPLOYING EFDS FOR COST-EFFECTIVENESS AND PERFORMANCE

Page 2: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

2 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

Copyright © 2011 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate of its publication date. The information is subject to change without notice. The information in this publication is provided “as is”. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Part Number h8175

Page 3: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

3 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

Table of Contents

Executive summary.................................................................................................. 4

Introduction ............................................................................................................ 4

Audience ............................................................................................................................ 5

Scope of project ...................................................................................................... 5

Product introductions ......................................................................................................... 5

EMC Symmetrix VMAX .................................................................................................... 5

EMC CLARiiON ................................................................................................................ 6

EMC RecoverPoint .......................................................................................................... 7

Sybase IQ ....................................................................................................................... 7

Proof-of-concept testing ..................................................................................................... 8

Symmetrix DMX/VMAX disk configurations ..................................................................... 9

CLARiiON CX4 disk configurations ................................................................................ 10

RecoverPoint ................................................................................................................ 10

Test suite ......................................................................................................................... 10

Results ............................................................................................................................. 11

Conclusion ............................................................................................................ 13

Page 4: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

4 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

Executive summary The EMC® Symmetrix VMAX™ series with Enginuity™ features a high-end storage system built for the virtual data center. Based on the Virtual Matrix Architecture™, Symmetrix® VMAX scales performance and capacity, delivers nondisruptive operations, and greatly simplifies and automates the management and protection of information. Advanced tiering via Enterprise Flash Drives (EFD), and Fibre Channel and SATA drives allows users to ensure that the right data is on the right storage tier at the right cost.

Sybase IQ is a high-performance decision support server designed specifically for data warehousing. Sybase IQ delivers faster results for mission-critical business intelligence, data warehouse, and reporting solutions, and runs on standard hardware and operating systems. Sybase IQ works with diverse data – including unstructured data – and diverse data sources.

This white paper addresses an extensive proof-of-concept (PoC) project that was conducted between EMC Corp. and a partner company. The considerations for deploying Sybase IQ data warehouses on EMC Enterprise Flash Drive (EFD) technology are specifically addressed. The cost benefits and performance gains are described in an effort to foster a better understanding of this joint implementation.

Introduction One of the biggest challenges today facing storage architects and administrators is not only the cost of deploying a disk configuration specific to an application, but also servicing the business by meeting application service-level agreements (SLAs). This means deploying an application that is cost-effective from a disk perspective, and one that can also meet the required performance metrics to satisfy the user community after the implementation.

EMC Symmetrix DMX™ and Symmetrix VMAX provide a variety of disk drive types and technologies that can be deployed into several different RAID configurations. Knowledge of these drive types and RAID configurations is critical to application deployment where cost and performance are concerned.

Available disk drive types are:

Serial ATA (SATA), which is a lower-cost, higher-capacity disk device

Fibre Channel (FC) disk technology, which provides higher rotational speeds and better performance than SATA disk

Enterprise Flash Drive (EFD), which is the latest disk drive technology, provides the highest service levels, and uses nonvolatile NAND memory similar to solid state disk (SSD) technologies

A Sybase IQ data warehouse is fundamentally and architecturally different from a transactional (OLTP) database system because its primary function is to process read I/O versus write I/O. With an online transaction processing database, it is most

Page 5: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

5 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

important to allow many users to update the database instantly and accurately, without interfering with one another. By contrast, with a business intelligence data warehouse such as Sybase IQ, fast query response times and ad-hoc query processing capabilities for many users simultaneously are most essential. Placement of data on specific disks, based on the I/O profile of the data warehouse, is another performance consideration.

Gaining a general understanding of I/O characteristics for data objects associated with the application will minimize disk contention and improve overall performance. These topics will be addressed throughout this paper.

Audience

This white paper is intended for those interested in the benefits of deploying Sybase IQ on EMC Symmetrix VMAX using EFD technology.

Scope of project EMC and a partner company engaged in a joint project where the latest hardware and software technologies and features would be implemented. The project, titled “Sybase IQ Data Warehouse -Technology Refresh Proof of Concept,” was designed to configure and test a suite of EMC products with Sybase IQ, using a structured methodology for testing and validating compatibility, scalability, and performance.

The end goal of this project was for the partner company to qualify and measure the cost benefits and performance gains of migrating and consolidating their current IQ data warehouse and applications onto a new Symmetrix VMAX storage array.

Note: The partner company is fully aware of the text documented herein but wishes to remain anonymous in this publication.

Product introductions

The following sections describe the products and technologies that were an integral part of the Sybase IQ data warehouse technology refresh PoC project.

EMC Symmetrix VMAX

At the heart of Symmetrix VMAX and its Virtual Matrix Architecture is a system that can scale to dozens of petabytes, support thousands of virtual servers, deliver millions of IOPS, and provide 24x7xForever availability.

EMC Enginuity is the brains behind the hardware and provides the intelligence that controls all components in a Symmetrix VMAX array, and coordinates realtime events related to the processing of production data.

The EMC Symmetrix VMAX was used as the source storage system for the PoC project and provided the platform for all baseline, replication, and performance testing.

Page 6: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

6 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

Figure 1. EMC Symmetrix VMAX and features

EMC CLARiiON

The EMC CLARiiON® CX4 series with UltraFlex™ technology is based on breakthrough architecture and extensive technological innovation, providing a midrange solution that is highly scalable, meeting the price points of most midrange customers. The CX4 is the fourth-generation CX™ series, and continues EMC’s commitment to maximizing customers’ investments in CLARiiON technology by ensuring that existing resources and capital assets are optimally utilized as customers adopt new technology. CLARiiON CX4 systems support the latest generation of disk drive technologies like Flash drives, 4 Gb/s FC drives for high performance, and SATA II for high capacity. The CLARiiON CX4 is the first midrange storage system to support all of these types of disk drive technologies.

The CLARiiON CX4 series introduces thin LUN technology that builds on CLARiiON virtual LUN capabilities and seamlessly integrates with CLARiiON management and replication software. With CLARiiON Virtual Provisioning™, users choose between traditional LUNs, metaLUNs, and thin LUNs. The ability to nondisruptively migrate data to different LUN and disk types allows users to deploy the best solution without incurring downtime. Virtual Provisioning enables organizations to reduce costs by increasing utilization without overprovisioning of storage capacity, simplifying storage management, and reducing application downtime.

CLARiiON CX4 with the latest FLARE® release and the EMC Fully Automated Storage Tiering (FAST) suite of software provide maximum performance and tiered storage functional flexibility. The FAST Suite includes FAST Cache, which complements FAST by automatically absorbing unpredicted spikes in application workloads.

• Supports 96 to 2,400 drives

– Up to 240 drives per storage bay

– Up to 10 storage bays

• Connectivity:

– Fibre

– FICON

– iSCSI

– GigE

• Up to 2x more ports than DMX-4

– 128 Fibre Channel host/SAN

– 32 Fibre Channel remote replication

– 64 iSCSI

– 64 FICON host

– 32 GigE remote replication

Page 7: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

7 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

Figure 2. EMC CLARiiON and features

EMC RecoverPoint

EMC RecoverPoint is an enterprise-scale replication appliance designed to replicate and protect application data on heterogeneous SAN-attached servers and storage systems. RecoverPoint provides superior data protection for mission-critical data when compared to traditional host and storage system snapshots or disk-to-tape backup products.

RecoverPoint provides full support for data replication and disaster recovery, and enables the replication of data over any distance. Its local replication utilizes RecoverPoint continuous data protection (CDP). Remote replication to sites around the world utilize RecoverPoint continuous remote replication (CRR). For both local and remote replication, RecoverPoint offers concurrent local and remote (CLR) data protection.

EMC RecoverPoint was used to replicate the data between the source VMAX and the target CX4 storage arrays.

Sybase IQ

Sybase IQ provides benefits that support the interactive approach to decision support including:

Intelligent query processing. Sybase IQ uses index-only access plans to process only the data needed to satisfy any type of query.

Ad hoc query performance on a uniprocessor as well as on parallel systems. An ad hoc query is one in which the system has no prior knowledge and no explicit tuning is required. Ad hoc queries are distinguished from standard or production reports, where only predefined variables, such as dates, are used to generate predefined reports on a regular basis.

Page 8: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

8 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

Multiplex capability is for managing large query loads in a multiserver configuration.

Flexible schema support.

Efficient query execution without query-specific tuning by the system administrator (under most circumstances).

Fast initial and incremental loading.

Fast aggregations, counts, and comparisons of data.

Parallel processing optimized for multiuser environments.

Figure 3. Sybase IQ Multiplex architecture

Proof-of-concept testing

Using the PoC lab at EMC corporate headquarters and a team of support staff, the following environment was installed, set up, and configured.

Symmetrix VMAX running 5874 microcode

Symmetrix DMX-4 4500 running target 5773 microcode

CLARiiON CX4-480 running FLARE 29

Cisco directors with SSM blades

Brocade DS-5100B switches with 32 ports each

6 RecoverPoint appliances

Page 9: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

9 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

RecoverPoint CDP volumes (local replica) and CRR volumes (remote replica) on DMX/VMAX and CX4 arrays, respectively

RecoverPoint uses SANTap Cisco fabric splitters for the DMX/VMAX, and a CLARiiON splitter for the CX4

4 Dell 2950 servers

Emulex HBAs

Windows Server 2003 SP2 64-bit

Sybase IQ 15.1

Figure 4 provides a visual representation of the PoC hardware and software environment.

1© Copyright 2010 EMC Corporation. All rights reserved.

PRODUCTION SITE DISASTER RECOVERY SITE

CLARiiON CX4-480

“B” = data warehouse CDP Replica 77 x 2 TB* SATA R6“C” = data warehouse CRR Replica 77 x 2 TB* SATA R6

Symmetrix VMAX

“A” = Production data warehouse 77 x 2 TB* SATA R6 10 x 240 GB EFD R5

ProductionSybase IQ

Servers

Sybase IQ Server(used for verification tests)

J

Sybase IQ Server(used for disaster

recovery tests)

“J” = RecoverPoint CRR Journal 4 x 1.76 TB FC R5“J” = RecoverPoint CDP Journal 8 x 1.76 TB FC R5

1,500 miles

CAB

J J

Operational Recovery

from CDP Replica

Disaster Recovery

from CDP Replica

Figure 4. EMC and customer test setup

The initial suite of testing was completed first on a Symmetrix DMX-4 4500, which was used to gather the initial baseline results. The DMX environment was reconfigured and final testing was performed entirely on VMAX. Both DMX and VMAX bin files were configured as follows.

Symmetrix DMX/VMAX disk configurations 400 x 1 TB SATA drives:

Hyper size 14RAID6

298 x 18-way concatenated metas were created from physically contiguous hypers at approximately 1 TB each

All devices were presented across all FA ports and all LUNs masked to the RPAs as well

Page 10: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

10 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

148 metavolumes form the basis of the production data warehouse (“A Copy” data LUNs)

150 metavolumes were used for the CDP replica (“B Copy” LUNs)

8 x 400 GB EFDs:

Hyper size 7RAID5

2 x 18-way metas were created at approximately 1 TB each. These devices were used for the “A Copy” catalog LUNs

All devices were presented across all FA ports and masked to RPAs

44 x 450 GB Fibre Channel drives:

Devices were configured RAID 1 and used for the RecoverPoint journal

All devices were presented across all FA ports and masked to the RP appliances

12 x 1 TB SATA drives were configured as spares

11 x 450 GB FC drives were configured as spares

1 x 400 GB EFD was configured as a spare

CLARiiON CX4 disk configurations Created RAID 6 (14+2) disk groups from 416 x 1 TB SATA drives for the RPAs

Created RAID 5 (4+1) disk groups from 5 x 600 GB FC drives for RP journal use

RecoverPoint 3 RPAs using the Cisco SANTap splitter for the “local” site

3 RPAs using the CLARiiON splitter for the “remote” site

IP network simulating a 700-mile distance between sites and 1 Gb/s of bandwidth

Test suite

Several criteria and objectives for the project were outlined in order to measure “success” and ensure that project goals were achieved. An extensive PoC test suite was developed to perform several relevant and critical functions including:

Test 1 — Prerequisite setup and configuration of the test/PoC environment

Prove that the environment is properly configured and all components of the system function as expected.

Test 2 — IQ performance testing

Execute three distinct test suites exhibiting IQ performance. Procedures were executed and timings were charted as follows: Single Process Load, Single Process Update, Single Process Query, Multiple Process Load, Multiple Process Update, Multiple Process Query, and Multiple Mixed Workload.

Page 11: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

11 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

Note: A RecoverPoint bookmark was inserted at the end of each test listed above.

Test 3 — RecoverPoint functionality (CDP image) testing

Mask the RecoverPoint CDP replica LUNs server. Enable Image Access mode. Make the LUNs accessible read/write on the server, choosing an appropriate bookmark; validate data; repeat with another bookmark.

Test 4 — RecoverPoint functionality (CRR image) testing

Mask the RecoverPoint CRR replica LUNs server. Enable Image Access mode. Make the LUNs accessible read/write on the server, choosing an appropriate bookmark; validate data; repeat with another bookmark.

Test 5 — SnapView™ functionality and performance testing

Mask SnapView target LUNs (“D Copy”) to Server4. Bring up IQ on Server4 and validate data.

Test SnapView cloning of CRR replica source to target LUNs. Note the copy throughput.

Use SnapView to incrementally clone CRR replica LUNs (source to target). Note the copy throughput.

Test 6 — Perform a failback to the “production” site

Use SnapView, and reverse-synchronize “D Copy” to “C Copy” employing Protected Restore. Use RecoverPoint, and synchronize the CRR replica back to the production LUNs. Bring up IQ on Server1 and Server2 and verify data.

Results

Real customer data was loaded into the Sybase IQ data warehouse. Custom scripts were written to specifically test various procedures. Many scripts were executed to run queries that would generate high I/O activity on different parts of the IQ database, specifically the Main database (herein Main db), the Temporary database (herein Temp db), and the Catalog data stores.

The details in this paper highlight the bottom line (or end) results in a few short paragraphs, after weeks of configuring, reconfiguring, and testing! However, we specifically configured the environment in three unique ways, ran the test suite, and created spreadsheets to track and log all of the recorded timings for each set of scripts. Each test suite took about 40 hours to execute to completion.

The initial testing was performed at the customer site using their environment and their DMX frames. The scripted test suite was executed (which would later be used for the PoC at EMC) to establish a baseline. The Sybase IQ data warehouse, initially configured at the customer site, was configured entirely on Fibre Channel disk technology.

The first system configuration maintained an environment where the entire data warehouse was configured on SATA disk drives.

Page 12: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

12 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

The second environment was configured so that the data warehouse resided on SATA disk, with the exception of the Main data store. For the Main db, it was configured on EFD technology.

Last, we once again configured the data warehouse all on SATA disk, with the exception of the Temp db data store. This time, for the Temp db, it was configured on EFD technology (instead of putting the Main db on EFD).

The results tell a great story!

Initial testing

In the initial test results with the Sybase IQ data warehouse sitting entirely on SATA disk drives, the results were almost identical to the customer’s in-house configuration that was configured on FC disk.

Second test suite

In the second set of tests with the data warehouse residing on SATA, with the exception of Main db on EFD, the results were negligible. We documented less than 1 percent performance improvement.

Final test suite

In the final set of tests with the data warehouse configured on the SATA disk, with the exception of the Temp db configured on EFD, the results were impressive. We documented overall performance gains of around 15 percent. However, when we executed various queries and various data loads, we saw performance gains of 18 percent to 50 percent. The following figure describes the details of the final (and winning) hardware configurations for the PoC project.

Page 13: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

13 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

28© Copyright 2010 EMC Corporation. All rights reserved.

Best Performance Configuration Using EFD and HDD (Hard Disk Drives) for IQ

18% to 50% performance benefit using EFD (for temp and catalog) during IQ loading and certain queries

SATA

Fibre Channel

Flash

Writer node db spaces

8 x 400 GB EFDsRAID 5(7+1)

11-way 2 TB METAs

Catalog

Temp

8 x 400 GB EFDsRAID 5(7+1)

11-way 2 TB METAs

Catalog

Temp

Reader nodes db spaces

Catalog

Temp

77 x 1tb SATAsRAID 6(14+2)

11-way 2 TB concatenated METAs

Main DBspace

Figure 5. Best practices for the winning configuration

Conclusion EMC Symmetrix VMAX provides a variety of hardware and software options to meet or exceed the SLAs of even the most demanding data warehouse and/or reporting requirements. EMC’s disk offerings, configuration options, and variety of disk types and sizes all provide cost and power saving benefits. Sybase IQ provides a business analytics engine to service even the most I/O-intensive mission-critical business intelligence applications.

The winning configuration is depicted in Figure 5. The project “Sybase IQ Data Warehouse -Technology Refresh Proof of Concept” proved why EMC recommends that users put the Sybase IQ Temp db on Enterprise Flash Drive technology. Noticeable and/or significant performance gains can be achieved. A Sybase IQ Temp db is typically small and therefore not a huge investment in disk resources when placing it on the more expensive EFD.

Page 14: EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for · PDF fileEMC Symmetrix VMAX and Sybase IQ: 5 Deploying EFDs for Cost-Effectiveness and Performance important to allow many users

14 EMC Symmetrix VMAX and Sybase IQ: Deploying EFDs for Cost-Effectiveness and Performance

In fact, when configuring a large multiterabyte data warehouse, cost savings are realized when the Main db is configured on either SATA or Fibre Channel disk. Main db consumes the greatest amount of disk space, and performance differences were negligible when the IQ Main db was configured on either disk type. However, when the Main db is configured on SATA disk, using the highest RAID rank available (for VMAX, that is RAID 6 (14+2)), the cost savings can be tremendous.