efficient storage management and disaster recovery in ... · efficient storage management and...

21
Produced by: D. Jonscher Slide 1 Efficient Storage Management and Disaster Recovery in Multi-TB Data Warehouses Dr. Dirk Jonscher Mainz 29 March 2011 [email protected]

Upload: dinhtram

Post on 26-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Produced by: D. Jonscher Slide 1

Efficient Storage Management and Disaster Recovery in Multi-TB Data Warehouses

Dr. Dirk JonscherMainz29 March 2011

[email protected]

Produced by: D. Jonscher Slide 2

Agenda

1. Introduction Credit Suisse

2. DWH @ Credit Suisse

3. DWH DB and Storage Requirements

4. Platform Architecture

5. Oracle Configuration

6. File System Configuration for Oracle Servers

7. Disaster Recovery

8. Summary

Produced by: D. Jonscher Slide 3

1. Credit Suisse History in Switzerland

1856

155 years ago, Alfred Escher founded Schweizerische Kreditanstalt (SKA) – which later became Credit Suisse – to push forward the expansion of the railway network and the industrialization of Switzerland.

No Swiss statesman had such a profound impact on the country as Alfred Escher; through his innovations, he laid the foundations for a modern Switzerland.

Alfred Escher (1819-1882) Paradeplatz, Zurich The monument of Alfred Escher at ETH Zurich, founded in 1854Zurich main station

Produced by: D. Jonscher Slide 4

1786

Over 200 years ago, the founder of the Massachusetts Bank – forerunner of First Boston – financed the first US ship to China. The First Boston Corporation was created as the investment banking arm of the First National Bank of Boston in 1932 and went public in 1934.

State Street 1801: State Street, Boston, 1801. On the right stood the American Coffee House, home of the Massachusetts Bank, 1792 – 1809.

1. Credit Suisse History – First Boston Corporation

Produced by: D. Jonscher Slide 5

1. Credit Suisse AG Today – Key Facts

Global bank headquartered in Zurich, serving clients in private banking, investment banking and asset management.

Registered shares of Credit Suisse Group AG (CSGN) are listed in Switzerland (SWX) and as American Depositary Shares (CS) in New York (NYSE).

Total number of employees: about 50'0000

Produced by: D. Jonscher Slide 6

2. Introduction: CS Application Platforms

Mainframe TP/Batch

XBS

WS

80....

JAPD

irect

net

Fron

tnet

....

Data Warehouse

Bas

el II

AM

L

....

DAP

Hea

lthw

ise

CA

PS ....

Integration

Infrastructure

(Managed Interfaces for Services, Events, Bulk-Data, Workflow)intra-AP

cross-AP

... ?(e.g. ERP, ECM,Grid, C/C++)

zOS RTP Solaris RTP Solaris RTP Windows

Produced by: D. Jonscher Slide 7

2. Introduction: Scope of Application Platforms

Services provided by an Application Platform– Platform Product Mgmt & Governance:

drives product development and release & life cycle management adhering to a well-defined governance model

– Platform Operations:operates applications cost-efficiently with standardized processes according to OLA

– Application Development Support: guides projects through entire development process and shields projects from low-level infrastructure issues

Infrastructure needed to provide these services– Technical Components: providers supply high-

quality and well-managed technical components that are tested and integrated into readily deployable packages

– Hosting Infrastructure: applications are hosted on shared hardware resources (servers, storage systems, backup, etc.)

Architecture, Guidelines & Documentation: Defined, standardized architecture based on open standards for various needs, and information to implement applications on the platform

Text

Text

Text

Text

Text

Text

Managed, high-quality Technical Components

Automated, integratedTool-chain

Hosting on Shared HWResources

Architecture, Guidelines &

Documentation

Produced by: D. Jonscher Slide 8

2. Introduction: Characterization DWH AP

Data Warehouse AP

P cost-effective platform for integrating data from multiple internal and external sources and for developing, deploying and operating applications that implement reporting, analysis and data mining functions

P result of rearchitecture program (RAP) DWH (1998 - 2001)

Scope

P reporting and analysis applications

P data from last end-of-day processing (no operational/transactional reporting), historized data

P no direct initiation of business transactions

Functions

P standard- and ad-hoc reporting

P On-Line Analytical Processing

P data mining only in special areas(Customer Relationship Management, CRM and anti-money laundering)

Produced by: D. Jonscher Slide 9

2. Introduction: DWH Reference Architecture

logic;extract, transform, load

logic (no ETL)Legend:

GUI

Metadata

Management

Reusable Selection,

Aggregation,Calculation

Web/App Servers

Integration, Historization

Land

ing

Zone

Reporting,OLAP,

Data Mining

Selection,Aggregation,Calculation

dataflow

Data Sources

Sta

ging

Are

a

Sub

ject

M

atte

r A

reas

Reu

sabl

e M

easu

res

&

Dim

ensi

ons

Are

as

relationaldatabase

multidimensionaldatabase file

Layer:Data

IntegrationData

EnrichmentAnalysis

DWH Data Universe Data Marts AnalysisServices Presentation Front

End

Produced by: D. Jonscher Slide 10

2. Introduction: Overview DWH-Tools (AR5)

Metadata SecurityManagement Administration

Windows

DataSources(usuallyHost)

DB2

IMS

Appl.

Appl.

DataSources(usuallyHost)

DB2

IMS

Appl.

Appl.

Reporting,OLAP,

Data Mining

DRP(BO XI

R3)

Clemen-tine15

CS-Appl.(JAP)

GUI

DRP(BO XI

R3)

Exceed

InternetExplorer

6.0

Con

nect

:Dire

ct

MDMS Control-SA3.3 RAT

Reusable Selection,Aggregation,Calculation

Power-Center

8.6(+PL/SQL)

Oracle11g

(RMDA)

Reusable Selection,Aggregation,Calculation

Power-Center

8.6(+PL/SQL)

Oracle11g

(RMDA)

Extraction, Transformation,

Loading

Power-Center

8.6

Oracle11g

(DWH DU)

Selection,Aggregation,Calculation

Oracle11g(DM)

or

MSAS(OLAP)

Power-Center

8.6

Layer: Data Data AnalysisIntegration Enrichment

Control-M

UNIX (Sun/Solaris RTP Solaris)

Produced by: D. Jonscher Slide 11

3. DWH DB and Storage Requirements (2010 - 2013)

Performance and ScalabilityP good performance for very large DB instances (up to 100 TB) up to 10 M (complex) SQL statements per

dayP throughput up to 5 GB/s (DWH Data Universe)P full backup of 100 TB instance in less than 12 hours

Storage ManagementP automated and standardized storage management

covering initial setup, growth, data migration and clean-upsimple and efficient processes to manage growth

P consolidation of spare capacity in one layerP independent migration/switch-over of DB instances possibleP storage-internal copy of data for UAT/IT-refresh possible

Disaster Recovery SolutionP solution needs to be based on a synchronous data replication approach (DWH ETL jobs are not running in

a single transaction and inconsistencies between the batch control view and the Oracle DBMS must be prevented) and must also be usable for "normal" server outages

P no automatic fail-over needed, but site switch needs to be highly automated (scripting)

Produced by: D. Jonscher Slide 12

3. KPIs DWH AP

DWH KPIs DB BackendsP 2 M9000 and 24 M5000 servers (+ dozens of smaller servers for the BI platform)P about 600 TB storage capacity (used, 1 PB attached, 6 PB (virtually) mapped)

about 100 TB in production database instances (full copies in UAT and IT)growth rate: 60% per year (~35 TB per month)

P about 100 applications with 16'000 users, ~2000 feeder files & ~5000 ETL jobs per day

Summary Hardware Technology AR5P server type (DB servers): M5000 and M9000P storage subsystems Hitachi USP-V, 450 GB FC disks, large disk pools (>384), RAID 5 (7/1) & thin prov.P storage SAN: Brocade 4/8 Gbs, 4 ports on M5000 and 16 ports on M9000P backup SAN: ditto, 8 ports on all servers ( shared pool of 16 drives per data center for DWH AP)P backup drives: IBM TS1130 with 1 TB cartridges (native write performance: ~160 MB/s)

Typical Server KPIsP LUNs: up to 2000 per server and 1600 per DB instance (overall about 20'000)P file systems: between 50 and 150 per server (overall about 2000)

Throughput of Current Platform ReleaseP data migration: 5-6 TB/h (M9000)P backup performance: ~1TB/h per tape drive

test for inc0 on DWH Data Universe: 7.5 TB/h on 8 drives (M9000)

Produced by: D. Jonscher Slide 13

Pow

erC

ente

r

Landing Zone

SAS

Mantas

4. Physical Architecture (High-Level)

Staging Server (StS)(PowerCenter)

DWH DU Server (DUS)

DWH Data Mart Server (DMS)

Sou

rce

Sys

tem

s

Data Universe

StA

SM

A

RM

DA

MDMFile

Tr

ansf

er

Pow

erC

ente

r

Load

Job

s

Pow

erC

ente

r

Load

Job

s

xDM##xDM##Data Marts

DM

DM...

file-based landing zone(feeder files)PowerCenter installation(incl. IDQ)hosting of SAS & Mantas

DWH Data Universe is stored in one big Oracle DB instanceStaging Area (StA)Subject Matter Area(s)(SMA)Reusable Measures & Dimensions Areas (RMDA)

multiple data mart serversmultiple data marts share an Oracle DB instance

Pow

erC

ente

r

Landing Zone

SAS

Mantas

Produced by: D. Jonscher Slide 14

4.1 Staging Server

Architecture

P usage of Solaris zones to split PowerCenter processing across data centers- DWH DU loads in one data center- data mart loads in the other data center

P data (feeder files and logs) are mirrored (via Volume Manager)in case of a failure/disaster switch Solaris zone of the failed server to the remote site(manual process)

Data Center 1 Data Center 2

DM zoneDDU zone mirroring via VMEnt

erpr

ise

Sto

rage

S

yste

m

Ent

erpr

ise

Sto

rage

S

yste

mPRODDDU

PRODDM

M5000 large:

6 dual-port Emulex cards for SAN: 4 x storage, 8 x backupEthernet interfaces: 1 quad-port 1 Gb/s card and 1 dual-port 10 Gb/s card

Produced by: D. Jonscher Slide 15

4.2 DWH DU Server

Architecture

P use single-domain M9000-32 on the production site

P split UAT server (also M9000-32) into 2 dynamic domainsdomain 1: stand-by environment (full I/O connectivity already configured)domain 2: UAT environment

P in case of a prod. server failure the stand-by domain on the PT/A server is used as productionboards must be reconfigured from PT/A domain into stand-by domain

Data Center 1 Data Center 2

PRODDUS st

and-

by

PTADUS

do1 do2mirroring via VMEnt

erpr

ise

Sto

rage

S

yste

m

Ent

erpr

ise

Sto

rage

S

yste

m

M9000-32:

12 dual-port Emulex cards for SAN: 16 x storage, 8 x backupEthernet interfaces: 1 quad-port 1 Gb/s card and 1 dual-port 10 Gb/s card

Produced by: D. Jonscher Slide 16

4.3 Data Mart Servers

Architecture

P use single-domain M5000 servers, distributed over both data centers

P 2 production servers in different data centers form a so-called "DM group"these "partner" servers host one (or more) production DB instances and the stand-by DB instances of their "partner" in the remote data center(e.g. DMS A: DB instance X & stand-by of Y; DMS B: DB instance Y & stand-by of X)idle resources minimized

P in case of a production server failure the "partner" server takes over the additional load

Data Center 1 Data Center 2

DM Group

PRODDMS A

PRODDMS B

mirroring via VMEnt

erpr

ise

Sto

rage

S

yste

m

Ent

erpr

ise

Sto

rage

S

yste

m

M5000 large:

6 dual-port Emulex cards for SAN: 4 x storage, 8 x backupEthernet interfaces: 2 quad-port 1 Gb/s card

Produced by: D. Jonscher Slide 17

5. Oracle Configuration

General SetupP single instance, big SGAs (20-64 GB)P based on Veritas Storage Foundation: volume manager (VxVM), data files (VxFS incl. Oracle Disk

Manager), multi-pathing (DMP)P TEMP tablespaces on raw devicesP REDO tablespaces on file systemP each DB instance has a dedicated set of LUNs

required for storage-internal copy of data for UAT- and IT refreshP each instance has its own file systems DB instances can easily be relocated (independent of each other)

Data ReplicationP host-based mirroring of data via VxVM across both CH data centers

dirty region log & fast mirror resynchronizationP site tagging

ensures that mirroring is indeed across both data centers & consistent split (in case of rolling disaster)P active reading on both sides

permanent test that data copy is consistent and usable

Produced by: D. Jonscher Slide 18

6. File System Configuration for Oracle Servers

Veritas File SystemP up to 50 file systems per DB instance DWH DU: 40 for r/w tablespaces and 10 for r/o tablespacesP each file system is striped over 16 LUN big striping units (42 MB = thin provisioning unit of HDS)P prefetched I/O: multiblock read/write for sequential I/O (feeder files, PCenter cache, etc.) 5 x 1MB unitsP Oracle file size up to 64 GB (no "big files") flexible management of file systemsP autoextend unit on Oracle files is 126 MB (3 x 42 MB)

Server-Side ConfigurationP between 250 and 500 TB mapped to each DB server (will cover the maximum growth until next AP release)P simple storage capacity management (autoextend)

only thin provisioning tool needs to be monitored (+ additional storage sub systems if needed)

Application SetupP initial configuration based on expected maximum growth for one year

determines initial number of data files per tablespaceP in the past: initial configuration of tablespaces (fixed size) and subsequent extension

reclaiming of unused capacity in tablespaces rather labor-intensive

Produced by: D. Jonscher Slide 19

7. Disaster Recovery: Switching of Zones

1. Shut-Down (not "needed" in disaster case or if the server is broken)P stop zoneP unmount file system(s) (of this zone) one call of a BCP administration script (takes about 10 min)P deport disk groups

2. RestartP import disk groupsP mount file system one call of a BCP administration script (takes about 10 min)P restart zone

CommentP automated redirect of all connection requests to this zone via Global Site Selector (GSS)

ICMP (ping interval: 120s), DNS caching also only for 120s on all DWH AP servers

Data Center 1 Data Center 2

mirroring via VM

Ent

erpr

ise

Sto

rage

Sys

tem

Ent

erpr

ise

Sto

rage

Sys

tem

zone D

zone E

zone B

zone A

zone B

zone C

}}

Produced by: D. Jonscher Slide 20

7. Switching a DB Instance to another DB Server

Data Center 1 Data Center 2

DB Server Group C(works exactly the same way for any other DB server in both data centers!)

Ent

erpr

ise

Sto

rage

S

yste

m

Ent

erpr

ise

Sto

rage

S

yste

mmirroring via VM

DB DB

1. Shut-Down (not "needed" in disaster case or if the server is broken)stop DB instance & listenerunmount file systems (of this instance) one call of a BCP administration script (takes about 30 min)deport disk groups

2. Restartimport disk groupsmount file systems one call of a BCP administration script (takes about 10 min)restart DB instance & listener

Commentslong running sessions can delay shut-downtypically recovery needed (long update transactions have an impact on restart time)automated redirect of DB connects via Global Site Selector (GSS; monitoring of listener port)

each DB instance has its own DNS entry (!)

}}

Produced by: D. Jonscher Slide 21

8. Summary and Outlook

SummaryP excellent performance and throughput

platform can scale as needed until 2014P thin provisioning works very well, overall storage utilization considerably improved, spare capacity indeed

concentrated in a single layercareful monitoring of thin provisioning pool required

P sharing of large disk groups across different test levels does not cause any production issuesP tiering-in-the-box is much easier to manage than tiered storageP snap technologies (e.g. Hitachi's ShadowImage) work very will (after a few hick-ups)

considerably reduced time to refresh test environments (copy process in the background takes more than 24 hours though)

P host-based mirroring does not cause performance issues (due to distribution of read accesses across both data centers the overall throughput is less impacted than expected about 5-10%)

P DR procedures work very will (components are now switched between both data centers every 3 months)DR procedures will only work if used on a regular basis

Next StepsP further improvements of storage management

automatic reclaim of thin provisioning units (if no longer used)automated restriping when disk groups are extended

P dynamic tiering (next generation of USP-V)P tests with flash technology (on server and/or storage side)