db2 purescalesas grid-in a risk environment ·  · 2016-03-11sas grid-in a risk environment philip...

33
DB2 PureScale SAS Grid -in a Risk environment Philip Källander, Patric Hamilton, Rickard Linck

Upload: haliem

Post on 18-May-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

DB2 PureScale

SAS Grid

-in a Risk environment

Philip Källander, Patric Hamilton, Rickard Linck

Background – RICI

• Project initiated 2009

• The system went live 2011

• First year focus was on sourcing data

• Since 2012 focus has been on calculating and analyzing data

• The system is based on a SAS solution and is using DB2 LUW for data processing

and data-storage

• Since 2014 focus has been on performance and adding more functionality

SAS Risk Management for Banking

No delta load, appended

DB2

~150 million “End of

day” transactions

daily

~12 million

transactional

movements daily

(delta loaded)

~80 million rows daily

DB2

Source system

Storage (SAN)

ETL -

bridge

(per

business

date)RMB mart

EnrichmentRisk Engine

(creates INSTVALS)

(macros, price methods,

etc)

INSTVALSETL

ETL

ETL

scheduling

sourcing

DDS (history)

CERM

MartMart

36+ 90+ files

Solution flow 2014 – before architecture

A non typical warehouse workload

3 Billion rows

Selected

Read

49 million rows

Deleted

Updated

Inserted

~ 60 Reads / 1 Write

Update SESSION.FINANCIAL_ACCOUNT_DAILY_CHNG work

set (md5_dds, md5_work, valid_from_dds, valid_to_dds) =

((Select sassrv.md5(trim(hex(coalesce(char(tmp.CURRENCY_CD),'.'))) ||

trim(hex(coalesce(char(tmp.BALANCE_AMT),'.'))) ||

trim(hex(coalesce(char(tmp.BALANCE_CNT),'.'))) ||

trim(hex(coalesce(char(tmp.RETAINED_FLG),'.')))),

sassrv.md5(trim(hex(coalesce(char(work.CURRENCY_CD),'.'))) ||

trim(hex(coalesce(char(work.BALANCE_AMT),'.'))) ||

trim(hex(coalesce(char(work.BALANCE_CNT),'.'))) ||

trim(hex(coalesce(char(work.RETAINED_FLG),'.')))),

valid_from_dttm,

valid_to_dttm

from sassrv.FINANCIAL_ACCOUNT_DAILY_CHNG tmp

Where tmp.ACCOUNT_RK = work.ACCOUNT_RK And

tmp.BALANCE_TYPE_CD = work.BALANCE_TYPE_CD

order by valid_from_dttm desc fetch first row only ))

where 1=1 with ur

Merge into sassrv.FINANCIAL_ACCOUNT_DAILY_CHNG ddsouter

Using (Select ACCOUNT_RK ,BALANCE_TYPE_CD ,VALID_FROM_DTTM

,VALID_TO_DTTM, MULTI_FLG

From session.FINANCIAL_ACCOUNT_DAILY_CHNG

where md5_work <> md5_dds and

(MULTI_FLG is null or MULTI_FLG = 0)) tmp

on

(ddsouter.ACCOUNT_RK = tmp.ACCOUNT_RK) and

(ddsouter.BALANCE_TYPE_CD = tmp.BALANCE_TYPE_CD) and

(tmp.MULTI_FLG is null or tmp.MULTI_FLG = 0) and

ddsouter.valid_to_dttm = '5999-01-01-00.00.00.000000'

When matched then Update

set Valid_To_Dttm = tmp.VALID_FROM_DTTM - 1 second,

Processed_Dttm = '2015-04-28-16.17.00.000000'

Insert into sassrv.FINANCIAL_ACCOUNT_DAILY_CHNG

Select work.ACCOUNT_RK , work.VALID_FROM_DTTM , work.VALID_TO_DTTM,

work.BALANCE_TYPE_CD , work.CURRENCY_CD , work.BALANCE_AMT ,

work.BALANCE_CNT , work.PROCESSED_DTTM , work.RETAINED_FLG ,

work.PARTITION_NR from session.FINANCIAL_ACCOUNT_DAILY_CHNG work

X lends 10 000 kr

What happens in the “risk engine”?

Typical calculation (mortgage)

10 000

5 000

0 Time

D1 D2 D.. M1 M2 M.. Y1 Y2 Y.. Y.. Y.. Y..

Liquidity

needed

Collateral

Unknown payment

Risk bucket 1 Risk bucket 2 Risk bucket …

SAS Risk Management for Banking

Source system

Storage (SAN)

ETL -

bridge

(per

business

date)RMB mart

EnrichmentRisk Engine

(creates INSTVALS)

(macros, price methods,

etc)

INSTVALSETL

ETL

ETL

scheduling

sourcing

DDS (history)

CERM

MartMart

Solution flow 2014

9-12h to stage data 10-11h to calculate risk

~3h

”buffer”

...in addition we were challenged by: • Increasing data volumes

• Higher reporting frequency

• More complex calculations

• More Stakeholders using the solution

– time was running out...

Regulatory requirements:

Extended/New measures to report

More reports

Higher frequency of reporting

Bigger penalties

Outer effect

System effect

IT’s ability to help is shrinking:

Expanded information space – horizontal/vertical

Shorter deadlines

Smaller window for service

Bigger impact during service/downtime

Increased burden for the governance team

Longer runtimes

Inner effect

Increasing business requirements:

More users

More complex analysis

More reports External and Internal

Frequency in data

Increased dependency to make decisions

Takes longer time to do the analysis

First focus on

performance

Second focus on

availability

Third focus on

simplicity

-Make sure we can run

the solution within the

SLA requirements

-Make sure we have a

solution we can grow with

-Has to work with SAS

and DB processes

-Make sure the batch still

runs if a server breaks

-Make sure we can do

maintenance as

seamless as possible

-Has to work with SAS

and DB processes

-We want to keep existing

competence

-We don’t want to

partition data

-We don’t want the

application to be cluster

aware

Situation analysis

A server breaks

Static platform

- Catastrophe!

- Standstill for 3 days for the risk

department

Failed site

- Catastrophe!

- Standstill for 3 days for the risk

department

Service

- Offline

- Overtime, Overtime and downtime plus

some overtime

High Availability/SLA

- Project to build the functionality

- Load two sites

Platform- Installation on every server

Increase performance

- Buy bigger box

- Add cores within box

- Add memory

- Etc

- Will cause downtime!!

© 2015 IBM Corporation 13

Learning from the undisputed Gold Standard... Sysplex on IBM System z (1990)

Introducing IBM DB2 pureScale in 2009

For higher levels of scalability and superior availability

• Extreme capacity

• Add capacity as your needs grow

• Application transparency

• No need for application re-design or

rework when growing/scaling

• Continuous availability

• Deliver uninterrupted access to your

data with consistent performance during

normal operations and hardware failure

© 2015 IBM Corporation 14

DB2 vs. DPF (Shared Nothing) vs. pureScale (Shared Data)

DB2

TX 1TX 2TX 3

Log

Database

Standalone DB2

Single Database View

Log

DB2

Log

DB2

Log

DB2

Part 1 Part 2 Part 3

SQL 1’ SQL 1’’ SQL 1’’’

SQL 1

DB2 with Database Partitioning FeatureIdeal for warehousing and OLAP scale out

and massively parallel processing (MPP) of queries

Log

DB2 DB2 DB2

Single Database View

DB2 pureScale Data SharingIdeal for active/active OLTP/ERP scale out

TX 1 TX 2 TX 3

Shared Data Access

Log

Database

© 2015 IBM Corporation 15

DB2 pureScale Architecture

• Multiple DB2 members for scalable and available database environment

• Client application connects into any DB2 member to execute transactions• Automatic workload balancing (con/tx)• Client Affinity• Automatic Client Reroute

• Shared storage for database data and transaction logs

• Cluster caching facilities (CF) provide centralized global locking and pagecache management for highest levels of availability and scalability• Duplexed, for no single point of failure

• High speed, low latency interconnect for efficient and scalable communication between members and CFs

• DB2 Cluster Services provides integrated failure detection, recovery automation and the clustered file system

Shared Storage

Database

Logs Logs LogsLogs

Cluster Interconnect

Member

CSMember

CSMember

CSMember

CS

Primary CF

CFCS

Secondary CF

CFCS

Clients

DB2 pureScale Cluster (Instance)

Leveraging IBM’s System z Sysplex Experience and Know-How

pureScale: Scale with Ease

Log LogLogLog

Add

member

online• Scale up or out… without changing

your applications

• Efficient coherency protocols designed

to scale without application changes

• Applications automatically and

transparently workload balanced

across members

• Up to 128 members

• Without impacting availability

• Members can be added while

cluster remains online

• Without administrative complexity

• No data redistribution required

• If a node crashes, only in-flight data (data being updated on the failed node at the

time it failed) remains locked until member recovery completes in seconds

Log

MemberMemberMemberMemberMember

CF CF

% o

f D

ata

Avail

ab

le

Time (~seconds)

Only data in-flight updates

locked during recovery

Database member

failure

100

50

X

© 2015 IBM Corporation 17

DB2 pureScale – Scalability Example with no connection affinity, no data partitioning or transaction routing

0123456789

101112

0 5 10 15

1.98x @ 2 members

3.9x @ 4 members

# Members

Th

rou

gh

pu

t vs 1

mem

ber

7.6x @ 8 members

10.4x @ 12 members

20Gb/s IB HCAs7874-024 IB Switch

OLTP 80/20 R/W workloadNo affinity10 System X servers for load drivers

12 8-core p550 members64 GB, 5 GHz each

Duplexed CFson 2 additional 8-core p550s64 GB, 5 GHz each

DS8300 storage

576 15K disks

Two 4Gb FC Switches

Requirements• Platforms:

• AIX 6.1, 7.1

• On any Power 6, 7, 8 compatible rack mounted server

• Linux - Red Hat - RHEL 5.9, 6.1-6.5 (KVM min RHEL 6.2) and SUSE - SLES 10.4, 11.2

• On any x64 Intel compatible rack mounted server (and also blade server for TCP/IP)

• Disk:

• All storage area network (SAN) and directly attached shared block storage, referenced as a

logical unit number (LUN)

• Desirable features:

• SCSI-3 Persistent Reserve (PR) for fast IO fencing

• SCSI-3 PR WRITE EXCLUSIVE REGISTRANTS ONLY reservation type code 0x50

enabled for tie-breaker use

• Network:

• RDMA (Remote Direct Memory Access) over InfiniBand (IB) – 40GE

• RDMA over Converged Ethernet (RoCE) - 10GE

• TCP/IP protocol over Ethernet (TCP/IP) network

© 2015 IBM Corporation 19

Relative Performance Between RDMA (10GE RoCE) and TCP/IP Sockets

Re

lative

# o

f tr

an

sa

ction

s p

er

se

co

nd

Re

lative

# o

f tr

an

sa

ction

s p

er

se

co

nd

1 member 2 members 3 members 4 members

1 member 2 members 3 members 4 members

Sockets (TCP/IP over Ethernet)

InfiniBand (RDMA)

Transactional workload with

70% reads, 30% writes

Transactional workload with

90% reads, 10% writes

Re

lative

# o

f tr

an

sa

ction

s p

er

se

co

nd

• 1-4 members, 2 CFs

• Intel x86 servers

• 32 logical CPUs per server

• Single adapter per server

• IBM DS3000 storage

SAS Grid Manager capabilities

• Performance

• Jobs are divided into subtasks for

parallel execution

• Increase throughput of SAS Jobs

• Workload Management

• Provides job, host & user

management

• Prioritize & schedule jobs using

rules-based queues

• Identify, allocate and manage resources

• High availability

• Detect grid failure and recovers automatically

• Automatic restart of jobs from last successful checkpoint.

, notice any similarities?

Grid Client

SAS Management ConsoleGrid Manager and Schedule Manager

SAS Program

Grid Control Server

Base SAS

SAS CONNECT®

SAS Grid Server

SAS Data Step

Batch Server

SAS Workspace

Server

… SAS Servers*

IBM Platform Process Mgr

IBM PlatformTM LSF

IBM Platform Grid Management Service

Grid Node 1

Base SAS

SAS CONNECT®

SAS Grid Server

SAS Data Step

Batch Server

SAS Workspace

Server

… SAS Servers*

IBM PlatformTM LSF

Grid Node n

Base SAS

SAS CONNECT®

SAS Grid Server

SAS Data Step

Batch Server

SAS Workspace

Server

… SAS Servers*

IBM PlatformTM LSFSAS Data

Integration StudioSAS Enterprise

Miner

SAS Enterprise Guide

Other SASClients

Shared File System SAS Installation and Configuration

Job Deployment Directories

Source and Target Data

IBM PlatformTM LSF

SAS Metadata Server

SAS Grid Submit Utility

* SAS Stored Process-, Workspace,

Pooled Workspace-

and OLAP Server with SAS 9.4.

Failover

Failover

Failover

IBM PlatformTM RTM for SASReport Track Monitor Solution

SAS Grid Manager Architecture

Integrated framework for modeling, simulating, and evaluating risk

Massively parallel (in-memory) to provide dynamic aggregation and visualization

SAS High Performance Risk

Traditional Partitioned Results

Traditional Risk Calculation vs High Performance Risk

Situation analysis

A server breaks

Static platform Scalable platform

- Catastrophe!

- Standstill for 3 days for the risk

department

- Lower performance

Failed site

- Change site

- Standstill 1 day

- Catastrophe!

- Standstill for 3 days for the risk

department

Service

- Offline

- Overtime, Overtime and downtime plus

some overtime

- Online

- Ordinary working hours, no downtime

High Availability/SLA

- Project to build the functionality

- Load two sites

- A part of the platform

- Add a server

Platform- Installation on every server - Installation all at once

(Except for some PureScale activities)

Increase performance

- Buy bigger box

- Add cores within box

- Add memory

- Etc

- Will cause downtime!!

- Add a server

(buy license of course…)

SAS Grid

ETL -

bridge

(per

business

date)RMB mart

EnrichmentRisk Engine

(creates INSTVALS)

(macros, price methods,

etc)

INSTVALS

Mart

ETL -

bridge

(per

business

date)RMB mart

EnrichmentRisk Engine

(creates INSTVALS)

(macros, price methods,

etc)

INSTVALS

Mart

Risk Calculation 2

Risk Calculation 3

SAS Risk Management for Banking

Source system

Storage (SAN)

ETL -

bridge

(per

business

date)RMB mart

EnrichmentRisk Engine

(creates INSTVALS)

(macros, price methods,

etc)

INSTVALSETL

ETL

ETL

scheduling

sourcing

DDS (history)

CERM

MartMart

Risk Calculation 1SAS Risk Management for Banking

SAS Risk Management for Banking

4 Billion rows

150 GB Compressed

580 GB Uncompressed

74% Compression

94 active tables

DB2

BLU

DB2 PureScale

Row-store

DB2 9.7

Teg

sthux330

HP DL380G7

2 x 2,53Ghz

Quad-Core CPU

256GB RAM

HP DL580G7

4 x 2,67Ghz

Hexa-Core CPU

512GB RAM

HP DL360G7

2 x 2,66Ghz

Quad-Core CPU

48GB RAM

1Gbit Ethernet

1Gbit Ethernet

DB2 with compression

SAS Metadata

SAS Compute

IBM DB2 9.7

Old non distributed architecture with SAS and IBM segregated

1 Gbit Interconnect

PureScale

XSERICIQ032

GPFS

10 Gbit Interconnect

GPFS

PureScale

XSERICIQ031

GPFS

SAS Grid

SAS HPR

SAS Grid

SAS HPR

RICI Target infrastructureVersion 0.98

PureScale

CF

Primary

XSERICIQ030

GPFS

PureScale

XSERICIP032

PureScale

CF

Secondary

XSERICIP030

GPFS GPFS

PureScale

XSERICIP033

GPFS

SAS Grid

SAS HPR

SAS Grid

SAS HPR PureScale

CF

Primary

XSERICIP031

GPFS

GPFS

DB RestoreDB Inc Delta

GPFS

DB Inc Delta DB Restore

DR

Ready

FTP/TSM DB Backup Mirroring

DD

S

DD

S

MA

RT

S

FTP/TSM DB Backup Mirroring

DR

Ready

MA

RT

S

DB

Full backup

DB

Full backup

DB

Full backup

Weekly/Monthly

DR processing

DB

Full backup

Weekly/Monthly

DR processing

Daily/Intra day

DR processing

Production-Test Production

Active

Disaster Recovery

WarmSystemtestDevelopment

2*Octo CPU

256 GB Ram

DB2 DE

2*Octo CPU

256 GB Ram

DB2 DE

2*Octo CPU

256 GB Ram

DB2 DE

2*Octo CPU

256 GB Ram

DB2 AEE

2*Octo CPU

256 GB Ram

DB2 AEE

2*Octo CPU

256 GB Ram

DB2 AEE

2*Octo CPU

256 GB Ram

DB2 AEE

PureScale

GPFS

PureScale

GPFS

SAS Grid

SAS HPR

SAS Grid

SAS HPRPureScale

CF

Primary

GPFS

1*Octo CPU

256 GB Ram

DB2 AEE Warm

2*Octo CPU

256 GB Ram

DB2 AEE Warm

2*Octo CPU

256 GB Ram

DB2 AEE Warm

2 Core

24 GB Ram

SAS Web Server

XSERICIF011

2 Core

24 GB Ram

SAS Grid Manager

SAS Metadata-

Server

XSERICIF010

2 Core

24 GB Ram

SAS Web Server

XSERICIQ011

2 Core

24 GB Ram

SAS Grid Manager

SAS Metadata-

Server

XSERICIQ010

2 Core

24 GB Ram

SAS Web Server

XSERICIP011

2 Core

24 GB Ram

SAS Grid Manager

SAS Metadata-

Server

XSERICIP010

1*Quad CPU

64 GB Ram

DB2 DE

DB2 BLU

SAS BI Server

XSERICIF020

1*Quad CPU

64 GB Ram

DB2 DE

DB2 BLU

SAS BI Server

XSERICIS020

2*Octo CPU

128 GB Ram

DB2 DE

DB2 BLU

SAS BI Server

XSERICIQ020

2*Octo CPU

128 GB Ram

DB2 AWGE 3TB

DB2 BLU

SAS BI Server

XSERICIP020

2*Octo CPU

128 GB Ram

DB2 AWGE 2TB

DB2 BLU

SAS BI Server

XSERICIP021

2 x AMD Opteron

6380

196 GB Ram

Citrix

2 x AMD Opteron

6380

196 GB Ram

Citrix Loadbalancing

2 Core

24 GB Ram

SAS Grid Manager

SAS Metadata-

Server

XSERICIS010

2 Core

24 GB Ram

SAS Web Server

XSERICIF011

PureScale

CF

Primary

XSERICIQNNN

GPFS

2*Octo CPU

256 GB Ram

DB2 DE

PureScale

CF

Primary

GPFS

1*Octo CPU

256 GB Ram

DB2 AEE Warm

New architecture, disibuted and co-located

Network traffic is reduced

between application and DB2

(runtime reduction by 40%

on baseline tests)

All cores are spinning on

CF’s

Workload is controlled by

SAS Grid

PureScale is set to

Automatic BP

PureScale and SAS Grid shares resources nicely

2015-03-21

2015-03-22

2015-03-23

2015-03-24

2015-03-25

2015-03-26

2015-03-27

2015-03-28

2015-03-29

2015-03-30

2015-03-31

2015-04-01

2015-04-02

2015-04-03

2015-04-04

2015-04-05

2015-04-06

(blank)

Other 929 724 788 1071 872 1172 1060 1129 693 640 1204 1574 1264 891 921 702 480

LOAD 1008 388 240 718 699 709 705 914 411 204 730 736 740 596 984 411 88

STAGE 910 383 204 655 648 652 648 818 370 202 658 676 667 545 905 370 85

CLEAN 515 274 120 347 338 343 341 497 268 113 346 352 350 316 525 268 78

EXTRACT 984 565 182 629 622 626 624 994 543 174 629 635 632 628 1007 543 151

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Nu

mb

er

of

job

s

Number of jobs/ night

During the PoC; 350 Extract; 210 Clean; 440 Stage; 490 Load

40% more jobs executed every night than 1 year ago

Old environment New environment

Sourcing DDS = 12 hours Sourcing DDS <= 2 hours6X +

Calculating risk, 1 = 10 hours

32 Core (8 + 24)

Calculating risk, 1 = 5 hours

64 Core (16+16+16+16)

2X +

2X

756GB Ram (256GB DB+512GB App) 1024GB Ram (4* 256)1.3X

Calculating risk, 2 = 4 hours Calculating risk, 2 = 1.5 hours2.5X +

4 Jobs / Min 25 Jobs / Min / 2 Members3X +

Calculating risk, 3 = 5 hours Calculating risk, 3 = 2.5 hours2X +

First report at 7:30 am First report at 5:10 am2h +

CFs included

What´s next

• Optimize the risk calculation (only in SAS)

• Test and verify EHL (Explicit Hierarchal Locking)

• Add FP5 and turn on STMM

• Collect even more statistics about PureScale resource consumption

System

Co-location

Software

Vendor partnership

Success factors

People

Cooperation

Philip Källander - [email protected]

Patric Hamilton – [email protected]

Rickard Linck - [email protected]

Frågor?