db2 purescalesas grid-in a risk environment · · 2016-03-11sas grid-in a risk environment philip...
TRANSCRIPT
Background – RICI
• Project initiated 2009
• The system went live 2011
• First year focus was on sourcing data
• Since 2012 focus has been on calculating and analyzing data
• The system is based on a SAS solution and is using DB2 LUW for data processing
and data-storage
• Since 2014 focus has been on performance and adding more functionality
SAS Risk Management for Banking
No delta load, appended
DB2
~150 million “End of
day” transactions
daily
~12 million
transactional
movements daily
(delta loaded)
~80 million rows daily
DB2
Source system
Storage (SAN)
ETL -
bridge
(per
business
date)RMB mart
EnrichmentRisk Engine
(creates INSTVALS)
(macros, price methods,
etc)
INSTVALSETL
ETL
ETL
scheduling
sourcing
DDS (history)
CERM
MartMart
36+ 90+ files
Solution flow 2014 – before architecture
A non typical warehouse workload
3 Billion rows
Selected
Read
49 million rows
Deleted
Updated
Inserted
~ 60 Reads / 1 Write
Update SESSION.FINANCIAL_ACCOUNT_DAILY_CHNG work
set (md5_dds, md5_work, valid_from_dds, valid_to_dds) =
((Select sassrv.md5(trim(hex(coalesce(char(tmp.CURRENCY_CD),'.'))) ||
trim(hex(coalesce(char(tmp.BALANCE_AMT),'.'))) ||
trim(hex(coalesce(char(tmp.BALANCE_CNT),'.'))) ||
trim(hex(coalesce(char(tmp.RETAINED_FLG),'.')))),
sassrv.md5(trim(hex(coalesce(char(work.CURRENCY_CD),'.'))) ||
trim(hex(coalesce(char(work.BALANCE_AMT),'.'))) ||
trim(hex(coalesce(char(work.BALANCE_CNT),'.'))) ||
trim(hex(coalesce(char(work.RETAINED_FLG),'.')))),
valid_from_dttm,
valid_to_dttm
from sassrv.FINANCIAL_ACCOUNT_DAILY_CHNG tmp
Where tmp.ACCOUNT_RK = work.ACCOUNT_RK And
tmp.BALANCE_TYPE_CD = work.BALANCE_TYPE_CD
order by valid_from_dttm desc fetch first row only ))
where 1=1 with ur
Merge into sassrv.FINANCIAL_ACCOUNT_DAILY_CHNG ddsouter
Using (Select ACCOUNT_RK ,BALANCE_TYPE_CD ,VALID_FROM_DTTM
,VALID_TO_DTTM, MULTI_FLG
From session.FINANCIAL_ACCOUNT_DAILY_CHNG
where md5_work <> md5_dds and
(MULTI_FLG is null or MULTI_FLG = 0)) tmp
on
(ddsouter.ACCOUNT_RK = tmp.ACCOUNT_RK) and
(ddsouter.BALANCE_TYPE_CD = tmp.BALANCE_TYPE_CD) and
(tmp.MULTI_FLG is null or tmp.MULTI_FLG = 0) and
ddsouter.valid_to_dttm = '5999-01-01-00.00.00.000000'
When matched then Update
set Valid_To_Dttm = tmp.VALID_FROM_DTTM - 1 second,
Processed_Dttm = '2015-04-28-16.17.00.000000'
Insert into sassrv.FINANCIAL_ACCOUNT_DAILY_CHNG
Select work.ACCOUNT_RK , work.VALID_FROM_DTTM , work.VALID_TO_DTTM,
work.BALANCE_TYPE_CD , work.CURRENCY_CD , work.BALANCE_AMT ,
work.BALANCE_CNT , work.PROCESSED_DTTM , work.RETAINED_FLG ,
work.PARTITION_NR from session.FINANCIAL_ACCOUNT_DAILY_CHNG work
X lends 10 000 kr
What happens in the “risk engine”?
Typical calculation (mortgage)
10 000
5 000
0 Time
D1 D2 D.. M1 M2 M.. Y1 Y2 Y.. Y.. Y.. Y..
Liquidity
needed
Collateral
Unknown payment
Risk bucket 1 Risk bucket 2 Risk bucket …
SAS Risk Management for Banking
Source system
Storage (SAN)
ETL -
bridge
(per
business
date)RMB mart
EnrichmentRisk Engine
(creates INSTVALS)
(macros, price methods,
etc)
INSTVALSETL
ETL
ETL
scheduling
sourcing
DDS (history)
CERM
MartMart
Solution flow 2014
9-12h to stage data 10-11h to calculate risk
~3h
”buffer”
...in addition we were challenged by: • Increasing data volumes
• Higher reporting frequency
• More complex calculations
• More Stakeholders using the solution
– time was running out...
Regulatory requirements:
Extended/New measures to report
More reports
Higher frequency of reporting
Bigger penalties
Outer effect
System effect
IT’s ability to help is shrinking:
Expanded information space – horizontal/vertical
Shorter deadlines
Smaller window for service
Bigger impact during service/downtime
Increased burden for the governance team
Longer runtimes
Inner effect
Increasing business requirements:
More users
More complex analysis
More reports External and Internal
Frequency in data
Increased dependency to make decisions
Takes longer time to do the analysis
First focus on
performance
Second focus on
availability
Third focus on
simplicity
-Make sure we can run
the solution within the
SLA requirements
-Make sure we have a
solution we can grow with
-Has to work with SAS
and DB processes
-Make sure the batch still
runs if a server breaks
-Make sure we can do
maintenance as
seamless as possible
-Has to work with SAS
and DB processes
-We want to keep existing
competence
-We don’t want to
partition data
-We don’t want the
application to be cluster
aware
Situation analysis
A server breaks
Static platform
- Catastrophe!
- Standstill for 3 days for the risk
department
Failed site
- Catastrophe!
- Standstill for 3 days for the risk
department
Service
- Offline
- Overtime, Overtime and downtime plus
some overtime
High Availability/SLA
- Project to build the functionality
- Load two sites
Platform- Installation on every server
Increase performance
- Buy bigger box
- Add cores within box
- Add memory
- Etc
- Will cause downtime!!
© 2015 IBM Corporation 13
Learning from the undisputed Gold Standard... Sysplex on IBM System z (1990)
Introducing IBM DB2 pureScale in 2009
For higher levels of scalability and superior availability
• Extreme capacity
• Add capacity as your needs grow
• Application transparency
• No need for application re-design or
rework when growing/scaling
• Continuous availability
• Deliver uninterrupted access to your
data with consistent performance during
normal operations and hardware failure
© 2015 IBM Corporation 14
DB2 vs. DPF (Shared Nothing) vs. pureScale (Shared Data)
DB2
TX 1TX 2TX 3
Log
Database
Standalone DB2
Single Database View
Log
DB2
Log
DB2
Log
DB2
Part 1 Part 2 Part 3
SQL 1’ SQL 1’’ SQL 1’’’
SQL 1
DB2 with Database Partitioning FeatureIdeal for warehousing and OLAP scale out
and massively parallel processing (MPP) of queries
Log
DB2 DB2 DB2
Single Database View
DB2 pureScale Data SharingIdeal for active/active OLTP/ERP scale out
TX 1 TX 2 TX 3
Shared Data Access
Log
Database
© 2015 IBM Corporation 15
DB2 pureScale Architecture
• Multiple DB2 members for scalable and available database environment
• Client application connects into any DB2 member to execute transactions• Automatic workload balancing (con/tx)• Client Affinity• Automatic Client Reroute
• Shared storage for database data and transaction logs
• Cluster caching facilities (CF) provide centralized global locking and pagecache management for highest levels of availability and scalability• Duplexed, for no single point of failure
• High speed, low latency interconnect for efficient and scalable communication between members and CFs
• DB2 Cluster Services provides integrated failure detection, recovery automation and the clustered file system
Shared Storage
Database
Logs Logs LogsLogs
Cluster Interconnect
Member
CSMember
CSMember
CSMember
CS
Primary CF
CFCS
Secondary CF
CFCS
Clients
DB2 pureScale Cluster (Instance)
Leveraging IBM’s System z Sysplex Experience and Know-How
pureScale: Scale with Ease
Log LogLogLog
Add
member
online• Scale up or out… without changing
your applications
• Efficient coherency protocols designed
to scale without application changes
• Applications automatically and
transparently workload balanced
across members
• Up to 128 members
• Without impacting availability
• Members can be added while
cluster remains online
• Without administrative complexity
• No data redistribution required
• If a node crashes, only in-flight data (data being updated on the failed node at the
time it failed) remains locked until member recovery completes in seconds
Log
MemberMemberMemberMemberMember
CF CF
% o
f D
ata
Avail
ab
le
Time (~seconds)
Only data in-flight updates
locked during recovery
Database member
failure
100
50
X
© 2015 IBM Corporation 17
DB2 pureScale – Scalability Example with no connection affinity, no data partitioning or transaction routing
0123456789
101112
0 5 10 15
1.98x @ 2 members
3.9x @ 4 members
# Members
Th
rou
gh
pu
t vs 1
mem
ber
7.6x @ 8 members
10.4x @ 12 members
20Gb/s IB HCAs7874-024 IB Switch
OLTP 80/20 R/W workloadNo affinity10 System X servers for load drivers
12 8-core p550 members64 GB, 5 GHz each
Duplexed CFson 2 additional 8-core p550s64 GB, 5 GHz each
DS8300 storage
576 15K disks
Two 4Gb FC Switches
Requirements• Platforms:
• AIX 6.1, 7.1
• On any Power 6, 7, 8 compatible rack mounted server
• Linux - Red Hat - RHEL 5.9, 6.1-6.5 (KVM min RHEL 6.2) and SUSE - SLES 10.4, 11.2
• On any x64 Intel compatible rack mounted server (and also blade server for TCP/IP)
• Disk:
• All storage area network (SAN) and directly attached shared block storage, referenced as a
logical unit number (LUN)
• Desirable features:
• SCSI-3 Persistent Reserve (PR) for fast IO fencing
• SCSI-3 PR WRITE EXCLUSIVE REGISTRANTS ONLY reservation type code 0x50
enabled for tie-breaker use
• Network:
• RDMA (Remote Direct Memory Access) over InfiniBand (IB) – 40GE
• RDMA over Converged Ethernet (RoCE) - 10GE
• TCP/IP protocol over Ethernet (TCP/IP) network
© 2015 IBM Corporation 19
Relative Performance Between RDMA (10GE RoCE) and TCP/IP Sockets
Re
lative
# o
f tr
an
sa
ction
s p
er
se
co
nd
Re
lative
# o
f tr
an
sa
ction
s p
er
se
co
nd
1 member 2 members 3 members 4 members
1 member 2 members 3 members 4 members
Sockets (TCP/IP over Ethernet)
InfiniBand (RDMA)
Transactional workload with
70% reads, 30% writes
Transactional workload with
90% reads, 10% writes
Re
lative
# o
f tr
an
sa
ction
s p
er
se
co
nd
• 1-4 members, 2 CFs
• Intel x86 servers
• 32 logical CPUs per server
• Single adapter per server
• IBM DS3000 storage
SAS Grid Manager capabilities
• Performance
• Jobs are divided into subtasks for
parallel execution
• Increase throughput of SAS Jobs
• Workload Management
• Provides job, host & user
management
• Prioritize & schedule jobs using
rules-based queues
• Identify, allocate and manage resources
• High availability
• Detect grid failure and recovers automatically
• Automatic restart of jobs from last successful checkpoint.
, notice any similarities?
Grid Client
SAS Management ConsoleGrid Manager and Schedule Manager
SAS Program
Grid Control Server
Base SAS
SAS CONNECT®
SAS Grid Server
SAS Data Step
Batch Server
SAS Workspace
Server
… SAS Servers*
IBM Platform Process Mgr
IBM PlatformTM LSF
IBM Platform Grid Management Service
Grid Node 1
Base SAS
SAS CONNECT®
SAS Grid Server
SAS Data Step
Batch Server
SAS Workspace
Server
… SAS Servers*
IBM PlatformTM LSF
Grid Node n
Base SAS
SAS CONNECT®
SAS Grid Server
SAS Data Step
Batch Server
SAS Workspace
Server
… SAS Servers*
IBM PlatformTM LSFSAS Data
Integration StudioSAS Enterprise
Miner
SAS Enterprise Guide
Other SASClients
Shared File System SAS Installation and Configuration
Job Deployment Directories
Source and Target Data
…
IBM PlatformTM LSF
SAS Metadata Server
SAS Grid Submit Utility
* SAS Stored Process-, Workspace,
Pooled Workspace-
and OLAP Server with SAS 9.4.
Failover
Failover
Failover
IBM PlatformTM RTM for SASReport Track Monitor Solution
SAS Grid Manager Architecture
Integrated framework for modeling, simulating, and evaluating risk
Massively parallel (in-memory) to provide dynamic aggregation and visualization
SAS High Performance Risk
Situation analysis
A server breaks
Static platform Scalable platform
- Catastrophe!
- Standstill for 3 days for the risk
department
- Lower performance
Failed site
- Change site
- Standstill 1 day
- Catastrophe!
- Standstill for 3 days for the risk
department
Service
- Offline
- Overtime, Overtime and downtime plus
some overtime
- Online
- Ordinary working hours, no downtime
High Availability/SLA
- Project to build the functionality
- Load two sites
- A part of the platform
- Add a server
Platform- Installation on every server - Installation all at once
(Except for some PureScale activities)
Increase performance
- Buy bigger box
- Add cores within box
- Add memory
- Etc
- Will cause downtime!!
- Add a server
(buy license of course…)
SAS Grid
ETL -
bridge
(per
business
date)RMB mart
EnrichmentRisk Engine
(creates INSTVALS)
(macros, price methods,
etc)
INSTVALS
Mart
ETL -
bridge
(per
business
date)RMB mart
EnrichmentRisk Engine
(creates INSTVALS)
(macros, price methods,
etc)
INSTVALS
Mart
Risk Calculation 2
Risk Calculation 3
SAS Risk Management for Banking
Source system
Storage (SAN)
ETL -
bridge
(per
business
date)RMB mart
EnrichmentRisk Engine
(creates INSTVALS)
(macros, price methods,
etc)
INSTVALSETL
ETL
ETL
scheduling
sourcing
DDS (history)
CERM
MartMart
Risk Calculation 1SAS Risk Management for Banking
SAS Risk Management for Banking
4 Billion rows
150 GB Compressed
580 GB Uncompressed
74% Compression
94 active tables
DB2
BLU
DB2 PureScale
Row-store
DB2 9.7
Teg
sthux330
HP DL380G7
2 x 2,53Ghz
Quad-Core CPU
256GB RAM
HP DL580G7
4 x 2,67Ghz
Hexa-Core CPU
512GB RAM
HP DL360G7
2 x 2,66Ghz
Quad-Core CPU
48GB RAM
1Gbit Ethernet
1Gbit Ethernet
DB2 with compression
SAS Metadata
SAS Compute
IBM DB2 9.7
Old non distributed architecture with SAS and IBM segregated
1 Gbit Interconnect
PureScale
XSERICIQ032
GPFS
10 Gbit Interconnect
GPFS
PureScale
XSERICIQ031
GPFS
SAS Grid
SAS HPR
SAS Grid
SAS HPR
RICI Target infrastructureVersion 0.98
PureScale
CF
Primary
XSERICIQ030
GPFS
PureScale
XSERICIP032
PureScale
CF
Secondary
XSERICIP030
GPFS GPFS
PureScale
XSERICIP033
GPFS
SAS Grid
SAS HPR
SAS Grid
SAS HPR PureScale
CF
Primary
XSERICIP031
GPFS
GPFS
DB RestoreDB Inc Delta
GPFS
DB Inc Delta DB Restore
DR
Ready
FTP/TSM DB Backup Mirroring
DD
S
DD
S
MA
RT
S
FTP/TSM DB Backup Mirroring
DR
Ready
MA
RT
S
DB
Full backup
DB
Full backup
DB
Full backup
Weekly/Monthly
DR processing
DB
Full backup
Weekly/Monthly
DR processing
Daily/Intra day
DR processing
Production-Test Production
Active
Disaster Recovery
WarmSystemtestDevelopment
2*Octo CPU
256 GB Ram
DB2 DE
2*Octo CPU
256 GB Ram
DB2 DE
2*Octo CPU
256 GB Ram
DB2 DE
2*Octo CPU
256 GB Ram
DB2 AEE
2*Octo CPU
256 GB Ram
DB2 AEE
2*Octo CPU
256 GB Ram
DB2 AEE
2*Octo CPU
256 GB Ram
DB2 AEE
PureScale
GPFS
PureScale
GPFS
SAS Grid
SAS HPR
SAS Grid
SAS HPRPureScale
CF
Primary
GPFS
1*Octo CPU
256 GB Ram
DB2 AEE Warm
2*Octo CPU
256 GB Ram
DB2 AEE Warm
2*Octo CPU
256 GB Ram
DB2 AEE Warm
2 Core
24 GB Ram
SAS Web Server
XSERICIF011
2 Core
24 GB Ram
SAS Grid Manager
SAS Metadata-
Server
XSERICIF010
2 Core
24 GB Ram
SAS Web Server
XSERICIQ011
2 Core
24 GB Ram
SAS Grid Manager
SAS Metadata-
Server
XSERICIQ010
2 Core
24 GB Ram
SAS Web Server
XSERICIP011
2 Core
24 GB Ram
SAS Grid Manager
SAS Metadata-
Server
XSERICIP010
1*Quad CPU
64 GB Ram
DB2 DE
DB2 BLU
SAS BI Server
XSERICIF020
1*Quad CPU
64 GB Ram
DB2 DE
DB2 BLU
SAS BI Server
XSERICIS020
2*Octo CPU
128 GB Ram
DB2 DE
DB2 BLU
SAS BI Server
XSERICIQ020
2*Octo CPU
128 GB Ram
DB2 AWGE 3TB
DB2 BLU
SAS BI Server
XSERICIP020
2*Octo CPU
128 GB Ram
DB2 AWGE 2TB
DB2 BLU
SAS BI Server
XSERICIP021
2 x AMD Opteron
6380
196 GB Ram
Citrix
2 x AMD Opteron
6380
196 GB Ram
Citrix Loadbalancing
2 Core
24 GB Ram
SAS Grid Manager
SAS Metadata-
Server
XSERICIS010
2 Core
24 GB Ram
SAS Web Server
XSERICIF011
PureScale
CF
Primary
XSERICIQNNN
GPFS
2*Octo CPU
256 GB Ram
DB2 DE
PureScale
CF
Primary
GPFS
1*Octo CPU
256 GB Ram
DB2 AEE Warm
New architecture, disibuted and co-located
Network traffic is reduced
between application and DB2
(runtime reduction by 40%
on baseline tests)
All cores are spinning on
CF’s
Workload is controlled by
SAS Grid
PureScale is set to
Automatic BP
PureScale and SAS Grid shares resources nicely
2015-03-21
2015-03-22
2015-03-23
2015-03-24
2015-03-25
2015-03-26
2015-03-27
2015-03-28
2015-03-29
2015-03-30
2015-03-31
2015-04-01
2015-04-02
2015-04-03
2015-04-04
2015-04-05
2015-04-06
(blank)
Other 929 724 788 1071 872 1172 1060 1129 693 640 1204 1574 1264 891 921 702 480
LOAD 1008 388 240 718 699 709 705 914 411 204 730 736 740 596 984 411 88
STAGE 910 383 204 655 648 652 648 818 370 202 658 676 667 545 905 370 85
CLEAN 515 274 120 347 338 343 341 497 268 113 346 352 350 316 525 268 78
EXTRACT 984 565 182 629 622 626 624 994 543 174 629 635 632 628 1007 543 151
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Nu
mb
er
of
job
s
Number of jobs/ night
During the PoC; 350 Extract; 210 Clean; 440 Stage; 490 Load
40% more jobs executed every night than 1 year ago
Old environment New environment
Sourcing DDS = 12 hours Sourcing DDS <= 2 hours6X +
Calculating risk, 1 = 10 hours
32 Core (8 + 24)
Calculating risk, 1 = 5 hours
64 Core (16+16+16+16)
2X +
2X
756GB Ram (256GB DB+512GB App) 1024GB Ram (4* 256)1.3X
Calculating risk, 2 = 4 hours Calculating risk, 2 = 1.5 hours2.5X +
4 Jobs / Min 25 Jobs / Min / 2 Members3X +
Calculating risk, 3 = 5 hours Calculating risk, 3 = 2.5 hours2X +
First report at 7:30 am First report at 5:10 am2h +
CFs included
What´s next
• Optimize the risk calculation (only in SAS)
• Test and verify EHL (Explicit Hierarchal Locking)
• Add FP5 and turn on STMM
• Collect even more statistics about PureScale resource consumption
Philip Källander - [email protected]
Patric Hamilton – [email protected]
Rickard Linck - [email protected]
Frågor?