db2 purescale implementation experience

35
DB2 pureScale implementation experience from a Card Payment service company in AP. YunCheol Ha IBM Australia Session Code: 1078 Sep 12, 2012 05:00 PM – 06:00 PM | Platform: DB2 LUW

Upload: others

Post on 20-Apr-2022

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DB2 pureScale implementation experience

DB2 pureScale implementation experience from a Card Payment service company in AP.

YunCheol HaIBM Australia

Session Code: 1078Sep 12, 2012 05:00 PM – 06:00 PM | Platform: DB2 LUW

Page 2: DB2 pureScale implementation experience

Click to edit Master title style

DB2 pureScale implementation experience from a Card Payment service company in AP

• DB2 pureScale Overview• DB2 pureScale introduction milestone • DB2 pureScale configuration

• Infiniband configuration

• Storage consideration• DB2 pureScale Installation

• Database creation

• Client configuration

• High Availability test• Overall DB2 pureScale migration benefit• Lessons learnt

2

Page 3: DB2 pureScale implementation experience

Click to edit Master title style

DB2 pureScale

• Extreme Capacity• Buy only what you need, add

capacity as your needs grow

• Application Transparency• Avoid the risk and cost of

application changes

• Continuous Availability• Deliver uninterrupted access

to your data with consistent performance

3

Learning from the undisputed Gold Standard... System z

Page 4: DB2 pureScale implementation experience

Click to edit Master title style

DB2 pureScale Architecture

4

Leverages the global lock and memory manager technology

from z/OS platform

Automatic workload balancing

Shared Data

InfiniBand(IB) and RoCE network & Cluster Caching Facility

Cluster of DB2 nodes running on Power or System x servers

Integrated Cluster Services

Page 5: DB2 pureScale implementation experience

Click to edit Master title style

An AP Card Payment services company• One of top Card Payment switching services providers.

• Card PoS system• Debit Card – transmission of transactions to customer’s bank

securely • Credit Card – credit card authorization and settlement services

• Value added services – revenue reporting/management services

• Challenges• Workload increases along with business growth - Flexible

database system architecture for business growth• No tolerance for service interruption due to IT system failure - 24

X 7 High Availability production environment• Minimum change on existing production system for migration to a

new system

5

Page 6: DB2 pureScale implementation experience

Click to edit Master title style

Why DB2 pureScale

• Linear scalable database system without application change along with workload and data growth

• Continuous availability for database services and uninterrupted system maintenance

• Easy Cluster solution management

• Minimum change on existing application and database for migration

• Successful PoC results

6

Page 7: DB2 pureScale implementation experience

Click to edit Master title style

2011. 12DB2 pure Scale

Project Start

2011. 91st PoC- High Availability check- Scalability check- pureScale cluster Management check

2011. 112nd PoC- Application migration check- Application performance check- Solution integration review

2012. 3. 19Production

cutover

2010. 7DB2 pureScale introduction- pureScale Architecture review - pureScale demonstration

DB2 pureScale milestone

7

Page 8: DB2 pureScale implementation experience

Click to edit Master title style

Function Detail Checked Result

System ArchitectureDB2 pureScale support Power 7 purchase

24 x 7 support System, DBMS redundancy

DBMS ArchitectureDB2 pureScale installation completed

DB creation and configuration completed

Integration with existing production

TP monitor Integration test completed

Data migration Completed

Application migration Completed

Bulk AP processing Active-Active support A/A support

HA testing

DBMS s/w failure < 10 sec for service restart

CF s/w failure < 10 sec for service restart

LPAR failure ( DB2 member ) < 10 sec for service restart

System failure (DB2 mem. & CF) < 10 sec for service restart

DBMS/System production

No system interrupted system maintenance support

support

Roadmap Software (DBMS) roadmap provided

CriticalMarginalAcceptable

PoC check list and result

8

Page 9: DB2 pureScale implementation experience

Click to edit Master title style

M1 M2 M3 M4

w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15

ASIS and DB environment and configuration check

pureScale HW configuration

pureScale SW configuration

1st data migration

AP test and migrated data evaluation

2nd data migration

AP test and migrated data evaluation

3rd data migration

AP test and migrated data evaluation

HA test

Backup and database restore test

IRS configuration and test

Final data migration and production cutover

Health Check and Monitoring

Production cutover

DB2 ps 시스템시스템시스템시스템구성및구성및구성및구성및설치설치설치설치

DB2 ps 시스템시스템시스템시스템구성및구성및구성및구성및설치설치설치설치

DB2 pureScale High Availability

Test

DB2 pureScale System installation and configuration

Implementation Schedule • 15 weeks of implementation - no change on applications

9

Page 10: DB2 pureScale implementation experience

Click to edit Master title style

Data

high speed Communication

network

LOG LOG

CF2CF1

Public Network

DB1 DB2

Production DB2 pureScale configuration

p770

AIX 6. 14 Cores64GB

DB2 pureScaleMember 1

p770

AIX 6. 14 Cores64GB

DB2 pureScaleMember 2

p740

AIX 6. 14 Cores64GB

DB2 pureScaleCF 2

p740

AIX 6. 14 Cores64GB

DB2 pureScaleCF 1

p6p6

AIX 5.34 Cores16GB

TMAXJeus WAS

DS 8700

• 2 p770 & 2 p740 • 2 members and 2 CFs

• DS8700 storage server• 10TB : RAID10 5TB

• 2 IB switches/HCAs

• Database• DB2 9.8 pureScale• 6.8K tables• Uncompressed 3.3TB of data & index

volumes• 1.2TB of data & indexes with

compression

• Workload • Mixed workload:

• online WAS payment online application

• C/C++ batch application on TP-monitor.

• Workload Balance configuration for the WAS payment online application

• Daily transaction is on average 8M trx.

10

Page 11: DB2 pureScale implementation experience

Click to edit Master title style

Redundant InfiniBand Configuration • Redundant IB HCAs for CFs

• IB port 1 active and IB port 2 inactive configured for each IB HCA

• Each CF port on a CF server has its own subnet ip address• ib0 10.10.0.3 10.10.0.4, ib2 10.10.1.3 10.10.1.4

• Members should have the same subnet as one of CFs• ib0 10.10.0.1 10.10.0.2

• Inter-switch links between two IB switches• Increased IB bandwidth

CF #1(CF 0)

DB #1(MEMBER 0)

ib0 : 10.10.0.1

INFINIBAND SWITCH 1 MASTER INFINIBAND SWITCH 2 STANDBY

GX ++

ib0 : 10.10.0.3 ib2 : 10.10.1.3GX ++

CF #2(CF 1)

GX ++

ib0 : 10.10.0.4 ib2 : 10.10.1.4GX ++

DB #2(MEMBER 1)

ib0 : 10.10.0.2

Inter-switch Links

Host Channel Adaptors (HCAs)

11

Page 12: DB2 pureScale implementation experience

Click to edit Master title style

• Edit dat.conf file under /etc/ directory when different IB HCAs, port or interface are used• One 2-port IB HCA on each Member and two 2-port IB HCAs of each CF

are configured

• IB port 1 active and IB port 2 inactive per each IB HCA• One entry per each of active IB HCA ports defined on data.conf file

• Check that IB HCA ports are active and the links are up• ibstat command with -v or -p options

DB #1 : /etc/dat.confhca0 u2.0 nonthreadsafe default /usr/lib/libdapl/li bdapl2.a(shr_64.o) IBM.1.1 "/dev/iba0 1 ib0" " “

• 2 ports are defined in a CF CF #1 : /etc/dat.confhca0 u2.0 nonthreadsafe default /usr/lib/libdapl/li bdapl2.a(shr_64.o) IBM.1.1 "/dev/iba0 1 ib0" " "hca1 u2.0 nonthreadsafe default /usr/lib/libdapl/li bdapl2.a(shr_64.o) IBM.1.1 "/dev/iba1 1 ib2" " "

Interface Adaptor Name

API version of the library

IB Interface

IB Device NameIB port number

Library path

Direct Access Transport configuration file

12

Page 13: DB2 pureScale implementation experience

Click to edit Master title style

Storage Configuration• General Parallel File System (GPFS) provides concurrent access on data from

all hosts in the pureScale cluster• For faster recovery time, pureScale category 1 storage, DS8700 storage, was

deployed• Category 1 storage supports

• fast I/O fencing (SCSI-3 PR) from failing member

• cluster services tie-breaker

• An appropriate multipath driver should be installed • SDDPCM driver for DS8000 series

• DB2 Information Center has the detail of prerequisites

• http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.qb.server.doc/doc/c0059360.html

13

Page 14: DB2 pureScale implementation experience

Click to edit Master title style

Storage Path Size

Shared Instance file system /dev/hdisk50 100G

ACTIVE LOG PATH /dev/hdisk53 10G

ARCHIVE LOG PATH /dev/hdisk47, /dev/hdisk51 200G

TABLESPACE PATH

DATA1 /dev/hdisk2 ~ hdisk11

100G X 10 = 1TB

1TB X 4 = 4TB

DATA2 /dev/hdisk12 ~ hdisk21

DATA3 /dev/hdisk22 ~ hdisk31

DATA4 /dev/hdisk32 ~ hdisk36, hdisk38 ~hdisk42

DB2 CS Tiebreaker /dev/hdisk37 2G

Storage Configuration• pureScale storage provision

• Shared LUNs must be accessible by all hosts.• 2 LUNs are allocated for pureScale installation

• Shared instance file system • DB2 Cluster Services Tiebreaker

• Table spaces and Log path on separated LUNs• 4 GPFS file system , each of which come from 10 LUNs, are

allocated for table spaces

14

Page 15: DB2 pureScale implementation experience

Click to edit Master title style

DB2 pureScale installation • Single installation for pureScale instance and cluster services across multiple hosts using

DB2 setup Wizard• Components installed and configured across 4 hosts :

• pureScale engine ( 2 members & 2 CFs )• pureScale instance• DB2 cluster services ( RSCT&TSA, GPFS )

• Two disk devices to be provided :• Shared instance file system• DB2 Cluster services tie-breaker

• 4 Hosts for Members and CFs are specified

15

Page 16: DB2 pureScale implementation experience

Click to edit Master title style

GPFS creation• DB2 Instance owner can create GPFS file systems

• System administrator needs to change ownership of LUNs to db2 instance owner.

• db2cluster command is used for DB2 cluster service management • -cfs option for cluster file system, GPFS

• -cm option for cluster management, TSA&RSCT

• db2cluster -cfs -create -filesystem db2data1 -disk /dev/hdisk2,/dev/hdisk3,/dev/hdisk4,/dev/hdisk5,/dev/hdisk6,/dev/hdisk7,/dev/hdisk8,/dev/hdisk9,/dev/hdisk10,/dev/hdisk11 -mount /XXXDB/DATA1

• db2cluster -cfs -create -filesystem db2log -disk /dev/hdisk53 -mount /XXXDB/LOG_ACTIVE

• Check created GPFS issuing db2cluster -cfs -list –filesystemFILE SYSTEM NAME MOUNT_POINT--------------------------------- -------------------------db2arch /XXXDB/LOG_ ARCHIVEdb2data1 /XXXDB/DATA1db2data2 /XXXDB/DATA2db2data3 /XXXDB/DATA3db2data4 /XXXDB/DATA4db2fs1 /db2sd_201 20111183347db2log /XXXDB/LOG _ACTIVE

16

Page 17: DB2 pureScale implementation experience

Click to edit Master title style

Database creation• Automatic storage is prerequisite for pureScale

• Database creation on created GPFS• Create database XXXDB on

/XXXDB/DATA1,/XXXDB/DATA2,/XXXDB/DATA3,/XXXDB/DATA4 dbpath on /db2sd_20120111183347;

• GPFS are registered as a TSA resource group automatically

• Table spaces Configuration• Multiple containers : 4 containers per each tablespace• 32 page extent size and 4K page tablespaces to reduce possibility of hot pages

Member Member

SAN

DATA1

db2fs1

/XXXDB/DATA1

/XXXDB/DATA2

/db2sd_20120111183347 db2fs1

DATA1/XXXDB/DATA1

/XXXDB/DATA2

/db2sd_20120111183347

hdisk2 ~ 11

hdisk53

17

Page 18: DB2 pureScale implementation experience

Click to edit Master title style

Client Configuration• Workload balancing (WLB)

• WLB balances application requests among all members of the DB2 pureScale cluster based on the current workload of DB2 pure Scale member servers

• Client receives the workload info from purescale• Client sends transactions to less busy members

• Online JAVA application is setup with WLB

• Automatic Client Reroute (ACR) with WLB• ACR feature reroutes applications having connection to failed members

to active DB2 pureScale members • Seamless ACR is configured - no error messages return to applications• Alternate Servers for the first connection is setup considering applications

failure of first connection trial to a primary DB2 pureScale member

18

# db2pd -d xxxdb -serverlist

Server List:Time: Wed Apr 18 12:46:35Database Name: XXXDB Count: 2

Hostname Non-SSL Port SSL Port PriorityXXXDB1 59999 0 55XXXDB2 59999 0 95

Page 19: DB2 pureScale implementation experience

Click to edit Master title style

<configuration><dsncollection>

<dsn alias="XXXDB" name="XXXDB" host="XXXDB1" port="59999"/></dsncollection><databases>

<database name="XXXDB" host="XXXDB1" port="59999"><acr>

<parameter name="enableAcr " value="true"/><parameter name="enableSeamlessAcr " value=“true”/><parameter name="maxAcrRetries" value="3"/><parameter name="acrRetryInterval" value="1"/><parameter name="enableAlternateServerListFirstConnect " value="true"/><alternateserverlist>

<server name="MEMBER0" hostname="XXXDB1" port="5999 9"/><server name="MEMBER1" hostname="XXXDB2" port="5999 9"/>

</alternateserverlist></acr>

</database></databases>

</configuration>

Client Configuration (C batch application)• db2dsdriver.cfg file is created and edited from C program client host

• Additional ACR parameters are setup• enableAcr=true is default

19

Page 20: DB2 pureScale implementation experience

Click to edit Master title style

<database><vendor>db2</vendor><export-name>xxxdb</export-name><data-source-class-name>com.ibm.db2.jcc.DB2ConnectionPoolDataSource</data-source-class-name><data-source-type>ConnectionPoolDataSource</data-source-type><database-name>XXXDB</database-name><port-number>59999</port-number><server-name>YYY.YY.Y.41</server-name><user>USERID</user><password>USERPW</password>

<property><name>DriverType</name><type>java.lang.Integer</type><value>4</value>

</property>

<property><name>EnableSeamlessFailover </name><type>java.lang.Integer</type><value>0</value>

</property><property>

<name>EnableSysplexWLB </name><type>java.lang.Boolean</type><value>TRUE</value>

</property><property>

<name>MaxTransportObjects</name><type>java.lang.Integer</type><value>10</value>

</property></database>

Client Configuration (JEUS WAS)• Transaction level WLB is configured for WAS

• JEUSMain.xml configuration excerpts

20

Page 21: DB2 pureScale implementation experience

Click to edit Master title style

High Availability test• High Availability is one of the most critical SLAs• Various failure scenarios ( SW, HW, dependant services / HW )

tested

21

2. Network Failure2. Network Failure

CFShared Data

Member 1

Host DB1 Host DB2

Client

Single Database View

IB Switch IB Switch

Member 2

CF

3. SYSTEM Failure3. SYSTEM Failure

CFShared Data

Member 1

Host DB1 Host DB2

Client

Single Database View

IB Switch IB Switch

Member 2

CF

1. DBMS / CF SW Failure1. DBMS / CF SW Failure

CFShared Data

Member 1

Host DB1 Host DB2

Client

Single Database View

IB Switch IB Switch

Member 2

CF

Page 22: DB2 pureScale implementation experience

Click to edit Master title style

CS

CSDB2

Single Database View

Shared Data

Clients• kill -9 erroneously issued to a member

• DB2 Cluster Services automatically detects member’s death

• Informs other members & CF servers• Initiates automated member restart of DB2 member on

same (“home”) host

• In the mean-time, client connections are transparently re-routed to healthy members

• Based on least load (by default), or,• Pre-designated failover member

• Other members remain fully available throughout – “Online Failover”

• Primary retains update locks held by member at the time of failure

• Other members can continue to read and update data not locked for write access by failed member

• Member restart completes• Retained locks released and all data fully

available• Full DB2 member started and available

for transaction processing

CSDB2

CSDB2

CS

DB2

CS

Updated Pages Global Locks

Log Records Pages

PrimarySecondary

Updated Pages Global Locks

kill -9Automatic;

Ultra Fast;

OnlineDB2

Member SW failure : “Member restart on home host”

22

Page 23: DB2 pureScale implementation experience

Click to edit Master title style

MEMBER CFRESULT

XXXDB1 XXXDB2 XXXCF1 XXXCF2

CASE1

Abnormal DB2 member down(kill -9 db2sysc

process)

Abnormal DB2 member down(kill -9 db2sysc

process)

•Termination of Application sessions which connected to the member•Successful ACR to the surviving member

CASE2Abnormal CF down

(kill -9 ca-server process)

Abnormal CF down (kill -9 ca-server

process)

•No impact on application.•In case of primary CF failure, CF role change.

CASE3Abnormal both DB2 members down

(kill -9 db2sysc process)•No Services

CASE4Abnormal both CFs down (kill -9 ca-server process)

•No Services•Group Crash Recovery is needed

HA Test – DBMS and CF SW Failure

23

Page 24: DB2 pureScale implementation experience

Click to edit Master title style

• Power cord tripped over accidentally

• DB2 Cluster Services loses heartbeat and declares member down

• Informs other members & CF servers• Fences member from logs and data• Initiates automated member restart on another

(“guest”) host• Using reduced, and pre-allocated memory model

• Member restart is like a database crash recovery in a single system database, but is much faster

• Redo limited to inflight transactions (due to FAC)• Benefits from page cache in CF

• In the mean-time, client connections are automatically re-routed to healthy members

• Based on least load (by default), or,• Pre-designated failover member

• Other members remain fully available throughout – “Online Failover”

• Primary retains update locks held by member at the time of failure

• Other members can continue to read and update data not locked for write access by failed member

• Member restart completes• Retained locks released and all data fully available

CS

CSDB2

Shared Data

CSDB2

CSDB2

CS

Updated Pages Global Locks

PrimarySecondary

Updated Pages Global Locks

Fence

CS

DB2

DB2

Pages

Log Recs

Single Database View

Automatic;

Ultra Fast;

Online

Clients

Machine / Host Failure

24

Page 25: DB2 pureScale implementation experience

Click to edit Master title style

MEMBER CFRESULT

XXXDB1 XXXDB2 XXXCF1 XXXCF2

CASE1 IB ib0 Down IB ib0 Down

•Termination of Application sessions which connected to the failed member•Successful ACR to the surviving member

CASE2 IB ib0 Down IB ib0 Down •No Impact on application

CASE3 IB ib1 Down IB ib1 Down •No Impact on application

CASE5 IB ib0, IB ib1 Down IB ib0, IB ib1 Down•No Impact on application•In case of Primary CF failure, CF role change

CASE8Public NC Down

(etherchannel-2ea)Public NC Down

(etherchannel-2ea)

•Termination of Application sessions which connected to the member•Successful ACR to the surviving member

CASE9Public NC Down

(etherchannel-2ea)Public NC Down

(etherchannel-2ea)

•No Impact on application•In case of Primary CF failure, CF role change

CASE10 IB SWITCH 1 Power-off

IB SWITCH 2 Power-off

•Termination of Application sessions connected through the failed IB switch•Successful ACR to the surviving member

HA Test – Network Failure

25

Page 26: DB2 pureScale implementation experience

Click to edit Master title style

MEMBER CFRESULT

XXXDB1 XXXDB2 XXXCF1 XXXCF2

CASE1SYSTEM DOWN

(HALT)SYSTEM DOWN

(HALT)

•Termination of Application sessions which connected to the failed Host•Successful ACR to the surviving member

CASE2SYSTEM DOWN

(HALT)SYSTEM DOWN

(HALT)

•No Impact on application•In case of Primary CF failure, CF role change

HA Test – SYSTEM Failure

• Successfully HA test of additional failure of dependent service and HW scenarios• GPFS daemon failure• HBA cable failure

26

Page 27: DB2 pureScale implementation experience

Click to edit Master title style

DB2 Cluster Services - TSA resource model • Tivoli SA MP(TSA), RSCT, GPFS, and DB2 code are the components of DB2 cluster

services and they are tightly integrated

• The Cluster services is created and configured automatically during pureScale installation, member add and remove

• Policy based Automation of TSA is preconfigured for SW and HW failure detection, decision making, reaction and recovery

27

Page 28: DB2 pureScale implementation experience

Click to edit Master title style

pureScale Cluster monitoring

28

• db2instance -list command provides single view of DB2 pureScale cluster status and a good starting point to monitor DB2 pureScale cluster status and investigate problems

• DB2 pureScale is normal

• Problems are on part of Host XXXDB1 and member 0 on Host XXXDB1 restarts on host XXXDB2 for member crash recoveryID TYPE STATE HOME_HOST CURRENT_HOST ALERT NETNAME-- ---- ----- --------- ------------ ----- -------0 MEMBER WAITING_FOR_FAILBACK XXXDB1 XXXDB1 YES XXXDB1-ib01 MEMBER STARTED XXXDB2 XXXDB2 NO XXXDB2-ib0128 CF PRIMARY XXXCF1 XXXCF1 NO XXXDB1-ib0,XXXDB1-ib1129 CF PEER XXXCF2 XXXCF2 NO XXXDB2-ib0,XXXDB2-ib1

HOSTNAME STATE INSTANCE_STOPPED ALERT-------- ----- ---------------- -----XXXDB1 ACTIVE NO YESXXXDB2 ACTIVE NO NOXXXCF1 ACTIVE NO NOXXXCF2 ACTIVE NO NO

ID TYPE STATE HOME_HOST CURRENT_HOST ALERT NETNAME-- ---- ----- --------- ------------ ----- -------0 MEMBER STARTED XXXDB1 XXXDB1 NO XXXDB1-ib01 MEMBER STARTED XXXDB2 XXXDB2 NO XXXDB2-ib0128 CF PRIMARY XXXCF1 XXXCF1 NO XXXDB1-ib0,XXXDB1-ib1129 CF PEER XXXCF2 XXXCF2 NO XXXDB2-ib0,XXXDB2-ib1

HOSTNAME STATE INSTANCE_STOPPED ALERT-------- ----- ---------------- -----XXXDB1 ACTIVE NO NOXXXDB2 ACTIVE NO NOXXXCF1 ACTIVE NO NOXXXCF2 ACTIVE NO NO

Page 29: DB2 pureScale implementation experience

Click to edit Master title style

• db2cluster command is used to monitor and manage DB2 cluster services instead of TSA&RSCT and GPFS native commands• db2cluster -cm -list -host -state : hosts status in the cluster• db2cluster -cm -list -tiebreaker : cluster manager tiebreaker• db2cluster -cm -list –HostFailureDetctionTime• db2cluster -cfs -list -configuration : gpfs configuration• db2cluster -cfs -list -filesystem : gpfs file system

• Understanding on TSA&RSCT and GPFS gives you more insight on pureScale cluster

• TSA&RSCT and GPFS trace files are helpful to understand history and current activities for each component • TSA&RSCT GblResRM trace : rpttr -o dtic

/var/ct/<domain_name>/log/mc/IBM.GblResRM/trace_summary* • TSA&RSCT RecoveryRM trace : rpttr -o dtic

/var/ct/<domain_name>/log/mc/IBM.RecoveryRM/trace_summary*• GPFS trace : /var/adm/ras/mmfs.log.latest

29

pureScale Cluster monitoring

Page 30: DB2 pureScale implementation experience

Click to edit Master title style

pureScale System Failure on production• pureScale provided uninterrupted database services during unexpected H/W failure

• At 08:29 3/29, OS hang occurred on Host 1. • Root cause was OS defect associated with Network status check• When OS hang was resolved at 08:43 3/29, the failed member on Host 1 restarted

automatically by pureScale cluster services• Database services continued regardless H/W failure

• Rolling upgrade for OS fix apply - Stealth Maintenance• OS fix and system reboot in the order of CF1 -> CF2 -> DB1 -> DB2 hosts without service

interruption

DB2

DB #1

08:29 08:43

DB2

DB #1

DB2

DB #1

DB2

DB #2

1. DB2 pureScale cluster services detected system fa ilure from host 1 automatically

2. ACR of applications on host 1 to the member on ho st 2

3. No interruption for applications on host 2

4. After OS hang problem was resolved, failed member restarted automatically

30

Page 31: DB2 pureScale implementation experience

Click to edit Master title style

Rules of thumb for problem determination• db2instance -list gives you a good overview of

outstanding existing problems that may require attention• Will not tell you about problems that existed in the past but have

since been resolved (e.g. host failed and restarted)

• If both host and member/CF alerts exist, start investigating host alerts first

• System logs (e.g. errpt –a, /var/logs/message) are a good first place to start to see if there were any system level events at the time of the problem

• Some alerts (e.g. host not responsive) will be cleared automatically when the problem is resolved. Others will require a user to issue db2cluster –clear –alerts.

31

Page 32: DB2 pureScale implementation experience

Click to edit Master title style

Before and after DB2 pureScale

Before : Single DB2 8.2 ESE Current : DB2 pureScale

• No service during SW or HW failure• No service during system maintenance

• Uninterrupted services during SW or HW failure• Uninterrupted services during system maintenance • Data and indexes size reduction through DB2 row compression

• Storage management improvement• Performance improvement• Database Backup time reduction

• Easy cluster management• Effective system resource usage through active active configuration

• Few Batch application performance degradation due to separation of application and database servers

• In db2 8.2, application and database were co-located

• Current progress :• Changing on singleton DMLs with Stored procedure• 10G Ethernet network configuration between application and DB2 pureScale servers

32

Page 33: DB2 pureScale implementation experience

Click to edit Master title style

What is next • Planning for DB2 V10 upgrade

• Improvement on page reclaiming ( hot page ) reduction associated with frequent inserts of increasing numeric values and timestamp, which cause a hot spot at the high end of indexes. • DB2 v10.1 “current member” special register can reduce the hot spot• Non partitioned table configuration

• Add hidden current member column on the base table• alter table tbl_order add column curmem smallint default current member

implicitly hidden;• Add the current member column as a prefix column of the indexes

• create index tbl_order_x1 on tbl_order ( curmem, order_seq_num );• Range partitioned table with local indexes

• Add hidden current member column on the base table as range partitioned key column

• .. partition by range ( curmem ) ..• Add the current member column in case of unique indexes

• Page reclaim associated with update/delete operation• Small page size – 4 K page tablespaces are used • Increased PCTFREE for tiny or semi tiny tables

• Migration of Online Payment system to DB2 pureScale

33

Page 34: DB2 pureScale implementation experience

Click to edit Master title style

Lessons learned• DB2 pure Scale installation guide in the DB2 Information Center is the

most important reference for successful pureScale installation and configuration

• IBM Toronto Lab support for workload analysis and optimized configuration recommendation

• 2 times of pre data and application Migration reduced risk of final production migration within very short downtime

• db2instance -list command is a good starting point to investigate problems. This command provides single view of DB2 pureScale cluster status.

• db2cluster command is used for cluster services monitoring and management instead of TSA&RSCT&GPFS commands and understanding on TSA&RSCT&GPFS gives you more insight on the cluster

34

Page 35: DB2 pureScale implementation experience

YunCheol HaIBM [email protected]

SessionDB2 pureScale implementation experience from a Card Payment service company in AP