ha/dr solutions for ibm i - commoncommon.org.pl/prezentacjecpw2019/hadrfori2019.pdf · 2019 ibm...

HA/DR solutions for IBM i

Edward Grygierczyk

[email protected]

2019 Common Polska13.05.2019| Lidzbark Warmiński

2019 IBM Systems Technical University

Topics

— What’s new with V7.3 TR6 and V7.4— Power Systems HA/DR solution family — Characteristics/Positioning— PowerHA section, examples — VM Restart Section (VM Recovery Manager, FSR)— Positioning considerations— Licensing/pricing— iCBU and ECBU— Cloud Storage for IBM I— Db2 Mirror section

© Copyright IBM Corporation 20192


What’s new for HA/DR products 1H 2019 for IBM i

3 Replace the footer with text from the PPT-Updater. Instructions are included in that

file.

PowerHA for i V7.3 Enterprise Edition

DS8000 HyperSwap + a Global Mirror link

Integration of PowerHA IASP-based replication with IBM Copy Services Manager for DS8000• Enables the DS8000 HyperSwap configuration with a

Global Mirror link• We will continue to support CLI for the foreseeable

future Automate adding monitored resource entries • For object creation, deletion and restore

New with IBM i V7.4,: Db2 Mirror for IBM i• Enables continuous availability via active/active Db2

synchronous replication

VM Recovery Manager 1.3 for DR (VMR DR)

• VMR GUI will now monitor and manage DR• For IBM i customers, the VMR DR can be managed

via the GUI, no CLI on AIX interaction required.

BRMS 7.3 TR6 highlights

• Turn-key cloud control group deployment that enables clients to easily set up custom control groups for cloud.

• Backup for changes to journaled objects is now the default setting; that is, the default for SAVLIBBRM command has been changed to OBJJRN(*YES)..

• Enhanced log information that uses the system timestamp to preserve message order when messages are logged at the same second. They are displayed using DSPLOGBRM.


Power Systems HA/DR solution family

PowerHA for AIX

Cognitive Systems HA/DR

active/passive HA/DR active/inactive VM restart

PowerHA for Linux PowerHA for IBM i VM Recovery Manager HA

VM Recovery Manage DR

PowerHA System Mirror – Covers planned and unplanned outages for both

software and hardware with automation– Solutions for both HA & DR– Advanced capabilities such as HyperSwap

operating system based technology– RTO via application restart– RPO sync or async mode– N+1 licensing

VM Recovery Manager– Primarily for planned and unplanned

hardware outages– Manage and monitor

large numbers of VMs (LPARS)– Relatively easy to implement and manage– Operating system independent

(supports AIX, i and Linux)– RTO via reboot (IPL)– RPO sync or async mode– N+0 licensing

4

Db2 pureScale

active/active HA

Db2 Mirror for IBM i

Db2 Mirror & Db2 pureScale– Db2 pureScale active/active via DB cluster– Db2 Mirror active/active DB via sync replication– Covers planned and unplanned outages for both

software and hardware with automation– Solutions for HA, continuous availability– RTO & RPO zero– N+N licensing


Technology Active/Active Clustering Active/Passive Clustering Active/Inactive

Definition Application clustering; applications in the cluster have simultaneous access to the production data therefore no app restart upon an app node outage. Certain types enable read-only access from secondary nodes

OS clustering; one OS in the cluster has access to the production data, multiple active OS instances on all nodes in the cluster. Application is restarted on a secondary node upon outage of a production node.

VM Clustering, One VM in a cluster pair has access to the data, one logical OS, two physical copies. OS and applications must be restarted on a secondary node upon a primary node outage event. LPM enables the VM to be moved non-disruptively for a planned outage event.

Outage Types SW,HW,HA, planned, unplannedRTO 0, limited distance

SW,HW,HA,DR, planned, unplanned, RTO>0, multi-site

HW,HA,DR, planned, unplanned, RTO>0, multi-site

OS integration Inside the OS Inside the OS OS agnostic

RPO Sync mode only Sync/Async Sync/Async

RTO 0 Fast (minutes) Fast Enough (VM Reboot)

Licensing* N+N licening N+1 licensing N+0 licensing

Industry Examples Db2 Mirror , Oracle RAC, pureScale PowerHA, Redhat HA, Linux HA VMware, VMR HA, LPM,

High Availability Topology Classification

— illustrations represent two node shared storage configurations for conceptual simplicity

— there are many other topologies and data resiliency combinations

* N = number of licensed processor cores on each system in the cluster

Application HA/DR VM Restart HA/DR


PowerHA System Mirror

6

• The IBM PowerHA solutions are based on shared storage clustering− PowerHA for AIX− PowerHA for i− PowerHA for Linux

• For IBM i the shared storage container is called an IASP, this is where the production data and application libraries reside

• The data in the IASP can be switched between systems and/or replicated for geographic dispersion

• Enables HA for all outage types software, hardware, HA, DR, backup operations and non disruptive upgrades and software maintenance.

• Best price/performance option • Complete HA/DR operational automation

• This configuration is the PowerHA foundational building block• HA via switching production ownership of shared storage• Administrator switches users and production between nodes

in the cluster with a single command• No more remote journaling, back to local journals


Power Systems solutions for HA/DR: PowerHA

7

PowerHA Standard EditionPrimarily an AIX/Linux configurationIBM i more typically do multi-site

PowerHA clustering for HA and DRBest case recovery point objective (RPO) Best case recovery time objective (RTO) All outage types covered; planned and unplanned, software and hardwareCapacity Backup (CBU) for Enterprise Systems provides huge savingsHA and/or DR server requires only one PowerHA LPP

PowerHA Enterprise EditionAIX and IBM i accounts, multi-site clustering, typical of IBM i configs

PowerHA Enterprise Editionlarge IBM i accounts, DS800 onlyHyperSwap + Global Mirror linkPrimary use case; banking clients

PowerHA for LinuxPrimary use case: SAP HANA & Net WeaverSAP System Replication for the SAP DBPowerHA for app servers and SAP HANA coordination


• Three systems, three sites, three real-time copies of the production IASP production data in a signal PowerHA EE cluster

• Metro Mirror/HyperSwap section of cluster provides continuously available storage (active/active)

• Global mirror link provides the disaster recovery section

DS8000 three site PowerHA for i 7.3 TR 6 Enterprise Edition HyperSwap cluster

PowerHA i Enterprise Edition multi-target Hyper Swap cluster

Application

Metro Mirror

Global Mirror

Site oneSite two

Site three

HyperSwap section of cluster

Global Copy


Classic three system two-site PowerHA i Enterprise Edition cluster

9 Replace the footer with text from the PPT-Updater. Instructions are included in that

file.

• Four nodes in this illustration, nodes 1,2 & 3 are production nodes, node 4 is a flash copy node• All nodes have active IBM i enabling concurrent software updates without disrupting production• Switched LUN configuration in the data center, applications and data go into the IASP, and local journals• System ASP(SYSBAS) contain the monitored resource entries (MREs)• The Administrative Domain keeps MREs in sync between the nodes 1,2 and 3 (note: no logical replication being used)• Node 4 is used for Flash Copy & BRMS operations to eliminate your backup window

⎻ We can flash the IASP, or the IASP + Admin Domain objects, or full system flash (IASP + ASP 4)

Production

IASP

Production

IASP

Flash CopyIASP

System

ASP 1

System

ASP 2

System

ASP 3

IBM i 1

Application

IBM i 2

Application

IBM i 3

Application

IBM i 4

BRMS

Admin Domain

V9000 V9000

Cloud

TAPE

Metro Mirror or Global Mirror

LPAR1 LPAR2

Production data center Disaster recovery site


PowerHA for i HA/DR configuration examples

— PowerHA, simple two node shared storage cluster

• Each node as an active IBM i and System A is hosting the application and users. System B can be used to conduct software maintenance

• Cluster switches users and application production to system B for planned or unplanned events

• Admin Domain synchronizes monitored resource entries (MREs)

— PowerHA, two-site four node cluster

• Metro Mirror mirrors the IASP data synchronous to the application state therefore distance is limited

• Global Mirror replicates the IASP data asynchronously to the application state therefore unlimited distance

• Cluster switches the users and application production to system B or C or D for planned or unplanned outage events

PowerHA for i two node shared storage data center cluster Standard Edition

PowerHA for i two site four node cluster Enterprise Edition

site 1 site 2

Global Mirror

Metro Mirror


PowerHA for i HA/DR two-site cluster configurations examples— PowerHA, two system two site cluster (sync)

• Metro Mirror replication of the IASP data is synchronous to the application state creating two identical copies.

• Distance is typically under 40 KM, Flash Copy & BRMS can be done at site 1, site 2 or both.

• The cluster moves users and application production to system B for planned or unplanned events, and reverses the direction of the replication direction.

— PowerHA, two system two site cluster (async)

• Global Mirror replicates the IASP data asynchronously to the application state therefore distance is unlimited

• BRMS and Flash Copy can be used at site 1, site 2 or both

• Cluster moves the users and application production to system B for planned our unplanned outage events, & replication direction is reversed

PowerHA for i two site two system cluster Enterprise Edition

site 1 site 2

Global Mirror

Metro Mirror

PowerHA for i two site two system cluster Enterprise Edition

site 1 site 2


PowerHA with internal disk and geomirroring

— PowerHA geomirror cluster (typically done with internal disk)

— Memory pages are replicated via IBM i mirroring over IP to local and remote IASPs in real-time

— Off line back-up followed by source side /target side tracking change resynchronization (consider a V5000 with flash copy at the target site for zero resync time after a save operation

— Both bandwidth and network quality are important.

— Synchronous mode up to 40 KM, production and target should be identical for maximum throughput

— Asynchronous mode unlimited distance, production and target data ordered and consistent

geomirror

target partition

IASPIASP

production partition

DB2, IFS,journals

DB2, IFS,journals

Monitored SYSBAS objects

Monitored SYSBASobjects

Admin Domains synchronization


Network

HA (target)

SYSBAS

PROD (source)

SYSBAS

PowerHA with geographic mirroring backup operations

LPAR-1 LPAR-1

1. Detach with tracking

• Replication from source is suspended, changes in production data are tracked

2. Once backups are completed

• Partial resync (tracked changes are replicated from source to target)

• You should conduct backup operation during quiet times to minimize the partial resync time

• No HA or DR failovers are possible until that re-sync has completed

Consider implementing SAN storage at the target side and use flash copy to eliminate resync time

No data replication during backupsPartial resynch

Detach with tracking

IASP IASP


Admin domain

The administrative domain

Monitored objects in table 3-1 Preparing for PowerHA SG24-8400-00

• Table 3-1, lists the MREs which are kept in sync across the nodes in the clustero Note that logical replication is not required.

Application data, local journals..

IASP

monitored resources

Monitored objects

Monitored objects

SYSBAS SYSBAS


VM Recovery Manager

15

VM Recovery Manager for HA, (VMR HA) low cost simple HA solution for DR, AIX, i and Linux

PowerHA AIX data center cluster withVMR DR to automate DR operations

VMR DR rovides a simple automated DR solution for SAP HANA deployments

VMR DR for MSPs and CSPsVM Recovery Manager for DR, (VMR DR) low cost simple solution for DR, AIX, i and Linux

VM Recovery Manager Low cost easy to use• Supports AIX, i and Linux• Supports DS8K, SVC, EMC and Hitachi• Non disruptive disaster recovery compliance testing• Ideal for DRaaS providers


VM Recovery Manager for DR– automated two site disaster recovery

ApplicationIBM i

IBM i

Application

storage replication storage replication

VM Recovery Manager for DR (VMR DR) – monitor and manage your DR operations from a GUI• VMs are replicated at the storage level in real-time (IASP not required)• Recovery is via VM restart at the DR site (VM Restart is essentially an IPL)• DR operations are operator initiated and fully automated via an intuitive GUI• Orchestrator is KSYS, it runs on an AIX partition at the DR site

⎻ KSYS interacts with the HMC and Storage• VMR DR Provides a DR Rehearsal mode, non disruptive to production

KSYS


Failover Rehearsal: non disruptive disaster recovery testing

Site 1 (active) Site 2 (backup)

• A point in time copy (i.e. Flash Copy) is created to start VMs on the back up system for DR testing or backup operations• Enables IT operations to validate disaster recovery compliance without disrupting production• Network isolation needs to be established by the administrator (admin can design and use the test VLANs for test VMs)

Host 11

Disk Group 1 Mirror

Host 21

S1 S2 S2C

…

LPA

R 1

_11

LPA

R 1

_12

LPA

R 1

_1m

VIO

S 1_

11

VIO

S 1_

12

VIO

S 2_

11

VIO

S 2_

12

…

LPA

R 1

_11

LPA

R 1

_12

LPA

R 1

_1m


Production Workload

Production Workload

Full System Replication (FSR) for i – Automated Failover Operations

Controlling LPAR (DR Site)

Controlling LPAR (Production Site)

Production LPAR (Production Site)

Production LPAR COPY (DR Site)Remote Copy

DS8K Metro Mirror or Global MirrorOR SVC-Based Metro Mirror or

Global Mirror w/Change Volumes

Issue the switch operationComplete startup processes

Automated with FSRC SWCSE Command

End applications/jobs on production siteMonitor/wait for all jobs to endIssue the shut-down commandMonitor/wait for shut-down to completeLog in to storage as AdminVerify/wait for disk synchronization

Log in to HMC at DR site as superadminIPL the DR site partition in manual modeModify “autostart” objects (lines, interfaces, devices, applications) to not start Correct comm resources, storage resources, IP interfaces, TCP routes, etc.Apply license keys

CONTROLLING PARTITIONS: Either controlling partition may control the switchover, failover or detach, dependent on which site is active

MANUAL STEPS AND LOG-INS NO LONGER REQUIRED!


Power Systems CBU for enterprise systems (ECBU)

Offering for • Power System E880, E880C, E870, E870C and E980

ECBU offering Features:• Deeply discounted processor nodes matching the installed production

server processor nodes• No charge, annually renewable active standby memory = 365xNx32

GB, where N is the number of active mobile cores on the production• Mobile processor activations are transferred from production to ECBU

via Enterprise Pool transfers• Registration of primary system and ECBU required. Primary and

ECBU must be within the same enterprise

Offering requirements overview– An E980 ECBU may support any E870/E870C, E880/E880C or

E980 primary (production) system (9119-MME/MHE or 9080-MME/MHE/M9S)

– Only one ECBU to one production server for registration and entitlement purposes but, multiple production servers to one ECBU is allowed.

– Only one ECBU to the primary system allowed. IBM i customers can use the iCBU for an additional CBU in a three site config.

– Primary can be a new or installed box, CBU must be a new box– A minimum of one entitlement of AIX or IBM i & PowerHA on the

CBU or if alternative HA/DR solution is used, as many IBM i or AIX entitlements as needed to support the workload (such as a an IP based replication workload)

– 8 processor static activations on the CBU (no more no less)– Minimum of 25% of DIMM memory active on the CBU– The no charge Memory ECOD days must be activated upon install

of CBU system and remain active for 365 days

Active m

emor

y

Primary ECBU

Processor & entitlement transfer

• All transferable entitlements must originate on the primary system and may not run concurrently on the primary system and the ECBU system

• Subsequent to the initial workload deployment, some subset of production partitions may be moved to the ECBU system for workload balancing etc

• The total number of processor entitlements running production across both servers can not exceed the original total licensed entitlements.

2019 IBM Systems Technical University 20

Traditional CBU Licensing example – two system one customer topology

Planning— CBU allows PowerHA licenses entitlement fail-over from the

registered production server• Minimum 1 entitlement required on the CBU box*• CBU server allows the temporary transfer of entitlements from

primary server for non concurrent usage on the CBU server • Round-up when using partial processors• 3.5 processors = 4 entitlements• One customer

Example— No HA/DR required for Partition 1

• No PowerHA licenses

— HA required for Partition 2 and 3• All processors in the production server partitions 2 and 3 are licensed

for PowerHA• One key, 8 entitlements • The license key will be a permanent key installed on partition 2 and 3

— A single processor is licensed on the CBU server• One key, one entitlement• The non-OS LPPs will be temporary keys for 8 cores good for two

years installed on partitions 2 and 3

* logical replication DR solutions generate workload on the CBU that can range 30% to 50% of the total workload on the primary. Those additional cores must be permanently licensed with no out of compliance messages prior to a failover operation.


IBM Cloud Storage Solutions for i (ICC)

TCP/IP

Cloud StorageVirtual Tape

Place your IBM i data system in a cloud Cloud or FTP storage

Two independent modes:⎻ BRMS to Cloud for backup operations⎻ GUI dashboard for storing files in the cloud (think of BOX-like usage cases)

FTP Server

IBM i


Cloud Storage Solutions for i (5733-ICC)

• Enhancements – announcement December 4, 2018– Supported cloud storage options

• IBM Cloud Object Storage• IBM Cloud (formerly called IBM Bluemix and IBM Softlayer (S3 Protocol) • FTP (on IBM i)• Softlayer “Legacy” (Swift Protocol)• Amazon AWS S3

– GUI • View your storage locations and contents, easily identify files/directories• Easily perform upload/download operations

• PTFs for the latest function

– PTF SI67483 contains the Cloud Storage Solutions Web GUI – PTF SI68368 adds support for IBM Cloud Object Storage


23

IBM Db2 Mirror for i - IBM Db2 Mirror for i: Enables Continuous Availability- High speed synchronous replication of Db2

for i (Data Center Solution)- Access Db2 objects from either LPAR

- Application Availability Enablement - Two Nodes read and write to the same DB

Files- Enables quickly moving all work to one node,

for planned maintenance or node failure- Enables Business Continuity for

Disruptive System Upgrades- Nodes can be at different OS levels - Nodes can be on different Power Hardware

Generations- Rolling upgrades for no downtime- Roll a node back a release with minimal

impact if Active/Active applications are deployed

Requires POWER8 or later and IBM i 7.4New IBM i LPP 5770DBM

Db2 Mirror

Application


RoCE

Name Age

Fred Add record

24 24 Fred

Operating System Synchronous Replication

Synchronous Database Update on both nodes SYSBASE or IASP

Node 1

App

Database

Name Age

Node 2

App

Database

Db2 Mirror – Active Active


RoCE

Node 1

Database

App

Application running separate on each node

Node 2

Database

App

Database replication eligible objectsNative:• Database Physical & Logical FileSQL:• Alias• Function• Global Variable• Index• Procedure• Schema• Sequence

Included with File support:• Row Permission• Column Mask• Temporal Table• Constraint• Etc…

DDS / Record Level Access SQL / Set Based Access

Db2 Mirror – Database Supported Objects

• SQL Package• Table• Trigger• User Defined Type• View• XML Schema

Repository

25


RoCE

Node 1

Database

App

Node 2

Database

App

IASP IASP

Objects can be in either SYSBAS or IASPs

Db2 Mirror – Other Supported Objects

— Other Objects• User profiles• Authority• Ownership• Security• PGM/SRVPGM• Data Areas• Data Queues (DDL Only)• SYSVALs• ENVARs• LIB• JOBD• Journals• Files (also has DDL Only option)

— Special Handling• OUTQ / Spool • Job Queue


IFS Support

• Requires IASP• IFS accessible on both Nodes (R/W)• Requires PowerHA• Filesystem automatically ’mutates’

when the storage is switched


Web Clients

RoCE

Node 1

Database

App

Node 2

Database

App

Application layer connects with either JDBC or Load Balancer

Db2 Mirror – Active Active, Web Clients


RoCE

Node 1

Database

App

Node 2

Database

App

Run Production Workloads on this node

Run Queries and reports on this node

Db2 Mirror – Active Passive


Db2 Mirror – What makes it different

— New integrated IBM i synchronization technology— Does not leverage any existing availability technology to provide continuous availability

• But does work with existing technology

JO JO

Normal Network Connection

FredSally

FredSally

FredSally

FredSally

Logical Replication

Physical Replication


DR Solutions Built on Top of Db2 Mirror for IBM i

RoCE

< 200M

Metro or Global Mirror


DR Solutions Built on Top of Db2 Mirror for IBM i


Db2 Mirror GUI

GUI runs on IBM i

GUI can run on the Db2 Mirror nodes

GUI can run outside of the Db2 Mirror nodes and manage multiple pairs

http://systemname:2006/Db2Mirror


SQL Services


ACS Insert from Examples


Performance Expectations

• With synchronous replication the complete path length will increase since the action may drive I/O on both nodes in order to finish. This could increase by up to ~(2-3)X

• The ability to run transactions on both nodes will mitigate per transaction overhead and with a target of achieving equal to or greater transactional throughput

• Read workloads will not be impacted since they do not have to be replicated

• Single threaded or serial I/O workloads will be the most impacted.

RoCE

Node 1

Database

App

Node 2

Database

App


Time Server Topology

— Internal Time Server— External Time Server

RoCE

Node 1

Database

App

Node 2

Database

App

RoCE

Node 1

Database

App

Node 2

Database

App


Communication Hardware

4 Adapter Options- PCIe3 2-port 10 Gb NIC & ROCE SR/Cu adapter

(FC EC2R and EC2S; CCIN 58FA)

- PCIe3 2-port 25/10 Gb NIC & ROCE SFP28 adapter (FC EC2T and FC EC2U; CCIN 58FB)

- PCIe3 2-port 100 GbE NIC & ROCE QSFP28 Adapter (FC EC3L and EC3M; CCIN 2CEC)

- PCIe4 2-port 100 GbE ROCE x16 adapter (FC EC66 and EC67; CCIN 2CF3)

Max Cable length = 100 MOptional RoCE switchPower9 enables SR-IOV

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec2r.htm

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec2r.htm

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec2t.htm

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec2t.htm

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec3l.htm

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec3l.htm

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec66.htm

https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hcd/fcec66.htm


Network Redundancy Groups (NRG)

• Network Redundancy Groups are a logical group of physical ports.

• Up to 16 links can form an NRG.

• Ability to prioritize different types of traffic onto separate physical links

• Failover domain is the entire group of ports


Db2 Mirror Setup

5 separate NRG categories to isolate traffic


Db2 Mirror Setup

The Load Balance Link Count tells the NRG how many active links to useThe default is 1 up to the max of 16 links


Db2 Mirror Setup

This config has 2 physical links and is showing with a Load Balance count of 1 only one is active.


Db2 Mirror Setup

The priority influences which link is active for the NRG


Db2 Mirror Setup

This config has 2 physical links and is showing with a Load Balance count of 2 which makes both links active.


Db2 Mirror Network Statistics


Default Inclusion State for Replication Rules

NOTE: Can only be chosen at setup time or re-configuration time.


Replication List Rules

Add Rules for existing objects and objects that don’t exist yet

Add Rules for an object type or a specific object name


Replication List Rules

Set the rule to include or exclude the object/library from replication


Inspect what the Rules look like applied to the System


System Defined Rules

System Defined Rules are predefined and cannot be changed


Pending Rules

Create a group of rules before applying them to the system


Visualize Pending Groups


Detecting Errors

— Nodes are designed as a ’Primary’ or ’Secondary’ to indicate which node is preferred to ‘track’.

— HMCs are used for failure detection of the partner node to indicate the Secondary can automatically take over as the Primary and begin tracking to allow Db2 transactions to continue.

— The Secondary side will block changes to Db2 transactions


Detecting Errors - Quorum

— Additional nodes are added to the cluster to help determine the Primary and Secondary roles in the event that the partner node is down when node IPLs.

— The quorum data is share amongst all nodes in the cluster and stores state information.

— Typically, if there is a DR configuration those nodes would serve as the additional nodes to store quorum data.


Detecting Errors – State Change

— If the Secondary Fails:- IPLs

- MSD- Goes to Restricted State

— The Primary will begin tracking replicated object changes and the application will continue to run.

— The Secondary will be in a ‘blocked’ state and not allow changes to replicated objects until the two nodes have resumed mirroring.

Primary Secondary



— If the Primary Fails (Crash/MSD):

— If the secondary can connect to the HMC and determine the primary has failed, the secondary will take over as the primary and begin tracking.

— If the secondary cannot detect the failure it will remain blocked. The user may choose to force the secondary to become the primary.

Primary Secondary



— If the Primary does an normal IPL or goes to restricted state:

— The secondary will remain blocked and primary will track while in restricted state or until the IPL completes.

Primary Secondary



— If the network fails:

— If there is no communication between the 2 nodes over the RoCE network, the Primary will continue to track replicated objects and the secondary will block changes to replicated objects until the mirroring is resumed.

Primary Secondary


Active Replication


The status of the system when one node goes offline


Suspend Mirroring from the GUI


Tracking / Blocked State


Object Tracking List


Mirror Resume Progress


History of Previous Resynchronizations


Resume Automatically

— The resume automatically property is defaulted to yes. This means if it was a system detected event such as a communication failure or crash, the mirror will resume once the failure is resolved.

— If the user suspends mirroring, then the user has to explicitly call resume.


Resync Parallelism

— If 5770SS1 Option 26 (DB2® Symmetric Multiprocessing) is installed you can take advantage of resyncing multiple objects at the same time.


Suggested Priority Example


Managing and Monitoring

— Exit Points for several of the state transitions

Exit Point Exit Point Format

Description

QIBM_QMRDB_PRECLONE PREC0100 Db2 Mirror ASP pre-clone

QIBM_QMRDB_POSTCLONE PSTC0100 Db2 Mirror ASP post-clone

QIBM_QMRDB_ROLE_CHG RCHG0100 Db2 Mirror replication role change

QIBM_QMRDB_STATE_CHG SCHG0100 Db2 Mirror replication state change


Serviceability


Compare


Compare Results


Alerts


QSYSOPR Messages

— Db2 Mirror product state change messages sent to QSYSOPR:— CPDC905 - Db2 Mirror Network Redundancy Group (NRG) link <ip address> is active.— CPDC906 - Network Redundancy Group (NRG) link <ip address> is inactive.— CPIC901 - Db2 Mirror replication is suspended for ASP group IASP33P. Reason code <reason

code>.— CPIC902 - Db2 Mirror replication is suspended for ASP group <iASP name or *SYSBAS> due to

an error. Reason code <reason code>. — CPIC903 - Db2 Mirror replication is suspended for maintenance operations.— CPIC904 - Db2 Mirror replication is active for ASP group <iASP name or *SYSBAS>. — — Db2 Mirror product failure messages sent to QSYSOPR:— CPD3E43 - DRDA/DDM Db2 Mirror server error occurred with reason code <reason code>.— CPF32CD - Db2 Mirror resynchronization failed for job <job name or *ALL>.


Specific Object Replication Details


DB2 Mirror – Database “must have” knowledge

1. DDS and SQL DDL files are supported2. Native DB I/O (e.g. RPG) and SQL are supported3. Mirrored database files contain the same data, at the same RRNs4. Journaling is optional, but encouraged5. Record level operations against mirrored files will yield identical results, regardless of whether

the source or target are being used6. Database DDL and I/O operations are synchronous


Database trigger considerations

— Configured via ADD/CHGPFTRG and ALTER/CREATE TRIGGER

Default


Output Queue (*OUTQ) Objects

Objects of type *OUTQ will be replicated synchronously

— OUTQ’s kept identical across both systems— Creates, updates, and deletes blocked if

• Initiated on secondary system while DB2 Mirror is interrupted• A required object not available on both systemso DTAQo MSGQo WSCST

— Customers configure which to replicate


Characteristics of Spooled Files

Spooled file have unique properties for DB2 Mirror— All the data spooled originate from a single system— Often generated over a long-running process— Can be quite large— Usually not useful if incomplete— Limited number of spooled files allowed on a system— Duplicate spooled files not allowed — Not true objects, in an IBM i sense


Spooled File Replication

— Spooled files will be replicated near-synchronously• At close, spooled file will be added to the OTL as deferred• A system job will resynch spooled files to the target system at configurable intervals• Cannot guarantee that the order of spooled files will be the same on both systems• Generation of spooled files is never blocked.o Spooled files added to OTL on both systems when replication suspendedo Resynchronized both ways when replication resumed

— Replicated to the same library/output queue on the target system


Spooled File Status

— Replicated spooled files will be restored in *HLD status • Prevents processing of replicated files until they are releasedo Ignored by active writerso No entries added to an associated DTAQ

•On failover, spooled files in *HLD must be released to be processed

— Once processed, replicated copies will be set to *SAV or *FIN status— *RLS, *HLD, *PND, *WTR, and *PRT status not replicated


Considerations for replicating spooled files

— Due to the large amount of potential data transfer, care should be taken to limit replication of spooled files to those needed.

• We help by permanently excluding the following output queuesQUSRSYS/QEZJOBLOGQUSRSYS/QEZDEBUGQGPL/QPRINT

• We help by excluding the following output queues, by defaultAll *OUTQs in QUSRSYSAll *OUTQs in QGPL These *OUTQs may be explicitly included in replication by name.

• When including a library, users should exclude unneeded OUTQs at the sameRCL configuration allows multiple changes to be submitted as a group


Synchronously Replicated Authority Changes

The following will be replicated synchronously:(Synchronous changes occur at the same time on both systems. They either succeed on both, or fail on both.)

• Authority & ownership changes to database file (table) objects including securing the file with an authorization list• Using e.g. GRTOBJAUT, RVKOBJAUT, CHGOBJOWN, CHGOBJPGP, ...

• Authority & ownership changes to any other supported object via database (SQL) operations• Authority changes to IFS objects on the hardware mirrored IASP• Creation of a *AUTL object. Adding users to/from a *AUTL. Changing ownership of a *AUTL.• Change of object audit attribute via CHGOBJAUD• Change of user profile parameters PASSWORD and UID/GID• Creation of a user profile

• User profile is created on both systems with the same attributes including UID and GID


Authority Changes Not Supporting Replication

The following will not be replicated:• Authority changes to objects not supported by DB2Mirror, or to objects which DB2Mirror is

configured to exclude.

• Cryptographic and digital certificate management capabilities• e.g. master keys, key store updates, and certificate store info

• Configuration for functions like Kerberos / EIM• plus other considerations like keytab files, EIM relationships, etc

87 © Copyright IBM Corporation 2019


Replicated Objects that can change while in the blocked state

— User Profiles— Authorization Lists— Function Usage Information— Environmental Variables— System Values— Spooled Files

For more information on specific behavior:https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/db2mi/db2mobjblocked.htm


https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/db2mi/db2mobjblocked.htm


IASPs


Db2 Mirror IASP Support

— IASPs are optional for Db2 data— IASPs are required for IFS concurrent sharing

• PowerHA required to switch IFS IASPs

— DB IASPs have there own Replication Rules and Object Tracking List



IASP Support



Switch over IFS IASPs



Disaster Recovery


Topology Options – DR

— As long as one local Db2Mirror node is up, production will remain at the local site.

— If both local nodes are unavailable, then a switch to the DR site can be initiated.

— The default will be that a switch to DR requires system administrator intervention, although a policy could be defined to initiate the switch automatically.

— Only one node will be activated at the DR site, and then a Db2Mirror resynch will be started to the 2nd DR node.

SystemMirror Replication or Hardware Replication


RoCE

Node 1

Database

App

Node 2

Database

App

RoCE

Node 1

Database

App

Node 2

Database

App









RoCE

Node 1

Database

App

Node 2

Database

App

RoCE

Node 1

Database

App

Node 2

Database

App







97



RoCE

Node 1

Database

App

Node 2

Database

App

RoCE

Node 1

Database

App

Node 2

Database

App







9898

RoCE

Node 1

Database

App

Node 2

Database

App

RoCE

Node 1

Database

App

Node 2

Database

App


Topology Options – Common DR Options

Roce

< 200M




RoCE

< 200M




RoCE

< 200M


RoC

E



RoCE

< 200M


RoCE

< 200M



RoCE

< 200M


RoCE

< 200M

IFS Switching only supported on DS8K with Hyperswap with 3 Storage Controllers



RoCE

< 200M


RoCE

< 200M


Logical Replication

— Logical replication solutions have the option to move the source node between the Db2 Mirror nodes and go to a single DR node

Logical Replication

Node 1

Database

App

RoCE

Node 1

Database

App

Node 2

Database

App


Logical Replication

— Logical replication solutions have the option to move the source node between the Db2 Mirror nodes and go to a Db2 Mirror pair.

Logical Replication

RoCE

Node 1

Database

App

Node 2

Database

App

RoCE

Node 1

Database

App

Node 2

Database

App


Software Requirements and Licensing


Software Required for Db2 Mirror Pair

— 5770SS1 Option 3 (Extended Base Directory Support)— 5770SS1 Option 12 (Host Servers)— 5770SS1 Option 22 (ObjectConnect)— 5770SS1 Option 26 (DB2® Symmetric Multiprocessing) - Optional— 5770SS1 Option 30 (Qshell) — 5770SS1 Option 34 (Digital Certificate Manager) — 5770SS1 Option 41 (High Availability Switchable Resources)— 5770SS1 Option 48 (IBM Db2Mirror) — 5770JV1 *BASE (IBM Developer Kit for Java)— Option 16 (Java SE 8 32 bit)— Option 17 (Java SE 8 64 bit)— 5733SC1 *BASE(IBM Portable Utilities for i) — Option 1 (OpenSSH, OpenSSL, zlib)— 5770DG1 *BASE (IBM HTTP Server for i)— 5770DBM *BASE (IBM Db2 Mirror for i) — Option 1 (Db2 Mirror Enablement)


Open Source Packages Required for Setup

— python2-six-1.10.0-1.ibmi7.1.noarch.rpm— python2-itoolkit-1.5.1-1.ibmi7.1.ppc64.rpm— python2-ibm_db-2.0.5.8-1.ibmi7.1.ppc64.rpm— cloudinit-1.0-0.ibmi7.1.ppc64.rpm


Software Required for Db2 GUI Node

— 5770SS1 Option 3 (Extended Base Directory Support)— 5770SS1 Option 12 (Host Servers)— 5770SS1 Option 22 (ObjectConnect)— 5770SS1 Option 26 (DB2® Symmetric Multiprocessing) - Optional— 5770SS1 Option 30 (Qshell) — 5770SS1 Option 34 (Digital Certificate Manager) — 5770SS1 Option 41 (High Availability Switchable Resources)— 5770SS1 Option 48 (IBM Db2Mirror) — 5770JV1 *BASE (IBM Developer Kit for Java)— Option 16 (Java SE 8 32 bit)— Option 17 (Java SE 8 64 bit)— 5733SC1 *BASE(IBM Portable Utilities for i) — Option 1 (OpenSSH, OpenSSL, zlib)— 5770DG1 *BASE (IBM HTTP Server for i)— 5770DBM *BASE (IBM Db2 Mirror for i) — Option 1 (Db2 Mirror Enablement)


Amen i Dziękuję za uwagę

ha/dr solutions for ibm i - commoncommon.org.pl/prezentacjecpw2019/hadrfori2019.pdf · 2019 ibm...

Documents