ha/dr solutions for ibm i - commoncommon.org.pl/prezentacjecpw2019/hadrfori2019.pdf · 2019 ibm...
TRANSCRIPT
HA/DR solutions for IBM i
Edward Grygierczyk
2019 Common Polska13.05.2019| Lidzbark Warmiński
2019 IBM Systems Technical University
Topics
— What’s new with V7.3 TR6 and V7.4— Power Systems HA/DR solution family — Characteristics/Positioning— PowerHA section, examples — VM Restart Section (VM Recovery Manager, FSR)— Positioning considerations— Licensing/pricing— iCBU and ECBU— Cloud Storage for IBM I— Db2 Mirror section
© Copyright IBM Corporation 20192
2019 IBM Systems Technical University
What’s new for HA/DR products 1H 2019 for IBM i
3 Replace the footer with text from the PPT-Updater. Instructions are included in that
file.
PowerHA for i V7.3 Enterprise Edition
DS8000 HyperSwap + a Global Mirror link
Integration of PowerHA IASP-based replication with IBM Copy Services Manager for DS8000• Enables the DS8000 HyperSwap configuration with a
Global Mirror link• We will continue to support CLI for the foreseeable
future Automate adding monitored resource entries • For object creation, deletion and restore
New with IBM i V7.4,: Db2 Mirror for IBM i• Enables continuous availability via active/active Db2
synchronous replication
VM Recovery Manager 1.3 for DR (VMR DR)
• VMR GUI will now monitor and manage DR• For IBM i customers, the VMR DR can be managed
via the GUI, no CLI on AIX interaction required.
BRMS 7.3 TR6 highlights
• Turn-key cloud control group deployment that enables clients to easily set up custom control groups for cloud.
• Backup for changes to journaled objects is now the default setting; that is, the default for SAVLIBBRM command has been changed to OBJJRN(*YES)..
• Enhanced log information that uses the system timestamp to preserve message order when messages are logged at the same second. They are displayed using DSPLOGBRM.
2019 IBM Systems Technical University
Power Systems HA/DR solution family
PowerHA for AIX
Cognitive Systems HA/DR
active/passive HA/DR active/inactive VM restart
PowerHA for Linux PowerHA for IBM i VM Recovery Manager HA
VM Recovery Manage DR
PowerHA System Mirror – Covers planned and unplanned outages for both
software and hardware with automation– Solutions for both HA & DR– Advanced capabilities such as HyperSwap
operating system based technology– RTO via application restart– RPO sync or async mode– N+1 licensing
VM Recovery Manager– Primarily for planned and unplanned
hardware outages– Manage and monitor
large numbers of VMs (LPARS)– Relatively easy to implement and manage– Operating system independent
(supports AIX, i and Linux)– RTO via reboot (IPL)– RPO sync or async mode– N+0 licensing
4
Db2 pureScale
active/active HA
Db2 Mirror for IBM i
Db2 Mirror & Db2 pureScale– Db2 pureScale active/active via DB cluster– Db2 Mirror active/active DB via sync replication– Covers planned and unplanned outages for both
software and hardware with automation– Solutions for HA, continuous availability– RTO & RPO zero– N+N licensing
2019 IBM Systems Technical University
Technology Active/Active Clustering Active/Passive Clustering Active/Inactive
Definition Application clustering; applications in the cluster have simultaneous access to the production data therefore no app restart upon an app node outage. Certain types enable read-only access from secondary nodes
OS clustering; one OS in the cluster has access to the production data, multiple active OS instances on all nodes in the cluster. Application is restarted on a secondary node upon outage of a production node.
VM Clustering, One VM in a cluster pair has access to the data, one logical OS, two physical copies. OS and applications must be restarted on a secondary node upon a primary node outage event. LPM enables the VM to be moved non-disruptively for a planned outage event.
Outage Types SW,HW,HA, planned, unplannedRTO 0, limited distance
SW,HW,HA,DR, planned, unplanned, RTO>0, multi-site
HW,HA,DR, planned, unplanned, RTO>0, multi-site
OS integration Inside the OS Inside the OS OS agnostic
RPO Sync mode only Sync/Async Sync/Async
RTO 0 Fast (minutes) Fast Enough (VM Reboot)
Licensing* N+N licening N+1 licensing N+0 licensing
Industry Examples Db2 Mirror , Oracle RAC, pureScale PowerHA, Redhat HA, Linux HA VMware, VMR HA, LPM,
High Availability Topology Classification
— illustrations represent two node shared storage configurations for conceptual simplicity
— there are many other topologies and data resiliency combinations
* N = number of licensed processor cores on each system in the cluster
Application HA/DR VM Restart HA/DR
2019 IBM Systems Technical University
PowerHA System Mirror
6
• The IBM PowerHA solutions are based on shared storage clustering− PowerHA for AIX− PowerHA for i− PowerHA for Linux
• For IBM i the shared storage container is called an IASP, this is where the production data and application libraries reside
• The data in the IASP can be switched between systems and/or replicated for geographic dispersion
• Enables HA for all outage types software, hardware, HA, DR, backup operations and non disruptive upgrades and software maintenance.
• Best price/performance option • Complete HA/DR operational automation
• This configuration is the PowerHA foundational building block• HA via switching production ownership of shared storage• Administrator switches users and production between nodes
in the cluster with a single command• No more remote journaling, back to local journals
2019 IBM Systems Technical University
Power Systems solutions for HA/DR: PowerHA
7
PowerHA Standard EditionPrimarily an AIX/Linux configurationIBM i more typically do multi-site
PowerHA clustering for HA and DRBest case recovery point objective (RPO) Best case recovery time objective (RTO) All outage types covered; planned and unplanned, software and hardwareCapacity Backup (CBU) for Enterprise Systems provides huge savingsHA and/or DR server requires only one PowerHA LPP
PowerHA Enterprise EditionAIX and IBM i accounts, multi-site clustering, typical of IBM i configs
PowerHA Enterprise Editionlarge IBM i accounts, DS800 onlyHyperSwap + Global Mirror linkPrimary use case; banking clients
PowerHA for LinuxPrimary use case: SAP HANA & Net WeaverSAP System Replication for the SAP DBPowerHA for app servers and SAP HANA coordination
2019 IBM Systems Technical University
• Three systems, three sites, three real-time copies of the production IASP production data in a signal PowerHA EE cluster
• Metro Mirror/HyperSwap section of cluster provides continuously available storage (active/active)
• Global mirror link provides the disaster recovery section
DS8000 three site PowerHA for i 7.3 TR 6 Enterprise Edition HyperSwap cluster
PowerHA i Enterprise Edition multi-target Hyper Swap cluster
Application
Metro Mirror
Global Mirror
Site oneSite two
Site three
HyperSwap section of cluster
Global Copy
2019 IBM Systems Technical University
Classic three system two-site PowerHA i Enterprise Edition cluster
9 Replace the footer with text from the PPT-Updater. Instructions are included in that
file.
• Four nodes in this illustration, nodes 1,2 & 3 are production nodes, node 4 is a flash copy node• All nodes have active IBM i enabling concurrent software updates without disrupting production• Switched LUN configuration in the data center, applications and data go into the IASP, and local journals• System ASP(SYSBAS) contain the monitored resource entries (MREs)• The Administrative Domain keeps MREs in sync between the nodes 1,2 and 3 (note: no logical replication being used)• Node 4 is used for Flash Copy & BRMS operations to eliminate your backup window
⎻ We can flash the IASP, or the IASP + Admin Domain objects, or full system flash (IASP + ASP 4)
Production
IASP
Production
IASP
Flash CopyIASP
System
ASP 1
System
ASP 2
System
ASP 3
IBM i 1
Application
IBM i 2
Application
IBM i 3
Application
IBM i 4
BRMS
Admin Domain
V9000 V9000
Cloud
TAPE
Metro Mirror or Global Mirror
LPAR1 LPAR2
Production data center Disaster recovery site
2019 IBM Systems Technical University
PowerHA for i HA/DR configuration examples
— PowerHA, simple two node shared storage cluster
• Each node as an active IBM i and System A is hosting the application and users. System B can be used to conduct software maintenance
• Cluster switches users and application production to system B for planned or unplanned events
• Admin Domain synchronizes monitored resource entries (MREs)
— PowerHA, two-site four node cluster
• Metro Mirror mirrors the IASP data synchronous to the application state therefore distance is limited
• Global Mirror replicates the IASP data asynchronously to the application state therefore unlimited distance
• Cluster switches the users and application production to system B or C or D for planned or unplanned outage events
PowerHA for i two node shared storage data center cluster Standard Edition
PowerHA for i two site four node cluster Enterprise Edition
site 1 site 2
Global Mirror
Metro Mirror
2019 IBM Systems Technical University
PowerHA for i HA/DR two-site cluster configurations examples— PowerHA, two system two site cluster (sync)
• Metro Mirror replication of the IASP data is synchronous to the application state creating two identical copies.
• Distance is typically under 40 KM, Flash Copy & BRMS can be done at site 1, site 2 or both.
• The cluster moves users and application production to system B for planned or unplanned events, and reverses the direction of the replication direction.
— PowerHA, two system two site cluster (async)
• Global Mirror replicates the IASP data asynchronously to the application state therefore distance is unlimited
• BRMS and Flash Copy can be used at site 1, site 2 or both
• Cluster moves the users and application production to system B for planned our unplanned outage events, & replication direction is reversed
PowerHA for i two site two system cluster Enterprise Edition
site 1 site 2
Global Mirror
Metro Mirror
PowerHA for i two site two system cluster Enterprise Edition
site 1 site 2
2019 IBM Systems Technical University
PowerHA with internal disk and geomirroring
— PowerHA geomirror cluster (typically done with internal disk)
— Memory pages are replicated via IBM i mirroring over IP to local and remote IASPs in real-time
— Off line back-up followed by source side /target side tracking change resynchronization (consider a V5000 with flash copy at the target site for zero resync time after a save operation
— Both bandwidth and network quality are important.
— Synchronous mode up to 40 KM, production and target should be identical for maximum throughput
— Asynchronous mode unlimited distance, production and target data ordered and consistent
geomirror
target partition
IASPIASP
production partition
DB2, IFS,journals
DB2, IFS,journals
Monitored SYSBAS objects
Monitored SYSBASobjects
Admin Domains synchronization
2019 IBM Systems Technical University
Network
HA (target)
SYSBAS
PROD (source)
SYSBAS
PowerHA with geographic mirroring backup operations
LPAR-1 LPAR-1
1. Detach with tracking
• Replication from source is suspended, changes in production data are tracked
2. Once backups are completed
• Partial resync (tracked changes are replicated from source to target)
• You should conduct backup operation during quiet times to minimize the partial resync time
• No HA or DR failovers are possible until that re-sync has completed
Consider implementing SAN storage at the target side and use flash copy to eliminate resync time
No data replication during backupsPartial resynch
Detach with tracking
IASP IASP
2019 IBM Systems Technical University
Admin domain
The administrative domain
Monitored objects in table 3-1 Preparing for PowerHA SG24-8400-00
• Table 3-1, lists the MREs which are kept in sync across the nodes in the clustero Note that logical replication is not required.
Application data, local journals..
IASP
monitored resources
Monitored objects
Monitored objects
SYSBAS SYSBAS
2019 IBM Systems Technical University
VM Recovery Manager
15
VM Recovery Manager for HA, (VMR HA) low cost simple HA solution for DR, AIX, i and Linux
PowerHA AIX data center cluster withVMR DR to automate DR operations
VMR DR rovides a simple automated DR solution for SAP HANA deployments
VMR DR for MSPs and CSPsVM Recovery Manager for DR, (VMR DR) low cost simple solution for DR, AIX, i and Linux
VM Recovery Manager Low cost easy to use• Supports AIX, i and Linux• Supports DS8K, SVC, EMC and Hitachi• Non disruptive disaster recovery compliance testing• Ideal for DRaaS providers
2019 IBM Systems Technical University
VM Recovery Manager for DR– automated two site disaster recovery
ApplicationIBM i
IBM i
Application
storage replication storage replication
VM Recovery Manager for DR (VMR DR) – monitor and manage your DR operations from a GUI• VMs are replicated at the storage level in real-time (IASP not required)• Recovery is via VM restart at the DR site (VM Restart is essentially an IPL)• DR operations are operator initiated and fully automated via an intuitive GUI• Orchestrator is KSYS, it runs on an AIX partition at the DR site
⎻ KSYS interacts with the HMC and Storage• VMR DR Provides a DR Rehearsal mode, non disruptive to production
KSYS
2019 IBM Systems Technical University
Failover Rehearsal: non disruptive disaster recovery testing
Site 1 (active) Site 2 (backup)
• A point in time copy (i.e. Flash Copy) is created to start VMs on the back up system for DR testing or backup operations• Enables IT operations to validate disaster recovery compliance without disrupting production• Network isolation needs to be established by the administrator (admin can design and use the test VLANs for test VMs)
Host 11
Disk Group 1 Mirror
Host 21
S1 S2 S2C
…
LPA
R 1
_11
LPA
R 1
_12
LPA
R 1
_1m
VIO
S 1_
11
VIO
S 1_
12
VIO
S 2_
11
VIO
S 2_
12
…
LPA
R 1
_11
LPA
R 1
_12
LPA
R 1
_1m
2019 IBM Systems Technical University
Production Workload
Production Workload
Full System Replication (FSR) for i – Automated Failover Operations
Controlling LPAR (DR Site)
Controlling LPAR (Production Site)
Production LPAR (Production Site)
Production LPAR COPY (DR Site)Remote Copy
DS8K Metro Mirror or Global MirrorOR SVC-Based Metro Mirror or
Global Mirror w/Change Volumes
Issue the switch operationComplete startup processes
Automated with FSRC SWCSE Command
End applications/jobs on production siteMonitor/wait for all jobs to endIssue the shut-down commandMonitor/wait for shut-down to completeLog in to storage as AdminVerify/wait for disk synchronization
Log in to HMC at DR site as superadminIPL the DR site partition in manual modeModify “autostart” objects (lines, interfaces, devices, applications) to not start Correct comm resources, storage resources, IP interfaces, TCP routes, etc.Apply license keys
CONTROLLING PARTITIONS: Either controlling partition may control the switchover, failover or detach, dependent on which site is active
MANUAL STEPS AND LOG-INS NO LONGER REQUIRED!
2019 IBM Systems Technical University
Power Systems CBU for enterprise systems (ECBU)
Offering for • Power System E880, E880C, E870, E870C and E980
ECBU offering Features:• Deeply discounted processor nodes matching the installed production
server processor nodes• No charge, annually renewable active standby memory = 365xNx32
GB, where N is the number of active mobile cores on the production• Mobile processor activations are transferred from production to ECBU
via Enterprise Pool transfers• Registration of primary system and ECBU required. Primary and
ECBU must be within the same enterprise
Offering requirements overview– An E980 ECBU may support any E870/E870C, E880/E880C or
E980 primary (production) system (9119-MME/MHE or 9080-MME/MHE/M9S)
– Only one ECBU to one production server for registration and entitlement purposes but, multiple production servers to one ECBU is allowed.
– Only one ECBU to the primary system allowed. IBM i customers can use the iCBU for an additional CBU in a three site config.
– Primary can be a new or installed box, CBU must be a new box– A minimum of one entitlement of AIX or IBM i & PowerHA on the
CBU or if alternative HA/DR solution is used, as many IBM i or AIX entitlements as needed to support the workload (such as a an IP based replication workload)
– 8 processor static activations on the CBU (no more no less)– Minimum of 25% of DIMM memory active on the CBU– The no charge Memory ECOD days must be activated upon install
of CBU system and remain active for 365 days
Active m
emor
y
Primary ECBU
Processor & entitlement transfer
• All transferable entitlements must originate on the primary system and may not run concurrently on the primary system and the ECBU system
• Subsequent to the initial workload deployment, some subset of production partitions may be moved to the ECBU system for workload balancing etc
• The total number of processor entitlements running production across both servers can not exceed the original total licensed entitlements.
2019 IBM Systems Technical University 20
Traditional CBU Licensing example – two system one customer topology
Planning— CBU allows PowerHA licenses entitlement fail-over from the
registered production server• Minimum 1 entitlement required on the CBU box*• CBU server allows the temporary transfer of entitlements from
primary server for non concurrent usage on the CBU server • Round-up when using partial processors• 3.5 processors = 4 entitlements• One customer
Example— No HA/DR required for Partition 1
• No PowerHA licenses
— HA required for Partition 2 and 3• All processors in the production server partitions 2 and 3 are licensed
for PowerHA• One key, 8 entitlements • The license key will be a permanent key installed on partition 2 and 3
— A single processor is licensed on the CBU server• One key, one entitlement• The non-OS LPPs will be temporary keys for 8 cores good for two
years installed on partitions 2 and 3
* logical replication DR solutions generate workload on the CBU that can range 30% to 50% of the total workload on the primary. Those additional cores must be permanently licensed with no out of compliance messages prior to a failover operation.
2019 IBM Systems Technical University
IBM Cloud Storage Solutions for i (ICC)
TCP/IP
Cloud StorageVirtual Tape
Place your IBM i data system in a cloud Cloud or FTP storage
Two independent modes:⎻ BRMS to Cloud for backup operations⎻ GUI dashboard for storing files in the cloud (think of BOX-like usage cases)
FTP Server
IBM i
2019 IBM Systems Technical University
Cloud Storage Solutions for i (5733-ICC)
• Enhancements – announcement December 4, 2018– Supported cloud storage options
• IBM Cloud Object Storage• IBM Cloud (formerly called IBM Bluemix and IBM Softlayer (S3 Protocol) • FTP (on IBM i)• Softlayer “Legacy” (Swift Protocol)• Amazon AWS S3
– GUI • View your storage locations and contents, easily identify files/directories• Easily perform upload/download operations
• PTFs for the latest function
– PTF SI67483 contains the Cloud Storage Solutions Web GUI – PTF SI68368 adds support for IBM Cloud Object Storage
2019 IBM Systems Technical University
23
IBM Db2 Mirror for i - IBM Db2 Mirror for i: Enables Continuous Availability- High speed synchronous replication of Db2
for i (Data Center Solution)- Access Db2 objects from either LPAR
- Application Availability Enablement - Two Nodes read and write to the same DB
Files- Enables quickly moving all work to one node,
for planned maintenance or node failure- Enables Business Continuity for
Disruptive System Upgrades- Nodes can be at different OS levels - Nodes can be on different Power Hardware
Generations- Rolling upgrades for no downtime- Roll a node back a release with minimal
impact if Active/Active applications are deployed
Requires POWER8 or later and IBM i 7.4New IBM i LPP 5770DBM
Db2 Mirror
Application
2019 IBM Systems Technical University
RoCE
Name Age
Fred Add record
24 24 Fred
Operating System Synchronous Replication
Synchronous Database Update on both nodes SYSBASE or IASP
Node 1
App
Database
Name Age
Node 2
App
Database
Db2 Mirror – Active Active
2019 IBM Systems Technical University
RoCE
Node 1
Database
App
Application running separate on each node
Node 2
Database
App
Database replication eligible objectsNative:• Database Physical & Logical FileSQL:• Alias• Function• Global Variable• Index• Procedure• Schema• Sequence
Included with File support:• Row Permission• Column Mask• Temporal Table• Constraint• Etc…
DDS / Record Level Access SQL / Set Based Access
Db2 Mirror – Database Supported Objects
• SQL Package• Table• Trigger• User Defined Type• View• XML Schema
Repository
25
2019 IBM Systems Technical University
RoCE
Node 1
Database
App
Node 2
Database
App
IASP IASP
Objects can be in either SYSBAS or IASPs
Db2 Mirror – Other Supported Objects
— Other Objects• User profiles• Authority• Ownership• Security• PGM/SRVPGM• Data Areas• Data Queues (DDL Only)• SYSVALs• ENVARs• LIB• JOBD• Journals• Files (also has DDL Only option)
— Special Handling• OUTQ / Spool • Job Queue
2019 IBM Systems Technical University
IFS Support
• Requires IASP• IFS accessible on both Nodes (R/W)• Requires PowerHA• Filesystem automatically ’mutates’
when the storage is switched
2019 IBM Systems Technical University
Web Clients
RoCE
Node 1
Database
App
Node 2
Database
App
Application layer connects with either JDBC or Load Balancer
Db2 Mirror – Active Active, Web Clients
2019 IBM Systems Technical University
RoCE
Node 1
Database
App
Node 2
Database
App
Run Production Workloads on this node
Run Queries and reports on this node
Db2 Mirror – Active Passive
2019 IBM Systems Technical University
Db2 Mirror – What makes it different
— New integrated IBM i synchronization technology— Does not leverage any existing availability technology to provide continuous availability
• But does work with existing technology
JO JO
Normal Network Connection
FredSally
FredSally
FredSally
FredSally
Logical Replication
Physical Replication
2019 IBM Systems Technical University
DR Solutions Built on Top of Db2 Mirror for IBM i
RoCE
< 200M
Metro or Global Mirror
2019 IBM Systems Technical University
DR Solutions Built on Top of Db2 Mirror for IBM i
2019 IBM Systems Technical University
Db2 Mirror GUI
GUI runs on IBM i
GUI can run on the Db2 Mirror nodes
GUI can run outside of the Db2 Mirror nodes and manage multiple pairs
http://systemname:2006/Db2Mirror
2019 IBM Systems Technical University
SQL Services
2019 IBM Systems Technical University
ACS Insert from Examples
2019 IBM Systems Technical University
Performance Expectations
• With synchronous replication the complete path length will increase since the action may drive I/O on both nodes in order to finish. This could increase by up to ~(2-3)X
• The ability to run transactions on both nodes will mitigate per transaction overhead and with a target of achieving equal to or greater transactional throughput
• Read workloads will not be impacted since they do not have to be replicated
• Single threaded or serial I/O workloads will be the most impacted.
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University
Time Server Topology
— Internal Time Server— External Time Server
RoCE
Node 1
Database
App
Node 2
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University
Communication Hardware
4 Adapter Options- PCIe3 2-port 10 Gb NIC & ROCE SR/Cu adapter
(FC EC2R and EC2S; CCIN 58FA)
- PCIe3 2-port 25/10 Gb NIC & ROCE SFP28 adapter (FC EC2T and FC EC2U; CCIN 58FB)
- PCIe3 2-port 100 GbE NIC & ROCE QSFP28 Adapter (FC EC3L and EC3M; CCIN 2CEC)
- PCIe4 2-port 100 GbE ROCE x16 adapter (FC EC66 and EC67; CCIN 2CF3)
Max Cable length = 100 MOptional RoCE switchPower9 enables SR-IOV
2019 IBM Systems Technical University
Network Redundancy Groups (NRG)
• Network Redundancy Groups are a logical group of physical ports.
• Up to 16 links can form an NRG.
• Ability to prioritize different types of traffic onto separate physical links
• Failover domain is the entire group of ports
2019 IBM Systems Technical University
Db2 Mirror Setup
5 separate NRG categories to isolate traffic
2019 IBM Systems Technical University
Db2 Mirror Setup
The Load Balance Link Count tells the NRG how many active links to useThe default is 1 up to the max of 16 links
2019 IBM Systems Technical University
Db2 Mirror Setup
This config has 2 physical links and is showing with a Load Balance count of 1 only one is active.
2019 IBM Systems Technical University
Db2 Mirror Setup
The priority influences which link is active for the NRG
2019 IBM Systems Technical University
Db2 Mirror Setup
This config has 2 physical links and is showing with a Load Balance count of 2 which makes both links active.
2019 IBM Systems Technical University
Db2 Mirror Network Statistics
2019 IBM Systems Technical University
Default Inclusion State for Replication Rules
NOTE: Can only be chosen at setup time or re-configuration time.
2019 IBM Systems Technical University
Replication List Rules
Add Rules for existing objects and objects that don’t exist yet
Add Rules for an object type or a specific object name
2019 IBM Systems Technical University
Replication List Rules
Set the rule to include or exclude the object/library from replication
2019 IBM Systems Technical University
Inspect what the Rules look like applied to the System
2019 IBM Systems Technical University
System Defined Rules
System Defined Rules are predefined and cannot be changed
2019 IBM Systems Technical University
Pending Rules
Create a group of rules before applying them to the system
2019 IBM Systems Technical University
Visualize Pending Groups
2019 IBM Systems Technical University
Detecting Errors
— Nodes are designed as a ’Primary’ or ’Secondary’ to indicate which node is preferred to ‘track’.
— HMCs are used for failure detection of the partner node to indicate the Secondary can automatically take over as the Primary and begin tracking to allow Db2 transactions to continue.
— The Secondary side will block changes to Db2 transactions
2019 IBM Systems Technical University
Detecting Errors - Quorum
— Additional nodes are added to the cluster to help determine the Primary and Secondary roles in the event that the partner node is down when node IPLs.
— The quorum data is share amongst all nodes in the cluster and stores state information.
— Typically, if there is a DR configuration those nodes would serve as the additional nodes to store quorum data.
2019 IBM Systems Technical University
Detecting Errors – State Change
— If the Secondary Fails:- IPLs
- MSD- Goes to Restricted State
— The Primary will begin tracking replicated object changes and the application will continue to run.
— The Secondary will be in a ‘blocked’ state and not allow changes to replicated objects until the two nodes have resumed mirroring.
Primary Secondary
2019 IBM Systems Technical University
Detecting Errors – State Change
— If the Primary Fails (Crash/MSD):
— If the secondary can connect to the HMC and determine the primary has failed, the secondary will take over as the primary and begin tracking.
— If the secondary cannot detect the failure it will remain blocked. The user may choose to force the secondary to become the primary.
Primary Secondary
2019 IBM Systems Technical University
Detecting Errors – State Change
— If the Primary does an normal IPL or goes to restricted state:
— The secondary will remain blocked and primary will track while in restricted state or until the IPL completes.
Primary Secondary
2019 IBM Systems Technical University
Detecting Errors – State Change
— If the network fails:
— If there is no communication between the 2 nodes over the RoCE network, the Primary will continue to track replicated objects and the secondary will block changes to replicated objects until the mirroring is resumed.
Primary Secondary
2019 IBM Systems Technical University
Active Replication
2019 IBM Systems Technical University
The status of the system when one node goes offline
2019 IBM Systems Technical University
Suspend Mirroring from the GUI
2019 IBM Systems Technical University
Tracking / Blocked State
2019 IBM Systems Technical University
Object Tracking List
2019 IBM Systems Technical University
Mirror Resume Progress
2019 IBM Systems Technical University
History of Previous Resynchronizations
2019 IBM Systems Technical University
History of Previous Resynchronizations
2019 IBM Systems Technical University
Resume Automatically
— The resume automatically property is defaulted to yes. This means if it was a system detected event such as a communication failure or crash, the mirror will resume once the failure is resolved.
— If the user suspends mirroring, then the user has to explicitly call resume.
2019 IBM Systems Technical University
Resync Parallelism
— If 5770SS1 Option 26 (DB2® Symmetric Multiprocessing) is installed you can take advantage of resyncing multiple objects at the same time.
2019 IBM Systems Technical University
Suggested Priority Example
2019 IBM Systems Technical University
Suggested Priority Example
2019 IBM Systems Technical University
Managing and Monitoring
— Exit Points for several of the state transitions
Exit Point Exit Point Format
Description
QIBM_QMRDB_PRECLONE PREC0100 Db2 Mirror ASP pre-clone
QIBM_QMRDB_POSTCLONE PSTC0100 Db2 Mirror ASP post-clone
QIBM_QMRDB_ROLE_CHG RCHG0100 Db2 Mirror replication role change
QIBM_QMRDB_STATE_CHG SCHG0100 Db2 Mirror replication state change
2019 IBM Systems Technical University
Serviceability
2019 IBM Systems Technical University
Compare
2019 IBM Systems Technical University
Compare Results
2019 IBM Systems Technical University
Alerts
2019 IBM Systems Technical University
Alerts
2019 IBM Systems Technical University
QSYSOPR Messages
— Db2 Mirror product state change messages sent to QSYSOPR:— CPDC905 - Db2 Mirror Network Redundancy Group (NRG) link <ip address> is active.— CPDC906 - Network Redundancy Group (NRG) link <ip address> is inactive.— CPIC901 - Db2 Mirror replication is suspended for ASP group IASP33P. Reason code <reason
code>.— CPIC902 - Db2 Mirror replication is suspended for ASP group <iASP name or *SYSBAS> due to
an error. Reason code <reason code>. — CPIC903 - Db2 Mirror replication is suspended for maintenance operations.— CPIC904 - Db2 Mirror replication is active for ASP group <iASP name or *SYSBAS>. — — Db2 Mirror product failure messages sent to QSYSOPR:— CPD3E43 - DRDA/DDM Db2 Mirror server error occurred with reason code <reason code>.— CPF32CD - Db2 Mirror resynchronization failed for job <job name or *ALL>.
2019 IBM Systems Technical University
Specific Object Replication Details
2019 IBM Systems Technical University
DB2 Mirror – Database “must have” knowledge
1. DDS and SQL DDL files are supported2. Native DB I/O (e.g. RPG) and SQL are supported3. Mirrored database files contain the same data, at the same RRNs4. Journaling is optional, but encouraged5. Record level operations against mirrored files will yield identical results, regardless of whether
the source or target are being used6. Database DDL and I/O operations are synchronous
2019 IBM Systems Technical University
Database trigger considerations
— Configured via ADD/CHGPFTRG and ALTER/CREATE TRIGGER
Default
2019 IBM Systems Technical University
Output Queue (*OUTQ) Objects
Objects of type *OUTQ will be replicated synchronously
— OUTQ’s kept identical across both systems— Creates, updates, and deletes blocked if
• Initiated on secondary system while DB2 Mirror is interrupted• A required object not available on both systemso DTAQo MSGQo WSCST
— Customers configure which to replicate
2019 IBM Systems Technical University
Characteristics of Spooled Files
Spooled file have unique properties for DB2 Mirror— All the data spooled originate from a single system— Often generated over a long-running process— Can be quite large— Usually not useful if incomplete— Limited number of spooled files allowed on a system— Duplicate spooled files not allowed — Not true objects, in an IBM i sense
2019 IBM Systems Technical University
Spooled File Replication
— Spooled files will be replicated near-synchronously• At close, spooled file will be added to the OTL as deferred• A system job will resynch spooled files to the target system at configurable intervals• Cannot guarantee that the order of spooled files will be the same on both systems• Generation of spooled files is never blocked.o Spooled files added to OTL on both systems when replication suspendedo Resynchronized both ways when replication resumed
— Replicated to the same library/output queue on the target system
2019 IBM Systems Technical University
Spooled File Status
— Replicated spooled files will be restored in *HLD status • Prevents processing of replicated files until they are releasedo Ignored by active writerso No entries added to an associated DTAQ
•On failover, spooled files in *HLD must be released to be processed
— Once processed, replicated copies will be set to *SAV or *FIN status— *RLS, *HLD, *PND, *WTR, and *PRT status not replicated
2019 IBM Systems Technical University
Considerations for replicating spooled files
— Due to the large amount of potential data transfer, care should be taken to limit replication of spooled files to those needed.
• We help by permanently excluding the following output queuesQUSRSYS/QEZJOBLOGQUSRSYS/QEZDEBUGQGPL/QPRINT
• We help by excluding the following output queues, by defaultAll *OUTQs in QUSRSYSAll *OUTQs in QGPL These *OUTQs may be explicitly included in replication by name.
• When including a library, users should exclude unneeded OUTQs at the sameRCL configuration allows multiple changes to be submitted as a group
2019 IBM Systems Technical University
Synchronously Replicated Authority Changes
The following will be replicated synchronously:(Synchronous changes occur at the same time on both systems. They either succeed on both, or fail on both.)
• Authority & ownership changes to database file (table) objects including securing the file with an authorization list• Using e.g. GRTOBJAUT, RVKOBJAUT, CHGOBJOWN, CHGOBJPGP, ...
• Authority & ownership changes to any other supported object via database (SQL) operations• Authority changes to IFS objects on the hardware mirrored IASP• Creation of a *AUTL object. Adding users to/from a *AUTL. Changing ownership of a *AUTL.• Change of object audit attribute via CHGOBJAUD• Change of user profile parameters PASSWORD and UID/GID• Creation of a user profile
• User profile is created on both systems with the same attributes including UID and GID
2019 IBM Systems Technical University
Authority Changes Not Supporting Replication
The following will not be replicated:• Authority changes to objects not supported by DB2Mirror, or to objects which DB2Mirror is
configured to exclude.
• Cryptographic and digital certificate management capabilities• e.g. master keys, key store updates, and certificate store info
• Configuration for functions like Kerberos / EIM• plus other considerations like keytab files, EIM relationships, etc
87 © Copyright IBM Corporation 2019
2019 IBM Systems Technical University
Replicated Objects that can change while in the blocked state
— User Profiles— Authorization Lists— Function Usage Information— Environmental Variables— System Values— Spooled Files
For more information on specific behavior:https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/db2mi/db2mobjblocked.htm
88 © Copyright IBM Corporation 2019
2019 IBM Systems Technical University
IASPs
2019 IBM Systems Technical University
Db2 Mirror IASP Support
— IASPs are optional for Db2 data— IASPs are required for IFS concurrent sharing
• PowerHA required to switch IFS IASPs
— DB IASPs have there own Replication Rules and Object Tracking List
90 © Copyright IBM Corporation 2019
2019 IBM Systems Technical University
IASP Support
91 © Copyright IBM Corporation 2019
2019 IBM Systems Technical University
IASP Support
92 © Copyright IBM Corporation 2019
2019 IBM Systems Technical University
Switch over IFS IASPs
93 © Copyright IBM Corporation 2019
2019 IBM Systems Technical University
Disaster Recovery
2019 IBM Systems Technical University
Topology Options – DR
— As long as one local Db2Mirror node is up, production will remain at the local site.
— If both local nodes are unavailable, then a switch to the DR site can be initiated.
— The default will be that a switch to DR requires system administrator intervention, although a policy could be defined to initiate the switch automatically.
— Only one node will be activated at the DR site, and then a Db2Mirror resynch will be started to the 2nd DR node.
SystemMirror Replication or Hardware Replication
SystemMirror Replication or Hardware Replication
RoCE
Node 1
Database
App
Node 2
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University 96
Topology Options – DR
— As long as one local Db2Mirror node is up, production will remain at the local site.
— If both local nodes are unavailable, then a switch to the DR site can be initiated.
— The default will be that a switch to DR requires system administrator intervention, although a policy could be defined to initiate the switch automatically.
— Only one node will be activated at the DR site, and then a Db2Mirror resynch will be started to the 2nd DR node.
SystemMirror Replication or Hardware Replication
SystemMirror Replication or Hardware Replication
RoCE
Node 1
Database
App
Node 2
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University 97
Topology Options – DR
— As long as one local Db2Mirror node is up, production will remain at the local site.
— If both local nodes are unavailable, then a switch to the DR site can be initiated.
— The default will be that a switch to DR requires system administrator intervention, although a policy could be defined to initiate the switch automatically.
— Only one node will be activated at the DR site, and then a Db2Mirror resynch will be started to the 2nd DR node.
97
SystemMirror Replication or Hardware Replication
SystemMirror Replication or Hardware Replication
RoCE
Node 1
Database
App
Node 2
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University 98
Topology Options – DR
— As long as one local Db2Mirror node is up, production will remain at the local site.
— If both local nodes are unavailable, then a switch to the DR site can be initiated.
— The default will be that a switch to DR requires system administrator intervention, although a policy could be defined to initiate the switch automatically.
— Only one node will be activated at the DR site, and then a Db2Mirror resynch will be started to the 2nd DR node.
9898
RoCE
Node 1
Database
App
Node 2
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University 99
Topology Options – Common DR Options
Roce
< 200M
Metro or Global Mirror
2019 IBM Systems Technical University 100
Topology Options – Common DR Options
RoCE
< 200M
Metro or Global Mirror
2019 IBM Systems Technical University 101
Topology Options – Common DR Options
RoCE
< 200M
Metro or Global Mirror
RoC
E
2019 IBM Systems Technical University 102
Topology Options – Common DR Options
RoCE
< 200M
Metro or Global Mirror
RoCE
< 200M
2019 IBM Systems Technical University 103
Topology Options – Common DR Options
RoCE
< 200M
Metro or Global Mirror
RoCE
< 200M
IFS Switching only supported on DS8K with Hyperswap with 3 Storage Controllers
2019 IBM Systems Technical University 104
Topology Options – Common DR Options
RoCE
< 200M
Metro or Global Mirror
RoCE
< 200M
2019 IBM Systems Technical University
Logical Replication
— Logical replication solutions have the option to move the source node between the Db2 Mirror nodes and go to a single DR node
Logical Replication
Node 1
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University
Logical Replication
— Logical replication solutions have the option to move the source node between the Db2 Mirror nodes and go to a single DR node
Logical Replication
Node 1
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University
Logical Replication
— Logical replication solutions have the option to move the source node between the Db2 Mirror nodes and go to a Db2 Mirror pair.
Logical Replication
RoCE
Node 1
Database
App
Node 2
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University
Logical Replication
— Logical replication solutions have the option to move the source node between the Db2 Mirror nodes and go to a Db2 Mirror pair.
Logical Replication
RoCE
Node 1
Database
App
Node 2
Database
App
RoCE
Node 1
Database
App
Node 2
Database
App
2019 IBM Systems Technical University
Software Requirements and Licensing
2019 IBM Systems Technical University
Software Required for Db2 Mirror Pair
— 5770SS1 Option 3 (Extended Base Directory Support)— 5770SS1 Option 12 (Host Servers)— 5770SS1 Option 22 (ObjectConnect)— 5770SS1 Option 26 (DB2® Symmetric Multiprocessing) - Optional— 5770SS1 Option 30 (Qshell) — 5770SS1 Option 34 (Digital Certificate Manager) — 5770SS1 Option 41 (High Availability Switchable Resources)— 5770SS1 Option 48 (IBM Db2Mirror) — 5770JV1 *BASE (IBM Developer Kit for Java)— Option 16 (Java SE 8 32 bit)— Option 17 (Java SE 8 64 bit)— 5733SC1 *BASE(IBM Portable Utilities for i) — Option 1 (OpenSSH, OpenSSL, zlib)— 5770DG1 *BASE (IBM HTTP Server for i)— 5770DBM *BASE (IBM Db2 Mirror for i) — Option 1 (Db2 Mirror Enablement)
2019 IBM Systems Technical University
Open Source Packages Required for Setup
— python2-six-1.10.0-1.ibmi7.1.noarch.rpm— python2-itoolkit-1.5.1-1.ibmi7.1.ppc64.rpm— python2-ibm_db-2.0.5.8-1.ibmi7.1.ppc64.rpm— cloudinit-1.0-0.ibmi7.1.ppc64.rpm
2019 IBM Systems Technical University
Software Required for Db2 GUI Node
— 5770SS1 Option 3 (Extended Base Directory Support)— 5770SS1 Option 12 (Host Servers)— 5770SS1 Option 22 (ObjectConnect)— 5770SS1 Option 26 (DB2® Symmetric Multiprocessing) - Optional— 5770SS1 Option 30 (Qshell) — 5770SS1 Option 34 (Digital Certificate Manager) — 5770SS1 Option 41 (High Availability Switchable Resources)— 5770SS1 Option 48 (IBM Db2Mirror) — 5770JV1 *BASE (IBM Developer Kit for Java)— Option 16 (Java SE 8 32 bit)— Option 17 (Java SE 8 64 bit)— 5733SC1 *BASE(IBM Portable Utilities for i) — Option 1 (OpenSSH, OpenSSL, zlib)— 5770DG1 *BASE (IBM HTTP Server for i)— 5770DBM *BASE (IBM Db2 Mirror for i) — Option 1 (Db2 Mirror Enablement)
2019 IBM Systems Technical University
Amen i Dziękuję za uwagę