ims and db2 recovery - ims ug nyc sept 2013.pdf
DESCRIPTION
TRANSCRIPT
IBM Software
NYC IMS User Group Meeting
© 2013 IBM Corporation
IMS and DB2 Backup and Recovery
Ron Bisceglia
IMS Tools Development
IBM Software 2
Agenda
Concepts around Business Resilience
IBM DB2 and IMS Backup and Recovery Solutions
Coordinated IMS and DB2 Local Recovery
Coordinated IMS and DB2 Disaster Recovery
IBM Software 3
What Scenarios Do You Want to Protect Against?
Regional Disaster Electric grid failure
Floods
Hurricanes
Earthquakes
Tornados
Tsunamis
Global Distance
Recovery
Local Disaster Human error
HAVC or power failures
Burst water pipe
Building fire
Architectural failures
Gas explosion
Metro Distance
Recovery
Single System Failure Human error
Component failures
Single system failures
High
Availability
Data Loss or
Corruption
Point in Time
Backup
Risk Assessment
Identify risks & likelihood
Evaluate & prioritize risks
Create a report of risks & vulnerabilities
IBM Software 4
Business Impact Analysis
Identify critical
business
processes
Identify
critical IT
Resources
Outage Impacts
&
Allowable
Outage Times
Imp
ac
t
Likelihood
Start
Here
Divide the scope into phases
It is difficult to do all at once
Scope can expand as phases
progress
IBM Software 5
Some Definitions - HA / CO / CA / DR / BR
Business Resilience (BR)
► The ability of the business to rapidly adapt and respond to opportunities, regulations and
risks, in order to maintain secure and continuous business operations, be a more trusted
partner, and enable growth.
► Business Resilience spans business strategy, organizational structure, business and IT
processes, IT infrastructure, applications and data, and facilities.
► Disaster Recovery (DR) is one component of an overall Business Resilience Plan.
Disaster Recovery
► Process of recovering an environment after a major disaster
► Bring to the point at which business can be conducted
Continuous Availability (CA) - Mask outage
► A system that delivers uninterrupted service 7 days a week, 24 hours a day ● There are no planned or unplanned outages from an end-user perspective.
► Principles for CA: redundancy, isolation, concurrency, automation
High Availability (HA) - Mask unplanned outage
► A system that delivers uninterrupted service during scheduled periods
● There are no unplanned outages from an end-user perspective.
Continuous Operation (CO) - Mask planned outage
► A system that delivers service 7 days a week, 24 hours a day with no scheduled outages. ● There are no planned outages from an end-user perspective.
► Covered by Operations Continuity Plan (OCP )
H
ard
er
to M
easu
re…
Hard
er
to G
uara
nte
e
IBM Software 6
Some Definitions – RTO and RPO
Recovery Time Objective (RTO) ► Time allowed to recover the applications
► All critical operations are up and running again
► Considerations include recovery of databases and network
Recovery Point Objective (RPO) ► Amount of data lost in the disaster
► Last point-in-time when all data was consistent
► Considerations include: ● Frequency of creating recovery points
● Frequency of transfer of data to remote site
RTO and RPO business requirements are usually different for disaster recovery versus local recovery
Async.
Replication
Async.
Replication
Tape
Backup
Tape
Backup
Tape
Restore
Tape
Restore
ClusteringClustering
Online
Restore
Online
Restore
Remote
Replication
Remote
Replication
SecsMinsHrsDaysWks Secs Mins Hrs Days WksSecsMinsHrsDaysWks Secs Mins Hrs Days WksWks
Recovery PointRecovery Point Recovery TimeRecovery Time
Sync.
Replication
Sync.
Replication
IBM Software 7
High Availability (HA) - Mask unplanned outage
Du
rati
on
S
co
pe
Fre
qu
en
cy
Outage Duration
Unchecked Problem Determination
Informal Recovery Procedures
Other secondary contributors
Initialization Design
Indecision
Lack of Back-up capability
Number of Users Impacted
System Design
Application Design
Data Design
System Configuration
Common dependencies
Unnecessary IPL or reboot
Number of Outages
Reinitialize
Subsystem & Application Abnormal Termination
Human Errors
HW Component Failures
Recurring Problems
Untested Changes
Complexity
Causes Actions
Decrease Frequency
Proactive Problem Prevention
Root cause analysis, System Outage Analysis
Technology exploitation
Effective Change Management
Robust System Design
Proactive Monitoring
Standardization
Reduce Duration
Effective Recovery Procedures
Component Failure Impact Analysis
Automation
Situation Management
Post Incident Reviews
System and application design
Warm Back-up
Limit Scope
System Integration and Design
Application Design
Data Design
Isolation of Vital Applications
Component Failure Impact Analysis
Redbook: System z Mean Time to Recovery Best Practices, SG24-7816 (March 2010)
IBM Software 8
HA: How to Reduce Unplanned Outages?
Capacity planning should correspond to expected
workload.
► Using100%availableresourcesispossiblewithSystemzbut…
► Better having efficient resource allocation when needed
Practice is critical.
► “Everyfootballteampractices!”
► Practice needed to learn, to team, to test, to validate processes and
knowledge.
Early detection is essential to avoid an outage or to
minimize the impact.
► Regular health check (tuning)
► Heartbeat implementation for potential problem discovery and
avoidance
● Understandwhatisa“normal”situation
► Understand failure avoidance mechanisms
● System able to protect itself against others
Best Practice
“Workingat90%CPUisabest
practice. We need to de-stress
theinfrastructure.It’snot
normal for me that we run at
morethan99%ofCPU;it’sbad
forthestacks,fordispatching.”
IBM Software 9
HA: How to Reduce Unplanned Outages? (cont)
Problem determination often rely on automated actions and
qualified people.
► Identify failure isolation mechanisms
► Collect documentation for root cause
analysis and keep it several days.
► Improve teaming between technical teams
► Practice restart scenarios
Fast restart is essential to reduce business impact.
all business processes available
TIME
problem occurs
possibly orderly shutdown
For Unplanned -gather diagnostic information
For Planned -implement planned changes
z/OS
subsystems
middleware
applications
business impact
shutdown restart
IBM Software 10 10 10
Agenda
Concepts around Business Resilience
IBM DB2 and IMS Backup and Recovery Solutions
Coordinated IMS and DB2 Local Recovery
Coordinated IMS and DB2 Disaster Recovery
IBM Software
Disaster Recovery VS. Local Recovery
Disaster or ‘Remote Site’ Recovery
► All production data is mirrored to remote site
► After disaster failover, mirroring systems are disabled
► DBMS systems are restarted using mirrored data
► Major operation to switch to remote site
► Used for major site failures or other “real” disasters
11
IBM Software
Disaster Recovery VS. Local Recovery
Application or ‘Local’ Recovery
► User application failures or operational errors
● Recovery is for one or more groups of databases
● Application errors, operational errors, local hardware
problems
► DBMS systems need local recovery resources
● Image copies, change accums, RECONs, etc.
► Recovery resources are mirrored to remote site
● Needed for local recovery following a failover
● Remote site is new production system for a period of time
12
IBM Software
Recovery Resources IMS Databases
► Image copy utilities
► Change accumulations
IMS Systems and Application libraries
► Volume dumps
► DFSMSdss data set copies
IMS Systems and data ► Volume dumps
► Remote Mirroring/Replication ● XRC, PPRC, SRDF
IMS Recovery Solution Pack ► Batch, concurrent, incremental image copies ► Fast-replication image copies
IMS Recovery Expert for z/OS ► System level backup
● Full system, data only, application, database, and
disaster recovery
13
DB2 Table Spaces ► Image copies
DB2 Systems and Application libraries ► Volume dumps
► DFSMSdss data set copies
► DFSMShsm BACKUP and RECOVER SYSTEM ● FlashCopy
DB2 Systems and data ► Volume dumps
► DFSMShsm BACKUP and RECOVER SYSTEM ● FlashCopy
► Remote Mirroring/Replication ● XRC, PPRC, SRDF
DB2 Recovery Expert for z/OS
► Fast-replication object level backup / recovery
● DB2 V10 IBM FlashCopy Image Copy
● With DB2 RE - DB2 V8, V9, V10 - FlashCopy and
EMC TimeFinder/Snap Image Copy
► System level backup
● Full system, data only, application, table space, and
disaster recovery
IBM Software 14 14 14
Agenda
Concepts around Business Resilience
IBM DB2 and IMS Backup and Recovery Solutions
Coordinated IMS and DB2 Local Recovery
Coordinated IMS and DB2 Disaster Recovery
IBM Software
Coordinated IMS and DB2 Local Recovery
Recoveries Today
► Not performed very often
► Hardware errors do not happen very frequently
► Problemtypicallydiscovered‘afterthefact’
IBM Tools Can Automate Complex Coordinated Recovery
► IMS and DB2 Recovery Expert, IMS Recovery Solution Pack
► IMS databases and DB2 tablespaces linked by Coordinated
Application Profile
● IMS Recovery Expert and DB2 Recovery Expert
► IMSandDB2loganalysistofindcoordinated‘quiet’times
► Recovery jobs/plans generated to recover IMS databases and
DB2 tablespaces to the same, consistent point-in-time
● Eliminates‘think’time
● Does not require a technical specialist
15
IBM Software
Unplanned Outages
When are these type of problems detected?
16
80 percent of
unplanned
downtime is
caused by
people and
process issues Hardware Outage
Operator Errors
Application Errors
IBM Software
Up to 70% of recovery time is “think time”!
► Not processing time
► Actual recovery takes up 30%
Source : McGladrey and Pullen
0% 50% 100%
1 20 20 30 30
Ele
men
ts
Total Recovery Time
Diagnose
Investigate
Analyze
Recover
Once you have an event…
17
IBM Software
IBM Local Application Recovery Solutions
IMS Recovery Solutions
► Base IMS product utilities
● IMS databases are recovered using image copies and logs
● IMS Full Database recovery or IMS Timestamp recovery
► IMS Recovery Solution Pack
● IMS databases are recovered using image copies and logs
● IMS Full Database recovery or IMS Timestamp recovery
● IMS Point In Time Recovery (PITR)
► IMS Recovery Expert and IMS Recovery Solution Pack
● IMS databases are recovered using SLB, image copies and logs
● IMS Full Database recovery or IMS Timestamp recovery
● IMS Point In Time Recovery (PITR)
Coordinated IMS and DB2 Local Recovery Solutions
► IMS Recovery Expert, IMS Recovery Solution Pack, DB2 Recovery Expert
● IMS and DB2 databases are recovered using SLB, image copies and logs
● Recovery to a consistent timestamp between IMS and DB2
18
IBM Software
Coordinated IMS & DB2 Local Application Recovery
Local Application Recovery Profiles
– IMS databases and DB2 objects are grouped into Profiles
● Databases/Objects can be in more than one profile
– Local Application Profiles give information on recovering the objects
Coordinated Local Application Recovery Profiles
► Combination of one or more IMS and DB2 profiles
● Each IMS or DB2 Local profile can be in only one Coordinated Profile
► Allows IMS and DB2 applications to be recovered to a ‘consistent’ point-
in-time
19
IBM Software
Coordinated Application Profiles
20
IBM Software
Coordinated Local Application Profile Local Application Profile Local Application Profile
Local and Coordinated Application Profiles
IMS Databases
IMS Databases
DB2 Tablespaces
DB2 Tablespaces
IMS Application
DB2 Application
IMS and DB2 Application
Coordinated Local Application Recovery Profiles
► Combination of one or more IMS and DB2 profiles
● Each profile can be in only one Coordinated Profile
► Allows IMS and DB2 applications to be recovered to a consistent point
Single IMS UOR
Single DB2 UOR
Single IMS and DB2 UOR
21
IBM Software
Recovery Points (RPs) allow consistent recoveries
Traditional methods produce DEALLOC records:
► UPDATE DB STOP(ACCESS) or /DBR DB
► UPDATE DB STOP(UPDATES) or /DBD DB
► UPDATE DB START(QUIESCE)
Traditional RPs are discovered in the RECON data set
► ALLOC shows when a database is allocated for update
► DEALLOC record indicates the database is no longer allocated
Traditional RPs are used with Timestamp Recovery
► No inflight transactions exist so no backouts required
• Performance improvement over Point-In-Time Recovery (PITR)
Difficult to match IMS and DB2 Recovery Points
Require some database unavailability time
22
IBM Software
Finding Recovery Points in a 24x7 Environment
Quiet Time Log Analysis
► Finds Recovery Points by analyzing log records
● Analyses actual transaction Unit of Recovery (UOR) activity
● Discovers commit points for IMS and DB2 databases/objects
► IMSandDB2‘common’quiettimescanbefound
● DB2 logs analyzed for a time range
● IMS logs analyzed for same time range
● Intersection of common quiet times produced
Recovery Points can be identified when databases were online
Requires Point-In-Time Recovery (PITR)
► IMS Recovery Solution Pack (DRF)
► DB2 Utility Suite
23
IBM Software
Common IMS and DB2 Recovery Points
24
IMS UOR 2
DB2 UOR 2
DB2 UOR 3
Coordinated Quiet Time
IMS UOR1
DB2 UOR 1
IMS UOR1
DB2 UOR 1
IMS UOR 1
DB2 UOR 1
IMS UOR1
DB2 UOR 1
DB2 UOR 2
DB2 UOR 3
IMS UOR 3 IMS UOR 3
IMS Quiet Time DB2 Quiet Time
IBM Software
Finding Quiet Times for a Coordinated Application
25
IBM Software
Quiet Time Analysis Parameters
26
IBM Software
Point In Time Recovery
Recover to timestamp = ‘12.277 08:00:00.000000’
Only ‘committed’ updates are recovered
27
IMS UOR 3
IMS UOR 4
12.277 08:00
IMS UOR 5
IMS UOR 2
IMS UOR 1
IMS UOR 6
IBM Software
Coordinated IMS & DB2 Local Application Recovery
• Coordinated Recovery requires a Consistent Timestamp
1) Selectthe“CurrentTimestamp”(Okay)
● All DBs/objects in Coordinated Profile are stopped for IMS and DB2
● IMS and DB2 logs are switched and archived
● Recovery is to the end of the current set of IMS and DB2 logs
● DBs/objects are restarted
2) Selectany“UserTimestamp” (Better)
● IMS Recovery Solution Pack (IMS DRF) and DB2 Recovery Utility
– Point In Time Recovery (PITR) applies only the committed updates
3) Selecta“QuietTimeRecoveryPointTimestamp”(Best)
● IMS and DB2 Recovery Expert identifies RPs from the logs
● IMS Recovery Solution Pack (IMS DRF) and DB2 Recovery Utility
– Point In Time Recovery (PITR) applies only the committed updates
● May provide better linkage to Business Cycle
28
IBM Software
Building Recovery JCL
29
IBM Software
Selecting Common Recovery Time
30
IBM Software 31
IMS Recovery Expert for z/OS Intelligent Recovery Manager
IMS
Databases
RECON
IMS RE
Repository
IMS System
Backup
IMS Intelligent Recovery Manager
HP
CA
US
ER
Database
Recovery
Utility
US
ER
DR
F
Index
Rebuilder
IIB
US
ER
HALDB
ILDS/Index
Rebuild
IMS
US
ER
Post
Recovery
Image Copy
HP
IC/H
PP
C
US
ER
Data Set
Restore
IBM
Fla
sh
cop
y
EM
C S
NA
P
IBM
DfS
MS
dss
Fast-replication
Data Set Restore
IMS RE Invoked
Recovery Processes
Change
Accumulation
Utility DBRC
Notifications
Managed IMS Application Recovery
IMS Log
IBM Software 32
DB2 Recovery Expert for z/OS V3.1 Intelligent Recovery Manager
Managed DB2 Application Recovery
DB2
Spaces
BSDS
Image
Copies
DB2 System
Backup
DB2 RE V3.1 Intelligent Recovery Manager
Tra
ditio
nal IC
Fastr
ep
IC
SQL
Recovery
Red
o S
QL
Un
do S
QL
Index
Rebuild
DB
2
Utilit
y
Check
Utility
Ind
ex
Data
Post
Recovery
Image
Copy
Tra
ditio
nal
Fastr
ep
IC
Restore
From
SLB
IBM
Fla
sh
cop
y
EM
C S
NA
P
IBM
DfS
MS
dss
Fast-replication
Data Set Restore
DB2 RE V3.1 Invoked
Recovery Processes
Recover
Utility
DB2
Catalog
DB2 Log
DB2 RE
Repository
Dropped
Object
Recovery
DD
L +
DC
L
Data
Recove
r
Log
Ap
ply
IBM Software
Recovering a Coordinated Application
Select Coordinated Application Profile
Select or Specify Recover to Timestamp
Analysis Performed by Recovery Expert to
Determine the Recovery Steps
Submit the JCL
Recover IMS Databases Restore from SLB
Restore Image Copies
Apply Logs
► To specified timestamp
Rebuild Indexes
Start IMS Databases
Stop IMS Databases
Recover DB2 Tablespaces Restore from SLB
Restore Image Copies
Apply Logs
► To specified timestamp
Rebuild Indexes
Start DB2 Tablespaces
Stop DB2 Tablespaces
33
IBM Software 34
Agenda
Concepts around Business Resilience
IBM DB2 and IMS Backup and Recovery Solutions
Coordinated IMS and DB2 Local Recovery
Coordinated IMS and DB2 Disaster Recovery
IBM Software
Disaster Recovery Planning and Processes
When considering business continuity and disaster recovery
(BC /DR ), a failed recovery means discontinued business.
Having systems inoperable and people unavailable for a matter
of days or even hours can be disastrous in terms of lost
revenue, customer dissatisfaction, and negative press. It is
therefore critical to understand the root-causes behind why
recoveries fail in the first place:
35
‘TheRootCausesBehindFailedRecoveries’
SunGard, 2012
IBM Software
Why Recoveries Fail
Failure to Plan
► To avoid the specter of a failed recovery, BC /DR plans must be
comprehensive, detailed, and consolidated.
Failure to Manage Change
► Change management is required to ensure that daily business operations,
the BC /DR plan, and the recovery solution are all kept in sync.
Failure to Validate ► The combination of tests (does it work) and exercises (can we do it)
provide the final stamp of approval that a business has done their due
diligence to protect themselves against a failed recovery.
36
‘TheRootCausesBehindFailedRecoveries’
SunGard, 2012
IBM Software
IBM Disaster Recovery Solutions
IMS Recovery Solutions
► IMS databases are recovered using image copies and/or logs
● IMS Full Database recovery or IMS Timestamp recovery
IMS Restart Solutions
► IMS system and databases are mirrored or replicated to remote site
● IMS Recovery Expert product: System Level Backup
● GDPS and Storage Mirroring
IMS Restart & Recovery Solution
► IMS system and databases are mirrored or replicated to remote site
► Additional transmitted data allows for forward recovery
Coordinated IMS and DB2 Restart & Recovery Solution
► Approach 1: SLB contains both IMS and DB2 volumes
► Approach 2: Separate SLBs for IMS and DB2 and PITR log recovery
37
IBM Software
A System Level Backup is a backup of the entire DBMS environment at a point in time ► Recorded in Recovery Expert Meta data repository
Leverages storage-based fast replication to
drive a volume level backup ► With Flashcopy backup completed in seconds
► Offloading data copy process to the storage
processor saves CPU and I/O resources
► Significantly faster than data set copies
Backup DBMS without affecting applications ► Backup windows reduced by replacing image copies
► Extends processing windows
Data consistency ensures data is dependent-
write consistent ► DBMS‘LogSuspend’
► Storage-based consistency functions
● FCCGFREEZE to perform a FlashCopy
consistency group (transparent to the user)
► Equivalent to a power failure
System Level Backup Overview
Storage Processor APIs
Target
Volumes System
Level
Backup
IMS/DB2 Recovery Expert
IMS/DB2
Source
DBMS
Volumes
38
IBM Software
System Level Backup Overview
Backup validation each time ensures
successful recoveries ► Insurance that a backup is available
Automated backup offload
(archive/recall) ► Copies system backup from fast replication
disk to tape for use at either local or disaster
site (or both)
Can be used in combination with image copies
Tape
Processing
Storage Processor APIs
Storage-Aware
Backup and
Recovery
Offload
SLB
System
Backup
Source
Database
Volumes
IMS/DB2
39
IBM Software
IMS and DB2 Recovery Expert: SLB
Environment discovery and configuration management
► IMS System Level Backup includes:
● Active and archive logs
● RECONs
● All IMS database data sets
● IMS system data sets (ex. ACBLIBs, DBDLIBs, PGMLIBs, etc.)
● All associated ICF User catalogs
► DB2 System Level Backup includes:
● Active and archive logs
● Bootstrap Data Set
● All DB2 database data sets
● DB2 system data sets (ex. Loadlib)
● All associated ICF User catalogs
40
IBM Software
Coordinated IMS and DB2 Restart Solution
Combined SLB created from IMS and DB2 volumes
► Separate analysis is performed on IMS and DB2
● List of volumes where IMS and DB2 data sets reside determined
● Volumes combined under one Recovery Expert product
► At Primary site
● Single, combined SLB is created
● One Flashcopy for all volumes (IMS and DB2)
● Recovery Expert repository is replicated/sent to remote site
► At Remote site…disaster failover
● Recovery Expert repository is restored
● SLB is restored
● IMSandDB2are‘restarted’
● Restart with Dynamic Backout and Undo/Redo processing occur
● IMS and DB2 restarted to same, transactionally consistent point in time
41
IBM Software
Coordinated DR - IMS Recovery Expert Production Site
RECON
WADS
OLDS
DATABASES
RDS
Logger
IMS Control
Region
DBRC
DLI/SAS
IMAGE COPY
RLDS
SLDS
CHANGE ACCUM IMS Volume 1
IMS System Analysis
IMS RE
Repository
IMS Volume 2 IMS Volume 3 IMS Volume nn
42
IBM Software
Coordinated DR - DB2 Recovery Expert
Production Site
LOGS Logger
DB2 Master
DDF
IMAGE COPY
LOGS
DB2 RE
Repository
DB2 System Analysis
DB2 Volume 1 DB2 Volume 2
DB2 Volume nn
43
IBM Software
Coordinated DR - DB2 RE or IMS RE
Create IMS and DB2 SLB
Logger
DB2 Master
DDF
IMS Volume 1 IMS Volume 2
IMS Volume 3 IMS Volume nn
DB2 Volume 1 DB2 Volume 2
DB2 Volume nn
IMS and DB2
Combined SLB
44
IBM Software
Coordinated DR - IMS and DB2 Restart Remote Site
Transmitted IMS and DB2
Combined SLB
Restore SLB for IMS/DB2
RDS
RECON IMAGE COPY
RLDS
CHANGE ACCUM
DATABASES
IMS RE Repository
WADS
OLDS
SLDS
IMAGE COPY
LOGS
DB2 RE
Repository
LOGS
TABLESPACES
Logger
IMS Control Region
DBRC
DLI/SAS
Restart IMS/DB2
Logger
DB2 Master
DDF
45
IBM Software
Coordinated IMS and DB2 DR: Combined SLB
Coordinated Recovery Point (RP)
► RPO = Changes past the last SLB
► RTO = Time to restore the Combined SLB and restart IMS and DB2
IMS SLB 1
DB2 SLB 1
IMS LOG 1
IMS LOG 2
Lost IMS Data (RPO)
Coordinated IMS and DB2 SLB Time
DB2 LOG 2
DB2 LOG 3
DB2 LOG 1
Lost DB2 Data (RPO)
46
IBM Software
Coordinated IMS and DB2 Recovery and Restart Solution
Separate SLBs created for IMS and DB2 volumes
► Separate analysis is performed on IMS and DB2
► At Primary site:
● Separate SLB is created for IMS and for DB2
– Two Flashcopies for each set of volumes (IMS & DB2)
● Archived logs are transmitted to remote site
– Log Timestamps are recorded in DR PDS
● Recovery Expert DR PDS is transmitted to remote site
► At Remote site…disaster failover
● Recovery Expert DR PDS is restored
● IMS and DB2 SLBs are restored
● Point In Time Recovery using timestamp in IMS and DB2 DR PDS
– Earlier of two timestamps in IMS and DB2 DR PDS
● Start IMS and DB2 (No Backouts/Undos needed during restart)
47
IBM Software
IMS Recovery Expert Remote Site Production Site
WADS
OLDS
DATABASES
RDS
Logger
IMS Control
Region
DBRC
DLI/SAS
Transmitted
IMAGE COPY
RECON
CHANGE ACCUM
IMAGE COPY
SLDS/RLDS
CHANGE ACCUM
IMS RE
Repository
System Level Backup
48
IBM Software
DB2 Recovery Expert Production Site
Transmitted
Logger
DB2 Master
DDF
TABLESPACES
ES
BSDS
IMAGE COPY
LOGS
DB2 RE
Repository
System Level Backup
Remote Site
49
IBM Software
IMS Recovery Expert Remote Site
Transmitted
Logger
IMS Control
Region
DBRC
DLI/SAS
Start IMS
System Level Backup
Recover DB
DATABASES
SLDS
IMAGE COPY
RLDS
CONDITIONED RECON
CHANGE ACCUM
SLDS/RLDS
IMAGE COPY
RECON
CHANGE ACCUM
IMS RE Repository
IMAGE COPY
CHANGE ACCUM
Restore SLB
RDS
WADS
OLDS RECON
SLDS/RLDS
DATABASES IMS RE
Repository
Find Coord RP
IMS RE
Repository
50
IBM Software
DB2 Recovery Expert Remote Site
Transmitted Start DB2
Logger
DB2 Master
DDF
System Level Backup
IMAGE COPY
LOGS
DB2 RE Repository
Restore SLB
BSDS
IMAGE COPY
LOGS
DB2 RE Repository
TABLESPACES
BSDS
Recover DB
IMAGE COPY
LOGS
DB2 RE
Repository
BSDS
Find Coord RP
DB2 RE
Repository
51
IBM Software
Coordinated IMS and DB2 DR: Separate SLB Coordinated Recovery Point (RP)
► RPO = Changes Past the Coordinated RP
● Requires application and business-cycle analysis
– Determine how all data is interconnected
► RTO = Time to restore SLBs, recover DBs with logs, restart IMS & DB2
IMS SLB 1
DB2 SLB 1
IMS LOG 1
IMS LOG 2
DB2 LOG 2
DB2 LOG 3
Lost Data (RPO) DB2 LOG 1
Coordinated RP
52
IBM Software
Disaster Recovery Process
Storage Processor APIs
IMS/DB2
Recovery Expert
IMS/DB2
Tape
Processing
Source
Database
Volumes
System
Level
Backup
SLB
Primary Production Site Secondary Production Site
Vtape Replication
Primary Disaster Restart Site
(remote tape-based
disaster restart)
Tape
Processing
SLB SLB and
Archive Log Tapes
Offload
Frequency - Nightly
DR -
Pre
p
Executes on local
Mainframe/IMS/DB2, copies archive
logs and necessary recovery assets.
Frequency – 15 mins.
RPO – 15 – 30 mins.
Execute 4 jobs created by DR-Prep process for
complete DBMS recovery,
resulting in reduced RTO. IMS/DB2
IMS/DB2
Recovery Expert SLB
Frequency - Nightly
recovery
assets
53
IBM Software
Coordinated IMS and DB2 DR Solutions
RTO is low based on:
► Volumes are restored from the SLB at the remote site
► Databases are recovered in parallel in one pass of logs
RPO is medium based on:
► Frequency of SLB creation and Log transmission
► Method of data transmission (ex. Virtual Tape)
Operational complexity is low
► Automation provided by IBM Tools
► Recovery jobs are created when recovery resources are produced
54
IBM Software
Summary of Coordinated IMS and DB2 Recovery
Coordinated IMS and DB2 Disaster Recovery
► Coordinated IMS and DB2 Restart & Recovery Solution
● Approach 1: SLB contains both IMS and DB2 volumes
● Approach 2: Separate SLBs for IMS and DB2 volumes
Coordinated IMS and DB2 Local Application Recovery
► Coordinated IMS and DB2 Recovery Solutions
● IMS Recovery Expert, IMS Recovery Solution Pack, DB2 Recovery Expert
– IMS and DB2 databases are recovered using SLB, image copies and logs
– Recovery to a consistent timestamp between IMS and DB2
55
IBM Software
IMS Backup and Recovery Workshop
Backup/Recovery Checklist/Review
On-site customer visit
► Agenda is customized to a customer’s specific interests and needs
► Example topics:
● Needs for coordinated IMS and DB2 application backup and
positioning data recovery to a common ‘synched’ point-in-time
● Reduce total spend (including CPU cost) on backup-type
processing
– Fact: Majority of customer spend on data management
processes on z is in ‘backup processing’. Typically at 60-70%,
what is their ROI on this spend ?
► Typically involve IMS, DB2, storage, business continuity
Follow-up analysis
► Identify pain points and where IBM solutions may help
56 © 2012 IBM Corporation