the evolution of async and sync replication for …...the evolution of async and sync replication...
TRANSCRIPT
The Evolution of Async and Sync Replication for High and
Continuous Availability Architectures
CTUG – October 20, 2009
Paul J. Holenstein
Executive Vice President
Gravic, Inc.
Copyright 2009 Gravic 2www.gravic.com
• Introduction to Gravic
• Business Continuity Overview
• Replication Technologies
• Async and Sync
• Replication Architectures
• High Availability: Active/Passive
• Continuous Availability: Active/Active
• Issues to Consider When Picking a BC Architecture
• Business Continuity Case Studies
• References for More Information
Agenda
Questions? Please ask as we go along…
Copyright 2009 Gravic 3
A History of Excellence
• 1979 – Founded, development organization & service bureau
• 1984 – Log-based replication product TMF Auditor
• 1995 – Low-latency process-to-process replication Shadowbase
Focused Technology Direction
• Product solutions for a wide array of enterprise data replication requirements
• Patents on critical and innovative technology
Total Replication Solutions®
• Leverage product, services, and partners to offer Complete Business Problem Solutions
www.gravic.com
Introduction to Gravic
Copyright 2009 Gravic 4www.gravic.com
• Recovery Time and Recovery Point Objectives
• Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are two
commonly used terms to describe business continuity requirements.
RPO – describes the point in time to which the data must be recovered
RTO – describes the time from when a failure occurs until the business
process must become active again.
Last Saved
Data Point
Potential for
Data Loss
RPO/RTO Relationship
Failure PointTime to
Recover
RPO RTO
Recovery
Complete
System Processing Time
Business Continuity Overview
Copyright 2009 Gravic 5
Business Continuity Overview
www.gravic.com
Replication Characteristics Affect RPO/RTO Levels that can be Attained
1. Replication Latency – affects RPO (amount of data loss at failure)
2. Async vs Sync Replication – improves RPO, but adds “Application Latency”
3. Target Applications Active – improves RTO and usability of target DB (e.g. queries)
4. Active/Passive vs Active/Active Architecture – dramatically improves RTO & RPO
Low-Latency
Replication
BC “Sweet Spot”
RTO (faster recovery)
RPO
(less, or no,
data loss) Inactive Target
Applications
SYNC Replication
(Including Split Mirrors)
Active Target
Applications
1
3
No Data Loss,
Fastest “Recovery”
Active/Active
Architecture
Little Data Loss,
Fastest “Recovery”
Active/Active
Architecture
High-Latency
Replication
ASYNC
Replication
Active/Passive
Architecture
2
Copyright 2009 Gravic 6www.gravic.com
Business Continuity Overview
magnetic
tape
Increasing
Availability
Better RTO (Faster Recovery)
days
da
ys
hoursh
ou
rs
msec.
min
ute
s
zero
se
co
nd
s
* Includes
Sizzling
Hot
Takeover
magnetic
tape
virtual
tape
Sync
Active/Passive
Async
Active/Passive
Async
Active/Active*
Data Replication Architectures
Sync
Active/Active*
Be
tte
r R
PO
(L
es
s D
ata
Lo
ss
)
Tape-based Recovery Methods
(Assuming you have a backup node)
Business Continuity Continuum
Copyright 2009 Gravic 7
Replication Architectures
www.gravic.com
Asynchronous Replication (Current Technology)
• Replication decoupled from the application processing
• Runs independently of application
Issues
• Replication Latency determines data loss at failure (RPO)
• Need low latency data replication engine to minimize data loss at failure
• If Active/Active:
• Data Collisions can occur (during Replication Latency window)
• Need special architectures/algorithms to avoid vs. identify & resolve (application issue)
Application
1A1B
Replication
Copyright 2009 Gravic
1B
Replication
8
Replication Architectures
www.gravic.com
Asynchronous Replication (Current Technology)
• Replication decoupled from the application processing
• Runs independently of application
Issues
• Replication Latency determines data loss at failure (RPO)
• Need low latency data replication engine to minimize data loss at failure
• If Active/Active:
• Data Collisions can occur (during Replication Latency window)
• Need special architectures/algorithms to avoid vs. identify & resolve (application issue)
Application
Application
2B
Replication
1A
2A
Copyright 2009 Gravic 9
Replication Architectures
www.gravic.com
Uni-Dir Asynchronous Replication (Current Technology)
Application
SOURCE
DATABASETARGET
DATABASE
TMF
Audit
Trails
2
TMF
Replication
Engine
Replication
Engine
System \LEFT System \RIGHT
3 TMF
(Data)
1
I/O 5
4
TMF
Copyright 2009 Gravic 10
Replication Architectures
www.gravic.com
Synchronous Replication (Future Technology)
• No (committed) data loss at failure of a node (RPO = 0)
• If Active/Active:• Data collisions are avoided (but become lock waits)
Issues
• Replication is part of the source application’s transaction
• Adds Application Latency, effectively slowing the application’s transactions
• What to do if network or target system is down?
• So-called Split Brain syndrome (next slide set…)
1B
Application1C
1A
1D
Copyright 2009 Gravic
1B
2C
2B
11
Replication Architectures
www.gravic.com
Synchronous Replication (Future Technology)
• No (committed) data loss at failure of a node (RPO = 0)
• If Active/Active:• Data collisions are avoided (but become lock waits, possibly back off & resubmit)
Issues
• Replication is part of the source application’s transaction
• Adds Application Latency, effectively slowing the application’s transactions
• What to do if network or target system is down?
• So-called Split Brain syndrome (next slide set…)
Application
Application
1C
1A
2D
2A
1D
Copyright 2009 Gravic 12
Replication Architectures
www.gravic.com
Synchronous Replication – Split Brain Defined
• Split Brain occurs when one (or more) nodes do not know the status/state of the other node(s) in the network
Split Brain Resolution Options
• Fail both nodes (reduces application availability)
• Fail all nodes but one (failover users to remaining node); which one?
• Allow both to run independently…”async recovery mode” needed
1BApplication 1A 2A?
Copyright 2009 Gravic 13
Replication Architectures
www.gravic.com
Synchronous Replication – Split Brain Defined
• Split Brain occurs when one (or more) nodes do not know the status/state of the other node(s) in the network
Split Brain Resolution Options
• 1) Fail all nodes: fails application - reduces application availability
• Fail all nodes but one (failover users to remaining node); which one?
• Allow both to run independently…”async recovery mode” needed
1BApplication 1A 2A?
Copyright 2009 Gravic 14
Replication Architectures
www.gravic.com
Synchronous Replication – Split Brain Defined
• Split Brain occurs when one (or more) nodes do not know the status/state of the other node(s) in the network
Split Brain Resolution Options
• 1) Fail all nodes
• 2) Fail all nodes but one: Active/Passive (recover when available)
• Allow both to run independently…”async recovery mode” needed
1BApplication 1A 2A?
Copyright 2009 Gravic 15
Replication Architectures
www.gravic.com
Synchronous Replication – Split Brain Defined
• Split Brain occurs when one (or more) nodes do not know the status/state of the other node(s) in the network
Split Brain Resolution Options
• 1) Fail all nodes
• 2) Fail all nodes but one: Active/Active (failover users); Pick one…
• Allow both to run independently…”async recovery mode” needed
Application 1A ?1BApplication
2A2B
3
Copyright 2009 Gravic 16
Replication Architectures
www.gravic.com
Synchronous Replication – Split Brain Defined
• Split Brain occurs when one (or more) nodes do not know the status/state of the other node(s) in the network
Split Brain Resolution Options
• 1) Fail all nodes
• 2) Fail all nodes but one
• 3) Allow nodes to run independently: Active/Passive queue changes
Application 1A 2A?1B
Copyright 2009 Gravic 17
Replication Architectures
www.gravic.com
Synchronous Replication – Split Brain Defined
• Split Brain occurs when one (or more) nodes do not know the status/state of the other node(s) in the network
Split Brain Resolution Options
• 1) Fail all nodes
• 2) Fail all nodes but one
• 3) Allow nodes to run independently: Active/Active both queue
Application 1A ?1BApplication
2A2B
Copyright 2009 Gravic 18
Replication Architectures
www.gravic.com
Synchronous Replication – Recovery Methods
• Recovery modes 2 & 3 require Async Escalation Recovery Mode
• This consists of escalation of replication capabilities to reach end state:
• Start in “Asynchronous Queue Mode” until problem is repaired;
• Then enter “Asynchronous Replication Mode” until queue drains;
• Then enter “’Semi-Synchronous’ Mode” until all async tx’s complete;
• Then enter “Full Synchronous Mode”…
• …Any problems, start over…
• Refer to September/October 2009 Availability Corner Connection Article:
• “Part 18: Recovering from Synchronous Replication Failures”
AudColl AudRtrAudCoorasync
step 1a
async
step 1b
async
step 1csemi-
sync
Prepare request
Prepare response
step 2a
by table
step 2b
by tx
step 3
sync
RTC?
RTC!
TMF AudCons
Copyright 2009 Gravic 19
Replication Architectures
www.gravic.com
Synchronous Replication – Contrast 2 Transactional Methods
• There are more, we’ll just discuss two today
• “Dual Writes” (Network Transactions)
• “Coordinated Commits”
Dual Writes (Network Transactions)
• No replication engine used
• Application starts a local transaction• Escalates to a “distributed”, or network, transaction when remote I/O is sent/applied
• Application does local and remote I/O directly• Application or library or interface process infrastructure needed; modify application?
• Each I/O is 1 or 2 network messages to and from remote system
• Read/Lock & Update vs. standalone Update
• Application delayed for each round trip
• Lots of small network messages sent/received across network
• Application commit is 2 round trips across network (2PC)• Application’s commit call is delayed until this completes
Copyright 2009 Gravic 20
Replication Architectures
www.gravic.com
Uni-Dir Synchronous Replication (Dual Writes/Network Transactions)
Application
SOURCE
DATABASETARGET
DATABASE
System \LEFT System \RIGHT
3A
I/O
1 Begin TX
2 TX ID
TMF 3B I/O
5A: Phase 1 Request
5B: Phase 1 Reply
5C: Phase 2 Request
5D: Phase 2 Reply
TMF
3A
I/O 3B I/O
Copyright 2009 Gravic 21
Replication Architectures
www.gravic.com
Synchronous Replication – Contrast 2 Transactional Methods
Coordinated Commits
• (Async) replication engine used
• Application starts a local transaction, it stays local• Replication engine “joins” it and can vote on outcome (to commit or abort)
• Remote replication engine process starts a local transaction there as well
• Application does local I/O• Runs at full speed (until commit time)
• Replication engine extracts, batches, sends, and applies I/O events to target
• All database events for all applications batched and sent together
• Fewer larger messages sent across network
• Application commit “decision” is 1 round trip across network• Replication engine checks if remote replay is ready to commit
• Votes Yes (commit) or No (abort)
• Application continues
• Result is then is replicated by replication engine and applied at target
Copyright 2009 Gravic 22
Replication Architectures
www.gravic.com
Uni-Dir Synchronous Replication (Coordinated Commits)
Application
SOURCE
DATABASETARGET
DATABASE
TMF
Audit
Trails
Replication
Engine
Replication
Engine
System \LEFT System \RIGHT
TMF
Copyright 2009 Gravic 23
Replication Architectures
www.gravic.com
Uni-Dir Synchronous Replication (Coordinated Commits)
Application
SOURCE
DATABASETARGET
DATABASE
TMF
Audit
Trails
Replication
Engine
Replication
Engine
System \LEFT System \RIGHT
1 Begin TX
3 Replication Engine Joins TX
2 TX ID
TMF
Copyright 2009 Gravic 24
Replication Architectures
www.gravic.com
Uni-Dir Synchronous Replication (Coordinated Commits)
Application
SOURCE
DATABASETARGET
DATABASE
TMF
Audit
Trails
5
TMF
Replication
Engine
Replication
Engine
System \LEFT System \RIGHT
6 TMF
(Data)
4
I/O8
7
TMF
Copyright 2009 Gravic
11
Verify
Reay to Commit
25
Replication Architectures
www.gravic.com
Uni-Dir Synchronous Replication (Coordinated Commits)
Application
SOURCE
DATABASETARGET
DATABASE
TMF
Audit
Trails
Replication
Engine
Replication
Engine
System \LEFT System \RIGHT
12
Allow Commit?
Yes/No Reply
9
Commit
Request
TMF
10 Phase 1
Vote
Copyright 2009 Gravic 26
Replication Architectures
www.gravic.com
Uni-Dir Synchronous Replication (Coordinated Commits)
Application
SOURCE
DATABASETARGET
DATABASE
TMF
Audit
Trails
Replication
Engine
Replication
Engine
System \LEFT System \RIGHT
15
Commit
NOTE: Coordinated Commits is a future technology and requires
HP’s NonStop Synchronous Replication Gateway
TMF16
Commit18
Commit
17
Commit
Application continues…
Copyright 2009 Gravic 27
Async and Sync Replication Summary
www.gravic.com
Synchronous Replication Has Substantially the Same Characteristics as Asynchronous Replication Except for Application Latency:
• Its throughput is substantially the same• May need to run more copies of the application server classes though
• Its failure characteristics are the same• In all modes except “any fault fails all nodes”
• Its I/O loading (for replay) is substantially the same
• Its network loading (for Coordinated Commits) is substantially the same
• Its use for eliminating planned downtime (ZDM) is the same
Its Main Downside is Application Latency:
• This delays the source application’s transaction while replication occurs.
• It will increase the number of simultaneous transactions on each node.
• It will extend the time that source data locks are held.
• It may increase the number of source transactions that are aborted and need to be resubmitted (e.g. if target I/O’s cannot be applied or “wait” too long/timeout).
And, Failover/Recovery will be More Complex:
• Sync Mode Async Mode Sync Mode.
• But this should primarily be handled automatically by the replication engine.
Copyright 2009 Gravic 28
Async and Sync Replication Summary
www.gravic.com
Synchronous Replication Benefits:
• Avoiding data loss following a node failure (RPO = 0)
• And, when coupled with an active/active architecture:
• Continuous availability (RTO → 0); this means no application outage across node failures!
• Additionally, the elimination of data collisions when the application is active on all nodes
Coordinated Commits Performs Well for Synchronous Replication If/When:
• The nodes are dispersed (e.g. for disaster tolerance); and/or
• Network loading/performance is an issue; and/or
• The application I/O rates are high; and/or
• The transactions are large; and/or
• The number of nodes grows.
The Key, Then, is to Implement Synchronous Replication for Reasonably “Well-Behaved” Applications
• For example, for an application that can scale as load increases (because to attain the same thruput, the application must scale)
Copyright 2009 Gravic 29www.gravic.com
• Classic Disaster Recovery (Active/Passive)
• Sizzling Hot Takeover (Active/“Almost Active”)
• Continuous Availability (Active/Active)
• Architectures That Avoid Data Collisions
• Reciprocal Replication
• Partitioned Application
• Modify Database Keys
• Architectures That Identify and Resolve Data Collisions
• Data Content Resolution
• Relative Replication
• Designated Winner (Nodal Precedence)
• Miscellaneous Availability “Enhancers”
• Asymmetric Capacity Expansion
• Zero Downtime Migrations (ZDM)
Business Continuity Case Studies (All Async)
In Order of Increasing Availability (High to Continuous)
Copyright 2009 Gravic 30www.gravic.com
Replication
Clients
BackupPrimary
Classic DR:
• Active → Passive
• Uni-Directional Replication
• Application Active on
Primary Node Only
• Passive Node May be used
for Reporting & Read-only
OLQP
Classic Disaster Recovery
Business Continuity Case Study
Copyright 2009 Gravic 31www.gravic.com
Brokers
Bi-Directional
Replication
Sizzling Hot Takeover:
• Active → “Almost Active”
• Target Appl Hot
(Improves RTO)
• Bi-Dir Configured
• No Data Collisions
• Easy to Validate Backup
(Submit Verification Tx’s)
• Facilitates Failover
Testing – Periodically
“Just Do It”
Sizzling Hot Takeover
Business Continuity Case Study
Copyright 2009 Gravic 32www.gravic.com
Users
Uni-directional Replication
All Nodes Active:
• Not “Active ↔ Active”
• 2 or more Uni-
Directional DR
Environments, Each
Backing the Other(s)
Up
• Separate Databases,
Data Collisions
Cannot Occur
Users
Uni-directional Replication
All Nodes Active: Reciprocal Replication
Business Continuity Case Study
Copyright 2009 Gravic 33www.gravic.com
Bi-Directional Replication
Active/Active: Partitioned Application Users – No Data Collisions
Users A-M Users N-Z
Partitioned Users:
• Active ↔ Active
• All Applications Active
• Bi-Directional
• Users Load Balanced
Across Nodes
(pre-assigned a
“primary” node)
• No Data Collisions
Business Continuity Case Study
Copyright 2009 Gravic 34www.gravic.com
Users
Bi-directional Replication
Modify Database:
• Active ↔ Active
• All Applications Active
• Users Load Balanced
Across Nodes
• Application Primarily
INSERTs
• No Data Collisions as
Node ID is added to the
Database Keys
Business Continuity Case Study
Active/Active: Modify Application Database – No Data Collisions
Copyright 2009 Gravic 35www.gravic.com
Replication Engine Technology
• Send Source’s Before as well as After images to target
• Compare Source’s Before Image to target’s Disk Image to check
Source Event Type Target Disk Image
INSERT
Does Not Exist – Apply Insert
Exists – Collision
UPDATE
Exists & Same – Apply Update
Does Not Exist – Collision
Exists & Different – Collision
DELETE
Exists & Same – Apply Delete
Does Not Exist – Collision
Exists & Different – Collision
Async Replication - How are Data Collisions Identified?
Copyright 2009 Gravic 36www.gravic.com
Replication Engine Technology
Async Replication - How are Data Collisions Identified?
Copyright 2009 Gravic 37www.gravic.com
ATM
Switch
Operators
Bi-Directional Replication
Data Content:
• Active ↔ Active
• All Applications Active
• Requests Load Balanced
Across Nodes
• Data Collisions Occur,
Resolved via Row Contents
(e.g. Most Recent Update
Timestamp)
Active/Active: Data Content Resolution
Business Continuity Case Study
Copyright 2009 Gravic 38www.gravic.com
Relative:
• Active ↔ Active
• All Applications Active
• Requests Load Balanced
Across Nodes
• Data Collisions Occur,
Resolved via Relative
Replication (e.g. data deltas
for numeric fields; text uses
Absolute Replication)
Cellular Account
Bi-Directional Replication
Active/Active: Relative Replication
Business Continuity Case Study
Copyright 2009 Gravic 39www.gravic.com
Designated Winner:
• Active ↔ Active
• All Applications Active
• Requests Load Balanced
Across Nodes
• Data Collisions Occur,
Resolved via “First to Update
the Master Wins” Rule.
…
Master
Peers…
Active/Active: Designated Winner
Business Continuity Case Study
Copyright 2009 Gravic 40www.gravic.com
Travel Agents
Internet Travel
Companies
Add more NonStop “read”
nodes as needed…
Application Writes
(Master DB)
Replication
Reads
OLQP – Query Read-only Nodes
ooo
Internet
Asymmetric Capacity Expansion
Active/Active: Asymmetric Capacity Expansion
Copyright 2009 Gravic 41
Build New NonStop Environment
SybaseDR
.......
........
.......
.......
........
.......
16 Systems
16 Systems
Linux/Sybase “Partitioned” Database
Linux/Sybase “Partitioned” Database
NonStop Active/ActiveLogin Request Complex
Replication
Replication
ReplicationReplication Replication
Login Requests
Sybase → NonStop Migration of AIM Authentication at AOL
www.gravic.com
Zero Down-time Migration (ZDM) Case Study
Copyright 2009 Gravic 42
Capture Changes to Source Database…
Sybase DR
.......
........
.......
.......
........
.......
16 Systems
16 Systems
Linux/Sybase “Partitioned” Database
Linux/Sybase “Partitioned” Database
Replication Engine(Login Changes)
Replication
Replication
ReplicationReplication Replication
Login Requests
....
....
....
14 Systems
Linux
Sybase → NonStop Migration 2
NonStop Active/ActiveLogin Request Complex
www.gravic.com
Zero Down-time Migration (ZDM) Case Study
Copyright 2009 Gravic 43
Perform Initial Load of New Environment
16 Systems
16 Systems
Linux/Sybase “Partitioned” Database
Linux/Sybase “Partitioned” Database
Replication
Replication
ReplicationReplication Replication............
14 Systems
Linux
Extract Load
SybaseDR
.......
........
.......
.......
........
.......
Replication Engine(Login Changes)
Login Requests
Sybase → NonStop Migration 3
NonStop Active/ActiveLogin Request Complex
www.gravic.com
Zero Down-time Migration (ZDM) Case Study
Copyright 2009 Gravic 44
Sybase → NonStop Migration 4
Drain Source Change Queues into New Environment
16 Systems
16 Systems
Linux/Sybase “Partitioned” Database
Linux/Sybase “Partitioned” Database
Replication
Replication
ReplicationReplication............
14 Systems
Replication Engine
Replication Engine
Linux
SybaseDR
.......
........
.......
.......
........
.......
Replication Engine(Login Changes)
Login Requests
NonStop Active/ActiveLogin Request Complex
www.gravic.com
Replication
Zero Down-time Migration (ZDM) Case Study
Copyright 2009 Gravic 45
Sybase → NonStop Migration 5
Validate New Environment; Re-Load if Incorrect
16 Systems
16 Systems
Linux/Sybase “Partitioned” Database
Linux/Sybase “Partitioned” Database
Replication
Replication
ReplicationReplication............
14 Systems
Replication Engine
Replication Engine
Linux
SybaseDR
.......
........
.......
.......
........
.......
Replication Engine(Login Changes)
Login Requests
NonStop Active/ActiveLogin Request Complex
www.gravic.com
Replication
Zero Down-time Migration (ZDM) Case Study
Validate
Copyright 2009 Gravic 46
Sybase → NonStop Migration 6
Migrate Reads + New Users to New Environment
16 Systems
16 Systems
Linux/Sybase “Partitioned” Database
Linux/Sybase “Partitioned” Database
Reads + New Login Requests
Reads + New Login Requests
Reads + New Login Requests
Read + New Login Requests
Replication
Replication
ReplicationReplication
Existing Login Change Requests
....
....
....
14 Systems
Linux
Replication Engine
Replication Engine
SybaseDR
.......
........
.......
.......
........
.......
Replication Engine(Login Changes)
NonStop Active/ActiveLogin Request Complex
www.gravic.com
Replication
Zero Down-time Migration (ZDM) Case Study
Copyright 2009 Gravic 47
Sybase → NonStop Migration 7
Migrate ALL Requests to New Environment, Retire Old Environment
SybaseDR
.......
........
.......
.......
........
.......
16 Systems
16 Systems
Linux/Sybase “Partitioned” Database
Linux/Sybase “Partitioned” Database
All Login Requests
All Login Requests
All Login Requests
All Login Requests
Replication
Replication
ReplicationReplication Replication
NonStop Active/ActiveLogin Request Complex
www.gravic.com
Zero Down-time Migration (ZDM) Case Study
Copyright 2009 Gravic 48
For More Information
www.gravic.com
Breaking the Availability Barrier Series
• Survivable Systems for Enterprise Computing
Volume 1:
• Achieving Century Uptimes with Active/Active
Volume 2:
• Active/Active Systems in Practice
Volume 3:
• Continuous Availability for the 21st Century (Available 2010)
Volume 4:
Copyright 2009 Gravic 49
Questions?
www.gravic.com
Gravic, Inc.
301 Lindenwood Drive
Suite 100
Malvern, PA 19355 USA
www.gravic.com
Phone: +1.610.647.6250
Fax: +1.610.647.7058