continuous application protection - dell emc … › content › dam › dell-emc › ...2016 emc...
TRANSCRIPT
Shreyas Bhagavath DevasyaSystems Engineer Analyst EMC [email protected]
Shivasharan Narayana GowdaSoftware Engineer Developer EMC [email protected]
CONTINUOUS APPLICATION PROTECTION
2016 EMC Proven Professional Knowledge Sharing 2
Table of Contents Preface .......................................................................................................................................................... 3
History of data protection............................................................................................................................. 4
Backups ..................................................................................................................................................... 4
Snapshots .................................................................................................................................................. 5
Replication ................................................................................................................................................ 6
Continuous Data Protection...................................................................................................................... 8
EMC RecoverPoint .................................................................................................................................... 8
RecoverPoint Components ....................................................................................................................... 8
RecoverPoint Protection ........................................................................................................................... 9
Crash Consistency and Application consistency ......................................................................................... 11
Crash Consistency ................................................................................................................................... 11
Application Consistency .......................................................................................................................... 11
EMC AppSync .......................................................................................................................................... 11
Application Consistency using EMC AppSync ............................................................................................. 11
Continuous Application Protection ............................................................................................................. 12
What do we mean by CAP? ..................................................................................................................... 12
Why CAP? ................................................................................................................................................ 13
Near-CAP using AppSync and RecoverPoint ........................................................................................... 13
Enhanced CAP – Our proposal ................................................................................................................ 14
Conclusion ................................................................................................................................................... 18
Glossary ....................................................................................................................................................... 19
Appendix ..................................................................................................................................................... 19
Disclaimer: The views, processes or methodologies published in this article are those of the authors.
They do not necessarily reflect EMC Corporation’s views, processes or methodologies.
2016 EMC Proven Professional Knowledge Sharing 3
Preface
Backup and Recovery has evolved from tape backups, which had higher RPO, to continuous bookmarks,
which provide zero RPO. With Continuous Data Protection (CDP) products like EMC RecoverPoint,
production can be restored to any point-in-time state in case of a failure. CDP has changed the face of
backup. Though CDP consumes more space to keep track of each and every I/O change, it serves
business critical applications very well.
CDP products have revolutionized backup and replication technology. Nonetheless, such products take
only crash-consistent copies. The continuity of data protection is achieved in terms of crash consistency.
Crash consistent copies do not serve database applications well. For recovery of database applications
from crash-consistent copies, database administrator's intervention would be required. To remove that
barrier, copy data management products have evolved. CDM products like EMC AppSync® automate the
recovery, copy, reuse, and repurposing processes by taking application-aware copies of production data.
In this article, we will explore continuous crash-consistent copies and application consistent copies.
Along with that, we will see the possibilities of achieving continuous application consistent copies
(Continuous Application Protection).
2016 EMC Proven Professional Knowledge Sharing 4
History of data protection
Over the past years, data protection techniques have changed and improved based on business needs.
We will discuss the main data protection techniques below.
Backups
One of the oldest methods of data protection is backups, especially tape backups. A full backup backs up
all the data to be backed up, whereas an incremental backup just backs up the data that has been
written since the last backup. Data backup can be used mainly for two purposes. First, the backed up
data can be used to recover the data in case of data loss. Data loss could be due to hardware failures,
data corruption, or user errors such as data deletion. Second, backups can be used to restore data to a
previous point in time at which the backup was taken. Backups are different from disaster recovery.
Backups only make sure that the organization's data is stored in a place where it can be accessed. After
a disaster, an organization would need a substantial amount of time to get the applications up and
running.
The smaller the RPO, the better suited a backup is for disaster recovery. Backups do not provide short
recovery times. This is because usually backups are performed once or twice a day and mostly at night.
This means that an organization could lose data ranging up to a day. Also, the way to recover the data
from backups, especially from tape backups, is very time-consuming.
Figure 1: Full Backup
2016 EMC Proven Professional Knowledge Sharing 5
Snapshots
Snapshot is another technology that is used to protect data. Snapshot is a point in time copy of data.
Snapshots can be either a full copy (Clone on VMAX®) or pointer-based (SnapVx on VMAX3).
Full clone snapshots take up as much storage as the original volume whereas pointer-based snapshots
take much less storage based on how much of the source data has changed. Full clone snapshots are
also referred to as the split-mirror snapshot. Whenever you run a mirror operation the entire source
data is copied to the target. During this operation, the source is read/write and the target is read-only.
When the split operation is performed, the target becomes read/write accessible. Pointer-based
snapshots are also referred to as copy-on-write snapshots. Every time new data is written to the source,
the target is updated. The limitation of this snapshot technique is that the source must be available to
recover the data.
Another snapshot technique is known as redirect-on-write. The main advantage of redirect-on-write
technique is that it is space efficient and provides better performance. New writes to the source are
directed to another location which is separately allocated. Once again the limitation of this technique is
the same as that of pointer-based snapshots; the source must be available to recover the data.
Figure 2: Incremental Backup
2016 EMC Proven Professional Knowledge Sharing 6
Replication
Next is the replication technology, where data is replicated to a remote site. Replication is a key way to
implement a disaster recovery solution. There are two main kinds of replication; synchronous replication
and asynchronous replication. Most synchronous replication solutions write data to source and target or
replica simultaneously. Acknowledgement is sent back to the host only after the data is written on the
replica. This imposes a distance limitation between the source and replica since extended distances
increase response time which might adversely affect the application. Synchronous replication is
preferred where the recovery time objective is low. Asynchronous replication solutions write data to the
source and then copy the data to the replica. Asynchronous replication data is often replicated on a
scheduled basis. The main benefit of asynchronous replication is that it works over long distances and
costs less than synchronous replication. Asynchronous replication works over longer distances since the
array does not have to wait till the data is written on the replica to acknowledge the write.
Asynchronous replication is mainly used for offsite backups.
Figure 3: Snapshots
2016 EMC Proven Professional Knowledge Sharing 7
Another way to classify replication is based on where replication takes place. The three main types of
replication based on that classification are storage array-based replication, host-based replication and
network-based replication. Most of the enterprise storage arrays have storage array-based replication
technology. EMC VMAX, for example, uses Synchronous Remote Data Facility (SRDF®) technology to
replicate data to a remote site. SRDF can perform both synchronous and asynchronous replication. Host-
based replication runs on the server hardware. Since the host-based software runs on the server, it
takes up CPU processing power which might affect server performance. Host-based replication software
usually supports only asynchronous replication. Network-based replication requires additional hardware
to replicate the data, an example of which is EMC RecoverPoint. Network-based replication supports
both synchronous and asynchronous replication.
Figure 4: Synchronous Application
Figure 5: Asynchronous Replication
2016 EMC Proven Professional Knowledge Sharing 8
Continuous Data Protection
CDP is a way of automatically saving every change that occurs on the source volume so that the source
can be rolled back to any point in time. True CDP saves every I/O or change made to the source whereas
near-CDP does this at scheduled time intervals. Using near-CDP you can only roll back to specific points
in time.
The changed data can be stored either internally on the array or externally. CDP is used for business
critical applications which require zero data loss and zero backup window. CDP can be either block-
based or file-based depending on the underlying storage. Applications themselves provide logging which
is not quite CDP. CDP can be implemented using EMC RecoverPoint.
EMC RecoverPoint1, 2
RecoverPoint is a Storage Area Network (SAN)-based replication technology which provides continuous
data protection for any point-in-time recovery. RecoverPoint family consists of two main members,
RecoverPoint and RecoverPoint for virtual machines. RecoverPoint provides local and remote replication
over distances both synchronous and asynchronously and is ideal for disaster recovery and operational
recovery. Application independent, RecoverPoint allows for concurrent remote and local replications.
RecoverPoint minimizes RPO and RTO. The product also supports replication to heterogeneous arrays.
Point in time copies can be created with the granularity of individual writes. The main components of
RecoverPoint are RecoverPoint appliances (physical or virtual), write splitters (in case of splitter based
replication), and RecoverPoint journal.
RecoverPoint Components
RecoverPoint appliance is what manages data protection and replication. The RecoverPoint Appliance
can be hardware or virtual machines. There need to be at least two active RecoverPoint Appliances in
any RecoverPoint cluster. The array-based write splitter is a RecoverPoint software component on the
EMC storage system such as VNX®, VMAX, and VPLEX®. A copy of the write is created at a point in
between the host and the storage array. The splitter makes a copy of the application write and sends
one copy to the RecoverPoint Appliance and the other copy to the storage volume. The RecoverPoint
Appliance identifies the RecoverPoint Cluster the write belongs to and sends it across to that cluster.
A RecoverPoint cluster is a set of 2 to 8 RecoverPoint Appliances. A RecoverPoint system can have a set
of up to five RecoverPoint clusters. RecoverPoint Repository Volume stores all the configuration
2016 EMC Proven Professional Knowledge Sharing 9
information for all clusters. One repository volume is needed per cluster. This repository volume needs
to be accessible by all of the RecoverPoint Appliances.
RecoverPoint Protection
Write order fidelity or write consistency is maintained by RecoverPoint by using consistency groups. The
consistency group makes sure that the writes to source volumes are also written to the replicas in a
consistent and write ordered manner. Write consistency is needed to ensure that a replica can be used
to continue working from or to restore the source to a previous point in time. RecoverPoint copies are
the copies of all the volumes of a consistency group. Consistency groups are made up of one or more
replication sets. Each replication set is made up of production volume and any replicas. Journal volumes
contain ongoing snapshots of data. A snapshot is a time marked by the system for any recovery
purposes. A bookmark is a text label which is applied to snapshots.
Continuous Data Protection using RecoverPoint3
Figure 6: CDP
2016 EMC Proven Professional Knowledge Sharing 10
The splitter which resides on the storage array sends a copy of the incoming write to the production
volume and another copy to the RPA. Both RPA and LUN acknowledge the writes back to the splitter.
The write-order is maintained across all the LUNs in the case of a consistency group (which contains
multiple LUNs). Hence, any point-in-time copy of a Consistency Group is write-order-consistent.
Continuous Remote Replication using RecoverPoint3
In the case of a remote configuration, an additional RPA is located at the remote site which writes all the
copy data to journal volumes at the remote site.
Figure 7: CRR
2016 EMC Proven Professional Knowledge Sharing 11
Crash Consistency and Application consistency4, 5 Crash Consistency
In crash consistent backups, all the data on the disk is saved. The data in memory is lost. When such a
backup is recovered and restored, the data will be identical to the state it was in at the time of backup.
Crash consistent copies suit non-database application backups. The recovery of data using a crash
consistent copy would take longer and introduce exceptions. The advantages of crash consistent
backups are that they are easy and fast.
Application Consistency
An Application Consistent copy is the one which the application is aware of. Application consistent copy
saves the data on disk, in memory, and the transactions in process. Application consistent backups
usually use client software which quiesces the database, flushes its memory, completes all the writes in
order, and then take backup.
In the case of any failure, the application should be able to recover from such an application consistent
copy with minimum effort by application administrators. Application consistent copies don’t just speed
up restore; they also make repurposing of copies instant and easy.
EMC AppSync6
If you know any software which does app-aware Copy Data Management (CDM), then you know EMC
AppSync. AppSync serves many use cases of copy data management, i.e. taking application-aware
copies, scheduling, repurposing, and so on. The focus here is on Application Consistency.
Application Consistency using EMC AppSync6 AppSync takes application consistent snapshots, bookmarks, or clones (user preferred) of several
applications like VMWare, Microsoft SQL, Microsoft Exchange, and Oracle. This would help users recover
applications within few seconds from failed state, reducing RTO.
AppSync has the capability to make a bookmark created on RecoverPoint as application consistent. This
is how it does this. The user has to install a light-weight AppSync host-plugin service on the host where
applications like Microsoft SQL, Microsoft Exchange, and Oracle DB are running. AppSync server
communicates with plug-in service while creating copies of underlying storage (on which application
data is residing). While doing so, AppSync intelligently manages app consistency. Bookmarks which are
app-consistent are marked as ‘application consistent’ by AppSync on RecoverPoint Appliance. Note that
2016 EMC Proven Professional Knowledge Sharing 12
RecoverPoint does not take application consistent bookmarks by itself. It only lets the user mark any
given bookmark as application consistent (just the name).
Such app-consistent bookmarks, when used to restore production or recover DBs at a different site for
data mining or any other purpose, can be brought up within seconds (or very low RTO).
Continuous Application Protection
What do we mean by CAP?
As we learned from the earlier sections, RecoverPoint can save crash-consistent data by itself. AppSync
has the ability to make RecoverPoint bookmarks application-consistent. Our idea is to club both of them
to complement each other. Using AppSync along with RecoverPoint would enable users to take
application-consistent copies frequently enough to simulate ‘Continuous Application Protection’ (CAP).
Though AppSync and RecoverPoint currently do not deliver real CAP, they do deliver near-CAP.
Figure 8: AppSync architecture
2016 EMC Proven Professional Knowledge Sharing 13
Why CAP?
Automated Recovery
When the copies are just crash consistent, there is pain associated with the recovery process. Crash
consistent copies would need application (DB) admin intervention for recovery (either during restore or
repurpose). Copy Data Management (CDM) software like AppSync can automate this process. The user
can easily restore the production to any point-in-time using continuous application-consistent copies.
Near-Zero RTO
As the copies are continuous as well as app-consistent, application (DB) can be recovered instantly from
such copies. It lowers RTO compared to recovering from crash-consistent copies. This would play a
pivotal role in business-critical applications which require Zero RTO along with Zero RPO.
Near-CAP using AppSync and RecoverPoint
AppSync creates and manages copies of application data. A service plan defines the attributes of these
copies. We can subscribe application data objects to a service plan. Then AppSync runs the service plan
and creates copies of the data from the attributes which are specified in the plan. A service plan can be
scheduled to run at particular time intervals to meet particular RPO policy. Users can also customize the
number of copies they want to keep for a particular application object (AppSync rotation policy).
Using the AppSync application consistency feature and copy rotation policy, users can create copies (or
bookmarks) on timely and rotation basis. These copies would help restore production with minimum
downtime (near zero RTO, given the systems performance is high).
When the application data is protected by RecoverPoint and their copies are managed by EMC AppSync,
users can create continuous application consistent copies by scheduling the AppSync Service Plans
frequently enough.
2016 EMC Proven Professional Knowledge Sharing 14
RecoverPoint and AppSync together can provide near-CAP. But they cannot achieve ideal CAP. The
reason is, RecoverPoint saves each I/O and makes the copy crash-consistent at I/O level using an array
splitter. But AppSync has no data path (well, it is not designed for that) hence it cannot mark the copies
app-consistent at I/O granularity. AppSync-like software, which is application aware, should be informed
by someone to mark the copy as application consistent when I/O reaches a certain extent.
Enhanced CAP – Our proposal
The idea is to power RecoverPoint and AppSync-like products to complement each other towards
achieving ideal CAP. It should enable taking continuous app-consistent copies and at the same time
should not burden the application with a huge number of freeze and thaw requests. The proposed
enhancements are explained below.
Figure 9: Near-CAP
2016 EMC Proven Professional Knowledge Sharing 15
As shown in Figure 10:
1. Application DB would modify the source data which is written to the source device.
2. Array-based splitter would write one copy of source data to source devices.
3. Array-based splitter would send another copy of data to CDP appliance.
4. CDP appliance records source data at I/O granularity into journals.
5. CDP appliance would track quantity of read/writes and based on that, it would decide whether
to initiate app-consistent copy or not (this quantity would be customizable in order to not over-
burden application server with requests to take app-consistent copies).
6. CDP appliance requests the backup server (and copy data manager) for app-consistent copy.
7. Backup Server asks the Backup Agent to prepare application DB for app-consistent backup.
8. Backup Agent quiesces app data and freezes the application.
9. Backup Agent sends acknowledgement to Backup Server indicating that application is ready to
be backed up.
10. Backup server requests CDP appliance to mark latest data written as app-consistent and CDP
appliance marks the latest bookmark as app-consistent. The CDP appliance sends an
acknowledgement to the Backup Server.
11. Backup Server informs Backup Agent to release applications to perform their regular operations.
Backup Agent would thaw application and release from backup mode.
2016 EMC Proven Professional Knowledge Sharing 16
In Figure 10, the communication between CDP appliance and Backup Agent might cause a delay as every
request for creating an app-consistent copy should bypass Backup Server. And also, when Backup Agent
has frozen the application DB while taking an app-consistent copy, it cannot be left in that state for very
long (i.e. for Microsoft SQL and Exchange, VSS agent allows only 10 second windows to hold I/O). For
that reason, if we can power the CDP appliance with limited Backup and Copy Data Management
capabilities, we can achieve app-consistent copies much quicker. One such model is discussed below.
Figure 10: Enhanced CAP using separate CDP Appliance and Backup Server
2016 EMC Proven Professional Knowledge Sharing 17
In Figure 11, you can see that CDP appliance and Backup Server are consolidated. Steps 1 to 5 are same
as explained in the Figure 10.
1. Application DB would modify the source data which is written to the source device.
2. Array-based splitter would write one copy of source data to source devices.
3. Array-based splitter would send another copy of data to CDP appliance.
4. CDP appliance records source data at I/O granularity into journals.
5. CDP appliance would track the quantity of read/writes and based on the amount would decide
whether to initiate app-consistent copy or not (again, this quantity would be customizable in
order to not over-burden application server with requests to take app-consistent copies).
6. CDP appliance, which is also Backup Server (with CDM capabilities), will request Backup Agent to
freeze the I/O of DB.
7. Backup Agent freezes DB, quiesces the DB and flushes in-memory data.
8. Backup Agent will communicate CDP appliance about readiness for taking app-consistent copy.
9. CDP appliance will mark the latest state of the copy as app-consistent.
10. ACK is sent to Backup Agent to release application DB to further proceed with regular I/O
operations. Backup Agent would thaw application DB to release I/O.
Figure 11: Enhanced CAP using consolidated CDP Appliance and Backup Server
2016 EMC Proven Professional Knowledge Sharing 18
The methodologies above are just our proposals to achieve real Continuous Application Protection.
These are not implemented by us and we are not sure of any technical challenges that may surface
during the implementation.
Conclusion
Existing CDP products can provide zero-RPO. Though CDP can help achieve low RTO, there is still room
for improvement. In this article we have explained Continuous Application Protection which would
achieve zero-RTO along with our proposal to achieve CAP. EMC, a pioneer in data protection and copy
data management, can implement these models with minimum effort. This could also add value to
existing EMC products such as RecoverPoint, AppSync, and ProtectPoint.
2016 EMC Proven Professional Knowledge Sharing 19
Glossary
App-consistent - Application consistent
Backup Agent - Host-based software which helps take app-consistent backups
Backup Server (CDM) - AppSync like products which can take application consistent backups
CDM - Copy Data Management/Manager
CDP - Continuous Data Protection
CDP Appliance - Similar to RPA, which can save continuous data changes to source disk
DB - Database application
LUN - logical unit number
RPO - Recovery Point Objective
RTO - Recovery Time Objective
Appendix
1. EMC RecoverPoint Whitepaper - https://brazil.emc.com/collateral/software/white-
papers/h4175-recoverpoint-clr-operational-dr-wp.pdf
2. EMC RecoverPoint 4.4 Admin Guide - https://support.emc.com/docu62057_RecoverPoint-4.4-
Administrator's-Guide.pdf?language=en_US
3. Blog - http://davidring.ie/2013/03/25/emc-recoverpoint-architecture-and-basic-concepts/
4. TechTarget Blog - http://searchdatabackup.techtarget.com/answer/Crash-consistent-vs-
application-consistent-backups-of-virtual-machines
5. N2WS Blog - http://www.n2ws.com/blog/ebs-snapshots-crash-consistent-vs-application-
consistent.html
6. EMC AppSync 2.2.2 User and Admin Guide - https://support.emc.com/docu61180_AppSync-
2.2.2-User-and-Administration-Guide.pdf?language=en_US
2016 EMC Proven Professional Knowledge Sharing 20
EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO
RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS
PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS
FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.