best practices in disaster 2 recovery testing...

24
Handbook 1 EDITOR’S NOTE 2 AVOID THESE COMMON DR TEST MISTAKES 3 IMPROVING BC PLAN EXERCISES 4 DISASTER RECOVERY COSTS: MAKE TESTING, PLANNING COST-EFFECTIVE VIRTUALIZATION CLOUD APPLICATION DEVELOPMENT NETWORKING STORAGE ARCHITECTURE DATA CENTER MANAGEMENT BI APPLICATIONS DISASTER RECOVERY/COMPLIANCE SECURITY Best Practices in Disaster Recovery Testing DR testing is frequently put off or overlooked entirely. However, many surveys show that IT pros are not confident in their ability to recover in a timely manner following a disaster.

Upload: others

Post on 20-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Handbook

1EDITOR’S NOTE

2AVOID THESE COMMON DR TEST MISTAKES

3IMPROVING BC PLAN EXERCISES

4DISASTER RECOVERY COSTS: MAKE TESTING, PLANNING COST-EFFECTIVE

VIR

TUA

LIZA

TIO

N

CLO

UD

AP

PLI

CAT

ION

DEV

ELO

PM

ENT

NET

WO

RK

ING

STO

RA

GE

AR

CH

ITEC

TUR

E

DAT

A C

ENTE

R M

AN

AG

EMEN

T

BI A

PP

LIC

ATIO

NS

DIS

AST

ER R

ECO

VER

Y/C

OM

PLI

AN

CE

SEC

UR

ITY

Best Practices in Disaster Recovery TestingDR testing is frequently put off or overlooked entirely. However, many surveys show that IT pros are not confident in their ability to recover in a timely manner following a disaster.

Page 2: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

2   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

1EDITOR’S LETTER

Technology is No Substitute for Good Backup Strategy

Every year, we read a crop of surveys that indicate organizations are not con-

fident in their ability to recover data after an outage. There are a variety of reasons 

why IT people lack confidence in their DR plans, but many simply lack confidence 

in the backup/DR technologies they rely on.

The thing is, it shouldn’t really be about that. The technology is just a tool to 

achieve a goal, and it’s too easy to just say ‘Oh well, it doesn’t work.’ That’s not 

going to fly where near 24/7 uptime is expected and organizations are subject 

to  regulatory compliance mandates  requiring  that data  is protected and easily 

accessible. 

Technology isn’t a substitute for a good backup strategy. And testing is the 

only way to find holes in your strategy. Too often organizations think of DR as 

a ‘set it and forget it’ exercise when DR really needs to be ongoing. Also, many 

people believe that all DR tests have to be elaborate. But, while full-scale DR tests 

should be conducted, there is also a lot to learn from smaller table-top exercises 

conducted frequently to test specific aspects of your plan. Whatever the size or 

shape your test takes on, though, the goal is to find vulnerabilities. It is important 

not to conduct tests under perfect conditions—conditions will be far from perfect 

Page 3: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

3   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

1EDITOR’S LETTER

following a disaster or other outage. It’s important for tests to simulate reality.

This Handbook offers a Q&A with independent DR expert Jon Toigo that cov-

ers a variety of DR testing topics from how frequently tests should be conducted 

to exactly what tests should entail. You will also find an article about conducting 

cost-effective DR tests and a piece about how and why testing should be built 

into your DR plan. n

Andrew Burton

Senior Site Editor, SearchDisasterRecovery.com

Page 4: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

4   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

Avoid These Common Mistakes in DR Tests

Disaster recovery planning is useless if you can’t restore business operations 

following an interruption. So, setting up a DR plan and forgetting about it until 

something happens isn’t an option. The only way you can be sure that your di-

saster recovery measures will be adequate is to test your plan regularly. 

Why, then, do so many organizations neglect to test their plans or perform 

inadequate tests? Because good DR testing can be time-consuming and difficult. 

In this Q&A, independent disaster recovery expert Jon Toigo discusses some 

of the most common mistakes organizations make when performing DR tests, the 

variety of types of tests you should consider running, how frequently tests should 

be conducted, and the variety of technologies involved in DR testing.

In every disaster recovery methodology, testing

is a huge component. What makes it so important?

I could recite lots of platitudes, but as a practical matter, testing is key to change 

management in disaster recovery planning. From this standpoint, testing helps 

to identify “gaps” in the recovery capability that you have developed to respond 

to  a  disruptive  event  and  to  assure  continuous  or  near-continuous  business 

Page 5: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

5   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

operations. You need to test periodically to identify changes in personnel, busi-

ness processes, applications and technology infrastructure—things that change 

constantly in the real world—that may impact the strategies you have developed 

to accomplish the three core objectives of DR: data restoration, application re-

hosting, and user  re-connection. Following  the  test, you take  the  information 

acquired to improve the plan. Then you re-test.

The more important value of testing is rehearsal. Plans tend to be complex 

instruments with lots of moving parts that require multiple recovery teams to 

work in concert. The more you can rehearse, the better your team members will 

understand their individual roles and the interdependencies between what they 

are doing and what others are doing. That is very important since it enables the 

teams to work in a mostly independent way and to perform tasks in a reasonably 

dependable way even in the face of a great irrationality—a disaster.

What are the components of testing as it is conceived today?

I think we are doing testing wrong today. Basically, tests are conducted on sched-

uled days, once or twice a year, or sometimes quarterly. We take teams off site and 

hold an event where we test how we will recover data from a backup or mirror, how 

we will restore applications in a minimum equipment configuration, and how we 

will reconnect application hosts to a network so that users can get to their work 

with adequate if not optimal performance levels and security.

Page 6: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

6   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

This is the way testing has been done for a long time, but it has the limitation 

of being non-linear. We break strategies down into tasks, and we test tasks out of 

order—in a non-linear fashion. That way, the failure of one test will not prevent us 

from undertaking the other scheduled tests. Such an approach obviates the teach-

ing value of testing. The human brain doesn’t readily reorganize non-linear tests 

into a coherent end-to-end understanding of the strategy, the team member’s role 

in the strategy, or the interdependencies between the roles and activities of dif-

ferent team members. That pretty much demolishes the rehearsal value of testing.

Add to this one other criticism: The act of formal test preparation tends to 

skew the outcome of the test. We pull the right backup tapes in advance, ensure 

that all necessary tools and equipment are present, and so forth—things that make 

the test less and less like a real-world application of procedures in the face of an 

actual disaster. That also limits the efficacy of traditional testing.

In an ideal world, what would a DR test entail?

In my perfect world, we would have defined strategies for data protection and 

restore that can be tested in real time, on an ad hoc basis, using either simulated 

or “live” procedures. Backups to tape should be subjected to read/write verifica-

tion to ensure that the data replicated is the right data and that it can be restored 

when needed. Data replicated to disk also must be verified routinely and not as a 

part of some formal test event.

Page 7: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

7   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

Ideally, we would select strategies for application re-hosting and for network 

reconnection that also avail themselves of testing at any time during the normal 

operating day, and without disrupting normal operations. Geo-clustering holds 

out the promise of such a strategy, as do, to a certain extent, server and storage 

virtualization techniques. Again, you should be able to confirm that system and 

network recovery capabilities are up to the task without waiting for a formal test 

event to find out.

If  we  could  accomplish  these  goals,  formal  testing  would  come  down  to  a 

much simpler set of tasks having to do with logistics—how the disaster would be 

identified, who would be contacted and in what order, how customers would be 

notified, how teams would travel to a recovery facility, how externalized services 

(e.g., “clouds,” hot sites, vendors tasked to drop-ship recovery supplies, the phone 

company, user facilities, etc.) would be activated, how the order of recovery tasks 

would unfold. That could be tested in a very linear fashion and at much less ex-

pense than traditional testing entails today. 

What are the biggest mistakes you see companies make when performing a DR test?

First and foremost, fewer than 50% of companies with plans test them at all. That 

is a huge mistake. If you are going to go to the trouble of defining a continuity 

capability and provisioning your strategies with people and resources, you ought 

to test the result to make sure that your theory of operation matches the reality of 

Page 8: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

8   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

a disaster event. These days, with budget dollars in short supply, too many firms 

are skipping the test—or at least postponing it into oblivion. Not a good idea.

When companies do test, something I am occasionally brought in to observe, 

the same sort of problems tend to arise. The short list includes:

■n Failing to bring the right tapes or other recovery media to the test

■n Too many cooks: management wants to show their support and attends the test, 

getting in the way and wasting precious time. If they want to be part of the test, 

give them a clipboard and make them observe and take notes.

■n Lack of standards for test data collection: taking notes on random scraps of paper 

makes correlating notes with tested activities, and producing summary reports, 

a difficult process

■n Failure to perform post-mortem interviews with test teams while the experience 

is still fresh in their minds—that’s where you obtain some of the best data for 

procedure refinement

■n Tendency to shape results to confirm or validate planning,  ignoring failures: 

there is no such thing as a failed test. Tests produce information that can and 

Page 9: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

9   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

should be used to improve procedures so that recovery activity can be accom-

plished within the specified timeframe. Yet, many planners try to hide “failures” 

to prevent management from losing faith in the project. 

Why is it important to break mirrors as part of your DR test?

Mirroring is the process of replicating data between two disk stores, usually across 

a local storage interconnect. Its cousin, replication, involves the same process but 

across distance using a WAN. Mirroring is usually viewed as synchronous, while 

replication is asynchronous (distance induced latency creates data deltas or dif-

ferences between the state of local and remote data stores).

While mirroring is regarded as a synchronous form of data copy, and pretty 

close to real time if done correctly, a lot of folks outsource the mirroring process 

to their array vendor. They copy after write rather than during write. In effect, 

data is written to disk on array #1, then it is copied by on-array software to an 

identical array #2 located nearby using a high speed/high bandwidth link. This 

approach drives up the cost of mirroring by locking you into the same vendor’s 

gear on both the primary and mirrored array.

But the real problem is one that a little software company, 21st Century Soft-

ware, has been showing off in its presentations for a couple of years. They show 

actual screen shots of multiple customers who thought their hardware was mir-

roring the right data, only to discover that NO DATA was actually being mirrored. 

Page 10: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 0   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

This usually happens when the volumes in the array that hold data are reorganized 

by the vendor service tech or by a storage administrator, but the DR coordinator is 

not informed of the change. Without periodically breaking the mirror, you might 

not notice the problem… until an actual disaster occurs.

So, why don’t folks break their mirrors and test? Simple: It is a hassle to quiesce 

applications, flush caches to volume 1, replicate to the mirrored volume 2, then 

shut everything down to do a file by file compare between the two. It takes time, 

both operator time and time from production application work, and there is no 

certainty that systems will restart and mirroring will resume properly. 

Remote replication has some of the same problems, but it may be possible to 

test for data deltas without quiescing the replication process. There, you have a 

ton of other issues related to latency and to jitter. You need to be vigilant that you 

have a data store to which you can recover.

How frequently should DR tests be conducted? Or should

it be determined by changes to your infrastructure?

Depends what you mean by test. Formal test events, as we noted previously, entail 

a lot of activities that planners would be smart to design into the strategies them-

selves. We should be able to validate data protection and recovery on an ad hoc 

basis—every day! With geo-clustering, we could simulate or perform failovers to 

remote kit and networks on a simulated or actual basis at any time of the day or 

Page 11: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 1   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

night. That leaves logistics testing, which only needs to be done two to four times 

a year, and most certainly following significant changes to business processes or 

infrastructure.

Table-top exercises and plan walkthroughs are great testing vehicles that can 

be used much more frequently than formal off-site test techniques. They have 

the merit of being much less expensive to conduct and have fewer logistical re-

quirements, and they give teams a chance to interact with each other as they walk 

through documented procedures for accomplishing their work. 

Just keep in mind that these tests must also be conducted in an objectives-

driven way. You may not be testing to a stopwatch, but interactions need to be 

kept sufficiently formal and professional to exercise the concept of a strategy or 

procedure in an effective way.

How can storage virtualization help DR testing?

Storage virtualization entails the establishment of a software controller ahead of 

all vendor equipment. Generally speaking, you negate the “value-add” software 

installed on heterogeneous storage hardware and use the “uber-controller” to serve 

value-add functionality across all disks in all cabinets.

Among the services usurped from proprietary hardware controllers and pro-

vided instead, and on a more universal basis, by the storage virtualization uber-

controller are data protection functions like mirroring, replication, continuous 

Page 12: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 2   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

data protection, snapshot copies, etc. These services are centralized in the storage 

hypervisor software layer, away from hardware, so they are not limited or con-

strained by the brand names on the various storage arrays—data can be replicated 

across spindles in any of the heterogeneous storage rigs, which are represented as 

virtual volumes. If your remote storage environment is also virtualized, storage 

hypervisors typically  feature the ability to target remote volumes with writes, 

simplifying WAN replication.

Perhaps the coolest part of this scenario is the ability to manage, allocate, and 

test data protection services from one place—a single pane of glass. That simpli-

fies the testing and validation of data protection processes; in some cases, obvi-

ating the need to quiesce applications to test mirrored volumes for consistency.

If you use a “hot site” for DR, how does the testing process

typically work and does it differ widely between providers?

With a hot site, as with some of the current generation “cloud” or managed hosting 

solution providers, you arrange for a certain amount of test time over the course 

of the year as part of your subscription agreement. The quantity of test time pro-

vided, the roles that will be played by service provider personnel, the specifics on 

how tests need to be scheduled, as well as the minutiae of security, logistics, and 

communications, will vary from provider to provider.

I believe  that  too  little  is being done  to validate  the measures  that service 

Page 13: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 3   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

providers are deploying to safeguard your applications and data—especially when 

it comes to platform and infrastructure cloud service providers. They all claim to 

provide Tier 1 data centers and comprehensive professional operations capabili-

ties, but very few clients actually ever travel to the cloud provider sites and con-

firm that they are getting anything more than some old servers racked in a bent-up 

cage at the end of a row in some musty old managed hosting facility.

What else should people consider before choosing the cloud as a backup target?

Cloud storage could be an effective backup target, and I don’t paint all cloud pro-

viders with the same brush. But the facts need to be closely considered.

First, there is usually one charge for writing data to the cloud storage target, and 

other charges for getting the data back. You need to know what all of the charges 

actually are.

Second, most cloud storage targets provide adequate bandwidth for transport-

ing changed data to the cloud once the original or full data backup has been copied 

(often a teeth-pulling experience). Following a disaster, you need your data—per-

haps close to all of your data—restored. Chances are good that the WAN connec-

tion to the cloud is inadequate to this task.

Remember: it takes more than a year to move 10TB of data across a T-1 (DS-1) 

WAN link. While MPLS networks may provide bigger pipes, using them typically 

means that your backup data is inside the boundary we might consider to be a 

Page 14: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

14   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

2DR TEST MISTAKES

minimum safe distance demarcation between your production data center and 

recovery environment (about 50 kilometers). 

Those constraints need to be taken seriously. I personally would never use a 

cloud backup service provider whose facility was closer than 50 kilometers or one 

that could not provide me my data back on tape.

Do cloud providers offer anything to facilitate testing?

It really varies from service to service. Most do not want to touch your data as a 

function of legal liability avoidance, so they do little to validate your backups or 

replicas. —Jon Toigo

Page 15: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 5   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

3BC PLAN EXERCISES

Improving BC Plan Exercises

Planning and conducting business continuity (BC) plan exercises is one 

of the most important activities in a business continuity program.

Conducting one or more BC plan exercises annually is a key component of a 

business continuity management system (BCMS). Exercises should be scheduled 

and integrated with other BCMS activities, such as plan updating, emergency team 

training, policy reviews and audits, business impact analyses (BIAs), risk assess-

ments (RAs), and awareness programs.

A BC plan exercise is not the same as a disaster recovery test. For instance, 

you don’t actually failover in a BC plan exercise. That’s what you do in a typical 

technology disaster recovery test, which addresses the recovery of IT systems, 

data, databases and so on. This is strictly business continuity.

When planning a BC exercise, the following are priorities:

1.Decide specifically what you plan to exercise, e.g., the entire plan or parts of 

the plan such as incident response procedures or the evacuation plan.

2.Secure a location to conduct the test that is away from any possible interruptions, 

Page 16: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 6   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

3BC PLAN EXERCISES

and encourage exercise participants to turn off their mobile devices if possible 

so they can concentrate on the exercise. If possible, conduct the exercise out-

side the participants’ offices in a less conspicuous location. If this is not pos-

sible, it may make sense to schedule the exercise outside of normal work hours 

or perhaps over a weekend.

3.It may be useful to invite participants other than the exercise developer(s) and 

representatives of the department(s) or activity being exercised, such as staff 

from IT, operations, risk management, human resources, legal, quality assur-

ance and internal audit, but this is not mandatory. A corollary to this is to have 

the “right” participants in the exercise. This means inviting people who have 

a true stake in protecting their department, as well as the company. Inviting 

senior management to an exercise is often avoided because the fear is that a 

senior manager may get too involved (e.g., try to take over the exercise) and 

other exercise participants may reduce their level of participation in deference 

to the executive.

4.It’s not necessary to complete a “successful” exercise. Completing a successful 

exercise doesn’t necessarily mean that the plan ran perfectly, the emergency 

team is fully prepared or that employees are ready to respond. It’s far better 

to  identify flaws  in  the exercise  logic and supporting activities now,  rather 

Page 17: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 7   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

3BC PLAN EXERCISES

than  later  (e.g.,  during  an  incident), when  the flaws could  result  in  serious 

consequences.

You should also assign someone as a timekeeper and scribe, so that a record 

of the exercise can be produced. This is important from an audit perspective 

and also for regulated organizations like banks or firms that are scrutinized by 

government agencies, such pharmaceutical companies and the U.S. Food and 

Drug Administration (FDA). And, it’s a good practice for all exercises.

While not usually a priority, consider launching a surprise exercise in addi-

tion to scheduled exercises. This is perhaps the best way to determine if your 

emergency teams are really prepared to respond to a business-threatening inci-

dent. Some advance planning (e.g., warning) is advised, especially if your exercise 

affects other departments, such as IT or facilities. Also, if other departments, 

such as IT, have scheduled exercises the same time as your surprise event, it 

may be prudent to reschedule. Of course, in real life, there will be no advance 

warnings or courtesy calls alerting you and others of an impending disaster.

Well-planned and conducted BC exercises are important investments in a com-

pany’s long-term success and survival. Knowledge of regularly scheduled exercises 

can also enhance the firm’s reputation and competitive position, especially since 

more organizations today require data about a prospective vendor/partner’s busi-

ness continuity and disaster recovery activities. —Paul Kirvan

Page 18: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

1 8   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

4MAKE DR AFFORDABLE

Disaster Recovery Costs: Make Testing, Planning Cost-Effective

For many, the biggest inhibitor to implementing an effective disaster recovery 

(DR) plan is cost. The shock that is often associated with the price of a proposed 

DR solution becomes something that organizations strapped for funds are simply 

unwilling to swallow, so they may end up stopping a disaster recovery project in 

its tracks.

The  question  then  for  many  data  storage  administrators  is  often, “Okay,  

how much DR can we buy for this amount of money?” rather than, “Here are the 

capabilities we need, let’s find a way to make it affordable.” The goal for every 

company trying to compose a cost-effective disaster recovery plan should be to 

understand the range of cost-saving options and the associated tradeoffs, and  

revise  their  disaster  recovery  strategy  based  on  the  choices  deemed  most 

acceptable.

In this article, learn about where you can look for efficiencies that will not 

compromise disaster recovery. Learn about where you can find opportunities for 

savings in areas like disaster recovery testing and new technologies that will re-

duce disaster recovery costs.

Here are some areas to explore to realize more cost-effective disaster recovery:

Page 19: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

19   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

4MAKE DR AFFORDABLE

1.Eliminate idle assets. One of the greatest cost contributors to disaster recov-

ery is the expense of maintaining assets that largely sit idle waiting for a disas-

ter to happen. For years, it was not uncommon to see servers and data storage 

sitting unutilized in a disaster recovery facility. Today, very few organizations 

can allocate funds for such unused capacity. Devising a plan that allows sys-

tems to be multi-purposed to any degree can dramatically improve the disaster 

recovery cost structure.

One of the most common approaches is to leverage test and development 

environments as backups for disaster recovery. The challenge here is determin-

ing how long these functions can be unavailable during a disaster situation.

2.Standardize. The more different types of widgets that are deployed, the greater 

the number of distinct  types of  resources are needed  for disaster  recovery. 

Likewise, the number of different configurations and system software variants 

of the same platform make DR design and testing more costly. Limiting vari-

ants of platforms and other infrastructure components and defining standard 

configurations reduces complexity and unnecessary costs.

3.Automate. Beyond standardization,  the use of  automation, where possible, 

can  simplify  the  testing  process,  improve  reliability  and  drive  efficiencies 

by reducing the number of hours necessary to complete tasks. Automation 

Page 20: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

2 0   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

4MAKE DR AFFORDABLE

opportunities exist in areas like system deployment, data replication, and host 

or application failover.

4.Virtualize. Perhaps the greatest potential opportunity to drive down DR plan-

ning and testing costs today is offered by virtualization. Server virtualization 

mitigates the idle assets problem and can greatly assist in efforts to automate 

the disaster recovery process, not only reducing costs, but offering the potential 

for improved levels of service.

Storage virtualization also has a role to play, but other than adding a com-

mon management layer, it’s simply another way to offer things like data migra-

tion, data replication and snapshots.

5.Operationalize. All too often the DR process is treated as an exception, some-

thing that exists somewhere on the fringes of IT rather than as part of the day-

to-day IT operations. 

By better integrating disaster recovery into the core functions of IT and 

thinking  about  DR  as  part  of  application  development,  architectural  de-

sign and operational planning activities, more efficient DR solutions can be 

implemented. 

Creating after-the-fact or one-off solutions that must be force-fitted and 

then exist as exceptions are costly and become difficult to manage and maintain.

Page 21: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

2 1   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

4MAKE DR AFFORDABLE

6.Document. A traditional weakness  in  the realm of disaster  recovery  is  the 

availability of current and comprehensive documentation. What does this have 

to do with cost? Besides the real risk of extended downtime and delays in the 

event of a disaster, poor or missing documentation can contribute to cost over-

runs in disaster recovery testing and can require extended time from senior 

resources where, with proper documentation, more junior personnel could get 

the job done.

7.Simplify. Complexities that drive up costs have a way of creeping into orga-

nizations if steps aren’t actively taken to avoid them. Previously mentioned 

factors like standardization and virtualization can go a long way towards sim-

plification, but there are other ways to simplify. For example, in many cases, 

customers  implement  discrete  point  solutions,  such  as  one-off  host-based 

replication or backup solutions, to support the recovery of a specific applica-

tion, which is often based on application vendor recommendations. In extreme 

cases, this can result in multiple data backup or replication technologies that 

each must be supported and managed by IT. While unique application require-

ments must be considered, supporting multiple recovery solutions can become 

a DR management nightmare.

8.Compartmentalize testing. Disaster recovery testing can be a highly disruptive 

Page 22: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

2 2   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

4MAKE DR AFFORDABLE

event that impacts day-to-day operations, raises overtime costs and gener-

ally  increases  anxiety.  While  large-scale DR  testing  is  essential,  consistent 

compartmentalized testing of networks, servers, storage and, to some degree,  

application components can help ensure recoverability while helping to reduce 

the disruption and avoid unplanned expenses related to failed or delayed DR 

tests.

9.Optimize data. When it comes to reducing or avoiding DR costs, less can be 

more. Technologies such as data deduplication and thin provisioning reduce 

underlying data storage footprints and can also significantly decrease band-

width requirements needed for data replication—thereby making DR signifi-

cantly more affordable. Another often overlooked data optimization practice 

that can reduce DR data footprint and traffic is data archiving. It’s important to 

consider that from a data perspective, disaster recovery is primarily concerned 

with currently active data sets, but the reality is that often the majority of data 

sitting on today’s storage arrays is non-current, historic data that has accumu-

lated over time. A program to purge unneeded data or move it offsite to a cloud 

or secondary repository could greatly reduce the data storage capacity required 

at a DR location and may likely even help speed up the recovery process—not 

to mention the savings associated with freeing up expensive primary storage 

at the primary location.

Page 23: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

2 3   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

4MAKE DR AFFORDABLE

In the end, disaster recovery is a type of insurance that people invest in, and 

then hope that they never need. Regardless if you never think you’ll use your DR 

plan, it’s important to recognize the necessity of being properly protected. The 

key is to buy enough, but not too much, and the items listed above represent areas 

to consider when making long-term planning decisions. Determining the right 

approach requires taking the time to fully understand the problem before diving 

into specific technical solutions. Once the problem is understood, however, the 

technical nuances of different solutions become critical and can impact disaster 

recovery costs substantially. The good news is that the cost of the core technolo-

gies that support DR are more affordable than ever. Ultimately, the challenge is 

that cost-effective disaster recovery is more a matter combining the right tech-

nologies with the right policies and processes, and this is where organizations 

frequently come up short. —Jim Damoulakis

Page 24: Best Practices in Disaster 2 Recovery Testing 3docs.media.bitpipe.com/io_10x/io_109982/item_694647/Best... · 2013-05-28 · Home Editor’s Letter Avoid These Common DR Test Mistakes

Home

Editor’s Letter

Avoid These Common DR

Test Mistakes

Improving BC Plan Exercises

Disaster Recovery Costs:

Make Testing, Planning Cost-

Effective

2 4   B E ST   P R ACT I C E S   I N   D I SA ST E R   R E C OV E RY   T E ST I N G

ABOUT THE

AUTHORS JON TOIGO is CEO and managing partner of Toigo Partners International. He has penned thousands of computing articles and columns, and has written 15 books on computing. He has chaired the Data Man-agement Institute since 1992.

PAUL KIRVAN, CISA, FBCI, works as an in-dependent business continuity consultant/auditor and is secretary of the Business Continuity Institute USA chapter and mem-ber of the BCI Global Membership Council. He can be reached at [email protected].

JIM DAMOULAKIS is CTO at GlassHouse Technologies, a leading independent pro-vider of storage and infrastructure services.

Best Practices in Disaster Recovery Testing  is a SearchDisasterRecovery.com  

e-publication.

Rich Castagna | Editorial Director

Andrew Burton | Senior Site Editor

Ed Hannan | Managing Editor

John Hilliard | Associate Site Editor

Sonia Lelii | Senior News Writer

Linda Koury | Director of Online Design

Neva Maniscalco | Graphic Designer

Jillian Abbott | Publisher [email protected]

TechTarget 275 Grove Street, Newton, MA 02466 

www.techtarget.com© 2013 TechTarget Inc. No part of this publication may be trans-mitted or reproduced in any form or by any means without written permission from the publisher. TechTarget reprints are available through The YGS Group.

About TechTarget: TechTarget publishes media for information technology professionals. More than 100 focused websites enable quick access to a deep store of news, advice and analysis about the technologies, products and processes crucial to your job. Our live and virtual events give you direct access to independent expert com-mentary and advice. At IT Knowledge Exchange, our social commu-nity, you can get advice and share solutions with peers and experts.