disaster recovery sites on aws: minimal cost, maximum efficiency

40
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Disaster Recovery Site on AWS: Minimal Cost Maximum Efficiency Ryan Holland, AWS July 10, 2014

Upload: amazon-web-services

Post on 08-Sep-2014

710 views

Category:

Technology


0 download

DESCRIPTION

Implementation of a disaster recovery (DR) site is crucial for the business continuity of any enterprise. Due to the fundamental nature of features like elasticity, scalability, and geographic distribution, DR implementation on AWS can be done at 10-50% of the conventional cost. In this session, we do a deep dive into proven DR architectures on AWS and the best practices, tools and techniques to get the most out of them.

TRANSCRIPT

Page 1: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Disaster Recovery Site on AWS:

Minimal Cost Maximum EfficiencyRyan Holland, AWS

July 10, 2014

Page 2: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

What You Will Learn

• Disaster Recovery and Business Continuity

• Why AWS for disaster recovery?

• Common DR architectures

• Backup and restore

• Pilot light

• Warm Standby

• Hot Standby

• Customer case study

• Where to go next

Page 3: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Disruptions to Business Continuity

Caused by outage of IT infrastructure

Affects businesses of all kinds and sizes

Can be very expensive

Page 4: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Downtime

Natural Disaster

Security Incidence

Equipment Failure

Human Error

What causes downtime

Page 5: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 6: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Conventional Disaster Recovery Sites

• High cost

• Low ROI

• Implemented only for most critical systems

• Usually scaled down to 50% of production

• Systems in a remote region challenging

• Costly software licenses based on hardware usage

Page 7: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Disaster Recovery on AWS

• Unprecedented capabilities to implement DR sites

• Easily set up DR sites on different geographic regions

• Cut down DR site cost by up to 70%

• Substantial savings on software licenses

Page 8: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Global Reach from Your Desktop

Page 9: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Tools for Implementing DR on AWS

• Leverage tools like CloudForamtion to automate deployment.

• Choose an AMI strategy that fits the RTO requirements.

• Cross-region EBS snapshot and AMI copy

• Cross-region read replicas for Amazon RDS for MySQL

• Amazon Route53 and Auto Scaling

• EC2 reserved instances

Page 10: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

AWS Storage Options

Simple Storage ServiceHighly scalable object storage

1 byte to 5TB in size

99.999999999% durability

Elastic Block StoreHigh performance block storage device

1GB to 1TB in size

Mount as drives to instances with

snapshot/cloning functionalities

GlacierLong term object archive

Extremely low cost per gigabyte

99.999999999% durability

Page 11: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Common DR architectures

Each architecture

differs from the other

In terms of RTO, RPO and Cost

Page 12: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Simple to get started

Easy starting point for exploring the AWS cloud

Low technical barrier to entry

Focus on incorporating cloud into your DR

strategy, not on complex technical issues related

to hot-hot systems

Lowest cost

Very high levels of data durability at low price

Cost of storing snapshots in Amazon S3

Archiving possibilities beyond tape using

Amazon Glacier

Backup & Restore Architecture

Page 13: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Back up and restore

Create instances from AMIs

Restore datafrom backups

Page 14: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Many Ways to Back Up

Page 15: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Backup & Restore Considerations

• Make sure you keep your AMIs current

• Use CloudFormation or other automation tools

• Consider EC2 light utilization reserved instances

• Test your DR plan frequently. Then test some more.

Page 16: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Build resources around

replicated dataset

Keep ‘pilot light’ on by replicating core

databases

Build AWS resources around dataset and

leave in stopped state

Scale resources in AWS in

response to a DR event

Start up pool of resources in AWS when

events dictate

Scale up the database instance to handle

production capacity

Pilot Light Architecture

Page 17: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Pilot Light Architecture

Page 18: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Create instances from

AMIs

Pilot Light Architecture

Page 19: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Activating a Pilot Light DR Site

• Use CloudFormation and Auto-Scaling to stage infrastructure.

• Keep your AMIs or bootstrapping scripts current.

• Leverage EC2 heavy utilization reserved instances for the

database

• Test your DR plan frequently. Then test some more.

Page 20: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Build a environment similar to

production at a reduced scale

Keep data and files synchronized between

production and DR site by replication

Use smaller and/or fewer instances than Production.

Scale resources in AWS in

response to a DR event

Scale out the environment by adding more

instance

Scale up the instances to handle production

capacity

Warm Standby Architecture

Page 21: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Warm Standby Architecture

Page 22: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Warm Standby Architecture

Page 23: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Moving Warm Standby to Production

• Use CloudFormation and Auto-Scaling to resize infrastructure.

• Leverage EC2 heavy utilization reserved instances for the

database and the warm standby instances.

• Test your DR plan frequently. Then test some more.

Page 24: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Build DR site as mirror image of Production

Keep all data and files synchronized between production and DR site by

synchronous replication if possible

Pick the size and number of instances based on acceptable level of performance

without any change in case of a DR event.

Use RI (Reserved Instances) for capacity reservation and cost saving

Multi-site Architecture

Page 25: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Load balance between

production and DR

If latency and error propagation risk

between production and DR sites are

acceptable

Multi-site Architecture

If DR site is isolated then

Switch over to AWS

Make necessary DNS changes to

redirect traffic to the DR site on AWS

Page 26: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Multi-site Architecture

Page 27: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

DR site on AWS can be for

• Primary site on customer data center

• Primary on AWS itself

Page 28: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

Primary and DR Sites on AWS

Page 29: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 30: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 31: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 32: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 33: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 34: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 35: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 36: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

What enabled this?

• Eight isolated S3 regions

• AWS CloudFormation allows quick bootstrap of

another region.

• Route 53 latency based routing and failover

Page 37: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

User in San

Francisco

eu-west-1 (Ireland)

us-east-1 (Northern Virginia)

us-west-1 (Northern California)us-west-1 (Northern California)

DNS Failover

Page 38: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

What didn’t go wrong

• Official NYC evacuation map stayed up

• USA TODAY Weather map stayed up

• Thousands of other maps used for weather

reporting, data visualization and coordination

around the event all stayed up

Page 39: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency
Page 40: Disaster Recovery Sites on AWS: Minimal Cost, Maximum Efficiency

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Disaster Recovery Site on AWS:

Minimal Cost Maximum EfficiencyRyan Holland, AWS

Thank you!