disaster recovery 101 - kent state university...site level disaster requires people & process...

19
Disaster Recovery 101 Sudarshan Ranganath & Matthew Phillips Ellucian

Upload: others

Post on 06-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

Disaster Recovery 101

Sudarshan Ranganath & Matthew Phillips

Ellucian

Page 2: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

SESSION OBJECTIVES

Business continuity is critical to every institution and its IT organization. How do you set up your ERP and other Tier 1 apps to reduce the risk of a disaster, and quickly recover from one should disaster strike? Learn about the infrastructure and practices that Ellucian’s Cloud Services uses to minimize the impact of a disaster for your Banner systems.

Page 3: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

AGENDA

3 May 2, 2013

Tier -1 Apps

Disaster Prevention

Disaster Readiness

DR Execution

DR Options

Page 4: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

TIER 1 APPS FOR DISASTER RECOVERY /

BC

4 May 2, 2013

Communication

SIS/ERP

LMS

CRM

Other Financial/Operational

Page 5: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

RISKS THAT COULD IMPACT YOUR

OPERATIONS

Frequent Likely Occassional Seldom Unlikely

Catastrophic

Critical

Moderate

Negligible

PriorityExtremely

HighHigh Moderate Low

Probability

Seve

rity

Disk

Failure/

Trip over

a wire

CPU

Failure

Hurricane

/Flooding Security

Breach

Hit by

Tornado

Power

Outage

Staff

Attrition

Demand

Surge

Impact

• Business

Interruption

• Financial

• Legal

• Reputational

Causes

• Natural disasters

• Human errors

• Technological failures

Page 6: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

KEY KPIS YOU CARE ABOUT AS AN IT

ORGANIZATION

• Application Availability Downtime

• User Experience for Public and Private apps Performance

• Number of Security Incidents

• Extent of compromise per Incident Security

• Ability of current infrastructure to handle load

• Time to add capacity in response to demand spike Scalability

• Probability of a disaster affecting the datacenter

• Time to recover from a site-level disaster Disaster Recoverability

• Time to update to newest version after being made available by vendor

Backup Currency

Software Currency

Stakeholder Support

Costs/Investment Efficiency

• Lost work product because of inefficient backup practices, and aging of backed-up data as a result

• Effectiveness in furthering student/staff satisfaction

• TCO to operate solution, ROI for every $$ invested

Page 7: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER PREVENTION

7 May 2, 2013

Power

Facility

Network

Hardware

Application Architecture

Replication

Process

Page 8: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER PREVENTION - POWER

8 May 2, 2013

Multiple Utilities or Stations

A and B power Grids

All components connected A&B

UPS Generator

Generator Backup

Fueling agreements for outage >2 days

Page 9: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER PREVENTION - FACILITY

9 May 2, 2013

Multiple Physical Entries for Power, Network

Hardened Walls and Roof

Temperature – Humidity

Secure personnel and equipment Entries

Multi-stage Fire Detection

Page 10: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER PREVENTION - NETWORK

10 May 2, 2013

Multiple Internet connections

Multiple ISP providers

Redundant firewalls

Redundant core network

Servers, storage redundant connections

Page 11: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER PREVENTION - HARDWARE

Redundancy is key at every level

SAN vs. non-SAN

Virtualization vs. Dedicated Server Hardware

Redundant cold/warm/hot hardware in DR location

May 2, 2013

Page 12: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER PREVENTION - APPLICATION

ARCHITECTURE

Again… redundancy is key at every level

DB tier and App tier

Monitoring & alerting considerations

Integrations

Customization and Modifications

Licensing

May 2, 2013

Page 13: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

Backup architecture considerations

OS is static

Application tier is static

Database backup considerations

Database backup architecture

Fullexp, RMAN, cold, custom hot

Archive vs no-archive mode (prod vs non-prod)

Data-Domain style vs Tape architecture

Architecture must consider RTO and RPO

DISASTER PREVENTION - APPLICATION

ARCHITECTURE

May 2, 2013

Page 14: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

14 May 2, 2013

DISASTER RECOVERY REPLICATION

Backup Process

Replication Process

Recovery Point

Recovery Time

Page 15: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER PREVENTION - PROCESS

15 May 2, 2013

ITIL® Change Management

Incident Management

Shutdown / Startup Processes

Access Control / Role Based Security

Training

Page 16: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

DISASTER READINESS

How do you test your readiness for disaster

Failover Test

Power

Network test

VM test

Application / Database test

Monitoring test

Page 17: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

17 May 2, 2013

EXECUTION WHEN YOU HAVE A

SITE LEVEL DISASTER

Requires People & Process

Facility to restore

Infrastructure (Network, servers, storage, Recovery software, DNS)

Most Recent Backups

Prioritization

Move IP networking from primary to DR

Recover Virtual Machines

Recover Databases

Recover Apps

Integrations to other systems

Page 18: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

18 May 2, 2013

DISASTER RECOVERY STRATEGIES

Strategy RPO RTO Cost

Server Replication Secs – Min < Hr $$$$$

SAN Replication Min – hours Hours – Day $$$$

VM + DB logs Hours – day Hours - days $$$

Offsite Tape + DR

Contract

Days Days-weeks $$

Offsite Tape Days Months $

Page 19: Disaster Recovery 101 - Kent State University...SITE LEVEL DISASTER Requires People & Process Facility to restore Infrastructure (Network, servers, storage, Recovery software, DNS)

SUMMARY

• DR is about

• Planning & Testing Readiness

• Prevention, Readiness, Execution