disaster recovery on a limited budget
TRANSCRIPT
Disaster Recovery on a Limited Budget
Disaster Recovery on a Limited Budget
Stephen RosenfeldUniversity of New Brunswick
CANHEIT June 28, 2005
Stephen RosenfeldUniversity of New Brunswick
CANHEIT June 28, 2005
BackgroundBackground Auditors “Management Letter Points” to the
Executive noted that no formal DR plan existed …. 3 years in a row
Last complete IT DR plan was mainframe based (done in 1987 and revised in 1996)
Some staff had been sent for DR training over the years, but we could never afford to dedicate them to the task
We had engaged in talks with the Provincial Government and the NB Electric Power Commission, but it was not a good fit.
If we were to take the situation seriously, we needed outside help
Auditors “Management Letter Points” to the Executive noted that no formal DR plan existed …. 3 years in a row
Last complete IT DR plan was mainframe based (done in 1987 and revised in 1996)
Some staff had been sent for DR training over the years, but we could never afford to dedicate them to the task
We had engaged in talks with the Provincial Government and the NB Electric Power Commission, but it was not a good fit.
If we were to take the situation seriously, we needed outside help
Xwave ProposalXwave Proposal We had a $25,000 budget Solicited a proposal from Xwave, an Aliant affiliate They provided Monique Thébeau, a Certified
Business Continuity Consultant
2 Stage Approach Discovery Phase – June 2004
Assess Risks and Exposures Determine current level of ITS DR preparedness Assess DR strategies currently in use Lay out the roadmap
Plan Development Phase – Sept-Dec 2004 Xwave to provide the Project Management
We had a $25,000 budget Solicited a proposal from Xwave, an Aliant affiliate They provided Monique Thébeau, a Certified
Business Continuity Consultant
2 Stage Approach Discovery Phase – June 2004
Assess Risks and Exposures Determine current level of ITS DR preparedness Assess DR strategies currently in use Lay out the roadmap
Plan Development Phase – Sept-Dec 2004 Xwave to provide the Project Management
Discovery FindingsDiscovery Findings
Why would you build your data centre above your Chemical Engineering department’s storeroom?
49 Recommendations made in 4 categories: Prevention – 33 Response – 1 Recovery – 13 Restoration – 2
Why would you build your data centre above your Chemical Engineering department’s storeroom?
49 Recommendations made in 4 categories: Prevention – 33 Response – 1 Recovery – 13 Restoration – 2
ITS Response to FindingsITS Response to Findings
All Discovery Phase recommendations were accepted
ITS took advantage of a scheduled building power outage to do a major reorganization of the equipment in our machine room
Server connections to UPS were rationalized EPO (Emergency Power Off) and fire alarms
were tested and found wanting; repairs made Major cleanup done
All Discovery Phase recommendations were accepted
ITS took advantage of a scheduled building power outage to do a major reorganization of the equipment in our machine room
Server connections to UPS were rationalized EPO (Emergency Power Off) and fire alarms
were tested and found wanting; repairs made Major cleanup done
Proposed Recovery OptionsProposed Recovery Options
Data Replication at another UNB location (SAN) Expensive, but network bandwidth available
Alternate Recovery Sites External DR Vendors - Estimate for 15 Servers
Hot site – $10,000/month Cold site - $5,000/month Quick ship - $2,500/month
UNB Self-Provided Site (VM Ware) - $150,000 Contract with Xwave’s Marysville Data Centre
$150,000 as above, plus $900-$2,300/month
Data Replication at another UNB location (SAN) Expensive, but network bandwidth available
Alternate Recovery Sites External DR Vendors - Estimate for 15 Servers
Hot site – $10,000/month Cold site - $5,000/month Quick ship - $2,500/month
UNB Self-Provided Site (VM Ware) - $150,000 Contract with Xwave’s Marysville Data Centre
$150,000 as above, plus $900-$2,300/month
ITS’s Chosen DR StrategyITS’s Chosen DR Strategy
Use alternate space available in D’Avray Hall (3,600’ away by tunnel; 2,600’ by crow) House decommissioned servers in the wiring
closet in this building (powered up & idling) House live redundant servers here as well, e.g.
another Novell NDS replica, secondary DNS & DHCP servers, Webmail2
Upgrade electrical panel, rack, and UPS Negotiate for more space in D’Avray Hall
Use alternate space available in D’Avray Hall (3,600’ away by tunnel; 2,600’ by crow) House decommissioned servers in the wiring
closet in this building (powered up & idling) House live redundant servers here as well, e.g.
another Novell NDS replica, secondary DNS & DHCP servers, Webmail2
Upgrade electrical panel, rack, and UPS Negotiate for more space in D’Avray Hall
DR Plan FeaturesDR Plan Features
Integrated with UNB’s Critical Incident Plan for Fire, Police, and PR coordination
Music Room will hold quick-shipped replacement servers in the event of disaster
Student computer lab in building available for ITS use in case of disaster
Conference Room also available to use as a Command Center
Integrated with UNB’s Critical Incident Plan for Fire, Police, and PR coordination
Music Room will hold quick-shipped replacement servers in the event of disaster
Student computer lab in building available for ITS use in case of disaster
Conference Room also available to use as a Command Center
UNB Departments InvolvedUNB Departments Involved
Nearly everyone in ITS Physical Plant Security SRIM (Public Relations) Purchasing Environmental Health & Safety
Nearly everyone in ITS Physical Plant Security SRIM (Public Relations) Purchasing Environmental Health & Safety
DR Plan SectionsDR Plan Sections
Incident Management Team Plan ESS Team Plan Communications & Network Team Plan Operations Team Plan Applications Recovery Team Plan Help Desk Team Plan
Disaster Recovery Test Program Disaster Recovery Maintenance Program
Incident Management Team Plan ESS Team Plan Communications & Network Team Plan Operations Team Plan Applications Recovery Team Plan Help Desk Team Plan
Disaster Recovery Test Program Disaster Recovery Maintenance Program
DR Team StructureDR Team StructureINCIDENT MANAGEMENT TEAMIncident Commander: Stephen Rosenfeld
Alternate IC: Janice El-BayoumiPurchasing Leader: Doug Beairsto
Purchasing Alternate: Mary-Lou VeerkampAdministration: Wilma Gilchrist
Admin Alternate: Pat Smith
ESS TeamLeader: Peter Ruddock
Alternate: David Lancaster
DR CoordinatorLeader: Brian Kaye
Alternate: Lori Murray-HawkinsAdmin Coordination:
Doug Swift/Terry Arnold
Damage Assessment TeamITS Leaders: Peter Jacobs & Brian KayePhysical Plant: Mike Carter/Terry Koch
UNB Security: Reg Jerrett/ Bob MacLean
COMM & Network TeamLeader: Peter Jacobs
Alternate: Sterling Gallan
Operations TeamLeader: John Jackson
Alternate: Fred Webber
Applications Recovery TeamLeader: Lori Murray-Hawkins
Alternate: Rik Hall
Help DeskLeader: Kim Washburn
Alternate: Scott ChamberlainMembers: SHDC & TSS
Routing, Switching, CablingLeader: Sterling GallanAlternate: Paul Prowse
Network Servers & ApplicationsLeader: Mike Jewett
Alternate: Matt Ashfield
Storage/BackupsLeader: Tracy Allen
Alternate: Doug Swift
UNIXLeader: Tony FitzgeraldAlternate: Rob MurrayMembers: Unix Group
Novell & BackupsLeader: Brian CassidyAlternate: Fred WebberMembers: Novell Group
DATATELLeader: Phil Parent
Alternate: Sean McDougall
EMAILLeader: David LancasterAlternate: Rob MurrayMember: Tracy Allen
WEBLeader: Shawn McGinn
Alternate: Megan Stewart
WEBCTLeader: Rik Hall
Alternate: Rock Leung
UNB Security TeamLeader: Reg Jerrett
Alternate: Bob MacLean
Restoration of Services RankingRestoration of Services Ranking
High: 1 - 7 days (14 machines) Backup server & basic network connectivity (DNS, DHCP) Directory services (PH, LDAP) E-mail & Webmail Datatel Emergency Web presence (Unix)
Medium: 7 - 21 days (25 machines) Library Catalog service Novell file systems, printing, and GroupWise E-Services portal & WebAdvisor Web & WebCT Footprints
Low: 21 days - 2 months (24 machines) Everything else - mostly lab support and monitoring software
High: 1 - 7 days (14 machines) Backup server & basic network connectivity (DNS, DHCP) Directory services (PH, LDAP) E-mail & Webmail Datatel Emergency Web presence (Unix)
Medium: 7 - 21 days (25 machines) Library Catalog service Novell file systems, printing, and GroupWise E-Services portal & WebAdvisor Web & WebCT Footprints
Low: 21 days - 2 months (24 machines) Everything else - mostly lab support and monitoring software
Next StepsNext Steps We now have a DR plan, but we only have a
minimal recovery solution Need to find more space in recovery building Explore options for a shared DR site with
other universities in the region Consolidate OS for ease of recovery Explore VM-Ware and Solaris 10
We now have a DR plan, but we only have a minimal recovery solution
Need to find more space in recovery building Explore options for a shared DR site with
other universities in the region Consolidate OS for ease of recovery Explore VM-Ware and Solaris 10
ConclusionConclusion The plan gives us a foot in the door to do a
University-wide Business Continuity Review by raising awareness of DR outside of ITS For example, the Library hosts 7 servers in their own
building, which do not fall under our DR plan Risk Assessment / Business Impact Analysis is required Our Recovery Time Objectives are an eye-opener
The model of using outside consultants was very successful and will be used for the BCP
DR considerations must be a factor in new purchases
Hardware vendor consolidation required for regional collaboration
The plan gives us a foot in the door to do a University-wide Business Continuity Review by raising awareness of DR outside of ITS For example, the Library hosts 7 servers in their own
building, which do not fall under our DR plan Risk Assessment / Business Impact Analysis is required Our Recovery Time Objectives are an eye-opener
The model of using outside consultants was very successful and will be used for the BCP
DR considerations must be a factor in new purchases
Hardware vendor consolidation required for regional collaboration