bcp workshop 150906 - dr methodology griffith u.ppt
TRANSCRIPT
Disaster Recovery
Sudath Wijeratne 15-Sep-06
2Information Services
Agenda
• Background • Methodology• Our DR Strategy• Learning Management system
(Blackboard) DR implementation• Discussions
3Information Services
Background
• Griffith has 5 campus locations from Brisbane's Southbank to the Gold Coast
• Servers have been installed local to campus• Nathan centric corporate systems and servers
4Information Services
Background• The University AUQA audit in 2003 identified risk
management as a priority, and the impact of failure of electronic infrastructure was identified as a risk requiring mitigation. A loss of the core Nathan Data Centre has the potential to cripple the University’s ability to deliver its core services.
• Achieving sufficient involvement and sponsorship from core business units to undertake a traditional top down Business Impact Statement (BIA) approach to disaster recovery has proven difficult over the last few years.
5Information Services
Background …• In absence of BCP plan/strategy, ICTS
management team commissioned a disaster recovery project to jumpstart the process via a bottom up approach for three critical University systems:– Learning@Griffith– Staff Email– Corporate Web Services
6Information Services
Methodology• Dozens Of Systems• Hundreds of IT Components (building blocks)• How they relate to each other and what is a real
impact of failure of any particular building block?• We needed a process / methodology to guide us to
approach DR system by system
7Information Services
Methodology• We have developed a methodology called
“Building Block”• Methodology is about
– Systems / Services– Architectural components – And their dependencies
8Information Services
Component fact sheetArcitectural components
System-1 S1-APS S1-DBS S1-AUS S1-NW S1-DNS
System-2
System-3
System-4
?DBMS Services
Application Services
Storage Services
Authentication Services
Directory Services NetworkSystems /
Services Web Services
Midddleware Services DNS
Methodology
9Information Services
Building Block methodology• Top down approach is to analyse IT services iteratively
decomposing them into key technological components and dependencies.
• As a result of this process the key building blocks required to deliver the service are identified.
• Recovery solutions for each building block are then undertaken.
• A second round of analysis identified common building blocks that are reusable for other systems, or as base components for holistic disaster recovery (e.g. DNS services, LDAP).
10Information Services
Understanding Environment Interdependencies
Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)
System-1 (S1)
11Information Services
Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)
Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)
DNS01
LBS (S1-APS-NW)
System-1 (S1)
Understanding Environment Interdependencies
12Information Services
Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)
Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)
LBS (S1-AUS-NW)
LDAP servers (S1-AUS-SVR)
DNS01
LBS (S1-APS-NW)
System-1 (S1)
Understanding Environment Interdependencies
13Information Services
Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)
Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)
LBS (S1-AUS-NW)
Database Server (S1-DBS-SVR)
LDAP servers (S1-AUS-SVR)
Storage Service (S1-DBS-SS)
DNS01
LBS (S1-APS-NW)
System-1 (S1)
Understanding Environment Interdependencies
14Information Services
Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)
Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)
LBS (S1-AUS-NW)
Database Server (S1-DBS-SVR)
LDAP servers (S1-AUS-SVR)
Storage Service (S1-DBS-SS)
DNS01
LBS (S1-APS-NW)
System-1 (S1)
Understanding Environment Interdependencies
15Information Services
Understanding Environment Interdependencies
Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)
Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)
LBS (S1-AUS-NW)
Database Server (S1-DBS-SVR)
LDAP servers (S1-AUS-SVR)
Storage Service (S1-DBS-SS)
DNS01
LBS (S1-APS-NW)
System-1 (S1)
16Information Services
Methodology application to Backboard
17Information Services
Blackboard
Application Services (BB-AS)
Authentication Services (BB-AUS)
Network Services (BB-NW)
Database Services
(BB-DBS)
Colleboration servers
(BB-AS-COLSVR)
Application servers (BB-AS-APPSVR)
LDAP Servers(BB-AUS-SVR)
LBS(BB-NWS-LBS)
DNS(BB-NWS-DNS)
BB-DHCP
Leraning@Griffith BOS brakedown structure
Storage(BB-AS-COLSVR-
SS)
Storage(BB-AS-APPSVR-
SS)
Server(BB-DBS-SVR)
Storage(BB-DBS-SS)
Digital Repository
In-House Applications
Border Router(NW-BDR-RTR)
LG Campus Router
(NW-LG-RT)
NA Campus Router
(NW-NA-RT)
GC Campus Router
(NW-GC-RT)
Fibre
Nathan
MG
GC
SB Campus Router
(NW-SB-RT)
ATM
SB
LG
Methodology Summary for Blackboard
18Information Services
Fact sheets
INFORMATION SERVICES INFORMATION & COMMUNICATION TECHNOLOGY SERVICES Date: July 2005
Architectural Layer Database Services (E.g. Application, DBMS, Storage, Network Services etc.)
CMDB Reference NA Reference to relevant CMDB for purpose of linking to an existing configuration management process, if applicable (E.g.: DBMS-CI008)
Identification Code BB-DBS Unique identification code. E.g.: DB01
Description Blackboard Database Services Brief description of the building block. E.g.: Learning@Griffith database.
Detailed Description Oracle based DBMS services Detail description of the building block E.g.: Server details, Vendor, SAN storage, anything relevant to identify the building block
Position Responsible Manager, DBMS Position that is responsible for the disaster recovery aspects of the building block. E.g.: Manager, Database & Storage Services.
Version 1.1 Document version. Last Updated 04/07/2005 Date when this document was last updated. Dependant Components Building blocks on which this building block is dependant.
E.g.: SRV01 (Server), STOR01 (Storage). DR Level Objective TBD Minimum level of services required while in disaster mode.
E.g.: Web access to email. DR Time Objective TBD The maximum amount of time before service must be made available.
E.g.: System must be available within 16 hours of disaster.
DR Point Objective TBD Maximum amount of data loss acceptable as a result of a disaster. E.g.: maximum of 24 hours data loss.
Current DR Contingency Database files are being replicated to MG What is in place now.
Future DR Contingency
TBD What is planned for the future. E.g.: An alternative balance switching method needs to be provided (possible a LBS switch needs to be supplied off Nathan).
Issues Register Reference Nil Building Block DR contingency issues listed in the issues register. E.g.: Present (see issue register), Nil (resolved or nonexistent)
DR Contingency Status In Planning What is the current status of the building block with respect to the ‘Future DR Contingency’. E.g.: Not Started, In Planning, In Progress, Complete.
19Information Services
Blackboard
Server Management
Services (BB-SMS)
Network Services
(GU-NCS)
Database Management
Services (BB-DBMS)
Collaboration servers (GU-SMS-BB-COLSVR)
Application servers (GU-SMS-BB-APPSVR)
LDAP Servers(GU-SMS-AUS-SVR)
SLB(GU-NCS-SLB)
DNS(GU-SMS-DNS)
GU-DHCP(GU-NCS-DHCP)
Network Connectivity
(GU-NCS-NW)
Learning@Griffith Recovery Order
BB Oracle(BB-DBMS-SVR)
Storage(GU-SAN-BB-SS)
Storage Services
(GU-SAN)
1
13
12
10
9
8
7
65
4
3
2
11
optional
20Information Services
Our DR Strategy is based around …• Two physically separated primary Data
Centres• Distributed operation of major systems
between these Campuses• Near real-time data replication capabilities
between these Data Centres
21Information Services
Key Dependencies for DR Project
• SAN Infrastructure provisioning at Gold Coast campus
• Network server campus virtualisation (Between Nathan and Gold Coast)
22Information Services
SAN Infrastructure design• Tiered storage capabilities• Provide inter campus (Inter SAN) copy (TRUE
Copy) and snapshot (Shadow Image) capabilities
• Tier to tier copy capabilities • Central management of storage• Allow for DR implementation using above
capabilities
23Information Services
Virtual server campus network design
24Information Services
Virtual server campus network design• Provide equal access to servers from anywhere• Well defined access points into the server
subnets• Greater server to server connectivity• Allow for DR planning by allowing a shared
layer 2 between sites• Easy to migrate servers and services between
sites
25Information Services
Learning@Griffith Distributed DR Architecture
26Information Services
Clients
Layer 3 SwitchApp Server 1
App Server 12
App Server 13
App Server 24
Colleboration Server(Active)
Nathan
BB Prod Database
File System(NFS mount)
Shared File System
NFS
SAN - NATHAN
BB Prod Database
File System(NFS mount)
Shared File System
SAN - MT Gravatt
Assync. Copy (SAN TRUE COPY)
ORACLE BACKUP
ORACLE BACKUP
NFS
Tape Backup
LDAP Server1 LDAP ServerN
Primary DNS
19/05/2006S.Wijeratne Before DR Implementation
Disaster Recovery Architecture for BlackBoard
F15K
27Information Services
Clients
Virtual Campus Network
Layer 7 Switch Layer 7 SwitchApp Server 1
App Server 12
App Server 13
App Server 24
Colleboration Server(Active)
Colleboration Server
(Stand by)
Nathan Gold Coast
BB Prod Database
File System(NFS mount)
Shared File System
NFS
SAN - NATHAN
BB Prod Database
File System(NFS mount)
Shared File System
SAN - GOLD COAST
Assync. Copy (SAN TRUE COPY)
ORACLE BACKUP
ORACLE BACKUP
NFS
Shadow Image
Tape Backup
LDAP Server1 LDAP ServerN LDAP ServerN+1 LDAP ServerZ
Primary DNS
Primary DNS(Backup) @South Bank
18/05/2006S.Wijeratne After DR Implementation
Disaster Recovery Architecture for BlackBoard
F15K F15K
28Information Services
29Information Services
Clients
Virtual Campus Network
Layer 7 Switch Layer 7 SwitchApp Server 1
App Server 12
App Server 13
App Server 24
Colleboration Server(Active)
Colleboration Server
(Stand by)
Nathan Gold Coast
BB Prod Database
File System(NFS mount)
Shared File System
NFS
SAN - NATHAN
BB Prod Database
File System(NFS mount)
Shared File System
SAN - GOLD COAST
Assync. Copy (SAN TRUE COPY)
ORACLE BACKUP
ORACLE BACKUP
NFS
Shadow Image
Tape Backup
LDAP Server1 LDAP ServerN LDAP ServerN+1 LDAP ServerZ
Primary DNS
Primary DNS @South Bank
18/05/2006S.Wijeratne Disaster at NA
Disaster Recovery Architecture for BlackBoard
F15K F15K
30Information Services
DR Plans• DR Management Framework
• DR Plans for each building block
• Resumption plans
31Information Services
Lesions learnt• Resource issues and priorities with multiple
projects• Distributed environment
– Benefits and challenges• Resources at 2nd primary site• External audit University Audit committees• BCP awareness
32Information Services
Discussions