from theory to practice · bcp from theory to practice presented by mark pryce & karl d. bryant...
TRANSCRIPT
BCPfrom Theory to
Practice
Presented by Mark Pryce & Karl D. Bryant
18 March 2013
TheoryBusiness Continuity Management Overview
Karl D. Bryant, CBCP, MBCI, PMP, CBCLA | Senior Vice President
Marsh Risk Consulting | Strategic Risk Consulting
540 West Madison, Suite 1200 | Chicago, Illinois 60661
P: 312 627 6391| F: 212 948 1318 | M: 312 550 9017
Marsh—Leadership, Knowledge, Solutions…Worldwide. 4
Business Continuity Management Defined
� Holistic management process
� Identifies potential impacts that threaten an organization
� Provides a framework for building resilience with the capability for an
effective response that safeguards the interests of its key stakeholders,
reputation, brand and value creating activities
� Management of recovery or continuity in the event of a disaster
� Management of the overall program through training, rehearsals, and
reviews to ensure the plan stays current
Marsh—Leadership, Knowledge, Solutions…Worldwide. 5
The 4 Main Components of BCM
� Crisis Management (CMP)
Overall coordination of the response to a crisis, in an effective, timely manner, with the goal of avoiding or minimizing damage to the organization’s profitability, reputation, and ability to operate.
� Emergency Response (ERP)
Immediate reaction and response to an emergency situation commonly focusing on ensuring life safety and reducing the severity of the incident.
� Business Continuity Planning (BCP)
Identification and protection of business processes required to maintain an acceptable level of operations in the event of sudden, unexpected, or not so unexpected, interruption of these processes and their supporting resources (i.e., to do what is necessary to keep the critical business units running).
� Disaster Recovery (DR)
Technical or IT portion of the BCP. Includes Mainframe, Midrange(VAX, AS/400), Client Server (UNIX, NT, etc.). Disaster Recovery is a component of Business Continuity.
Marsh—Leadership, Knowledge, Solutions…Worldwide. 6
What is the first step in Business Continuity?
� You must be able to answer the WHAT questions?
– What is critical and how long can I do without?
– What services must I provide?
– What must I do first, second, third, etc?
– What drives my decisions?
– What do I need to make decisions?
– What am I willing or able to spend?
� If you start a program without these questions answered, it will usually
have difficulties being sustained through more than a few update cycles.
Marsh—Leadership, Knowledge, Solutions…Worldwide. 7
Business Continuity Planning Objectives
� Create standard for enterprise-wide Business Continuity Planning
– Establish a standard planning toolkit and implement the same “look and feel” for all operations
� Helps to engender the plan as a part of the organizational culture
� Increases the likelihood that plans will have congruence and work in harmony at a time of crisis
� Gather data for recovery resource requirements (Business Impact Analysis)
– Avoid creating or documenting catalog information –recovery period will not be identical to “business as usual” and recovery requirements will not be an inventory of what is currently in place
– Focus on capturing resources and infrastructure requirementsto establish acceptable level of operations – the optimal solution set for getting back in business and maintaining baseline operations until normal functional levels can be restored.
Marsh—Leadership, Knowledge, Solutions…Worldwide. 8
Business Continuity Planning Objectives
� Identify and analyze gaps between recovery requirements and existing recovery
capabilities
– Make every effort to re-use or re-tool existing infrastructure (alternate
workspace, IT environments, common areas) in the development of recovery
solutions
� Develop viable recovery strategies for all operating units
– It is critical that during the planning process, great care is taken to ensure
that the solutions developed are reasonable and practical.
– It is not sufficient to develop recovery strategies based on “conventional
wisdom” especially if they are not fully vetted and validated through plan
exercising.
Marsh—Leadership, Knowledge, Solutions…Worldwide. 9
Business Continuity Planning Objectives
� Conduct training and awareness sessions
– Make sure all program stakeholders (executives, employees, customers,
supply chain partners) understand why the plan was implemented and how it
protects their interests
– Increases likelihood that plan owners and constituents know what to do
in a crisis situation
� Establish a process for updating and maintaining the Business Continuity Plan
– The plans will become out of date almost immediately
– It is critical that the plans be relevant and topical in the event they are
needed
– Well maintained plans enhance audit compliance
Marsh—Leadership, Knowledge, Solutions…Worldwide. 10
Event Neutral Planning
Marsh—Leadership, Knowledge, Solutions…Worldwide. 11
Process Approach
Marsh—Leadership, Knowledge, Solutions…Worldwide. 12
Resource Mapping
Address all dependencies and the skills
required to maintain operations, whether a
public entity, manufacturer, service
company, or other type of organization
Marsh—Leadership, Knowledge, Solutions…Worldwide. 13
Best Practice“IMPACT” vs Threat-Based Approach
Assumption
People (employees,
contractors, support
functions)
Technology &
Processing(data processing
networks)
Physical(facilities, raw
materials,
equipment)
Relationships &
Interdependencies
Unavailable and/or
inaccessible for an
extended period of time
Pandemic – 40% of
internal and 40% of
external work force
Inability to gain access to
service/ install software.
Building quarantined,
civil unrest damage to
facility and vital records
Sole source, critical
infrastructure, supplier
severely affected
Destroyed or perished Three orders of
succession
Wide-scale civil unrest
and looting, destroy
facilities
Key electronic records
destroyed.
Wide-scale civil unrest
and looting destroys
facilities
Key documentation
destroyed
Foreign investors loose
confidence, exit market
Assume the resource is either unavailable for >30 days
and/or, worst case, destroyed.
Marsh—Leadership, Knowledge, Solutions…Worldwide. 14
Business Continuity ManagementPlanning Methodology
Marsh—Leadership, Knowledge, Solutions…Worldwide. 15
BCM Methodology & Approach
� Program Scoping
– Organizational Analysis
– Collection and Review of Existing Materials
� Business Continuity Plans
� Recovery Infrastructure
� IT Environment
– Initial Program Design
– Resource Options and Budgeting
� In house staff
� New hires
� Engage outside consultants
Program Scoping
Marsh—Leadership, Knowledge, Solutions…Worldwide. 16
� Develop a BCP “toolkit” that may include the following:
– Threat & Vulnerability Analysis Matrix
– Business Impact Analysis (BIA) Questionnaire
– Strategy Evaluation Matrix
– Manual Workaround Worksheet
� Custom Business Continuity Plan Template(s) based on the recovery needs and strategies of the various business entities
� Common Plan Elements:
– Appropriate Command and Control Structure, including specific BCP teams required for each business area
– Team members duties and responsibilities
– Incident Response Model
– Escalation Procedures
– Declaration Procedures
BCM Methodology & ApproachDevelopment of BCP “Toolkit”
Marsh—Leadership, Knowledge, Solutions…Worldwide. 17
BCM Methodology & Approach
� Risk Assessment
– Assessing your business, utilities and physical risks
– Analyzing the risks by department and location
– Developing risk mitigation strategies for those risks that can be mitigated
– Following graphics illustrate the outcome of the risk assessment which includes a risk
map (a graphical method to position identified risks against anticipated impact and
likelihood axes) and gap analysis chart (a graphical method to show gaps or overlaps
between the inherent risk and the management effectiveness of that risk):
Risk Assessment
Marsh—Leadership, Knowledge, Solutions…Worldwide. 18
� Business Impact Analysis
– Documentation of Critical Business Processes
– Seasonal/Regulatory/Community Impact
– Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) for all
processes, applications and process dependencies
– Potential financial losses or additional expenses related to outages
– Human Capital Requirements for Recovery (including “Key Person”
Dependencies)
– Key Suppliers
– Vital Records
– Identification of Process Interdependencies (Inputs and Outputs – both
Internally and Externally)
– Prioritization of processes for recovery
BCM Methodology & ApproachBusiness Impact Analysis
Marsh—Leadership, Knowledge, Solutions…Worldwide. 19
� Strategy Evaluation and Selection
– Select appropriate recovery strategies
– Develop recovery strategy implementation plan
� Business Continuity Plan Development
– Establish plan to cover broad base of outages
� Loss of people
� Loss of site
� Loss of systems
� Loss of relationships (Supply Chain Risk)
– Document recovery strategies based on process prioritization and risk
assessment findings
BCM Methodology & ApproachStrategy Development / Business Continuity Planning
Marsh—Leadership, Knowledge, Solutions…Worldwide. 20
� Training of Staff
– Development of Training Material
– BCM Training “Workshops”
� Exercise Program
– Development of Exercise Scenarios
� based on Property Engineering Reports (Probable Maximum Loss)
� connects BCM Program to Insurance Market Requirements
– Facilitation of an Exercise
– After-Action Review and Subsequent Plan Updates
� Plan Maintenance Program / Ongoing Program Support
– Plan Deployment
– Plan Update Protocol
BCM Methodology & ApproachProgram Maintenance
Marsh—Leadership, Knowledge, Solutions…Worldwide. 21
Business Resilience Maturity LevelsCompanies are evolving their risk management thinking to address this universe of threats.
Value added for Wells
Degree of sophistication
Insurance and compliance
I
IV V
VI
VII“Risk management equals buying insurance”→ Risk transfervia insurance
“Decision making across firm is linked to building economic value”→ Risk adjusted resource allocation atall levels
Core risk management
II“Regulators are demanding risk management activities” → Over-reliance on ‘checklists’, false sense of security
“We need to know the economic impact of our largest risks”→ Specific risk quantification
“We need a sustainable process for monitoring all our risks”→ Qualitative risk management “Risk needs to be
quantified comprehensively”→ Over-control by centralized risk management,initial quantitative models too primitive
“Shareholders demand a risk/return framework”→ Risk and growth appetite defined, risk dynamically measured and aggregated properly
III
Evolution of risk management thinking
Disaster Recovery
(Component level contingencies and data center solutions)
Business Continuity
(Integrating Business Recovery and Disaster Recovery)
Business Resilience
(Integrating Crisis Management with Business Continuity and aligning it with overall Risk Management)
Risk-return optimization
Marsh—Leadership, Knowledge, Solutions…Worldwide. 22
Resource PrioritizationResources (e.g., people, physical, technology and relationships) are prioritized to support
the identification and the development of both tactical and strategic options.
� Tactical options
� Quick hits
� Minimal investment to provide greater resiliency
� Knowledge / strategy exists, but not formally documented
Low Impact /Low Effort
� Strategic options
� Focus on the priorities
� Identify risk mitigation and financing options
� Model and price
� Require longer term programs and solutions,
probably some degree of risk acceptance
Low Impact / High Effort
High Impact /Low Effort
High Impact / High Effort
Practice
Mark Pryce, CRM, CBCP | Managing Director
RecoveryLogic Inc. | Business Resiliency & Risk Consulting
TF: 1-888-751-5231 | M 416-543-7714
www.recoverylogic.ca
Business Continuity Management Overview
Corporate Policy Development
Our first challenge was to develop a corporate policy and obtain signoff from
the CEO and senior executives.
Scope:
• Protect the business and facilitate recovery from significant disruptions
• Consistent with industry best practice in Canada and the US
• Compliant with Canadian & US standards (CSA Z1600 & NFPA 1600)
Purpose:
• Establishing BC & DR programs at the Business Unit level
• Developing and implementing BC & DR plans
• Testing and updating plans on a regular basis
Responsibilities:
• Plans owned by Business Unit VP
• Program managed by Business Continuity Team
Program Development
Our corporate policy established a mandate for BCP and allowed us to begin
development of a program following a multi pronged approach:
Incident Management
• Develop corporate incident management plan
• Build corporate emergency operations centre
Facilities
• Complete physical and environmental risk assessment for priority sites
Business Operation
• Complete Enterprise BIA to identify and prioritize business units by
criticality
Program Management
From the Enterprise BIA, we developed a three year project plan to engage
each BU in priority sequence:
• Kick-off Meetings were scheduled with the VP of each BU to gain their
buy-in and have them assign Departmental Primes with whom we could
work to assess impacts and develop required plans
With over 130 VPs and 530 Directors, we needed tools to keep track of plans:
• Monthly reports to advise VPs of plan development progress
• Plan signoff and certification process
• Plan status tracking tool
Initially this was all done via spreadsheets and email. Now we are in the
process of implementing a custom designed Sharepoint plan library with
metadata and automated workflows.
Corporate IMP
Incident Management -CEOC
CORPORATE
EMERGENCY
OPERATIONS
CENTRE
(EOC)
NETWORK
OPERATIONS CENTRE
Facilities - Risk Assessment
Risks were assessed for all owned and leased facilities:
• Building Type (Single story, high rise)
• Environment (Location, surrounding industry, proximity to highways and
railroads, other occupants, presence of unionized employees)
• Equipment Location (Equipment room environment, windows)
• Security (Fencing, access control, surveillance and anonymity)
• Safety (Fire alarms, fire suppression, fire water and smoke detection)
• HVAC (Heating and cooling capacity, redundancy, hot spots, air flow)
• Power (Generator backup, location, maintenance, battery capacity)
Recommendations were developed:
• Short term / Low Cost: (Re-routing of water pipes, addition of drip trays, installation of
fire suppression systems, enhancement of cooling system capacity)
• Long term / High Cost: (Relocation of sites, development of mobile DR trailers.)
Business Operations -Enterprise BIA
KEY BUSINESS DRIVERS
Level Brand & Reputation Financials Customer Service Regulatory & Legal
HighCatastrophic impact to
image
Direct & significant
impact to revenues
Direct customer-facing
impact
Legal liability with
penalties (especially
existing customers)
Medium Visible impact to imageIndirect impact to
revenues
Indirect customer
impact
Regulatory liability with no
penalties (especially
potential customers)
LowNone/limited impact to
image
None/limited impact
to revenues
None/limited impact to
Customer Service
(i.e. no/limited
customer contact)
No/minor regulatory/ legal
liabilities
Analysis was conducted and signed off by the Business Continuity Advisory
Committee.
• 174 BUs were assessed based on four Key Business Drivers:
• A rating for “High” was scored at 3 points
• A rating of “Medium” was scored at 2 points
• A rating of “Low” was scored at 1 point
• Total maximum score was thus 12 points (3 points for each of the four key business drivers)
Enterprise BIA (continued)
PRIOROTY TIERS
Tier Score Sample Tier Responses Business Units
1 12
• Top priority for support
• Recommend returning to normal support
levels within 8 hours
Business Units, e.g.:
• Network Operations (Wireless & Cable)
• Media Operations (Television & Radio)
2 10 –11
• Immediate need for best-effort support
• Recommend returning to normal support
levels within 48 hours
Business Units, e.g.:
• Transaction Processing
• Production Systems
• Customer Support
• Service Delivery
3 7 – 9
• Best-effort support after 24 hours
• Recommend returning to normal support
levels within 72 hours
• Escalation may occur on major fault
Business Units, e.g.:
• Internal Communication and Service
• External communication
• Sales
• Production
4 4 – 6
• Recommend returning to normal support
levels within 7 days
• Escalation may occur on major fault or
time sensitive project (marketing launch,
major implementation)
Business Units, e.g.:
• Strategy
• Project Management
• Planning
• Marketing
Each BU was tiered based on their total score and a project plan was
developed to engage them in priority sequence.
Departmental BIAs
Our team met with the identified Departmental Primes (Directors and or
delegated managers) and performed BIAs to:
• Determine the criticality of their business functions
• Functions with a Recovery Time Objective (RTO) of less than 7 days were deemed
critical
• Identify need for BCP to recover critical business functions at alternate
site
• Identify need for DRP to recover supporting technologies (hardware and
software) required to perform critical business functions
• Identify need for specialized PP with additional details over and above the
standard “Corporate PP”
BIA Template (continued)
BUSINESS FUNCTION PROCESS FLOWS
BF #Business Function
OwnerBusiness
Function Description
Internal Inputs External Inputs Internal Outputs External Outputs
Trigger / Input /
Dependency
Alternate
Process /
Manual Work-
around
Trigger / Input /
Dependency
Alternate
Process /
Manual Work-
around
Trigger / Input /
Dependency
Alternate
Process /
Manual Work-
around
Trigger / Input /
Dependency
Alternate
Process /
Manual Work-
around
1
2
RECOVERY TIME OBJECTIVE
Brand & Reputation Financial Loss Customer Service Regulatory & Legal Summary RTO8h 24h 48h 72h <7d 7d+ 8h 24h 48h 72h <7d 7d+ 8h 24h 48h 72h <7d 7d+ 8h 24h 48h 72h <7d 7d+ 8h 24h 48h 72h <7d 7d+
STAFFING REQUIREMENTSMETRICS / SLAs
RECOVERY REQUIREMENTS AT ALTERNATE SITE
Business as Usual
-Address of first officeNetwork Hardware
Records / Documentation
Work from Home
Work from Alternate Site:
A: Location of Alt Site AB: Location of Alt Site BC: Location of Alt Site CD: Location of Alt Site D
Metric /
SLA 1
Metric /
SLA 2
Gre
en
Ora
ng
e
Blu
e
Pu
rple
Go
ld
Bro
wn
Silv
er
Bla
ck
Laptops
Avaya P
hones
Head
sets
Equipment / T
iools
Teleco
m / Switch
ing
Computer / S
ervers
Other
Shared
drive
Hard
copy
Perso
nal D
rives
Local D
atabases
Other
BAU WFH VPN 8h 24h 48h 72h 5d 7d+Alt
SiteBAU
At Time of
DisasterBAU
At Time of
Disaster
Business Continuity Plans
We chose to develop BCPs by BU rather than by physical location as it allowed
us to expedite plan creation by engaging each BU once for all their locations.
Plans are kept in a library, cross-referenced by location.
Our BCPs detail the strategies and processes to be followed to recover critical
business functions following an incident which has rendered the BU’s normal
place of work “inaccessible” for a prolonged period of time. (eg. fire or flood) BCPs
are:
• Owned by the BU’s Crisis Management Team
• Signed off by the departmental VP
• Implimented by the BU's Recovery Teams
Recovery strategies may include relocation of critical staff to alternate
facilities or “Work From Home”.
BCP Template
BCP Template (continued)
Exposures
No ExposureAffected
TeamOwner
Resolution
Date
1
2
3
Recovery Time Objective (RTO)
Dept
#
Department
Name
Department
DescriptionRTO
Recovery
StrategyExposure?
1 See section 1 Y
2 See section 2 N
3 See section 3
Roles & Responsibilities
Team Role Responsibilities Owner/s***
CIMT
(Corporate
Incident Mgmt.
Team)
CIMT Members
- Manage corporate response for incidents impacting more than
one business unit
- Corporate Communications to all external (non-Rogers)
audiences
Name
CMT
(Crisis Mgmt.
Team)
Incident Commander - Single point of contact between CIMT and CMT Name
Departmental Primes - Single point of contact between CMT and IRTs
- Coordinate incident response among IRTs
Director 1
Director 2
Director 3
Communication Prime- Coordinate communications between Incident Commander,
Departmental Primes and Incident Recovery Team Primes
Other
IRT
(Incident
Recovery Team)
Incident Recovery
Team Primes
- Recover Critical Business Processes within RTO
- Provide updates to Crisis Management Team
Manager 1
Manager 2
Manager 3
BCPBusiness Continuity
Planning Team
- Manage Corporate Emergency Operations Centre (CEOC)
- Facilitate meetings
- Assist CIMT, CMT & IRT with incident response as required
Director, Business
Continuity
Senior Mgr,
Business Continuity
DRP
Corporate IT Services has in place a comprehensive DRP which includes fully
redundant backups at alternate sites, with an RPO of 24 hours.
• Business Units do not need their own DRP unless they use applications or
hardware which are not managed by our corporate ITS
• As such, our BIAs focus on identifying any “custom applications” or
“servers under someone’s desk” for which DRPs need to be developed
Facilities based network service provider with extensive Wireless, Fibre and
Coaxial networks, for which we have in place comprehensive DRPs:
• Mobile DR trailers (Cable Head Ends)
• Cellular Switch Strategy
• Cells on Wheels
• Media Technology (Television & Radio stations)
Questions
Thank You!