an anaytics based quality management system€¦ · –predictive analytics focused on failure...
TRANSCRIPT
An Analytics Based Quality Management System
Steve MazzucaPrincipal Data Scientist
2018-03-15
Original System – Reactive Quality Management
2
The original Quality Management System took action only when targets were missed.
The Continual Improvement Culture lacked:
– Predictive Analytics focused on failure prevention.
– Root Cause Identification focus.
– A Repeatable Process to track remediation action item closure.
– A focus on taking proactive actions.
Impact: A firefighting culture that focused on addressing incidents rather than preventing the
underlying problems.
Does anyone remember Lucille Ball in the Chocolate Factory Episode?– A good example of a reactive Quality Management System.
We knew that we could do better!
What was done:
Defined, documented and implemented a Quality Management System
that follows IBM best practices. It supports Operational Governance
and drives Continual Improvement throughout the entire delivery team.
What are the benefits:
–Continual Improvement Focus
–Results Based Analytics
–Robust Root Cause Analysis (RCA) process
–Consistent Analytic Tools and Techniques
–Process Improvements Actions aligned with Analytic Tools
An Analytics Based Quality Management System
3
Process
Quality Management System Overview
5
Measure Performance1
RCA (Root Cause Analysis)1. Root Cause Analysis Process2. Defect Prevention Program3. Problem Solving Sessions
Issues 1. SLA/SLO target failures2. PBA exceptions 3.Customer Feedback4. Idea Log
IdentifyIssues2
AddressRoot Cause3
ExecuteActions4
Actions follow-up1. Weekly Metric Analysis2. Daily huddles3. Account governance reviews
Enterprise Dashboard1. Account SLAs & SLOs2. Service Line Metrics3. MB&C* metrics
* SIAM – Service Integration and Management
Client
Centric
Analytics
Account Continual
Improvement Projects
SIAM** Continual Improvement Projects
Objective: Drive sustainable continual service improvements and enhance client experience through effective and efficient use of Analytics and the resulting Synthesis.
Small Projects(DA team)Large Scale Projects (SIAM* team)
Consistent Quality Management System Meeting Structure- Structure drives a focus on Quality Activities all levels of the business.- Quality related meetings reviewed for redundancy and completeness.- Outcome based analytics utilized to ensure impactful results.
CI = Continual Improvement
DA = Delivery Analyst
FLM = First Line Manager
SIL = Service Integration Lead
Delivery Analysts collaborate, conduct research, collect data and perform analysis in order to deliver recommendations….
Monitor incoming measurements for signals using
Statistical Process Control to generate synthesis
resulting in impactful recommendations.
• Provide in-depth data analysis and modelling to identify opportunities for improvement in an objective fashion.
Create broadly applicable solutions resulting in
sustainable process improvements that address
reoccurring problems.
Impactful Analysis:
- Pareto Analysis of Late Work Orders- 166 Errors Identified in 10 categories- Top 3 categories represent 81% of defects.
7
Defect Prevention Program: Driven by the Delivery Analysts
– IBM’s Global Defect Prevention Program:
• A sustainable methodology to proactively & systematically:
− Improve Quality of services delivered by reducing the number of defects
through structured Problem Management.
− Improve Productivity by utilizing a global process to synthesize incidents
across the organization in order to demonstrably reduce the number of
incoming incidents
• Global process ensures organizational support to effectively leverage
defect knowledge across accounts
Incident
Problem
• “Any event which is not part of the standard operation of a system that causes, or may cause, an interruption to, or a reduction in, the quality of service.”
• “An unknown underlying cause of one or more incidents.”
Pro
du
cti
vit
y
Qu
ality
Delivery Excellence
Global Knowledge Management
Impact / Benefits to Account Impact / Benefits to Client IT Division (CIO Office)
• Reduction in number of incident tickets- Average of 250 tickets per month reduction
• Time savings provides additional opportunities for analysts to work on other CI projects.
Impact: Critical and High severity incidents reduced resulting in a more efficient support process.
• Enhanced System Availability• Better implementation of Business Processes
that in turn lead to enhanced performance of IT resources.
Impact: Higher quality and more efficient Business Processes both of which contribute to the bottom line.
Impact/Benefit to Client’s Business Impact/Benefit to Client’s Customer
• Quality improvement:- Defects reduced by identifying and addressing
the true Root Cause of recurring defects.
• Productivity improvement: - Eliminated non-value add activities associated
with defect handling process.
Impact: Time savings that can be applied to activities that provide value to the Client and their Customers.
• Improved Quality, Productivity and Service to their Customers through reduction of repeat incidents.
Incident reduction shown below:
Sample Defect Prevention Program Project - Impact
Objective:
Drive sustainable continual improvement in Root Cause Analysis quality and effectiveness via an efficient and sustainable process.
Process:
Status:
Implemented Enhanced RCA Process.
Implemented RCA metrics to ensure a sustainable process.
Implemented enhanced RCA Practitioner Training program.
Data indicate an improved and sustainable process.
Impact: Client Executive called out program impact during all hands call.
23
1
RCA GenerationRCA Internal Quality
Review Board RCA Submission to
ClientExecute and Track
Actions Global Clearing
House (Optional)
The RCA (Root Cause Analysis) Process:
15
Analytics
Process Behavior Analysis drives objective decision making.
A Typical Process Behavior Chart
What is it?
Process Behavior Analysis starts with a time series.
A central line, known as the mean or average is added for detecting shifts.
Natural Process Limits are computed from the data and placed symmetrically on either side of the
mean.
Objective criteria for exceptional behavior are defined:
- missing the Voice of the Customer
- 8 points above or below the mean
- a point outside the Natural Process Limits
- 6 points in a row all increasing or decreasing
- non-random behavior
When exceptional behavior is observed action is required.
Industry Standard Analytic Tools are utilized
9
Definition: A visual method of analysis that utilizes SPC methodology to compare process capability with process specifications. It drives focus on areas requiring intervention.
Status:
• Regular review cadence implemented.
• SLAs capability analysis utilized for Continual Improvement action prioritization.
• Linkage established between Client facing metrics and internal facing metrics.
Exception Based Management process drives objective and efficient analysis of account performance.
13
Standard weekly quality review includes: • Exception Based Reporting analysis• Action Item follow up• Ticket Quality Review• Idea Log Review• Defect Prevention Program
Benefits:• Consistent analysis at all levels• Efficient use of delivery resources
Delivery Analysts use multiple analytic techniques to drive improved Delivery Excellence for the account.
14
Reason Classification
Cabing
Team Vendor
IBM
Network
Team
IBM
Storage
Team Client
Client
and IBM Migration
Work in
Progress Total
Work In Progress 3 2 2 8 5 2 2 24
Project Work Orders 2 1 1 7 2 2 15
No Storage Available 4 4
Pending 3 3
Planning 1 2 1 4
On Hold 1 1
Pending With Customer 1 1 2
Total 7 3 3 10 22 2 2 4 53
Open Work Orders > 20 Business Days
* 53 Work Orders out of 155 Open Work Orders are aged over 30 days
* Project Work Orders: Mostly pending due to Client Tool Project.
* Client driven projects account for a substantial portion of Open Work Orders
* Additional analysis on Work In Progress Work Orders is being performed.
15
Client Service Line – Incident Ticket Analysis
Top Categories – Jan’16 - June ’16
• Event Log – 14796• High Space- 9069• Logins disabled on Clientapp server alerts - 8351
Action
High Space Errors – Defect Prevention Program process identified actions that have been completed.Event log and Logins disabled alerts – Defect Prevention Program Projects in-progress
Watson Dashboards are used to analyze performance, derive insights and define action items.
Insights
Monthly ticket volumeInsights (July’16)
Weekly metric analysis indicates sustainable improvement.
Wintel: High Space alters increase detected as per SPC analysis.• DPP Investigation launched
Wintel: No RDP Port xxxx or Server Down incidents for the month of July.
Citrix: Logins disabling of xGenApp server reduced to 7340 from 16431 due to false alerts elimination.
Unix: File system and High Space alert increased as per SPC • DPP Investigation launched
Actions derived from Watson Analytics synthesis resulted in a demonstrable improvement in Incident Ticket Volume.Incident Volume trend
SLA Financial Risk Assessment Model – Patent Pending What is it?
– SLAs are used to measure the performance of critical parts of the Client’s business.
– Clients assign penalty weights to SLAs as a means to clearly communicate their business priorities.
– Tool performs statistical analysis on attainment data creating a financial risk profile used to prioritize Continual Improvement Actions.
How does it work?– The SLA portfolio is analyzed and the probability of failure is calculated and multiplied by the penalty liability.
– Annualizing the risk in this fashion results in a more realistic representation of financial impact over time.
What is the value?– Provides a method to prioritize Risk Mitigation Activities through industry standard predictive analytics.
– Enhances effectiveness of Continual Improvement activities particularly when used along with the DPP and RCA Processes.
17
QMS Additional Support: The following shows faux output from the SLA Financial Risk Assessment Tool.*** Input based upon a artificial performance data in order to show tool capability.
SLA Risk Assessment
Annual SLA Penalty Liability due to Support Capability is $XXX.
SLA# 4,2,5, and 1 represent most of the calculated financial risk.
SLA# 3 and 10 represent most of the remaining calculated
financial risk.
The calculations were based upon simulated penalty liability. The
analysis helps to identify risk and prioritize CSI activities.
Actions
Immediate review of Continual Improvement Plans (CIPs)
for SLA# 4,2,5 and 1.
Review of CIPs for the remaining SLAs prioritized in order
of calculated risk.
Redo analysis if input parameters change.
Client - Quality Management System Status
Implemented a Quality Management System (QMS) that:Utilizes Analytics to drive Continual Improvement
Focuses on Operational Governance to resolve issues in an efficient manner.
Drives identification of the true Root Cause of problems/issues.
Interlocks with the CSI Register Process.
Integrates the SIAM Team as part of the Continual Improvement Process
Utilizes IBM’s Defect Prevention Program
Impact:Probability of missing critical SLAs reduced
Delivery Service Quality improved demonstrably
Client communicated satisfaction with the progress made
Implementation team members recognized by the Client
Next Steps: Investigate Additional Continual Improvement Opportunities
Integrate Watson into Problem Solving Process
18
Questions ?????
19