copyright © 2010 sas institute inc. all rights reserved. root cause investigations graphical...
TRANSCRIPT
Copyright © 2010 SAS Institute Inc. All rights reserved.
Root Cause Investigations Graphical ApproachesByron Wingerd, Systems Engineer, JMP/SAS
2
Copyright © 2010, SAS Institute Inc. All rights reserved.
Agenda
Vaccine Process Overview
Visualizing Process Changes
Decision Trees for Process Investigations
3
Copyright © 2010, SAS Institute Inc. All rights reserved.
3
Vaccine Production Process
Bulking andFormulation
DownstreamPurification
Fill/FinishOperation
LabelingPackagingInspection
MediaPreparation
SeedFermentation
UpstreamFermentation
InoculumPreparation
4
Copyright © 2010, SAS Institute Inc. All rights reserved.
Major Inputs to a Biological Process Equipment
Process Equipment Support Equipment Facility Utilities
Materials Chemicals, Gasses, Filters, Biological
Personnel Procedures and other Documents (SOP’s etc.) Shifts, Teams and Individuals
Measurements At-Line and On-line sensors and assays Off-Line measurements and assays Materials, Personnel, Equipment and Instruments…
Process Investigation
5
Copyright © 2010, SAS Institute Inc. All rights reserved.
The Despair Scenario
5
Process Investigation
InvestigationBurn out
Special CauseVariation
Common Cause Variation ?
Noisy ResponseInteractingSystems
ChangingInputs
Process Investigation
6
Copyright © 2010, SAS Institute Inc. All rights reserved.
Many Possible Process Inputs
Lots of Possible Inputs Good data Bad data Horrible Data
Multiple Input Changes Before Each Run Documented in multiple locations Many parts, different owners
Many Systems Interact Unintended consequences
Goal: Spend the least amount of time and effort excluding branches
Process Investigation
Process Investigation
7
Copyright © 2010, SAS Institute Inc. All rights reserved.
When did the Problem Start?
Identify the Start of the trend as narrowly as possible
EquipmentMaterialsPersonnelMeasurements
8
Copyright © 2010, SAS Institute Inc. All rights reserved.
Markers are colored by material type.
Each row is an individual lot
Are Inputs Changing When the Trend Starts? Are Inputs Changing When the Trend Starts?
Conclusion: Materials are Excluded in First Round
EquipmentMaterialsPersonnelMeasurements Date of Run
Material 1 Lot 2Lot 1
Lot 3Lot 4
Event Marker
9
Copyright © 2010, SAS Institute Inc. All rights reserved.
When did the Problem Start?
Clean cut change events are really convenient Boundaries of the event are easy to investigate specifically Changes in cause correlated with change in process
10
Copyright © 2010, SAS Institute Inc. All rights reserved.
Possible Trends in Measurement Results
What Happens When the Trends are Messy?
Change in the Question: Which X’s might be Driving Y
11
Copyright © 2010, SAS Institute Inc. All rights reserved.
Case Study:
Problem:
All of my inputs are changing
How can I visualize the changes
Do changes in inputs affect variation?
What is most important to look at first?
Approach:
Bubble plots for visualization
Partition Platform to Analyze Data
12
Copyright © 2010, SAS Institute Inc. All rights reserved.
Changes in Materials Over Time
Each color change in a row represents a change in the lot number of the material
13
Copyright © 2010, SAS Institute Inc. All rights reserved.
What are the key Drivers of Variation?
13
15
16
17
18
19
21
22
23
USL 20
LSL 14LSL 14
Fin
al Y
ield
0 20 40 60 80 100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400
420
440
460
480
500
520
540
560
Sample
14
Copyright © 2010, SAS Institute Inc. All rights reserved.
Pain Point: Higher frequency of OOS runs
13
15
16
17
18
19
21
22
23
USL 20
LSL 14
Fin
al Y
ield
0
20
40
60
80
10
0
12
0
14
0
16
0
18
0
20
0
22
0
24
0
26
0
28
0
30
0
32
0
34
0
36
0
38
0
40
0
42
0
44
0
46
0
48
0
50
0
52
0
54
0
56
0
Sample
Data colored to highlight the extreme high and low values. Color format is the same on the next slide
15
Copyright © 2010, SAS Institute Inc. All rights reserved.
14
15
16
17
18
19
20
21
22
Fin
al Y
ield
All Rows
Cassette 1(10004324, 10004317, 10004280, 10004299)Buffer Salt 1(20007332, 20007311)
Buffer Salt 1(20007325, 20007269, 20007304, 20007318)
Cassette 1(10004281, 10004290, 10004308, 10004329)Buffer Salt 3(20007320, 20007306, 20007291, 20007271, 20007341)
Buffer Salt 3(20007313, 20007334)
All Rows
Cassette 1(10004324, 10004317, 10004280, 10004299)
Buffer Salt 1(20007332, 20007311)Buffer Salt 1(20007325, 20007269, 20007304, 20007318)
Cassette 1(10004281, 10004290, 10004308, 10004329)
Buffer Salt 3(20007320, 20007306, 20007291, 20007271, 20007341)Buffer Salt 3(20007313, 20007334)
0.433RSquare
1.0615633RMSE
548N
3
Numberof Splits
1630.75AICc
All RowsCountMeanStd Dev
54817.3903281.4109445
57.28299LogWorth
1.73152Difference
Cassette 1(10004324, 10004317, 10004280, 10004299)
CountMeanStd Dev
26316.48981
1.08129715.4604526LogWorth
0.71235Difference
Buffer Salt 1(20007332, 20007311)CountMeanStd Dev
6315.9480951.0581052
Buffer Salt 1(20007325, 20007269, 20007304, 20007318)
CountMeanStd Dev
20016.66045
1.0336199
Cassette 1(10004281, 10004290, 10004308, 10004329)
CountMeanStd Dev
28518.2213331.1453044
7.2638511LogWorth
0.89493Difference
Buffer Salt 3(20007320, 20007306, 20007291, 20007271, 20007341)
CountMeanStd Dev
22718.0392071.1151587
Buffer Salt 3(20007313, 20007334)CountMeanStd Dev
5818.9341380.9776575
Cassette 1Buffer Salt 3Buffer Salt 1Cassette 2Filter 4aFilter 4bFilter 1Filter 2Filter 3Cassette 3Disposable AssemblyProcess Skid Stage 3Cold Room----Excipients----A bufferPreservativeStabilizer AStabilizer BDetergentSalt ASalt BBuffer Salt 2Buffer Salt 4Buffer Salt 5Preservative APreservative BDetergent Stage 2Detergent Stage 4Culture MediaMedia Feed 1Media Feed 2Flask MediaWFI Source
Term111000000000000000000000000000000
Numberof Splits
410.0877636.9988324.311260.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.000000.00000
SS
Column Contributions
Partition for Final Yield
Recursive Partition (Decision Tree)
Systematic method for looking at relatively large data sets
Current structure of the partition and its effect on the responses
Higher Y values shifted to the right
X values are arranged randomly within each category
16
Copyright © 2010, SAS Institute Inc. All rights reserved.
Decision Trees
Also known as Recursive Partitioning, CHAID, CART
Models are a series of nested IF() statements, where each condition in the IF() statement can be viewed as a separate branch in a tree.
Commonly used for credit scoring, fraud detection, marketing promotion target generation, …
Also used to help discover the “hot” X’s in historical data
17
Copyright © 2010, SAS Institute Inc. All rights reserved.
Under the hood
To find the next branch in the tree
For every branch or split For every X
» Search through each unique value of X
» Split the branch into two groups– e.g. X1 < X1_split vs. X1>= X1_split– Record
» the difference in the response average between the groups,
» calculate the logworth = - log10(p-value)
Select the split that maximizes the logworth (minimized the p-value) and add a branch based on that split
18
Copyright © 2010, SAS Institute Inc. All rights reserved.
Under the hood
Keep building tree until Minimize size (number of data points) in a branch is met Other criteria can also be imposed
19
Copyright © 2010, SAS Institute Inc. All rights reserved.
Partition Conclusions
Identified Key Materials
Investigation Direction Can you investigate everything? Recursive partition points helped to narrow
down the potential list of candidates to investigate in depth.
If your X’s don’t explain your Y’s You’re measuring the wrong thing
20
Copyright © 2010, SAS Institute Inc. All rights reserved.
14
15
16
17
18
19
20
21
22
Fin
al Y
ield
100 150 200 250 300 350
Age of Cassette 1 at use (d)
Linear FitFit Mean
Final Yield = 20.217945 - 0.0143331*Age of Cassette 1 at use (d)
RSquareRSquare AdjRoot Mean Square ErrorMean of ResponseObservations (or Sum Wgts)
0.4676130.4666371.03043617.39033
548
Summary of Fit
Std Error t Ratio Prob>|t|
Parameter Estimates
Linear Fit
Bivariate Fit of Final Yield By Age of Cassette 1 at use (d)
Follow on Investigation
Deeper investigation reveals explanation
Started with available data
Added “hard to get” data
Implemented changes
Variation dropped, and costly unusual events did too.
21
Copyright © 2010, SAS Institute Inc. All rights reserved.
Application to Future Investigations
10003 adenosine10005 L alanine
10013 L arginine10015 DL aspartic acid
10020 biotin10022 calcium chloride
10030 dextrose anhydrous10036 ferrous sulfate
10039 L glutamic acid10042 glycine
10043 guanine10045 L histidine
10047 hydrochloric acid10049 DL isoleucine
10053 DL leucine10058 magnesium sulfate10060 manganese sulfate
10063 DL methionine10072 DL phenylalinine
10080 potassium phosphate dib...10081 potassium phosphate mono
10083 L proline10085 pyridoxine HCL
10087 DL serine10089 sodium bicarbonate
10105 thiamine10108 DL threonine10115 L tryptophan
10118 DL valine
Mat
eria
l
3273
3279
3284
3288
3292
3299
3305
3313
3317
3321
3328
3335
3343
3347
3351
3355
3360
3367
3372
3377
3382
3386
3392
3396
3401
3406
3411
3418
3422
3426
3433
3438
3442
3447
3451
3456
3460
3464
3468
3472
Run NumberRun ID Number (sequential runs) Start of
Trend
List of materialsEach material is listed on a separate row.(Intentionally left of graph)
The Usual Suspect
Lot change in the Usual Suspect is not aligned with the start of the trendConclusion: Move on to other potential causes
22
Copyright © 2010, SAS Institute Inc. All rights reserved.
Case Study Fixed Production Schedule
Pound the wall until the problem goes away.
40
50
60
70
80
90
100
110Process M
easurem
ent
222
2
2
2
222
22
1
22
2
22
2
1
1
111
1
1
1
1
1
1
22
3
13404
3414
3424
3434
3444
3454
3464
3474
3484
3494
3504
3514
3524
3534
Run
Avg=56.81
LCL=41.44
UCL=72.17
Train 1
Train 2
Train 3
Running Schedule
Train N…
Problem FixedTrend Begins
23
Copyright © 2010, SAS Institute Inc. All rights reserved.
Exclude Categories Quickly
Data was readily available
Define, Measure, Analyze and Hope
Process Investigation
50
60
70
80
90
100Pro
cess
Measu
rem
ent
All Rows10015 PM DL aspartic acid(105577, 107597)
10015 PM DL aspartic acid(106636)
All Rows
10015 PM DL aspartic acid(105577, 107597) 10015 PM DL aspartic acid(106...
0.448RSquare
143N
1
Numberof Splits
All RowsCountMeanStd Dev
14356.8055948.7159703
26.803198LogWorth
20.9618Difference
10015 PM DL asparticacid(105577, 107597)CountMeanStd Dev
13155.0465656.5827782
10015 PM DL asparticacid(106636)CountMeanStd Dev
1276.0083335.4264434
Partition for Process Measurement
24
Copyright © 2010, SAS Institute Inc. All rights reserved.
Group Measurements by Suspect Material
Control chart phased by suspect material lot numbers
Corrective action was implemented quickly
Minimized impact of failure mode.
Process Investigation
30
40
50
60
70
80
90
100
110
Pro
cess
Measu
rem
ent
222
2
1
22
22
2
2
22
3
1
105577 106636 107597
3404
3414
3424
3434
3444
3454
3464
3474
3484
3494
3504
3514
3524
3534
Sublot No
25
Copyright © 2010, SAS Institute Inc. All rights reserved.
Conclusions
While it’s a lot of work to keep manufacturing databases up to date. The pay off in an emergency is worth it.
Depth of data collection can be shallow as long as: Resources are available to troll through non electronic records Intermediate data is sufficient for an indirect diagnosis
Recursive Partition, decision trees can quickly yield actionable results in root cause investigations. Best when the relationship between X’s and Y’s are unknown Good where there are many X’s to wade through Sparse data could cause problems, other tools like Random
Forrest (Bootstrap Forrest in JMP) may be necessary.
Copyright © 2010 SAS Institute Inc. All rights reserved.
Copyright © 2010 SAS Institute Inc. All rights reserved.
Methods Byron Wingerd, Systems Engineer, JMP/SAS
28
Copyright © 2010, SAS Institute Inc. All rights reserved.
Plots for Showing Sequential Changes in Categorical Variables
Raw material lots change frequently in many of the processes we investigate
Need a standard graphic: Quickly compare lot turnover between materials Show change events relative to raw material lot changes Drill down from all materials to specific materials.
In JMP, the Bubble Plot graph type can be easily formatted to generate simple and informative plots
29
Copyright © 2010, SAS Institute Inc. All rights reserved.
Data Structure
Single Material (Flat Table) One row for each process run Individual columns for each raw material
» The lot number for each material is recorded for each run» Graphics are intolerant of missing data, empty cells will be blank.
Multiple Materials (Stacked Table) One Column for run ID One Column for Material type One Column for Lot Numbers
These graphics are intolerant of missing data, empty cells will be blank.
30
Copyright © 2010, SAS Institute Inc. All rights reserved.
Plot 1: For One Material
Bubble Plot Dialog: Graph/Bubble Plot
31
Copyright © 2010, SAS Institute Inc. All rights reserved.
Plot 2: For Multiple Materials
Need to Stack Materials. (Tables/Stack)
The lot numbers of materials are in separate column so the first step is to stack the Lot Number Columns
Name for the column that will contain the lot numbers
Name for the column that will contain the material names
32
Copyright © 2010, SAS Institute Inc. All rights reserved.
Plot 2: For Multiple Materials
Optional Step: Clean up Material or Lot namesClean up names using the Recode tool in the columns menu
Concatenate the material type and lot number. This column is used for the graph label.
Note: The Concatenate character is a pair of double tubes, Shift-Backslash (the button over the enter key). Concatenate only works on character columns or numbers that are forced to be characters using “Char(:colname).”
33
Copyright © 2010, SAS Institute Inc. All rights reserved.
Plot 2: For Multiple Materials
Bubble Plot Dialog: Graph/Bubble Plot
34
Copyright © 2010, SAS Institute Inc. All rights reserved.
Plot 2: For Multiple Materials
Format Axis Double Click Y axis to edit the scale or to add reference lines
FBS ASH30335
HBME L707073L-Glut L731147L-Glut L748305
L-Glut L767886L-Glut L802976PBS aug34505
PBS AUG34505PBS AUM57432
PBS AVE71175PGS L0667215PGS L0667403
PGS L0667660PGS L0667813
PGS L0668552PGS L0668726
Trypsin L0665829
Trypsin L0666049Trypsin L0666135Trypsin L0666168
Trypsin L0666331
Ma
teria
l an
d
Lo
t Nu
mb
er
05
/01
/20
10
05
/15
/20
10
05
/29
/20
10
06
/12
/20
10
06
/26
/20
10
07
/10
/20
10
07
/24
/20
10
08
/07
/20
10
08
/21
/20
10
09
/04
/20
10
10
/02
/20
10
Po
ten
tial C
ha
ng
e E
ven
t
Test Date
FBS ASH30335
HBME L707073L-Glut L731147L-Glut L748305
L-Glut L767886L-Glut L802976PBS aug34505
PBS AUG34505PBS AUM57432
PBS AVE71175PGS L0667215PGS L0667403
PGS L0667660PGS L0667813
PGS L0668552PGS L0668726
Trypsin L0665829
Trypsin L0666049Trypsin L0666135Trypsin L0666168
Trypsin L0666331
Ma
teria
l an
d
Lo
t Nu
mb
er
05
/01
/20
10
05
/15
/20
10
05
/29
/20
10
06
/12
/20
10
06
/26
/20
10
07
/10
/20
10
07
/24
/20
10
08
/07
/20
10
08
/21
/20
10
09
/04
/20
10
10
/02
/20
10
Po
ten
tial C
ha
ng
e E
ven
t
Test Date
35
Copyright © 2010, SAS Institute Inc. All rights reserved.
Plot 3: More Color for Multiple Materials
In this plot each material is on one row and the color of the row changes with each lot change
36
Copyright © 2010, SAS Institute Inc. All rights reserved.
Setting up the Plot
Data Structure
Run or Date Column
Material ID column
Lot ID column
Making the Graph
Graph/Bubble Plot X, Run Number (or date) Y, Material Coloring, Lot Number/ID
Details Use the Red Triangle Menu to change the shape to a Square
37
Copyright © 2010, SAS Institute Inc. All rights reserved.
Scripting The Bubble Plot
The JMP Scripting Language (JSL) can be used to generate graphs automatically.
JMP writes the script for you. Red Triangle Menu, select Script, then save the script to the
script window. Add a couple of edits, like an Open statement and a send to
(“<<“) and add your columns names to your captured script.
dt=Open(“c:\filepath\filename.jmp”);dt<<Bubble Plot(
X( :Run Number ), Y( :Material ), Coloring( :Lot ID ), Bubble Size( 10 ), //Controls initial marker size Legend( 0 ), //Turns off the legend Set Shape( Square ));//Sets the marker shape to squares
Paste this script into a new script window. Add your file name and your column names
38
Copyright © 2010, SAS Institute Inc. All rights reserved.
Running a Recursive Partition
From the Analyze menu, select Modeling, Partition
Add response and factors to the dialog and click OK
39
Copyright © 2010, SAS Institute Inc. All rights reserved.
Running a Recursive Partition
Click the Split and Prune button to find the best splits
The Red Triangle menu contains an option to view the column contributions
For automatic splitting, choose the k-fold cross validation option, or exclude rows to use in a for a validation subset.
Copyright © 2010 SAS Institute Inc. All rights reserved.
41
Copyright © 2010, SAS Institute Inc. All rights reserved.
Process of Statistical Discovery
Reporting
Analysis/Graphics
Data Management
Data Access Big Time Savings, But Not Flashy
The “Ahas” Occur Here
Interactive Flash Output
Data → Information → Knowledge → UnderstandingSo decision makers can take Action!
42
Copyright © 2010, SAS Institute Inc. All rights reserved.
Process of Statistical DiscoveryGetting Data into JMP is Easy
JMP, Excel, Text, SAS & other data formats
SAS Data Server
Database
Internet/html
Reporting
Analysis/Graphics
Data Management
Data Access
43
Copyright © 2010, SAS Institute Inc. All rights reserved.
Process of Statistical DiscoveryShaping the Data for Analysis - Big Time Savings
Tables Menu Spend time here today
Cols Menu Column Info… Column Properties Formula…
Rows MenuReporting
Analysis/Graphics
Data Management
Data Access
44
Copyright © 2010, SAS Institute Inc. All rights reserved.
Process of Statistical DiscoveryMany Analyses & Graphs – Range of Stat. Expertise
Exploratory Data Analysis, Statistics, Modeling
Design of Experiments
Interactive Data Mining
Visual Six Sigma, Quality, Reliability
Business Visualization
Profiler, Simulator, Data Filter
Reporting
Analysis/Graphics
Data Management
Data Access
45
Copyright © 2010, SAS Institute Inc. All rights reserved.
Process of Statistical DiscoveryWide Range of Outputs Available
Graphs & Tables in: Data Tables, Reports,
Journals, Projects
‘Paste Special’ into MS Word, PPT, Excel
Flash Objects Profiler Distribution Bubble Plots
Print to PDF
Reporting
Analysis/Graphics
Data Management
Data Access
46
Copyright © 2010, SAS Institute Inc. All rights reserved.
All Data is Contextual…
Only people understand ‘context’, ‘relevance’ and ‘utility’.
Making new discoveries is not ‘algorithmic’, and never can be.
JMP allows informed users to explore data in flexible ways to make new useful discoveries.
This happens “in the same head”, with no division of labor to confuse things.
JMP, in continual development for more than twenty years, is designed and architected to support this process of ‘Statistical Discovery’.