fifty years (and more) of designed experimentation in dupont · milestones in designed experiments...
TRANSCRIPT
Fifty Years (and More) of
Designed Experimentation in DuPont
Steve Bailey
Principal Consultant and MBB
Minitab Insights Conference
Tuesday, September 13, 2016
• DuPont's 50+ year Strategy of Experimentation
(SOE) deployment experience will be reviewed.
• Two applications of "customized" (computer
generated) designs will be presented, including
effective ways to visualizing results.
– A 5-factor response surface experiment with both
continuous and categorical factors at 3 levels.
– An 8-component constrained mixture response
surface experiment.
• The two examples will show how situations
"beyond the cataloged designs" were handled
by computer generated designs and analyses.
Outline of Talk
• Understand the evolution of DuPont's SOE
methodology, the training, and the
supporting software over the decades.
• Design and analyze custom (computer
generated) design involving both multi-level
discrete and continuous factors.
• Design and analyze custom (computer
generated) constrained mixture response
surface experiments.
Learning Objectives
Year DOE Theory Area Software Course Audience
1920s Agriculture split plot
experiments (Fisher &
Yates)
1930s
1940s Plackett-Burman
designs
1950s Response surface
methods (Box et al.)
R&D Hand calculations
Univac
1960s Mixture designs
(Scheffe)
R&D
MFG
SOE (1964) Internal offering
1970s Design optimality and
computer-aided designs
Conjoint analysis
Univac programs
developed
internally
SOE
SOFD added
Internal &
External
offering
(Began selling
course
externally in
1974)
1980s Robust parameter
design
R&D, MFG
Agriculture
RS/Discover
(VAX)
ECHIP (PC)
Minitab® (VAX)
Last major
content update
1990s Industrial split plot
designs
R&D, MFG,
Agriculture
Tech.
Sales/Marketin
g
Minitab® (PC),
JMP®
Software
updates
Internal &
DuPont
Customers
2000s Computer experiments R&D, MFG,
Agriculture,
Minitab®, JMP®,
SAS®
SOE & SOEFD
course
Strategy of Experimentation (SOE) History
Milestones In Designed ExperimentsFrom “Design Of Experiments Makes A Comeback”, Chemical & Engineering News, April 1, 2013 Issue - Vol. 91 Issue 13
1. Ronald A. Fisher (1890 – 1962)2. George E. P. Box (1919 – 2013)3. “In the 1970s, DuPont’s Quality Management & Technology Center
trains DuPont employees on DOE and offers the training to other companies. The company continues the service into the 1990s.”
4. FDA Quality by Design (2011)
Milestones In Designed Experiments
From “Design Of Experiments Makes A Comeback”, Chemical & Engineering News, April 1, 2013 Issue - Vol. 91 Issue 13
• Ronald A. Fisher (1890 – 1962)• George E. P. Box (1919 – 2013)• “In the 1970s, DuPont’s Quality Management & Technology
Center trains DuPont employees on DOE and offers the training to other companies. The company continues the service into the 1990s.”
• FDA Quality by Design (2011)
Strategy of Experimentation (SOE) History
• Late 1950’s – Computing (mainframe) arrives at DuPont
• Early 1960’s – First Response Surface DOEs done in plants and labs
• 1964 – First SOE course (concurrent with ASG formation)
• Late 1960’s – Internal software developed for DOE design and analysis
• Mid 1970’s – Revised SOE text and created external business
• Late 1970’s – Strategy of Formulations Development (SFD) course
• Late 1980’s – Last SOE text revision
• Late 1980’s – Experimentation for Robust Product Design (ERPD)
• Late 1980’s – Software integrated into SOE and SFD course
Strategy of Experimentation (SOE)
(from Forward of 1975 SOE Text)
• Every experimental program embodies an experimental strategy that may either be good or bad.
• The strategy selection can catalyze technical progress or cause stagnation.
• “Strategy of Experimentation” teaches the information needed to apply modern experimental designs effectively.
• The course encompasses the philosophic and practical elements of experimental programs as well as the methodology of statistical experimental design.
• The material represents, both by inclusion and exclusion, a distillation of what is most important for the working scientist (and their supervision) both to understand and be able to do.
Full Factorials as Building Blocks for Screening and Response Surface Experiments
X3
X2X1
X3
X2X1
X3
X2X1
Full Factorial Experiments
Response Surface Experiments
Screening Experiments
Over 40,000 students internally and
externally trained in DuPont’s
Strategy of Experimentation (SOE)!9
Evolution of the Experimental Environment
Cataloged designs (fractions of full 2-level factorial designs)
• Fractional factorial designs (n is a power of 2)
• Plackett-Burman designs (n is a multiple of 4)
The success of these designs (Lucas):
• They have orthogonal (balanced) structure
• They get run!
What is the effect of “ignored” second-order effects?
• Two-factor interactions – cross-product terms
• Curvature – quadratic (squared) terms
Other “catalogued” options
• Taguchi Designs (“orthogonal arrays”)
• Definitive Screening Designs (New Kid on The Block)
A Brief Word on Screening Designs
Not Available Available
Not
Available
Available
Lack of Fit
Pu
re E
rro
r
df: 2, 0, 0 df: 2, 0, 1
df: 2, 2, 0 df: 2, 3, 1
Key:
df: p, r, l
Fit
Pure
Error
Lack of
Fit
n = p+l+r total runs in the DOEp = number of parameters in model (eg, p=2 for straight line)
l = number of extra treatment combinations for lack of fit
r = number of replicate runs added to the design
QUADRATIC POLYNOMIALMODELS
Y = b0 + b1X1 + b11X12
Y = b0 + b1X1 + b2X2 + b12X1X2 + b11X12 + b22X2
2
Y = b0 + b1X1 + b2X2 + b3X3
+ b12X1X2 + b13X1X3 + b23X2X3
+ b11X12 + b22X2
2 + b33X32
What if Xs are categorical?
What is Xs are mixture components?
X3
X2
X1
Block 1
(First Half-Fraction)
Block 2
(Second Half Fraction)
Block 3
(Face Points)
Center Points
Face-Centered Cube Designs for 3 Factors
Going beyond two-level and response surface designs
– Categorical factors with three or more levels
– Both continuous and categorical factors
– Constrained experimental regions (eg, mixtures)
– Both mixture and (continuous and/or categorical) process factors
– Hard-to-change factors (restricted randomization)
– Blocking constraints
Customized “Optimal” Design Applications
Customized “Optimal” Designs
General Approach• Identify factors and constraints, including
– Type of factors – continuous, categorical, mixture
– Factor levels – ranges, allowable levels, and constraints
– Randomization (easy/hard to change Xs) and blocking constraints
• Identify model to be fit– Main effects, interactions, curvature
– Determines p = number of model parameters
• Select desired number of runs– Bob Wheeler’s rule of thumb: l=5 and r=5
– So n=p+5+5 total runs
• Generate “optimal” design (see next chart)
• Review design diagnostics (including VIFs) before running
•D-Optimal Designs
– Minimize the variances of the model coefficient
estimates
– Tend to allocate runs to the extremes of the region
• e.g., models with no curvature get no center points
– Sensitive to departures from assumed model
•I-Optimal Designs
– Minimize the integrated prediction variance over the
experimental region
– Tend to allocate more runs to the interior of the region
– Less sensitive to departures from assumed model
•Will use D-Optimal approach in these examples
Design Optimality Criteria
Minitab Experiment Contest Entry
Response Surface DOE with Continuous and Categorical Factors:
Optimizing Use of a Thermal Stabilizer in a Plastic Composite
Steven P. Bailey
Principal Consultant and Master Black Belt
DuPont Engineering Research and Technology
Avelino F. Lima
Principal Investigator
DuPont Performance Polymers
February 14, 2012
*** Non-disclosure: All industrial experiments, results and scenarios are based on
the authors’ actual experience. Data, units, variable names, etc have been changed for
demonstration purposes only to protect company and process propriety.
Goals of the Experiment
– Show that a new stabilizer type is just as effective as the current
types.
– Find the lowest amount of stabilizer that results in functional
equivalence.
– Show that a new vendor is just as good as the current vendors.
Importance of the Experiment
– Functional equivalence of stabilizer type is critical to the customer.
– Minimizing the amount of stabilizer is desirable.
– Finding that the new vendor and stabilizer are just as good as the
current ones improves robustness of the supply chain.
21
Goal and Importance of the Experiment
Process Description:
DOE with Continuous and Categorical Factors
22
FocusThis experiment focused on
expanding the levels of the
categorical variables (V and S) that
are available for use. We also want
to minimize P.
D-optimal design with continuous and categorical factors.
23
Factor ID (and Description) Factor Type (and Coded Levels)
G (amount of reinforcement) Continuous (-1, 0, 1)
N (amount of colorant) Continuous (-1, 0, 1)
P (amount of stabilizer) Continuous (-1, 0, 1)
V (vendor) Categorical (1, 2, 3)
S (type of stabilizer) Categorical (1, 2, 3)
Need for D-optimal design to fit second-order model
• A full factorial for all 3 levels of all 5 factors would be:
• 35 = 243 possible treatment combinations (TCs)
• 243 is way too many! We need to select the best subset to use.
Experiment Description – The Factors (Xs)
• There were 19 responses used to measure
quality of the energy balancing.
– 3 of these, YA, YB, and YC were tested at 3
different aging times, 1, 2, and 3. These
represent 9 responses.
– There are 10 additional responses, YD thru YM.
• We want to determine whether the new
stabilizer type and the new vendor result in
acceptable quality levels for the 19
responses.
24
Experiment Results – The Responses (Ys)
When a categorical factor with k levels is added to a model
it must be coded (“behind the scenes”) using k-1 indicator
variables. Two options below are available, illustrated using
the 3-level factor “Vendor” in our example.
(0,1) Coding
A reference level is chosen (Vendor 1
in this case). That is set to 0 for each
X. The two columns that are formed
will compare the other vendors to
Vendor1.
Vendor 2 Vendor 3
Vendor 1 0 0
Vendor 2 1 0
Vendor 3 0 1
(-1,0,1) Coding
The last level is left out and set to -1.
This will compare each vendor to the
overall mean. Unless you have a
specific group you want to compare all
of the others to, this is usually the
better approach.
Vendor 1 Vendor 2
Vendor 1 1 0
Vendor 2 0 1
Vendor 3 -1 -1
Coding of Categorical X’s
Second-Order Model Terms
# Terms Description
1 Overall Mean
3 Linear effects for continuous factors (G,N,P)
4 Main effects for 3-level categorical factors (V,S)
3 Quadratic effects for continuous factors (G*G, N*N, P*P)
3 G*N, G*P, N*P interactions
4 V*S interaction
12 G*V, G*S, N*V, N*S, P*V, P*S interactions
30 Total number of model parameters (p)
Generic Second-Order Model Terms# Terms Description
1 Overall Mean
c Linear effects for c continuous factors (C1, C2, … Cc)
Main effects for d categorical factors (D1, D2…Dd) with
ki levels each (i=1,…d)
c Quadratic effects for continuous factors (C1*C1,
C2*C2…Ck*Ck)
2-factor interactions among the c continuous factors
2-factor interactions among the d categorical factors
2-level interactions between the c continuous and d
categorical factors
Total number of model parameters (p)
5 Factors (Xs) – 3 continuous (G,N,P) and 2 three-level categorical (V,S)
p=30 is the minimum number of treatment combinations we need to
estimate the right-sized model
l=6 is number of added treatment combinations (TCs) used for lack-of-fit
r=5 is the number of replicates added
n= p+l+r = 30+6+5 = 41 is the resulting number of runs in the design
19 responses (Ys) analyzed – Next charts show
• analytical summary and stoplight dashboard
• prediction profile plot as graphical dashboard
“Right-Sizing” the DOE:How many experimental runs are needed?
Experiment Results – Summary of Models Fit
Response R-Sq(Adj) p(LoF) S(Resid) LargeResiduals Significant Terms (alpha=0.05) G N P V S
YA1 23.7 0.627 0.0382 26 G,
YA2 58.6 0.323 0.0194 8,10 G,GG,NG,VG
YA3 91.2 0.209 0.0387 G, P, PP, V PG
YB1 95.5 0.930 0.0184 (5,39) G,GG,NG
YB2 93.3 0.000 0.0158 17,19,22,26 N,G,NN,GG,NG,VG
YB3 39.3 0.003 0.0328 10,36,40 G
YC1 73.9 0.414 0.0285 24 N,G,NG
YC2 79.1 0.163 0.0184 8,15 N,PP,NN,PG,NG,VP,VG,SG
YC3 80.1 0.018 0.0226 23,33,37,41 G, GG
YD 68.5 0.139 18.57 19,22 N,G,SP
YE 99.9 0.923 0.0598 (6,38) P,N,G,V,GG,PG,VN,SN,VS
YF 99.3 0.934 4.451 15 N,G,GG
YG 94.3 0.044 0.0652 15,16,28,32 N,G,GG,VP,VN,SG,SV
YH 99.4 0.122 0.3889 8,15,19,21 G,GG
YI 99.8 0.236 0.2305 24,29,40 P,G,PN,SN
YJ 86.3 0.984 0.053 (5,39) N,G,GG,NG
YK 23.4 0.845 0.1563 NG
YL 99.4 0.002 0.0104 7,11,13,20,32,36 P,G,PN,PG
YM 94.4 0.608 1.1894 2 N, G, P, GG, NG
Yellow – Small R-Sq(Adj) or significant lack-of-fit (LoF)
Orange – Significant interactions (but not main effects)
Red – Significant main effect
This is a screen
capture of the
interactive
Response
Optimizer for this
DOE from Minitab
Version 17.
Process Outcome from Experiment Results
Minimal statistical difference was noticed among
the three stabilizer types, so we can use the new
stabilizer.
Stabilizer amount can be reduced in all
formulations to the lowest level tested.
Minimal statistical differences were noticed
among the three vendors, so we can qualify the
new vendor.
31
Second Example: 8-Component Mixture Experiment
Purpose: Optimize the efficiency of TiO2 used in combination with
functional extenders in the European architectural coatings market.
8 components were studied with these constraints:
• These 8 mixture components must add to 100%
• Each component has its own lower and upper limits as well
Extreme vertices used as candidate formulations
D-Optimal DOE generated with n = 47 runs ( n = p + l + r )
• p = q(q+1)/2 = 36 minimum formulations needed for a q=8 component
quadratic mixture model
• l = 5 formulations added for lack of fit, plus
• r = 6 replicates added
•10 measured responses analyzed – all models fit and predicted well!
32
Quadratic Response Surface Model
for q=2 Mixture Components
Y = a0 + a1X1 + a2X2 + a12X1X2 + a11X12 + a22X2
2
The Mixture Constraint
X1 + X2 = 1
X12 = X1*X1 = X1(1 - X2) = X1 - X1*X2
X22 = X2*X2 = X2(1 - X1) = X2 - X1*X2
Quadratic Mixture Model (Scheffé)
Y = b1X1 + b2X2 + b12X1X2
For a q-component model, we have q(q-1)/2
of these nonlinear blending terms!
Non-linear blending terms
– Important to characterize responses
– But not all q(q-1)/2 terms may be needed
– So “stepwise regresion” models used
34
d = 0.25*b12
Y
b1
b2
X1 1.0 0.5 0.0
X2 0.0 0.5 1.0
0.0 0.5
0.0
0.5
0.0
0.5
X3 = 1
X1 = 0.51
X3 = 0.58
X2 = 0.52
X3 = 0.11
X2 = 0.14
X1 = 0.18
X2 = 1
X1 = 1
Minimum And Maximum Component Levels
Requirements:
0.18 X1 0.51
0.14 X2 0.52
0.11 X3 0.58
This DOE example is a
published case study!
Coatings Tech
Volume 11, No 4,
pp 35-41 (April 2014)
Authors:
Steven DeBacker
Mike Diebold
Steve Bailey
This approach was
replicated successfully
in many other mixture
DOEs
39
• DuPont's deployment of Strategy of
Experimentation (SOE) as a system
(methodology, documentation, training,
consulting, software) has been a powerful
tool with tremendous business benefit.
• The two examples of custom (computer
generated) designs involving both multi-
level categorical and continuous factors
and mixture components illustrate how to
go beyond the “catalogued” designs but
follow the SOE principles.
Summary
• DOE and SOE history
Stephanie DeHart, Steve Larson,
Vaneeta Grover, and many other past
DuPont Applied Statistics Group
members
• First example
Avelino Lima, Bob Lawton, Paul Bouvy
• Second example
Steven DeBacker, Mike Diebold,
Sarah Richards41
Acknowledgements
Steve Bailey
Professional Biography
• Entire 36.5 year DuPont career was
spent with the corporate Applied
Statistics Group where he was most
recently Principal Consultant and MBB
• Provided statistics and six sigma
consulting primarily for DuPont’s
Performance Polymers and Titanium
Technologies businesses
• Led DuPont's Master Black Belt
Network
• Coordinated, developed and delivered
BB, MBB, and Champion training
• Was President and Chairman of the
Board of the American Society for
Quality (ASQ) 1997-99
• BB and MBB certified by both DuPont
and ASQ
Personal BiographyBorn: Indianapolis, Indiana
Hometown: Milwaukee, Wisconsin (actually two suburbs – Shorewood and Wauwatosa)
Education: B.S., M.S., Ph.D. degrees in Statistics from the Univ of Wisconsin in 1970s
Family: Wife Marg, 3 daughters, 7 grandkids
Personal Interest Items: Golf, bowling, movies, all Wisconsin college and professional sports (Packers, Badgers, Brewers, etc)
Questions?
Remaining charts are a brief history of
DuPont Company (last 5 of 214 years!)
and its Applied Statistics Group (50+ years)
And examples of hands-on DOE
training exercises are at the end
Questions?
1989 – Quality Management and Technology (QM&T) ISO 9000, PQM, Malcolm Baldrige (DCIC)
Quality Management
Applied StatisticsQuality Technology
2007 – ASG (Back to the Future)
Quality Management Six Sigma
Applied Statistics
Note: No attempt was made at depicting the correct sizes or overlap of circles!
Quality Technology
Mgmt
Tech
Quality Statistics
DuPont’s 13 Businesses
• Protection Technologies
• Building Innovations
• Safety Resources
• Pioneer Hi-Bred
• Crop Protection
• Nutrition & Health
• Performance Polymers
• Packaging & Industrial Polymers
• Titanium Technologies
• Chemicals & Fluoroproducts
• Performance Coatings
• Industrial BioSciences
• Electronics & Communications
49
DuPont’s 13 Businesses as of 2012
DuPont’s 8 Businesses
• Protection Technologies
• Building Innovations
Combined: Protection Solutions
• Safety Resources
• Pioneer Hi-Bred
• Crop Protection
• Nutrition & Health
• Performance Polymers
• Packaging & Industrial Polymers
Combined: Performance Materials
• Titanium Technologies
• Chemicals & Fluoroproducts
Both spun off: Chemours • Performance Coating
Sold: Axalta (part of Carlysle)
• Industrial BioSciences
• Electronics & Communications
50
DuPont’s 8 Businesses as of 2015
X2: Hook Position (here = 4; next down = 5, etc)
X4: Stop Position (here = 3; next to
right = 2, etc)
X1: Cup Position (top = 1; next = 2,
etc)
X5: Pin Position; (here = 3; next down = 4, etc)
X3: Start Angle (deg)
Project Y = Distance traveled by the ball
NOTES:
• Because the Statapult can potentially be modified to all these Xs on a continuous scale, all 5 Xscan be treated as continuous for the purpose of this experiment.
• You are, however, restricted to experimenting with only the settings available under the current Statapult design.
Funnel Experiment – X1 (ac) shown here
ac
ac = Angle of the centerline of the funnel from horizontal
Funnel Experiment Variables
Held or not heldDiscreteh = base of funnel held or not
0 to 30”Continuousd = distance on the channel from
ball release to funnel entrance
1/8” to 9/16” in 1/16” incrementsDiscrete/Continuousbs = Size of bearing used
Chrome or plasticDiscretebt = Type of bearing used
0 to 180 degreesContinuousaf = Angle of the channel to the
face of the funnel
0 to 90 degreesCalculated - continuousah = Angle of the channel above
the horizontal
~10 to 90 degreesCalculated - continuousac = Angle of the centerline of
the funnel from horizontal
RangeTypeVariable
ca)elLengthFunn
erLineHeightCentarcsine(
hm aa 90