preliminaries thank you john mccool – good excuse to review, think about and give an overview of a...

Preliminaries

• Thank you John McCool– Good excuse to review, think about and give an

overview of a 25 year research project• Stat Day: A worthwhile conference

– I have attended it frequently• What can/should we do to advance statistics

Last years Stat Day: Follow-up

• Drove up with Ron Snee and Steve Bailey• Good Discussion when driving home with two

major topics:– 1) Should we use and teach Definitive Screening

Designs? Do they add much to Plackett-Burman Designs for Screening?

– 2) Status of “Living A Great Life: Mathematical Models for Living” –a book that I may write

Definitive Screening Designs

• Jones, B and Nachtsheim, C. J. (2011) “A Class of Three-Level Designs for Definitive Screening in the Presence of Second-Order Effects” JQT, 43,1 pp 1- 15

• Jones, B and Nachtsheim, C. J. (2013) “Definitive Screening Designs with Added Two-Level Categorical Factors” JQT, 45, 2, pp 121-129

• Xiao, L., Lin, D. K. J. and Bai, F. (2012) “Constructing Definitive Screening Designs Using Conference Matrices” JQT 44, 1, pp 2-8

Properties:Estimates linear and quadratic termsFor each factorDesign can be saturated:e.g. 6 factors in 13 runs

Can be blocked Add center points for each block

Can study 2-level categorical factorsReplace 0s with ±1Near orthogonal

Should use Conference MatrixComputer search with 10,000 starts Will not find best design

Even Number of FactorsUse Dummy columns

Recommendation:Use instead or reflectedPlackett-Burman Designe.g. 24(vs 25) points to study 11(12)factors

Analysis Table for The 12-Run Plackett Burman Design (L12)

Trial Avg X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 Y

1 + + + - + + + - - - + -

2 + + - + + + - - - + - +

3 + - + + + - - - + - + +

4 + + + + - - - + - + + -

5 + + + - - - + - + + - +

6 + + - - - + - + + - + +

7 + - - - + - + + - + + +

8 + - - + - + + - + + + -

9 + - + - + + - + + + - -

10 + + - + + - + + + - - -

11 + - + + - + + + - - - +

12 + - - - - - - - - - - -

Sum +

Sum -

Check

Differ.

Effect

Living A Great Life:Mathematical Models for Living

• How to live the best life given your starting status– Life is a process not a destination– Always have a Plan B

• Time is the universal limiting resource• Living well is the best revenge• There are good mathematical models for many

aspects of life:– Buying a house– Finding a spouse/life partner

RSM & Optimum Design Theory• Kiefer, J. C. (1959) “Optimum Experimental Designs” JRSS B 21, 272-

319• Lucas, J.M. (1976) “Which Response Surface Design is Best”

Technometrics 16, 411-417• Original Paper was rejected by Technometrics – just a calculation

exercise• Closed RSM to Optimum Design Theory:

– Composite and Box-Behnken Designs are so efficient that there is little possibility of better designs.

– Bill Notz of OSU, a Kiefer student, confirmed this in his talk at the 2013 FTC• Split-Plot Designs Re-Opened the door• Possible Ph.D. Topics here

Large Response Surface Designs

– Crosier, R.B (1991) “Some New Three-Level Response Surface Designs” Technical Report CRDEC-TR-308, US Army Chemical Research, Development and Engineering Center (Available at Stat. Lab Server, Carnegie Mellon University) Up to 15 factors

• I recommended this paper; however, it was rejected by Technometrics– Mee, R. W. (2007) Optimal Three-Level Designs for

Response Surfaces in Spherical Experimental Regions. (Up to 16 factors and references a 14 factor application)

– This paper includes Crosier’s designs

Letter Results

• Crosier paper was rejected, decision was unchanged– No Longer an Associate Editor of Technometrics– Would have become longest term AE

The History of a Research Project

ByJames M. Lucas

2014 National Quality MonthStatistical Symposium

Penn State Great ValleyOctober 17, 2014

Abstract• I describe how a simple observation lead to a long term research project studying

experiments with Hard-to-Change and Easy-to-Change Factors.• The observation was that most experiments I had been involved in were not randomized

even though the experimental runs were conducted using a random run order. Randomization requires that each experimental unit be treated the same way. So for a randomized experiment a random run order is necessary but it is not sufficient to achieve randomization. Resetting of all experimental factors is also required before randomization is achieved.

• I will give an overview of experiments with Hard-to-Change and Easy-to-Change Factors, and describe recent research and practical applications. I will discuss the contributions of my students and me, describe personal interactions that occurred with other researchers and journal editors during this project, and show how they advanced or slowed down scientific progress. I will discuss the problems and advantages of early disclosure of research results including the disadvantage of preemption by others and the advantage of faster dissemination of scientific results. I will describe current research efforts and current applications in this very active area and tell what problems need to be solved to best help experimenters.

Audience Questions

• Why?• To be sure I am telling you something useful• To be on the same page that you are on

DOE Questions

• Are you involved with running experiments?• How often do your experiments involve hard-

to-change factors? (First asked at ‘92 JSM)– Seldom (<10% of the time)– Sometimes (<50% of the time)– Often (>50% of the time)

• Software developer surprised at large fraction of “Often” answers

15

DOE Questions

• How many of you are involved with running experiments?

• How many of you “randomize” to guard against trends or other unexpected events?

• If the same level of a factor such as temperature is required on successive runs, how many of you set that factor to a neutral level and reset it?

16

Additional Questions

• How many of you have conducted experiments on the same process on which you have implemented a Quality Control Procedure?– What did you find?

17

MY OBSERVATIONS

• Comparing residual standard deviation from an experiment with residual standard deviation from an in-control process.– Experimental standard deviation is larger. 1.5X to

3X is common.• Why? You are inducing variation by making

changes in the process

Background

• Why I made the observation that most experiments I had been involved in were not randomized even though the experimental runs were conducted in a random run order

• I had the mixed model tools from the 1980s– DuPont ASG’s Mixed Model Program based on Fellner,

William H. (1986) “Robust Estimates of Variance Components” Technometrics 58, pp 51-60

• Commercially Available much later– SAS PROC MIXED in 1996– JMP6 in 2005

Introduction

• Experiments with H-T-C and/or E-T-C factors occur frequently

• Industrial experiments are seldom really randomized– They often shouldn’t be randomized

• Split-Plot experiments (intentionally or inadvertently) are often conducted• Proper Split-Plot Blocking is often the answer

• Good Reference:– Jones, B. and Nachtsheim, C. J. (2009) “Split-Plot Designs:

What, Why and How” JQT 41, 4 pp 340-361

Good Quote

• Cuthbert Daniel (1976) comes to split-plot experiments from an affordability perspective when he says that “In nearly all experimental situations some factors are hard to vary, whereas others, if not easy, are at least amenable to deliberate variation. Most industrial experiments are, then, split-plot in their design. The total number of runs is largely determined by the number of combinations of the hard-to-vary factors that can be afforded.”

• Compare this quote to J&N (2009):• All industrial experiments are split-plot experiments

Framework(For much of my research in Design of Experiments)

• DuPont Applied Statistics Group’s Strategy of Experimentation (SOE) Course– 2½ day course– Taken by a large fraction of DuPont’s professionals

• Managerial familiarity with DOE– Sells statistics

• Designs Used:– Plackett-Burman Screening Designs– Two level Factorials for process improvement– Box-Bhnken and Composite Response Surface

Designs for Process Optimization

My DOE Course

• Based on the DuPont Course• Includes recommendations for Hard-to-Cange

and Easy-to-Change factors• Screening will also include definitive Screening

Designs• Also Emphasize “Bold Experimentation”

23

Statistical Systems: Introduction

• Statistical Systems– Take a Systems Approach to solve problems that

have statistical aspects– Tie together statistical techniques and

engineering/scientific knowledge– Many were developed following WW II

• SOE is a useful statistical system• Good Example of Statistical Engineering

Experiments with H-T-C Factors: Approach

• Get Help: Four Ph. D Dissertations:• Huey Ju (U of D)

– Expected Variance over all Randomizations• Jeetu Ganju (U of D)

– Bias of RNR Experiments• Frank Anbari (Drexel)

– Optimum blocking with One H-T-C factor• Derek Webb (Montana State)

– More than one H-T-C factor

Aside: My Decision for LAGL

• U of D Adjunct from 1970s– Organized team taught QC Course– Previously directed 3 Ph.D. Dissertations

• Why I am not an academic– Vince Laricca told me: “I fought very hard to keep you from

joining the department because you would have Made the department too applied.”

• How did that work out for him?

• Other Academic Opportunities:– Drexel-did not like the commute– Penn State-move was too disruptive

• I love to Consult

26

Once Bitten: Conducting the wrong experiment

• 3-Factor Box-Behnken design for Mylar – One factor easy-to-change– Factor had “wrong” sign– Factor was not significant

• Learning: Better Consulting • Better Experiment

– 3 x 3 Factorial– 3 settings of the easy-to-change factor– 27 versus 15 runs– Larger experiment can be cheaper

RSM: One Easy-to-Change Factor

• Design good experiment in H-T-C factors– Examine all levels of E-T-C Factor at each setting of

H-T-C Factors (and often use more levels)– Gives a Classic Split-Plot Experiment

• Examples:– Pharmaceutical Pill experiments: compression– Plating Process: Current ratio

• E-T-C factor often has the largest effect

Example:Experiments Used toDevelopThe Ni/PdPlating Process

One Hard-to-Change Factor

• Plan good Experiment ignoring H-T-C aspect• Very Expensive to change factor

– Completely restrict running H-T-C factor– Sometimes the only feasible experiment

• Visit each level of the H-T-C factor twice using Standard Blocking:– 4 Factor Composite (FCC) Show.– Box-Behnken often has 2 (or 3) Orthogonal Blocks– Completely restrict each block

–Good Ph.D. Project

Good StandardBlocking

Are more Blocks needed?

Potential Ph. D. Topic

Fewer center points

Ju and Lucas (2002) History

• Originally submitted to Technometrics in 1994– Associate Editor: Dick DeVeaux– Reviewers: Ray Myers, Tom Lorenzen

• Revisions got no closer to publication– AE personal letter

• Advanced Science by motivating papers published before it was:– Letsinger, J. D., Myers, R. H. and Lentner, M. (1996) Response

Surface Methods for Bi-Randomization Structures” JQT 28, pp 381-397

– Ray said that he expected our paper to be published before his was• Ju and Lucas Response Surface Paper was not published

Comments on Letsinger et. al. (1996)

• High Impact paper– Recommended REML for analysis of unbalanced

Split-Plot experiments• Visited each level of the H-T-C factor only once

– A definite weakness in the designs considered– Minimum-Aberation Designs– Split-Plots vs Generalized S-Ps

33

Randomized Not Reset (RNR) Experiments; Also Called Random Run Order (RRO)

• A large fraction (perhaps a large majority) of industrial experiments are Randomized not Reset (RNR) experiments

• Properties of RNR experiments and a discussion of how experiments should be conducted:– “Lk Factorial Experiments with Hard-to-Change and

Easy-to-Change Factors” Ju and Lucas, 2002, JQT 34, 411-421 [studies one H-T-C factor and uses Random Run Order (RRO) rather than RNR]

– “Factorial Experiments when Factor Levels Are Not Necessarily Reset” Webb, Lucas and Borkowski, 2004, JQT 36, 1, pp 1-11

34

Not Resetting Factors

• Common practice• Has had many successes!

– Complete randomization may be impractical• Not addressed by the classical definition

– Gives a split-plot blocking structure with the blocks determined at random

• May be cost effective• Causes biased hypothesis tests (Ganju and Lucas

1997, 1999, 2005)

35

Ganju and Lucas References• Ganju, J., and Lucas, J. M. (1997). “Bias in Test

Statistics when Restrictions on Randomization are Caused by Factors”. Communications in Statistics – Theory and Methods, 26, pp. 47-63.

• Ganju, J., and Lucas, J. M. (1999). “Detecting Randomization Restrictions Caused by Factors”. Journal of Statistical Planning and Inference, 81, pp. 129-140.

• Ganju, J., and Lucas, J. M. (2000). “Analysis of Unbalanced Data from an Experiment with Random Block Effects and Unequally Spaced Factor Levels”. The American Statistician, 54, 1, pp. 5-11.

• Ganju, J., and Lucas, J. M. (2005). “Randomized and Random Run Order Experiments” Journal of Statistical Planning and Inference 133, pp.199-210.

37

An Essential Element of Randomized Not Reset (RNR) Experiments (DuPont Applied Statistics Group)

Randomization Questions?

• Should Industrial Experiments be Randomized?– Historically most have not been randomized even if a

random run order was used• How large is the experimental Error?

– RSM tools assume that the error is small• Can confirmatory runs be made?• Experimental error is increased by setting factor

levels– Cost benefit analysis is needed

• Discussion paper – not a Ph.D topic

Analysis Comments

• Consultants Rule: 90% of the information gained from Appropriate Plot.

• Academics are heavily into the formal analysis– Even though is often adds little

• Wayne Nelson (2007) said “I have often designed and conducted experiments as split-plots but I have seldom bothered to analyze them as split-plots.”

Experiment With Random Block Effects

• 3 Temperatures x 3 Times x 12 Days– Factorial with 10 extra center points – Almost balanced

• At ’92 JSM in Toronto told Andre he had missed significant DayxTime Interaction in:– Kuhri, A. I. (1992) “RS Models with Random Block Effects” Technometrics

34, 26-37• He Wrote:

– Khuri, A. I. (1994) “RS Models with Mixed Effects” JQT 28, 177-186• This used RSM approach rather than model structure so we played

off of his analysis and wrote:Ganju, J. and Lucas, J. M. “Analysis of Unbalanced Data From an Experiment with Random Block Effects and Unequally Spaced Factor Levels” (2000) The American Statistician 54, 1, 5-11

TheEssential Plot

Main Effects >90% of SS

Shows significant high order interactions

The Formal Analysis: Adds Little

Comments:

Coding Used_1, 0, 1Not equally spaced levels

Other analyses conducted

43

Current Research Project-Partial ConfoundingPurpose: To describe partial confounding and to evaluate the advantages it provides when a blocked 2-level factorial is used to estimate a main effects plus two-factor interaction model.Abstract: Traditionally blocked 2-level factorial experiments may confound some 2- factor interactions with blocks. Increased precision can be obtained by the use of partial confounding. Partial confounding uses fractional factorials and a different confounding relationship for each fraction to increase the precision of estimates. We describe partial confounding for 2-level factorials for traditional and split-plot blocking. We show the increases in precision obtained by partial confounding and show when it is useful. We give many examples of blocked and split-plot experiments where partial confounding provides increased precision. The precision of partially confounded designs is compared with the precision obtained using computer generated designs.A simple example illustrates partial confounding. Consider a 24 experiment that is run in 4-blocks. The traditional blocking procedure is to use blocking generators ABC and ABD so that the 2-factor interaction CD is also confounded with blocks. Partial confounding uses the two half fractions with defining contrast I = ABC and uses a different blocking procedure in each half fraction. In one half- fraction ABD and CD are confounded with blocks and in the other half fraction ACD and BD are confounded with blocks. Increased overall precision is obtained because the confounded interaction terms are partially estimated within blocks.

44

New Observation: Improved Blocking for Factorial Designs

• Why Block?– Increase precision of the experiment– Reduce bias– Better answers to the questions of interest

• Consider an experiment when 4 runs can conveniently be done in a shift. With Block size 4, the shift-to-shift variation is placed in Blocks.

• How should we do the blocking?

45

24 in Four Blocks - Notes• Confounded Blocking Relationship:• I = ABC = ABD = CD

– Two-Factor Interaction is also confounded• Model of Interest: 4 main Effects plus 6 Two-factor

interactions• Conducting the experiment:

– Randomize Block Order– Randomize Run Order in Each Block

• Can we do better?– Computer Generated Design– Partially Confounded design

46

Confounded Blocking:24 Factorial in Four Blocks

Obs. A B C D Blk

1 - - - - 1

2 + - - - 2

3 - + - - 2

4 + + - - 1

5 - - + - 3

6 + - + - 4

7 - + + - 4

8 + + + - 3

9 - - - + 4

10 + - - + 3

11 - + - + 3

12 + + - + 4

13 - - + + 2

14 + - + + 1

15 - + + + 1

16 + + + + 2

47

Parameter Variances

• V(β) = σb2 /4 + σe

2/16– Constant and CD terms

• V(β) = σe2/16

– All other model terms• Design is Orthogonal• Maximum Variance of Prediction is the sum of

the parameter variances

48

Properties of Confounded 24 DesignPrediction Variance Profile

0

0.4

0.8

1.2

Va

rian

ce

1.1875

-1

-0.5 0

0.5 1

1X1

-1

-0.5 0

0.5 1

1X2

-1

-0.5 0

0.5 1

-1X3

-1

-0.5 0

0.5 1

1X4

Average Variance = 0.465278Maximum Variance = (2/4)σb

2 + (11/16)σe2

= 0.5 + 0.6875 = 1.1875 when σ = 1

49

Globally Optimum Design(A design that may unachievable)

• Power-Of-Orthogonality (POO) Theorem– Orthogonal with maximum diagonals of X’V-1X– V-, G-, A- & D Optimal

Optimal Max. Var. = (1/4)σb2 + (11/16)σe

2

= 0.25 + 0.6875 = 0.9375 when σ = 1Corollary: 2k-p is globally optimum when no model terms are confounded with blocks

50

Run Random Block X1 X2 X3 X4 Y1 1 -1 -1 1 1 .2 1 1 -1 -1 -1 .3 1 -1 1 -1 -1 .4 1 1 1 -1 1 .5 2 -1 1 1 1 .6 2 -1 -1 -1 1 .7 2 -1 -1 1 -1 .8 2 1 -1 1 1 .9 3 1 1 -1 -1 .

10 3 1 1 1 1 .11 3 1 -1 1 -1 .12 3 -1 1 1 -1 .13 4 1 -1 -1 1 .14 4 -1 -1 -1 -1 .15 4 -1 1 -1 1 .16 4 1 1 1 -1 .

Design EvaluationPrediction Variance Profile

0

0.4

0.8

1.2

Va

rian

ce

1.113839

-1

-0.5 0

0.5 1

-1X1

-1

-0.5 0

0.5 1

-1X2

-1

-0.5 0

0.5 1

-1X3

-1

-0.5 0

0.5 1

1X4

Computer Generated Design:JMP Custom Design with 10,000 Starts

Average Variance = 0.454653

51

Partially Confounded Blocking:24 Factorial in Four Blocks

Obs. A B C D Blk

1 - - - - 1

2 + - - - 2

3 - + - - 2 3

4 + + - - 1

5 - - + - 3 2

6 + - + - 4

7 - + + - 4

8 + + + - 3

9 - - - + 4

10 + - - + 3

11 - + - + 3 2

12 + + - + 4

13 - - + + 2 3

14 + - + + 1

15 - + + + 1

16 + + + + 2

52

Prediction Variance Profile

0

0.4

0.8

1.2

Va

rian

ce

1.020833

-1

-0.5 0

0.5 1

-1X1

-1

-0.5 0

0.5 1

1X2

-1

-0.5 0

0.5 1

1X3

-1

-0.5 0

0.5 1

-1X4

Partial Confounding Relationship

I=-ABC Half Fraction block on ABDSo Blocks 1 and 4 are exactly the same as for confounded blocking

I=ABC Half Fraction block on ACDSo half the items in Blocks 2 and 3 will change signs

Average Variance = 0.446759

53

Illigitemi non carborundum

Questions?Comments?

54

Partial Confounding Discussion• Should a computer program beat a Grand Master

Chess player?– Yes – Chess is a deterministic game

• Should a computer program be able to beat an experienced experimental designer?– No – Science is open-ended

• Opportunity for research where a computer generated design is current best practice– Extend to more factors– Workable Ph.D. Dissertation

55

Classical Definition: A Completely Randomized Design• “Completely randomized designs are

designs in which the assignment of factor-level combinations to a test run sequence or to experimental units (physical entities on which measurements are taken) is made by a random process where all assignments are equally likely” Gunst(QP, Feb. 2000)

56

The Classical Definition is Inadequate

• It does not address resetting so it does not address how industrial and scientific experiments are conducted

• It does not address the inherent split-plot aspects of experiments using equipment– This effects the desired inferences– New edition of MGH changes definition

Letter to the editor on “Randomization is the Key to Experimental Design Structure” by R. F. Gunst, Quality Progress (2000), May, 14.

57

Operational Definition: A Completely Randomized Design• Observation = Model + Error• A completely randomized design is

achieved by using a process that makes the errors independent– A random order is necessary but not sufficient

to achieve a CRD– Do what is needed so that each experimental

unit is treated I the same way– Consistent with Fisher

Split-Plot Experiments

• Main Effects Are (Partially or Fully) confounded with Blocks (Cochran and Cox 1957)

• Others use a less general definition

Purpose of a Ph. D. Advisor

• Provide a workable topic that can be completed within a year

60

Types of Factors

• Require Resetting for Randomization– Hard-to-Change (HTC)

• Temperature

– Easy-to-Change (ETC)• Current density

• Not requiring resetting• Surfactant type

• Determined by experimental situation

preliminaries thank you john mccool – good excuse to review, think about and give an overview of a...

Documents

level designs

plackettburman designs

boxbehnken designs

best design

best life

best technometrics

great life

response surfaces