rethinking steepest ascent for multiple response applications
DESCRIPTION
Rethinking Steepest Ascent for Multiple Response Applications. Robert W. Mee Jihua Xiao University of Tennessee. Outline. Overview of RSM Strategy Steepest Ascent for an Example Efficient Frontier Plots Paths of Improvement (POI) Regions. Sequential RSM Strategy. - PowerPoint PPT PresentationTRANSCRIPT
Rethinking Steepest Ascent for Multiple Response Applications
Robert W. Mee
Jihua Xiao
University of Tennessee
Outline
Overview of RSM Strategy Steepest Ascent for an Example Efficient Frontier Plots Paths of Improvement (POI) Regions
Sequential RSM Strategy
Box and Wilson (JRSS-B, 1951) 1. Initial design to estimate linear main effects2. Exploration along path of steepest ascent3. Repeat step 1 in new optimal location
– If main effects are still dominant, repeat step 2; if not, go to step 4
4. Augment to complete a 2nd-order design 5. Optimization based on fitted second-order
model
Multiple Responses RSM Literature
Del Castillo (JQT 1996), "Multiresponse Optimization…” Construct confidence cones for path of steepest ascent (i.e.,
maximum improvement) for each response– Use very large 1- for responses of secondary importance, e.g. 99%-99.9%
confidence– Use 95%-99% confidence for more critical responses
Identify directions x falling inside every confidence cone If no such x exists, choose a convex combination of the paths of
steepest ascent, giving greater weight for responses that are well estimated
– Constrain the solution to reside inside the confidence cones for the most critical responses.
Multiple Responses RSM Literature
Desirability Functions (Derringer and Suich, JQT 1980)
Score each response with a function between 0 and 1.
The geometric mean of the scores is the overall desirability
Recent enhancements use score functions that are “smooth” (i.e., differentiable).
An Example with Multiple Responses
Vindevogel and Sandra (Analytical Chem. 1991) 25-2 fractional factorial design using micellar
electrokinetic chromatography Higher surfactant levels required to separate two of
four esters, but this increases the analysis time Response variables include:
– Resolution for separation of 2nd and 3rd testosterone esters– Time for process, tIV
– Four other responses of lesser importance
Reaction Time vs. Reaction Rate
Rate = 1 / Time
0.05
0.06
0.07
0.08
0.09
0.1
0.11
Rat
e
7.5 10 12.5 15 17.5 20
t_IV min
Fitted First-Order Models for Resolution and Reaction Rate
Good news: Both models have R2 > 99%
Bad news: Improvement for resolution and rate point in opposite directions
– Authors recommend a compromise: – Lower x1 (pH) and x5 (buffer) to increase rate– Lower x2 (SHS%) and x3 (Acet.) and increase x4 (surfactant)
to increase resolution.
What about Modeling Desirability?
First-order model for Desirability
ModelErrorC. Total
Source527
DF0.060417120.023509340.08392646
Sum of Squares0.0120830.011755
Mean Square1.0280F Ratio
0.5603Prob > F
Analysis of Variance
Interceptx1x2x3x4x5
Term0.4441642-0.0545440.0144716
-0.05221-0.029628-0.027641
Estimate0.0383320.0383320.0383320.0383320.0383320.038332
Std Error11.59-1.420.38
-1.36-0.77-0.72
t Ratio0.0074*0.29070.74210.30630.52040.5458
Prob>|t|
Parameter Estimates
What we just tried was a bad idea!
Even when first-order models fit each response well, the desirability function for two or more responses will require a more complicated model
Following an initial two-level design, one cannot model desirability directly.
It is better to maximize desirability based on predicted response values from simple models for each response
Maximizing Predicted Desirability for the Vindevogel and Sandra Example
JMP’s default finds the maximum within a hypercube
This does not identify a useful path for exploration
0
0.5
1
1.5
2
Res
olut
ion
1.04
261
±0.1
4557
7
0.05
0.07
0.09
0.11
Rat
e0.
0926
38±0
.009
083
0.00
0.50
1.00
Des
irabi
lity
0.60
2826
-1.5
-0.5 .5 1.5
-1x1
-1.5
-0.5 .5 1.5
-1x2
-1.5
-0.5 .5 1.5
-1x3
-1.5
-0.5 .5 1.5
-0.22963x4
-1.5
-0.5 .5 1.5
-1x5
.00
.25
.50
.75
1.00
Desirability
Prediction Profiler
Software Should Maximize Desirability Within a Hypersphere
0
0.5
1
1.5
2
Pre
d F
orm
ula
Res
olut
ion
1.45
2702
±0
0.05
0.07
0.09
0.11
Pre
d F
orm
ula
1/t_
IV0.
0942
67±5
.957
e-8
7.46
7.5
7.54
Rad
ius^
27.
4900
05±6
.617
e-6
0.00
0.50
1.00
Des
irabi
lity
0.80
3535
-3 -1 1 3
-0.86736x1
-3 -1 1 3
0.120135x2
-3 -1 1 3
-2.50108x3
-3 -1 1 3
-0.55318x4
-3 -1 1 3
-0.40229x5
.00
.25
.50
.75
1.00
Desirability
Prediction Profiler
Confidence Cone for Path of Steepest Ascent (Box and Draper)
Define b, the angle between least squares estimator b and true coefficient vector
Pivotal quantity:
Upper confidence bound for sin2b
Assuming b < 90o,
2 21,ˆ(sin ) ' /( / ) ( 1)xx k dfS k F bβ b b
2, 1, , 1,2
ˆ( 1) ( 1)sin
( ' )k df k df
xx
k F k F
S kF
bβ b b
1/ 21 1, 1,sin (1 ) /k dfk F F
bβ
95% Confidence Cone for Paths of Steepest Ascent for Resolution & Rate
95% Confidence Cones for Paths of Steepest Ascent– Resolution (Y1): b < 14.4o
– Rate (Y2): b < 32.7o
These confidence cones do not overlap, since the angle between bResolution and bRate is 141.5o!
What compromise is best?
Efficient Frontier Notation
J larger-the-better response variables First-order model in k factors for each response Notation
– bj: vector of least squares estimates for jth response
– Tj: corresponding vector of t statistics for jth response
Convex combinations for two responses– For 0≤c≤1: xC=(1-c)T1 + cT2
Efficient Frontier for Two Responses
Let xN denote a vector that is not a convex combination of T1 and T2
There exists a convex combination xC, with |xC| = |xN|, such that xC‘bj ≥ xN‘bj (j = 1,2)
Proof by contradiction. I.e., suppose not. Then
So one only need consider convex combinations of the paths of steepest ascent.
1 2 1 2
N N C C 1 2x T x T x T x T T T
Efficient Frontier for Resolution and Rate
Predicted Resolution and Rate for x’x=7.49
Grid lines match Predicted Y @ design center
One quadrant shows gain in both Yj’s 0
1
2
Pre
dict
edR
esol
utio
n
.05 .06 .07 .08 .09 .1 .11 .12 .13
Predicted Rate
Bivariate Fit of Pred Formula Resolution 4 By Pred Formula t_IV Rate 4
Steepest ascent for Rate
Steepest ascent for Resolution
Efficient Frontier for Resolution and Rate
No change in Rate (Y2): xC=(1-c1)T1 + c1T2
c1=0.63
xc = [-0.48, -0.10, -2.68, -0.10, -0.22] @ xc’xc=7.49 Resolution = 1.76 Rate = .084
1 1 2 2 2 1 2' /( ' ' )c T T T T T T
0
1
2
Pre
dict
edR
esol
utio
n.05 .06 .07 .08 .09 .1 .11 .12 .13
Predicted Rate
Resolution = 1.76Rate = .084
Efficient Frontier for Resolution and Rate
No change in Resolution (Y1): xC=(1-c2)T1 + c2T2
c2=0.7355
xc = [-1.39, 0.46, -1.85, -1.22, -0.65] @ xc’xc=7.49 Resolution = .86 Rate = .11
2 1 1 1 1 1 2' /( ' ' )c T T T T T T
0
1
2
Pre
dict
edR
esol
utio
n.05 .06 .07 .08 .09 .1 .11 .12 .13
Predicted Rate
Resolution = 0.86Rate = .110
Improving Both Responses
If T1’T2 > 0, all convex combinations of T1 and T2 increase the predicted Y for both responses
If T1’T2 < 0, all xc with c1 < c < c2 increase the predicted Y for both responses
For our example, .63 < c < .735 increase predicted Resolution and Rate
Efficient Frontier @ x’x=5 versus Factorial Points
Factorial pts. = 8 directions, none on the efficient frontier
What about sampling error?
0
1
2
Res
olut
ion
.05 .06 .07 .08 .09 .1 .11 .12
Rate
Attaching Confidence to Improvement
Lower confidence limit for E[Y(x)], given x
Lower confidence limit for change in E[Y(x)], given x
where
ˆ, ( )ˆ( ) df Y xY x t s
ˆ, ( )ˆ( ) df Y xY x t s
0ˆ ˆ( ) ( ) 'Y x Y x b x b
Efficient Frontier @ x’x=7.49 with 90% Lower Confidence Bound for E(Y)-0
-1
-0.5
0
0.5
1
1.5
Cha
nge
in R
esol
utio
n
-0.04 -0.02 0 .02 .04
Change in Rate
90% Lower Confidence Bounds
Paths of Improvement (POI) Region
POI Region = The POI Region is a cone about the path of
steepest ascent, containing all x such that the angle
Using t2,.10 = 1.886, the upper bound for θxb is 86.9o for Resolution, and 83.3o for Rate
For simultaneous (in x) confidence region, replace tdf, with (kFk,df,)
1/2 or [(k-1)Fk-1,df,]1/2
ˆ, ( )ˆ{ : ( ) 0}df Y x
x Y x t s
1,cos ( / )Udft kF xb xb
Paths of Improvement vs. Path of Steepest Ascent
“Path of Steepest Ascent” b is perpendicular to contours for predicted Y
– The path of steepest ascent is not scale invariant– Contours are invariant to the scaling of the factors
Paths of improvement contours are complementary to the confidence cone for steepest ascent path
– Assuming b < 90o, 100(1-)% confidence cone for steepest ascent
– Assuming b < 90o, 100(1-)% confidence cone for paths of improvement
1/ 21 1, 1,sin (1 ) /k dfk F F
bβ
1/ 21 1, 1,cos (1 ) /k dfk F F
xb
Scale Dependence for Path of Steepest Ascent
If the experiment uses a small range for one factor, steepest ascent will neglect that factor
Suppose Y = 0 + X1 + X2 Experiment 1
– X1: [-2,2]– X2: [-1,1] – Path of S.A.: [4,1]
Experiment 2– X1: [-1,1]– X2: [-2,2]– Path of S.A.: [1,4]
Contour [1,-1] for both
Complementary Regions
Cone for
Paths of Improvement Region
As precision improves, the confidence cone for shrinks, while the paths of improvement region expands toward half of Rk
Common Paths of Improvement
Using predicted values, convex combinations xC=(1-c2)T1 + c2T2 yield improvement in both responses for c1 < c < c2 – For our example, .63 < c < .735
Using lower confidence bounds, a smaller set of directions yield “certain” improvement in both responses– For our example using t2,.10 = 1.886, we are sure
of improvement for .651 < c < .727
Extensions to J > 2 Responses
The efficient frontier for more than two responses is the set of directions x that are a convex combination of all J vectors of steepest ascent– If some directions of steepest ascent are interior
to this set, they are not binding Overlaying contour plots can show the
predicted responses for each direction x on the efficient frontier.
Is Simultaneous Improvement Really Possible?
Can we reject Ho: = 180o?
– An approximate F test based on the difference in SSE for regression of Y2 on X and regression of Y2 on predicted Y1.
– For our example, F = 25.45 vs. F4, 2 (p = .04) Can we construct an upper confidence bound
for this angle?– No solution at present– The larger this angle, the further one must
extrapolate in these k factors to achieve gain in both responses.