modelselection1_wcsmo_2013_ali
TRANSCRIPT
Model Selection based onRegional Error Estimation of Surrogates (REES)
Ali Mehmani, Souma Chowdhury, Jie Zhang, Weiyang Tong, and Achille Messac
Syracuse University, Department of Mechanical and Aerospace Engineering
10th World Congress on Structural and Multidisciplinary Optimization May 19 - 24, 2013, Orlando, FL
Surrogate model
• Surrogate models are commonly used for providing a tractable and inexpensive approximation of the actual system behavior in many routine engineering analysis and design activities:
2
Surrogate model: Model selection
3
Statistical model selection approaches provide support information for users
• to select a best model,
• to select a best kernel function, and
• to determine an optimum model’s parameter.
Surrogate model: Model selection
4
Types of model Types of basis/kernel Parameter estimation
•RBF,•Kriging,•E-RBF,•SVR,•QRS,• …
•Linear•Gaussian•Multiquadric• Inverse multiquadric•Kriging•…
•Shape parameter in RBF,•Smoothness and width
parameters in Kriging,•Kernel parameter in SVM,• …
𝑓 (𝑥 )=∑𝑖=1
𝑛
𝑤𝑖ψ (‖𝑥−𝑥 𝑖‖)RBF
=
Multiquadric Shape parameter
ψ (𝑟 )=(𝑟 2+𝒄2) 1 /2
Research Objective
• Investigate the effectiveness of a Regional Error Estimation of Surrogate (REES) to select the best surrogate model based on the level of accuracy.
5
Overall fidelity information
Minimum fidelity informationREES
e.g., Hybrid surrogate
e.g., Conservative surrogate
Presentation Outline
6
• Surrogate model selection
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering design problems
Presentation Outline
7
• Surrogate model selection
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering design problems
Surrogate model selection
8
• Size and location of sample points,• Dimension and level of a noise,• Application domain,• …
Suitable Surrogate
Error measures are used to select the best surrogate
lack of any general guidelines regarding the suitability of different surrogate models for different applications
Application-based model selection (Manual selection)
Error-based model selection (Automatic Selection)
Surrogate model: Model selection
Types of model Types of basis/kernel Parameter estimation
•RBF,•Kriging,•E-RBF,•SVR,•QRS,• …
•Linear•Gaussian•Multiquadric• Inverse multiquadric•Kriging•…
•Shape parameter in RBF,•Smoothness and width
parameters in Kriging,•Kernel parameter in SVM,• …
Split samples (or holdout samples) Bootstrapping Cross-Validation (Predictive Sum of Square based on k-fold or leave-one-out cv) Akaike information criterion (AIC), and Schwarz's Bayesian information criterion (BIC)
There exist many parameter/model/kernel selection approaches
9
Presentation Outline
10
• Review of surrogate model error measurement methods
• Regional Error Estimation of Surrogate (REES)
• Numerical examples: benchmark and an engineering design problems
REES: Concept
11
Model accuracy Available resources
In general, this concept can be applied to different types of approximation models;
- Surrogate modeling,- Finite Element Analysis, and- ...
REES: Methodology
12
The REES method formulates the variation of error as a function of the number of training points using intermediate surrogates.
This formulation is used to predict the level of error in the final surrogate.
Step 1 : Generation of sample dataThe entire set of sample points is represented by .
13
Sample Point
Step 2 : Estimation of the variation of error with sample density
Test Point
Training Point
First Iteration:
Test Point
Training Point
Second Iteration:
Test Point
Training Point
Third Iteration:
Test Point
Training Point
Forth Iteration:
Training Point
Final Surrogate:
REES: Methodology
A position of sample points which are selected as training points, at each iteration, is critical to the surrogate accuracy.
14
Intermediate surrogates are iteratively constructed (at each iteration) over a heuristic subsets of sample points.
REES: Methodology
Med
ian
of
RA
Es
Number of Training Points
It. 1
t1 t2 t3 t4
MomedChoose number of iterations, Nit
Choose number of combinations, Kt
Define intermediate training and test points
Construct an intermediate surrogate
Estimate Median and Maximum errors
Fit a distribution over all combinations
Determine Momedian and Momaximum errors
FOR t=1,..,Nit
FOR k=1,…, Kt
variation of the error with sample density
Med
ian
of
RA
Es
t1 t2 t3 t4
It. 2It. 1
Momed
variation of the error with sample density
Number of Training Points
Med
ian
of
RA
Es
t1 t2 t3 t4
It. 3It. 2It. 1
Momed
variation of the error with sample density
Number of Training Points
Med
ian
of
RA
Es
t1 t2 t3 t4
It. 3It. 2It. 1 It. 4
Momed
variation of the error with sample density
Number of Training Points
Step 3 : Prediction of error in the final surrogate
The final surrogate model is constructed using the full set of training data.
Regression models are applied to relate the evaluated error at each iteration to the size of training points,
These regression models are called the variation of error with sample density (VESD).
The regression models are used to predict the level of the error in the final surrogate.
19
Exponential regression model
Multiplicative regression model
Linear regression model
In the choice of these functions we assume a smooth monotonic decrease of the error with the training point density.
REES: Methodology
Mod
e of
Med
ian
of
RA
Es
Number of Training Points
t1 t2 t3 t4
It. 3It. 2It. 1 It. 4
Predicted Overall Error
Momed
Prediction of error in the final surrogate
t1 t2 t3 t4
Predicted Overall Error
Momed
Momax
Number of Training Points
Prediction of error in the final surrogateIt. 3It. 2It. 1 It. 4
Mode of maximum error distribution at each iteration
t1 t2 t3 t4
Predicted Overall Error
Momed
Momax
Predicted Maximum Error
Number of Training Points
Prediction of error in the final surrogateIt. 3It. 2It. 1 It. 4
Presentation Outline
23
• Review of surrogate model error measurement methods
• Regional Error Estimation of Surrogate
• Numerical examples: benchmark and an engineering design problems
Numerical Examples
24
The effectiveness of the model selection based on REES is explored to select the best surrogate between all candidates including
(i) Kriging,(ii) RBF,(iii) E-RBF, and (iv) Quadratic Response Surface (QRS) on four benchmark problems and an engineering design problem.
The results of the REES method in selecting the best surrogate are compared with the model selection based on actual errors evaluated using additional test points, and model selection based on a normalized PRESS.
Numerical Examples
Branin-Hoo (2 variables) Hartmann (6 variables)
Dixon & Price (18 variables)Dixon & Price (12 variables)
VESD regression models used to predict the overall error
Numerical Examples
VESD regression models used to predict the maximum error
Dixon & Price (18 variables)Dixon & Price (12 variables)
Branin-Hoo (2 variables) Hartmann (6 variables)
Numerical Examples
Wind Farm Power Generation
27
Surrogates are developed using Kriging, RBF, E-RBF, and QRS to represent the power generation of an array-like wind farm.
VESD used to predict the overall error VESD used to predict the maximum error
Numerical Examples
28
Model Selection based the actual error on additional test points
Set Number of additional test points
for all surrogate candidates,
for
𝑦 𝑖=System(𝑥𝑖)
𝑦 𝑖=Surrogate(𝑥𝑖)
𝑅𝐴𝐸𝑖=|𝑦 𝑖− 𝑦 𝑖
𝑦 𝑖|
Fit a distribution over all RAEs
Determine the mode of the error distribution as an actual error
Select the best surrogate with the smallest error
Numerical Examples
29
FunctionSurrogate model selection
Rank 1 Rank 2 Rank 3 Rank 4
Model selection based on the overall error estimated using REES
Branin-Hoo ERBF Kriging RBF QRS
Hartmann - 6 QRS Kriging RBF ERBF
Dixon and Price - 12 ERBF QRS Kriging RBF
Dixon and Price - 18 ERBF QRS Kriging RBF
Wind Farm Kriging QRS RBF ERBF
Model selection based on the actual error
FunctionSurrogate model selection
Rank 1 Rank 2 Rank 3 Rank 4
Branin-Hoo ERBF RBF Kriging QRS
Hartmann - 6 QRS Kriging RBF ERBF
Dixon and Price - 12 ERBF Kriging RBF QRS
Dixon and Price - 18 ERBF Kriging RBF QRS
Wind Farm Kriging RBF ERBF QRS
Numerical Examples
30
Model Selection based the Normalized Prediction Sum of Square (PRESS)
Set Number of additional training points
for all surrogate candidates,
for
𝑦 𝑖=Surrogate(𝑥𝑖)
𝑅𝐴𝐸𝐶𝑉𝑖=| �̂�−𝑖
𝑖− 𝑦 𝑖
𝑦 𝑖 |Fit a distribution over all
Determine the root mean square of errors (RAEs), n-PRESS error
Select the best surrogate with the smallest n-PRESS error
�̂� −𝑖𝑖=Intermediate Surrogate(𝑥𝑖)
Numerical Examples
31
Model selection based on the n-PRESS
FunctionSurrogate model selection
Rank 1 Rank 2 Rank 3 Rank 4
Model selection based on the RMSE on additional test points
FunctionSurrogate model selection
Rank 1 Rank 2 Rank 3 Rank 4
Wind Farm QRS ERBF Kriging RBF
Dixon and Price - 18 QRS ERBF RBF Kriging
Dixon and Price - 12 QRS ERBF Kriging RBF
Hartmann - 6 Kriging QRS RBF ERBF
Branin-Hoo ERBF Kriging RBF QRS
Wind Farm Kriging ERBF RBF QRS
Dixon and Price - 18 ERBF RBF Kriging QRS
Dixon and Price - 12 ERBF RBF Kriging QRS
Hartmann - 6 RBF Kriging ERBF QRS
Branin-Hoo RBF ERBF Kriging QRS
Numerical Examples
32
Model selection based on the maximum error estimated using REES
FunctionSurrogate model selection
Rank 1 Rank 2 Rank 3 Rank 4
Function75th percentile of the RMSE
Rank 1 Rank 2 Rank 3 Rank 475th percentile of the
Rank 1 Rank 2 Rank 3 Rank 4
Model selection based on the 75th percentile of the RMSE and
Wind Farm Kriging QRS ERBF RBF
Dixon and Price - 18 ERBF QRS Kriging RBF
Dixon and Price - 12 ERBF QRS Kriging RBF
Hartmann - 6 RBF Kriging QRS ERBF
Branin-Hoo ERBF RBF Kriging QRS
Wind Farm Kriging QRS ERBF RBF QRS ERBF Kriging RBF
Dixon and Price - 18 ERBF RBF Kriging QRS QRS ERBF RBF Kriging
QRS ERBF Kriging RBF
QRS RBF Kriging ERBF
QRS Kriging RBF ERBF
Dixon and Price - 12 ERBF RBF Kriging QRS
Hartmann - 6 Kriging RBF QRS ERBF
Branin-Hoo ERBF RBF Kriging QRS
Numerical Examples
33
% Success of Model Selection (overall error)
REES PRESS
W considering QRS 100% 0%
W/OUT considering QRS 100% 40%
% Success of Model Selection (maximum error)
REES
W considering QRS 80% 0%
W/OUT considering QRS 80% 40%
A Summary and Comparison
Concluding Remarks
34
We developed a new model selection approach to select the best surrogate among available surrogate models based on the level of accuracy.
The REES method is defined based on the hypothesis that:
“The accuracy of the approximation model is related to the amount of available resources”
The REES method proposes two error measures;
Overall error measure,
Maximum error measure
to assess the confidence level of the surrogate model. The preliminary results on benchmark and wind farm power generation
problems indicate that in all of cases the REES method selects the best surrogate with a higher level of confidence in comparison to the n-PRESS.
Acknowledgement
35
I would like to acknowledge my research adviser Prof. Achille Messac, and my co-adviser Prof. Souma Chowdhury for their immense help and support in this research.
Support from the NSF Awards is also acknowledged.