multivariate modeling of batch processes using summary ...€¦ · multivariate modeling of batch...
TRANSCRIPT
10/7/11
1
1
Multivariate Modeling of Batch Processes Using Summary
Variables
Barry M. Wise and Neal B. Gallagher
EIGENVECTOR RESEARCH, INC. Wenatchee, WA USA
2
Outline
• The process monitoring data problem • Semiconductor Etch Data Set • Approaches to preprocessing
• Variable means • Warping for multi-way • Data summary
• Comparison of results • Polymerization Data Set • Conclusions
10/7/11
2
3
Batch Process Monitoring Data Problem
• Messy-typically includes start-up and shut-down phases that are not of interest
• Periods of “steady-state” where not much is changing
• Variable record lengths • Lots of data! • Reduce to a set of more compact descriptors?
4
Plasma Metal Etch
• Linewidth (Critical Dimension) Control • Constant linewidth reduction run to run and across wafer • Constant linewidth reduction for every material in stack
• Minimal damage to oxide
Silicon
Oxides500Å Ti1000Å TiN
6000Å AlCu (.5%)
500Å TiNResist Resist
Silicon
OxidesTi
TiN
AlCu
TiNResist Resist
TiN
CD
TiN
TiEtch in
Cl2/BCl3Plasma
10/7/11
3
5
Metal Etch Data Set
• Barna, G.G., White, D., Wise, B.M., Gallagher, N.B., Sofge, D., “Development of Robust Fault Detection and Classification Techniques/SEMATECH J-88E Project at TI”, SEMATECH AEC/APC Workshop VIII, Santa Fe, New Mexico, Oct. 27-30, 1996.
• White, D., Barna, G.G., Butler, S.W., Wise B., and Gallagher, N., "Methodology for Robust and Sensitive Fault Detection," Electrochemical Society Meeting, Montreal, May, 1997.
• Gallagher, N.B.,Wise, B.M., Butler, S.W., White, D., and Barna, G.G., "Development and Benchmarking of Multivariate Statistical Process Control Tools for a Semiconductor Etch Process: Improving Robustness Through Model Updating", IFAC ADCHEM'97, Banff, Canada, 78–83, June, 1997.
• Wise, B.M., Gallagher, N.B., Butler, S.W., White, D., and Barna, G.G., "A Comparison of Principal Components Analysis, Multi-way Principal Components Analysis, Tri-linear Decomposition and Parallel Factor Analysis for Fault Detection in a Semiconductor Etch Process," J. Chemometr., 13, 379–396 (1999).
• Wise, B.M., Gallagher, N.B., Butler, S.W., White, D., and Barna, G.G., "Development and Benchmarking of Multivariate Statistical Process Control Tools for a Semiconductor Etch Process: Impact of Measurement Selection and Data Treatment on Sensitivity", IFAC SAFEPROCESS'97, 35–42, Kingston Upon Hull, U.K., Aug., 1997.
• Wise, B.M., Gallagher, N.B., and Martin, E.B., “Application of PARAFAC2 to Fault Detection and Diagnosis in Semiconductor Etch,” J. Chemometr., 15(4), 285–298 (2001).
• Wise, B.M., Gallagher, N.B., "Multi-way Analysis in Process Monitoring and Modeling," AIChE Symposium Series, 93(316), 271–274 (1997).
• Warren, J. and Gallagher, N.B., “Heuristic and Statistical Methods for Fault Detection: Complementary or Competing Approaches?”, SEMATECH AEC/APC Symposium XVIII, Westminster, CO, Sept. 30-Oct. 5 (2006).
This data has been used many times and is available at www.eigenvector.com. A few examples:
6
Available Measurements • Regulatory controller setpoints & controlled variable
measured values • gas flows, pressure, plasma powers
• Regulatory controller manipulated variables • exhaust throttle valve, capacitors • mass flow controller do not provide valve position
• Additional process measurements • endpoint intensity (plasma emission at particular frequency) • impedance measurements
• Optical emission spectra • RF plasma variables
10/7/11
4
7
Sensitivity of MSPC Models
• Three experiments performed with 21 “induced” faults on: • TCP top power • RF bottom power • Cl2 flow • BCl3 flow • Chamber pressure • Helium chuck pressure
• Goal: Compare ability of models considered for detecting faults
8
Global and Local Models
• Global models based on data over long period of time with considerable variance due to drift • Tend to be less sensitive to faults
• Local models build over narrower time windows, less drift variance • More sensitive to faults
10/7/11
5
9
Generating Faults
• Setpoints were changed for controlled process variables • Data for the controlled variable was adjusted to have the
original desired mean • Result is data that looks like a sensor has developed a bias
10
Mean of Each Variable over Wafer Processing Time
• Take mean of each variable • Pros:
• Easy, conceptually simple • Large data reduction • No problem with variable record length
• Cons: • May miss some types of fault • Completely lose time information
10/7/11
6
11
Mean Procedure
Can add length of batch variable
12
Arrangement of Data for Multiway Analysis
• Records must be the same length (except PARAFAC2) • Requires some combination of alignment, truncation or
warping • Pros
• Retains time information
• Cons • Results in many variables (parsimony problem) • Complex preprocessing step, requires interpolation • Model interpretation issues
10/7/11
7
13
Alignment for Multi-way
Problem: How to do first step?
14
Alignment and Warping Methods
• A veritable smorgasbord of methods available • Dynamic Time Warping (DTW) • Correlation Optimized Warping (COW) • Indicator variable/step number • Linear interpolation • Align and truncate • Combinations and variations of the above • Etc.
10/7/11
8
15
Correlation Optimized Warping
• Piecewise preprocessing method • Allows limited changes in segment lengths,
controlled by slack parameter • Linear interpolation over segments • Dynamic programming used to optimize
correlation between warped sample and reference • Less flexible than DTW (unless constrained)
16
COW References
• N.P.V Nielsen, J.M. Carstensen and J. Smedsgaard, “Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimized warping,” J. Chromatogr. A, 805, 17-35, 1998.
• G. Tomasi, F. van den Berg and C. Andersson, “Correlation Optimized Warping and Dynamic Time Warping as Preprocessing Methods for Chromatographic Data,” J. Chemometrics, 18, 231-241, 2004.
• G. Tomasi, T. Skov and F. van den Berg, Warping Toolbox, see: http://www.models.life.ku.dk/source/DTW_COW/index.asp
10/7/11
9
17
Example: COW COW breaks signals into segments and linearly expands or contracts them to optimize correlation
18
Hints on COW May be better to calculate warp with 2nd derivative Apply calculated warp to other variables Calculate warp on PCA scores or other latent variable
10/7/11
10
19
Data Summary Approach • Convert data into alternate set of descriptors • If process has multiple steps, calculate parameters
that describe each step
20
Summary Variables
• Pros • Conceptually simple • Some time information retained • Noise reduction • Reduces number of variables (vs. MPCA)
• Cons • Further from original data • May not have step numbers to work with
10/7/11
11
21
Summary Variables Step 1
Step 2
Step 3 Step 1 Step 2 Step 3
22
Creation of Pseudo-steps
• Break reference process trace into “sensible” segments (manually)
• Assign step numbers • Warp new data onto reference • Reverse warp reference step numbers into new
data • Use summary variables
10/7/11
12
23
Example of Step Creation
24
Variation in Step Length
10/7/11
13
25
PCA on Summary Variables
26
Loadings for PCA Model
10/7/11
14
27
Compare to MPCA Loadings
28
Contribution Plots
Pressure +3 Fault
Pressure -2 Fault
10/7/11
15
29
Results
TLD PARAFAC MPCA PCA/Means PARAFAC2 Summary!Global 11 12 10 10 7 14!Local 14 17 11 16 13 18!
!
Faults Caught by PCA on Summary Variables Compared with TLD, PARAFAC, MPCA, PCA on the Means and PARAFAC2. 21 total faults possible
31
Polymerization Process
• 10 process variables (sanitized) • 100 time intervals each • TempR 1, TempR 2, TempR 3, Press 1, Flow 1, TempC
1, TempC 2, Press 2, Press 3, Flow 2
• Calibration: 1 to 36 (normal batches) • Test: 37 to 55 (one normal and seven faults) • Described in
*Nomikos, P. and J.F. MacGregor, “Multivariate SPC charts for monitoring batch processes,” Technometrics, 37(1), 1995.
10/7/11
16
32
Compare Summary Variables to MPCA
• Summary Variable • Divided records into 7 steps based on transitions in
variable Flow 1 (control variable) • Calculated mean value and step length each segment • 10 variables x 7 steps + 7 step lengths = 77 variables
• MPCA • Block Scaling • 10 variables x 100 time points = 1000 variables
33
Unusual Batches Identified
0 10 20 30 40 50 60101
102
103
104
105
106
107
Sample
Q R
esid
uals
(46.
72%
)
37 39 43 44 45
46 47
48
49
50 51 52 53 54 55
Samples/Scores Plot of Summary of trs & Summary of trs,
Same as in Nomikos & MacGregor
10/7/11
17
34
Compare Loadings
0.3 0.2 0.1 0 0.1 0.2 0.3 0.40.25
0.2
0.15
0.1
0.05
0
0.05
0.1
0.15
0.2
0.25
PC 1 (24.77%)
PC 2
(15.
75%
)
Variables/Loadings Plot for Summary of trs
PC 2 (15.75%)TempR 1TempR 2TempR 3Press 1TempC 1Flow 1TempC 2Press 2Press 3Flow 2Step Length
0.3 0.2 0.1 0 0.1 0.2 0.3 0.40.25
0.2
0.15
0.1
0.05
0
0.05
0.1
0.15
0.2
0.25
PC 1 (24.77%)
PC 2
(15.
75%
)
Variables/Loadings Plot for Summary of trs
PC 2 (15.75%)Step 1Step 2Step 3Step 4Step 5Step 6Step 7
0.1 0.08 0.06 0.04 0.02 0 0.02 0.04 0.06 0.08 0.10.06
0.04
0.02
0
0.02
0.04
0.06
0.08
0.1
PC 1 (38.55%)PC
2 (1
1.68
%)
PC 2 (11.68%) TempR 1PC 2 (11.68%) TempR 2PC 2 (11.68%) TempR 3PC 2 (11.68%) Press 1PC 2 (11.68%) Flow 1PC 2 (11.68%) TempC 1PC 2 (11.68%) TempC 2PC 2 (11.68%) Press 2PC 2 (11.68%) Press 3PC 2 (11.68%) Flow 2
Summary PCA MPCA
35
Conclusions
• Digestion into summary variables creates smaller, more manageable and interpretable data sets
• Segments can be based on process steps, or inflection points in control variables, etc.
• Warping techniques such as COW can be used to create pseudo-steps
• Models based on summary variables at least as sensitive as MPCA but easier to work with
• Less chance of overfitting