engineering-driven data analytics for in situ process .../uploads/shi_in_situ_process... · - data...

Engineering-driven Data Analytics for In Situ Process Monitoring of Nano manufacturing

Jianjun Shi

The Carolyn J. Stewart Chair and ProfessorThe H. Milton Stewart School of Industrial and Systems Engineering

Georgia Institute of Technology

(contact: [email protected]; http://pwp.gatech.edu/jianjun-shi/)

mailto:[email protected]://pwp.gatech.edu/jianjun-shi/

Outline

• Overview of Data Fusion for Quality Improvement Research

• Data analytics for in situ process monitoring of nanomanufacturing– Generalized Wavelet Shrinkage of In-line Raman Spectra– Penalized Mixed-effects Decomposition– Tensor Mixed Effects Model– Physics-based Feature Extraction & Predictive Model

• Summary

- 2 -

Overview of Data Fusion for Quality Improvement

1 2 3 4 5 6 7

Manufacturing/Executing Operation

Raw DataInformationKnowledgeDecision

In-Process//M

aint. Data

Manufacturing System, Product Realization and Data Fusion

Data Fusion enables improvements across all stages of the product lifecycle, from product conceptualization, to design, production, service, and logistics.

Challenges and Opportunities

Opportunities• Ubiquitous availability of data• System operations become transparent• Technological capabilities and flexibilities of individual machine• Advancements in data science, machine learning and computing capabilities• …

Challenges• Define clear engineering objectives and quickly retrieve all relevant information from all stages

of a manufacturing system• Effectively address data uncertainties and noises• The “imbalance” in data availability: massive normal operational data, few specific failure data• Lack of unified model/strategy to make a (real time) informative decision• Lack of deep integration of data science and design & manufacturing engineering.• …

Data Science(Statistics/CS/

Signal Processing)

OR/Control

Interdisciplinary Framework:Fusion of Engineering, Data Science, OR/Control

Engineering/Domain Knowledge

Stream of Variation Methodologies for

Multistage Manufacturing Processes

Shi, J., 2006, “Stream of Variation Modeling and Analysis for Multistage Manufacturing Processes”, CRC Press, 469pp.Shi, J. and Zhou, S., 2009, “Quality Control and Improvement for Multistage Systems: A Survey”,

IIE Transactions on Quality and Reliability Engineering, Vol. 41, pp744-753.

Multistage Manufacturing System:Processes with multiple workstations and/or multiple operations

•

• • •

12 34

•

•••

1

234

Level 2

Level 3

Level 4

3

Level 1

Automotive Body Assembly

Roll-to-Roll Nano Buckypaper Mfg System

3D Printing / Multilayer Additive Manufacturing

Challenges:- Variation propagation modeling- Tolerance synthesis- Root cause diagnosis- Distributed sensing- Critical station identification- Automatic Compensation

Engineering Knowledge(CAD and CAPP)

Variation Model

PartsEnd-of-line sensorStation 1 Station k Station N... ...

Root Causes

Feedback to Design(Analytical)

distributed sensors

System & ControlTheory

Statistical analysis

Data Enabled Design and Control of Multistage Manufacturing System

operator intervention

Statistical Process ControlFeedback to Design:(Heuristics–DOs & DON’ts)

? SPC

Data Fusion Enabled Process Monitoring, Root Cause Diagnosis, Defect Prevention, and Feedback to Design

1. Sensitivitybased DesignEvaluation

4. Diagno-sability Study

3. OptimalSensorDistributionStrategy

2. Process-orientedTolerance Synthesis

5. FaultDiagnosis

Design Manufacturing

6. QualityReliability Chain

Integration of Design &Manufacturing Information

Stream-of-Variation Modelkkkkkk wuBxAx ++= −− 11

kkkk vxCy +=

• The SoV theory provides an unified framework for variation reduction of multistage manufacturing system.

• The SoV theory has been implemented in auto and aerospace and their suppliers companies.

Product Design Manufacturing System

tolerancing

Die Fabricatin

g

Tooling error

Single station model

Assembly line

Inspection

Stamping error

Multi-station model

0

1

2

3

4

5

6

7

8

Sensitivity

Assembly system

Diagnosis

Variation analysis

Mfg ControlModeling

Design Opt

Stage N...

.

..

yk

uk wk

xk-1 xkStage 1 Stage k

SoV Theory and Application

http://www.gom.com/Images/big/blanks08.jpgChart1

8

3

2.4

2

1.5

1

0.9

0.8

0.7

0.6

0.4

0.3

0.2

0.15

0.1

Sheet1

8

3

2.4

2

1.5

1

0.9

0.8

0.7

0.6

0.4

0.3

0.2

0.15

0.1

Sheet1

Sheet2

Sheet3

Physics-Driven Machine Learning and Modeling for Quality Improvement

• Li, J., and Shi, J., 2007, “Knowledge Discovery from Observational Data for Process Control through Causal Bayesian Networks”, IIE Transactions, Vol. 39, pp681-690.

• Liu, K., Zhang, X. and Shi, J., 2013, “Adaptive Sensor Allocation Strategy for Process Monitoring and Diagnosis in a Bayesian Network”, IEEE Transactions on Automation Science and Engineering.

VISION: Three interrelated layers of networks: - system, sensing, and decision making

Manufacturing System Network

P(X1)

P(X5|X1)

P(X2|X1)

P(X3|X2)P(X6|X1, X5)

X6

X1

X2X4

X5

X3

P(X4|X1)

P(X1)

P(X5|X1)

P(X2|X1)

P(X3|X2)P(X6|X1, X5)

X6

X1

X2X4

X5

X3

P(X4|X1)

Extracted Knowledge and Intelligence for Decision Making

Process/Product design data

Distributed Sensing Network

System operational data

Raw material/part

Final product

Intermediateproduct

Intermediateproduct

Stage 1 Stage 2 Stage 5

Stage 2 Stage 5

Physics-Driven Machine Learning and Modelingfor Multistage Rolling Process

• 50 to 80 roller stations w/ miles long line.• Each station has more than 15 typical variables

(speed, temp, force, lub, etc…)• Heterogeneous data with uncertainties• Complex interactions among the variables• No math models to link in-line product quality

with massive process data• Cost and energy loss due to quality and defects

are HIGH!!!

Physics-Driven Machine Learning and Modeling

Causation-Based Process Monitoring, Diagnosis and Control

Stage-freeprocess variables

Z1 Z2

Stage-specificprocess/product

variablesX1

X2

X3 X10Y1 Y2 Y5

X12

X11

Li, J., and Shi, J., 2007, “Knowledge Discovery from Observational Data for Process Control through Causal Bayesian Networks”, IIE Transactions, Vol. 39, pp681-690.

Additional Ongoing R&D Projects on Multistage Manufacturing Process Control

Semiconductor Boeing 787 Assembly Nano Buckypaper Mfg System(Samsung) (Boeing) (NSF, Nano Scale Up)

- Intensive process and product data;

- Data fusion for anomaly detection, defect prevention, and yield improvement.

- Dimensional errors exist in part, fixture, etc.

- Variation modeling, shape control to improve quality and reduce cycle time for multistage composite part assembly

- Nano mfg scale up needs new process and in-line sensors

- Integrated data fusion, monitoring, diagnosis and predictive control

Outline

• Overview of Data Fusion Research

• Data analytics for in situ process monitoring of nanomanufacturing– Generalized Wavelet Shrinkage of In-line Raman

Spectra– Penalized Mixed-effects Decomposition– Tensor Mixed Effects Model– Physics-based Feature Extraction & Predictive

Model• Summary

- 15 -

Nano Buckypaper Manufacturing Process Scale Up

Nano Buckypaper

- Nano Buckypaper has nice mechanical, thermo and electrical properties.- It can be fabricated in a batch mode, but roll-to-roll control is desired.- In-situ Raman metrology for quality assessment is essential.

Challenges

- 17 -

• Noise Challenge • Signal dependent noise, instead of i.i.d. noise. (Yue et al. 2017)

• Data Challenge• About 600 Raman signals per minute.• For each Raman spectrum, it includes 1000+ Raman shifts and intensities.

500 1000 1500 20000

500

1000

1500

2000

2500

Raman Shift (cm2)

Varia

nce

(a.u

.)

SWCNT, Acquisition Time 1s, a 1 b 41

variancea*s+b

0 500 1000 1500 20000

500

1000

1500

2000

2500

Raman intensity(a.u.)

Varia

nce(

a.u.

)

SWCNT, Acquisition Time 1s, a 1 b 41

Challenges

- 18 -

• Feature Challenge– Multiple features in profile level

• Raman peak intensity corresponds to material concentration and distribution.

• Peak frequency yields molecular structure and phase.

• Bandwidth is associated with crystallinity and phase.

(Salzer and Siesler 2009)

• Manufacturing consistency• Uniformity• Defects

– Multiple features in quality level

Challenges

- 19 -

• Spatial-Temporal Correlation – Spatial: aligned Nano-Buckypaper– Temporal: along Raman shift/frequency

Alignment

Challenges

- 20 -

• Noise Challenge– Not i.i.d. Gaussian noise– Data acquisition time v.s. Signal-Noise

Ratio (SNR)– Signal dependent property

• Data Challenge– large size data: 600+ profiles/minute– High dimensional: spatial dim. + each

profile with 1020+ Raman shift/frequency• Features Challenge

– Multiple features in profile level– Multiple features in quality level– Feature-feature mixture, feature-noise

mixture • Spatial-Temporal Correlation

– Spatial: aligned Nano-Buckypaper– Temporal: along Raman shift/frequency

Topic 1: Generalized Wavelet Shrinkage

Topic 2: Penalized Mixed-Effects

Decomposition

Topic 3: Tensor Mixed-Effects Model

Data Denoising

Data Decomposition

Multi-array Data Correlation

- 21 -

Topic 1: Generalized Wavelet Shrinkage of in-line Raman Spectroscopy *

• Yue, X., Wang, K., Yan, H., Park, J.G., Liang, Z., Zhang, C., Wang, B. and Shi, J., 2017, “Generalized Wavelet Shrinkage of Inline Raman Spectroscopy for Quality Monitoring of Continuous Manufacturing of Carbon Nanotube Buckypaper”, IEEE Transactions on Automation Science and Engineering, 14(1), pp.196-207. (Best Paper Award of IEEE T-ASE in 2017)

Motivation

- 22 -

• Shorten Data Acquisition Time– Decrease control bandwidth

for process monitoring and control

– Decrease the material heterogeneity dependent noises and provide a better in-line monitoring capability

• High Signal-Noise-Ratio(SNR)– To reduce the risk from that

important features could be overlooked

– To increase accuracy in the peak analysis. e.g. intensity ratio of D/G band is correlated with degree of functionalization

• To develop a technique to denoise high-speed measured Raman spectra

Signal Modeling & Validation

- 23 -

• Snyder's Model (Snyder et al.,1993)

𝐘𝐘 :Observed Raman intensity

S :Real Raman intensity

𝚺𝚺 :Signal-dependent noise

𝐘𝐘 = 𝐒𝐒 + 𝚺𝚺 ~ 𝑵𝑵(𝐒𝐒, 𝑎𝑎 � diag 𝐒𝐒 + 𝑏𝑏 � 𝐈𝐈)

𝑦𝑦𝑖𝑖 Data collected by the 𝑖𝑖th frequency bin

𝑛𝑛𝑖𝑖𝑠𝑠 # of photoelectrons

𝑔𝑔𝑖𝑖 Readout noise from amplifier

𝑦𝑦𝑖𝑖 = 𝑛𝑛𝑖𝑖𝑠𝑠 + 𝑔𝑔𝑖𝑖

𝜇𝜇𝑖𝑖𝑠𝑠 > 107

𝑛𝑛𝑖𝑖𝑠𝑠~poisson(𝜇𝜇𝑖𝑖𝑠𝑠)

𝑔𝑔𝑖𝑖~𝑁𝑁(0, 𝜀𝜀𝑔𝑔)

• Pearson correlation coefficient (PCC) of variance vector and SWCNT signal is 0.958 and the PCC for MWCNT is 0.901 (strongly correlated)

• Linear relationship is uncorrelated with exposure time, material types; Validated linear parameters are relatively stable.

Conclusion:500 1000 1500 20000

500

1000

1500

2000

2500

Raman Shift (cm2)V

aria

nce

(a.u

.)

SWCNT, Acquisition Time = 1s, a= 1 b= 41

variancea*s+b

0 500 1000 1500 20000

500

1000

1500

2000

2500

Raman intensity(a.u.)

Var

ianc

e(a.

u.)

SWCNT, Acquisition Time= 1s, a= 1 b=41

Wavelet Shrinkage (WS)

- 24 -

• State-of-the-art– Wavelet shrinkage is asymptotically optimal for recovering objects

from certain functional classes, such as Besov spaces (Candes, 2006)– Soft & Hard shrinkage are two widely used thresholding rules– Thresholding method: VisuShrink, RiskShrink, SureShrink (Donoho

and Johnstone, 1995; Johnstone and Silverman, 1997) • Soft Shrinkage

• Hard Shrinkage

min𝐖𝐖

𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ1

�𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂𝑠𝑠 𝑤𝑤𝑖𝑖

𝐽𝐽,𝜆𝜆2

= sign 𝑤𝑤𝑖𝑖𝐽𝐽 𝑤𝑤𝑖𝑖

𝐽𝐽 −𝜆𝜆2 +

min𝐖𝐖

𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ0

�𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂ℎ 𝑤𝑤𝑖𝑖

𝐽𝐽, 𝜆𝜆 = sign 𝑤𝑤𝑖𝑖𝐽𝐽 � 𝑤𝑤𝑖𝑖

𝐽𝐽 � 𝐼𝐼 𝑤𝑤𝑖𝑖𝐽𝐽 − 𝜆𝜆

• Hard Shrinkage

– Adaptively determine thresholds (from level-dependent to individual-dependent)– When noise is not signal-dependent, 𝛀𝛀 will be a scalar matrix, the GWS will be

equivalent to the WS.

Generalized Wavelet Shrinkage (GWS)

- 25 -

• Soft Shrinkagemin𝐖𝐖

𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇𝛀𝛀−1 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ1

�𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂𝑠𝑠 𝑤𝑤𝑖𝑖

𝐽𝐽,𝜆𝜆

2𝑘𝑘𝑖𝑖𝐽𝐽 = sign 𝑤𝑤𝑖𝑖

𝐽𝐽 𝑤𝑤𝑖𝑖𝐽𝐽 −

𝜆𝜆2𝑘𝑘𝑖𝑖

𝐽𝐽+

• Idea– Considering the impact of covariance matrix– Smoothing spline v.s. Generalized smoothing spline; Lasso v.s. Adaptive Lasso– Wavelet shrinkage => Generalized wavelet shrinkage

min𝐖𝐖

𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇𝛀𝛀−1 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ0

�𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂ℎ 𝑤𝑤𝑖𝑖

𝐽𝐽,𝜆𝜆𝑘𝑘𝑖𝑖𝐽𝐽 = sign 𝑤𝑤𝑖𝑖

𝐽𝐽 � 𝑤𝑤𝑖𝑖𝐽𝐽 � 𝐼𝐼 𝑤𝑤𝑖𝑖

𝐽𝐽 −𝜆𝜆𝑘𝑘𝑖𝑖𝐽𝐽

Wavelet Denoising Procedure

- 26 -

Input Data

Wavelet transformation

Soft/Hard shrinkage

Inverse wavelet transformation

Historical data

Signal-noise relationship

Penalized parameter by

Cross validation

Get individual-dependent threshold

Segmentation

level-

/Hard

t

Output Data

(a) (b)

Phase I

Phase II

Input Data

Wavelet transformation

Get level-dependent threshold

Soft/Hard shrinkage

Inverse wavelet transformation

Output Data

(a)

Current WS Procedure Proposed GWS Procedure

Case Study

- 27 -

• Raman Spectroscopy– Near infra-red (NIR) laser with a wavelength of 785nm – A power of 150mW was used to eliminate the effect of ambient light– Low magnification lens was used to achieve larger focus tolerance

Renishaw™ inVia micro-Raman system

Case Study

- 28 -

Acq time / Mean𝑃𝑃1 − 𝑃𝑃1∗Std

𝑃𝑃1 − 𝑃𝑃1∗Mean𝑃𝑃2 − 𝑃𝑃2∗

Std𝑃𝑃2 − 𝑃𝑃2∗

Peak1 SNR Peak2 SNR

T=1Raw data 15.58 10.31 29.76 19.42 5.41 17.86

WS 8.21 7.4 14.99 8.73 8.97 37.04GWS 5.99 4.27 14.01 8.13 13.65 39.68

T=0.5Raw data 11.83 5.58 26.8 19.1 4.22 10.07

WS 5.55 4.05 14.08 10.45 7.65 18.85GWS 5.04 3.77 12.06 7.27 8.33 23.92

T=0.1Raw data 12.63 5.63 13.45 10.2 1.11 5.46

WS 4.39 2.47 8.38 7.47 2.94 8.15GWS 3.9 2.93 8.18 6.03 2.96 9.09

T=0.05Raw data 9.91 5.51 13.72 8.78 0.88 3.67

WS 2.61 1.58 8.35 4.42 3.22 6.46GWS 2.53 1.58 6.44 4.01 3.28 7.89

T=0.01Raw data 10.67 3.66 13.2 8.95 0.59 2.30

WS 2.42 2.02 6.36 4.85 1.91 4.55GWS 2.18 1.61 5.74 4.38 2.24 5.04

• SWCNT Buckypaper (𝑃𝑃1∗ is D band peak intensity; 𝑃𝑃2∗ is G band peak intensity)

Summary

- 29 -

• Noise Source and Pattern – The noise in Raman spectra includes photon shot noise, sample-

generated noise, instrument-generated noise, computation-generated noise, external-generated noise

– The signal dependent property is validated• Generalized Wavelet Shrinkage

– GWS is proposed to make full use of the signal dependent property to realize denoising and signal enhancement

– GWS can realize individual adaptive wavelet thresholding, which outperforms the level-dependent conventional wavelet shrinkage

– GWS can improve the SNR dramatically or can be used to reduce data acquisition time without loss of SNR

• Condition of applying the GWS– The noise has signal dependent property (most of spectroscopy

techniques are based on quantification of photoelectrons)– The signal-dependence relationship is relatively stable

- 30 -

Topic 2: Penalized Mixed-effects Decomposition (PMD) for Multichannel Profile Monitoring of In-line Raman

Spectroscopy*

• Yue, X., Yan, H., Liang, R., Shi, J., 2018, “A Wavelet-based Penalized Mixed-effects Model for Multichannel Profile Detection of In-line Raman Spectroscopy”, IEEE Transactions on Automation Science and Engineering. Vol. 15, Issue 5. pp 1258 - 1271

Objective

- 31 -

• Monitoring multiple features simultaneously

• Decompose nominal random effects and defective random effects

• Quick computation for real-time monitoring

State-Of-The-Art in Profile Monitoring

- 32 -

• Linear Profile Monitoring– Kang and Albin, 2000, Kim et al. 2003, Mahmoud, et al. 2007.– Limitation: only suitable for linear profiles.

• Nonlinear Profile Monitoring I– Polynomial regression, spline (Mosesova et al. 2006, Zou et al., 2008, Shiau

et al. 2009, Chang and Yadama, 2010).– Limitation: Only focus on modeling of smooth and differentiable profiles,

which are not applicable when the profiles include spikes or peaks.• Nonlinear Profile Monitoring II

– PCA, Wavelet, mixed-effects models (Mosesova et al. 2006, Jensen and Birch, 2009, Paynabar and Jin, 2011).

𝒚𝒚𝑖𝑖 = 𝑿𝑿𝑖𝑖𝜷𝜷 + 𝒁𝒁𝑖𝑖𝒃𝒃𝑖𝑖 + 𝜺𝜺𝑖𝑖

Fixed effects Randomeffects Noise

• It cannot separate normal random effects and defective random effects.

• Too many coefficients. • REML or EM-based methods cannot

provide sufficient calculation speed.

Limitations:

- 33 -

Yue, X., Yan, H., Liang, R., Shi, J., 2018, “A Wavelet-based Penalized Mixed-effects Model for Multichannel Profile Detection of In-line Raman Spectroscopy”, IEEE Transactions on Automation Science and Engineering. Vol. 15, Issue 5. pp 1258 - 1271

argmin𝜽𝜽𝑖𝑖𝑖𝑖,𝜹𝜹𝑖𝑖𝑖𝑖

𝒆𝒆𝑖𝑖𝑖𝑖𝑇𝑇 𝛀𝛀−1𝒆𝒆𝑖𝑖𝑖𝑖 + 𝛾𝛾 𝜹𝜹𝑖𝑖𝑖𝑖 1

subject to. 𝒚𝒚𝑖𝑖𝑖𝑖 = 𝝁𝝁𝑖𝑖 + 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 + 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 + 𝒆𝒆𝑖𝑖𝑖𝑖𝑩𝑩𝐿𝐿 ≼ 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 ≼ 𝑩𝑩𝑈𝑈 ,

𝑩𝑩𝑈𝑈= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎1, 𝑩𝑩𝐿𝐿= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎2

Raman Spectra𝒚𝒚𝒊𝒊𝒊𝒊

Fixed Effect𝝁𝝁𝑖𝑖

Normal Effect𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖

Defective Effect𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖

Signal Dependent Noise

𝒆𝒆𝑖𝑖𝑖𝑖

Fabrication Consistency

Physical Quality Features

Uniformity Defects

Product Tolerance

Signal Dependent Property &

Sparsity

– Decompose in-line Raman spectra into fixed effect, normal effect, defective effect, and noise

– Lay a foundation for the monitoring of consistency, uniformity and defects simultaneously

Penalized Mixed-effects Decomposition

How this method separate different components?

• Illustration– 𝝁𝝁𝑖𝑖 will be identified firstly. It shows the common components

(fixed effects) among a groups of curves– The method will then put as many components into the normal

effects 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 as possible, because it will not pay any price in the loss function

– The defective effects 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 will be identified by the 𝑳𝑳𝟏𝟏 norm regularization, which encourages sparsity of defects

argmin𝜽𝜽𝑖𝑖𝑖𝑖,𝜹𝜹𝑖𝑖𝑖𝑖

𝒆𝒆𝑖𝑖𝑖𝑖𝑇𝑇 𝛀𝛀−1𝒆𝒆𝑖𝑖𝑖𝑖 + 𝛾𝛾 𝜹𝜹𝑖𝑖𝑖𝑖 1

subject to. 𝒚𝒚𝑖𝑖𝑖𝑖 = 𝝁𝝁𝑖𝑖 + 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 + 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 + 𝒆𝒆𝑖𝑖𝑖𝑖𝑩𝑩𝐿𝐿 ≼ 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 ≼ 𝑩𝑩𝑈𝑈 ,

𝑩𝑩𝑈𝑈= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎1, 𝑩𝑩𝐿𝐿= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎2

Raman Spectra𝒚𝒚𝒊𝒊𝒊𝒊

Fixed Effects𝝁𝝁𝑖𝑖

Normal Effects𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖

Defective Effects𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖

Signal Dependent Noise

𝒆𝒆𝑖𝑖𝑖𝑖

Bounds are correlated

with signals

Comparison Between PMD, LMM & SSD

- 35 -

• LMM: Linear Mixed-effects Model• PMD: Penalized Mixed-effects Decomposition• SSD: Smooth Sparse Decomposition

Algorithm for the PMD

- 36 -

Algorithm 1: Accelerated proximal gradient (APG) based algorithm for the PMDWhile 𝒊𝒊 = 𝟏𝟏:𝑵𝑵

Initialization:

𝜹𝜹𝒊𝒊𝒊𝒊𝟎𝟎 = 𝟎𝟎 , 𝑳𝑳 = 𝟐𝟐

min(diag(𝛀𝛀))𝑾𝑾𝒂𝒂 𝟐𝟐𝟐𝟐, 𝝁𝝁𝒊𝒊 = median(𝒚𝒚𝒊𝒊�),𝒙𝒙(𝟎𝟎) = 𝟎𝟎.

End

While 𝜹𝜹𝒊𝒊𝒊𝒊𝒌𝒌−𝟏𝟏 − 𝜹𝜹𝒊𝒊𝒊𝒊

𝒌𝒌 > 𝝐𝝐

Let 𝜽𝜽𝒊𝒊𝒊𝒊∗ = 𝑾𝑾𝑻𝑻 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 � 𝑰𝑰 𝑩𝑩𝑳𝑳 ≼ 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 ≼ 𝑩𝑩𝑼𝑼 + 𝑾𝑾𝑻𝑻𝑩𝑩𝑳𝑳 �𝑰𝑰 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 ≼ 𝑩𝑩𝑳𝑳 + 𝑾𝑾𝑻𝑻𝑩𝑩𝑼𝑼 � 𝑰𝑰 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 ≽ 𝑩𝑩𝑳𝑳

Update 𝜹𝜹𝒊𝒊𝒊𝒊𝒌𝒌 = 𝑺𝑺𝜸𝜸

𝑳𝑳(𝒙𝒙(𝒌𝒌−𝟏𝟏) + 𝟐𝟐

𝑳𝑳𝑾𝑾𝒂𝒂𝑻𝑻 𝛀𝛀−𝟏𝟏 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝜽𝜽𝒊𝒊𝒊𝒊∗ −𝑾𝑾𝒂𝒂𝒙𝒙(𝒌𝒌−𝟏𝟏) )

Update 𝒕𝒕𝒌𝒌 = 𝟏𝟏 + 𝟏𝟏 + 𝟒𝟒𝒕𝒕𝒌𝒌−𝟏𝟏𝟐𝟐 /𝟐𝟐

Update 𝒙𝒙(𝒌𝒌) = 𝜹𝜹𝒊𝒊𝒊𝒊(𝒌𝒌−𝟏𝟏) + 𝒕𝒕𝒌𝒌−𝟏𝟏−𝟏𝟏

𝒕𝒕𝒌𝒌𝜹𝜹𝒊𝒊𝒊𝒊

(𝒌𝒌−𝟏𝟏) − 𝜹𝜹𝒊𝒊𝒊𝒊(𝒌𝒌−𝟐𝟐)

EndEnd

𝒚𝒚𝑖𝑖𝑖𝑖 = 𝝁𝝁𝑖𝑖 + 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 + 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 + 𝒆𝒆𝑖𝑖𝑖𝑖

Surrogate Data Analysis

- 37 -

• Surrogated Data Analysis Set-up– Three types of defects in different bands of the profiles.– Signal-dependent noise is generated by 𝒆𝒆~𝑵𝑵(𝟎𝟎, diagonal 𝑺𝑺 + 42).

• Criteria to evaluate the performance– Detection Rate (DR): detection ratio of real defect points;– False Alarm Rate (FAR): ratio of false classification of defects in all the non-defective

points; – Detected Peak Intensity Difference (DPID): ⁄1 nd ∑i=1

nd |max Ii − max Ii∗ |/|max Ii∗ |, max Ii denotes the detected peak intensity, max Ii∗ denotes the real peak intensity;

– Mean Square Error (MSE); – Computation Time.

Simulation Results

- 38 -

LMM SSD PMDDetection Rate (DR) 100.00% 13.33% 100.00%

False Alarm Rate (FAR) 85.69% 2.21% 2.63%

Detected Peak Intensity Difference

(DPID)77.73% 74.10% 30.95%

Mean Square Error (MSE) 9.93 10.06 9.18

Computation Time 1203.27s 0.55s 0.94s

• The proposed PMD works better than the LMM and the SSD (higher DR with lower FAR & DPID).

Summary

- 39 -

• Contributions– A novel accurate and computationally efficient data decomposition

method for detection and monitoring of high-dimensional, functional, nonlinear profiles.

– An APG based algorithm is developed to efficiently handle the parameter estimation , which meets the real time monitoring requirement.

• Impacts– For nanomanufacturing: to enable monitoring of fabrication consistency,

uniformity, defect information simultaneously.– It can be used to other datasets that have several characteristics:

multichannel, high dimension, signal-dependent noise, mixture of different physical information.

- 40 -

Topic 3: Tensor Mixed Effects (TME) Model and Applications in Nanomanufacturing Inspection*

• Yue, X., Park, J.G., Liang, Z., Shi, J., 2019, “Tensor Mixed Effects (TME) Model and Applications in Nanomanufacturing Inspection”, Technometrics. (in press). (Best Paper award finalist, the 2017 INFORMS Data Mining Section)

Raman Mapping Data

- 41 -

• Raman Mapping Measurement

CNTs Buckypaper Continuous Production

Buckypaper Microscope Image

Characteristics of Raman Mapping: high-dimensional, mixed-effects, different correlations in multiple dimensions.

• Complex Spatial-Temporal Correlation Structures• Tensor is an Efficient

Mathematical Tool for Formulating Raman Mapping Data.

• The complex correlations & mixed-effects should be considered during modeling.

• Tensor Decomposition– Tucker decomposition (Kolda and Bader 2009)

– GLM model in tensor domain (Zhou et al. 2013)

Literature Review

- 42 -

Tucker Decomposition (Tucker 1966, Kolda and Bader 2009)

Generalized Linear Tensor Regression Model (GLTR) (Zhou et al. 2013)

• Limitation: all these methods do not consider random effects.

• Linear Mixed Effects Model– Linear mixed effects model is widely used (Demidenko 2013;

Galecki and Burzykowski 2013).

• Merits: (i) it has the capability to handle multilevel hierarchy data; (ii) it takes complex association structures into consideration.

• Limitation: The methods cannot handle tensor responses of Raman mapping, due to ultra high dimensionality and complex correlations.

Literature Review

- 43 -

𝒚𝒚𝑖𝑖 = 𝑿𝑿𝑖𝑖𝜷𝜷 + 𝒁𝒁𝑖𝑖𝒃𝒃𝑖𝑖 + 𝜺𝜺𝑖𝑖

Fixed Effects

Random Effects Noise

Vectorization

Vectorized Linear Mixed Effect (vLME) model

𝐽𝐽3𝐽𝐽 × 𝐽𝐽 × 𝐽𝐽

Limitations of the vLME model• Dimension after vectorization is high,

and the computation cost is high.

• It destroys the inherent multi-way correlation structures.

Complexity: 𝒪𝒪(𝑁𝑁𝐽𝐽9)

Random Effects

• Motivation– Fixed effects & Random effects.– Handle multi-dimensional arrays.– Exploit the correlations in different dimensions.

• Model

Tensor Mixed Effects Model

- 44 -𝓡𝓡𝑖𝑖 ∽ 𝑵𝑵𝑃𝑃2,𝑄𝑄2,𝑅𝑅2 𝓞𝓞;𝚺𝚺r,𝚿𝚿r,𝛀𝛀r �𝓔𝓔𝑖𝑖 ∽ 𝑵𝑵𝐽𝐽,𝐾𝐾,𝐿𝐿(𝓞𝓞;𝚺𝚺ε,𝚿𝚿ε,𝛀𝛀ε

𝓨𝓨𝑖𝑖 = 𝓕𝓕 ×1 𝑨𝑨𝑖𝑖1 ×2 𝑨𝑨𝑖𝑖

2 ×3 𝑨𝑨𝑖𝑖3 + 𝓡𝓡𝑖𝑖 ×1 𝑩𝑩𝑖𝑖

1 ×2 𝑩𝑩𝑖𝑖2 ×3 𝑩𝑩𝑖𝑖

3 + 𝓔𝓔𝑖𝑖 (1)

Fixed Effects Noise

Fixed effects core tensor

Fixed effects design matrices Noise tensor

Random effects core tensor

Random effects design matrices

𝓨𝓨𝑖𝑖

𝐽𝐽,𝐾𝐾, 𝐿𝐿

𝓕𝓕𝑨𝑨𝑖𝑖

1

𝑨𝑨𝑖𝑖2

𝑨𝑨𝑖𝑖3

𝑃𝑃1,𝑄𝑄1,𝑅𝑅1

𝓡𝓡𝑖𝑖𝑩𝑩𝑖𝑖

1

𝑩𝑩𝑖𝑖2

𝑩𝑩𝑖𝑖3

𝑃𝑃2,𝑄𝑄2,𝑅𝑅2

𝐽𝐽,𝐾𝐾, 𝐿𝐿

𝓔𝓔𝑖𝑖

– Inference of the TME Model– Existence conditions of the MLE.– Identifiability of the TME Model– Double Flip-Flop Algorithm for Parameter Estimation– Convergence Investigations

Topics Investigated to DevelopTensor Mixed Effects (TME) Model

• Yue, X., Park, J.G., Liang, Z., Shi, J., 2019, “Tensor Mixed Effects (TME) Model and Applications in Nanomanufacturing Inspection”, Technometrics. (in press).

Simulation Setup and Convergence Results

- 46 -

• Simulation Set-up– We generate the tensors with dimension 30×5×5. The dimensions of core tensor of

fixed effects and random effects are 8×3×3 and 3×2×2, respectively.– Covariance matrices of random effects are generated randomly. Covariance

matrices of residual errors are generated by random diagonal matrices.– 1000 response tensors are generated to test the performance.

• Convergence criteria versus iterative histories – Divided 𝐿𝐿1 norm of difference between parameters in two successive iterations.

�𝑿𝑿𝑖𝑖𝑘𝑘 − �𝑿𝑿𝑖𝑖

𝑘𝑘−11

/𝐽𝐽 � 𝐽𝐽 𝑿𝑿 = 𝚺𝚺, 𝚿𝚿,𝛀𝛀

1st Flip-flop 2nd Flip-flop

Simulation: Parameter Estimation Results

- 47 -

Case Study

- 48 -

• Experiment Set-up

• Alignment• The multi-walled CNTs buckypaper before alignment and after

alignment are measured by Raman mapping technique.

𝚿𝚿,𝛀𝛀 𝐡𝐡𝐡𝐡𝐡𝐡𝐡𝐡 𝐩𝐩𝐩𝐩𝐩𝐩𝐡𝐡𝐩𝐩𝐩𝐩𝐩𝐩𝐡𝐡𝐩𝐩 𝐩𝐩𝐩𝐩 𝐛𝐛𝐡𝐡 𝐮𝐮𝐮𝐮𝐡𝐡𝐮𝐮 𝐩𝐩𝐩𝐩 𝐫𝐫𝐡𝐡𝐩𝐩𝐫𝐫𝐡𝐡𝐮𝐮𝐡𝐡𝐩𝐩𝐩𝐩 𝐩𝐩𝐡𝐡𝐡𝐡 𝐂𝐂𝐂𝐂𝐂𝐂 𝐡𝐡𝐩𝐩𝐩𝐩𝐚𝐚𝐩𝐩𝐚𝐚𝐡𝐡𝐩𝐩𝐩𝐩 𝐩𝐩𝐩𝐩 𝐩𝐩𝐡𝐡𝐡𝐡 𝐁𝐁𝐮𝐮𝐁𝐁𝐁𝐁𝐁𝐁𝐩𝐩𝐡𝐡𝐩𝐩𝐡𝐡𝐫𝐫.

Initial Case Study

- 49 -

Stretch Ratio (%) Stretch Ratio (%)

Coe

ffici

ent (

a.u.

)

Coe

ffici

ent (

a.u.

)

(a) (b)𝚿𝚿r 𝛀𝛀r

𝛀𝛀r(1,2)𝛀𝛀r(1,3)

𝚿𝚿r(1,2)𝚿𝚿r(1,3)

Summary

- 50 -

• Contributions– We proposed a novel TME model:

• it has the capability to handle multilevel hierarchy data. • it takes correlation along different dimensions, into consideration. • it can analyze mixed effects for high dimensional datasets.

– We derived the MLE, and explore the existence and identifiability of the TME.

– An iterative double Flip-Flop algorithm has been developed for parameter estimation of the TME model, and the complexity is analyzed.

• Impacts– For nanomanufacturing: the TME has potential to be used to quantify

the alignment degree of CNTs Buckypaper.– It may also be used to other datasets with tensor structures, mixed-

effects, suitable sample size.

Nanopowder Manufacturing Process Control

51

*Reference: Oljaca, M. et al. (2002), Flame synthesis of nanopowders via combustion chemical vapor deposition, Journal of materials science letters, 21, 621– 626.

Nanopowder Manufacturing Scale-upAtomizer

Control Objective

Engineering knowledge

Data Statistical Model Calibration

Control & Evaluation

Quality Indices

Predictive Model Development

Chang, C. -J., Plumlee, M., Shi, J., 2011, “A predictive Model of Nanomiser Energy And Its Application In System Monitoring”, Technical Report to Department of Energy and nGimat Company

Challenges:• Nano-metrology analysis for

process control

• Variation propagation in multi-stage manufacturing process

• Process control capability

Goal: 1kg/day to 1000kg/day

Physics-based Feature Extraction & Predictive Model

• Objective: Translate and re-define the nonlinear dynamic system into linear model

NanomixerSolution Flow Rate

System setting input

Process Randomness

SystemOutput (Y)

1(X )

2(X )

Linear System(ARIMA model between Y and

u1, u2)

Process Randomness

System Output (Y)( )1 1 1 2,u f X X=( )2 2 1 2,u f X X=

Engineering Feature Transformation

(Nonlinear System)

NanomixerPhysics-Based Predictive ModelSolution Flow Rate

System setting input

1(X )

2(X )

1(X )

2(X )

(Y)

Physics-based Data-Driven Model Model Validation

04/27/2011 - 54 -

System inputs

0 50 100 150 200 250 300

80

100

120

140

160

180

200

220

0 50 100 150 200 250 30010

15

20

25

30

0 50 100 150 200 250 300 35030

40

50

60

70

80

90

Time

Sol

utio

n En

ergy

Black: Energy Measureemnts; Green: Model Predictions

Inpu

t Set

ting

Sol

utio

n Fl

ow R

ate

System output

1(X )

2(X )

(Y)

Outline

• Introduction and Research Overview

• Research Topics

• Generalized Wavelet Shrinkage of In-line Raman Spectra

• Penalized Mixed-effects Decomposition

• Tensor Mixed Effects Model

• Physics-based Feature Extraction & Predictive Model

• Summary

- 55 -

Summary

- 56 -

– A novel generalized wavelet shrinkage (GWS) method was proposed to remove the signal-dependent noise efficiently in situ Reman sensing signals..

– A machine learning enabled new algorithm “penalized mixed-effects decomposition (PMD)” was developed to decompose in-line profiles into four components: fixed effects, normal effects, defective effects, and noise.

– A novel tensor mixed-effects (TME) model was developed to analyze massive high-dimensional data with complex temporal-special correlation structure.

– Engineering-driven Data Analytics plays an important role in data enabled design and manufacturing.

- 57 -

Thank You!

Engineering-driven Data Analytics for In Situ Process Monitoring of Nano manufacturingOutline Overview of Data Fusion for Quality Improvement��Manufacturing System, Product Realization and Data FusionChallenges and OpportunitiesSlide Number 6Slide Number 7Multistage Manufacturing System:�Processes with multiple workstations and/or multiple operationsSlide Number 9SoV Theory and ApplicationPhysics-Driven Machine Learning and Modeling for Quality ImprovementVISION: Three interrelated layers of networks: �- system, sensing, and decision makingPhysics-Driven Machine Learning and Modeling�for Multistage Rolling ProcessAdditional Ongoing R&D Projects �on Multistage Manufacturing Process Control�Outline Nano Buckypaper Manufacturing Process Scale UpChallengesChallengesChallengesChallengesSlide Number 21MotivationSignal Modeling & ValidationWavelet Shrinkage (WS)Generalized Wavelet Shrinkage (GWS)Wavelet Denoising ProcedureCase StudyCase StudySummarySlide Number 30ObjectiveState-Of-The-Art in Profile Monitoring Penalized Mixed-effects DecompositionHow this method separate different components?Comparison Between PMD, LMM & SSDAlgorithm for the PMDSurrogate Data AnalysisSimulation ResultsSummarySlide Number 40Raman Mapping DataLiterature ReviewLiterature ReviewTensor Mixed Effects Model Topics Investigated to Develop�Tensor Mixed Effects (TME) Model Simulation Setup and Convergence ResultsSimulation: Parameter Estimation ResultsCase StudyInitial Case StudySummaryNanopowder Manufacturing Process ControlNanopowder Manufacturing Scale-upPhysics-based Feature Extraction & Predictive ModelPhysics-based Data-Driven Model Model ValidationOutline SummarySlide Number 57

engineering-driven data analytics for in situ process .../uploads/shi_in_situ_process... · - data...

Documents