engineering-driven data analytics for in situ process .../uploads/shi_in_situ_process... · - data...

57
Engineering-driven Data Analytics for In Situ Process Monitoring of Nano manufacturing Jianjun Shi The Carolyn J. Stewart Chair and Professor The H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology (contact: [email protected] ; http://pwp.gatech.edu/jianjun-shi/)

Upload: others

Post on 18-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Engineering-driven Data Analytics for In Situ Process Monitoring of Nano manufacturing

    Jianjun Shi

    The Carolyn J. Stewart Chair and ProfessorThe H. Milton Stewart School of Industrial and Systems Engineering

    Georgia Institute of Technology

    (contact: [email protected]; http://pwp.gatech.edu/jianjun-shi/)

    mailto:[email protected]://pwp.gatech.edu/jianjun-shi/

  • Outline

    • Overview of Data Fusion for Quality Improvement Research

    • Data analytics for in situ process monitoring of nanomanufacturing– Generalized Wavelet Shrinkage of In-line Raman Spectra– Penalized Mixed-effects Decomposition– Tensor Mixed Effects Model– Physics-based Feature Extraction & Predictive Model

    • Summary

    - 2 -

  • Overview of Data Fusion for Quality Improvement

  • 1 2 3 4 5 6 7

    Manufacturing/Executing Operation

    Raw DataInformationKnowledgeDecision

    In-Process//M

    aint. Data

    Manufacturing System, Product Realization and Data Fusion

    Data Fusion enables improvements across all stages of the product lifecycle, from product conceptualization, to design, production, service, and logistics.

  • Challenges and Opportunities

    Opportunities• Ubiquitous availability of data• System operations become transparent• Technological capabilities and flexibilities of individual machine• Advancements in data science, machine learning and computing capabilities• …

    Challenges• Define clear engineering objectives and quickly retrieve all relevant information from all stages

    of a manufacturing system• Effectively address data uncertainties and noises• The “imbalance” in data availability: massive normal operational data, few specific failure data• Lack of unified model/strategy to make a (real time) informative decision• Lack of deep integration of data science and design & manufacturing engineering.• …

  • Data Science(Statistics/CS/

    Signal Processing)

    OR/Control

    Interdisciplinary Framework:Fusion of Engineering, Data Science, OR/Control

    Engineering/Domain Knowledge

  • Stream of Variation Methodologies for

    Multistage Manufacturing Processes

    Shi, J., 2006, “Stream of Variation Modeling and Analysis for Multistage Manufacturing Processes”, CRC Press, 469pp.Shi, J. and Zhou, S., 2009, “Quality Control and Improvement for Multistage Systems: A Survey”,

    IIE Transactions on Quality and Reliability Engineering, Vol. 41, pp744-753.

  • Multistage Manufacturing System:Processes with multiple workstations and/or multiple operations

    • • •

    12 34

    •••

    1

    234

    Level 2

    Level 3

    Level 4

    3

    Level 1

    Automotive Body Assembly

    Roll-to-Roll Nano Buckypaper Mfg System

    3D Printing / Multilayer Additive Manufacturing

    Challenges:- Variation propagation modeling- Tolerance synthesis- Root cause diagnosis- Distributed sensing- Critical station identification- Automatic Compensation

  • Engineering Knowledge(CAD and CAPP)

    Variation Model

    PartsEnd-of-line sensorStation 1 Station k Station N... ...

    Root Causes

    Feedback to Design(Analytical)

    distributed sensors

    System & ControlTheory

    Statistical analysis

    Data Enabled Design and Control of Multistage Manufacturing System

    operator intervention

    Statistical Process ControlFeedback to Design:(Heuristics–DOs & DON’ts)

    ? SPC

    Data Fusion Enabled Process Monitoring, Root Cause Diagnosis, Defect Prevention, and Feedback to Design

  • 1. Sensitivitybased DesignEvaluation

    4. Diagno-sability Study

    3. OptimalSensorDistributionStrategy

    2. Process-orientedTolerance Synthesis

    5. FaultDiagnosis

    Design Manufacturing

    6. QualityReliability Chain

    Integration of Design &Manufacturing Information

    Stream-of-Variation Modelkkkkkk wuBxAx ++= −− 11

    kkkk vxCy +=

    • The SoV theory provides an unified framework for variation reduction of multistage manufacturing system.

    • The SoV theory has been implemented in auto and aerospace and their suppliers companies.

    Product Design Manufacturing System

    tolerancing

    Die Fabricatin

    g

    Tooling error

    Single station model

    Assembly line

    Inspection

    Stamping error

    Multi-station model

    0

    1

    2

    3

    4

    5

    6

    7

    8

    Sensitivity

    Assembly system

    Diagnosis

    Variation analysis

    Mfg ControlModeling

    Design Opt

    Stage N...

    .

    ..

    yk

    uk wk

    xk-1 xkStage 1 Stage k

    SoV Theory and Application

    http://www.gom.com/Images/big/blanks08.jpgChart1

    8

    3

    2.4

    2

    1.5

    1

    0.9

    0.8

    0.7

    0.6

    0.4

    0.3

    0.2

    0.15

    0.1

    Sheet1

    8

    3

    2.4

    2

    1.5

    1

    0.9

    0.8

    0.7

    0.6

    0.4

    0.3

    0.2

    0.15

    0.1

    Sheet1

    Sheet2

    Sheet3

  • Physics-Driven Machine Learning and Modeling for Quality Improvement

    • Li, J., and Shi, J., 2007, “Knowledge Discovery from Observational Data for Process Control through Causal Bayesian Networks”, IIE Transactions, Vol. 39, pp681-690.

    • Liu, K., Zhang, X. and Shi, J., 2013, “Adaptive Sensor Allocation Strategy for Process Monitoring and Diagnosis in a Bayesian Network”, IEEE Transactions on Automation Science and Engineering.

  • VISION: Three interrelated layers of networks: - system, sensing, and decision making

    Manufacturing System Network

    P(X1)

    P(X5|X1)

    P(X2|X1)

    P(X3|X2)P(X6|X1, X5)

    X6

    X1

    X2X4

    X5

    X3

    P(X4|X1)

    P(X1)

    P(X5|X1)

    P(X2|X1)

    P(X3|X2)P(X6|X1, X5)

    X6

    X1

    X2X4

    X5

    X3

    P(X4|X1)

    Extracted Knowledge and Intelligence for Decision Making

    Process/Product design data

    Distributed Sensing Network

    System operational data

  • Raw material/part

    Final product

    Intermediateproduct

    Intermediateproduct

    Stage 1 Stage 2 Stage 5

    Stage 2 Stage 5

    Physics-Driven Machine Learning and Modelingfor Multistage Rolling Process

    • 50 to 80 roller stations w/ miles long line.• Each station has more than 15 typical variables

    (speed, temp, force, lub, etc…)• Heterogeneous data with uncertainties• Complex interactions among the variables• No math models to link in-line product quality

    with massive process data• Cost and energy loss due to quality and defects

    are HIGH!!!

    Physics-Driven Machine Learning and Modeling

    Causation-Based Process Monitoring, Diagnosis and Control

    Stage-freeprocess variables

    Z1 Z2

    Stage-specificprocess/product

    variablesX1

    X2

    X3 X10Y1 Y2 Y5

    X12

    X11

    Li, J., and Shi, J., 2007, “Knowledge Discovery from Observational Data for Process Control through Causal Bayesian Networks”, IIE Transactions, Vol. 39, pp681-690.

  • Additional Ongoing R&D Projects on Multistage Manufacturing Process Control

    Semiconductor Boeing 787 Assembly Nano Buckypaper Mfg System(Samsung) (Boeing) (NSF, Nano Scale Up)

    - Intensive process and product data;

    - Data fusion for anomaly detection, defect prevention, and yield improvement.

    - Dimensional errors exist in part, fixture, etc.

    - Variation modeling, shape control to improve quality and reduce cycle time for multistage composite part assembly

    - Nano mfg scale up needs new process and in-line sensors

    - Integrated data fusion, monitoring, diagnosis and predictive control

  • Outline

    • Overview of Data Fusion Research

    • Data analytics for in situ process monitoring of nanomanufacturing– Generalized Wavelet Shrinkage of In-line Raman

    Spectra– Penalized Mixed-effects Decomposition– Tensor Mixed Effects Model– Physics-based Feature Extraction & Predictive

    Model• Summary

    - 15 -

  • Nano Buckypaper Manufacturing Process Scale Up

    Nano Buckypaper

    - Nano Buckypaper has nice mechanical, thermo and electrical properties.- It can be fabricated in a batch mode, but roll-to-roll control is desired.- In-situ Raman metrology for quality assessment is essential.

  • Challenges

    - 17 -

    • Noise Challenge • Signal dependent noise, instead of i.i.d. noise. (Yue et al. 2017)

    • Data Challenge• About 600 Raman signals per minute.• For each Raman spectrum, it includes 1000+ Raman shifts and intensities.

    500 1000 1500 20000

    500

    1000

    1500

    2000

    2500

    Raman Shift (cm2)

    Varia

    nce

    (a.u

    .)

    SWCNT, Acquisition Time 1s, a 1 b 41

    variancea*s+b

    0 500 1000 1500 20000

    500

    1000

    1500

    2000

    2500

    Raman intensity(a.u.)

    Varia

    nce(

    a.u.

    )

    SWCNT, Acquisition Time 1s, a 1 b 41

  • Challenges

    - 18 -

    • Feature Challenge– Multiple features in profile level

    • Raman peak intensity corresponds to material concentration and distribution.

    • Peak frequency yields molecular structure and phase.

    • Bandwidth is associated with crystallinity and phase.

    (Salzer and Siesler 2009)

    • Manufacturing consistency• Uniformity• Defects

    – Multiple features in quality level

  • Challenges

    - 19 -

    • Spatial-Temporal Correlation – Spatial: aligned Nano-Buckypaper– Temporal: along Raman shift/frequency

    Alignment

  • Challenges

    - 20 -

    • Noise Challenge– Not i.i.d. Gaussian noise– Data acquisition time v.s. Signal-Noise

    Ratio (SNR)– Signal dependent property

    • Data Challenge– large size data: 600+ profiles/minute– High dimensional: spatial dim. + each

    profile with 1020+ Raman shift/frequency• Features Challenge

    – Multiple features in profile level– Multiple features in quality level– Feature-feature mixture, feature-noise

    mixture • Spatial-Temporal Correlation

    – Spatial: aligned Nano-Buckypaper– Temporal: along Raman shift/frequency

    Topic 1: Generalized Wavelet Shrinkage

    Topic 2: Penalized Mixed-Effects

    Decomposition

    Topic 3: Tensor Mixed-Effects Model

    Data Denoising

    Data Decomposition

    Multi-array Data Correlation

  • - 21 -

    Topic 1: Generalized Wavelet Shrinkage of in-line Raman Spectroscopy *

    • Yue, X., Wang, K., Yan, H., Park, J.G., Liang, Z., Zhang, C., Wang, B. and Shi, J., 2017, “Generalized Wavelet Shrinkage of Inline Raman Spectroscopy for Quality Monitoring of Continuous Manufacturing of Carbon Nanotube Buckypaper”, IEEE Transactions on Automation Science and Engineering, 14(1), pp.196-207. (Best Paper Award of IEEE T-ASE in 2017)

  • Motivation

    - 22 -

    • Shorten Data Acquisition Time– Decrease control bandwidth

    for process monitoring and control

    – Decrease the material heterogeneity dependent noises and provide a better in-line monitoring capability

    • High Signal-Noise-Ratio(SNR)– To reduce the risk from that

    important features could be overlooked

    – To increase accuracy in the peak analysis. e.g. intensity ratio of D/G band is correlated with degree of functionalization

    • To develop a technique to denoise high-speed measured Raman spectra

  • Signal Modeling & Validation

    - 23 -

    • Snyder's Model (Snyder et al.,1993)

    𝐘𝐘 :Observed Raman intensity

    S :Real Raman intensity

    𝚺𝚺 :Signal-dependent noise

    𝐘𝐘 = 𝐒𝐒 + 𝚺𝚺 ~ 𝑵𝑵(𝐒𝐒, 𝑎𝑎 � diag 𝐒𝐒 + 𝑏𝑏 � 𝐈𝐈)

    𝑦𝑦𝑖𝑖 Data collected by the 𝑖𝑖th frequency bin

    𝑛𝑛𝑖𝑖𝑠𝑠 # of photoelectrons

    𝑔𝑔𝑖𝑖 Readout noise from amplifier

    𝑦𝑦𝑖𝑖 = 𝑛𝑛𝑖𝑖𝑠𝑠 + 𝑔𝑔𝑖𝑖

    𝜇𝜇𝑖𝑖𝑠𝑠 > 107

    𝑛𝑛𝑖𝑖𝑠𝑠~poisson(𝜇𝜇𝑖𝑖𝑠𝑠)

    𝑔𝑔𝑖𝑖~𝑁𝑁(0, 𝜀𝜀𝑔𝑔)

    • Pearson correlation coefficient (PCC) of variance vector and SWCNT signal is 0.958 and the PCC for MWCNT is 0.901 (strongly correlated)

    • Linear relationship is uncorrelated with exposure time, material types; Validated linear parameters are relatively stable.

    Conclusion:500 1000 1500 20000

    500

    1000

    1500

    2000

    2500

    Raman Shift (cm2)V

    aria

    nce

    (a.u

    .)

    SWCNT, Acquisition Time = 1s, a= 1 b= 41

    variancea*s+b

    0 500 1000 1500 20000

    500

    1000

    1500

    2000

    2500

    Raman intensity(a.u.)

    Var

    ianc

    e(a.

    u.)

    SWCNT, Acquisition Time= 1s, a= 1 b=41

  • Wavelet Shrinkage (WS)

    - 24 -

    • State-of-the-art– Wavelet shrinkage is asymptotically optimal for recovering objects

    from certain functional classes, such as Besov spaces (Candes, 2006)– Soft & Hard shrinkage are two widely used thresholding rules– Thresholding method: VisuShrink, RiskShrink, SureShrink (Donoho

    and Johnstone, 1995; Johnstone and Silverman, 1997) • Soft Shrinkage

    • Hard Shrinkage

    min𝐖𝐖

    𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ1

    �𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂𝑠𝑠 𝑤𝑤𝑖𝑖

    𝐽𝐽,𝜆𝜆2

    = sign 𝑤𝑤𝑖𝑖𝐽𝐽 𝑤𝑤𝑖𝑖

    𝐽𝐽 −𝜆𝜆2 +

    min𝐖𝐖

    𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ0

    �𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂ℎ 𝑤𝑤𝑖𝑖

    𝐽𝐽, 𝜆𝜆 = sign 𝑤𝑤𝑖𝑖𝐽𝐽 � 𝑤𝑤𝑖𝑖

    𝐽𝐽 � 𝐼𝐼 𝑤𝑤𝑖𝑖𝐽𝐽 − 𝜆𝜆

  • • Hard Shrinkage

    – Adaptively determine thresholds (from level-dependent to individual-dependent)– When noise is not signal-dependent, 𝛀𝛀 will be a scalar matrix, the GWS will be

    equivalent to the WS.

    Generalized Wavelet Shrinkage (GWS)

    - 25 -

    • Soft Shrinkagemin𝐖𝐖

    𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇𝛀𝛀−1 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ1

    �𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂𝑠𝑠 𝑤𝑤𝑖𝑖

    𝐽𝐽,𝜆𝜆

    2𝑘𝑘𝑖𝑖𝐽𝐽 = sign 𝑤𝑤𝑖𝑖

    𝐽𝐽 𝑤𝑤𝑖𝑖𝐽𝐽 −

    𝜆𝜆2𝑘𝑘𝑖𝑖

    𝐽𝐽+

    • Idea– Considering the impact of covariance matrix– Smoothing spline v.s. Generalized smoothing spline; Lasso v.s. Adaptive Lasso– Wavelet shrinkage => Generalized wavelet shrinkage

    min𝐖𝐖

    𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 𝑇𝑇𝛀𝛀−1 𝐘𝐘 − 𝚪𝚪𝑇𝑇𝐖𝐖 + 𝜆𝜆 𝐖𝐖 ℓ0

    �𝑤𝑤𝑖𝑖𝐽𝐽 = 𝜂𝜂ℎ 𝑤𝑤𝑖𝑖

    𝐽𝐽,𝜆𝜆𝑘𝑘𝑖𝑖𝐽𝐽 = sign 𝑤𝑤𝑖𝑖

    𝐽𝐽 � 𝑤𝑤𝑖𝑖𝐽𝐽 � 𝐼𝐼 𝑤𝑤𝑖𝑖

    𝐽𝐽 −𝜆𝜆𝑘𝑘𝑖𝑖𝐽𝐽

  • Wavelet Denoising Procedure

    - 26 -

    Input Data

    Wavelet transformation

    Soft/Hard shrinkage

    Inverse wavelet transformation

    Historical data

    Signal-noise relationship

    Penalized parameter by

    Cross validation

    Get individual-dependent threshold

    Segmentation

    level-

    /Hard

    t

    Output Data

    (a) (b)

    Phase I

    Phase II

    Input Data

    Wavelet transformation

    Get level-dependent threshold

    Soft/Hard shrinkage

    Inverse wavelet transformation

    Output Data

    (a)

    Current WS Procedure Proposed GWS Procedure

  • Case Study

    - 27 -

    • Raman Spectroscopy– Near infra-red (NIR) laser with a wavelength of 785nm – A power of 150mW was used to eliminate the effect of ambient light– Low magnification lens was used to achieve larger focus tolerance

    Renishaw™ inVia micro-Raman system

  • Case Study

    - 28 -

    Acq time / Mean𝑃𝑃1 − 𝑃𝑃1∗Std

    𝑃𝑃1 − 𝑃𝑃1∗Mean𝑃𝑃2 − 𝑃𝑃2∗

    Std𝑃𝑃2 − 𝑃𝑃2∗

    Peak1 SNR Peak2 SNR

    T=1Raw data 15.58 10.31 29.76 19.42 5.41 17.86

    WS 8.21 7.4 14.99 8.73 8.97 37.04GWS 5.99 4.27 14.01 8.13 13.65 39.68

    T=0.5Raw data 11.83 5.58 26.8 19.1 4.22 10.07

    WS 5.55 4.05 14.08 10.45 7.65 18.85GWS 5.04 3.77 12.06 7.27 8.33 23.92

    T=0.1Raw data 12.63 5.63 13.45 10.2 1.11 5.46

    WS 4.39 2.47 8.38 7.47 2.94 8.15GWS 3.9 2.93 8.18 6.03 2.96 9.09

    T=0.05Raw data 9.91 5.51 13.72 8.78 0.88 3.67

    WS 2.61 1.58 8.35 4.42 3.22 6.46GWS 2.53 1.58 6.44 4.01 3.28 7.89

    T=0.01Raw data 10.67 3.66 13.2 8.95 0.59 2.30

    WS 2.42 2.02 6.36 4.85 1.91 4.55GWS 2.18 1.61 5.74 4.38 2.24 5.04

    • SWCNT Buckypaper (𝑃𝑃1∗ is D band peak intensity; 𝑃𝑃2∗ is G band peak intensity)

  • Summary

    - 29 -

    • Noise Source and Pattern – The noise in Raman spectra includes photon shot noise, sample-

    generated noise, instrument-generated noise, computation-generated noise, external-generated noise

    – The signal dependent property is validated• Generalized Wavelet Shrinkage

    – GWS is proposed to make full use of the signal dependent property to realize denoising and signal enhancement

    – GWS can realize individual adaptive wavelet thresholding, which outperforms the level-dependent conventional wavelet shrinkage

    – GWS can improve the SNR dramatically or can be used to reduce data acquisition time without loss of SNR

    • Condition of applying the GWS– The noise has signal dependent property (most of spectroscopy

    techniques are based on quantification of photoelectrons)– The signal-dependence relationship is relatively stable

  • - 30 -

    Topic 2: Penalized Mixed-effects Decomposition (PMD) for Multichannel Profile Monitoring of In-line Raman

    Spectroscopy*

    • Yue, X., Yan, H., Liang, R., Shi, J., 2018, “A Wavelet-based Penalized Mixed-effects Model for Multichannel Profile Detection of In-line Raman Spectroscopy”, IEEE Transactions on Automation Science and Engineering. Vol. 15, Issue 5. pp 1258 - 1271

  • Objective

    - 31 -

    • Monitoring multiple features simultaneously

    • Decompose nominal random effects and defective random effects

    • Quick computation for real-time monitoring

  • State-Of-The-Art in Profile Monitoring

    - 32 -

    • Linear Profile Monitoring– Kang and Albin, 2000, Kim et al. 2003, Mahmoud, et al. 2007.– Limitation: only suitable for linear profiles.

    • Nonlinear Profile Monitoring I– Polynomial regression, spline (Mosesova et al. 2006, Zou et al., 2008, Shiau

    et al. 2009, Chang and Yadama, 2010).– Limitation: Only focus on modeling of smooth and differentiable profiles,

    which are not applicable when the profiles include spikes or peaks.• Nonlinear Profile Monitoring II

    – PCA, Wavelet, mixed-effects models (Mosesova et al. 2006, Jensen and Birch, 2009, Paynabar and Jin, 2011).

    𝒚𝒚𝑖𝑖 = 𝑿𝑿𝑖𝑖𝜷𝜷 + 𝒁𝒁𝑖𝑖𝒃𝒃𝑖𝑖 + 𝜺𝜺𝑖𝑖

    Fixed effects Randomeffects Noise

    • It cannot separate normal random effects and defective random effects.

    • Too many coefficients. • REML or EM-based methods cannot

    provide sufficient calculation speed.

    Limitations:

  • - 33 -

    Yue, X., Yan, H., Liang, R., Shi, J., 2018, “A Wavelet-based Penalized Mixed-effects Model for Multichannel Profile Detection of In-line Raman Spectroscopy”, IEEE Transactions on Automation Science and Engineering. Vol. 15, Issue 5. pp 1258 - 1271

    argmin𝜽𝜽𝑖𝑖𝑖𝑖,𝜹𝜹𝑖𝑖𝑖𝑖

    𝒆𝒆𝑖𝑖𝑖𝑖𝑇𝑇 𝛀𝛀−1𝒆𝒆𝑖𝑖𝑖𝑖 + 𝛾𝛾 𝜹𝜹𝑖𝑖𝑖𝑖 1

    subject to. 𝒚𝒚𝑖𝑖𝑖𝑖 = 𝝁𝝁𝑖𝑖 + 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 + 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 + 𝒆𝒆𝑖𝑖𝑖𝑖𝑩𝑩𝐿𝐿 ≼ 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 ≼ 𝑩𝑩𝑈𝑈 ,

    𝑩𝑩𝑈𝑈= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎1, 𝑩𝑩𝐿𝐿= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎2

    Raman Spectra𝒚𝒚𝒊𝒊𝒊𝒊

    Fixed Effect𝝁𝝁𝑖𝑖

    Normal Effect𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖

    Defective Effect𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖

    Signal Dependent Noise

    𝒆𝒆𝑖𝑖𝑖𝑖

    Fabrication Consistency

    Physical Quality Features

    Uniformity Defects

    Product Tolerance

    Signal Dependent Property &

    Sparsity

    – Decompose in-line Raman spectra into fixed effect, normal effect, defective effect, and noise

    – Lay a foundation for the monitoring of consistency, uniformity and defects simultaneously

    Penalized Mixed-effects Decomposition

  • How this method separate different components?

    • Illustration– 𝝁𝝁𝑖𝑖 will be identified firstly. It shows the common components

    (fixed effects) among a groups of curves– The method will then put as many components into the normal

    effects 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 as possible, because it will not pay any price in the loss function

    – The defective effects 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 will be identified by the 𝑳𝑳𝟏𝟏 norm regularization, which encourages sparsity of defects

    argmin𝜽𝜽𝑖𝑖𝑖𝑖,𝜹𝜹𝑖𝑖𝑖𝑖

    𝒆𝒆𝑖𝑖𝑖𝑖𝑇𝑇 𝛀𝛀−1𝒆𝒆𝑖𝑖𝑖𝑖 + 𝛾𝛾 𝜹𝜹𝑖𝑖𝑖𝑖 1

    subject to. 𝒚𝒚𝑖𝑖𝑖𝑖 = 𝝁𝝁𝑖𝑖 + 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 + 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 + 𝒆𝒆𝑖𝑖𝑖𝑖𝑩𝑩𝐿𝐿 ≼ 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 ≼ 𝑩𝑩𝑈𝑈 ,

    𝑩𝑩𝑈𝑈= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎1, 𝑩𝑩𝐿𝐿= 𝑑𝑑𝑖𝑖𝑎𝑎𝑔𝑔(𝛀𝛀) � 𝝎𝝎2

    Raman Spectra𝒚𝒚𝒊𝒊𝒊𝒊

    Fixed Effects𝝁𝝁𝑖𝑖

    Normal Effects𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖

    Defective Effects𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖

    Signal Dependent Noise

    𝒆𝒆𝑖𝑖𝑖𝑖

    Bounds are correlated

    with signals

  • Comparison Between PMD, LMM & SSD

    - 35 -

    • LMM: Linear Mixed-effects Model• PMD: Penalized Mixed-effects Decomposition• SSD: Smooth Sparse Decomposition

  • Algorithm for the PMD

    - 36 -

    Algorithm 1: Accelerated proximal gradient (APG) based algorithm for the PMDWhile 𝒊𝒊 = 𝟏𝟏:𝑵𝑵

    Initialization:

    𝜹𝜹𝒊𝒊𝒊𝒊𝟎𝟎 = 𝟎𝟎 , 𝑳𝑳 = 𝟐𝟐

    min(diag(𝛀𝛀))𝑾𝑾𝒂𝒂 𝟐𝟐𝟐𝟐, 𝝁𝝁𝒊𝒊 = median(𝒚𝒚𝒊𝒊�),𝒙𝒙(𝟎𝟎) = 𝟎𝟎.

    End

    While 𝜹𝜹𝒊𝒊𝒊𝒊𝒌𝒌−𝟏𝟏 − 𝜹𝜹𝒊𝒊𝒊𝒊

    𝒌𝒌 > 𝝐𝝐

    Let 𝜽𝜽𝒊𝒊𝒊𝒊∗ = 𝑾𝑾𝑻𝑻 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 � 𝑰𝑰 𝑩𝑩𝑳𝑳 ≼ 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 ≼ 𝑩𝑩𝑼𝑼 + 𝑾𝑾𝑻𝑻𝑩𝑩𝑳𝑳 �𝑰𝑰 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 ≼ 𝑩𝑩𝑳𝑳 + 𝑾𝑾𝑻𝑻𝑩𝑩𝑼𝑼 � 𝑰𝑰 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝒂𝒂𝜹𝜹𝒊𝒊𝒊𝒊 ≽ 𝑩𝑩𝑳𝑳

    Update 𝜹𝜹𝒊𝒊𝒊𝒊𝒌𝒌 = 𝑺𝑺𝜸𝜸

    𝑳𝑳(𝒙𝒙(𝒌𝒌−𝟏𝟏) + 𝟐𝟐

    𝑳𝑳𝑾𝑾𝒂𝒂𝑻𝑻 𝛀𝛀−𝟏𝟏 𝒚𝒚𝒊𝒊𝒊𝒊 − 𝝁𝝁𝒊𝒊 −𝑾𝑾𝜽𝜽𝒊𝒊𝒊𝒊∗ −𝑾𝑾𝒂𝒂𝒙𝒙(𝒌𝒌−𝟏𝟏) )

    Update 𝒕𝒕𝒌𝒌 = 𝟏𝟏 + 𝟏𝟏 + 𝟒𝟒𝒕𝒕𝒌𝒌−𝟏𝟏𝟐𝟐 /𝟐𝟐

    Update 𝒙𝒙(𝒌𝒌) = 𝜹𝜹𝒊𝒊𝒊𝒊(𝒌𝒌−𝟏𝟏) + 𝒕𝒕𝒌𝒌−𝟏𝟏−𝟏𝟏

    𝒕𝒕𝒌𝒌𝜹𝜹𝒊𝒊𝒊𝒊

    (𝒌𝒌−𝟏𝟏) − 𝜹𝜹𝒊𝒊𝒊𝒊(𝒌𝒌−𝟐𝟐)

    EndEnd

    𝒚𝒚𝑖𝑖𝑖𝑖 = 𝝁𝝁𝑖𝑖 + 𝑾𝑾𝜽𝜽𝑖𝑖𝑖𝑖 + 𝑾𝑾𝒂𝒂𝜹𝜹𝑖𝑖𝑖𝑖 + 𝒆𝒆𝑖𝑖𝑖𝑖

  • Surrogate Data Analysis

    - 37 -

    • Surrogated Data Analysis Set-up– Three types of defects in different bands of the profiles.– Signal-dependent noise is generated by 𝒆𝒆~𝑵𝑵(𝟎𝟎, diagonal 𝑺𝑺 + 42).

    • Criteria to evaluate the performance– Detection Rate (DR): detection ratio of real defect points;– False Alarm Rate (FAR): ratio of false classification of defects in all the non-defective

    points; – Detected Peak Intensity Difference (DPID): ⁄1 nd ∑i=1

    nd |max Ii − max Ii∗ |/|max Ii∗ |, max Ii denotes the detected peak intensity, max Ii∗ denotes the real peak intensity;

    – Mean Square Error (MSE); – Computation Time.

  • Simulation Results

    - 38 -

    LMM SSD PMDDetection Rate (DR) 100.00% 13.33% 100.00%

    False Alarm Rate (FAR) 85.69% 2.21% 2.63%

    Detected Peak Intensity Difference

    (DPID)77.73% 74.10% 30.95%

    Mean Square Error (MSE) 9.93 10.06 9.18

    Computation Time 1203.27s 0.55s 0.94s

    • The proposed PMD works better than the LMM and the SSD (higher DR with lower FAR & DPID).

  • Summary

    - 39 -

    • Contributions– A novel accurate and computationally efficient data decomposition

    method for detection and monitoring of high-dimensional, functional, nonlinear profiles.

    – An APG based algorithm is developed to efficiently handle the parameter estimation , which meets the real time monitoring requirement.

    • Impacts– For nanomanufacturing: to enable monitoring of fabrication consistency,

    uniformity, defect information simultaneously.– It can be used to other datasets that have several characteristics:

    multichannel, high dimension, signal-dependent noise, mixture of different physical information.

  • - 40 -

    Topic 3: Tensor Mixed Effects (TME) Model and Applications in Nanomanufacturing Inspection*

    • Yue, X., Park, J.G., Liang, Z., Shi, J., 2019, “Tensor Mixed Effects (TME) Model and Applications in Nanomanufacturing Inspection”, Technometrics. (in press). (Best Paper award finalist, the 2017 INFORMS Data Mining Section)

  • Raman Mapping Data

    - 41 -

    • Raman Mapping Measurement

    CNTs Buckypaper Continuous Production

    Buckypaper Microscope Image

    Characteristics of Raman Mapping: high-dimensional, mixed-effects, different correlations in multiple dimensions.

    • Complex Spatial-Temporal Correlation Structures• Tensor is an Efficient

    Mathematical Tool for Formulating Raman Mapping Data.

    • The complex correlations & mixed-effects should be considered during modeling.

  • • Tensor Decomposition– Tucker decomposition (Kolda and Bader 2009)

    – GLM model in tensor domain (Zhou et al. 2013)

    Literature Review

    - 42 -

    Tucker Decomposition (Tucker 1966, Kolda and Bader 2009)

    Generalized Linear Tensor Regression Model (GLTR) (Zhou et al. 2013)

    • Limitation: all these methods do not consider random effects.

  • • Linear Mixed Effects Model– Linear mixed effects model is widely used (Demidenko 2013;

    Galecki and Burzykowski 2013).

    • Merits: (i) it has the capability to handle multilevel hierarchy data; (ii) it takes complex association structures into consideration.

    • Limitation: The methods cannot handle tensor responses of Raman mapping, due to ultra high dimensionality and complex correlations.

    Literature Review

    - 43 -

    𝒚𝒚𝑖𝑖 = 𝑿𝑿𝑖𝑖𝜷𝜷 + 𝒁𝒁𝑖𝑖𝒃𝒃𝑖𝑖 + 𝜺𝜺𝑖𝑖

    Fixed Effects

    Random Effects Noise

    Vectorization

    Vectorized Linear Mixed Effect (vLME) model

    𝐽𝐽3𝐽𝐽 × 𝐽𝐽 × 𝐽𝐽

    Limitations of the vLME model• Dimension after vectorization is high,

    and the computation cost is high.

    • It destroys the inherent multi-way correlation structures.

    Complexity: 𝒪𝒪(𝑁𝑁𝐽𝐽9)

  • Random Effects

    • Motivation– Fixed effects & Random effects.– Handle multi-dimensional arrays.– Exploit the correlations in different dimensions.

    • Model

    Tensor Mixed Effects Model

    - 44 -𝓡𝓡𝑖𝑖 ∽ 𝑵𝑵𝑃𝑃2,𝑄𝑄2,𝑅𝑅2 𝓞𝓞;𝚺𝚺r,𝚿𝚿r,𝛀𝛀r �𝓔𝓔𝑖𝑖 ∽ 𝑵𝑵𝐽𝐽,𝐾𝐾,𝐿𝐿(𝓞𝓞;𝚺𝚺ε,𝚿𝚿ε,𝛀𝛀ε

    𝓨𝓨𝑖𝑖 = 𝓕𝓕 ×1 𝑨𝑨𝑖𝑖1 ×2 𝑨𝑨𝑖𝑖

    2 ×3 𝑨𝑨𝑖𝑖3 + 𝓡𝓡𝑖𝑖 ×1 𝑩𝑩𝑖𝑖

    1 ×2 𝑩𝑩𝑖𝑖2 ×3 𝑩𝑩𝑖𝑖

    3 + 𝓔𝓔𝑖𝑖 (1)

    Fixed Effects Noise

    Fixed effects core tensor

    Fixed effects design matrices Noise tensor

    Random effects core tensor

    Random effects design matrices

    𝓨𝓨𝑖𝑖

    𝐽𝐽,𝐾𝐾, 𝐿𝐿

    𝓕𝓕𝑨𝑨𝑖𝑖

    1

    𝑨𝑨𝑖𝑖2

    𝑨𝑨𝑖𝑖3

    𝑃𝑃1,𝑄𝑄1,𝑅𝑅1

    𝓡𝓡𝑖𝑖𝑩𝑩𝑖𝑖

    1

    𝑩𝑩𝑖𝑖2

    𝑩𝑩𝑖𝑖3

    𝑃𝑃2,𝑄𝑄2,𝑅𝑅2

    𝐽𝐽,𝐾𝐾, 𝐿𝐿

    𝓔𝓔𝑖𝑖

  • – Inference of the TME Model– Existence conditions of the MLE.– Identifiability of the TME Model– Double Flip-Flop Algorithm for Parameter Estimation– Convergence Investigations

    Topics Investigated to DevelopTensor Mixed Effects (TME) Model

    • Yue, X., Park, J.G., Liang, Z., Shi, J., 2019, “Tensor Mixed Effects (TME) Model and Applications in Nanomanufacturing Inspection”, Technometrics. (in press).

  • Simulation Setup and Convergence Results

    - 46 -

    • Simulation Set-up– We generate the tensors with dimension 30×5×5. The dimensions of core tensor of

    fixed effects and random effects are 8×3×3 and 3×2×2, respectively.– Covariance matrices of random effects are generated randomly. Covariance

    matrices of residual errors are generated by random diagonal matrices.– 1000 response tensors are generated to test the performance.

    • Convergence criteria versus iterative histories – Divided 𝐿𝐿1 norm of difference between parameters in two successive iterations.

    �𝑿𝑿𝑖𝑖𝑘𝑘 − �𝑿𝑿𝑖𝑖

    𝑘𝑘−11

    /𝐽𝐽 � 𝐽𝐽 𝑿𝑿 = 𝚺𝚺, 𝚿𝚿,𝛀𝛀

    1st Flip-flop 2nd Flip-flop

  • Simulation: Parameter Estimation Results

    - 47 -

  • Case Study

    - 48 -

    • Experiment Set-up

    • Alignment• The multi-walled CNTs buckypaper before alignment and after

    alignment are measured by Raman mapping technique.

    𝚿𝚿,𝛀𝛀 𝐡𝐡𝐡𝐡𝐡𝐡𝐡𝐡 𝐩𝐩𝐩𝐩𝐩𝐩𝐡𝐡𝐩𝐩𝐩𝐩𝐩𝐩𝐡𝐡𝐩𝐩 𝐩𝐩𝐩𝐩 𝐛𝐛𝐡𝐡 𝐮𝐮𝐮𝐮𝐡𝐡𝐮𝐮 𝐩𝐩𝐩𝐩 𝐫𝐫𝐡𝐡𝐩𝐩𝐫𝐫𝐡𝐡𝐮𝐮𝐡𝐡𝐩𝐩𝐩𝐩 𝐩𝐩𝐡𝐡𝐡𝐡 𝐂𝐂𝐂𝐂𝐂𝐂 𝐡𝐡𝐩𝐩𝐩𝐩𝐚𝐚𝐩𝐩𝐚𝐚𝐡𝐡𝐩𝐩𝐩𝐩 𝐩𝐩𝐩𝐩 𝐩𝐩𝐡𝐡𝐡𝐡 𝐁𝐁𝐮𝐮𝐁𝐁𝐁𝐁𝐁𝐁𝐩𝐩𝐡𝐡𝐩𝐩𝐡𝐡𝐫𝐫.

  • Initial Case Study

    - 49 -

    Stretch Ratio (%) Stretch Ratio (%)

    Coe

    ffici

    ent (

    a.u.

    )

    Coe

    ffici

    ent (

    a.u.

    )

    (a) (b)𝚿𝚿r 𝛀𝛀r

    𝛀𝛀r(1,2)𝛀𝛀r(1,3)

    𝚿𝚿r(1,2)𝚿𝚿r(1,3)

  • Summary

    - 50 -

    • Contributions– We proposed a novel TME model:

    • it has the capability to handle multilevel hierarchy data. • it takes correlation along different dimensions, into consideration. • it can analyze mixed effects for high dimensional datasets.

    – We derived the MLE, and explore the existence and identifiability of the TME.

    – An iterative double Flip-Flop algorithm has been developed for parameter estimation of the TME model, and the complexity is analyzed.

    • Impacts– For nanomanufacturing: the TME has potential to be used to quantify

    the alignment degree of CNTs Buckypaper.– It may also be used to other datasets with tensor structures, mixed-

    effects, suitable sample size.

  • Nanopowder Manufacturing Process Control

    51

    *Reference: Oljaca, M. et al. (2002), Flame synthesis of nanopowders via combustion chemical vapor deposition, Journal of materials science letters, 21, 621– 626.

  • Nanopowder Manufacturing Scale-upAtomizer

    Control Objective

    Engineering knowledge

    Data Statistical Model Calibration

    Control & Evaluation

    Quality Indices

    Predictive Model Development

    Chang, C. -J., Plumlee, M., Shi, J., 2011, “A predictive Model of Nanomiser Energy And Its Application In System Monitoring”, Technical Report to Department of Energy and nGimat Company

    Challenges:• Nano-metrology analysis for

    process control

    • Variation propagation in multi-stage manufacturing process

    • Process control capability

    Goal: 1kg/day to 1000kg/day

  • Physics-based Feature Extraction & Predictive Model

    • Objective: Translate and re-define the nonlinear dynamic system into linear model

    NanomixerSolution Flow Rate

    System setting input

    Process Randomness

    SystemOutput (Y)

    1(X )

    2(X )

    Linear System(ARIMA model between Y and

    u1, u2)

    Process Randomness

    System Output (Y)( )1 1 1 2,u f X X=( )2 2 1 2,u f X X=

    Engineering Feature Transformation

    (Nonlinear System)

    NanomixerPhysics-Based Predictive ModelSolution Flow Rate

    System setting input

    1(X )

    2(X )

    1(X )

    2(X )

    (Y)

  • Physics-based Data-Driven Model Model Validation

    04/27/2011 - 54 -

    System inputs

    0 50 100 150 200 250 300

    80

    100

    120

    140

    160

    180

    200

    220

    0 50 100 150 200 250 30010

    15

    20

    25

    30

    0 50 100 150 200 250 300 35030

    40

    50

    60

    70

    80

    90

    Time

    Sol

    utio

    n En

    ergy

    Black: Energy Measureemnts; Green: Model Predictions

    Inpu

    t Set

    ting

    Sol

    utio

    n Fl

    ow R

    ate

    System output

    1(X )

    2(X )

    (Y)

  • Outline

    • Introduction and Research Overview

    • Research Topics

    • Generalized Wavelet Shrinkage of In-line Raman Spectra

    • Penalized Mixed-effects Decomposition

    • Tensor Mixed Effects Model

    • Physics-based Feature Extraction & Predictive Model

    • Summary

    - 55 -

  • Summary

    - 56 -

    – A novel generalized wavelet shrinkage (GWS) method was proposed to remove the signal-dependent noise efficiently in situ Reman sensing signals..

    – A machine learning enabled new algorithm “penalized mixed-effects decomposition (PMD)” was developed to decompose in-line profiles into four components: fixed effects, normal effects, defective effects, and noise.

    – A novel tensor mixed-effects (TME) model was developed to analyze massive high-dimensional data with complex temporal-special correlation structure.

    – Engineering-driven Data Analytics plays an important role in data enabled design and manufacturing.

  • - 57 -

    Thank You!

    Engineering-driven Data Analytics for In Situ Process Monitoring of Nano manufacturingOutline Overview of Data Fusion for Quality Improvement������Manufacturing System, Product Realization and Data FusionChallenges and OpportunitiesSlide Number 6Slide Number 7Multistage Manufacturing System:�Processes with multiple workstations and/or multiple operationsSlide Number 9SoV Theory and ApplicationPhysics-Driven Machine Learning and Modeling for Quality ImprovementVISION: Three interrelated layers of networks: �- system, sensing, and decision makingPhysics-Driven Machine Learning and Modeling�for Multistage Rolling ProcessAdditional Ongoing R&D Projects �on Multistage Manufacturing Process Control�Outline Nano Buckypaper Manufacturing Process Scale UpChallengesChallengesChallengesChallengesSlide Number 21MotivationSignal Modeling & ValidationWavelet Shrinkage (WS)Generalized Wavelet Shrinkage (GWS)Wavelet Denoising ProcedureCase StudyCase StudySummarySlide Number 30ObjectiveState-Of-The-Art in Profile Monitoring Penalized Mixed-effects DecompositionHow this method separate different components?Comparison Between PMD, LMM & SSDAlgorithm for the PMDSurrogate Data AnalysisSimulation ResultsSummarySlide Number 40Raman Mapping DataLiterature ReviewLiterature ReviewTensor Mixed Effects Model Topics Investigated to Develop�Tensor Mixed Effects (TME) Model Simulation Setup and Convergence ResultsSimulation: Parameter Estimation ResultsCase StudyInitial Case StudySummaryNanopowder Manufacturing Process ControlNanopowder Manufacturing Scale-upPhysics-based Feature Extraction & Predictive ModelPhysics-based Data-Driven Model Model ValidationOutline SummarySlide Number 57