monitoring high-yield processes monitoring high-yield processes cesar acosta-mejia june 2011

53
Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Upload: elijah-heath

Post on 20-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

MONITORING HIGH-YIELD PROCESSES

Cesar Acosta-Mejia

June 2011

Page 2: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

EDUCATION

– B.S. Catholic University of Peru

– M.A. Monterrey Tech, Mexico

– Ph.D. Texas A&M University

RESEARCH

– Quality Engineering - SPC, Process monitoring

– Applied Probability and Statistics – Sequential analysis

– Probability modeling – Change point detection, process surveillance

Page 3: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

MOTIVATION

– High-yield processes

– Monitor the fraction of nonconforming units p

– Very small p (ppm)

– To detect increases or decreases in p

– A very sensitive procedure

MONITORING HIGH-YIELD PROCESSES

Page 4: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

MONITORING HIGH-YIELD PROCESSES

ASSUMPTIONS

• Process is observed continuously

• Process can be characterized by Bernoulli trials

• Fraction of nonconforming units p is constant, but

may change at an unknown point of time

Page 5: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Hypothesis Testing

For (level ) two-sided tests

the region R is made up of two subregions R1 and R2

with limits L and U such that

P[X ≤ L] = / 2

P[X ≥ U] = / 2

L U

Page 6: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Hypothesis Testing

Consider testing the proportion p

Page 7: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Hypothesis Testing

The test may be based on different random variables

• Binomial (n, p)

• Geometric (p)

• Negative Binomial (r, p)

• Binomial – order k (n, p)

• Geometric – order k (p)

• Negative Binomial – order k (r, p)

Page 8: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Binomial tests

when p is very small

Page 9: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 1

• proportion p0 = 0.025 (25000 ppm)

• test H0 : p = 0.025

against

H1 : p 0.025

• X n. of nonconforming units in 500 items

0.0027

Page 10: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 1

Let X Binomial (500,p)

To test the hypothesis

H0 : p = 0.025 against H1 : p 0.025

the rejection region is

R = {x ≤ 2} {x ≥ 25}

since

P[X ≤ 2] = 0.000300 < 0.00135 = /2

P[X ≥ 25] = 0.001018 < 0.00135 = /2

Page 11: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 1

Plot of P[rejecting H0] vs. p is

0.0027

0.00000

0.00200

0.00400

0.00600

0.00800

0.01000

0.01200

5000 10000 15000 20000 25000 30000 35000 40000 45000

parts per million

prob

abili

ty

of

reje

ctin

g H

o

Page 12: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Hypothesis Testing

Now consider testing

p0 = 0.0001 (100 ppm)

Page 13: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 1

Let X Binomial (n = 500,p)

To test the hypothesis

H0 : p = 0.0001 against H1 : p 0.0001

the rejection region is

R = {X ≥ 2}

since

P [X ≥ 2] = 0.0012

For n=500 there is no two-sided test for p = 0.0001.

Page 14: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 1

Binomial (n = 500, p = 0.025) Binomial (n = 500, p = 0.0001)

Page 15: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 1

For this test a plot of P[rejecting H0] vs. p is

0.0027

0

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

20 40 60 80 100 120 140 160 180 200 220 240 260

parts per million

P [

rej

ectin

g H

o]

Page 16: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Consider a geometric test for p

when p is very small

Page 17: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 2

Let X Geo(p)

To test the hypothesis ( = 0.0027)

H0 : p = 0.0001 against H1 : p 0.0001

the rejection region is

R = {X ≤ 13} {X ≥ 66075}

since

P[X ≤ 13] = 0.0013

P[X ≥ 66075] = 0.00135

An observation in {X ≤ 13} leads to conclude that p > 0.0001

Page 18: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

0.00270

0.00000

0.00200

0.00400

0.00600

0.00800

0.01000

0.01200

50 100 150 200 250 300

p

P[r

eje

ctin

g H

o]

Test 2

For this test a plot of P[rejecting H0] vs. p is

Page 19: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Another performance measure

of a sequential testing procedure

Page 20: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Hypothesis Testing

Let X1, X2, … Geo(p) iid

Let T number of observations until H0 is rejected

Consider the random variables for j = 1,2,…

Aj = 1 if Xj R P[Aj = 0] = PR

Aj = 0 otherwise

then the probability function of T is

P[T= t] = P[A1 = 0] P[A2 = 0]… P[At-1 = 0] P[At = 1]

= PR [1-PR]t-1

Page 21: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Hypothesis Testing

therefore

T Geo(PR)

Let us consider E[T] = 1/PR as a performance measure

then

E[T] = 1/PR mean number of tests until H0 is rejected

when p = p0

E[T] = 1/

Page 22: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 2

Let X Geo(p) q = 1 - p

P [X ≤ x] = 1 – qx

Let the rejection region R = {X < L} {X > U}

then

PA = P [not rejecting H0]

= P [ L ≤ X ≤ U]

= 1 – qU – (1 – qL-1)

= qL-1 – qU

PR = 1 – (1- p )L-1 + (1 - p)U

Page 23: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 2

Let X Geo(p)

To test the hypothesis ( = 0.0027)

H0 : p = 0.0001 against H0 : p 0.0001

the rejection region is

R = {X < 14} {X > 66074}

then P[rejecting H0] is

PR = 1 – (1 – p)13 + (1 – p)66074

E[T] = 1/PR

when p = p0 E[T] = 1/ = 370.4

Page 24: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 2

we want E[T] < 370.4 when p > 0.0001

Page 25: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Test 2

How can we improve upon this test ?

we want E[T] < 370.4 when p > 0.0001

Page 26: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

run sum procedure

Page 27: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Geometric chart

A sequence of tests of hypotheses

Page 28: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE RUN SUM – for the mean

Page 29: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM

Page 30: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - DEFINITION

• Let us denote the following cumulative sums

SUt = SUt-1 + qt if Xt falls above the center line

= 0 otherwise

SLt = SLt-1 - qt if Xt falls below the center line

= 0 otherwise

where qt is the score assigned to the region in which Xt falls

Page 31: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - DEFINITION

• The run sum statistic is defined, for t = 1,2,…, by

St = max {SUt, -SLt}

with SU0 = 0, SL0 = 0

and limit sum L

Page 32: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - DESIGN

• Need to define

region limits (l1, l2, l3 and l5, l6, l7)

region scores (q1, q2, q3 and q4)

limit sum L

Page 33: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - DESIGN

• Region limits above and below the center line are not symmetric around the center line.

• To define the region limits we use the cumulative probabilities of the distribution of X Geo (p0)

• Such probabilities were chosen to be the same as those of a run sum for the mean with the same scores

Page 34: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - DESIGN

Page 35: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - EXAMPLE

• If X Geo (p0 = 0.0001)

the region limits are given by

0.00123 = P [X ≤ l1 ]

0.02175 = P [X ≤ l2 ]

0.15638 = P [X ≤ l3 ]

0.50000 = P [X ≤ l4 ]

0.84362 = P [X ≤ l5 ]

0.97825 = P [X ≤ l6 ]

0.99877 = P [X ≤ l7 ]

Page 36: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - EXAMPLE

• If X Geo (p0 = 0.0001)

the region limits are given by

0.00123 = P [X ≤ 13 ]

0.02175 = P [X ≤ 220 ]

0.15638 = P [X ≤ 1701 ]

0.50000 = P [X ≤ 6932 ]

0.84362 = P [X ≤ 18554 ]

0.97825 = P [X ≤ 36280 ]

0.99877 = P [X ≤ 67007 ]

Page 37: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

THE GEOMETRIC RUN SUM - EXAMPLE

• Conclude H1: p p0 when St L

• Let T number of samples until H0 is rejected

• What is the distribution of T ?

• What is the mean and standard deviation?

Page 38: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM (0,1,2,3) L = 5 - MODELING

• Markov chain

• States defined by the values that St can assume

• State space

= {-4,-3,-2,-1,0,1,2,3,4,C}

where

C ={n N | n = …,-6,-5,5,6,…}

is an absorbing state

• Transition probabilities

Page 39: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM (0,1,2,3) L = 5 - MODELING

• Let p1 = P [ X ≤ l1 ]

p2 = P [ l1 ≤X ≤ l2]

p3 = P [ l2 ≤X ≤ l3 ]

p4 = P [ l3 ≤X ≤ l4]

p5 = P [ l4 ≤X ≤ l5]

p6 = P [ l5 ≤X ≤ l6]

p7 = P [ l6 ≤X ≤ l7]

p8 = P [ X > l8 ]

where X Geo (p0)

Page 40: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM (0,1,2,3) L = 5 - MODELING

Transitions from St = 0

Page 41: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM (0,1,2,3) L = 5 - MODELING

Transitions from St = 1

Page 42: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM (0,1,2,3) L = 5 - MODELING

Transitions from St = 2

Page 43: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM (0,1,2,3) L = 5 - MODELING

Page 44: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM (0,1,2,3) L = 5 - MODELING

• Let T be the first passage time to state C

n. of observations until the run sum rejects H0

• Let Q be the sub matrix of transient states, then

P [T ≤ t] = e ( I – Qt ) J

G (s) = se ( I – s Q )-1 ( I – Q) J

E [T] = e ( I – Q )-1 J

e is a row vector defining the initial state {S0}

Page 45: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Geometric Run sum

For this chart a plot of E[T] vs. p is

0

100

200

300

400

500

600

20 30 40 50 60 70 80 90 100

110

120

130

140

150

160

170

180

ppm

aver

age

run

leng

th

Page 46: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

Geometric Run sum

A comparison with Test 2

370.47

0

100

200

300

400

500

600

20 30 40 50 60 70 80 90 100

110

120

130

140

150

160

170

180

ppm

aver

age

run

leng

th

Page 47: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

RUN SUM – FURTHER IMPROVEMENT

• Consider a geometric run sum

– No regions

– Center line equal to l4– Scores are equal to X

– Design – limit sum L

Page 48: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

NEW GEOMETRIC RUN SUM - DEFINITION

• Let us denote the following cumulative sums

SUt = SUt-1 + Xt if Xt falls above the center line

= 0 otherwise

SLt = SLt-1 - Xt if Xt falls below the center line

= 0 otherwise

Page 49: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

NEW GEOMETRIC RUN SUM - DEFINITION

• The run sum statistic is defined, for t = 1,2,…, by

St = max {SUt, -SLt}

with SU0 = 0, SL0 = 0

and limit sum L

Page 50: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

NEW GEOMETRIC RUN SUM - MODELING

• Markov chain – not possible

– huge number of states

• Need to derive the distribution of T

• Can show that

Page 51: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

NEW GEOMETRIC RUN SUM - MODELING

Page 52: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

CONCLUSIONS

• The run sum is an effective procedure

for two-sided monitoring

• For monitoring very small p,

it is more effective than

a sequence of geometric tests

• If limited number of regions

it can be modeled by a Markov chain

Page 53: Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes

TOPICS OF INTEREST

• Estimate (the time p changes – the change point)

• Bayesian tests

• Lack of independence (chain dependent BT)

• Run sum can be applied to other instances

- monitoring - arrival process