central limit theorem

20
Central Limit Theorem Example: (NOTE THAT THE ANSWER IS CORRECTED COMPARED TO NOTES5.PPT) 5 chemists independently synthesize a compound 1 time each. Each reaction should produce 10ml of a substance. Historically, the amount produced by each reaction has been normally distributed with std dev 0.5ml. 1. What’s the probability that less than 49.8mls of the substance are made in total? 2. What’s the probability that the average amount produced is more than 10.1ml? 3. Suppose the average amount produced is more than 11.0ml. Is that a rare event? Why or why not? If more than 11.0ml are made, what might that suggest?

Upload: bary

Post on 05-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Central Limit Theorem. Example: (NOTE THAT THE ANSWER IS CORRECTED COMPARED TO NOTES5.PPT) 5 chemists independently synthesize a compound 1 time each. Each reaction should produce 10ml of a substance. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Central Limit Theorem

Central Limit Theorem• Example:

(NOTE THAT THE ANSWER IS CORRECTED COMPARED TO NOTES5.PPT)

– 5 chemists independently synthesize a compound 1 time each.– Each reaction should produce 10ml of a substance.– Historically, the amount produced by each reaction has been normally

distributed with std dev 0.5ml.1. What’s the probability that less than 49.8mls of the substance are made in

total?2. What’s the probability that the average amount produced is more than

10.1ml?3. Suppose the average amount produced is more than 11.0ml. Is that a rare

event? Why or why not? If more than 11.0ml are made, what might that suggest?

Page 2: Central Limit Theorem

Answer:• Central limit theorem:

If E(Xi)= and Var(Xi)=2 for all i (and independent) then:X1+…+Xn ~ N(n,n2)

(X1+…+Xn)/n ~ N(,2/n)

Page 3: Central Limit Theorem

Lab:

1. Let Y = total amount made. Y~N(5*10,5*0.52) (by CLT)Pr(Y<49.8) = Pr[(Y-50)/1.12 < (49.8-50)/1.12]=Pr(Z < -0.18) = 0.43

2. Let W = average amount made.W~N(10,0.52/5) (by CLT)Pr(W > 10.1) = Pr[Z > (10.1 – 10)/0.22]=Pr(Z > 0.45) = 0.33

Page 4: Central Limit Theorem

Lab (continued)

3. One definition of rare:It’s a rare event if Pr(W > 11.0) is small(i.e. if “Seeing probability of 11.0 or something more extreme is small”)Pr(W>11) = Pr[Z > (11-10)/0.22] = Pr(Z>4.55) = approximately zero.

This suggests that perhaps either the true mean is not 10 or true std dev is not 0.5 (or not normally distributed…)

Page 5: Central Limit Theorem

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Sample size: 1006(source: gallup.com)

Page 6: Central Limit Theorem

• Let Xi = 1 if person i thinks the Presidentis hiding something and 0 otherwise.

• Suppose E(Xi) = p and Var(Xi) = p(1-p) and each person’s opinion is independent.

• Let Y = total number of “yesses”= X1+…+ X1006

• Y ~ Bin(1006,p)• Suppose p = 0.36 (this is the estimate…)• What is Pr(Y < 352)?

Note that this definitionturns three outcomes intotwo outcomes

Page 7: Central Limit Theorem

Normal Approximation to the binomial CDF

– Even with computers, as n gets large, computing things like this can become difficult. (1006 is OK, but how about 1,000,000?)

– Idea: Use the central limit theorem approximate this probability– Y is approximately

N[1006*0.36, (0.36)*(0.64)*1006]= N(362.16,231.8) (by central limit theorem)

Pr[ (Y-362.16)/15.2 < (352-362.16)/15.2]= Pr(Z < -0.67) = 0.25

Pr(Y<352) = Pr(Y=0)+…+Pr(Y=351), where Pr(Y=k) = (1006 choose k)0.36k0.641006-k

Page 8: Central Limit Theorem

Normal Approximation to the binomial CDF

Black “step function” is plots of bin(1006,0.36) pdf versus Y (integers)

Blue line is plot of Normal(362.16,231.8) pdf

Page 9: Central Limit Theorem

Normal Approximation to the binomial CDF

Area under blue curve toleft of 352

is approximately equal to the

sum of areas ofrectangles (blackStepfunction) to the left of 352

Page 10: Central Limit Theorem

Comments about normal approximation of the binomial :

Rule of thumb is that it’s OK if np>5 and n(1-p)>5.

“Continuity correction”

Y is binomial.

If we use the normal approximation to the probability that Y<k, we should calculate Pr(Y<k+.5)

If we use the normal approximation to the probability that Y>k, we should calculate Pr(Y<k-.5)

(see picture on board)

Page 11: Central Limit Theorem

Probability meaning of 6 sigma

• Even if you shift the process mean for the center of the specifications to 1.5 standard deviations toward one of the specifications, then you will expect no more than 3.4 out of a million defects outside of the specification toward which you shifted.

• (I know it’s convoluted, but that’s the definition…)

Page 12: Central Limit Theorem

What does 6 sigma mean?(example)

• Suppose a product has a quantitative specification:ex: “Make the gap between the car door and the car body between 3.4 and 4.6mm.”

• When cars are actually made, the “std dev of car door gap is 0.1mm”. i.e. X1,…,Xn are gap widths. The sqrt(sample variance of X1,…,Xn)= 0.1mm

Page 13: Central Limit Theorem

Lower specification

Upper specification

3.4mm 4.6mm4.6 – 3.4 = 1.2 = 12*0.1 = 12*sigma

Statistically, six sigma means that Upper Spec – Lower Spec > 12 sigma(i.e. Specs are fixed. Lower the manufactuing process variability.)

Center of spec = 4mm gap

Shifted mean= 3.85mm gap

Distribution of gap widths

Probability of beingout here is Pr( gap is less than 3.4 ) = Pr( (gap – 3.85)/0.1 < (3.4-3.85)/.1)

=Pr( Z < -4.5) = 3.4/1,000,000 Arbitrary “magic” number for 6

Page 14: Central Limit Theorem

In general:Assume process mean is 1.5 standard deviations toward the lower spec: i.e. E(X)=4-1.5 and assume X has a normal distribution.When the process is in control enough so that the distance between the center of the specs and the lower spec is least 6, thenPr(X below lower spec) =Pr( X<4- 6)=Pr[(X- (4-1.5-6(4-1.5 ] =Pr(Z<-4.5) = 3.4/1,000,000

Probability meaning of 6 sigma

Page 15: Central Limit Theorem

Control Charts

• Let X = an average of n measurements.• Each measurement has mean and

variance 2.• Fact:

– By the central limit theorem, almost all observations of X fall in the interval +/- 3/sqrt(n) (i.e. mean +/- 3 standard deviations)

– /sqrt(n) is also called x or standard error

Page 16: Central Limit Theorem

Use the “fact” to detect changes in production quality

• Idea: let xi = average door gap from the n cars made by shift i at the car plant

+3 /sqrt(n)(Upper Control Limit)

-3 /sqrt(n)(Lower Control Limit)

shift

x1

x2

x3

x4

x5

x6

x7

x8

Points outside the +/- 3 std error bounds, are called “out of control”. They are evidence that and or are not the true mean and std dev any more, and the process needs to be readjusted. Calculate the “false alarm rate”… (= 26/10,000)

Page 17: Central Limit Theorem

Assume 100 new people arepolled.

Assume true pr( a new person says yes) = 0.36.

Let P = “P hat”= number say yes/100

What’s an approximation tothe distribution of P-hat?

Use the approximation todetermine a number so thatthe Pr(p-hat> that number) = 0.95.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 18: Central Limit Theorem

EXAMPLE OF SAMPLING DISTRIBUTION OF P-HAT

Xk = 1 if person k says yes and 0 if not.

Note that E(Xk)=0.36=p and Var(Xk)=0.36*0.64=p(1-p) Note that Xk is binomial(1,0.36).

P-hat = (X1+…+X100)/100. By CLT, P-hat is approximately N(0.36,0.36*0.64/100).

(Rule of thumb is that this approximation is good if np>5 and n(1-p)>5.)

Page 19: Central Limit Theorem

• Suppose true p is 0.36.• If survey is conducted again on 100 people, then

P-hat ~ N(.36,(.36)(.64)/100) = N(.36, 0.002304)

Want p0 so that Pr(P-hat<p0) = 0.95 Pr(P-hat<p0) = 0.95 means Pr(Z < (p0-.36)/0.048) = 0.95.Since Pr(Z<1.645) = 0.95,(p0-.36)/0.048 = 1.645(p0-.36) = 0.07896p0 = 0.43896

Page 20: Central Limit Theorem

• Suppose true p is 0.40.

• If survey is conducted again on 49 people, what’s the probability of seeing 38% to 44% favorable responses?

Pr( 0.38 < P ”hat” < 0.44)

= Pr[(0.38-0.40)/sqrt(0.40*0.60/49) < Z < (0.44-0.40)/sqrt(0.40*0.60/49) ]

= Pr(-0.29 < Z < 0.57)= Pr(Z<0.57) – Pr(Z<-0.29)= 0.7157-0.3859=0.3298