stt 315 this lecture is based on chapter 5.4 acknowledgement: author is thankful to dr. ashok sinha,...

19
STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their slides.

Upload: shawn-byrd

Post on 29-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

STT 315

This lecture is based on Chapter 5.4

Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their slides.

Page 2: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

2

Statistical Inference• Inference means that we are making a

conclusion about the population parameter based on the statistic we calculated from a sample.

• Conclusions made using statistical inference are probabilistic in nature. We may not be able to say for sure, but with certain confidence.

• There are two types of inference:– Confidence Intervals,– Hypothesis Tests.

Page 3: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

3

Goal

Students will be able to:• Construct a confidence interval for a proportion.• Interpret a confidence interval for a proportion.• Check conditions for the use of inference about a

population proportion– Independence (or sample less than 10% of population),– Sample size large enough (successes and failures each

greater than 10).

• Explain the relationship between the margin of error, sample size, and level of certainty.

Page 4: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

4

Estimating Smokers• Suppose I want to estimate the

percent of MSU undergraduate students who smoke.

• A random sample of 99 undergraduate students were selected and 17 of them smoked tobacco last week.

• I want to make a 95% confidence interval for the proportion of MSU undergraduates based on this information.

Page 5: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

5

Are the conditions met?• Firstly it is a random sample.

• Though the sample is without replacement, but it satisfies 10% condition as there are more than 1000 undergraduate students in MSU.

• Also both number of smokers (17) and non-smokers (82) are larger than 10, the sample can be considered to be large enough.

Page 6: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

6

Use the results to make a CI• The sampling distribution results

guarantees us that the sample proportions will be roughly normally distributed around the population proportion.

• So 95% of samples should fall within two standard deviations of the population proportion.

• But we don’t know the population proportion (that’s what we are trying to estimate)! So we cannot get

• Therefore we need to use the sample proportion and work backward from there.

.p̂

Page 7: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

7

To make a 95% confidence interval we must create an interval that is 2 standard deviations long, above and below the statistic.

One standard deviation is

So 2 standard deviations is 2(.0379) = .0758 (and 7.58% is the margin of error).

In our sample, 17 out of 99 students smoked tobacco in the last week, or 17.2%.

17.2% is a statistic (or a point estimate).

We will use 17.2% to make an interval estimate for the value of the parameter.

.0379.099

)828(.172.

Construction of C.I.

Page 8: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

8

And we write:We are 95% confident that between 9.62% and 24.78% of MSU undergraduates smoke tobacco.

If we want to make a 68% confidence interval, we only have to extend the interval one standard deviation from the statistic in each direction:

So a 68% confidence interval has endpoints at 0.172 - 0.0379 = 0.1341, and 0.172 + 0.0379 = 0.2099and we write:

We are 68% confident that between 13.4% and 21.0% of MSU undergraduates smoke tobacco.

So a 95% confidence interval has endpoints at 0.172 - 0.0758 = 0.0962 , and 0.172 + 0.0758 = 0.2478.

Page 9: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

9

Our 95% CI for smokers was 9.62% to 24.78%.

This means that (find the correct one):a) 95% of random samples of MSU undergraduates

will have between 9.62% and 24.78% smokers.b) Between 9.62% and 24.78% of MSU

undergraduates smoke.c) 95% of MSU undergraduates smoke between

9.62% and 24.78% of the time.d) We are 95% sure that between 9.62% and

24.78% of MSU undergraduates smoke.

Page 10: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

10

Standard Error (S.E.)• If subjects are independent, • and if the sample size is large enough,

then the sample proportions are approximately normally distributed with mean p, and standard deviation

i.e.,• But in an estimation problem, p is unknown. So

we replace population proportion (p) by the sample proportion ( ) in its formula and get standard error of sample proportion

),(~ˆ p̂pNp

.)1(

ˆn

ppp

approximately.

,ˆˆ)ˆ1(ˆ

)ˆ.(.n

qp

n

pppES

where .ˆ1ˆ pq

Page 11: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

11

Confidence Interval (C.I.) andMargin of Error (M.E.)

• Since for large n, the sample proportion ( ) is approximately normal, we can conclude (using empirical rule) that within a margin of error of 1×S.E. we are about

68% sure the population proportion (p) lies. within a margin of error of 2×S.E. we are about

95% sure the population proportion (p) lies.• So confidence interval for p is• Obviously, more the confidence you require, larger the

margin of error.

..ˆ EMp

Page 12: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

12

Find the exact area between -2 and 2 standard deviations from the mean on a normal curve using your calculator.

Hint: normalcdf(-2,2,0,1) = 0.954 = 95.4%.So this is not exactly 95%, but slightly more.On the other hand the exact area between -1.96 and 1.96 standard deviations from the mean is

normalcdf(-1.96,1.96,0,1) = 0.95 = 95%.Using 1.96, we get the 95% C.I. for p to be: (0.097, 0.246).

Is it 1.96 or 2 for 95% C.I.?

Note: calculator uses 1.96.But how to use calculator to construct C.I.?

Page 13: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

13

Formula: C.I. for • The formula for C.I. for is given by

where is such a number that , where Z is a standard normal variable.• However, one can use TI 83/84 to compute

C.I.’s for p.

Page 14: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

14

C.I. with TI 83/84 PlusWant to make a 85% confidence interval for smokers among MSU undergraduates. In a random sample of 99 MSU undergraduates 17 smoked tobacco last week.• Press [STAT].• Select [TESTS].• Choose A: 1-PropZInt….• Input the following:

o x: 17o n: 99o C-Level: 85

• Choose Calculate and press [ENTER].

Answer: 85% C.I. for p is (0.117, 0.226).

Page 15: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

15

How would I a confidence interval a proportion using a calculator?

Sample Input

Sample Output

The sample input shows finding a 99% confidence interval with a sample size of 4040 people and 2048 smokers.

We would interpret the sample output as:“We are 99% confident that between 48.7% and 52.7% of the population smokes.

Note: This example wasn’t actually about smoking.

Page 16: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

16

Width of a C.I.• Since the formula of confidence interval for p is

the width of the C.I. is 2×M.E.• So if we know the width of C.I., we can compute the

M.E. by halving the width.• Example: Given a 90% C.I. for p is (0.23, 0.37), find

the values of (a) sample proportion and (b) margin of error of the 90% C.I.Solution: Since the width = (0.37-0.23) = 0.14, and so the margin of error of 90% C.I. for p is 0.14/2 = 0.07.Moreoverand so

..ˆ EMp

23.0..ˆ EMp

.30.023.007.0ˆ p

Page 17: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

17

Smoker example• We found that 17.2% of a sample of 99

MSU undergraduates had smoked in the past week.

• We used this to find a 95% confidence interval for the proportion of MSU undergraduates who smoke.

• The endpoints of our 95% confidence interval is (0.096, 0.248).

• The width of 95% C.I. is (0.248-0.096) = 0.15, and so the margin of error is (0.15/2) = 0.075.

• If we want to reduce the margin of error while keeping the confidence level the same, we could increase the sample size.

Page 18: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

18

M.E. and sample size• If we wanted to reduce the margin of error to

4%, minimum how many undergrads would we have to survey?

• The formula is: • But what p to use (remember q = 1-p)?

.25.600)04.0(

5.05.096.1

.).(

5.05.096.12

2

2

2

EM

n

.98.341)04(.

)828)(.172(.8416.3

.).(

96.122

2

EM

pqn So we would need 342 subjects.

..).(

)(2

22/

EM

pqzn

Two cases: No information about p is given. In that case use p = 0.5.In our exercise, if nothing about p is known:

So we would need 601 subjects.

If some information about p is known, use that information.If we use the information of sample: p = 0.172, q = 1-0.172 = 0.828.

Page 19: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing

19

Summary

• Larger sample size makes smaller margin of error.

• Larger confidence makes larger margin of error.

• The level of confidence is the proportion of intervals that will contain the value of the population parameter.

• As long as the conditions are met, the process of confidence interval works.