from the data at hand to the world at large chapter 19 confidence intervals for an unknown...

36
From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating an unknown population proportion p

Upload: christian-day

Post on 23-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

From the Data at Hand to the World at Large

Chapter 19Confidence Intervals for an

Unknown Population pEstimation of a population parameter:

•Estimating an unknown population proportion p

Page 2: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Chapter 19 Objectives

1. Determine, manually and using technology, confidence intervals for an unknown population proportion p based on the information contained in a single sample.

2. Understand and properly interpret a confidence interval for a population proportion p.

Page 3: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

What do we frequently need to estimate?

• An unknown population proportion p

• An unknown population mean

• Will deal first with estimating a population proportion p

?

p?

Page 4: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Concepts of Estimation• The objective of estimation is to estimate

the unknown value of a population parameter, like a population proportion p, on the basis of a sample statistic calculated from sample data.

e.g., NCSU student affairs office may want to estimate the proportion of students that want more campus weekend activities

• There are two types of estimates– Point Estimate– Interval estimate

Page 5: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Point Estimate of p

• p = , the sample proportion of x successes in a sample of size n, is the best point estimate of the unknown value of the population proportion p

^nx

Page 6: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example: Estimating an unknown population proportion p • Is Sidney Lowe’s departure good or bad

for State's men's basketball team? (Technician poll; not scientifically valid!!)

• In a sample of 1000 students, 590 say that Lowe’s departure is good for the bb team.

• p = 590/1000 = .59 is the point estimate of the unknown population proportion p that think Lowe’s departure is good.

^

Page 7: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Shortcoming of Point Estimates

• p = 590/1000 = .59, best estimate of population proportion p

BUT

How good is this best estimate?

No measure of reliability

^

Another type of estimate

Page 8: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

A confidence interval is a range (or an interval) of values used to estimate the unknown value of a population parameter .

http://abcnews.go.com/US/PollVault/

Interval Estimator

Page 9: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Tool for Constructing Confidence Intervals: The Central Limit

Theorem• If a random sample of n observations is

selected from a population (any population), and x “successes” are observed, then when n is sufficiently large, the sampling distribution of the sample proportion p will be approximately a normal distribution.

• (n is large when np ≥ 10 and nq ≥ 10).

Page 10: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

The sampling distribution model for a sample proportion p

Provided that the sampled values are independent and the sample size n is large enough, the sampling distribution ofp is modeled by a normal distribution with E(p) = p and

standard deviation SD(p) = , that is

where q = 1 – p and where n large enough means np>=10 and nq>=10 The Central Limit Theorem is a formal statement of this fact.

pq

n

ˆ ~ ,pq

p N pn

Page 11: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Standard Normal

P(-1.96 z 1.96) =. 95

Page 12: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

95% Confidence Interval for p

n)p(1p

1.96p

written

)n

)p(1p1.96p,

n)p(1p

1.96p(

:p for

interval confidence 95% aconstruct tonx

p Use

ˆˆˆ

ˆˆˆ

ˆˆˆ

ˆ

Page 13: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

p1.96

pqp

n 1.96

pqp

n

.95

Confidence levelSampling distribution model for

ˆ95% of the time p will be in this interval

Therefore, the interval

ˆ ˆ1.96 , 1.96

will "capture" 95% of the time

pq pqp p

n n

p

Page 14: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example (Gallup Polls)

)544.,496(.024.52.1600

)48)(.52(.96.152.

ˆˆ96.1ˆ

calculate wefor

interval confidence 95% a desire weifThen

.52.ˆ suppose voters;1600ely approximat

sample typicallypolls preferenceVoter

n

qpp

p

p

http://abcnews.go.com/US/PollVault/story?id=145373&page=1

Page 15: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

• Confidence intervals other than 95% confidence intervals are also

used

Page 16: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Standard Normal

Page 17: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

98% Confidence Intervals

n

qpp

n

qpp

n

qpp

p

ˆˆ33.2ˆ

written

ˆˆ33.2ˆ,

ˆˆ33.2ˆ

:For

Page 18: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Four Commonly Used Confidence Levels

Confidence Level Multiplier

.90 1.645

.95 1.96

.98 2.33

.99 2.576

Page 19: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Medication side effects (confidence interval for p)Arthritis is a painful, chronic inflammation of the joints.

An experiment on the side effects of pain relievers

examined arthritis patients to find the proportion of

patients who suffer side effects.

What are some side effects of ibuprofen?Serious side effects (seek medical attention immediately):

Allergic reaction (difficulty breathing, swelling, or hives),Muscle cramps, numbness, or tingling,Ulcers (open sores) in the mouth,Rapid weight gain (fluid retention),Seizures,Black, bloody, or tarry stools,Blood in your urine or vomit,Decreased hearing or ringing in the ears,Jaundice (yellowing of the skin or eyes), orAbdominal cramping, indigestion, or heartburn,

Less serious side effects (discuss with your doctor):Dizziness or headache,Nausea, gaseousness, diarrhea, or constipation,Depression,Fatigue or weakness,Dry mouth, orIrregular menstrual periods

Page 20: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Calculate a 90% confidence interval for the population proportion p of arthritis patients who suffer some “adverse symptoms.”

* ˆ ˆˆ

.052(1 .052).052 1.645

440.052 1.645(0.011)

.052 .018

pqp z

n

052.0440

23ˆ p

For a 90% confidence level, z* = 1.645.

We are 90% confident that the interval (.034, .070) contains the true

proportion of arthritis patients that experience some adverse symptoms when

taking ibuprofen.

90%CIfor :

0.052 0.018 (.034,.070)

p

440 subjects with chronic arthritis were given ibuprofen for pain relief; 23 subjects suffered from adverse side effects.

* ˆ ˆˆ

pqp z

n

What is the sample proportion ?

Page 21: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example: impact of sample size

)070,.034(.018.052.440

)052.1(052.645.1052. :CI%90

)059,.045(.007.052.1000

)052.1(052.645.1052. :CI%90

n=440: width of 90% CI: 2*.018 = .036n=1000: width of 90% CI: 2*.007=.014When the sample size is increased, the 90% CI is narrower

Page 22: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

IMPORTANT

• The higher the confidence level, the wider the interval

• Increasing the sample size n will make a confidence interval with the same confidence level narrower (i.e., more precise)

Page 23: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example

• Find a 95% confidence interval for p, the proportion of NCSU students that strongly favor the current lottery system for awarding tickets to football and men’s basketball games, if a random sample of 1000 students found that 347 strongly favor the current system.

Page 24: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example (solution)347ˆ ˆp .347, so 1 .6531000and the confidenceinterval is

(.347)(.653).347 1.96 = .347 .0295

1000(.3175, .3765)

p

Page 25: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Interpreting Confidence Intervals• Previous example: .347±.0295(.3175, .3765)• Correct: We are 95% confident that the interval

from .3175 to .3765 actually does contain the true value of p. This means that if we were to select many different samples of size 1000 and construct a 95% CI from each sample, 95% of the resulting intervals would contain the value of the population proportion p. (.3175, .3765) is one such interval. (Note that 95% refers to the procedure we used to construct the interval; it does not refer to the population proportion p)

• Wrong: There is a 95% chance that the population proportion p falls between .3175 and .3765. (Note that p is not random, it is a fixed but unknown number)

Page 26: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Confidence Interval Interpretation

She achieved 95% accuracy, so she would answer 95 out of 100 correctly, say.

7 x 5 = 42

Is there a 95% chance that 7x5=42?95% confidence intervals

behave the same way. An individual confidence interval either captures p or it doesn’t…

…but in a group of many 95% confidence intervals, about 95% of them will capture p.

Page 27: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Determining the Sample Size n to Estimate a Population Proportion p

Page 28: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

To Estimate a Population Proportion p

• If you desire a C% confidence interval for a population proportion p with an accuracy specified by you, how large does the sample size need to be?

• We will denote the accuracy by ME, which stands for Margin of Error.

Page 29: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Required Sample Size n to Estimate a Population Proportion p

*

*

*

ˆ

ˆ

2

2

p ME

pqCI for p : p z

n

pqset z ME and solve for n :

n

z pqn = ;

(ME)

Page 30: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

p1.96

pqp

n 1.96

pqp

n

.95

Confidence levelSampling distribution of

ME ME

2

2

set 1.96 and solve for

1.96

pqME n

n

pqn

ME

Page 31: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

What About p and q=1-p?*2

2

z pqn = ; we don't know p or q;

(ME)

TWO METHODS :

1: if prior information is available concerning

the value of p, use that value of p to calculate

n;

2 : if no prior information about p is available,

to o

btain a conservative estimate of the

1required sample size, use p q

2

Page 32: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example: Sample Size to Estimate a Population Proportion p

• The U. S. Crime Commission wants to estimate p = the proportion of crimes in which firearms are used to within .02 with 90% confidence. Data from previous years shows that p is about .6

Page 33: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example: Sample Size to Estimate a Population Proportion p (cont.)

*

*

2

2

2

2

z pqn = ; ME .02; p is estimated to

(ME)

be about .6 from previous years' data;

90% z 1.645

(1.645) (.6)(.4)n 1,623.6; n 1624

(.02)

Page 34: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example: Sample Size to Estimate a Population Proportion p

The Curdle Dairy Co. wants to estimate the proportion p of customers that will purchase its new broccoli-flavored ice cream.

Curdle wants to be 90% confident that they have estimated p to within .03. How many customers should they sample?

Page 35: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example: Sample Size to Estimate a Population Proportion p (cont.)

• The desired Margin of Error is ME = .03• Curdle wants to be 90% confident, so

z*=1.645; the required sample size is

• Since the sample has not yet been taken, the sample proportion p is still unknown.

• We proceed using either one of the following two methods:

*2

2

z pqn =

(ME)

2

2

(1.645) pqn

(.03)

Page 36: From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating

Example: Sample Size to Estimate a Population Proportion p (cont.)• Method 1:

– There is no knowledge about the value of p– Let p = .5. This results in the largest possible n needed for

a 90% confidence interval of the form – If the proportion does not equal .5, the actual ME will be

narrower than .03 with the n obtained by the formula below.

03.p̂

2

2

1.645 .5 .5751.67 752

.03n

2

2

1.645 .2 .8481.07 482

.03n

• Method 2:– There is some idea about the value of p (say p ~ .2)– Use the value of p to calculate the sample size

*2

2

z pqn =

(ME)