the practice of statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • the 95 part of...

22
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company

Upload: others

Post on 19-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

The Practice of StatisticsThird Edition

Chapter 10:Estimating with Confidence

Copyright © 2008 by W. H. Freeman & Company

Page 2: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Chapter 10

• Estimating with confidence

– Confidence Intervals are used to estimate

unknown population parameters.

• Three Big Ideas in this Chapter.

Page 3: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Big Idea #1

• When we use a sample statistic to estimate

the unknown value of a population

parameter, we want not just an estimate but

also a statement of how accurate the

estimate is.

– Confidence intervals do that for us.

Page 4: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Big Idea #2

• Most confidence intervals have the form:

estimate ± margin of error.

• We can see both the basic (“point”) estimate

and the margin of error that tells us how

accurate the point estimate is.

Page 5: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Big Idea #3

• A confidence interval can’t guarantee to always

capture the population parameter.

• The Confidence level tells us how often we capture

the parameter in very many uses of the confidence

interval method.

• Most common confidence level is 95%. If we use

95% confidence intervals very many times, 95% of

them will capture the parameter and 5% will not.

Page 6: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Confidence Intervals

• How long does a AA battery last?

• Not practical to determine life of every battery.

• Select a sample to represent the population.

• Collect data from the sample

• We want to infer from the sample data some

conclusion about the population.

Page 7: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

We cannot be certain our conclusion is correct.

Statistical Inference uses probability to express the strength

of our conclusions.

This is the goal of Statistical Inference

This is the largest part of the AP Exam (30 – 40 percent).

Page 8: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Where Are We Headed?

• Two common types of statistical inference

– Chapter 10 – Confidence Intervals for

estimating the value of a population parameter.

– Chapter 11 – Significance Tests – Assess the

evidence for a claim about a population.

• Both report probabilities that state what

would happen if we used the inference

method many times.

Page 9: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Cautions!

• Formal inference requires long-run regular behavior.

• Inference is most reliable when data is produced by randomized design.

• If not true, your conclusions may be open to challenge.

• Inference cannot fix basic flaws in producing data.

• GIGO

Page 10: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Assumptions for Right Now

• We are going to pretend the world is simpler than it is.

• Pretend that population standard deviation, σ is known,

even though we don’t know μ, the population mean.

• We will start with an overly simple technique for

estimating a population mean.

• Then we will modify our approach to make it more

useful.

Page 11: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

The Basics –

IQ and College Admissions

• Administer IQ test to SRS of 50 of 5,000

incoming college freshmen.

• ҧ𝑥 = 112

• What can you say about µ for the entire class?

• Is µ = 112 for the population really 112?

– Probably not, but how close to 112 is μ?

Page 12: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

How do we Determine?

• Question to answer: How would ҧ𝑥 vary if we took many samples of 50 freshmen from the same population?

• Recall from Chapter 9:

– ҧ𝑥 is the same as µ

– Standard deviation of x-bar is σ/√n.

– Suppose for this test σ = 15

– So standard deviation of x-bar = 15/√50 = 2.1

– CLT tells us the ҧ𝑥 of 50 scores is distributed approximately normally.

Page 13: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Freshman Class

Take every possible combination

of 50 students and get sampling

distribution with mean equal to

unknown μ and std. dev. = 2.1

Page 14: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

• To estimate µ use ҧ𝑥 of the random sample

• ҧ𝑥 is an unbiased estimator, but it will rarely equal µ, so the estimate has some error.

• The values of ҧ𝑥 follow a approximately normal distribution with mean µ and standard deviation = 2.1

• The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations of µ so(2*2.1) 4.2.

• So µ lies between ҧ𝑥 plus 4.2 AND ҧ𝑥 minus 4.2

• That means we estimate that µ is somewhere between 112 – 4.2 = 107.8 and 112 + 4.2 = 116.2.

• The interval is 107.8 to 116.2

• This captures the true µ in about 95% of the samples.

Page 15: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

This is the same curve as the previous

slide. 95% of all samples lie within 2

standard deviations of the population

mean, μ.

That is the same thing as saying the

interval ҧ𝑥 ± 4.2 (2 standard deviation

in either direction) captures μ in 95%

of all samples.

Page 16: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Starting with the population, imagine taking all of the possible SRS of 50 freshman.

The recipe ҧ𝑥 ± 4.2 gives us an interval based on each sample, 95% of these intervals

capture the unknown population mean μ.

The language of statistical inference uses this fact about what

would happen in many samples to express our confidence in the

results in any one sample.

Page 17: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Interpreting a Confidence Interval

• ҧ𝑥 = 112

• Interval = 112 ± 4.2 (107.8, 116.2)

• We are 95% confident that the unknown

mean IQ for the freshmen is between 107.8

and 116.2.

Page 18: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

There are TWO Possibilities

• The interval between 107.8 and 116.2 contains the true µ.

OR

• Our SRS was one of the 5% of samples in which ҧ𝑥 was not within 4.2 of the true µ

• We cannot say which one our sample is.

• What we are saying is that: “we got these numbers using a method that gives correct results 95% of the time.”

Page 19: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Vocabulary

• The interval ҧ𝑥 ± 4.2 is called the 95%

CONFIDENCE INTERVAL (C.I.).

• C.I. = point estimate ± margin of error

• ҧ𝑥 is our point estimate and margin of error shows

how accurate we believe our guess to be.

• The CONFIDENCE LEVEL is 95%

• This is our confidence level because it catches the

unknown µ in 95% of all the possible samples.

Page 20: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Here is the formal description

We can choose the confidence level, usually 90% or greater,

because we want to quite sure of our conclusion.

C will stand for Confidence Level in decimal form.

95% confidence level corresponds to C = .95

point

Page 21: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

The red dot is ҧ𝑥 in these 25

samples. How many of the

samples have ҧ𝑥 ± 2 SD or

more?

Just this one

Page 22: The Practice of Statisticsygwstatistics.weebly.com/.../chapter10-part1.pdf · • The 95 part of 68-95-99.7 rule says that 95% of all samples will be within two Standard Deviations

Assignment

• Read Pages 626 – 632

• Exercises 10.1, 10.2, 10.5, 10.6

• Play with the Confidence Interval Applet at: http://digitalfirst.bfwpub.com/stats_applet/stats_applet_4_ci.html

and do activity 10B with it. (Our applet is a bit different but you

should be able to figure out how to use it.)

• Watch: www.learner.org/courses/againstallodds/unitpages/unit24.html