chapter 7: sampling distributions section 7.1 how likely are the possible values of a statistic? the...

38
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Upload: cecilia-wilcox

Post on 18-Jan-2016

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Chapter 7: Sampling Distributions

Section 7.1

How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Page 2: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objectives

1. Statistic vs. Parameter

2. Sampling Distributions

3. Mean and Standard Deviation of the Sampling Distribution of a Proportion

4. Standard Error

5. Sampling Distribution Example

6. Population, Data, and Sampling Distributions

Page 3: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 1:Statistic and Parameter

A statistic is a numerical summary of sample data such as a sample proportion or sample mean

A parameter is a numerical summary of a population such as a population proportion or population mean.

In practice, we seldom know the values of parameters.

Parameters are estimated using sample data. We use statistics to estimate parameters.

Page 4: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 2:Sampling Distributions

Example: Prior to counting the votes, the proportion in favor of

recalling Governor Gray Davis was an unknown parameter.

An exit poll of 3160 voters reported that the sample proportion in favor of a recall was 0.54.

If a different random sample of about 3000 voters were selected, a different sample proportion would occur.

The sampling distribution of the sample proportion shows all possible values and the probabilities for those values.

Page 5: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 2:Sampling Distributions

The sampling distribution of a statistic is the probability distribution that specifies probabilities for the possible values the statistic can take.

Sampling distributions describe the variability that occurs from study to study using statistics to estimate population parameters

Sampling distributions help to predict how close a statistic falls to the parameter it estimates

Page 6: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:Mean and SD of the Sampling Distribution of a Proportion For a random sample of size n from a

population with proportion p of outcomes in a particular category, the sampling distribution of the proportion of the sample in that category has

n

p)-p(1deviation standard

pMean

Page 7: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 4:The Standard Error

To distinguish the standard deviation of a sampling distribution from the standard deviation of an ordinary probability distribution, we refer to it as a standard error.

Page 8: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 5:Example: 2006 California Election

If the population proportion supporting the reelection of Schwarzenegger was 0.50, would it have been unlikely to observe the exit-poll sample proportion of 0.565?

Based on your answer, would you be willing to predict that Schwarzenegger would win the election?

Page 9: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 5:Example: 2006 California

Given that the exit poll had 2705 people and assuming 50% support the reelection of Schwarzenegger, Find the estimate of the population proportion and

the standard error:

0096.2705

)5.1(*5.

5.

p

Page 10: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 5:Example: 2006 California Election

Page 11: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 5:Example: 2006 California Election

The sample proportion of 0.565 is more than six standard errors from the expected value of 0.50.

The sample proportion of 0.565 voting for reelection of Schwarzenegger would be very unlikely if the population proportion were p = 0.50 or p < 0.50

6.8 0.0096

0.50) - (0.565 z

Page 12: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 6:Population Distribution

Population distribution: This is the probability distribution from which we take the sample. Values of its parameters are usually unknown.

They’re what we’d like to learn about.

Page 13: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 6:Data distribution

This is the distribution of the sample data. It’s the distribution we actually see in practice.

It’s described by statistics With random sampling, the larger the sample

size n, the more closely the data distribution resembles the population distribution

Page 14: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 6:Sampling Distribution

This is the probability distribution of a sample statistic.

With random sampling, the sampling distribution provides probabilities for all the possible values of the statistic.

The sampling distribution provides the key for telling us how close a sample statistic falls to the corresponding unknown parameter.

Its standard deviation is called the standard error.

Page 15: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 6:Example

In the 2006 U.S. Senate election in NY An exit poll of 1336 voters showed

67% (895) voted for Clinton 33% (441) voted for Spencer

When all 4.1 million votes were tallied 68% voted for Clinton 32% voted for Spencer

Let X= vote outcome with x=1 for Clinton and x=0 for Spencer

Page 16: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 6:Example

The population distribution is the 4.1 million values of the x vote variable, 32% of which are 0 and 68% of which are 1.

The data distribution is the 1336 values of the x vote for the exit poll, 33% of which are 0 and 67% of which are 1.

The sampling distribution of the sample proportion is approximately a normal distribution with p=0.68 and

Only the sampling distribution is bell-shaped; the others are discrete and concentrated at the two values 0 and 1

0.68(1 0.68) /1336 0.013

Page 17: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Chapter 7: Sampling Distributions

Section 7.2

How Close Are Sample Means to Population Means?

Page 18: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objectives

1. The Sampling Distribution of the Sample Mean

2. Effect of n on the Standard Error

3. Central Limit Theorem (CLT)

4. Calculating Probabilities of Sample Means

Page 19: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 1:The Sampling Distribution of the Sample Mean

The sample mean, x, is a random variable. The sample mean varies from sample to

sample. By contrast, the population mean, µ, is a

single fixed number.

Page 20: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 1:The Sampling Distribution of the Sample Mean

For a random sample of size n from a population having mean µ and standard deviation σ, the sampling distribution of the sample mean has:

Center described by the mean µ (the same as the mean of the population).

Spread described by the standard error, which equals the population standard deviation divided by the square root of the sample size:

standard error of

x n

Page 21: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 1:Example 1: Pizza Sales

Daily sales at a pizza restaurant vary from day to day.

The sales figures fluctuate around a mean µ = $900 with a standard deviation σ = $300.

What are the center and spread of the sampling distribution?

$900

standard error 300

7 113

Page 22: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 2:Effect of n on the Standard Error

Knowing how to find a standard error gives us a mechanism for understanding how much variability to expect in sample statistics “just by chance.”

The standard error of the sample mean =

As the sample size n increases, the denominator increases, so the standard error decreases.

With larger samples, the sample mean is more likely to fall closer to the population mean.

n

Page 23: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:Central Limit Theorem

Question: How does the sampling distribution of the sample mean relate with respect to shape, center, and spread to the probability distribution from which the samples were taken?

Page 24: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:Central Limit Theorem (CLT)

For random sampling with a large sample size n, the sampling distribution of the sample mean is approximately a normal distribution.

This result applies no matter what the shape of the probability distribution from which the samples are taken.

Page 25: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:CLT: How Large a Sample?

The sampling distribution of the sample mean takes more of a bell shape as the random sample size n increases.

The more skewed the population distribution, the larger n must be before the shape of the sampling distribution is close to normal.

In practice, the sampling distribution is usually close to normal when the sample size n is at least about 30.

If the population distribution is approximately normal, then the sampling distribution is approximately normal for all sample sizes.

Page 26: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:CLT: Impact of increasing n

Page 27: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:CLT: Making Inferences

For large n, the sampling distribution is approximately normal even if the population distribution is not.

This enables us to make inferences about population means regardless of the shape of the population distribution.

Page 28: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 4:Calculating Probabilities of Sample Means

The distribution of weights of milk bottles is normally distributed with a mean of 1.1 lbs and a standard deviation (σ)=0.20.

What is the probability that the mean of a random sample of 5 bottles will be greater than 0.99 lbs? Calculate the mean and standard error for the sampling

distribution of a random sample of 5 milk bottles By the CLT, is approximately normal with mean=1.1

and standard error = =0.0894 P( >0.99)=

normcdf (.99,1E99,1.1,.20

5) .89

x

0.2 5

x

Page 29: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 4: Calculating Probabilities of Sample Means

Closing prices of stocks have a right skewed distribution with a mean (µ) of $25 and σ= $20.

What is the probability that the mean of a random sample of 40 stocks will be less than $20? Calculate the mean and standard error for the sampling

distribution of a random sample of 40 stocks By the CLT, is approximately normal with mean=25

and standard error = =3.1623 P( <20)=

normcdf ( 1E99, 20, 25,20

40) .06

20 40

x

x

Page 30: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 4: Calculating Probabilities of Sample Means

An automobile insurer has found that repair claims have a mean of $920 and a standard deviation of $870. Suppose that the next 100 claims can be regarded as a random sample from the long-run claims process.

What is the probability that the average of the 100 claims is larger than $900?

normcdf (900,1E99, 920,870

100) .59

Page 31: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 4: Calculating Probabilities of Sample Means

Example: the distribution of actual weights of 8 oz. wedges of cheddar cheese produced by a certain company is normal with mean =8.1 oz. and standard deviation =0.1 oz.

Find the value x such that there is only a 10% chance that the average weight of a sample of five wedges will be above x.

invnorm(.9,8.1,.1

5) 8.16

Page 32: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Example: the distribution of actual weights of 8 oz. wedges of cheddar cheese produced by a certain company is normal with mean =8.1 oz. and standard deviation =0.1 oz.

Find the value x such that there is only a 5% chance that the average weight of a sample of five wedges will be below x.

Learning Objective 4: Calculating Probabilities of Sample Means

invnorm(.05,8.1,.1

5) 8.03

Page 33: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Chapter 7: Sampling Distributions

Section 7.3

How Can We Make Inferences About a Population?

Page 34: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objectives

1. Using the CLT to Make Inferences

2. Standard Errors in Practice

3. Sampling Distribution for a Proportion

Page 35: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 1:Using the CLT to Make Inferences

Implications of the CLT When the sampling distribution of the sample

mean is approximately normal, falls within 2 standard errors of with probability close to 0.95 and almost certainly falls within 3 standard errors of .

For large n, the sampling distribution of is approximately normal no matter what the shape of the underlying population distribution.

x

x

Page 36: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 2:Standard Errors in Practice

In practice, standard errors are estimated Standard errors have exact values depending

on parameter values, e.g.,

for a sample proportion

for a sample mean In practice, these parameter values are

unknown. Inference methods use standard errors that substitute sample values for the parameters in the exact formulas above

These estimated standard errors are the numbers we use in practice.

p(1 p) n

n

Page 37: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:Sampling Distribution for a Proportion

The binomial probability distribution is the sampling distribution for the number of successes in n independent trials

In practice, the sample proportion of successes is the statistic usually reported

Since the sample proportion is simply the number of successes divided by the number of trials, the formulas for the mean and standard deviation of the sampling distribution of the proportion of successes are the formulas for the mean and standard deviation of the number of successes divided by n.

Page 38: Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objective 3:Sampling Distribution for a Proportion

For a binomial random variable with n trials and probability p of success for each, the sampling distribution of the proportion of successes has Mean = p Standard error =

These values can be found by taking the mean np and the standard deviation for the binomial distribution of the number of successes and dividing by n

p(1 p)

n

np(1 p)