statistics for managers using microsoft excel, 5e © 2008 pearson prentice-hall, inc.chap 7-1...

35
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter 7 Sampling and Sampling Distributions

Upload: alexa-ollis

Post on 16-Dec-2015

249 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-1

Statistics for ManagersUsing Microsoft® Excel

5th Edition

Chapter 7

Sampling and Sampling Distributions

Page 2: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-2

Learning Objectives

In this chapter, you will learn: To distinguish between different survey

sampling methods The concept of the sampling distribution To compute probabilities related to the

sample mean and the sample proportion The importance of the Central Limit

Theorem

Page 3: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-3

Why Sample?

Selecting a sample is less time-consuming than selecting every item in the population (census).

Selecting a sample is less costly than selecting every item in the population.

An analysis of a sample is less cumbersome and more practical than an analysis of the entire population.

Page 4: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-4

Types of Samples

Quota

Samples

Non-Probability Samples

Judgment Chunk

Probability Samples

Simple Random

Systematic

Stratified

ClusterConvenience

Page 5: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-5

Types of Samples

In a nonprobability sample, items included are chosen without regard to their probability of occurrence. In convenience sampling, items are selected

based only on the fact that they are easy, inexpensive, or convenient to sample.

In a judgment sample, you get the opinions of pre-selected experts in the subject matter.

Page 6: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-6

Types of Samples

In a probability sample, items in the sample are chosen on the basis of known probabilities.

Probability Samples

Simple Random Systematic Stratified Cluster

Page 7: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-7

Simple Random Sampling

Every individual or item from the frame has an equal chance of being selected

Selection may be with replacement (selected individual is returned to frame for possible reselection) or without replacement (selected individual isn’t returned to the frame).

Samples obtained from table of random numbers or computer random number generators.

Page 8: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-8

Systematic Sampling Decide on sample size: n Divide frame of N individuals into groups of k

individuals: k=N/n Randomly select one individual from the 1st group Select every kth individual thereafter

For example, suppose you were sampling n = 9 individuals from a population of N = 72. So, the population would be divided into k = 72/9 = 8 groups. Randomly select a member from group 1, say individual 3. Then, select every 8th individual thereafter (i.e. 3, 11, 19, 27, 35, 43, 51, 59, 67)

Page 9: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-9

Stratified Sampling

Divide population into two or more subgroups (called strata) according to some common characteristic.

A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes.

Samples from subgroups are combined into one. This is a common technique when sampling

population of voters, stratifying across racial or socio-economic lines.

Page 10: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-10

Cluster Sampling

Population is divided into several “clusters,” each representative of the population.

A simple random sample of clusters is selected. All items in the selected clusters can be used, or items

can be chosen from a cluster using another probability sampling technique.

A common application of cluster sampling involves election exit polls, where certain election districts are selected and sampled.

Page 11: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-11

Comparing Sampling Methods

Simple random sample and Systematic sample Simple to use May not be a good representation of the population’s

underlying characteristics Stratified sample

Ensures representation of individuals across the entire population

Cluster sample More cost effective Less efficient (need larger sample to acquire the same

level of precision)

Page 12: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-12

Evaluating Survey Worthiness

What is the purpose of the survey? Were data collected using a non-probability

sample or a probability sample? Coverage error – appropriate frame? Nonresponse error – follow up Measurement error – good questions elicit good

responses Sampling error – always exists

Page 13: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-13

Types of Survey Errors Coverage error or selection bias

Exists if some groups are excluded from the frame and have no chance of being selected

Non response error or bias People who do not respond may be different from

those who do respond Sampling error

Chance (luck of the draw) variation from sample to sample.

Measurement error Due to weaknesses in question design, respondent

error, and interviewer’s impact on the respondent

Page 14: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-14

Sampling Distributions

A sampling distribution is a distribution of all of the possible values of a statistic for a given size sample selected from a population.

For example, suppose you sample 50 students from your college regarding their mean GPA. If you obtained many different samples of 50, you will compute a different mean for each sample. We are interested in the distribution of all potential mean GPA we might calculate for any given sample of 50 students.

Page 15: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-15

Sampling DistributionsNumerical Data

Suppose your population (simplified) was four people at your institution.

Population size N=4 Random variable, X, is age of individuals Values of X: 18, 20, 22, 24 (years)

Page 16: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-16

Sampling DistributionsSample Mean Example

Summary Measures for the Population Distribution:

214

24222018N

Xμ i

2.236N

μ)(Xσ

2i

.3

.2

.1

0 18 20 22 24

A B C D

P(x)

x

Uniform Distribution

Page 17: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-17

Sampling DistributionsSample Mean Example

1st Obs.

2nd Observation

18 20 22 24

18 18,18 18,20 18,22 18,24

20 20,18 29,20 20,22 20,24

22 22,18 22,20 22,22 22,24

24 24,18 24,20 24,22 24,24

Now consider all possible samples of size n=2

1st Obs.

2nd Observation

18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

16 Sample Means

16 possible samples (sampling with replacement)

Page 18: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-18

Sampling DistributionsSample Mean Example

Sampling Distribution of All Sample

Means

1st Obs

2nd Observation

18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

16 Sample Means

18 19 20 21 22 23 240

.1

.2

.3

P(X)

X

(no longer uniform)

Sample Means Distribution

_

Page 19: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-19

Sampling DistributionsSample Mean Example

2116

24211918

N

Xμ i

X

1.5816

21)-(2421)-(1921)-(18

N

)μX(σ

222

2Xi

X

Summary Measures of this Sampling Distribution:

Page 20: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-20

Sampling DistributionsSample Mean ExamplePopulation

N = 4

1.58σ 21μX

X

2.236σ 21μ

Sample Means Distributionn = 2

18 20 22 24 A B C D

0

.1

.2

.3 P(X)

X 18 19 20 21 22 23 240

.1

.2

.3 P(X)

X_

_

Page 21: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-21

Sampling DistributionsStandard Error

n

σσ

X

Different samples of the same size from the same population will yield different sample means.

A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean:

Note that the standard error of the mean decreases as the sample size increases.

Page 22: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-22

Sampling DistributionsStandard Error: Normal Pop.

μμX

n

σσ

X

If a population is normal with mean μ and standard deviation σ, the sampling distribution of the mean is also normally distributed with

and

(This assumes that sampling is with replacement or sampling is without replacement from an infinite population)

Page 23: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-23

Sampling DistributionsZ Value: Normal Pop.

n

σμ)X(

σ

)μX(Z

X

X

Z-value for the sampling distribution of the sample mean:

where: = sample mean

= population mean

= population standard deviation

n = sample size

Xμσ

Page 24: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-24

Sampling DistributionsProperties: Normal Pop.

(i.e. is unbiased )

Normal Population Distribution

Normal Sampling Distribution (has the same mean)

μμx

xx

x

μ

Page 25: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-25

Sampling DistributionsProperties: Normal Pop.

For sampling with replacement: As n increases, decreasesxσ

Larger sample size

Smaller sample size

Page 26: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-26

Sampling DistributionsNon-Normal Population The Central Limit Theorem states that as the sample

size (that is, the number of values in each sample) gets large enough, the sampling distribution of the mean is approximately normally distributed. This is true regardless of the shape of the distribution of the individual values in the population.

Measures of the sampling distribution:

μμx n

σσx

Page 27: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-27

Sampling DistributionsNon-Normal Population

Population Distribution

Sampling Distribution (becomes normal as n increases)

x

x

Larger sample size

Smaller sample size

μ

Page 28: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-28

Sampling DistributionsNon-Normal Population

For most distributions, n > 30 will give a sampling distribution that is nearly normal

For fairly symmetric distributions, n > 15 will give a sampling distribution that is nearly normal

For normal population distributions, the sampling distribution of the mean is always normally distributed

Page 29: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-29

Sampling DistributionsExample

Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected.

What is the probability that the sample mean is between 7.75 and 8.25?

Even if the population is not normally distributed, the central limit theorem can be used (n > 30).

So, the distribution of the sample mean is approximately normal with

8μ x 0.536

3

n

σσx

Page 30: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-30

Sampling DistributionsExample (Numerical data)

So, the distribution of the sample mean is approximately normal (with n > 30)

Use PhStats Normal distribution calculator with the revised parameters

8μ x 0.536

3

n

σσx

Page 31: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-31

Sampling DistributionsThe Proportion (Categorical Data)

size sample

interest of sticcharacteri thehaving sample in the ofnumber

n

X itemsp

The proportion of the population having some characteristic is denoted π.

Sample proportion ( p ) provides an estimate of π:

0 ≤ p ≤ 1

p has a binomial distribution

(assuming sampling with replacement from a finite population or without replacement from an infinite population)

Page 32: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-32

Sampling Distribution of p

Mean for the proportion:

n

)(1σp

Standard error for the proportion:

p

Approximated by a normal distribution if:

nπ 5 & n(1- π) 5

Page 33: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-33

Sampling DistributionsThe Proportion: Example

If the true proportion of voters who support

Proposition A is π = .4, what is the probability that

a sample of size 200 yields a sample proportion

between .40 and .45?

In other words, if π = .4 and n = 200, what is

P(.40 ≤ p ≤ .45) ?

Page 34: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-34

Sampling DistributionsThe Proportion: Example

200n

51206.*200)1(804.*200 bothnn

.03464200

.4).4(1

n

)(1σ

pFind :

Use PhStats Normal distribution calculator with the revised parameters

p ;

P(.40 ≤ p ≤ .45)= .4251

4. p

If from a finite population, check the fpc box and input N

Page 35: Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 7-35

Chapter Summary

Described different types of samples Examined survey worthiness and types of survey

errors Introduced sampling distributions Described the sampling distribution of the mean

For normal populations Using the Central Limit Theorem

Described the sampling distribution of a proportion Calculated probabilities using sampling distributions

In this chapter, we have