chapter eight mcgraw-hill/irwin © 2006 the mcgraw-hill companies, inc., all rights reserved....

Chapter

Eight

McGraw-Hill/Irwin

© 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.

Sampling Methods and the Sampling Methods and the Central Limit TheoremCentral Limit Theorem

Why sample?

•The physical impossibility of checking all items in the population.

•The cost of studying all the items in a population.

•The time-consuming aspect of contacting the whole population.

•The destructive nature of certain tests.

•The adequacy of sample results in most cases.

Objective of inferential statistics is to determine characteristics of a population based on a sample

Simple Random Sample: A sample selected so that each item or person in the population has the same chance of being included.

Sampling Methods

One can also a table of random numbers (Appendix E)

Systematic Random Sampling: Every kth member of the population is selected for the sample.

Stratified Random Sampling:

A population is first divided into subgroups, called strata, and a sample is selected from each stratum.

Eg. College students may be stratified into freshmen, sophomore, etc. or simply male and female

Cluster Sampling: A population is first divided into primary units then samples are selected from the primary units.

Question ?

If you repeatedly take samples from a population and calculate the sample mean for each sample,

what would the distribution of the sample means look like ?

μ

σ

xx

μ=?

σ=?

Demo the CLT using Visual Statistics software

Generalizing the result

Irrespective of the shape of distribution of data in the original population, as you increase the sample size (minimum recommended is n=30), the distribution of the sample mean will become a normal distribution.

Note: If the population distribution is known to be normal, then sample means is guaranteed to be normally distributed (even if n<30).

If all samples of a particular size are selected from any population, the distribution of the sample mean is approximately a normal distribution.

Central Limit TheoremCentral Limit Theorem

x = n

As n increases μx will approach μ. So sample mean is a good estimator of population mean.

This s.d. is called the standard error (ie., of the mean distribution).

Note that the Std Error is smaller

Variance of the sample mean distribution

Var(x) = Var (x1 + x2 +…+xn) n = 1 [Var(x1) + Var(x2) + … +Var(xn)] n2

= 1 [σ2 + σ2 + … + σ2] = 1 [n. σ2] n2 n2

= n σ2

n2

therefore, Standard Deviation = σ/√n (Remember this formula!)

σx2 = σ2

n

Where x1 is mean of sample 1,x2 is …)

nσXz

μ

σ

x

μ

σ/√n

x

The Z score formula for the distribution of sample means is:

Distribution of sample

Distribution of population

Std.Error

X

zCompare with Chapter 7formula:

Practice!

Historically, the average sales per customer at a tire store is known to be $85, with a s.d. of $9. You take a random sample of 40 customers. What is the probability the mean expenditure for this sample will be $87 or more?

Z= 87 – 85 = 2 = 1.41 9/√40 1.42

From Appendix D, prob. for this Z-score is 0.4207. The prob for sample mean to exceed Z=1.41 is 0.5 – 0.4207.

Hence, the answer is 0.0793.

Use s in place of σ if the population standard deviation is unknown, so

long as n ≥ 30.

ns

Xz

Z score formula is:

Practice time!

Problem #17 on page 237

Z = 1950-2200 = -7.07 250/√50

So probability is virtually 1

chapter eight mcgraw-hill/irwin © 2006 the mcgraw-hill companies, inc., all rights reserved....

Documents