chapter eight mcgraw-hill/irwin © 2006 the mcgraw-hill companies, inc., all rights reserved....
TRANSCRIPT
Chapter
Eight
McGraw-Hill/Irwin
© 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Sampling Methods and the Sampling Methods and the Central Limit TheoremCentral Limit Theorem
Why sample?
•The physical impossibility of checking all items in the population.
•The cost of studying all the items in a population.
•The time-consuming aspect of contacting the whole population.
•The destructive nature of certain tests.
•The adequacy of sample results in most cases.
Objective of inferential statistics is to determine characteristics of a population based on a sample
Simple Random Sample: A sample selected so that each item or person in the population has the same chance of being included.
Sampling Methods
One can also a table of random numbers (Appendix E)
Systematic Random Sampling: Every kth member of the population is selected for the sample.
Stratified Random Sampling:
A population is first divided into subgroups, called strata, and a sample is selected from each stratum.
Eg. College students may be stratified into freshmen, sophomore, etc. or simply male and female
Cluster Sampling: A population is first divided into primary units then samples are selected from the primary units.
Question ?
If you repeatedly take samples from a population and calculate the sample mean for each sample,
what would the distribution of the sample means look like ?
μ
σ
xx
μ=?
σ=?
Demo the CLT using Visual Statistics software
Generalizing the result
Irrespective of the shape of distribution of data in the original population, as you increase the sample size (minimum recommended is n=30), the distribution of the sample mean will become a normal distribution.
Note: If the population distribution is known to be normal, then sample means is guaranteed to be normally distributed (even if n<30).
If all samples of a particular size are selected from any population, the distribution of the sample mean is approximately a normal distribution.
Central Limit TheoremCentral Limit Theorem
x = n
As n increases μx will approach μ. So sample mean is a good estimator of population mean.
This s.d. is called the standard error (ie., of the mean distribution).
Note that the Std Error is smaller
Variance of the sample mean distribution
Var(x) = Var (x1 + x2 +…+xn) n = 1 [Var(x1) + Var(x2) + … +Var(xn)] n2
= 1 [σ2 + σ2 + … + σ2] = 1 [n. σ2] n2 n2
= n σ2
n2
therefore, Standard Deviation = σ/√n (Remember this formula!)
σx2 = σ2
n
Where x1 is mean of sample 1,x2 is …)
nσXz
μ
σ
x
μ
σ/√n
x
The Z score formula for the distribution of sample means is:
Distribution of sample
Distribution of population
Std.Error
X
zCompare with Chapter 7formula:
Practice!
Historically, the average sales per customer at a tire store is known to be $85, with a s.d. of $9. You take a random sample of 40 customers. What is the probability the mean expenditure for this sample will be $87 or more?
Z= 87 – 85 = 2 = 1.41 9/√40 1.42
From Appendix D, prob. for this Z-score is 0.4207. The prob for sample mean to exceed Z=1.41 is 0.5 – 0.4207.
Hence, the answer is 0.0793.
Use s in place of σ if the population standard deviation is unknown, so
long as n ≥ 30.
ns
Xz
Z score formula is:
Practice time!
Problem #17 on page 237
Z = 1950-2200 = -7.07 250/√50
So probability is virtually 1