prepared by lloyd r. jaisingh

49
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. A PowerPoint Presentation Package to Accompany Applied Statistics in Business & Economics, 4 th edition David P. Doane and Lori E. Seward Prepared by Lloyd R. Jaisingh

Upload: lani-atkinson

Post on 01-Jan-2016

37 views

Category:

Documents


5 download

DESCRIPTION

A PowerPoint Presentation Package to Accompany. Applied Statistics in Business & Economics, 4 th edition David P. Doane and Lori E. Seward. Prepared by Lloyd R. Jaisingh. Chapter Contents 8.1 Sampling Variation 8.2 Estimators and Sampling Errors - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Prepared by Lloyd R. Jaisingh

McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.

A PowerPoint Presentation Package to Accompany

Applied Statistics in Business & Economics, 4th edition

David P. Doane and Lori E. Seward

Prepared by Lloyd R. Jaisingh

Page 2: Prepared by Lloyd R. Jaisingh

8-2

Sampling Distributions and Estimation

Chapter Contents

8.1 Sampling Variation

8.2 Estimators and Sampling Errors

8.3 Sample Mean and the Central Limit Theorem

8.4 Confidence Interval for a Mean (μ) with Known σ

8.5 Confidence Interval for a Mean (μ) with Unknown σ

8.6 Confidence Interval for a Proportion (π)

8.7 Estimating from Finite Populations

8.8 Sample Size Determination for a Mean

8.9 Sample Size Determination for a Proportion

8.10 Confidence Interval for a Population Variance, 2 (Optional)

Ch

apter 8

Page 3: Prepared by Lloyd R. Jaisingh

8-3

Chapter Learning Objectives (LO’s)

LO8-1: Define sampling error, parameter, and estimator.

LO8-2: Explain the desirable properties of estimators.

LO8-3: State the Central Limit Theorem for a mean.

LO8-4: Explain how sample size affects the standard error.

LO8-5: Construct a 90, 95, or 99 percent confidence interval for μ.

Ch

apter 8

Sampling Distributions and Estimation

Page 4: Prepared by Lloyd R. Jaisingh

8-4

Chapter Learning Objectives (LO’s)

LO8-6: Know when to use Student’s t instead of z to estimate μ.

LO8-7: Construct a 90, 95, or 99 percent confidence interval for π.

LO8-8: Construct confidence intervals for finite populations.

LO8-9: Calculate sample size to estimate a mean or proportion.

LO8-10: Construct a confidence interval for a variance (optional).

Ch

apter 8

Sampling Distributions and Estimation

Page 5: Prepared by Lloyd R. Jaisingh

• Sample statistic – a random variable whose value depends on which population items are included in the random sample.

• Depending on the sample size, the sample statistic could either represent the population well or differ greatly from the population.

• This sampling variation can easily be illustrated.

Ch

apter 8

8.1 Sampling Variation

8-5

Page 6: Prepared by Lloyd R. Jaisingh

Ch

apter 8

8.1 Sampling Variation

• Consider eight random samples of size n = 5 from a large population of GMAT scores for MBA applicants.

• The sample means tend to be close to the population mean (m = 520.78).

8-6

Page 7: Prepared by Lloyd R. Jaisingh

8-7

• The dot plots show that the sample means have much less variation than the individual sample items.

Ch

apter 8

8.1 Sampling Variation

Page 8: Prepared by Lloyd R. Jaisingh

• Estimator – a statistic derived from a sample to infer the value of a population parameter.

• Estimate – the value of the estimator in a particular sample.• Population parameters are usually represented by

Greek letters and the corresponding statistic by Roman letters.

Some Terminology

Ch

apter 8

LO8-1: Define sampling error, parameter and estimator.

8.2 Estimators and Sampling DistributionsLO8-1

8-8

Page 9: Prepared by Lloyd R. Jaisingh

8-9

Examples of Estimators

Ch

apter 8

Sampling Distributions

• The sampling distribution of an estimator is the probability distribution of all possible values the statistic may assume when a random sample of size n is taken.

• Note: An estimator is a random variable since samples vary.

8.2 Estimators and Sampling DistributionsLO8-1

Page 10: Prepared by Lloyd R. Jaisingh

• Bias is the difference between the expected value of the estimator and the true parameter. Example for the mean,

Bias

• An estimator is unbiased if its expected value is the parameter being estimated. The sample mean is an unbiased estimator of the population mean since

• On average, an unbiased estimator neither overstates nor understates the true parameter.

Ch

apter 8

• Sampling error is the difference between an estimate and the corresponding population parameter. For example, if we use the sample mean as an estimate for the population mean, then the

8.2 Estimators and Sampling DistributionsLO8-1

8-10

Page 11: Prepared by Lloyd R. Jaisingh

Ch

apter 8

8.2 Estimators and Sampling DistributionsLO8-1

8-11

Page 12: Prepared by Lloyd R. Jaisingh

• Efficiency refers to the variance of the estimator’s sampling distribution.

• A more efficient estimator has smaller variance.

Efficiency

Figure 8.6

Ch

apter 8

LO8-2: Explain the desirable properties of estimators.

Note: Also, a desirable property for an estimator is for it to be unbiased.

8.2 Estimators and Sampling DistributionsLO8-2

8-12

Page 13: Prepared by Lloyd R. Jaisingh

ConsistencyA consistent estimator converges toward the parameter being estimated

as the sample size increases.

Figure 8.6

Ch

apter 8

LO8-2: Explain the desirable properties of estimators.

8.2 Estimators and Sampling DistributionsLO8-2

8-13

Page 14: Prepared by Lloyd R. Jaisingh

Ch

apter 8

8.3 Sample Mean and the Central Limit Theorem

LO8-3: State the Central Limit Theorem for a mean.

The Central Limit Theorem is a powerful result that allows us toapproximate the shape of the sampling distribution of the sample mean even when we don’t know what the population looks like.

LO8-3

8-14

Page 15: Prepared by Lloyd R. Jaisingh

• If the population is exactly normal, then the sample mean follows a normal distribution.

Ch

apter 8

• As the sample size n increases, the distribution of sample means narrows in on the population mean µ.

8.3 Sample Mean and the Central Limit TheoremLO8-3

8-15

Page 16: Prepared by Lloyd R. Jaisingh

8-16

• If the sample is large enough, the sample means will have approximately a normal distribution even if your population is not normal.

Ch

apter 8

8.3 Sample Mean and the Central Limit TheoremLO8-3

Page 17: Prepared by Lloyd R. Jaisingh

Illustrations of Central Limit Theorem

Note:

Ch

apter 8

8.3 Sample Mean and the Central Limit TheoremLO8-3

Using the uniformand a right skewed distribution.

8-17

Page 18: Prepared by Lloyd R. Jaisingh

The Central Limit Theorem permits us to define an interval within which the sample means are expected to fall. As long as the sample size n is large enough, we can use the normal distribution regardless of the population shape (or any n if the population is normal to begin with).

Applying The Central Limit Theorem

Ch

apter 8

8.3 Sample Mean and the Central Limit TheoremLO8-3

8-18

Page 19: Prepared by Lloyd R. Jaisingh

Even if the population standard deviation σ is large, the sample means will fall within a narrow interval as long as n is large. The key is the standard error of the mean:.. The standard error decreases as n increases.

Sample Size and Standard Error

Ch

apter 8

8.3 Sample Mean and the Central Limit Theorem

For example, when n = 4 the standard error is halved. To halve it again requires n = 16, and to halve it again requires n = 64. To halve thestandard error, you must quadruple the sample size (the law of diminishing returns).

LO8-4: Explain how sample size affects the standard error.

LO8-4

8-19

Page 20: Prepared by Lloyd R. Jaisingh

• Consider a discrete uniform population consisting of the integers {0, 1, 2, 3}.

• The population parameters are: m = 1.5, s = 1.118.

Illustration: All Possible Samples from a Uniform Population

Ch

apter 8

8.3 Sample Mean and the Central Limit Theorem

8-20

Page 21: Prepared by Lloyd R. Jaisingh

• The population is uniform, yet the distribution of all possible sample means of size 2 has a peaked triangular shape.

Illustration: All Possible Samples from a Uniform Population

Ch

apter 8

8.3 Sample Mean and the Central Limit Theorem

8-21

Page 22: Prepared by Lloyd R. Jaisingh

What is a Confidence Interval?

Ch

apter 8

8.4 Confidence Interval for a Mean () with known ()LO8-5

LO8-5: Construct a 90, 95, or 99 percent confidence interval for μ.

8-22

Page 23: Prepared by Lloyd R. Jaisingh

What is a Confidence Interval?• The confidence interval for m with known s is:

Ch

apter 8

8.4 Confidence Interval for a Mean () with known ()LO8-5

8-23

Page 24: Prepared by Lloyd R. Jaisingh

• A higher confidence level leads to a wider confidence interval.

Choosing a Confidence Level

• Greater confidence implies loss of precision (i.e. greater margin of error).

• 95% confidence is most often used.

Ch

apter 8

8.4 Confidence Interval for a Mean () with known ()LO8-5

Confidence Intervals for Example 8.2

8-24

Page 25: Prepared by Lloyd R. Jaisingh

• A confidence interval either does or does not contain m.• The confidence level quantifies the risk.• Out of 100 confidence intervals, approximately 95% may contain

m, while approximately 5% might not contain when constructing 95% confidence intervals.

Interpretation

Ch

apter 8

8.4 Confidence Interval for a Mean () with known ()LO8-5

When Can We Assume Normality?• If is known and the population is normal, then we can safely use the

formula to compute the confidence interval.• If is known and we do not know whether the population is normal, a common

rule of thumb is that n 30 is sufficient to use the formula as long as the distributionIs approximately symmetric with no outliers.

• Larger n may be needed to assume normality if you are sampling from a strongly skewed population or one with outliers.

8-25

Page 26: Prepared by Lloyd R. Jaisingh

• Use the Student’s t distribution instead of the normal distribution when the population is normal but the standard deviation s is unknown and the sample size is small.

Student’s t Distribution

Ch

apter 8

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

LO8-6: Know when to use Student’s t instead of z to estimate .

8-26

Page 27: Prepared by Lloyd R. Jaisingh

Student’s t Distribution

Ch

apter 8

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

LO8-6: Know when to use Student’s t instead of z to estimate .

8-27

Page 28: Prepared by Lloyd R. Jaisingh

8-28

Student’s t Distribution

• t distributions are symmetric and shaped like the standard normal distribution.

• The t distribution is dependent on the size of the sample.

Figure 8.11

Ch

apter 8

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Comparison of Normal and Student’s t

Page 29: Prepared by Lloyd R. Jaisingh

8-29

Degrees of Freedom• Degrees of Freedom (d.f.) is a parameter based on the sample

size that is used to determine the value of the t statistic.• Degrees of freedom tell how many observations are used to

calculate s, less the number of intermediate estimates used in the calculation. The d.f for the t distribution in this case, is given by d.f. = n -1.

Ch

apter 8

• As n increases, the t distribution approaches the shape of the normal distribution.

• For a given confidence level, t is always larger than z, so a confidence interval based on t is always wider than if z were used.

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Page 30: Prepared by Lloyd R. Jaisingh

8-30

Comparison of z and t• For very small samples, t-values differ substantially from the

normal.• As degrees of freedom increase, the t-values approach the

normal z-values.• For example, for n = 31, the degrees of freedom, d.f. = 31 – 1 =

30.

Ch

apter 8

So for a 90 percent confidence interval, we would use t = 1.697, which is only slightly larger than z = 1.645.

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Page 31: Prepared by Lloyd R. Jaisingh

8-31

Example GMAT Scores Again

Figure 8.13

Ch

apter 8

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Page 32: Prepared by Lloyd R. Jaisingh

8-32

Example GMAT Scores Again

• Construct a 90% confidence interval for the mean GMAT score of all MBA applicants.

x = 510 s = 73.77

• Since s is unknown, use the Student’s t for the confidence interval with d.f. = 20 – 1 = 19.

• First find t/2 = t.05 = 1.729 from Appendix D.

Ch

apter 8

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Page 33: Prepared by Lloyd R. Jaisingh

8-33

• For a 90% confidence interval, use Appendix D to find t0.05 = 1.729 with d.f. = 19.

Note: One can use Excel,Minitab, etc. to obtain these valuesas well as to construct confidence Intervals.

Ch

apter 8

We are 90 percent confident that the true mean GMAT score might be within the interval [481.48, 538.52]

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Page 34: Prepared by Lloyd R. Jaisingh

8-34

Confidence Interval Width• Confidence interval width reflects

- the sample size, - the confidence level and - the standard deviation.

• To obtain a narrower interval and more precision- increase the sample size or - lower the confidence level (e.g., from 90% to 80% confidence).

Ch

apter 8

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Page 35: Prepared by Lloyd R. Jaisingh

8-35

Using Appendix D

• Beyond d.f. = 50, Appendix D shows d.f. in steps of 5 or 10.• If the table does not give the exact degrees of freedom, use the

t-value for the next lower degrees of freedom.• This is a conservative procedure since it causes the interval to be

slightly wider.• A conservative statistician may use the t distribution for confidence intervals when σ is unknown because using z would underestimate the margin of error.

Ch

apter 8

8.5 Confidence Interval for a Mean () with Unknown ()LO8-6

Page 36: Prepared by Lloyd R. Jaisingh

• A proportion is a mean of data whose only values are 0 or 1.

Ch

apter 8

8.6 Confidence Interval for a Proportion ()

LO8-7: Construct a 90, 95, or 99 percent confidence interval for π.

LO8-7

8-36

Page 37: Prepared by Lloyd R. Jaisingh

• The distribution of a sample proportion p = x/n is symmetric if p = .50 and regardless of p, approaches symmetry as n increases.

Applying the CLT

Ch

apter 8

8.6 Confidence Interval for a Proportion ()LO8-7

8-37

Page 38: Prepared by Lloyd R. Jaisingh

• Rule of Thumb: The sample proportion p = x/n may be assumed to be normal if both np 10 and n(1- p) 10.

When is it Safe to Assume Normality of p?

Sample size to assume normality:

Table 8.9

Ch

apter 8

8.6 Confidence Interval for a Proportion ()LO8-7

8-38

Page 39: Prepared by Lloyd R. Jaisingh

Confidence Interval for p

• Since p is unknown, the confidence interval for p = x/n (assuming a large sample) is

Ch

apter 8

8.6 Confidence Interval for a Proportion ()LO8-7

8-39

Page 40: Prepared by Lloyd R. Jaisingh

Example Auditing

Ch

apter 8

8.6 Confidence Interval for a Proportion ()LO8-7

8-40

Page 41: Prepared by Lloyd R. Jaisingh

8-41

Ch

apter 8

8.7 Estimating from Finite Population

LO8-8: Construct Confidence Intervals for Finite Populations.

LO8-8

N = population size; n = sample size

Page 42: Prepared by Lloyd R. Jaisingh

8-42

• To estimate a population mean with a precision of + E (allowable error), you would need a sample of size. Now,

Sample Size to Estimate m

Ch

apter 8

8.8 Sample Size determination for a Mean

LO8-9: Calculate sample size to estimate a mean or proportion.

LO8-9

Page 43: Prepared by Lloyd R. Jaisingh

8-43

• Method 1: Take a Preliminary SampleTake a small preliminary sample and use the sample s in place of s in the sample size formula.

• Method 2: Assume Uniform PopulationEstimate rough upper and lower limits a and b and set s = [(b-a)/12]½.

How to Estimate s?

Ch

apter 8

• Method 3: Assume Normal PopulationEstimate rough upper and lower limits a and b and set s = (b-a)/4. This assumes normality with most of the data with m ± 2s so the range is 4s.

• Method 4: Poisson ArrivalsIn the special case when m is a Poisson arrival rate, then s = m.

8.8 Sample Size determination for a MeanLO8-9

Page 44: Prepared by Lloyd R. Jaisingh

8-44

• To estimate a population proportion with a precision of ± E (allowable error), you would need a sample of size

• Since p is a number between 0 and 1, the allowable error E is also between 0 and 1.

Ch

apter 8

8.9 Sample Size determination for a ProportionLO8-9

Page 45: Prepared by Lloyd R. Jaisingh

8-45

• Method 1: Assume that p = .50 This conservative method ensures the desired precision. However, the sample may end up being larger than necessary.

• Method 2: Take a Preliminary SampleTake a small preliminary sample and use the sample p in place of p in the sample size formula.

• Method 3: Use a Prior Sample or Historical DataHow often are such samples available? Unfortunately, p might be different enough to make it a questionable assumption.

How to Estimate p?

Ch

apter 8

8.9 Sample Size determination for a ProportionLO8-9

Page 46: Prepared by Lloyd R. Jaisingh

• If the population is normal, then the sample variance s2 follows the chi-square distribution (c2) with degrees of freedom d.f. = n – 1.

• Lower (c2L) and upper (c2

U) tail percentiles for the chi-square distribution can be found using Appendix E.

Chi-Square Distribution

LO8-10: Construct a confidence interval for a variance (optional).

8.10 Confidence Interval for a Population Variance (2)LO8-10

8-46

Page 47: Prepared by Lloyd R. Jaisingh

• Using the sample variance s2, the confidence interval is

Confidence Interval

• To obtain a confidence interval for the standard deviation , just take the square root of the interval bounds.

8.10 Confidence Interval for a Population Variance (2)LO8-10

LO8-10: Construct a confidence interval for a variance (optional).

8-47

Page 48: Prepared by Lloyd R. Jaisingh

You can use Appendix E to find critical chi-square values.

8.10 Confidence Interval for a Population Variance (2)LO8-10

8-48

Page 49: Prepared by Lloyd R. Jaisingh

• The methods described for confidence interval estimation of the variance and standard deviation depend on the population having a normal distribution.

• If the population does not have a normal distribution, then the confidence interval should not be considered accurate.

Caution: Assumption of Normality

8.10 Confidence Interval for a Population Variance (2)LO8-10

8-49