1 estimation from sample data chapter 08. chapter 8 - learning objectives explain the difference...

1

Estimation From Sample DataEstimation From Sample Data

Chapter 08

Chapter 8 - Learning Objectives

• Explain the difference between a point and an interval estimate.

• Construct and interpret confidence intervals: with a z for the population mean or proportion. with a t for the population mean.

• Determine appropriate sample size to achieve specified levels of accuracy and confidence.

3

8.1 Introduction

Statistical inference is the process by which we acquire information about populations from samples.

There are two types of inferences:1. Estimation of a parameter

2. Testing a Hypotheses about a parameter

4

8.2 Concept of Estimation

The main objective of estimation is to determine the value of a population parameter on the basis of a sample statistic.

There are two types of estimators:

1. Point Estimator2. Interval Estimator

5

Point Estimator

A point estimator allows us to draw inference about a population parameter (say the mean or the proportion) by estimating a statistic from a sample.

The statistic provides estimate of the value of the parameter at a single point (value)—thus the name point estimate.

7

An interval estimator draws inferences about a population parameter by providing a range (interval) of value within which the unknown population parameter lies.

Interval estimator

Population distribution

Sample distribution

Parameter

Interval Estimator

8

Take a sample and compute the Average Weekly Summer Income of students in your sample (Say, 600) of UMD students.

400=X

You want to know the Average Weekly Summer income of UMD students.

Point Estimate: µ=$400

Interval estimate: µ= $380-$420

Example--

Interval estimator is used more frequently than point estimator: (1) point estimator is more prone to making faulty inferences, and (2) when we use interval estimator we can specify how confident we are in our estimate.

9

In estimation, you would want to select the right sample and sample statistic that allow you to estimate a parameter with less error.

The selection of the right statistic depends on some important characteristics.

Characteristics of Estimators

Desirable characteristics of Estimators

10

Desirable Characteristics of Estimators

1. Unbiasedness: An unbiased estimator is one whose expected value is equal to the parameter it estimates.

2. Consistency: An unbiased estimator is said to be consistent if the difference between the estimator and the parameter grows smaller as the sample size increases.

3. Relative efficiency:From among two or more unbiased estimators (estimates), the one with a smaller variance is said to be relatively efficient.

11

8.3 Interval Estimation of the Population Mean and Proportion

8.3.1 When the Population Variance is Known

8.3.2 When the Population Variance is Unknown

12

8.3.1 Estimating the Population Mean when the Population Variance is Known

We are able to provide an interval estimate of a population mean or proportion based on the following characteristics of a sampling distribution.

1. Given the sampling distribution, we can draw a sample of size n from the population, and calculate the sample mean or proportion

2. Given the central limit theorem we consider that the sampling distribution of the sample means or proportions is normal (or approximately normal) and thus provide probability estimates for the sample mean or proportion that we estimate.

3. Given the formula for standardizing any random variable, we can relate the standardized value obtained from a normal distribution and the sample mean/proportion we are estimating :

nx

Z

13

The general form of an interval estimate of aThe general form of an interval estimate of a population mean ispopulation mean is

The general form of an interval estimate of aThe general form of an interval estimate of a population mean ispopulation mean is

Margin of Errorx Margin of Errorx

Margin of Error and the Interval EstimateMargin of Error and the Interval Estimate

14

8.3 Estimating the Population Mean when the Population Variance is Known

We estimate the range (interval) that contains the value of the unknown population parameter (say the mean) as follows…

)(2n

zx

15

8.3.1 Estimating the Population Mean when the Population Variance is Known

)(2n

zx

where: where: is the sample meanis the sample mean

zz/2 /2 is the standardized value of the Random is the standardized value of the Random variable representing an area,variable representing an area,/2 in on one tail /2 in on one tail of the standard normal probability distributionof the standard normal probability distribution

is the population standard deviationis the population standard deviationnn is the sample size is the sample size1-1-αα is the confidence coefficient is the confidence coefficient

X

16

1)n

zxn

zx(P 22

In its expanded form, the interval can be stated as follows:

The Confidence Interval for ( is known)

The confidence interval

17

Interpreting the Confidence Interval for

Based on the estimate, we can say that with a (1 – percent confidence the interval:

contains the true value of the unknown population parameter.

Based on the estimate, we can say that with a (1 – percent confidence the interval:

contains the true value of the unknown population parameter.

n

zxn

zx

22 ,

19

Interval Estimationof a Population Proportion

p zp pn

/

( )2

1p z

p pn

/

( )2

1

where: 1 -where: 1 - is the confidence coefficient is the confidence coefficient

zz/2 /2 is the is the zz value providing an area of value providing an area of

/2 in the upper tail of the standard/2 in the upper tail of the standard

normal probability distributionnormal probability distribution

is the sample proportionis the sample proportionpp

20


Commonly used confidence levels and their corresponding Z scores

Confidence level α Z (for α/2)

90% 10% 1.645

95% 5% 1.960

99% 1% 2.575

21

SSDDInterval Estimate of Population Mean:

Known: Example

Step-1: Identify coefficient (α) and the confidence coefficient (1- α)at which the margin of error is to be computed (α =5%)

Step-2: Compute the corresponding margin of error for the selected Confidence coefficient

Step-3: Establish the Interval estimate of by adding and subtracting the margin of error to the sample mean:

22


Hands-On-Practice Problems

23

Interval Estimate of Population Mean: Known: Example

SSDD

A random sample of 81 credit card sales in a department store showed that an average sale of $68. From past data, it is known that the standard deviation of sales on credit card is $27.

8.1) Determine the 90% confidence interval estimate of sales on credit .

8.2) Determine the 95% confidence interval estimate of sales on credit.

8.3) Determine the 99% confidence interval estimate of sales on credit.

24


SSDD

Solution: n = 81; = $68. = $27. X Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be computed:

Coefficient (α =10%) Confidence Coefficient: (1- α)= 90%

Coefficient (Zα/2 =0.05)=1.645; Standard ErrorMargin of Error = 1.645 x 3 = 4.935

39

27

81

27

nx


68 – 4.935 = 63.065; 68 + 4.935 = 72. 935; [ 63.065 72.935]

We are 90 percent confident that the average credit sales of the store lies in the interval $63 and $73

Step-2: Compute the corresponding margin of error for the selected confidence coefficient

25


SSDD

A random sample of 81 credit card sales in a department store showed that an average sale of $68,000. From past data, it is known that the standard deviation of the credit card sales is $27.

8.1) Determine the 90% confidence interval estimate of the sales on credit cards. [63.065- 72.935]

8.2) Determine the 95% confidence interval estimate of the sales on credit cards.

26


SSDD




39

27

81

27

nx


68 – 5.88= 62.12; 68 + 5.88 = 73. 88; [ 62.12 73.88]



27


SSDD

A random sample of 81 credit card sales in a department store showed that an average sale of $68,000. From past data, it is known that the standard deviation of the credit card sales is $27.

8.1) Determine the 90% confidence interval estimate of the sales on credit cards. [63.065 - 72.935]

8.2) Determine the 95% confidence interval estimate of the sales on credit cards [62.12 - 73.88]

8.3) Determine the 99% confidence interval estimate of the sales on credit cards.

28


SSDD




39

27

81

27

nx


68 – 7.725= 60.275; 68 + 7.725 = 75.725; [ 60.275 75.725]



29


SSDD

A random sample of 81 credit card sales in a department store showed that an average sale of $68. From past data, it is known that the standard deviation of the credit card sales is $27.

8.1) The 90% confidence interval estimate of sales on credit cards.

[63.065 - 72.935]8.2) The 95% confidence interval estimate of sales on credit cards

[62.12 - 73.88]8.3) The 99% confidence interval estimate of sales on credit cards

[60.275 - 75.725]

30

Implications…

As we increase the confidence coefficient (say from 90% to 95% or to 99%), the interval that contains the mean of the population widens.

There is a trade-off between the width of the interval and the confidence with which we can make the estimation

31

The Confidence Interval for ( When The Population Standard Deviation Is Unknown)

32

Recall that when the population variance is known we use the following statistic to provide an interval estimate of a population mean

The Confidence Interval for ( When The Population Standard Deviation Is Unknown)

)(

nzx

2

33

The t - Statistic

n

x

n

x s

n

x

Z

However, information about population variance may not be available all the time. Provided that the sampled population is normally distributed, even if the population variance is unknown, we use variance estimated from the sample and a t statistic (Student t distribution) to make inference about the population mean.

t

34

The t - Statistic

0

The t distribution is mound-shaped, and symmetrical around zero.

The variance of a t-distribution depends on the sample size. Generally it has higher variance than a normal distribution

35

t Distribution

StandardStandardnormalnormal

distributiondistribution

tt distributiondistribution(20 degrees(20 degreesof freedom)of freedom)

tt distributiondistribution(10 degrees(10 degrees

of of freedom)freedom)

zz, , tt

When the degrees of freedom (sample size) is more than 100, the standard normal When the degrees of freedom (sample size) is more than 100, the standard normal zz value value provides a good approximation to the provides a good approximation to the tt value. value.

The variance (spread) of a t-distribution, compared to that of normal distribution is largely determined by the “degrees of freedom” ( the sample size)

36

The t - Statistic

The interval estimate of the population mean is thus computed as :

] x 1-nat [ )(2n

stx

37

Example:8.2.1 In a random sample of 100 oil changes, it was found that

on average it takes about 22 minutes to change oil for a car with a standard deviation of 5 minutes.

Assuming that the amount of time it takes to change oil on a car is normally distributed, provide the 99% confidence interval estimate of the average amount of time it takes to change oil on a typical car).

The Confidence Interval for ( is unknown)

38

Example 8.2.2.

Using the same information, but assuming a standard deviation of 25 minutes, provide the 99% confidence interval estimate of the population mean (the average amount of time it takes to change oil on a car).


39

Example 8.2.3. Using the same information (std. deviation=5,

sample=22, but assuming a sample size of 400 car changes and provide the 99% confidence interval estimate of the population mean (that is, the average amount of time it takes to change oil on a typical car).


40

The width of the confidence interval is affected by

1. The confidence level (1-a): The higher the confidence level, the wider the interval estimate.

2. The population standard deviation (s): The higher the variance, the wider the interval estimate.

3. The sample size (n): The larger the sample size, the narrower the interval estimate

The width of the confidence interval is affected by

1. The confidence level (1-a): The higher the confidence level, the wider the interval estimate.

2. The population standard deviation (s): The higher the variance, the wider the interval estimate.

3. The sample size (n): The larger the sample size, the narrower the interval estimate

Implications for the Width of the Confidence Interval

41

Wide interval estimator provides little information.

Where is

???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ????

42

But we want a narrower confidence interval and a higher confidence level.But we want a narrower confidence interval and a higher confidence level.

Wide interval estimator provides little information.Where is

Information and the Width of the Interval

43

The width of the confidence interval is affected by confidence level, variance of the population, and sample size.

1. We want higher confidence level and narrow interval estimate. However, there is a trade-off between confidence level and the interval estimate we want to establish.

2. Although lower variance can provide us with narrow interval estimate, the variance of the population or sample is beyond our control.

3. Therefore, the only way we can establish narrow (more informative interval) while maintaining higher confidence level is thus by adjusting (increasing) our sample size.

The width of the confidence interval is affected by confidence level, variance of the population, and sample size.

1. We want higher confidence level and narrow interval estimate. However, there is a trade-off between confidence level and the interval estimate we want to establish.

2. Although lower variance can provide us with narrow interval estimate, the variance of the population or sample is beyond our control.

3. Therefore, the only way we can establish narrow (more informative interval) while maintaining higher confidence level is thus by adjusting (increasing) our sample size.

The Width of the Confidence Interval

44

90%

Confidence level

Determining the Proper sample size is thus a critical component of in Establishing Narrow Interval Estimation

Determining the Proper sample size is thus a critical component of in Establishing Narrow Interval Estimation

The Sample Size

45

From the formula that we used to establish the interval estimate of the population parameter, we can derive a formula that allows us to determine the appropriate sample size.Two important requirements:1. At what confidence level do we want to provide the

interval estimate2. What interval width (W) do we need?

8.3 Selecting the Sample size

46

Where W is the interval width we want to maintain. Thus to compute the sample size, first we need to determine the interval width.

2

222/

2

2 )()(

w

Z

w

zn

2

222/

2

2 )()(

w

Z

w

zn

8.3 Selecting the Sample size

47

Example 10.2 In order to estimate the amount of lumber that can be harvested from

a tract of land with a 99% confidence, it was indicated that the mean diameter of trees in the tract must be within one inch.

Assuming that diameters are normally distributed with standard deviation of 6 inches, how many samples should be selected to provide the interval estimation for the mean of the diameter of the trees in the tract at the specified confidence level?.

Selecting the Sample size

48

Solution The estimate accuracy is +/-1 inch. That is w = 1.

The confidence level 99% leads to = .01, thus z/2 = z.005 = 2.575.

The standard deviation was given as 6

Thus, we can compute the required sample size as follows:

2391

)6(575.2w

zn

222

Selecting the Sample size

49

1. Determine the sample size, and the values of variables of interest (width, spread of

the population or sample).

2. Select the confidence level for the interval estimation

3. Compute the sample mean ( population variance may be known or unknown).

4. Determine the critical value (Z or t from the standard normal table)

5. Compute the confidence interval.

Computing Interval Estimates: Summary

50

Summary of Interval Estimation Summary of Interval Estimation ProceduresProcedures

for a Population Meanfor a Population Mean

Is theIs thepopulation standardpopulation standard deviation deviation known ? known ?

Use the sampleUse the samplestandard deviationstandard deviation

ss to estimate to estimate

UseUse

YesYes NoNo

/ 2

sx t

n / 2

sx t

nUseUse

/ 2x zn

/ 2x z

n

KnownKnownCaseCase

UnknownUnknownCaseCase

51

Interval Estimationof a Population Proportion

p zp pn

/

( )2

1p z

p pn

/

( )2

1

where: 1 -where: 1 - is the confidence coefficient is the confidence coefficient

zz/2 /2 is the is the zz value providing an area of value providing an area of

/2 in the upper tail of the standard/2 in the upper tail of the standard

normal probability distributionnormal probability distribution

is the sample proportionis the sample proportionpp

1 estimation from sample data chapter 08. chapter 8 - learning objectives explain the difference...

Documents