5_sampling-dsitributions-confidence interval- 15-09-14 [compatibility mode]

108
A. Ramesh Department of Management Studies Indian Institute of Technology Roorkee Sampling Distribution and Confidence Interval

Upload: rk49103

Post on 09-Dec-2015

214 views

Category:

Documents


0 download

DESCRIPTION

5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

TRANSCRIPT

Page 1: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

A. RameshDepartment of Management Studies

Indian Institute of Technology Roorkee

Sampling Distribution and Confidence Interval

Page 2: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling Distribution and Confidence Interval

• Context• Examples• Random vs Non-Random Samples• Central Limit Theorem• Confidence Interval

Page 3: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Reasons for Sampling

• Sampling can save money and time.• Because the research process is sometimes

destructive, the sample can save product.• If accessing the population is impossible;

sampling is the only option.

Page 4: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Reasons for Taking a Census

• Eliminate the possibility that a random sample is not representative of the population.

• The person authorizing the study is uncomfortable with sample information.

Page 5: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Random Versus Nonrandom Sampling

• Random sampling• Every unit of the population has the same probability of

being included in the sample.• A chance mechanism is used in the selection process.• Eliminates bias in the selection process• Also known as probability sampling

• Nonrandom Sampling• Every unit of the population does not have the same

probability of being included in the sample.• Open the selection bias• Not appropriate data collection methods for most statistical

methods• Also known as non-probability sampling

Page 6: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Random Sampling Techniques

• Simple Random Sample• Stratified Random Sample

– Proportionate– Disproportionate

• Systematic Random Sample• Cluster (or Area) Sampling

Page 7: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Simple Random Sample

• Number each frame unit from 1 to N.• Use a random number table or a random

number generator to select n distinct numbers between 1 and N, inclusively.

• Easier to perform for small populations• Cumbersome for large populations

Page 8: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Simple Random Sample:Numbered Population Frame

01 Andhra Pradesh02 Himachal Pradesh03 Gujrath04 Maharashtra05 Nagaland06 Goa07 West bengal08 Haryana09 Punjab10 Delhi

11 Madhya Pradesh12 Uttar Pradesh13 Bihar14 Rajasthan15 J & K 16 Tamil Nadu17 Karantaka18 Kerala19 Orissa20 Manipur

Page 9: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Simple Random Sampling:Random Number Table

9 9 4 3 7 8 7 9 6 1 4 5 7 3 7 3 7 5 5 2 9 7 9 6 9 3 9 0 9 4 3 4 4 7 5 3 1 6 1 85 0 6 5 6 0 0 1 2 7 6 8 3 6 7 6 6 8 8 2 0 8 1 5 6 8 0 0 1 6 7 8 2 2 4 5 8 3 2 68 0 8 8 0 6 3 1 7 1 4 2 8 7 7 6 6 8 3 5 6 0 5 1 5 7 0 2 9 6 5 0 0 2 6 4 5 5 8 78 6 4 2 0 4 0 8 5 3 5 3 7 9 8 8 9 4 5 4 6 8 1 3 0 9 1 2 5 3 8 8 1 0 4 7 4 3 1 96 0 0 9 7 8 6 4 3 6 0 1 8 6 9 4 7 7 5 8 8 9 5 3 5 9 9 4 0 0 4 8 2 6 8 3 0 6 0 65 2 5 8 7 7 1 9 6 5 8 5 4 5 3 4 6 8 3 4 0 0 9 9 1 9 9 7 2 9 7 6 9 4 8 1 5 9 4 18 9 1 5 5 9 0 5 5 3 9 0 6 8 9 4 8 6 3 7 0 7 9 5 5 4 7 0 6 2 7 1 1 8 2 6 4 4 9 3

Page 10: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Simple Random Sample:Sample Members

• N = 20• n = 4

01 Andhra Pradesh02 Himachal Pradesh03 Gujrath04 Maharashtra05 Nagaland06 Goa07 West bengal08 Haryana09 Punjab10 Delhi

11 Madhya Pradesh12 Uttar Pradesh13 Bihar14 Rajasthan15 J & K 16 Tamil Nadu17 Karantaka18 Kerala19 Orissa20 Manipur

Page 11: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Stratified Random Sample

• Population is divided into non-overlapping subpopulations called strata

• A random sample is selected from each stratum• Potential for reducing sampling error• Proportionate -- the percentage of these sample

taken from each stratum is proportionate to the percentage that each stratum is within the population

• Disproportionate -- proportions of the strata within the sample are different than the proportions of the strata within the population

Page 12: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Stratified Random Sample: Population of FM Radio Listeners

20 - 30 years old(homogeneous within)

(alike)

30 - 40 years old(homogeneous within)

(alike)

40 - 50 years old(homogeneous within)

(alike)

Heterogeneous(different)between

Heterogeneous(different)between

Stratified by Age

Page 13: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Systematic Sampling

• Convenient and relatively easy to administer

• Population elements are an ordered sequence (at least, conceptually).

• The first sample element is selected randomly from the first k population elements.

• Thereafter, sample elements are selected at a constant interval, k, from the ordered sequence frame.

k = N

n,

where:

n = sample size

N = population size

k = size of selection interval

Page 14: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Systematic Sampling: Example

• Purchase orders for the previous fiscal year are serialized 1 to 10,000 (N = 10,000).

• A sample of fifty (n = 50) purchases orders is needed for an audit.

• k = 10,000/50 = 200• First sample element randomly selected

from the first 200 purchase orders. Assume the 45th purchase order was selected.

• Subsequent sample elements: 245, 445, 645, . . .

Page 15: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Cluster Sampling• Population is divided into non-overlapping

clusters or areas• Each cluster is a miniature of the

population.• A subset of the clusters is selected randomly

for the sample.• If the number of elements in the subset of

clusters is larger than the desired value of n, these clusters may be subdivided to form a new set of clusters and subjected to a random selection process.

Page 16: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Cluster Sampling� Advantages

• More convenient for geographically dispersed populations

• Reduced travel costs to contact sample elements• Simplified administration of the survey• Unavailability of sampling frame prohibits using

other random sampling methods� Disadvantages

• Statistically less efficient when the cluster elements are similar

• Costs and problems of statistical analysis are greater than for simple random sampling

Page 17: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Nonrandom Sampling

• Convenience Sampling: Sample elements are selected for

the convenience of the researcher

• Judgment Sampling: Sample elements are selected by

the judgment of the researcher

• Quota Sampling: Sample elements are selected until the

quota controls are satisfied

• Snowball Sampling: Survey subjects are selected based

on referral from other survey respondents

Page 18: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Errors

� Data from nonrandom samples are not appropriate for analysis by inferential statistical methods.

� Sampling Error occurs when the sample is not representative of the population

� Non-sampling Errors• Missing Data, Recording, Data Entry, and Analysis

Errors• Poorly conceived concepts , unclear definitions, and

defective questionnaires• Response errors occur when people so not know,

will not say, or overstate in their answers

Page 19: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling Distribution of

Proper analysis and interpretation of a sample statistic requires knowledge of its distribution.

Population

(parameter)

Sample

x(statistic )

Calculate xto estimate

Select arandom sample

Process ofInferential Statistics

x

Page 20: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of a Small Finite Population

Population Histogram

0

1

2

3

52.5 57.5 62.5 67.5 72.5

Freq

uenc

y

N = 8

54, 55, 59, 63, 68, 69, 70,74

Page 21: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sample Space for n = 2 with Replacement

Sample Mean Sample Mean Sample Mean Sample Mean1 (54,54) 54.0 17 (59,54) 56.5 33 (64,54) 59.0 49 (69,54) 61.52 (54,55) 54.5 18 (59,55) 57.0 34 (64,55) 59.5 50 (69,55) 62.03 (54,59) 56.5 19 (59,59) 59.0 35 (64,59) 61.5 51 (69,59) 64.04 (54,63) 58.5 20 (59,63) 61.0 36 (64,63) 63.5 52 (69,63) 66.05 (54,64) 59.0 21 (59,64) 61.5 37 (64,64) 64.0 53 (69,64) 66.56 (54,68) 61.0 22 (59,68) 63.5 38 (64,68) 66.0 54 (69,68) 68.57 (54,69) 61.5 23 (59,69) 64.0 39 (64,69) 66.5 55 (69,69) 69.08 (54,70) 62.0 24 (59,70) 64.5 40 (64,70) 67.0 56 (69,70) 69.59 (55,54) 54.5 25 (63,54) 58.5 41 (68,54) 61.0 57 (70,54) 62.0

10 (55,55) 55.0 26 (63,55) 59.0 42 (68,55) 61.5 58 (70,55) 62.511 (55,59) 57.0 27 (63,59) 61.0 43 (68,59) 63.5 59 (70,59) 64.512 (55,63) 59.0 28 (63,63) 63.0 44 (68,63) 65.5 60 (70,63) 66.513 (55,64) 59.5 29 (63,64) 63.5 45 (68,64) 66.0 61 (70,64) 67.014 (55,68) 61.5 30 (63,68) 65.5 46 (68,68) 68.0 62 (70,68) 69.015 (55,69) 62.0 31 (63,69) 66.0 47 (68,69) 68.5 63 (70,69) 69.516 (55,70) 62.5 32 (63,70) 66.5 48 (68,70) 69.0 64 (70,70) 70.0

Page 22: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of the Sample Means

Sampling Distribution Histogram

0

5

10

15

20

53.75 56.25 58.75 61.25 63.75 66.25 68.75 71.25

Freq

uenc

y

Page 23: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

1,800 Randomly Selected Values from an Exponential Distribution

050

100150200250300350400450

0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10X

Frequency

Page 24: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Means of 60 Samples (n = 2) from an Exponential Distribution

Frequency

0

1

2

3

4

5

6

7

8

9

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00x

Page 25: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Means of 60 Samples (n = 5) from an Exponential Distribution

Frequency

x

0

1

2

3

4

5

6

7

8

9

10

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

Page 26: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Means of 60 Samples (n = 30) from an Exponential Distribution

0

2

4

6

8

10

12

14

16

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00

Frequency

x

Page 27: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

1,800 Randomly Selected Values from a Uniform Distribution

X-bar

Frequency

0

50

100

150

200

250

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Page 28: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Means of 60 Samples (n = 2) from a Uniform Distribution

Frequency

x

0

1

2

3

4

5

6

7

8

9

10

1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25

Page 29: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Means of 60 Samples (n = 5) from a Uniform Distribution

Frequency

x

0

2

4

6

8

10

12

1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25

Page 30: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Means of 60 Samples (n = 30) from a Uniform Distribution

Frequency

x

0

5

10

15

20

25

1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25

Page 31: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Central Limit Theorem

• For sufficiently large sample sizes (n 30),

• the distribution of sample means , is approximately normal;

• the mean of this distribution is equal to , the population mean; and

• its standard deviation is ,

• regardless of the shape of the population distribution.

x

n

Marquis de Laplace

Page 32: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Central Limit Theorem

• The central limit theorem is one of the most remarkable results of the theory of probability.

• In its simplest form, the theorem states that the sum of a large number of independent observations from the same distribution has, under certain general conditions, an approximate normal distribution.

• Moreover, the approximation steadily improves as the number of observations increases. The theorem is considered the heart of probability theory, although a better name would be normal convergence theorem.

Page 33: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distributions of Samples..

• Sampling distributions drawn from a uniformly distributed population start to look like normal distributions even with a sample size as small as 2.

• If the sample size is large enough they form nearly perfect normal distributions

Page 34: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Population: Uniform Distribution

Fig. 1) Histogram of Population - Uniform Distribution: population = 10,000; mean = 5.013; std dev 2.897

Page 35: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Samples: n=2

Fig. 2) Sampling Distribution n = 2: number of samples = 2010; mean = 4.995; std dev 2.011

Page 36: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Samples: n=10

Fig. 3) Sampling Distribution n = 10: number of samples = 2010; mean = 5.018;std dev 0.906

Page 37: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Samples: n=50

Fig. 4) Sampling Distribution n = 50: number of samples = 2010; mean = 4.999; std dev 0.411

Page 38: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Useful web site for Demo

• http://www.statisticalengineering.com/central_limit_theorem.htm

Page 39: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

ExponentialPopulation

n = 2 n = 5 n = 30

Distribution of Sample Means for Various Sample Sizes

UniformPopulation

n = 2 n = 5 n = 30

Page 40: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Sample Means for Various Sample Sizes

NormalPopulation

n = 2 n = 5 n = 30

U ShapedPopulation

n = 2 n = 5 n = 30

Page 41: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling from a Normal Population

• The distribution of sample means is normal for any sample size.

x

x

If x is the mean of a random sample of size nfrom a normal population with mean of and standard deviation of , the distribution of x is

a normal distribution with mean and

standard deviation

.n

Page 42: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Z Formula for Sample Means

ZX

X

n

X

X

Page 43: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Example

Population Parameters: 85, 9Sample Size: 40

87( 87)

87

X

X

n

P X P Z

P Z

n

P Z

P ZZ

87 85940

1 415 0 1 415 42010793

.. ( . ). ..

Page 44: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Graphic Solution to Example

Z = X-n

87 85

940

21 42

1 41.

.

1

Z1.410

.5000

.4207

X

940

1 42.

X8785

.5000

.4207

Equal Areasof .0793

Page 45: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling from a Finite Population without Replacement

• In this case, the standard deviation of the distribution of sample means is smaller than when sampling from an infinite population (or from a finite population with replacement).

• The correct value of this standard deviation is computed by applying a finite correction factor to the standard deviation for sampling from a infinite population.

Page 46: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling from a Finite Population

• Finite Correction Factor

• Modified Z Formula

N nN 1

Z X

nN nN

1

Page 47: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Central Limit Theorem for Proportion

• Mean of the sampling distribution of the proportion

• Standard error of the proportion

• Z=

pp

p

pqn

p

p p

Page 48: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling Distribution of p• Sample Proportion

• Sampling Distribution• Approximately normal if nP > 5 and nQ > 5 (P is the

population proportion and Q = 1 - P.)• The mean of the distribution is P.• The standard deviation of the distribution is

:

p Xn

whereX

number of items in a sample that possess the characteristicn = number of items in the sample

p

pqn

Page 49: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

EstimationEstimation

Page 50: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Statistical Inference…Statistical inference is the process by which we acquire information and draw conclusions about populations from samples.

In order to do inference, we require the skills and knowledge of descriptive statistics, probability distributions, and sampling distributions.

Parameter

Population

Sample

Statistic

Inference

Data

Statistics

Information

Page 51: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

EstimationThere are two types of inference: estimation and hypothesis testing;estimation is introduced first.

The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic.E.g., the sample mean ( x ) is employed to estimate the population mean (μ)

Page 52: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Estimation…The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic.

There are two types of estimators:

Point Estimator

Interval Estimator

Page 53: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Point Estimator…

A point estimator draws inferences about apopulation by estimating the value of anunknown parameter using a single value orpoint.

Page 54: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

• Point probabilities in continuous distributions were virtually zero.

• Point estimator gets closer to the parameter value with an increased sample size, but point estimators don’t reflect the effects of larger sample sizes.

• Hence we will employ the interval estimator to estimate population parameters…

Point Estimator…

Page 55: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Interval Estimator…An interval estimator draws inferences about a population by estimating the value of an unknown parameter using an interval.

That is we say (with some ___% certainty) that the population parameter of interest is between some lower and upper bounds.

Page 56: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Point & Interval Estimation…For example, suppose we want to estimate the mean summer income of a class of business students. For n=25 students, X is calculated to be 400/week.

point estimate interval estimate

An alternative statement is:The mean income is between 380 and 420 /week.

Page 57: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

An estimator of a population parameter is a sample statistic used to estimate the parameter. The most commonly-used estimator of the:

Population Parameter Sample Statistic

Mean () is the Mean ( X )Variance ( 2 ) is the Variance (s2 )Standard Deviation () is the Standard Deviation (s)Proportion (p) is the Proportion ( p )

Estimator

Page 58: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

We can calculate an interval estimator from a sampling distribution, by: Drawing a sample of size n from the populationCalculating its mean, XAnd, by the central limit theorem, we know that X is normally (or approximately normally) distributed so…

…will have a standard normal (or approximately normal) distribution.

Estimating μ when is known)

nxZ

Page 59: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Looking at this in more detail…

Known, i.e. standard normal distribution

Known, i.e. sample mean

Unknown, i.e. we want to estimate

the population mean

Known, i.e. the number of items

sampled

Known, i.e. its assumed we know

the population standard deviation…

Estimating when is known

Page 60: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Thus, the probability that the interval contains the population mean μ is 1– . This is a confidence interval estimator of μ .

1)n

zxn

z(P 22

the sample mean is in the center of

the interval…

the confidence interval

Estimating μ when is known)

nzx,

nzx 22

n

zx 2

Page 61: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

nzx 2

nzx,

nzx 22

Confidence Interval Estimator for μ :The probability 1– is called the confidence level.

Usually represented with a “plus/minus”

( ± ) sign

lower confidence limit (LCL)

upper confidence limit (UCL)

Page 62: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Graphically……here is the confidence interval for :

x

width

nzx 2

n

zx 2

n

z2 2

x

Page 63: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Graphically…•…the actual location of the population mean …

…may be here… …or here… …or possibly even here…

Page 64: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Four commonly used confidence levels…

Confidence level 0.90 0.10 0.05 1.6450.95 0.05 0.025 1.960.98 0.02 0.01 2.330.99 0.01 0.005 2.575

Confidence level 0.90 0.10 0.05 1.6450.95 0.05 0.025 1.960.98 0.02 0.01 2.330.99 0.01 0.005 2.575

Z /2

Page 65: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Interval Width…The width of the confidence interval estimate is a function of the confidence level, the populationstandard deviation, and the sample size…

2x zn

Page 66: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Interval Width…The width of the confidence interval estimate is a function of the confidence level, the population standard deviation, and the sample size…

A larger confidence levelproduces a w i d e r confidence interval:

2x zn

Page 67: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Interval Width…The width of the confidence interval estimate is a function of the confidence level, the population standard deviation, and the sample size…

Larger values of σproduce wider confidence intervals

2x zn

Page 68: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Interval Width…

The width of the confidence interval estimate is a function of the confidence level, the population standard deviation, and the sample size…

Increasing the sample size decreases the width of the confidence interval while the confidence level can remain unchanged.Note: this also increases the cost of obtaining additional data

2x zn

Page 69: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Selecting the Sample Size…

We can control the width of the interval by determining the sample size necessary to produce narrow intervals.

Suppose we want to estimate the mean demand “to within 5 units”; i.e. we want to the interval estimate to be: x ± 5Since:

It follows that = 5

2x zn

Solve for n to get requisite sample size!

that

Page 70: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Estimation with small samples: using the t distribution

• If:– The sample size is small (<25 or so), and– The true variance 2 is unknown

• Then the t distribution should be used instead of the standard Normal.

Page 71: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Sample Meansfor (1-)% Confidence

X

Z0

2Z

2Z

2

2

Page 72: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Sample Meansfor (1-)% Confidence

X

Z0

2Z

2Z

2

2

.52

.52

Page 73: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Sample Meansfor (1-)% Confidence

X

Z0

2Z

2Z

2

2 1

21

2

Page 74: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Probability Interpretation of the Level of Confidence

Pr [ ]ob Xn

XnZ Z

2 21

Page 75: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Distribution of Sample Means for 95% Confidence

.4750 .4750

X

95%.025.025

Z1.96-1.96 0

Page 76: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

95% Confidence Interval for

78.16222.14378.915378.9153

854696.1153

854696.1153

n

ZXn

ZX

Page 77: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

95% Confidence Intervals for

X

95%

XX

X

XX

X

Page 78: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Example

7.7 7.710.455 1.645 10.455 1.64544 44

10.455 1.91 10.455 1.918.545 12.365

X Z X Zn n

X and nZ

10 455 7 7 4490% 1645

. , . , ..

confidence

Pr [ . . ] .ob 8 545 12 365 0 90

Page 79: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Example

X Zn

N nN

X Zn

N nN

1 1

34 3 2 33850

800 50800 1

34 3 2 33850

800 50800 1

34 3 2 554 34 3 2 5543175 3685

. . . .

. . . .. .

X N and nZ

34 3 8 800 5098% 2 33

. , , ..

= confidence

Page 80: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Confidence Interval to Estimate when n is Large and is Unknown

X ZSn

or

X ZSn

X ZSn

2

2 2

Page 81: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Example

2.908.807.45.857.45.85

1103.19575.25.85

1103.19575.25.85

n

SZXn

SZX

575.2 confidence %99.110 ,3.19S ,5.85

ZnandX

99.0]2.908.80[Pr ob

Page 82: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling from normal distribution

• SBI of IITR Branch calculates that its individual saving accounts are normally distributed with a mean of 2,000 and a standard deviation of 600.If the bank takes a random sample of 100 accounts, what is the probability that the sample mean will lie between 1900 and 2050?

• 0.7492

Page 83: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sampling from non-normal distribution

• The distribution of annual earnings of all bank tellers with five year's experience is negatively skewed. Mean 19,000 and Std Deviation 2000.

• If we draw a random sample of 30 tellers, What is the probability that their earning will average more than 1975 annually?

• 0.0202

Page 84: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

• t-distribution

Page 85: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Estimating the Mean of a Normal Population: Small n and Unknown

• The population has a normal distribution.• The value of the population standard

deviation is unknown.• The sample size is small, n < 30.• Z distribution is not appropriate for these

conditions• t distribution is appropriate

Page 86: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

The t Distribution

• Developed by British statistician, William Gosset

• A family of distributions -- a unique distribution for each value of its parameter, degrees of freedom (d.f.)

• Symmetric, Unimodal, Mean = 0, Flatter than a Z

• t formula t XSn

Page 87: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Degrees of freedom

• Example• No. of values we can choose freely.

Page 88: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Comparison of Selected t Distributions to the Standard Normal

-3 -2 -1 0 1 2 3

Standard Normalt (d.f. = 25)

t (d.f. = 1)t (d.f. = 5)

Page 89: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

• T table is more compact • Shows areas and t values only for few

percentages 1,2,5,10..• T table does not focus on the chance that

population parameter being estimated will fall within our confidence interval.

• Instead the chance that population parameter we are estimating will not be within our confidence interval(that is, that will lie outside it)

T-table

Page 90: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Table of Critical Values of t

df t0.100 t0.050 t0.025 t0.010 t0.0051 3.078 6.314 12.706 31.821 63.6562 1.886 2.920 4.303 6.965 9.9253 1.638 2.353 3.182 4.541 5.8414 1.533 2.132 2.776 3.747 4.6045 1.476 2.015 2.571 3.365 4.032

23 1.319 1.714 2.069 2.500 2.80724 1.318 1.711 2.064 2.492 2.79725 1.316 1.708 2.060 2.485 2.787

29 1.311 1.699 2.045 2.462 2.75630 1.310 1.697 2.042 2.457 2.750

40 1.303 1.684 2.021 2.423 2.70460 1.296 1.671 2.000 2.390 2.660

120 1.289 1.658 1.980 2.358 2.6171.282 1.645 1.960 2.327 2.576

t

With df = 24 and = 0.05, t = 1.711.

Page 91: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Confidence Intervals for of a Normal Population: Small n and Unknown

X t Sn

or

X t Sn

X t Sn

df n

1

Page 92: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Example

X tSn

X tSn

2 14 3 0121 29

142 14 3 012

1 2914

2 14 1 04 2 14 1 041 10 3 18

. ..

. ..

. . . .. .

X S n df n

t

2 14 1 29 14 1 13

21 99

20 005

3 012005 13

. , . , ,.

.

.. ,

Page 93: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Solution for Demonstration Problem

X tSn

X tSn

2 14 3 0121 29

142 14 3 012

1 2914

2 14 1 04 2 14 1 041 10 3 18

. ..

. ..

. . . .. .

Pr [ . . ] .ob 110 318 0 99

Page 94: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Chi-Square distribution

Page 95: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Population Variance

• Variance is an inverse measure of the group’s homogeneity.

• Variance is an important indicator of total quality in standardized products and services.

• Managers improve processes to reduce variance.• Variance is a measure of financial risk. Variance of

rates of return help managers assess financial and capital investment alternatives.

• Variability is a reality in global markets. Productivity, wages, and costs of living vary between regions and nations.

Page 96: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Estimating the Population Variance

• Population Parameter

• Estimator of

• formula for Single Variance

2

2

1SX Xn

22

21

1

n S

ndegrees of freedom = -

Page 97: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Confidence Interval for 2

n n

df n

S S

1 1

11

2

2

22

2

12

2

level of confidence

Page 98: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Selected 2 Distributions

df = 3

df = 5

df = 10

0

Page 99: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

2 Table

0 5 10 15 20

0.10

df = 5

9.23635

df 0.975 0.950 0.100 0.050 0.0251 9.82068E-04 3.93219E-03 2.70554 3.84146 5.023902 0.0506357 0.102586 4.60518 5.99148 7.377783 0.2157949 0.351846 6.25139 7.81472 9.348404 0.484419 0.710724 7.77943 9.48773 11.143265 0.831209 1.145477 9.23635 11.07048 12.832496 1.237342 1.63538 10.6446 12.5916 14.44947 1.689864 2.16735 12.0170 14.0671 16.01288 2.179725 2.73263 13.3616 15.5073 17.53459 2.700389 3.32512 14.6837 16.9190 19.0228

10 3.24696 3.94030 15.9872 18.3070 20.4832

20 9.59077 10.8508 28.4120 31.4104 34.169621 10.28291 11.5913 29.6151 32.6706 35.478922 10.9823 12.3380 30.8133 33.9245 36.780723 11.6885 13.0905 32.0069 35.1725 38.075624 12.4011 13.8484 33.1962 36.4150 39.364125 13.1197 14.6114 34.3816 37.6525 40.6465

70 48.7575 51.7393 85.5270 90.5313 95.023180 57.1532 60.3915 96.5782 101.8795 106.628590 65.6466 69.1260 107.5650 113.1452 118.1359

100 74.2219 77.9294 118.4980 124.3421 129.5613

With df = 5 and = 0.10, 2 = 9.23635

Page 100: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Two Table Values of 2

0 2 4 6 8 10 12 14 16 18 20

df = 7

.05

.05

.95

2.16735 14.0671

df 0.950 0.0501 3.93219E-03 3.841462 0.102586 5.991483 0.351846 7.814724 0.710724 9.487735 1.145477 11.070486 1.63538 12.59167 2.16735 14.06718 2.73263 15.50739 3.32512 16.9190

10 3.94030 18.3070

20 10.8508 31.410421 11.5913 32.670622 12.3380 33.924523 13.0905 35.172524 13.8484 36.415025 14.6114 37.6525

Page 101: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

2

2 2 2.1 .05

2 22 2 2

.11 1 .952 2

.0022125, 8, 1 7, .10

14.0671

2.16735

n df nS

n nS S

1 1

8 1 0 0 2 2 1 2 51 4 0 6 7 1

8 1 0 0 2 2 1 2 52 1 6 7 3 5

0 0 1 1 0 1 0 0 7 1 4 6

2

2

22

2

12

2

2

2

..

..

. .

90.0]007146.0001101.0[Pr 2 ob

90% Confidence Interval for 2

Page 102: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Solution for Demonstration Problem

4277.27648.04011.12

)2544.1(1253641.39

)2544.1(125

11

2

2

2

21

22

2

2

2

SS nn

4011.12

3641.39

05. ,241 ,25 ,2544.1

2

975.

2

205.1

2

21

2

025.

2

205.

2

2

2

ndfnS

Page 103: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Determining Sample Size when Estimating

• Z formula

• Error of Estimation (tolerable error)

• Estimated Sample Size

• Estimated

Z X

n

E X

nZ

EZ

E

2

2 2

2

2

2

14

range

Page 104: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Sample Size When Estimating Example

nZ

E

2

2 2

2

2 2

2

1645 41

43 30 44

( . ) ( )

. or

EZ

1 490% 1 645

,.

confidence

Page 105: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Solution for Demonstration Problem

n ZE

2 2

2

2 2

2

196 6 252

37 52 38

( . ) ( . )

. or

E rangeZ

estimated range

2 2595% 1 96

14

14

25 6 25

,.

: .

confidence

Page 106: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Determining Sample Size when Estimating P

• Z formula

• Error of Estimation (tolerable error)

• Estimated Sample Size

Z p PP Q

n

E p P

n PQZE

2

2

Page 107: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Solution for Demonstration Problem

nPQ

or

ZE

2

2

2

2

2 33003

0 40 0 60

1 447 7 1 448

( . ).

. .

, . ,

EConfidence Z

estimated PQ P

0 0398% 2 33

0 401 0 60

..

..

Page 108: 5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]

Example: Determining n when Estimating P with No Prior Information

nPQ

or

ZE

2

2

2

2

164505

0 50 0 50

270 6 271

( . ).

. .

.

0.0590% 1.645

estimate , 0.501 0.50

EConfidence Z

with no prior of P use PQ P