5_sampling-dsitributions-confidence interval- 15-09-14 [compatibility mode]
DESCRIPTION
5_sampling-Dsitributions-Confidence Interval- 15-09-14 [Compatibility Mode]TRANSCRIPT
A. RameshDepartment of Management Studies
Indian Institute of Technology Roorkee
Sampling Distribution and Confidence Interval
Sampling Distribution and Confidence Interval
• Context• Examples• Random vs Non-Random Samples• Central Limit Theorem• Confidence Interval
Reasons for Sampling
• Sampling can save money and time.• Because the research process is sometimes
destructive, the sample can save product.• If accessing the population is impossible;
sampling is the only option.
Reasons for Taking a Census
• Eliminate the possibility that a random sample is not representative of the population.
• The person authorizing the study is uncomfortable with sample information.
Random Versus Nonrandom Sampling
• Random sampling• Every unit of the population has the same probability of
being included in the sample.• A chance mechanism is used in the selection process.• Eliminates bias in the selection process• Also known as probability sampling
• Nonrandom Sampling• Every unit of the population does not have the same
probability of being included in the sample.• Open the selection bias• Not appropriate data collection methods for most statistical
methods• Also known as non-probability sampling
Random Sampling Techniques
• Simple Random Sample• Stratified Random Sample
– Proportionate– Disproportionate
• Systematic Random Sample• Cluster (or Area) Sampling
Simple Random Sample
• Number each frame unit from 1 to N.• Use a random number table or a random
number generator to select n distinct numbers between 1 and N, inclusively.
• Easier to perform for small populations• Cumbersome for large populations
Simple Random Sample:Numbered Population Frame
01 Andhra Pradesh02 Himachal Pradesh03 Gujrath04 Maharashtra05 Nagaland06 Goa07 West bengal08 Haryana09 Punjab10 Delhi
11 Madhya Pradesh12 Uttar Pradesh13 Bihar14 Rajasthan15 J & K 16 Tamil Nadu17 Karantaka18 Kerala19 Orissa20 Manipur
Simple Random Sampling:Random Number Table
9 9 4 3 7 8 7 9 6 1 4 5 7 3 7 3 7 5 5 2 9 7 9 6 9 3 9 0 9 4 3 4 4 7 5 3 1 6 1 85 0 6 5 6 0 0 1 2 7 6 8 3 6 7 6 6 8 8 2 0 8 1 5 6 8 0 0 1 6 7 8 2 2 4 5 8 3 2 68 0 8 8 0 6 3 1 7 1 4 2 8 7 7 6 6 8 3 5 6 0 5 1 5 7 0 2 9 6 5 0 0 2 6 4 5 5 8 78 6 4 2 0 4 0 8 5 3 5 3 7 9 8 8 9 4 5 4 6 8 1 3 0 9 1 2 5 3 8 8 1 0 4 7 4 3 1 96 0 0 9 7 8 6 4 3 6 0 1 8 6 9 4 7 7 5 8 8 9 5 3 5 9 9 4 0 0 4 8 2 6 8 3 0 6 0 65 2 5 8 7 7 1 9 6 5 8 5 4 5 3 4 6 8 3 4 0 0 9 9 1 9 9 7 2 9 7 6 9 4 8 1 5 9 4 18 9 1 5 5 9 0 5 5 3 9 0 6 8 9 4 8 6 3 7 0 7 9 5 5 4 7 0 6 2 7 1 1 8 2 6 4 4 9 3
Simple Random Sample:Sample Members
• N = 20• n = 4
01 Andhra Pradesh02 Himachal Pradesh03 Gujrath04 Maharashtra05 Nagaland06 Goa07 West bengal08 Haryana09 Punjab10 Delhi
11 Madhya Pradesh12 Uttar Pradesh13 Bihar14 Rajasthan15 J & K 16 Tamil Nadu17 Karantaka18 Kerala19 Orissa20 Manipur
Stratified Random Sample
• Population is divided into non-overlapping subpopulations called strata
• A random sample is selected from each stratum• Potential for reducing sampling error• Proportionate -- the percentage of these sample
taken from each stratum is proportionate to the percentage that each stratum is within the population
• Disproportionate -- proportions of the strata within the sample are different than the proportions of the strata within the population
Stratified Random Sample: Population of FM Radio Listeners
20 - 30 years old(homogeneous within)
(alike)
30 - 40 years old(homogeneous within)
(alike)
40 - 50 years old(homogeneous within)
(alike)
Heterogeneous(different)between
Heterogeneous(different)between
Stratified by Age
Systematic Sampling
• Convenient and relatively easy to administer
• Population elements are an ordered sequence (at least, conceptually).
• The first sample element is selected randomly from the first k population elements.
• Thereafter, sample elements are selected at a constant interval, k, from the ordered sequence frame.
k = N
n,
where:
n = sample size
N = population size
k = size of selection interval
Systematic Sampling: Example
• Purchase orders for the previous fiscal year are serialized 1 to 10,000 (N = 10,000).
• A sample of fifty (n = 50) purchases orders is needed for an audit.
• k = 10,000/50 = 200• First sample element randomly selected
from the first 200 purchase orders. Assume the 45th purchase order was selected.
• Subsequent sample elements: 245, 445, 645, . . .
Cluster Sampling• Population is divided into non-overlapping
clusters or areas• Each cluster is a miniature of the
population.• A subset of the clusters is selected randomly
for the sample.• If the number of elements in the subset of
clusters is larger than the desired value of n, these clusters may be subdivided to form a new set of clusters and subjected to a random selection process.
Cluster Sampling� Advantages
• More convenient for geographically dispersed populations
• Reduced travel costs to contact sample elements• Simplified administration of the survey• Unavailability of sampling frame prohibits using
other random sampling methods� Disadvantages
• Statistically less efficient when the cluster elements are similar
• Costs and problems of statistical analysis are greater than for simple random sampling
Nonrandom Sampling
• Convenience Sampling: Sample elements are selected for
the convenience of the researcher
• Judgment Sampling: Sample elements are selected by
the judgment of the researcher
• Quota Sampling: Sample elements are selected until the
quota controls are satisfied
• Snowball Sampling: Survey subjects are selected based
on referral from other survey respondents
Errors
� Data from nonrandom samples are not appropriate for analysis by inferential statistical methods.
� Sampling Error occurs when the sample is not representative of the population
� Non-sampling Errors• Missing Data, Recording, Data Entry, and Analysis
Errors• Poorly conceived concepts , unclear definitions, and
defective questionnaires• Response errors occur when people so not know,
will not say, or overstate in their answers
Sampling Distribution of
Proper analysis and interpretation of a sample statistic requires knowledge of its distribution.
Population
(parameter)
Sample
x(statistic )
Calculate xto estimate
Select arandom sample
Process ofInferential Statistics
x
Distribution of a Small Finite Population
Population Histogram
0
1
2
3
52.5 57.5 62.5 67.5 72.5
Freq
uenc
y
N = 8
54, 55, 59, 63, 68, 69, 70,74
Sample Space for n = 2 with Replacement
Sample Mean Sample Mean Sample Mean Sample Mean1 (54,54) 54.0 17 (59,54) 56.5 33 (64,54) 59.0 49 (69,54) 61.52 (54,55) 54.5 18 (59,55) 57.0 34 (64,55) 59.5 50 (69,55) 62.03 (54,59) 56.5 19 (59,59) 59.0 35 (64,59) 61.5 51 (69,59) 64.04 (54,63) 58.5 20 (59,63) 61.0 36 (64,63) 63.5 52 (69,63) 66.05 (54,64) 59.0 21 (59,64) 61.5 37 (64,64) 64.0 53 (69,64) 66.56 (54,68) 61.0 22 (59,68) 63.5 38 (64,68) 66.0 54 (69,68) 68.57 (54,69) 61.5 23 (59,69) 64.0 39 (64,69) 66.5 55 (69,69) 69.08 (54,70) 62.0 24 (59,70) 64.5 40 (64,70) 67.0 56 (69,70) 69.59 (55,54) 54.5 25 (63,54) 58.5 41 (68,54) 61.0 57 (70,54) 62.0
10 (55,55) 55.0 26 (63,55) 59.0 42 (68,55) 61.5 58 (70,55) 62.511 (55,59) 57.0 27 (63,59) 61.0 43 (68,59) 63.5 59 (70,59) 64.512 (55,63) 59.0 28 (63,63) 63.0 44 (68,63) 65.5 60 (70,63) 66.513 (55,64) 59.5 29 (63,64) 63.5 45 (68,64) 66.0 61 (70,64) 67.014 (55,68) 61.5 30 (63,68) 65.5 46 (68,68) 68.0 62 (70,68) 69.015 (55,69) 62.0 31 (63,69) 66.0 47 (68,69) 68.5 63 (70,69) 69.516 (55,70) 62.5 32 (63,70) 66.5 48 (68,70) 69.0 64 (70,70) 70.0
Distribution of the Sample Means
Sampling Distribution Histogram
0
5
10
15
20
53.75 56.25 58.75 61.25 63.75 66.25 68.75 71.25
Freq
uenc
y
1,800 Randomly Selected Values from an Exponential Distribution
050
100150200250300350400450
0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10X
Frequency
Means of 60 Samples (n = 2) from an Exponential Distribution
Frequency
0
1
2
3
4
5
6
7
8
9
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00x
Means of 60 Samples (n = 5) from an Exponential Distribution
Frequency
x
0
1
2
3
4
5
6
7
8
9
10
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
Means of 60 Samples (n = 30) from an Exponential Distribution
0
2
4
6
8
10
12
14
16
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00
Frequency
x
1,800 Randomly Selected Values from a Uniform Distribution
X-bar
Frequency
0
50
100
150
200
250
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Means of 60 Samples (n = 2) from a Uniform Distribution
Frequency
x
0
1
2
3
4
5
6
7
8
9
10
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
Means of 60 Samples (n = 5) from a Uniform Distribution
Frequency
x
0
2
4
6
8
10
12
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
Means of 60 Samples (n = 30) from a Uniform Distribution
Frequency
x
0
5
10
15
20
25
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
Central Limit Theorem
• For sufficiently large sample sizes (n 30),
• the distribution of sample means , is approximately normal;
• the mean of this distribution is equal to , the population mean; and
• its standard deviation is ,
• regardless of the shape of the population distribution.
x
n
Marquis de Laplace
Central Limit Theorem
• The central limit theorem is one of the most remarkable results of the theory of probability.
• In its simplest form, the theorem states that the sum of a large number of independent observations from the same distribution has, under certain general conditions, an approximate normal distribution.
• Moreover, the approximation steadily improves as the number of observations increases. The theorem is considered the heart of probability theory, although a better name would be normal convergence theorem.
Distributions of Samples..
• Sampling distributions drawn from a uniformly distributed population start to look like normal distributions even with a sample size as small as 2.
• If the sample size is large enough they form nearly perfect normal distributions
Population: Uniform Distribution
Fig. 1) Histogram of Population - Uniform Distribution: population = 10,000; mean = 5.013; std dev 2.897
Distribution of Samples: n=2
Fig. 2) Sampling Distribution n = 2: number of samples = 2010; mean = 4.995; std dev 2.011
Distribution of Samples: n=10
Fig. 3) Sampling Distribution n = 10: number of samples = 2010; mean = 5.018;std dev 0.906
Distribution of Samples: n=50
Fig. 4) Sampling Distribution n = 50: number of samples = 2010; mean = 4.999; std dev 0.411
Useful web site for Demo
• http://www.statisticalengineering.com/central_limit_theorem.htm
ExponentialPopulation
n = 2 n = 5 n = 30
Distribution of Sample Means for Various Sample Sizes
UniformPopulation
n = 2 n = 5 n = 30
Distribution of Sample Means for Various Sample Sizes
NormalPopulation
n = 2 n = 5 n = 30
U ShapedPopulation
n = 2 n = 5 n = 30
Sampling from a Normal Population
• The distribution of sample means is normal for any sample size.
x
x
If x is the mean of a random sample of size nfrom a normal population with mean of and standard deviation of , the distribution of x is
a normal distribution with mean and
standard deviation
.n
Z Formula for Sample Means
ZX
X
n
X
X
Example
Population Parameters: 85, 9Sample Size: 40
87( 87)
87
X
X
n
P X P Z
P Z
n
P Z
P ZZ
87 85940
1 415 0 1 415 42010793
.. ( . ). ..
Graphic Solution to Example
Z = X-n
87 85
940
21 42
1 41.
.
1
Z1.410
.5000
.4207
X
940
1 42.
X8785
.5000
.4207
Equal Areasof .0793
Sampling from a Finite Population without Replacement
• In this case, the standard deviation of the distribution of sample means is smaller than when sampling from an infinite population (or from a finite population with replacement).
• The correct value of this standard deviation is computed by applying a finite correction factor to the standard deviation for sampling from a infinite population.
Sampling from a Finite Population
• Finite Correction Factor
• Modified Z Formula
N nN 1
Z X
nN nN
1
Central Limit Theorem for Proportion
• Mean of the sampling distribution of the proportion
• Standard error of the proportion
• Z=
pp
p
pqn
p
p p
Sampling Distribution of p• Sample Proportion
• Sampling Distribution• Approximately normal if nP > 5 and nQ > 5 (P is the
population proportion and Q = 1 - P.)• The mean of the distribution is P.• The standard deviation of the distribution is
:
p Xn
whereX
number of items in a sample that possess the characteristicn = number of items in the sample
p
pqn
EstimationEstimation
Statistical Inference…Statistical inference is the process by which we acquire information and draw conclusions about populations from samples.
In order to do inference, we require the skills and knowledge of descriptive statistics, probability distributions, and sampling distributions.
Parameter
Population
Sample
Statistic
Inference
Data
Statistics
Information
EstimationThere are two types of inference: estimation and hypothesis testing;estimation is introduced first.
The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic.E.g., the sample mean ( x ) is employed to estimate the population mean (μ)
Estimation…The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic.
There are two types of estimators:
Point Estimator
Interval Estimator
Point Estimator…
A point estimator draws inferences about apopulation by estimating the value of anunknown parameter using a single value orpoint.
• Point probabilities in continuous distributions were virtually zero.
• Point estimator gets closer to the parameter value with an increased sample size, but point estimators don’t reflect the effects of larger sample sizes.
• Hence we will employ the interval estimator to estimate population parameters…
Point Estimator…
Interval Estimator…An interval estimator draws inferences about a population by estimating the value of an unknown parameter using an interval.
That is we say (with some ___% certainty) that the population parameter of interest is between some lower and upper bounds.
Point & Interval Estimation…For example, suppose we want to estimate the mean summer income of a class of business students. For n=25 students, X is calculated to be 400/week.
point estimate interval estimate
An alternative statement is:The mean income is between 380 and 420 /week.
An estimator of a population parameter is a sample statistic used to estimate the parameter. The most commonly-used estimator of the:
Population Parameter Sample Statistic
Mean () is the Mean ( X )Variance ( 2 ) is the Variance (s2 )Standard Deviation () is the Standard Deviation (s)Proportion (p) is the Proportion ( p )
Estimator
We can calculate an interval estimator from a sampling distribution, by: Drawing a sample of size n from the populationCalculating its mean, XAnd, by the central limit theorem, we know that X is normally (or approximately normally) distributed so…
…will have a standard normal (or approximately normal) distribution.
Estimating μ when is known)
nxZ
Looking at this in more detail…
Known, i.e. standard normal distribution
Known, i.e. sample mean
Unknown, i.e. we want to estimate
the population mean
Known, i.e. the number of items
sampled
Known, i.e. its assumed we know
the population standard deviation…
Estimating when is known
Thus, the probability that the interval contains the population mean μ is 1– . This is a confidence interval estimator of μ .
1)n
zxn
z(P 22
the sample mean is in the center of
the interval…
the confidence interval
Estimating μ when is known)
nzx,
nzx 22
n
zx 2
nzx 2
nzx,
nzx 22
Confidence Interval Estimator for μ :The probability 1– is called the confidence level.
Usually represented with a “plus/minus”
( ± ) sign
lower confidence limit (LCL)
upper confidence limit (UCL)
Graphically……here is the confidence interval for :
x
width
nzx 2
n
zx 2
n
z2 2
x
Graphically…•…the actual location of the population mean …
…may be here… …or here… …or possibly even here…
Four commonly used confidence levels…
Confidence level 0.90 0.10 0.05 1.6450.95 0.05 0.025 1.960.98 0.02 0.01 2.330.99 0.01 0.005 2.575
Confidence level 0.90 0.10 0.05 1.6450.95 0.05 0.025 1.960.98 0.02 0.01 2.330.99 0.01 0.005 2.575
Z /2
Interval Width…The width of the confidence interval estimate is a function of the confidence level, the populationstandard deviation, and the sample size…
2x zn
Interval Width…The width of the confidence interval estimate is a function of the confidence level, the population standard deviation, and the sample size…
A larger confidence levelproduces a w i d e r confidence interval:
2x zn
Interval Width…The width of the confidence interval estimate is a function of the confidence level, the population standard deviation, and the sample size…
Larger values of σproduce wider confidence intervals
2x zn
Interval Width…
The width of the confidence interval estimate is a function of the confidence level, the population standard deviation, and the sample size…
Increasing the sample size decreases the width of the confidence interval while the confidence level can remain unchanged.Note: this also increases the cost of obtaining additional data
2x zn
Selecting the Sample Size…
We can control the width of the interval by determining the sample size necessary to produce narrow intervals.
Suppose we want to estimate the mean demand “to within 5 units”; i.e. we want to the interval estimate to be: x ± 5Since:
It follows that = 5
2x zn
Solve for n to get requisite sample size!
that
Estimation with small samples: using the t distribution
• If:– The sample size is small (<25 or so), and– The true variance 2 is unknown
• Then the t distribution should be used instead of the standard Normal.
Distribution of Sample Meansfor (1-)% Confidence
X
Z0
2Z
2Z
2
2
Distribution of Sample Meansfor (1-)% Confidence
X
Z0
2Z
2Z
2
2
.52
.52
Distribution of Sample Meansfor (1-)% Confidence
X
Z0
2Z
2Z
2
2 1
21
2
Probability Interpretation of the Level of Confidence
Pr [ ]ob Xn
XnZ Z
2 21
Distribution of Sample Means for 95% Confidence
.4750 .4750
X
95%.025.025
Z1.96-1.96 0
95% Confidence Interval for
78.16222.14378.915378.9153
854696.1153
854696.1153
n
ZXn
ZX
95% Confidence Intervals for
X
95%
XX
X
XX
X
Example
7.7 7.710.455 1.645 10.455 1.64544 44
10.455 1.91 10.455 1.918.545 12.365
X Z X Zn n
X and nZ
10 455 7 7 4490% 1645
. , . , ..
confidence
Pr [ . . ] .ob 8 545 12 365 0 90
Example
X Zn
N nN
X Zn
N nN
1 1
34 3 2 33850
800 50800 1
34 3 2 33850
800 50800 1
34 3 2 554 34 3 2 5543175 3685
. . . .
. . . .. .
X N and nZ
34 3 8 800 5098% 2 33
. , , ..
= confidence
Confidence Interval to Estimate when n is Large and is Unknown
X ZSn
or
X ZSn
X ZSn
2
2 2
Example
2.908.807.45.857.45.85
1103.19575.25.85
1103.19575.25.85
n
SZXn
SZX
575.2 confidence %99.110 ,3.19S ,5.85
ZnandX
99.0]2.908.80[Pr ob
Sampling from normal distribution
• SBI of IITR Branch calculates that its individual saving accounts are normally distributed with a mean of 2,000 and a standard deviation of 600.If the bank takes a random sample of 100 accounts, what is the probability that the sample mean will lie between 1900 and 2050?
• 0.7492
Sampling from non-normal distribution
• The distribution of annual earnings of all bank tellers with five year's experience is negatively skewed. Mean 19,000 and Std Deviation 2000.
• If we draw a random sample of 30 tellers, What is the probability that their earning will average more than 1975 annually?
• 0.0202
• t-distribution
Estimating the Mean of a Normal Population: Small n and Unknown
• The population has a normal distribution.• The value of the population standard
deviation is unknown.• The sample size is small, n < 30.• Z distribution is not appropriate for these
conditions• t distribution is appropriate
The t Distribution
• Developed by British statistician, William Gosset
• A family of distributions -- a unique distribution for each value of its parameter, degrees of freedom (d.f.)
• Symmetric, Unimodal, Mean = 0, Flatter than a Z
• t formula t XSn
Degrees of freedom
• Example• No. of values we can choose freely.
Comparison of Selected t Distributions to the Standard Normal
-3 -2 -1 0 1 2 3
Standard Normalt (d.f. = 25)
t (d.f. = 1)t (d.f. = 5)
• T table is more compact • Shows areas and t values only for few
percentages 1,2,5,10..• T table does not focus on the chance that
population parameter being estimated will fall within our confidence interval.
• Instead the chance that population parameter we are estimating will not be within our confidence interval(that is, that will lie outside it)
T-table
Table of Critical Values of t
df t0.100 t0.050 t0.025 t0.010 t0.0051 3.078 6.314 12.706 31.821 63.6562 1.886 2.920 4.303 6.965 9.9253 1.638 2.353 3.182 4.541 5.8414 1.533 2.132 2.776 3.747 4.6045 1.476 2.015 2.571 3.365 4.032
23 1.319 1.714 2.069 2.500 2.80724 1.318 1.711 2.064 2.492 2.79725 1.316 1.708 2.060 2.485 2.787
29 1.311 1.699 2.045 2.462 2.75630 1.310 1.697 2.042 2.457 2.750
40 1.303 1.684 2.021 2.423 2.70460 1.296 1.671 2.000 2.390 2.660
120 1.289 1.658 1.980 2.358 2.6171.282 1.645 1.960 2.327 2.576
t
With df = 24 and = 0.05, t = 1.711.
Confidence Intervals for of a Normal Population: Small n and Unknown
X t Sn
or
X t Sn
X t Sn
df n
1
Example
X tSn
X tSn
2 14 3 0121 29
142 14 3 012
1 2914
2 14 1 04 2 14 1 041 10 3 18
. ..
. ..
. . . .. .
X S n df n
t
2 14 1 29 14 1 13
21 99
20 005
3 012005 13
. , . , ,.
.
.. ,
Solution for Demonstration Problem
X tSn
X tSn
2 14 3 0121 29
142 14 3 012
1 2914
2 14 1 04 2 14 1 041 10 3 18
. ..
. ..
. . . .. .
Pr [ . . ] .ob 110 318 0 99
Chi-Square distribution
Population Variance
• Variance is an inverse measure of the group’s homogeneity.
• Variance is an important indicator of total quality in standardized products and services.
• Managers improve processes to reduce variance.• Variance is a measure of financial risk. Variance of
rates of return help managers assess financial and capital investment alternatives.
• Variability is a reality in global markets. Productivity, wages, and costs of living vary between regions and nations.
Estimating the Population Variance
• Population Parameter
• Estimator of
• formula for Single Variance
2
2
1SX Xn
22
21
1
n S
ndegrees of freedom = -
Confidence Interval for 2
n n
df n
S S
1 1
11
2
2
22
2
12
2
level of confidence
Selected 2 Distributions
df = 3
df = 5
df = 10
0
2 Table
0 5 10 15 20
0.10
df = 5
9.23635
df 0.975 0.950 0.100 0.050 0.0251 9.82068E-04 3.93219E-03 2.70554 3.84146 5.023902 0.0506357 0.102586 4.60518 5.99148 7.377783 0.2157949 0.351846 6.25139 7.81472 9.348404 0.484419 0.710724 7.77943 9.48773 11.143265 0.831209 1.145477 9.23635 11.07048 12.832496 1.237342 1.63538 10.6446 12.5916 14.44947 1.689864 2.16735 12.0170 14.0671 16.01288 2.179725 2.73263 13.3616 15.5073 17.53459 2.700389 3.32512 14.6837 16.9190 19.0228
10 3.24696 3.94030 15.9872 18.3070 20.4832
20 9.59077 10.8508 28.4120 31.4104 34.169621 10.28291 11.5913 29.6151 32.6706 35.478922 10.9823 12.3380 30.8133 33.9245 36.780723 11.6885 13.0905 32.0069 35.1725 38.075624 12.4011 13.8484 33.1962 36.4150 39.364125 13.1197 14.6114 34.3816 37.6525 40.6465
70 48.7575 51.7393 85.5270 90.5313 95.023180 57.1532 60.3915 96.5782 101.8795 106.628590 65.6466 69.1260 107.5650 113.1452 118.1359
100 74.2219 77.9294 118.4980 124.3421 129.5613
With df = 5 and = 0.10, 2 = 9.23635
Two Table Values of 2
0 2 4 6 8 10 12 14 16 18 20
df = 7
.05
.05
.95
2.16735 14.0671
df 0.950 0.0501 3.93219E-03 3.841462 0.102586 5.991483 0.351846 7.814724 0.710724 9.487735 1.145477 11.070486 1.63538 12.59167 2.16735 14.06718 2.73263 15.50739 3.32512 16.9190
10 3.94030 18.3070
20 10.8508 31.410421 11.5913 32.670622 12.3380 33.924523 13.0905 35.172524 13.8484 36.415025 14.6114 37.6525
2
2 2 2.1 .05
2 22 2 2
.11 1 .952 2
.0022125, 8, 1 7, .10
14.0671
2.16735
n df nS
n nS S
1 1
8 1 0 0 2 2 1 2 51 4 0 6 7 1
8 1 0 0 2 2 1 2 52 1 6 7 3 5
0 0 1 1 0 1 0 0 7 1 4 6
2
2
22
2
12
2
2
2
..
..
. .
90.0]007146.0001101.0[Pr 2 ob
90% Confidence Interval for 2
Solution for Demonstration Problem
4277.27648.04011.12
)2544.1(1253641.39
)2544.1(125
11
2
2
2
21
22
2
2
2
SS nn
4011.12
3641.39
05. ,241 ,25 ,2544.1
2
975.
2
205.1
2
21
2
025.
2
205.
2
2
2
ndfnS
Determining Sample Size when Estimating
• Z formula
• Error of Estimation (tolerable error)
• Estimated Sample Size
• Estimated
Z X
n
E X
nZ
EZ
E
2
2 2
2
2
2
14
range
Sample Size When Estimating Example
nZ
E
2
2 2
2
2 2
2
1645 41
43 30 44
( . ) ( )
. or
EZ
1 490% 1 645
,.
confidence
Solution for Demonstration Problem
n ZE
2 2
2
2 2
2
196 6 252
37 52 38
( . ) ( . )
. or
E rangeZ
estimated range
2 2595% 1 96
14
14
25 6 25
,.
: .
confidence
Determining Sample Size when Estimating P
• Z formula
• Error of Estimation (tolerable error)
• Estimated Sample Size
Z p PP Q
n
E p P
n PQZE
2
2
Solution for Demonstration Problem
nPQ
or
ZE
2
2
2
2
2 33003
0 40 0 60
1 447 7 1 448
( . ).
. .
, . ,
EConfidence Z
estimated PQ P
0 0398% 2 33
0 401 0 60
..
..
Example: Determining n when Estimating P with No Prior Information
nPQ
or
ZE
2
2
2
2
164505
0 50 0 50
270 6 271
( . ).
. .
.
0.0590% 1.645
estimate , 0.501 0.50
EConfidence Z
with no prior of P use PQ P