lc07 sl estimation.ppt_0
DESCRIPTION
.TRANSCRIPT
![Page 1: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/1.jpg)
Inferences on Inferences on Population ProportionsPopulation Proportions
![Page 2: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/2.jpg)
Use Calculation from Sample to Use Calculation from Sample to Estimate Population ParameterEstimate Population Parameter
Population Sample(select)
Statistic
(calculate)
Parameter(estimate)
(describes)
%63ˆ p?p
![Page 3: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/3.jpg)
Statistic ParameterStatistic Parameter
Describes a Describes a sample.sample.
Always knownAlways known Changes upon Changes upon
repeated repeated sampling.sampling.
Examples:Examples:
Describes a Describes a population.population.
Usually unknownUsually unknown Is fixedIs fixed
Examples:Examples:
pssx ˆ,,, 2 p,,, 2
![Page 4: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/4.jpg)
A Statistic is a Random VariableA Statistic is a Random Variable Upon repeated sampling of the same Upon repeated sampling of the same
population, the value of a statistic population, the value of a statistic changes changes variablevariable..
While we donWhile we don’’ t know what the next t know what the next sample will yield, we do know the sample will yield, we do know the overall pattern over many, many overall pattern over many, many samplings samplings randomrandom..
The distribution of possible values of a The distribution of possible values of a statistic for repeated samples of the statistic for repeated samples of the same size from a population is called same size from a population is called the the sampling distributionsampling distribution of the of the statistic.statistic.
![Page 5: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/5.jpg)
ProportionProportion
We are interested in the distribution ofWe are interested in the distribution of
Note, is cY where c = 1/n is a constant Note, is cY where c = 1/n is a constant and Y is a binomial random variable.and Y is a binomial random variable.
If Y is normally distributed, cY will also be If Y is normally distributed, cY will also be normally distributed. normally distributed.
n
yp ˆ
p̂
![Page 6: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/6.jpg)
If Y is normal cY is normalIf Y is normal cY is normal
For example: If Y is n(µ=100,σ2=4),
then (0.5)Y is n(µ=50, σ2=1)
y cy0.5y
![Page 7: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/7.jpg)
Distribution of Sample Distribution of Sample ProportionsProportions
A normal curve can be used to A normal curve can be used to approximate the distribution of approximate the distribution of sample proportions if:sample proportions if:
The size of the sample or number of The size of the sample or number of repetitions is relatively large (say repetitions is relatively large (say 30 or more).30 or more).
While the sample size is relatively While the sample size is relatively small compared to the population small compared to the population size (say < 10%) size (say < 10%)
![Page 8: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/8.jpg)
Sampling Distribution for Sampling Distribution for
p̂
The sampling distribution of based on large n is approximately normal.
p̂
![Page 9: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/9.jpg)
Sampling Distribution forSampling Distribution forp̂
To completely define the normal distribution of p̂
We need the mean (expected value) and variance.
![Page 10: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/10.jpg)
Sampling Distribution of Sampling Distribution of p̂
pn
ppzp
)1(2/
n
ppzp
)1(2/
So, at most, will be away from p, (1-α)100% of the time. We call this (1-α)100% the level of confidence.
p̂ n
ppz
)1(2/
1-α
![Page 11: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/11.jpg)
Repeated Sampling of Size Repeated Sampling of Size nn
95% of the time our estimate will be within
of the truth.
n
pp )1(96.1
95%
![Page 12: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/12.jpg)
Standard ErrorStandard Error We don’t know the value of p, so we We don’t know the value of p, so we
will usewill use
When we use , we have an When we use , we have an estimate of the standard deviation estimate of the standard deviation for the sampling distribution of .for the sampling distribution of .
We call this estimate the standard We call this estimate the standard error. error.
p̂
p̂
p̂
n
pp )ˆ1(ˆ
![Page 13: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/13.jpg)
Confidence Interval for Confidence Interval for pp
n
pppCI
)ˆ1(ˆ96.1ˆ :pfor %95
n
ppzpCI
)ˆ1(ˆˆ :pfor %100)1( 2/
Example:
![Page 14: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/14.jpg)
Confidence Interval Confidence Interval Based on Normal Based on Normal
DistributionDistributionError) (Standard Value) (Est Pt Z
n
ppzp
)ˆ1(ˆˆ 2/
Standard error is our estimate of the standard deviation for the distribution of the point estimate.
![Page 15: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/15.jpg)
Sample SizeSample Size
(solving for n)
n
pp )ˆ1(ˆz eError Maximum /2
2
22/ )ˆ1(ˆ
e
ppzn
![Page 16: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/16.jpg)
Sample SizeSample Size If you can’t guess a value for :If you can’t guess a value for :
What sample size would you What sample size would you recommend to estimate the recommend to estimate the proportion of exeedences with a 95% proportion of exeedences with a 95% confidence interval and maximum confidence interval and maximum error of 2%?error of 2%?
p̂
2
22/
2
22/
4
)5)(.5(.
e
z
e
zn
![Page 17: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/17.jpg)
Research Update!Research Update! Recent research shows that we get Recent research shows that we get
better coverage for (1-better coverage for (1-α)100% CI’s α)100% CI’s onon
p-hat if we alter the CI formula.p-hat if we alter the CI formula.
95%CI:95%CI:
where where
4
)ˆ1(ˆˆ 2/
n
ppzp
4
2ˆ
n
xp (Agresti/Coull)
![Page 18: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/18.jpg)
Estimation Procedures: An Example
![Page 19: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/19.jpg)
Basic Logic
In estimation procedures, statistics calculated from random samples are used to estimate the value of population parameters.
Example: If we know 42% of a random sample
drawn from a city are Republicans, we can estimate the percentage of all city residents who are Republicans.
![Page 20: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/20.jpg)
Basic Logic Information from
samples is used to estimate information about the population.
Statistics are used to estimate parameters.
POPULATION
SAMPLE
PARAMETER
STATISTIC
![Page 21: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/21.jpg)
Basic Logic Sampling Distribution
is the link between sample and population.
The value of the parameters are unknown but characteristics of the S.D. are defined by theorems.
POPULATION
SAMPLING DISTRIBUTION
SAMPLE
![Page 22: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/22.jpg)
Two Estimation Procedures
A point estimate is a sample statistic used to estimate a population value. A newspaper story reports that 74% of a
sample of randomly selected Americans support capital punishment.
Confidence intervals consist of a range of values. ”between 71% and 77% of Americans
approve of capital punishment.”
![Page 23: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/23.jpg)
Constructing Confidence Intervals For Means Set the alpha (probability that the interval
will be wrong). Setting alpha equal to 0.05, a 95% confidence
level, means the researcher is willing to be wrong 5% of the time.
Find the Z score associated with alpha. If alpha is equal to 0.05, we would place half
(0.025) of this probability in the lower tail and half in the upper tail of the distribution.
Substitute values into equation.
![Page 24: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/24.jpg)
Confidence Intervals For Means
For a random sample of 178 households, average TV viewing was 6 hours/day with s = 3. Alpha = .05. N=178. c.i. = 6.0 ±1.96(3/√177) c.i. = 6.0 ±1.96(3/13.30) c.i. = 6.0 ±1.96(.23) c.i. = 6.0 ± .44
![Page 25: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/25.jpg)
Confidence Intervals For Means
We can estimate that households in this community average 6.0±.44 hours of TV watching each day.
Another way to state the interval: 5.56≤μ≤6.44 We estimate that the population mean is greater
than or equal to 5.56 and less than or equal to 6.44.
This interval has a .05 chance of being wrong.
![Page 26: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/26.jpg)
Confidence Intervals For Means
Even if the statistic is as much as ±1.96 standard deviations from the mean of the sampling distribution the confidence interval will still include the value of μ.
Only rarely (5 times out of 100) will the interval not include μ.
![Page 27: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/27.jpg)
Other confidence levels
Confidence level
Alpha Alpha/2 Z score
90% .10 .05 +/- 1.65
95% .05 .025 +/- 1.96
99% .01 .0050 +/- 2.58
99.9% .001 .0005 +/- 3.29
![Page 28: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/28.jpg)
Constructing Confidence Intervals For Proportions
Procedures: Set alpha. Find the associated Z score. Substitute the sample information into
Formula
![Page 29: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/29.jpg)
Confidence Intervals For Proportions
If 42% of a random sample of 764 from a Midwestern city are Republicans, what % of the entire city are Republicans?
Don’t forget to change the % to a proportion. c.i. = .42 ±1.96 (√.25/764) c.i. = .42 ±1.96 (√.00033) c.i. = .42 ±1.96 (.018) c.i. = .42 ±.04
![Page 30: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/30.jpg)
Confidence Intervals For Proportions
Changing back to %s, we can estimate that 42% ± 4% of city residents are Republicans.
Another way to state the interval: 38%≤Pu≤ 46%
We estimate the population value is greater than or equal to 38% and less than or equal to 46%.
This interval has a .05 chance of being wrong.
![Page 31: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/31.jpg)
SUMMARY
In this situation, identify the following: Population Sample Statistic Parameter
![Page 32: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/32.jpg)
SUMMARY
Population = All residents of the city.
Sample = the 764 people selected for the sample and interviewed.
Statistic = Ps = .42 (or 42%)
Parameter = unknown. The % of all residents of the city who are Republicans.
![Page 33: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/33.jpg)
Estimating Population Parameters: Another Example
![Page 34: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/34.jpg)
Estimating Population Proportion
Example: When surveying 500 people selected at random from the general population we get 200 responses of “yes” when asked if they like broccoli. Estimate the proportion of the population that likes broccoli.
p’ = x/n = 200/500 = 0.40
q’ = 1 - p’ = 0.60
np’ = 200 > 25 and nq’ = 300 > 25 normal aprox. is OK
At the 95% confidence level we have /2 = 0.025 and z/2 = 1.96
The margin for error is E = z/2(.4*.6/500) = 1.96(.022) = 0.043
P(p’- 0.043 < p < p’+ 0.043) = P(.357 < p < .443) = 0.95
![Page 35: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/35.jpg)
Estimating Population Proportion
There are two user controlled factors that determine the margin of error
E = z/2(p’q’/n)1. The confidence level 1 -
2. The sample size n
The smaller , the greater z/2 and the greater E
The larger a value of n, the smaller E
If the experimenter wants to fix the width of the confidence interval (by setting E to a pre-determined constant) and set the confidence level (by selecting a particular ), then we can use equation (1) above to determine the sample size needed to achieve this level of precision.
(1)
![Page 36: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/36.jpg)
Estimating Population Proportion
E = z/2(p’q’/n)
Set E and to desired values E and z/2 are constant. Solve equation (1) for n
n = z/22(p’q’)/E2
(1)
In equation (2) we have not yet taken a sample from the population, so we cannot be sure what the proportion of successes might be. The value of p’ that we use in this equation may be an estimate that we make based upon some prior knowledge, or, we may chose p’ = q’ = 0.5, which maximizes n for particular choice of and E
(2)
![Page 37: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/37.jpg)
Estimating Population Proportion
Returning to our previous example, suppose we choose = 0.05, E = 0.025, and have no prior knowledge about p’
The from equation (2) on the previous slide we obtain
n = z/22(p’q’)/E2
Where
z/2 = 1.96 and
n = (1.96)2 (0.25)/0.0252
n = 1536.4 1537
![Page 38: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/38.jpg)
Estimating the population mean, known
Assume:
•Sample size, n > 30
•Population standard deviation is known
Then from the Central Limit Theorem, we know that the sampling distribution for samples of size n
•Has mean x’ = , the mean of the original population
•Has x’ = /n
![Page 39: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/39.jpg)
Estimating the population mean, known
Let x’ be the mean of the sample of size n, then we have for a confidence interval 1 - given by
P(x’ – z/2/n < < x’ + z/2/n) = 1 -
Let the margin of error E = z/2/n
Then with a probability of 1 - the population mean lies between x’ – E and x’ + E
![Page 40: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/40.jpg)
Estimating the population mean, known
x’ mean from sample of size n
E
E
1
2
/2
/2
If the mean were 1 the probability of getting a sample x’ would be /2
If the mean were 2 the probability of getting a sample x’ would be /2
![Page 41: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/41.jpg)
Estimating Population mean with unknown
If the population standard deviation is unknown, the sample will have to provide both an estimate on the population mean and standard deviation.
Estimation of the population mean will be similar to how it was done when is known, but the sample standard deviation will be used instead. Student’s t-distribution will be used to determine the margin of error, and the confidence interval will be somewhat wider than it would be for the same sample size if were known.
![Page 42: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/42.jpg)
Estimating Population mean with unknown
Step 1. For a sample of size n, n 30, calculate the sample mean x’, and sample standard deviation s
Recall: the sample variance s2 =(fixi2 – (fixi)2)/(n – 1)
Step 2. Convert to a standard t-score
t = (x’ - )/(s/n)
Where is the (unknown) mean of both the original population and the sampling distribution
Step 3. Select a confidence level 1 - , and determine t/2
Step 4. Find the margin of error E = t/2 (s/n)
Then P(x’ – E x’ + E) = 1 -
![Page 43: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/43.jpg)
Estimating Population mean with unknown
Finding t/2
Choose = 0.05 , and assume n = 30 – From table A-3
Degrees of Area in One Tail
freedom 0.005 0.01 0.025 0.05
……………………………………………………….
29 2.756 2.462 2.045 1.699
Then t/2 = 2.045 for n –1 = 29 degrees of freedom
![Page 44: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/44.jpg)
Estimating Population mean with unknown
Using the TI calculator to find confidence interval for a statistic with t-distribution
Let n = 106, x’ = 98.2, s = 0.62
Construct 99% Confidence Interval
Step 1: Select STAT > TESTS scroll down to 8: TInterval
Step 2: Select Stats if x’, n, and s are known
Select Data if these values are to be calculated from a list
Step 3: (Stats) Use the arrow key to move to each prompt and enter the values given above. Then Calculate <enter>
Answer: TInterval (98.081, 98.319)
![Page 45: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/45.jpg)
Estimating Population Variance
Requirements:
1. The sample is a simple random sample
2. The population must have normally distributed values
The sample variance has 2 distribution
2 = (n – 1) s2/2
Where
n = sample size
s2 = sample variance
2 = population variance (to be determined)
![Page 46: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/46.jpg)
Estimating Population Standard Deviation
0 2
Confidence Interval for the Population Variance:
(n-1)s2/2R < 2 < (n-1)s2/2
L
2L 2
R
/2/2
2 distribution is skewed right and always positive
![Page 47: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/47.jpg)
Estimating Population Standard Deviation
Example: Given the following data, find the 95% confidence interval for the population standard deviation
n = 41, x’ = 67200, s = 18277
P{(n-1)s2/2R < 2 < (n-1)s2/2
L} = 0.95
First find 2R and 2
L when each tail of the distribution contains 2.5% of the area under the curve
From Table A-4 for the Chi-Square Distribution
Degrees of Freedom
Area to the right of the Critical Value
0.995 0.99 0.975 0.95 0.10 0.05 0.025
………………………………………………………………………………………….
40 20.707 22.164 24.443 26.509 51.805 55.758 59.342
![Page 48: Lc07 SL Estimation.ppt_0](https://reader030.vdocuments.us/reader030/viewer/2022020802/577c7c671a28abe0549a771c/html5/thumbnails/48.jpg)
Estimating Population Standard Deviation
From the previous slide we have 2R = 59.342 and 2
L = 24.433
And therefore:
(n-1)s2/2R = 40 (18277)2/59.342 = 2.2516 x 108
and
(n-1)s2/2L = 40 (18277)2/24.433 = 5.4688 x 108
Thus we have:
P(2.2516 x 108 < 2 < 5.4688 x 108) = 0.95
and for the standard deviation
P(15,005 < < 23385) = 0.95