chapter 8: introduction to statistical inference
DESCRIPTION
Chapter 8: Introduction to Statistical Inference. In Chapter 8:. 8.1 Concepts 8.2 Sampling Behavior of a Mean 8.3 Sampling Behavior of a Count and Proportion. § 8.1: Concepts. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/1.jpg)
Apr 20, 2023
Chapter 8: Chapter 8: Introduction to Statistical Introduction to Statistical
InferenceInference
![Page 2: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/2.jpg)
In Chapter 8:
8.1 Concepts
8.2 Sampling Behavior of a Mean
8.3 Sampling Behavior of a Count and Proportion
![Page 3: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/3.jpg)
§8.1: ConceptsStatistical inference is the act of generalizing from a sample to a population with calculated degree of certainty.
We calculate statistics in the sample
We are curious about parameters in the population
![Page 4: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/4.jpg)
Parameters Statistics
Source Population Sample
Calculated? No Yes
Constant? Yes No
Notation (examples) μ, σ, p psx ˆ,,
Parameters and Statistics
It is essential to draw the distinction between parameters and statistics.
![Page 5: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/5.jpg)
§8.2 Sampling Behavior of a Mean
• How precisely does a given sample mean reflect the underlying population mean?
• To answer this question, we must establish the sampling distribution of x-bar
• The sampling distribution of x-bar is the hypothetical distribution of means from all possible samples of size n taken from the same population
![Page 6: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/6.jpg)
Simulation Experiment
• Population: N = 10,000 with lognormal distribution (positive skew), μ = 173, and σ = 30 (Figure A, next slide)
• Take repeated SRSs, each of n = 10, from this population
• Calculate x-bar in each sample
• Plot x-bars (Figure B , next slide)
![Page 7: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/7.jpg)
A. Population (individual values)
B. Sampling distribution of x-bars
![Page 8: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/8.jpg)
Findings1. Distribution B is Normal
even though Distribution A is not (Central Limit Theorem)
2. Both distributions are centered on µ (“unbiasedness”)
3. The standard deviation of Distribution B is much less than the standard deviation of Distribution A (square root law)
![Page 9: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/9.jpg)
Results from Simulation Experiment
• Finding 1 (central limit theorem) says the sampling distribution of x-bar tends toward Normality even when the population distribution is not Normal. This effect is strong in large samples.
• Finding 2 (unbiasedness) means that the expected value of x-bar is μ
• Finding 3 is related to the square root law which says:
nx
![Page 10: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/10.jpg)
Standard Deviation (Error) of the Mean
• The standard deviation of the sampling distribution of the mean has a special name: it is called the “standard error of the mean” (SE)
• The square root law says the SE is inversely proportional to the square root of the sample size:
nSExx
![Page 11: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/11.jpg)
Example, the Weschler Adult Intelligence Scale has σ = 15
Quadrupling the sample size cut the SE in half Square root law!
151
15
nSEx
For n = 1
5.74
15
nSEx
For n = 4
75.316
15
nSEx
For n = 16
![Page 12: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/12.jpg)
Putting it together:
• The sampling distribution of x-bar is Normal with mean µ and standard deviation (SE) = σ / √n (when population Normal or n is large)
• These facts make inferences about µ possible• Example: Let X represent Weschler adult
intelligence scores: X ~ N(100, 15). Take an SRS of n = 10 SE = σ / √n = 15/√10 = 4.7 xbar ~ N(100, 4.7) 68% of sample mean will be in the range
µ ± SE = 100 ± 4.7 = 95.3 to 104.7
x ~ N(µ, SE)
![Page 13: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/13.jpg)
![Page 14: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/14.jpg)
Law of Large Numbers
• As a sample gets larger and larger, the sample mean tends to get closer and closer to the μ
• This tendency is known as the Law of Large Numbers
This figure shows results from a sampling experiment in a population with μ = 173.3
As n increased, the sample mean became a better reflection of μ = 173.3
![Page 15: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/15.jpg)
8.3 Sampling Behavior of Counts and Proportions
• Recall (from Ch 6) that binomial random variable represents the random number of successes (X) in n independent “success/failure” trials; the probability of success for each trial is p
• Notation X~b(n,p)
• The sampling distribution X~b(10,0.2) is shown on the next slide: μ = 2 when the outcome is expressed as a count and μ = 0.2 when the outcome is expressed as a proportion.
![Page 16: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/16.jpg)
![Page 17: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/17.jpg)
Normal Approximation to the Binomial
• When n is large, the binomial distribution takes on Normal properties
• How large does the sample have to be to apply the Normal approximation?
• One rule says that the Normal approximation applies when npq ≥ 5
![Page 18: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/18.jpg)
Top figure: X~b(10,0.2)npq = 10 ∙ 0.2 ∙ (1–0.2) = 1.6 (less than 5) → Normal approximation does not apply
Bottom figure: X~b(100,0.2) npq = 100 ∙ 0.2 ∙ (1−0.2) = 16 (greater than 5) → Normal approximation applies
![Page 19: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/19.jpg)
Normal Approximation for a Binomial Count
npqnpNX ,~
When Normal approximation applies:
npqnp and
![Page 20: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/20.jpg)
n
pqpNp
n
pqp
,~ˆ
and
Normal Approximation for a Binomial Proportion
![Page 21: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/21.jpg)
“p-hat” is the symbol for the sample proportion
![Page 22: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/22.jpg)
Illustrative Example: Normal Approximation to the Binomial
• Suppose the prevalence of a risk factor in a population is p = 0.2
• Take an SRS of n = 100 from this population
• A variable number of cases in a sample will follow a binomial distribution with n = 20 and p = .2
![Page 23: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/23.jpg)
Illustrative Example, cont.
4,20~
48.2.100 and
20.2100
NX
npq
np
The Normal approximation for the binomial count is:
04.0,2.0~ˆ
04.0100
8.2.
.2
Np
n
pq
p
The Normal approximation for the binomial proportion is:
![Page 24: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/24.jpg)
1. Statement of a problem: Suppose we see a sample with 30 cases. What is the probability of see at least 30 cases under these circumstance, i.e., Pr(X ≥ 30) = ? assuming X ~ N(20, 4)
2. Standardize: z = (30 – 20) / 4 = 2.5
3. Sketch: next slide
4. Table B: Pr(Z ≥ 2.5) = 0.0062
Illustrative Example, cont.
![Page 25: Chapter 8: Introduction to Statistical Inference](https://reader035.vdocuments.us/reader035/viewer/2022062314/5681399e550346895da138dc/html5/thumbnails/25.jpg)
Illustrative Example, cont.Binomial and superimposed Normal sampling distributions for the problem.
Our Normal approximation suggests that only .0062 of samples will see at least this many cases.