3 2 review sampling, ci
DESCRIPTION
hhfghghfTRANSCRIPT
![Page 1: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/1.jpg)
Multivariate Random Variables
• Consider two random variables X and Y
• Joint distribution
• Marginal distribution
• Conditional distribution
![Page 2: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/2.jpg)
Joint Probability Mass Function
![Page 3: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/3.jpg)
Joint Probability Density Function
![Page 4: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/4.jpg)
Example of joint probability density
![Page 5: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/5.jpg)
Conditional distributions
![Page 6: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/6.jpg)
Independent Random Variables
![Page 7: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/7.jpg)
Expected value
![Page 8: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/8.jpg)
Correlation
![Page 9: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/9.jpg)
Random samples
![Page 10: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/10.jpg)
Linear Combinations and their means
![Page 11: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/11.jpg)
Variances of linear combinations
![Page 12: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/12.jpg)
The difference between random variables
![Page 13: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/13.jpg)
The Case of Normal Random Variables
![Page 14: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/14.jpg)
14
Sampling and Confidence Interval
![Page 15: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/15.jpg)
Statistical Inference
• Statistical inference: study the population based on the data obtained from a sample from the population
• Random variable 𝑋 vs. pdf 𝑓(𝑥) vs. cdf𝐹 𝑥 vs. parameter (𝜇, 𝜎2) vs. sample (𝑋1, 𝑋2, . . , 𝑋𝑛) vs. estimate ( 𝑋) or any other statistic (ℎ(𝑋𝑘))
• Estimates of parameters
– Point estimate of 𝜇: 𝑋=(𝑋1+𝑋2+. . . +𝑋𝑛)/𝑛
– Point estimate of 𝜎2: 𝑆2=((𝑋1− 𝑋)2 + (𝑋2− 𝑋)2+. . . +(𝑋𝑛− 𝑋)2)/(𝑛 − 1)
• TAMU student height: average height of students; sample based on random sampling of 30 students in ISEN department
• Problem: variability; 𝜇 ≠ 𝑥.
• Point estimate says nothing about how
close it might be to 𝜇
15
![Page 16: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/16.jpg)
Definition of a statistic
![Page 17: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/17.jpg)
Interval Estimate
• Interval estimate: an entire interval of plausible values
– More information about a population characteristic than does a point estimate
– A confidence level for the estimate
– Such interval estimates are called confidence intervals (CI)
• Confidence level: a measure of degree of reliability of the interval (95%, 99%, 90%)
• Significance level (𝜶): 1- confidence level
• Width of CI: given the confidence level, if the interval is narrow, our knowledge of the value of the parameters is reasonably precise; a very wide CI indicates large amount of uncertainty.
17
Point Estimate
Lower
Confidence
Limit
Width of confidence interval
Upper
Confidence
Limit
![Page 18: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/18.jpg)
CI of Normal Distribution
• A 100 1 − 𝛼 % confidence interval for the mean 𝜇 of a normal population when the value of 𝜎 is known is given by
𝑥 − 𝑧𝛼/2𝜎
𝑛, 𝑥 + 𝑧𝛼/2
𝜎
𝑛
18
Z curve
1 − 𝛼
−𝑧𝛼/2 𝑧𝛼/2
P −𝑧𝛼/2< 𝑍 < 𝑧𝛼/2 = 1 − 𝛼
![Page 19: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/19.jpg)
Example
• TAMU student height: --- Suppose students’ height follows normal distribution with
unknown mean 𝜇 and known 𝜎 (𝜎 = 2 ).--- We have observation from the sample of IE students; each
observation is a random sample from 𝑁(𝜇, 𝜎2).
--- The sample mean follows normal distribution 𝑋 ∼ 𝑁(𝜇,𝜎2
𝑛).
--- Normalize 𝑋: 𝑍 = 𝑋−𝜇
𝜎/ 𝑛, under the standard normal curve
𝑃 −1.96 < 𝑍 < 1.96 = 0.95
→ 𝑃 −1.96 < 𝑋 − 𝜇
𝜎/ 𝑛< 1.96 = 0.95
𝑃 𝑋 − 1.96𝜎
𝑛< 𝜇 < 𝑋 + 1.96
𝜎
𝑛= 0.95
95% confidence interval: 𝑋 − 1.96𝜎
𝑛, 𝑋 + 1.96
𝜎
𝑛
19
![Page 20: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/20.jpg)
Example
• 95% confidence interval: 𝑋 − 1.96𝜎
𝑛, 𝑋 + 1.96
𝜎
𝑛
• CI is a random variable because 𝑋 is a random variable
• Interval length 2 × 1.96 ×𝜎
𝑛is not random, only the location is random
(center 𝑋).
• CI can be explained as the probability is 0.95 that the random interval includes the true value of 𝜇.
• TAMU students’ height: 68.46 − 1.962
24, 68.46 + 1.96
2
24=
(67.66, 69.26)
20
𝑿
𝑋 − 1.96𝜎
𝑛
2 × 1.96𝜎
𝑛
![Page 21: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/21.jpg)
Interpret CI
• Confidence level: 95% is the probability 0.95 for the interval
• WRONG: we calculated CI(67.66,69.26), it is wrong to say 𝜇 is within this fixed interval with probability 0.95.
• We should use long-run relative frequency interpretation of probability to explain CI.
• We can take various samples, (e.g., students from business school, or students enrolling a certain class, or students currently at the library), and
then we get large enough intervals as 𝑋 −
21
![Page 22: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/22.jpg)
Large Sample CI For Mean
• In reality, TAMU students’ height may not follow normal distribution.
• However, if the sample size is large enough, according to central limit theorem, 𝑋 tends to follow a normal distribution, whatever the population distribution.
• Thus 𝑍 = 𝑋−𝜇
𝜎/ 𝑛has approximately a standard normal
distribution,
• When 𝑛 is large, CI 𝑥 − 𝑧𝛼/2𝜎
𝑛, 𝑥 + 𝑧𝛼/2
𝜎
𝑛remains valid
whatever the population distribution.
• Even if the height follows some non-Gaussian distribution with known 𝜎, we can still use CI, provided we have large enough sample size 𝑛.
22
P 𝑧𝛼/2 < 𝑋−𝜇
𝜎/ 𝑛< 𝑧𝛼/2 ≈ 1 − 𝛼
![Page 23: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/23.jpg)
CI When Variance Unknown
• Assumption: population is normal, and random samples are from a normal distribution with both 𝜇 and 𝜎 unknown.
• Theorem: when 𝑋 is the mean of a random sample of size 𝑛 from a normal
distribution with mean 𝜇, then random variable 𝑇 = 𝑋−𝜇
𝑆/ 𝑛follows a t
distribution with 𝒏 − 𝟏 degrees of freedom.
• Properties: let 𝑇𝑣 denote a t statistic with 𝜈 DoF.
--- Each 𝑇𝜈 pdf (𝑡𝜈 curve) is bell shaped and centered at 0.
--- Each 𝑡𝜈 curve is more spread out than the standard normal (z) curve.
--- As 𝜈 increases, the spread of the corresponding 𝑡𝜈 curves decreases.
--- As 𝜈 → ∞, the 𝑡𝜈 curve approaches the standard normal curve.
• Critical value: Let 𝑡𝛼,𝜈 denote the number on the measurement axis for
which the area under the t curve with 𝜈 DoF to the right of 𝑡𝛼,𝜈 is 𝛼; 𝑡𝛼,𝜈 is called a t critical value.
23
![Page 24: 3 2 Review Sampling, CI](https://reader034.vdocuments.us/reader034/viewer/2022042610/56d6be621a28ab301691e661/html5/thumbnails/24.jpg)
CI When Variance Unknown
• Let 𝑥 and 𝑠 be the sample mean and sample standard deviation from a normal population with mean 𝜇. Then the 100 1 − 𝛼 %confidence interval for 𝜇 is
• An upper confidence bound for 𝜇 is 𝑥 + 𝑡𝛼,𝑛−1𝑠
𝑛, with confidence
level 100 1 − 𝛼 %.
24
𝑥 − 𝑡𝛼2,𝑛−1
𝑠
𝑛, 𝑥 + 𝑡𝛼
2,𝑛−1
𝑠
𝑛