"classical" inference
DESCRIPTION
"Classical" Inference. Two simple inference scenarios. Question 1: Are we in world A or world B?. Possible worlds: World A World B. Jerzy Neyman and Egon Pearson. D : Decision in favor of:. H 0 : Null Hypothesis. H 1 : Alternative Hypothesis. - PowerPoint PPT PresentationTRANSCRIPT
"Classical" Inference
Two simple inference scenarios
Question 1: Are we in world A or world B?
Possible worlds:World A World B
X number added
[-.5, .5] 38 38
[-1, 1] 68 30
[-1.5, 1.5] 87 19
[-2, 2] 95 8
[-2.5, 2.5] 99 4
(- ∞, ∞) 100 1
X number added
[4, 6] 38 38
[3, 7] 68 30
[2, 8] 87 19
[1, 9] 95 8
[0, 10] 99 4
(- ∞, ∞) 100 1
Jerzy Neyman and Egon Pearson
Correct acceptance of H0
pr(D=H0| T=H0) = (1 – )
Type I Error
pr(D=H1| T=H0) = [aka size]
Type II Error
pr(D=H0| T=H1) =
Correct acceptance of H1
pr(D=H1| T=H1) = (1 – ) [aka power]
D: Decision in favor of:
H1: Alternative Hypothesis
H0: Null Hypothesis
H0: Null Hypothesis
T: The Truth of
the matter:
H1: Alternative Hypothesis
Definition. A subset C of the sample space is a best critical region of size α for testing the hypothesis H0 against the hypothesis H1 if
and for every subset A of the sample space, whenever:
we also have:
] | ,...,Pr[ (ii) 01 HAXX n
] | ,...,Pr[] | ,...,Pr[ (iii) 1111 HAXXHCXX nn
] | ,...,Pr[ (i) 01 HCXX n
Neyman-Pearson Theorem:
Suppose that for for some k > 0:
1.
2.
3.
Then C is a best critical region of size α for the test of H0 vs. H1.
kHC
HC
]|Pr[
]|Pr[
1
0
kHC
HC
]|Pr[
]|Pr[
1
0
]|Pr[ 0HC
9
• When the null and alternative hypotheses are both Normal, the relation between the power of a statistical test (1 – ) and is given by the formula
is the cdf of N(0,1), and q is the quantile determined by .
• fixes the type I error probability, but increasing n reduces the type II error probability
€
(1− β) = Φ| μ H1
− μ H0|
1
nσ H1
− | qα |σ H0
σ H1
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟
Question 2: Does the evidence suggest our world is not like World A?
World A
X number added
[-.5, .5] 38 38
[-1, 1] 68 30
[-1.5, 1.5] 87 19
[-2, 2] 95 8
[-2.5, 2.5] 99 4
(- ∞, ∞) 100 1
Sir Ronald Aymler Fisher
Fisherian theory
Significance tests: their disjunctive logic, and p-values as evidence:
``[This very low p-value] is amply low enough to exclude at a high level of significance any theory involving a random distribution….. The force with which such a conclusion is supported is logically that of the simple disjunction: Either an exceptionally rare chance has occurred, or the theory of random distribution is not true.'' (Fisher 1959, 39)
Fisherian theory
``The meaning of `H' is rejected at level α' is `Either an event of probability α has occurred, or H is false', and our disposition to disbelieve H arises from our disposition to disbelieve in events of small probability.'' (Barnard 1967, 32)
• Common philosophical simplification: • Hypothesis space given qualitatively;
• H0 vs. –H0,
• Murderer was Professor Plum, Colonel Mustard, Miss Scarlett, or Mrs. Peacock
• More typical situation:• Very strong structural assumptions• Hypothesis space given by unknown numeric `parameters'• Test uses:
• a transformation of the raw data, • a probability distribution for this transformation (≠ the original
distribution of interest)
Three Commonly Used Facts
• Assume is a collection of independent and identically distributed (i.i.d.) random variables.
• Assume also that the Xis share a mean of μ and a standard deviation of σ.
{X1,..., Xn}
Three Commonly Used Facts
For the mean estimator :
1.
2.
X 1
nX i
i1
n
E[X ] E[X]
Var[X ] 1
nVar[X]
2
n
Three Commonly Used Facts
The Central Limit Theorem. If {X1,…, Xn} are i.i.d. random variables from a distribution with mean and variance 2, then:
3.
Equivalently:
N(0,1)~1
lim1
n
i
i
n
X
n
limn
X
X
~ N(0,1)
Examples
• Data: January 2012 CPS
• Sample: PhD’s, working full time, age 28-34
• H0: mean income is 75k
Hyp. Value Probability
H0 -1.024022 0.3138
0
1
2
3
4
5
6
7
8
40000 80000 120000
Series: WAGESSample 1 13566 IF PRTAGE > 27 AND PRTAGE < 35 AND PEEDUCA = 46 AND PEHRACTT > 39Observations 32
Mean 68898.16Median 66300.00Maximum 149999.7Minimum 10140.00Std. Dev. 33707.49Skewness 0.624378Kurtosis 3.253256
Jarque-Bera 2.164708Probability 0.338797
Comments
• The background conditions (e.g., the i.i.d. condition behind the sample) are a clear example of `Quine-Duhem’ conditions.
• When background conditions are met, ``large samples’’ don’t make inferences ``more certain’’
• Multiple tests
• Monitoring or ``peeking'‘ at data, etc.
Point estimates and Confidence Intervals
• Many desiderata of an estimator:• Consistent• Maximum Likelihood• Unbiased• Sufficient• Minimum variance• Minimum MSE (mean squared error)• (most) efficient
• By CLT: approximately:
• Thus:
• By algebra:
• So:
X
X
~ N(0,1)
Pr[ 2 X
X
2].95
Pr[X 2X X 2
X ].95
Pr[ X 1
n2].95
Interpreting confidence intervals
• The only probabilistic component that determines what occurs is . • Everything else are constants. • Simulations, examples• Question: Why ``center’’ the interval?
Pr[ X 1
n2].95
X
Confidence Intervals
• $68,898.16 ± $12,152.85• ``C.I. = mean ± m.o.e’’
• = ($56,745.32 , $81,051.01)
Using similar logic, but different computing formulae, one can extend these methods to address further questions
e.g., for standard deviations, equality of means across groups, etc.
Equality of Means: BAs
Sex Count Mean Std. Dev.
1 223 63619.54 31370.01
2 209 51395.43 25530.66
All 432 57705.56 29306.13
Value Probability
4.424943 0.0000
Equality of Means: PhDs
Sex Count Mean Std. Dev.1 21 66452.71 36139.78
2 11 73566.76 29555.10
All 32 68898.16 33707.49
Value Probability
-0.560745 0.5791