"classical" inference

"Classical" Inference

Two simple inference scenarios

Question 1: Are we in world A or world B?

Possible worlds:World A World B

X number added

[-.5, .5] 38 38

[-1, 1] 68 30

[-1.5, 1.5] 87 19

[-2, 2] 95 8

[-2.5, 2.5] 99 4

(- ∞, ∞) 100 1

X number added

[4, 6] 38 38

[3, 7] 68 30

[2, 8] 87 19

[1, 9] 95 8

[0, 10] 99 4

(- ∞, ∞) 100 1

Jerzy Neyman and Egon Pearson

Correct acceptance of H0

pr(D=H0| T=H0) = (1 – )

Type I Error

pr(D=H1| T=H0) = [aka size]

Type II Error

pr(D=H0| T=H1) =

Correct acceptance of H1

pr(D=H1| T=H1) = (1 – ) [aka power]

D: Decision in favor of:

H1: Alternative Hypothesis

H0: Null Hypothesis

H0: Null Hypothesis

T: The Truth of

the matter:

H1: Alternative Hypothesis

Definition. A subset C of the sample space is a best critical region of size α for testing the hypothesis H0 against the hypothesis H1 if

and for every subset A of the sample space, whenever:

we also have:

] | ,...,Pr[ (ii) 01 HAXX n

] | ,...,Pr[] | ,...,Pr[ (iii) 1111 HAXXHCXX nn

] | ,...,Pr[ (i) 01 HCXX n

Neyman-Pearson Theorem:

Suppose that for for some k > 0:

1.

2.

3.

Then C is a best critical region of size α for the test of H0 vs. H1.

kHC

HC

]|Pr[

]|Pr[

1

0

kHC

HC

]|Pr[

]|Pr[

1

0

]|Pr[ 0HC

9

• When the null and alternative hypotheses are both Normal, the relation between the power of a statistical test (1 – ) and is given by the formula

is the cdf of N(0,1), and q is the quantile determined by .

• fixes the type I error probability, but increasing n reduces the type II error probability

€

(1− β) = Φ| μ H1

− μ H0|

1

nσ H1

− | qα |σ H0

σ H1

⎛

⎝

⎜ ⎜ ⎜

⎞

⎠

⎟ ⎟ ⎟

Question 2: Does the evidence suggest our world is not like World A?

World A

X number added

[-.5, .5] 38 38

[-1, 1] 68 30

[-1.5, 1.5] 87 19

[-2, 2] 95 8

[-2.5, 2.5] 99 4

(- ∞, ∞) 100 1

Sir Ronald Aymler Fisher

Fisherian theory

Significance tests: their disjunctive logic, and p-values as evidence:

``[This very low p-value] is amply low enough to exclude at a high level of significance any theory involving a random distribution….. The force with which such a conclusion is supported is logically that of the simple disjunction: Either an exceptionally rare chance has occurred, or the theory of random distribution is not true.'' (Fisher 1959, 39)

Fisherian theory

``The meaning of `H' is rejected at level α' is `Either an event of probability α has occurred, or H is false', and our disposition to disbelieve H arises from our disposition to disbelieve in events of small probability.'' (Barnard 1967, 32)

• Common philosophical simplification: • Hypothesis space given qualitatively;

• H0 vs. –H0,

• Murderer was Professor Plum, Colonel Mustard, Miss Scarlett, or Mrs. Peacock

• More typical situation:• Very strong structural assumptions• Hypothesis space given by unknown numeric `parameters'• Test uses:

• a transformation of the raw data, • a probability distribution for this transformation (≠ the original

distribution of interest)

Three Commonly Used Facts

• Assume is a collection of independent and identically distributed (i.i.d.) random variables.

• Assume also that the Xis share a mean of μ and a standard deviation of σ.

{X1,..., Xn}


For the mean estimator :

1.

2.

X 1

nX i

i1

n

E[X ] E[X]

Var[X ] 1

nVar[X]

2

n


The Central Limit Theorem. If {X1,…, Xn} are i.i.d. random variables from a distribution with mean and variance 2, then:

3.

Equivalently:

N(0,1)~1

lim1

n

i

i

n

X

n

limn

X

X

~ N(0,1)

Examples

• Data: January 2012 CPS

• Sample: PhD’s, working full time, age 28-34

• H0: mean income is 75k

Hyp. Value Probability

H0 -1.024022 0.3138

0

1

2

3

4

5

6

7

8

40000 80000 120000

Series: WAGESSample 1 13566 IF PRTAGE > 27 AND PRTAGE < 35 AND PEEDUCA = 46 AND PEHRACTT > 39Observations 32

Mean 68898.16Median 66300.00Maximum 149999.7Minimum 10140.00Std. Dev. 33707.49Skewness 0.624378Kurtosis 3.253256

Jarque-Bera 2.164708Probability 0.338797

Comments

• The background conditions (e.g., the i.i.d. condition behind the sample) are a clear example of `Quine-Duhem’ conditions.

• When background conditions are met, ``large samples’’ don’t make inferences ``more certain’’

• Multiple tests

• Monitoring or ``peeking'‘ at data, etc.

Point estimates and Confidence Intervals

• Many desiderata of an estimator:• Consistent• Maximum Likelihood• Unbiased• Sufficient• Minimum variance• Minimum MSE (mean squared error)• (most) efficient

• By CLT: approximately:

• Thus:

• By algebra:

• So:

X

X

~ N(0,1)

Pr[ 2 X

X

2].95

Pr[X 2X X 2

X ].95

Pr[ X 1

n2].95

Interpreting confidence intervals

• The only probabilistic component that determines what occurs is . • Everything else are constants. • Simulations, examples• Question: Why ``center’’ the interval?

Pr[ X 1

n2].95

X

Confidence Intervals

• $68,898.16 ± $12,152.85• ``C.I. = mean ± m.o.e’’

• = ($56,745.32 , $81,051.01)

Using similar logic, but different computing formulae, one can extend these methods to address further questions

e.g., for standard deviations, equality of means across groups, etc.

Equality of Means: BAs

Sex Count Mean Std. Dev.

1 223 63619.54 31370.01

2 209 51395.43 25530.66

All 432 57705.56 29306.13

Value Probability

4.424943 0.0000

Equality of Means: PhDs

Sex Count Mean Std. Dev.1 21 66452.71 36139.78

2 11 73566.76 29555.10

All 32 68898.16 33707.49

Value Probability

-0.560745 0.5791

"classical" inference

Documents