hypothesis testing ii: z tests, t tests, and conﬁdence ... · same story form a hypothesis get...

Hypothesis Testing II: z tests, ttests, and confidence intervals

INFO-2301: Quantitative Reasoning 2Michael Paul and Jordan Boyd-GraberAPRIL 24, 2017

INFO-2301: Quantitative Reasoning 2 | Paul and Boyd-Graber Hypothesis Testing II: z tests, t tests, and confidence intervals | 1 of 4

Same Story

• Form a hypothesis

• Get some data

• Compute test statistic

• Reject hypothesis if p-value low enough


Important difference

• Assumes data come from normal distribution

• Can cause problems if diverges too much normal

• (How can you tell?) χ2 goodness of fit


Goodness of Fit

(But we’ll assume that data are normal)


Hypothesis Testing II: z tests


INFO-2301: Quantitative Reasoning 2 | Paul and Boyd-Graber Hypothesis Testing II: z tests | 1 of 5

z-test

• Suppose we have one observation from normal distribution with mean µand variance σ2

• Given an observation x we can compute the Z score as

Z =x −µσ

(1)

• H0: Our observation came from the normal distribution with µ=µ0

◦ Assume same known variance σ

◦ But we need to be more specific!


z-test

• Suppose we have one observation from normal distribution with mean µand variance σ2

• Given an observation x we can compute the Z score as

Z =x −µσ

(1)

• H0: Our observation came from the normal distribution with µ=µ0

◦ Assume same known variance σ◦ But we need to be more specific!


Two-tailed vs. one-tailed tests

• Two tail: Alternative µ 6=µ0

• One tail: Alternative µ>µ0


Multiple observations

If you observe x1 . . .xN from distribution with mean µ, test whether µ 6=µ0

• Compute test statistic

Z =x̄ −µ0

σ/p

N(2)

• If H0 were true, x̄ would be normal distribution with µ0 and variance σ2

N

• Now we can decide when to reject based on normal CDF


When to reject (two-tailed)


Hypothesis Testing II: ConfidenceIntervals


INFO-2301: Quantitative Reasoning 2 | Paul and Boyd-Graber Hypothesis Testing II: Confidence Intervals | 1 of 4

Conveying Uncertainty

• Suppose you make a measurement (e.g., a poll)

• Often useful to show all µ0 that could be excluded given themeasurement

• Fixed α

• Confidence interval


z-distribution Confidence

• Assume known variance σ

• You observe {x1 . . .xN}• Obtain mean x̄

• Recall test statisticx̄ −µσpN

(1)

• For what µ would we not reject the null?

• Note that x̄ and µ are symmetric


From the Distribution

• Set α= 0.05

• Solve for µ= NorCDF(0,0.025)

• x̄ ±1.96 σpn

• Some people just use 2.0

• More samples→ tighter bounds


Hypothesis Testing II: One Samplet Tests


INFO-2301: Quantitative Reasoning 2 | Paul and Boyd-Graber Hypothesis Testing II: One Sample t Tests | 1 of 6

What if you don’t know variance?

• t-test allows you to test hypothesisif you don’t know variance

• Sometimes called “small sampletest”: same as z test with enoughobservations

• William Gossett: check that yeastcontent matched Guiness’sstandard (but couldn’t publish)

• I.e., checking whether yeastcontent equal to µ0


t-test statistic

• Need to estimate variance

s2 =∑

i

(xi − x̄)2

N −1(1)

• n−1 removes bias (expected value is less than truth)

• Test statistic looks similar

T ≡x̄ −µ0

spN

(2)


Degrees of Freedom

• Like χ2, t-distribution parameterized by degrees of freedom

• ν= N −1 degress of freedom

PDF

Γ�

ν+12

�

pνπΓ

�

ν2

�

�

1 +x2

ν

�− ν+12

(3)

CDF

1

2+ xΓ

�

ν+ 1

2

�

2F1

�

12 , ν+1

2 ; 32 ;− x2

ν

�

pπνΓ

�

ν2

� (4)


Shape of t-distribution


Example

• Suppose observe {0,1,2,3,4,5}• Test whether µ 6= 1

• x̄ = 2.5, s2 = 3.5• T =

x̄−µ0Ç

s2N

= 2.5−1.0q

3.56

= 1.9640

• Double area under the at two tailed CDF


Example

• Suppose observe {0,1,2,3,4,5}• Test whether µ 6= 1• x̄ = 2.5, s2 = 3.5

• T =x̄−µ0Ç

s2N

= 2.5−1.0q

3.56

= 1.9640



Example

• Suppose observe {0,1,2,3,4,5}• Test whether µ 6= 1• x̄ = 2.5, s2 = 3.5• T =

x̄−µ0Ç

s2N

= 2.5−1.0q

3.56

= 1.9640



Hypothesis Testing II: Two Samplet Tests


INFO-2301: Quantitative Reasoning 2 | Paul and Boyd-Graber Hypothesis Testing II: Two Sample t Tests | 1 of 5

Comparing Two Samples

• Thus far, we’ve tested whether data is consistent with mean µ

• What if we want to test whether two samples are from the samedistribution

• Two-Sample t-test


Two-Sample (unpooled)

• Two samples X1 = {x1,1,x1,2 . . .x1,N1} and X2 = {x2,1,x2,2 . . .x2,N2

}• Doesn’t assume that variance is the same for both samples (unpooled)

• Compute mean and sample variance for sample 1 (x̄1,s21) and sample 2

(x̄2,s22)


Test Statistic

• T-statistic

T =(x1− x2)r

s21

N1+

s22

N2

(1)

• Plug into t-distrubtion with

ν=

�

s21

N1+

s22

N2

�2

�

s21

N1

�2

N1−1 +

�

s22

N2

�2

N2−1

(2)

• Intuition: Difference between x̄1 and x̄2 has variance that’s aninterpolation between the two samples

• Two-tailed vs. one-tailed distinction still applies


ν Example

s21 = 1,s2

2 = 2,n1 = 4,n2 = 8

ν=

�

s21

N1+

s22

N2

�2

�

s21

N1

�2

N1−1 +

�

s22

N2

�2

N2−1

(3)

(4)


ν Example

s21 = 1,s2

2 = 2,n1 = 4,n2 = 8

ν=

�

s21

N1+

s22

N2

�2

�

s21

N1

�2

N1−1 +

�

s22

N2

�2

N2−1

(3)

=

�

14 + 2

8

�2

13

�

14

�2+ 1

7

�

28

�2 (4)

(5)


ν Example

s21 = 1,s2

2 = 2,n1 = 4,n2 = 8

ν=

�

s21

N1+

s22

N2

�2

�

s21

N1

�2

N1−1 +

�

s22

N2

�2

N2−1

(3)

=

�

14 + 2

8

�2

13

�

14

�2+ 1

7

�

28

�2 (4)

=14

�

14

�2 � 13 + 1

7

�=

41021

=42

5(5)


Hypothesis Testing II


INFO-2301: Quantitative Reasoning 2 | Paul and Boyd-Graber Hypothesis Testing II | 1 of 4

Details

• Important to pre-register hypothesis

• Check assumptions of tests: distribution, randomness, response

• Many other tests that are possible


Problems with Statistical Tests’ Philosophy

• Failing to reject the null does not prove the null

• Cannot exclude data you don’t like

• Confusing statistical significance with practical significance (µ 6= 0 couldmean µ= 0.000001)

Often better:

• Present plot with confidence intervals, let people decide for themselves

• Create comprehensive model with explainable parameters, computeintervals for parameters


hypothesis testing ii: z tests, t tests, and conﬁdence ... · same story form a hypothesis get...

Documents