hypothesis testing

EconometricsHypothesis Testing (Inference)in Linear Regression Models

Yuyi LI

University of Manchester

2012

Yuyi LI (University of Manchester) Inference in Linear Regression Models 2012 1 / 27

OutlineFinite-sample inference in linear models: thenormality assumptionSampling distributions of the OLS estimatorsHypothesis tests: a single restrictionThe t testHypothesis tests: multiple linear restrictionsThe F testReading: Wooldridge Chapter 4, Appendix C6∗Mathematics and Statistics: standard normal, tand F distributions, statistical hypotheses, teststatistics, rejection rules, significance level, criticalvalues, p-value, degrees of freedom. . .


Finite-Sample Distributions of OLS Estimators Normality in Finite Samples

Assumption MLR.6: NormalityRecall the MLR model

y = β0 + β1x1 + β2x2 + . . .+ βk xk + u

Can test hypotheses about βj based on the sampledistribution of its estimator βj.Need an extra assumption on u in finite samples:Assumption MLR.6: (Normality) Error term u isindependent of the regressors x1, x2, . . . , xk and isnormally distributed: u ∼ N(0, σ2).MLR.6 can be justified by the Central LimitTheorem as sample size tends to infinity.MLR.1-6 are the Classical Linear Model (CLM)assumptions.



Normality of OLS EstimatorsNormality of u translates to OLS estimators.

Theorem (Normality of OLS Estimators)Under CLM assumptions (MLR.1-MLR.6),

βj ∼ N(βj,Var(βj)), j = 0, 1, 2, . . . , k

where Var(βj) = σ2(X′X)−1jj and (X′X)−1

jj denotes the (j + 1)-thleading diagonal element of (X′X)−1.

Then, an infeasible distribution follows

βj − βj√Var(βj)

∼ N(0, 1)



The t Distribution of OLS EstimatorsEstimate Var(βj) : Var(βj) = σ2(X′X)−1

jj = [se(βj)]2

where σ2 = SSR/(n − 1 − k )

Theorem (t Distribution of OLS Estimators)Under the CLM assumptions (MLR.1-MLR.6),

βj − βj

se(βj)∼ tn−1−k , for j = 0, 1, . . . , k

where tdf is a t distribution with df degrees of freedom

tdf approximates N(0, 1) when df is large (> 120)This can be used to test hypothesis on βj


Finite-Sample Distributions of OLS Estimators Hypothesis Testing (Inference)

HypothesesConsider MLR modely = β0 + β1x1 + β2x2 + . . .+ βk xk + uThe null and alternative hypotheses (H0 and H1)Examples of hypotheses:

(i) H0 : β1 = 0 (a restriction on one parameter)“x1 has no impact on y”(ii) H0 : β1 − β2 = 0 (a linear restriction on parameters)“x1 and x2 have equivalent effects on y”(iii) H0 : β1 = β2 = βk = 0 (multiple linear restrictions)“x1, x2 and xk have no joint explanatory power on y”

What test to use?t : (i) a restriction on one parameter or(ii) a linear restriction on parametersF : (iii) multiple linear restrictions (and (i) & (ii))



Classical Testing Procedure

The Classical procedure involves steps toSelect significance level (α): probability of rejecting H0

when it is true. Common choices: 1%, 5%, 10%Choose H1: may affect the rejection ruleCalculate the test statisticReject H0 if the calculated test statistic is in therejection region; Otherwise, do not reject H0

Note, selection of α is somewhat arbitraryNote, rejection decision is made by comparing teststatistic and corresponding critical valueNote, critical values depend on α, H1, and statisticdistribution under H0



Hypothesis Testing using the p-ValueChoice of α (e.g. 5%) is somewhat arbitraryAn alternative: Given the test statistic, what is thesmallest significance level at which H0 would berejected? This is called the p-value of the test.The p-value (p) is the probability of observing atest statistic (T ) as extreme as we did if H0 is true:E.g. t test for H0 : βj = 0 against H1 : βj , 0=⇒ p = Pr(|T | > |t |), where t is the observed t test statisticSmall p-values provides evidence against H0E.g. H0 : β1 = 0, H1 : β1 , 0, t ∼ t930, observed value oft = −2.137642 and p = 0.0328. Then H0 is rejected at 5%level, but not at 1% level. Precisely, H0 is rejected at allsignificance levels ≥ 3.28%


The t Test A Single Parameter

t Test: A Restriction on One Parameter

The null hypothesis (j = 0, 1, 2, . . . , k ):

H0 : βj = c, where c is a constant

The t statistic

t =βj − βj

se(βj)=βj − c

se(βj)∼ tn−1−k under H0 (1)

Example: if βj = 0.8 and se(βj) = 0.2, the teststatistic for H0 : βj = 1 is t = (0.8 − 1)/0.2 = −1Next, we introduce a special case when c = 0



Significant Regressor & Rejection RuleIf c = 0, then H0 : βj = 0.H0 : once x1, x2, . . . , xj−1, xj+1, . . . , xk have beencontrolled for, xj has no effect on the expectedvalue of y (i.e. xj is statistically insignificant).The t statistic is

t =βj − βj

se(βj)=βj − 0

se(βj)=

βj

se(βj)∼ tn−1−k under H0

These t statistics are routinely reported in EViewsAlternatives and Rejection Rules:Alternative H1 : βj > 0 H1 : βj < 0 H1 : βj , 0Rejection Rule t > tcv t < −tcv |t | > tcv



ExampleTest H0 : β1 = 1 against H1 : β1 < 1 with α = 0.01

y = β0 + β1x1 + β2x2 + uy = 0.2

(0.03)+ 0.8

(0.2)x1 − 1.2

(0.05)x2

where standard errors are in brackets, n = 353

t test statistic

t =β1 − β1

se(β1)∼ tn−1−2 = t350 under H0

=0.8 − 1

0.2= −1

Critical value tcv = 2.326 (from t350, 1-tailed H1, α = 0.01)

Reject H0 if t < −tcv : Fail to reject H0, as t > −tcv here


The t Test A Single Linear Combination of Parameters

A Linear Restriction on ParametersH0 involves several parameters, but only onerestriction: e.g. H0 : β1 = β2 or H0 : β1 + β2 = β3

Example: Cobb-Douglas production function

Yi = ALβ1

i K β2

i Ui

where Y=production, A=technology, L=labour,K=capital and U=unobservables. Taking logs:

log(Yi) = log(A) + β1 log(Li) + β2 log(Ki) + ui

Constant returns to scale: H0 : β1 + β2 = 1.Note, t tests can be used to test H0, but statisticsare not easily computed due to the complexity ofcorresponding standard errors



Equivalence of EducationsConsider the population model

log(wage) = β0 + β1x1 + β2x2 + β3x3 + u (2)x1=the number of years in junior collegex2=the number of years in universityx3=the number of months in workforceH0 : β1 = β2. What does it mean?H1 : β1 < β2. What does this imply?The hypotheses can be rewritten asH0 : β1 − β2 = 0, H1 : β1 − β2 < 0.The following slides are based on Wooldridge section 4.4,which mainly details two methods of obtaining the standarderror of a linear combination of paramter estimators.Wooldrige section 4.4 (page 140- )



A t Statistic and A ProblemA t statistic can be constructed to test H0:

t =β1 − β2

se(β1 − β2)(3)

where estimates β1 and β2 are easily obtained.Problem: How to get se(β1 − β2)?

Estimate the variance of (β1 − β2):

Var(β1 − β2) = Var(β1) + Var(β2) − 2Cov(β1, β2)

se(β1 − β2) =

√Var(β1) + Var(β2) − 2Cov(β1, β2)

=√[se(β1)]2 + [se(β2)]2 − 2s12, where a hat indicates

an unbiased estimator and s12 = Cov(β1, β2).Yuyi LI (University of Manchester) Inference in Linear Regression Models 2012 14 / 27


The Problem and Solutions

Note, se(β1) and se(β2) are computed by anysoftware, but s12 is not=⇒ se(β1 − β2) on the previous page is unknown=⇒ Statistic (3) is not computable=⇒ Testing H0 is not possible?Solutions: Two methods:Method 1: Compute s12 by estimating thecovariance matrix of βMethod 2: Rewrite model (2) by introducing a newparameterThese two methods are explained sequentially. . .



Method One (1/2)Write model (2) in matrix form (Revise notes L3)

y = Xβ+ u

where y = [y1, y2, · · · , yn]′, y ≡ log(wage),

X = [x′1, x′2, · · · , x

′n]′, xi = [1, xi1, xi2, xi3], i = 1, 2, . . . , n,

β = [β0, β1, β2, β3]′ and u = [u1, u2, · · · , un]

OLS estimator is β = (X′X)−1X′y.Covariance matrix is Var(β|X) = σ2(X′X)−1, andan unbiased estimator is

Var(β|X) = σ2(X′X)−1 =SSR

n − 1 − k(X′X)−1

What is the structure of Var(β|X)?Yuyi LI (University of Manchester) Inference in Linear Regression Models 2012 16 / 27


Method One (2/2)Var(β|X) =

[se(β0)]2 s01 s02 s03

s10 [se(β1)]2 s12 s13

s20 s21 [se(β2)]2 s23

s30 s31 s32 [se(β3)]2

If we compute Var(β|X), it is straightforward toobtain se(β1), se(β2) and s12

=⇒ se(β1 − β2) can be obtained=⇒ Statistic (3) can be computed=⇒ Testing H0 is possible!β is 4 × 1 here. How to generalise to (k + 1) × 1?



Method Two (1/2)Define θ = β1 − β2

=⇒ (i) H0 : θ = 0 is the same as H0 : β1 − β2 = 0=⇒ (ii) β1 = θ + β2

=⇒ (iii) statistic (3) becomes t = θ/se(θ)Note: Computing (iii) to test (i) is now the solution!How to compute (iii)? Rewrite model (2):

y = β0 + β1x1 + β2x2 + β3x3 + u= β0 + (θ + β2)x1 + β2x2 + β3x3 + u= β0 + θx1 + β2x1 + β2x2 + β3x3 + u= β0 + θx1 + β2(x1 + x2) + β3x3 + u (4)

Note, models (2) and (4) are equivalentYuyi LI (University of Manchester) Inference in Linear Regression Models 2012 18 / 27


Method Two (2/2)

Model (4) can be estimated by regressing y on anintercept, x1, (x1 + x2) and x3

=⇒ This yields θ and se(θ) directly=⇒ Statistic t = θ/se(θ) in (iii) can be computed=⇒ Testing H0 : θ = 0 in (i) is possible!Either method enables a t test on hypothesisinvolving a linear restriction on several parametersNote, t test is used to test a single linear restrictionwith one or more parametersNote, if there are multiple linear restrictions, F testcan be used


The F Test Multiple Linear Restrictions

Multiple Linear Restrictions

Consider a model

y = β0 + β1x1 + β2x2 + β3x3 + β4x4 + u (5)

Example: q linear restrictions hypothesesH0 : β1 = β3 = 0 exclusion restrictionsH0 : β1 = 0, β2 = β3 two restrictionsH0 : β1 = β2 = β3 = β4 = 0 reg. overall significance

What is q under each of the above H0?Alternative hypotheses: H1 : H0 does not hold.Note, H0 is violated as long as one restriction failsTo construct the F test, we need to identify therestricted model and unrestricted model. . .



Restricted and Unrestricted ModelsRestricted model: (5) with q restrictions under H0Example, restricted models are

H0 : β1 = β3 = 0(5) =⇒ y = β0 + β2x2 + β4x4 + uH0 : β1 = 0, β2 = β3

(5) =⇒ y = β0 + β2(x2 + x3) + β4x4 + uH0 : β1 = β2 = β3 = β4 = 0(5) =⇒ y = β0 + u

Unrestricted model: (5) without any restrictionsEstimating restricted model yields SSRr (Sum ofSquared Residuals in the restricted model)Estimating unrestricted model yields SSRur (Sumof Squared Residuals in the unrestricted model)Always, SSRur ≤ SSRr .OLS can be used to estimate the restricted modelYuyi LI (University of Manchester) Inference in Linear Regression Models 2012 21 / 27


The F Statistic and Rejection Rule

The F statistic

F =(SSRr − SSRur)/qSSRur/(n − 1 − k )

∼ Fq,n−1−k under H0

where q = number of restrictions under H0 (numerator df ),n − 1 − k = unrestricted model df (denominator df ),Fq,n−1−k = an F distribution with q and (n − 1 − k ) dfdf = degrees of freedom

F ≥ 0. Why?Reject H0 if F > Fcv (critical value from Fq,n−1−k ).Fail to reject H0 otherwise.



An Example for F Test 1/2Consider the wage equation model

lw = β0 + β1ed + β2ex + β3te + β4hr + u (6)

where lw = log(wage) (monthly), ed = years in education,ex = years of work experience, te = years in the currentemployment, hr = average weekly working hours, and udenotes the random error term.

te and hr are irrelevant to wage (dropped in (6))?H0 : β3 = β4 = 0 against H1 : β3 , 0 or/and β4 , 0.OLS regression of (unrestricted) model (6): SSRur

Restricted model (under H0; 2 restrictions, q = 2)lw = β0 + β1ed + β2ex + u: SSRr .



An Example for F Test 2/2

In our case, n = 935, k = 4, q = 2,SSRur =139.28,SSRr = 143.98.The F statistic is


=(143.98 − 139.28)/2139.28/(935 − 1 − 4)

≈ 15.70

Critical value Fcv = 4.61 (from F2,930, α = 0.01)Since F ≈ 15.70 > Fcv , H0 is rejected at 1%significance levelStatistically, years in the current employment (te)and average weekly working hours (hr) have somejoint effect on wage.



The R-Squared Form of the F Statistic

The F statistic can also be computed based on theR2 values in the unrestricted and restricted models

F =(R2

ur − R2r )/q

(1 − R2ur)/(n − 1 − k )

where R2ur is the R2 from the unrestricted model

and R2r is from the restricted model.

Note, this formula works only if unrestricted andrestricted models have same dependent variablesFor example, R2 form F statistic is invalid to testgeneral linear hypothesis H0 : β0 = 0, β1 − β2 = 1



Overall Significance of the ModelF test for the joint significance of all the regressors(excluding the constant)Consider (unrestricted) model:y = β0 + β1x1 + β2x2 + . . .+ βk xk + u.H0 : β1 = β2 = . . . = βk = 0 againstH1 : any βj , 0, j = 1, 2, . . . , kRestricted model: y = β0 + e; Note, R2

r = 0. why?R2 form of F statistic

F =(R2

ur − R2r )/q

(1 − R2ur)/(n − 1 − k )

=R2

ur/q(1 − R2

ur)/(n − 1 − k )

This statistic is routinely reported in EViews andmost statistical softwares. Can you find it?


Appendix Deriving R-Squared Form of the F Statistic

R-Squared Form of the F Statistic.By definition,

R2 =SSESST

= 1 −SSRSST

=⇒ SSR = SST(1 − R2)

This means, SSRur = SST(1 − R2ur), SSRr = SST(1 − R2

r ) in theunrestricted and restricted models. Then F statistic becomes


=[SST(1 − R2

r ) − SST(1 − R2ur)]/q

[SST(1 − R2ur)]/(n − 1 − k )

=[(1 − R2

r ) − (1 − R2ur)]/q

(1 − R2ur)/(n − 1 − k )

=(R2

ur − R2r )/q

(1 − R2ur)/(n − 1 − k )

�


hypothesis testing

Documents

single linear

multiple linear

linear regression

x1 x2

linear restriction

restricted

xk

hypothesis