econometrics ii - vaasan yliopistolipas.uwasa.fi/~sjp/teaching/ecmii/lectures/ecmiic1.pdf ·...

72
Econometrics II Seppo Pynn¨ onen Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2018 Seppo Pynn¨ onen Econometrics II

Upload: others

Post on 03-Sep-2019

14 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Econometrics II

Seppo Pynnonen

Department of Mathematics and Statistics, University of Vaasa, Finland

Spring 2018

Seppo Pynnonen Econometrics II

Page 2: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Part I

Introduction

As of Jan 8, 2018Seppo Pynnonen Econometrics II

Page 3: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Econometrics1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 4: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Econometrics

Econometrics is a discipline of statistics, specialized for using anddeveloping mathematical and statistical tools for empiricalestimation of economic relationships, testing economic theories,making economic predictions, and evaluating government andbusiness policy.

Data: Nonexperimental (observational)

Major tool: Regression analysis (in wide sense)

Seppo Pynnonen Econometrics II

Page 5: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 6: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 7: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data

(a) Cross-sectionalData collected at given point of time. E.g. a sample ofhouseholds or firms, from each of which are a number ofvariables like turnover, operating margin, market value ofshares, etc., are measured.From econometric point of view it is important that theobservations consist a random sample from the underlyingpopulation.

Seppo Pynnonen Econometrics II

Page 8: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 9: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data

(b) Time SeriesA time series consist of observations on a variable(s) overtime. Typical examples are daily share prices, interest rates,CPI values.An important additional feature over cross-sectional data isthe ordering of the observations, which may convey importantinformation.An additional feature is data frequency which may requirespecial attention.

Seppo Pynnonen Econometrics II

Page 10: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 11: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data

(c) Pooled Cross-sectionsBoth time series and cross-section features.For example a number of firms are randomly selected, say in1990, and another sample is selected in 2000.If in both samples the same features are measured, combiningboth years form a pooled cross-section data set.Pooled cross-section data is analyzed much the same way asusual cross-section data.However, it may be important to pay special attention to thefact that there are 10 years in between.Usually the interest is whether there are some importantchanges between the time points. Statistical tools are usuallythe same as those used for analysis of differences between twoindependently sampled populations.

Seppo Pynnonen Econometrics II

Page 12: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 13: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data

(d) Panel dataPanel data (longitudinal data) consists of (time series) datafor the same cross section units over time.Allows to analyze much richer dependencies than pure crosssection data.

Seppo Pynnonen Econometrics II

Page 14: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Types of Economic Data

Example 1

Job training data from Holzer et al. (1993) Are training subsidieseffective? The Michigan experience, Industrial and Labor RelationsReview 19, 625–636.

Excerpt from the data:

year fcode employ sales avgsal

1987 410032 100 4.70E+07 35000

1988 410032 131 4.30E+07 37000

1989 410032 123 4.90E+07 39000

1987 410440 12 1560000 10500

1988 410440 13 1970000 11000

1989 410440 14 2350000 11500

1987 410495 20 750000 17680

1988 410495 25 110000 18720

1989 410495 24 950000 19760

1987 410500 200 2.37E+07 13729

1988 410500 155 1.97E+07 14287

1989 410500 80 2.60E+07 15758

1987 410501 . 6000000 .

1988 410501 . 8000000 .

1989 410501 . 1.00E+07 .

etc

Seppo Pynnonen Econometrics II

Page 15: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 16: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

The linear regression model is the single most useful tool ineconometrics.

Assumption: each observation i , i = 1, . . . , n is generated by theunderlying process described by

yi = β0 + β1xi1 + · · ·+ βkxik + ui , (1)

where yi is the dependent or explained variable and xi1, xi2, . . . , xikare independent or explanatory variables, u is the error term, andβ0, β1, . . . , βk are regression coefficients (slope coefficients) (β0 iscalled the intercept term or constant term).

Seppo Pynnonen Econometrics II

Page 17: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

A notational convenience:

yi = x′iβ + ui , (2)

where xi = (1, x1i , . . . , xik)′ andβ = (β0, β1, . . . , βk)′ are k + 1 column vectors.

Stacking the x-observation vectors to an n × (k + 1) matrix

X =

x′1x′2...x′i...x′n

=

1 x11 x12 . . . x1k1 x21 x22 . . . x2k...

......

...1 xi1 xi2 . . . xik...

......

...1 xn1 xn2 . . . xnk

(3)

Seppo Pynnonen Econometrics II

Page 18: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

we can writey = Xβ + u, (4)

where y = (y1, . . . , yn)′, and u = (u1, . . . , un)′.

Seppo Pynnonen Econometrics II

Page 19: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Example 2

In Example 1 the interest is whether grant for employee educationdecreases product failures. The estimated model is assumed to be

log(scrap) = β0 + β1grant + β2grant−1 + u, (5)

where scrap is scarp rate (per 100 items), grant = 1 if firm received

grant in year t, grant = 0 otherwise, and grant−1 = 1 if firm received

grant in the previous year, grant−1 = 0 otherwise.

The above model does not take into account that the data consist of

three consecutive year measurements from the same firms (i.e., panel

data).

Seppo Pynnonen Econometrics II

Page 20: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Ordinaty Least Squares (OLS) Estimation yields (Stata):

regress lscrap grant grant_1

Source | SS df MS Number of obs = 162

----------+------------------------------ F( 2, 159) = 0.30

Model | 1.34805124 2 .67402562 Prob > F = 0.7395

Residual | 354.397022 159 2.22891209 R-squared = 0.0038

----------+------------------------------ Adj R-squared = -0.0087

Total | 355.745073 161 2.20959673 Root MSE = 1.493

---------------------------------------------------

lscrap | Coef. Std. Err. t P>|t|

--+------------------------------------------------

grant | .0543534 .310501 0.18 0.861

grant_1 | -.2652102 .36995 -0.72 0.474

_cons | .4150563 .139828 2.97 0.003

---------------------------------------------------

Neither of the coefficients are statistically significant and grant has evenpositive sign, although close to zero.

Dealing later with the panel estimation we will see that the situation can

be improved.

Seppo Pynnonen Econometrics II

Page 21: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

The problem with the above estimation is that the OLSassumptions are not met.

The OLS assumptions are:

(i) E[ui |X] = 0 for all i

(ii) var[ui |X] = σ2u for all i

(iii) cov[ui , uj |X] = 0 for all i 6= j ,

(iv) X is a n × (k + 1) matrix with rank k + 1

Remark 1.1: Assumption (1) implies

cov[ui ,X] = 0, (6)

which is crucial in OLS-estimation.

Seppo Pynnonen Econometrics II

Page 22: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

In panel data typically the homoscedasticity assumption (ii) isviolated.

In order to capture the implied heteroscedasticity the error term ismodeled generally as

uit = αi + δt + vit (7)

in which vit satisfied the assumption (ii), αi captures the(unobserved time invariant) individual effects and δt captures the(unobserved individual invariant) time effect.

We return to these matters more closely later.

Seppo Pynnonen Econometrics II

Page 23: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Under assumptions (i)–(iv) the OLS estimator

β = (X′X)−1X′y (8)

is the Best Linear Unbiased Estimator (BLUE) of the regressioncoefficients β of the linear model in equation (4).

This is known as the Gauss-Markov theorem.

Seppo Pynnonen Econometrics II

Page 24: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 25: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Sum of Squares (SS) identity:

SST = SSR + SSE, (9)

where

Total: SST =n∑

i=1

(yi − y)2 (10)

Model: SSR =n∑

i=1

(yi − y)2, (11)

Residual: SSE =n∑

i=1

(yi − yi )2 (12)

with yi = x′i β, and y = 1n

∑ni=1 yi , the sample mean.

Seppo Pynnonen Econometrics II

Page 26: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Goodness of fit: R-square, R2

R2 =SSR

SST= 1− SSE

SST, (13)

Adjusted R-square (Adj R-square), R2

R2 = 1− SSE/(n − k − 1)

SST/(n − 1)= 1− s2u

s2y, (14)

where

s2u =1

n − k − 1

n∑i=1

u2i =SSE

n − k − 1(15)

is an estimator of the variance σ2u = var[ui ] of the error term(su =

√s2u , ”Root MSE” in the Stata output), and

s2y =1

n − 1

n∑i=1

(yi − y)2 (16)

is the sample variance of y .Seppo Pynnonen Econometrics II

Page 27: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 28: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Assumption

(v) u ∼ N(0, σ2uI),

where I is an n × n identity matrix.

Individual coefficient restrictions:

Hypotheses are of the form

H0 : βj = β∗j , (17)

where β∗j is a given constant.

t-statistics:

t =βj − β∗js.e(βj)

, (18)

where

s.e(βj) = su

√(X′X)jj , (19)

and (X′X)jj is the jth diagonal element of (X′X)−1.Seppo Pynnonen Econometrics II

Page 29: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

F -test:

The overall hypothesis that none of the explanatory variablesinfluence the y -variable, i.e.,

H0 : β1 = β2 = · · · = βk = 0 (20)

is tested by F -test

F =SSR/k

SST/(n − k − 1), (21)

which is F -distributed with degrees of freedom f1 = k andf2 = n − k − 1 if the null hypothesis is true.

Seppo Pynnonen Econometrics II

Page 30: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

General (linear) restrictions:

H0 : Rβ = q, (22)

where R is a fixed m × (k + 1) matrix and q is a fixed m-vector.

m indicates the number of independent linear restrictions imposedto the coefficients.

The alternative hypothesis is

H1 : Rβ 6= q. (23)

The null hypothesis in (22) can be tested with an F -statistic thatunder the null hypothesis has the F -distribution with degrees offreedom f1 = m and f2 = n − k − 1.

Seppo Pynnonen Econometrics II

Page 31: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Example 3

Consider model

y = β0 + β1x1 + β2x2 + β3x3 + β4x4 + u. (24)

In terms of the general linear hypothesis (22) testing for singlecoefficients, e.g.,

H0 : β1 = 0 (25)

is obtained by selecting

R = (0 1 0 0 0) and q = 0. (26)

Seppo Pynnonen Econometrics II

Page 32: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

The null hypothesis in (20), i.e.,

H0 : β1 = β2 = β3 = β4 = 0 (27)

is obtained by selecting

R =

0 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 1

and q =

0000

(28)

Seppo Pynnonen Econometrics II

Page 33: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

A null hypothesis of the form

H0 : β1 + β2 = 1, β3 = β4 (29)

corresponds to

R =

(0 1 1 0 00 0 0 1 −1

)and q =

(10

). (30)

Seppo Pynnonen Econometrics II

Page 34: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Example 4

Consider the following consumption function

Ct = β0 + β1Yt + β2Ct−1 + ut . (31)

Then β1 is called the short-run MPC (marginal propensity to consume).

The long-run MPC is

βlrmpc =β1

1− β2. (32)

Test the hypothesis whether the long run MPC = 1, i.e.,

H0 :β1

1− β2= 1. (33)

This is equivalent to β1 + β2 = 1.

Seppo Pynnonen Econometrics II

Page 35: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Thus, the non-linear hypothesis (33) reduces in this case to the linearhypothesis

H0 : β1 + β2 = 1, (34)

and we can use the general linear hypothesis of the form (22) with

R = (0 1 1) and q = 1. (35)

Seppo Pynnonen Econometrics II

Page 36: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Remark 1.2: Hypotheses of the form (34) can be easily tested with thestandard t-test by re-parameterizing the model.

Defining Zt = Ct−1 − Yt , equation (31) is (statistically) equivalent to

Ct = β0 + γYt + β2Zt + ut , (36)

where γ = β1 + β2.

Thus, in terms of (36) testing hypothesis (33) reduces to testing

H0 : γ = 1, (37)

which can be worked out with the usual t-statistic.

t =γ − 1

s.e(γ). (38)

Seppo Pynnonen Econometrics II

Page 37: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Example 5

Generalized Gobb-Douglas production function in transportationindustrya Yi = value added (output), L = labor, K = capital, and N =the number of establishments in the transportation industry.

y = β0 + β1k + β2l + u, (39)

where y = log(Y /N), k = log(K/N), and l = log(L/N) (log is thenatural logarithm).

lm(formula = y ~ k + l, data = dfr)

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 2.29326 0.10718 21.396 3.24e-16 ***

k 0.27898 0.08069 3.458 0.00224 **

l 0.92731 0.09832 9.431 3.46e-09 ***

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.1885 on 22 degrees of freedom

Multiple R-squared: 0.9597,Adjusted R-squared: 0.9561

F-statistic: 262.2 on 2 and 22 DF, p-value: 4.501e-16

aZellner, A and N. Revankar (1970). Review of Economic Studies 37, 241–250. Data:

http://pages.stern.nyu.edu/∼wgreene/Text/Edition7/tablelist8new.htm Table F7.2

Seppo Pynnonen Econometrics II

Page 38: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

According to the results the capital elasticity is 0.279 and the laborelasticity is 0.927, thus labor intensive.

Let us test for the constant return to scale, i.e.,

H0 : β1 + β2 = 1. (40)

Using R car package linearHypothesis(), the general restrictedhypothesis method (22) yields

Res.Df RSS Df Sum of Sq F Pr(>F)

1 23 1.3079

2 22 0.7814 1 0.52645 14.822 0.000869 ***

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

which rejects the null hypothesis.

In order to demonstrate the re-parametrization approach, defineregression model

log(Y /N) = β0 + γ log(K/N) + β2 log(L/K) + u. (41)

Estimation of this specification yieldsSeppo Pynnonen Econometrics II

Page 39: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

lm(formula = y ~ k + lpk, data = dfr)

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 2.29326 0.10718 21.396 3.24e-16 ***

k 1.20629 0.05358 22.512 < 2e-16 ***

lpk 0.92731 0.09832 9.431 3.46e-09 ***

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.1885 on 22 degrees of freedom

Multiple R-squared: 0.9597,Adjusted R-squared: 0.9561

F-statistic: 262.2 on 2 and 22 DF, p-value: 4.501e-16

All the goodness-of-fit of these models are exactly the same, indicating

the equivalence equivalence of the models in statistical sense.

Seppo Pynnonen Econometrics II

Page 40: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

The null hypothesis of the constant returns to scale in terms of thismodel is

H0 : γ = 1. (42)

The t-value is

t =γ − 1

s.e(γ)=

1.206294− 1

0.053584≈ 3.84 (43)

with p-value = 0.0009, exactly the same as above, again rejecting the

null hypothesis.

Seppo Pynnonen Econometrics II

Page 41: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Remark 1.3: Estimation of the regression parameters under the

restrictions of the form Rβ = q are obtained by using restricted Least

Squares, provided by modern statistical packages.

Seppo Pynnonen Econometrics II

Page 42: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Confidence intervals:

A 100(1− α)% confidence interval for a single parameter is of theform

βj ± tα/2s.e(βj), (44)

where tα/2 is the 1− α/2 percentile of the t-distribution withn − k − 1 degrees of freedom.

Seppo Pynnonen Econometrics II

Page 43: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 44: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

The linear regression model

Economic theory implies sometimes nonlinear hypotheses.

In fact, the long-run MPC example is an example of non-linearhypothesis, which we could transform to a linear hypothesis.

This is not always possible.

For example a hypothesis of the form

H0 : β1β2 = 1 (45)

is nonlinear.

Non-linear hypotheses can be tested using Wald-test, Lagrangemultiplier test, or Likelihood Ratio, LR-test.

Each of these has an asymptotic χ2-distribution with degrees offreedom equal to the number of imposed restrictions on theparameters.

These tests will be considered more closely later.Seppo Pynnonen Econometrics II

Page 45: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 46: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

In addition to OLS the Maximum Likelihood (ML) is one of themost popular estimation method in econometrics.

This method can be utilized if the joint distribution of the randomvariables is known.

Likelihood function

Generally, suppose that the probability distribution of a randomvariable, Y , depends on a set of parameters, θ, then theprobability density for the random variable is denoted as fY (y ;θ).

Seppo Pynnonen Econometrics II

Page 47: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

The joint density of random variables Y1, . . . ,Yn isfY(y1, . . . , yn;θ), where Y = (Y1, . . . ,Yn)′ is the (column) vectorof the observations (prime denotes transposition).

In particular, if the random variables are independently distributedthen

fY(y1, . . . , yn;θ) =n∏

i=1

fYi(yi ;θ), (46)

where

n∏i=1

fYi(yi ;θ) = fY1(y1;θ)fY2(y2;θ) · · · fYn(yn;θ) (47)

is the product of the marginal densities fYi(yi ;θ).

Seppo Pynnonen Econometrics II

Page 48: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Moreover, if the random variables Yi are independently andidentically distributed (i.i.d.) then the distributions fYi

(·;θ) are thesame for all i = 1, . . . , n.

For simplicity, we denote then f (yi ;θ) ≡ fYi(yi ;θ).

Seppo Pynnonen Econometrics II

Page 49: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

In statistical analysis we can consider a sample of observationsy1, . . . , yn as a realization (observed values) of independentrandom variables Y1, . . . ,Yn.

Furthermore, assuming i.i.d., with observed values yi s, the functionin equation (46) becomes a function of θ and we can write

L(θ) ≡ L(θ; y1, . . . , yn) =n∏

i=1

f (yi ;θ), (48)

which is called the likelihood function.

Taking (natural) logarithms on both sides, we get the loglikelihood function

`(θ) ≡ log L(θ) =n∑

i=1

log f (yi ;θ). (49)

Seppo Pynnonen Econometrics II

Page 50: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Denoting the log-likelihoods of individual observations as`i (θ) = log f (yi ;θ), we can write (49) as

`(θ) =n∑

i=1

`i (θ). (50)

Seppo Pynnonen Econometrics II

Page 51: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Example 6

Under the normality assumption of the error term, ui , of the regression

yi = x′iβ + ui (51)

ui ∼ N(0, σ2u). (52)

It follows that given xi

yi |xi ∼ N(x′iβ, σ2u). (53)

Thus, with θ = (β′, σ2)′, the (conditional) density function is

f (yi |xi ;θ) =1√

2πσ2u

e− (yi−x′i β)2

2σ2u , (54)

`i (θ) = −1

2log(2π)− 1

2log σ2

u −1

2

(yi − x′iβ)2

σ2u

, (55)

Seppo Pynnonen Econometrics II

Page 52: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

and

`(θ) = −n

2log(2π)− n

2log σ2

u −1

2

n∑i=1

(yi − x′iβ)2

σ2u

. (56)

In matrix form (56) becomes

`(θ) = −n

2log(2π)− n

2log σ2

u −1

2σ2u

(y − Xβ)′(y − Xβ). (57)

Seppo Pynnonen Econometrics II

Page 53: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 54: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

We say that the parameter vector θ is identified or estimable if forany other parameter vector θ∗ 6= θ, for some data data y,L(θ∗; y) 6= L(θ; y).

Given data y the maximum likelihood estimate (MLE) of θ is thevalue θ of the parameter for which

L(θ) = maxθ

L(θ), (58)

i.e., the parameter value that maximizes the likelihood function.

Seppo Pynnonen Econometrics II

Page 55: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

In practice it is usually more convenient to maximize thelog-likelihood, such that the MLE of θ is the value θ which satisfies

l(θ) = maxθ`(θ). (59)

Seppo Pynnonen Econometrics II

Page 56: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Example 7

Consider the simple regression model

yi = β0 + β1xi + ui , (60)

with ui ∼ N(0, σ2u).

Given a sample of observations (y1, x1), (y2, x2), . . . , (yn, xn),

`(θ) = −1

2

(log(2π) + log σ2 +

n∑i=1

(yi − β0 − β2xi )2/σ2u

), (61)

where θ = (β0, β1, σ2u).

Seppo Pynnonen Econometrics II

Page 57: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

The maximum of (61) can be found by setting the partial derivatives tozero.

I.e.,∂`∂β0

=∑n

i=1(yi − β0 − β1xi )/σ2u = 0

∂`∂β1

=∑n

i=1 xi (yi − β0 − β1xi )/σ2u = 0

∂`∂σ2

u= −1/σ2

u − 1(σ2

u)2

∑ni=1(yi − β0 − β1xi )2 = 0

(62)

Seppo Pynnonen Econometrics II

Page 58: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Solving these gives

β1 =

∑ni=1(xi − x)(yi − y)∑n

i=1(xi − x)2(63)

β0 = y − β1x (64)

and

σ2u =

1

n

n∑i=1

u2i , (65)

whereui = yi − β0 − β1xi (66)

is the regression residual and y and x are the sample means of yi and xi .

Seppo Pynnonen Econometrics II

Page 59: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

In this particular case the ML estimators of the regressionparameters, β0 and β1 coincide the OLS estimators.

In OLS the error variance σ2u estimator is

s2 =1

n − 2

n∑i=1

u2i =n

n − 2σ2u. (67)

Seppo Pynnonen Econometrics II

Page 60: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 61: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Let θ0 be the population value of the parameter (vector) θ and letθ be the MLE of θ0.

Then

(a) Consistency: plim θ = θ0, i.e., θ is a consistent estimator of θ0

(b) Asymptotic normality: θ ∼ N(θ0, I(θ0)−1

)asymptotically,

where

I(θ0) = −E[∂2`(θ)

∂θ∂θ′

]θ=θ0

. (68)

That is, θ is asymptotically normally distributed.

Seppo Pynnonen Econometrics II

Page 62: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

(c) Asymptotic efficiency: θ is asymptotically efficient. That is, inthe limit as the sample size grows, MLE is unbiased and its(limiting) variance is smallest among estimators that areasymptotically unbiased.

(d) Invariance: The MLE of γ0 = g(θ0) is g(θ), where g is a(continuously differentiable) function.

Example 8

In Example 7 the MLE of the error variance σ2u is given by σ2

u defined in

equation (65). Using property (d), the MLE of the standard deviation

σu =√σ2u is σu =

√σ2u.

Seppo Pynnonen Econometrics II

Page 63: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Remark 1.4: The above so called large sample properties (a)–(c) of the

ML (and OLS) estimators stem from two important results in probability

theory: The Law of Large Numbers (LLN) and the Central Limit

Theorem (CLT)

Seppo Pynnonen Econometrics II

Page 64: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

The LLN:

Theorem 1

Suppose Y1, . . . ,Yn are independent random variables with E[Yi ] = µand var[Yi ] = σ2 <∞ then for any ε > 0

P(|Yn − µ| > ε

)→ 0 (69)

as n→∞, where

Yn =1

n

n∑i=1

Yi (70)

is the sample mean. We denote (69) in short

plim Yn = µ

as n→∞. Alternatively it is also denoted as YnP→ µ.

Seppo Pynnonen Econometrics II

Page 65: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

The CLT:

Theorem 2

Suppose Y1, . . . ,Yn are i.i.d random variables with E[Yi ] = µ andvar[Yi ] = σ2 then the distribution of

√n(Yn − µ)

σ(71)

approaches to the standard normal distribution, N(0, 1), as n→∞,where Yn is as defined in (70).

Seppo Pynnonen Econometrics II

Page 66: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Remark 1.5: The I(θ0) defined in (68) is called the

Fisher information matrix and

H =∂2`(θ)

∂θ∂θ′(72)

is called the Hessian of the log-likelihood.

Seppo Pynnonen Econometrics II

Page 67: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood1 Introduction

Econometrics

Types of Economic Data

Cross-sectional

Time Series Data

Pooled Cross-sections

Panel Data

The linear regression model

Regression statistics

Inference

Nonlinear hypotheses

Maximum Likelihood

Maximum Likelihood Estimation

Properties of Maximum Likelihood Estimators

Likelihood Ratio, Wald, and Lagrange Multiplier tests

Seppo Pynnonen Econometrics II

Page 68: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

For testing general restrictions of the form

H0 : c(θ) = 0, (73)

where c(·) is some (vector valued) function, there are three generalpurpose test methods:

Remark 1.6: We could specify the above hypothesis alternatively

H0 : r(θ) = q, (74)

where r(·) is some function and q is some constant.

Defining c(θ) = r(θ)− q reduces back to hypothesis (73).

Seppo Pynnonen Econometrics II

Page 69: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Likelihood ratio test (LR-test)

LR = −2 log

(LRLU

)= −2(`R − `U), (75)

whereLR = max

θ,c(θ)=0L(θ)

is the maximum of the likelihood under the restriction ofhypothesis (73),

LU = maxθ

L(θ)

is the unrestricted maximum of the likelihood function(`U = log LU and `R = log LR).

Remark 1.7: Use of the LR test requires computing both the restricted

MLE of θ (to compute `R) and the unrestricted MLE (to compute `U).

Seppo Pynnonen Econometrics II

Page 70: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Wald test

W = c(θ)′V−1c(θ), (76)

where V is the asymptotic variance covariance matrix of c(θ).

Remark 1.8: Use of the Wald test requires only to find the unrestricted

MLE.

Seppo Pynnonen Econometrics II

Page 71: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Lagrange multiplier test (LM)

LM =

(∂`(θR)

∂θ

)′ [I(θR)

]−1(∂`(θR)

∂θ

), (77)

where θR is the restricted MLE satisfying the restriction c(θR) = 0of the general hypothesis (73).

Remark 1.9: Use of the LM test requires only the restricted MLE.

Seppo Pynnonen Econometrics II

Page 72: Econometrics II - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecmii/lectures/ecmiic1.pdf · Econometrics II Seppo Pynn onen Department of Mathematics and Statistics, University of

Introduction

Maximum Likelihood

Under the null hypothesis (73) each of these test statistics isasymptotically χ2-distributed with degrees of freedom equal to thenumber of restrictions.

Thus, they are asymptotically equivalent. In small samplesnumerical values may differ, however.

Seppo Pynnonen Econometrics II