stat 497 lecture notes 8 estimation 1. after specifying the order of a stationary arma process, we...

STAT 497LECTURE NOTES 8

ESTIMATION

1

ESTIMATION• After specifying the order of a stationary ARMA

process, we need to estimate the parameters.• We will assume (for now) that:

1. The model order (p and q) is known, and2. The data has zero mean.

• If (2) is not a reasonable assumption, we can subtract the sample mean , fit a zero-mean ARMA model: Then use as the model for Yt.

2

Y

YX t YYX whereaBXB tttt

ESTIMATION

– Method of Moment Estimation (MME)– Ordinary Least Squares (OLS) Estimation– Maximum Likelihood Estimation (MLE)– Least Squares Estimation• Conditional• Unconditional

3

THE METHOD OF MOMENT ESTIMATION• It is also known as Yule-Walker estimation. Easy but not

efficient estimation method. Works for only AR models for large n.

• BASIC IDEA: Equating sample moment(s) to population moment(s), and solve these equation(s) to obtain the estimator(s) of unknown parameter(s).

kk

kk

ORn

tkttktt

n

ttt

YYn

YYE

YYn

YE

ˆ

ˆ1

1)(

1

1

4

THE METHOD OF MOMENT ESTIMATION

• Let n is the variance/covariance matrix of X with

the given parameter values.

• Yule-Walker for AR(p): Regress Xt onto Xt−1, . . ., Xt−p.

• Durbin-Levinson algorithm with replaced by .

• Yule-Walker for ARMA(p,q): Method of moments.

Not efficient.5

THE YULE-WALKER ESTIMATION

• For a stationary (causal) AR(p)

6

p1

pp

ap

tkt

p

jjtjtkt

where

and

pkaXEXXXE

.,,

0

0

,...,1,0,

2

2

1

.

,

tt

tkt

aBX:process the of RSF

the used have weaXE values the calculate To


• To find the Yule-Walker estimators, we are using,

• These are forecasting equations. • We can use Durbin-Levinson algorithm.

7

.ˆˆ kkkk or

pa

pThus

ˆˆˆˆ

ˆˆˆ:ˆfor equationsWalker -Yule the,

02

p


• If• If {Xt} is an AR(p) process,

8

r.nonsingula is then m ˆ,0ˆ0

p.k for n

N

nN

asymp

kk

a

p

a

pa

asymp

1,0~ˆ

ˆ

,~ˆ

.

22

12.

Hence, we can use the sample PACF to test for AR order, and we can calculate approximate confidence intervals for the parameters.


• If Xt is an AR(p) process, and n is large,

• 100(1)% approximate confidence interval for j is

9

12.

ˆˆ,0~ˆ pa

approx

Nn

2/11

2/ˆˆˆ

jjpa

j nz


• AR(1)

Find the MME of .It is known that 1 = .

ttt aYY 1

n

tt

n

ttt

YY

YYYY

1

2

11

1

11

ˆ

ˆ

10


• So, the MME of is

• Also, is unknown.

• Therefore, using the variance of the process, we can obtain MME of .

n

tt

n

ttt

YY

YYYY

1

2

11~

2a

2

2

0 1

a

2a

11


n

tta

n

tta

n

tt

a

YYn

YYn

YYn

1

221

2

1

222

1

2

2

2

00

1ˆ1~

1~1~

1

1

ˆ

12


• AR(2)

Find the MME of all unknown parameters.• Using the Yule-Walker Equations

tttt aYYY 2211

22

21

22112

2

111211

1

1

13


• So, equate population autocorrelations to sample autocorrelations, solve for 1 and 2.

12

111 ˆ

1ˆ

222

21

22 ˆ1

ˆ

14


15

.

ˆ1

ˆ1ˆ~ and

ˆ1

ˆˆ~2

2112

1

212

21

Using these we can obtain the MME of 2a

221102 ˆ

~ˆ

~1ˆ~ a

To obtain MME of , use the process variance formula.

2a


• AR(1)

• AR(2)

16

2

2

0 1

a

1

0

11

1

1 ˆˆˆ

ˆˆˆ

2

1

1

01

10

2

11

2

2

1

ˆ

ˆ

ˆˆ

ˆˆ

ˆ

ˆˆ

ˆ

ˆ


• MA(1)

• Again using the autocorrelation of the series at lag 1,

17

1 ttt aaY

1

21

2,1

112

121

ˆ2ˆ411~

0ˆˆ

ˆ1

Choose the root so that the root satisfying the invertibility condition


• For real roots,

If , unique real roots but non-invertible.If , no real roots exists and MME fails.If , unique real roots and invertible.

18

5.0ˆ5.0

ˆ25.00ˆ41

1

21

21

5.0ˆ1 5.0ˆ1 5.0ˆ1

THE YULE-WALKER ESTIMATION• This example shows that the MMEs for MA and

ARMA models are complicated. • More generally, regardless of AR, MA or ARMA

models, the MMEs are sensitive to rounding errors. They are usually used to provide initial estimates needed for a more efficient nonlinear estimation method.

• The moment estimators are not recommended for final estimation results and should not be used if the process is close to being nonstationary or noninvertible.

19

THE MAXIMUM LIKELIHOOD ESTIMATION

• Assume that• By this assumption we can use the joint pdf

instead of which cannot be written as multiplication of marginal pdfs because of the dependency between time series observations.

20

.,0~ 2...

a

dii

t Na

nn afafaaf 11 ,, nyyf ,,1

MLE METHOD

• For the general stationary ARMA(p,q) model

or

21

qtqttptptt aaaYYY 1111

qtqtptpttt aaYYYa 1111

. tt YY where

MLE

• The joint pdf of (a1,a2,…, an) is given by

• Let Y=(Y1,…,Yn) and assume that initial conditions Y*=(Y1-p,…,Y0)’ and a*=(a1-q,…,a0)’ are known.

22

n

tt

a

n

aan aaaf1

2

2

2/22

1 21

exp2,,,,,

..,,, q1p1 and where

MLE

• The conditional log-likelihood function is given by

23

2

*22

2,,

2ln2

,,,lna

aa

SnL

squares.of sumlconditiona

the is aYYaS wheren

tt

1**

2

* ,,,,,,

Initial Conditions: .0 and ** taEaYY

MLE• Then, we can find the estimators of =(1,…,p), =(1,…, q) and such that the conditional likelihood function is maximized. Usually, numerical nonlinear optimization techniques are required. After obtaining all the estimators,

24

..

ˆ,ˆ,ˆˆ *2

fdS

a

where d.f.= of terms used in SS of parameters = (np) (p+q+1) = n (2p+q+1).

MLE• AR(1)

25

.,0~ where 2...

1 a

dii

tttt NaaYY

n

tt

a

n

an aaaf1

22

2/21 2

1exp2,,

11

233323

122212

0101 0 takesLet'

nnnnnn YYaaYY

YYaaYY

YYaaYY

YaYY

MLE

The Jacobian will be

26

1

100

01

001

32

2

3

2

2

2

n

nnn

n

Ya

Ya

Ya

Ya

Ya

Ya

J

nnn aafJaafYYYf ,,,,,, 2212

MLE

• Then, the likelihood function can be written as

27

.1

,0~ where

21

21

,,

,,,,,

2

2

01

2

12/1

22

02/1

0

21

12112

2

212

0

21

a

YYn

a

Y

n

nna

NY

ee

aafYf

YYYfYfYYfL

n

ttt

a

MLE• Hence,

• The log-likelihood function:

28

21

2

2

2122/2

22 1

21

exp2

1, YYYL

n

ttt

an

a

a

S

S

n

ttt

a

aa

YYY

nnL

21

2

2

212

222

12

1

1ln21

ln2

2ln2

,ln

*

MLE

• Here, S*() is the conditional sum of squares and S() is the unconditional sum of squares.

• To find the value of where the likelihood function is maximized,

• Then,

29

.ˆ0

,ln 2

aL

.

ˆˆ 2

nS

a

MLE

• If we neglect ln(12), then MLE=conditional LSE.

• If we neglect both ln(12) and , then

30

.min,max 2

SL a

21

21 Y

.min,max *2

SL a

MLE

• Asymptotically unbiased, efficient, consistent, sufficient for large sample sizes but hard to deal with joint pdf.

31

CONDITIONAL LEST SQUARES ESTIMATION

• AR(1)

32

1

2...

1 .,0~

ttt

a

dii

tttt

YYa

Na whereaYY

n

tntt

n

tt YY observed for SYYaSSE

11*

2

11

2 .,...,

n

tt

n

ttt

n

tttt

Y

YY

YYYd

dS

2

2

1

21

111

*

ˆ

02

CONDITIONAL LSE

• If the process mean is different than zero

33

of MME

YY

YYYYn

tt

n

ttt

2

2

1

21ˆ

CONDITIONAL LSE

• MA(1)

– Non-linear in terms of parameters– LS problem– S*() cannot be minimized analytically– Numerical nonlinear optimization methods like

Newton-Raphson or Gauss-Newton,...*There are similar problem is ARMA case.

34

2

1 ,0_~,11, atttt NormalWNaaaY

ARYYYaIF tttt 2

2

1:

UNCONDITIONAL LSE

35

2

1

2

2

2

1 1

min

YYYS

where wrtSn

ttt

ˆ0

ddS

• This nonlinear in .• We need nonlinear optimization techniques.

BACKCASTING METHOD

• Obtain the backward form of ARMA(p,q)

• Instead of forecasting, backcast the past values of Yt and at, t 0. Obtain the unconditional log-likelihood function, then obtain the estimators.

36

.

11 11

jtt

j

t

q

qt

p

p

YYF where

aFFYFF

EXAMPLE

• If there are only 2 observations in time series (not realistic)

Find the MLE of and .

37

122

11

aaY

aY

.,0~ 2...

1 a

dii

2 Na anda where 2

a

EXAMPLE• US Quarterly Beer Production from 1975 to 1997> par(mfrow=c(1,3))> plot(beer)> acf(as.vector(beer),lag.max=36)> pacf(as.vector(beer),lag.max=36)

38

EXAMPLE (contd.)> library(uroot)Warning message: package 'uroot' was built under R version 2.13.0 > HEGY.test(wts =beer, itsd = c(1, 1, c(1:3)), regvar = 0,selectlags = list(mode = "bic", Pmax = 12))Null hypothesis: Unit root. Alternative hypothesis: Stationarity.---- HEGY statistics: Stat. p-valuetpi_1 -3.339 0.085tpi_2 -5.944 0.010Fpi_3:4 13.238 0.010> CH.test(beer) ------ - ------ ---- Canova & Hansen test ------ - ------ ---- Null hypothesis: Stationarity. Alternative hypothesis: Unit root. L-statistic: 0.817 Critical values: 0.10 0.05 0.025 0.01 0.846 1.01 1.16 1.35

39

> plot(diff(beer),ylab='First Difference of Beer Production',xlab='Time') > acf(as.vector(diff(beer)),lag.max=36)> pacf(as.vector(diff(beer)),lag.max=36)

40

EXAMPLE (contd.)

> HEGY.test(wts =diff(beer), itsd = c(1, 1, c(1:3)), regvar = 0,selectlags = list(mode = "bic", Pmax = 12)) ---- ---- HEGY test ---- ----

Null hypothesis: Unit root. Alternative hypothesis: Stationarity.

---- HEGY statistics:

Stat. p-valuetpi_1 -6.067 0.01tpi_2 -1.503 0.10Fpi_3:4 9.091 0.01Fpi_2:4 7.136 NAFpi_1:4 26.145 NA

41

EXAMPLE (contd.)

> fit1=arima(beer,order=c(3,1,0),seasonal=list(order=c(2,0,0), period=4))> fit1Call:arima(x = beer, order = c(3, 1, 0), seasonal = list(order = c(2, 0, 0), period = 4))

Coefficients: ar1 ar2 ar3 sar1 sar2 -0.7380 -0.6939 -0.2299 0.2903 0.6694s.e. 0.1056 0.1206 0.1206 0.0882 0.0841sigma^2 estimated as 1.79: log likelihood = -161.55, aic = 335.1

> fit2=arima(beer,order=c(3,1,0),seasonal=list(order=c(3,0,0), period=4))> fit2Call:arima(x = beer, order = c(3, 1, 0), seasonal = list(order = c(3, 0, 0), period = 4))

Coefficients: ar1 ar2 ar3 sar1 sar2 sar3 -0.8161 -0.8035 -0.3529 0.0444 0.5798 0.3387s.e. 0.1065 0.1188 0.1219 0.1205 0.0872 0.1210

sigma^2 estimated as 1.646: log likelihood = -158.01, aic = 330.0142

EXAMPLE (contd.)

stat 497 lecture notes 8 estimation 1. after specifying the order of a stationary arma process, we...

Documents

yulewalker estimation

yulewalker estimation

yulewalker estimation

yulewalker estimators

yulewalker equations

efficient estimation

final estimation results

invertibility condition