stat 497 lecture notes 8 estimation 1. after specifying the order of a stationary arma process, we...
TRANSCRIPT
STAT 497LECTURE NOTES 8
ESTIMATION
1
ESTIMATION• After specifying the order of a stationary ARMA
process, we need to estimate the parameters.• We will assume (for now) that:
1. The model order (p and q) is known, and2. The data has zero mean.
• If (2) is not a reasonable assumption, we can subtract the sample mean , fit a zero-mean ARMA model: Then use as the model for Yt.
2
Y
YX t YYX whereaBXB tttt
ESTIMATION
– Method of Moment Estimation (MME)– Ordinary Least Squares (OLS) Estimation– Maximum Likelihood Estimation (MLE)– Least Squares Estimation• Conditional• Unconditional
3
THE METHOD OF MOMENT ESTIMATION• It is also known as Yule-Walker estimation. Easy but not
efficient estimation method. Works for only AR models for large n.
• BASIC IDEA: Equating sample moment(s) to population moment(s), and solve these equation(s) to obtain the estimator(s) of unknown parameter(s).
kk
kk
ORn
tkttktt
n
ttt
YYn
YYE
YYn
YE
ˆ
ˆ1
1)(
1
1
4
THE METHOD OF MOMENT ESTIMATION
• Let n is the variance/covariance matrix of X with
the given parameter values.
• Yule-Walker for AR(p): Regress Xt onto Xt−1, . . ., Xt−p.
• Durbin-Levinson algorithm with replaced by .
• Yule-Walker for ARMA(p,q): Method of moments.
Not efficient.5
THE YULE-WALKER ESTIMATION
• For a stationary (causal) AR(p)
6
p1
pp
ap
tkt
p
jjtjtkt
where
and
pkaXEXXXE
.,,
0
0
,...,1,0,
2
2
1
.
,
tt
tkt
aBX:process the of RSF
the used have weaXE values the calculate To
THE YULE-WALKER ESTIMATION
• To find the Yule-Walker estimators, we are using,
• These are forecasting equations. • We can use Durbin-Levinson algorithm.
7
.ˆˆ kkkk or
pa
pThus
ˆˆˆˆ
ˆˆˆ:ˆfor equationsWalker -Yule the,
02
p
THE YULE-WALKER ESTIMATION
• If• If {Xt} is an AR(p) process,
8
r.nonsingula is then m ˆ,0ˆ0
p.k for n
N
nN
asymp
kk
a
p
a
pa
asymp
1,0~ˆ
ˆ
,~ˆ
.
22
12.
Hence, we can use the sample PACF to test for AR order, and we can calculate approximate confidence intervals for the parameters.
THE YULE-WALKER ESTIMATION
• If Xt is an AR(p) process, and n is large,
• 100(1)% approximate confidence interval for j is
9
12.
ˆˆ,0~ˆ pa
approx
Nn
2/11
2/ˆˆˆ
jjpa
j nz
THE YULE-WALKER ESTIMATION
• AR(1)
Find the MME of .It is known that 1 = .
ttt aYY 1
n
tt
n
ttt
YY
YYYY
1
2
11
1
11
ˆ
ˆ
10
THE YULE-WALKER ESTIMATION
• So, the MME of is
• Also, is unknown.
• Therefore, using the variance of the process, we can obtain MME of .
n
tt
n
ttt
YY
YYYY
1
2
11~
2a
2
2
0 1
a
2a
11
THE YULE-WALKER ESTIMATION
n
tta
n
tta
n
tt
a
YYn
YYn
YYn
1
221
2
1
222
1
2
2
2
00
1ˆ1~
1~1~
1
1
ˆ
12
THE YULE-WALKER ESTIMATION
• AR(2)
Find the MME of all unknown parameters.• Using the Yule-Walker Equations
tttt aYYY 2211
22
21
22112
2
111211
1
1
13
THE YULE-WALKER ESTIMATION
• So, equate population autocorrelations to sample autocorrelations, solve for 1 and 2.
12
111 ˆ
1ˆ
222
21
22 ˆ1
ˆ
14
THE YULE-WALKER ESTIMATION
15
.
ˆ1
ˆ1ˆ~ and
ˆ1
ˆˆ~2
2112
1
212
21
Using these we can obtain the MME of 2a
221102 ˆ
~ˆ
~1ˆ~ a
To obtain MME of , use the process variance formula.
2a
THE YULE-WALKER ESTIMATION
• AR(1)
• AR(2)
16
2
2
0 1
a
1
0
11
1
1 ˆˆˆ
ˆˆˆ
2
1
1
01
10
2
11
2
2
1
ˆ
ˆ
ˆˆ
ˆˆ
ˆ
ˆˆ
ˆ
ˆ
THE YULE-WALKER ESTIMATION
• MA(1)
• Again using the autocorrelation of the series at lag 1,
17
1 ttt aaY
1
21
2,1
112
121
ˆ2ˆ411~
0ˆˆ
ˆ1
Choose the root so that the root satisfying the invertibility condition
THE YULE-WALKER ESTIMATION
• For real roots,
If , unique real roots but non-invertible.If , no real roots exists and MME fails.If , unique real roots and invertible.
18
5.0ˆ5.0
ˆ25.00ˆ41
1
21
21
5.0ˆ1 5.0ˆ1 5.0ˆ1
THE YULE-WALKER ESTIMATION• This example shows that the MMEs for MA and
ARMA models are complicated. • More generally, regardless of AR, MA or ARMA
models, the MMEs are sensitive to rounding errors. They are usually used to provide initial estimates needed for a more efficient nonlinear estimation method.
• The moment estimators are not recommended for final estimation results and should not be used if the process is close to being nonstationary or noninvertible.
19
THE MAXIMUM LIKELIHOOD ESTIMATION
• Assume that• By this assumption we can use the joint pdf
instead of which cannot be written as multiplication of marginal pdfs because of the dependency between time series observations.
20
.,0~ 2...
a
dii
t Na
nn afafaaf 11 ,, nyyf ,,1
MLE METHOD
• For the general stationary ARMA(p,q) model
or
21
qtqttptptt aaaYYY 1111
qtqtptpttt aaYYYa 1111
. tt YY where
MLE
• The joint pdf of (a1,a2,…, an) is given by
• Let Y=(Y1,…,Yn) and assume that initial conditions Y*=(Y1-p,…,Y0)’ and a*=(a1-q,…,a0)’ are known.
22
n
tt
a
n
aan aaaf1
2
2
2/22
1 21
exp2,,,,,
..,,, q1p1 and where
MLE
• The conditional log-likelihood function is given by
23
2
*22
2,,
2ln2
,,,lna
aa
SnL
squares.of sumlconditiona
the is aYYaS wheren
tt
1**
2
* ,,,,,,
Initial Conditions: .0 and ** taEaYY
MLE• Then, we can find the estimators of =(1,…,p), =(1,…, q) and such that the conditional likelihood function is maximized. Usually, numerical nonlinear optimization techniques are required. After obtaining all the estimators,
24
..
ˆ,ˆ,ˆˆ *2
fdS
a
where d.f.= of terms used in SS of parameters = (np) (p+q+1) = n (2p+q+1).
MLE• AR(1)
25
.,0~ where 2...
1 a
dii
tttt NaaYY
n
tt
a
n
an aaaf1
22
2/21 2
1exp2,,
11
233323
122212
0101 0 takesLet'
nnnnnn YYaaYY
YYaaYY
YYaaYY
YaYY
MLE
The Jacobian will be
26
1
100
01
001
32
2
3
2
2
2
n
nnn
n
Ya
Ya
Ya
Ya
Ya
Ya
J
nnn aafJaafYYYf ,,,,,, 2212
MLE
• Then, the likelihood function can be written as
27
.1
,0~ where
21
21
,,
,,,,,
2
2
01
2
12/1
22
02/1
0
21
12112
2
212
0
21
a
YYn
a
Y
n
nna
NY
ee
aafYf
YYYfYfYYfL
n
ttt
a
MLE• Hence,
• The log-likelihood function:
28
21
2
2
2122/2
22 1
21
exp2
1, YYYL
n
ttt
an
a
a
S
S
n
ttt
a
aa
YYY
nnL
21
2
2
212
222
12
1
1ln21
ln2
2ln2
,ln
*
MLE
• Here, S*() is the conditional sum of squares and S() is the unconditional sum of squares.
• To find the value of where the likelihood function is maximized,
• Then,
29
.ˆ0
,ln 2
aL
.
ˆˆ 2
nS
a
MLE
• If we neglect ln(12), then MLE=conditional LSE.
• If we neglect both ln(12) and , then
30
.min,max 2
SL a
21
21 Y
.min,max *2
SL a
MLE
• Asymptotically unbiased, efficient, consistent, sufficient for large sample sizes but hard to deal with joint pdf.
31
CONDITIONAL LEST SQUARES ESTIMATION
• AR(1)
32
1
2...
1 .,0~
ttt
a
dii
tttt
YYa
Na whereaYY
n
tntt
n
tt YY observed for SYYaSSE
11*
2
11
2 .,...,
n
tt
n
ttt
n
tttt
Y
YY
YYYd
dS
2
2
1
21
111
*
ˆ
02
CONDITIONAL LSE
• If the process mean is different than zero
33
of MME
YY
YYYYn
tt
n
ttt
2
2
1
21ˆ
CONDITIONAL LSE
• MA(1)
– Non-linear in terms of parameters– LS problem– S*() cannot be minimized analytically– Numerical nonlinear optimization methods like
Newton-Raphson or Gauss-Newton,...*There are similar problem is ARMA case.
34
2
1 ,0_~,11, atttt NormalWNaaaY
ARYYYaIF tttt 2
2
1:
UNCONDITIONAL LSE
35
2
1
2
2
2
1 1
min
YYYS
where wrtSn
ttt
ˆ0
ddS
• This nonlinear in .• We need nonlinear optimization techniques.
BACKCASTING METHOD
• Obtain the backward form of ARMA(p,q)
• Instead of forecasting, backcast the past values of Yt and at, t 0. Obtain the unconditional log-likelihood function, then obtain the estimators.
36
.
11 11
jtt
j
t
q
qt
p
p
YYF where
aFFYFF
EXAMPLE
• If there are only 2 observations in time series (not realistic)
Find the MLE of and .
37
122
11
aaY
aY
.,0~ 2...
1 a
dii
2 Na anda where 2
a
EXAMPLE• US Quarterly Beer Production from 1975 to 1997> par(mfrow=c(1,3))> plot(beer)> acf(as.vector(beer),lag.max=36)> pacf(as.vector(beer),lag.max=36)
38
EXAMPLE (contd.)> library(uroot)Warning message: package 'uroot' was built under R version 2.13.0 > HEGY.test(wts =beer, itsd = c(1, 1, c(1:3)), regvar = 0,selectlags = list(mode = "bic", Pmax = 12))Null hypothesis: Unit root. Alternative hypothesis: Stationarity.---- HEGY statistics: Stat. p-valuetpi_1 -3.339 0.085tpi_2 -5.944 0.010Fpi_3:4 13.238 0.010> CH.test(beer) ------ - ------ ---- Canova & Hansen test ------ - ------ ---- Null hypothesis: Stationarity. Alternative hypothesis: Unit root. L-statistic: 0.817 Critical values: 0.10 0.05 0.025 0.01 0.846 1.01 1.16 1.35
39
> plot(diff(beer),ylab='First Difference of Beer Production',xlab='Time') > acf(as.vector(diff(beer)),lag.max=36)> pacf(as.vector(diff(beer)),lag.max=36)
40
EXAMPLE (contd.)
> HEGY.test(wts =diff(beer), itsd = c(1, 1, c(1:3)), regvar = 0,selectlags = list(mode = "bic", Pmax = 12)) ---- ---- HEGY test ---- ----
Null hypothesis: Unit root. Alternative hypothesis: Stationarity.
---- HEGY statistics:
Stat. p-valuetpi_1 -6.067 0.01tpi_2 -1.503 0.10Fpi_3:4 9.091 0.01Fpi_2:4 7.136 NAFpi_1:4 26.145 NA
41
EXAMPLE (contd.)
> fit1=arima(beer,order=c(3,1,0),seasonal=list(order=c(2,0,0), period=4))> fit1Call:arima(x = beer, order = c(3, 1, 0), seasonal = list(order = c(2, 0, 0), period = 4))
Coefficients: ar1 ar2 ar3 sar1 sar2 -0.7380 -0.6939 -0.2299 0.2903 0.6694s.e. 0.1056 0.1206 0.1206 0.0882 0.0841sigma^2 estimated as 1.79: log likelihood = -161.55, aic = 335.1
> fit2=arima(beer,order=c(3,1,0),seasonal=list(order=c(3,0,0), period=4))> fit2Call:arima(x = beer, order = c(3, 1, 0), seasonal = list(order = c(3, 0, 0), period = 4))
Coefficients: ar1 ar2 ar3 sar1 sar2 sar3 -0.8161 -0.8035 -0.3529 0.0444 0.5798 0.3387s.e. 0.1065 0.1188 0.1219 0.1205 0.0872 0.1210
sigma^2 estimated as 1.646: log likelihood = -158.01, aic = 330.0142
EXAMPLE (contd.)