problemsfromchapter3ofshumwayand stoffer’sbook · university of utah department of mathematics...

UNIVERSITY OF UTAH

GUIDED READING

TIME SERIES

Problems from Chapter 3 of Shumway andStoffer’s Book

Author:Curtis MILLER

Supervisor:Prof. Lajos HORVATH

November 10, 2015

UNIVERSITY OF UTAH DEPARTMENT OF MATHEMATICS

ARIMA Models

Curtis Miller

November 10, 2015

1 ESTIMATION

1.1 AR(2) MODEL FOR cmort

To estimate the AR(2) process, I first use ordinary least squares (OLS). I then use the Yule-Walker estimate. This is shown in the R code below:

# OLS esimate# demean = T results in looking at cmort - mean(cmort)# intercept = F sets the intercept to 0cmort.ar2.ols <- ar.ols(cmort, order = 2, demean = T)

# Yule-Walker estimatecmort.ar2.yw <- ar.yw(cmort, order = 2, demean = T)

1.1.1 PARAMETER ESTIMATE COMPARISON

# OLS estimatecmort.ar2.ols

#### Call:## ar.ols(x = cmort, order.max = 2, demean = T)##

2

## Coefficients:## 1 2## 0.4286 0.4418#### Intercept: -0.04672 (0.2527)#### Order selected 2 sigma^2 estimated as 32.32

# Yule-Walker estimatecmort.ar2.yw

#### Call:## ar.yw.default(x = cmort, order.max = 2, demean = T)#### Coefficients:## 1 2## 0.4339 0.4376#### Order selected 2 sigma^2 estimated as 32.84

Looking at the coefficients of the AR(2) model estimated using the two methods, I see verylittle difference. OLS and Yule-Walker estimation produce similar results.

1.1.2 STANDARD ERROR COMPARISON

# The standard error of the OLS estimatescmort.ar2.ols$asy.se.coef$ar

## [1] 0.03979433 0.03976163

# The variance matrix of the Yule-Walker estimatescmort.ar2.yw$asy.var.coef

## [,1] [,2]## [1,] 0.001601043 -0.001235314## [2,] -0.001235314 0.001601043

# Corresponding standard error of both parameterssqrt(cmort.ar2.yw$asy.var.coef[1,1])

## [1] 0.04001303

Looking at the above R output, it appears that both models have the same standard errorfor the parameters.

3

1.2 AR(1) SIMULATION AND ESTIMATION

First I generate the data:

ar1.sim <- arima.sim(n = 50, list(ar = c(.99), sd = c(1)))

I first estimate the parameter from the simulation using the Yule-Walker estimate.

ar1.sim.yw <- ar.yw(ar1.sim, order = 1)# Model estimatesar1.sim.yw

#### Call:## ar.yw.default(x = ar1.sim, order.max = 1)#### Coefficients:## 1## 0.8946#### Order selected 1 sigma^2 estimated as 1.144

# Model covariance matrixar1.sim.yw$asy.var.coef

## [,1]## [1,] 0.004158676

Here, I would perform inference on the model by assuming it is Normally distributed. Iwould use the covariance matrix listed above for estimating the standard error.

Bootstrap results in R could be done as follows:

tsboot(ar1.sim, function(d) {return(ar.yw(d, order = 1)$ar)

}, R = 2000)

#### MODEL BASED BOOTSTRAP FOR TIME SERIES###### Call:## tsboot(tseries = ar1.sim, statistic = function(d) {## return(ar.yw(d, order = 1)$ar)## }, R = 2000)##

4

#### Bootstrap Statistics :## original bias std. error## t1* 0.8946416 0 0

The bootstrap standard error is zero, while the theoretical standard error is non-zero.

2 INTEGRATED MODELS FOR NONSTATIONARY DATA

2.1 EWMA MODEL FOR GLACIAL VARVE DATA

Here I am interested in the varve dataset. In fact, I am interested in analyzing log(var ve),since I believe this may actually be a stationary process. I will be estimating a EWMA modelfor this data.

logvarve <- log(varve)# EWMA for logvarve with lambda = .25logvarve.ima.25 <- HoltWinters(logvarve[1:100],

alpha = 1 - .25,beta = FALSE,gamma = FALSE)

logvarve.ima.5 <- HoltWinters(logvarve[1:100],alpha = 1 - .5,beta = FALSE,gamma = FALSE)

logvarve.ima.75 <- HoltWinters(logvarve[1:100],alpha = 1 - .75,beta = FALSE,gamma = FALSE)

# Plotting resultspar(mfrow = c(3,1))plot(logvarve.ima.25, main = "EWMA Fit with Lambda = .25")plot(logvarve.ima.5, main = "EWMA Fit with Lambda = .5")plot(logvarve.ima.75, main = "EWMA Fit with Lambda = .75")

The results are shown in 2.1. With a small smoothing parameter (λ), the results are verysensetive to the immediate past, while a high smoothing parameter leads to more stable pre-dictions.

3 BUILDING ARIMA MODELS

5

EWMA Fit with Lambda = .25

Time

Obs

erve

d / F

itted

0 20 40 60 80 100

1.5

2.5

3.5


Time

Obs

erve

d / F

itted

0 20 40 60 80 100

1.5

2.5

3.5


Time

Obs

erve

d / F

itted

0 20 40 60 80 100

1.5

2.5

3.5

Figure 2.1: EWMA fit for different smoothing parameters

6

3.1 AR(1) MODEL FOR GNP DATA

Here I am investigating how well an AR(1) (or, more exactly, an ARIMA(1,1,0)) model fits thenatural log of U.S. GNP data. I estimate this ARIMA model.

gnpgr = diff(log(gnp)) # growth rate of GNPgnp.model <- sarima(gnpgr, 1, 0, 0, details = F) # AR(1) model fit

I see disturbing trends in the diagnostic plots shown in Figure 3.1. The residual plot shouldlook like white noise, but I see the variance decreasing as the year increases. The ACF isfine, but the Q-Q plot suggests non-normality. Fortunately, the ACF and p-values for Ljung-Box statistic look as they should be. Still, other models (probably ones that do not assumeGaussian white noise) may be better.

3.2 FITTING CRUDE OIL PRICES WITH AN ARIMA(p,d , q) MODEL

My objective is to fit an ARIMA(p,d , q) model for the oil dataset. I start by examining thedata:

# Prepare layoutold.par <- par(mar = c(0, 0, 0, 0), oma = c(4, 4, 1, 1), mfrow = c(4, 1),

cex.axis = .75)

plot(oil, xaxt = 'n'); mtext(text = "Oil Price", side = 2, line = 2,cex = .75)

plot(log(oil), xaxt = 'n'); mtext(text = "Natural Logarithm of Oil Price",side = 2, line = 2, cex = .75)

plot(diff(oil), xaxt = 'n'); mtext(text = "First Difference in Oil Price",side = 2, line = 2, cex = .75)

plot(diff(log(oil))); mtext(text = "Percent Change in Oil Price",side = 2, line = 2, cex = .75)

The first plot in Figure 3.2 shows that oil prices clearly are not a stationary process, andit appears that the variance of the process increases with time. Taking the natural log of oilprices helps control the increasing variability, but not the nonstationary behavior of the se-ries. When looking at the change in oil price from one period to the next, I do see a processthat looks more stationary, but the nonconstant variance is not removed. The final attemptis to look at the differences in the natural log of oil prices (which can be interpreted as thepercentage change in oil prices). This appears to be stationary and with a mostly constantvariance. However, there are large deviations around 2009, and even prior, that would leadone to conclude that the white noise is not Gaussian, which threatens estimation and infer-ence.

I now look at the ACF and PACF of ∆ log(oilt ):

7

Standardized Residuals

Time

1950 1960 1970 1980 1990 2000

−3

−1

01

23

4

1 2 3 4 5 6

−0.

20.

00.

20.

4

ACF of Residuals

LAG

AC

F

−3 −2 −1 0 1 2 3

−3

−1

01

23

4

Normal Q−Q Plot of Std Residuals

Theoretical Quantiles

Sam

ple

Qua

ntile

s

5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0

p values for Ljung−Box statistic

lag

p va

lue

Figure 3.1: Diagnostic plots for the AR(1) model

8

Time

oil

2040

6080

100

120

140

Oil

Pric

e

Time

log(

oil)

3.0

3.5

4.0

4.5

5.0

Nat

ural

Log

arith

m o

f Oil

Pric

e

Time

diff(

oil)

−15

−10

−5

05

1015

Firs

t Diff

eren

ce in

Oil

Pric

e

Time

diff(

log(

oil))

2000 2002 2004 2006 2008 2010

−0.

2−

0.1

0.0

0.1

0.2

Per

cent

Cha

nge

in O

il P

rice

Figure 3.2: Basic plots of the oil series

9

par(mar = c(0, 0, 0, 0), oma = c(4, 4, 1, 1), mfrow = c(2, 1),cex.axis = .75)

acf(diff(log(oil)), xaxt = 'n'); mtext(text = "Sample ACF",side = 2, line = 2)

pacf(diff(log(oil))); mtext(text = "Sample PACF", side = 2,line = 2)

When looking at the sample PACF in Figure 3.3, I see that the PACF is nonzero as far out aseight lags, which may suggest that p = 8+ 1 = 9, or that we should consider lagging the ARterm out to as far as nine lags.

# noquote(capture.output(), "") used only to make presentation# easierwrite(capture.output(sarima(log(oil),

p = 9, d = 1, q = 0))[32:38],"")

#### Coefficients:## ar1 ar2 ar3 ar4 ar5 ar6## 0.1678 -0.1189 0.1844 -0.0713 0.0486 -0.0715## s.e. 0.0429 0.0432 0.0436 0.0442 0.0444 0.0443## ar7 ar8 ar9 constant## -0.0158 0.1135 0.0525 0.0017

The ninth lag does not appear statistically significant, so I drop the number of lags down toeight. I now use the following ARIMA model (with diagnostic plots shown):

oil.model <- sarima(log(oil), p = 8, d = 1, q = 0, details = F)

write(capture.output(oil.model)[8:14], "")

## Coefficients:## ar1 ar2 ar3 ar4 ar5 ar6## 0.1742 -0.1200 0.1814 -0.0689 0.0448 -0.0621## s.e. 0.0426 0.0433 0.0436 0.0442 0.0443 0.0437## ar7 ar8 constant## -0.0218 0.1224 0.0017## s.e. 0.0435 0.0428 0.0026

The residuals clearly do not appear to be Gaussian; there are large price movements thatmake this assumption doubtful, and the Q-Q plot does not support the Normality assump-tion. The ACF of the residuals can get large for some distant lags but otherwise are within theband of reasonable values. The p-values of the Ljung-Box statistics suggest that we do nothave dependence in our residuals for large lags. This may be the best fit an ARIMA model canprovide.

10

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

FS

ampl

e A

CF

0.0 0.1 0.2 0.3 0.4 0.5

−0.

10−

0.05

0.00

0.05

0.10

0.15

Lag

Par

tial A

CF

Series diff(log(oil))

Sam

ple

PAC

F

Figure 3.3: Sample ACF and PACF for percentage change in oil price

11


Time

2000 2002 2004 2006 2008 2010

−4

−2

02

46

0.0 0.1 0.2 0.3 0.4 0.5 0.6

−0.

20.

00.

20.

4

ACF of Residuals

LAG

AC

F

−3 −2 −1 0 1 2 3

−4

−2

02

46



Sam

ple

Qua

ntile

s

10 12 14 16 18 20

0.0

0.2

0.4

0.6

0.8

1.0


lag

p va

lue

Figure 3.4: Diagnostic plots for the ARIMA(8,1,0) model for the log(oil) series

12

4 REGRESSION WITH AUTOCORRELATED ERRORS

4.1 MONTHLY SALES DATA

4.1.1 ARIMA MODEL FITTING

The problem first asks for an ARIMA model for the sales data series. I first plot the series.


plot(sales, xaxt = 'n'); mtext(text = "Sales", side = 2, line = 2,cex = .75)

plot(diff(sales), xaxt = 'n')mtext(text = "First Order Difference in Sales",

side = 2, line = 2, cex = .75)plot(diff(diff(sales))); mtext(text = "Second Order Difference in Sales",

side = 2, line = 2, cex = .75)

Figure 4.1 shows the plots of the sales series. Clearly, salest is not stationary. Surpris-ingly, neither is ∆salest ; this series shows periodicity. It takes a second-order differencing,∆ (∆salest ), to find a stationary series.

I next examine the ACF and PACF functions to try and identify the order of the AR and MAterms.


acf(diff(diff(sales)), xaxt = 'n'); mtext(text = "Sample ACF",side = 2, line = 2)

pacf(diff(diff(sales))); mtext(text = "Sample PACF", side = 2,line = 2)

As shown in Figure 4.2, the sample ACF cuts off after one lag and the sample PACF appearsto be trailing off, so I believe that an ARIMA(0,2,1) should provide a good fit for the data.

par(old.par)oil.model <- sarima(sales, p = 0, d = 2, q = 1, details = F)

write(capture.output(oil.model)[8:14], "")

## Coefficients:## ma1## -0.7480## s.e. 0.0662#### sigma^2 estimated as 1.866: log likelihood = -256.57, aic = 517.14

13

Time

sale

s

200

210

220

230

240

250

260

Sal

es

Time

diff(

sale

s)

−2

02

4

Firs

t Ord

er D

iffer

ence

in S

ales

Time

diff(

diff(

sale

s))

0 50 100 150

−4

−2

02

Sec

ond

Ord

er D

iffer

ence

in S

ales

Figure 4.1: Basic plots of the sales series

14

−0.

50.

00.

51.

0

Lag

AC

FS

ampl

e A

CF

5 10 15 20

−0.

5−

0.4

−0.

3−

0.2

−0.

10.

00.

1

Lag

Par

tial A

CF

Series diff(diff(sales))

Sam

ple

PAC

F

Figure 4.2: Sample ACF and PACF for second order difference in sales

15


Time

0 50 100 150

−3

−2

−1

01

23

5 10 15 20

−0.

20.

00.

20.

4

ACF of Residuals

LAG

AC

F

−2 −1 0 1 2

−3

−2

−1

01

23



Sam

ple

Qua

ntile

s

5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0


lag

p va

lue

Figure 4.3: Diagnostic plots for the ARIMA(0,2,1) model for the sales series

16

Looking at the diagnostic plots in Figure 4.3, the ARIMA(0,2,1) seems to fit well. The errorterms appear Gaussian, there are no strong autocorrelations in the residuals, and the errorterms do not appear to be dependent.

4.1.2 RELATIONSHIP BETWEEN sales AND lead

I examine the CCF of sales and lead and a lag plot of ∆salest and ∆leadt−3 to determine if aregression involving these variables is reasonable.

ccf(diff(sales), diff(lead), main = "CCF of sales and lead")

As seen in Figure 4.4, while sales and lead are often uncorrelated, around lag 3 they becomehighly correlated. This fact is emphasized by a lag plot.

lag2.plot(lead, sales, max.lag = 3)

Figure 4.5 shows a linear relationship with a third lag of lead and contemporary sales. Thiswould justify regressing ∆salest on ∆leadt−3.

4.1.3 REGRESSION WITH ARMA ERRORS

Given that the variable lead seems to provide useful information about sales, I try to regresssales on lead. More specifically, I try to regress ∆salest on ∆leadt−3, while viewing the errorterm as being some unknown ARMA process.

saleslead <- ts.intersect(diff(sales), lag(diff(lead), k = -3))salesnew <- saleslead[,1]leadnew <- saleslead[,2]fit <- lm(salesnew ~ leadnew)acf2(resid(fit))

## ACF PACF## [1,] 0.59 0.59## [2,] 0.40 0.09## [3,] 0.34 0.11## [4,] 0.31 0.10## [5,] 0.23 -0.02## [6,] 0.15 -0.04## [7,] 0.13 0.03## [8,] 0.13 0.03## [9,] 0.01 -0.15## [10,] 0.02 0.07## [11,] 0.09 0.10

17

−15 −10 −5 0 5 10 15

−0.

4−

0.2

0.0

0.2

0.4

0.6

Lag

AC

F

CCF of sales and lead

Figure 4.4: CCF of sales and lead

18

10 11 12 13 14

200

210

220

230

240

250

260

lead(t−0)

sale

s(t)

0.95

10 11 12 13 14

200

210

220

230

240

250

260

lead(t−1)

sale

s(t)

0.95

10 11 12 13 14

200

210

220

230

240

250

260

lead(t−2)

sale

s(t)

0.94

10 11 12 13 14

200

210

220

230

240

250

260

lead(t−3)

sale

s(t)

0.94

Figure 4.5: Lag plot of sales and lead

19

5 10 15 20

−0.

20.

20.

6

Series: resid(fit)

LAG

AC

F

5 10 15 20

−0.

20.

20.

6

LAG

PAC

F

Figure 4.6: Sample ACF and PACF for residuals from linear fit

20

## [12,] 0.01 -0.13## [13,] -0.01 0.03## [14,] -0.07 -0.09## [15,] -0.07 -0.04## [16,] -0.02 0.09## [17,] -0.05 -0.03## [18,] -0.03 0.02## [19,] 0.04 0.11## [20,] 0.05 0.03## [21,] 0.02 -0.07## [22,] 0.00 -0.01## [23,] -0.01 -0.04

Figure 4.6 shows the ACF and the PACF of the residuals of the "naïve" fit. The PACF cuts offat 1 and the ACF trails off, so this appears to be an AR(1) process.

arima.fit <- sarima(salesnew, 1, 0, 0, xreg=cbind(leadnew), details = F)

As shown in Figure 4.7, the diagnostic plots for the process, when interpreting the errorterms as an ARMA(1,0) process, look very good. Normality of the white noise residuals, theACF of the white noise residuals, and the tests of dependence all show desirable properties.

stargazer(arima.fit$fit,covariate.labels = c("$\\phi$", "Intercept",

"$\\Delta \\text{lead}_{t-3}$"),dep.var.labels = c("$\\Delta \\text{sales}_t$"),label = "tab:prob35h",

title = "Coefficients of the model for $\\Delta \\text{sales}_t$",table.placement = "ht")

Table 4.1 shows the estimates of the coefficients of the model. The AR(1) term (φ) is statis-tically significant and so is the intercept and the coefficient of the ∆leadt−3 term.

5 MULTIPLICATIVE SEASONAL ARIMA MODELS

5.1 ACF OF AN ARIMA(p,d , q)× (P,D,Q)s MODEL

The problem asks for a plot of the theoretical ACF of an ARIMA(1,0,0)×(0,0,1)12 model, withΦ= 0.8 and θ = 0.5. This model is:

xt = .8xt−12 +wt + .5wt−1 (5.1)

The ACF is computed and plotted below:

21


Time

0 50 100 150

−3

−2

−1

01

2

5 10 15 20

−0.

20.

00.

20.

4

ACF of Residuals

LAG

AC

F

−2 −1 0 1 2

−3

−2

−1

01

2



Sam

ple

Qua

ntile

s

5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0


lag

p va

lue

Figure 4.7: Diagnostic plots for the model for the sales series with ARMA(1,0) error terms

22

Table 4.1: Coefficients of the model for ∆salest

Dependent variable:

∆salest

φ 0.645∗∗∗

(0.063)

Intercept 0.362∗∗

(0.177)

∆leadt−3 2.788∗∗∗

(0.143)

Observations 146Log Likelihood −168.717σ2 0.588Akaike Inf. Crit. 345.433

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

ACF <- ARMAacf(ar = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, .8),ma = c(.5))

plot(ACF, type = "h", xlab = "lag",xlim = c(1, 15), ylim = c(-.5,1)); abline(h=0)

Figure 5.1 shows the theoretical ACF of the process.

23

2 4 6 8 10 12 14

−0.

50.

00.

51.

0

lag

AC

F

Figure 5.1: ACF of a seasonal ARIMA process

24

problemsfromchapter3ofshumwayand stoffer’sbook · university of utah department of mathematics...

Documents