forecasting using - rob j. hyndmanforecasting residuals residuals in forecasting: difference between...

Forecasting using

2. The forecaster’s toolbox

OTexts.com/fpp/2/

Forecasting using R 1

Rob J Hyndman

Outline

1 Some simple forecasting methods

2 Forecast residuals

3 Evaluating forecast accuracy

Forecasting using R Some simple forecasting methods 2

Some simple forecasting methods

Average method

Forecast of all future values is equal to mean ofhistorical data {y1, . . . , yT}.Forecasts: yT+h|T = y = (y1 + · · ·+ yT)/T

Naïve method

Forecasts equal to last observed value.Forecasts: yT+h|T = yT.Consequence of efficient market hypothesis.

Seasonal naïve method

Forecasts equal to last value from same season.Forecasts: yT+h|T = yT+h−km where m =seasonal period and k = b(h− 1)/mc+1.

Average method

Naïve method

Average method

Naïve method

Forecasts for quarterly beer production

1995 2000 2005

Mean methodNaive methodSeasonal naive method

Drift method

Forecasts equal to last value plus averagechange.

Forecasts:

yT+h|T = yT +h

T − 1

T∑t=2

(yt − yt−1)

= yT +h

T − 1(yT − y1).

Equivalent to extrapolating a line drawnbetween first and last observations.

Drift method

Forecasts:

yT+h|T = yT +h

T − 1

T∑t=2

(yt − yt−1)

= yT +h

T − 1(yT − y1).

Drift method

Forecasts:

yT+h|T = yT +h

T − 1

T∑t=2

(yt − yt−1)

= yT +h

T − 1(yT − y1).

Dow Jones Index (daily ending 15 Jul 94)

0 50 100 150 200 250 300

Mean methodNaive methodDrift model

Mean: meanf(x, h=20)

Naive: naive(x, h=20) or rwf(x, h=20)

Seasonal naive: snaive(x, h=20)

Drift: rwf(x, drift=TRUE, h=20)

Homework 1

Any questions?

Outline

Forecasting using R Forecast residuals 9

Forecasting residuals

Residuals in forecasting: difference betweenobserved value and its forecast based on allprevious observations: et = yt − yt|t−1.

Assumptions1 {et} uncorrelated. If they aren’t, then

information left in residuals that should be usedin computing forecasts.

2 {et} have mean zero. If they don’t, thenforecasts are biased.

Useful properties (for prediction intervals)3 {et} have constant variance.4 {et} are normally distributed.

Forecasting Dow-Jones index

0 50 100 150 200 250 300

Naïve forecast:

yt|t−1 = yt−1

et = yt − yt−1

Note: et are one-step-forecast residuals

Naïve forecast:

yt|t−1 = yt−1

et = yt − yt−1

Naïve forecast:

yt|t−1 = yt−1

et = yt − yt−1

0 50 100 150 200 250 300

Histogram of residuals

Change in Dow−Jones index

−100 −50 0 50

Normal?

Outline

Forecasting using R Evaluating forecast accuracy 16

Measures of forecast accuracy

Let yt denote the tth observation and yt|t−1 denote its forecastbased on all previous data, where t = 1, . . . , T. Then thefollowing measures are useful.

MAE = T−1T∑

|yt − yt|t−1|

MSE = T−1T∑

(yt − yt|t−1)2 RMSE =

√√√√T−1

T∑t=1

(yt − yt|t−1)2

MAPE = 100T−1T∑

|yt − yt|t−1|/|yt|

MAE, MSE, RMSE are all scale dependent.

MAPE is scale independent but is only sensible if yt � 0for all t, and y has a natural zero.

MAE = T−1T∑

|yt − yt|t−1|

MSE = T−1T∑

√√√√T−1

T∑t=1

(yt − yt|t−1)2

MAPE = 100T−1T∑

|yt − yt|t−1|/|yt|

MAE = T−1T∑

|yt − yt|t−1|

MSE = T−1T∑

√√√√T−1

T∑t=1

(yt − yt|t−1)2

MAPE = 100T−1T∑

|yt − yt|t−1|/|yt|

Mean Absolute Scaled Error

MASE = T−1T∑

|yt − yt|t−1|/Q

where Q is a stable measure of the scale of the timeseries {yt}.

MASE = T−1T∑

|yt − yt|t−1|/Q

Proposed by Hyndman and Koehler (IJF, 2006)

MASE = T−1T∑

|yt − yt|t−1|/Q

For non-seasonal time series,

Q = (T − 1)−1T∑

|yt − yt−1|

works well. Then MASE is equivalent to MAE relativeto a naive method.

MASE = T−1T∑

|yt − yt|t−1|/Q

For seasonal time series,

Q = (T −m)−1T∑

|yt − yt−m|

works well. Then MASE is equivalent to MAE relativeto a seasonal naive method.

1995 2000 2005

Mean method

RMSE MAE MAPE MASE38.0145 33.7776 8.1700 2.2990

Naïve method

0 50 100 150 200 250 300

Mean method

Naïve method

Drift model

Training and test sets

Available data

Training set Test set(e.g., 80%) (e.g., 20%)

The test set must not be used for any aspect ofmodel development or calculation of forecasts.

Forecast accuracy is based only on the test set.

Available data

Training set Test set(e.g., 80%) (e.g., 20%)

The test set must not be used for any aspect ofmodel development or calculation of forecasts.

Forecast accuracy is based only on the test set.

beer3 <- window(ausbeer,start=1992,end=2005.99)beer4 <- window(ausbeer,start=2006)

fit1 <- meanf(beer3,h=20)fit2 <- rwf(beer3,h=20)

accuracy(fit1,beer4)accuracy(fit2,beer4)

In-sample accuracy (one-step forecasts)accuracy(fit1)accuracy(fit2)

beer3 <- window(ausbeer,start=1992,end=2005.99)beer4 <- window(ausbeer,start=2006)

fit1 <- meanf(beer3,h=20)fit2 <- rwf(beer3,h=20)

accuracy(fit1,beer4)accuracy(fit2,beer4)

In-sample accuracy (one-step forecasts)accuracy(fit1)accuracy(fit2)

Beware of over-fitting

A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.

Poll: true or false?

1 Good forecast methods should have normallydistributed residuals.

2 A model with small residuals will give goodforecasts.

3 The best measure of forecast accuracy is MAPE.

4 If your model doesn’t forecast well, you shouldmake it more complicated.

5 Always choose the model with the best forecastaccuracy as measured on the test set.

forecasting using - rob j. hyndmanforecasting residuals residuals in forecasting: difference between...

Documents