another look at the forecast performance of arfima models

19
Another look at the forecast performance of ARFIMA models Craig Ellis a, * , Patrick Wilson b a School of Economics and Finance, University of Western Sydney Locked Bag 1797 Penrith South DC Sydney NSW 1797, Australia b School of Finance and Economics, University of Technology, Sydney, Australia Abstract This paper investigates the out-of-sample forecast performance of the autoregressive fractionally integrated moving average [ARFIMA (0,d,0)] specification, both when the underlying value of the fractional differencing parameter (d) is known a priori and when it is unknown. Forecast performance is measured relative to simple deterministic models and a random walk model, for forecast horizons up to 100 periods ahead. Overall, the linear models tend to outperform the ARFIMA specification for both the positive and negative values of d for the simulated series, and for positive d values from the real time-series data. The results of the study question the use of the ARFIMA specification as a forecast tool. D 2004 Elsevier Inc. All rights reserved. JEL classification: C22; C52; G10 Keywords: Time series; Simulation; ARFIMA 1. Introduction Traditional models of financial asset returns are based on a number of simplifying assumptions. Among these is the primary assumption that consecutive price changes are independent (i.e., follow a random walk). The development of financial asset pricing models has also been based on this assumption. A general feature of these pricing models is that the relationships between the model parameters are fundamentally linear. Under the assumption that price changes are additionally stationary around some long-term mean, 1057-5219/$ - see front matter D 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.irfa.2004.01.005 * Corresponding author. Tel.: +61-2-4620-3250; fax: +61-2-4620-3787. E-mail address: [email protected] (C. Ellis). International Review of Financial Analysis 13 (2004) 63 – 81

Upload: craig-ellis

Post on 17-Oct-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

International Review of Financial Analysis

13 (2004) 63–81

Another look at the forecast performance of

ARFIMA models

Craig Ellisa,*, Patrick Wilsonb

aSchool of Economics and Finance,

University of Western Sydney Locked Bag 1797 Penrith South DC Sydney NSW 1797, AustraliabSchool of Finance and Economics, University of Technology, Sydney, Australia

Abstract

This paper investigates the out-of-sample forecast performance of the autoregressive fractionally

integrated moving average [ARFIMA (0,d,0)] specification, both when the underlying value of the

fractional differencing parameter (d) is known a priori and when it is unknown. Forecast

performance is measured relative to simple deterministic models and a random walk model, for

forecast horizons up to 100 periods ahead. Overall, the linear models tend to outperform the

ARFIMA specification for both the positive and negative values of d for the simulated series, and for

positive d values from the real time-series data. The results of the study question the use of the

ARFIMA specification as a forecast tool.

D 2004 Elsevier Inc. All rights reserved.

JEL classification: C22; C52; G10

Keywords: Time series; Simulation; ARFIMA

1. Introduction

Traditional models of financial asset returns are based on a number of simplifying

assumptions. Among these is the primary assumption that consecutive price changes are

independent (i.e., follow a random walk). The development of financial asset pricing

models has also been based on this assumption. A general feature of these pricing models

is that the relationships between the model parameters are fundamentally linear. Under the

assumption that price changes are additionally stationary around some long-term mean,

1057-5219/$ - see front matter D 2004 Elsevier Inc. All rights reserved.

doi:10.1016/j.irfa.2004.01.005

* Corresponding author. Tel.: +61-2-4620-3250; fax: +61-2-4620-3787.

E-mail address: [email protected] (C. Ellis).

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8164

linear forecast methods should therefore yield the best prediction of the future value of the

series in the long term. Recent empirical research has, however, identified the possibility

that these relations are deterministic nonlinear (including Barkoulas & Baum, 1998;

Barkoulas, Labys, & Onochie, 1999; Cheung, 1993; Diebold & Rudebusch, 1989; Fang,

Lai, & Lai, 1994; Okunev & Wilson, 1997). These studies also question the accuracy of

linear models for modelling and forecasting economic and financial time-series data.

A usual feature of nonlinear behaviour is the long-term dependence between time-series

increments. Time-series that follow a random walk typically exhibits autocorrelation

functions that quickly tend to be zero. By contrast, the autocorrelation functions of long-

term dependent time-series tend to decay only slowly to zero. Rather than being

independent, the increments of long-term dependent time-series exhibit nonnegligible

positive or negative dependence, even over long intervals. For analysts, such autocorre-

lations imply that nonlinear forecast techniques using knowledge of past price changes

should yield statistically accurate long-term forecasts.

The intuition underlying the implied failure of simple linear models to accurately

describe the time-series properties of economic and financial variables is simple. Time

series for which the autocorrelation is negative over long intervals will exhibit mean-

reverting properties. Therefore, linear modelling techniques should be expected to

consistently overestimate or underestimate the price of the asset as the level of negative

dependence increases. For time-series exhibiting positive autocorrelation, successive

price changes will tend to be in the same direction. However, as the autocorrelation

coefficient tends to unity and the time-series itself becomes approximately linear, the

marginal benefit to using nonlinear methods over linear forecast techniques should

decline.

The forecast performance of the autoregressive fractionally integrated moving average

[ARFIMA (0,d,0)] specification, relative to simple autoregressive, AR( p), models, has

been previously examined by Ray (1993), Smith and Yadav (1994), and Barkoulas and

Baum (1997). Using high-order AR models, Ray finds evidence that these will outperform

the ARFIMA specification, even over forecast horizons of up to 20 periods ahead for

fractional noise processes with no explicit autoregressive or moving average (MA)

components. In similar tests, Smith and Yadav show that low-order AR models can

similarly outperform the ARFIMA (0,d,0) specification, yet, indicate that the relative

performance of higher order AR models declines as the forecast horizon is increased by

beyond about 20 periods ahead. Compared with an AR(1), Barkoulas and Baum, by

contrast, show a significant improvement in forecast accuracy using an ARFIMA ( p,d,0)

for Eurocurrency deposit rates denominated in several currencies.

Using a variety simulated AR(1), MA( q), and ARFIMA (1,d,0) processes, Andersson

(1998) tests the forecast performance of general ARFIMA ( p,d,q) models versus ARMA

( p,q) models. While concluding that it is generally worse to ignore fractional long-term

dependence than to impose it a priori, Andersson shows that the forecast power of the

general ARFIMA specification is largely a function of the method used to initially measure

the level of dependence (i.e., spectral regression, rescaled range, maximum likelihood) in

the underlying time series. Long-horizon ARFIMA ( p,d,q) forecasts based on the

maximum likelihood (ML) estimator are shown by Andersson to generally outperform

the ARMA specification.

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 65

Monte Carlo simulation techniques have also been used by Reisen and Lopes (1999)

and Bhansali and Kokoszka (2002) to test the relative forecast performance of ARFIMA

models. Applying the theory of forecasting ARIMA ( p,d,q) models to the ARFIMA

( p,d,q) specification, Reisen and Lopes use infinite AR and MA representations to forecast

ARFIMA models with different values for p, d>0, and q. Compared with ARMA ( p,q)

forecasts, the mean forecast error for the ARFIMA replication is lower for forecasts up to

five steps ahead, yet, the variance of errors and mean-squared error are not statistically

different when compared with the ARMA forecasts. Employing a similar methodology for

forecasts up to 15 steps ahead, Bhansali and Kokoszka conclude that the additional

estimation of the fractional differencing parameter, d, does not necessarily improve

forecast accuracy.

This paper investigates the forecast performance of various linear models: the last

observed value (Yt), the mean of all observed past values, an AR(1), and a random walk

model (et) against an ARFIMA (0,d,0) forecast model for up to 100 steps ahead. The

methodology employed extends the research of Ray (1993) and Smith and Yadav (1994)

to considering the relative forecast performance of pure fractional ARFIMA models versus

models other than the AR( p) specification over longer forecast horizons. Emphasis on the

pure fractional specification (i.e., p = q = 0 and d p 0) follows from Davidson (2002, p.

193), who argues that pure fractional ARFIMA models quite often provide an ‘‘adequate

representation [of real time-series], in terms of both maximizing the Schwarz criterion and

showing no significant residual autocorrelation.’’

Relative forecast performance is first measured using simulated long-term dependent

series with known values of the fractional differencing parameter. The study is then

extended to consider forecast performance using S&P 500 and Dow Jones Industrial

Average (DJIA) daily returns, for which values of the fractional differencing parameter are

unknown and must first be measured. The major finding of the paper is that the ARFIMA

specification generally underperforms both the simple average and the AR(1), yet,

outperforms forecasts based on the last observation or last observation plus a stochastic

error (i.e., random walk).

The remainder of the paper is organised as follows: The specification and properties of

the ARFIMA model are discussed in Section 2. Section 3 describes the Monte Carlo

simulation methodology used in this study and summarises the alternative measures of the

fractional differencing parameter used in this study when d is unknown. A detailed

analysis of the relative forecast performance of the various forecast models employed is

provided in Section 4, including a separate analysis of findings pertaining to the simulated

and real data sets. Finally, Section 5 concludes the paper.

2. Specification of the ARFIMA ( p,d,q) model

Two general forms of long-term dependent process that have been recently examined in

the financial literature are fractional Brownian motions (fBm) and ARFIMA processes.

Proposed by Granger and Joyeux (1980), the ARFIMA ( p,d,q) specification represents an

alternative model of long-term dependence to that of fBm. A particularly attractive feature

of the ARFIMA specification is that it allows short- and long-term dependencies to be

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8166

modelled separately from each other. This gives the model a distinct advantage over fBm,

which only considers long-term dependence.

Consider a function which conforms to an ARMA ( p,q) process. Denoting U(L) and

W(L) to be polynomials of the order p and q, respectively, such that

UðLÞ ¼ 1� /1L� /2L2 � . . .� /pL

p

WðLÞ ¼ 1� w1L� w2L2 � . . .� wqL

q ð1Þ

and, introducing a mean, l, the general form of the ARMA specification is

UðLÞðYt � lÞ ¼ WðLÞet etfð0; r2e Þ; l ¼ f ðX;bÞ ð2Þ

The variable L in Eqs. (1) and (2) represents the backward shift operator. Under the

assumption that the mean, l, may be determined by some set of exogenous regressors, X

represents a (T� k) matrix and b a (k� 1) vector.

When the function exhibits a nonstationarity of the type associated with a positive and

real unit root, it follows that the function should become stationary when differenced.

Introducing the backward difference operator j, Eq. (2) may be rewritten as

UðLÞjdðYt � lÞ ¼ WðLÞet ð3Þ

The parameter d defines the level of differencing required to induce stationarity. For the

value d = 1, the function is unit root nonstationary. Alternatively, for d = 0, the function is

itself stationary, and Eq. (3) reduces to Eq. (2).

When the differencing parameter is a real value, the correct model is the ARFIMA

( p,d,q) specification. Replacing the backward difference operator in Eq. (3) with an

infinite order autoregressive process, the ARFIMA specification may be written as

UðLÞð1� LÞdðYt � lÞ ¼ WðLÞet ð4Þ

Stationarity and invertibility conditions require that the value of the fractional

differencing parameter, d, is such that jdj < 0.5. For the values 0 < d < 0.5, the process is

stationary and long term dependent, with a correlation structure similar to an fBm with

0.5 <H < 1. Given � 0.5 < d < 0, the resulting process exhibits negative long-term depen-

dence similar with a strong mean-reverting process. However, for p = q = 0 and

� 0.5 < d < 0.5, the process is a type of fractionally differenced white noise whose general

characteristics are similar with fBm for all 0 <H < 1, satisfying H = d + 0.5 (Hosking,

1984).

3. Research methodology

The objective of this study is to investigate the out-of-sample forecast performance of

the ARFIMA (0,d,0) specification given different underlying levels of the fractional

differencing parameter, d. Given the conditions of stationarity and invertibility, the initial

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 67

values of d chosen in this study are d =� 0.49, 0, and 0.49 for the Monte Carlo

simulations. On the basis of estimated d values reported by various other authors for real

time-series (see above), simulated series are then also generated for the underlying values

d =� 0.1 and 0.1. For each underlying value of d, 4000 individual series are then

simulated. Different methods for simulating ARFIMA processes, based on the calculation

of the autocovariace function, include the Cholesky decomposition algorithm (Geweke &

Porter-Hudak, 1983), inverse Fourier transformation, and recursive methods (Chung,

1994). An alternative simulation method is to use a type of truncated autoregression. In

this study, simulated ARFIMA processes are constructed by the truncated autoregression

method of Hosking (1984), where the infinite order autoregression in Eq. (4) is replicated

by the truncated process

ð1� LÞdðYt � lÞ ¼Xj

k¼0

ð�1 Þkd

k

0@

1ALkðYt � lÞ ð5Þ

The parameter j in Eq. (5) represents the truncation lag of the differencing operator.

Compared with simulations based on the computing of the covariance matrix, the

truncated autoregression is both simpler and faster when the sample size is large (Martin

& Wilkins, 1999). The specifics of the truncation process, including the guidelines for the

selection of the size of the truncation lag, j, are given in detail by Hosking (1984). Martin

and Wilkins (1999) also describe an indirect simulation method for the application for the

Cholesky decomposition and truncated autoregression.

In addition to the Monte Carlo simulations, relative forecast performance is also tested

for two series of actual financial returns, the S&P 500 and the DJIA. The full-sample

period for both series is 1/01/1969 to 30/04/2002, and comprises of 8293 observations for

the S&P 500, and 8273 for the DJIA. In contrast to the Monte Carlo simulations, where the

underlying value of d is known a priori, d values for the S&P 500 and DJIA must first be

measured. Due to the variability in the estimates of the fractional differencing parameter

attributable to different test methodologies (Andersson, 1998), two distinct methods for the

estimation of d from the S&P 500 and DJIA data are employed in this study: the modified

profile likelihood (MPL) and the classical rescaled adjusted range. These tests are

described separately in Section 3.1.

To test the robustness of our findings to the time period employed, the S&P 500 and

DJIA full-sample sets are further divided into four contiguous subsamples: 1/01/1969 to

31/12/1976, 3/01/1977 to 31/12/1984, 2/01/1985 to 31/12/1992, and 4/01/1993 to 30/04/

2002. The last 100 observations for each of the sample series comprise the out-of-sample-

forecast period, during which, the relative forecast performance of each model is tested,

while the preceding observations are used to measure the value of d. Summary statistics

for each series, including estimated values for the fractional differencing parameter, are

provided in Table 3.

3.1. Estimating the fractional differencing parameter when d is unknown

Various techniques for the estimation of the fractional differencing parameter for

time-series data include a general class of ML techniques, the Geweke and Porter-

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8168

Hudak (1983) method (GPH), and the classical rescaled adjusted range (Hurst, 1951;

Mandelbrot, & Van Ness, 1968). While the ML method may be considered

parametric in nature, the GPH technique is only semiparametric, and the rescaled

range nonparametric.

Under the assumption of normality of the function Yt, such that YtfN(l,�), the log-

likelihood of the ARFIMA specification in Eq. (4) may be given as

logLðd;/;w;b; r2e Þ ¼ � T

2logð2pÞ � 1

2logjSj � 1

2zVS�1z ð6Þ

where z=(Y� l), � is an T� T covariance matrix, and l, as given prior, is l = f (X,b)(Hauser, 1999).

Derived by An and Bloomfield (1993) from the exact maximum likelihood (EML), the

MPL (S M), given stationary ARFIMA errors and f(X,b) =Xb, is

S Mðd;/;wÞ ¼ � T

2ð1þ log2pÞ � 1

2� 1

T

� �logjRj

� T � k � 2

2log½T�1zVR�1z � 1

2logjXVR�1Xj ð7Þ

Based on the suggestion by Cox and Reid (1987) that the presence of nuisance

parameters can bias the estimation of d when l is estimated from some set of exogenous

regressors, the MPL procedure has been proven by Hauser (1999) to be superior to the

EML for a variety of fractionally integrated series with finite sample lengths.

The classical rescaled adjusted range (R/r)n, as described by Mandelbrot and Van Ness

(1968), is calculated as

ðR=rÞn ¼ ð1=rnÞmax

1VkVn

Xj¼1

k

ðXj �MnÞ �min

1VkVn

Xj¼1

k

ðXj �MnÞ

24

35 ð8Þ

where Mn is the sample mean, (1/n)�jXj, and rn is the series standard deviation,

given as

rn ¼ 1=nXnj

ðXj �MnÞ2" #0:5

ð9Þ

For a given series of length, N, the Mandelbrot and Van Ness procedure first requires

that Eqs. (8) and (9) be estimated over several subseries of length nVN. Using ordinary

least squares regression, a global Hurst exponent, H, is then estimated. Following from

Hosking (1984), the corresponding values of the fractional differencing parameter, d, are

estimated from H using

H ¼ d þ 0:5 dað�0:5; 0:5Þ ð10Þ

One advantage of the classical rescaled range procedure over ML techniques in

estimating d from real time-series data is the impact of unobservable values of Yt (for

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 69

t = 0,� 1,� 2,. . .). Whereas ML techniques require the unknown values of Yt in Eq. (4) to

be replaced by some proxy, the classical rescaled adjusted range methodology does not

require knowledge of the unobserved values.1 The estimation of d by the Hurst exponent

has been considered theoretically by Geweke and Porter-Hudak (1983), Hosking, and,

recently, Ellis (1999).

Estimates of d in this study, using the MPL procedure, were calculated using the

ARFIMA package 1.01 for Ox 3.00 (Doornik & Ooms, 2001). Estimates using the

classical rescaled range procedure were calculated using the authors’ own code.

3.2. Measuring the relative forecast performance of ARFIMA models

The relative forecast performance of the ARFIMA (0,d,0) model is compared to three

simple deterministic models: the last observed value (Yt), the mean of all observed past

values, an AR(1), and a random walk model (et), over time horizons t + k periods ahead for

values of k = 1,2,. . .,100. For every value of the fractional differencing parameter, 1000

simulations are performed for each of the four models. The length of each simulated series

is 2100 observations. From these, the first 2000 observations are used to estimate the

model parameters.2 The out-of-sample performance of each model is then tested over the

last 100 observations. Summary results are then reported for forecast horizons of

k = 1,30,60, and 90 periods ahead. Relative forecast performance is measured in terms

of the difference between both the absolute errors and the squared errors of the four

alternative models versus the ARFIMA model. That each model is tested independently is

significant, as this serves to eliminate the correlation between forecast errors from the

different models, which may bias the comparisons of the overall forecast performance. The

process by which comparative forecast performance is measured can be described as

follows.

Denote the forecast error attributable to the alternative model as et + k1 and the forecast

error for the null forecast model (ARFIMA) as et + k2 . Given a specified function, g(e), of

the forecast error, e, for a given pair of independent forecast errors, et + k1 and et + k

2 , the null

hypothesis of the equality of expected forecast accuracy for levels in price is

E½jðP1tþk � PtþkÞj � jðP2

tþk � PtþkÞj ¼ 0 ð11Þ

E½gðe1tþkÞ � gðe2tþkÞ ¼ 0

From Eq. (11), the null hypothesis is that there is expected to be no significant

difference between the out-of-sample performances of the null and alternate models

1 For a more in-depth discussion of the implications of replacement of Yt by various proxies, see Hosking

(1984).2 In terms of the random walk model, the estimation of model parameters is replicated by simulating a set of

2000 independent and identically distributed variables. Consecutive k-step ahead forecasts are then randomly

generated from this set. Unlike each of the other models, the random walk requires no a priori information about

the past increments of the simulated series.

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8170

(Harvey, Leybourne, & Newbold, 1997). The mean difference in relative forecast

performance is denoted by

hnftþkg ¼ gnðe1tþkÞ � gnðe2tþkÞ

h ¼ N�1XNn¼1

dnftþkg ð12Þ

where the statistical significance of the observed value of h can be estimated using

the familiar two-tailed t test for differences between two means. Following from the

null hypothesis in Eq. (12), the expected value of h is h= 0 when the null and

alternative models are equally as accurate. As per Andersson (1998), the choice of

the significance test is motivated by the large number of simulations and by the

Central Limit Theorem.

To test for the sensitivity of our findings to the underlying form of the forecast error

specification function, g(e), two different error metrics are employed: the relative

difference between the absolute errors, jet + k1 j � jet + k

2 j, and the relative difference between

the squared errors, (et + k1 )2� (et + k

2 )2, of the null and alternative models. A positive value of

h indicates that the ARFIMA null model outperforms the alternative model on the basis of

the chosen error metric (i.e., relative absolute or relative squared errors). Conversely, a

negative h value will show that the null model has lower forecast power than the

alternative model. The statistical significance of h may be determined using standard

parametric tests. However, an important point to note is that if the variances of the

distributions of the null and alternative models are significantly different, the standard

error of the mean in Eq. (12) may be biased by the use of squared, as opposed to absolute,

errors. The problem may arise because squaring will exaggerate any outlying errors and

distort the size of the difference (et + k1 )2� (et + k

2 )2 if either of et + k1 or e t + k

2 is very large

(small) relative to the mean. The choice of both absolute and squared errors is therefore

justified as providing a test for the robustness of our findings to the underlying error

metric.

4. Comparative forecast performance

The relative forecast performance of the ARFIMA (0,d,0) specification versus the four

alternate models is examined using Monte Carlo simulation techniques and actual daily

returns for the S&P 500 and DJIA stock indices. The relative performance is measured

using the technique introduced by Harvey et al. (1997) as described in Eq. (12) for the

Monte Carlo simulated returns and in Eq. (11) for the S&P 500 and DJIA daily returns.

Results pertaining to the simulated data are described in two tables. The first table, Table 1,

provides a summary analysis of forecast performance using relative absolute errors. Table

2 provides the same analysis, but on the basis of relative squared errors. Values of h in both

tables are the mean of 1000 relative forecast errors each. The values in parenthesis

represent the standard error of the mean. Results pertaining to forecasts of S&P 500 and

Table 1

Summary analysis of relative absolute errors, jet + k1 j� je t + k

2 jt + 1 t + 30

Yt Mean AR(1) et Yt Mean AR(1) et

d=� 0.49 0.0326

(0.0563)

� 0.1012*

(0.0510)

� 0.1010

(0.0532)

0.0679

(0.0548)

0.1287*

(0.0562)

� 0.2016*

(0.0512)

� 0.2017*

(0.0512)

� 0.0323

(0.0547)

d=� 0.1 � 0.0493

(0.0367)

� 0.3428*

(0.0312)

� 0.3402*

(0.0313)

� 0.0224

(0.0361)

0.0143

(0.0380)

� 0.2701*

(0.0309)

� 0.2701*

(0.0309)

0.0200

(0.0365)

d= 0.0 � 0.0091

(0.0368)

� 0.3091*

(0.0314)

� 0.3087*

(0.0314)

� 0.0629

(0.0375)

0.0533

(0.0356)

� 0.2587*

(0.0288)

� 0.2586*

(0.0288)

0.0406

(0.0355)

d= 0.1 � 1.0474*

(0.0599)

� 1.4091*

(0.0567)

� 1.4136*

(0.0568)

� 1.1277*

(0.0598)

� 0.8117*

(0.0501)

� 1.1414*

(0.0465)

� 1.1414*

(0.0465)

� 0.8721*

(0.0507)

d= 0.49 0.1428*

(0.0407)

� 0.2822*

(0.0332)

� 0.2473*

(0.0331)

� 0.0372

(0.0390)

� 0.0172

(0.0406)

� 0.3473*

(0.0344)

� 0.3473*

(0.0344)

� 0.0625

(0.0402)

t + 60 t + 90

Yt Mean AR(1) et Yt Mean AR(1) et

d=� 0.49 0.2691*

(0.0578)

� 0.1457*

(0.0531)

� 0.1455*

(0.0531)

� 0.0363

(0.0524)

0.2236*

(0.0577)

� 0.1932*

(0.0529)

� 0.1927*

(0.0529)

� 0.0306

(0.0542)

d=� 0.1 0.0123

(0.0364)

� 0.2733*

(0.0301)

� 0.2733*

(0.0301)

0.0153

(0.0378)

� 0.0151

(0.0378)

� 0.2513*

(0.0304)

� 0.2513*

(0.0304)

� 0.0405

(0.0354)

d= 0.0 0.0645

(0.0371)

� 0.2388*

(0.0313)

� 0.2389*

(0.0313)

0.0011

(0.0362)

� 0.0497

(0.0368)

� 0.3038*

(0.0306)

� 0.3039*

(0.0306)

� 0.0067

(0.0378)

d= 0.1 � 0.7673*

(0.0522)

� 1.0529*

(0.0478)

� 1.052*

(0.0478)

� 0.7407*

(0.0520)

� 0.7796*

(0.0539)

� 1.1116*

(0.0491)

� 1.1116*

(0.0491)

� 0.7472*

(0.0536)

d= 0.49 0.0004

(0.0436)

� 0.3644*

(0.0360)

� 0.3644*

(0.0360)

� 0.0541

(0.0403)

0.0293

(0.0430)

� 0.3585*

(0.0351)

� 0.3585*

(0.0351)

� 0.0695

(0.0401)

* Indicates significantly different to zero at the .05 level.

C.Ellis,

P.Wilso

n/Int.Rev.

Financ.

Analy.

13(2004)63–81

71

Table 2

Summary analysis of relative squared errors, (et + k1 )2� (et + k

2 )2

t + 1 t + 30

Yt Mean AR(1) et Yt Mean AR(1) et

d=� 0.49 0.5278

(0.2889)

� 0.5175*

(0.2288)

� 0.2297

(0.2591)

0.3645

(0.2678)

0.7988*

(0.2798)

� 0.9069*

(0.2321)

� 0.9076*

(0.2320)

� 0.1483

(0.2508)

d=� 0.1 � 0.1558

(0.1042)

� 1.0906*

(0.0811)

� 1.0868*

(0.0812)

� 0.0601

(0.1029)

0.0789

(0.1115)

� 0.9644*

(0.0820)

� 0.9644*

(0.0820)

0.0492

(0.1050)

d= 0.0 � 0.1007

(0.1070)

� 1.1033*

(0.0841)

� 1.1022*

(0.0841)

� 0.2601

(0.1071)

0.1677

(0.1001)

� 0.8906*

(0.0729)

� 0.8905*

(0.0729)

0.1480

(0.1015)

d= 0.1 � 5.8439*

(0.3574)

� 7.0640*

(0.3511)

� 7.0701*

(0.3513)

� 6.1313*

(0.3568)

� 3.8781*

(0.2519)

� 4.9913*

(0.2449)

� 4.9913*

(0.2449)

� 4.0247*

(0.2536)

d= 0.49 0.6209*

(0.1409)

� 0.9278*

(0.0963)

� 0.8650*

(0.0969)

� 0.0581

(0.1216)

� 0.0468

(0.1341)

� 1.2679*

(0.1046)

� 1.2679*

(0.1046)

� 0.2768

(0.1311)

t + 60 t + 90

Yt Mean AR(1) et Yt Mean AR(1) et

d=� 0.49 1.4471*

0.2945

� 0.7129*

(0.2285)

� 0.7120*

(0.2285)

� 0.1471

(0.2369)

1.1940*

(0.2986)

� 1.0637*

(0.2350)

� 1.0618*

(0.2350)

� 0.2340

(0.2476)

d=� 0.1 0.0847

(0.1069)

� 0.9092*

(0.0768)

� 0.9091*

(0.0768)

0.0917

(0.1078)

� 0.0054

(0.1101)

� 0.9306

(0.0809)

� 0.9306

(0.0809)

� 0.1357

(0.1018)

d= 0.0 0.1533

(0.1097)

� 0.9082*

(0.0822)

� 0.9083*

(0.0822)

� 0.0218

(0.0368)

� 0.1684

(0.1079)

� 1.0309*

(0.0822)

� 1.0309*

(0.0822)

� 0.0291

(0.1100)

d= 0.1 � 3.7394*

(0.2509)

� 4.7461*

(0.2409)

� 4.7461*

(0.2409)

� 3.6496*

(0.2520)

� 3.9384*

(0.2756)

� 5.0159*

(0.2628)

� 5.0159*

(0.2628)

� 3.7923*

(0.2730)

d= 0.49 0.0709

(0.1533)

� 1.3377*

(0.1152)

� 1.3378*

(0.1152)

� 0.2414

(0.1381)

0.1397

(0.1483)

� 1.2991*

(0.1070)

� 1.2990*

(0.1070)

� 0.3175

(0.1297)

* Indicates significantly different to zero at the .05 level.

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8172

DJIA daily returns are likewise considered using the relative absolute and relative squared

errors. Results from each set of tables are discussed in turn.

4.1. Relative forecast performance using Monte Carlo simulated returns

The relative forecast performance of the ARFIMA (0,d,0) specification is tested versus

four models: the last observed value, the mean of all values in the series, an AR(1), and a

random walk. Mean values of h in Table 1, based on both of the last observed value (Yt)

and the random walk model (et), are generally insignificant for all underlying values of the

fractional differencing parameter and across all forecast horizons. Significance is tested at

the .05 level using t test for the difference between two means. As stated previously, the

null hypothesis is that the mean value of h is 0. The failure to reject the null hypothesis

implies that there is no statistical benefit in estimating forecasts using the ARFIMA (0,d,0)

specification over forecasts using either a simple random walk, with standard normally

distributed errors, or the last observed value.

The relative performance of the ARFIMA specification against both the mean model

and the AR(1) contrasts strongly to that just described for the random walk (et) and the lastobserved value (Yt). The significant negative values of h show that the ARFIMA model

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 73

underperforms both of these alternative models when tested on the basis of relative

absolute errors. The relative absolute errors are significant for all forecast horizons and all

underlying values of d, except for the AR(1) model, at t + 1 periods ahead for d =� 0.49.

The short forecast horizon results for the AR(1) model are consistent with those by Smith

and Yadav (1994) employing autoregressive lags up to p = 5, and by Ray (1993) for

autoregressive lags up to p = 20. The findings are significant in that they provide an

intuitive explanation for the apparent decline in power of the AR( p) forecast models over

long forecast horizons, which was noted by Smith and Yadav. Rather than diminish the

potential of the AR( p) model, the result highlights the necessity for the order of the

autoregressive lag ( p) to be at least as high as the length of the longest forecast horizon, if

the power of the model is to be maximized for all k-steps ahead.

By contrast to the results provided in Table 1, Table 2 shows the relative difference in

forecast performance, measured using the squared errors, (et + k1 )2� (et + k

2 )2. Consistent

with results already discussed for the absolute errors, je t + k1 j � je t + k

2 j, results in Table 2

confirm the underperformance of the ARFIMA specification relative to both the mean

model and the AR(1). The failure of the ARFIMA specification to outperform either of the

last observation (Yt) or the random walk models is similarly confirmed in the Table 2

results. That the significance of the relative errors is entirely consistent across both tables

proves the robustness of the findings to the choice of error metric (absolute or squared

error) employed.

The moments of the distribution of forecast errors can provide additional information

about forecast performance, above that gained from a simple analysis of the error metrics.3

For instance, a model may exhibit superior performance in the mean, but have a relatively

large variance. Higher variances should, of course, be discouraged as they increase the

probability of large forecast errors. Considering the first moment of distribution of the

forecast errors, the mean and sum of forecast errors is generally negative for all five

models, given the values of the fractional differencing parameter d < 0. However, for

values of dz 0, the error means and sums tend to be positive. Mean forecast errors for the

ARFIMA model are only significantly different to zero for values of d =� 0.49, 0, and

0.49 at forecast horizons of t + 90 periods ahead. All remaining positive and negative

values for the mean forecast error are insignificant, implying no significant difference in

the out-of-sample performance of the different models tested.

The analysis of the variance of forecast errors provides information about the tendency

for the various forecast models to produce large errors. Relative to the variance of errors of

the mean model and AR(1), the variance of ARFIMA forecast errors is larger at all

forecast horizons for all underlying values of d. ARFIMA forecast variances are also

generally higher than those produced by the random walk model. Compared with the

variance of errors attributable to forecasts based on the last observation (Yt), ARFIMA

variances are systematically lower for d =� 0.49, yet, are comparable for all other

underlying values of d.

The distribution of errors for all five models is mostly non-Gaussian due to their

slight leptokurtic nature, rather than due to skewness, which is insignificant for all

3 Summary statistics and normality tests of the distribution of forecast errors for each of the five models are

available from the authors by request.

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8174

values of d and lengths of forecast horizon. Two normality tests are conducted: the

Anderson-Darling (an empirical cumulative distribution test) and Jacque-Beru (a chi-

squared test). Both tests fail to accept normality for all models, except at d=� 0.49,

for which normality is accepted (a>.01) for all the models at forecast horizons of

t+ 30, t + 60, and t+ 90 periods ahead. Normality is also accepted for the distribution

of ARFIMA forecast errors at all forecast horizons for d= 0.1. However, a note of

caution should be exercised in interpreting these test results because they are

sensitive to even small deviations from normality when the sample size is large.

Therefore, while there are no significant differences in the mean forecast errors of

the five models, significant differences in the variance of forecast errors suggest that

the ARFIMA specification has a higher tendency to produce larger forecast errors

than the other models.

4.2. Relative forecast performance using S&P 500 and DJIA daily returns

Summary statistics pertaining to the S&P 500 and DJIA daily returns [log(Pt/Pt� 1)]

for each of the full-sample period and subsample periods are given in Table 3. As indicated

by the mean returns and points gain, each series exhibits an increase in value over the

period being observed. Both S&P 500 and DJIA full-sample returns exhibit significant

positive kurtosis and some skewness. Returns for both data sets are also highly skewed,

with significant positive kurtosis during the period from 2/01/1985 to 31/12/1992;

however, this is largely due to the impact of the October 1987 stock-market crash. Using

run tests for the independence of each series, the null hypothesis of independence is

rejected for the S&P 500 and DJIA for the 2/01/1985 to 31/12/1992 and 4/01/1993 to 30/

04/2002 subperiods, and, additionally, for the DJIA for the 3/01/1977 to 31/12/1984

subperiod. Dickey-Fuller (DF) and Augmented Dickey-Fuller (ADF) test results for the

stationarity of each series are also presented in Table 3, and confirm that all the series are

(first-difference) stationary at the 1% level of significance. The DF test is conducted using

a lag length of zero. Following from Schwert (1989), the number of lagged dependent

variables used in the ADF test (denoted as p in Table 3) is estimated as p=[4(N/100)0.25],

where N is the length of each series. For both the DF and ADF tests, the 1% critical value

is � 3.96.

The estimated values of the fractional differencing parameter for each series are also

provided in Table 3. Consistent with Andersson (1998), results attributable to the

classical rescaled range are different from those obtained using the MPL technique.

Specifically, the classical estimates are consistently lower than those using the MPL

method when d is positive, yet, not when d is negative. Both methods, however, agree on

the sign of d, positive or negative, for each of the series under observation. Comparing

the estimates of d, using the classical techniques, to their expected value, E(d)—the latter

estimated via bootstrap—shows significant levels of positive dependence (d>0) in the full

sample and the first two subsamples for both the S&P 500 and DJIA.4 Estimated t

4 Following the bootstrap methodology suggested by Ambrose, Ancel, and Griffiths (1992), each sample is

scrambled 1000 times. Values of d are then calculated for each iteration, and their mean and standard deviation

used to estimate upper and lower confidence intervals.

Table 3

S&P 500 and DJIA summary statistics and d values

Start– finish 1/01/69–

30/04/02

1/01/69–

31/12/76

3/01/77–

31/12/84

2/01/85–

31/12/92

4/01/93–

30/04/02

S&P 500

Sample size 8293 1908 1915 1918 2249

Starting value 103.9 103.9 107.0 165.4 435.4

Points gain (Loss) 973.1 3.6 60.2 270.3 641.5

Mean return 2.79E� 04 1.70E� 05 2.19E� 04 4.74E� 04 3.85E� 04

Standard error 1.08E� 04 2.08E� 04 1.92E� 04 2.47E� 04 2.12E� 04

Standard deviation 0.0099 0.0093 0.0086 0.0111 0.0103

Skewness � 1.5690 0.2972 0.2504 � 4.6753 � 0.2923

Kurtosis 38.1848 2.4885 1.6101 95.1824 4.3803

Minimum � 0.2283 � 0.0404 � 0.0405 � 0.2283 � 0.0711

Maximum 0.0871 0.0490 0.0465 0.0871 0.0499

Runs ( P value) .000 .0000 .0003 .7674 .5090

DF (0) � 84.400 � 35.308 � 39.029 � 42.278 � 47.169

ADF ( p) � 25.507 � 15.291 � 14.196 � 14.720 � 15.845

d Value (classical) 0.01641 0.11934 0.05529 � 0.02786 � 0.01320

E(d) 0.00205 0.01877 0.01560 0.02061 0.01295

d Value

[MPL] (t value)

0.03162

(3.40)

0.13210

(6.28)

0.07658

(3.88)

� 0.00891

(� 0.47)

� 0.02379

(� 1.39)

d Value

[no dummy]

(t value)

0.03141

(3.42)

� 0.02458

(� 1.43)

DJIA

Sample size 8273 1906 1909 1913 2242

Starting value 943.8 943.8 999.8 1198.9 3309.2

Points gain (loss) 9002.5 60.9 211.8 2102.3 6637.0

Mean return 2.81E� 04 3.12E� 05 9.32E� 05 4.98E� 04 4.71E� 04

Standard error 1.11E� 04 2.13E� 04 2.00E� 04 2.63E� 04 2.09E� 04

Standard deviation 0.0102 0.0096 0.0090 0.0118 0.0101

Skewness � 2.0084 0.2867 0.3962 � 5.2913 � 0.5501

Kurtosis 52.7647 1.6743 1.5821 116.7042 5.2503

Minimum � 0.2563 � 0.0357 � 0.0359 � 0.2563 � 0.0745

Maximum 0.0967 0.0495 0.0478 0.0967 0.0486

Runs ( P value) .0008 .0000 .3273 .3400 .7719

DF (0) � 84.464 � 34.206 � 40.600 � 43.060 � 46.258

ADF ( p) � 25.610 � 15.185 � 14.764 � 14.725 � 14.706

d Value (classical) 0.01157 0.09715 0.02454 � 0.01855 � 0.03074

E(d) 0.00211 0.01948 0.02184 0.01994 0.01329

d Value [MPL]

(t value)

0.02882

(3.12)

0.14614

(6.67)

0.05048

(2.62)

� 0.02279

(� 1.22)

� 0.00852

(� 0.49)

d Value

[no dummy]

(t value)

0.02878

(3.12)

� 0.00815

(� 0.47)

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 75

values attributable to the MPL technique similarly show significant levels of positive

dependence for these series. To test the robustness of these findings to potential biases

induced by the October 1987 crash, the MPL d value estimates have also been

Table 4

S&P 500 and DJIA summary analysis of relative absolute errors, jet + k1 j � jet + k

2 j, using MPL estimates of d value

t + 1 t + 30

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 0.0110 � 0.0018 0.0093 1.3188 � 1.0556 � 1.0622 � 1.0564 0.1359

DJIA � 0.0735 � 0.0715 � 0.0719 1.2779 � 1.5100 � 1.5080 � 1.5093 � 0.3622

1/01/69–31/12/76

S&P 500 � 0.9037 � 0.9008 � 0.9079 0.4157 � 0.4834 � 0.4805 � 0.4853 � 0.4728

DJIA � 1.5158 � 1.5132 � 1.5133 � 0.1964 � 1.0810 � 1.0784 � 1.0756 � 1.0825

3/01/77–31/12/84

S&P 500 � 1.1040 � 1.0811 � 1.1035 0.2644 � 0.5868 � 0.5638 � 0.5854 0.1918

DJIA � 0.7927 � 0.7855 � 0.7936 0.6065 � 1.3089 � 1.3017 � 1.3108 � 0.4986

2/01/85–31/12/92

S&P 500 � 1.4329 � 1.4283 � 1.4328 � 0.4748 � 1.2903 � 1.2949 � 1.2907 � 0.3336

DJIA � 0.9021 � 0.9010 � 0.9017 0.1197 � 1.3680 � 1.3669 � 1.3680 0.1506

4/01/93–30/04/02

S&P 500 � 1.5320 � 1.5446 � 1.5320 � 1.1387 � 0.6744 � 0.6808 � 0.6743 � 0.0388

DJIA � 1.2585 � 1.2567 � 1.2586 � 0.9113 � 0.4588 � 0.4570 � 0.4591 0.7251

t + 60 t + 90

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 � 0.6055 � 0.6183 � 0.6069 0.0498 � 1.0591 � 1.0719 � 1.0608 � 0.4560

DJIA � 1.4227 � 1.4247 � 1.4241 � 0.7802 � 0.2704 � 0.2723 � 0.2720 0.3327

1/01/69–31/12/76

S&P 500 � 0.4217 � 0.4246 � 0.4180 1.0130 � 1.0790 � 1.0819 � 1.0746 0.0450

DJIA � 0.0431 � 0.0457 � 0.0374 1.3947 � 0.5074 � 0.5099 � 0.5030 0.6166

3/01/77–31/12/84

S&P 500 � 1.0840 � 1.0611 � 1.0837 0.3299 � 0.3073 � 0.2946 � 0.3064 0.1471

DJIA � 1.2118 � 1.2046 � 1.2107 0.2154 � 1.3531 � 1.3459 � 1.3526 � 1.2912

2/01/85–31/12/92

S&P 500 � 1.3566 � 1.3611 � 1.3565 � 1.0154 � 0.7258 � 0.7270 � 0.7256 0.9620

DJIA � 0.8396 � 0.8385 � 0.8395 � 0.0370 � 1.6192 � 1.6181 � 1.6191 � 0.3847

4/01/93–30/04/02

S&P 500 � 1.7182 � 1.7308 � 1.7181 0.0426 � 1.3620 � 1.3747 � 1.3620 � 0.8843

DJIA � 0.3681 � 0.3699 � 0.3680 1.2706 � 0.5227 � 0.5245 � 0.5225 � 0.0066

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8176

calculated using an ‘October dummy’ variable where applicable. As noted in Table 3, d

values with and without the inclusion of the dummy variable are consistent. Overall,

the results of both test methodologies confirm the presence of positive dependence in

the full sample and the first two subsamples for both the S&P 500 and DJIA, and

reject the presence of negative dependence in the latter two subsamples for both data

sets.

The relative forecast performance of the ARFIMA null model for multistep-ahead

forecasts of S&P 500 and DJIA daily returns is shown in four tables. Tables 4 and 5

provide a summary analysis of the relative absolute and relative squared forecast errors,

respectively, for ARFIMA forecasts, based upon estimated values of d using the MPL

Table 5

S&P 500 and DJIA summary analysis of relative squared errors, (e t + k1 )2� (e t + k

2 )2, using MPL estimates of d

value

t + 1 t + 30

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 0.0004 0.0000 0.0003 1.7678 � 3.1598 � 3.1596 � 3.1598 � 1.1723

DJIA � 0.0068 � 0.0067 � 0.0067 1.8445 � 1.2613 � 1.2613 � 1.2613 0.6813

1/01/69–31/12/76

S&P 500 � 0.8327 � 0.8326 � 0.8327 0.9316 � 0.0515 � 0.0515 � 0.0515 0.3736

DJIA � 2.3283 � 2.3283 � 2.3283 � 0.5608 � 1.9069 � 1.9069 � 1.9069 � 1.4862

3/01/77–31/12/84

S&P 500 � 1.2210 � 1.2204 � 1.2210 0.6543 � 0.4351 � 0.4346 � 0.4351 1.5225

DJIA � 0.6648 � 0.6644 � 0.6648 1.3573 � 0.8914 � 0.8914 � 0.8914 0.3654

2/01/85–31/12/92

S&P 500 � 2.0554 � 2.0554 � 2.0554 � 1.1361 � 0.5856 � 0.5856 � 0.5856 � 0.2132

DJIA � 0.8182 � 0.8182 � 0.8182 0.2309 � 1.0016 � 1.0016 � 1.0016 � 0.2740

4/01/93–30/04/02

S&P 500 � 2.4132 � 2.4136 � 2.4132 � 2.2415 � 0.8639 � 0.8636 � 0.8639 � 0.8208

DJIA � 1.6075 � 1.6075 � 1.6075 � 1.4804 � 0.3909 � 0.3909 � 0.3909 0.4841

t + 60 t + 90

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 � 1.6720 � 1.6717 � 1.6720 0.0137 � 0.2285 � 0.2279 � 0.2285 1.1167

DJIA � 1.0067 � 1.0068 � 1.0067 0.7563 � 0.5710 � 0.5710 � 0.5710 0.7410

1/01/69–31/12/76

S&P 500 � 0.1254 � 0.1253 � 0.1254 2.0736 � 2.3850 � 2.3850 � 2.3850 � 2.0991

DJIA � 0.7525 � 0.7525 � 0.7525 1.4501 � 0.1793 � 0.1793 � 0.1793 0.1096

3/01/77–31/12/84

S&P 500 � 0.7169 � 0.7161 � 0.7169 0.5427 � 1.1874 � 1.1864 � 1.1874 � 0.8217

DJIA � 0.9582 � 0.9581 � 0.9582 � 0.6062 � 1.0964 � 1.0964 � 1.0964 � 0.4876

2/01/85–31/12/92

S&P 500 � 1.8111 � 1.8111 � 1.8111 � 0.7634 � 0.7893 � 0.7893 � 0.7893 � 0.0566

DJIA � 0.0158 � 0.0158 � 0.0158 0.2532 � 0.2617 � 0.2617 � 0.2617 1.6195

4/01/93–30/04/02

S&P 500 � 1.4579 � 1.4575 � 1.4579 0.6062 � 1.1781 � 1.1775 � 1.1781 � 1.0561

DJIA � 1.6505 � 1.6506 � 1.6505 � 1.0784 � 1.1823 � 1.1823 � 1.1823 1.1098

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 77

technique. Tables 6 and 7 provide a like analysis for ARFIMA forecasts based upon

classical estimates of d.

Consistent with results already discussed for the Monte Carlo simulations with

lower levels of positive dependence (i.e., d = 0.1), the ARFIMA specification is

shown to largely underperform each of the four alternate models. Except for some

positive values, attributable to random walk forecasts of S&P 500 and DJIA daily

returns, hn(t + k) values in all the tables are negative. That there are also no significant

differences in the error metrics (i.e., the relative absolute errors for pertaining to the

Table 6

S&P 500 and DJIA summary analysis of relative absolute errors, je t + k1 j�je t + k

2 j, using classical estimates of d

value

t + 1 t + 30

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 � 0.0428 � 0.0299 � 0.0316 1.2779 � 1.5098 � 1.5032 � 1.5040 � 0.3117

DJIA � 0.0715 � 0.0735 � 0.0719 1.2779 � 1.5080 � 1.5100 � 1.5093 � 0.3622

1/01/69–31/12/76

S&P 500 � 0.4715 � 0.4744 � 0.4786 0.8450 � 0.4670 � 0.4699 � 0.4718 � 0.4593

DJIA � 0.2760 � 0.2786 � 0.2760 1.0408 � 0.9686 � 0.9712 � 0.9658 � 0.9727

3/01/77–31/12/84

S&P 500 � 0.7817 � 0.8047 � 0.8042 0.5638 � 0.5169 � 0.5398 � 0.5385 0.2387

DJIA � 0.3602 � 0.3673 � 0.3683 1.0318 � 0.2566 � 0.2638 � 0.2657 0.5465

2/01/85–31/12/92

S&P 500 � 0.9212 � 0.9257 � 0.9256 0.0323 � 0.2217 � 0.2171 � 0.2176 0.7396

DJIA � 1.3743 � 1.3753 � 1.3750 � 0.3536 � 1.3846 � 1.3857 � 1.3857 0.1330

4/01/93–30/04/02

S&P 500 � 0.8795 � 0.8668 � 0.8668 � 0.4736 � 0.6035 � 0.5971 � 0.5971 0.0385

DJIA � 1.6712 � 1.6730 � 1.6730 � 1.3257 � 1.1078 � 1.1095 � 1.1098 0.0744

t + 60 t + 90

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 � 1.4701 � 1.4573 � 1.4588 � 0.8020 � 0.3125 � 0.2997 � 0.3014 0.3034

DJIA � 1.4247 � 1.4227 � 1.4241 � 0.7802 � 0.2723 � 0.2704 � 0.2720 0.3327

1/01/69–31/12/76

S&P 500 � 0.4381 � 0.4352 � 0.4315 0.9995 � 0.3665 � 0.3636 � 0.3592 0.7605

DJIA � 1.3061 � 1.3035 � 1.2977 0.1343 � 1.4197 � 1.4172 � 1.4128 � 0.2932

3/01/77–31/12/84

S&P 500 0.0236 0.0007 0.0010 1.4146 � 0.8618 � 0.8745 � 0.8736 � 0.4201

DJIA � 1.5608 � 1.5680 � 1.5669 � 0.1408 � 0.6465 � 0.6537 � 0.6532 � 0.5918

2/01/85–31/12/92

S&P 500 � 1.4402 � 1.4357 � 1.4357 � 1.0945 � 0.8266 � 0.8253 � 0.8251 0.8625

DJIA � 0.2869 � 0.2880 � 0.2879 0.5146 � 1.1078 � 1.1089 � 1.1088 0.1256

4/01/93–30/04/02

S&P 500 � 0.9057 � 0.8930 � 0.8930 0.8677 � 0.4731 � 0.4604 � 0.4604 0.0173

DJIA � 1.5613 � 1.5596 � 1.5595 0.0791 � 0.9151 � 0.9133 � 0.9131 � 0.3972

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8178

MPL estimates of d are similar with the relative absolute errors pertaining to the

classical estimates and, likewise, for relative squared errors) further implies that the

choice of method for estimating the value of d, ultimately, has little or no bearing on

out-of-sample forecast accuracy. While alternative methods for estimating d indeed

yield different values for the fractional differencing parameter, this result implies that

there are only limited practical consequences that may arise when varying estimates

of d are found for the same data sample. Overall, the findings with respect to the

S&P 500 and DJIA daily returns data confirm the poor ability of the pure fractional

Table 7

S&P 500 and DJIA summary analysis of relative squared errors, (et + k1 )2� (et + k

2 )2, using classical estimates of d

value

t + 1 t + 30

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 � 0.0026 � 0.0022 � 0.0023 1.7652 � 2.2890 � 2.2889 � 2.2889 � 0.8461

DJIA � 0.0067 � 0.0068 � 0.0067 1.8445 � 2.3460 � 2.3461 � 2.3461 � 0.9785

1/01/69–31/12/76

S&P 500 � 0.2334 � 0.2334 � 0.2335 1.5308 � 0.2340 � 0.2341 � 0.2341 � 0.2337

DJIA � 0.0832 � 0.0832 � 0.0832 1.6843 � 0.9823 � 0.9824 � 0.9821 � 0.9825

3/01/77–31/12/84

S&P 500 � 0.6485 � 0.6490 � 0.6490 1.2263 � 0.3023 � 0.3033 � 0.3033 0.3200

DJIA � 0.1515 � 0.1518 � 0.1519 1.8702 � 0.0710 � 0.0711 � 0.0711 0.5901

2/01/85–31/12/92

S&P 500 � 0.8584 � 0.8584 � 0.8584 0.0609 � 0.0526 � 0.0525 � 0.0525 0.8865

DJIA � 1.8984 � 1.8984 � 1.8984 � 0.8493 � 1.9551 � 1.9552 � 1.9552 0.3896

4/01/93–30/04/02

S&P 500 � 0.7893 � 0.7889 � 0.7889 � 0.6172 � 0.3680 � 0.3680 � 0.3679 0.0482

DJIA � 2.8303 � 2.8303 � 2.8303 � 2.7032 � 1.2799 � 1.2800 � 1.2800 0.1740

t + 60 t + 90

Yt Mean AR(1) et Yt Mean AR(1) et

1/01/69–30/04/02

S&P 500 � 2.1795 � 2.1792 � 2.1793 � 1.7249 � 0.1039 � 0.1035 � 0.1036 0.2878

DJIA � 2.0964 � 2.0963 � 2.0964 � 1.6508 � 0.0840 � 0.0840 � 0.0840 0.3039

1/01/69–31/12/76

S&P 500 � 0.1994 � 0.1994 � 0.1993 1.8920 � 0.1345 � 0.1345 � 0.1345 1.1363

DJIA � 1.7251 � 1.7250 � 1.7249 0.3709 � 2.0157 � 2.0157 � 2.0157 � 0.7465

3/01/77–31/12/84

S&P 500 0.0006 0.0000 0.0000 2.0013 � 0.7734 � 0.7737 � 0.7737 � 0.5626

DJIA � 2.4833 � 2.4835 � 2.4835 � 0.4238 � 0.4349 � 0.4351 � 0.4351 � 0.4305

2/01/85–31/12/92

S&P 500 � 2.0819 � 2.0818 � 2.0818 � 1.9605 � 0.6860 � 0.6860 � 0.6860 2.1725

DJIA � 0.0850 � 0.0850 � 0.0850 0.5649 � 1.2506 � 1.2506 � 1.2506 0.2968

4/01/93–30/04/02

S&P 500 � 0.8315 � 0.8312 � 0.8312 2.3354 � 0.2333 � 0.2329 � 0.2329 0.0170

DJIA � 2.5108 � 2.5108 � 2.5108 0.2571 � 0.8705 � 0.8705 � 0.8705 � 0.5836

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 79

ARFIMA model to produce superior forecasts, as is shown for the Monte Carlo

simulated series, for which the value of d is known.

5. Summary

Under the assumptions that asset returns are independent and follow a Gaussian

distribution, information about past price changes should not be used to successfully

forecast future returns. Rather, the best unbiased estimate of price changes during the next

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8180

period should be the current period’s price change. However, recent empirical analysis

suggests that a variety of financial time series exhibit nonlinear dependence. An

implication of these findings is that such dependence can be efficiently modelled and

forecasted forward.

This study examines the out-of-sample forecast performance of the ARFIMA (0,d,0)

specification, relative to various simple deterministic and stochastic models. Monte Carlo

techniques are used to simulate ARFIMA (0,d,0) processes with known levels of positive

and negative dependence and the models applied. Using a variety of different measures of

forecast performance, the ARFIMA specification fails to outperform forecasts derived

from both the simple average and AR(1) models, and performs only as good as a forecast

based on the last observed value or a random walk model. This result is irrespective of the

sign and magnitude of the underlying dependence in the simulated series, although the

statistical significance of the difference in relative forecast performance is generally higher

at longer forecast horizons.

Expanding the study to consider the relative forecast performance of the ARFIMA

specification using two sets of actual financial time-series returns, the poor out-of-sample

performance of the ARFIMA (0,d,0) is confirmed. Examining daily returns for the S&P

500 and DJIA with varying moderate levels of dependence, the ARFIMA model is unable

to outperform any of the alternate linear models or random walk on a consistent basis.

Given the computational complexity of the ARFIMA specification, this poor result may

lead us to question the applicability of the model. However, the fact that the simulated

series used in this study were, by constraint of the values of d used, all stationary should

not be overlooked.

Acknowledgements

Earlier versions of this paper were presented at the Twelfth Annual PACAP/FMA

Finance Conference and the 2003 FMA Annual Meeting. The authors would like to

gratefully acknowledge helpful comments received from participants at both meetings.

References

Ambrose, B. W., Ancel, E. W., & Griffiths, M. D. (1992). The fractal structure of real estate investment trust

returns: The search for evidence of market segmentation and non-linear dependency. Journal of the American

Real Estate and Urban Economics Association, 20(1), 25–54.

An, S. & Bloomfield, P. (1993). Cox and Reid’s modification in regression models with correlated errors.

Discussion Paper, Department of Statistics, North Carolina State University.

Andersson, M. K. (1998). On the effects of imposing or ignoring long memory when forecasting, Stockholm

School of Economics. Working Paper Series in Economics and Finance No. 225.

Barkoulas, J. T., & Baum, C. F. (1997). Fractional differencing modeling and forecasting of Eurocurrency deposit

rates. Journal of Financial Research, 20(3), 355–372.

Barkoulas, J. T., & Baum, C. F. (1998). Fractional dynamics in Japanese financial time series. Pacific-Basin

Finance Journal, 6, 115–124.

Barkoulas, J. T., Labys, W. C., & Onochie, J. I. (1999). Long memory in futures prices. Financial Review, 34,

91–100.

C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 81

Bhansali, R. J., & Kokoszka, P. S. (2002). Prediction of long-memory time series: An overview. In P. Doukhan,

G. Oppenheim, & M. Taqqu (Eds.), Long range dependence: Theory and applications. Massachusetts:

Birkhauser.

Cheung, Y. W. (1993). Long memory in foreign exchange-rates. Journal of Business and Economic Statistics,

11(1), 93–101.

Chung, C. F. (1994). A note on calculating the autocovariances of the fractionally integrated ARMA models.

Economics Letters, 45, 293–297.

Cox, D. R., & Reid, N. (1987). Parameter orthogonality and approximate conditional inference (with discussion).

Journal of the Royal Statistical Society. Series B, Statistical Methodology, 49, 1–39.

Davidson, D. (2002). A model of fractional cointegration, and tests for cointegration using the bootstrap. Journal

of Econometrics, 110, 187–212.

Diebold, F. X., & Rudebusch, G. D. (1989). Long memory and persistence in aggregate output. Journal of

Monetary Economics, 24(2), 189–209.

Doornik, J. A. & Ooms, M. (2001). A package for estimating, forecasting and simulating ARFIMA models:

Arfima package 1.01 for Ox. Discussion paper, Nuffield College, Oxford.

Ellis, C. (1999). Estimation of the ARFIMA ( p, d, q) fractional differencing parameter (d) using the classical

rescaled adjusted range technique. International Review of Financial Analysis, 8(1), 53–65.

Fang, H., Lai, K. S., & Lai, M. (1994). Fractal structure in currency futures price dynamics. Journal of Futures

Markets, 14(2), 169–181.

Geweke, J., & Porter-Hudak, S. (1983). The estimation and application of long memory time series models.

Journal of Time Series Analysis, 4(4), 221–238.

Granger, C. W. J., & Joyeux, R. (1980). An introduction to long-memory time series models and fractional

differencing. Journal of Time Series Analysis, 1, 15–39.

Harvey, D., Leybourne, S., & Newbold, P. (1997). Testing the equality of mean square prediction errors.

International Journal of Forecasting, 13, 281–291.

Hauser, M. (1999). Maximum likelihood estimators for ARMA and ARFIMA models: A Monte Carlo study.

Journal of Statistical Planning and Inference, 80, 229–255.

Hosking, J. R. M. (1984). Modeling persistence in hydrological time series using fractional differencing. Water

Resources Research, 20(12), 1898–1908.

Hurst, H. E. (1951). Long term storage capacity of reservoirs. Transactions of the American Society of Civil

Engineers, 116, 770–799.

Mandelbrot, B. B., & Van Ness, J. W. (1968). Fractional Brownian motions, fractional noises and applications.

SIAM Review, 10(4), 422–437.

Martin, V. L., & Wilkins, N. P. (1999). Indirect estimation of ARFIMA and VARFIMA models. Journal of

Econometrics, 93, 149–175.

Okunev, J., & Wilson, P. (1997). Using nonlinear tests to examine integration between real estate and equity

markets. Real Estate Economics, 25(3), 487–503.

Ray, B. K. (1993). Modelling long-memory processes for optimal long-range prediction. Journal of Time Series

Analysis, 14, 511–525.

Reisen, V. A., & Lopes, S. (1999). Some simulations and applications of forecasting long-memory time-series

models. Journal of Statistical Planning and Inference, 80, 269–287.

Schwert, C. G. (1989). Tests for unit roots: A Monte Carlo investigation. Journal of Business and Economic

Statistics, 7, 147–159.

Smith, J., & Yadav, S. (1994). Forecasting costs incurred from unit differencing fractionally integrated processes.

International Journal of Forecasting, 10, 507–514.