another look at the forecast performance of arfima models
TRANSCRIPT
International Review of Financial Analysis
13 (2004) 63–81
Another look at the forecast performance of
ARFIMA models
Craig Ellisa,*, Patrick Wilsonb
aSchool of Economics and Finance,
University of Western Sydney Locked Bag 1797 Penrith South DC Sydney NSW 1797, AustraliabSchool of Finance and Economics, University of Technology, Sydney, Australia
Abstract
This paper investigates the out-of-sample forecast performance of the autoregressive fractionally
integrated moving average [ARFIMA (0,d,0)] specification, both when the underlying value of the
fractional differencing parameter (d) is known a priori and when it is unknown. Forecast
performance is measured relative to simple deterministic models and a random walk model, for
forecast horizons up to 100 periods ahead. Overall, the linear models tend to outperform the
ARFIMA specification for both the positive and negative values of d for the simulated series, and for
positive d values from the real time-series data. The results of the study question the use of the
ARFIMA specification as a forecast tool.
D 2004 Elsevier Inc. All rights reserved.
JEL classification: C22; C52; G10
Keywords: Time series; Simulation; ARFIMA
1. Introduction
Traditional models of financial asset returns are based on a number of simplifying
assumptions. Among these is the primary assumption that consecutive price changes are
independent (i.e., follow a random walk). The development of financial asset pricing
models has also been based on this assumption. A general feature of these pricing models
is that the relationships between the model parameters are fundamentally linear. Under the
assumption that price changes are additionally stationary around some long-term mean,
1057-5219/$ - see front matter D 2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.irfa.2004.01.005
* Corresponding author. Tel.: +61-2-4620-3250; fax: +61-2-4620-3787.
E-mail address: [email protected] (C. Ellis).
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8164
linear forecast methods should therefore yield the best prediction of the future value of the
series in the long term. Recent empirical research has, however, identified the possibility
that these relations are deterministic nonlinear (including Barkoulas & Baum, 1998;
Barkoulas, Labys, & Onochie, 1999; Cheung, 1993; Diebold & Rudebusch, 1989; Fang,
Lai, & Lai, 1994; Okunev & Wilson, 1997). These studies also question the accuracy of
linear models for modelling and forecasting economic and financial time-series data.
A usual feature of nonlinear behaviour is the long-term dependence between time-series
increments. Time-series that follow a random walk typically exhibits autocorrelation
functions that quickly tend to be zero. By contrast, the autocorrelation functions of long-
term dependent time-series tend to decay only slowly to zero. Rather than being
independent, the increments of long-term dependent time-series exhibit nonnegligible
positive or negative dependence, even over long intervals. For analysts, such autocorre-
lations imply that nonlinear forecast techniques using knowledge of past price changes
should yield statistically accurate long-term forecasts.
The intuition underlying the implied failure of simple linear models to accurately
describe the time-series properties of economic and financial variables is simple. Time
series for which the autocorrelation is negative over long intervals will exhibit mean-
reverting properties. Therefore, linear modelling techniques should be expected to
consistently overestimate or underestimate the price of the asset as the level of negative
dependence increases. For time-series exhibiting positive autocorrelation, successive
price changes will tend to be in the same direction. However, as the autocorrelation
coefficient tends to unity and the time-series itself becomes approximately linear, the
marginal benefit to using nonlinear methods over linear forecast techniques should
decline.
The forecast performance of the autoregressive fractionally integrated moving average
[ARFIMA (0,d,0)] specification, relative to simple autoregressive, AR( p), models, has
been previously examined by Ray (1993), Smith and Yadav (1994), and Barkoulas and
Baum (1997). Using high-order AR models, Ray finds evidence that these will outperform
the ARFIMA specification, even over forecast horizons of up to 20 periods ahead for
fractional noise processes with no explicit autoregressive or moving average (MA)
components. In similar tests, Smith and Yadav show that low-order AR models can
similarly outperform the ARFIMA (0,d,0) specification, yet, indicate that the relative
performance of higher order AR models declines as the forecast horizon is increased by
beyond about 20 periods ahead. Compared with an AR(1), Barkoulas and Baum, by
contrast, show a significant improvement in forecast accuracy using an ARFIMA ( p,d,0)
for Eurocurrency deposit rates denominated in several currencies.
Using a variety simulated AR(1), MA( q), and ARFIMA (1,d,0) processes, Andersson
(1998) tests the forecast performance of general ARFIMA ( p,d,q) models versus ARMA
( p,q) models. While concluding that it is generally worse to ignore fractional long-term
dependence than to impose it a priori, Andersson shows that the forecast power of the
general ARFIMA specification is largely a function of the method used to initially measure
the level of dependence (i.e., spectral regression, rescaled range, maximum likelihood) in
the underlying time series. Long-horizon ARFIMA ( p,d,q) forecasts based on the
maximum likelihood (ML) estimator are shown by Andersson to generally outperform
the ARMA specification.
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 65
Monte Carlo simulation techniques have also been used by Reisen and Lopes (1999)
and Bhansali and Kokoszka (2002) to test the relative forecast performance of ARFIMA
models. Applying the theory of forecasting ARIMA ( p,d,q) models to the ARFIMA
( p,d,q) specification, Reisen and Lopes use infinite AR and MA representations to forecast
ARFIMA models with different values for p, d>0, and q. Compared with ARMA ( p,q)
forecasts, the mean forecast error for the ARFIMA replication is lower for forecasts up to
five steps ahead, yet, the variance of errors and mean-squared error are not statistically
different when compared with the ARMA forecasts. Employing a similar methodology for
forecasts up to 15 steps ahead, Bhansali and Kokoszka conclude that the additional
estimation of the fractional differencing parameter, d, does not necessarily improve
forecast accuracy.
This paper investigates the forecast performance of various linear models: the last
observed value (Yt), the mean of all observed past values, an AR(1), and a random walk
model (et) against an ARFIMA (0,d,0) forecast model for up to 100 steps ahead. The
methodology employed extends the research of Ray (1993) and Smith and Yadav (1994)
to considering the relative forecast performance of pure fractional ARFIMA models versus
models other than the AR( p) specification over longer forecast horizons. Emphasis on the
pure fractional specification (i.e., p = q = 0 and d p 0) follows from Davidson (2002, p.
193), who argues that pure fractional ARFIMA models quite often provide an ‘‘adequate
representation [of real time-series], in terms of both maximizing the Schwarz criterion and
showing no significant residual autocorrelation.’’
Relative forecast performance is first measured using simulated long-term dependent
series with known values of the fractional differencing parameter. The study is then
extended to consider forecast performance using S&P 500 and Dow Jones Industrial
Average (DJIA) daily returns, for which values of the fractional differencing parameter are
unknown and must first be measured. The major finding of the paper is that the ARFIMA
specification generally underperforms both the simple average and the AR(1), yet,
outperforms forecasts based on the last observation or last observation plus a stochastic
error (i.e., random walk).
The remainder of the paper is organised as follows: The specification and properties of
the ARFIMA model are discussed in Section 2. Section 3 describes the Monte Carlo
simulation methodology used in this study and summarises the alternative measures of the
fractional differencing parameter used in this study when d is unknown. A detailed
analysis of the relative forecast performance of the various forecast models employed is
provided in Section 4, including a separate analysis of findings pertaining to the simulated
and real data sets. Finally, Section 5 concludes the paper.
2. Specification of the ARFIMA ( p,d,q) model
Two general forms of long-term dependent process that have been recently examined in
the financial literature are fractional Brownian motions (fBm) and ARFIMA processes.
Proposed by Granger and Joyeux (1980), the ARFIMA ( p,d,q) specification represents an
alternative model of long-term dependence to that of fBm. A particularly attractive feature
of the ARFIMA specification is that it allows short- and long-term dependencies to be
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8166
modelled separately from each other. This gives the model a distinct advantage over fBm,
which only considers long-term dependence.
Consider a function which conforms to an ARMA ( p,q) process. Denoting U(L) and
W(L) to be polynomials of the order p and q, respectively, such that
UðLÞ ¼ 1� /1L� /2L2 � . . .� /pL
p
WðLÞ ¼ 1� w1L� w2L2 � . . .� wqL
q ð1Þ
and, introducing a mean, l, the general form of the ARMA specification is
UðLÞðYt � lÞ ¼ WðLÞet etfð0; r2e Þ; l ¼ f ðX;bÞ ð2Þ
The variable L in Eqs. (1) and (2) represents the backward shift operator. Under the
assumption that the mean, l, may be determined by some set of exogenous regressors, X
represents a (T� k) matrix and b a (k� 1) vector.
When the function exhibits a nonstationarity of the type associated with a positive and
real unit root, it follows that the function should become stationary when differenced.
Introducing the backward difference operator j, Eq. (2) may be rewritten as
UðLÞjdðYt � lÞ ¼ WðLÞet ð3Þ
The parameter d defines the level of differencing required to induce stationarity. For the
value d = 1, the function is unit root nonstationary. Alternatively, for d = 0, the function is
itself stationary, and Eq. (3) reduces to Eq. (2).
When the differencing parameter is a real value, the correct model is the ARFIMA
( p,d,q) specification. Replacing the backward difference operator in Eq. (3) with an
infinite order autoregressive process, the ARFIMA specification may be written as
UðLÞð1� LÞdðYt � lÞ ¼ WðLÞet ð4Þ
Stationarity and invertibility conditions require that the value of the fractional
differencing parameter, d, is such that jdj < 0.5. For the values 0 < d < 0.5, the process is
stationary and long term dependent, with a correlation structure similar to an fBm with
0.5 <H < 1. Given � 0.5 < d < 0, the resulting process exhibits negative long-term depen-
dence similar with a strong mean-reverting process. However, for p = q = 0 and
� 0.5 < d < 0.5, the process is a type of fractionally differenced white noise whose general
characteristics are similar with fBm for all 0 <H < 1, satisfying H = d + 0.5 (Hosking,
1984).
3. Research methodology
The objective of this study is to investigate the out-of-sample forecast performance of
the ARFIMA (0,d,0) specification given different underlying levels of the fractional
differencing parameter, d. Given the conditions of stationarity and invertibility, the initial
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 67
values of d chosen in this study are d =� 0.49, 0, and 0.49 for the Monte Carlo
simulations. On the basis of estimated d values reported by various other authors for real
time-series (see above), simulated series are then also generated for the underlying values
d =� 0.1 and 0.1. For each underlying value of d, 4000 individual series are then
simulated. Different methods for simulating ARFIMA processes, based on the calculation
of the autocovariace function, include the Cholesky decomposition algorithm (Geweke &
Porter-Hudak, 1983), inverse Fourier transformation, and recursive methods (Chung,
1994). An alternative simulation method is to use a type of truncated autoregression. In
this study, simulated ARFIMA processes are constructed by the truncated autoregression
method of Hosking (1984), where the infinite order autoregression in Eq. (4) is replicated
by the truncated process
ð1� LÞdðYt � lÞ ¼Xj
k¼0
ð�1 Þkd
k
0@
1ALkðYt � lÞ ð5Þ
The parameter j in Eq. (5) represents the truncation lag of the differencing operator.
Compared with simulations based on the computing of the covariance matrix, the
truncated autoregression is both simpler and faster when the sample size is large (Martin
& Wilkins, 1999). The specifics of the truncation process, including the guidelines for the
selection of the size of the truncation lag, j, are given in detail by Hosking (1984). Martin
and Wilkins (1999) also describe an indirect simulation method for the application for the
Cholesky decomposition and truncated autoregression.
In addition to the Monte Carlo simulations, relative forecast performance is also tested
for two series of actual financial returns, the S&P 500 and the DJIA. The full-sample
period for both series is 1/01/1969 to 30/04/2002, and comprises of 8293 observations for
the S&P 500, and 8273 for the DJIA. In contrast to the Monte Carlo simulations, where the
underlying value of d is known a priori, d values for the S&P 500 and DJIA must first be
measured. Due to the variability in the estimates of the fractional differencing parameter
attributable to different test methodologies (Andersson, 1998), two distinct methods for the
estimation of d from the S&P 500 and DJIA data are employed in this study: the modified
profile likelihood (MPL) and the classical rescaled adjusted range. These tests are
described separately in Section 3.1.
To test the robustness of our findings to the time period employed, the S&P 500 and
DJIA full-sample sets are further divided into four contiguous subsamples: 1/01/1969 to
31/12/1976, 3/01/1977 to 31/12/1984, 2/01/1985 to 31/12/1992, and 4/01/1993 to 30/04/
2002. The last 100 observations for each of the sample series comprise the out-of-sample-
forecast period, during which, the relative forecast performance of each model is tested,
while the preceding observations are used to measure the value of d. Summary statistics
for each series, including estimated values for the fractional differencing parameter, are
provided in Table 3.
3.1. Estimating the fractional differencing parameter when d is unknown
Various techniques for the estimation of the fractional differencing parameter for
time-series data include a general class of ML techniques, the Geweke and Porter-
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8168
Hudak (1983) method (GPH), and the classical rescaled adjusted range (Hurst, 1951;
Mandelbrot, & Van Ness, 1968). While the ML method may be considered
parametric in nature, the GPH technique is only semiparametric, and the rescaled
range nonparametric.
Under the assumption of normality of the function Yt, such that YtfN(l,�), the log-
likelihood of the ARFIMA specification in Eq. (4) may be given as
logLðd;/;w;b; r2e Þ ¼ � T
2logð2pÞ � 1
2logjSj � 1
2zVS�1z ð6Þ
where z=(Y� l), � is an T� T covariance matrix, and l, as given prior, is l = f (X,b)(Hauser, 1999).
Derived by An and Bloomfield (1993) from the exact maximum likelihood (EML), the
MPL (S M), given stationary ARFIMA errors and f(X,b) =Xb, is
S Mðd;/;wÞ ¼ � T
2ð1þ log2pÞ � 1
2� 1
T
� �logjRj
� T � k � 2
2log½T�1zVR�1z � 1
2logjXVR�1Xj ð7Þ
Based on the suggestion by Cox and Reid (1987) that the presence of nuisance
parameters can bias the estimation of d when l is estimated from some set of exogenous
regressors, the MPL procedure has been proven by Hauser (1999) to be superior to the
EML for a variety of fractionally integrated series with finite sample lengths.
The classical rescaled adjusted range (R/r)n, as described by Mandelbrot and Van Ness
(1968), is calculated as
ðR=rÞn ¼ ð1=rnÞmax
1VkVn
Xj¼1
k
ðXj �MnÞ �min
1VkVn
Xj¼1
k
ðXj �MnÞ
24
35 ð8Þ
where Mn is the sample mean, (1/n)�jXj, and rn is the series standard deviation,
given as
rn ¼ 1=nXnj
ðXj �MnÞ2" #0:5
ð9Þ
For a given series of length, N, the Mandelbrot and Van Ness procedure first requires
that Eqs. (8) and (9) be estimated over several subseries of length nVN. Using ordinary
least squares regression, a global Hurst exponent, H, is then estimated. Following from
Hosking (1984), the corresponding values of the fractional differencing parameter, d, are
estimated from H using
H ¼ d þ 0:5 dað�0:5; 0:5Þ ð10Þ
One advantage of the classical rescaled range procedure over ML techniques in
estimating d from real time-series data is the impact of unobservable values of Yt (for
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 69
t = 0,� 1,� 2,. . .). Whereas ML techniques require the unknown values of Yt in Eq. (4) to
be replaced by some proxy, the classical rescaled adjusted range methodology does not
require knowledge of the unobserved values.1 The estimation of d by the Hurst exponent
has been considered theoretically by Geweke and Porter-Hudak (1983), Hosking, and,
recently, Ellis (1999).
Estimates of d in this study, using the MPL procedure, were calculated using the
ARFIMA package 1.01 for Ox 3.00 (Doornik & Ooms, 2001). Estimates using the
classical rescaled range procedure were calculated using the authors’ own code.
3.2. Measuring the relative forecast performance of ARFIMA models
The relative forecast performance of the ARFIMA (0,d,0) model is compared to three
simple deterministic models: the last observed value (Yt), the mean of all observed past
values, an AR(1), and a random walk model (et), over time horizons t + k periods ahead for
values of k = 1,2,. . .,100. For every value of the fractional differencing parameter, 1000
simulations are performed for each of the four models. The length of each simulated series
is 2100 observations. From these, the first 2000 observations are used to estimate the
model parameters.2 The out-of-sample performance of each model is then tested over the
last 100 observations. Summary results are then reported for forecast horizons of
k = 1,30,60, and 90 periods ahead. Relative forecast performance is measured in terms
of the difference between both the absolute errors and the squared errors of the four
alternative models versus the ARFIMA model. That each model is tested independently is
significant, as this serves to eliminate the correlation between forecast errors from the
different models, which may bias the comparisons of the overall forecast performance. The
process by which comparative forecast performance is measured can be described as
follows.
Denote the forecast error attributable to the alternative model as et + k1 and the forecast
error for the null forecast model (ARFIMA) as et + k2 . Given a specified function, g(e), of
the forecast error, e, for a given pair of independent forecast errors, et + k1 and et + k
2 , the null
hypothesis of the equality of expected forecast accuracy for levels in price is
E½jðP1tþk � PtþkÞj � jðP2
tþk � PtþkÞj ¼ 0 ð11Þ
E½gðe1tþkÞ � gðe2tþkÞ ¼ 0
From Eq. (11), the null hypothesis is that there is expected to be no significant
difference between the out-of-sample performances of the null and alternate models
1 For a more in-depth discussion of the implications of replacement of Yt by various proxies, see Hosking
(1984).2 In terms of the random walk model, the estimation of model parameters is replicated by simulating a set of
2000 independent and identically distributed variables. Consecutive k-step ahead forecasts are then randomly
generated from this set. Unlike each of the other models, the random walk requires no a priori information about
the past increments of the simulated series.
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8170
(Harvey, Leybourne, & Newbold, 1997). The mean difference in relative forecast
performance is denoted by
hnftþkg ¼ gnðe1tþkÞ � gnðe2tþkÞ
h ¼ N�1XNn¼1
dnftþkg ð12Þ
where the statistical significance of the observed value of h can be estimated using
the familiar two-tailed t test for differences between two means. Following from the
null hypothesis in Eq. (12), the expected value of h is h= 0 when the null and
alternative models are equally as accurate. As per Andersson (1998), the choice of
the significance test is motivated by the large number of simulations and by the
Central Limit Theorem.
To test for the sensitivity of our findings to the underlying form of the forecast error
specification function, g(e), two different error metrics are employed: the relative
difference between the absolute errors, jet + k1 j � jet + k
2 j, and the relative difference between
the squared errors, (et + k1 )2� (et + k
2 )2, of the null and alternative models. A positive value of
h indicates that the ARFIMA null model outperforms the alternative model on the basis of
the chosen error metric (i.e., relative absolute or relative squared errors). Conversely, a
negative h value will show that the null model has lower forecast power than the
alternative model. The statistical significance of h may be determined using standard
parametric tests. However, an important point to note is that if the variances of the
distributions of the null and alternative models are significantly different, the standard
error of the mean in Eq. (12) may be biased by the use of squared, as opposed to absolute,
errors. The problem may arise because squaring will exaggerate any outlying errors and
distort the size of the difference (et + k1 )2� (et + k
2 )2 if either of et + k1 or e t + k
2 is very large
(small) relative to the mean. The choice of both absolute and squared errors is therefore
justified as providing a test for the robustness of our findings to the underlying error
metric.
4. Comparative forecast performance
The relative forecast performance of the ARFIMA (0,d,0) specification versus the four
alternate models is examined using Monte Carlo simulation techniques and actual daily
returns for the S&P 500 and DJIA stock indices. The relative performance is measured
using the technique introduced by Harvey et al. (1997) as described in Eq. (12) for the
Monte Carlo simulated returns and in Eq. (11) for the S&P 500 and DJIA daily returns.
Results pertaining to the simulated data are described in two tables. The first table, Table 1,
provides a summary analysis of forecast performance using relative absolute errors. Table
2 provides the same analysis, but on the basis of relative squared errors. Values of h in both
tables are the mean of 1000 relative forecast errors each. The values in parenthesis
represent the standard error of the mean. Results pertaining to forecasts of S&P 500 and
Table 1
Summary analysis of relative absolute errors, jet + k1 j� je t + k
2 jt + 1 t + 30
Yt Mean AR(1) et Yt Mean AR(1) et
d=� 0.49 0.0326
(0.0563)
� 0.1012*
(0.0510)
� 0.1010
(0.0532)
0.0679
(0.0548)
0.1287*
(0.0562)
� 0.2016*
(0.0512)
� 0.2017*
(0.0512)
� 0.0323
(0.0547)
d=� 0.1 � 0.0493
(0.0367)
� 0.3428*
(0.0312)
� 0.3402*
(0.0313)
� 0.0224
(0.0361)
0.0143
(0.0380)
� 0.2701*
(0.0309)
� 0.2701*
(0.0309)
0.0200
(0.0365)
d= 0.0 � 0.0091
(0.0368)
� 0.3091*
(0.0314)
� 0.3087*
(0.0314)
� 0.0629
(0.0375)
0.0533
(0.0356)
� 0.2587*
(0.0288)
� 0.2586*
(0.0288)
0.0406
(0.0355)
d= 0.1 � 1.0474*
(0.0599)
� 1.4091*
(0.0567)
� 1.4136*
(0.0568)
� 1.1277*
(0.0598)
� 0.8117*
(0.0501)
� 1.1414*
(0.0465)
� 1.1414*
(0.0465)
� 0.8721*
(0.0507)
d= 0.49 0.1428*
(0.0407)
� 0.2822*
(0.0332)
� 0.2473*
(0.0331)
� 0.0372
(0.0390)
� 0.0172
(0.0406)
� 0.3473*
(0.0344)
� 0.3473*
(0.0344)
� 0.0625
(0.0402)
t + 60 t + 90
Yt Mean AR(1) et Yt Mean AR(1) et
d=� 0.49 0.2691*
(0.0578)
� 0.1457*
(0.0531)
� 0.1455*
(0.0531)
� 0.0363
(0.0524)
0.2236*
(0.0577)
� 0.1932*
(0.0529)
� 0.1927*
(0.0529)
� 0.0306
(0.0542)
d=� 0.1 0.0123
(0.0364)
� 0.2733*
(0.0301)
� 0.2733*
(0.0301)
0.0153
(0.0378)
� 0.0151
(0.0378)
� 0.2513*
(0.0304)
� 0.2513*
(0.0304)
� 0.0405
(0.0354)
d= 0.0 0.0645
(0.0371)
� 0.2388*
(0.0313)
� 0.2389*
(0.0313)
0.0011
(0.0362)
� 0.0497
(0.0368)
� 0.3038*
(0.0306)
� 0.3039*
(0.0306)
� 0.0067
(0.0378)
d= 0.1 � 0.7673*
(0.0522)
� 1.0529*
(0.0478)
� 1.052*
(0.0478)
� 0.7407*
(0.0520)
� 0.7796*
(0.0539)
� 1.1116*
(0.0491)
� 1.1116*
(0.0491)
� 0.7472*
(0.0536)
d= 0.49 0.0004
(0.0436)
� 0.3644*
(0.0360)
� 0.3644*
(0.0360)
� 0.0541
(0.0403)
0.0293
(0.0430)
� 0.3585*
(0.0351)
� 0.3585*
(0.0351)
� 0.0695
(0.0401)
* Indicates significantly different to zero at the .05 level.
C.Ellis,
P.Wilso
n/Int.Rev.
Financ.
Analy.
13(2004)63–81
71
Table 2
Summary analysis of relative squared errors, (et + k1 )2� (et + k
2 )2
t + 1 t + 30
Yt Mean AR(1) et Yt Mean AR(1) et
d=� 0.49 0.5278
(0.2889)
� 0.5175*
(0.2288)
� 0.2297
(0.2591)
0.3645
(0.2678)
0.7988*
(0.2798)
� 0.9069*
(0.2321)
� 0.9076*
(0.2320)
� 0.1483
(0.2508)
d=� 0.1 � 0.1558
(0.1042)
� 1.0906*
(0.0811)
� 1.0868*
(0.0812)
� 0.0601
(0.1029)
0.0789
(0.1115)
� 0.9644*
(0.0820)
� 0.9644*
(0.0820)
0.0492
(0.1050)
d= 0.0 � 0.1007
(0.1070)
� 1.1033*
(0.0841)
� 1.1022*
(0.0841)
� 0.2601
(0.1071)
0.1677
(0.1001)
� 0.8906*
(0.0729)
� 0.8905*
(0.0729)
0.1480
(0.1015)
d= 0.1 � 5.8439*
(0.3574)
� 7.0640*
(0.3511)
� 7.0701*
(0.3513)
� 6.1313*
(0.3568)
� 3.8781*
(0.2519)
� 4.9913*
(0.2449)
� 4.9913*
(0.2449)
� 4.0247*
(0.2536)
d= 0.49 0.6209*
(0.1409)
� 0.9278*
(0.0963)
� 0.8650*
(0.0969)
� 0.0581
(0.1216)
� 0.0468
(0.1341)
� 1.2679*
(0.1046)
� 1.2679*
(0.1046)
� 0.2768
(0.1311)
t + 60 t + 90
Yt Mean AR(1) et Yt Mean AR(1) et
d=� 0.49 1.4471*
0.2945
� 0.7129*
(0.2285)
� 0.7120*
(0.2285)
� 0.1471
(0.2369)
1.1940*
(0.2986)
� 1.0637*
(0.2350)
� 1.0618*
(0.2350)
� 0.2340
(0.2476)
d=� 0.1 0.0847
(0.1069)
� 0.9092*
(0.0768)
� 0.9091*
(0.0768)
0.0917
(0.1078)
� 0.0054
(0.1101)
� 0.9306
(0.0809)
� 0.9306
(0.0809)
� 0.1357
(0.1018)
d= 0.0 0.1533
(0.1097)
� 0.9082*
(0.0822)
� 0.9083*
(0.0822)
� 0.0218
(0.0368)
� 0.1684
(0.1079)
� 1.0309*
(0.0822)
� 1.0309*
(0.0822)
� 0.0291
(0.1100)
d= 0.1 � 3.7394*
(0.2509)
� 4.7461*
(0.2409)
� 4.7461*
(0.2409)
� 3.6496*
(0.2520)
� 3.9384*
(0.2756)
� 5.0159*
(0.2628)
� 5.0159*
(0.2628)
� 3.7923*
(0.2730)
d= 0.49 0.0709
(0.1533)
� 1.3377*
(0.1152)
� 1.3378*
(0.1152)
� 0.2414
(0.1381)
0.1397
(0.1483)
� 1.2991*
(0.1070)
� 1.2990*
(0.1070)
� 0.3175
(0.1297)
* Indicates significantly different to zero at the .05 level.
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8172
DJIA daily returns are likewise considered using the relative absolute and relative squared
errors. Results from each set of tables are discussed in turn.
4.1. Relative forecast performance using Monte Carlo simulated returns
The relative forecast performance of the ARFIMA (0,d,0) specification is tested versus
four models: the last observed value, the mean of all values in the series, an AR(1), and a
random walk. Mean values of h in Table 1, based on both of the last observed value (Yt)
and the random walk model (et), are generally insignificant for all underlying values of the
fractional differencing parameter and across all forecast horizons. Significance is tested at
the .05 level using t test for the difference between two means. As stated previously, the
null hypothesis is that the mean value of h is 0. The failure to reject the null hypothesis
implies that there is no statistical benefit in estimating forecasts using the ARFIMA (0,d,0)
specification over forecasts using either a simple random walk, with standard normally
distributed errors, or the last observed value.
The relative performance of the ARFIMA specification against both the mean model
and the AR(1) contrasts strongly to that just described for the random walk (et) and the lastobserved value (Yt). The significant negative values of h show that the ARFIMA model
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 73
underperforms both of these alternative models when tested on the basis of relative
absolute errors. The relative absolute errors are significant for all forecast horizons and all
underlying values of d, except for the AR(1) model, at t + 1 periods ahead for d =� 0.49.
The short forecast horizon results for the AR(1) model are consistent with those by Smith
and Yadav (1994) employing autoregressive lags up to p = 5, and by Ray (1993) for
autoregressive lags up to p = 20. The findings are significant in that they provide an
intuitive explanation for the apparent decline in power of the AR( p) forecast models over
long forecast horizons, which was noted by Smith and Yadav. Rather than diminish the
potential of the AR( p) model, the result highlights the necessity for the order of the
autoregressive lag ( p) to be at least as high as the length of the longest forecast horizon, if
the power of the model is to be maximized for all k-steps ahead.
By contrast to the results provided in Table 1, Table 2 shows the relative difference in
forecast performance, measured using the squared errors, (et + k1 )2� (et + k
2 )2. Consistent
with results already discussed for the absolute errors, je t + k1 j � je t + k
2 j, results in Table 2
confirm the underperformance of the ARFIMA specification relative to both the mean
model and the AR(1). The failure of the ARFIMA specification to outperform either of the
last observation (Yt) or the random walk models is similarly confirmed in the Table 2
results. That the significance of the relative errors is entirely consistent across both tables
proves the robustness of the findings to the choice of error metric (absolute or squared
error) employed.
The moments of the distribution of forecast errors can provide additional information
about forecast performance, above that gained from a simple analysis of the error metrics.3
For instance, a model may exhibit superior performance in the mean, but have a relatively
large variance. Higher variances should, of course, be discouraged as they increase the
probability of large forecast errors. Considering the first moment of distribution of the
forecast errors, the mean and sum of forecast errors is generally negative for all five
models, given the values of the fractional differencing parameter d < 0. However, for
values of dz 0, the error means and sums tend to be positive. Mean forecast errors for the
ARFIMA model are only significantly different to zero for values of d =� 0.49, 0, and
0.49 at forecast horizons of t + 90 periods ahead. All remaining positive and negative
values for the mean forecast error are insignificant, implying no significant difference in
the out-of-sample performance of the different models tested.
The analysis of the variance of forecast errors provides information about the tendency
for the various forecast models to produce large errors. Relative to the variance of errors of
the mean model and AR(1), the variance of ARFIMA forecast errors is larger at all
forecast horizons for all underlying values of d. ARFIMA forecast variances are also
generally higher than those produced by the random walk model. Compared with the
variance of errors attributable to forecasts based on the last observation (Yt), ARFIMA
variances are systematically lower for d =� 0.49, yet, are comparable for all other
underlying values of d.
The distribution of errors for all five models is mostly non-Gaussian due to their
slight leptokurtic nature, rather than due to skewness, which is insignificant for all
3 Summary statistics and normality tests of the distribution of forecast errors for each of the five models are
available from the authors by request.
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8174
values of d and lengths of forecast horizon. Two normality tests are conducted: the
Anderson-Darling (an empirical cumulative distribution test) and Jacque-Beru (a chi-
squared test). Both tests fail to accept normality for all models, except at d=� 0.49,
for which normality is accepted (a>.01) for all the models at forecast horizons of
t+ 30, t + 60, and t+ 90 periods ahead. Normality is also accepted for the distribution
of ARFIMA forecast errors at all forecast horizons for d= 0.1. However, a note of
caution should be exercised in interpreting these test results because they are
sensitive to even small deviations from normality when the sample size is large.
Therefore, while there are no significant differences in the mean forecast errors of
the five models, significant differences in the variance of forecast errors suggest that
the ARFIMA specification has a higher tendency to produce larger forecast errors
than the other models.
4.2. Relative forecast performance using S&P 500 and DJIA daily returns
Summary statistics pertaining to the S&P 500 and DJIA daily returns [log(Pt/Pt� 1)]
for each of the full-sample period and subsample periods are given in Table 3. As indicated
by the mean returns and points gain, each series exhibits an increase in value over the
period being observed. Both S&P 500 and DJIA full-sample returns exhibit significant
positive kurtosis and some skewness. Returns for both data sets are also highly skewed,
with significant positive kurtosis during the period from 2/01/1985 to 31/12/1992;
however, this is largely due to the impact of the October 1987 stock-market crash. Using
run tests for the independence of each series, the null hypothesis of independence is
rejected for the S&P 500 and DJIA for the 2/01/1985 to 31/12/1992 and 4/01/1993 to 30/
04/2002 subperiods, and, additionally, for the DJIA for the 3/01/1977 to 31/12/1984
subperiod. Dickey-Fuller (DF) and Augmented Dickey-Fuller (ADF) test results for the
stationarity of each series are also presented in Table 3, and confirm that all the series are
(first-difference) stationary at the 1% level of significance. The DF test is conducted using
a lag length of zero. Following from Schwert (1989), the number of lagged dependent
variables used in the ADF test (denoted as p in Table 3) is estimated as p=[4(N/100)0.25],
where N is the length of each series. For both the DF and ADF tests, the 1% critical value
is � 3.96.
The estimated values of the fractional differencing parameter for each series are also
provided in Table 3. Consistent with Andersson (1998), results attributable to the
classical rescaled range are different from those obtained using the MPL technique.
Specifically, the classical estimates are consistently lower than those using the MPL
method when d is positive, yet, not when d is negative. Both methods, however, agree on
the sign of d, positive or negative, for each of the series under observation. Comparing
the estimates of d, using the classical techniques, to their expected value, E(d)—the latter
estimated via bootstrap—shows significant levels of positive dependence (d>0) in the full
sample and the first two subsamples for both the S&P 500 and DJIA.4 Estimated t
4 Following the bootstrap methodology suggested by Ambrose, Ancel, and Griffiths (1992), each sample is
scrambled 1000 times. Values of d are then calculated for each iteration, and their mean and standard deviation
used to estimate upper and lower confidence intervals.
Table 3
S&P 500 and DJIA summary statistics and d values
Start– finish 1/01/69–
30/04/02
1/01/69–
31/12/76
3/01/77–
31/12/84
2/01/85–
31/12/92
4/01/93–
30/04/02
S&P 500
Sample size 8293 1908 1915 1918 2249
Starting value 103.9 103.9 107.0 165.4 435.4
Points gain (Loss) 973.1 3.6 60.2 270.3 641.5
Mean return 2.79E� 04 1.70E� 05 2.19E� 04 4.74E� 04 3.85E� 04
Standard error 1.08E� 04 2.08E� 04 1.92E� 04 2.47E� 04 2.12E� 04
Standard deviation 0.0099 0.0093 0.0086 0.0111 0.0103
Skewness � 1.5690 0.2972 0.2504 � 4.6753 � 0.2923
Kurtosis 38.1848 2.4885 1.6101 95.1824 4.3803
Minimum � 0.2283 � 0.0404 � 0.0405 � 0.2283 � 0.0711
Maximum 0.0871 0.0490 0.0465 0.0871 0.0499
Runs ( P value) .000 .0000 .0003 .7674 .5090
DF (0) � 84.400 � 35.308 � 39.029 � 42.278 � 47.169
ADF ( p) � 25.507 � 15.291 � 14.196 � 14.720 � 15.845
d Value (classical) 0.01641 0.11934 0.05529 � 0.02786 � 0.01320
E(d) 0.00205 0.01877 0.01560 0.02061 0.01295
d Value
[MPL] (t value)
0.03162
(3.40)
0.13210
(6.28)
0.07658
(3.88)
� 0.00891
(� 0.47)
� 0.02379
(� 1.39)
d Value
[no dummy]
(t value)
0.03141
(3.42)
� 0.02458
(� 1.43)
DJIA
Sample size 8273 1906 1909 1913 2242
Starting value 943.8 943.8 999.8 1198.9 3309.2
Points gain (loss) 9002.5 60.9 211.8 2102.3 6637.0
Mean return 2.81E� 04 3.12E� 05 9.32E� 05 4.98E� 04 4.71E� 04
Standard error 1.11E� 04 2.13E� 04 2.00E� 04 2.63E� 04 2.09E� 04
Standard deviation 0.0102 0.0096 0.0090 0.0118 0.0101
Skewness � 2.0084 0.2867 0.3962 � 5.2913 � 0.5501
Kurtosis 52.7647 1.6743 1.5821 116.7042 5.2503
Minimum � 0.2563 � 0.0357 � 0.0359 � 0.2563 � 0.0745
Maximum 0.0967 0.0495 0.0478 0.0967 0.0486
Runs ( P value) .0008 .0000 .3273 .3400 .7719
DF (0) � 84.464 � 34.206 � 40.600 � 43.060 � 46.258
ADF ( p) � 25.610 � 15.185 � 14.764 � 14.725 � 14.706
d Value (classical) 0.01157 0.09715 0.02454 � 0.01855 � 0.03074
E(d) 0.00211 0.01948 0.02184 0.01994 0.01329
d Value [MPL]
(t value)
0.02882
(3.12)
0.14614
(6.67)
0.05048
(2.62)
� 0.02279
(� 1.22)
� 0.00852
(� 0.49)
d Value
[no dummy]
(t value)
0.02878
(3.12)
� 0.00815
(� 0.47)
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 75
values attributable to the MPL technique similarly show significant levels of positive
dependence for these series. To test the robustness of these findings to potential biases
induced by the October 1987 crash, the MPL d value estimates have also been
Table 4
S&P 500 and DJIA summary analysis of relative absolute errors, jet + k1 j � jet + k
2 j, using MPL estimates of d value
t + 1 t + 30
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 0.0110 � 0.0018 0.0093 1.3188 � 1.0556 � 1.0622 � 1.0564 0.1359
DJIA � 0.0735 � 0.0715 � 0.0719 1.2779 � 1.5100 � 1.5080 � 1.5093 � 0.3622
1/01/69–31/12/76
S&P 500 � 0.9037 � 0.9008 � 0.9079 0.4157 � 0.4834 � 0.4805 � 0.4853 � 0.4728
DJIA � 1.5158 � 1.5132 � 1.5133 � 0.1964 � 1.0810 � 1.0784 � 1.0756 � 1.0825
3/01/77–31/12/84
S&P 500 � 1.1040 � 1.0811 � 1.1035 0.2644 � 0.5868 � 0.5638 � 0.5854 0.1918
DJIA � 0.7927 � 0.7855 � 0.7936 0.6065 � 1.3089 � 1.3017 � 1.3108 � 0.4986
2/01/85–31/12/92
S&P 500 � 1.4329 � 1.4283 � 1.4328 � 0.4748 � 1.2903 � 1.2949 � 1.2907 � 0.3336
DJIA � 0.9021 � 0.9010 � 0.9017 0.1197 � 1.3680 � 1.3669 � 1.3680 0.1506
4/01/93–30/04/02
S&P 500 � 1.5320 � 1.5446 � 1.5320 � 1.1387 � 0.6744 � 0.6808 � 0.6743 � 0.0388
DJIA � 1.2585 � 1.2567 � 1.2586 � 0.9113 � 0.4588 � 0.4570 � 0.4591 0.7251
t + 60 t + 90
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 � 0.6055 � 0.6183 � 0.6069 0.0498 � 1.0591 � 1.0719 � 1.0608 � 0.4560
DJIA � 1.4227 � 1.4247 � 1.4241 � 0.7802 � 0.2704 � 0.2723 � 0.2720 0.3327
1/01/69–31/12/76
S&P 500 � 0.4217 � 0.4246 � 0.4180 1.0130 � 1.0790 � 1.0819 � 1.0746 0.0450
DJIA � 0.0431 � 0.0457 � 0.0374 1.3947 � 0.5074 � 0.5099 � 0.5030 0.6166
3/01/77–31/12/84
S&P 500 � 1.0840 � 1.0611 � 1.0837 0.3299 � 0.3073 � 0.2946 � 0.3064 0.1471
DJIA � 1.2118 � 1.2046 � 1.2107 0.2154 � 1.3531 � 1.3459 � 1.3526 � 1.2912
2/01/85–31/12/92
S&P 500 � 1.3566 � 1.3611 � 1.3565 � 1.0154 � 0.7258 � 0.7270 � 0.7256 0.9620
DJIA � 0.8396 � 0.8385 � 0.8395 � 0.0370 � 1.6192 � 1.6181 � 1.6191 � 0.3847
4/01/93–30/04/02
S&P 500 � 1.7182 � 1.7308 � 1.7181 0.0426 � 1.3620 � 1.3747 � 1.3620 � 0.8843
DJIA � 0.3681 � 0.3699 � 0.3680 1.2706 � 0.5227 � 0.5245 � 0.5225 � 0.0066
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8176
calculated using an ‘October dummy’ variable where applicable. As noted in Table 3, d
values with and without the inclusion of the dummy variable are consistent. Overall,
the results of both test methodologies confirm the presence of positive dependence in
the full sample and the first two subsamples for both the S&P 500 and DJIA, and
reject the presence of negative dependence in the latter two subsamples for both data
sets.
The relative forecast performance of the ARFIMA null model for multistep-ahead
forecasts of S&P 500 and DJIA daily returns is shown in four tables. Tables 4 and 5
provide a summary analysis of the relative absolute and relative squared forecast errors,
respectively, for ARFIMA forecasts, based upon estimated values of d using the MPL
Table 5
S&P 500 and DJIA summary analysis of relative squared errors, (e t + k1 )2� (e t + k
2 )2, using MPL estimates of d
value
t + 1 t + 30
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 0.0004 0.0000 0.0003 1.7678 � 3.1598 � 3.1596 � 3.1598 � 1.1723
DJIA � 0.0068 � 0.0067 � 0.0067 1.8445 � 1.2613 � 1.2613 � 1.2613 0.6813
1/01/69–31/12/76
S&P 500 � 0.8327 � 0.8326 � 0.8327 0.9316 � 0.0515 � 0.0515 � 0.0515 0.3736
DJIA � 2.3283 � 2.3283 � 2.3283 � 0.5608 � 1.9069 � 1.9069 � 1.9069 � 1.4862
3/01/77–31/12/84
S&P 500 � 1.2210 � 1.2204 � 1.2210 0.6543 � 0.4351 � 0.4346 � 0.4351 1.5225
DJIA � 0.6648 � 0.6644 � 0.6648 1.3573 � 0.8914 � 0.8914 � 0.8914 0.3654
2/01/85–31/12/92
S&P 500 � 2.0554 � 2.0554 � 2.0554 � 1.1361 � 0.5856 � 0.5856 � 0.5856 � 0.2132
DJIA � 0.8182 � 0.8182 � 0.8182 0.2309 � 1.0016 � 1.0016 � 1.0016 � 0.2740
4/01/93–30/04/02
S&P 500 � 2.4132 � 2.4136 � 2.4132 � 2.2415 � 0.8639 � 0.8636 � 0.8639 � 0.8208
DJIA � 1.6075 � 1.6075 � 1.6075 � 1.4804 � 0.3909 � 0.3909 � 0.3909 0.4841
t + 60 t + 90
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 � 1.6720 � 1.6717 � 1.6720 0.0137 � 0.2285 � 0.2279 � 0.2285 1.1167
DJIA � 1.0067 � 1.0068 � 1.0067 0.7563 � 0.5710 � 0.5710 � 0.5710 0.7410
1/01/69–31/12/76
S&P 500 � 0.1254 � 0.1253 � 0.1254 2.0736 � 2.3850 � 2.3850 � 2.3850 � 2.0991
DJIA � 0.7525 � 0.7525 � 0.7525 1.4501 � 0.1793 � 0.1793 � 0.1793 0.1096
3/01/77–31/12/84
S&P 500 � 0.7169 � 0.7161 � 0.7169 0.5427 � 1.1874 � 1.1864 � 1.1874 � 0.8217
DJIA � 0.9582 � 0.9581 � 0.9582 � 0.6062 � 1.0964 � 1.0964 � 1.0964 � 0.4876
2/01/85–31/12/92
S&P 500 � 1.8111 � 1.8111 � 1.8111 � 0.7634 � 0.7893 � 0.7893 � 0.7893 � 0.0566
DJIA � 0.0158 � 0.0158 � 0.0158 0.2532 � 0.2617 � 0.2617 � 0.2617 1.6195
4/01/93–30/04/02
S&P 500 � 1.4579 � 1.4575 � 1.4579 0.6062 � 1.1781 � 1.1775 � 1.1781 � 1.0561
DJIA � 1.6505 � 1.6506 � 1.6505 � 1.0784 � 1.1823 � 1.1823 � 1.1823 1.1098
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 77
technique. Tables 6 and 7 provide a like analysis for ARFIMA forecasts based upon
classical estimates of d.
Consistent with results already discussed for the Monte Carlo simulations with
lower levels of positive dependence (i.e., d = 0.1), the ARFIMA specification is
shown to largely underperform each of the four alternate models. Except for some
positive values, attributable to random walk forecasts of S&P 500 and DJIA daily
returns, hn(t + k) values in all the tables are negative. That there are also no significant
differences in the error metrics (i.e., the relative absolute errors for pertaining to the
Table 6
S&P 500 and DJIA summary analysis of relative absolute errors, je t + k1 j�je t + k
2 j, using classical estimates of d
value
t + 1 t + 30
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 � 0.0428 � 0.0299 � 0.0316 1.2779 � 1.5098 � 1.5032 � 1.5040 � 0.3117
DJIA � 0.0715 � 0.0735 � 0.0719 1.2779 � 1.5080 � 1.5100 � 1.5093 � 0.3622
1/01/69–31/12/76
S&P 500 � 0.4715 � 0.4744 � 0.4786 0.8450 � 0.4670 � 0.4699 � 0.4718 � 0.4593
DJIA � 0.2760 � 0.2786 � 0.2760 1.0408 � 0.9686 � 0.9712 � 0.9658 � 0.9727
3/01/77–31/12/84
S&P 500 � 0.7817 � 0.8047 � 0.8042 0.5638 � 0.5169 � 0.5398 � 0.5385 0.2387
DJIA � 0.3602 � 0.3673 � 0.3683 1.0318 � 0.2566 � 0.2638 � 0.2657 0.5465
2/01/85–31/12/92
S&P 500 � 0.9212 � 0.9257 � 0.9256 0.0323 � 0.2217 � 0.2171 � 0.2176 0.7396
DJIA � 1.3743 � 1.3753 � 1.3750 � 0.3536 � 1.3846 � 1.3857 � 1.3857 0.1330
4/01/93–30/04/02
S&P 500 � 0.8795 � 0.8668 � 0.8668 � 0.4736 � 0.6035 � 0.5971 � 0.5971 0.0385
DJIA � 1.6712 � 1.6730 � 1.6730 � 1.3257 � 1.1078 � 1.1095 � 1.1098 0.0744
t + 60 t + 90
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 � 1.4701 � 1.4573 � 1.4588 � 0.8020 � 0.3125 � 0.2997 � 0.3014 0.3034
DJIA � 1.4247 � 1.4227 � 1.4241 � 0.7802 � 0.2723 � 0.2704 � 0.2720 0.3327
1/01/69–31/12/76
S&P 500 � 0.4381 � 0.4352 � 0.4315 0.9995 � 0.3665 � 0.3636 � 0.3592 0.7605
DJIA � 1.3061 � 1.3035 � 1.2977 0.1343 � 1.4197 � 1.4172 � 1.4128 � 0.2932
3/01/77–31/12/84
S&P 500 0.0236 0.0007 0.0010 1.4146 � 0.8618 � 0.8745 � 0.8736 � 0.4201
DJIA � 1.5608 � 1.5680 � 1.5669 � 0.1408 � 0.6465 � 0.6537 � 0.6532 � 0.5918
2/01/85–31/12/92
S&P 500 � 1.4402 � 1.4357 � 1.4357 � 1.0945 � 0.8266 � 0.8253 � 0.8251 0.8625
DJIA � 0.2869 � 0.2880 � 0.2879 0.5146 � 1.1078 � 1.1089 � 1.1088 0.1256
4/01/93–30/04/02
S&P 500 � 0.9057 � 0.8930 � 0.8930 0.8677 � 0.4731 � 0.4604 � 0.4604 0.0173
DJIA � 1.5613 � 1.5596 � 1.5595 0.0791 � 0.9151 � 0.9133 � 0.9131 � 0.3972
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8178
MPL estimates of d are similar with the relative absolute errors pertaining to the
classical estimates and, likewise, for relative squared errors) further implies that the
choice of method for estimating the value of d, ultimately, has little or no bearing on
out-of-sample forecast accuracy. While alternative methods for estimating d indeed
yield different values for the fractional differencing parameter, this result implies that
there are only limited practical consequences that may arise when varying estimates
of d are found for the same data sample. Overall, the findings with respect to the
S&P 500 and DJIA daily returns data confirm the poor ability of the pure fractional
Table 7
S&P 500 and DJIA summary analysis of relative squared errors, (et + k1 )2� (et + k
2 )2, using classical estimates of d
value
t + 1 t + 30
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 � 0.0026 � 0.0022 � 0.0023 1.7652 � 2.2890 � 2.2889 � 2.2889 � 0.8461
DJIA � 0.0067 � 0.0068 � 0.0067 1.8445 � 2.3460 � 2.3461 � 2.3461 � 0.9785
1/01/69–31/12/76
S&P 500 � 0.2334 � 0.2334 � 0.2335 1.5308 � 0.2340 � 0.2341 � 0.2341 � 0.2337
DJIA � 0.0832 � 0.0832 � 0.0832 1.6843 � 0.9823 � 0.9824 � 0.9821 � 0.9825
3/01/77–31/12/84
S&P 500 � 0.6485 � 0.6490 � 0.6490 1.2263 � 0.3023 � 0.3033 � 0.3033 0.3200
DJIA � 0.1515 � 0.1518 � 0.1519 1.8702 � 0.0710 � 0.0711 � 0.0711 0.5901
2/01/85–31/12/92
S&P 500 � 0.8584 � 0.8584 � 0.8584 0.0609 � 0.0526 � 0.0525 � 0.0525 0.8865
DJIA � 1.8984 � 1.8984 � 1.8984 � 0.8493 � 1.9551 � 1.9552 � 1.9552 0.3896
4/01/93–30/04/02
S&P 500 � 0.7893 � 0.7889 � 0.7889 � 0.6172 � 0.3680 � 0.3680 � 0.3679 0.0482
DJIA � 2.8303 � 2.8303 � 2.8303 � 2.7032 � 1.2799 � 1.2800 � 1.2800 0.1740
t + 60 t + 90
Yt Mean AR(1) et Yt Mean AR(1) et
1/01/69–30/04/02
S&P 500 � 2.1795 � 2.1792 � 2.1793 � 1.7249 � 0.1039 � 0.1035 � 0.1036 0.2878
DJIA � 2.0964 � 2.0963 � 2.0964 � 1.6508 � 0.0840 � 0.0840 � 0.0840 0.3039
1/01/69–31/12/76
S&P 500 � 0.1994 � 0.1994 � 0.1993 1.8920 � 0.1345 � 0.1345 � 0.1345 1.1363
DJIA � 1.7251 � 1.7250 � 1.7249 0.3709 � 2.0157 � 2.0157 � 2.0157 � 0.7465
3/01/77–31/12/84
S&P 500 0.0006 0.0000 0.0000 2.0013 � 0.7734 � 0.7737 � 0.7737 � 0.5626
DJIA � 2.4833 � 2.4835 � 2.4835 � 0.4238 � 0.4349 � 0.4351 � 0.4351 � 0.4305
2/01/85–31/12/92
S&P 500 � 2.0819 � 2.0818 � 2.0818 � 1.9605 � 0.6860 � 0.6860 � 0.6860 2.1725
DJIA � 0.0850 � 0.0850 � 0.0850 0.5649 � 1.2506 � 1.2506 � 1.2506 0.2968
4/01/93–30/04/02
S&P 500 � 0.8315 � 0.8312 � 0.8312 2.3354 � 0.2333 � 0.2329 � 0.2329 0.0170
DJIA � 2.5108 � 2.5108 � 2.5108 0.2571 � 0.8705 � 0.8705 � 0.8705 � 0.5836
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 79
ARFIMA model to produce superior forecasts, as is shown for the Monte Carlo
simulated series, for which the value of d is known.
5. Summary
Under the assumptions that asset returns are independent and follow a Gaussian
distribution, information about past price changes should not be used to successfully
forecast future returns. Rather, the best unbiased estimate of price changes during the next
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–8180
period should be the current period’s price change. However, recent empirical analysis
suggests that a variety of financial time series exhibit nonlinear dependence. An
implication of these findings is that such dependence can be efficiently modelled and
forecasted forward.
This study examines the out-of-sample forecast performance of the ARFIMA (0,d,0)
specification, relative to various simple deterministic and stochastic models. Monte Carlo
techniques are used to simulate ARFIMA (0,d,0) processes with known levels of positive
and negative dependence and the models applied. Using a variety of different measures of
forecast performance, the ARFIMA specification fails to outperform forecasts derived
from both the simple average and AR(1) models, and performs only as good as a forecast
based on the last observed value or a random walk model. This result is irrespective of the
sign and magnitude of the underlying dependence in the simulated series, although the
statistical significance of the difference in relative forecast performance is generally higher
at longer forecast horizons.
Expanding the study to consider the relative forecast performance of the ARFIMA
specification using two sets of actual financial time-series returns, the poor out-of-sample
performance of the ARFIMA (0,d,0) is confirmed. Examining daily returns for the S&P
500 and DJIA with varying moderate levels of dependence, the ARFIMA model is unable
to outperform any of the alternate linear models or random walk on a consistent basis.
Given the computational complexity of the ARFIMA specification, this poor result may
lead us to question the applicability of the model. However, the fact that the simulated
series used in this study were, by constraint of the values of d used, all stationary should
not be overlooked.
Acknowledgements
Earlier versions of this paper were presented at the Twelfth Annual PACAP/FMA
Finance Conference and the 2003 FMA Annual Meeting. The authors would like to
gratefully acknowledge helpful comments received from participants at both meetings.
References
Ambrose, B. W., Ancel, E. W., & Griffiths, M. D. (1992). The fractal structure of real estate investment trust
returns: The search for evidence of market segmentation and non-linear dependency. Journal of the American
Real Estate and Urban Economics Association, 20(1), 25–54.
An, S. & Bloomfield, P. (1993). Cox and Reid’s modification in regression models with correlated errors.
Discussion Paper, Department of Statistics, North Carolina State University.
Andersson, M. K. (1998). On the effects of imposing or ignoring long memory when forecasting, Stockholm
School of Economics. Working Paper Series in Economics and Finance No. 225.
Barkoulas, J. T., & Baum, C. F. (1997). Fractional differencing modeling and forecasting of Eurocurrency deposit
rates. Journal of Financial Research, 20(3), 355–372.
Barkoulas, J. T., & Baum, C. F. (1998). Fractional dynamics in Japanese financial time series. Pacific-Basin
Finance Journal, 6, 115–124.
Barkoulas, J. T., Labys, W. C., & Onochie, J. I. (1999). Long memory in futures prices. Financial Review, 34,
91–100.
C. Ellis, P. Wilson / Int. Rev. Financ. Analy. 13 (2004) 63–81 81
Bhansali, R. J., & Kokoszka, P. S. (2002). Prediction of long-memory time series: An overview. In P. Doukhan,
G. Oppenheim, & M. Taqqu (Eds.), Long range dependence: Theory and applications. Massachusetts:
Birkhauser.
Cheung, Y. W. (1993). Long memory in foreign exchange-rates. Journal of Business and Economic Statistics,
11(1), 93–101.
Chung, C. F. (1994). A note on calculating the autocovariances of the fractionally integrated ARMA models.
Economics Letters, 45, 293–297.
Cox, D. R., & Reid, N. (1987). Parameter orthogonality and approximate conditional inference (with discussion).
Journal of the Royal Statistical Society. Series B, Statistical Methodology, 49, 1–39.
Davidson, D. (2002). A model of fractional cointegration, and tests for cointegration using the bootstrap. Journal
of Econometrics, 110, 187–212.
Diebold, F. X., & Rudebusch, G. D. (1989). Long memory and persistence in aggregate output. Journal of
Monetary Economics, 24(2), 189–209.
Doornik, J. A. & Ooms, M. (2001). A package for estimating, forecasting and simulating ARFIMA models:
Arfima package 1.01 for Ox. Discussion paper, Nuffield College, Oxford.
Ellis, C. (1999). Estimation of the ARFIMA ( p, d, q) fractional differencing parameter (d) using the classical
rescaled adjusted range technique. International Review of Financial Analysis, 8(1), 53–65.
Fang, H., Lai, K. S., & Lai, M. (1994). Fractal structure in currency futures price dynamics. Journal of Futures
Markets, 14(2), 169–181.
Geweke, J., & Porter-Hudak, S. (1983). The estimation and application of long memory time series models.
Journal of Time Series Analysis, 4(4), 221–238.
Granger, C. W. J., & Joyeux, R. (1980). An introduction to long-memory time series models and fractional
differencing. Journal of Time Series Analysis, 1, 15–39.
Harvey, D., Leybourne, S., & Newbold, P. (1997). Testing the equality of mean square prediction errors.
International Journal of Forecasting, 13, 281–291.
Hauser, M. (1999). Maximum likelihood estimators for ARMA and ARFIMA models: A Monte Carlo study.
Journal of Statistical Planning and Inference, 80, 229–255.
Hosking, J. R. M. (1984). Modeling persistence in hydrological time series using fractional differencing. Water
Resources Research, 20(12), 1898–1908.
Hurst, H. E. (1951). Long term storage capacity of reservoirs. Transactions of the American Society of Civil
Engineers, 116, 770–799.
Mandelbrot, B. B., & Van Ness, J. W. (1968). Fractional Brownian motions, fractional noises and applications.
SIAM Review, 10(4), 422–437.
Martin, V. L., & Wilkins, N. P. (1999). Indirect estimation of ARFIMA and VARFIMA models. Journal of
Econometrics, 93, 149–175.
Okunev, J., & Wilson, P. (1997). Using nonlinear tests to examine integration between real estate and equity
markets. Real Estate Economics, 25(3), 487–503.
Ray, B. K. (1993). Modelling long-memory processes for optimal long-range prediction. Journal of Time Series
Analysis, 14, 511–525.
Reisen, V. A., & Lopes, S. (1999). Some simulations and applications of forecasting long-memory time-series
models. Journal of Statistical Planning and Inference, 80, 269–287.
Schwert, C. G. (1989). Tests for unit roots: A Monte Carlo investigation. Journal of Business and Economic
Statistics, 7, 147–159.
Smith, J., & Yadav, S. (1994). Forecasting costs incurred from unit differencing fractionally integrated processes.
International Journal of Forecasting, 10, 507–514.