modeling volatility of s&p 500 index daily returns · option market, implied volatility subsume...

97
Modeling Volatility of S&P 500 Index Daily Returns: A comparison between model based forecasts and implied volatility Huang Kun Department of Finance and Statistics Hanken School of Economics Vasa 2011

Upload: others

Post on 02-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Modeling Volatility of S&P 500 Index Daily Returns:

A comparison between model based forecasts and implied volatility

Huang Kun

Department of Finance and Statistics

Hanken School of Economics

Vasa

2011

HANKEN SCHOOL OF ECONOMICS

Department of: Finance and Statistics Type of work: Thesis

Author: Huang Kun Date: April, 2011

Title of thesis:

Modeling Volatility of S&P 500 Index Daily Returns: A comparison between model

based forecasts and implied volatility

Abstract:

The objective of this study is to investigate the predictability of model based

forecasts and the VIX index on forecasting future volatility of S&P 500 index daily

returns. The study period is from January 1990 to December 2010, including 5291

observations.

A variety of time series models were estimated, including random walk model,

GARCH (1,1), GJR(1,1) and EGARCH (1,1) models. The study results indicate that GJR

(1,1) outperforms other time series models for out-of-sample forecasting. The forecast

performance of VIX, GJR(1,1) and RiskMetrics were compared using various

approaches. The empirical evidence does not support the view that implied volatility

subsumes all information content, and the study results provide strong evidence

indicating that GJR (1,1) outperforms VIX and RiskMetrics for modeling future

volatility of S&P 500 index daily returns.

Additionally, the results of the encompassing regression for future realized

volatility at 5-, 10-, 15-, 30- and 60-day horizons, and the results of the encompassing

regression for squared return shocks suggest that the joint use of GJR (1,1) and

RiskMetrics can produce the best forecasts.

By and large, our finding indicates that implied volatility is inferior for future

volatility forecasting, and the model based forecasts have more explanatory power for

future volatility.

Keywords: volatility, S&P 500, GARCH, GJR, RiskMetrics, implied volatility

CONTENTS

1 Introduction………………………………………………………………………………………………………2

2 Literature Review……………………………………………………………………………………………….6

3 The CBOE Volatility Index – VIX………………………………………………………………………16

3.1 Implied Volatility……………………………………………………………………………………….16

3.2 The VIX Index……………………………………………………………………………………………17

4 Time Series Models for Volatility Forecasting…………………………………………………… 19

4.1 Random Walk Model………………………………………………………………………………….19

4.2 The ARCH(q) Model……………………………………………………………………….………… 19

4.3 The GARCH (p,q) Model………………………………………………………………….…………20

4.3.2 The Stylized Facts of Volatility……………………………………………….…………21

4.4 The GJR (p,q) Model…………………………………………………………………………………23

4.5 The EGARCH (p,q) Model…………………………………………………………………………..24

4.6 RiskMetrics Approach…………………………………………………………………………………25

5 Practical Issues for Model-building……………………………………………………………………26

5.1 Test ARCH Effect………………………………………………………………………………………26

5.2 Information Criterion…………………………………………………………………………………27

5.3 Evaluating the Volatility Forecasts……………………………………………………………….27

5.3.1 Out-of-sample Forecast………………………………………………………………………..27

5.3.2 Traditional Evaluation Statistics…………………………………………………………..28

6 Data………………………………………………………………………………………………………………30

6.1 S&P 500 Index Daily Returns………………………………………………………………………30

6.1.1 Autocorrelation of S&P 500 Index Daily Returns……………………………………32

6.1.2 Testing ARCH Effect of S&P 500 Index Daily Returns……………………………33

6.2 Properties of the VIX Index…………………………………………………………………………34

6.3 Study on S&P 500 Index and the VIX Index………………………………………………….34

6.3.1 Cross-correlation between S&P 500 Index and the VIX Index……………….34

6.3.2 S&P 500 Index Daily Returns and the VIX Index………………………………..37

7 Estimation and Discussion……………………………………………………………………………….43

7.1 Model Selection…………………………………………………………………………………………43

7.2 Test Numerical Accuracy of GARCH Estimates……………………………………………45

7.3 Estimates of Models…………………………………………………………………………………..46

7.4 BDS Test…………………………………………………………………………………………………...49

7.5 Graphical Diagnostic………………………………………………………………………………….51

8 Forecast Performance of Model Based Forecasts and VIX…………………………………..53

8.1 Out-of-sample Forecast Performance of GARCH Models……………………………..53

8.2 In-sample Forecast Performance of VIX……………………………………………………..54

8.3 Comparing Predictability of Time Series Models and VIX…………………………….56

8.3.1 Correlation between Realized Volatility and Volatility Forecasts…………59

8.3.2 Regression for In-sample Realized Volatility……………………………………..60

8.3.3 Residual Tests for Regression of In-sample Realized Volatility……………64

8.3.4 Regression for Out-of-sample Realized Volatility………………………………67

8.3.5 Residual Tests for Regression of Out-of-sample Realized Volatility…….70

8.3.6 Encompassing Regression for Realized Volatility………………………………72

8.3.7 Average Squared Deviation……………………………………………………………..75

8.3.8 Regression for Squared Return Shocks……………………………………………76

8.3.9 Encompassing Regression for Squared Daily Return Shocks……………..78

9 Conclusion…………………………………………………………………………………………………….80

References………………………………………………………………………………………………………….81

Appendix A. VIX and Future Realized Volatility…………………………………………………….86

Appendix B. Out-of-sample Forecast Performance on Realized Volatility………………..89

Appendix C. Residuals from Regression for Out-of-sample Realized Volatility…………91

TABLES

Table 1.Summary statistics for S&P 500 index daily returns 31

Table 2 Test for ARCH effect in S&P 500 daily index returns 33

Table 3. Summary statistics of the VIX index 34

Table 4. Cross-correlation between S&P 500 index daily returns and

implied volatility index 35

Table 5. Regression results for VIX changes and S&P 500 index daily returns 38

Table 6. Information criteria for estimated GARCH (p,q) models 44

Table 7. The summary statistics of estimated volatility models 47

Table 8. BDS test for serial independence in residuals 50

Table 9. Forecast Performance of GARCH models 53

Table 10. In-sample forecast performance of VIX and GARCH specifications 55

Table 11 Correlation between Realized Volatility and Alternative Forecasters 59

Table 12. Performance of regression for in-sample realized volatility 61

Table 13. Forecast performance on out-of-sample realized volatility 63

Table 14. Residual tests for regression for in-sample realized volatility 66

Table 15. Performance of regression for out-of-sample realized volatility 68

Table 16. Residual tests for regression for out-of-sample realized volatility 71

Table 17. Encompassing regression for realized volatility 74

Table 18. The average squared deviation from alternative approaches 76

Table 19. Regression results for squared return shocks 77

Table 20. Encompassing regression results for squared return shocks 78

FIGURES

Figure 1.Daily returns, squared daily returns and absolute daily returns

for the S&P 500 index 32

Figure 2. Autocorrelation of , and | | for S&P 500 index 33

Figure 3. S&P 500 Index (logarithm) and the VIX Index 36

Figure 4. S&P 500 index daily returns and the VIX index 41

Figure 5 S&P 500 index absolute daily returns and the VIX index 42

Figure 6. Estimates from various GARCH (p,q) models 45

Figure 7. Graphical residual diagnostics from GARCH (1,1) to S&P 500 returns 52

2

1 Introduction

Volatility is computed as the standard deviation of equity returns. Modeling volatility

in financial market is important because volatility is often perceived as a significant

element for the evaluation of assets, the measurement of risk, the investment

decision making, the valuation of security and the monetary policy making.

The stock market volatility is virtually time-varying. The empirical evidence

dates back to the well-known pioneering studies of Mandelbrot (1963) and Fama

(1965) demonstrated that large price (small price) changes tend to be followed by

large price (small price) changes, implying that there are some periods which display

pronounced volatility clustering. It is widely accepted that volatility changes in

financial market are predictable. The various models have been applied by extensive

empirical studies for future volatility forecasting and measuring the predictability of

volatility forecasts. However, there is little consensus in terms of which model or

family of models is the best for describing assets returns.

To date the two most popular approaches for future volatility forecasting are

considered to be the Generalized Autoregressive Conditional Heteroskedasticity

(GARCH) model and the RistMetrics approach introduced by Robert Engle (1982)

and J. P. Morgan (1992), respectively. The forecasts of these two approaches are

derived on the basis of historical data. Additionally, the volatility implied from the

actual observed option price is thought to be an efficient volatility forecasts and

becoming more and more popular for volatility forecasting, particularly in the U.S

market. A large number of empirical evidence documented that, under the efficient

option market, implied volatility subsume forward-looking information contained in

all other variables in the market’s information set that help measure volatility of

option’s lifetime. By and large, the conventional approaches for volatility forecasting

are classified into two categories, and they are time series models based on historical

data and volatility implied from observed option price.

The GARCH model is the natural extension of autoregressive conditional

heteroscedasticity (ARCH) model which was thought to be the good description of

3

stock returns and an efficient technique for estimating and analyzing time-varying

volatility in stock returns. The seminal ARCH (q) model was pioneered by Engle

(1982), representing a function of the squared returns of the past q periods and

formulating the conditional variance of returns via maximum likelihood procedure

rather than making use of the sample standard deviation. However, there are some

limitations of ARCH (q) model. For example, how to decide the appropriate number

of lags of the squared residual in the model; the large value of q may induce a non-

parsimonious conditional variance model; non-negative constraints might be

violated.

Some problems of ARCH (q) model can be overcome by GARCH (p,q) model

which incorporates the additional dependencies on p lags of the past volatility and

the variance of residuals is modeled by an autoregressive moving average ARMA (p,q)

process replacing the AR (q) process of ARCH (q) model. GARCH (p,q) model is

widely used in practice. The extensive empirical evidence suggest that GARCH (p,q)

model is a more parsimonious model than ARCH (q) model and provides a

framework for deeper time-varying volatility estimation. One of outstanding features

of the GARCH (p,q) model is that it can effectively remove the excess kurtosis in

returns. Particularly, GARCH (1,1) model is widely recognized as the most popular

framework for modeling volatilities of many financial time series.

However, the standard symmetric GARCH (p,q) model also has some

underlying limitations. For instance, the requirement that the conditional variance is

positive may be violated for the estimated model. The only way to avoid this problem

is to place the constraints for coefficients to force them to be positive. The second

limitation is that it cannot explain the leverage effect, although it has good

performance for explaining volatility clustering and leptokurtosis in a time series.

Thirdly, the direct feedback between the conditional mean and conditional variance

is not allowed by the standard GARCH (p,q) model.

In order to overcome the limitations of the standard symmetric GARCH (p,q)

model, a number of extensions have been introduced, such as the asymmetric GJR

(p,q) and EGARCH (p,q) models which can better capture the dynamics of time series

4

and make the modeling more flexible.

As another conventional approach for volatility forecasting, implied volatility is

the volatility implied from observed option price and computed by option pricing

formulas, such as the Black-Scholes formula which is widely used in practice. As we

know, the required parameters for computing option price using Black-Scholes model

are stock price, strike price, risk free interest rate, time to maturity, volatility as well

as dividend. Being the unique unknown parameter, implied volatility is thought to be

the representation of the future volatility by consensus because option is priced on

the basis of future payoffs.

Today, implied volatility indices have been constructed and published by stock

exchange in many countries, and it is widely recognized that implied volatility index

has superior predictability for future stock market volatility. A common question

regarding to implied volatility is whether the option price subsumes all relevant

information about future volatility. The large number of empirical evidence from

previous studies (e.g., Fleming, Ostdiek and Whaley 1995, Christensen and Prabhala

1998, Giot 2005a, Giot 2005b, Corrado and Miller, JR. 2005, Giot and Laurent 2006,

Frijns, Tallau and Tourani-Rad 2008, Becker, Clements and McClelland 2009,

Becker, Clements and Coleman-Fenn 2009, Frijns, Tallau and Tourani-Rad 2010)

demonstrate that implied volatility is a forward-looking measure of market volatility.

However, the poor predictive power of implied volatility was also indicated by some

studies, such as Day and Lewis (1992), Canina and Figlewski (1993), Becker,

Clements and White (2006), Becker, Clements and White (2007) and Becker and

Clements (2008).

The objective of our study is to investigate whether the model based forecasts or

the CBOE volatility index (the VIX index published by Chicago Board Options

Exchange) is superior on forecasting future volatility of S&P 500 index daily returns

The data used for our study ranges from January 1990 to December 2010. There are

several reasons why we consider the use of the VIX index. First, it is on the basis of

S&P 500 index which is considered to be the core index for the U.S equity market.

Second, VIX is widely believed as the market’s expectation of S&P 500 index. Third,

5

VIX has considerable data set of historical prices over 20-year. Finally, the

information content and performance of VIX have been studied by a large number of

empirical studies using various approaches, but the study results are conflict.

Therefore, it is interesting to examine the performance of VIX by our own study.

The time series model studied in this paper includes random walk model, ARCH

(p) model, GARCH (p,q) model, GJR (p,q) model, EGARCH (p,q) model and

RiskMetrics approach. We first estimated the parameters of respective time series

model, and then examined their out-of-sample forecast performance. Our empirical

evidence suggest that GJR (1,1) model performs best for modeling S&P 500 index

future returns. Next, the predictive power between GJR (1,1), RiskMetrics approach

and VIX were compared by different approaches. We performed the regression of

future realized volatility at different forecasting horizons of both in-sample and

out-of-sample periods, as well as the study of their forecasting performance on the

average daily return shocks. To guard against spurious inferences, the diagnostic

tests of residuals were conducted.

Our study results are in line with Becker, Clements and White (2006), Becker,

Clements and White (2007) and Becker and Clements (2008). The empirical

evidence of our study does not support the view that implied volatility subsumes all

information content, and the study results provide strong evidence indicating that

GJR (1,1) is superior for modeling future volatility of S&P 500 index daily returns.

Additionally, the results of encompassing regression for future realized volatility at 5-,

10-, 15-, 30- and 60-day horizons, and the results of the encompassing regression for

squared return shocks suggest that the joint use of GJR(1,1) and RiskMetrics can

produce the best forecasts.

The rest of this paper is structured as follow. We reviewed literatures in section 2.

In section 3, the implied volatility and the VIX index are introduced. The time series

models and practical issues for modeling are detailed in section 4 and section 5,

respectively. Section 6 outlines the data used for our study. The estimates of time

series models are discussed in section 7. Section 8 presents the empirical results of

comparison between VIX, RiskMetrics and GJR(1,1). Finally, section 9 concludes.

6

2 Literature Review

The predictability of ARCH (q) model on volatility of equity returns has been studied

by extensive literature. However, the empirical evidence indicating the good forcast

performance of ARCH (q) model are sporadic. The previous studies by Franses and

Van Dijk (1996), Braisford and Faff (1996) and Figlewski (1997) examined the

out-of-sample forecast performance of ARCH (q) models, and their study results are

conflict. However, the common ground of their studies is that the regression of

realized volatility produce a quite low statistic of R2. Since the average R2 is smaller

than 0.1, they suggested that ARCH (q) model has weak predictive power on future

volatility.

There is a variety of restrictions influencing the forecasting performance of

ARCH models. The frequency of data is one of restrictions, and it is an issue widely

discussed in preceding papers. Nelson (1992) studied ARCH model and documented

that the ARCH model using high frequency data performs well for volatility

forecasting, even when the model is severely misspecified. However, the

out-of-sample forecasting ability of medium- and long-term volatility is poor.

The existing literature regarding to the study on GARCH type models can be

classified into two categories, and they are the investigation on the basic symmetric

GARCH models and the GARCH models with various volatility specifications.

Wilhelmsson (2006) investigated the forecast performance of the basic GARCH

(1,1) model by estimating S&P 500 index future returns with nine different error

distributions, and found that allowing for a leptokurtic error distribution leads to

significant improvements in variance forecasts compared to using the normal

distribution. Additionally, the study also found that allowing for skewness and time

variation in the higher moments of the distribution does not further improve

forecasts.

Chuang, Lu and Lee (2007) studied the volatility forecasting performance of the

standard GARCH models based on a group of distributional assumptions in the

context of stock market indices and exchange rate returns. They found that the

7

GARCH model combined with the logistic distribution, the scaled student’s t

distribution and the Riskmetrics model are preferable both stock markets and foreign

exchange markets. However, the complex distribution does not always outperform a

simpler one.

Franses and van Dijk (1996) examined the predictability of the standard

symmetric GARCH model as well as the asymmetric Quadratic GARCH and GJR

models on weekly stock market volatility forecasting, and the study results indicated

that the QGARCH model has the best forecasting ability on stock returns within the

sample period.

Brailsford and Faff (1996) investigated the predictive power of various models on

volatility of the Australia stock market. They tested the random walk model, the

historical mean model, the moving average model, the exponential smoothing model,

the exponential weighted moving average model, the simple regression model, the

symmetric GARCH models and two asymmetric GJR models. The empirical evidence

suggested that GJR model is the best for forecasting the volatility of Australia stock

market returns.

Chong, Ahmad and Abdullah (1999) compared the stationary GARCH,

unconstrained GARCH, non-negative GARCH, GARCH-M, exponential GARCH and

integrated GARCH models, and they found that exponential GARCH (EGARCH)

performs best in describing the often-observed skewness in stock market indices and

in out-of-sample (one-step-ahead) forecasting.

Awartani and Corradi (2005) studied the predictability of different GARCH

models, particularly focused on the predictive content of the asymmetric component.

The study results show that GARCH models allowing for asymmetries in volatility

produce more accurate volatility predictions.

Evans and McMillan (2007) studied the forecasting performance of nine

competing models for daily volatility for stock market returns of 33 economies. The

empirical results show that GARCH models allowing for asymmetries and

long-memory dynamics provide the best forecast performance.

8

By and large, the extensive empirical studies and evidence demonstrated that

GARCH models allowing for asymmetries perform very well for modeling future

volatility.

EWMA model is also a widely used technique for modeling and forecasting

volatility of equity returns in financial markets, and the well-known RiskMetrics

approach is virtually the variation of EWMA. A great deal of existing studies using

EWMA model on various markets demonstrated that EWMA model has different

performance.

Akgiray (1989) first examined the forecast performance of EWMA technique on

volatility forecasting for stocks on the NYSE. The study also examined predictability

of ARCH and GARCH models. The finding indicated that EWMA model is useful for

forecasting time series, however, the GARCH model performs best for forecasting

volatility.

Tse (1991) studied volatility of stock returns of Japanese market during the

period of 1986 to 1989 using ARCH, GARCH and EWMA models. The study results

revealed that the EWMA model outperforms ARCH and GARCH models for volatility

forecasting of stock returns in Tokyo Stock Exchange during the sample period.

Tse and Tung (1992) investigated monthly volatility movements in Singapore

stock market using three different volatility forecasting models which are the naive

method based on historical sample variance, EWMA and GARCH models. The study

results suggested that EWMA model is the best for predicting volatility of monthly

returns for Singapore market.

Wash and Tsou (1998) investigated the volatility of Australian index from

January 1, 1993 to December 31, 1995 using a variety of forecasting techniques, and

they are historical volatility, an improved extreme-value method, the ARCH/GARCH

class of models, and EWMA model. The hourly data, daily data and weekly data were

used, respectively. The finding indicated that the EWMA model outperforms other

volatility forecasting techniques within the sample period.

Galdi and Pereira (2007) examined and compared efficiency of EWMA model,

GARCH model and stochastic volatility (SV) for Value at Risk (VaR). The empirical

9

results domonstrated that VaR calculated by EWMA model was less violated than by

GARCH models and SV for a sample with 1500 observations.

Patev, Kanaryan and Lyroudi (2009) studied volatility forecasting on the thin

emerging stock markets, and their study primarily focused on Bulgaria stock market.

Three different models which are RiskMetrics, EWMA with t-distribution and EWMA

with GED distribution were employed for investigation. The study results suggested

that both EWMA with t-distribution and EWMA with GED distribution have good

performance for modeling and forecasting volatility of stock returns of Bulgaria

market. They also concluded that EWMA model can be effectively used for volatility

forecasting on emerging markets.

Implied volatility is another popular issue which has attracted a great deal of

attention by empirical research. Particularly, the information content of implied

volatility is the subject of many studies and it has been well documented that implied

volatility is an efficient volatility forecast and it subsumes all information contained

in other variables. The predictability of model based forecasts and implied volatility

have been compared by a number of studies, and the objective is to find out the

answer for whether implied volatility or model based forecasts is superior for future

volatility forecasting.

The implied volatility from index option has been widely studied but the study

results are conflict. The studies by Day and Lewis (1992), Canina and Figlewski

(1993), Becker et al. (2006), Becker et al. (2007) and Becker and Clements (2008)

demonstrated that historical data subsumes important information that is not

incorporated into option prices, suggesting that implied volatility has poor

performance on volatility forecasting. However, the empirical evidence from the

studies by Poterba and Summers (1986), Sheikh (1989), Harvey and Whaley (1992),

Fleming, Ostdiek and Whaley (1995), Christensen and Prabhala (1998), Blair, Poon

and Taylor (2001), Poon and Granger (2001), Mayhew and Stivers (2003), Giot

(2005 a), Giot (2005 b), Corrado and Miller, JR. (2005), Giot and Laurent (2006),

Frijns et al. (2008), Becker, Clements and McClelland (2009), Becker, Clements and

Coleman-Fenn (2009) and Frijns et al. (2010) documented that the implied

10

volatilities from index options can capture most of the relevant information in the

historical data.

The implied volatility index (VIX) from CBOE is a widely used index option for

empirical research on implied volatility in practice. The VIX index was the volatility

implied from the option price of S&P 100 index, and the calculation method has been

changed since 2003. Today, the VIX index is computed by the option price from S&P

500 index. Therefore, the literature regarding to the empirical studies on VIX can be

classified into two categories: VIX based on S&P 100 index and VIX based on S&P

500 index.

Most studies found that the volatility implied by S&P 100 index option prices to

be a biased and inefficient forecast of future volatility and to contain little or no

incremental information beyond that in past realized volatility.

Day and Lewis (1992) examined the volatility implied from the call option prices

of S&P 100 index of the period from 1985 to 1989 by the use of the cross-sectional

regression. The information content of implied volatility was compared to the

conditional volatility of GARCH and EGARCH models of both in-sample and

out-of-sample periods. The information content of implied volatility of in-sample

period was examined by the likelihood ratio of the nested conditional volatility

GARCH and EGARCH models augmented with implied volatility as an exogenous

variable. The out-of-sample forecast performance of implied volatility and GARCH

and EGARCH models was studied by running the regression for the ex post volatility

on implied volatility and the volatility forecasts from GARCH and EGARCH models.

The study results show that implied volatility is biased and inefficient. The drawback

of their study may be the use of overlapping samples to predict one-week ahead

volatility of options which have the remaining life up to 36-day.

Canina and Figlewski (1993) showed that implied volatility has no virtual

correlation with future return volatility and does not incorporate information

contained in recent observed volatility. According to the analysis by Canina and

Figlewski (1993), one reason for producing their study results could be the use of S&P

100 index options (OEX) and the index option markets process volatility information

11

inefficiently. The second reason is that the Black-Scholes option pricing model may

be not suitable for pricing index options since prohibitive transaction costs associated

with hedging of options in the cash index market. However, the Black-Scholes model

does not require continuous trading in cash markets. Christesen and Prabhala (1998)

mentioned that Constantinides (1994) have argued that transaction costs have no

first-order effect on option prices. Therefore, transaction costs cannot interpret the

apparent failure of the Black-Scholes model for the OEX options market. It seems

that the study results of Canina and Figlewski (1993) refute the basic principle of

option pricing theory. (Christesen and Prabhala 1998)

The study by Christensen and Prabhala (1998) was the development of the study

by Canina and Figlewski (1993). They reinvestigated the relation between implied

volatility and realized volatility of the OEX options market, and they found the

different study results. Their finding indicates that implied volatility outperforms

past volatility in forecasting future volatility and subsumes the information content of

past volatility in some of their specifications. Christensen and Prabhala (1998) argued

that the reason causing their study results to be different from Canina and Figlewski’s

(1993) is that they used a longer volatility series, and ‘this increases statistical power

and allows for evolution in the efficiency of the market for OEX index options since

their introduction in 1983’. Their sample data ranges from November 1983 to May

1995 which equals to 11.5 year. However, the data used by Canina and Figlewski

(1993) was from March 15, 1983 to March 28, 1987, and this period preceded the

October 1987 crash. Christensen and Prabhala (1998) documented that there was a

regime shift around the crash period, and implied volatility is more biased before the

crash. The second reason is that they used monthly data to sample the implied and

realized volatility series, while the daily data was used by Canina and Figlewski

(1993). The lower frequency of data enables them to ‘construct volatility series with

nonoverlapping data with exactly one implied and one realized volatility coving each

time period’, and their ‘nonoverlapping sample yields more reliable regression

estimates relative to less precise and potentially inconsistent estimates obtained from

overlapping samples used in previous work’.

12

Blair et.al (2001) compared ARCH models and VIX based on S&P 100 index

using both daily index returns and intraday returns. The data ranges from November

1983 to May 1995, and it spans a time period of 139 months which is approximately

11.5 years. The study results indicate VIX performs very well on volatility forecasting

and the volatility forecasts are unbiased.

The technique for computing VIX was improved in 2003. Since the new

computation is based on the option price of S&P 500 index rather than S&P 100

index, therefore, the evaluation of the performance of VIX on forecasting future

volatility of S&P 500 index became the subject of most empirical research. However,

the results of various studies are also conflict.

Corrado and Miller, JR. (2005) studied implied volatility indices VIX, VXO as

well as VXN which are based on S&P 500, S&P 100 and Nasdaq 100 indices,

respectively. The study period spans 16 years from January 1988 to December 2003.

They compared the results of OLS regression to the estimates derived from

instrument variable regression, and the study results documented that implied

volatility indices VIX, VXO and VXN dominate historical realized volatility.

Particularly, VXN is nearly unbiased and it can produce more efficient forecasts than

realized volatility.

Giot and Laurent (2006) investigated information content of both VIX and VXO

implied volatility indices. The data used for their study ranges from January 1990 to

May 2003. The information content was evaluated by running an encompassing

regression of the jump/continuous components of historical volatility, and implied

volatility was augmented as an additional variable. The study results show that

implied volatility subsumes most relevant volatility information. They also indicated

that the addition of the jump/continuous components can hardly affect the

explanatory power of the encompassing regression.

Becker, Clements and McClelland (2009) examined information content of VIX

by seeking the answers for two questions. First, whether the VIX index subsumes

information regarding to how historical jump activity contributed to the price

volatility; second, whether the VIX reflects any incremental information pertaining to

13

future jump activity relative to model based forecasts. The empirical results of their

study provide the affirmative answers for these two questions.

Becker, Clements and Coleman-Fenn (2009) compared model based forecasts

and VIX. They argued that the unadjusted implied volatility is inferior. However, the

transformed VIX augmented with the volatility risk-premium can have the same good

performance as model based forecasts.

The study results of Becker et al. (2006), Becker et al. (2007) and Becker and

Clements (2008) refute the hypothesis of VIX being an efficient volatility forecast.

The same data set was used for these three studies, ranging from January 1990 to

October 2003. The study results indicate that there is significant and positive

relationship between VIX and future volatility, but the VIX is an inefficient volatility

forecast.

There are several determinant variables for computing the implied volatility,

such as the index level, risk free interest rate, dividends, contractual provisions of the

option and the observed option price. The measurement errors of these variables may

lead to the biased estimation of implied volatilities. Since the implied volatilities used

by early studies contain relevant measurement errors whose magnitudes are

unknown, therefore, this may be the primary reason leading to the conflicting study

results of various studies.

In addition, the biasness of implied volatility estimation can also be induced by

some other factors. For example, the relatively infrequent trading of the stocks in the

index; the use of closing prices which have different closing times of stock and

options markets; the bid or ask price effects which may cause the first order

autocorrelation of the implied volatility series to be negative.

Comparing to index option, the study based on the individual stock options is

sporadic. The studies by Latané and Rendleman (1976) was conducted with

expectation of favoring implied volatility, however, the results are less overwhelming

due to these studies predate the development of conditional heteroskedasticity

models and applied naive models of historical volatility.

14

Lamoureux and Lastrapes (1993) examined implied volatility based on the

option prices of 10 stocks of a 2-year short period from April 1982 to March 1984.

They demonstrated that implied volatility is biased and inefficient, and the GARCH

model performs better on modeling the conditional variance. Additionally, they also

found that when implied volatility was included as a state variable in the GARCH

conditional variance equation, historical return shocks still provided important

additional information beyond that reflected in option prices. Their study results are

difficult to interpret because they used overlapping samples to examine one day

ahead forecasting ability of implied volatility computed from options that have a

much longer remaining life which is up to 129 trading days.

Based on the theory and methodology of the study by Lamoureux and Lastrapes

(1993), Mayhew and Stivers (2003) examined 50 firms with the highest option

volume traded on the CBOE between 1988 and 1995, and they used the daily time

series of the volatility index (VIX) from CBOE. During this period, the VIX

represented the implied volatility of an at-the-money option based on the S&P 100

Index with 22 trading days to expiration. Their study results show that the implied

volatility outperforms GARCH specification. In addition, when implied volatility is

added to the conditional variance equation, it captures most of all of the relevant

information in past return shocks, at least for stocks with actively-traded options.

Furthermore, they documented that return shocks from period 2 and older

provide reliable incremental volatility information for only a few firms in the

sample.Finally, they also found that the implied volatility from equity index options

provides incremental information about firm-level conditional volatility. For the

most of the firms, index implied volatility contains information beyond that in past

returns shocks, suggesting an alternative method for modeling volatility for stocks

without traded options. For a small part of firms with less actively-traded individual

options, the index implied volatility provides incremental information beyond the

own firm’s implied volatility. Therefore, the equity index options appear to impound

systematic volatility information that is not available from less liquid stock options.

15

Frijns et al. (2008) and Frijns et al. (2010) studied return volatility of

Australian stock market of different period. Due to there is no implied volatility index

published by Australian Stock Exchange, Frijns et al. (2010) computed the implied

volatility index namely AVX on the basis of the European style index options traded

on the Australian Securities Exchange. The approach of constructing AVX is similar

to the way of computing VIX by CBOE. The distinctive feature is that the implied

volatilities of eight near-the-money options were combined into a single

at-the-money implied volatility index with a constant time to maturity of three

months (Frijns et al. 2010: 31). Therefore, the computed AVX is considered to be the

forecasted future return volatility of S&P/ASX 200 over the subsequent three months.

The study results demonstrated that implied volatility outperforms RiskMetrics and

GARCH and provides important information for forecasting future return volatility of

Australian stock market. Furthermore, it is proposed that AVX could be valuable

information to investors, corporations and financial institutions.

To summarize, the empirical results of immediate studies favor the conclusion

that implied volatility are more efficient and informative for forecasting future

volatility of assets returns.

16

3 The CBOE Volatility Index-VIX

3.1 Implied Volatility

Implied volatility is a prediction of process volatility rather than the estimate, and its

horizon is given by the maturity of the option. In a constant volatility framework,

implied volatility is the volatility of underlying asset price process that implicit in the

market price of an option according to a particular model. If the process volatility is

stochastic, implied volatility is considered to be the average volatility of the

underlying asset price process that is implicit in the market price of an option

(Alexander, 2001:22).

The market price of options can be computed using various models. A simple

model namely Black-Scholes model is widely used for European options pricing in

practice. In practice, the theoretical market price and real price of option may differ

from each other, whereas application of implied volatility can make these two prices

equivalent (Alexander, 2001). A recognized fact is that different options on the same

underlying asset can generate various implied volatilities. Furthermore, using

different data can induce the irreconcilably different inferences of parameters value.

Since implied volatilities are thought of the market’s forecast of the volatility

implied from the underlying asset of an option, the calculation of an implied volatility

is closely associated with the option valuation model. Blair et al. (2001) argued that

the inappropriate use of option valuation model can lead to mis-measurement in

implied volatilities. For example, if implied volatilities of S&P 500 index option are

calculated by an European model then error will be caused by the omission of the

early exercise option due to is an American style option. In addition, Harvey and

Whaley (1992) showed that if the option pricing model includes the early exercise

option and the timing and level of dividends are assumed to be constant, then the

option will be priced by error so that implied volatilities will be mis-measured.

17

3.2 The VIX Index

The VIX index was introduced by the Chicago Board Options Exchange (CBOE)

in 1993. By using the implied volatilities of various near-the-money options on the

S&P 100 index, Whaley (1993) introduced the VIX index on the basis of a synthetic

at-the-money option with a constant time to maturity of one-month, and

demonstrated that the VIX index is not only an efficient index for market volatility,

but also could be employed for hedging purpose by introducing options and futures

on the VIX. The current calculation approach of VIX was changed since September

22, 2003, and it is now calculated from the bid and ask quotes of options on S&P 500

index rather than S&P 100 index. The S&P 500 index is the most popular underlying

asset as well as the most widely used benchmark in the U.S market

Before changing the calculation approach, the VIX index based on S&P 100

index is a weighted index of American implied volatilities derived from eight

near-the-money, near-to-expiry, S&P 100 call and put options, and it was considered

to be able to eliminate smile effects and most of problems of mis-measurement. It

used the binominal valuation methods with trees that are adjusted to reflect the

actual amount and timing of anticipated cash dividends. The midpoint of the most

recent bid/ask quotes are used to calculate the option price and this way was

considered to be able to avoid problems inducing by bid/ask bounce. Both call and

put options were used in order to increase the amount of information and eliminate

problems caused by mis-measurement of underlying index and put/call option

clientele effects. VIX based on S&P 100 index represents a hypothetical option that is

at-the-money and had a constant 22 trading days (30 calendar days) to expiry. It

employed pairs of near-the-money exercise prices which are barely above and below

the current index price. Otherwise, a pair of times to expiry was also used, one is at

least eight calendar days to expiration and another one is the following contract

month. Blair et al. (2001) showed that although VIX is robust to mis-measurement, it

is still a biased predictor of subsequent volatility due to a trading time adjustment

that typically multiplies conventional implied volatilities by approximately 1.2.

18

The new calculation approach makes the VIX index to be much closer to the real

financial practices and become the practical standard for trading and hedging

volatility. It is widely accepted and considered to be the market’s expected volatility

of the S&P 500 index. Since the computation augments a wide range of exercise

prices, the VIX index based on S&P 500 index become more robust. In addition, VIX

is computed directly from option prices rather than seeking it by the use of the

Black-Scholes option pricing model (Ahoniemi 2006). The popularity of VIX are

developing rapidly and it has become the main index for the U.S stock market

volatility. So far, VIX has been a tradable asset for both option and futures with

6-year history.

In terms of CBOE proprietary information (2009), VIX is computed by the

at-the-money and out-of-the-money call and put option prices using the formula

2 ∆ 1

1 1

where σ denotes VIX divided by 100, T is time to maturity, r is the risk free interest

rate, F is the forward index level computed by the index option prices, denotes

the first strike below the forward index level (F), is the strike price of ith

out-of-the-money option (a call if ; a put if ; both call and put if

), stands for the midpoint of the bid-ask spread for each option with

strike , ∆ is the interval between strike prices and it is calculated by the

difference between the strike on either side of divided by two, /2.

Since VIX forecasts 30-day volatility of S&P 500 index, the near-term and

next-term put and call options of the first two contract months are used to compute

VIX. For near-term options, the time to maturity should equal one week at least so

that can minimize the potential pricing anomalies which could happen near the time

to maturity. If the expiration date of the near-term options is less than one week,

then must roll to the next two contract months (CBOE proprietary information

2009).

19

4 Time Series Models for Volatility Forecasting

4.1 Random Walk Model

Perhaps the random walk model is the simplest one for modeling volatility of a

time series. Under the efficient market hypothesis, the stock price indices are

virtually random. The standard model for estimating the volatility of stock returns

using ordinary least square method is the random walk model based on the historical

price:

2

where denotes the stock index return at time t; μ is the average return under the

efficient market hypothesis, and it is expected to be equal to zero; is the error

term at time t, and its auto-covariance should equal to zero over time.

4.2 The ARCH (q) Model

Engle (1982) introduced the autoregressive conditional heteroskedasticity ARCH

(q) model and documented that the serial autocorrelated squared returns

(conditional heteroskedasticity) can be modeled using an ARCH (q) model. The

framework of the ARCH (q) model is:

3

4

5

where denotes the conditional mean given information set available at time

20

1; represents a sequence of iid random variables with mean equals zero and

unit variance. The constraints of parameters that 0 and 0 1 , … ,

ensure the conditional variance is non-negative.

The equation (5) for can be expressed as an AR (q) process for the squared

residuals:

6

where is a martingale difference sequence (MDS) since 0

and it is assumed that ∞ (Zivot 2008:4). The condition for to be

covariance stationary is that the sum of all parameters of past residuals

1, … , should be smaller than unity. The measurements of persistence of and

are ∑ and 1 ∑⁄ , respectively.

4.3 The GARCH (p,q) Model

The generalized ARCH (GARCH) model, proposed by Bollerslev (1986), is the

extension of ARCH model. It is based on the assumption that the conditional

variance to be dependent upon previous own lags, and it replaces the AR (q)

representation in equation (5) with an ARMA (p,q) process:

7

where the parameter constraints 0 0, 1, , and 0 1, ,

assure that σ 0. The equation (7) together with equation (3) and (4) is known as

the basic GARCH (p,q) model. If 0, the GARCH (p,q) model became an ARCH(q)

model. In the interest of the coefficient estimates of the GARCH term to be identified

at least one of parameters 1, … , must be significant from zero. For the

basic GARCH (p,q) model, the squared residuals behave like an ARMA process. It

21

is required that ∑ ∑ 1 for the covariance stationarity. The

unconditional variance of is computed as :

1 ∑ ∑

8

In practice, the GARCH (1, 1) model comprising only three parameters in the

conditional variance equation is sufficient to capture the volatility clustering in the

data. The conditional variance equation of GARCH (1,1) model is

9

Due to , the equation (9) can be rewritten as

10

The equation (10) is an ARMA (1,1) process for , and it is followed by many

properties of GARCH (1,1) model. For instance, the persistence of the conditional

volatility is captured by , and the constraints 1 assures the

covariance stationarity. The covariance stationary GARCH (1,1) model has an

ARCH ∞ representation with , and the unconditional variance of is

1⁄ . (Zivot, 2008:6)

4.3.1 The Stylized Facts of Volatility

The stylized facts about the volatility of economic and financial time series have

been studied extensively. The most important stylized facts are known as volatility

clustering, leptokurtosis, volatility mean reversion and leverage effect.

The volatility clustering can be interpreted by GARCH (1,1) model of equation (9).

For many daily or weekly financial time series, a distinctive feature is that the

22

coefficient estimate of the GARCH term approximates 0.9. This implies that the large

(small) value of the conditional variance will be followed by the large (small) value.

The same discursion can be derived by the ARMA representation of GARCH models

in equation (10), i.e. the large changes in will be followed by the large changes,

and small changes in will be still followed by small changes. (Zivot, 2008)

Compared to the normal distribution, the distributions of the high frequency

data usually have fatter tails and excess peakedness at the mean. This fact is known

as leptokurtosis, and it suggests the frequent presence of the extreme values. The

kurtosis is a statistic for measuring the peak of a distribution of time series compared

to a normal distributed random variables with constant mean and variance, and it is

calculated by a function of residuals and their variance :

kurtosis = (11)

The kurtosis of a normal distribution is three and the excess kurtosis which equals to

kurtosis minus three is zero. The normal distribution with zero excess kurtosis is

known as mesokurtic. A distribution with the excess kurtosis larger than three is

referred to as leptokurtic, and the distribution is said to be platykurtic if the excess

kurtosis is smaller than three.

Sometimes financial markets experience excessive volatility, however, it seems

that the volatility can ultimately go back to its mean level. The unconditional variance

of the residuals of the standard GARCH (1,1) model is computed by

1⁄ . In order to clarify that the volatility can be finally driven back

to the long run level, we consider the interpretation by rewriting the ARMA

representation in equation (10):

12

by successively iterating k times,

23

13

where γ is a moving average process. Due to 1 is required for a

covariance stationary GARCH (1,1) model, approach zero as k increase

infinitely. Although may deviate from the long run level at time t, will

approach zero as k becomes larger, and this implies that the volatility will eventually

go back its long run level σ . The half-life of a volatility shock suggests the average

time for | | to decrease by one half, and it is measured by 0.5⁄ .

Therefore, the speed of mean reversion is dominated by , i.e. if the value of

1, the half-life of a volatility shock will be very long; if 1, the

GARCH model is non-stationary and the volatility will ultimately explode to infinity

as k increases infinitely (Zivot 2008:8).

The standard GARCH (p,q) model enforce a symmetric response of volatility to

positive and negative shocks because the conditional variance equation of the

standard GARCH (p,q) model is a function of the lagged residuals but not their signs,

i.e. the sign will be lost if the lagged residuals are squared (Brooks, 2008). Therefore,

the standard GARCH (p,q) model cannot capture the asymmetric effect which is also

known as the leverage effect in the distribution of returns. One alternative is

modeling the conditional variance equation augmented with the asymmetry. Another

approach is allowing the residuals to have an asymmetric distribution (Zivot 2008).

In order to overcome this limitation of the standard GARCH (p,q) model, a number

of extensions have been built such GJR and the exponential GARCH (EGARCH)

models.

4.4 The GJR (p,q) Model

The GJR (p,q) model is built with the assumption that the unexpected changes in

the market returns have different effects on the conditional variance of returns.

Compared to the basic GARCH (p,q) model, the GJR (p,q) model augments with an

24

additional term which is used to account for the possible asymmetries. The function

form of the conditional variance is given by:

(14)

where I (.) represents the dummy variable that takes value one if 0 ,

otherwise zero. If γ 0, the leverage effect exhibits and suggests that the negative

shocks will have a larger impact on conditional variance than positive shocks; if γ 0,

the news impact is asymmetric. Since the conditional variance should be positive,

therefore, the constraints of parameters are 0, 0, 0 and 0 .

When 0, the model is still admissible even if γ 0. The model is stationary

if γ 2 1 .

4.5 The EGARCH (p,q) Model

The exponential GARCH (EGARCH) model introduced by Nelson (1991)

incorporates the leverage effect and specifies the conditional variance in the

logarithmic form. The conditional variance equation of the EGARCH model is

expressed as:

| |

15

If 0 or there is arrival of good news, the total effect of is 1 | |; if

0 or there is arrival of bad news, the total effect of is 1 | |.

The EGARCH model has three advantages over the basic GARCH model. First,

since the conditional variance is modeled in the logarithmic form, the variance will

always be positive even if the parameters are negative. With appropriate condition of

the parameters, this specification captures the fact that a negative shock leads to a

higher conditional variance in the next period than a positive shock. Second,

25

asymmetries are allowed in the EGARCH model. If the relationship between volatility

and returns is negative, the parameter of the asymmetry term, , will be negative.

Third, the EGARCH model is stationary and has finite kurtosis if 1. Thus,

there is no restriction on the leverage effect that the model can represent imposed by

the positivity, stationarity or the finite fourth order moment restrictions.

4.6 RiskMetrics Approach

The RiskMetrics approach was introduced by J.P. Morgan (1992). It is a

variation of the exponentially weighted moving average (EWMA) model which can be

expressed as

1∞

16

where denotes the average return estimated by observations and it is assumed to

be zero by RiskMetrics approach as well as many empirical studies. is the decay

factor determining the weights given to recent and older observations. The

determination of the value of is important. Although can be estimated, it is

often conventionally restricted to be 0.94 for daily data and 0.97 for monthly data,

and such weights are recommended by RiskMetrics approach. To be explicit, the

specification of RiskMetrics model is

1 (17)

26

5 Practical Issues for Model-building

5.1 Test ARCH Effect

Volatility clustering is caused by the autocorrelation in squared and absolute

returns or in the residuals from the estimated conditional mean equation (Zivot,

2008). There are different approaches for testing the ARCH effect, and two

conventional methods are Ljung-Box (1978) statistic and Lagrange multiplier (LM)

test suggested by Englie (1982).

Denoting the i-lag autocorrelation of the squared or absolute returns by , the

Ljung-Box statistic is computed as:

~ 18

The statistic of LM test is given by

· ~ 19

where q represents the number of restrictions placed on the model, T denotes the

number of total observations, and is from the regression of the equation (6). The

hypothesis of LM test is:

H : 0 (suggesting there is no ARCH effect)

H : 0 (suggesting there is ARCH effect)

Lee and King (1993) documented that the LM test can also be used to test the GARCH

effects. Lumsdaine and Ng (1999) argued that the LM test could fail if the conditional

mean equation is specified inappropriately and this can lead to serial autocorrelation

of the estimated residuals as well as the squared estimated residuals.

27

5.2 Information Criterion

An important issue regarding to the model-building is the determination of

orders of ARCH and GARCH terms of the conditional variance equation. Due to

GARCH model can be considered as an ARMA process for squared residuals,

therefore, the conventional information criteria can be used for model selection.

Three widely used information criteria are Akaike information criterion (AIC),

Bayesian information criterion (SBIC) and Hanna-Quinn criterion (HQIC), and their

respective algebraic expressions are:

2 20

21

2 22

where denotes the variance of residuals, T represents the sample size, k is the

total number of the estimated parameters, i.e. 1 for a GARCH (p,q)

model. The model with the smallest value of AIC, SBIC and HQIC is considered to be

the best one. However, a common practice is that it is difficult to beat the GARCH (1,1)

model.

5.3 Evaluating the Volatility Forecasts

5.3.1 Out-of-sample Forecast

The predictability of the estimated models is often evaluated by the

out-of-sample forecast performance. Two common approaches used for

out-of-sample forecasts are known as recursive forecast and rolling forecast. The

28

recursive forecast has a fixed initial estimation date, and the sample is increased by

one and model is re-estimated at each time. For the L step ahead forecasts, this

process is continued until no more L step ahead forecasts can be computed. The

rolling forecast has a fixed length of the in-sample period used for estimating the

model, i.e., both the start and the end estimation dates should increase by one and

the model is re-estimated at each time. For the L step ahead forecasts, this process is

continued until no more L step ahead forecasts can be computed. (Brooks, 2008)

5.3.2 Traditional Evaluation Statistics

In most empirical studies, four error measurements are widely used to evaluate

the forecast performance of the estimated models. They are known as the root mean

square error (RMSE), the mean absolute error (MAE), the mean absolute percent

error (MAPE), and Theil’s U-statistic. These measurements are expressed as:

1

1 23

1

1 24

100

1⁄ 25

26

where T represents the number of total observations and is the first

out-of-sample forecast observation. Therefore, the model is estimated by the

29

observations from 1 to ( 1 , and observations from to T are used for the

out-of-sample forecasting. and denote the actual and the estimated

conditional variance at time t, respectively. is obtained from a benchmark model

which is often a simple model such as the random walk model.

RMSE provides a quadratic loss unction. A distinctive feature of RMSE is that it

is particularly useful if the estimates errors are extremely large and they can cause

the serious problems. However, if there are large estimates errors but they cannot

lead to the serious problems, then, this becomes the disadvantage of RMSE. (Brooks,

2008)

MAE measures the average absolute forecast error. Although the function form

of RMSE and MAE are simple, but they are inconstant to scale transformations, and

their symmetric characteristics imply that it is not very realistic and inconceivable in

some cases. (Yu, 2002)

MAPE measures the percentage error, i.e. its value is restricted between zero and

one hundred percent. MAPE has an advantage which is useful to compare the

performance of the estimated models and the random walk model. For a random

walk in the log level, the criterion MAPE is equivalent to one. Therefore, an estimated

model with the MAPE which is smaller than one is considered to be a better one than

random walk model. However, if the series take on the absolute value which is

smaller than one, then MAPE is not reliable. (Brooks, 2008)

Since one term of the function of Theil’s U-statistic is the estimated conditional

variance from the benchmark model, therefore, the estimates errors is standardized.

The U-statistic can be used to compare the estimated model and the benchmark

model. If U-statistic equals to one, it suggests that the estimated model has the same

accuracy as the benchmark model. If U-statistic is smaller than one, then the

estimated model is considered to be better than the benchmark model (Brooks,

2008). Comparing to MAE, Theil’s U-statistic is constant to scalar transformation,

but it is symmetric (Yu, 2002)

30

6 Data

The data used for our empirical study are daily returns and daily implied volatilities

of S&P 500 Index of 5291 trading days of a 21-year period. The in-sample period

ranges from 3 January 1990 to 31 December 2009 providing 5039 daily observations,

followed by the out-of-sample period from 2 January 2010 to 31 December 2010

comprising with 252 daily observations.

6.1 S&P 500 Index Daily Returns

Daily returns from the S&P 500 index are defined in the standard way by the

natural logarithm of the ratio of consecutive daily closing levels. Index returns are

adjusted for dividends. Denoting the price at the end of trading day t by , the log

return or continuously compounded return is computed as:

100 log ⁄ (27)

Table 1 shows some standard summary statistics of both full sample and the

yearly sub-period along with the Jarque-Bera test for normality. The latter is defined

as:

·6

324

28

where S and K represent the sample skewness and kurtosis, respectively. Our null

hypothesis is that the observations are iid (identically and independently) normal

distribution. JB is asymptotically distributed as chi-square with two degrees of

freedom. As can be seen, the average daily returns of full sample period is 0.024%

and daily (annual) standard deviation is 1.17% (18.57%). As is expected for a time

series of returns, the average daily returns of both full sample period and all

sub-period are close to zero, and most of them are slightly positive. It is obvious that

31

Table 1.Summary statistics for S&P 500 index daily returns

Period Obs. Mean Max. Min. Median Std. Dev. Skewness Kurtosis JB

All 5291 0.02366 10.9572 -9.46951 0.05222 1.17112 -0.19939 11.86668 17367.04

1990 252 -0.03392 3.13795 -3.07110 0.10574 1.00134 -0.16909 3.62153 5.257010

1991 252 0.09268 3.66421 -3.72717 -0.00908 0.89962 0.17191 4.95451 41.35232

1992 254 0.01720 1.54441 -1.87401 0.00475 0.60972 0.05634 3.23772 0.732460

1993 253 0.02695 1.90943 -2.42929 0.00867 0.54192 -0.17885 5.41942 63.05525

1994 252 -0.00616 2.11232 -2.29358 0.01293 0.62069 -0.29147 4.27654 20.67846

1995 252 0.11647 1.85818 -1.55830 0.09443 0.49127 -0.07153 4.08430 12.56164

1996 254 0.07264 1.92519 -3.13120 0.05538 0.74320 -0.61248 4.75474 48.46755

1997 251 0.10761 4.98869 -7.11275 0.18832 1.14970 -0.67569 9.42657 451.0362

1998 252 0.09381 4.96460 -7.04376 0.14023 1.28147 -0.61991 7.72505 250.5634

1999 252 0.07078 3.46586 -2.84590 0.03313 1.13707 0.06162 2.86455 0.352110

2000 252 -0.04242 4.65458 -6.00451 -0.03791 1.40018 0.00075 4.38816 20.23325

2001 248 -0.05635 4.88840 -5.04679 -0.06114 1.35822 0.02048 4.44777 21.67631

2002 252 -0.10561 5.57443 -4.24234 -0.17836 1.63537 0.42507 3.66104 12.17688

2003 252 0.09291 3.48136 -3.58671 0.12758 1.07374 0.05323 3.75894 6.166869

2004 252 0.03417 1.62329 -1.64550 0.06359 0.69883 -0.11016 2.86226 0.708838

2005 252 0.01173 1.95440 -1.68168 0.05587 0.64773 -0.01553 2.84928 0.248659

2006 251 0.05087 2.13358 -1.84963 0.09829 0.63098 0.10281 4.15534 14.40212

2007 251 0.01382 2.87896 -3.53427 0.08083 1.00926 -0.49408 4.44814 32.14436

2008 253 -0.19206 10.9572 -9.46951 0.00000 2.58401 -0.03373 6.67544 142.4539

2009 252 0.08361 6.83664 -5.42620 0.18690 1.71760 -0.06047 4.85098 36.12797

2010 252 0.04774 4.30347 -3.97557 0.07988 1.13778 -0.21103 4.95993 42.20451

32

there is large difference between maximum and minimum returns, and this is a

common feature of index returns. The time-varying statistics of the standard

deviation indicate that there is considerable fluctuation of S&P 500 daily returns. The

distribution of daily index returns of full sample period is clearly non-normal with

negative skewness and pronounced excess kurtosis. The statistics of skewness of 13

sub-period are negative and slightly positive for other 7 sub-period; the values of

kurtosis exceed 3 in all periods. The information observed from Table 1 indicates that

the distribution of observations do not match our assumption.

Figure 1 plots the daily log returns, squared returns, and absolute value of

returns of S&P 500 index over the whole study period from January 03, 1990 to

December 31, 2010. There is no clear discernible pattern of behavior in the log

returns, but there is some persistence indicated in the plots of the squared and

absolute returns which represent the volatility of returns. Particularly, the plots show

evidence of volatility clustering, implying that low values of volatility are tended to be

followed by low values and high values of volatility are followed by high values.

Figure 1.Daily returns, squared daily returns and absolute daily returns for the S&P 500 index

6.1.1 Autocorrelations of S&P 500 Index Daily Returns

The sample autocorrelations of the daily log returns, squared returns, and

absolute value of returns of S&P 500 index are presented in the Figure 2. The

autocorrelation is deemed significant if |autocorrelation| 1.96 √5226⁄ at 5% level.

-10

0

10

95 00 05 10

S&P 500 Daily Returns

0

50

100

150

95 00 05 10

S&P 500 Squared Daily Returns

0

4

8

12

95 00 05 10

S&P 500 Absolute Daily Returns

33

As can be seen, the log returns show no evidence of serial correlation, while the

autocorrelation of squared and absolute returns are alternate between positive and

negative. Further, the decay rates of the sample autocorrelations of squared and

absolute returns appear to be slow, and this is the evidence of long memory behavior.

Figure 2. Autocorrelation of , and | | for S&P 500 index

6.1.2 Testing ARCH Effect of S&P 500 Index Daily Returns

The test of the presence of ARCH effect is conducted by Ljung-Box test

computed from daily squared returns, and LM test for different lags of residuals of

estimation of S&P 500 index daily returns. The summary statistics is presented in

Table 2. The results of both the Ljung-Box and the LM tests are statistically

significant and indicate that there is presence of ARCH effect in S&P 500 daily index

returns, showing the evidence of volatility clustering.

Table 2 Test for ARCH effect in S&P 500 daily index returns

lag 1 5 10 15

Ljung-Box 225.51 2089.4 4097.0 5762.2

(0.0000) (0.0000) (0.0000) (0.0000)

LM 220.59 1208.01 1379.53 1529.60

(0.0000) (0.0000) (0.0000) (0.0000)

Notes: p-values are in parentheses

-.4

.0

.4

5 10 15 20

S&P 500 Daily Returns

acf

-.8

-.4

.0

.4

5 10 15 20

S&P 500 Squared Daily Returnsac

f

-.6

-.4

-.2

.0

.2

5 10 15 20

S&P 500 Absolute Daily Returns

acf

34

6.2 Properties of the VIX Index

Although VIX has potential flaw, compared to other implied volatility indices, it

can eliminate most of the problems of mis-measurement. Therefore, we use it as our

measure for S&P 500 index implied volatility. Adjusted daily values of VIX at the

close of option trading are used.

Table 3 presents the summary statistics of the VIX index of the sample period

from January 03, 1990 to December 31, 2010. The average level of implied volatility

index is 20.3949% over the sample period. The statistics of autocorrelation indicate

that the series is highly persistent. The distribution of VIX is non-normal with

positive skewness and excess kurtosis. Since the statistic of Augmented Dickey-Fuller

test is -4.49 with p-value equals to 0.0002, the null hypothesis of presence of unit

root can be rejected at 1% level.

Table 3. Summary statistics of the VIX index

Mean Std.Dev Skewness Kurtosis p(1) p(2) p(3) p(4) p(5) ADF

0.203949 0.082424 2.020700 10.26646 0.983* 0.969* 0.959* 0.950* 0.942* -4.49

(0.0002)

Note: P(i) denotes autocorrelations of series for i-lag; * is significant at 1% level; the P-value is in the parenthesis.

6.3 Study on S&P 500 Index and the VIX Index

6.3.1 Cross-correlations between S&P 500 Index and the VIX Index

Table 4 presents the statistic results of cross-correlations between S&P 500

index daily returns and the VIX index of both full sample and yearly sub-period. The

contemporaneous cross-correlation for the full sample period and all yearly periods

are negative, and 15 yearly sub-period are highly significant. We also observed some

35

Table 4. Cross-correlation between S&P 500 index daily returns and implied volatility index

Period Obs. -2 -1 0 +1 +2

All 5291 0.0135 0.0217 -0.1214* -0.1085* -0.00926*

1990 252 0.0463 0.0341 -0.1805* -0.2036* -0.1802*

1991 252 0.1840 0.1438 -0.0570 -0.0537 -0.0451

1992 254 0.0352 0.0156 -0.1583* -0.1789* -0.1256*

1993 253 0.1210 0.0939 -0.1795* -0.2546** -0.1913*

1994 252 0.0403 0.0403 -0.2850* -0.2743* -0.2723*

1995 252 -0.0108 -0.0345 -0.2921* -0.2613* -0.1789*

1996 254 0.1019 0.0378 -0.3134* -0.2398* -0.1970*

1997 251 0.0838 0.0863 -0.1273 -0.1246 -0.0935

1998 252 0.0599 0.0583 -0.1748* -0.1573* -0.1258*

1999 252 0.1301 0.1126 -0.2784* -0.2330* -0.2706*

2000 252 0.1012 0.0907 -0.2252* -0.2068* -0.1223*

2001 248 0.1438 0.1252 -0.1401 -0.1300 -0.1052

2002 252 0.0956 0.0951 -0.1150 -0.1148 -0.0999

2003 252 -0.0223 -0.0084 -0.1088 -0.0974 -0.0801

2004 252 0.0888 0.0788 -0.2422* -0.2219* -0.1907*

2005 252 0.1192 0.1524 -0.2500* -0.1649* -0.1795*

2006 251 0.0926 0.0598 -0.2606* -0.2476* -0.1505*

2007 251 0.0232 0.0788 -0.1839* -0.1243 -0.1122

2008 253 0.0348 0.0627 -0.1271 -0.0999 -0.0804

2009 252 -0.0313 0.0025 -0.1670* -0.1394* -0.1167

2010 252 0.0030 0.0359 -0.2826* -0.2690* -0.2314*

36

significant cross-correlations at other leads for various yearly periods but not for any

lags.

Figure 3 further confirms the negative relationship between S&P 500 Index and

the VIX index, i.e. when S&P 500 Index level peaks VIX is at a trough and vice versa.

Two common explanations for the phenomenon of Figure 3 are leverage effect and

time-varying risk-premium effect. Leverage effect implies that the increase of

leverage is the result of the decrease of the value of equity since the stock price

decline. Thus, the risk known as volatility of stock market will increase. Time-varying

Figure 3. S&P 500 Index (logarithm) and the VIX Index

risk-premium effect is also known as volatility feedback effect, suggesting that the

increase of the asset’s risk premium is in unison with the increase of expected

volatility, and this can lead to a higher expected return and the decrease of current

stock price.

4.8

5.2

5.6

6.0

6.4

6.8

7.2

7.6

8.0

8.4

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010

( log ) S&P 500 IndexVIX

( log

) S

&P

500

Inde

xV

IX

37

6.3.2 S&P 500 Index Daily Returns and the VIX Index

The relationship between stock market returns and implied volatility index was

first investigated by Fleming et al. (1995) for US stock market, and the presence of

significant negative and asymmetric relationship was demonstrated. The VIX index is

widely recognized as an effective proxy for expected volatility. Since VIX was

calculated by the option prices of S&P 100 index before 2003, therefore, it is

interesting to study the contemporaneous relationship between S&P 500 index daily

returns and the VIX index using 21-year historical data, and we want to confirm

whether the relationship between S&P 500 index and its based VIX is still negative

and asymmetric.

By following Fleming et al. (1995), we ran a regression of S&P 500 index daily

returns and contemporaneous daily VIX changes on leads and lags. In order to

evaluate whether there is an asymmetric contemporaneous relationship between S&P

500 index returns and the VIX index, the absolute daily returns at a lag of zero is

included. Additionally, the VIX at a lag of one is also included for controlling for

first-order autocorrelation. The regression has the form:

∆ | | ∆ 29

In line with previous empirical studies by Fleming et al. (1995), Frijns et al.

(2008) and Frijns et al. (2010), the parameter of is expected to be negative. If

is positive and significant, the relationship between S&P 500 index returns and

changes in VIX is asymmetric.

Table 5 presents the regression results for VIX changes and intertemporal S&P

500 index daily returns for the full sample and yearly sub-periods. For the full sample

period, the value of parameter of is same as our expectation. The highly

significant with a t-statistic of -90.43 confirms the negative contemporaneous

relationship between VIX changes and S&P 500 index daily returns. The positive and

significant with t-statistic of 9.62 shows the evidence for asymmetric relationship

38

Table 5. Regression results for VIX changes and S&P 500 index daily returns

Period

All -0.0926*

(-5.30)

0.0738*

(6.70)

-0.0377**

(-2.15)

-0.9931*

(-90.43)

0.1442*

(9.62)

0.0060

(-0.55)

0.0429*

(3.92)

-0.0899*

(-6.63)

0.6238

1990 -0.5052*

(-3.86)

0.1357

(1.61)

-0.0896

(-0.86)

-0.9483*

(-11.24)

0.6644*

(5.00)

0.0107

(0.13)

0.0354

(0.43)

-0.1161***

(-1.90)

0.4335

1991 -0.2125**

(-2.28)

0.1071

(1.53)

-0.0408

(-0.49)

-0.8401*

(-11.99)

0.3743*

(3.56)

-0.0915

(-1.32)

0.1276***

(1.84)

-0.1064***

(-1.70)

0.3830

1992 -0.1979*

(-3.33)

0.1579**

(2.54)

-0.1587**

(-2.18)

-0.6374*

(-10.02)

0.3952*

(4.01)

-0.1131***

(-1.79)

-0.0998

(-1.55)

-0.1219**

(-2.04)

0.3290

1993 -0.2033*

(-3.74)

0.0970

(1.44)

-0.2013**

(-2.56)

-0.6820*

(-10.19)

0.5470*

(5.42)

-0.0254

(-0.38)

0.0544

(0.81)

-0.1841*

(-3.12)

0.3840

1994 -0.0835

(-1.34)

0.1325**

(1.98)

-0.2083**

(-2.21)

-1.0879*

(-16.18)

0.1832***

(1.80)

0.0042

(0.06)

0.0247

(0.37)

-0.2372*

(-3.85)

0.5464

1995 -0.0549

(-0.93)

0.1517**

(2.08)

-0.0173

(-0.22)

-0.5777*

(-7.76)

0.3325*

(3.02)

-0.0508

(-0.70)

-0.1233***

(-1.70)

-0.1454**

(-2.34)

0.2362

1996 -0.0971

(-1.44)

0.1498*

(2.41)

0.1709**

(2.06)

-0.9191*

(-15.45)

0.3105*

(3.49)

-0.0593

(-0.99)

0.0488

(0.83)

-0.0997

(-1.62)

0.5313

1997 -0.2008**

(-2.40)

0.0828***

(1.76)

0.1154***

(1.71)

-0.7647*

(-16.35)

0.3359*

(4.65)

-0.0428

(-0.90)

-0.0817***

(-1.75)

0.1188***

(1.94)

0.5582

39

Table 5 (continued)

1998 0.0778

(0.85)

0.1566*

(3.26)

0.0944

(1.02)

-1.2421*

(-26.57)

0.0134

(0.19)

-0.0496

(-1.06)

0.0772***

(1.66)

0.0263

(0.42)

0.7521

1999 0.1539***

(1.92)

-0.0058

(-0.13)

0.0182

(0.24)

-1.0018*

(-23.15)

-0.1007

(-1.42)

-0.0409

(-0.93)

0.1346*

(3.17)

-0.1211***

(-1.92)

0.7002

2000 -0.1155

(-1.41)

0.1335*

(3.44)

0.0234

(0.38)

-0.7496*

(-20.02)

0.0969

(1.61)

-0.0123

(-0.33)

0.0033

(0.09)

-0.0266

(-0.43)

0.6469

2001 -0.3038*

(-3.66)

0.0622

(1.53)

-0.0326

(-0.46)

-0.9238*

(-22.95)

0.2281*

(3.67)

-0.0569

(-1.40)

-0.0429

(-1.03)

-0.1048***

(-1.67)

0.7019

2002 -0.2814*

(-3.02)

0.0655***

(1.83)

-0.0315

(-0.47)

-0.8930*

(-25.03)

0.1655*

(2.88)

-0.0282

(-0.79)

0.0079

(0.22)

-0.0252

(-0.40)

0.7214

2003 -0.1754*

(-2.63)

0.1053*

(2.70)

0.0374

(0.69)

-0.5826*

(-14.80)

0.2182*

(3.50)

-0.0149

(-0.37)

-0.0280

(-0.70)

0.0503

(0.80)

0.4901

2004 -0.1414*

(-2.75)

0.0601

(1.30)

-0.0546

(-0.77)

-0.8820*

(-19.40)

0.2789*

(3.79)

-0.0461

(-1.01)

-0.0717

(-1.57)

-0.1517**

(-2.48)

0.6220

2005 -0.0933**

(-2.28)

-0.0002

(-0.00)

-0.0207

(-0.31)

-0.8988*

(-23.59)

0.1887*

(2.98)

0.0041

(0.11)

0.0458

(1.21)

-0.1561**

(-2.53)

0.7153

2006 -0.0414

(-0.80)

0.1711*

(3.18)

0.0110

(0.12)

-1.1016*

(-20.09)

0.1827**

(2.25)

-0.1048***

(-1.92)

0.0965***

(1.78)

-0.0296

(-0.47)

0.6664

40

Table 5 (continued)

2007 -0.0403

(-0.55)

0.1296**

(2.51)

-0.0924

(-0.92)

-1.3498*

(-25.79)

0.1485**

(2.00)

0.0463*

(0.91)

0.0249

(0.49)

-0.1336**

(-2.12)

0.7692

2008 -0.2082

(-1.42)

-0.1042**

(-2.38)

-0.1385

(-1.60)

-1.2136*

(-26.88)

0.0132

(0.23)

0.0223

(0.51)

0.0978**

(2.26)

-0.0868

(-1.37)

0.7806

2009 -0.3577*

(-3.39)

0.1740*

(4.10)

-0.1078

(-1.56)

-0.8979*

(-20.94)

0.2638*

(4.29)

0.0871**

(2.03)

0.0126

(0.29)

-0.2142*

(-3.64)

0.6910

2010 -0.2000**

(-2.01)

0.1898*

(3.12)

0.0473

(0.43)

-1.4543*

(-23.89)

0.3010*

(3.43)

0.0829

(1.36)

-0.0696

(-1.15)

0.0005

(0.01)

0.7185

Note: *,**,*** indicate significant at 1%, 5% and10% level, respectively; t-statistics are in parentheses.

between VIX changes and S&P 500 index daily returns. Since the parameter is

negative and significant with t-statistic of -6.63, there is first-order autocorrelation in

VIX changes series. For lags and leads, the coefficients for lag one and two are

significant at 1% and 5% level, respectively, and the coefficients for lead two is highly

significant at 1% level but for lead one is insignificant. The value of R2 is moderate at

0.6238.

The statistic results of all yearly sub-periods of Table 5 are similar to those for

the full sample period. The coefficients are negative and highly significant at 1%

level for all sub-periods, well exposing the negative intertemporal relationship

between VIX changes and S&P 500 index daily returns throughout the full sample.

The value of parameter of each sub-period is almost same as our expectation

except for four yearly sub-periods: 1998, 1999, 2000 and 2008. In general, the

positive and highly significant approve the asymmetric relationship between VIX

changes and S&P 500 index daily returns. Although the negative and significant

for the full sample period shows first-order autocorrelation in VIX changes series,

41

this fact cannot be observed in most cases, particularly after 1996. We also found that

the significant lead and lagged coefficients are sporadic for sub-periods, particularly

during the period of 2000 to 2010. Finally, the minimum value of R2 of sub-periods is

0.2362 of 1995, and the maximum value is 0.7806 of 2008. By and large, the value of

R2 is desirable.

We conclude that the contemporaneous relationship between VIX changes and

S&P 500 index daily returns is significantly negative and asymmetric, and the series

of VIX changes has autocorrelation of order one.

Figure 4 shows the relationship between S&P 500 index daily returns and the

VIX index. As can be seen, in some periods, the positive returns associated with

decreased implied volatility are smaller than the negative returns associated with

increased implied volatility. The plot of Figure 4 explicitly exhibits the negative and

asymmetric relationship between these two series as well.

Figure 4. S&P 500 index daily returns and the VIX index

-12%

-8%

-4%

0%

4%

8%

12%

0%

20%

40%

60%

80%

1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010

S&P 500 Daily ReturnsImplied Volatility Index (VIX)

Implied V

olatility IndexS&

P 5

00 D

aily

Ret

urns

42

The relationship between S&P 500 index absolute daily returns and the VIX

index are also examined, and the time series plot is presented by Figure 5. It is

obvious that there is close relationship between S&P 500 index absolute daily returns

and VIX because these two time series broadly move in unison during the sample

period. Therefore, VIX performs well for capturing market volatility of S&P 500

index returns.

Figure 5 S&P 500 index absolute daily returns and the VIX index

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010

Implied Volati l i ty Index (VIX)S&P 500 Absolute Daily Returns

S&

P 5

00 A

bsol

ute

Dai

ly R

etur

nsIm

plied Volatility Index

43

7 Estimation and Discussion

This section starts with the study on information criterion which is used for the

selection of orders of ARCH and GARCH terms of various GARCH (p,q) models.

Then, the numerical accuracy of the estimates of GARCH (p,q) models is examined by

comparing the estimates from GARCH (p,q) models to the estimates from ARCH (p)

models. Next, we detailed the coefficient estimates of respective GARCH (p,q) model

as well as the benchmark model (random walk model). In addition, the results of the

standard diagnostics for residuals from the estimated models are also analyzed.

Finally, we provide the study results for the standardized residuals of GARCH (p,q)

models by BDS test and graphical diagnostics.

7.1 Model Selection

The information criterion AIC, SBIC and HQIC of various GARCH (p,q) models

fitted to the daily returns of the S&P 500 index are presented in Table 6. When q=0,

the GARCH (p,q) model reduce to the pure ARCH (p) model.

The information of Panel A indicate that GARCH (4,0) which is equivalent to

ARCH (4) is selected by all information criterions. Panel B shows that GARCH (1,1) is

picked by SBIC and GARCH (2,2) is selected by both AIC and HQIC. For GJR (p,q)

model of Panel C, GJR (2,1) is selected by all information criterions. Finally, the

EGARCH (2,2) is picked by AIC and HQIC but EGARCH (2,1) is preferred by SBIC.

The ultimate column is the log likelihood of each model, and GARCH (4,0), GARCH

(2,2), GJR (2,2) and EGARCH (2,2) models are selected by the maximum value of log

likelihood.

It appears that AIC and HQIC can always provide the same information.

Information criterion is one of criterions for model selection, but it cannot perfectly

indicate whether the model with smallest value of AIC, SBIC or HQIC is the best one.

The performance of an estimated model is affected by many factors. Since the

GARCH (1,1) model is usually assumed to be the best one for modeling and

44

forecasting financial time series, and the parsimonious GJR (1,1) and EGARCH (1,1)

models are also widely used in most empirical studies, therefore, we decided to model

S&P 500 index daily returns using GARCH (1,1), GJR(1,1) and EGARCH (1,1) models.

Table 6. Information criteria for estimated GARCH (p,q) models

( p,q ) AIC SBIC HQIC LL

Panel A: GARCH(p,0)

(1,0) 3.0489 3.0328 3.0503 -7678.777

(2,0) 2.9144 2.9196 2.9163 -7338.925

(3,0) 2.8610 2.8674 2.8632 -7203.218

(4,0) 2.8043 2.8121 2.8070 -7059.384

Panel B: GARCH(p,q)

(1,1) 2.7115 2.7167 2.7134 -6827.712

(1,2) 2.7112 2.7177 2.7135 -6825.992

(2,1) 2.7105 2.7170 2.7127 -6824.053

(2,2) 2.7098 2.7175 2.7125 -6821.283

Panel C: GJR (p,q)

(1,1) 2.6846 2.6911 2.6869 -6758.813

(1,2) 2.6850 2.6928 2.6877 -6758.810

(2,1) 2.6807 2.6885 2.6835 -6748.130

(2,2) 2.6811 2.6902 2.6843 -6748.089

Panel D: EGARCH (p,q)

(1,1) 2.6823 2.6888 2.6846 -6753.040

(1,2) 2.6827 2.6905 2.6854 -6753.026

(2,1) 2.6783 2.6860 2.6810 -6741.874

(2,2) 2.6777 2.6868 2.6809 -6739.517

Note: LL denotes log likelihood.

45

7.2 Test Numerical Accuracy of GARCH Estimates

After the model has been estimated, it is necessary to test the numerical accuracy

of the estimates to assure that the estimated model is efficient for volatility

estimating. Otherwise, the inappropriate coefficient estimates can induce spurious

inference.

Figure 6. Estimates from various GARCH (p,q) models

-5

0

5

10

95 00 05 10

S&P 500 index daily returns

0

10

20

30

95 00 05 10

Estimates from GARCH (1,1)

0

10

20

30

95 00 05 10

Estimates from GJR(1,1)

0

10

20

30

95 00 05 10

Estimates from EGARCH (1,1)

0

20

40

60

95 00 05 10

Estimates from GARCH (1,0)

0

20

40

60

95 00 05 10

Estimates from GARCH (2,0)

0

10

20

30

40

95 00 05 10

Estimates from GARCH (3,0)

0

20

40

60

95 00 05 10

Estimates from GARCH (4,0)

46

Zivot (2008) suggests that the numerical accuracy of model estimates can be

examined by comparing the volatility estimates of GARCH (1,1) model with the

volatility estimates from ARCH (p) models. If the volatility estimates from these

different models have the similar dynamics, then the coefficient estimates of models

are appropriate. By following Zivot (2008), we compared the graphical volatility of

GARCH (1,1), GJR (1,1) and EGARCH (1,1) to GARCH (1,0), GARCH (2,0), GARCH

(3,0) and GARCH (4,0) models.

As can be seen from Figure 6, all models perform well in capturing the observed

volatility clustering in S&P 500 index daily returns. In particular, they explicitly

describe the tremendous fluctuation of volatilities of the period from 2007 to 2009,

and this is a concussive financial period experienced the economic prosperity and

economic crisis. Comparing to GARCH (p,0) models ( p=1, 2, 3, 4) which are

equivalent to ARCH(p) models, the volatilities of GARCH(1,1), GJR(1,1) and

EGARCH(1,1) are much smoother and display more persistence. Since the estimated

volatility from these models exhibit the similar dynamics, thus, the estimates of

GARCH (1,1), GJR(1,1) and EGARCH(1,1) models are appropriate.

7.3 Estimates of Models

Table 7 presents the estimates of various models, and they are random walk

model, GARCH (1,1), GJR (1,1) and EGARCH (1,1) models. The second column

presents the estimated parameters and diagnostic results of random walk model. The

coefficient estimate of the constant term is close to zero and statistically insignificant.

The DW statistic shows the result of Durbin-Watson test and it is very close to 2,

suggests that there is no first order autocorrelation of residuals. However, the

Q-statistic and LM statistic from Ljung-box and ARCH-LM tests for the test of lag 10

indicate the presence of autocorrelation.

The coefficient estimates of conditional mean and conditional variance equation

of GARCH (1,1) model are shown in the third column of Table 7. We assumed that the

residuals for the S&P 500 daily returns are normal distributed. The coefficient

47

Table 7. The summary statistics of estimated volatility models

Models Random Walk GARCH(1,1) GJR(1,1) EGARCH(1,1)

Panel A: Estimates of Mean Equation

c 0.02245 0.0478* 0.0230** 0.0238**

(1.36) (4.09) (1.96) (2.14)

Panel B: Estimates of Conditional Variance Equation

0.0073* 0.0181* -0.0883*

(7.14) (9,85) (-13.74)

0.0635* 0.0017 0.1123*

(14.48) (0.38) (13.63)

0.9309* 0.9322* 0.9855*

(196.06) (202.05) (689.48)

0.1132* -0.0925*

(15.38) (-16.13)

Panel C: Diagnostic Results of Residuals

DW 2.11 2.11 2.11 2.11

Q-statistic(10) 53.636* 15.056 13.014 13.739

LM(10) 1296.372* 8.3173 7.6431 12.0981

LL -7952.988 -6827.712 -6758.813 -6750.04

Skewness -0.1987 -0.4026 -0.4077 -0.3819

Kurtosis 12.1684 4.9230 4.8199 4.6840

Note: * (**) denotes significant at 1% (5%) level; z-statistics are in parenthesis; DW statistic is

from Durbin-Watson test; Q-statistic and LM are the results of Ljung-Box and ARCH-LM tests,

respectively.

48

estimates of the conditional variance equation are consistent with our expectation.

The intercept term is very small (0.0073), the parameter of ARCH term equals to

0.0635 and the coefficient on the lagged conditional variance is 0.9309. The

coefficients on both the lagged squared residual and lagged conditional variance

terms are highly significant, implying the presence of ARCH and GARCH effects. The

sum of the coefficients on ARCH and GARCH terms is very close to unity (0.9944),

suggesting that the model is covariance stationary with a high degree of persistence

and long memory in the conditional variance, i.e., a large positive or negative return

will lead future forecasts of the variance to be high for a protracted period. The

half-life of shocks to volatility to the S&P 500 index is 123 days. Additionally, the sum

of coefficients of ARCH and GARCH terms is also an estimation of the rate at which

the response function decays on daily basis. It seems that the response function to

shocks decline slowly because the rate is very high (0.9944). It means that the new

shock will affect the returns for a longer period. In other words, the old information is

more important than recent information and the information decays very slowly.

Furthermore, the highly statistically significant coefficient estimates of ARCH and

GARCH terms ( and ) suggests that the constant variance model can be rejected,

at least within the sample period. Finally, The unconditional standard deviation of

returns is 1.14 computed as 1⁄ , and it is very close to the sample

standard deviations presented in Table 1, which equals to 1.17. The DW statistic

suggests there is no first order autocorrelation, and the null hypothesis that the

residuals are not serial correlated for lag 10 is not rejected by Q-statistic as well as

LM statistic. The statistics of skewness and kurtosis show that the residuals are

non-normal.

As can be seen from the penultimate column of Table 7, the estimated parameters

on the asymmetric term and the lagged conditional variance of GJR (1,1) model are

statistically significant but the ARCH parameter is insignificant, and the positive and

significant coefficient of the asymmetric term implies the presence of leverage effect.

Since all coefficient estimates are positive, suggests that the negative shocks imply a

49

higher next period conditional variance than positive shocks of the same sign. This is

consistent with our expectations for the application of a GARCH model to the index

returns. The sum of the coefficients on the lagged squared error and lagged

conditional variance is very close to unity (0.9339), thus, shocks to the conditional

variance will be highly persistent. Due to the coefficient estimate of the asymmetric

term is smaller than 0.1322 which is computed by 2*(1 – α1 – β1 ) , therefore, the

model is stationary. Since DW statistic is not insignificant from 2 and both Q-statistic

and LM statistic indicate there is no correlation for lag 10, therefore, the residuals are

not serial correlated.

The statistical properties of EGARCH (1,1) model are presented in the ultimate

column of Table 7. The sum of coefficient estimates of ARCH and GARCH terms

approximates 1, implying that shocks to the conditional variance will be highly

persistent. The negative coefficient of the asymmetric term suggests that the positive

shocks imply a higher next period conditional variance than negative shocks of the

same sign, and this is inconsistent with our expectation. The negative coefficient

estimate of the asymmetric term suggests the absence of leverage effect and this is

conflict to the inference by GJR (1,1). Since the absolute value of coefficient estimate

of logarithmic GARCH term is less than 1, the model is stationary and has finite

kurtosis. It is interesting to find that the DW statistics of all models studied in Table 7

have the same value. With respect to the serial correlation of residuals of the

estimated EGARCH (1,1) model, DW statistic explicitly indicates that there is no first

order autocorrelation, and Ljung-Box as well as ARCH-LM tests demonstrate that the

residuals are independent for lag 10. The same as GARCH(1,1) and GJR(1,1) models,

the null hypothesis that the residuals of EGARCH (1,1) is normality is rejected by

negative skewness and excess kurtosis.

7.4 BDS Test

The nonparametric BDS test examines the nonlinearity of residuals. The null

hypothesis of BDS diagnostic test is that the data are pure white noise. If the linear of

50

non linear structure is removed from data, the remaining structure should be due to

an unknown nonlinear data generating process (Magnus and Fosu, 2006:2046).

The ordinary residuals from estimated random walk model and the standardized

residuals from the estimated GARCH models were examined and the results of BDS

diagnostic test are reported in Table 8. For random walk model, the null hypothesis

that the data is pure random is strongly rejected at 1% level, implying that S&P 500

daily returns do not follow a random walk so that the random walk model cannot

capture the features of the data.

Table 8. BDS test for serial independence in residuals

Random Walk GARCH GJR EGARCH

BDS Asymptotic (p values)

Dimension ε =0.95 ε =0.99 ε=0.95 ε=0.99 ε=0.95 ε=0.99 ε=0.95 ε=0.99

2 0.0000 0.0000 0.3581 0.8692 0.0235 0.6440 0.0501 0.8258

3 0.0000 0.0000 0.8231 0.7870 0.1117 0.7583 0.2537 0.8994

4 0.0000 0.0000 0.6805 0.7754 0.1323 0.7423 0.3161 0.8948

5 0.0000 0.0000 0.5209 0.8319 0.1962 0.7137 0.5483 0.8750

BDS Bootstrap (p values)

2 0.0000 0.0220 0.3692 0.8232 0.0096 0.2680 0.0344 0.9612

3 0.0000 0.0000 0.7792 0.3520 0.0876 0.8656 0.2500 0.7424

4 0.0000 0.0000 0.6480 0.3756 0.1172 0.8628 0.3212 0.7520

5 0.0000 0.0000 0.5088 0.4380 0.1824 0.8144 0.5948 0.7884

Note: The standardized residuals of GARCH models and the ordinary residuals of Random Walk Model

were used for BDS test. Bootstrap with 5000 repetitions. ε denotes fraction of pairs epsilon value.

The results of BDS tests for standardized residuals of GARCH models are

desirable. Both the asymptotic and bootstrap p values of each model indicate that the

null hypothesis of white noise cannot be rejected at 0.99 epsilon bound, suggesting

that all GARCH models are correctly specified, and they can well capture the relevant

51

feature of S&P 500 index daily returns.

Additionally, the insignificant statistic of Ljung-Box and LM test for the

standardized residuals of GARCH models reported in Table 7 show that these models

are successful at modeling the serial correlation structure in the conditional mean

equation and conditional variance equation.

7.5 Graphical Diagnostics

We also examined the standardized residuals of the estimated GARCH models by

graphical diagnostics provided by Figure 7. As can be seen, the autocorrelation

function (ACF) of respective estimated model does not show significant

autocorrelation. The normal qq-plot of the standardized residuals of each estimated

model indicates the strong departures from normality.

In addition, the standard statistical diagnostics of Ljung-Box and Engle’s LM

tests of the estimated GARCH models in Table 7 show the consistent results with ACF

in Figure 7, suggesting that is there is no remaining ARCH effect, and the statistic of

skewness and kurtosis of each estimated model also confirm that the residuals are

non-normal.

52

Figure 7. Graphical residual diagnostics from GARCH (1,1) to S&P 500 returns

-.06

-.04

-.02

.00

.02

.04

.06

5 10 15 20 25 30 35

ACF of standardized residuals from GARCH(1,1)

-4

-2

0

2

4

-8 -4 0 4 8

Quantiles of standardized residuals from GARCH(1,1)

Qua

ntile

s of

Nor

mal

-.06

-.04

-.02

.00

.02

.04

.06

5 10 15 20 25 30 35

ACF of standardized residuals from GJR(1,1)

-4

-2

0

2

4

-8 -4 0 4

Quantiles of standardized residuals from GJR(1,1)

Qua

ntile

s of

Nor

mal

-.06

-.04

-.02

.00

.02

.04

.06

5 10 15 20 25 30 35

ACF of standardized residuals from EGARCH(1,1)

-4

-2

0

2

4

-8 -4 0 4

Quantiles of standardized residuals from EGARCH(1,1)

Qua

ntile

s of

Nor

mal

53

8 Forecast Performance of Estimated Models and VIX

The forecast performance of both time series models and the implied volatility index

(VIX) are discussed in this section. First, the out-of-sample forecast performance of

time series models is examined by the conventional error measurements. Next, the

in-sample forecast performance of implied volatility is studied by running a GARCH

model augmented with dummy variable and exogenous variable. Particularly, the

comparison of forecast performance between VIX, GJR(1,1) and RiskMetrics are

investigated by a variety of approaches.

8.1 out-of-sample Forecast Performance of GARCH Models

The out-of-sample forecast performance of estimated GARCH models are

evaluated by four conventional error measurements, and they are root mean square

error, mean absolute error, mean absolute percentage error and Theil’s U-statistic.

The model with the smallest statistic is considered to be the best for modeling the

conditional volatility of S&P 500 index daily returns.

As can be seen from Table 9, the estimated GARCH(1,1) is considered to be the

best model for out-of-sample forecasting since it has the smallest statistic of RMSE,

MAE and Theil’s U-statistic. If the models are evaluated by MAPE, then GJR (1,1) is

preferred.

Table 9. Forecast Performance of GARCH models

Models GARCH(1,1) GJR(1,1) EGARCH(1,1)

RMSE 1.1355 1.1357 1.1358

MAE 0.7946 0.7964 0.7964

MAPE 122.0995 107.4572 107.8961

Theil 0.9588 0.9795 0.9789

54

Theil’s U-statistics of each model is smaller than one, and this indicates that the

estimated models perform better than the benchmark model. However, the statistic

of MAPE of each model is more than 100%, suggesting that the benchmark model

outperforms the estimated models. Therefore, the result of the comparison between

the estimated models and the random walk model is mixing. It seems MAPE and

Theil’s U-statistic conflict in our test.

Due to the standard GARCH (1,1) cannot capture asymmetry of volatility, and

GJR (1,1) is indicated to be the best one by MAE, thus, GJR (1,1) will be further

studied and its forecast performance will be compared to VIX as well as RiskMetrics

approach.

8.2 In-sample Forecast Performance of VIX

We examined the in-sample forecast performance of VIX by following Blair et al.

(2001) and Frijns et al.(2008) by running the GARCH model augmented with dummy

variable and exogenous variable:

~ 0,

(30)

where denotes S&P 500 index daily return, is the average daily return, is the

random error component on mean level. We assume that is normally distributed

with a mean of zero and a conditional variance equals to . is a dummy variable

which equals to one if the innovation is negative and zero otherwise, and it is used

to capture the asymmetric impact of shocks on volatility.

Six different GARCH specifications are nested by equation (30) if we place

restrictions on it:

(1) if 0, equation (30) becomes a standard GARCH (1,1) model;

55

(2) if 0, equation (30) becomes a GJR (1,1) model which can capturing

asymmetric impact of shocks;

(3) if 0, is the only parameter for explaining the volatility

process;

(4) if 0, then equation (30) becomes a model consisted of VIX and

market shocks;

(5) if 0, then equation (30) includes asymmetry in market shocks;

(6) the equation (30) without any restrictions is a GJR model with implied

volatility which is an exogenous variable.

Table 10. In-sample forecast performance of VIX and GARCH specifications

LL excess LL

GARCH 0.0073* 0.9309* 0.0635* -6827.712

GJR-GARCH 0.0108* 0.9322* 0.0017 0.1103* -6758.813 68.899

VIX -0.5420* 0.0040* -6485.995 341.717

ARCH-VIX -0.0183* -0.0134** 0.0028* -6746.639 81.073

GJR-ARCH-VIX -0.1058* -0.0444* 0.0574* 0.0028* -6740.518 87.194

GJR-GARCH-VIX 0.0008 0.8520* -0.0344* 0.1684* 0.0002* -6703.526 124.186

Note: *(**) denote significant at 1% (5%) level. LL is the statistic of log-likelihood.

The in-sample forecast performance of different models nested in equation (30) are

presented in Table 10, including parameter estimates, the statistics of log-likelihood,

and the values of excess log-likelihood on the basis of the standard GARCH (1,1) in the

second row. The highly significant equals to 0.9309 of GARCH (1,1) model

confirms the strong persistence in volatility. The GJR-GARCH model with

asymmetric term performs better than GARCH (1,1) because the log-likelihood

increase by 69 approximately. Since the estimates of of GJR-GARCH model is

positive, thus the negative shocks imply a higher next period conditional variance.

56

The parameter estimates in the fourth row are for the restricted model which has

0 except for the parameter of . Thus, this model describes the

volatility process only with VIX series, and the large value of excess log-likelihood

(approximate 342) implies that this model performs better than GARCH (1,1). The

nested ARCH-VIX and GJR-ARCH-VIX models incorporate the shock terms into the

specifications, and it is interesting to find that they have the same value of which

equals to 0.0028 and highly significant at 1% level. However, the values of excess

log-likelihood of these two nested models indicate that GJR-ARCH-VIX performs

better, because it has larger excess value compared to the log-likelihood of the

standard GARCH (1,1). In addition, the coefficient of the shock term of

GJR-ARCH-VIX is highly significant at 1% but that of ARCH-VIX is significant at 5%.

The last row shows the estimated parameters of unrestricted model. It is obvious

that GJR-GARCH-VIX significantly improves the standard GARCH (1,1) and it

outperforms GJR-GARCH, ARCH-VIX and GJR-ARCH-VIX, since it has larger excess

log-likelihood. The parameter of is highly significant at 1% level. We also find

that the incorporation of this exogenous variable reduces the value of parameter of the

GARCH term of the standard GARCH (1,1) by approximately 0.08, suggesting that the

VIX series capture a part of persistence in the volatility process.

If comparing all these nested specifications, we find that the addition of VIX can

significantly improve the standard GARCH (1,1). Since the nested VIX model in fourth

row shows the largest excess log-likelihood, thus we conclude that the volatility

process can be reasonably described by VIX series.

8.3 Comparing Predictability of Time Series Models and VIX

In this section, the forecast performance of VIX is investigated as well as

compared to the forecasts performance of RiskMetrics approach and GJR (1,1) models

by running a regression of realized volatility. We consider the forecasting horizons at

5, 10, 15, 30 and 60 trading days. The objective is to investigate whether the VIX

series incorporates all information which has been included in the time series.

57

The time series plots of VIX and annualized future realized volatility over

different horizons are presented by the figures of Appendix A. It seems that VIX can

perfectly track the realized volatility at each horizon. It is obvious that the level of VIX

and realized volatility of each horizon are different, and VIX overestimate the realized

volatility at all horizons. Although the information from figures of Appendix A reveal

that VIX appears to have good forecast performance on realized volatility, we need to

confirm this fact by formal tests of running a regression of realized volatility.

Additionally, we need to investigate whether VIX is superior against other

approaches.

We define the realized volatility as the square root of the sum of S&P 500 index

squared daily returns which is computed as

31

where k denotes the number of trading days, is the squared daily return on day t.

The regression of the realized volatility is given by

, 32

where , denotes forecasts obtained from alternative approaches, and

5, 10, 15, 30, 60 .

In order to run the equation (32),we should primarily construct , series over

different horizons by alternative approaches. By following Giot (2005b) and Frijns et

al. (2008), the k-day forward-looking volatility forecasts on day t by VIX series is

computed as

360 33

58

RiskMetrics approach can be regarded as a simplified and restricted GARCH (1,1)

model. We assume that

1 (34)

where denotes the variance according to RiskMetrics approach, r is return of

S&P 500 index on day t, equals to 0.94 and it captures the persistence of volatility.

Due to the parameters 1 and should sum to one, a unit root is included by

model (51), implying that model (51) is a specific parameterization of Integrated

GARCH (1,1) model. The forecast obtained by model (51) is the forecasts for the next

day. In order to obtain the forecasts for longer horizon, the forecast measurement is

re-scaled and the k-day forward-looking forecast by following Frijns et al. (2008) can

be derived from

· 35

where is the daily forecast by RiskMetrics approach.

The forecasts based on GJR(1,1) model is given by equation (14), and one-day

ahead forward-looking forecast can be obtained by

36

For -day horizon ( 1), the forward-looking forecast can be computed as

0.5 37

The total volatility -day ahead can be derived from

59

, 38

After deriving the forecasts by these three different approaches, we first examine

the correlation between these forecasts and realized volatility at different horizons,

and our purpose is to confirm whether VIX is a better forecaster than other

approaches; then, the forecast performances are evaluated by running the regression

of realized volatility by equation (32). Furthermore, due to our sample data spans a

considerably long period, thus, the realized volatility at each horizon of both

in-sample and out-of-sample period are regressed.

8.3.1 Correlation between Realized Volatility and Volatility Forecasts

The correlation between future realized volatility and volatility forecasts from

respective forecaster at each forecasting horizon are reported in Table 11. It is obvious

that GJR (1,1) has the highest correlation with realized volatility at each horizon. VIX

has higher correlation compared to Riskmetrics approach in most cases except for the

60-day horizon. It is also interesting to find that the correlation between realized

volatility and GJR (1,1) increases for longer horizon. The information observed from

Table 11 implies that GJR (1,1) may performs best on realized volatility forecasting

against VIX and RiskMetrics approach.

Table 11 Correlation between Realized Volatility and Alternative Forecasters

5-day 10-day 15-day 30-day 60-day

VIX 0.7906 0.8141 0.8086 0.7762 0.7039

RiskMetrics 0.7486 0.7808 0.7813 0.7638 0.7135

GJR (1,1) 0.8482 0.9085 0.9311 0.9615 0.9796

60

8.3.2 Regression for In-sample Realized Volatilty

Table 12 presents the performance of regression for in-sample realized volatility

by estimated variance from VIX series, RiskMetrics approach and GJR(1,1) model at

various horizons. The statistic of is also used to evaluate the predictability of ,

of equation (32). Due to the requirements for , to be an unbiased estimates of

realized volatility are 0 1, the joint hypothesis are tested and F-statistic

indicates the test results.

Panel A of Table 12 shows the parameter estimates of the regression by VIX series.

The coefficients α are negative and statistically significant from zero at 1% level in

most cases expect for the 60-day horizon. The coefficients β are close to one and

highly significant at 1% level at all horizons. In addition, F-statistic of each horizon

significantly rejects the null hypothesis that α = 0 and β =1. Therefore, the estimates

of α and β are biased in all cases. However, unbiasedness is not a determinant for a

good predictor because the observable and systematic bias can be controlled. The high

value of R2 is a required property of good forecaster. By observing the R2 of different

horizons, we found that VIX performs best at 10-day horizon and worst at 60-day

with R2 equals to 0.6692 and 0.5103, respectively.

The regression results from RiskMetrics approach are shown in Panel B of Table 12.

As can be seen, the estimates are also biased in all cases since both coefficients α and

β are statistically significant at 1% level at all horizons, and the F-statistic significantly

rejects the joint hypothesis that α =0 and β =1 at 1% level at each horizon as well.

When evaluating the value of R2, we found that RiskMetrics approach performs best

at 15-day horizon with R2 equals to 0.6214, and performs worst at 60-day horizon

with R2 equals to 0.5230. Comparing to the regression by VIX series, we found that

the RiskMetrics approach only outperforms VIX at 60-day horizon, implying that VIX

is a better forecaster against RiskMetrics approach in all other cases.

The regression results from GJR (1,1) model are reported in the third panel of

Table 12. When evaluating the unbiasedness through coefficients of α and β, we found

the comparable results with Panel A and Panel B. Both the estimates of α and β are

61

Table 12. Performance of regression for in-sample realized volatility

α β R2 F-statistic

Panel A:Forecasting Regression by VIX

5-day -0.7643*

(0.0340)

[0.0000]

1.2172*

(0.0131)

[0.0000]

0.6302 315.91

[0.0000]

10-day -0.8980*

(0.0430)

[0.0000]

1.1863*

(0.0118)

[0.0000]

0.6692 259.14

[0.0000]

15-day -0.9188*

(0.0520)

[0.0000]

1.1519*

(0.0116)

[0.0000]

0.6624 192.7473

[0.0000]

30-day -0.7253*

(0.0755)

[0.0000]

1.0682*

(0.0119)

[0.0000]

0.6160 80.5313

[0.0000]

60-day 0.1900

(0.1162)

[0.1022]

0.9353*

(0.0130)

[0.0000]

0.5103 42.4262

[0.0000]

Panel B: Forecasting regression by RiskMetrics

5-day 0.1863*

(0.0278)

[0.0000]

0.8638*

(0.0106)

[0.0000]

0.5697 121.6117

[0.0000]

10-day 0.3800*

(0.0350)

[0.0000]

0.8518*

(0.0094)

[0.0000]

0.6193 138.5070

[0.0000]

15-day 0.5784*

(0.0417)

[0.0000]

0.8327*

(0.0092)

[0.0000]

0.6214 173.5402

[0.0000]

30-day 1.1631*

(0.0585)

[0.0000]

0.7855*

(0.0091)

[0.0000]

0.5981 278.4340

[0.0000]

60-day 2.3794*

(0.0870)

[0.0000]

0.7062*

(0.0096)

[0.0000]

0.5230 473.5829

[0.0000]

62

Table 12 (continued)

Panel C:Forecasting Regression for GJR(1,1)

5-day -0.2043*

(0.0233)

[0.0000]

1.0250*

(0.0089)

[0.0000]

0.7248 90.7276

[0.0000]

10-day -0.2779*

(0.0247)

[0.0000]

1.0436*

(0.0067)

[0.0000]

0.8292 88.1022

[0.0000]

15-day -0.3350*

(0.0259)

[0.0000]

1.0500*

(0.0057)

[0.0000]

0.8705 99.4965

[0.0000]

30-day -0.4797*

(0.0269)

[0.0000]

1.0604*

(0.0042)

[0.0000]

0.9274 163.9572

[0.0000]

60-day -0.7058*

(0.0278)

[0.0000]

1.0689*

(0.0031)

[0.0000]

0.9605 321.7394

[0.0000]

Note: * denotes significant at 1% level; the statistics of parentheses are standard error; the

statistics of square brackets are P-value; F-statistic are used for the test of joint hypothesis α =

0 and β = 1.

highly significant at 1% level for all cases, and the joint hypothesis α = 0 and β =1 for

each horizon is also explicitly rejected by F-statistic. Therefore, the forecasts by GJR

(1,1) model are also biased forecasts for future volatility at all horizons. However,

comparing to the other two approaches, the coefficients α are more close to zero and

the coefficients β are more close to one at each horizon. With respect to the predictive

power of GJR (1,1) model, the largest value of R2 equaled 0.9605 appears at 60-day

horizon, and the smallest value of R2 equaled 0.7248 is obtained at 5-day horizon. It is

interesting to find that the value of R2 of GJR (1,1) model increases for longer horizon,

and the value of R2 are extremely high at each horizon. It appears that GJR (1,1) has

particular good forecast performance for the long forecasting horizon.

63

To summarize, comparing to VIX series and RiskMetrics approach, GJR (1,1) has

the highest value of R2 and coefficients β are more closer to unity at each horizon.

Consequently, GJR (1,1) model outperforms other approaches on regression for

realized volatility in the sample period, and this finding is consistent with information

observed from Table 11.

We further investigate the out-of-sample forecast performance of each model by

graphical volatility and the conventional error measurements. The figures of

Appendix B plot the out-of-sample volatility forecasts by each model at different

horizons and each model performs well for tracking the dynamics of future volatility.

The statistic of error measures are provided by Table 13, and the information of each

panel explicitly indicates that GJR (1,1) has the best forecast performance for future

realized volatility.

Table 13. Forecast performance on out-of-sample realized volatility

VIX RiskMetrics GJR(1,1)

Panel A: RMSE

5-day 0.8809 1.0256 0.8012

10-day 1.1243 1.2924 0.8287

15-day 1.4051 1.5418 0.8842

30-day 2.2093 2.2978 0.9979

60-day 3.1862 3.1815 0.8467

Panel B: MAE

5-day 0.71956 0.8186 0.6705

10-day 0.8821 1.0238 0.6413

15-day 1.0964 1.1812 0.6837

30-day 1.7227 1.6903 0.7764

60-day 2.6637 2.5519 0.7310

64

Table 13 (Continued)

Panel C: MAPE

5-day 53.8270 56.2948 43.2352

10-day 34.5104 37.5946 23.5408

15-day 32.7246 33.7394 19.3809

30-day 30.6203 29.1853 13.6856

60-day 29.2769 27.0751 8.0997

Panel D: Theil’s Statistic

5-day 0.1710 0.2068 0.1605

10-day 0.1520 0.1812 0.1166

15-day 0.1560 0.1777 0.1011

30-day 0.1697 0.1824 0.0782

60-day 0.1698 0.1739 0.0456

8.3.3 Residual Tests for Regression of In-sample Realized Volatility

The future realized volatilities of in-sample period are estimated by the classical

linear regression model which has five underlying assumptions:

1) 0

2) ∞

3) , 0

4) , 0

5) ~ 0,

These assumptions make the ordinary least squares (OLS) technique has substantive

desirable properties and the hypothesis tests concerning the parameter estimates can

be conducted validly. The violations of the assumptions can lead to some problems,

65

such as both the parameter estimates and the associated standard errors are wrong,

and the distributions assumed for the test statistic are not appropriate.

In order to confirm that the volatility forecasts are efficient and our inferences

based on the coefficient estimates of the regression for in-sample realized volatility

are correct, the residual diagnostic tests were conducted. Due to the coefficients α are

highly significant for all regressions, thus, the first assumption is not violated. In

other words, if a constant is included in the regression model, the first assumption

that the mean of the residuals equals to zero will never be violated. In terms of of

the fourth assumption, it denotes the independent variable of regression equation. If

the independent variable is stochastic and uncorrelated with residual, then the

estimates of OLS are consistent and unbiased.

Table 14 reports the diagnostic results of residuals of regression for in-sample

realilzed volatility. The second column shows the results for heteroskedasticity tests

by the White Test. The autocorrelation of residuals are examined by Durbin-Watson

Test and the results are presented in the third column. The correlation coefficients

between residual and regressor are listed in the fourth column. The statistic of

skewness and kurtosis of the next two columns describe the departures from

normality.

Unfortunately, the diagnostic results are undesirable. The null hypothesis that the

residuals are homoscedastic is explicitly rejected for all regressions at each horizon,

and the DW-statistic of each horizon indicates the presence of first order

autocorrelation of residuals. The correlation coefficients between residuals and

regressors are the same as our expectation. The statistics of skewness and kurtosis

imply that the distributions of residuals are non-normal in all cases. For such a long

in-sample period with 5287 observations, the violation of the normality assumption is

inconsequential.

66

Table 14. Residual tests for regression for in-sample realized volatility

W-statistic DW-statistic Corr( , Skewness Kurtosis

Panel A: Regression by VIX

5-day 689.86

[0.0000]

0.3678 0.0000 1.7703 13.7935

10-day 513.23

[0.0000]

0.1664 0.0000 2.2320 16.4138

15-day 377.04

[0.0000]

0.1294 0.0000 2.7128 21.2689

30-day 178.57

[0.0000]

0.0507 0.0000 3.5227 27.7184

60-day 48.87

[0.0000]

0.0167 0.0000 3.9749 28.3814

Panel B: Regression by RiskMetrics

5-day 559.77

[0.0000]

0.3698 0.0000 1.8001 14.6771

10-day 687.41

[0.0000]

0.1638 0.0000 1.8377 13.9682

15-day 589.31

[0.0000]

0.1397 0.0000 2.0736 16.2758

30-day 284.85

[0.0000]

0.0518 0.0000 2.8501 21.8930

60-day 91.20

[0.0000]

0.0154 0.0000 3.4874 24.0941

Panel A: Regression by GJR (1,1)

5-day 1003.478

[0.0000]

0.4566 0.0000 0.7187 10.9908

10-day 1178.237

[0.0000]

0.2247 0.0000 0.1098 11.3857

15-day 1132.649

[0.0000]

0.2163 0.0000 -0.2153 13.8315

30-day 1150.130

[0.0000]

0.1528 0.0000 -0.2749 15.0022

60-day 1568.914

[0.0000]

0.0464 0.0000 -0.2798 12.9577

Note: W-statistic is the result for heteroskedasticity test by the White Test; DW-statistic is the

result for first order autoccorelation test by the Durbin-Watson Test; Corr( , report the

correlation coefficients between residuals and regressors.

67

8.3.4 Regression for Out-of-sample Realized Volatility

The information observed from Table 14 suggests that the coefficient estimates of

regression for in-sample realized volatility may be wrong so that the inferences based

on analysis of coefficient estimates could be unreliable. In order to further investigate

the predictability of VIX, RiskMetrics and GJR (1,1), we run the OLS regression for

the realized volatility for the out-of-sample period.

Table 15 presents the coefficient estimates of regression for out-of-sample

realized volatility as well as R2 and F-statistic, and the latter one is the result of the

joint hypothesis test. Panel A shows the information of regression by VIX. The

coefficients α are negative at 5-, 10-, and 15-day horizons and they are statistically

significant at 1% level at most horizons except for the 15-day horizon. The estimates of

β are positive and highly significant at 1% level in most cases except for the 60-day

horizon. We also found that the coefficients α increases with longer horizons but the

coefficients β decreases with longer horizons, implying that VIX has weaker

explanatory power for longer horizon. Due to the estimates α is insignificantly from

zero and β is significant and close to one at 15-day horizon, it appears that VIX

performs best for 15-day ahead forecasts. However, F-statistic explicitly rejects the

joint null hypothesis that α = 0 and β = 1. In addition, this joint null hypothesis for

other cases are also rejected by F-statistic. Therefore, the estimates by α and β are

biased. The R2 decreases with longer horizon, suggesting that VIX has better

predictability at the shortest horizon. Our finding regarding performance of VIX

differs from the previous finding based on the regression for in-sample realized

volatility.

Panel B of Table 15 is the results of the regression by RiskMetrics approach of

out-of-sample period. As can be seen, the coefficients α are positive at all horizons and

they are statistically significant in most cases except for the 5-day horizon. The

parameter estimates of β are significant in most cases except for the 60-day horizon.

These findings are comparable with the results of Panel A. Particularly, the estimates

α increases with longer horizons but the estimates of β decreases for longer horizons.

68

Table 15. Performance of regression for out-of-sample realized volatility

α β R2 F-statistic

Panel A:Forecasting Regression by VIX

5-day -1.2619*

(0.2397)

[0.0000]

1.3222*

(0.0875)

[0.0000]

0.4825 34.1455

[0.0000]

10-day -0.8393*

(0.3133)

[0.0079]

1.1023*

(0.0805)

[0.0000]

0.4387 21.3210

[0.0000]

15-day -0.1003

(0.3969)

[0.8008]

0.9136*

(0.0829)

[0.0000]

0.3398 16.7091

[0.0000]

30-day 3.1144*

(0.5894)

[0.0000]

0.4424*

(0.0861)

[0.0000]

0.1067 31.4245

[0.0000]

60-day 9.5196*

(0.7259)

[0.0000]

-0.0502

(0.0734)

[0.4943]

0.0024 109.0566

[0.0000]

Panel B: Forecasting regression by RiskMetrics

5-day 0.3177

(0.2241)

[0.1575]

0.7914*

(0.0871)

[0.0000]

0.2522 7.3609

[0.0008]

10-day 1.0219*

(0.2848)

[0.0004]

0.6623*

(0.0778)

[0.0000]

0.2320 11.3619

[0.0000]

15-day 1.8635*

(0.3435)

[0.0000]

0.5338*

(0.0763)

[0.0000]

0.1718 19.7635

[0.0000]

30-day 4.7194*

(0.4703)

[0.0000]

0.2181*

(0.0729)

[0.0031]

0.0389 57.7503

[0.0000]

60-day 10.0989*

(0.5732)

[0.0000]

-0.1175

(0.0607)

[0.0544]

0.0191 169.3804

[0.0000]

69

Table 15 (continued)

Panel C:Forecasting Regression by GJR(1,1)

5-day -0.2366

(0.1557)

[0.1300]

1.0337*

(0.0607)

[0.0000]

0.5416 4.7423

[0.0095]

10-day -0.1803

(0.1706)

[0.2916]

1.0189*

(0.0467)

[0.0000]

0.6645 2.3019

[0.1023]

15-day -0.1720

(0.1886)

[0.3629]

1.0153*

(0.0420)

[0.0000]

0.7124 1.7324

[0.1791]

30-day -0.0614

(0.7929)

[0.0000]

0.9945*

(0.0363)

[0.0000]

0.7724 1.0366

[0.3564]

60-day -0.0873

(0.2649)

[0.7422]

1.0069*

(0.0284)

[0.0000]

0.8669 0.2236

[0.8942]

Note: * denotes significant at 1% level; P-values are in square brackets.

Additionally, the R2 decreases with longer horizons as well. The same as VIX,

RiskMetrics approach has weaker explanatory power for future volatility for longer

horizon. Comparing to the regression for in-sample realized volatility by RiskMetric

approach, the information of Panel B shows the different performance of RiskMetrics

approach for forecasting out-of-sample realized volatility.

The performance of GJR (1,1) on regressing out-of-sample realized volatility are

presented by the third panel of Table 15. The coefficient estimates α are not

statistically insignificant from zero for all forecasting horizons. The estimates β are

close to one and highly significant at 1% level in all cases. The joint null hypothesis

that α = 0 and β =1 is not rejected by F-statistic at each horizon. Therefore, the

estimates by GJR (1,1) is unbiased and it has strong explanatory power for realized

volatility of the out-of-sample period. The value of R2 at each horizon is much higher

than the R2 of regression by VIX and RiskMetrics, and it increases with longer

70

horizons. Thus, GJR (1,1) has outstanding forecast performance for out-of-sample

realized volatility and outperforms VIX and RiskMetrics. Particularly, GJR (1,1)

performs better for longer forecasting horizon.

To summarize, the information observed from Table 15 demonstrates that the

forecast performance of respective approach on in-sample and out-of-sample realized

volatility is different at each forecasting horizon. However, the study results again

confirms that GJR (1,1) outperforms other approaches.

8.3.5 Residual Tests for Regression of Out-of-sample Realized Volatility

In order to confirm that the inferences based on analysis for information of Table

15 are convincible, the diagnostic tests for residuals of out-of-sample regression are

conducted and the diagnostic results are shown in Table 16. Panel A presents the

diagnostic results of residuals of regression by VIX. The W-statistic of the second

column are the results for heteroscedasticity test. As can be seen, the null hypothesis

that the residuals are homoscedastic is not rejected at 5- and 10-day horizons. The

DW-statistic from Durbin-Watson Test at each horizon indicates the presence of first

order autocorrelation of residuals and the results are undesirable. The correlation

coefficients of residuals and regressors equal to zero at all horizons, suggesting that

the estimates by OLS is consistent and unbiased. The statistic of skewness indicate

that the residuals are positively skewed at all horizons. The kurtosis at each horizon

reveals that the residuals are leptokurtic in most cases but platykurtic at 60-day

horizon. Thus, the normality assumption is violated.

The diagnostic results of Panel B of Table 16 are similar to the results of Panel A.

The residuals of regression by RiskMetrics are homoscedastic at 5- and 10-day

horizons documented by W-statistic of the second column. The presence of first order

autocorrelation of each horizon is confirmed by DW-statistic of the third column. The

residuals are uncorrelated with regressors since the correlation coefficients equal to

zero at each horizon. The normality assumption is violated because the residuals are

positively skewed and leptokurtic in most cases but platykurtic at 60-day horizon.

71

Table 16. Residual tests for regression for out-of-sample realized volatility

W-statistic DW-statistic Corr( , Skewness Kurtosis

Panel A: Regression by VIX

5-day 0.2196

[0.6394]

0.4846 0.0000 0.9849 4.5510

10-day 1.6266

[0.2022]

0.2285 0.0000 1.2872 5.2104

15-day 7.9010

[0.0049]

0.1534 0.0000 1.3408 4.9310

30-day 15.7494

[0.0001]

0.0598 0.0000 0.9751 3.5156

60-day 7.0272

[0.0080]

0.0678 0.0000 0.3166 1.7013

Panel B: Regression by RiskMetrics

5-day 0.0230

[0.8795]

0.4741 0.0000 0.8253 3.9860

10-day 2.3318

[0.1268]

0.2126 0.0000 0.9892 4.0637

15-day 8.2965

[0.0040]

0.1286 0.0000 0.9826 3.7893

30-day 18.9292

[0.0000]

0.0509 0.0000 0.8177 3.1486

60-day 37.7532

[0.0000]

0.0461 0.0000 0.1766 1.5969

Panel C: Regression by GJR(1,1)

5-day 6.4295

[0.0112]

0.7242 0.0000 0.4750 3.1025

10-day 3.5898

[0.0581]

0.3815 0.0000 0.5469 3.7150

15-day 1.9599

[0.1615]

0.1457 0.0000 0.6815 3.7572

30-day 1.9080

[0.1672]

0.1668 0.0000 0.7217 3.4385

60-day 9.3204

[0.0023]

0.2828 0.0000 0.3603 2.0008

Note: W-statistic is the result for heteroskedasticity test by the White Test; DW-statistic is the

result for first order autoccorelation test by the Durbin-Watson Test; Corr( , report the

correlation coefficients between residuals and regressors.

72

Panel C of Table 16 reports the residual tests results of regression by GJR (1,1) for

out-of-sample realized volatility. As can be seen from W-statistic of the second

column, the null hypothesis that the residuals are homoscedastic is not rejected in

most cases except for the 60-day horizon. DW-statistic confirms the first order

autocorrelation of residuals at each horizon. The correlation coefficients of residuals

and regressors show that they are independent of each other in all cases. The

normality assumption is violated since residuals are positively skewed and leptokurtic

in most cases.

The residuals from regression by respective approach at each horizon are plotted

by the figures of Appendix C. The information observed from these figures suggests

that the violation of normality assumption in each case appears to be induced by a

small number of very large positive or negative outliers of each case. Although

DW-statistic of all regressions indicate the presence of first order autocorrelation at

each horizon, the figures of Appendix C show that the residuals have no

autocorrelation over time at 5-, 10- and 15-day horizons, but the residuals are

positively correlated at 30- and 60-day horizons for all regressions.

By and large, the statistics of Table 16 demonstrates that GJR (1,1) performs

better than the other two approaches since the null hypothesis of heteroscedasticity

test is not rejected in most cases and the residuals of GJR (1,1) have a distribution that

is much closer to normality.

8.3.6 Encompassing Regression for Realized Volatility

In the interest of investigating whether the forecast performance of one approach

is superior than the other or whether two approaches complement each other, we run

the encompassing regression followed by Frijns et al. (2008). The form of the

regression is:

, , , 39

73

where , denotes realized volatility at different horizon, , and , are

forecasts from alternative approaches. The significance of and will reveal

whether one approach dominates the other. For instance, if is significant but

is not, it suggests that , performs better than , on future volatility forecasting.

If both and are significant, it implies that these two approaches are

complement each other and each of them has information not included by the other.

Table 17 reports the estimates of encompassing regression at different horizons,

as well as the statistic of R2. Panel A presents the encompassing regression results for

VIX against RiskMetrics approach. Both estimates of β1 and β2 are highly significant

at 1% level with P-values equal to 0.0000. Therefore, VIX and RiskMetrics approach

complement each other, indicating that using both approaches may achieve the best

forecast performance. This can be confirmed if we compare R2 of the regressions for

realized volatility by VIX and RiskMetrics, i.e. the value of R2 of encompassing

regression is larger than R2 of the regression for realized volatility by VIX or

RiskMetrics approach at all horizons.

The forecasting performance of VIX against GJR (1,1) model evaluated by

encompassing regression is presented in the Panel B of Table 17. We found the same

result with Panel A. The estimated parameters of both VIX and variance derived by

GJR (1,1) model at each horizon are highly significant, therefore, VIX and GJR (1,1)

model are complement each other. Since the value of R2 of encompassing regression

for VIX against GJR (1,1) are higher than the value of R2 of Table 15 of regression for

realized volatility by VIX or GJR (1,1) model at each horizon, thus, the joint use of

VIX and the variance from GJR (1,1) model may perform better for future volatility

forecasting.

Panel C of Table 17 shows the regression results for RiskMetrics approach

against GJR (1,1) model. Again, the coefficients of , and , are highly

significant at 1% level with P-values equal to 0.0000 in all cases, and α are negative

and significantly deviate from zero at all horizons. Therefore, the information of

Panel C indicates that RiskMetrics approach and GJR (1,1) model are complement

each other over all forecasting horizons. Comparing to the value of R2 of regression by

74

Table 17. Encompassing regression for realized volatility

α β1 β2 R2

Panel A: VIX - RiskMetrics

5-day -0.6588*

(0.0372)

[0.0000]

1.0075*

(0.0324)

[0.0000]

0.1705*

(0.0243)

[0.0000]

0.6285

10-day -0.6770*

(0.0468)

[0.0000]

0.8958*

(0.0288)

[0.0000]

0.2338*

(0.0216)

[0.0000]

0.6700

15-day -0.6009*

(0.0564)

[0.0000]

0.8213*

(0.0284)

[0.0000]

0.2643*

(0.0213)

[0.0000]

0.6637

30-day -0.1218

(0.0816)

[0.1357]

0.6474*

(0.0290)

[0.0000]

0.3323*

(0.0217)

[0.0000]

0.6194

60-day 1.2879*

(0.1240)

[0.0000]

0.4059*

(0.0312)

[0.0000]

0.4165*

(0.0234)

[0.0000]

0.5244

Panel B: VIX – GJR (1,1)

5-day -0.2956*

(0.0311)

[0.0000]

0.1215*

(0.0281)

[0.0000]

0.9375*

(0.0221)

[0.0000]

0.7204

10-day -0.1960*

(0.0322)

[0.0000]

-0.0753*

(0.0197)

[0.0001]

1.0972*

(0.0156)

[0.0000]

0.8258

15-day -0.1198*

(0.0327)

[0.0003]

-0.1604*

(0.0157)

[0.0000]

1.1634*

(0.0125)

[0.0000]

0.8696

30-day -0.1272*

(0.0319)

[0.0001]

-0.1741*

(0.0093)

[0.0000]

1.1788*

(0.0076)

[0.0000]

0.9292

60-day -0.3572*

(0.0315)

[0.0000]

-0.1085*

(0.0053)

[0.0000]

1.1371*

(0.0045)

[0.0000]

0.9625

75

Table 17 (continued)

Panel C: RiskMetrics – GJR (1,1)

5-day -0.2123*

(0.0216)

[0.0000]

-0.6575*

(0.0251)

[0.0000]

1.6798*

(0.0263)

[0.0000]

0.7517

10-day -0.2680*

(0.0220)

[0.0000]

-0.5443*

(0.0155)

[0.0000]

1.5789*

(0.0164)

[0.0000]

0.8582

15-day -0.3050*

(0.0225)

[0.0000]

-0.4560*

(0.0116)

[0.0000]

1.4927*

(0.0123)

[0.0000]

0.8972

30-day -0.4047*

(0.0230)

[0.0000]

-0.2872*

(0.0066)

[0.0000]

1.3293*

(0.0072)

[0.0000]

0.9443

60-day -0.5910*

(0.0243)

[0.0000]

-0.1521*

(0.0039)

[0.0000]

1.2027*

(0.0043)

[0.0000]

0.9687

Note: * denotes significant at 1% level; the standard errors are in parentheses;

P-values are in brackets.

each of them in Table 15, we found that the value of R2 of encompassing regression is

much higher. Consequently, the joint use of RiskMetrics approach and GJR (1,1)

model can achieve better forecast performance than using only one of them.

To summarize, we do not find the approach dominating the other by

encompassing regression. The information of the value of R2 observed from each

panel of Table 17 indicate that the forecasting for future volatility by jointly using

RiskMetrics approach and GJR (1,1) model performs best at all horizons.

8.3.7 Average Squared Deviation

Mayhew and Stivers (2003) examined forecast performance of implied volatility

and GARCH type models based on individual stocks by average squared deviation

76

(ASD). The basic idea of ASD approach is to discover the good forecaster by

comparing the average deviation between squared return shocks and estimated

volatility, and the forecaster with the lowest average deviation is considered to be the

best predictor against other approaches. By following Mayhew and Stivers (2003), we

use ASD approach to investigate the forecast performance of VIX, Riskmetrics

approach and GJR (1,1) model at different horizons. The ASD for the volatility

forecast of each model is expressed as:

1 40

where denotes the daily returns, is the average return, T is the number of

observations, and is the out-of-sample conditional variance derived from

alternative model.

Table 18. The average squared deviation from alternative approaches

VIX RiskMetrics GJR (1,1)

ASD 6.1253 6.2622 5.9929

As can be seen from Table 18, GJR (1,1) has the lowest average standard

deviation compared to VIX and RiskMetrics approach, respectively. The information

of Table 18 again documents that GJR (1,1) beats VIX and RiskMetrics approach for

volatility forecasting.

8.3.8 Regression for Squared Return Shocks

We run the regression for squared return shocks by out-of-sample conditional

77

volatility forecasts from VIX, RiskMetrics approach and GJR (1,1) model, respectively.

The objective is to examine which approach can well track the dynamics of daily

volatility. The regression has the form

41

where is the daily return on day t, denotes the average daily return, is the

conditional volatility forecasted by respective approach. The conditions for to be

an unbiased forecaster are α = 0 and β = 1.

Table 19. Regression results for squared return shocks

α β R2

VIX -2.9370*

(-4.4445)

[0.0000]

3.5538*

(6.5669)

[0.0000]

0.1471

RiskMetrics 0.3182

(1.0663)

[0.2873]

0.7427*

(3.8342)

[0.0002]

0.0555

GJR (1,1) 0.1954

(0.7355)

[0.4627]

0.8427*

(5.0624)

[0.0000]

0.0930

Note:* denotes significant at 1% level; t-statistics are in parentheses; P-values are in brackets.

Table 19 reports the parameter estimates for regression of squared return shocks

by out-of-sample volatility forecasts. As can be seen, the coefficient α is negative and

highly significant for regression by VIX, and positive and insignificant in other two

cases. The coefficients β are positive and highly significant at 1% level for all models.

Compared to the regression by VIX, the coefficients β of other two approaches are

more close to unity and the coefficients α are insignificant from zero. However, the

value of R2 of respective approach indicates that VIX outperforms other approaches

outstandingly.

78

8.3.9 Encompassing Regression for Squared Daily Return Shocks

In order to further investigate whether VIX dominates other approaches for

tracking dynamics of daily volatility, we run the encompassing regression for squared

daily return shocks. By following Mayhew and Stivers (2003), the form of the

encompassing regression is:

, , 42

where , is the volatility forecasted by VIX, and , is the volatility forecast from

other approaches.

Table 20. Encompassing regression results for squared return shocks

α β1 β2 R2

VIX-RiskMetrics -4.6452*

(-5.3597)

[0.0000]

6.1189*

(6.0492)

[0.0000]

-1.0265*

(-2.9833)

[0.0031]

0.1766

VIX-GJR(1,1) -5.3426*

(-4.3891)

[0.0000]

6.6711*

(4.6532)

[0.0000]

-1.0026**

(-2.3446)

[0.0198]

0.1655

Note: *, ** denotes significant at 1%, 5% level, respectively; t-statistics are in parentheses;

P-values are in square brackets.

Table 20 presents the parameter estimates of encompassing regression for

squared return shocks by VIX against other approaches, respectively. For regression

by volatility forecasts from VIX and RiskMetrics approach, the coefficient α is

negative and highly significantly deviate from zero, and both the parameter estimates

β1 and β2 are highly significant at 1% level. Therefore, VIX and RiskMetrics approach

are complement each other, suggesting that each of them contains information not

included by the other. For regression by VIX against GJR (1,1), the coefficient α is

79

also negative and highly significant at 1% level, the coefficient β1 is positive and

significant at 1% level, and the coefficient β2 is negative and significant at 5% level.

Consequently, the statistic demonstrates that VIX dominates GJR (1,1) at 1% level,

and these two approaches complement each other at 5% level. In addition, the

parameter estimates of β1 of both encompassing regression indicate that VIX has

stronger positive explanatory power. When evaluating the forecasting ability by the

value of R2 of each regression, the higher R2 of regression by jointly using VIX and

RiskMetrics outperforms the other.

80

9 Conclusion

The objective of this study is to examine the predictive power of model based

forecasts and the VIX index on forecasting future volatility of S&P 500 index daily

returns. The study period is from January 1990 to December 2010, including 5291

observations.

First, a variety of time series models were estimated, including random walk

model, GARCH (1,1), GJR(1,1) and EGARCH (1,1) models. The result of analysis for

the estimated models indicates that GJR (1,1) performs best for out-of-sample

forecast in sample period. Then, the forecast performance of VIX, GJR(1,1) and

RiskMetrics were compared using various approaches by following Frijns et al.(2008),

Giot (2005b) and Mayhew and Stivers (2003). The empirical results are detailed in

section 8.

The results of our study are in line with Becker, Clements and White (2006),

Becker, Clements and White (2007) and Becker and Clements (2008). The empirical

evidence does not support the view that implied volatility subsumes all information

content, and the study results provide strong evidence indicating that GJR (1,1)

outperforms VIX and RiskMetrics for modeling future volatility of S&P 500 index

daily returns. In addition, the results of the encompassing regression for future

realized volatility at 5-, 10-, 15-, 30- and 60-day horizons, and the results of the

encompassing regression for squared return shocks suggest that the joint use of

GJR(1,1) and RiskMetrics can produce the best forecasts.

By and large, our finding indicates that implied volatility is inferior for future

volatility forecasting, and the model based forecasts have more explanatory power for

future volatility.

81

References

Ahoniemi, K. (2006) Modeling and forecasting implied volatility – an econometric

analysis of the VIX index. Helsinki Center of Economic Research, discussion

paper, No.129.

Akgiray, V. (1989). Conditional heteroscedasticity in time series of stock returns:

Evidence and forecasts. Journal of Business, 62, 55-79.

Alexander, C. (2001) Market Models: A guide to financial data analysis. John Wiley

& Sons, Ltd.

Awartani, B. M. A. and Corradi, V. (2005).Predicting the volatility of the S&P-500

stock index via GARCH models: The role of asymmetries. International Journal

of Forecasting, 21, 167-183.

Becker, R. and Clements, A.E. (2008). Are combination forecasts of S&P 500 volatility

statistically superior? International Journal of Forecasting, 24, 122-133.

Becker, R., Clements, A.E. and Coleman-Fenn, C.A. (2009). Forecast performance of

implied volatility and the impact of the volatility risk premium. NCER Working

Paper Series.

http://www.ncer.edu.au/papers/documents/WPNo45.pdf

Becker, R., Clements, A.E. and McClelland, A. (2009). The jump component of S&P

500 volatility and the VIX index. Journal of Banking & Finance, 33, 1033-1038.

Becker, R., Clements, A.E. and White, S.I. (2006). On the informational efficiency of

S&P 500 implied volatility. North American Journal of Economics and Finance,

17, 139-153.

Becker, R., Clements, A.E. and White, S.I. (2007). Does implied volatility provide any

information beyond that captured in model-based volatility forecasts? Journal of

Banking & Finance, 31, 2535-2549.

Blair, B. J., Poon, Ser-H.and Taylor. S. J. (2001). Forecasting S&P 100 volatility: the

incremental information content of implied volatilities and high-frequency index

returns. Journal of Econometrics, 105, 5-26.

Bollerslev, T., Chou, R. Y. and Kroner, K. F. (1992). ARCH modeling in Finance: A

82

review of the theory and empirical evidence. Journal of Econometrics, 52, 1-2,

5-59.

Brailsford, T. J. and Faff, R. W. (1996).An evaluation of volatility forecasting

techniques.Journal of Banking and Finance, 20, 419-438.

Brooks, C. (2008). Introductory Econometrics for Finance.Cambridge University

Press.

Canina, L. and Figlewski, S. (1993). The information content of implied volatility.The

Review of Financial Studies, 6, 3, 659-681.

Chong, C. W., Ahmad, M. I. and Abdullah, M. Y. (1999). Performance of GARCH

models in forecasting stock market volatility. Journal of Forecasting, 18,

333-343.

Christensen, B. J. and Prabhala, N. R., (1998).The relation between implied and

realized volatility. Journal of Financial Economics, 50, 125-150.

Chuang, I. Y., Lu, J. R. and Lee, P. H. (2007). Forecasting volatility in the financial

markets: A comparison of alternative distributional assumptions. Applied

Financial Economics, 17, 1051-1060.

Corrado, C. and Miller, JR.T.W. (2005). The forecast quality of CBOE implied

volatility indexes. The Journal of Futures Markets, 25, 4, 339-373.

Day, T. E. and Lewis, C. M. (1992). Stock market volatility and the information

content of stock index options. Journal of Econometrics, 52, 267-287.

Evans, T. and McMillan, D. G. (2007). Volatility forecasts: The role of asymmetric and

long-memory dynamics and regional evidence. Applied Financial Economics, 17,

1421-1430.

Figlewski, S. (1997). Forecasting volatility. Financial Markets, Institutions and

Instruments, 6,1, 1-88.

Fleming, J., Ostdiek, B. and Whaley, R. E. (1995). Predicting stock market volatility: A

new measure. The Journal of Futures Markets, 15, 3, 265-302.

Franses, P. H. and van Dijk, R. (1996). Forecasting stock market volatility using

(non-linear) GARCH models.Journal of Forecasting, 15, 229-235.

Frijns, B., Tallau, C. and Tourani-Rad, A., (2008). The information content of implied

83

volatility: Evidence from Australia. 21st Australasian Finance and Banking

Conference 2008 Paper.

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1246142

Frijns, B., Tallau, C. and Tourani-Rad, A., (2010). Australian Implied Volatility Index.

The Finsia Journal of Applied Finance, 1, 31-35.

Galdi, F. C. and Pereira, L. M. (2007). Value at Risk (VaR) using volatility forecasting

models: EWMA, GARCH and Stochastic volatility. Brazilian Business Review, 4,1,

74-94.

Giot, P. (2005a). Relationships between implied volatility indexes and stock index

returns. Journal of Portfolio Management Spring 2005, 31, 3, 92-100.

Giot, P. (2005b). Implied volatility indexes and daily value at risk models. The

Journal of Derivatives, 12, 54-64.

Giot, P. and Laurent, S. (2006). The information content of implied volatility in light

of the jump/continuous decomposition of realized volatility. Working Paper.

http://www.core.ucl.ac.be/econometrics/Giot/Papers/implied4_8.pdf

Harvey, C. R. and Whaley, R. E. (1992). Dividends and S&P 100 Index Option

Valuation. The Journal of Futures Markets, 12, 123-137.

Latané, H. A. and Rendleman, R. J. (1976). Standard deviations of stock price ratios

implied in option prices. The Journal of Finance, 31, 2, 369-381.

Lamoureus, C. G. and Lastrapes, W. D. (1993). Forecasting stock-return variance:

Toward an understanding of stochastic implied volatilities. The Review of

Financial Studies, 6, 2, 293-326.

Lee, J.H.H. and King, M.L. (1993). A locally most powerful based score test for ARCH

and GARCH regression disturbances. Journal of Business and Economic

Statistics, 7, 259-279.

Lumsdaine, R.L. and Ng, S. (1999). Testing for ARCH in the presence of a possibly

misspecified conditional mean. Journal of Econometrics, 93, 257-279.

Magnus, F.J. and Fosu, O.A.E. (2006). Modelling and forecasting volatility of returns

on the Ghana stock exchange using GARCH models. American Journal of

Applied Sciences, 3 (10), 2042-2048.

84

Mayhew, S. and Stivers, C. (2003). Stock return dynamics, option volume, and the

information content of implied volatility. The Journal of Futures Markets, 23, 7,

615-646.

Nelson, D. B. (1992). Filtering and forecasting with misspecified ARCH models I:

getting the right variance with the wrong model. Journal of Econometrics, 52,

61-90.

Patev, P., Kanaryan, N. and Lyroudi, K. (2009). Modelling and forecasting the

volatility of thin emerging stock markets: the case of Bulgaria. Comparative

Economic Research, 12, 4, 47-60.

Poon, S. and Granger, C. (2001). Forecasting financial market volatility: A view.

Working paper, University of Strathclyde and University of California, Sen

Diego.

Poterba, J. M. and Summers, L. H. (1986). The persistence of volatility and stock

market fluctuations. American Economic Review, 76, 1142-1151.

Sheikh, A. M. (1989). Stock splits, volatility increases, and implied volatilities. The

Journal of Finance, 44, 1361-1372.

The CBOE Volatility Index – VIX. (2009). CBOE Proprietary Information. Chicago

Board Options Exchange, Incorporated.

http://www.cboe.com/micro/VIX/vixwhite.pdf

Tse, Y. K. (1991). Stock returns volatility in the Tokyo Stock Exchange. Japan and the

World Economy, 3, 285-298.

Tse, Y. K. and Tung, S. H. (1992). Forecasting volatility in the Singapore stock market.

Asia Pacific Journal of Management, 9(1), 1-13.

Walsh, D. M. and Tsou, G. Y. (1998). Forecasting index volatility: sampling interval

and non-trading effects. Applied Financial Economics, 8, 477-485.

Whaley, R. E. (1993). Derivatives on market volatility: hedging tools long overdue.

Journal of Derivatives, 1,1,71-84.

Wilhelmsson, A. (2006). GARCH forecasting performance under different

distribution assumptions. Journal of Forecasting, 25, 561-578.

Yu, J. (2002). Forecasting volatility in the New Zealand stock market. Applied

85

Financial Economics, 12, 193-202.

Zivot, E. (2008). Practical issue in the analysis of univariate GARCH models.

Handbook of Financial Time Series.

86

Appendix A

VIX and Future Realized Volatility

Figure A.1 VIX and annualized future realized volatility (5 trading days)

Figure A.2 VIX and annualized future realized volatility (10 trading days)

0%

20%

40%

60%

80%

100%

120%

1/1995 1/2000 1/2005 1/2010

Annualized Realized Volatility (5 trading days)VIX

0%

20%

40%

60%

80%

100%

1/1995 1/2000 1/2005 1/2010

Annualized Realized Volatility (10 trading days)VIX

87

Figure A.3 VIX and future realized volatility (15 trading days)

Figure A.4 VIX and annualized future realized volatility (30 trading days)

0%

20%

40%

60%

80%

100%

1/1995 1/2000 1/2005 1/2010

Annualized Realized Volatil ity (15 trading days)VIX

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

1/1995 1/2000 1/2005 1/2010

Annualized Realized Volatil ity (30 trading days)VIX

88

Figure A.5 VIX and annualized future realized volatility (60 trading days)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

1/1995 1/2000 1/2005 1/2010

Annualized Realized Volatil ity (60 trading days)VIX

89

Appendix B

Out-of-sample Forecast Performance on Realized Volatility

Figure B1. Out-of-sample 5-day ahead realized volatility forecasts

Figure B2. Out-of sample 10-day ahead realized volatility forecasts

-4

0

4

8

Apr 10 Jul 10 Oct 10

5-day ahead forecast by VIX

-2

0

2

4

6

Apr 10 Jul 10 Oct 10

5-day ahead forecast by RiskMetrics

-2

0

2

4

6

Apr 10 Jul 10 Oct 10

5-day ahead forecast by GJR(1,1)

-4

0

4

8

12

Apr 10 Jul 10 Oct 10

10-day ahead forecast by VIX

-4

0

4

8

12

Apr 10 Jul 10 Oct 10

10-day ahead forecast by RiskMetrics

0

2

4

6

8

Apr 10 Jul 10 Oct 10

10-day ahead forecast by GJR(1,1)

90

Figure B3. Out-of-sample 15-day ahead realized volatility forecasts

Figure B4. Out-of-sample 30-day ahead realized volatility forecasts

Figure B5. Out-of-sample 60-day ahead realized volatility forecasts

-5

0

5

10

15

Apr 10 Jul 10 Oct 10

15-day ahead forecast by VIX

-4

0

4

8

12

Apr 10 Jul 10 Oct 10

15-day ahead forecast by RiskMetrics

0.0

2.5

5.0

7.5

10.0

Apr 10 Jul 10 Oct 10

15-day ahead forecast by GJR(1,1)

0

5

10

15

20

Apr 10 Jul 10 Oct 10

30-day ahead forecast by VIX

-5

0

5

10

15

Apr 10 Jul 10 Oct 10

30-day ahead forecast by RiskMetrics

0

4

8

12

Apr 10 Jul 10 Oct 10

30-day ahead forecast by GJR(1,1)

-10

0

10

20

30

Apr 10 Jul 10 Oct 10

60-day ahead forecast by VIX

-10

0

10

20

Apr 10 Jul 10 Oct 10

60-day ahead forecast by RiskMetrics

0

4

8

12

16

Apr 10 Jul 10 Oct 10

60-day ahead forecast by GJR(1,1)

91

Appendix C

Residuals from Regressions for Out-of-sample Realized Volatility

-2

-1

0

1

2

3

4

4/2010 7/2010 10/2010

Residuals from Regression by VIX (5-day)

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression VIX (10-day)

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression by RiskMetrics (5-day)

-4

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression by RiskMetrics (10-day)

92

-2

-1

0

1

2

3

4

4/2010 7/2010 10/2010

Residuals from Regression by GJR (1,1) (5-day)

-2

-1

0

1

2

3

4

4/2010 7/2010 10/2010

Residuals from Regression by GJR(1,1) (10-day)

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression by VIX (15-day)

-4

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression by VIX (30-day)

-4

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression by RiskMetrics (15-day)

-4

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression by RiskMetrics (30-day)

93

-2

-1

0

1

2

3

4

4/2010 7/2010 10/2010

Residuals from Regression by GJR(1,1) (15-day)

-3

-2

-1

0

1

2

3

4/2010 7/2010 10/2010

Residuals from Regression by GJR(1,1) (30-day)

-4

-2

0

2

4

6

4/2010 7/2010 10/2010

Residuals from Regression by VIX (60-day)

-4

-2

0

2

4

4/2010 7/2010 10/2010

Residuals from Regression by RiskMetrics (60-day)

-2

-1

0

1

2

4/2010 7/2010 10/2010

Residuals from Regression by GJR(1,1) (60-day)