buss and vilkov - option-implied correlation and factor betas revisited

Option-Implied Correlation and Factor Betas Revisited∗

Adrian Buss† Grigory Vilkov‡

First draft: November 2008This version: July 13, 2010

Abstract

We propose a new method of using option-implied information to construct heterogeneousimplied correlations (HETIC) for all stock pairs and stock-factor combinations. We useimplied correlations in computing forward-looking betas for arbitrary factors, which is notpossible with the other option-implied methods for finding betas. Computed under the risk-neutral measure, our market betas on average do not suffer from the bias induced by thevolatility and correlation risk premiums. For the S&P500 stocks in 1996–2009, HETIC betasoutperform other historical and option-implied methods in predicting the realized betas interms of average and number of best MSE and R2. In market-neutral pairs trading andin portfolio exposure targeting applications, our betas outperform the others in terms ofdeviation from the target exposure. HETIC market betas clearly confirm the existence of apositive risk-return relation that incorrectly may be put in doubt if one uses daily betas.

JEL classification: G11, G12, G14, G17Keywords: option-implied, correlation, beta, high-frequency data, risk-return relation

∗We would like to thank Yacine Aıt-Sahalia, Peter Christoffersen, Ralph Koijen, Christian Schlag, and RamanUppal for comments and stimulating discussions. We received many helpful comments from the participants of theAdam Smith Asset Pricing Conference 2009, the Humboldt-Copenhagen Conference 2009, the 12th Conference ofthe Swiss Society for Financial Market Research 2009, the European Finance Association 2009, and the Brownbagseminar at the University of Frankfurt. We are also grateful to Alexandra Hansis and Yuliya Plyakha for theirprogramming expertise. All remaining errors are our own.†Graduate Program “Finance and Monetary Economics,” Goethe University, Gruneburgplatz 1 / Uni-Pf H

25, D-60323 Frankfurt am Main, Germany. Email: [email protected]‡Corresponding Author: Finance Department, Goethe University, Gruneburgplatz 1 / Uni-Pf H 25, D-60323

Frankfurt am Main, Germany. Tel.: +49 175 528-3918. Email: [email protected].

Option-Implied Correlation and Factor Betas Revisited

First draft: November 2008This revision: July 2010

Abstract

We propose a new method of using option-implied information to construct heterogeneous im-plied correlations (HETIC) for all stock pairs and stock-factor combinations. We use impliedcorrelations in computing forward-looking betas for arbitrary factors, which is not possible withthe other option-implied methods for finding betas. Computed under the risk-neutral measure,our market betas on average do not suffer from the bias induced by the volatility and correlationrisk premiums. For the S&P500 stocks in 1996–2009, HETIC betas outperform other historicaland option-implied methods in predicting the realized betas in terms of average and number ofbest MSE and R2. In market-neutral pairs trading and in portfolio exposure targeting applica-tions, our betas outperform the others in terms of deviation from the target exposure. HETICmarket betas clearly confirm the existence of a positive risk-return relation that incorrectly maybe put in doubt if one uses daily betas.

Contents

1 Introduction 1

2 Factor Betas 52.1 Stock Market and Factor Betas Definition . . . . . . . . . . . . . . . . . . . . . . 52.2 Volatility and Correlation Risk Premium Assumptions . . . . . . . . . . . . . . . 62.3 Defining HETIC and Competing Factor Betas . . . . . . . . . . . . . . . . . . . . 10

2.3.1 Implied Betas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3.2 Non-parametric Betas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.3 Parametric DCC-MIDAS Betas . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Implied Betas and Change of Measure: HETIC vs. Alternatives . . . . . . . . . . 12

3 Data Description and Preparations 143.1 Stock and Option Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Realized Covariances under the P -measure . . . . . . . . . . . . . . . . . . . . . 163.3 Implied Volatility, Variance and Skewness under the Q-Measure . . . . . . . . . . 17

4 Factor Betas Horse Race: Empirical Investigation 184.1 Factor Beta Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2 Realized Beta Predictability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.2.1 Market Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.2.2 Other Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.3.1 Pairs Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.3.2 Portfolio Immunization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.4 Risk-Return Relation Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5 Robustness Tests 27

6 Conclusion 28

A Proofs 29

B Construction of Risk-Neutral Moments 30

C Factor-Mimicking Portfolios 32

Figures 37

Tables 38

1 Introduction

The notion of beta, as emanating from the seminal work of Sharpe (1964) and Lintner (1965), is

the cornerstone of modern finance and one of the most important concepts in both theory and

practice. It represents the stock return sensitivity to movements of the market and other factors

and therefore its systematic risk. Wang (2003) and Ghysels and Jacquier (2006), among others,

emphasize the importance of an accurate measurement and even more of an accurate prediction

of individual stock betas. Many practical areas rely on betas — portfolio and risk management,

asset pricing, cost of capital estimation, and performance measurement. Investors will clearly

benefit from a better beta forecast in applications like building a tracking portfolio with a

targeted market beta, immunizing portfolios against economic factors, trading market-neutral

pairs, etc. Our objective is to use information from option prices to construct option-implied

betas and test if they are superior predictors of out-of-sample realized factor betas. Options

are forward-looking instruments with fixed maturity, subsuming the market expectations about

the foreseeable future, and hence we expect the proposed method to work better than the betas

based on historical data.

We make two important contributions. First, we propose a new efficient way to model the

heterogenous implied correlations (HETIC) in two consistent versions. The full HETIC gives us

the full correlation matrix for a given stock universe, and the reduced HETIC gives the vector of

stock-factor correlations. The reduced version may be estimated for factors with traded options

on them (e.g., market index), and is computationally very efficient. To estimate the implied

correlation we add to the correlation predictor the correlation risk premium, and the latter is

modeled for all stock-stock or stock-factor pairs as a function of a single variable. This function is

positively related to the market volatility risk premium and negatively related to the individual

stock volatility risk premiums.

Second, we show how to use option-implied second moments to compute forward-looking

betas under the risk-neutral measure Q for arbitrary factors that do not have traded options on

them. This is not possible with the other option-implied methods. Empirically, we deal with

the standard factors1 and also with the orthogonal principal components of those. The HETIC

betas significantly outperform a number of rival betas, including betas from two option-implied

methodologies, historical rolling-window estimation from daily returns, and parametric DCC-1Market, size, and value of Fama and French (1993), and momentum of Carhart (1997).

1

MIDAS betas. The only betas that HETIC cannot beat in all cases are the high-frequency

rolling window betas; we count as our additional contribution that we show how utilizing high-

frequency data for correlations and volatilities estimation makes the beta predictors more precise

compared to daily data.

For the market betas our method significantly decreases the prediction error of the one-

month realized betas for the S&P500 stocks to 0.12 in terms of mean squared error (MSE) from

0.15 in the case of the standard rolling historical daily betas and the parametric DCC-MIDAS

betas. For the six-month betas the MSE goes down from 0.11 to 0.07 for the same set of betas.

Moreover, we observe a considerable boost in predictability precision for all factors: the MSE

decreases for one-month betas on average by 30%, compared to the MSE of daily betas.

As a result of enhanced predictability, investors can benefit from using HETIC factor betas

in practical applications. For instance, in the popular pairs trading strategy a hedge fund goes

long one stock and short the other from a pair of cointegrated stocks to earn some money from

their relative mispricing. Market movements can deteriorate the profitability of the strategy, and

one typically makes the portfolio market-neutral. We compare the realized market betas of the

market-neutral pairs’ portfolios for different beta methods. HETIC market betas statistically

and economically outperform all other betas in terms of MSE for relevant holding periods of 5,

10, and 21 trading days.2 For a more general application of the HETIC factor betas, we perform

an extension of the pairs trading to multiple stocks and factors. We immunize an arbitrary

portfolio with respect to the selected factors, i.e., we get expected factor exposures of zero and

show that our option-implied betas again outperform the other methods in terms of the error

in the realized portfolio factor betas.

We also show that the correct ex ante assessment of the conditional betas matters for the

asset pricing testing. Sorting available assets by the HETIC market betas, we clearly see a

positive risk-return relation out-of-sample, while the traditional daily betas can show a negative

relation, thus putting CAPM in doubt (as in Baker, Bradley, and Wurgler (2010)).

In the aforementioned performance tests we compare our option-implied betas with several

other methodologies proposed in the literature. One of the simplest techniques is to estimate

betas from historical stock returns and take this estimate as a forecast for the future. Since

betas are time-varying, the estimation is typically performed on a rolling window of historical2Typically, in pairs trading the positions are held less than 21 trading days, and also get rebalanced from time

to time. Hence, short period risk targeting is especially important for these applications.

2

returns.3 This technique assumes that the future is sufficiently similar to the past and limits

itself to historical data. French, Groth, and Kolari (1983) (hereafter FGK) introduce the idea

of using option-implied information for beta forecasts. They continue to use rolling window

historical correlations but replace historical volatilities with the Black and Scholes (1973) implied

volatilities from stock and index options.

Chang, Christoffersen, Jacobs, and Vainberg (2009) (hereafter CCJV) extend the idea by

using only option-implied information for beta computations.4 Their technique is based exclu-

sively on traded stock and index option prices. The ratio of stock to market skewness serves as

a proxy for the expected correlation between the market and a stock, and the implied volatility

is used instead of the historical one.

In their papers, French, Groth, and Kolari and Chang, Christoffersen, Jacobs, and Vain-

berg show that option-implied information improves the beta predictability and confirm that

implied volatilities and skewness contain useful information about future stock return distri-

butions. However, both methods ignore the fact that the option-implied moments used in the

estimation are computed under the risk-neutral probability measure, and hence deviate from

their counterparts under the objective measure due to the presence of non-zero volatility risk

premium and correlation risk premium.5 A higher index volatility risk premium in comparison

to the individual stock volatility risk premium drags the FGK betas down from the unbiased

values under the objective probability measure, while the CCJV betas empirically overshoot the

fair level.

The approach proposed in this paper takes into account the difference in expectations of

volatility and correlation under different probability measures. Moreover, we model the corre-

lation risk premium such that it fills in the gap between the index and individual volatility risk

premiums, and hence implicitly corrects the implied betas’ bias with respect to the objective

betas.

As an input to the HETIC identification procedure we may use any predictor of the future

pairwise stock or stock-to-factor correlation under the objective probability measure P and3See Keim and Stambaugh (1986), and Breen, Glosten, and Jagannathan (1989), among others.4The first to solely use option-implied information for beta forecasts is Siegel (1995), creating an exchange

option that implicitly reveals the market beta of a stock. Yet these options are not traded.5Many authors have documented a significantly higher market volatility under the risk-neutral probability

measure than under the actual one, attributing it to the existence of a negative volatility risk premium, whilefor individual stocks the evidence is mixed (e.g., Carr and Wu (2009)). A negative correlation risk premium hasbeen documented by Driessen, Maenhout, and Vilkov (2009).

3

adjust it later by the correlation risk premium. We consider several alternatives of the predictor,

including a simple historical rolling window estimator (from daily and high-frequency returns),

and a parametric estimator based on the DCC-MIDAS model by Colacito, Engle, and Ghysels

(2009). In our tests the betas based on high-frequency covariance estimators outperformed the

betas based on daily and parametric covariances, and therefore we selected the high-frequency

rolling window historical correlation as the HETIC input.

As a proxy for the expected volatility under the risk-neutral measure Q in the main parts

of the analysis we use the parametric Black and Scholes (1973) implied volatility (IV) and for

robustness tests the model-free implied variance (MFIV), which quantifies the nonparametric

risk-neutral expectation of the return variance until the specific options expiration date.6 Our

results show that IV performs slightly better in improving betas’ predictive quality than MFIV.

In addition to the cited strands of the systematic risk literature, our work closely relates

to the high-frequency econometrics literature as we use intraday stock and index returns to

estimate volatilities and correlations. These estimates are used as inputs to the historical betas

formula (as in Andersen, Bollerslev, Diebold, and Wu (2006)) and to the betas using option-

implied information. For estimation purposes we utilize the realized (co)variation estimator, as

described in Karatzas and Shreve (1991), which is consistent in the absence of noise. However,

microstructure noise, such as bid-ask bounce, renders the estimator inconsistent and biased.

The estimation of covariation becomes additionally complicated by the nonsynchronicity of the

data that induces a downward bias and is known as the Epps effect (Epps (1979)). For our

work, we sample at frequencies (most of the time 60 minutes) where the effects of noise and

nonsynchronicity are marginal. In addition, we use subsampling and averaging as presented in

Zhang, Mykland, and Aıt-Sahalia (2005) to increase the efficiency of the estimator.

The paper is organized as follows. Section 2 formulates our method for constructing het-

erogeneous implied correlations and option-implied factor betas. We pay special attention to

the correlation risk premium modeling and discuss the biases of the implied betas with respect

to the objective betas. We also present in this section the other rival betas that we use in the

empirical section. The data used in the empirical part are described in Section 3. Section 4

discusses the estimation of factor betas with different methods and the results of a “horse race”6The MFIV was proposed by Dumas (1995), who built on the seminal work of Breeden and Litzenberger

(1978), and was extended by Carr and Madan (1998), Britten-Jones and Neuberger (2000), Bakshi, Kapadia, andMadan (2003), and Jiang and Tian (2005), among others.

4

among various betas in several applications. Section 5 presents the robustness tests. Finally,

Section 6 concludes.

2 Factor Betas

We now introduce a method for computing factor betas using option-implied information,

namely, volatilities and correlations under the risk-neutral probability measure Q. Section 2.1

gives a brief introduction to factor betas; then we discuss our model assumptions for the volatil-

ity and correlation risk premiums in Section 2.2, and Section 2.3 introduces all beta methods.

In Section 2.4 we discuss the effects of the risk premiums on the option-implied betas.

2.1 Stock Market and Factor Betas Definition

The stock market M consists of N risky assets indexed by i = 1, .., N , represented as a multidi-

mensional Ito process with a number of systematic Wieners (each driving multiple stocks), and

idiosyncratic ones (each affecting its own stock). The total instantaneous volatility σi,t of each

stock i, as well as the correlation ρij,t for each pair of stocks i and j, are stochastic and follow

an Ito process. In other words, the instantaneous diffusion matrix is stochastic, and the only

condition we impose is positive definiteness. In the given setup the correlation ρiM,t between

each stock i and the market M is also stochastic and follows an Ito process.

A factor beta reflects the linear relationship between a factor and a stock return. For

instance, the factor beta (here we use the market index M as a factor) of a stock is given by the

well-known ratio of the covariance between the stock return ri and the factor (market) return

rM and the factor (market) return variance:7

βiM =Cov (ri, rM )V ar (rM )

=

N∑j=1

wjσjσiρij

N∑i=1

N∑j=1

wiwjσjσiρij

(1)

=σiσM× ρiM , (2)

where ws is the weight of asset s ∈ {1..N} in a given factor portfolio. In this paper we want to

utilize forward-looking information contained in option prices, namely, option-implied volatilities7For simplicity we will omit the time subscript unless it is essential for exposition.

5

and correlations, to compute betas under the risk-neutral measure Q. Depending on the available

information, implied betas can be computed in two ways:

βQiM =

σQiN∑j=1

wjσQj ρ

Qij

N∑i=1

N∑j=1

wiwjσQi σ

Qj ρ

Qij

(3)

=σQi

σQM× ρQiM , (4)

where the superscript denotes the respective probability measure.

The main ingredients of the implied beta above are the risk-neutral volatilities, σQi and σQM ,

and correlations, ρQij or ρQiM . Recent studies have shown that implied volatilities predict the

realized ones really well (Blair, Poon, and Taylor (2001) and Jiang and Tian (2005), among

others), and we can also expect implied correlations to be superior predictors for the realized

ones. Hence, the risk-neutral betas should perform well in predicting the realized betas (under

the objective measure).

However, one should be careful in using the risk-neutral betas for forecasting. Recent litera-

ture has shown that both stock return volatility (e.g., Carr and Wu (2009), and Todorov (2009))

and correlations between the stocks (Driessen, Maenhout, and Vilkov (2009)) are stochastic and

bear significant risk premiums. This makes the factor beta stochastic and potentially bearing

a non-zero risk premium, whose sign and magnitude is determined by the relation between the

volatility and correlation risk premiums. A high R2 in the predictive regression of realized betas

on the risk-neutral ones does not guarantee the absence of a systematic bias, and we should take

this into account in our empirical analysis.

2.2 Volatility and Correlation Risk Premium Assumptions

The volatilities and correlations in our setup are stochastic and bear significant risk premiums.

As a consequence, due to Girsanov’s theorem, when we move from the objective measure P to

the risk-neutral measure Q, we not only change the drift of stock returns, but we also have to

adjust upwards the drift of their second moments — volatility and correlation. In the following

discussion we call this change in drift the volatility or correlation risk premiums, respectively.

In the option pricing literature, the variance (Heston (1993)) or volatility (Schobel and Zhu

(1999)) risk premiums are typically assumed to be proportional to the current levels of variance

6

or volatility, respectively. Variance risk premium being proportional to the level of variance has

also been confirmed empirically by Carr and Wu (2009).

In order to be consistent with this finding, we define the volatility risk premium in relative

terms as the ratio of the expected finite period volatility under the risk-neutral measure to the

expected volatility under the objective measure:

V RPi ≡σQiσPi

, (5)

where σQi and σPi are the expected finite period volatilities for an underlying i ∈ {M, 1..N}

under the respective measure. Going to the data, one can get both ingredients for the realized

volatility risk premium computation in (5), and we do not need to make any special identifying

assumptions.

In contrast, the modeling of the correlation risk premium poses a real challenge for several

reasons. First, there is no established standard for the correlation risk premium. All we know is

that the expected correlation under the risk-neutral measure Q is higher than the one under the

objective measure P , and that pairwise correlations cannot exceed one. Second, in an N × N

correlation matrix there are N × (N − 1)/2 unique off-diagonal elements, and one needs to

make clever assumptions on the correlation risk premium structure to be able to identify the

risk-neutral correlation matrix without needing options on each pair of the stocks in the sample.

Moreover, we have to preserve positive definiteness of the covariance matrix when changing from

the objective to the risk-neutral measure.

To get some insights into the possible parametric assumptions for the correlation risk pre-

mium, we decompose the index volatility risk premium into the individual volatility and cor-

relation risk premiums in the spirit of Driessen, Maenhout, and Vilkov (2009). Given a set of

market index weights {wi} , we can write the instantaneous market index variance σ2M at a given

time point as follows:

σ2M =

N∑i=1

N∑j=1

wiwjσiσjρij (6)

=N∑i=1

wiσiσMρiM . (7)

Assuming for a moment that the weights {ws} are constant, viewing the index volatility as a

7

function of stochastic stock volatilities and correlations, and applying Ito’s Lemma to (6) , we

can write the relative change in the index volatility drift as a weighted sum of relative changes

in the individual volatility and absolute changes in correlation drifts due to change in measure:

EQ [dσM ]− EP [dσM ]σM

=N∑i=1

ϑiEQ [dσi]− EP [dσi]

σi+

N∑i=1

N∑j=i+1

wiwjσiσjσ2M

(EQ

[dρij

]− EP

[dρij

]),

(8)

where ϑi = w2i

(σiσM

)2+∑j 6=i

wiwjσiσjρij

σ2M

. Empirically, ϑi is positive for all stocks in our sample.

For ease of exposition now consider the case where the index M consists only of two stocks

i = 1, 2. The change in drift for the correlation between these stocks is then given by:

EQ [dρ12]− EP [dρ12] =σ2M

w1w2σ1σ2×[EQ [dσM ]− EP [dσM ]

σM−

2∑i=1

ϑiEQ [dσi]− EP [dσi]

σi

].

We can see that for positive ϑi the correlation risk premium decreases in the volatility risk

premiums of individual stocks and increases in the index volatility risk premium.

Using this observation, we formulate a model in the general N−stock case: we assume that

for each pair of stocks i, j, the correlation risk premium is increasing in the index volatility risk

premium and is decreasing in the individual volatility risk premiums of stocks i and j. In other

words, the volatility and correlation risk premiums are substitutes in terms of contributing to

the index volatility risk premium, and if one goes down, the other should go up to compensate

for the drop and to fill in the gap between the left- and right-hand side of the equation (8).

Moreover, we observe that the change in correlation due to change in measure enters the

expression (8) in an additive form, and this inspires us to define the correlation risk premium as

the difference between the expected finite period pairwise correlations under the two measures:

CRPij ≡ ρQij − ρPij . (9)

In order to preserve the joint dynamics of the volatility and correlation risk premiums as dis-

cussed above, and to simplify identification, we propose that the correlation risk premium is a

function of the known volatility risk premiums only:

CRPij,t = ρt ×V RPM,t

V RPi,t × V RPj,t, (10)

where ρt ∈ (0, 1) defines the size of the correlation risk premium. In order to keep the risk-

8

neutral correlations ρQij below one, we normalize them in the spirit of the Dynamic Conditional

Correlation model in Engle (2002):

ρQij =ρPij + CRPij√

ρPii + CRPii

√ρPjj + CRPjj

. (11)

We identify the full heterogeneous implied correlation (HETIC) matrix ΓQ with elements ρQij

from restriction (6) written under the risk-neutral measure Q, i.e., from equating the implied

variance of the index and the implied variance of the portfolio of index constituents, given the

correlation risk premium in the specified parametric form (10):

(σQM

)2=

N∑i=1

N∑j=1

wiwjσQi,tσ

Qj,t

ρPij,t + ρt × V RPMV RPi,t×V RPj,t√

ρPii,t + ρt × V RPM

V RP 2i,t

√ρPij,t + ρt × V RPM

V RP 2i,t

, (12)

where the only unknown variable ρt is found numerically. In Appendix A we prove the following

theorem, stipulating the conditions for positive definiteness of the resulting HETIC matrix ΓQ.

Theorem 1 The heterogeneous implied correlation (HETIC) matrix ΓQ is positive definite if

and only if the correlation matrix under objective probability measure ΓP is positive definite and

ρt > 0.

If we do not need the whole correlation matrix under the risk-neutral measure, but only the

stock-to-market correlations ρQiM , we can work out a reduced implied correlations model. Along

the lines of the previous derivation, but applying Ito’s Lemma to equation (7), one can show

that the relative change in the index volatility drift is also a weighted sum of changes in the

individual volatility and stock-to-market correlation drifts due to change in measure. It allows

us to write the reduced model for the stock-to-market correlation risk premium at time t as

follows:

CRPiM,t = ρQiM,t − ρPiM,t = πt ×

V RPM,t

V RPi,t, (13)

where πt is computed from the identifying restriction (7) written under the risk-netral probability

measure Q:

πt =σQM,t −

N∑i=1

wiσQi,tρ

PiM,t

N∑i=1

wiσQi,tV RPM,t

V RPi,t

. (14)

9

After recovering the correlation risk premium between stock i and the marketM, we can compute

the risk-neutral stock-to-market correlations as ρQiM,t = ρPiM,t + CRPiM,t, and we call it reduced

HETIC. Compared with the full HETIC, the reduced one has fewer parameters to estimate and

does not require any identifying restrictions on the full correlation matrix. Empirically, one can

use the reduced HETIC for any factor that has options traded on it.

2.3 Defining HETIC and Competing Factor Betas

The ultimate goal in designing new betas is to improve the prediction of the realized betas.

In the empirical section we compare our HETIC betas’ performance with that of three other

general approaches: implied, non-parametric historical, and parametric historical betas. The

current section defines all the competing betas considered in our analysis.

2.3.1 Implied Betas

Using the correlation and volatility risk premiums assumptions stipulated in the previous section,

we define full HETIC betas for a factor F (including market factor M) using expression (1) with

implied correlations ρHETICij given by (9), (10), (11), and identified from equation (12):

βHETICiF =

σQiN∑j=1

wFj σQj ρ

HETICij

N∑i=1

N∑j=1

wFi wFj σ

Qi σ

Qj ρ

HETICij

, (15)

where wFk is the weight of an asset k ∈ {1..N} in the factor F. The reduced HETIC betas for a

factor with traded options on it (we use market M as the most appropriate example) are defined

by (2) with implied correlations ρHETICiM given by (13) and identified from (14):

βHETICiM =σQi

σQM× ρHETICiM . (16)

There are two alternative beta methodologies in the literature that use option-implied infor-

mation. Both methods can handle betas only with respect to factors that have traded options

on them. First, French, Groth, and Kolari (1983) (FGK) combine historical stock to market

10

correlations with option-implied volatilities for a stock and the market:

βFGKiM = ρPiM ×σQi

σQM, (17)

and second, Chang, Christoffersen, Jacobs, and Vainberg (2009) (CCJV), suggest using the

risk-neutral model-free skewness (MFIS) implied by current option prices, in addition to option-

implied volatilities:

βCCJViM =(MFISiMFISM

) 13

×σQi

σQM. (18)

The first part of this beta expression serves as a proxy for the risk-neutral correlation.

2.3.2 Non-parametric Betas

Betas in the literature that are not based on options are typically simple rolling window historical

betas. The advantage of these betas is that they are simple to compute and can be constructed

for any factor. We evaluate two types of these betas — from daily and high-frequency returns

— by plugging the respective (by data frequency) second moments into expression (1) or (2)

and using weights from several factor-mimicking portfolios including the market index M .

2.3.3 Parametric DCC-MIDAS Betas

The parametric model under the objective probability measure that we consider is the Dynamic

Conditional Correlation model with Mixed Data Sampling (DCC-MIDAS). The DCC model as

proposed by Engle (2002) is a GARCH-type model for conditional correlations. This model gives

a prediction at the sampling frequency of the input data, and as the model is typically fitted

to daily returns, it is not very suitable for longer horizon forecasts: the prediction converges

relatively fast to the unconditional mean. We utilize an extension to this base model — the

DCC-MIDAS model as in Colacito, Engle, and Ghysels (2009) that separates volatilities and

correlations into short-run and long-run components, and the latter can be used to predict the

second moments at a slower frequency than the input data.

Specifically, volatilities follow the GARCH-MIDAS process in Engle, Ghysels, and Sohn

11

(2008):

rk,t = µk +√mk,t × gk,tξk,t (19)

gk,t = (1− αk − κk) + αk(rk,t − µk)

2

mk,t+ κk · gk,t−1 (20)

mk,t = mk + θkLv∑l=1

ϕ(ωk,v)×RVk,t−l; RVk,t =20∑τ=0

(rk,t−τ )2 , (21)

where rk,t is the (daily) return of asset k ∈ {M, 1..N}, the short-run volatility component gk,t

follows a unit GARCH process, and the long-run volatility component mk,t is the weighted sum

of monthly realized variances RVk,t. Given the standardized residuals ξk,t = rk,t−µk√mk,t×gk,t

, the

DCC-MIDAS conditional correlations ρij,t = qij,t√qii,t√qjj,t

follow:

qij,t = ρij,t(1− a− b) + a× ξi,t−1 × ξj,t−1 + b× qij,t−1 (22)

ρij,t =Lc∑l=1

ϕ(ωk,c)cij,t−l; cij,t =

20∑τ=0

ξi,t−τ · ξj,t−τ√20∑τ=0

(ξi,t−τ

)2√ 20∑τ=0

(ξj,t−τ

)2 , (23)

where ρij,t denotes the long run correlation, computed as the weighted sum of monthly realized

correlations cij,t. The weights ϕ(ωk,.) in both models are given by a Beta weight function

as in Colacito, Engle, and Ghysels (2009) and are choosen optimally together with the other

parameters. For estimation we use a two-step procedure in the spirit of Engle (2002), where we

first estimate the GARCH-MIDAS and then the DCC-MIDAS model using Maximum Likelihood

estimation.

The DCC-MIDAS beta for stock i with respect to factor M is then given by:

βDCCi,t = ρiM,t ×√mi,t

√mM,t

, (24)

driven solely by the long-run components and suitable for longer horizon forecasts.

2.4 Implied Betas and Change of Measure: HETIC vs. Alternatives

Given our stock market model, factor betas are stochastic, and by applying Ito’s Lemma to

equations (3) and (4) one can derive the respective stochastic differential equations from the

underlying volatility and correlation processes. Because of the non-zero volatility and corre-

12

lation risk premiums, betas may have different drifts under the objective and the risk-neutral

probability measures. We call this change of drift due to the change of measure the beta risk

premium. The presence of the risk premium translates into the presence of a systematic bias in

risk-neutral betas when used as predictors for the objective ones.

As HETIC and the two other option-implied betas methods use risk-neutral quantities,

they all may suffer from the risk premium induced bias, when used to predict objective betas.

Rewriting French, Groth, and Kolari betas (17) as

βFGKiM = βPiM ×V RPiV RPM

, (25)

we see that these betas get affected by the forward-looking information in the form of the

volatility risk premiums ratio. They will be individually biased if V RPi 6= V RPM and biased

on aggregate if the weighted individual volatility risk premiums are not equal to the market

volatility risk premium. Recent research (Carr and Wu (2009) and others) shows that the market

volatility risk premium is greater in magnitude than the mean individual volatility risk premium,

and therefore we would expect the FGK betas to be biased downwards. The methodology does

not take into account the correlation risk premium that “absorbs” the difference in index versus

individual volatility risk premiums. As the assumed correlation for this model is equal to the

objective correlation ρPiM , the correlation risk premium is zero.

Second, Chang, Christoffersen, Jacobs, and Vainberg betas (18) can be written as:

βCCJViM = βPiM ×1

ρPi,M

(MFISiMFISM

) 13

× V RPiV RPM

. (26)

The authors recognize that option-implied correlations are not exactly the same as the objective

(or historical) correlations, and use the MFIS ratio as a proxy for implied correlations. Hence,

the CCJV betas implicitly use the forward-looking information in the form of the volatility

risk premium ratio and ratio of implied to objective correlations (which is in effect the relative

correlation risk premium). The cubic root of risk-neutral skewness ratio can indeed serve as

a proxy for the risk-neutral correlation under the assumption of zero skewness of the market

regression residuals and if MFIS for the market and the stock have the same sign. Knowing that

the one-factor model for stock returns is often rejected, a priori we have no clear knowledge

about the sign or the magnitude of the bias in CCJV betas.

13

The FGK and CCJV methodologies can only be applied to computing betas for factors with

traded options, and hence are most directly comparable to our reduced HETIC betas:

βHETICiM =σQi

σQM× ρHETICiM = βPiM ×

V RPiV RPM

+σQi

σQM× CRPiM (27)

= βFGKiM +σQi

σQM× CRPiM , (28)

where the correlation risk premium CRPiM is defined by equation (13). As we see, the reduced

HETIC betas take into account the correlation risk premium (that “absorbs” the difference

between the index and individual volatility risk premiums), and add some positive bias correction

to the FGK betas.

While we might get a systematic bias in HETIC betas for each stock individually, we can see

that there will be no bias on average, because the mean betas under the risk-neutral measure Q

are equal to mean objective betas (assuming constant weights), and both are equal to one. For

the reduced HETIC we have

N∑i=1

wiβQiM =

N∑i=1

wiσQi ρ

QiM

σQM=

N∑i=1

wiσQi σ

QM

(ρPiM + CRPiM

)(σQM

)2 = 1, (29)

where the last equality follows from the identifying restriction (14). The same average relation

may be derived for the full HETIC betas using the linearity of the covariance in (1) and (4).

In contrast to HETIC betas, we expect to observe on average a downward bias in FGK betas,

and the average bias in CCJV betas is not clear a priori. For HETIC, the relation (29) will only

hold under an assumption of constant market (factor) weights wi, but empirically we expect it

to hold, at least for a short time.

3 Data Description and Preparations

Our study is based on the major U.S. market proxy, the S&P500 index, and its constituents

for a sample period from January 4, 1996, to October 30, 2009, a total of 3,482 trading days.

In Section 3.1 we shortly describe the stock and option data. In Sections 3.2 and 3.3 we then

discuss the estimation of the realized (co)variances and option-implied measures, respectively.

14

3.1 Stock and Option Data

The daily stock data consist of prices, returns, and number of shares outstanding and come

together with the S&P500 index returns from the CRSP database. Sorted by PERMNO, we

have in total 950 names in our data, which is more than 500 because of index additions/deletions.

To construct an index weights proxy we use COMPUSTAT: we first merge it with the daily

CRSP data, and then compute the weights on each day using the closing market capitalization

of all current index components on the previous day. Resulting index weights almost perfectly

coincide with the real S&P500 weights from Bloomberg LP for a sub-period of several years.

The high-frequency stock data consist of transaction prices from the NYSE’s trades and

quotes (TAQ) database. For the S&P500 we use transaction prices of the Standard & Poor’s

Depository Receipt (SPY) from TAQ.8 High-frequency prices are filtered from the official opening

at 9:30 EST until 16:00 EST and only include valid entries. In addition, we remove outliers

following the cleaning procedures in Barndorff-Nielsen, Hansen, Lunde, and Shephard (2009).

We construct a regularly spaced one minute price grid for every trading day using the volume-

weighted average price for a minute. We fill in missing minute VWAPs using past values, but

only within the last 30 minutes. Before filling in, in early years we have not very frequent

trading. For example, in 1996 there are on average 95 (from maximum 390) minutes per day

with valid trades for an index, and 132 minutes – for stocks. We observe a relatively stable

number of trades per day only starting in 2002, when we get 350 non-empty minutes per day

for an index, and 273 – for stocks. After filling in, we always have more than 387 valid entries

per day for an index, and about 385 for stocks.

The data for the equity and index options are obtained from IvyDB OptionMetrics Volatility

Surface file that provides us with Black-Scholes implied volatilities for options with standard

maturities and moneyness levels. We select out-of-the-money (puts with deltas strictly larger

than −0.5 and calls with deltas smaller than 0.5) implied volatilities for maturities of one and

six months, and also the implied volatilities for at-the-money call and put options with the

same maturities. On average we have option data for 445 out of the 500 S&P500 stocks – the

number of available implied volatilities grows from 373 in 1996 (adding up to 76% of the index

in terms of the weight) to 483 (98% of the total weight) in 2009. Note that we use options not8The SPY, an exchange-traded fund that holds all S&P500 stocks, is highly liquid and can be redeemed for

the underlying portfolio so that the fund’s price does not deviate from the S&P500.

15

as instruments for trading, but as information source only. If an underlying option market is

not liquid or not efficient, and we still can take advantage of it for improving betas’ predictive

abilities, then with a “good” market we would be even better off.

3.2 Realized Covariances under the P-measure

The theory of quadratic variation (see Jacod (1994) and Jacod and Protter (1998)) implies that

we can consistently estimate integrated (co)variance over a period from t up to T by the realized

covariance matrix Σ(t, T ):

Σ(t, T ) =S∑k=1

r(t+ k∆t) r′(t+ k∆t), (30)

where S denotes the number of sampling periods, ∆t ≡ (T − t)/S is the sampling frequency,

and r(t+ k∆t) ≡ logP (t+ k∆t)− logP (t+ (k− 1)∆t) is the vector of incremental log-returns.

The realized (co)variance measures based on high-frequency data deliver more accurate

ex-post integrated (co)variances (see Andersen, Bollerslev, Diebold, and Ebens (2001), and

Barndorff-Nielsen and Shephard (2002), among others) as they utilize more data compared to

their daily counterparts. However, at very high frequencies, microstructure effects render these

estimates inconsistent. One the one hand, noise effects9 arising from bid-ask bounce or discrete

trading drive realized variances to infinity (see Zhang, Mykland, and Aıt-Sahalia (2005), and

Bandi and Russell (2005)), while, on the other hand, the effect of nonsynchronicity, known as

the Epps effect, drives covariances to zero (see Epps (1979) and Zhang (2006)). Both problems

can be mitigated by reducing the sampling frequency. For instance, sampling every five minutes

works well for eliminating the noise effect (Bollerslev, Tauchen, and Zhou (2009)), while for the

Epps effect this is still too fast.

To obtain a reasonable sampling frequency, we use the cross-sectional property of stock

market betas, namely, that the weighted average market betas should be equal to one. While

this property holds only under the assumption of constant index weights, we expect it to hold

approximately in real conditions. We compute the market variance as well as the covariances

between the index components and the index at a frequency ∆t. If the frequency is too fast, the

Epps effect introduces a negative bias in the covariances, and the weighted market beta is below

one. Figure 1 shows the time series of weighted average market betas computed on a rolling9See Brown (1990), Zhou (1996), and Corsi, Zumbach, Muller, and Dacorogna (2001).

16

basis using data for 21 trading days (one month), for four different sampling frequencies — 30,

60, 120, and 180 minutes.

We find that a sampling frequency of 60 minutes is optimal for most of our sample, and only

for the years 1996− 2001 we decided to use a slower sampling frequency of 180 minutes as there

is a very pronounced downward bias in the average betas for faster scales. Obviously, using

slower frequencies and sampling only each 60th price observation, one throws away information

and thus loses efficiency. As a solution we follow Zhang, Mykland, and Aıt-Sahalia (2005) and

use subsampling and averaging. Initially, we have stock prices at a one minute grid: for minutes

1, 2, 3, . . . up to the end of the estimation window. To sample at a 60 minute frequency we use

60 different subsamples: for the first one we compute hourly returns from prices at minutes

1, 61, 121, . . .; for the second subsample we use the information from minutes 2, 62, 122, . . ., and

so on. For each of the resulting subsamples we compute estimator (30). Finally, we average over

the subsamples.

As at a daily frequency the microstructure problems are not an issue, we can simply apply

the realized covariance estimator (30) for the daily data second moment estimates.

3.3 Implied Volatility, Variance and Skewness under the Q-Measure

There are two generally accepted ways for computing risk-neutral volatilities: a parametric and

a model-free one. The recent literature (e.g., Carr and Lee (2009); Carr and Wu (2009)) shows

that theoretically the risk-neutral expected volatility is best approximated by the Black and

Scholes (1973) at-the-money implied volatility (IV), while the risk-neutral expected variance is

best approximated by the model-free implied variance (MFIV). MFIV has more content as it

subsumes information from all options expiring on one date (Vanden (2008)), does not rely on

any model, and has a nice economic interpretation of a price of the linear portfolio of options.10

However, the simple at-the-money IV still seems to be a superior expected risk-neutral volatility

proxy, even though it is based on one option only and relies on restrictive assumptions.

In most of our computations we need a good predictor for volatility, and hence we use in the

main parts of the analysis the average of at-the-money call and put option IV

σQi =IVCall + IVPut

210See Carr and Wu (2006) for a good discussion of why industry has chosen MFIV over IV by switching

methodology for the on CBOE traded implied volatility index VIX.

17

as a proxy for expected risk-neutral volatility σQi for the respective maturity, and we use a square

of it as a proxy for the expected risk-neutral variance.

We also compute model-free variance (MFIV) and skewness (MFIS) following the nonpara-

metric formulas of Bakshi, Kapadia, and Madan (2003). While MFIS is needed for CCJV betas,

we use MFIV for robustness tests in Section 5. The exact formulas are given in Appendix B.

For the HETIC computations we also need the predicted value of the volatility risk premium.

Following the recent literature (e.g., Carr and Wu (2009), and Carr and Lee (2009)), we proxy

for it by the historical volatility risk premium, estimated for stock i on day t as the average

realized volatility risk premium over the past year:

V RPi,t =1

252− τt−τ−1∑δ=t−252

σQiδσPiδ

, (31)

where σQiδ and σPiδ are the implied and realized volatilities for the maturity period from δ to

δ + τ . We compute the V RPi,t for two maturities τ ∈ {21, 126} (trading days) separately and

use them for the respective correlation and beta prediction horizon.

In the robustness section we also use variance risk premiums that are computed using the

same principle as in (31), but using MFIV and realized variances.

4 Factor Betas Horse Race: Empirical Investigation

We now assess the performance of HETIC betas compared to the competing betas. We first

shortly describe in Section 4.1 how we estimate the different betas, and in Sections 4.2 and 4.3

we then evaluate their performance in both statistically and economically driven applications.

In Section 4.4 we also compare the HETIC betas with the traditional rolling window ones in the

CAPM-type risk-return relation.

4.1 Factor Beta Estimation

All the betas we consider are derived from equations (1) or (2). The only difference in methods

lies in the second moments that we feed into these expressions.

As a common input to all implied beta methods we use the at-the-money IV’s for the

maturity coinciding with the horizon of the betas we want to predict — 21 and 126 trading

days, or one and six months. The value used for correlation then distinguishes each given

18

method. First, French, Groth, and Kolari betas in (17) are computed using the historical stock-

to-market correlation for the past 126 days in two versions — from daily and high-frequency data.

Second, for Chang, Christoffersen, Jacobs, and Vainberg betas in (18) we get the proxy for the

implied stock-to-market correlations using the model-free implied skewness for the respective

maturity of one or six months. Third, for HETIC betas we fit two models of the implied

correlation — the full and the reduced one. As a proxy for the expected correlation under

the objective measure, we use historical high-frequency stock-to-stock (full model) or stock-to-

market (reduced model) correlations for 126 trading days. After plugging into equations (12)

and (14) all previously computed variables — implied volatilities, volatility risk premiums, and

objective measure correlation proxy — we solve these equations for ρ and π on each day in the

sample period. Then we compute the respective implied correlations and use them in equations

(15) and (16) to derive the full and reduced HETIC betas, respectively. Note that for FGK and

CCJV we only compute market betas, and for HETIC we compute two versions of market betas

— full and reduced, — and a number of factor betas using the full model.

For the rolling betas, correlations and volatilities are estimated from historical data. For the

high-frequency data we use on each day 21 and 126 days of history, depending on the horizon

of the future betas that we want to predict; whereas for daily data we always use the last 126

returns. For these types of betas we compute both market and factor betas.

The parametric betas are estimated by first fitting the DCC-MIDAS model described in

section 2.3.3 for each stock-to-market pair independently for the whole sample period, and then

using the long-run components of volatility and correlation on each day in equation (24). For

the parametric method we compute only the market betas, though one can certainly do that

exercise for other factors as well. Due to a large number of stock-to-market combinations we

report only some average parameter values, e.g. for the market variance we have αMkt = 0.065,

κMkt = 0.65, mMkt = 2.95× 10−5, θMkt = 0.025 and a weight parameter ωMkt of about 4. For

individual variances we have: α = 0.083, κ = 0.55, m = 1.08 × 10−4, θ = 0.027 and a weight

parameter ω of about 3. The average parameters for the correlation part are a = 0.03, b = 0.60

and a weight parameter of about 2.

In Table 1 we provide summary statistics on the estimated market betas for two horizons —

one and six months to maturity. Among others, we compute a market cap weighted average of

individual betas that should theoretically be equal to one, assuming constant market weights.

19

Across different time spans, the weighted average FGK betas are 10% to 19% lower, and CCJV

betas are 18% to 34% higher than one. Among others, the positive bias of the CCJV betas can

be explained by the fact that these betas are positive by construction, e.g. the minimal beta

is around 0.01 for both maturities while all other methodologies give also some highly negative

betas. For FGK betas the bias gets more severe with a longer time span, while for CCJV betas

we observe the opposite relation. The mean weighted HETIC betas are unbiased by construction.

The rolling window high-frequency betas are about 4% lower than one due to the Epps effect,

as we have seen from Figure 1, while daily betas, not suffering from the microstructure noise,

are literally equal to one. DCC-MIDAS betas demonstrate a small negative bias of about 5%.

Thus, the FGK and CCJV implied betas show the largest biases with respect to the objective

market beta, and to investigate the exact causes of this fact, we provide the stock-to-market

correlations used in the computation of the methods (for CCJV beta it is the skew-based proxy)

in Table 2, as well as the volatility risk premium for the index and the mean value of the risk

premium for its constituents in Table 3.

The average historical correlation is around 0.48 for the daily and around 0.45 for the high-

frequency returns. The average HETIC correlation is slightly higher due to the correlation risk

premium, and is given by 0.54 for one month and by 0.59/0.57 (full/reduced models) for six

months. The artificial CCJV implied correlation proxy is by construction always positive and

much higher, decreasing from around 0.79 for one month to 0.74 for half a year. Moreover,

consistent with previous research, we observe that the ratio of implied to realized volatility is

on average higher for an index than for its components. Interestingly, the discrepancy between

the volatility risk premium for the index and its constituents grows substantially with the time

span, mostly due to the decreasing volatility risk premium in the individual stocks. The volatility

risk premium is 1.22 in the index versus 1.14 in its components for one month, while for six

months the index risk premium decreases slightly to 1.17 and the individual risk premium

drops to 1.01. The difference in the volatility risk premiums explains the negative bias in

the FGK betas that increases with the time span. The correlation risk premium in HETIC

exactly compensates this volatility risk premium difference effect, and the skew-based proxy in

CCJV betas overcompensates it, making the betas biased upwards. A longer time span works

in the right direction for the CCJV betas, as more compensation is required from the implied

correlation proxy due to the increasing difference in the volatility risk premiums for the index

20

and its constituents.

4.2 Realized Beta Predictability

The main test for judging the performance of the beta methods lies in assessing its predictive

power with respect to the realized betas. In this section we carry out some time-series tests

to isolate the best methodology in predicting betas for each considered factor F . We use two

generally accepted performance metrics: mean-squared error (MSE), i.e., the squared deviation

of the ex-post realized betas from the predicted ones, and the R2 from the time-series regression

of realized betas on the predictors and an intercept:

β(F ),Realizedi,t = υi + λiβ

(F ),P redictedi,t + ε

(F )i,t , ∀t, (32)

where subscript i refers to a given stock. The results for the market factor are presented in

section 4.2.1, and the results for additional factors are provided in section 4.2.2.

4.2.1 Market Factor

As predicted values for the market betas we use the rolling window historical betas (calculated

from high-frequency and daily returns), the DCC-MIDAS betas, French, Groth, and Kolari betas

(FGK HF, using high-frequency return correlations, and FGK Daily), Chang, Christoffersen,

Jacobs, and Vainberg betas, and HETIC betas (full and reduced versions). As realized betas are

measured more precisely from high-frequency rather than daily data, we only use high-frequency

realized betas for 21 and 126 trading days (one and six months) as the ex-post measure. The

result of the performance tests are given in Table 4.

We first focus on the predictability results for the one-month horizon. The market betas

from full HETIC, having an average MSE of 0.1274, are significantly better than the other

competing methods in terms of MSE. For instance, compared to historical daily betas we achieve

a performance improvement of about 18%. The only betas that come close to full HETIC betas

are the high-frequency FGK betas. Using the reduced version of HETIC decreases the MSE even

more, to 0.1183. These reduced HETIC betas are significantly better, with t-statistics above

8, than all the other methods. Moreover, the reduced HETIC betas give for about 50% of all

stocks the lowest MSE, followed by the full HETIC model and the FGK HF model, with 15%

and 14% of lowest MSE stocks, respectively.

21

The picture for the explained variability is similar. While the R2 for the full HETIC model

is higher than for all other methodologies, the reduced version shows an even higher R2 with a

t-statistic above 10 for the difference with the next rival.

For the longer horizon predictability results are qualitatively the same. With exception of

the historical high-frequency betas, the competing methods have MSEs that are more than 30%

higher and R2s that are more than 18% lower than the full HETIC model. While the historical

high-frequency betas show about the same performance as the full HETIC model in terms of

MSE, they have a significantly higher MSE and lower R2 than the reduced HETIC model. Thus,

there is clear evidence that the HETIC methodologies are the most efficient (in terms of the

explained variability) and the least biased (in terms of the MSE) predictor of the realized market

betas, with the reduced version being the overall best method.

Note that the three option-implied methodologies (FGK, CCJV and HETIC) only differ in

the correlation proxy used, as they are constructed from the same option-implied volatilities.

The differences in terms of MSE and R2 between these methods can therefore by attributed to

the option-implied correlation proxy proposed in Section 2.2.

4.2.2 Other Factors

As additional factors, we use the widely accepted Fama and French (1993) size (SMB), and

value (HML), and Carhart (1997) momentum (UMD) factors. Normally, one should use the

whole universe of traded stocks to replicate these factors. However, as we have discussed in

Section 2, we are limited to a smaller stock sample, as we can compute the implied pairwise

correlations only for the stocks in the considered index. To circumvent this problem we follow the

procedure in Breeden, Gibbons, and Litzenberger (1989), as well as Lamont (2001), to create

factor-mimicking portfolios using the stocks in the index that track the innovations in these

factors. The procedure is described in detail in Appendix C, and the most important statistics

for the derived factor-mimicking portfolios are given in Table 5.

As outlined before, FGK and CCJV betas can only be used for factors with traded options on

them, so that we now concentrate on the rolling window historical betas (calculated from high-

frequency and daily returns) as well as the forward-looking HETIC betas. We do not consider

DCC-MIDAS betas as they show about the same performance in predicting market factor betas

as daily historical betas, but are computationally more costly. Again we use high-frequency

22

betas as the ex-post measurement.

The results are presented in Table 6 and are very encouraging for the HETIC model. For

the short horizon and all three factors the HETIC betas have significantly lower MSEs than

other rivals with t-statistics around 8, as well as the lowest MSE for 63%, 84% and 83% of

the stocks, respectively. Furthermore, for all factors HETIC betas show the highest R2 with

extremely high t-statistics and the highest R2 for around 80% of the stocks. Note that while

we showed in Section 2.4 that HETIC betas are on aggregate unbiased for the market factor (as

they are computed from the variance decomposition of the market), such a result does not exist

for other factors. Thus, even being potentially biased, the HETIC betas are able to deliver very

good performance for the short horizon.

For longer horizon predictability, HETIC betas are still the best in terms R2 for all three

factors while the historical high-frequency betas are now the best in terms of MSE. However,

the HETIC betas are still significantly better than the historical daily betas with t-statistics

above 7.5, showing performance improvements of more than 20% in terms of MSE.

4.3 Applications

We now evaluate the performance of the HETIC betas in two economic applications: first, the

market betas in a pairs trading setup, and then all factor betas in a portfolio risk exposure

targeting application.

4.3.1 Pairs Trading

Pairs trading consists of selecting a pair of stocks in order to go long one of the stocks and short

the other one. This is typical for “stock picking,” i.e., a situation where a fund expects the first

stock to outperform the second one. It is also typical for statistical arbitrage models in which a

fund trades two cointegrated stocks if the spread strongly deviates from its long-run mean and

profits if the spread returns to the equilibrium.11 To hedge against market movements that can

deteriorate the profitability of such trading strategies, hedge funds not only trade the two stocks

but also the market to render the overall portfolio market neutral. As we can neither compute

the possible cointegration relationship for all pairs, nor perform stock picking, we consider all11For a good overview on pairs trading see Vidyamurthy (2004).

23

pairs of stocks and randomly decide on the trade direction (long/short).12 As pairs trading

positions are typically closed quickly, we only consider holding periods of one to four weeks

using options with the shortest maturity of one month.

For each pair, each holding period and each beta method we create a portfolio consisting of

a long position in the first stock, a short position in the second one, and position in the market

(with weight w) such that the expected market beta is zero:

(1− w)× (β1 − β2) + w × βM = 0, (33)

where βM = 1 denotes the beta of the market, and βi, i = 1, 2 is the beta of each stock with

respect to the market. Given the optimal weight w we then compute the realized market beta

of the portfolio, i.e., the deviation from the target beta of zero.

The results of this pairs trading application are presented in Table 7. For a holding period

of five days the performance of the full HETIC betas, with a mean squared error of 0.0322,

is significantly better than for all other competing methods, with t-statistics above 8. For the

reduced HETIC model the MSE is even lower. These results are confirmed for holding periods

of 10 and 21 trading days: full HETIC betas are always significantly better than the competing

methods, yielding a predictability improvement of about 25% compared to historical daily betas,

while the reduced HETIC model is the best overall method. For all considered methodologies,

the MSE decreases with the holding period because longer horizon betas are less volatile.

4.3.2 Portfolio Immunization

We now extend the pairs trading application in two dimensions. First, instead of considering

only the market factor, we also take into account the SMB, HML, and UMD factors. Second,

we concentrate on portfolios with one hundred stocks instead of just two stocks. The basic goal

stays the same: we want to immunize a portfolio against movements in the factors.

On each trading day we create two portfolios consisting of equal-weighted long positions in 50

randomly selected stocks and equal-weighted short positions in 50 other stocks. For each beta

methodology, we compute the expected portfolio factor betas and the corresponding neutral

positions in the factors such that the expected beta for each factor is zero. Specifically, we solve12Since we evaluate only the realized beta of the portfolio and not the profitability, this random selection of

pairs should not introduce any bias.

24

the system of equations for the portfolio weights {wPf , wMkt, wSMB, wHML, wUMD}:

wPf × βPf,k + wk × βLong/Shortk = 0 ∀k ∈ {Mkt, SMB,HML,UMD} (34)

wPf + wMkt + wSMB + wHML + wUMD = 1, (35)

where βPf,k denotes the beta of the portfolio consisting of hundred stocks with respect to factor

k and βLong/Shortk the long/short beta of the kth factor (1 or −1). Given the optimal portfolio

weights, we compute the realized factor exposures, i.e., deviations from zero beta.

The results of this immunization exercise are presented in Table 8. Again, we consider

only short-term holding periods of one to four weeks as such portfolios are typically at least

reshuffled every month. For a five day holding period, full HETIC betas and high-frequency

betas yield basically the same performance with a sum of squared deviations of about 0.0645

which is significantly lower than the MSE for daily betas. For holding periods of two or four

weeks the full HETIC betas are signficantly better, yielding performance improvements of about

7% compared to the daily betas.

4.4 Risk-Return Relation Revisited

The Capital Asset Pricing Model (CAPM) as proposed by Sharpe (1964) and Lintner (1965),

stipulates a linear relation between a stock’s return and its market beta, the systematic risk.

This implies that a portfolio consisting of long positions in high beta stocks and short positions

in low beta stocks should have a significantly positive return as the high beta stocks bear a

higher systematic risk. However, in a recent study, Baker, Bradley, and Wurgler (2010) find

that high beta stocks have substantially underperformed low beta stocks over the last years.

The authors label this as perhaps the greatest anomaly in finance as it challenges the basic

notion of a risk-return tradeoff. The finding can be due either to failure of the CAPM, i.e., the

market beta is not a measure of total systematic risk, or due to poor measurement of the ex

ante systematic risk of stocks.

We now revisit this test of the risk-return relation, and therefore, compare the historical daily

beta method used in the literature, with the reduced HETIC beta method, which is the most

efficient and least unbiased method for estimating market beta, as shown in Section 4.1. On

each trading day t, we sort the stocks into N portfolios based on their forecasted beta over the

25

next month. We then compute the return on a high-minus-low beta portfolio as the difference

between the value-weighted return on the Nth portfolio, containing the stocks with the highest

betas, and the value-weighted return on the 1st portfolio, containing the stocks with the lowest

betas. Repeating this procedure for all trading days t we arrive at a return time-series for the

long/short beta portfolio.

The results are very encouranging. For N = 10 portfolios, i.e., with about 35 stocks in each

portfolio, the average return on the high-minus-low beta portfolio using daily historical data for

forecasting is −0.36% per year. This result is in line with the previous findings and documents

a negative risk-return relation. However, if we use the reduced HETIC betas, we find a strong

positive risk-return relation with an average return on the long/short beta portfolio of 3.15%

per year. The difference between the two methodologies is highly significant with a t-statistic

of 4.7. If we consider N = 20 portfolios with about 18 stocks in each portfolio, the difference is

even more significant. While for the daily historical data the average return on the beta sorted

portfolio is −3.80%, the reduced HETIC betas yield an average return of 4.74%. The results for

N = 5 portfolios are qualitatively the same. While the long/short portfolio based on daily data

now also yields a positive return, the return based on the reduced HETIC betas is about 1.8%

higher. These results are confirmed by the beta forecasts based on high-frequency historical

information, where the returns on the long/short beta portfolio are 4.41%, 3.03% and 4.29% for

N = 5, 10, 20 portfolios, respectively.

Goyal and Saretto (2009) and Bali and Hovakimian (2009) show that stocks with high volatil-

ity risk premiums outperform stocks with low premiums. Importantly, the high/low beta portfo-

lios for the historical daily method and the reduced HETIC method have comparable exposures

to volatility risk premiums, e.g. for N = 5 the average value-weighted volatility risk premium

for the high beta portfolio are 1.09 for the historical daily betas and 1.10 for the HETIC method

and similar for the low beta portfolio. Therefore the differences in returns between the historical

daily method and the reduced HETIC method cannot be atttributed to differences in volatility

risk premium exposures.

Overall, the results based on the reduced HETIC betas yield a strong positive risk-return re-

lation, whereas the relation based on historical daily betas is typically negative. This statistically

and economically significant difference can be attributed to the more accurate measurement of

future market betas. Though we are restricted to our sample of S&P500 stocks, whereas other

26

studies use the full stock universe, the fact that the sorting based on daily betas yields the

typically observed negative relationship is reassuring.

5 Robustness Tests

In this section, we describe the various tests that we have undertaken to verify the robustness

of the results from our empirical analysis.

First, we check the beta predictability and economic applications performance in different

sub-periods. We identify some periods naively, by just selecting visually good (with low volatility

and growing market) and bad (with negative market dynamics and high volatility) regimes, and

also apply a more formal Markov Regime Switching model to identify high and low volatility

regimes. The HETIC betas consistently outperform other betas in most of the regimes under

the various regime identification procedures.

Second, instead of taking the S&P500 index and its constituents, we also replicate the study

with the smaller S&P100 index for the period ending in 06/2007. The major inferences from

the S&P500 based analysis remain unchanged.

Third, instead of using the standard market, two Fama and French (1993) and Carhart

(1997) momentum factors, we create four orthogonal factors by applying a principal component

identification procedure to the times series of the original factors. In real portfolio risk targeting

applications, one would like to be protected against the source of risk that may be driving several

commonly used economic factors, and not just against some economic factor.13 Then we carry

out the portfolio immunization exercise from Section 4.3, using these orthogonal factors. The

HETIC method still delivers by far the best results as compared to the other betas.

Fourth, we select a different proxy for the risk-neutral second moment of returns, and instead

of using the Black-Scholes at-the-money implied volatility (IV), we use the model-free implied

variance (MFIV). As MFIV is a better proxy for the risk-neutral variance, and hence the variance

risk premium, we reformulate the model in its terms. Assuming a constant set of index weights,

viewing the index variance as a function of stochastic stock variances and correlations, and

applying Ito’s Lemma to (6), we can write the index variance risk premium as a function of

individual variance risk premiums and the correlation risk premiums. As a result, the variance13It is well known that the standard 3+1 factors are far from being orthogonal, and hedging against the market

factor, one still remains exposed to the part of the market factor driving source of risk.

27

and correlation risk premiums now can be viewed as complements in terms of contributing to

the index variance risk premium, and we model the correlation risk premium along the lines of

Section 2.2, but using the variance instead of volatility risk premiums everywhere. We repeat

the whole analysis in the paper with this model, and the results remain qualitatively unchanged.

Moreover, the new MFIV-based HETIC systematically loses (very slightly) in all applications

to the IV-based model, while still being better than all other methodologies.

6 Conclusion

Linear factor models are by far the most popular models for asset pricing and portfolio manage-

ment. It is clear that for a good performance of these models, an accurate prediction of the stock

factor exposures, i.e., the betas, are crucial. In this paper, we show how to utilize forward-looking

information contained in current option prices to construct heterogeneous implied correlations

(HETIC) and for building option-implied betas. We propose two consistent HETIC models: the

full model that delivers us the stock-to-stock correlation matrix and the reduced one that gives

us the stock-to-factor implied correlation. We explicitly model the correlation risk premium as

a function of index and individual volatility risk premiums and identify it in a way that closes

the gap between the implied variances of the market index and its constituents’ portfolio.

This procedure turns out to be crucial for the performance of the betas in empirical applica-

tions. The HETIC betas are significantly the most efficient predictors of realized factor betas,

among several option-implied and historical sample based alternatives. We also show that the

HETIC betas win the horse race in the popular market-neutral pairs trading strategy, where

the systematic (market) risk of a portfolio should be eliminated. Further, we extend the pairs

trading idea to a multiple factor portfolio immunization application, and again, our method

demonstrates a superior performance.

Finally, correct ex ante assessment of the factor betas is certainly important for conditional

asset pricing models. Using the forward-looking market HETIC betas, we confirm the positive

risk-return relation as suggested by the CAPM model and show that using traditional daily

return-based betas may lead to incorrect inferences in this respect.

The method of modeling option-implied correlations and their application to computing

forward-looking betas, developed in this paper, have numerous empirical applications and could

be used to address various research and practical problems.

28

Appendix A Proofs

Theorem 1. The heterogeneous implied correlation (HETIC) matrix ΓQ is positive definite if

and only if the correlation matrix under objective probability measure ΓP is positive definite

and ρt > 0.

Proof. The non-normalized correlation matrix ΓQ,∗ can be written as:

ΓQ,∗ = ΓP + ρt × V RPM,t × φ× φ′,

where φ =(

1V RP1,t

, . . . , 1V RPN,t

)Tdenotes a vector containing the inverse of the individual stock

volatility risk premiums and ΓP denotes the correlation matrix under the objective measure P .

While the correlation matrix ΓP is positive definite by construction, one can show that φ · φ′ is

positive definite because c′ × φ × φ′ × c =(∑N

i=1 ciφi

)2> 0 for all c ∈ Rn 6= 0. Because the

scalar multiplication of a positive definite matrix with a positive scalar is positive definite, we

get that ρt × V RPM,t × φ × φ′ is positive definite if and only if ρt > 0. Moreover, ΓQ,∗ as the

sum of two positive definite matrices, is also positive definite. In case ρt = 0, the ΓQ,∗ will be

positive definite because it is the sum of a positive definite matrix and zero. The normalized

correlation matrix under the Q measure can be written as:

ΓQ = R−1 × ΓQ,∗ ×R−1,

where

R =

√ΓQ,∗1,1 0 0 · · · 0

0√

ΓQ,∗2,2 0 · · · 0...

......

. . ....

0 0 0 · · ·√

ΓQ,∗N,N

.

Given that√

ΓQ,∗i,i > 0 ∀i, the matrix R is positive definite, and therefore, so is R−1. More-

over, because the product M ×N ×M of positive definite matrices M and N is again positive

definite, we conclude that ΓQ is positive definite.

29

Appendix B Construction of Risk-Neutral Moments

The formulas in this appendix closely follow the exposition in Bakshi, Kapadia, and Madan

(2003) and are given for completeness.

Let the τ -period return be given by the log price relative:

R(t, τ) ≡ ln[S(t+ τ)/S(t)].

Define a variance, a cubic, and a quartic contract with the following payoffs:

H[S] =

R(t, τ)2, volatility contract;

R(t, τ)3, cubic contract;

R(t, τ)4, quartic contract.

Let V (t, τ) ≡ EQt {e−rτR(t, τ)2}, W (t, τ) ≡ EQt {e−rτR(t, τ)3}, and X(t, τ) ≡ EQt {e−rτR(t, τ)4}

represent the fair value of the respective payoff.

The price of the variance contract, or the model-free implied variance (MFIV), is given by

V (t, τ) =∫ ∞S(t)

2(

1− ln(

KS(t)

))K2

· C(t, τ ;K)dK +∫ S(t)

0

2(

1− ln(

KS(t)

))K2

· P (t, τ ;K)dK, (36)

the price of the cubic contract is

W (t, τ) =∫ ∞S(t)

6 ln(

KS(t)

)− 3

(ln(

KS(t)

))2

K2· C(t, τ ;K)dK

−∫ S(t)

0

6 ln(

KS(t)

)+ 3

(ln(

KS(t)

))2

K2· P (t, τ ;K)dK, (37)

and the price of the quartic contract is

X(t, τ) =∫ ∞S(t)

12(

ln[ KS(t) ]

)2− 4

(ln[ K

S(t) ])3

K2· C(t, τ ;K)dK

+∫ S(t)

0

12(

ln[S(t)K ])2

+ 4(

ln[S(t)K ])3

K2· P (t, τ ;K)dK. (38)

30

Define

µ(t, τ) = erτ − 1− erτ

2V (t, τ)− erτ

6W (t, τ)− erτ

24X(t, τ). (39)

Then we can calculate τ -period model-free implied skewness (MFIS) as:

MFIS(t, τ) =erτW (t, τ)− 3µ(t, τ)erτV (t, τ) + 2(µ(t, τ))3

(erτV (t, τ)− (µ(t, τ))2)32

. (40)

To calculate the integrals in (36), (37), and (38) in principle, we need a continuum of option

prices. We discretize the respective integrals and approximate them from the available options.

We have 13 implied volatilities for OTM options from the surface file at our disposal for each

maturity. Using cubic splines, we interpolate these implied volatilities inside the available mon-

eyness range and extrapolate using the last known (boundary for each side) value to fill in 1,001

grid points in the moneyness range from 1/3 until 3. Then we calculate the option prices from

the interpolated volatilities, and we use these prices to compute the model-free variance (MFIV)

and risk-neutral skewness (MFIS) as in (36) and (40).

31

Appendix C Factor-Mimicking Portfolios

For each factor F , we run on each day the mimicking regression of the factor return rFt on

a constant and the excess returns Xt of the available index constituents, using the historical

returns from the last 1008 trading days (or four years), i.e.,

rFt = at + b′tXt + ut. (41)

The coefficient b′t has the interpretation of weights in a zero-cost portfolio. Using these weights

and the next day excess returns of the available stocks, we compute the next day return of the

factor mimicking portfolio as rMim,Ft+1 = b′tXt+1. This procedure gives us the time-series of daily

mimicking portfolio returns for each factor F .

In Table 5 we present summary statistics for the daily returns of the factors and the factor

mimicking portfolios as well as summary statistics for the mimicking procedure.

32

References

Andersen, T. G., T. Bollerslev, F. X. Diebold, and H. Ebens, 2001, “The Distribution of Stock

Return Volatility,” Journal of Financial Economics, 61(1), 43–76.

Andersen, T. G., T. Bollerslev, F. X. Diebold, and J. Wu, 2006, Advances in Econometrics:

Econometric Analysis of Economic and Financial Time Seriesvol. B, chap. Realized Beta:

Persistence and Predictability, pp. 1–40.

Baker, M. P., B. Bradley, and J. A. Wurgler, 2010, “Benchmarks as Limits to Arbitrage: Un-

derstanding the Low Volatility Anomaly,” SSRN eLibrary.

Bakshi, G. S., N. Kapadia, and D. B. Madan, 2003, “Stock Return Characteristics, Skew Laws,

and the Differential Pricing of Individual Equity Options,” The Review of Financial Studies,

16, 101–143.

Bali, T. G., and A. Hovakimian, 2009, “Volatility Spreads and Expected Stock Returns,” Man-

agement Science, 55(11), 1797–1812.

Bandi, F. M., and J. R. Russell, 2005, “Realized Covariation, Realized Beta, and Microstructure

Noise,” Working paper series, University of Chicago.

Barndorff-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard, 2009, “Realised Kernels in

Practice: Trades and Quotes,” Econometrics Journal.

Barndorff-Nielsen, O. E., and N. Shephard, 2002, “Econometric Analysis of Realized Volatility

and Its Use in Estimating Stochastic Volatility Models,” Journal of the Royal Statistical

Society Series B, 64(2), 253–280.

Black, F., and M. Scholes, 1973, “The Pricing of Options and Corporate Liabilities,” Journal of

Political Economy, 81(3), 637–654.

Blair, B. J., S.-H. Poon, and S. J. Taylor, 2001, “Forecasting S&P 100 Volatility: the Incremental

Information Content of Implied Volatilities and High-Frequency Index Returns,” Journal

of Econometrics, 105, 5–26.

Bollerslev, T., G. Tauchen, and H. Zhou, 2009, “Expected Stock Returns and Variance Risk

Premia,” Review of Financial Studies.

Breeden, D. T., M. R. Gibbons, and R. H. Litzenberger, 1989, “Empirical Tests of the

Consumption-Oriented CAPM,” Journal of Finance, 44, 231–262.

Breeden, D. T., and R. H. Litzenberger, 1978, “Prices of State-Contingent Claims Implicit in

Option Prices,” Journal of Business, 51(4), 621–652.

33

Breen, W., L. R. Glosten, and R. Jagannathan, 1989, “Economic Significance of Predictable

Variations in Stock Index Returns,” Journal of Finance, 44(5), 1177–1189.

Britten-Jones, M., and A. Neuberger, 2000, “Option Prices, Implied Price Processes, and

Stochastic Volatility,” Journal of Finance, 55, 839–866.

Brown, S. J., 1990, “Estimating Volatility,” in Financial Options: From Theory to Practice, ed.

by S. Figlewski, W. L. Silber, and M. G. Subrahmanyam. Business One-Irwin, Homewood,

IL, pp. 516–537.

Carhart, M. M., 1997, “On Persistence in Mutual Fund Performance,” Journal of Finance, 52(1),

57–82.

Carr, P., and R. Lee, 2009, “Volatility Derivatives,” Annual Review of Financial Economics, 1,

319–339.

Carr, P., and D. B. Madan, 1998, “Towards a Theory of Volatility Trading,” in Robert A. Jarrow,

ed.: Volatility: New Estimation Techniques for Pricing Derivatives (RISK Publications,

London).

Carr, P., and L. Wu, 2006, “A Tale of Two Indices,” Journal of Derivatives, pp. 13–29.

, 2009, “Variance Risk Premiums,” Review of Financial Studies, 22(3), 1311–1341.

Chang, B. Y., P. Christoffersen, K. Jacobs, and G. Vainberg, 2009, “Option-Implied Measures

of Equity Risk,” SSRN eLibrary.

Colacito, R., R. Engle, and E. Ghysels, 2009, “A Component Model of Dynamic Correlations,”

Working paper.

Corsi, F., G. Zumbach, U. Muller, and M. Dacorogna, 2001, “Consistent High-Precision Volatil-

ity from High-Frequency Data,” Economic Notes, 30, 183–204.

Driessen, J., P. Maenhout, and G. Vilkov, 2009, “The Price of Correlation Risk: Evidence from

Equity Options,” Journal of Finance, 64(3), 1375–1404.

Dumas, B., 1995, “The Meaning of the Implicit Volatility Function in Case of Stochastic Volatil-

ity,” HEC Working Paper.

Engle, R. F., 2002, “Dynamic Conditional Correlation: A Simple Class of Multivariate Gen-

eralized Autoregressive Conditional Heteroskedasticity Models,” Journal of Business and

Economic Statistics, 20(3), 339–350.

Engle, R. F., E. Ghysels, and B. Sohn, 2008, “On the Economic Sources of Stock Market

Volatility,” Working paper.

34

Epps, T. W., 1979, “Comovements in Stock Prices in the Very Short Run,” Journal of the

American Statistical Association, 74(366), 291–298.

Fama, E. F., and K. R. French, 1993, “Common Risk Factors in the Returns on Stock and

Bonds,” Journal of Financial Economics, 33(1), 3–56.

French, D. W., J. C. Groth, and J. W. Kolari, 1983, “Current Investor Expectations and Better

Betas,” Journal of Portfolio Management, 10, 12–18.

Ghysels, E., and E. Jacquier, 2006, “Market Beta Dynamics and Portfolio Efficiency,” Working

Paper, University of North Carolina.

Goyal, A., and A. Saretto, 2009, “Cross-Section of Option Returns and Volatility,” Journal of

Financial Economics, 94(2), 310–326.

Heston, S. L., 1993, “A Closed-Form Solution for Options with Stochastic Volatility with Appli-

cations to Bond and Currency Options,” Review of Financial Studies Financ. Stud., 6(2),

327–343.

Jacod, J., 1994, “Limit of Random Measures Associated with the Increments of a Brownian

Semimartingale,” Preprint University Paris VI.

Jacod, J., and P. Protter, 1998, “Asymptotic Error Distributions for the Euler Method for

Stochastic Differential Equations,” Annals of Probability, 26(1), 267–307.

Jiang, G. J., and Y. S. Tian, 2005, “The Model-Free Implied Volatility and Its Information

Content,” Review of Financial Studies, 18, 1305–1342.

Karatzas, I., and S. E. Shreve, 1991, Brownian Motion and Stochastic Calculus. Springer.

Keim, D. B., and R. F. Stambaugh, 1986, “Predicting Returns in the Stock and Bond Markets,”

Journal of Financial Economics, 17(2), 357–390.

Lamont, O. A., 2001, “Economic Tracking Portfolios,” Journal of Econometrics, 105, 161–184.

Lintner, J., 1965, “The Valuation of Risky Assets and the Selection of Risky Investments in

Stock Portfolios and Capital Budgets,” Review of Economics and Statistics, 47, 13–37.

Petersen, M. A., 2009, “Estimating Standard Errors in Finance Panel Data Sets: Comparing

Approaches,” Review of Financial Studies, 22(1), 435–380.

Schobel, R., and J. Zhu, 1999, “Stochastic Volatility with an Ornstein-Uhlenbeck Process: An

Extension,” European Finance Review, 3(1), 23–46.

Sharpe, W. F., 1964, “Capital Asset Prices: A Theory of Market Equilibrium under Conditions

of Risk,” Journal of Finance, 19(3), 425–442.

35

Siegel, A. F., 1995, “Measuring Systematic Risk Using Implicit Beta,” Management Science, 41,

124–128.

Todorov, V., 2009, “Variance Risk-Premium Dynamics: The Role of Jumps,” Review of Finan-

cial Studies.

Vanden, J. M., 2008, “Information Quality and Options,” The Review of Financial Studies,

21(6), 2635–2676.

Vidyamurthy, G., 2004, Pairs Trading: Quantitative Methods and Analysis. Wiley Finance.

Wang, K. Q., 2003, “Asset Pricing with Conditioning Information: A New Test,” Journal of

Finance, 58(1), 161–196.

White, H., 1980, “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct

Test for Heteroskedasticity,” Econometrica, 48(4), 817–838.

Zhang, L., 2006, “Estimating Covariation: Epps Effect, Microstructure Noise,” Working Paper,

University of Illinois at Chicago.

Zhang, L., P. A. Mykland, and Y. Aıt-Sahalia, 2005, “A Tale of Two Time Scales: Determining

Integrated Volatility with Noisy High-Frequency Data,” Journal of the American Statistical

Association, 100(472), 1394–1411.

Zhou, B., 1996, “High-Frequency Data and Volatility in Foreign-Exchange Rates,” Journal of

Business and Economic Statistics, 14(1), 45–52.

36

Figure 1: Mean High-Frequency Market Betas

The figure shows the time series of the index weighted average market betas of all S&P500 stocks (with S&P500as a market proxy), computed over a rolling 21 trading days window of high-frequency data. The high-frequencyestimators are computed from (30) with the use of subsampling and averaging (Zhang, Mykland, and Aıt-Sahalia(2005)), and for four different Slowscales (sampling frequencies). For this picture, we smooth the series using a60-day moving average. The downward jump between the points Jan01 and Jan02 is due to the September 11,2001 events.

Jan97 Jan98 Jan99 Jan00 Jan01 Jan02 Jan03 Jan04 Jan05 Jan06 Jan07 Jan08 Jan09 Dec090.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05Weighted Market Beta for various Slowscales

30 minutes60 minutes120 minutes180 minutes

37

Table 1: Market Beta Summary Statistics

This table provides summary statistics for the market betas based on different beta methodologies.

The sample period spans from January 1996 to October 2009. We compute on each trading day

the weighted beta using the index weights for stocks in the index. The statistic Mean Weighted

then reports the time-series average of these weighted betas. For the computation of the remaining

statistics we first pool the observations for all stocks in the index as well as all trading days and

then compute the standard deviation, min, median, and max. We report the summary statistics

separately for the two different maturities (one and six months) in Panels (a) and (b).

(a) 1 month maturity

Hist. HF Hist. Daily DCC-MIDAS HETIC red. HETIC CCJV FGK HF FGK

# Observations 1814053 1768530 1613420 1431399 1431399 1325650 1521995 1501198Mean Weighted 0.9603 0.9992 0.9468 1.0000 1.0000 1.3386 0.8702 0.9088Std. Dev. 0.5630 0.5126 0.4553 0.4514 0.4747 0.6659 0.4406 0.4723min -3.0887 -1.3570 -0.8746 -3.0341 -1.2060 0.0127 -1.5112 -1.8525median 0.8466 0.8985 0.8702 0.9471 0.9237 1.3605 0.7891 0.8539max 7.1020 6.4488 4.1145 10.9851 10.8348 54.6351 9.5030 11.3612

(b) 6 month maturity


# Observations 1791173 1768530 1613420 1370644 1370644 1410228 1521995 1501198Mean Weighted 0.9576 0.9992 0.9468 1.0000 1.0000 1.1855 0.8175 0.8540Std. Dev. 0.4693 0.5126 0.4553 0.4008 0.4428 0.4434 0.3954 0.4226min -0.8396 -1.3570 -0.8746 -0.9610 -0.8914 0.0169 -1.0452 -1.2010median 0.8422 0.8985 0.8702 0.9568 0.9263 1.2073 0.7363 0.8006max 4.3435 6.4488 4.1145 6.2684 5.6580 15.0392 8.0527 9.1496

38

Table 2: Market Correlation Summary Statistics

This table provides summary statistics for the stock-to-market correlations that are used by the dif-

ferent beta methodologies. Artificial CCJV denotes the correlation proxy used by Chang, Christof-

fersen, Jacobs, and Vainberg (2009), i.e., the ratio of index to stock skew. The sample period spans

from January 1996 to October 2009. We first pool the observations for all stocks in the index as

well as all trading days and then compute the mean, standard deviation, minimum, and maximum.

We report the results separately for the two different option maturities.

Hist. HF Hist. Daily DCC-MIDAS HETIC red. HETIC artificial CCJV1 month maturity

# Observations 1814053 1768530 1613429 1431285 1431285 1325650Mean 0.4484 0.4761 0.4812 0.5456 0.5390 0.7902Std. Dev. 0.1969 0.1869 0.1883 0.1919 0.1970 0.2176Min -0.6279 -0.4168 -0.5952 -0.9196 -0.6129 0.0060Max 0.9276 0.9372 0.9504 0.9887 1.0000 17.4249

6 month maturity

# Observations 1791173 1768530 1613429 1370644 1370644 1410228Mean 0.4458 0.4761 0.4812 0.5908 0.5717 0.7390Std. Dev. 0.1632 0.1869 0.1883 0.1324 0.1612 0.1764Min -0.4216 -0.4168 -0.5952 -0.4640 -0.3745 0.0106Max 0.8654 0.9372 0.9504 0.9479 1.0000 5.9293

39

Table 3: Realized and Implied Measures

This table provides summary statistics for the ATM implied volatilities, historical volatility risk

premiums, model-free implied variances, model-free implied skewness, as well as historical variance

risk premiums. We report statistics for the S&P500 index components and the S&P500 index

separately for the sample period from January 1996 to October 2009. For the index components

statistics we pool the observations for all stocks and trading days and then compute the reported

statistics. Volatilities as well as variances are annualized and the computations are described in

section 3.3. The summary statistics are reported for the two different option maturities separately.

SP500 SP500components index

1 month maturity

# Observations 1421546 3335

Mean ATM implied volatility 0.3696 0.2024Mean hist. volatility risk premium 1.1421 1.2242

Mean model-free implied variance (0.4340)2 (0.2303)2

Mean hist. variance risk premium (1.2517)2 (1.3181)2

Mean model-free skewness -0.3031 -0.7073

6 month maturity

# Observations 1362976 3227

Mean ATM implied volatility 0.3499 0.2023Mean hist. volatility risk premium 1.0116 1.1717

Mean model-free implied variance (0.4063)2 (0.2272)2

Mean hist. variance risk premium (1.0774)2 (1.2913)2

Mean model-free skewness -0.3215 -0.7634

40

Table 4: Market Beta: Regressions

This table provides the summary of the market beta predictability over the sample period from

January 1996 to October 2009. For each stock and trading day we first compute the expected as well

as realized beta, and the corresponding Mean Squared Error (MSE) as well as the time-series mean

of the MSE. We then report the cross-sectional average of the MSE together with a t-statistic (based

on White (1980) std. errors) for the difference to HETIC. In addition we report the percentage of

stocks for which a specific beta methodology yields the lowest MSE. Moreover, we run for each stock

the regression of the daily time-series of the realized market beta on the predicted market beta and

a constant for several methods of beta construction: βRealizedi,t = αi +λiβ

Predictedi,t +εi,t∀t. We report

the cross-sectional average of the adjusted R2 together with a t-statistic (based on White (1980) std.

errors) for the difference in R2 compared to the HETIC methodology, and the percentage of stocks

for which a specific methodology yields the highest R2. The statistics are presented separately for

two different maturities in Panels (a) and (b).



Avg. # Observations 2202 2202 2202 2202 2202 2202 2202 2202

Avg. MSE 0.1415 0.1502 0.1530 0.1274 0.1183 0.6922 0.1307 0.1481t-stat diff. to HETIC 8.10 9.77 10.12 - -10.02 28.47 2.14 13.04percent. of lowest MSE 9.47 6.11 4.63 15.16 49.47 0.00 13.89 1.26

Avg. R2 0.3178 0.2644 0.2415 0.3191 0.3336 0.0973 0.2958 0.2569t-stat diff. to HETIC -0.39 -13.74 -18.01 - 13.39 -30.60 -12.38 -22.19percent. of highest R2 33.26 6.53 4.63 12.63 34.32 1.89 5.47 1.26



Avg. # Observations 2248 2248 2248 2248 2248 2248 2248 2248

Avg. MSE 0.0714 0.1061 0.1062 0.0746 0.0685 0.2885 0.0975 0.1023t-stat diff. to HETIC -1.95 14.74 14.04 - -6.71 27.25 10.75 14.74percent. of lowest MSE 33.68 0.84 1.88 30.54 29.50 0.00 2.93 0.63

Avg. R2 0.3341 0.2602 0.2407 0.3517 0.3513 0.1610 0.2958 0.2471t-stat diff. to HETIC -3.65 -15.83 -17.64 - -0.16 -22.76 -12.51 -18.79percent. of highest R2 19.87 5.65 5.44 34.52 15.06 6.90 7.95 4.60

41

Table 5: Factor Mimicking Portfolios: Summary Statistics

This table provides the summary statistics for the factor mimicking portfolios of the SMB, HML, and

UMD factors. On each trading day we run the mimicking regression of the daily factor return rkt on a

constant as well as the daily excess returns Xt of the available index components: rkt = at+b′t Xt+ut,

using historical return data from the last 1008 trading days. Thus, the sample period begins in

January 2000 and ends in October 2009. Using the coefficients bt of these regressions, which have

the interpretation of weights, we then compute the out-of-sample return of the mimicking portfolio

rMim,kt for the following trading day as btXt+1. That way we obtain a time-series of mimicking factor

portfolio returns and a time-series of regression information. We report the number of observations

as well as the standard deviation of the factor mimicking portfolio daily returns as well as for the

factor daily returns. We also present the time-series mean, min, and max of the R2 of the mimicking

regressions. Finally, we also report the correlation between the factor mimicking portfolio returns

and the factor returns itself.

SMB mim. SMB HML mim. HML UMD mim. UMD

# Observations 2769 2769 2769 2769 2769 2769

Std. Dev. of daily returns 0.0065 0.0063 0.0073 0.0073 0.0117 0.0110

Mean R2 of mimick. regression 0.7820 0.8842 0.7459Min R2 of mimick. regression 0.7281 0.8171 0.5567Max R2 of mimick. regression 0.8289 0.9344 0.9272

Correlation FF factor + mim. pf. 0.6114 0.8156 0.6310

42

Table 6: Factor Betas: Regressions

This table provides the summary of the factor (SMB, HML, UMD) beta predictability over the

sample period from January 2000 to October 2009. For each stock and trading day we first compute

the expected as well as realized factor beta, and the corresponding Mean Squared Error (MSE)

as well as the time-series mean of the MSE. We then report the cross-sectional average of the

MSE together with a t-statistic (based on White (1980) std. errors) for the difference to HETIC.

In addition we report the percentage of stocks for which a specific beta methodology yields the

lowest MSE. Moreover, we run for each stock the regression of the daily time-series of the realized

factor beta on the predicted factor beta and a constant for several methods of beta construction:

β(k),Realizedi,t = αi + λiβ

(k),Predictedi,t + ε

(k)i,t ∀t, k = SMB,HML,UMD. We report the cross-sectional

average of the adjusted R2 together with a t-statistic (based on White (1980) std. errors) for the

difference in R2 compared to the HETIC methodology, and the percentage of stocks for which a

specific methodology yields the highest R2. The statistics are presented separately for two different

maturities in Panels (a) and (b).


SMB HML UMDHist. HF Hist. Daily HETIC Hist. HF Hist. Daily HETIC Hist. HF Hist. Daily HETIC

Avg. # Observations 2217 2217 2217 2217 2217 2217 2217 2217 2217

Avg. MSE 0.7413 0.8780 0.6809 0.9034 1.0646 0.7672 0.3760 0.5050 0.3168t-stat diff. to HETIC 8.04 14.76 - 16.55 12.99 - 15.43 17.75 -Percent. of lowest MSE 28.54 8.62 62.83 9.86 5.75 84.39 15.40 2.05 82.55

Avg. R2 0.3728 0.3833 0.4466 0.3684 0.3341 0.4225 0.4692 0.4890 0.5417t-stat diff. to HETIC -25.97 -21.92 - -18.41 -31.16 - -26.88 -19.82 -Percent. of highest R2 9.45 10.47 80.08 16.63 3.08 80.29 7.19 16.84 75.98


SMB HML UMDHist. HF Hist. Daily HETIC Hist. HF Hist. Daily HETIC Hist. HF Hist. Daily HETIC

Avg. # Observations 2157 2157 2157 2157 2157 2157 2157 2157 2157

Avg. MSE 0.4743 0.7982 0.6048 0.6415 0.9355 0.7847 0.3274 0.5429 0.3755t-stat diff. to HETIC -14.34 11.04 - -21.06 7.51 - -9.96 14.21 -Percent. of lowest MSE 89.31 1.05 9.64 84.28 9.43 6.29 77.99 1.68 20.34

Avg. R2 0.4106 0.3861 0.4723 0.3302 0.3106 0.3551 0.4447 0.4591 0.4635t-stat diff. to HETIC -20.34 -17.45 - -8.49 -10.35 - -4.49 -0.87 -Percent. of highest R2 8.60 14.47 76.94 19.92 25.37 54.72 19.29 41.09 39.62

43

Table 7: Market-Neutral Pairs Trading

This table provides the results of the market-neutral pairs trading application. At the end of each

month in the sample period from January 1996 to October 2009, and for each available pair, we

form a portfolio consisting of a long position in one stock of the pair, a short position in the other

stock, and a long/short position in the market such that the expected market beta of the portfolio

is zero, i.e., a market-neutral portfolio. Then we compute the realized market beta over the holding

period (5, 10, or 21 trading days) for each portfolio and month as well as the Mean Squared Error

(MSE), i.e., the squared deviation from the zero beta expectation, together with a t-statistic for the

difference to HETIC based on standard errors clustered by time and pair following Petersen (2009).

The results are reported separately for the three different holding periods.


5 trading days holding period

# Observations 6930194 6930194 6930194 6930194 6930194 6930194 6930194 6930194Avg. MSE 0.0574 0.0364 0.0409 0.0322 0.0315 0.0693 0.0418 0.0402t-stat diff. to HETIC 11.42 8.70 11.47 - -3.45 16.88 9.63 10.22





44

Table 8: Portfolio Immunization

This table provides the results of the portfolio immunization exercise. The sample period spans from

January 2000 to October 2009. On each trading day we randomly select two portfolios consisting

of equal-weighted long positions in 50 stocks and equal-weighted short positions in 50 other stocks.

For each portfolio and trading day we then compute the corresponding neutral positions in the

factor portfolios, i.e., we create an aggregate portfolio that has zero expected factor exposure for

all four factors (Market, SMB, HML and UMD). We compute the realized factor exposures for each

portfolio and trading day, and the corresponding Mean Squared Error, i.e., the deviation from the

zero expected factor exposure. We then sum up the MSEs over all four factors. For the summary

statistics we pool the observations for all portfolios as well as all trading days and report the sum

of MSEs together with a t-statistic for the difference to HETIC based on standard errors clustered

by time (Petersen (2009)). The results are reported for the three different holding periods (5, 10,

and 21 trading days) separately.

Hist. HF Hist. Daily HETIC


# Observations 5454 5454 5454Avg. sum of MSEs 0.0643 0.0698 0.0646t-stat diff. to HETIC -1.12 18.88 -


# Observations 5454 5454 5454Avg. sum of MSEs 0.0637 0.0673 0.0628t-stat diff. to HETIC 3.10 16.58 -


# Observations 5454 5454 5454Avg. sum of MSEs 0.0691 0.0709 0.0658t-stat diff. to HETIC 11.00 17.95 -

45

buss and vilkov - option-implied correlation and factor betas revisited

Documents