lecture 7: correlation and covariance...

Lecture 7: Correlation and Covariance Modelling

Prof. Massimo Guidolin

20192– Financial Econometrics

Spring 2017

Overview

2

The true multivariate nature of empirical applications in finance: risk management examples

Factor models of conditional covariances and variances

Quick introduction to conditional covariance models: exponential smoothing and naïve GARCH models

Dynamic conditional correlation models (DCC)

Estimation of DCC models

Multivariate GARCH models

Lecture 7: Correlation Modelling – Prof. Guidolin

Multi-Dimensional Problems

3

Let’s recap where we are at in the course We will proceed in three steps following a stepwise distribution

modeling (SDM) approach: Establish a variance forecasting model for each of the assets

individually and introduce methods for evaluating the performance of these forecasts DONE!

Consider ways to model conditionally non-normal aspects of the assets in our portfolio—i.e., aspects that are not captured by conditional means, variances, and covariances DONE!

Link individual variance forecasts with correlation models NEXT Most relevant (realistic) applications in empirical finance are

actually multivariate: they involve n assets/securities/portfolios• Portfolio choice is by construction multivariate• Risk management is multivariate because it involves portfolios• Only some asset (especially derivative) pricing may occasionally turn

simply univariate, but that is more the exception than the ruleLecture 7: Correlation Modelling – Prof. Guidolin

Multi-Dimensional Problems

4

So far, all the work you have been doing as far as second moments were concerned, was univariate• This means that second moment = variance (volatility)• Therefore: a further, useful step consists of developing multivariate

methods for second moments What happens if we have n assets in our ptf.? One way to proceed consists of simply modeling not the process of

R1t+1, R2

t+1, …, RNt+1 but instead of ptf. returns

RPt+1 = 1R1

t+1 + 2R2t+1 +…+ nRN

t+1,in which case it is simply, e.g.: VaRP

t+1(p) = - Pt+1Φ-1(p) - P

t+1where P

t+1 is a volatility forecast possibly coming from some GARCH model directly fitted on PORTFOLIO RETURNS

However this aggregate VaR method is directly dependent on the portfolio allocations 1, 2, …, n

Modelling portfolio returns takes the weights as given


Active vs. Passive Risk Management

5

It requires us to re-estimate and predict volatility every time the portfolio is changed • Aggregate methods are well-suited to providing forecasts of portfolio-

level risk measures such as aggregate VaR• We speak about passive risk management, in which portfolio

structure tends to remain unchanged• Less well-suited for active risk management process

If risk manager wants to know the sensitivity of ptf. VaR to increases in market volatility and correlations, which typically occur in times of market stress, a multivariate model is needed

Sensitivity risk measures when ptf. weights change Active risk management requires a multivariate model, which

provides a forecast for the entire covariance matrix

While modeling aggregate portfolio returns directly may be appropriate for passive portfolio risk measurement, it is not as useful for active risk management



6

From the fact that and Var[1R1t+1 + 2R2

t+1] = 1

2Var[R1t+1]+ 2

2Var[R2t+1]+ 212Cov[R1

t+1,R2t+1], we derive that

where ii,t+1 2i,t+1 , ii,t+1 = 1, and ij,t+1 Covt[Ri

t+1,Rjt+1]

• Covt[·] is a covariance conditional on time t information Using vector notation,

If we are willing to assume normality, then the ptf. VaR is

What are the inputs required? n variance forecasts; n(n-1)/2 covariances or correlations

Active risk management requires a multivariate model, which provides a forecast for the entire covariance matrix


[ ] =

t+1 = 0

-[ ]1/2 1/2-


7

• For instance, with only 15 assets in a portfolio you will need: (i) 15 variance forecasts; (ii) (15x14/2) = 105 correlations

• Problem: even ignoring means, with 15 assets you will need at least (105+15)/15 = 8 data points, or observations on 8 periods to be able to estimate all the parameters

• But this gives you approximately one observation per parameter! A good rule is that you need at least 20 obs. per parameter

• This means that with 15 assets, you will need data on 160 periods! E.g., almost 14 years of monthly data; or 32 weeks of daily data

The ratio btw. the TOTAL number of observations and the number of parameters to be estimated is called saturation ratio

How can we simplify the task? We study three approaches:① Modeling exposure mappings (using factor models, e.g., CAPM)② Modeling conditional covariances (this is an Intro to the “Multivariate ARCH” topic, M-ARCH)③ Modeling dynamic conditional correlations (DCC)


Exposure Mapping Approach

8

A simple way to reduce the dimensionality of portfolio variance is to impose a factor structure• E.g., it may be reasonable to assume that the CAPM holds,

RPF,t+1 = Rf + PF[RMt+1 - Rf]

so that

• In this case all you need to estimate is the CAPM beta, PF

• Is this a reduced form model? Only apparently: because PF = 11 + 22 + … + nn, or the beta of a ptf. is the weighted sum of the individual betas, with weights equal to the ptf. weights, it follows that

2PF,t+1 = [11 + 22 + … + nn]2·2

Mkt,t+1

and it is easy to verify that the compact expression 2PF,t+1 =

2PF2

Mkt,t+1 contains individual covariances btw. securities• Let’s consider in detail the case of n = 2:

2PF,t+1 = [11 + 22]2·2

Mkt,t+1 = [212

1 + 222

2 + 21212] 2Mkt,t+1

Under an exposure mapping approach, 2PF,t+1 = 2

PF,12f1,t+1 +

2PF, 22

f2,t+1 + … + 2PF,K2

fK,t+1 if the K factors are uncorrelated



9

• This is equivalent to 2PF,t+1 = 2

1 21,t+1 + 2

2 22,t+1 + 2122

12,t+1which is the standard expression and 2

12,t+1 Covt[R1,t+1,R2,t+1]• This follow from the fact that Covt[R1,t+1,R2,t+1] = Covt[Rf + 1(RMt+1-Rf), Rf + 2(RMt+1-Rf)]

= Covt[1RMt+1, 2RMt+1] = 12Vart[RMt+1] = 122Mkt,t+1

Why does it matter that this is not a reduced form model? Because it means that if you know: (i) the weights 1, 2, …, n, (ii) the individual security betas, 1, 2, …, n, , (iii) 2

Mkt,t+1, you can compute your estimate for 2

PF,t+1

If the weights change, you can exploit your knowledge of the change without re-estimating betas and 2

Mkt,t+1

Of course this only applies to portfolios that contain systematic risk, but which are diversified enough that the firm-specific idiosyncratic risk can be ignored

Active risk management based on factor models will work iffidiosyncratic risk may be effectively ignored


Modeling GARCH Conditional Covariances

10

The simplest idea is to build time-varying estimates of covariances using rolling (moving) averages,

• Not really satisfactory because the choice of m is problematic

We can use simple exponential smoother on covariances:

• The restriction that coefficient (1 − ) on the cross products and on past covariance sum to one is not necessarily desirable

• It implies that there is no mean-reversion in covariance

• A high covariance will remainhigh forever!



11

This follows from the fact that the model can be written in GARCH(1,1) form as ij,t+1 = ij + R1,tR2,t + ij,t which is to be compared to:

ij,t+1 = (1− )R1,tR2,t+ ij,t ij = 0, = 1− , = , + = 1so that, as a result, E[ij,t+1] = ij /(1- - ) fails to be defined

The next step is then rather obvious: let’s not restrict and in the GARCH(1,1) type model for conditional covariance:

When + < 1, the process is stationary and the unconditional covariance will equal ij /(1- - )• Why are we restricting and to NOT depend on the specific pair of

securities/assets examined?• Setting and (and ) not to depend on i and j yields good outcomes

GARCH models may be fruitfully extended to modellingcovariances, even though restrictions are needed to keep the covariance matrix (semi) positive definite



12

Restricting parameters to not depend on i and j guarantees that the resulting covariance matrix that collects GARCH(1,1) variances and covariance is (semi) positive definite, i.e., that for all possible vectors w,

Why is that relevant? Well, just recall that• This SPD condition is ensured by estimating volatilities and

covariances in an “internally consistent fashion”• Sufficient condition for internal consistency is the use of the same

for every volatility and covar in exponential smoothing• Similarly, using a GARCH(1,1) model with α and β identical across

variances and covariances is sufficient Not clear that the persistence parameters , α, and β should be the

same for all variances and covariances: need to develop better models If you listen to trading desk and asset management lingo, they will

hardly talk about covariances: instead the focus will be on CORRELATIONS, besides volatility


Dynamic Conditional Correlation Models

13

• For instance, one interesting (worrisome) phenomenon is that all correlations tend to “skyrocket” during market crisis (bear)

• Skyrocket is a way to speak: you do recall that a correlation,

belongs to [-1, 1]• A first, intuitive but mechanical approach consists in applying GARCH

models to both variances and covariances in the definition of con-ditional correlation, e.g.:

A more fruitful approach still start from the decompositionbut it generalizes it to matrix form:

The dynamic conditional correlation approach is based on the eigenvalue-eigenvector decomposition



14

• Here Dt+1 is a matrix of standard deviations, σi,t+1, on the ith diagonal and zero everywhere else

• t+1 is a matrix of correlations, ρij,t+1 with ones on the diagonal• E.g., for n = 2:

At this point we proceed in two steps:① Volatilities of each asset are estimated through GARCH or one of the other methods considered in first part of the course② Model conditional covariances of standardized returns derived from the first step

The DCC approach is based on two steps: modelling the volatility of each individual asset; modelling the covariances of standardized residuals from the first step



15

• Luckily, the conditional covariance of the zi,t+1 variables equals the conditional correlation of the raw returns:

• You need to use an auxiliary variable qij,t+1 to be updated to be able to compute conditional correlations:

• Why a need for the qij,t+1 auxiliary variable? Because being able to use the ratio above ensures ij,t+1 falls in the interval [-1,1]

• At this point write a dynamic model for the conditional value for qij,t+1, like:in this case of exponential smoothing with parameter

The DCC approach is based on applying GARCH/RiskMetrics-type models to auxiliary variables that ensure ρij,t+1 [-1,1]



16

• An obvious alternative is a GARCH-type dynamic model:

• Notice that the correlation persistence parameters and are common across i and j: the persistence of the correlation between any two assets in the portfolio is the same.

• It does not, however, imply that the level of the correlations at any time is the same across pairs of assets

• Why the restriction? Usual reason: to guarantee ij,t+1 [-1,1]


Unconditional correlation matrix

(Covariance targeting)


17

• An important feature of these models is that the matrix Qt+1 collecting the qij,t+1 auxiliary variables,

will be positive definite as it is a weighted average of positive semi-definite and positive definite matrices

• This will in turn ensure that the correlation matrix t+1 and the covariance matrix, t+1, will be positive semi-definite

• DCC models are enjoying a massive popularity because they are easy to implement in 3 steps:• First, all the individual variances are estimated one by one• Second, the returns are standardized and the unconditional

correlation matrix is estimated• Third, the correlation persistence parameters and (or ) are

estimated

The DCC approach guarantees that the estimated/predicted covariance matrices are always (semi) positive-definite


Estimation of DCC Models

18

• Only few parameters are estimated simultaneously using numerical optimization

• This feature makes DCC models extremely tractable for risk management of large portfolios

• Yes, but how do you estimate the parameters?• Simple, just use QMLE methods “in waves”, i.e., sequentially• Why straight QML? Because you want to be able to build standardized

residuals from some GARCH volatility models first and then proceed to estimate and (or ) for dynamic correlations

It is QMLE because you want to estimate conditioning on some first-stage parameter estimates of GARCH models for volatility• E.g., in the case of

n = 2 you maximize

DCC models are estimated by QMLE because you first estimate first-stage GARCH parameters and then DCC ones conditioning on first-stage ML estimates


Estimation of DCC Models

19

• To initialize the recursion, set

• Notice that the variables that enter the likelihood are the rescaled returns, zt , and not the original raw returns, Rt


max

(E.g., in the case ofRiskMetrics)

Multivariate GARCH Models: VECH(1,1)

20

Multivariate GARCH models are in spirit similar to their univariate counterparts, except that they also specify conditional covariance functions, i.e., covariances directly move over time

Several different multivariate GARCH formulations, e.g., VECH, the diagonal VECH and the BEKK models• Below it is assumed for simplicity that there are n = 2 assets

A VECH(1,1) model is specified as:

• Ht is a conditional covariance matrix, t -1 is an innovation (disturbance) vector, t-1 represents the information set at time t − 1

• C is a 3 × 1 parameter vector, A and B are 3 × 3 parameter matrices and VECH(·) denotes the column-stacking operator applied to the upper portion of the symmetric matrix Ht

• The model requires the estimation of 21 parameters, a lot!Lecture 7: Correlation Modelling – Prof. Guidolin

A Vech GARC(1,1) model is based on the idea of modeling covariance matrices as column-stacked vectors

Multivariate GARCH Models : VECH(1,1)

21

• How the VECH operator works is shown below:

• The elements for the case n = 2 are written out below:

• Conditional variances and conditional covariances depend on the lagged values of all of the conditional variances of, and conditional covariances between, all of the asset returns in the series, as well as the lagged squared errors and the error cross-products

• As n increases, the estimation of the VECH model quickly becomes infeasible


Multivariate GARCH: Diagonal VECH and BEKK

22

VECH conditional covariance matrix has been restricted so that A and B are assumed to be diagonal• This reduces the number of parameters to be estimated to 9 (A and B

each have 3 elements) and the model, known as a diagonal VECH, is:

A disadvantage of the VECH model is that there is no guarantee of a positive semi-definite covariance matrix• It is this property which ensures that, whatever the weight of each

asset in the portfolio, an estimated value-at-risk is always positive• The BEKK model addresses the difficulty with VECH of ensuring

that the H matrix is always positive definite:

• A and B are 2 × 2 matrices of parameters and W is upper triangularLecture 7: Correlation Modelling – Prof. Guidolin

A Diagonal Vech GARC(1,1) model is a VECH(1,1) in which the matrices of parameters are restricted to be diagonal

Estimation of Multivariate GARCH Models

23

The positive definiteness of the covariance matrix is ensured owing to the quadratic nature of the terms on the equation’s RHS

Under the assumption of conditional normality, the parameters of the multivariate GARCH models of any of the above specifications can be estimated by maximizing the log-likelihood function

• θ denotes all the unknown parameters to be estimated, N is the number of assets (i.e. the number of series in the system), and T is the number of observations

• The MLE for θ is asymptotically normal• The additional complexity and extra parameters involved compared

with univariate models make estimation a computationally more difficult task, although the principles are the same


Multivariate GARCH models are estimated by (Q)MLE, which are consistent and asymptotically normal

Carefully read these Lecture Slides + class notes

Possibly read CHRISTOFFERSEN, chapter 7

Possibly read BROOKS, chapter 9 (sections 21-30)

Lecture Notes are available on Prof. Guidolin’s personal web page

Andersen T., T. Bollerslev, and F. Diebold (2009) “Parametric and Nonparametric Volatility Measurement”, in Ait-Sahalia, Y. and L. P. Hansen (eds.), Handbook of Financial Econometrics, Elsevier.

Litterman R. and K. Winkelmann (1998) “Estimating Covariance Matrices”, Goldman Sachs Quantitative Strategies Research Notes.

Reading List/How to prepare the exam

24Lecture 7: Correlation Modelling – Prof. Guidolin

lecture 7: correlation and covariance...

Documents