lecture 7: correlation and covariance...
TRANSCRIPT
Lecture 7: Correlation and Covariance Modelling
Prof. Massimo Guidolin
20192– Financial Econometrics
Spring 2017
Overview
2
The true multivariate nature of empirical applications in finance: risk management examples
Factor models of conditional covariances and variances
Quick introduction to conditional covariance models: exponential smoothing and naïve GARCH models
Dynamic conditional correlation models (DCC)
Estimation of DCC models
Multivariate GARCH models
Lecture 7: Correlation Modelling – Prof. Guidolin
Multi-Dimensional Problems
3
Let’s recap where we are at in the course We will proceed in three steps following a stepwise distribution
modeling (SDM) approach: Establish a variance forecasting model for each of the assets
individually and introduce methods for evaluating the performance of these forecasts DONE!
Consider ways to model conditionally non-normal aspects of the assets in our portfolio—i.e., aspects that are not captured by conditional means, variances, and covariances DONE!
Link individual variance forecasts with correlation models NEXT Most relevant (realistic) applications in empirical finance are
actually multivariate: they involve n assets/securities/portfolios• Portfolio choice is by construction multivariate• Risk management is multivariate because it involves portfolios• Only some asset (especially derivative) pricing may occasionally turn
simply univariate, but that is more the exception than the ruleLecture 7: Correlation Modelling – Prof. Guidolin
Multi-Dimensional Problems
4
So far, all the work you have been doing as far as second moments were concerned, was univariate• This means that second moment = variance (volatility)• Therefore: a further, useful step consists of developing multivariate
methods for second moments What happens if we have n assets in our ptf.? One way to proceed consists of simply modeling not the process of
R1t+1, R2
t+1, …, RNt+1 but instead of ptf. returns
RPt+1 = 1R1
t+1 + 2R2t+1 +…+ nRN
t+1,in which case it is simply, e.g.: VaRP
t+1(p) = - Pt+1Φ-1(p) - P
t+1where P
t+1 is a volatility forecast possibly coming from some GARCH model directly fitted on PORTFOLIO RETURNS
However this aggregate VaR method is directly dependent on the portfolio allocations 1, 2, …, n
Modelling portfolio returns takes the weights as given
Lecture 7: Correlation Modelling – Prof. Guidolin
Active vs. Passive Risk Management
5
It requires us to re-estimate and predict volatility every time the portfolio is changed • Aggregate methods are well-suited to providing forecasts of portfolio-
level risk measures such as aggregate VaR• We speak about passive risk management, in which portfolio
structure tends to remain unchanged• Less well-suited for active risk management process
If risk manager wants to know the sensitivity of ptf. VaR to increases in market volatility and correlations, which typically occur in times of market stress, a multivariate model is needed
Sensitivity risk measures when ptf. weights change Active risk management requires a multivariate model, which
provides a forecast for the entire covariance matrix
While modeling aggregate portfolio returns directly may be appropriate for passive portfolio risk measurement, it is not as useful for active risk management
Lecture 7: Correlation Modelling – Prof. Guidolin
Active vs. Passive Risk Management
6
From the fact that and Var[1R1t+1 + 2R2
t+1] = 1
2Var[R1t+1]+ 2
2Var[R2t+1]+ 212Cov[R1
t+1,R2t+1], we derive that
where ii,t+1 2i,t+1 , ii,t+1 = 1, and ij,t+1 Covt[Ri
t+1,Rjt+1]
• Covt[·] is a covariance conditional on time t information Using vector notation,
If we are willing to assume normality, then the ptf. VaR is
What are the inputs required? n variance forecasts; n(n-1)/2 covariances or correlations
Active risk management requires a multivariate model, which provides a forecast for the entire covariance matrix
Lecture 7: Correlation Modelling – Prof. Guidolin
[ ] =
t+1 = 0
-[ ]1/2 1/2-
Active vs. Passive Risk Management
7
• For instance, with only 15 assets in a portfolio you will need: (i) 15 variance forecasts; (ii) (15x14/2) = 105 correlations
• Problem: even ignoring means, with 15 assets you will need at least (105+15)/15 = 8 data points, or observations on 8 periods to be able to estimate all the parameters
• But this gives you approximately one observation per parameter! A good rule is that you need at least 20 obs. per parameter
• This means that with 15 assets, you will need data on 160 periods! E.g., almost 14 years of monthly data; or 32 weeks of daily data
The ratio btw. the TOTAL number of observations and the number of parameters to be estimated is called saturation ratio
How can we simplify the task? We study three approaches:① Modeling exposure mappings (using factor models, e.g., CAPM)② Modeling conditional covariances (this is an Intro to the “Multivariate ARCH” topic, M-ARCH)③ Modeling dynamic conditional correlations (DCC)
Lecture 7: Correlation Modelling – Prof. Guidolin
Exposure Mapping Approach
8
A simple way to reduce the dimensionality of portfolio variance is to impose a factor structure• E.g., it may be reasonable to assume that the CAPM holds,
RPF,t+1 = Rf + PF[RMt+1 - Rf]
so that
• In this case all you need to estimate is the CAPM beta, PF
• Is this a reduced form model? Only apparently: because PF = 11 + 22 + … + nn, or the beta of a ptf. is the weighted sum of the individual betas, with weights equal to the ptf. weights, it follows that
2PF,t+1 = [11 + 22 + … + nn]2·2
Mkt,t+1
and it is easy to verify that the compact expression 2PF,t+1 =
2PF2
Mkt,t+1 contains individual covariances btw. securities• Let’s consider in detail the case of n = 2:
2PF,t+1 = [11 + 22]2·2
Mkt,t+1 = [212
1 + 222
2 + 21212] 2Mkt,t+1
Under an exposure mapping approach, 2PF,t+1 = 2
PF,12f1,t+1 +
2PF, 22
f2,t+1 + … + 2PF,K2
fK,t+1 if the K factors are uncorrelated
Lecture 7: Correlation Modelling – Prof. Guidolin
Active vs. Passive Risk Management
9
• This is equivalent to 2PF,t+1 = 2
1 21,t+1 + 2
2 22,t+1 + 2122
12,t+1which is the standard expression and 2
12,t+1 Covt[R1,t+1,R2,t+1]• This follow from the fact that Covt[R1,t+1,R2,t+1] = Covt[Rf + 1(RMt+1-Rf), Rf + 2(RMt+1-Rf)]
= Covt[1RMt+1, 2RMt+1] = 12Vart[RMt+1] = 122Mkt,t+1
Why does it matter that this is not a reduced form model? Because it means that if you know: (i) the weights 1, 2, …, n, (ii) the individual security betas, 1, 2, …, n, , (iii) 2
Mkt,t+1, you can compute your estimate for 2
PF,t+1
If the weights change, you can exploit your knowledge of the change without re-estimating betas and 2
Mkt,t+1
Of course this only applies to portfolios that contain systematic risk, but which are diversified enough that the firm-specific idiosyncratic risk can be ignored
Active risk management based on factor models will work iffidiosyncratic risk may be effectively ignored
Lecture 7: Correlation Modelling – Prof. Guidolin
Modeling GARCH Conditional Covariances
10
The simplest idea is to build time-varying estimates of covariances using rolling (moving) averages,
• Not really satisfactory because the choice of m is problematic
We can use simple exponential smoother on covariances:
• The restriction that coefficient (1 − ) on the cross products and on past covariance sum to one is not necessarily desirable
• It implies that there is no mean-reversion in covariance
• A high covariance will remainhigh forever!
Lecture 7: Correlation Modelling – Prof. Guidolin
Modeling GARCH Conditional Covariances
11
This follows from the fact that the model can be written in GARCH(1,1) form as ij,t+1 = ij + R1,tR2,t + ij,t which is to be compared to:
ij,t+1 = (1− )R1,tR2,t+ ij,t ij = 0, = 1− , = , + = 1so that, as a result, E[ij,t+1] = ij /(1- - ) fails to be defined
The next step is then rather obvious: let’s not restrict and in the GARCH(1,1) type model for conditional covariance:
When + < 1, the process is stationary and the unconditional covariance will equal ij /(1- - )• Why are we restricting and to NOT depend on the specific pair of
securities/assets examined?• Setting and (and ) not to depend on i and j yields good outcomes
GARCH models may be fruitfully extended to modellingcovariances, even though restrictions are needed to keep the covariance matrix (semi) positive definite
Lecture 7: Correlation Modelling – Prof. Guidolin
Modeling GARCH Conditional Covariances
12
Restricting parameters to not depend on i and j guarantees that the resulting covariance matrix that collects GARCH(1,1) variances and covariance is (semi) positive definite, i.e., that for all possible vectors w,
Why is that relevant? Well, just recall that• This SPD condition is ensured by estimating volatilities and
covariances in an “internally consistent fashion”• Sufficient condition for internal consistency is the use of the same
for every volatility and covar in exponential smoothing• Similarly, using a GARCH(1,1) model with α and β identical across
variances and covariances is sufficient Not clear that the persistence parameters , α, and β should be the
same for all variances and covariances: need to develop better models If you listen to trading desk and asset management lingo, they will
hardly talk about covariances: instead the focus will be on CORRELATIONS, besides volatility
Lecture 7: Correlation Modelling – Prof. Guidolin
Dynamic Conditional Correlation Models
13
• For instance, one interesting (worrisome) phenomenon is that all correlations tend to “skyrocket” during market crisis (bear)
• Skyrocket is a way to speak: you do recall that a correlation,
belongs to [-1, 1]• A first, intuitive but mechanical approach consists in applying GARCH
models to both variances and covariances in the definition of con-ditional correlation, e.g.:
A more fruitful approach still start from the decompositionbut it generalizes it to matrix form:
The dynamic conditional correlation approach is based on the eigenvalue-eigenvector decomposition
Lecture 7: Correlation Modelling – Prof. Guidolin
Dynamic Conditional Correlation Models
14
• Here Dt+1 is a matrix of standard deviations, σi,t+1, on the ith diagonal and zero everywhere else
• t+1 is a matrix of correlations, ρij,t+1 with ones on the diagonal• E.g., for n = 2:
At this point we proceed in two steps:① Volatilities of each asset are estimated through GARCH or one of the other methods considered in first part of the course② Model conditional covariances of standardized returns derived from the first step
The DCC approach is based on two steps: modelling the volatility of each individual asset; modelling the covariances of standardized residuals from the first step
Lecture 7: Correlation Modelling – Prof. Guidolin
Dynamic Conditional Correlation Models
15
• Luckily, the conditional covariance of the zi,t+1 variables equals the conditional correlation of the raw returns:
• You need to use an auxiliary variable qij,t+1 to be updated to be able to compute conditional correlations:
• Why a need for the qij,t+1 auxiliary variable? Because being able to use the ratio above ensures ij,t+1 falls in the interval [-1,1]
• At this point write a dynamic model for the conditional value for qij,t+1, like:in this case of exponential smoothing with parameter
The DCC approach is based on applying GARCH/RiskMetrics-type models to auxiliary variables that ensure ρij,t+1 [-1,1]
Lecture 7: Correlation Modelling – Prof. Guidolin
Dynamic Conditional Correlation Models
16
• An obvious alternative is a GARCH-type dynamic model:
• Notice that the correlation persistence parameters and are common across i and j: the persistence of the correlation between any two assets in the portfolio is the same.
• It does not, however, imply that the level of the correlations at any time is the same across pairs of assets
• Why the restriction? Usual reason: to guarantee ij,t+1 [-1,1]
Lecture 7: Correlation Modelling – Prof. Guidolin
Unconditional correlation matrix
(Covariance targeting)
Dynamic Conditional Correlation Models
17
• An important feature of these models is that the matrix Qt+1 collecting the qij,t+1 auxiliary variables,
will be positive definite as it is a weighted average of positive semi-definite and positive definite matrices
• This will in turn ensure that the correlation matrix t+1 and the covariance matrix, t+1, will be positive semi-definite
• DCC models are enjoying a massive popularity because they are easy to implement in 3 steps:• First, all the individual variances are estimated one by one• Second, the returns are standardized and the unconditional
correlation matrix is estimated• Third, the correlation persistence parameters and (or ) are
estimated
The DCC approach guarantees that the estimated/predicted covariance matrices are always (semi) positive-definite
Lecture 7: Correlation Modelling – Prof. Guidolin
Estimation of DCC Models
18
• Only few parameters are estimated simultaneously using numerical optimization
• This feature makes DCC models extremely tractable for risk management of large portfolios
• Yes, but how do you estimate the parameters?• Simple, just use QMLE methods “in waves”, i.e., sequentially• Why straight QML? Because you want to be able to build standardized
residuals from some GARCH volatility models first and then proceed to estimate and (or ) for dynamic correlations
It is QMLE because you want to estimate conditioning on some first-stage parameter estimates of GARCH models for volatility• E.g., in the case of
n = 2 you maximize
DCC models are estimated by QMLE because you first estimate first-stage GARCH parameters and then DCC ones conditioning on first-stage ML estimates
Lecture 7: Correlation Modelling – Prof. Guidolin
Estimation of DCC Models
19
• To initialize the recursion, set
• Notice that the variables that enter the likelihood are the rescaled returns, zt , and not the original raw returns, Rt
Lecture 7: Correlation Modelling – Prof. Guidolin
max
(E.g., in the case ofRiskMetrics)
Multivariate GARCH Models: VECH(1,1)
20
Multivariate GARCH models are in spirit similar to their univariate counterparts, except that they also specify conditional covariance functions, i.e., covariances directly move over time
Several different multivariate GARCH formulations, e.g., VECH, the diagonal VECH and the BEKK models• Below it is assumed for simplicity that there are n = 2 assets
A VECH(1,1) model is specified as:
• Ht is a conditional covariance matrix, t -1 is an innovation (disturbance) vector, t-1 represents the information set at time t − 1
• C is a 3 × 1 parameter vector, A and B are 3 × 3 parameter matrices and VECH(·) denotes the column-stacking operator applied to the upper portion of the symmetric matrix Ht
• The model requires the estimation of 21 parameters, a lot!Lecture 7: Correlation Modelling – Prof. Guidolin
A Vech GARC(1,1) model is based on the idea of modeling covariance matrices as column-stacked vectors
Multivariate GARCH Models : VECH(1,1)
21
• How the VECH operator works is shown below:
• The elements for the case n = 2 are written out below:
• Conditional variances and conditional covariances depend on the lagged values of all of the conditional variances of, and conditional covariances between, all of the asset returns in the series, as well as the lagged squared errors and the error cross-products
• As n increases, the estimation of the VECH model quickly becomes infeasible
Lecture 7: Correlation Modelling – Prof. Guidolin
Multivariate GARCH: Diagonal VECH and BEKK
22
VECH conditional covariance matrix has been restricted so that A and B are assumed to be diagonal• This reduces the number of parameters to be estimated to 9 (A and B
each have 3 elements) and the model, known as a diagonal VECH, is:
A disadvantage of the VECH model is that there is no guarantee of a positive semi-definite covariance matrix• It is this property which ensures that, whatever the weight of each
asset in the portfolio, an estimated value-at-risk is always positive• The BEKK model addresses the difficulty with VECH of ensuring
that the H matrix is always positive definite:
• A and B are 2 × 2 matrices of parameters and W is upper triangularLecture 7: Correlation Modelling – Prof. Guidolin
A Diagonal Vech GARC(1,1) model is a VECH(1,1) in which the matrices of parameters are restricted to be diagonal
Estimation of Multivariate GARCH Models
23
The positive definiteness of the covariance matrix is ensured owing to the quadratic nature of the terms on the equation’s RHS
Under the assumption of conditional normality, the parameters of the multivariate GARCH models of any of the above specifications can be estimated by maximizing the log-likelihood function
• θ denotes all the unknown parameters to be estimated, N is the number of assets (i.e. the number of series in the system), and T is the number of observations
• The MLE for θ is asymptotically normal• The additional complexity and extra parameters involved compared
with univariate models make estimation a computationally more difficult task, although the principles are the same
Lecture 7: Correlation Modelling – Prof. Guidolin
Multivariate GARCH models are estimated by (Q)MLE, which are consistent and asymptotically normal
Carefully read these Lecture Slides + class notes
Possibly read CHRISTOFFERSEN, chapter 7
Possibly read BROOKS, chapter 9 (sections 21-30)
Lecture Notes are available on Prof. Guidolin’s personal web page
Andersen T., T. Bollerslev, and F. Diebold (2009) “Parametric and Nonparametric Volatility Measurement”, in Ait-Sahalia, Y. and L. P. Hansen (eds.), Handbook of Financial Econometrics, Elsevier.
Litterman R. and K. Winkelmann (1998) “Estimating Covariance Matrices”, Goldman Sachs Quantitative Strategies Research Notes.
Reading List/How to prepare the exam
24Lecture 7: Correlation Modelling – Prof. Guidolin