on the derivation of the black-litterman equation for ...€¦ · corresponding author –...
TRANSCRIPT
corresponding author – [email protected]
On the Derivation of the Black-Litterman Equation for Expected Excess Returns1
Harald Bogner
First version: May 17th 2015 This version: October 18th 2015
The Black-Litterman approach is a method to combine subjective views about the distributions of expected excess returns of portfolios consisting of subsets of available assets, with distributions of risk premiums implied in current market prices for all available assets under an assumption about expected excess returns in equilibrium. The method has extensively been explained and discussed in the literature.2 The following presents the approach therefore only in a very compressed “executive summary style”. The focus of this text is instead on the derivation of a main result of the Black-Litterman model, their formula for expected excess returns. The derivation of this result has not been given in the original papers by Black and Litterman. Black and Litterman suggested either using the Theil mixed estimation method to derive the formula, or “the Black-Litterman approach”3 which is only very briefly explained in the appendix of one of their original papers on the model.4 An endnote in another one of their papers referred to a “Bayesian approach” and also a later publication by He and Litterman gave a hint, stating that the model “uses the Bayesian approach to infer the assets’ expected returns”, but neither did provide a comprehensive derivation.5 Several proofs using the Bayes theorem have been presented by other authors.6 The derivation given in the following differs from these by relying explicitly on general results regarding the properties of the product of two different densities for the same multivariate normal random vector. For convenience, a
1 The equation referred to here is given in step 8. of the appendix on page 42 to “Global Portfolio Optimization”, by Fischer
Black and Robert Litterman, published in the Financial Analysts Journal, 48, 5, Sep/Oct 1992, p. 28-43 – this paper (in the following referred to as Black/Litterman 1992) is also the main source of the model description given and notation used here. 2 For a detailed and comprehensive overview see the blog www.blacklitterman.org by Jay Walters.
3 Black/Litterman 1992, p. 35
4 Ibid, appendix, p.42
5 See endnote 11 in “Asset Allocation”, The Journal of Fixed Income, September 1991, Fischer Black and Ribert B Litterman and
page 2 of “The Intuition Behind Black-Litterman Model Portfolios” by Guangliang He and Robert Litterman, Available at SSRN: http://ssrn.com/abstract=334304 or http://dx.doi.org/10.2139/ssrn.334304 - and in the following referred to as He/Litterman. The above and all following online references were retrieved on May 17
th 2015.
6 Again, for a comprehensive overview see www.blacklitterman.org Black/Litterman 1992.
2
proof of these results that follows given references, and relies on the information form7 for normal densities, is shown in the appendix.8 Market-implied risk premiums The Black-Litterman method is based on the assumption, that from the investor perspective, not only assets’ excess returns, i.e. returns after deducting the risk-free rate, but also risk premiums, i.e. expected excess returns, are random – more precisely: the risk premiums, in the following written as random vector M, with μ being the “realization” of M, i.e. the true risk premiums vector, have a
multivariate normal distribution. This randomness of risk premiums can be considered as the result of market prices fluctuating around equilibrium prices. By assuming the market was currently in a CAPM equilibrium, and with (an estimate for) the market price of risk (aka the market risk aversion parameter) and a known covariance matrix of excess returns Σ , “equilibrium risk premiums” can be derived, i.e. risk premiums that would be the true risk premiums if the market was in equilibrium at current price levels. 9 10 - This vector of equilibrium risk premiums is in the following written as π Black and Litterman suggest to use this vector as the mean vector of a market-implied distribution of risk premiums, with a covariance matrix that is proportional to the covariance matrix of excess returns, i.e. every covariance between two risk premiums (including the variances) is the covariance of the corresponding excess returns times a constant . - For the market-implied distribution, the covariance matrix of risk premiums is hence written as: τ In summary, while the true risk premiums are unknown, as one possible model one can estimate a multivariate normal distribution for the vector of risk premiums with a mean corresponding to risk premiums implied in current market prices under an equilibrium assumption (the equilibrium risk premiums), and a covariance matrix proportional to the (estimated) covariance matrix of excess returns:
I. ,~ NM
7 One could certainly argue, that the general derivation of the information form from the moment form is part of the proof, and
depending how detailed one formulates these steps, it may actually not be much shorter than the proofs given in other sources. However, even viewed like this, it may still provide an alternative structure that may contribute to a comprehensive understanding of the Black-Litterman result. Again for convenience and relying on hints in a reference given there, section A.I of the appendix shows how a density function written in information form can be derived from a density function written in moment form. 8 Note that the last part of the appendix, the rewriting of the scaling factor, is given only for the sake of completeness, it is not
relevant in the context of the Black-Litterman equation. 9 The equilibrium Black/Litterman refer to is an extension of the Sharpe-Lintner-Mossin-Treynor-CAPM, with the currency
market also being in equilibrium, i.e. there is an equilibrium level of currency hedging, as described in a 1989 paper titled “Universal Hedging: Optimizing Currency Risk and Reward in International Equity Portfolios” by Fischer Black and derived in “Equilibrium Exchange Rate Hedging”, a 1989 working paper by the same author. 10
This requires the estimation of a risk aversion parameter for the market (aka “market price of risk” – see the corresponding section in the post on Froot/Stein and Treynor/Black in this blog and the references given there), see He/Litterman page 3.
3
This will in the following be referred to as “market-implied” or – following He and Litterman – “prior” distribution for risk premiums. Risk premiums implied in the investor’s subjective views Besides the market-implied distribution, an investor may also have individual subjective views on one or more linear combinations of the assets’ risk premiums – i.e. on risk premiums of portfolios (consisting of long and short positions).11 Black and Litterman express these views as follows:
- P is a kxn matrix of weights in those portfolios specified by the investor, where k is the total number
of views, n is the number of assets in the market
- The views hold for the true vector of asset risk premiums , i.e. the “realization” of M, and μ are
hence the values expected by the investor for the excess returns of the portfolios on which the
investor has a view on.
While the investor does not know the true risk premium for any of the portfolios, they know a model of
their distribution, conditional on :
II. εqμ
where q is an n-dimensional (column) vector of constants, known by the investor. The
elements of the vector ε are random and normally distributed with mean zero and a
diagonal covariance matrix , i.e.:
,0~ N , such that
III. q,μ~N
εqμ can also be (approximately) solved for the expected return vector:
IV. εqμ
11
A view on a single asset could be understood as a special case of a linear combination, where all the weights of other assets are zero. The interpretation of views as portfolios was introduced in He/Litterman.
4
where P+ is the pseudoinverse of P.12 This vector describes the investor’s information on risk
premiums implied in their views. The vector will only contain risk premiums for assets
which are affected by the investor’s views, and zeros for other assets.
From IV it follows that this expected return vector is random with a multivariate normal distribution:13
V.
'
v q,~NM where Mv is the random expected return vector, and the realization of
Mv, i.e. the true expected return vector implied in the views, is labelled v . Here the index
v indicates that these are the view-implied parameters for the distribution of risk premiums.
Note that the covariance matrix will include zeros for assets not affected by the views.
As the views are conditional on the unknown true risk premiums vector , the risk premiums vector
implied in the views is also conditional on . The density function for the expected excess return vector
v implied in the views and conditional on the unknown true risk premiums vector is hence in the
following written as:
Mf vMv|
In summary, the investor has views in the form of a distribution for the risk premiums of one or more
portfolios formed with one or more of the assets available in the market. This distribution is
conditional on the true vector of risk premiums. It implies a conditional probability density function
for the random vector of asset risk premiums.
The two models for risk premiums discussed above, the market-implied and the subjective views-
implied risk premiums can be combined with the Bayes rule for probability density functions, as
described in the following.
Combining the market-implied and views-implied models for expected excess returns
According to the Bayes formula for densities14, the density of the true vector of risk premiums
conditional on the risk premiums implied in the views, i.e. the posterior density function for the risk
premiums, is the product of the market-implied prior density and the conditional density implied in the
12
As P has full row rank (no view is a linear combination of one or more other views), the right-inverse PT(PP
T)
-1 could be used
as pseudoinverse. Solving the equation as above with the pseudoinverse results in an approximation of , such that the
Euclidean norm of the distance between the estimate of based on the multiplication with P+ and the true cannot be made
smaller by changing the estimate of . See page 451, “The Operator Theory of the Pseudo-Inverse, I. Bounded Operators”, by
Frederick J. Beutler, Journal of Mathematical Analysis and Applications, 10, 1965, p.451-470. Here however, due to a 2nd
inversion at a later stage, which reverses the inversion here, it is not necessary to actually find the pseudoinverse. 13
As is a diagonal, the covariance matrix of the elements of the vector vεΡ is a diagonal as well, with the elements of the
diagonal being 22
iip in matrix notation, this can be written as:
'.
14 See for example http://www.math.uah.edu/stat/dist/Conditional.html
5
views for the asset risk premiums, divided by the unconditional density of the risk premiums implied in
the views:
vvM
MvMvvM
Mf
MfMfMf
v
v
||
As shown in section A.II of the appendix (and the references given there), the numerator of the fraction
on the right hand side of this equation is a Gaussian. For a given vμ , the denominator of the fraction on
the right hand side, vvM μMfv
, is a constant. Further, as the Bayes rule for densities implies that the
left hand side is a density function, the value of vvM Mfv
must ensure that the right hand side is a
proper density function as well, – in particular that it integrates to 1. I.e. vvM μMf
v
1 takes the role of
a normalizing factor, and the right hand side is not only a Gaussian, but a density function of a
multivariate normal distribution. This holds for every vμ , so that (now slightly changing the notation to
emphasize that vμ is variable),
vM
MvMvM
μf
μf|μμfμ|μf
v
v
is the probability density function of a variable with a multivariate normal distribution. The parameters
of this distribution, i.e. its mean vector and covariance matrix, are given by the general equations for
parameters of products of multivariate normal densities, which are derived in section A.II of the
appendix following the references given there. The equations are (A.9.d in section A.II of the appendix):
2121
11
112
112
121
11
112
11
To apply these results here, use the covariance matrices given in equations I. and V. to specify the
covariance matrix of the posterior distribution:15
111
11
1
T
T
τ
τ
And with that the mean vector of the posterior distribution becomes:
11111 TT πττμ
15
Note for the following that the inverse of a pseudoinverse of a matrix A is the original matrix A.
6
which is the Black-Litterman equation for expected excess returns.
Appendix16
For the multiplication of densities in the next section, the following describes the information form for
densities.
A.I Density of a multivariate normal distribution in information form
The density function i of a multivariate-normally distributed n-dimensional vector Tnj xxxx ,,,,1
is:17
A.1
iiT
i xx
in
i exf
1
2
1
2
1
with a mean vector i and a covariance matrix i .
Defining the variables
A.2
iii
ii
1
1
and
A.3 iiiTiiLNnLN 12
2
1
the density can be written in the so-called information form (aka natural form or canonical form)18
16
As all functions appearing in the whole following text are densities on the same random vector, the notation is simplified –
i.e. instead of fX(X=x), and using e.g. gX(X=x) for another density function for the same random vector, fi(x) is written, where
different values for the index i are used to distinguish different densities on the same random vector x.
17 While in this first section of the appendix only one density is considered, the index i would not be necessary and may be
removed in future versions of this document – to then be reintroduced in the next section on products of densities.
7
xxx
ii
TTii
exf
2
1
Proof:
Perform multiplications in the exponent of A.1:
xxx TT
xx T
11
1
A.4 1111 TTTT xxxx
with
TTT xx 11
and as xT 1 is a scalar:
xxx TTTT 111 ,
A.4 becomes:
111
1111
2
TTT
TTTT
xxx
xxxx
and hence the exponent in A.1 is:
xxx
xxx
TTT
TTT
xx iiT
i
1
2
111
2
1
1
2
111
2
1
2
1 1
so that A1 becomes:
18
See e.g. p. 4 in “Some Properties Of the Gaussian Distribution” Jianxin Wu, GVU Center and College of Computing, Georgia Institute of Technology, April 22, 2004, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2635&rep=rep1&type=pdf, and p.2 in “Manipulating the Multivariate Gaussian Density”, Thomas B. Schoen and Fredrik Lindsten, Division of Automatic Control, Linkoeping University, January 11, 2011, http://user.it.uu.se/~thosc112/pubpdf/schonl2011.pdf.
8
A.5
xxx
in
ii
Ti
Ti
iiTi
ee
xf
111
2
12
1
2
Because
1
1
the fraction on the left can be written as follows:
A.6
1111
22
1
2
15.0
5.02
1
12
2
TTT
LNnLNn
nee
e,
and with the expressions introduced in A.2:
iii
ii
1
1
the exponent on the right hand side of A6 can be written as in A3:
iiiTiiLNnLN LNnLNT
12
2
12
2
111
and the density A5 can be expressed as:19
A.7 xxx
ii
TTii
exf
2
1
which completes the proof.
A.II Product of two multivariate normal densities:
Using A.7, a product of two multivariate normal densities for the same random vector x can be written
as:
A.8 xxx TT
exfxf212121
2
1
21
19
See for example the references given in footnote 15 and also page 5 in “Products and Convolutions of Gaussian Probability Density Functions”, P.A. Bromiley, Imaging Sciences Research Group, Institute of Population Health, School of Medicine, University of Manchester, Tina Memo No. 2003-003, Internal Report, last updated 14 / 8 / 2014. http://www.tina-vision.net/docs/memos/2003-003.pdf - in the following referred to as “Bromiley”.
9
with
A.8a
21
21
and defined analogously to A.3:
122
1 TLNnLN ,
A.8 can be expressed as:20
A.9 xxx TT
e
2
121
where
A.9.a xxx TT
e
2
1
is a Gaussian pdf as defined by A.7 and the expressions introduced in A.2 and A.3.
The product of the two densities A.8 is hence a scaled Gaussian pdf and can be written as:
A.9.b
xxx
xxx
TT
TT
Se
ee
2
1
2
1
21
Where
A.9.c 21eS
is the scaling factor. In other words, the product of a multivariate normal density multiplied with the
reciprocal of that scaling factor, is a multivariate normal density. The reciprocal of that scaling factor is
also referred to as normalizing constant.
A.III Parameters of a scaled product of two multivariate Gaussian densities:
From A.9.b it follows that a product of two multivariate-normal densities divided by a scaling factor is a
multivariate normal density:
,~21 NS
fff
20
This is a special case with n=2 of the expression for the product of n multivariate normal densities given in Bromiley.
10
(with the scaling factor S defined as in A.9.c)
From A.2 and A.8.a, one can see the parameters of the multivariate normal vector x with density f:
2121
11
1
12
11
1
A.9.d
2121
11
112
112
121
11
112
11
A.IV Rewriting the scaling factor
With
211
212121
1
22
1
22
1
T
T
LNnLN
LNnLN
the exponent of the scaling factor can be written as:
211
212121212221
1111
21
22
12
2
12
2
1
TTT LNnLNLNnLNLNnLN
Rearranging and simplifying this expression gives:
aLNLNLNnLN
LNLNLNnLN TTT
2
1
2
12
2
1
2
1
2
12
2
1
2121
211
212121221
1112121
Where a is defined as follows:
A.10 211
212121221
111
TTTa .
With
11
2121
2121
1
2
1
LNLNLN
LNLNLN
eee
e
12212
1112
12
11121
21 111
and
nn
LNnLN
n
ee
2
12
5.02
22
12
the scaling factor S can then be written as: 21
A.11
a
ne 5.0
122
1
With
21
21
and
a becomes:
12
1221
111
211
212121221
111
TTT
TTTa
This expression is now simplified enough to replace the expressions introduced for the information
form:
2121
11
112
11
122
1112
122
1221
111
111
211
212121221
111
TTTT
TTTa
21
Corresponds to equation 6d in “Gaussian Identities”, Sam Roweis, (revised July 1999),
https://www.cs.nyu.edu/~roweis/notes/gaussid.pdf
12
A.12 212111
112
11
122
1112
1221
111
TTTT
The three-brackets product on the right can be written as:
2
121
11
112
11
122
111
TT
A.13
212111
112
11
2121
11
112
11
122
111
T
T
A special case of the matrix inversion lemma is:22
A.14 21
212211
2111
112
11
So that the first summand in A.13 can be written as:
212
2121
11
111
111
11
211111
2111111
11
211111
2111
2121
111
12111
2121
111
12111
2121
11
112
11
TTTT
TTTT
TT
T
T
And the second summand:
2121
11
122
122
21
212221
2122
2121
112
12122
2121
112
12122
2121
11
112
11
TTTT
TT
T
T
2121221222
12122 1
11
TTTT
So that the sum in A.13 becomes:
22
See p. 201 of “Gaussian Processes for Machine Learning”, by C. E. Rasmussen & C. K. I. Williams, the MIT Press, 2006, http://www.gaussianprocess.org/gpml/chapters/RWA.pdf - in the following referred to as “Rasmussen”.
13
21
21221221
112
121
12111
111 2
121221
12111
2121
11
112
11
122
111
TTTT TTTT
TT
And with that one gets for A.12:
21
2121112
121222
121
121111
1211
21
21221221
112
121222
121
121111
12111
1112
1221
111
2121
11
112
11
122
1112
1221
111
TTTTTT
TTTTTTTTTT
TTTT
A.15
1
112
121222
121
121112
12121
1211
TTTTTT
To proceed, use the following result:23
11
212
21
211
112
11
Proof:
11
12
1121
12
1
21
211
And:
11
12
1221
11
1
11
212
Hence:
23
For example given at http://math.stackexchange.com/questions/934921/inverse-of-a-sum-of-symmetric-matrices
14
11
21221
211
112
11
And:
11
1
211
12
12
112
11
11
121
Apply the result just proven on A.15:
111
112
11
1122
12
112
11
1211
1122
1212
12121
1211
111
112
11
11222
12
112
11
12112
12121
1211
1112
12
112
11
11222
121
11
112
11
12112
12121
1211
TTTTTT
TTTTTT
TTTTTT
Apply now the special case of the matrix inversion lemma given earlier again and simplify:
1121221
21121
21211
211
11
21211122
12112
1211
1122
1212
12121
1211
1111
12111
1122
122
12122
1211
1122
1212
12121
1211
TTTT
TTTTTTTT
TTTTTT
Rearrange:
211
212211
211
121
212211
211
11
21221
21121
21211
211
TT
TT
TTTT
211
2121 TT
Use this result for a in the equation for the scaling factor A.11:
a
neS 5.0
122
1
Substitute for a:
15
A.16
211
21212
1
122
1
TT
en
So the scaling factor is the vector of values of the density function of a multivariate Gaussian with mean
2 and covariance matrix 12 evaluated at 1 .24
24
This is the form for the normalizing factor as given in Rasmussen, equation A.8, page 200. Note that, as written there, 1
and 2 can be exchanged.