on the derivation of the black-litterman equation for ...€¦ · corresponding author –...

corresponding author – [email protected]

On the Derivation of the Black-Litterman Equation for Expected Excess Returns1

Harald Bogner

First version: May 17th 2015 This version: October 18th 2015

The Black-Litterman approach is a method to combine subjective views about the distributions of expected excess returns of portfolios consisting of subsets of available assets, with distributions of risk premiums implied in current market prices for all available assets under an assumption about expected excess returns in equilibrium. The method has extensively been explained and discussed in the literature.2 The following presents the approach therefore only in a very compressed “executive summary style”. The focus of this text is instead on the derivation of a main result of the Black-Litterman model, their formula for expected excess returns. The derivation of this result has not been given in the original papers by Black and Litterman. Black and Litterman suggested either using the Theil mixed estimation method to derive the formula, or “the Black-Litterman approach”3 which is only very briefly explained in the appendix of one of their original papers on the model.4 An endnote in another one of their papers referred to a “Bayesian approach” and also a later publication by He and Litterman gave a hint, stating that the model “uses the Bayesian approach to infer the assets’ expected returns”, but neither did provide a comprehensive derivation.5 Several proofs using the Bayes theorem have been presented by other authors.6 The derivation given in the following differs from these by relying explicitly on general results regarding the properties of the product of two different densities for the same multivariate normal random vector. For convenience, a

1 The equation referred to here is given in step 8. of the appendix on page 42 to “Global Portfolio Optimization”, by Fischer

Black and Robert Litterman, published in the Financial Analysts Journal, 48, 5, Sep/Oct 1992, p. 28-43 – this paper (in the following referred to as Black/Litterman 1992) is also the main source of the model description given and notation used here. 2 For a detailed and comprehensive overview see the blog www.blacklitterman.org by Jay Walters.

3 Black/Litterman 1992, p. 35

4 Ibid, appendix, p.42

5 See endnote 11 in “Asset Allocation”, The Journal of Fixed Income, September 1991, Fischer Black and Ribert B Litterman and

page 2 of “The Intuition Behind Black-Litterman Model Portfolios” by Guangliang He and Robert Litterman, Available at SSRN: http://ssrn.com/abstract=334304 or http://dx.doi.org/10.2139/ssrn.334304 - and in the following referred to as He/Litterman. The above and all following online references were retrieved on May 17

th 2015.

6 Again, for a comprehensive overview see www.blacklitterman.org Black/Litterman 1992.

http://www.blacklitterman.org/

http://ssrn.com/abstract=334304

http://dx.doi.org/10.2139/ssrn.334304

http://www.blacklitterman.org/

2

proof of these results that follows given references, and relies on the information form7 for normal densities, is shown in the appendix.8 Market-implied risk premiums The Black-Litterman method is based on the assumption, that from the investor perspective, not only assets’ excess returns, i.e. returns after deducting the risk-free rate, but also risk premiums, i.e. expected excess returns, are random – more precisely: the risk premiums, in the following written as random vector M, with μ being the “realization” of M, i.e. the true risk premiums vector, have a

multivariate normal distribution. This randomness of risk premiums can be considered as the result of market prices fluctuating around equilibrium prices. By assuming the market was currently in a CAPM equilibrium, and with (an estimate for) the market price of risk (aka the market risk aversion parameter) and a known covariance matrix of excess returns Σ , “equilibrium risk premiums” can be derived, i.e. risk premiums that would be the true risk premiums if the market was in equilibrium at current price levels. 9 10 - This vector of equilibrium risk premiums is in the following written as π Black and Litterman suggest to use this vector as the mean vector of a market-implied distribution of risk premiums, with a covariance matrix that is proportional to the covariance matrix of excess returns, i.e. every covariance between two risk premiums (including the variances) is the covariance of the corresponding excess returns times a constant . - For the market-implied distribution, the covariance matrix of risk premiums is hence written as: τ In summary, while the true risk premiums are unknown, as one possible model one can estimate a multivariate normal distribution for the vector of risk premiums with a mean corresponding to risk premiums implied in current market prices under an equilibrium assumption (the equilibrium risk premiums), and a covariance matrix proportional to the (estimated) covariance matrix of excess returns:

I. ,~ NM

7 One could certainly argue, that the general derivation of the information form from the moment form is part of the proof, and

depending how detailed one formulates these steps, it may actually not be much shorter than the proofs given in other sources. However, even viewed like this, it may still provide an alternative structure that may contribute to a comprehensive understanding of the Black-Litterman result. Again for convenience and relying on hints in a reference given there, section A.I of the appendix shows how a density function written in information form can be derived from a density function written in moment form. 8 Note that the last part of the appendix, the rewriting of the scaling factor, is given only for the sake of completeness, it is not

relevant in the context of the Black-Litterman equation. 9 The equilibrium Black/Litterman refer to is an extension of the Sharpe-Lintner-Mossin-Treynor-CAPM, with the currency

market also being in equilibrium, i.e. there is an equilibrium level of currency hedging, as described in a 1989 paper titled “Universal Hedging: Optimizing Currency Risk and Reward in International Equity Portfolios” by Fischer Black and derived in “Equilibrium Exchange Rate Hedging”, a 1989 working paper by the same author. 10

This requires the estimation of a risk aversion parameter for the market (aka “market price of risk” – see the corresponding section in the post on Froot/Stein and Treynor/Black in this blog and the references given there), see He/Litterman page 3.

3

This will in the following be referred to as “market-implied” or – following He and Litterman – “prior” distribution for risk premiums. Risk premiums implied in the investor’s subjective views Besides the market-implied distribution, an investor may also have individual subjective views on one or more linear combinations of the assets’ risk premiums – i.e. on risk premiums of portfolios (consisting of long and short positions).11 Black and Litterman express these views as follows:

- P is a kxn matrix of weights in those portfolios specified by the investor, where k is the total number

of views, n is the number of assets in the market

- The views hold for the true vector of asset risk premiums , i.e. the “realization” of M, and μ are

hence the values expected by the investor for the excess returns of the portfolios on which the

investor has a view on.

While the investor does not know the true risk premium for any of the portfolios, they know a model of

their distribution, conditional on :

II. εqμ

where q is an n-dimensional (column) vector of constants, known by the investor. The

elements of the vector ε are random and normally distributed with mean zero and a

diagonal covariance matrix , i.e.:

,0~ N , such that

III. q,μ~N

εqμ can also be (approximately) solved for the expected return vector:

IV. εqμ

11

A view on a single asset could be understood as a special case of a linear combination, where all the weights of other assets are zero. The interpretation of views as portfolios was introduced in He/Litterman.

4

where P+ is the pseudoinverse of P.12 This vector describes the investor’s information on risk

premiums implied in their views. The vector will only contain risk premiums for assets

which are affected by the investor’s views, and zeros for other assets.

From IV it follows that this expected return vector is random with a multivariate normal distribution:13

V.

'

v q,~NM where Mv is the random expected return vector, and the realization of

Mv, i.e. the true expected return vector implied in the views, is labelled v . Here the index

v indicates that these are the view-implied parameters for the distribution of risk premiums.

Note that the covariance matrix will include zeros for assets not affected by the views.

As the views are conditional on the unknown true risk premiums vector , the risk premiums vector

implied in the views is also conditional on . The density function for the expected excess return vector

v implied in the views and conditional on the unknown true risk premiums vector is hence in the

following written as:

Mf vMv|

In summary, the investor has views in the form of a distribution for the risk premiums of one or more

portfolios formed with one or more of the assets available in the market. This distribution is

conditional on the true vector of risk premiums. It implies a conditional probability density function

for the random vector of asset risk premiums.

The two models for risk premiums discussed above, the market-implied and the subjective views-

implied risk premiums can be combined with the Bayes rule for probability density functions, as

described in the following.

Combining the market-implied and views-implied models for expected excess returns

According to the Bayes formula for densities14, the density of the true vector of risk premiums

conditional on the risk premiums implied in the views, i.e. the posterior density function for the risk

premiums, is the product of the market-implied prior density and the conditional density implied in the

12

As P has full row rank (no view is a linear combination of one or more other views), the right-inverse PT(PP

T)

-1 could be used

as pseudoinverse. Solving the equation as above with the pseudoinverse results in an approximation of , such that the

Euclidean norm of the distance between the estimate of based on the multiplication with P+ and the true cannot be made

smaller by changing the estimate of . See page 451, “The Operator Theory of the Pseudo-Inverse, I. Bounded Operators”, by

Frederick J. Beutler, Journal of Mathematical Analysis and Applications, 10, 1965, p.451-470. Here however, due to a 2nd

inversion at a later stage, which reverses the inversion here, it is not necessary to actually find the pseudoinverse. 13

As is a diagonal, the covariance matrix of the elements of the vector vεΡ is a diagonal as well, with the elements of the

diagonal being 22

iip in matrix notation, this can be written as:

'.

14 See for example http://www.math.uah.edu/stat/dist/Conditional.html

http://www.math.uah.edu/stat/dist/Conditional.html

5

views for the asset risk premiums, divided by the unconditional density of the risk premiums implied in

the views:

vvM

MvMvvM

Mf

MfMfMf

v

v

||

As shown in section A.II of the appendix (and the references given there), the numerator of the fraction

on the right hand side of this equation is a Gaussian. For a given vμ , the denominator of the fraction on

the right hand side, vvM μMfv

, is a constant. Further, as the Bayes rule for densities implies that the

left hand side is a density function, the value of vvM Mfv

must ensure that the right hand side is a

proper density function as well, – in particular that it integrates to 1. I.e. vvM μMf

v

1 takes the role of

a normalizing factor, and the right hand side is not only a Gaussian, but a density function of a

multivariate normal distribution. This holds for every vμ , so that (now slightly changing the notation to

emphasize that vμ is variable),

vM

MvMvM

μf

μf|μμfμ|μf

v

v

is the probability density function of a variable with a multivariate normal distribution. The parameters

of this distribution, i.e. its mean vector and covariance matrix, are given by the general equations for

parameters of products of multivariate normal densities, which are derived in section A.II of the

appendix following the references given there. The equations are (A.9.d in section A.II of the appendix):

2121

11

112

112

121

11

112

11

To apply these results here, use the covariance matrices given in equations I. and V. to specify the

covariance matrix of the posterior distribution:15

111

11

1

T

T

τ

τ

And with that the mean vector of the posterior distribution becomes:

11111 TT πττμ

15

Note for the following that the inverse of a pseudoinverse of a matrix A is the original matrix A.

6

which is the Black-Litterman equation for expected excess returns.

Appendix16

For the multiplication of densities in the next section, the following describes the information form for

densities.

A.I Density of a multivariate normal distribution in information form

The density function i of a multivariate-normally distributed n-dimensional vector Tnj xxxx ,,,,1

is:17

A.1

iiT

i xx

in

i exf

1

2

1

2

1

with a mean vector i and a covariance matrix i .

Defining the variables

A.2

iii

ii

1

1

and

A.3 iiiTiiLNnLN 12

2

1

the density can be written in the so-called information form (aka natural form or canonical form)18

16

As all functions appearing in the whole following text are densities on the same random vector, the notation is simplified –

i.e. instead of fX(X=x), and using e.g. gX(X=x) for another density function for the same random vector, fi(x) is written, where

different values for the index i are used to distinguish different densities on the same random vector x.

17 While in this first section of the appendix only one density is considered, the index i would not be necessary and may be

removed in future versions of this document – to then be reintroduced in the next section on products of densities.

7

xxx

ii

TTii

exf

2

1

Proof:

Perform multiplications in the exponent of A.1:

xxx TT

xx T

11

1

A.4 1111 TTTT xxxx

with

TTT xx 11

and as xT 1 is a scalar:

xxx TTTT 111 ,

A.4 becomes:

111

1111

2

TTT

TTTT

xxx

xxxx

and hence the exponent in A.1 is:

xxx

xxx

TTT

TTT

xx iiT

i

1

2

111

2

1

1

2

111

2

1

2

1 1

so that A1 becomes:

18

See e.g. p. 4 in “Some Properties Of the Gaussian Distribution” Jianxin Wu, GVU Center and College of Computing, Georgia Institute of Technology, April 22, 2004, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2635&rep=rep1&type=pdf, and p.2 in “Manipulating the Multivariate Gaussian Density”, Thomas B. Schoen and Fredrik Lindsten, Division of Automatic Control, Linkoeping University, January 11, 2011, http://user.it.uu.se/~thosc112/pubpdf/schonl2011.pdf.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2635&rep=rep1&type=pdf

http://user.it.uu.se/~thosc112/pubpdf/schonl2011.pdf

8

A.5

xxx

in

ii

Ti

Ti

iiTi

ee

xf

111

2

12

1

2

Because

1

1

the fraction on the left can be written as follows:

A.6

1111

22

1

2

15.0

5.02

1

12

2

TTT

LNnLNn

nee

e,

and with the expressions introduced in A.2:

iii

ii

1

1

the exponent on the right hand side of A6 can be written as in A3:

iiiTiiLNnLN LNnLNT

12

2

12

2

111

and the density A5 can be expressed as:19

A.7 xxx

ii

TTii

exf

2

1

which completes the proof.

A.II Product of two multivariate normal densities:

Using A.7, a product of two multivariate normal densities for the same random vector x can be written

as:

A.8 xxx TT

exfxf212121

2

1

21

19

See for example the references given in footnote 15 and also page 5 in “Products and Convolutions of Gaussian Probability Density Functions”, P.A. Bromiley, Imaging Sciences Research Group, Institute of Population Health, School of Medicine, University of Manchester, Tina Memo No. 2003-003, Internal Report, last updated 14 / 8 / 2014. http://www.tina-vision.net/docs/memos/2003-003.pdf - in the following referred to as “Bromiley”.

http://www.tina-vision.net/docs/memos/2003-003.pdf

http://www.tina-vision.net/docs/memos/2003-003.pdf

9

with

A.8a

21

21

and defined analogously to A.3:

122

1 TLNnLN ,

A.8 can be expressed as:20

A.9 xxx TT

e

2

121

where

A.9.a xxx TT

e

2

1

is a Gaussian pdf as defined by A.7 and the expressions introduced in A.2 and A.3.

The product of the two densities A.8 is hence a scaled Gaussian pdf and can be written as:

A.9.b

xxx

xxx

TT

TT

Se

ee

2

1

2

1

21

Where

A.9.c 21eS

is the scaling factor. In other words, the product of a multivariate normal density multiplied with the

reciprocal of that scaling factor, is a multivariate normal density. The reciprocal of that scaling factor is

also referred to as normalizing constant.

A.III Parameters of a scaled product of two multivariate Gaussian densities:

From A.9.b it follows that a product of two multivariate-normal densities divided by a scaling factor is a

multivariate normal density:

,~21 NS

fff

20

This is a special case with n=2 of the expression for the product of n multivariate normal densities given in Bromiley.

10

(with the scaling factor S defined as in A.9.c)

From A.2 and A.8.a, one can see the parameters of the multivariate normal vector x with density f:

2121

11

1

12

11

1

A.9.d

2121

11

112

112

121

11

112

11

A.IV Rewriting the scaling factor

With

211

212121

1

22

1

22

1

T

T

LNnLN

LNnLN

the exponent of the scaling factor can be written as:

211

212121212221

1111

21

22

12

2

12

2

1

TTT LNnLNLNnLNLNnLN

Rearranging and simplifying this expression gives:

aLNLNLNnLN

LNLNLNnLN TTT

2

1

2

12

2

1

2

1

2

12

2

1

2121

211

212121221

1112121

Where a is defined as follows:

A.10 211

212121221

111

TTTa .

With

11

2121

2121

1

2

1

LNLNLN

LNLNLN

eee

e

12212

1112

12

11121

21 111

and

nn

LNnLN

n

ee

2

12

5.02

22

12

the scaling factor S can then be written as: 21

A.11

a

ne 5.0

122

1

With

21

21

and

a becomes:

12

1221

111

211

212121221

111

TTT

TTTa

This expression is now simplified enough to replace the expressions introduced for the information

form:

2121

11

112

11

122

1112

122

1221

111

111

211

212121221

111

TTTT

TTTa

21

Corresponds to equation 6d in “Gaussian Identities”, Sam Roweis, (revised July 1999),

https://www.cs.nyu.edu/~roweis/notes/gaussid.pdf

https://www.cs.nyu.edu/~roweis/notes/gaussid.pdf

12

A.12 212111

112

11

122

1112

1221

111

TTTT

The three-brackets product on the right can be written as:

2

121

11

112

11

122

111

TT

A.13

212111

112

11

2121

11

112

11

122

111

T

T

A special case of the matrix inversion lemma is:22

A.14 21

212211

2111

112

11

So that the first summand in A.13 can be written as:

212

2121

11

111

111

11

211111

2111111

11

211111

2111

2121

111

12111

2121

111

12111

2121

11

112

11

TTTT

TTTT

TT

T

T

And the second summand:

2121

11

122

122

21

212221

2122

2121

112

12122

2121

112

12122

2121

11

112

11

TTTT

TT

T

T

2121221222

12122 1

11

TTTT

So that the sum in A.13 becomes:

22

See p. 201 of “Gaussian Processes for Machine Learning”, by C. E. Rasmussen & C. K. I. Williams, the MIT Press, 2006, http://www.gaussianprocess.org/gpml/chapters/RWA.pdf - in the following referred to as “Rasmussen”.

http://www.gaussianprocess.org/gpml/chapters/RWA.pdf

13

21

21221221

112

121

12111

111 2

121221

12111

2121

11

112

11

122

111

TTTT TTTT

TT

And with that one gets for A.12:

21

2121112

121222

121

121111

1211

21

21221221

112

121222

121

121111

12111

1112

1221

111

2121

11

112

11

122

1112

1221

111

TTTTTT

TTTTTTTTTT

TTTT

A.15

1

112

121222

121

121112

12121

1211

TTTTTT

To proceed, use the following result:23

11

212

21

211

112

11

Proof:

11

12

1121

12

1

21

211

And:

11

12

1221

11

1

11

212

Hence:

23

For example given at http://math.stackexchange.com/questions/934921/inverse-of-a-sum-of-symmetric-matrices

http://math.stackexchange.com/questions/934921/inverse-of-a-sum-of-symmetric-matrices

14

11

21221

211

112

11

And:

11

1

211

12

12

112

11

11

121

Apply the result just proven on A.15:

111

112

11

1122

12

112

11

1211

1122

1212

12121

1211

111

112

11

11222

12

112

11

12112

12121

1211

1112

12

112

11

11222

121

11

112

11

12112

12121

1211

TTTTTT

TTTTTT

TTTTTT

Apply now the special case of the matrix inversion lemma given earlier again and simplify:

1121221

21121

21211

211

11

21211122

12112

1211

1122

1212

12121

1211

1111

12111

1122

122

12122

1211

1122

1212

12121

1211

TTTT

TTTTTTTT

TTTTTT

Rearrange:

211

212211

211

121

212211

211

11

21221

21121

21211

211

TT

TT

TTTT

211

2121 TT

Use this result for a in the equation for the scaling factor A.11:

a

neS 5.0

122

1

Substitute for a:

15

A.16

211

21212

1

122

1

TT

en

So the scaling factor is the vector of values of the density function of a multivariate Gaussian with mean

2 and covariance matrix 12 evaluated at 1 .24

24

This is the form for the normalizing factor as given in Rasmussen, equation A.8, page 200. Note that, as written there, 1

and 2 can be exchanged.

on the derivation of the black-litterman equation for ...€¦ · corresponding author –...

Documents