101 measures
TRANSCRIPT
1
The 101 ways to measure portfolio performance
Philippe Cogneau
Researcher, University of Liège, HEC Management School
Email: [email protected]
Georges Hübner
Deloitte Professor of Financial Management, University of Liège, HEC Management School
Associate Professor of Finance, Maastricht University, Faculty of Economics and Business Administration
Mailing address: Université de Liège, Rue Louvrex 14, Bat. N1, B-4000 Liège, Belgium.
Phone : (+32) 4 2327428
Email: [email protected]
2
The 101 ways to measure portfolio performance
Abstract
This paper performs a census of the 101 performance measures for portfolios that have been proposed so
far in the scientific literature. We discuss their main strengths and weaknesses and provide a classification
based on their objectives, properties and degree of generalization. The measures are categorized based on
the general way they are computed: asset selection vs. market timing, standardized vs. individualized,
absolute vs. relative and excess return vs. gain measure. We show that several categories have been
exhausted while some others feature very heterogeneous ways to assess performance within the same sets
of objectives.
3
The 101 ways to measure portfolio performance
1. Introduction
Since the introduction of the Sharpe ratio in 1966, many different measures of portfolio performance have
been introduced in the scientific as well as practitioners literature. Yet, there exists no census of all of
them. The most complete study so far is due to Le Sourd [2007], but it mentions about fifty different
measures1.
From an exhaustive review of the relevant literature, we have identified one hundred and one
portfolio performance measures2. The main purpose of this paper is to provide a taxonomy of them. It
naturally involves the identification of categories, in which we gather those measures that display
common characteristics. Hence, we do not only provide an exhaustive list, but also a partition of the
performance measurement area in homogenous categories.3
The second objective of this article is to identify, among the categories, those that can be considered
as “dead-ends” in terms of further investigations. Whenever there exists a performance measure that
provides a proper generalization of any measure within the same category, then common sense dictates
the usage of this particular measure and the abandonment of any other attempt to research further in that
direction.
2. A general typology
Insert exhibit 1 approximately here
Exhibit 1 displays the structure of the simple binary classification tree proposed in this paper.
In the first level, we distinguish the types of skills reflected in the measures, namely asset selection
versus market timing. Measures that reflect asset selection are themselves split according to the
individualization of performance. We segregate the standardized risk-adjusted performance measures
4
versus those that explicitly depend on investors’ preferences. Finally, in the category of risk-adjusted
performance measures, all corresponding measures can be classified according to a double entry table.
The first dimension represents the measure of value creation, whether it is an excess return or a gain
potential. The second dimension reports the type of performance translation, in relative (ratio) or
absolute (difference) terms. Each category corresponds to a given section or sub-section.
3. Ratios performance / risk
Insert exhibit 2 approximately here
In the first class, we consider all measures that are computed as a ratio dividing the performance by a risk
measure (category Asset Selection/Standardized/Relative). The sub-classifications are made according to
how risk is measured.
3.1. Absolute risk
3.1.1. Sharpe ratio and close variations
The original measure of this kind is the Sharpe ratio [Sharpe, 1966], defined as the ratio of the mean
return in excess of the risk free rate over its standard deviation. It rests on the hypothesis that returns are
normally distributed and/or that the investor has a utility function whose only arguments are expectation
and variance of returns. Simplicity and ease of interpretation are the main strengths of this ratio4. For
these reasons, it is still widely used by financial institutions to compare the performance of mutual funds.
Central to the usefulness of the Sharpe Ratio is the fact that an excess return represents the result of a
"zero-investment strategy". So, it represents the payoff from a unit of investment financed by borrowing.
And as it refers to total risk, it can be used for a well-diversified financial portfolio, which is meant to
represents an individual’s total investment. Another important quality is that it cannot be manipulated by
leverage – which is a weakness of Jensen’s alpha that we present below.
5
On the other side, the Sharpe ratio exhibits numerous drawbacks as well. First, it does not quantify
the value added, if any: it is only a ranking criterion. It also assumes frictionless financial markets, so that
it is possible to borrow to invest more than 100% in a risky portfolio – and this is not always possible.
The risk free rate is constant and identical for lending and borrowing. In its computation, the choice of
risk-free rate is important, as it affects rankings – though the impact is rather weak.
The Sharpe ratio is an absolute measure that does not refer to a benchmark.5 It equally measures the
performance of the portfolio and the performance of the market in which the portfolio is invested.
Considering the point of view of the investor, his investment horizon must match the performance
measurement period. Furthermore, as it measures the total risk, Sharpe ratio is only suitable for investors
who invest in only one fund. In case of aggregation of portfolios, its consolidation is not straightforward
because of the covariance effects between volatilities.
Its interpretation is also difficult when it is negative: if risk increases, the Sharpe ratio also increases.
To tackle this issue, Israelsen [2005] proposes the Israelsen’s modified Sharpe ratio in which he
exponentiates the denominator with the excess return divided by its absolute value. With this measure, the
values have a wider range in size, but do not give useful information in absolute.
A problem rarely mentioned is the sampling error embedded in the values of the ratio. The estimate
of the standard deviation is measured with statistical noise. Vinod and Morey [2001] introduce the double
Sharpe ratio, computed as the quotient of the Sharpe ratio estimate by its standard deviation. To compute
it, they use a bootstrapping methodology and generate a great number of resamplings from the original
return sample.
The assumption of a Gaussian returns distribution does not hold for many funds, in particular for
hedge funds, so different statistical adaptations were proposed in the literature. Spurgin [2001] shows that
with the issuance of out-of-the-money options, the manager of a fund can enhance the Sharpe ratio by
enhancing the mean-variance trade-off and altering the tail of his portfolio. Statistical variations are
6
proposed to tackle this issue, by including higher moments in the formula. Zakamouline and Koekebakker
[2008] propose the adjusted for skewness Sharpe ratio (ASSR), and even an adjusted for skewness
and kurtosis Sharpe ratio (ASKSR). Watanabe [2006] also considers these third and fourth moments,
but in a simpler form, in his Sharpe + skewness/kurtosis ratio.
Mahdavi [2004] introduces an adjusted Sharpe ratio (ASR) to evaluate assets whose return
distribution is not normal. The approach is to transform the payoff so that its distribution will match that
of the benchmark: once the return is transformed, the resulting Sharpe ratio of the asset can be directly
compared to that of the benchmark, knowing the total payoffs from both instruments have exactly the
same distributions.
Lo [2002] shows that standard deviations at the denominators present serial correlations for hedge
funds and that leads to results till 70% too high. He suggests a Sharpe ratio adapted to autocorrelation
whose formula included a bias corrector. In fact, this is more a bias corrector than a true new measure.
Even, the idea to multiply a performance measure by a bias corrector can be extended to every other
performance measure.
The reference value in Sharpe ratio is the risk free rate. An interesting variation is proposed by Roy
in 1952, so fourteen years before Sharpe. He proposes to compare the return to a reserve return that is
specific for the investor. So, Roy’s measure permits to consider different utility functions – in general,
the greater the reserve return, more the portfolios having a higher return are ranked – but it faces all other
drawbacks of Sharpe ratio. Indeed, in many measures, authors use both the risk-free and the reserve
return in the numerator.
Despite all these statistical adaptations, most issues of the Sharpe ratio remain. This explains why
many variations of the Sharpe ratio were introduced.
7
3.1.2. Other absolute risk measures
3.1.2.1 Half- and semi-variance
By using standard deviation of returns, the Sharpe measure puts both positive and negative variations
from the average on the same level. But most investors are only afraid of negative variations. The Sharpe
ratio does not make any distinction between upside risk and downside risk.
In the reward to half-variance index, introduced by Ang and Chua [1979], the standard deviation is
replaced by the half-variance which considers only the returns lower than the mean. Pure downside-risk,
i.e. only pure losses with a return lower than zero, is considered in the downside-risk Sharpe ratio
[Ziemba, 2005].
Within this category, the most widely used measure is the Sortino ratio6 because of its flexibility. It
combines previous measures, subtracting like Roy a reserve return in the numerator, and considering the
same reserve return in the computation of the semi-variance at the denominator. Watanabe [2006]
improves it in the same direction as the Sharpe ratio, with his Sortino + skewness/kurtosis ratio.
A refined variation is the Sortino-Satchell ratio [Sortino, 2000; Sortino and Satchell, 2001]7, in
which the semi-variance related to a reserve return is replaced by lower partial moment of order q – it
coincides with Sortino ratio when q = 2. The introduction of a power index permits the consideration of
the investor’s degree of risk aversion: in practice, a value of q = 0.8 is used to describe an aggressive
investor and 2.5 for a conservative investor.
3.1.2.2 VaR and CVaR
Another idea is to consider the Value at Risk (VaR) as a risk indicator. Value at Risk is the measure
selected by the investor who is mostly concerned by disasters, i.e. rare events. For instance, if we consider
a threshold α of 5%, VaRα will give the minimum loss that will happen in the worst 5% of the cases.
Dividing the VaRα by the initial value of the portfolio, we obtain a percentage of loss which is a risk
8
indicator and can be used as denominator in the Sharpe ratio. Dowd [1999, 2000] calls it logically Sharpe
ratio based on the Value at Risk. This measure also tackles one important drawback of the Sharpe ratio,
its inability to distinguish between upside and downside risks. It also discriminates the irregular losses as
opposed to repeated losses. It is particularly useful when making hedge decisions, as it permits to avoid
the excessive use of micro hedges against individual risk exposures.
The accurate numerical estimation of the VaR is computationally intensive and can be quite complex,
especially needing large databases. So, Favre and Galeano [2002], propose the Sharpe ratio based on
Cornish-Fisher VaR. Its formula includes the third and fourth moments of the distribution, so also
presenting the advantage to cover non normal distribution of returns.
There are other issues related to the VaR. It is sensitive to the selected threshold, as conflicting
results happen sometimes at different confidence levels. As for any quantile measure, it is not sub-
additive, which implies that portfolio diversification may lead to an increase of risk. It does not measure
losses exceeding VaR, which are definitely of interest, even more than the VaR itself. Finally, VaR has
many local extremes, leading to unstable rankings.
Instead of using the VaR, the Sharpe ratio based on the Conditional Value at Risk, i.e. the
average loss when it is superior to the VaR, introduced by Artzner et al. [1999]8, meets the last two
drawbacks. It assesses how deep is the loss in case of a disaster, and not anymore to estimate the
threshold from where one can speak of disaster.
3.1.2.3 Miscellaneous with absolute risk
Various suggestions to estimate risk have led to other versions of the Sharpe ratio. They are too different
to be attached to a specific group, and we list them with their main characteristics.
9
A possibility is to consider the mean absolute deviation in the denominator, as in the mean absolute
deviation (MAD) ratio of Konno and Yamazaki [1991]. This ratio is more robust to outliers than the
Sharpe ratio.
The Gini ratio, proposed by Yitzhaki [1982], is the ratio between the excess return from the risk-free
rate and its Gini coefficient. Gini coefficient is a measure of dispersion that depends on the spread of
values among themselves, rather than on the deviations about some fixed central point like the mean, as is
computed the standard deviation. It is often used in the economics literature to measure income dispersion
and the discriminatory power of rating models in credit risk management. It shares many properties with
the variance, but appears to be more informative for distributions that depart from normality. It also has
the advantage of being linear.
Young [1998] introduces the Minimax ratio as the ratio between the expected excess return and the
Minimax risk measure, the latter being the maximum loss over all past observations. On one hand, it can
be seen as an extreme sub case of the Sharpe ratio based on the Conditional Value at Risk; on the other,
one can see the MAD as based on a L1 risk measure – indifference to risk over any linear region of the
piecewise function -, Sharpe ratio as based on a L2 risk measure – risk is the square root of the sum of the
square errors to the mean -,and the Minimax on a L∞ - strong absolute aversion to downside risk. The
Minimax ratio is easy to compute, but strongly affected by outliers in the historical data.
Martin and Mc Cann [1989] propose the Ulcer performance index. The denominator is the Ulcer
index, computed as the quadratic mean of the percentage drops in value during the observed period; Ulcer
index measures the depth and the duration of percentage drawdowns in price from earlier highs. While
remaining easy to compute, it presents a couple of concrete advantages compared to Sharpe ratio: it
considers only downward changes, and the strings of losses that result in significant drawdowns in value
are recognized.
10
The Sharpe-Omega9 is introduced by Kazemi and al. [2004] as the ratio of the expected excess return
over the value of a put option on the return of the portfolio. It is assumed to be a reasonable measure of
the investment’s riskiness, as the price of the put option is the cost of protecting an investment’s return
below the target ratio.
Finally, the interest in finance to the stable modelling drives Rachev and Mittnik [2000] to consider
the stable ratio. Among many non-Gaussian distributions that are proposed in the literature to model
asset returns that presents empirically an excess kurtosis, the stable Paretian distribution has unique
distinctive characteristics that put it on the top of the list. The stable dispersion measure is the scale
parameter of a stable Paretian distribution.
3.1.3. Ratio of gain and shortfall aversion
The spirit of this class of measures is very close to the ratios “performance / risk” presented above. The
extension is here that performance is measured as a potential gain divided by a loss exposition.
3.1.3.1 Classical measures of loss
Bernardo and Ledoit [2000] introduce a measure defined as the ratio of the expectation of the positive
part of the returns divided by the expectation of the negative part. The Bernardo-Ledoit gain-loss ratio
has gained a lot of popularity thanks to Shadwick and Keating [2002] who rebrand it under the name
Omega. It is frequently used for hedge funds as it incorporates all the higher moments of the distribution.
The reserve return can be chosen arbitrarily. If it is set to the mean of the distribution, the measure
equals 1. It does not need any benchmark or index to be computed. However, Bernardo and Ledoit
propose a version in which the reserve return is replaced by an index, so that index funds will get a zero
performance and only those funds that beat the index will receive a positive score.
The ratio can be interpreted as the quotient of a call option and a put option, both having an exercise
price equal to the reserve return. Each element of the fraction can be approximated using the Black and
11
Scholes formula. The price of the call is the cost of acquiring the return above the threshold; the price of
the put is the cost of protecting the return below the threshold.
The upside potential ratio (UPR) proposed by Sortino et al. [1999] relies on a similar idea. The
numerator is the expected return above the reserve return and can be seen as the potential of success. The
denominator is downside risk as calculated in the Sortino ratio. Unlike the Sortino ratio, the UPR uses the
same reference rate for evaluating both profits and losses. Furthermore, the UPR increases with its
numerator – which measures the expected return above minimum acceptable return – and decreases as its
denominator – downside risk – increases. The UPR delivers therefore performance outputs that conform
the wishes of the investors: to obtain rise potential while protecting against losses.
Farinelli and Tibiletti [2008] propose a generalized measure. The Farinelli-Tibiletti ratio is the ratio
of an upper partial moment of order p to a lower partial moment of order q. The values of p and q depend
on the desired relevance given to the magnitude of the deviations: the higher p and q, the higher the
investor’s preference for (expected gains with p) or dislike of (expected losses for q) extreme events. The
Bernardo-Ledoit measure or Omega is a particular case with p = 1 and q = 1, while the upside potential
ratio is another particular case, with p = 1 and q = 2.
3.1.3.2 CVaR as measure of loss
Like for the Sharpe ratio, the CVaR as an alternative measure of risk is worth considering: it is proposed,
by Biglova et al. [2004] as the Rachev Ratio. It is the ratio between the CVaR of the opposite of excess
return at a given confidence level, α, and the CVaR of the excess return at another confidence level, β.
The values of the parameters can be adjusted to fit investment style: taking α and β close to 0.5
correspond to a moderate style, while lower α and β reflect more aggressive styles. The same paper
proposes even a Rachev generalized ratio in which the authors introduce power indexes that vary in
respect to the investor’s degree of risk aversion and attraction to high returns.
12
3.1.3.3 Maximum drawdown as measure of loss
Another idea is to replace the notion of standard deviation – or one of its variations – by the maximum
drawdown in the considered period, a parameter that investors often consider. Fundamentally, on a
considered period, this figure represents more a regret, the loss between a peak and a valley, than an
effective loss. Four measures emerge.
The Calmar ratio [Young, 1991] is simply the total amount of return divided by the maximum loss
on the considered period. An obvious drawback of this measure is its sensitivity to outliers, so Sterling
Jones10
proposes the Sterling ratio. The denominator is the average of the drawdowns during the period,
to which one adds an arbitrary threshold of 10%. It adjusts for the fact that short term calculations of
drawdown are understated compared with the annual drawdown figure. This adjustment presents a
drawback: if the average drawdown for any of the funds analyzed is less than minus this threshold, then
the denominator becomes negative and comparison with other funds is meaningless – an issue we already
met with Sharpe ratio. That is the reason why this threshold is sometimes omitted.
The Sterling-Calmar is an alternative to get the best out of these two ratios, considering the average
of the N maximum drawdowns on the denominator. Finally, in the Burke ratio [Burke, 1994], the
denominator is the square root of the sum of the squares of the N largest drawdowns. As the Sterling
ratio, it is less sensitive to outliers.
3.2. Systematic risk
3.2.1. Treynor ratio and variants
One year before Sharpe, Treynor [1965] introduces the Treynor ratio computed with a similar formula,
but considering the systematic risk of the portfolio at the denominator. Most of its drawbacks are those of
Sharpe ratio, with some specificities. It requires the choice of a good reference index, because the
denominator heavily depends on the selected benchmark. It is inadequate if the market exposure varies
13
because the beta can be distorted. Unlike the Sharpe ratio, its computation is straightforward for portfolio
aggregation, because the beta is a weighted sum of constituent’s betas, and it is relevant for a portfolio
that does not cover the whole patrimony of an individual.
As for the Sharpe ratio, three directions are proposed to give it more flexibility: introducing a reserve
return instead of the risk-free return, keeping only the negative deviations at the denominator, and finally
considering lower partial moments of order k. A generalized formula is proposed by Srivastava and
Essayyad [1994] for Treynor ratio based on lower partial moments.
3.2.2. Black-Treynor ratio and generalization
Treynor and Black [1973] consider alpha, which is an adequate measure of excess return, at the
numerator instead of the excess return. The so-called Black-Treynor ratio has all advantages of alpha –
see below , and the division by beta permits the comparison of different portfolios, independently of their
systematic risk.
The original Jensen’s alpha is often replaced by a better alpha, extracted from the regression of a
multi-factor econometric model. Hübner [2005] introduces the Generalized Black-Treynor ratio that
combines the advantages of the Black-Treynor ratio with the use of the multi-dimensional model.
3.3. Non systematic risk
We consider here the risk that can be eliminated by diversification.
3.3.1. Moses, Cheney and Veit’s measure
Moses, Cheyney and Veit [1987] propose a measure computed as the product of Jensen’s alpha by the
return in excess to the risk-free rate, divided by the non systematic risk. Moses, Cheney and Veit’s
measure shows the arbitrage that makes the manager of a fund, between the level of diversification of the
portfolio – at the denominator – and his performance compared to the market –at the numerator.
14
3.3.2. Information ratio and variations
The idea underlying the information ratio (or IR) – also called the appraisal ratio – proposed by Grinold
[1989] is to get the performance relative to a given reference portfolio. It measures the excess return of
the fund over a given benchmark, divided by the standard deviation of the excess return – or more
concretely, the degree of regularity in outperforming the benchmark.
The excess return over the benchmark results from the choices made by the manager to overweight
assets that he hopes will exceed that of the benchmark. A passive management gives a null ratio. The
denominator, also called “tracking error”, reflects the cost of an active management.
This ratio has some major drawbacks. First, it requires much data to assess its significance. The
sensitivity to the selected benchmark is also a concern: Goodwin [1998] estimates that is has a notable
impact, which is contradicted by Gillet and Moussavou [2000]. Next, if a fund tracks an index closely,
with a small tracking error, little changes in excess return swing the information ratio from largely
positive to largely negative or vice versa. As for the Sharpe ratio, Israelsen [2005] partially tackles this
issue by introducing Israelsen’s modified information ratio where the tracking error is exponentiated.
Finally, this ratio also considers equally positive and negative variations from the index: an issue solved
considering an information ratio based on semi-variance [Gillet and Moussavou, 2000].
4. Incremental return
Insert exhibit 3 approximately here
In the second class, we consider all measures that are computed as an absolute return by subtracting a
penalty from the measure of wealth (category Asset Selection/Standardized/Absolute).
15
4.1. Incremental return versus market
4.1.1. Analytical measures
Starting from a certain portfolio, it is possible to borrow or lend at the risk-free rate to adjust portfolio risk
to the one of the market portfolio. The M² index (or RAP, for risk-adjusted performance) is so introduced
by Modigliani and Modigliani [1997] as the incremental return added as compared to the level of market
risk. This measure, expressed in basis points, is easy to interpret. Rankings are independent of the chosen
benchmark, as it only plays the role of a scaling factor. However, it is just a linear function of the Sharpe
ratio and not really a new measure. As a consequence, it shares the disadvantages of Sharpe ratio.
Scholz and Wilkens [2005a] propose a similar measure, replacing the ratio in the formula by the
inverse of the beta of the fund. Their market risk-adjusted performance measure (MRAP) permits a
comparison of portfolio returns with those of the market, and it is easy to interpret. As it measures returns
relative to market risk instead of total risk, it is suitable for investors that invest in many different assets.
This index is equal to the Treynor ratio plus the risk-free rate. The same paper introduces the differential
return based on RAP. It is computed as the difference between the M² of the portfolio and the M² of the
market index (which is also its average return).
Lobosco [1999] proposes the style risk-adjusted performance measure (SRAP). It looks like the
M², but uses a style benchmark instead of a single index. It enables a more accurate evaluation of the
manager’s performance.
Statman [1987] makes another attempt in this direction with the excess standard deviation adjusted
return (eSDAR). It represents the excess return of the fund over the market, where the fund is leveraged
to have the same standard deviation. Its value is equal to M² measure minus the return of the market.
Finally, Aftalion and Poncet [1991] introduce a variant where the unobservable market portfolio is
replaced by a benchmark representative of the portfolio universe. The Aftalion and Poncet index
16
measures the gap between the return of the portfolio and the return of its benchmark – positive
contribution in the formula – taking into account the difference in risk – negative contribution in the
index. The only difficulty is to estimate the market price of risk.
4.1.2. Efficient frontier based measures
Cantaluppi and Hug [2000] propose the efficiency ratio which is the distance to the efficient frontier, in a
two-dimensional world risk/return. Instead of answering to the question “what is the performance of the
portfolio relative to others?”, this measure cares about “which performance could the portfolio achieve?”
Graham and Harvey [1997] tackle two main issues of Sharpe ratio: it assumes that the risk-free rate is
constant and not correlated to risky assets returns; and the estimates are not precise enough when fund
volatilities are too different. The Graham and Harvey measure 1 (GH1) derives from drawing a convex
efficient frontier using a reference index and T-bills. GH1 is the difference in return with the portfolio
located on the efficient frontier that has the same risk. As it is the under/over-performance compared to a
portfolio composed with the index of the market and cash, it is easy to interpret. Graham and Harvey’s
measure 2 (GH2) is obtained by constituting a set of portfolios that combines a given fund and cash, and
then considering the portfolio that has the same volatility as the market index. The measure is the
difference between the return of this portfolio and the market index return. It generalizes M², which
assumes that cash return has zero variance and zero covariance with other assets.
4.2. Incremental return versus benchmark
4.2.1. One factor model
4.2.1.1 Jensen’s alpha
The original measure in this class is Jensen’s alpha [1968]. It is defined as the difference between the
return in excess from the risk free rate, and the return at equilibrium in excess of the risk free rate, taking
17
into account the systematic risk of the portfolio. It has always been very popular, because it is has the
dimension of a return and is easy to interpret. It reflects the manager’s ability to earn a return above the
equilibrium return indicated by the security market line. Like the Sharpe ratio, its drawbacks are
numerous. Jensen’s alpha depends on the choice of a benchmark11
to represent the market portfolio.
Being proportional to beta, it does not enable a comparison of portfolios with different levels of risk.
Thus, except in peer groups, it can not be used as a ranking criterion. It is also inadequate with a time-
varying fund’s market exposure. It can also be manipulated by leverage. It also suffers from the limits of
the CAPM model, which are not often verified in reality.
4.2.1.2 Variations over Jensen’s alpha
Before considering extensions of Jensen’s alpha, we enumerate some variations that have been proposed
in the financial literature, but never getting as popular as Jensen’s alpha.
The standardized Jensen’s alpha is computed dividing Jensen’s alpha by its standard deviation. It is
linked to the alpha, but it includes the degree of confident that we have in the estimation of the model: if
we consider two funds having the same alpha, but one being estimated with a good model and the other
with a worse, the standardized alpha of the first will be superior and the second inferior to 1.96 - which
corresponds to a confidence level of 99% .
Black [1972] shows that the CAPM theory was valid without the existence of a risk-free asset, and
develops a version of the model by replacing it with an asset or portfolio having a beta of zero: this
measure, called Alpha with Black’s zero-beta model, is not often used by practitioners who dispose of
various variants for the risk-free rate.
Brennan [1970] develops a version of the CAPM that allows the impact of taxes on the model to be
taken into account. He derives the alpha with Brennan’s model taking taxes into account.
18
Fama [1972] introduces the total risk alpha that measures the manager’s stock picking skills, and
can be explained this way: if we consider a target risk σp, a portfolio BP having this total risk can be
obtained by combining the market portfolio and the risk-free asset. A manager can try to obtain a
different return by stock-picking, building a portfolio P with this fixed level of risk. The difference of
returns Rp – RBP measures the manager’s stock picking skills. Conversely to Jensen’s alpha, it integrates
total risk, as the benchmark portfolio represents the market index matched to the total risk of the fund.
For a portfolio invested on two markets, McDonald’s measure [McDonald, 1973] determines each
market’s contribution to the total performance of the portfolio. Pogue et al. [1973] generalize this formula
to a portfolio containing several asset classes and invested in several markets, allowing the evaluation of
the manager’s capacity to select the best performing assets and invest in the most profitable markets.
A Jensen’s alpha adjusted for stale prices is proposed by Scholes and Williams [1977] and
Dimson [1979]. It adds three lagged market betas β1, β2, and β3 to the contemporaneous beta β0. If the
lagged betas are found to be significant at 5%, then we take this α; otherwise, it is the common Jensen’s
alpha. An alternative version, proposed by Fung et al [2004], is to consider this alpha when the sum of the
lagged betas is significant.
Leland [1999] replaces the betam in Jensen’s formula by a betap adjusted to the utility of the
investors. Leland’s alpha relies on the hypothesis that the investor has a power utility function, and also
that there is an asymmetry in the evaluation of the systematic risk. It is useful in the context of non linear
instruments, to tackle the fact that Jensen’s alpha can be artificially increased by leverage.
4.2.2. Multi-factors models
4.2.2.1 Alpha with multi-factors models
The consideration of multi-factors models is justified by the weaknesses of the CAPM model - which
underlies previous classical measures – that were reported by Roll [1977]. These models try to explain
19
portfolio returns by sets of macroeconomic versus microeconomic, and explicit versus implicit risk
factors.
In this class, two occurrences are very popular. Alpha based on Fama and French’s three factors
model is the first one. Fama and French [1992, 1993] set in evidence the fact that, complementary to the
beta, the book-to-market ratio and company size measured by its market capitalisation are two factors that
characterize a company’s risk. Carhart [1997] adds a fourth factor: alpha based on Carhart’s four
factor model includes momentum, which is the difference between the average of the highest returns and
the average of the lowest returns from previous year.
Other microeconomic multi-factors models are proposed in the literature. For instance the multi-
factor Alpha for Hybrid funds, which is mentioned by Elton et al. [1993], adds to Fama and French’s
model a factor specific for funds that include bonds in their portfolio.
Finally, the alpha based on Barra’s model uses no less than thirteen risk indices [Sheikh, 1996].
4.2.2.2 Alpha with conditional models
A complementary way in computing alpha is to introduce conditional betas as in Ferson and Schadt
[1996]. The underlying idea is to remove, from the performance measure, an investment strategy that can
be replicated using public information. Conceptually, this class of models suppose that risk premiums in a
moment t can be predicted at t-1 considering variables – called “instruments” - whose values are observed
in t-1. This idea of varying betas appears to be particularly relevant for at least three reasons: the betas of
the assets in a portfolio are changing over time; changes in prices induce a change in the weights of an
even passive portfolio; and active management with buys and sells are better modelled.
Christopherson et al. [1999] dig deeper into this idea, assuming that the alpha also follows a
conditional process. They propose to let excess performance varying over time. The conditional alpha
appears to answer a remark already mentioned in Jensen’s original paper: alphas of funds are negative
20
more often than positive, which has been interpreted as inferior performance. However, using conditional
alphas, the distribution of alphas shifts to the right and is centred near zero.
4.2.2.3 Extensions of CAPM-based measures
Three measures rely on extensions of the CAPM. Harvey and Siddique [2000] generalise Fama and
French’s model by considering the third moment of the distributions12
. Alpha based on Harvey and
Siddique’s model is then particularly dedicated to funds that present a non normal distribution of returns.
Hwang and Satchell [1999] consider a three-moment CAPM and a quadratic return generating
process. The higher moment measure of Hwang and Satchell emphasizes the importance of
coskewness and cokurtosis, but suffers from the other limitations of Jensen’s alpha.
Gomez and Zapatero [2003] propose an alpha based on a two-factor CAPM. Together with the
market beta, a new risk factor – called active management risk – is brought into the analysis. The new
beta is defined as the covariance between the asset excess return and the excess return of the benchmark
index normalized to its variance.
4.3. Difference between gain and shortfall aversion13
Melnikoff [1998] suggests characterizing the investor’s aversion to shortfall by a constant which
represents its gain-shortfall trade-off, i.e. the relation between the expected gains desired by him to make
up for a fixed shortfall risk. Melnikoff’s measure is computed as the difference between the return of the
portfolio and the average annual shortfall rate, multiplied by the weight of the gain-shortfall aversion
minus one. This measure depends clearly on the profile of the investor, which is an advantage – it is more
precise – but also a drawback – as two investors will have two different rankings, so it is difficult to
compare the quality of this measure to another one.
This drawback is also shared by the Sharpe alpha, as mentioned by Plantinga and De Groot [2001],
defined as the return of the portfolio minus its variance multiplied by a coefficient of aversion to shortfall
21
specific to the investor with a quadratic utility function. The ranking depends on the chosen coefficient.
Fouse’s index [Sortino and Price, 1994] relies on downside risk through the semi-variance. With the
coefficient of aversion to risk, a second parameter has to be selected here: the reserve return.
5. Preference-based measures
Insert exhibit 4 approximately here
We discuss performance measures that explicitly account for the investors’ risk preferences through the
use of a utility function (category Asset Selection/Individualized).
5.1. Direct translation of preferences
5.1.1. Utility functions based
Hodges [1998] relates the Sharpe ratio to investor preferences for an exponential utility function, in a
situation where returns are normally distributed. Relaxing the latter hypothesis, he determines a
generalized Sharpe ratio.
Stutzer [2000] assumes that investors aim to minimise the probability that the excess returns over a
given threshold will be negative over a long time horizon. When the portfolio has a positive expected
excess return, this probability decays asymptotically to zero at an exponential rate. Portfolios with high
probability decay rates are preferable to those with low decay rates. The maximum possible rate is
defined as the Stutzer index of convergence. Unfortunately, this measure is not intuitive. Furthermore,
this rate is the opposite of the maximum expected utility of an investment in the portfolio, computed with
an exponential utility function. So, it is linked to Hodges’s measure in a straightforward manner.
Kaplan [2005] considers utility functions that are decomposed into an expected return component
and a loss penalty function that has an exponential type. He calls lambda the measure obtained by
considering the optimal utility.
22
These three measures have the common drawback that their computation requires the solving of a
maximization problem.
Morningstar regularly publishes rankings of funds, based on its own methodology [Morningstar,
2007]. It tries to estimate the utility provided by the portfolio for an investor that has a power utility
function. The Morningstar risk adjusted return is very important in practice, because the rankings that
it publishes are followed by many investors.
Sharma [2004] proposes the alternative investments risk adjusted performance (AIRAP) which
has only two slight differences from previous measure: it uses total returns instead of surplus returns, and
it computes an average yield. Among all advantages, this measure captures all observed higher moments,
works even when mean returns are negative, can be formulated as a modified Sharpe ratio, and is
invariant to wealth level. Downside variance is more penalizing than with Sharpe ratio.
Ingersoll et al. [2007] define a measure as one that has four properties which characterize the fact that
it is not vulnerable to manipulation. The manipulation-proof performance measure appears to be
similar in substance and nearly in form to the Morningstar measure.
Finally, Pézier [2008] introduces a certain equivalent return (CER) for an investor and an asset, as the
minimum sure excess return above the risk-free rate on total wealth the investor quote to be equally
attractive. Then, he defines the maximum certain equivalent excess return (CER*) as the CER of the
optimal allocation of the investor’s total wealth to the considered asset and the risk-free asset. This
measure is expressed in basis points, so it is easy to interpret. This generalization does not make any
restriction to the distribution of the returns, takes into account the investor's risk attitude – as any personal
utility functions is possible -, and permits to consider the context of the investment (horizon, availability
of a risk-free asset…). A CER* can also be translated, by positive monotonic transformations, into
equivalent criteria onto other scales such as generalized Sharpe ratios.
23
5.1.2. Miscellaneous
Many measures already seen have a parameter specific to the considered investor, but next one, proposed
by Scholz and Wilkens [2005b] is particular, based on the following situation. Let us consider an investor
who is already holding a portfolio P, and wants to invest additional money in a portfolio Di without
changing his initial portfolio. They define the investor specific performance measure as a measure
based on the variance of the new portfolios, considering that the Di with the lowest variance dominates all
others for a given expected return. In particular, if the portfolio P is the market index, this measure is
determined by the Sharpe and Treynor ratios and permits to arbitrate between two funds, one having the
best Sharpe and the other the best Treynor ratios.
Muralidhar [2000, 2001] introduces the M³ or Muralidhar’s measure, indicating how to construct
portfolios that satisfy an investor’s objectives. The idea is to create portfolios invested in an investment
fund, a benchmark and the risk-free asset with proportions a, b and 1-a-b respectively. Assume that the
investor accepts a certain level of annualised tracking error compared to his benchmark, which we call
objective tracking error. Parameters a and b are computed in such a way that the portfolio obtained has a
tracking error equal to the objective tracking error and its standard deviation is equal to the standard
deviation of the benchmark. The obtained portfolio is called “correlation-adjusted portfolio” as the
constraint on the tracking error creates a target correlation between the portfolio and the benchmark. Once
the optimal proportions have been calculated, we compute the return of the correlation-adjusted portfolio
for the fund. Compared to M² measure, it includes the differences in standard deviation and the
correlation of each portfolio with the benchmark and the correlations between the portfolios themselves.
One can observe that if no tracking error exists, M³ = M².
The same author proposes in 2002 the skill, history and risk-adjusted measure. This measure is the
product of M³ by a measure of confidence in skill. So this new measure has all properties of M³, but
allows differences in data history to be taken into account: two portfolios with identical variances,
24
information ratios and tracking errors, but differing only in length of history will yield different
confidence levels in the skill of their managers.
5.1.3. Prospect theory based
Prospect theory is an alternative theory proposed by Khaneman and Tversky [1979], in reaction to the
expected utility theory. Expected utility theory is unable to explain why people are often simultaneously
attracted to both insurance and gambling. Under prospect theory, value is assigned to gains and losses
rather than to final assets; also probabilities are replaced by decision weights which are generally lower
than probabilities. It is in this context that is introduced a prospect ratio. As for the Sharpe and Sortino
ratios, Watanabe [2006] suggests a prospect + skewness/kurtosis ratio.
5.2. Indirect translation of preferences
When the composition of the fund is known, new measures are possible.
Cohen et al. [2005] propose two measures whose specificity is to exploit information contained in the
holdings and returns of other funds. Their idea is to evaluate a manager’s decisions by comparison to the
decisions of managers whose performances are superior – so they need to choose a first measure to
evaluate them, for instance the alpha. Cohen, Coval and Pastor’s measure based on levels of holding is
the weighted sum of a quality measure for each asset in the fund. This measure is derived from the
performance of all managers who had this asset in their portfolio.
In Cohen, Coval and Pastor’s measure based on changes in holding, the weights are covariances
between the changes in the portfolio and those of the other managers: a manager is rewarded if he buys
assets also purchased by managers with a good performance and if he sells assets purchased by managers
with a weak performance.
Daniel et al. [1997] decompose the total performance of a fund in characteristic selectivity,
characteristic timing and average selectivity. These three Daniel’s measures are computed using a
25
method that forms benchmarks by directly matching the characteristics of the component stocks of the
fund being evaluated. The idea is not too far from the conditional alpha, where time varying weights are
related to instruments, but here the varying weights are the concrete changes in stock holdings.
Going on in this direction, Ferson and Khang [2002] introduce the conditional weight measure,
which combines the weights as in Daniel’s measure and the expected returns as in the conditional alpha.
6. Market timing
Insert exhibit 5 approximately here
Finally, market timing performance measures reflect the managerial skill of adequately timing the market
(category Market Timing).
6.1. Original measures
The first two measures are based on Jensen’s alpha and intend to determine whether its value is due to a
good market timing strategy (remember that good market timing negatively influences the alpha).
Treynor and Mazuy [1966] produce a single factor model derived from the CAPM in which a quadratic
term is added to reflect the market timing. Its coefficient is Treynor and Mazuy’s coefficient. If it is
positive, the portfolio has good market timing because return of the portfolio is as higher as the risk
premium is higher. This coefficient indeed measures the co-skewness with the benchmark portfolio. A
positive value for the independent term of the regression, is considered as a sign of superior stock
selection – but one has to be conscious that is has not the same meaning as Jensen’s alpha.
Henriksson and Merton [1981] start from a similar idea, but provide a different interpretation of
market timing ability. Adding a term in the CAPM model that contains a dummy variable based on the
difference between market return and the risk-free rate, they permit managers to choose between two
levels of market risk – an up-market and a down-market beta. The difference between them is
Henriksson and Merton’s coefficient. Compared to Treynor and Mazuy’s model, it presents the
26
drawback that beta can only have two values, while intuitively the exposure to the market is higher as the
risk premium is higher. Furthermore, Goetzmann et al. [2000] show that this model gives weak results if
it is applied to monthly results of a daily timer.
Chen and Stockum [1986], among others, show that the error term in both of these models is often
heteroscedastic, while Drew et al. [2002] also detect a problem of multicolinearity. These two issues have
to be resolved by ad hoc methods, before using the ordinary least squares regression.
Weigel [1991] extends Henriksson and Merton’s analysis, supposing that a fund can be invested in
three assets: risk-free, bonds and stocks. Weigel’s coefficient has a value of 1 if the manager has a perfect
forecast of the markets; it is between 0 and 1 if he foresees more or less the evolution. If the coefficient is
negative, then his forecasts are bad.
6.2. Extension of original measures
6.2.1. Adding a cubic term
Coming back to funds invested in stocks only, Jagannathan and Korajczyk [1986] add a cubic term in the
original Treynor and Mazuy model. Their Treynor and Mazuy extended timing measure permits to
detect when the cubic term is negative, corresponding to cases of artificial market timing as measured by
the original model.
6.2.2. Multi-factor versions
Bello and Janjigian [1997] propose an extended Treynor and Mazuy’s measure to cover assets that are
not in the main index used to encompass the case of funds that includes bonds.
For more general hybrid funds, Comer [2006] suggests a multi-factor timing measure to consider
systematic risks of the funds to the market, to small stocks, to growing stocks, to long maturity bonds, to
short maturity bonds, to high quality bonds and to low quality bonds.
27
Henriksson [1984] tries to solve problems that might happen due to both the omission of relevant
factors and issues concerning the choice of the benchmark portfolio in the Henriksson and Merton model.
His Henriksson and Merton extended measure of market timing includes two more factors and a
second dummy variable to introduce the excess return of an equally weighted portfolio of the funds.
Finally, Chan et al. [2002] propose a Henriksson and Merton timing measure in a three-factor
context, which is computed with the same three factor model than Fama and French.
6.2.3. Conditional versions
We saw above that Ferson and Schadt [1996] propose a conditional model that produces conditional
betas. By extension, they propose to consider a conditional Treynor and Mazuy’s coefficient and a
conditional Henriksson and Merton’s coefficient. In general, a typical mutual fund increases its market
exposure when stock returns are low. Using the conditional market timing models, evidence of perverse
market timing for the typical fund can be reduced.
6.3. Period based measures
Grinblatt and Titman [1989a and b] suggest a method that gets portfolio returns over several periods and
attribute a positive weighting to each of them. The Grinblatt and Titman index is the weighted average
of the excess returns. To attribute a null performance to uninformed investors, the weighted average of
the reference portfolio in excess of the risk-free rate must be null. A positive measure indicates that the
manager accurately foresaw the evolution of the market, while an uninformed one has zero performance.
This approach is not very intuitive, and the computations to determine the weights can be complex,
buit data requirements are simple. This measure generalizes other measures, as Jensen’s alpha –equal to
this measure when all investors’ utility functions are quadratic – and the Treynor and Mazuy measure.
Cornell [1979] proposes a measure to evaluate the ability of a manager to pick stocks when they have
higher returns than usual. The Cornell measure is the average difference between the return of the
28
considered portfolio during the period in which the portfolio is held, and the return on a benchmark
portfolio with the same weightings, but considered over a different period. It does not use the market
portfolio: asset returns are the direct references used. Like Jensen’s measure, it attributes a null
performance to a portfolio that has no particular timing or selection skills. Unfortunately, it requires a
large amount of calculations. There is also a possibility that certain securities disappear during the period.
Finally, it requires knowledge of the weightings of the assets that make up the portfolio.
Grinblatt and Titman [1993] propose the performance change measure, based on the study of
changes in the portfolio. It relies on the principle that an informed investor changes the weightings in his
portfolio according to his forecast on the evolution of the returns. His portfolio will thus display a non-
null covariance between the weightings on the assets of the portfolio and the returns on the same assets.
The measure is put together by aggregating the covariances. Unlike the Cornell measure, it does not use
any benchmark portfolio. However, it requires the knowledge of asset returns and of their weightings
within the portfolio. It is limited by the significant number of calculations and data requirement.
6.4. Miscellaneous
The measure of performance based on pure market timing introduced by Sweeney [1988] gives the
abnormal return during a defined period. It considers transactions costs as well as changes in the
portfolio. It is however limited to two assets, one risky and the other riskless, and supposes that the
portfolio is always fully invested on one of them.
Bhattacharya and Pfleiderer [1983] suggest a quadratic model with the same origin as Treynor-
Mazuy’s model. In the Bhattacharya and Pfleiderer measure of market timing, timing ability is
defined as the correlation between the manager’s forecasts and the excess market return. The latter can be
estimated directly from the returns of the benchmark excess returns, while the first one is estimated from
a quadratic model [Stevenson, 2004].
29
7. Conclusion
We showed in this paper that more than one hundred measures have been proposed in the literature to
evaluate the performance of a fund, including the notions of return and risk. Each of them has its
strengths, but also its weaknesses and limits. They encompass various dimensions that make sense for
most of them. Hence, it would be unfair to say that “one size fits it all”. Our ongoing efforts try to
arbitrate between them and to distinguish those who can be considered as the most significant in general
to explain portfolio performance but also persistence.
Next, one should also attempt to classify them in terms of their relevance under various economic
contexts (volatile or not, bear or bull…), regarding different type of funds (stocks only, including
bonds…) and durations (short term, medium term, long term). Eventually, studies of the persistence in
performance, and the detection of the best portfolio managers, should adequately encompass the relevant
dimensions of performance.
30
Bibliography
AFTALION Florin and PONCET Patrice (1991), “Les mesures de performance des OPCVM: Problèmes et
solutions”, Revue Banque, n°517, juin.
ANG James S. and CHUA Jess H. (1979), “Composite Measures for the Evaluation of Investment Performance”,
Journal of Financial and Quantitative Analysis, vol. 14, n° 2, pp. 361-384.
ARTZNER Philippe, DELBAEN Freddy, EBER Jean-Marc and HEATH David (1999), “Coherent Measures of
Risk”, Mathematical Finance, vol. 9, n° 3, pp. 203-228.
BAWA V.S. (1975), “Optimal Rules for Ordering Uncertain Prospects”, Journal of Financial Economics, vol. 2, n°
1, pp. 95-121.
BELLO Zakri Y. and JANJIGIAN Vahan (1997), “A Reexamination of the Market-Timing and Security-Selection
Performance of Mutual Funds”, Financial Analysts Journal, vol. 53, n° 5, pp. 24-30.
BERNARDO Antonio and LEDOIT Olivier (2000), “Gain, Loss and Asset Pricing”, Journal of Political Economy,
vol. 108, n° 1, pp. 144-172.
BHATTACHARYA S. and PFLEIDERER P. (1983), “A Note on Performance Evaluation”, Stanford University,
Technical Report 714.
BIGLOVA Almira, ORTOBELLI Sergio, RACHEV Svetlozar and STOYANOV Stoyan (2004), “Different
Approaches to Risk Estimation in Portfolio Theory”, Journal of Portfolio Management, vol. 31, n° 4, pp. 103-112.
BLACK Fischer (1972), “Capital Market Equilibrium with Restricted Borrowing”, Journal of Business, vol. 45, n°
3, pp. 444-455.
BRENNAN Michael J. (1970), “Taxes, Market Valuation and Corporate Financial Policy”, National Tax Journal,
vol. 23, n° 4, pp. 417-427.
BURKE, Gibbons (1994), “A Sharper Sharpe Ratio”, Futures (Cedar Falls, Iowa), Vol. 23, n° 3, p. 56.
CANTALUPPI Laurent and HUG Ruedi (2000), “Efficiency Ratio: A New Methodology for Performance
Measurement”, Journal of Investing, vol. 9, n° 2, pp. 19-25.
CARHART Mark M. (1997), “On Persistence in Mutual Fund Performance”, Journal of Finance, vol. 52, n° 1, pp.
57-82.
CHAN Louis K. C., CHEN Hsiu-Lang and LAKONISHOK Josef (2002), “On Mutual fund Investment Styles”,
Review of Financial Studies, vol. 15, n° 5, pp. 1407-1437.
CHEN Carl R. and STOCKUM Steve (1986), “Selectivity, Market Timing and Random Beta Behavior of Mutual
Funds: A Generalised Model”, Journal of Financial Research, vol. 9, n° 1, pp. 87-96.
CHRISTOPHERSON Jon A., FERSON Wayne E. and TURNER Andrew L. (1999), “Performance Evaluation
using Conditional Alphas and Betas”, Journal of Portfolio Management, vol. 26, n° 5, pp. 59-72.
COHEN Randolph B., COVAL Joshua D. and PASTOR Lubos (2005), “Judging Fund Managers by the Company
they Keep”, Journal of Finance, vol. 60, n° 3, pp. 1057-1096.
COMER George (2006), “Hybrid Mutual Funds and Market Timing Performance”, Journal of Business, vol. 79, n°
2, pp. 771-797.
CORNELL Bradford (1979), “Asymmetric Information and Portfolio Performance Management”, Journal of
Financial Economics, vol. 7, n° 4, pp. 381-390.
DANIEL Kent, GRINBLATT Mark, TITMAN Sheridan and WERMERS Russ (1997), “Measuring Mutual Fund
Performance with Characteristic-Based Benchmarks”, Journal of Finance, vol. 52, n° 3, pp. 1035-1058.
DIMSON E. (1979), “Risk management when Shares are subject to Infrequent Trading,”
Journal of Financial Economics, vol. 7, n° 2, pp. 197-226.
DOWD Kevin (1999), “A Value at Risk Approach to Risk-Return Analysis”, Journal of Portfolio Management,
vol. 25, n° 4, pp. 60-67.
31
DOWD Kevin (2000), “Adjusting for Risk: An Improved Sharpe Ratio”, International review of Economics and
Finance, vol. 9, n° 3, pp. 209-222.
DREW Michael E., VEERARAGHAVAN Madhu and WILSON Vanessa (2002), “Market Timing and Selectivity:
Evidence from Australian Equity Superannuation Funds”, Queensland University of Technology – Discussion
Papers in Economics, Finance and International Competitiveness, N° 105.
ELTON Edwin J., GRUBER Martin J., DAS Sanjiv and HLAVKA Matthew (1993), “Efficiency with Costly
Information: A Reinterpretation of Evidence from Managed Portfolios”, Review of Financial Studies, vol. 6, n° 1,
pp. 1-22.
FAMA Eugene F. (1972), "Components of Investment Performance", Journal of Finance, vol. 27, n° 3, pp. 551-
567.
FAMA, Eugene F. and FRENCH Kenneth (1992), “The Cross-Section of Expected Stock Returns”, Journal of
Finance, vol. 47, n° 2, pp. 427-465.
FAMA, Eugene F. and FRENCH Kenneth (1993), “Common Risk Factors in the Returns on Stocks and Bonds”,
Journal of Financial Economics, vol. 33, n° 1, pp. 3-56.
FARINELLI Simone and TIBILETTI Luisa (2008), “Sharpe Thinking in Asset Ranking with One-Sided
Measures”, European Journal of Operational Research, vol. 185, n° 3, pp. 1542–1547.
FAVRE Laurent and GALEANO José-Antonio (2002), “Mean-Modified Value-at-Risk Optimization with Hedge
Funds”, Journal of Alternative Investments, vol. 5, n° 2, pp. 21-25.
FERSON Wayne and KHANG Kenneth (2002), “Conditional Performance Measurement Using Portfolio Weights:
Evidence for Pension Funds”, Journal of Financial Economics, vol. 65, n° 2, pp. 249-282.
FERSON Wayne and SCHADT Rudi W. (1996), “Measuring Fund Strategy and Performance in Changing
Economic Conditions”, Journal of Finance, vol. 51, n° 2, pp. 425-461.
FUNG Hung-Gay, XU Xiaoqing Eleanor and YAU Jot (2004), “Do Hedge Managers Display Skill?”, Journal of
Alternative Investments, vol. 6, n° 4, pp. 22-31.
GILLET Philippe and MOUSSAVOU Jean (2000), “L’importance du choix du benchmark et du taux sans risque
dans la mesure des performances des fonds d’investissement”, The European Investment Review.
GOETZMANN W. N., INGERSOLL Jr J. and IVKOVIC Z. (2000), “Monthly Measurement of Daily Timers”,
Journal of Financial and Quantitative Analysis, vol. 35, n°3, pp. 257-290.
GOMEZ Juan-Pedro and ZAPATERO Fernando (2003), “Asset Pricing Implications of Benchmarking a Two-
Factor CAPM”, European Journal of Finance, vol. 9, pp. 343-357.
GOODWIN Thomas H. (1998), “The Information Ratio”, Financial Analysts Journal, vol. 54, n° 4, pp. 34-43.
GRAHAM John R. and HARVEY Campbell R. (1997), “Grading the Performance of Market-Timing Newsletters”,
Financial Analysts Journal, vol. 53, n° 6, pp. 54-66.
GRINBLATT Mark and TITMAN Sheridan (1989a), “Mutual Fund Performance: an Analysis of Quarterly
Portfolio Holdings”, Journal of Business, vol. 62, n° 3, pp. 393-416.
GRINBLATT Mark and TITMAN Sheridan (1989b), “Portfolio Performance Evaluation: Old Issues and New
Insights”, Review of Financial Studies, vol. 2, n° 3, pp. 393-421.
GRINBLATT Mark and TITMAN Sheridan (1993), “Performance Measurement without Benchmarks: an
Examination of Mutual Fund Returns”, Journal of Business, vol. 66, n° 1, pp. 47-68.
GRINOLD Richard C. (1989), “The Fundamental Law of Active Management”, Journal of Portfolio Management,
vol. 15, n° 3, pp. 30-37.
HARVEY Campbell R. and SIDDIQUE Akhtar (2000), “Conditional Skewness in Asset Pricing Tests”, Journal of
Finance, vol. 55, n° 3, pp. 1263-1295.
HENRIKSSON Roy D. (1984), “Market Timing and Mutual Fund Performance: an Empirical Investigation”,
Journal of Business, vol. 57, n° 1, pp. 73-96.
32
HENRIKSSON Roy Dm. and MERTON R. (1981), “On Market-timing and Investment Performance: II. Statistical
Procedures for Evaluating Forecasting Skills”, Journal of Business, vol. 54, n° 4, pp. 513-533.
HODGES Stewart D. (1998), “A Generalization of the Sharpe Ratio and its Applications to the Valuation Bounds
and Risk Measures”, Working Paper, University of Warwick.
HÜBNER Georges (2005), “The Generalized Treynor Ratio”, Review of Finance, vol. 9, n° 3, pp. 415-435.
HWANG S. and SATCHELL Stephen E. (1999), “Modelling Emerging Market Risk Premia Using Higher
Moments”, International Journal of Finance and Economics, vol. 4, pp. 271-296.
INGERSOLL Jonathan, SPIEGEL Matthew, GOETZMANN William and WELCH Ivo (2007), “Portfolio
Performance Manipulation and Manipulation-proof Performance Measures”, Review of Financial Studies, vol. 20,
n° 5, 1503-1546.
ISRAELSEN Craig L. (2005), “A Refinement to the Sharpe Ratio and Information Ratio”, Journal of Asset
Management, vol. 5, n° 6, pp. 423-427.
JAGANNATHAN Ravi and KORAJCZYK Robert A. (1986), “Assessing the Market Timing Performance of
Managed Portfolios”, Journal of Business, vol. 59, n° 2, pp. 217-235.
JENSEN Michael C. (1968), “The Performance of Mutual Funds in the Period 1945-64”, Journal of Finance, vol.
23, n° 2, pp. 389-416.
KAHNEMAN D. and TVERSKY A. (1979), “Prospect Theory: An Analysis of Decision under Risk”,
Econometrica, vol. 47, n° 2, pp. 263-291.
KAPLAN Paul D. (2005), “A Unified Approach to Risk-Adjusted Performance”, Working Paper, Morningstar Inc.
KAPLAN Paul D. and KNOWLES James A. (2004), “Kappa: A Generalized Downside Risk-Adjusted
Performance Measure”, Journal of Performance Measurement, vol. 8, n° 3, pp. 42-54.
KAZEMI Hossein, SCHNEEWEIS Thomas and GUPTA Bhaswar (2004), “Omega as a Performance Measure”,
Journal of Performance Measurement, vol. 8, n° 3, pp. 16-25.
KEATING Con and SHADWICK William F. (2002), “A Universal Performance Measure”, Journal of
Performance Measurement, vol. 6, n° 3, pp. 59-84.
KESTNER, Lars N. (1996), “Getting a Handle on True Performance”, Futures (Cedar Falls, Iowa), vol. 25, n° 1,
pp. 44-46.
KONNO H. & YAMAZAKI H. (1991), “Mean-Absolute Deviation Portfolio Optimization Model and its
Application to Tokyo Stock Market”, Management Science, vol. 37, n° 5, pp. 519-531.
LELAND Hayne E. (1999), “Beyond Mean-Variance: Performance Measurement in a Non-Symmetrical World”,
Financial Analysts Journal, vol. 55, n° 1, pp. 27-36.
LE SOURD Véronique (2007), “Performance Measurement for Traditional Investment”, EDHEC Risk and
Management Research Centre.
LO Andrew (2002), “The Statistics of Sharpe Ratios”, Financial Analysts Journal, vol. 58, n° 4, pp. 36-52.
LOBOSCO Angelo (1999), “Style/Risk-Adjusted Performance”, Journal of Portfolio Management, vol. 26, n° 4,
pp. 65-68.
MAHDAVI Mahnaz (2004), “Risk-Adjusted Return When Returns Are Not Normally Distributed: Adjusted
Sharpe Ratio”, Journal of Alternative Investments, vol. 6, n° 4, pp. 47-57.
MARTIN Peter and Mc CANN Byron (1989), “The Investor's Guide to Fidelity Funds: Winning Strategies for
Mutual Fund Investors”, John Wiley & Sons.
MARTIN R. Douglas, RACHEV Svetlozar and SIBOULET Frédéric (2003), “Phi-Alpha Optimal Portfolios &
Extreme Risk Management”, Wilmott, vol. 2003, n° 6, pp. 70-83.
McDONALD John (1973), “French Mutual Fund Performance: Evaluation of Internationally Diversified
Portfolios”, Journal of Finance, vol. 28, n° 5, pp. 1161-1180.
33
MELNIKOFF Meyer (1998), “Investment Performance Analysis for Investors”, Journal of Portfolio Management,
vol. 25, n° 1, pp. 95-107.
MODIGLIANI Franco and MODIGLIANI Leah (1997), “Risk Adjusted Performance”, Journal of Portfolio
Management, vol. 23, n° 2, pp. 45-54.
MORNINGSTAR (2007), “The Morningstar Rating Methodology”, Morningstar Methodology Paper.
MOSES Edward A., CHEYNEY John M. and VEIT E. Theodore (1987), “A new and more complete performance
measure”, Journal of Portfolio Management, vol. 13, n° 2, pp. 24-33.
MURALIDHAR Arun S. (2000), “Risk-Adjusted Performance: The Correlation Correction”, Financial Analysts
Journal, vol. 56, n° 5, pp. 63-71.
MURALIDHAR Arun S. (2001), “Optimal Risk-Adjusted Portfolios with Multiple Managers”, Journal of Portfolio
Management, vol. 27, n° 3, pp. 97-104.
MURALIDHAR Arun S. (2002), “Skill, History and Risk-Adjusted Performance”, Journal of Performance
Measurement, vol. 6, n° 2, pp. 53-66.
PEZIER Jacques P. (2008), “Maximum Certain Equivalent Excess Returns and Equivalent Preference Criteria”,
Working Paper.
PLANTINGA Auke and DE GROOT Sebastiaan (2001), “Risk-adjusted performance measures and implied risk-
attitudes”, Journal of Performance Measurement, vol. 6, n° 2, pp. 9-19.
POGUE Gerald A., SOLNIK Bruno H. and ROUSSELIN Antoine (1973), “The Impact of International
Diversification: A Study of the French Mutual Funds”, M.I.T., Working Paper, 658/73.
RACHEV Svetlozar T. and MITTNIK, S. (2000), “Stable Paretian Models in Finance”, Wiley, Chichester.
ROLL Richard (1977), “A Critique of the Asset Pricing Theory’s Test Part 1: On Past and Potential Testability of
the Theory”, Journal of Financial Economics, vol. 4, pp. 129-176
ROY A. D. (1952), “Safety First and the Holding of Assets”, Econometrica, vol. 20, n° 3, pp. 431-449.
SAWICKI, Julia and ONG Fred (2000), “Evaluating Mutual Fund Performance Using Conditional Measures:
Australian Evidence”, Pacific-Basin Finance Journal, vol. 8, pp. 505-528.
SCHOLES M. and WILLIAMS J.T. (1977), “Estimating Betas from Nonsynchronous Data”, Journal of Financial
Economics, vol. 5, n° 3, pp. 309-327.
SCHOLZ Hendrik and WILKENS Marco (2005a), “A Jigsaw Puzzle of Basic Risk-adjusted Performance
Measures”, Journal of Performance Measurement, vol. 9, pp. 57-64.
SCHOLZ Hendrik and WILKENS Marco (2005b), “Investor Specific Performance Measurement: A Justification
of Sharpe Ratio and Treynor Ratio”, International Journal of Finance, vol. 17, n° 4, pp. 3671-3691.
SHARMA Milind (2004), “A.I.R.A.P. - Alternative RAPMs for Alternative Investments”, Journal of Investment
Management, vol. 2, n° 4, pp. 106-129.
SHARPE William F. (1966), “Mutual Fund Performance”, Journal of Business, vol. 39, n° 1 part 2, pp.119-138.
SHARPE William F. (1994), “The Sharpe Ratio”, Journal of Portfolio Management, vol. 21, n° 1, pp. 49-58.
SHEIKH A. (1996), “Barra’s Risk Model”, Barra Research Insights.
SORTINO Frank A. (2000), “Measuring Risk: Upside-Potential Ratios Vary by Investment Style”, Pensions and
Investments, vol. 28, n° 22, pp. 30–35.
SORTINO Frank A. and PRICE Lee N. (1994), “Performance Measurement in a Downside Risk Framework”,
Journal of Investing, vol. 3, n° 3, pp. 59-64.
SORTINO Frank A. and SATCHELL Stephen E. (2001), “Managing downside risk in financial markets”,
Batterworth-Heinemann Finance, Oxford
SORTINO Frank A. and VAN DER MEER Robert (1991), “Downside Risk”, Journal of Portfolio Management,
vol. 17, n° 4, pp. 27-31.
34
SORTINO Frank, VAN DER MEER Robert and PLANTINGA Auke (1999), “The Dutch Triangle”, Journal of
Portfolio Management, vol. 26, n° 5, pp. 50-58.
SPURGIN Richard B. (2001), “How to Game Your Sharpe Ratio”, Journal of Alternative Investments, vol. 4, n° 3,
pp. 38-46.
SRIVASTAVA Suresh C. and ESSAYYAD Musa (1994), “Investigating a New Methodology for Ranking
International Mutual Funds”, Journal of Economics and Finance, vol. 18, n° 3, pp. 241-260.
STATMAN Meir (1987), “How Many Stocks Make a Diversified Portfolio?”, Journal of Financial and
Quantitative Analysis, vol. 22, n° 3, pp. 353-363
STEVENSON Simon (2004), “A Performance Evaluation of Portfolio Managers: Tests of Micro and Macro
Forecasting”, European Journal of Finance, vol. 10, n° 5, pp. 391-411.
STUTZER Michael (2000), “A Portfolio Performance Index”, Financial Analysts Journal, vol. 56, n° 3, pp. 52-61.
SWEENEY R. J. (1988), “Some New Filter Tests: Methods and Results”, Journal of Financial and Quantitative
Analysis, vol. 23, pp. 285-300
SZEGÖ Giorgio (2002), “Measures of Risk”, Journal of Banking and Finance, vol. 26, n° 7, pp. 1253-1272.
TREYNOR Jack L. (1965), “How to Rate Management of Invested Funds”, Harvard Business Review, vol. 44, n°
1, pp. 63-75.
TREYNOR Jack L. and BLACK Fischer (1973), “How to Use Security Analysis to Improve Portfolio Selection”,
Journal of Business, vol. 46, n° 1, pp. 61-86.
TREYNOR Jack L. and MAZUY Kay K. (1966), “Can Mutual Funds Outguess the Market?”, Harvard Business
Review, vol. 44, n° 4, pp. 131-136.
VINOD H. D. and MOREY Matthew R. (2001), “A Double Sharpe Ratio”, Advances in Investment Analysis and
Portfolio Management, vol. 8, pp. 57-65.
WATANABE Yasuaki (2006), “Is Sharpe Ratio Still Effective?”, Journal of Performance Measurement, vol. 11,
n° 1, pp. 55-66.
WEIGEL Eric J. (1991), “The Performance of Tactical Asset Allocation”, Financial Analysts Journal, vol. 47, n°
5, pp. 63-70.
YITZHAKI Shlomo (1982), “Stochastic Dominance, Mean Variance and Gini's Mean Difference”, American
Economic Review, vol. 72, n° 1, pp. 178-185
YOUNG Martin R. (1998), “A Minimax Portfolio Selection Rule with Linear Programming Solution”,
Management Science, vol. 44, n° 5, pp. 673-683.
YOUNG Terry W. (1991), “Calmar Ratio: A Smoother Tool”, Futures (Cedar Falls, Iowa), vol. 20, n°11.
ZAKAMOULINE Valeri and KOEKEBAKKER Steen (2008), “Portfolio Performance Evaluation with
Generalized Sharpe Ratios: Beyond the Mean and Variance”, working paper, submitted to Journal of Banking and
Finance.
ZIEMBA William T. (2005), “The Symmetric Downside-Risk Sharpe Ratio”, Journal of Portfolio Management,
vol. 32, n° 1, pp. 108-122.
35
Exhibit 1.
Preference-based
measures
(Section 5)
17 measures
Market timing
(Section 6)
15 measures
Return-based differences
(Section 4 except 4.3)
26 measures
Return-based ratios
(Section 3 except 3.1.3)
31 measures
Gain-based differences
(Subsection 4.3)
3 measures
Gain-based ratios
(Subsection 3.1.3)
9 measures
Re
turn
Ga
in
Relative (ratio) Absolute (difference)
Asset selection
Standardized
risk-adjusted
measures
Skill
Individualization
Va
lue
cre
ati
on
Translation
Portfolio
performance
measures
36
Exhibit 2.
3. Ratios
performance / risk
3.1. Absolute risk3.2. Systematic
risk
3.3. Non
systematic risk
3.1.1. Sharpe ratio and
close variations
- Sharpe ratio
- Israelsen’s modified
Sharpe ratio
- Double Sharpe ratio
- Adjusted for skewness
Sharpe ratio
- Adjusted for skewness
and kurtosis Sharpe ratio
- Sharpe + skewness/
kurtosis
- Adjusted Sharpe ratio
- Sharpe ratio adapted to
autocorrelation
- Roy’s measure
3.1.2. Other
absolute risk
measures
3.1.2.1. Half- and semi-variance
- Reward to half-variance index
- Downside-risk Sharpe ratio
- Sortino ratio
- Sortino + skewness/kurtosis ratio
- Sortino-Satchell ratio (Kappa)
3.1.2.2. VaR and CVaR
- Sharpe ratio based on the
VaR
- Sharpe ratio based on
Cornish-Fisher VaR
- Sharpe ratio based on CVaR
(STARR ratio)
3.1.2.3. Miscellaneous
- Mean absolute deviation ratio
- Gini ratio
- Minimax ratio
- Martin ratio (Ulcer
performance index)
- Sharpe-Omega
- Stable ratio
3.2.1. Treynor ratio and
variants
- Treynor ratio
- Treynor ratio based on
lower partial moments
3.2.2. Black-Treynor
ratio and generalization
- Black-Treynor ratio
- Generalized Black-
Treynor ratio
3.3.1. Moses,
Cheney and Veit’s
measure
3.3.2. Information ratio
and variations
- Information ratio
- Israelsen’s modified
information ratio
- Information ratio based
on semi-variance
3.1.3. Ratio of gain and
shortfall aversion
3.1.3.2. CVaR as measure
of shortfall aversion
- Rachev ratio
- Rachev generalized ratio
3.1.3.1 Classical measures
of shortfall aversion
- Bernardo-Ledoit gain-loss
ratio (Omega)
- Upside potential ratio
- Farinelli-Tibiletti ratio
3.1.3.3. Maximum drawdown as
measure of shortfall aversion
- Calmar ratio
- Sterling ratio
- Sterling-Calmar ratio
- Burke ratio
37
Exhibit 3.
4. Incremental
return
4.1. Incremental
return vs market4.2. Incremental
return vs benchmark
4.2.2. Multi-factors
model
4.2.1. One-factor
model
4.2.2.3. Extensions of
CAPM based
- Alpha based on Harvey
and Siddique’s model
- Higher moment of Hwang
and Satchell
- Alpha based on a two-
factor CAPM
4.2.2.1. Alpha with multi-
factors
- Alpha based on Fama &
French’s three factors model
- Alpha based on Carhart’s
four factors model
- Multi-factors alpha for
hybrid funds
- Alpha based on Barra’s
model
4.2.1.1. Jensen
alpha
4.2.2.2. Conditional
models
- Alpha with
conditional betas
- Conditional alpha
4.2.1.2. Variations of Jensen
alpha
- Standardized alpha
- Alpha with Black’s zero-beta
model
- Alpha with Brenman’s model
taking taxes into account
- Total risk alpha
- Mc Donald’s measure
- Jensen alpha adjusted for
stale
- Leland alpha
4.3. Difference between
gain and shortfall aversion
- Melnikoff’s measure
- Sharpe alpha
- Fouse’s index4.1. Incremental return
vs market
- M² (risk adjusted
performance)
- Market risk adjusted
performance
- Differential return
based on RAP
- Style risk adjusted
performance measure
- Excess standard
deviation adjusted return
- Aftalion and Poncet’s
index
4.1.2. Efficient frontier
based measures
- Efficiency ratio
- Graham and Harvey’s
measure 1
- Graham and Harvey’s
measure 2
38
Exhibit 4.
5. Preferences
based
5.1. Direct
5.2. Indirect
- Cohen, Coval and Pastor’s
measure based on levels of
holding
- Cohen, Coval and Pastor’s
measure based on changes
in holding
- Daniel’s measures
- Conditional weight measure5.1.2. Miscellaneous
- Investor specific
performance measure
- M³ (Muralidhar’s
measure)
- Skill, history and
risk-adjusted measure
5.1.1. Utility functions based
- Hodges’s generalized Sharpe
ratio
- Stutzer index of convergence
- Lambda
- Morningstar risk adjusted return
- Alternative investments risk
adjusted performance
- Manipulation-proof
performance measure
- Maximum certain equivalent
excess return
5.1.3. Prospect theory
based
- Prospect ratio
- Prospect + skewness /
kurtosis ratio
39
Exhibit 5.
6. Market timing
measures
6.1. Original
measures
- Treynor and
Mazuy’s coefficient
- Henriksson and
Merton’s coefficient
- Weigel’s coefficient
6.2. Extension of
original measures
6.2.2. Multi-factors
versions
- Extended Treynor and
Mazuy’s measure
- Multi-factor timing model
- Henriksson and Merton’s
extended measure of
market timing
- Henriksson and Merton’s
timing measure on a three
factor context
6.2.1. Adding a cubic
term
- Treynor and Mazuy’s
extended timing measure
6.2.3. Conditional
versions
- Conditional Treynor and
Mazuy’s coefficient
- Conditional Henriksson
and Merton’s coefficient
6.3. Period-based
measures
- Grinblatt and Titman
index
- Cornell measure
- Performance change
measure
6.4. Miscellaneous
- Performance based on pure
market timing
- Bhattacharya and Pfleiderer
measure of market timing
40
Endnotes
1 Her paper describes measures much more deeply than we do here; in this way, it is an excellent complement to this paper.
2 This count considers the removal of redundant measures and of measures that have been used in the empirical literature
without a formal discussion of their roots. Even though we have brought our best efforts in this survey, we might still ignore
some recent or very unpopular measures. Nevertheless we feel confident that we encompass a very significant perimeter in
this area.
3 The complete list and formulae of all the 101 measures are available upon request.
4 Furthermore, Sharpe (1994) showed that the Sharpe ratio can be interpreted as a t-statistic to test the hypothesis that the
return on the portfolio is equal to the risk-free return: t-Stat = Sharpe * sqrt(T). A higher Sharpe ratio is consistent with a
higher probability that the portfolio return will exceed the risk-free return.
5 In fact, the implicit benchmark is the risk-free rate.
6 Its name is due to the popularity of this ratio after a paper of Sortino and Van der Meer in 1991. But it was already
mentioned by Ang and Chua in 1979 and even by Bawa in 1975.
7 Kaplan and Knowles (2004) introduce a measure named Kappa of order κ which is the same as Sortino-Satchell ratio.
8 It was rediscovered by Martin et al. (2003) under the name STARR (Stable Tail Adjusted Return Ratio).
9 It is an intermediary measure between Sharpe / Sortino ratio and the Omega, which is presented later in this paper.
10 Kestner [1996] is often mentioned as the originator of this ratio, but in fact it seems that he is the first who mentions it in
a paper. The ratio was initially attributed to Sterling Jones, but we did not find a paper of this author describing this ratio.
11 Historically, Jensen’s alpha is the first benchmark-based measure.
12 Ang and Chua [1979] had the same idea to generalize Jensen’s alpha by inclusion of the skewness in the model.
13 This category represents a hybrid between standardized risk-adjusted and preference-based measures.