higher moment coherent risk measures - u-system accountsu.arizona.edu › ~krokhmal › pdf ›...

Higher Moment Coherent Risk Measures∗

Pavlo A. Krokhmal

Department of Mechanical and Industrial EngineeringThe University of Iowa, 2403 Seamans Center, Iowa City, IA 52242

E-mail: [email protected]

April 2007

Abstract

The paper considers modeling of risk-averse preferences in stochastic programming problems using risk mea-sures. We utilize the axiomatic foundation of coherent risk measures and deviation measures in order to developsimple representations that express risk measures via specially constructed stochastic programming problems. Us-ing the developed representations, we introduce a new family of higher-moment coherent risk measures (HMCR),which includes, as a special case, the Conditional Value-at-Risk measure. It is demonstrated that the HMCR mea-sures are compatible with the second order stochastic dominance and utility theory, can be efficiently implementedin stochastic optimization models, and perform well in portfolio optimization case studies.

Keywords: Risk measures, stochastic programming, stochastic dominance, portfolio optimization

1 Introduction

Research and practice of portfolio management and optimization is driven to a large extent by tailoring the mea-sures of reward (satisfaction) and risk (unsatisfaction/regret) of the investment venture to the specific preferencesof an investor. While there exists a common consensus that an investment’s reward may be adequately associatedwith its expected return, the methods for proper modeling and measurement of an investment’s risk are subject tomuch more pondering and debate. In fact, the risk-reward or mean-risk models constitute an important part of theinvestment science subject and, more generally, the field of decision making under uncertainty.

The cornerstone of modern portfolio analysis was set up by Markowitz (1952, 1959), who advocated identificationof the portfolio’s risk with the volatility (variance) of its returns. On the other hand, Markowitz’s work led toformalization of the fundamental view that any decision under uncertainties may be evaluated in terms of its riskand reward. The seminal Markowitz’s ideas are still widely used today in many areas of decision making, and theentire paradigm of bi-criteria “risk-reward” optimization has received extensive development in both directions ofincreasing the computational efficiency and enhancing the models for risk measurement and estimation.

At the same time, it has been recognized that the symmetric attitude of the classical Mean-Variance (MV) approach,where both the “positive” and “negative” deviations from the expected level are penalized equally, does not alwaysyield an adequate estimation of risks induced by the uncertainties. Hence, significant effort has been devoted to thedevelopment of downside risk measures and models. Replacing the variance by the lower standard semi-deviationas a measure of investment risk so as to take into account only “negative” deviations from the expected level hasbeen proposed as early as by Markowitz (1959); see also more recent works by Ogryczak and Ruszczynski (1999,2001, 2002).

∗Supported in part by NSF grant DMI 0457473.

1

Among the popular downside risk models we mention the Lower Partial Moment and its special case, the ExpectedRegret, which is also known as Integrated Chance Constraint in stochastic programming (Bawa, 1975; Fishburn,1977; Dembo and Rosen, 1999; Testuri and Uryasev, 2003; van der Vlerk, 2003). Widely known in finance andbanking industry is the Value-at-Risk measure (JP Morgan, 1994; Jorion, 1997; Duffie and Pan, 1997). Being sim-ply a quantile of loss distribution, the Value-at-Risk (VaR) concept has its counterparts in stochastic optimization(probabilistic, or chance programming, see Prekopa, 1995), reliability theory, etc. Yet, minimization or control ofrisk using the VaR measure proved to be technically and methodologically difficult, mainly due to VaR’s notoriousnon-convexity as a function of the decision variables. A downside risk measure that circumvents the shortcomingsof VaR while offering a similar quantile approach to estimation of risk is the Conditional Value-at-Risk measure(Rockafellar and Uryasev, 2000, 2002; Krokhmal et al., 2002a). Risk measures that are similar to CVaR and/or maycoincide with it, are Expected Shortfall and Tail VaR (Acerbi and Tasche, 2002), see also Conditional Drawdown-at-Risk (Chekhlov et al., 2005; Krokhmal et al., 2002b). A simple yet effective risk measure closely related toCVaR is the so-called Maximum Loss, or Worst-Case Risk (Young, 1998; Krokhmal et al., 2002b), whose use inproblems with uncertainties is also known as the robust optimization approach (see, e.g., Kouvelis and Yu, 1997).

In the last few years, the formal theory of risk measures received a major impetus from the works of Artzneret al. (1999) and Delbaen (2002), who introduced an axiomatic approach to definition and construction of riskmeasures by developing the concept of coherent risk measures. Among the risk measures that satisfy the coherencyproperties, there are Conditional Value-at-Risk, Maximum Loss (Pflug, 2000; Acerbi and Tasche, 2002), coherentrisk measures based on one-sided moments (Fischer, 2003), etc. Recently, Rockafellar et al. (2006) have extendedthe theory of risk measures to the case of deviation measures, and demonstrated a close relationship between thecoherent risk measures and deviation measures; spectral measures of risk have been proposed by Acerbi (2002).

An approach to decision making under uncertainty, different from the risk-reward paradigm, is embodied by thevon Neumann and Morgenstern (vNM) utility theory, which exercises mathematically sound axiomatic descriptionof preferences and construction of the corresponding decision strategies. Along with its numerous modificationsand extensions, the vNM utility theory is widely adopted as a basic model of rational choice, especially in eco-nomics and social sciences (see, among others, Fishburn, 1970, 1988; Karni and Schmeidler, 1991, etc). Thus,substantial attention has been paid in the literature to the development of risk-reward optimization models and riskmeasures that are consistent with expected utility maximization. In particular, it has been shown that under certainconditions the Markovitz MV framework is consistent with the vNM theory (Kroll et al., 1984). Ogryczak andRuszczynski (1999, 2001, 2002) developed mean-semideviation models that are consistent with stochastic dom-inance concepts (Fishburn, 1964; Rothschild and Stiglitz, 1970; Levy, 1998); a class of risk-reward models withSSD-consistent coherent risk measures was discussed in De Giorgi (2005). Optimization with stochastic dom-inance constraints was recently considered by Dentcheva and Ruszczynski (2003); stochastic dominance-basedportfolio construction was discussed in Roman et al. (2006).

In this paper we aim to offer an additional insight into the properties of axiomatically defined measures of risk bydeveloping a number of representations that express risk measures via solutions of stochastic programming prob-lems (Section 2.1); using the developed representations, we construct a new family of higher-moment coherentrisk (HMCR) measures. In Section 2.2 it is demonstrated that the suggested representations are amenable to seam-less incorporation into stochastic programming problems. In particular, implementation of the HMCR measuresreduces to p-order conic programming, and can be approximated via linear programming. Section 2.3 shows thatthe developed results are applicable to deviation measures, while section 2.4 illustrates that the HMCR measuresare compatible with the second-order stochastic dominance and utility theory. The conducted case study (Section3) indicates that the family of HMCR measures has a strong potential for practical application in portfolio selectionproblems. Finally, the Appendix contains the proofs of the theorems introduced in the paper.

2 Modeling of risk measures as stochastic programs

The discussion in the Introduction has illustrated the variety of approaches to definition and estimation of risk.Arguably, the recent advances in risk theory are associated with the axiomatic approach to construction of riskmeasures pioneered by Artzner et al. (1999). The present endeavor essentially exploits this axiomatic approach in

2

order to devise simple computational recipes for dealing with several types of risk measures by representing themin the form of stochastic programming problems. These representations can be used to create new risk measuresto be tailored to specific risk preferences, as well as to incorporate these preferences into stochastic programmingproblems. In particular, we present a new family of Higher Moment Coherent Risk measures (HMCR). It willbe shown that the HMCR measures are well-behaved in terms of theoretical properties, and demonstrate verypromising performance in test applications.

Within the axiomatic framework of risk analysis, risk measure R(X) of a random outcome X from some prob-ability space (�, F , µ) may be defined as a mapping R : X 7→ R, where X is a linear space of F -measurablefunctions X : � 7→ R. In a more general setting one may assume X to be a separated locally convex space; forour purposes it suffices to consider X = Lp(�, F , P), 1 ≤ p ≤ ∞, where the particular value of p shall be clearfrom the context. Traditionally to convex analysis, we call function f : X 7→ R proper if f (X) > −∞ for allX ∈ X and dom f 6= ∅, i.e., there exists X ∈ X such that f (X) < +∞ (see, e.g., Rockafellar, 1970; Zalinescu,2002). In the remainder of the paper, we confine ourselves to risk measures that are proper and not identicallyequal to +∞. Also, throughout the paper it is assumed that X represents a loss function, i.e., small values of X are“good,” and large values are “bad”.

2.1 Convolution-type representations for coherent measures of risk

A coherent risk measure, according to Artzner et al. (1999) and Delbaen (2002), is defined as a mapping R : X 7→

R that further satisfies the next four properties (axioms):

(A1) monotonicity: X ≤ 0 ⇒ R(X) ≤ 0 for all X ∈ X ,

(A2) sub-additivity: R(X + Y ) ≤ R(X) + R(Y ) for all X, Y ∈ X ,

(A3) positive homogeneity: R(λX) = λR(X) for all X ∈ X , λ > 0,

(A4) translation invariance: R(X + a) = R(X) + a for all X ∈ X , a ∈ R.

Observe that given the positive homogeneity (A3), the requirement of sub-additivity (A2) in the above definitioncan be equivalently replaced with the requirement of convexity (see also Schied and Follmer, 2002):

(A2′) convexity: R(λX + (1 − λ)Y

)≤ λR(X) + (1 − λ)R(Y ), X, Y ∈ X , 0 ≤ λ ≤ 1.

From the axioms (A1)–(A4) one can easily derive the following useful properties of coherent risk measures (see,for example, Delbaen, 2002; Ruszczynski and Shapiro, 2006):

(C1) R(0) = 0 and, in general, R(a) = a for all a ∈ R,

(C2) X ≤ Y ⇒ R(X) ≤ R(Y ), and, in particular, X ≤ a ⇒ R(X) ≤ a, a ∈ R,

(C3) R(X − R(X)

)= 0,

(C4) if X is a Banach lattice, R(X) is continuous in the interior of its effective domain,

(where the inequalities X ≥ a, X ≤ Y , etc., are assumed to hold almost surely). From the definition of coherentrisk measures it is easy to see that, for example, EX and ess.sup X , where

ess.sup X =

{min{ η ∈ R | X ≤ η }, if { η ∈ R | X ≤ η } 6= ∅,∞, otherwise,

are coherent risk measures; more examples can be found in Rockafellar et al. (2006). Below we present simplecomputational formulas that aid in construction of coherent risk measures and their incorporation into stochasticprograms. Namely, we execute the idea that one of the axioms (A3) or (A4) can be relaxed and then “reinstated” bysolving an appropriately defined mathematical programming problem. In other words, one can construct a coherentrisk measure via solution of a stochastic programming problem that involves a function φ : X 7→ R satisfying onlythree of the four axioms (A1)–(A4).

3

First we present a representation for coherent risk measures that is based on the relaxation of the translationinvariance axiom (A4). The next theorem shows that if one selects a function φ : X 7→ R satisfying axioms(A1)–(A3) along with additional technical conditions, then there exists a simple stochastic optimization probleminvolving φ whose optimal value would satisfy (A1)–(A4).

Theorem 1 Let function φ : X 7→ R satisfy axioms (A1)–(A3), and be a lower semicontinuous (lsc) function suchthat φ(η) > η for all real η 6= 0. Then the optimal value of the stochastic programming problem

ρ(X) = infη

η + φ(X − η) (1)

is a proper coherent risk measure, and the infimum is attained for all X, so infη in (1) may be replaced by minη∈R.

For proof of Theorem 1, as well as other theorems introduced in the paper, see the Appendix.

Remark 1.1 It is all-important that the stochastic programming problem (1) is convex, due to the convexity of thefunction φ. Also, it is worth mentioning that one cannot substitute a coherent risk measure itself for function φ in(1), as it will violate the condition φ(η) > η of the Theorem.

Corollary 1.1 The set arg minη∈R{η + φ(X − η)

}⊂ R of optimal solutions of (1) is closed.

Example 1.1 (Conditional Value-at-Risk) A famous special case of (1) is the optimization formula for Condi-tional Value-at-Risk (Rockafellar and Uryasev, 2000, 2002):

CVaRα(X) = minη ∈ R

η + (1 − α)−1E(X − η)+, 0 < α < 1, (2)

where (X)± = max{±X, 0}, and function φ(X) = (1 − α)−1E(X)+ evidently satisfies the conditions of The-orem 1. The space X in this case can be selected as L2(�, F , P). One of the many appealing features of (2)is that it has a simple intuitive interpretation: if X represents loss/unsatisfaction, then CVaRα(X), is, roughlyspeaking, the conditional expectation of losses that may occur in (1 − α) · 100% of the worst cases. In the caseof a continuously distributed X , this rendition is exact: CVaRα(X) = E

[X | X ≥ VaRα(X)

], where VaRα(X) is

defined as the α-quantile of X : VaRα(X) = inf{ζ | P[X ≤ ζ ] ≥ α

}. In the general case, the formal definition of

CVaRα(X) becomes more intricate (Rockafellar and Uryasev, 2002), but representation (2) still applies.

Example 1.2 A generalization of (2) can be constructed as (see also Ben-Tal and Teboulle, 1986)

Rα,β(X) = minη ∈ R

η + α E(X − η)+ − β E(X − η)−, (3)

where, in accordance with the conditions of Theorem 1, one has to put α > 1 and 0 ≤ β < 1.

Example 1.3 (Maximum Loss) If the requirement of finiteness of φ in (1) is relaxed, i.e., the image of φ is(−∞, +∞], then the optimal value of (1) still defines a coherent risk measure, but the infimum may not beachievable. An example is served by the so-called MaxLoss measure,

MaxLoss(X) = ess.sup X = infη

η + φ∗(X − η), where φ∗(X) =

{0, X ≤ 0,

∞, X > 0.

It is easy to see that φ∗ is positive homogeneous convex, non-decreasing, lsc, and satisfies φ∗(η) > η for all η 6= 0,but is not finite.

Example 1.4 (Higher Moment Coherent Risk Measures) Let X = Lp(�, F , P), and for some 0 < α < 1consider φ(X) = (1−α)−1

‖(X)+‖p, where ‖X‖p =(E|X |

p)1/p. Clearly, φ satisfies the conditions of Theorem 1,thus we can define a family of higher-moment coherent risk measures (HMCR) as

HMCRp,α(X) = minη ∈ R

η + (1 − α)−1∥∥(X − η)+∥∥

p, p ≥ 1, α ∈ (0, 1). (4)

4

From the fact that ‖X‖p ≤ ‖X‖q for 1 ≤ p < q it immediately follows that the HMCR measures are monotonicwith respect to the order p:

HMCRp,α(X) ≤ HMCRq,α(X) for p < q and X ∈ Lq . (5)

Of special interest is the case p = 2 that defines the second-moment coherent risk measure (SMCR):

SMCRα(X) = minη ∈ R

η + (1 − α)−1∥∥(X − η)+∥∥

2, 0 < α < 1. (6)

We will see below that SMCRα(X) is quite similar in properties to CVaRα(X) while measuring the risk in termsof the second moments of loss distributions. Implementation-wise, the SMCR measure can be incorporated into amathematical programming problem via the second order cone constraints (see Section 3). The second order coneprogramming (SOCP) is a well-developed topic in the field of convex optimization (see, e.g., a review by Alizadehand Goldfarb, 2003), and a number of commercial off-the-shelf software packages are available for solving convexproblems with second-order cone constraints.

Now we comment briefly on the relation between the introduced above HMCR family and other known in theliterature risk measures that involve higher moments of distributions. The Lower Partial Moment (see, e.g., Bawa(1975); Fishburn (1977), and others)1

LPMp(X; t) = E((X − t)+

)p, p ≥ 1, t ∈ R, (7)

is convex in X , but not positive homogeneous or translation invariant. In the context of axiomatically defined riskmeasures, an interesting example of spectral risk measure that corresponds to “pessimistic manipulation” of X andis sensitive to higher moments was considered by Tasche (2002). Closely related to the proposed here HMCRmeasures are the so-called coherent measures based on one-sided moments, or coherent measures of semi-Lp type(Fischer, 2003; Rockafellar et al., 2006):2

R(X) = EX + β∥∥(X − EX)+

∥∥p, p ≥ 1, β ≥ 0. (8)

A key difference between (8) and the HMCR measures (4) is that the HMCR family are tail risk measures, whilethe measures of type (8) are based on central semi-moments (see Example 2.3 below).

Example 1.5 (Composition of risk measures) Formula (1) readily extends to the case of multiple functions φi , i =

1, . . . , n, that are cumulatively used in measuring the risk of element X ∈ X and conform to the conditions ofTheorem 1. Namely, one has that

ρn(X) = minηi ∈ R, i=1,...,n

n∑i=1

(ηi + φi (X − ηi )

), (9)

is a proper coherent risk measure.

The value of η that delivers minimum in (1) does also possess some noteworthy properties as a function of X . Inestablishing of these properties the following notation is convenient. Assuming that the set arg minx ∈ R f (x) isclosed for some function f : R 7→ R, we denote its left endpoint as

Arg minx ∈ R

f (x) = min{

y | y ∈ arg minx∈R f (x)}.

1Here, the traditional terminology is preserved: according to the convention adopted in this paper, X denotes losses and therefore the properterm for (7) would be the upper partial moment.

2Interestingly, Fischer (2003) restricted the range of values for the constant β in (8) to β ∈ [0, 1], whereas Rockafellar et al. (2006) allowedβ to take values in (0, ∞).

5

Theorem 2 Let function φ : X 7→ R satisfy the conditions of Theorem 1. Then function

η(X) = Arg minη ∈ R

η + φ(X − η) (10)

exists and satisfies properties (A3) and (A4). If, additionally, φ(X) = 0 for every X ≤ 0, then η(X) satisfies (A1),along with the inequality η(X) ≤ ρ(X), where ρ(X) is the optimal value of (1).

Remark 2.1 If φ satisfies all the conditions of Theorem 2, the optimal solution η(X) of the stochastic optimizationproblem (1) complies with all axioms for coherent risk measures, except (A2), thereby failing to be convex.

Example 2.1 (Value-at-Risk) A well-known example of two risk measures obtained by solving a stochastic pro-gramming problem of type (1) is again provided by formula (2) due to Rockafellar and Uryasev (2000, 2002), andits counterpart

VaRα(X) = Arg minη ∈ R

η + (1 − α)−1E(X − η)+. (11)

The Value-at-Risk measure VaRα(X), despite being adopted as a de facto standard for measurement of risk infinance and banking industries, is notorious for the difficulties it presents in risk estimation and control.

Example 2.2 (Higher-Moment Coherent Risk Measures) For higher-moment coherent risk measures, the func-tion φ in (10) is taken as φ(X) = (1−α)−1

‖(X)+‖p, and the corresponding optimal ηp,α(X) satisfies the equation

(1 − α)−1/(p−1)=

∥∥(X − ηp,α(X)

)+∥∥

p∥∥(X − ηp,α(X)

)+∥∥

p−1

, p > 1. (12)

A formal derivation of equality (12) can be carried out using the techniques employed in Rockafellar and Uryasev(2002) to establish formula (11). Although the optimal ηp,α(X) in (12) is determined implicitly, Theorem 2 ensuresthat it has properties similar to those of VaRα (monotonicity, positive homogeneity, etc). Moreover, by pluggingrelation (12) with p = 2 into (6), the second-moment coherent risk measure (SMCR) can be presented in the formthat involves only the first moment of losses in the tail of the distribution:

SMCRα(X) = η2,α(X) + (1 − α)−2∥∥(X − η2,α(X)

)+∥∥

1

= η2,α(X) + (1 − α)−2E(X − η2,α(X)

)+. (13)

Note that in (13) the second-moment information is concealed in the η2,α(X). Further, by taking a CVaR measurewith the confidence level α∗

= 2α − α2, one can write

SMCRα(X) = ηsmcr +1

1 − α∗E

(X − ηsmcr

)+

≥ ηcvar +1

1 − α∗E

(X − ηcvar

)+= CVaRα∗(X), (14)

where ηsmcr = ηp,α(X) as in (12) with p = 2, and ηcvar = VaRα∗(X); note that the inequality in (14) holds due tothe fact that ηcvar = VaRα∗(X) minimizes the expression η+(1−α∗)−1E(X −η)+. In other words, with the aboveselection of α and α∗, the expressions for SMCRα(X) and CVaRα∗(X) differ only in the choice of η that deliversthe minimum to the corresponding expressions (6) and (2). For Conditional Value-at-Risk, it is the α-quantile ofthe distribution of X , whereas the optimal η2,α(X) for the SMCR measure incorporates the information on thesecond moment of losses X .

Example 2.3 (HMCR as tail risk measures) It is easy to see that the HMCR family are tail risk measures, namelythat 0 < α1 < α2 < 1 implies ηp,α1(X) ≤ ηp,α2(X), where ηp,αi = Arg minη

{η + (1 − αi )

−1∥∥(X − η)+

∥∥p

}; in

addition, one has limα→1 ηp,α(X) = ess.sup X , at least when ess.sup X is finite (see the Appendix).

6

These properties puts the HMCR family in a favorable position comparing to the coherent measures of type (8)(Rockafellar et al., 2006; Fischer, 2003), where the “tail cutoff” point, about which the partial moments are com-puted, is always fixed at EX . In contrast to (8), the location of tail cutoff in the HMCR measures is determinedby the optimal ηp,α(X) and is adjustable by means of the parameter α. In a special case, for example, theHMCR measures (4) can be reduced to form (8) with β = (1 − αp)

−1 > 1 by selecting αp according to (12)as αp = 1 −

(‖(X − EX)+‖p−1

/‖(X − EX)+‖p

)p−1, p > 1.

Representations based on relaxation of (A3) Observe that formula (1) in Theorem 1 is analogous to the oper-ation of infimal convolution, well-known in convex analysis:

( f � g)(x) = infy

f (x − y) + g(y).

Continuing this analogy, consider the operation of right scalar multiplication

(φη)(X) = η φ(η−1 X), η ≥ 0,

where for η = 0 we set (φ0)(X) = (φ0+)(X). If φ is proper and convex, then it is known that (φη)(X) is a convexproper function in η ≥ 0 for any X ∈ dom φ (see, for example, Rockafellar, 1970). Interestingly enough, this factcan be pressed into service to formally define a coherent risk measure as

ρ(X) = infη ≥ 0

η φ(η−1 X

), (15)

if function φ, along with some technical conditions similar to those of Theorem 1, satisfies axioms (A1), (A2′), and(A4). Note that excluding the positive homogeneity (A3) from the list of properties of φ denies also its convexity,thus one must replace (A2) with (A2′) to ensure convexity in (15). In the terminology of convex analysis thefunction ρ(X) defined by (15) is known as the positively homogeneous convex function generated by φ. Likewise,by direct verification of conditions (A1)–(A4) it can be demonstrated that

ρ(X) = supη > 0

η φ(η−1 X), (16)

is a proper coherent risk measure, provided that φ(X) satisfies (A1), (A2), and (A4). By (C1), axioms (A1) and(A2) imply that φ(0) = 0, which allows one to rewrite (16) as

ρ(X) = supη > 0

φ(ηX + 0) − φ(0)

η= φ0+(X), (17)

where the last equality in (17) comes from the definition of the recession function (Rockafellar, 1970; Zalinescu,2002).

2.2 Implementation in stochastic programming problems

The developed results can be efficiently applied in the context of stochastic optimization, where the random out-come X = X (x, ω) can be considered as a function of the decision vector x ∈ Rm , convex in x over some closedconvex set C ⊂ Rm . Firstly, representation (1) allows for efficient minimization of risk in stochastic programs. Fora function φ that complies with the requirements of Theorem 1, denote

8(x, η) = η + φ(X (x, ω) − η

)and R(x) = ρ

(X (x, ω)

)= min

η ∈ R8(x, η). (18)

Then, clearly,

minx ∈C

ρ(X (x, ω)

)⇐⇒ min

(x,η) ∈ C×R8(x, η), (19)

7

in the sense that both problems have the same optimal objective values and optimal vector x∗. The last observa-tion enables seamless incorporation of risk measures into stochastic programming problems, thereby facilitatingthe modeling of risk-averse preferences. For example, a generalization of the classical 2-stage stochastic linearprogramming (SLP) problem (see, e.g., Birge and Louveaux, 1997; Prekopa, 1995) where the outcome of thesecond-stage (recourse) action is evaluated by its risk rather than the expected value, can be formulated by replac-ing the expectation operator in the second-stage problem with a coherent risk measure R:

minx≥0

c>x + R

[miny≥0

q(ω)>y(ω)

](20)

s. t. Ax = b, T(ω) x + W(ω) y(ω) = h(ω).

Note that the expectation operator is a member of the class of coherent risk measures defined by (A1)–(A4),whereby the classical 2-stage SLP problem is a special case of (20). Assuming that the risk measure R above isamenable to representation (1) via some function φ, problem (20) can be implemented by virtue of Theorem 1 as

min c>x + η + φ(q(ω)>y(ω) − η

)s. t. Ax = b, T(ω) x + W(ω) y(ω) = h(ω),

x ≥ 0, y(ω) ≥ 0, η ∈ R,

with all the distinctive properties of the standard SLP problems (e.g., the convexity of the recourse function, etc.)being preserved.

Secondly, representation (1) also admits implementation of risk constraints in stochastic programs. Namely, letg(x) be a function that is convex on C; then, the following two problems are equivalent, as demonstrated byTheorem 3 below:

minx ∈C

{ g(x) | R(x) ≤ c}, (21a)

min(x,η) ∈C×R

{ g(x) | 8(x, η) ≤ c}. (21b)

Theorem 3 Optimization problems (21a) and (21b) are equivalent in the sense that they achieve minima at thesame values of the decision variable x and their optimal objective values coincide. Further, if the risk constraintin (21a) is binding at optimality, (x∗, η∗) achieves the minimum of (21b) if and only if x∗ is an optimal solution of(21a) and η∗

∈ arg minη 8(x∗, η).

In other words, one can implement the risk constraint ρ(X (x, ω)

)≤ c by using representation (1) for the risk

measure ρ with the infimum operator omitted.

HMCR measures in stochastic programming problems The introduced higher moment coherent risk measurescan be incorporated in stochastic programming problems via conic constraints of order p > 1. Namely, let{ω1, . . . , ωJ } ⊆ � where P{ωj } = πj ∈ (0, 1) be the scenario set of a stochastic programming model. Observethat a HMCR-based objective or constraint can be implemented via the constraint HMCRp,α

(X (x, ω)

)≤ u, with

u being either a variable or a constant, correspondingly. By virtue of Theorem 3, the latter constraint admits arepresentation by the set of inequalities

u ≥ η + (1 − α)−1t, (22a)

t ≥(w

p1 + . . . + w

pJ)1/p

, (22b)

wj ≥ π1/pj

(X (x, ωj ) − η

), j = 1, . . . , J, (22c)

wj ≥ 0, j = 1, . . . , J. (22d)

Note that the convexity of X as a function of the decision variables x implies convexity of (22c), and, consequently,convexity of the set (22). Constraint (22b) defines a J + 1-dimensional cone of order p, and is central to practical

8

implementation of constraints (22) in mathematical programming models. In the special case of p = 2, it repre-sents a second-order (quadratic) cone in RJ+1, and well-developed methods of second-order cone programming(SOCP) can be invoked to handle the constructions of type (22). In the general case of p ∈ (1, ∞), the p-ordercone within the positive orthant(

wp1 + . . . + w

pJ)1/p

≤ t, t, wj ≥ 0, j = 1, . . . , J, (23)

can be approximated by linear inequalities when J = 2d . Following Ben-Tal and Nemirovski (1999), the 2d+ 1-

dimensional p-order conic constraint (23) can be represented by a set of 3-dimensional p-order conic inequalities[(w

(k−1)2 j−1

)p+

(w

(k−1)2 j

)p]1/p

≤ w(k)j , j = 1, . . . , 2d−k, k = 1, . . . , d, (24)

where w(d)1 ≡ t and w

(0)j ≡ wj ( j = 1, . . . , 2d ). Each of the 3-dimensional p-order cones (24) can then be

approximated by a set of linear equalities. For any partition 0 ≡ α0 < α1 < . . . < αm ≡ π/2 of the segment[0, π

2 ], an internal approximation of the p-order cone in the positive orthant of R3

ξ3 ≥(ξ

p1 + ξ

p2)1/p

, ξ1, ξ2, ξ3 ≥ 0, (25)

can be written in the form

ξ3(

sin2/pαi+1 cos2/p αi − cos2/p αi+1 sin2/p αi)

≥ ξ1(

sin2/p αi+1 − sin2/p αi)+ ξ2

(cos2/p αi − cos2/p αi+1

), i = 0, . . . , m − 1, (26a)

and an external approximation can be constructed as

ξ3(

cosp αi + sinp αi) p−1

p ≥ ξ1 cosp−1 αi + ξ2 sinp−1 αi , i = 0, . . . , m. (26b)

For example, the uniform partition αi =π i2m (i = 0, . . . , m) generates the following approximations of a 3-

dimensional second-order cone:

ξ3 cos π4m ≥ ξ1 cos π(2i+1)

4m + ξ2 sin π(2i+1)4m , i = 0, . . . , m − 1,

ξ3 ≥ ξ1 cos π i2m + ξ2 sin π i

2m , i = 0, . . . , m.

2.3 Application to deviation measures

Since being introduced in Artzner et al. (1999), the axiomatic approach to construction of risk measures has beenrepeatedly employed by many authors for the development of other types of risk measures tailored to specificpreferences and applications (see Rockafellar et al., 2006; Acerbi, 2002; Ruszczynski and Shapiro, 2006). In thissubsection we consider deviation measures as introduced by Rockafellar et al. (2006). Namely, deviation measureis a mapping D : X 7→ [0, +∞] that satisfies

(D1) D(X) > 0 for any non-constant X ∈ X , whereas D(X) = 0 for constant X ,

(D2) D(X + Y ) ≤ D(X) + D(Y ) for all X, Y ∈ X , 0 ≤ λ ≤ 1,

(D3) D(λX) = λD(X) for all X ∈ X , λ > 0,

(D4) D(X + a) = D(X) for all X ∈ X , a ∈ R.

Again, from axioms (D1) and (D2) follows convexity of D(X). In Rockafellar et al. (2006) it was shown thatdeviation measures that further satisfy

(D5) D(X) ≤ ess.sup X − EX for all X ∈ X ,

9

are characterized by the one-to-one correspondence

D(X) = R(X − EX) (27)

with expectation-bounded coherent risk measures, i.e., risk measures that satisfy (A1)–(A4) and an additionalrequirement

(A5) R(X) > EX for all non-constant X ∈ X , whereas R(X) = EX for all constant X .

Using this result, it is easy to provide an analog of formula (1) for deviation measures.

Theorem 4 Let function φ : X 7→ R satisfy axioms (A1)–(A3), and be a lsc function such that φ(X) > EX for allX 6= 0. Then the optimal value of the stochastic programming problem

D(X) = −EX + infη

{η + φ(X − η)

}(28)

is a deviation measure, and the infimum is attained for all X, so that infη in (28) may be replaced by minη∈R.

Given the close relationship between deviation measures and coherent risk measures, it is straightforward to applythe above results to deviation measures.

2.4 Connection with utility theory and second-order stochastic dominance

As it has been mentioned in the Introduction, considerable attention has been devoted in the literature to thedevelopment of risk models and measures compatible with the utility theory of von Neumann and Morgenstern(1944), which represents one of the cornerstones of the decision-making science.

The vNM theory argues that when the preference relation “�” of the decision-maker satisfies certain axioms(completeness, transitivity, continuity, and independence), there exists a function u : R 7→ R, such that an outcomeX is preferred to outcome Y (“X � Y ”) if and only if E[u(X)] ≥ E[u(Y )]. If the function u is non-decreasingand concave, the corresponding preference is said to be risk averse. Rothschild and Stiglitz (1970) have bridgedthe vNM utility theory with the concept of the second-order stochastic dominance by showing that X dominatingY by the second-order stochastic dominance, X �SSD Y , is equivalent to the relation E[u(X)] ≥ E[u(Y )] holdingtrue for all concave non-decreasing functions u, where the inequality is strict for at least one such u. Recall that arandom outcome X dominates outcome Y by the second order stochastic dominance if∫ z

−∞

P[X ≤ t] dt ≤

∫ z

−∞

P[Y ≤ t] dt for all z ∈ R.

Since coherent risk measures are generally inconsistent with the second-order stochastic dominance (see an exam-ple in De Giorgi, 2005), it is of interest to introduce risk measures that comply with this property. To this end,we replace the monotonicity axiom (A1) in the definition of coherent risk measures by the requirement of SSDisotonicity (Pflug, 2000; De Giorgi, 2005):

(−X) �SSD (−Y ) ⇒ R(X) ≤ R(Y ).

Namely, we consider risk measures R : X 7→ R that satisfy the following set of axioms:3

(A1′) SSD isotonicity: (−X) �SSD (−Y ) ⇒ R(X) ≤ R(Y ) for X, Y ∈ X ,

(A2′) convexity: R(λX + (1 − λ)Y

)≤ λR(X) + (1 − λ)R(Y ), X, Y ∈ X , 0 ≤ λ ≤ 1,

(A3) positive homogeneity: R(λX) = λR(X), X ∈ X , λ > 0,

(A4) translation invariance: R(X + a) = R(X) + a, X ∈ X , a ∈ R.3See Mansini et al. (2003); Ogryczak and Opolska-Rutkowska (2006) for conditions under which SSD-isotonic measures also satisfy the

coherence properties.

10

Note that unlike the system of axioms (A1)–(A4), the above axioms, and in particular (A1′), require X and Y to beintegrable, i.e., one can take the space X in (A1′)–(A4) to be L1 (for a discussion of topological properties of setsdefined by stochastic dominance relations, see, e.g., Dentcheva and Ruszczynski, 2004).

Again, it is possible to develop an analog of formula (1), which would allow for construction of risk measures withthe above properties using functions that comply with (A1′), (A2′), and (A3).

Theorem 5 Let function φ : X 7→ R satisfy axioms (A1′), (A2′), and (A3), and be a lsc function such that φ(η) > ηfor all real η 6= 0. Then the optimal value of the stochastic programming problem

ρ(X) = minη ∈ R

η + φ(X − η) (29)

exists and is a proper function that satisfies (A1′), (A2′), (A3), and (A4).

Obviously, by solving the risk-minimization problem

minx ∈C

ρ(X (x, ω)

)where ρ is a risk measure that is both coherent and SSD-compatible in the sense of (A1′), one obtains a solutionthat is SSD-efficient, i.e., acceptable to any risk-averse rational utility maximizer, and also bears the lowest risk interms of coherence preference metrics. Below we illustrate that functions φ satisfying the conditions of Theorem 5can be easily constructed in the scope of the presented approach.

Example 5.1 Let φ(X) = E[u(X)], where u : R 7→ R is a convex, positively homogeneous, non-decreasingfunction such that u(η) > η for all η 6= 0. Obviously, function φ(X) defined in this way satisfies the condi-tions of Theorem 5. Since −u(−η) is concave and non-decreasing, one has that −E[u(X)] ≥ −E[u(Y )], and,consequently, φ(X) ≤ φ(Y ), whenever (−X) �SSD (−Y ). It is easy to see that, for example, function φ of theform

φ(X) = αE(X)+ − βE(X)−, α ∈ (1, +∞), β ∈ [0, 1),

satisfies the conditions of Theorem 5. Thus, in accordance to Theorems 1 and 5, the coherent risk measure Rα,β (3)is also consistent with the second-order stochastic dominance. A special case of (3) is Conditional Value-at-Risk,which is known to be compatible with the second-order stochastic dominance (Pflug, 2000).

Example 5.2 (Hiher-Moment Coherent Risk Measures) SMCR and, in general, the family of higher-momentcoherent risk measures constitute another example of risk measures that are both coherent and compatible withthe second-order stochastic dominance. Indeed, function u(η) = ((η)+)p is convex and non-decreasing, whence(E[u(X)]

)1/p≤

(E[u(Y )]

)1/p for any (−X) �SSD (−Y ). Thus, the HMCR family, defined by (29) with φ(X) =

(1 − α)−1‖(X)+‖p,

HMCRp,α(X) = minη ∈ R

η + (1 − α)−1∥∥(X − η)+∥∥

p , p ≥ 1,

is both coherent and SSD-compatible, by virtue of Theorems 1 and 5. Implementation of such a risk measure instochastic programming problems enables one to introduce risk preferences that are consistent with both conceptsof coherence and second-order stochastic dominance.

The developed family of higher-moment risk measures (4) possesses all the outstanding properties that are soughtafter in the realm of risk management and decision making under uncertainty: compliance with the coherenceprinciples, amenability for an efficient implementation in stochastic programming problems (e.g., via second-ordercone programming), and compatibility with the second-order stochastic dominance and utility theory. The questionthat remains to be answered is whether these superior properties translate into an equally superior performance inpractical risk management applications.

The next section reports a pilot study intended to investigate the performance of the HMCR measures in real-liferisk management applications. It shows that the family of HMCR measures is a promising tool for tailoring riskpreferences to the specific needs of decision-makers, and can be compared favorably with some of the most widelyused risk management frameworks.

11

3 Portfolio choice using HMCR measures: An illustration

In this section we illustrate the practical utility of the developed Higher-Moment Coherent Risk measures on theexample of portfolio optimization, a typical testing ground for many risk management and stochastic programmingtechniques. To this end, we compare portfolio optimization models that use the HMCR measures with p = 2(SMCR) and p = 3 against portfolio allocation models based on two well-established, and theoretically as wellas practically proven methodologies, the Conditional Value-at-Risk measure and the Markowitz Mean-Varianceframework.

This choice of benchmark models is further supported by the fact that the HMCR family is similar in the construc-tion and properties to CVaR (more precisely, CVaR is a HMCR measure with p = 1), but, while CVaR measuresthe risk in terms of the first moment of losses residing in the tail of the distribution, the SMCR measure quanti-fies risk using the second moments, in this way relating to the MV paradigm. The HMCR measure with p = 3demonstrates the potential benefits of using higher-order tail moments of loss distributions for risk estimation.

Portfolio optimization models and implementation The portfolio selection models employed in this case studyhave the general form

minx

R(−r>x) (30a)

s. t. e>x = 1, (30b)

Er>x ≥ r0, (30c)x ≥ 0, (30d)

where x = (x1, . . . , xn)> is the vector of portfolio weights, r = (r1, . . . , rn)> is the random vector of assets’returns, and e = (1, . . . , 1)>. The risk measure R in (30a) is taken to be either SMCR (6), HMCR with p = 3(4), CVaR (2), or variance σ 2 of the negative portfolio return, −r>x = X . In the above portfolio optimizationproblem, (30b) represents the budget constraint, which, together with the no-short-selling constraint (30d) ensuresthat all the available funds are invested, and (30c) imposes the minimal required level r0 for the expected return ofthe portfolio.

We have deliberately chosen not to include any additional trading or institutional constraints (transaction costs,liquidity constraints, etc.) in the portfolio allocation problem (30) so as to make the effect of risk measure selectionin (30a) onto the resulting portfolio rebalancing strategy more marked and visible.

Traditionally to stochastic programming, the distribution of random return ri of asset i is modeled using a set of Jdiscrete equiprobable scenarios {ri1, . . . , ri J }. Then, optimization problem (30) reduces to a linear programmingproblem if CVaR is selected as the risk measure R in (30a) (see, for instance, Rockafellar and Uryasev, 2000;Krokhmal et al., 2002a). Within the Mean-Variance framework, (30) becomes a convex quadratic optimizationproblem with the objective

R(−r>x) =

n∑i,k=1

σik xi xk, where σik =1

J − 1

J∑j=1

(ri j − ri )(rk j − rk), ri =1J

J∑j=1

ri j . (31)

In the case of R(X) = HMCRp,α(X), problem (30) transforms into a linear programming problem with a single

12

p-order cone constraint (32e):

min η +1

1 − α

1J 1/p t (32a)

s. t.n∑

i=1

xi = 1, (32b)

1J

J∑j=1

n∑i=1

ri j xi ≥ r0, (32c)

wj ≥ −

n∑i=1

ri j xi − η, j = 1, . . . , J, (32d)

t ≥(w

p1 + . . . + w

pJ)1/p

, (32e)xi ≥ 0, i = 1, . . . , n, (32f)wj ≥ 0, j = 1, . . . , J. (32g)

When p = 2, i.e., R is equal to SMCR, (32) reduces to a SOCP problem. In the case when R is selected asHMCR with p = 3, the 3rd order cone constraint (32e) has been approximated via linear inequalities (26a) withm = 500, thereby transforming problem (32) to a LP. The resulting mathematical programming problems havebeen implemented in C++, and we used CPLEX 10.0 for solving the LP and QP problems, and MOSEK 4.0 forsolving the SOCP problem.

Set of instruments and scenario data Since the introduced family of HMCR risk measures quantifies risk interms of higher tail moments of loss distributions, the portfolio optimization case studies were conducted using adata set that contained return distributions of fifty S&P500 stocks with the so-called “heavy tails.” Namely, forscenario generation we used 10-day historical returns over J = 1024 overlapping periods, calculated using dailyclosing prices from Oct 30, 1998 to Jan 18, 2006. From the set of S&P500 stocks (as of January, 2006) we selectedn = 50 instruments by picking the ones with the highest values of kurtosis of biweekly returns, calculated over thespecified period. In such a way, the investment pool had the average kurtosis of 51.93, with 429.80 and 17.07 beingthe maximum and minimum kurtosis, correspondingly. The particular size of scenario model, J = 1024 = 210,has been chosen so that the linear approximation techniques (26) can be employed for the HMCR measure withp = 3.

Out-of-sample simulations The primary goal of our case study is to shed light on the potential “real-life” per-formance of the HMCR measures in risk management applications, and to this end we conducted the so-calledout-of-sample experiments. As the name suggests, the out-of-sample tests determine the merits of a constructedsolution using out-of-sample data that have not been included in the scenario model used to generate the solution.In other words, the out-of-sample setup simulates a common situation when the true realization of uncertaintieshappens to be outside of the set of the “expected,” or “predicted” scenarios (as is the case for most portfolio op-timization models). Here, we employ the out-of-sample method to compare simulated historic performances offour self-financing portfolio rebalancing strategies that are based on (30) with R chosen either as a member of theHMCR family with α = 0.90, namely, CVaR0.90(·), SMCR0.90(·), HMCR3, 0.90(·), or as variance σ 2(·).

It may be argued that in practice of portfolio management, instead of solving (30), it is of more interest to constructinvestment portfolios that maximize the expected return subject to risk constraint(s), e.g.,

maxx ≥ 0

{Er>x

∣∣ R(−r>x) ≤ c0, e>x = 1}. (33)

Indeed, many investment institutions are required to keep their investment portfolios in compliance with numerousconstraints, including constraints on risk. However, our main point is to gauge the effectiveness of the HMCR risk

13

measures in portfolio optimization by comparing them against other well-established risk management method-ologies, such as the CVaR and MV frameworks. And since these risk measures yield risk estimates on differentscales, it is not obvious which risk tolerance levels c0 should be selected in (33) to make the resulting portfolioscomparable.

Thus, to facilitate a fair “apple-to-apple” comparison, we construct self-financing portfolio rebalancing strategiesby solving the risk-minimization problem (30), so that the resulting portfolios all have the same level r0 of expectedreturn, and the success of a particular portfolio rebalancing strategy will depend on the actual amount of risk borneby the portfolio due to the utilization of the corresponding risk measure.

The out-of-sample experiments have been set up as follows. The initial portfolios were constructed on Dec 11,2002 by solving the corresponding variant of problem (30), where the scenario set consisted of 1024 overlappingbi-weekly returns covering the period from Oct 30, 1998 to Dec 11, 2002. The duration of the rebalancing periodfor all strategies was set at two weeks (ten business days). On the next rebalancing date of Dec 26, 2002,4 the 10-day out-of-sample portfolio returns, r>x∗, were observed for each of the three portfolios, where r is the vector ofout-of-sample (Dec 11, 2002 – Dec 26, 2002) returns and x∗ is the corresponding optimal portfolio configurationobtained on Dec 11, 2002. Then, all portfolios were rebalanced by solving (30) with an updated scenario set.Namely, we included in the scenario set the ten vectors of overlapping biweekly returns that realized during the tenbusiness days from Dec 11, 2002 to Dec 26, 2002, and discarded the oldest ten vectors from October–Novemberof 1998. The process was repeated on Dec 26, 2002, and so on. In such a way, the out-of-sample experimentconsisted of 78 biweekly rebalancing periods covering more than three years. We ran the out-of-sample tests fordifferent values of the minimal required expected return r0, and typical results are presented in Figures 1 to 3.

90

110

130

150

170

190

210

230

250

Dec-11

-2002

Jan-2

7-200

3

Mar-11

-2003

Apr-23

-2003

Jun-0

5-200

3

Jul-1

8-200

3

Aug-29

-2003

Oct-13

-2003

Nov-24

-2003

Jan-0

8-200

4

Feb-23

-2004

Apr-05

-2004

May-18

-2004

Jul-0

1-200

4

Aug-13

-2004

Sep-27

-2004

Nov-08

-2004

Dec-21

-2004

Feb-03

-2005

Mar-18

-2005

May-02

-2005

Jun-1

4-200

5

Jul-2

7-200

5

Sep-08

-2005

Oct-20

-2005

Dec-02

-2005

Jan-1

8-200

6

Port

folio

val

ue, %

MV

CVaR

SMCR

HMCR

Figure 1: Out-of-sample performance of conservative (r0 = 0.5%) self-financing portfolio rebalancing strategiesbased on MV model, and CVaR, SMCR (p = 2), and HMCR (p = 3) measures of risk with α = 0.90.

Figure 1 reports the portfolio values (in percent of the initial investment) for the four portfolio rebalancing strategiesbased on (30) with SMCR0.90(·), CVaR0.90(·), variance σ 2(·), and HMCR3, 0.90(·) as R(·), and the required levelr0 of the expected return being set at 0.5%. One can observe that in this case the clear winner is the portfoliobased in the HMCR measure with p = 3, the SMCR-based portfolio is runner-up, with the CVaR- and MV-based portfolios falling behind these two. This situation is typical for smaller values of r0; as r0 increases and therebalancing strategies become more aggressive, the CVaR- and MV- portfolios become more competitive, whilethe HMCR (p = 3) portfolio remains dominant most of the time (Fig. 2).

An illustration of typical behavior of more aggressive rebalancing strategies is presented in Figure 3, where r0 is set

4Holidays were omitted from the data.

14

90

110

130

150

170

190

210

230

250

Dec-11

-2002

Jan-2

7-200

3

Mar-11

-2003

Apr-23

-2003

Jun-0

5-200

3

Jul-1

8-200

3

Aug-29

-2003

Oct-13

-2003

Nov-24

-2003

Jan-0

8-200

4

Feb-23

-2004

Apr-05

-2004

May-18

-2004

Jul-0

1-200

4

Aug-13

-2004

Sep-27

-2004

Nov-08

-2004

Dec-21

-2004

Feb-03

-2005

Mar-18

-2005

May-02

-2005

Jun-1

4-200

5

Jul-2

7-200

5

Sep-08

-2005

Oct-20

-2005

Dec-02

-2005

Jan-1

8-200

6

Port

folio

val

ue, %

MVCVaRSMCRHMCR

Figure 2: Out-of-sample performance of self-financing portfolio rebalancing strategies that have expected returnr0 = 1.0% and are based on MV model, and CVaR, SMCR (p = 2), and HMCR (p = 3) risk measures withα = 0.90.

90

140

190

240

290

Dec-11

-2002

Jan-2

7-200

3

Mar-11

-2003

Apr-23

-2003

Jun-0

5-200

3

Jul-1

8-200

3

Aug-29

-2003

Oct-13

-2003

Nov-24

-2003

Jan-0

8-200

4

Feb-23

-2004

Apr-05

-2004

May-18

-2004

Jul-0

1-200

4

Aug-13

-2004

Sep-27

-2004

Nov-08

-2004

Dec-21

-2004

Feb-03

-2005

Mar-18

-2005

May-02

-2005

Jun-1

4-200

5

Jul-2

7-200

5

Sep-08

-2005

Oct-20

-2005

Dec-02

-2005

Jan-1

8-200

6

Port

folio

val

ue, %

MV

CVaR

SMCR

HMCR

Figure 3: Out-of-sample performance of aggressive (r0 = 1.3%) self-financing portfolio rebalancing strategiesbased on SMCR (p = 2), CVaR, MV, and HMCR (p = 3) measures of risk.

at 1.3% (in the dataset used in this case study, infeasibilities in (30) started to occur for values of r0 > 0.013). Asa general trend, the HMCR (p = 2, 3) and CVaR portfolios exhibit similar performance (which can be explainedby the fact that at high values of r0 the set of instruments capable of providing such a level of expected return israther limited). Still, it may be argued that the HMCR (p = 3) strategy can be preferred to the other three, basedon its overall stable performance throughout the duration of the run.

Finally, to illuminate the effects of taking into account higher-moment information in estimation of risks using theHMCR family of risk measures, we compared the simulated historical performances of the SMCR measure withparameter α = 0.90 and CVaR measure with confidence level α∗

= 0.99, so that the relation α∗= 2α − α2 holds

(see the discussion in Example 2.2). Recall that in such a case the expressions for SMCR and CVaR differ only in

15

the location of the optimal η∗ (see (14)). For CVaR, the optimal η equals to VaR, and for SMCR the correspondingvalue of η is determined by (12) and depends on the second tail moment of the distribution.

Figure 4 presents a typical outcome for mid-range values of the expected return level r0: most of the time, theSMCR portfolio dominates the corresponding CVaR portfolio. However, for smaller values of expected return(e.g., r0 ≤ 0.005), as well as for values approaching r0 = 0.013, the SMCR- and CVaR-based rebalancingstrategies demonstrated very close performance. This can be explained by the fact that lower values of r0 leadto rather conservative portfolios, while the values of r0 close to the infeasibility barrier of 0.013 lead to veryaggressive and poorly diversified portfolios that are comprised of a limited set of assets capable of achieving thishigh level of the expected return.

90

110

130

150

170

190

210

230

Dec-11

-2002

Jan-2

7-200

3

Mar-11

-2003

Apr-23

-2003

Jun-0

5-200

3

Jul-1

8-200

3

Aug-29

-2003

Oct-13

-2003

Nov-24

-2003

Jan-0

8-200

4

Feb-23

-2004

Apr-05

-2004

May-18

-2004

Jul-0

1-200

4

Aug-13

-2004

Sep-27

-2004

Nov-08

-2004

Dec-21

-2004

Feb-03

-2005

Mar-18

-2005

May-02

-2005

Jun-1

4-200

5

Jul-2

7-200

5

Sep-08

-2005

Oct-20

-2005

Dec-02

-2005

Jan-1

8-200

6

Port

folio

val

ue, %

SMCRCVaR

Figure 4: Out-of-sample performance of self-financing portfolio rebalancing strategies based on SMCR0.90 andCVaR0.99 measures of risk (r0 = 1.0%)

Although the obtained results are data-specific, the presented preliminary case studies indicate that the developedHMCR measures demonstrate very promising performance, and can be successfully employed in the practice ofrisk management and portfolio optimization.

4 Conclusions

In this paper we have considered modeling of risk-averse preferences in stochastic programming problems us-ing risk measures. We utilized the axiomatic approach to construction of coherent risk measures and deviationmeasures in order to develop simple representations for these risk measures via solutions of specially designedstochastic programming problems. Using the developed general representations, we introduced a new family ofhigher-moment coherent risk measures (HMCR). In particular, we considered the second-moment coherent riskmeasure (SMCR), which is implementable in stochastic optimization problems using the second-order cone pro-gramming, and the 3rd order HMCR (p = 3). The conducted numerical studies indicate that the HMCR measurescan be effectively used in the practice of portfolio optimization, and compare well with the well-established bench-mark models such as Mean-Variance framework or CVaR.

16

ReferencesAcerbi, C. (2002) “Spectral measures of risk: A coherent representation of subjective risk aversion,” Journal of Banking and

Finance, 26 (7), 1487–1503.

Acerbi, C. and Tasche, D. (2002) “On the coherence of expected shortfall,” Journal of Banking and Finance, 26 (7), 1487–1503.

Alizadeh, F. and Goldfarb, D. (2003) “Second-order cone programming,” Mathematical Programming, 95 (1), 3–51.

Artzner, P., Delbaen, F., Eber, J.-M., and Heath, D. (1999) “Coherent Measures of Risk,” Mathematical Finance, 9 (3), 203–228.

Bawa, V. S. (1975) “Optimal Rules For Ordering Uncertain Prospects,” Review of Financial Studies, 2 (1), 95–121.

Ben-Tal, A. and Nemirovski, A. (1999) “On polyhedral approximations of the second-order cone,” Mathematics of OperationsResearch, 26 (2).

Ben-Tal, A. and Teboulle, M. (1986) “Expected Utility, Penalty Functions, and Duality in Stochastic Nonlinear Programming,”Management Science, 32 (11), 1145–1466.

Birge, J. R. and Louveaux, F. (1997) Introduction to Stochastic Programming, Springer, New York.

Chekhlov, A., Uryasev, S., and Zabarankin, M. (2005) “Drawdown Measure in Portfolio Optimization,” International Journalof Theoretical and Applied Finance, 8 (1), 13–58.

De Giorgi, E. (2005) “Reward-Risk Portfolio Selection and Stochastic Dominance,” Journal of Banking and Finance, 29 (4),895–926.

Delbaen, F. (2002) “Coherent risk measures on general probability spaces,” in: K. Sandmann and P. J. Schonbucher (Eds.)“Advances in Finance and Stochastics: Essays in Honour of Dieter Sondermann,” 1–37, Springer.

Dembo, R. S. and Rosen, D. (1999) “The Practice of Portfolio Replication: A Practical Overview of Forward and InverseProblems,” Annals of Operations Research, 85, 267–284.

Dentcheva, D. and Ruszczynski, A. (2003) “Optimization with Stochastic Dominance Constraints,” SIAM Journal on Opti-mization, 14 (2), 548–566.

Dentcheva, D. and Ruszczynski, A. (2004) “Semi-Infinite Probabilistic Optimization: First Order Stochastic Dominance Con-straints,” Optimization, 53 (5–6), 583–601.

Duffie, D. and Pan, J. (1997) “An Overview of Value-at-Risk,” Journal of Derivatives, 4, 7–49.

Fischer, T. (2003) “Risk capital allocation by coherent risk measures based on one-sided moments,” Insurance: Mathematicsand Economics, 32 (1), 135–146.

Fishburn, P. C. (1964) “Stochastic Dominance and Moments of Distributions,” Mathematics of Operations Research, 5, 94–100.

Fishburn, P. C. (1970) Utility Theory for Decision-Making, Wiley, New York.

Fishburn, P. C. (1977) “Mean-Risk Analysis with Risk Associated with Below-Target Returns,” The American Economic Re-view, 67 (2), 116–126.

Fishburn, P. C. (1988) Non-Linear Preference and Utility Theory, Johns Hopkins University Press, Baltimore.

Jorion, P. (1997) Value at Risk: The New Benchmark for Controlling Market Risk, McGraw-Hill.

JP Morgan (1994) Riskmetrics, JP Morgan, New York.

Karni, E. and Schmeidler, D. (1991) “Utility Theory with Uncertainty,” in: Hildenbrand and Sonnenschein (Eds.) “Handbookof Mathematical Economics,” volume IV, North-Holland, Amsterdam.

Kouvelis, P. and Yu, G. (1997) Robust Discrete Optimization and Its Applications, Kluwer Academic Publishers, Dodrecht.

Krokhmal, P., Palmquist, J., and Uryasev, S. (2002a) “Portfolio Optimization with Conditional Value-At-Risk Objective andConstraints,” Journal of Risk, 4 (2), 43–68.

17

Krokhmal, P., Uryasev, S., and Zrazhevsky, G. (2002b) “Risk Management for Hedge Fund Portfolios: A Comparative Analysisof Linear Rebalancing Strategies,” Journal of Alternative Investments, 5 (1), 10–29.

Kroll, Y., Levy, H., and Markowitz, H. M. (1984) “Mean-Variance Versus Direct Utility Maximization,” Journal of Finance,39 (1), 47–61.

Levy, H. (1998) Stochastic Dominance, Kluwer Academic Publishers, Boston-Dodrecht-London.

Mansini, R., Ogryczak, W., and Speranza, M. G. (2003) “LP solvable models for portfolio optimization: a LP solvable mod-els for portfolio optimization: a classification and computational comparison,” IMA Journal of Management Mathematics,14 (3), 187–220.

Markowitz, H. M. (1952) “Portfolio Selection,” Journal of Finance, 7 (1), 77–91.

Markowitz, H. M. (1959) Portfolio Selection, Wiley and Sons, New York, 1st edition.

Ogryczak, W. and Opolska-Rutkowska, M. (2006) “SSD consistent criteria and coherent risk measures,” in: F. Ceragioli,A. Dontchev, H. Furuta, K. Marti, and L. Pandolfi (Eds.) “System Modeling and Optimization. Proceedings of the 22nd IFIPTC7 Conference,” volume 199 of IFIP International Federation for Information Processing, 227–237.

Ogryczak, W. and Ruszczynski, A. (1999) “From stochastic dominance to mean-risk models: Semideviations as risk measures,”European Journal of Operational Research, 116, 33–50.

Ogryczak, W. and Ruszczynski, A. (2001) “On consistency of stochastic dominance and mean-semideviation models,” Mathe-matical Programming, 89, 217–232.

Ogryczak, W. and Ruszczynski, A. (2002) “Dual stochastic dominance and related mean-risk models,” SIAM Journal on Opti-mization, 13 (1), 60–78.

Pflug, G. (2000) “Some Remarks on the Value-at-Risk and the Conditional Value-at-Risk,” in: S. Uryasev (Ed.) “ProbabilisticConstrained Optimization: Methodology and Applications,” Kluwer Academic Publishers.

Prekopa, A. (1995) Stochastic Programming, Kluwer Academic Publishers.

Rockafellar, R. T. (1970) Convex Analysis, volume 28 of Princeton Mathematics, Princeton University Press.

Rockafellar, R. T. and Uryasev, S. (2000) “Optimization of Conditional Value-at-Risk,” Journal of Risk, 2, 21–41.

Rockafellar, R. T. and Uryasev, S. (2002) “Conditional Value-at-Risk for General Loss Distributions,” Journal of Banking andFinance, 26 (7), 1443–1471.

Rockafellar, R. T., Uryasev, S., and Zabarankin, M. (2006) “Generalized Deviations in Risk Analysis,” Finance and Stochastics,10 (1), 51–74.

Roman, D., Darby-Dowman, K., and Mitra, G. (2006) “Portfolio construction based on stochastic dominance and target returndistributions,” Mathematical Programming, 108, 541–569.

Rothschild, M. and Stiglitz, J. (1970) “Increasing risk I: a definition,” Journal of Economic Theory, 2 (3), 225–243.

Ruszczynski, A. and Shapiro, A. (2006) “Optimization of Convex Risk Functions,” Mathematics of Operations Research,31 (3), 433–452.

Schied, A., A.ed and Follmer, H. (2002) “Robust preferences and convex measures of risk,” in: K. Sandmann and P. J.Schonbucher (Eds.) “Advances in Finance and Stochastics: Essays in Honour of Dieter Sondermann,” 39–56, Springer.

Tasche, D. (2002) “Expected shortfall and beyond,” Working paper, http://arxiv.org/abs/cond-mat/0203558.

Testuri, C. and Uryasev, S. (2003) “On Relation Between Expected Regret and Conditional Value-at-Risk,” in: Z. Rachev (Ed.)“Handbook of Numerical Methods in Finance,” Birkhauser.

van der Vlerk, M. H. (2003) “Integrated Chance Constraints in an ALM Model for Pension Funds,” Working paper.

von Neumann, J. and Morgenstern, O. (1944) Theory of Games and Economic Behavior, Princeton University Press, Princeton,NJ, 1953rd edition.

18

Young, M. R. (1998) “A Minimax Portfolio Selection Rule with Linear Programming Solution,” Management Science, 44 (5),673–683.

Zalinescu, C. (2002) Convex Analysis in General Vector Spaces, World Scientific, Singapore.

Appendix

Proof of Theorem 1. Convexity, lower semicontinuity, and sublinearity of φ in X imply that the function φX (η) =

η + φ(X − η) is also convex, lsc, and proper in η ∈ R for each fixed X ∈ X . For the infimum of φX (η)to be achievable at finite η, its recession function has to be positive: φX 0+(±1) > 0, which is equivalent toφX 0+(ξ) > 0, ξ 6= 0, due to the positive homogeneity of φX . By definition of the recession function (Rockafellar,1970; Zalinescu, 2002) and positive homogeneity of φ, we have that the last condition holds if φ(ξ) > ξ for allξ 6= 0:

φX 0+(ξ) = limτ→∞

η + τξ + φ(X − η − τξ) − η − φ(X − η)

τ= ξ + φ(−ξ).

Hence, ρ(X) defined by (1) is a proper lsc function, and minimum in (1) is attained at finite η for all X ∈ X .Below we verify that ρ(X) satisfies axioms (A1)–(A4).

(A1) Let X ≤ 0. Then φ(X) ≤ 0 as φ satisfies (A1), which implies

minη ∈ R

η + φ(X − η) ≤ 0 + φ(X − 0) ≤ 0.

(A2) For any Z ∈ X let ηZ ∈ arg minη ∈ R

{η + φ(Z − η)} ⊂ R, then

ρ(X) + ρ(Y ) = ηX + φ(X − ηX ) + ηY + φ(Y − ηY )

≥ ηX + ηY + φ(X + Y − ηX − ηY ) ≥ ηX+Y + φ(X + Y − ηX+Y ) = ρ(X + Y ).

(A3) For any fixed λ > 0 we have

ρ(λX) = minη ∈ R

{η + φ(λX − η)

}= λ min

η ∈ R

{η/λ + φ(X − η/λ)

}= λρ(X). (34)

(A4) Similarly, for any fixed a ∈ R,

ρ(X + a) = minη ∈ R

{η + φ(X + a − η)

}= a + min

η ∈ R

{(η − a) + φ

(X − (η − a)

)}= a + ρ(X). (35)

Thus, ρ(X) defined by (1) is a proper coherent risk measure. �

Proof of Theorem 2. Conditions on function φ ensure that the set of optimal solutions of problem (1) is closedand finite, whence follows the existence of η(X) in (10). Property (A3) is established by noting that for any λ > 0equality (34) implies

η(λX) = Arg minη ∈ R

{η + φ(λX − η)

}= Arg min

η ∈ R

{η/λ + φ(λX − η/λ)

},

from which follows that η(λX) = λη(X). Similarly, by virtue of (35), we have

η(X + a) = Arg minη ∈ R

{η + φ(X + a − η)

}= Arg min

η ∈ R

{(η − a) + φ

(λX − (η − a)

)},

19

which leads to the sought relation (A4): η(X + a) = η(X) + a. To validate the remaining statements of theTheorem, consider φ to be such that φ(X) = 0 for every X ≤ 0. Then, (C2) immediately yields φ(X) ≥ 0 for allX ∈ X , which proves

η(X) ≤ η(X) + φ(X − η(X)

)= ρ(X).

By the definition of η(X), we have for all X ≤ 0

η(X) + φ(X − η(X)

)≤ 0 + φ(X − 0) = 0, or η(X) ≤ −φ

(X − η(X)

). (36)

Assume that η(X) > 0, which implies φ(−η(X)

)= 0. From (A2) it follows that φ

(X − η(X)

)≤ φ(X) +

φ(−η(X)

)= 0, leading to φ

(X −η(X)

)= 0, and, consequently, to η(X) ≤ 0 by (36). The contradiction furnishes

the statement of the theorem. �

Proof of Theorem 3. Denote the feasible sets of (21a) and (21b), respectively, as

Sa = {x ∈ C | R(x) ≤ c} and Sb = {(x, η) ∈ C × R | 8(x, η) ≤ c}.

Now observe that projection 5C(Sb) of the feasible set of (21b) onto C,

5C(Sb) = {x ∈ C | (x, η) ∈ Sb for some η ∈ R}

= {x ∈ C | 8(x, η) ≤ c for some η ∈ R},

coincides with the feasible set of (21a):

Sa = 5C(Sb). (37)

Indeed, x′∈ Sa means that x′

∈ C and R(x′) = minη 8(x′, η) ≤ c. By virtue of Theorem 1 there exists η′∈ R

such that 8(x′, η′) = minη 8(x′, η), whence (x′, η′) ∈ Sb, and, consequently, x′∈ 5C(Sb). If, on the other hand,

x′′∈ 5C(Sb), then there exists η′′

∈ R such that (x′′, η′′) ∈ Sb and therefore 8(x′′, η′′) ≤ c. By definition of R(·),R(x′′) ≤ 8(x′′, η′′) ≤ c, thus x′′

∈ Sa .

Given (37), it is easy to see that (21a) and (21b) achieve minima at the same values of x ∈ C and their optimalobjective values coincide. Indeed, if x∗ is an optimal solution of (21a) then x∗

∈ Sa and g(x∗) ≤ g(x) holds forall x ∈ Sa . By (37), if x ∈ Sa then there exists some η ∈ R such (x, η) ∈ Sb. Thus, for all (x, η) ∈ Sb one hasg(x∗) ≤ g(x), meaning that (x∗, η∗) is an optimal solution of (21b), where η∗

∈ R is such that (x∗, η∗) ∈ Sb.Inversely, if (x∗, η∗) solves (21b), then (x∗, η∗) ∈ Sb and g(x∗) ≤ g(x) for all (x, η) ∈ Sb. According to (37),(x, η) ∈ Sb also yields x ∈ Sa , hence for all x ∈ Sa one has g(x∗) ≤ g(x), i.e., x∗ is an optimal solution of (21a).

Finally, assume that the risk constraint in (21a) is binding at optimality. If (x∗, η∗) achieves the minimum of(21b), then 8(x∗, η∗) ≤ c and, according to the above, x∗ is an optimal solution of (21a), whence c = R(x∗) ≤

8(x∗, η∗) ≤ c. From the last relation we have 8(x∗, η∗) = R(x∗) and thus η∗∈ arg minη 8(x∗, η). Now consider

x∗ that solves (21a) and η∗ such that η∗∈ arg minη 8(x∗, η). This implies that 8(x∗, η∗) = R(x∗) = c, or

(x∗, η∗) ∈ Sb. Taking into account that g(x∗) ≤ g(x) for all x ∈ Sa and consequently for all (x, η) ∈ Sb, one hasthat (x∗, η∗) is an optimal solution of (21b).

Proof of Theorem 4. Since formula (28) differs from (1) by the constant summand (−EX), we only have to verifythat R(X) = infη

{η + φ(X − η)

}satisfies (A5). As φ(X) > EX for all X 6= 0, we have that φ(X − ηX ) >

E(X − ηX ) for all non-constant X ∈ X , where ηX ∈ arg minη{η + φ(X − η)}. From the last inequality it followsthat ηX + φ(X − ηX ) > EX , or R(X) > EX for all non-constant X ∈ X . Thus, D(X) > 0 for all non-constantX . For a ∈ R, infη

{η + φ(a − η)

}= a, whence D(a) = 0. �

Proof of Theorem 5. The proof of existence and all properties except (A1′) is identical to that of Theorem 1.Property (A1′) follows elementarily: if (−X) �SSD (−Y ), then (−X + c) �SSD (−Y + c), and consequently,φ(X − c) ≤ φ(Y − c) for c ∈ R, whence

ρ(X) = ηX + φ(X − ηX ) ≤ ηY + φ(X − ηY ) ≤ ηY + φ(Y − ηY ) = ρ(Y ),

20

where, as usual, ηZ ∈ arg minη{η + φ(Z − η)} ⊂ R, for any Z ∈ X . �

Example 2.3: Additional details. To demonstrate the monotonicity of ηp,α(X) with respect to α ∈ (0, 1), observethat by definition of ηp,α1(X)

ηp,α1(X) + (1 − α1)−1∥∥(

X − ηp,α1

)+∥∥

p ≤ ηp,α2(X) + (1 − α1)−1∥∥(

X − ηp,α2

)+∥∥

p. (38)

Now, assume that ηp,α1(X) > ηp,α2(X) for some α1 < α2, then (38) yields

0 < ηp,α1(X) − ηp,α2(X) ≤ (1 − α1)−1

(∥∥(X − ηp,α2

)+∥∥

p −∥∥(

X − ηp,α1

)+∥∥

p

)< (1 − α2)

−1(∥∥(

X − ηp,α2

)+∥∥

p −∥∥(

X − ηp,α1

)+∥∥

p

).

From the last inequality it follows directly that

ηp,α1(X) + (1 − α2)−1∥∥(

X − ηp,α1

)+∥∥

p < ηp,α2(X) + (1 − α2)−1∥∥(

X − ηp,α2

)+∥∥

p,

which contradicts the definition of ηp,α2(X).

The limiting behavior of ηp,α(X) can be verified by noting first that for 1 ≤ p < ∞

limα→1

HMCRp,α(X) = ess.sup X. (39)

Indeed, using the notation of Example 1.3 one has

limα→1

infη

{η + (1 − α)−1∥∥(X − η)+

∥∥p

}≤ inf

ηlimα→1

{η + (1 − α)−1∥∥(X − η)+

∥∥p

}= inf

ηη + φ∗(X − η) = ess.sup X.

On the other hand, from the inequality (see Example 1.4)

HMCRp,α(X) ≤ HMCRq,α(X) for 1 ≤ p < q,

and the fact that limα→1

CVaRα(X) = ess.sup X (see, e.g., Rockafellar et al. (2006)) we obtain

ess.sup X = limα→1

CVaRα(X) ≤ limα→1

HMCRp,α(X),

which verifies (39). The existence of limα→1

ηp,α(X) ∈ R follows from the monotonicity of ηp,α(X) with respect to

α. Theorem 2 maintains that ηp,α(X) ≤ HMCRp,α(X), whence

limα→1

ηp,α(X) ≤ ess.sup X.

In the case of finite ess.sup X , by rewriting (39) in the form

limα→1

{ηp,α(X) + (1 − α)−1∥∥(X − ηp,α(X))+

∥∥p

}= ess.sup X,

and assuming that limα→1

ηp,α(X) = ess.sup X − ε for some ε ≥ 0, it is easy to see that the above equality holds only

in the case of ε = 0.

21

higher moment coherent risk measures - u-system accountsu.arizona.edu › ~krokhmal › pdf ›...

Documents