selected topics from portfolio optimization
TRANSCRIPT
Selected Topics fromPortfolio Optimization
Including some topics on
Stochastic Optimization
Lecture Notes
Winter 2020
Alois Pichler
Faculty of Mathematics
DRAFTVersion as of May 28, 2021
2
rough draft: do not distribute
Preface and Acknowledgment
Le silence รฉternel de ces espacesinfinis mโeffraie.
Blaise Pascal, Pensรฉes,Fragment Transition no 7 / 8
The purpose of these lecture notes is to facilitate the content of the lecture and the course.From experience it is helpful and recommended to attend and follow the lectures in addition.These lecture notes do not cover the lectures completely.
I am indebted to Prof. Georg Ch. Pflug for numerous discussions in the area and signifi-cant support over years.
Please report mistakes, errors, violations of copyright, improvements or necessary com-pletions.
Content: https://www.tu-chemnitz.de/mathematik/studium/module/2013/M16.pdf
3
4
rough draft: do not distribute
Contents
1 Historical Milestones in Portfolio Optimization 91.1 In Banking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2 In Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Bibliography 10
2 Introduction and Classification of Stochastic Programs 132.1 Relations and Connections to Portfolio Optimization: Markowitz . . . . . . . . 132.2 Alternative Formulations of the Markowitz Problem . . . . . . . . . . . . . . . . 132.3 Involving Risk Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Risk Neutral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.2 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.3 Robust Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3.4 Distributionally Robust Optimization . . . . . . . . . . . . . . . . . . . . 15
2.4 Probabilistic Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.5 Stochastic Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.6 On General Difficulties In Stochastic Optimization . . . . . . . . . . . . . . . . . 15
3 The Markowitz Model 173.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Empirical Problem Formulation And Variables . . . . . . . . . . . . . . . . . . . 173.3 The Empirical/ Discrete Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.4 The First Moment: Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5 The Second Moment: Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.6 The Non-Empirical Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.7 The Capital Asset Pricing Model (CAPM) . . . . . . . . . . . . . . . . . . . . . 23
3.7.1 The Mean-Variance Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.7.2 Tangency portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.7.3 The Two Fund Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.8 Markowitz Portfolio Including a Risk Free Asset . . . . . . . . . . . . . . . . . . 293.9 One Fund Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.9.1 Capital Asset Pricing Model (CAPM) . . . . . . . . . . . . . . . . . . . . 313.9.2 On systematic and specific risk . . . . . . . . . . . . . . . . . . . . . . . 333.9.3 Sharpe ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.10 Alternative Formulations of the Markowitz Problem . . . . . . . . . . . . . . . . 333.11 Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5
6 CONTENTS
4 Value-at-Risk 374.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2 How about adding risk? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3 Properties of the Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.4 Profit versus loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5 Axiomatic Treatment of Risk 43
6 Examples of Coherent Risk Functionals 456.1 Mean Semi-Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.2 Average Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.3 Entropic Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.4 Spectral Risk Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.5 Kusuokaโs Representation of Law Invariant Risk Measures . . . . . . . . . . . 506.6 Application in Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7 Portfolio Optimization Problems Involving Risk Measures 537.1 Integrated Risk Management Formulation . . . . . . . . . . . . . . . . . . . . . 537.2 Markowitz Type Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.3 Alternative Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8 Expected Utility Theory 578.1 Examples of utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578.2 ArrowโPratt measure of absolute risk aversion . . . . . . . . . . . . . . . . . . 578.3 Example: St. Petersburg Paradox1 . . . . . . . . . . . . . . . . . . . . . . . . . 588.4 Preferences and utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9 Stochastic Orderings 619.1 Stochastic Dominance of First Order . . . . . . . . . . . . . . . . . . . . . . . . 619.2 Stochastic Dominance of Second Order . . . . . . . . . . . . . . . . . . . . . . 629.3 Portfolio Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
10 Arbitrage 6710.1 Type A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6710.2 Type B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
11 The Flowergirl Problem2 7111.1 The Flowergirl problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7111.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
12 Duality For Convex Risk Measures 73
1By Ruben Schlotter2Also: Newsboy, or Newsvendor problem
rough draft: do not distribute
CONTENTS 7
13 Stochastic Optimization: Terms, and Definitions, and the Deterministic Equiva-lent 7513.1 Expected Value of Perfect Information (EVPI) and Value of Stochastic Solution
(VSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7513.2 The Farmer Ted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7513.3 The Risk-Neutral Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7513.4 Glossary/ Concept/ Definitions: . . . . . . . . . . . . . . . . . . . . . . . . . . . 7613.5 KKT for (13.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7613.6 Deterministic Equivalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7713.7 L-Shaped Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7713.8 Farkasโ Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7813.9 L-Shaped Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7813.10Variants of the Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
14 Co- and Antimonotonicity 8114.1 Rearrangements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8114.2 Comonotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8214.3 Integration of Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 8414.4 Copula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8414.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
15 Convexity 8715.1 Properties of Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8715.2 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8915.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
16 Sample Average Approximation (SAA) 9316.1 SAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
16.1.1 Pointwise LLN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9316.1.2 Pointwise and Functional CLT . . . . . . . . . . . . . . . . . . . . . . . . 94
16.2 The ฮ-method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
17 Weak Topology of Measures 9717.1 General Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9717.2 The Wasserstein Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9817.3 The Real Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
18 Topologies For Set-Valued Convergence 10118.1 Topological features of Minkowski addition . . . . . . . . . . . . . . . . . . . . . 101
18.1.1 Topological features of convex sets . . . . . . . . . . . . . . . . . . . . . 10118.2 Preliminaries and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
18.2.1 Convexity, and Conjugate Duality . . . . . . . . . . . . . . . . . . . . . . 10218.2.2 PompeiuโHausdorff Distance . . . . . . . . . . . . . . . . . . . . . . . . 103
18.3 Local description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Version: May 28, 2021
8 CONTENTS
rough draft: do not distribute
1Historical Milestones in Portfolio Optimization
Probability is the foundation of banking.
Francis Ysidro Edgeworth, 1845โ1926,Anglo-Irish philosopher and political
economist. Edgeworth [1888]
1.1 IN BANKING
โฒ 1938: Bond Duration, Edgeworth [1888]
โฒ 1952: Markowitz mean-variance framework
โฒ 1963: Sharpโs capital asset pricing model
โฒ 1966: Multiple factor models
โฒ 1973: Black & Scholes option pricing model, the โGreeksโ
โฒ 1988: Risk weighted assets for banks
โฒ 1993: Value-at-Risk
โฒ 1994: Risk Metrics
โฒ 1997: Credit Metrics
โฒ 1998: Integration of credit and market risk
โฒ 1998: Risk Budgeting, the Basel Rules
โฒ 2007: Basel II
โฒ 2017: Basel III
1.2 IN INSURANCE
The natural business of insurance companies is concerned with Risk.
โฒ Pricing of individual Contracts
โฒ Reserving in the Portfolio
โฒ The Cramรฉr-Lundberg model
โฒ Solvability
9
10 HISTORICAL MILESTONES IN PORTFOLIO OPTIMIZATION
โฒ Solvency II
โฒ US and Canada Insurance Supervisory: Conditional Tail Expectation
rough draft: do not distribute
Bibliography
P. Artzner, F. Delbaen, and D. Heath. Thinking coherently. Risk, 10:68โ71, 1997. 43
P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent Measures of Risk. MathematicalFinance, 9:203โ228, 1999. doi:10.1111/1467-9965.00068. 43
A. Ben-Tal and A. Nemirovski. Lectures on modern convex optimization. SIAM Series onOptimization. 2001. 15
R. I. Bot, S.-M. Grad, and G. Wanka. Duality in Vector Optimization. Springer, 2009.doi:10.1007/978-3-642-02886-1. 87
C. Castaing and M. Valadier. Convex Analysis and Measurable Multifunctions. Number580 in Lecture Notes in Mathematics. Springer, 1977. doi:10.1007/BFb0087685. URLhttps://books.google.com/books?id=Fev0CAAAQBAJ. 103
G. Cornuejols and R. Tรผtรผncรผ. Optimization Methods in Finance. Cambridge UniversityPress (CUP), 2006. doi:10.1017/cbo9780511753886. 67
D. Denneberg. Non-additive measure and integral, volume 27. Springer Science & BusinessMedia, 1994. doi:10.1007/978-94-017-2434-0. 82
D. Dentcheva and A. Ruszczynski. Portfolio optimization with risk control by stochastic dom-inance constraints. In G. Infanger, editor, Stochastic Programming, volume 150 of Interna-tional series in Operations Research & Management Science, chapter 9, pages 189โ211.Springer Science+Business Media, LLC, 2011. doi:10.1007/978-1-4419-1642-6. 65
F. Y. Edgeworth. The mathematical theory of banking. Journal of the Royal Statistical Society,51(1):113โ127, 1888. 9
H. Fรถllmer and A. Schied. Stochastic Finance: An Introduction in Discrete Time. de GruyterStudies in Mathematics 27. Berlin, Boston: De Gruyter, 2004. ISBN 978-3-11-046345-3.doi:10.1515/9783110218053. URL http://books.google.com/books?id=cL-bZSOrqWoC.51
C. Hess. Set-valued integration and set-valued probability theory: An overview. In E. Pap,editor, Handbook of Measure Theory, volume I, II of Handbook of Measure Theory, chap-ter 14, pages 617โ673. Elsevier, 2002. doi:10.1016/B978-044450263-6/50015-4. 103
A. Mรผller and D. Stoyan. Comparison methods for stochastic models and risks. Wiley seriesin probability and statistics. Wiley, Chichester, 2002. ISBN 978-0-471-49446-1. URL https://books.google.com/books?id=a8uPRWteCeUC. 64
G. Ch. Pflug and A. Pichler. Multistage Stochastic Optimization. Springer Series inOperations Research and Financial Engineering. Springer, 2014. ISBN 978-3-319-08842-6. doi:10.1007/978-3-319-08843-3. URL https://books.google.com/books?id=q_VWBQAAQBAJ. 71, 98
11
12 BIBLIOGRAPHY
G. Ch. Pflug and W. Rรถmisch. Modeling, Measuring and Managing Risk. World Scientific,River Edge, NJ, 2007. doi:10.1142/9789812708724. 17, 39, 51, 72
S. T. Rachev and L. Rรผschendorf. Mass Transportation Problems Volume I: Theory, VolumeII: Applications, volume XXV of Probability and its applications. Springer, New York, 1998.doi:10.1007/b98893. 99
R. T. Rockafellar. Conjugate Duality and Optimization, volume 16. CBMS-NSF Regional Con-ference Series in Applied Mathematics. 16. Philadelphia, Pa.: SIAM, Society for Industrialand Applied Mathematics. VI, 74 p., 1974. doi:10.1137/1.9781611970524. 103
R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer Nature SwitzerlandAG, 1997. doi:10.1007/978-3-642-02431-3. URL https://books.google.com/books?id=w-NdOE5fD8AC. 103
A. Shapiro. Time consistency of dynamic risk measures. Operations Research Letters, 40(6):436โ439, 2012. doi:10.1016/j.orl.2012.08.007. 51
A. Shapiro, D. Dentcheva, and A. Ruszczynski. Lectures on Stochastic Programming. MOS-SIAM Series on Optimization. SIAM, second edition, 2014. doi:10.1137/1.9780898718751.51, 93
A. E. van Heerwaarden and R. Kaas. The Dutch premium principle. Insurance: Mathematicsand Economics, 11:223โ230, 1992. doi:10.1016/0167-6687(92)90049-H. 51
rough draft: do not distribute
2Introduction and Classification of Stochastic Programs
Universitรคten sind gefรคhrlicher alsHandgranaten.
Ruhollah Chomeini, 1902โ1989
We employ the usual axioms in probability theory and denote a probability space by
(ฮฉ, F , ๐).
Typically, we denote random variables mapping to a state space ฮ by
b : ฮฉโ ฮ
(or sometimes also ๐ : ฮฉโ R).
2.1 RELATIONS AND CONNECTIONS TO PORTFOLIO OPTIMIZATION: MARKOWITZ
See Markowitz, Section 3 below for details.
Definition 2.1. A portfolio ๐ฅโ โ R๐ฝ (with ๐ฝ indicating the number of stocks) is efficient if itsolves
minimize in ๐ฅโR๐ฝ var ๐ฅ>b (2.1)subject to E ๐ฅ>b โฅ `,
1> ๐ฅ โค 1,(๐ฅ โฅ 0).
2.2 ALTERNATIVE FORMULATIONS OF THE MARKOWITZ PROBLEM
Instead of Markowitz (2.1) one might consider the problem
maximize E ๐ฅ>b (2.2)subjet to var ๐ฅ>b โค ๐,
1> ๐ฅ โค 1,(๐ฅ โฅ 0).
13
14 INTRODUCTION AND CLASSIFICATION OF STOCHASTIC PROGRAMS
StochasticsProbability Theory
Optimization
Portfolio Optimization
Figure 2.1: Wide intersections: in theory and (economic) practice
2.3 INVOLVING RISK FUNCTIONALS
2.3.1 Risk Neutral
... is about the expectation, as
minimizein x E๐(๐ฅ, b)subject to E๐บ๐ (๐ฅ, b) โค 0, ๐ = 1, . . . , ๐
This may be reformulated by employing the risk functional
R(๐(๐ฅ, ยท)
)B Eb ๐(๐ฅ, b).
2.3.2 Utility Functions
A function ๐ข : Rโ R is a utility function if it satisfies some model-design properties in addition.Optimization problems involving utility function generally read
minimizein x E ๐ข(๐(๐ฅ, b)
)subject to E๐บ๐ (๐ฅ, b) โค 0, ๐ = 1, . . . , ๐
or
minimizein x R(๐(๐ฅ, b)
)subject to R
(๐บ๐ (๐ฅ, b)
)โค 0, ๐ = 1, . . . , ๐
where we might want to put
R(๐(๐ฅ, ยท)
)B E ๐ข
(๐(๐ฅ, b)
).
rough draft: do not distribute
2.4 PROBABILISTIC CONSTRAINTS 15
2.3.3 Robust Optimization
Robust optimization considers the problem (Ben-Tal and Nemirovski [2001])
minimizein x maxb โฮ
๐(๐ฅ, b)
subject to maxb โฮ
๐บ๐ (๐ฅ, b) โค 0, ๐ = 1, . . . , ๐
Note, that there is no probability measure
R(๐(๐ฅ, ยท)
)B ess sup๐(๐ฅ, b)
and the problem is basically about the support of the probability measure.
2.3.4 Distributionally Robust Optimization
Distributionally robust optimization involves the probability measure instead,
minimizein x max๐โ๐ซR๐
(๐(๐ฅ, b)
),
subject to R๐
(๐บ๐ (๐ฅ, b)
)โค 0, ๐ = 1, . . . , ๐,
where we indicate the probability measure ๐ โ ๐ซ explicitly.
2.4 PROBABILISTIC CONSTRAINTS
This is about the problem
minimize (in ๐ฅ) R(๐(๐ฅ, b)
)subject to ๐
(๐บ๐ (๐ฅ, b) โค 0
)โฅ ๐ผ, ๐ = 1, . . . , ๐
Example 2.2 (Economic example). Call Center
2.5 STOCHASTIC DOMINANCE
minimizein x E ๐ข(๐(๐ฅ, b)
)(2.3)
subject to ๐บ๐ (๐ฅ, b) < ๐, ๐ = 1, . . . , ๐
for some (stochastic) order relation <.
2.6 ON GENERAL DIFFICULTIES IN STOCHASTIC OPTIMIZATION
sample ofobservations
estimation/
model selectionprobability
modelscenario
generationscenariomodel
Problem 2.3 (Accuracy). How large/ small do we need Y > 0?
Version: May 28, 2021
16 INTRODUCTION AND CLASSIFICATION OF STOCHASTIC PROGRAMS
rough draft: do not distribute
3The Markowitz Model
The trend is your friend.
Bรถrsenweisheit
This section follows Pflug and Rรถmisch [2007, Section 4].The model of Markowitz1 is historically the first model to determine the decomposition of
an optimal portfolio.
3.1 INTRODUCTION
For portfolio optimization people often use the simple model provided by Harry Markowitzwhich involves the variance. Note that it is a significant drawback of the Markowitz modelthat positive deviations (profits โ this is what the investor wants) and negative deviations(losses โ this is what the investor tries to avoid) are treated exactly the same way. So theMarkowitz model is of historical interest (it was the first model on asset allocation with theobjective to reduce the variance) but it violates some natural objectives of an investor.
Extensions of the problem described at the end of this section avoid this downside.
3.2 EMPIRICAL PROBLEM FORMULATION AND VARIABLES
(i) ๐ฝ is the number of stocks considered (๐ฝ = 5 in the example which Table 3.1 displays);
(ii) each stock ๐ โ {1, . . . , ๐ฝ} is observed at ๐ + 1 consecutive times ๐ก0, . . . , ๐ก๐ (๐ = 12 inTable 3.1);
(iii) the price observed of stock ๐ at time ๐ก is ๐ ๐๐ก ;
(iv) b๐ =(b1๐, b2
๐, . . . b๐ฝ
๐
)> collects the annualized returns of all ๐ฝ stocks; note that b ๐๐= ๐>
๐b๐;
(v) ๐ฅ ๐ represents the fraction of cash invested in stock ๐ , ๐ โ {1, . . . , ๐ฝ}; we set ๐ฅ B(๐ฅ1, ๐ฅ2, . . . , ๐ฅ๐ฝ )>, ๐ฅ is the allocation vector;
(vi) The budget constraint: the total amount of cash to be invested is not more than thebudget available. 1C is the default value (or 1๐C, say): the budget constraint thusreads ๐ฅ> 1 โค 1C, where 1 = (1, . . . , 1)>;
(vii) Short-selling constraint: occasionally we do not allow short-selling (i.e., negative po-sitions), then the constraints ๐ฅ โฅ 0 has to be added. ๐ฅ โฅ 0 is understood as ๐ฅ ๐ โฅ 0,๐ โ {1, . . . , ๐ฝ}, for each stock.
To solve the problem one needs to specify the probability measure ๐ which is used to com-pute the expectation E and the variance var.
1Harry Max Markowitz. 1927. Nobel Memorial Price in Economic Sciences in 1990
17
18 THE MARKOWITZ MODEL
๐๐ก/C DAX RWE gold oil US-$/ C
๐ก0 January 9798.11 12.870 981.49 34.9 0.9228๐ก1 February 9495.40 10.540 1030.49 33.08 0.9197๐ก2 March 9965.51 11.375 1144.73 33.77 0.8787๐ก3 April 10038.97 13.045 1137.17 37.04 0.8729๐ก4 May 10262.74 11.765 1192.35 43.71 0.8983๐ก5 June 9680.09 14.190 1121.85 45.81 0.9005๐ก6 July 10337.50 15.905 1222.25 46.04 0.8949๐ก7 August 10592.69 14.665 1244.67 39.78 0.8962๐ก8 September 10511.02 15.335 1204.53 43.35 0.8896๐ก9 October 10665.01 14.460 1212.10 46.15 0.9107๐ก10 November 10640.30 11.860 1180.53 44.95 0.9444๐ก11 December 11481.06 11.815 1082.17 47.72 0.9509
๐ก12 January 11599.01 11.800 1081.43 52.69 0.9494
Table 3.1: Prices ๐ ๐๐ก๐
observed in 2016. www.investing.com
Jan 2016 Apr 2016 Jul 2016 Oct 2016 Jan 20170.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
DAX
RWE
gold
oil
Figure 3.1: Prices of Table 3.1
rough draft: do not distribute
3.3 THE EMPIRICAL/ DISCRETE MODEL 19
3.3 THE EMPIRICAL/ DISCRETE MODEL
Empirical models extract the probability model from historic observations.For this observe a stock at ๐ + 1 successive times (๐ก๐)๐๐=0 and collect the prices ๐๐ก๐ .
Definition 3.1. Its annualized return during the time-period [๐ก๐โ1, ๐ก๐] is2
b๐ B1
๐ก๐ โ ๐ก๐โ1ln
๐๐ก๐
๐๐ก๐โ1.
Define the weights (probabilities) ๐๐ B ๐ก๐โ๐ก๐โ1๐ก๐โ๐ก0 , a random variable b : ฮฉ โ R and a proba-
bility measure with
๐(b = b๐) B ๐๐ =๐ก๐ โ ๐ก๐โ1๐ก๐ โ ๐ก0
and we set
๐> B (๐1, . . . , ๐๐) B(๐ก1 โ ๐ก0๐ก๐ โ ๐ก0
,๐ก2 โ ๐ก1๐ก๐ โ ๐ก0
, . . . ,๐ก๐ โ ๐ก๐โ1๐ก๐ โ ๐ก0
).
Remark 3.2. Note thatโ๐
๐=1 ๐๐ = ๐> ยท 1 =
โ๐๐=1
๐ก๐โ๐ก๐โ1๐ก๐โ๐ก0 = 1, thus
1๐ก๐ โ ๐ก0
ln๐๐ก๐
๐๐ก0๏ธธ ๏ธท๏ธท ๏ธธannual return
=1
๐ก๐ โ ๐ก0
๐โ๐=1ln
๐๐ก๐
๐๐ก๐โ1(3.1)
=
๐โ๐=1
๐ก๐ โ ๐ก๐โ1๐ก๐ โ ๐ก0๏ธธ ๏ธท๏ธท ๏ธธ
๐๐
ยท 1๐ก๐ โ ๐ก๐โ1
ln๐๐ก๐
๐๐ก๐โ1๏ธธ ๏ธท๏ธท ๏ธธannualized return per period
=
๐โ๐=1
๐๐ ยท b๐๏ธธ ๏ธท๏ธท ๏ธธaverage of returns
= E๐ b. (3.2)
Based on this observation it follows that the annual return for the entire period is the expectedvalue of the annualized returns of successive periods (cf. Table 3.2).
Definition 3.3 (The first moment). The expected return is ๐ B E b.
Obviously, one may observe all ๐ฝ stocks in parallel, at the same time. So put
b๐
๐B
1๐ก๐ โ ๐ก๐โ1
ln๐๐๐ก๐
๐๐๐ก๐โ1
, ๐ = 1, . . . , ๐, ๐ = 1, . . . , ๐ฝ
and set b๐ B(b1๐, . . . , b๐ฝ
๐
). Collect all observations in the ๐ ร ๐ฝ-matrix
ฮ B(b๐
๐
) ๐=1:๐ฝ๐=1:๐
=
(1
๐ก๐ โ ๐ก๐โ1ln
๐๐๐ก๐
๐๐๐ก๐โ1
) ๐=1:๐ฝ
๐=1:๐
(cf. Table 3.2). For the random return vector we have that
๐
(b =
(b1๐ , . . . , b
๐ฝ๐
))= ๐ (b = ฮ๐) = ๐๐
2Here and always: logarithmus naturalis with basis ๐ = 2.718 . . .
Version: May 28, 2021
20 THE MARKOWITZ MODEL
annualized monthly returns b ๐๐ก DAX RWE gold oil US-$/ C
๐1 = 1/12 or ๐1 = 31/365 -37.7% -239.7% 58.5% -64.3% -4.0%๐2 = 1/12 or ๐2 = 28/365 58.0% 91.5% 126.2% 24.8% -54.7%๐3 = 1/12 or ๐3 = 31/365 8.8% 164.4% -8.0% 110.9% -7.9%๐4 = 1/12 or ๐4 = 30/365 26.5% -123.9% 56.9% 198.7% 34.4%๐5 = 1/12 or ๐5 = 31/365 -70.1% 224.9% -73.1% 56.3% 2.9%๐6 = 1/12 or ๐6 = 30/365 78.8% 136.9% 102.9% 6.0% -7.5%๐7 = 1/12 or ๐7 = 31/365 29.3% -97.4% 21.8% -175.4% 1.7%๐8 = 1/12 or ๐8 = 31/365 -9.3% 53.6% -39.3% 103.1% -8.9%๐9 = 1/12 or ๐9 = 30/365 17.5% -70.5% 7.5% 75.1% 28.1%๐10 = 1/12 or ๐10 = 31/365 -2.8% -237.9% -31.7% -31.6% 43.6%๐11 = 1/12 or ๐11 = 30/365 91.3% -4.6% -104.4% 71.8% 8.2%๐12 = 1/12 or ๐12 = 31/365 12.3% -1.5% -0.8% 118.9% -1.9%
average of monthly returns, (3.2) 16.9% -8.7% 9.7% 41.2% 2.8%annual return, (3.1) 16.9% -8.7% 9.7% 41.2% 2.8%
Table 3.2: Matrix ฮ, collecting the returns b ๐๐, cf. Table 3.1
where ฮ๐ is the ๐th row in the matrix ฮ). Note that the return of stock ๐ rewrites for theempirical probability measure
๐(ยท) =๐โ๐=1
๐๐ ยท ๐ฟ( b 1๐ ,..., b ๐ฝ๐ ) (ยท),
i.e., each stock is a random variable with ๐
(b ๐ = b
๐
๐
)= ๐๐, independently of ๐ . For every ๐
thus
E b ๐ =
๐โ๐=1
๐๐b๐
๐= ๐>b ๐ ,
where
๐> = (๐1, . . . , ๐๐) =(๐ก1 โ ๐ก0๐ก๐ โ ๐ก0
,๐ก2 โ ๐ก1๐ก๐ โ ๐ก0
, . . . ,๐ก๐ โ ๐ก๐โ1๐ก๐ โ ๐ก0
).
Remark 3.4. It follows from the Taylor series expansion
ln(1 + ๐ฅ) = ๐ฅ โ 12๐ฅ2 + 1
3๐ฅ3 โ . . .
for small ๐ฅ that
b๐
๐=
1๐ก๐ โ ๐ก๐โ1
ln๐๐๐ก๐
๐๐๐ก๐โ1
=1
๐ก๐ โ ๐ก๐โ1ln
(1 +
๐๐๐ก๐
๐๐๐ก๐โ1
โ 1)โ 1๐ก๐ โ ๐ก๐โ1
(๐๐๐ก๐
๐๐๐ก๐โ1
โ 1).
3.4 THE FIRST MOMENT: RETURN
Suppose an amount of ๐ฅ ๐ is invested in the stock ๐ . Then the total return of the investment is
๐ฅ>b =๐ฝโ๐=1๐ฅ ๐b
๐ .
rough draft: do not distribute
3.5 THE SECOND MOMENT: RISK 21
DAX RWE gold oil US-$/ C
return ๐ ๐ = E ๐>๐b 16.9% -8.7% 9.7% 41.2% 2.8%
variance ฮฃ ๐ ๐ = var(๐>๐b
)19.2% 208.8% 42.8% 88.9% 5.9%
Table 3.3: Return and variance (cf. Table 3.1)
Lemma 3.5. The expected return is ๐ B E b = ๐>ฮ.
The return observed of the portfolio in period ๐ isโ๐ฝ
๐=1 b๐
๐๐ฅ ๐ = (ฮ ยท ๐ฅ)๐ (the ๐th line in the
matrix ฮ ยท ๐ฅ). Markowith adds the constraint
E ๐ฅ>b =๐โ๐=1
๐๐ยฉยญยซ
๐ฝโ๐=1๐ฅ๐
๐b ๐
ยชยฎยฌ = ๐>ฮ๐ฅ = ๐>๐ฅ โฅ `,
which means, that a minimum return ` is required.
Remark 3.6. Suppose the total cash invested in stock ๐ is ๐ถ๐ (i.e., ๐ถ ๐/๐ ๐
0 is the number ofshares of stock ๐) with total initial cash ๐ถ0 =
โ๐ฝ๐=1๐ถ
๐ , then it is natural to define the fraction
๐ฅ ๐ B๐ถ ๐โ๐ฝ๐=1๐ถ
๐so that
โ๐ฝ๐=1 ๐ฅ ๐ = 1. The total portfolio value at time ๐ก then is ๐ถ๐ก = ๐ถ0 ยท
โ๐ฝ๐=1 ๐ฅ ๐
๐๐๐ก
๐๐
0.
Note, however, thatโ๐ฝ
๐=1 ๐ฅ ๐โ๐ผ
๐=1 ๐๐๐ก๐= ๐ถ๐ก๐ผ , but
โ๐ฝ๐=1 ๐ฅ ๐
โ๐ผ๐=1 b
๐ก๐๐โ
๐ถ๐ก๐
๐ถ๐ก0.
3.5 THE SECOND MOMENT: RISK
The covariance is
var ๐ฅ>b = E(๐ฅ>b โ E ๐ฅ>b
)2= E
(๐ฅ>b
)2 โ (E ๐ฅ>b
)2=
=
๐โ๐=1
๐๐ยฉยญยซ
๐ฝโ๐=1b๐
๐๐ฅ ๐
ยชยฎยฌ2
โ(
๐โ๐=1
๐๐b๐
๐๐ฅ ๐
)2=
๐โ๐=1
๐๐ยฉยญยซ
๐ฝโ๐=1
๐ฝโ๐โฒ=1
b๐
๐๐ฅ ๐ ยท b ๐
โฒ
๐๐ฅ ๐โฒ
ยชยฎยฌ โ๐โ๐=1
๐๐ยฉยญยซ
๐ฝโ๐=1b๐
๐๐ฅ ๐
ยชยฎยฌ ยท๐โ
๐โฒ=1๐๐โฒ
ยฉยญยซ๐ฝโ
๐โฒ=1b๐โฒ
๐โฒ ๐ฅ ๐โฒยชยฎยฌ
=
๐ฝโ๐=1๐ฅ ๐
๐ฝโ๐โฒ=1
๐ฅ ๐โฒ ยท๐โ๐=1
๐๐
(b๐
๐b๐โฒ
๐
)โ
๐ฝโ๐=1๐ฅ ๐
๐ฝโ๐โฒ=1
๐ฅ ๐โฒ ยท๐โ๐=1
๐๐b๐
๐
๐โ๐โฒ=1
๐๐โฒb๐โฒ
๐โฒ
=
๐ฝโ๐=1๐ฅ ๐
๐ฝโ๐โฒ=1
๐ฅ ๐โฒ ยท(
๐โ๐=1
๐๐ ยท b ๐๐ b๐โฒ
๐โ
๐โ๐=1
๐๐b๐
๐ยท
๐โ๐โฒ=1
๐๐โฒb๐โฒ
๐โฒ
)๏ธธ ๏ธท๏ธท ๏ธธ
=:ฮฃ ๐, ๐โฒ
and thus
var ๐ฅ>b =๐ฝโ๐=1๐ฅ ๐
๐ฝโ๐โฒ=1
๐ฅ ๐โฒฮฃ ๐ , ๐โฒ = ๐ฅ>ฮฃ๐ฅ,
Version: May 28, 2021
22 THE MARKOWITZ MODEL
19.2% 4.1% 7.9% 1.3% -1.9%4.1% 208.8% -7.4% 44.5% -18.9%7.9% -7.4% 42.8% -10.3% -6.7%1.3% 44.5% -10.3% 88.9% 2.9%
-1.9% -18.9% -6.7% 2.9% 5.9%
(a) Covariance matrix ฮฃ of returns in Table 3.1, cf. also Ta-ble 3.3
5.72 -0.04 -1.01 -0.20 0.69-0.04 1.03 0.74 -0.57 4.41-1.01 0.74 3.59 -0.14 6.22-0.20 -0.57 -0.14 1.49 -2.780.69 4.41 6.22 -2.78 39.83
(b) The inverse ฮฃโ1
Table 3.4: Covariance matrix ฮฃ of returns in Table 3.1 and its inverse
where ฮฃ is the covariance matrix (aka. variance-covariance matrix) with entries
ฮฃ ๐ , ๐โฒ =
๐โ๐=1
๐๐b๐
๐b๐โฒ
๐โ
๐โ๐=1
๐๐b๐
๐ยท
๐โ๐โฒ=1
๐๐โฒb๐โฒ
๐โฒ
= E b ๐b ๐โฒ โ E b ๐ ยท E b ๐โฒ = cov
(b ๐ , b ๐
โฒ).
Remark 3.7 (Besselโs correction). For the empirical measure ๐๐ = 1/๐, the entries of thevarianceโcovariance matrix are
ฮฃ ๐ , ๐โฒ = cov(b ๐ , b ๐
โฒ)=1๐
๐โ๐=1
(b๐
๐โ b ๐
) (b๐โฒ
๐โ b ๐
โฒ), where b
๐B
๐โ๐=1
b๐
๐.
Besselโs correction replaces this quantity by ฮฃ ๐ , ๐โฒ =1
๐โ1โ๐
๐=1
(b๐
๐โ b ๐
) (b๐โฒ
๐โ b ๐
โฒ).
Example 3.8. Cf. Table 3.4.
3.6 THE NON-EMPIRICAL FORMULATION
Here, the random variable is b : ฮฉโ R๐ฝ . We define ๐ B E ๐ฅ>b and observe that
var ๐ฅ>b = E(๐ฅ>b
)2 โ (E ๐ฅ>b
)2= E
ยฉยญยซ๐ฝโ๐=1b ๐๐ฅ ๐
ยชยฎยฌ2
โ ยฉยญยซ๐ฝโ๐=1๐ฅ ๐ E b
๐ยชยฎยฌ2
=
๐ฝโ๐=1
๐ฝโ๐โฒ=1
๐ฅ ๐๐ฅ ๐โฒ E(b ๐b ๐
โฒ)โ
๐ฝโ๐=1
๐ฝโ๐โฒ=1
๐ฅ ๐๐ฅ ๐โฒ(E b ๐
) (E b ๐
โฒ)
=
๐ฝโ๐=1
๐ฝโ๐โฒ=1
๐ฅ ๐๐ฅ ๐โฒ(E
(b ๐b ๐
โฒ)โ
(E b ๐
) (E b ๐
โฒ))
๏ธธ ๏ธท๏ธท ๏ธธcov( b ๐ , b ๐โฒ)
= ๐ฅ> cov(b) ๐ฅ.
Definition 3.9. The covariance matrix3 is
ฮฃ B cov(b) = E(b ยท b>
)โ (E b) ยท (E b)> .
3The covariance cov is a.k.a. variance matrix.
rough draft: do not distribute
3.7 THE CAPITAL ASSET PRICING MODEL (CAPM) 23
Figure 3.2: Harry Markowitz (1927) explains the CAPM and the mean variance plot. NobelMemorial Prize in Economic Sciences (1990)
Remark 3.10. Note, thatฮฃ = ฮ> ยท diag(๐) ยท ฮ โ ๐>ฮ๏ธธ๏ธท๏ธท๏ธธ
๐
ยท ฮ>๐๏ธธ๏ธท๏ธท๏ธธ๐>
and ฮฃ is symmetric.
3.7 THE CAPITAL ASSET PRICING MODEL (CAPM)
Markowitz considers the problem
minimize (in ๐ฅ โ R๐ฝ ) var ๐ฅ>b (3.3)subject to E ๐ฅ>b โฅ `,
1> ๐ฅ โค 1C,(๐ฅ โฅ 0)
The Markowitz problem (3.3) is quadratic, with linear constraints: ๐ฝ varialbes, 2 constraints.
Definition 3.11. A portfolio ๐ฅโ โ R๐ฝ is efficient if it solves (3.3).
Expressed by matrices the Markowitz problem (3.3) is
minimize in ๐ฅโR๐ฝ ๐ฅ>ฮฃ๐ฅ
subject to ๐ฅ>๐ โฅ `,๐ฅ> 1 โค 1,(๐ฅ โฅ 0).
Theorem 3.12. The efficient Markowitz portfolio is given by
๐ฅโ(`) = `(๐
๐ฮฃโ1๐ โ ๐
๐ฮฃโ1 1
)โ ๐๐ฮฃโ1๐ + ๐
๐ฮฃโ1 1, (3.4)
where ๐ B ๐>ฮฃโ1๐, ๐ B ๐>ฮฃโ1 1, ๐ B 1> ฮฃโ1 1 and ๐ B ๐๐ โ ๐2 are auxiliary quantities.
Version: May 28, 2021
24 THE MARKOWITZ MODEL
y
g(x,y) = c
f(x,y) = d1
f(x,y) = d2
x
f(x,y) = d3
Figure 3.3: Illustration of Lagrange multipliers, contour lines of ๐
Remark 3.13. The units of the auxiliary are [๐] = interest2variance , [๐] = interest
variance , [๐] = 1variance and
[ฮฃ] = variance.
Proof. Differentiate the Lagrangian4
๐ฟ (๐ฅ;_, ๐พ) B 12๐ฅ>ฮฃ๐ฅ โ _(๐>๐ฅ โ `) โ ๐พ(1> ๐ฅ โ 1)
to get the necessary conditions for optimality,
0 =๐๐ฟ
๐๐ฅ=12๐ฅ>ฮฃ + 1
2๐ฅ>ฮฃ โ _๐> โ ๐พ 1>, (3.5)
0 =๐๐ฟ
๐_= โ๐>๐ฅ + `, (3.6)
0 =๐๐ฟ
๐๐พ= โ1> ๐ฅ + 1. (3.7)
It follows from (3.5) that๐ฅโ = _ฮฃโ1๐ + ๐พฮฃโ1 1 . (3.8)
To determine the shadow prices _ and ๐พ we employ (3.6) and (3.7), i.e.,
` = ๐>๐ฅโ = _๐>ฮฃโ1๐ + ๐พ๐>ฮฃโ1 1 and (3.9)
1 = 1> ๐ฅโ = _ 1> ฮฃโ1๐ + ๐พ 1> ฮฃโ1 1 .
We may rewrite these latter equations as a usual matrix equation,(๐ ๐
๐ ๐
) (_
๐พ
)=
(`
1
)(3.10)
with solutions _โ = `๐โ๐๐๐โ๐2 and ๐พโ = ๐โ`๐
๐๐โ๐2 . Substitute them in (3.8) to get the assertion (3.4) ofthe theorem, i.e., the efficient portfolio. ๏ฟฝ
4We could choose ๐ฅ>ฮฃ๐ฅ orโ๐ฅ>ฮฃ๐ฅ equally well in the Lagrangian function ๐ฟ.
rough draft: do not distribute
3.7 THE CAPITAL ASSET PRICING MODEL (CAPM) 25
Corollary 3.14. Note from (3.9) that
E ๐ฅโ>b = ๐ฅโ>b = `
and
var(๐ฅโ>b
)= ๐ฅโ>ฮฃ๐ฅโ =
`2๐ โ 2`๐ + ๐๐๐ โ ๐2
, (3.11)
cov(๐ฅโ>b, b๐
)= ๐ฅโ>ฮฃ๐๐ =
`๐ โ ๐๐๐ โ ๐2
๐๐ +๐ โ `๐๐๐ โ ๐2
.
Proof. We have
var(๐ฅโ>b
)= ๐ฅโ>ฮฃ๐ฅโ =
(_ฮฃโ1๐ + ๐พฮฃโ1 1
)>ฮฃ๐ฅโ
= _๐>๐ฅโ + ๐พ 1> ๐ฅโ = _` + ๐พ
=`๐ โ ๐๐๐ โ ๐2
` + ๐ โ `๐๐๐ โ ๐2
= `2๐
๐๐ โ ๐2โ 2` ๐
๐๐ โ ๐2+ ๐
๐๐ โ ๐2=: ๐2(`).
๏ฟฝ
Corollary 3.15. It holds that ๐๐ > ๐2.
Proof. The matrix ฮฃ is positive definite as it is a covariance matrix, and so is its inverse. Itthus holds that ๐ = 1> ฮฃโ1 1 > 0. The variance is also positive for every ` > 0, so it followsfrom (3.11) that ๐๐ โ ๐2 > 0. ๏ฟฝ
3.7.1 The Mean-Variance Plot
This section studies the Markowitz problem as a function of the parameter `.
Corollary 3.16 (Mean-variance). Set ๐2 B var๐๐ฅโ , then we have (for ๐ > 1โ๐) that
`(๐) = ๐ +โ(๐๐ โ ๐2) (๐๐2 โ 1)
๐=๐
๐+
โ(๐ โ ๐
2
๐
) (๐2 โ 1
๐
). (3.12)
Proof. Solve (3.11) for var๐๐ฅ = ๐2 using the quadratic formula. ๏ฟฝ
Figure 3.4 graphs the relation (3.12), i.e., the mean and the variance of efficient portfolios.
Corollary 3.17. It holds that
`(๐) โค ๐๐+ ๐
โ๐ โ ๐
2
๐(3.13)
and every efficient portfolio satisfies ๐ โฅ 1โ๐
and ` โฅ ๐๐.
Proof. This is immediate from (3.12); cf. also Figure 3.4. ๏ฟฝ
Version: May 28, 2021
26 THE MARKOWITZ MODEL
0 5 ยท 10โ2 0.1 0.15 0.2 0.25 0.3
5 ยท 10โ2
0.1
0.15
0.2
risk, ๐
retu
rn,`
Figure 3.4: The mean-variance plot (3.12), the efficient frontier and asymptotic (3.13)
Remark 3.18. The Markowitz portfolio with smallest variance which does not include a riskfree asset is given for ` = ๐
๐(differentiate (3.11) with respect to `) and this portfolio thus is of
particular interest. In particular, note that its variance, by (3.11), is
๐2min = var ๐ฅโ(๐
๐
)>b =
(๐๐
)2๐ โ 2๐
๐๐ + ๐
๐ ๐ โ ๐2=1๐
๐2 โ 2๐2 + ๐ ๐๐ ๐ โ ๐2
=1๐.
Exercise 3.1. The portfolio with smallest-variance in our data is ` = ๐๐= 1.7666.3 = 2.66%, the
corresponding standard deviation, which cannot be improved, is ๐ = 1โ๐= 1โ
1> ฮฃโ1 1= 12.3%;
cf. Figure 3.4 and Figure 3.5.
3.7.2 Tangency portfolio
For some fixed risk free rate ๐0 we study the particular reward
`๐ B๐ โ ๐0 ๐๐ โ ๐0 ๐
. (3.14)
Lemma 3.19. The variance corresponding to the reward `๐ is
๐2๐ B ๐2(`๐) =๐ โ 2๐0 ๐ + ๐20 ๐(๐ โ ๐0 ๐)2
. (3.15)
Proof. From (3.11) it follows that
๐2(`๐) =๐`2๐ โ 2๐`๐ + ๐
๐๐ โ ๐2
=1
(๐ โ ๐0๐)2๐(๐ โ ๐0๐)2 + 2๐(๐ โ ๐0๐) (๐ โ ๐0๐) + ๐(๐ โ ๐0๐)2
๐๐ โ ๐2
= ยท ยท ยท =๐ โ 2๐0 ๐ + ๐20 ๐(๐ โ ๐0 ๐)2
rough draft: do not distribute
3.7 THE CAPITAL ASSET PRICING MODEL (CAPM) 27
after some annoying, but elementary algebra. ๏ฟฝ
Definition 3.20 (Sharpe ratio). The Sharpe ratio of the portfolio with reward `๐ is
๐ ๐ B`๐ โ ๐0๐๐
. (3.16)
Remark 3.21. Note first that `๐โ๐0๐โ๐0 ๐ = ๐2๐. It follows that
๐ โ ๐0 ๐ =`๐ โ ๐0๐2๐
=(3.16)
๐ ๐
๐๐
(3.17)
and with (3.14) that๐ โ ๐0 ๐ =
(3.14)`๐ (๐ โ ๐0 ๐) =
(3.17)
`๐ ยท ๐ ๐๐๐
.
From (3.15) we deduce further that
๐ โ 2๐0 ๐ + ๐20 ๐ =(3.15)
๐2๐ (๐ โ ๐0 ๐)2 =(3.17)
๐ 2๐. (3.18)
Definition 3.22 (Market portfolio, tangency portfolio). The market portfolio is
๐ฅ๐ B ๐ฅโ(`๐) =๐๐
๐ ๐ยท ฮฃโ1
(๐ โ ๐0 ยท 1
). (3.19)
Remark 3.23. It follows according (3.4) is
๐ฅ๐ = ๐ฅโ(`๐) =`๐๐ โ ๐๐๐ โ ๐2
ฮฃโ1๐ โ `๐๐ โ ๐๐๐ โ ๐2
ฮฃโ1 1
=1
๐ โ ๐0๐ฮฃโ1๐ โ ๐0
๐ โ ๐0๐ฮฃโ1 1
and with (3.17) thus (3.19).
Remark 3.24. For the line ๐ก (๐) B ๐0 + ๐ ยท `๐โ๐0๐๐it holds that
๐ก (๐๐) = `(๐๐) and๐ก โฒ(๐๐) = `โฒ(๐๐) (cf. (3.12)).
The line ๐ก thus is the tangent drawn from the point of the risk-free asset ๐ก (0) = ๐0 to thefeasible region for risky assets. The tangent line is called capital market line. The portfoliowith decomposition (3.19) is also called the most efficient portfolio, it has the highest reward-to-volatility ratio.
3.7.3 The Two Fund Theorem
Note that ` is a model-parameter in the Markowitz model (2.1). We now compare efficientportfolios for different returns `.
Theorem 3.25 (Two fund theorem). If ๐ฅโ1 and ๐ฅโ2 are different efficient portfolios (for different`s), then every efficient portfolio can be obtained as an affine combination of these two.
Version: May 28, 2021
28 THE MARKOWITZ MODEL
0.05 0.1 0.15 0.2
return ยต-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
allo
ca
tio
n
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
sta
nd
ard
de
via
tio
n ฯ
DAX
RWE
gold
oil
Figure 3.5: Asset allocation according to Markowitz for varying return `
Proof. Observe from (3.1) and (3.9) that
๐ฅโ(`) = `๐ โ ๐๐๐ โ ๐2
ฮฃโ1๐ + ๐ โ `๐๐๐ โ ๐2
ฮฃโ1 1
=
(โ ๐
๐๐ โ ๐2ฮฃโ1๐ + ๐
๐๐ โ ๐2ฮฃโ1 1
)+ `
(๐
๐๐ โ ๐2ฮฃโ1๐ โ ๐
๐๐ โ ๐2ฮฃโ1 1
).
๏ฟฝ
Exercise 3.2. The result for our data is (cf. Figure 3.5)
๐ฅโ(`) =
ยฉยญยญยญยญยญยซ2.7%10.5%14.3%โ7.8%80.2%
ยชยฎยฎยฎยฎยฎยฌ+ `
ยฉยญยญยญยญยญยซ+1.90โ0.80โ0.05+1.68โ2.73
ยชยฎยฎยฎยฎยฎยฌ;
the auxiliary quantities are ๐ = 0.40, ๐ = 1.76 and ๐ = 66.31.
rough draft: do not distribute
3.8 MARKOWITZ PORTFOLIO INCLUDING A RISK FREE ASSET 29
3.8 MARKOWITZ PORTFOLIO INCLUDING A RISK FREE ASSET
Set ๐ B E b, i.e., ๐ ๐ B E b ๐ , ๐ = 1, . . . , ๐ฝ. Further, let ๐0 be the return of the risk-free asset.We set
๐ =
ยฉยญยญยญยญยซ๐0๐1...
๐๐ฝ
ยชยฎยฎยฎยฎยฌ=
ยฉยญยญยญยซ๐0
๐
ยชยฎยฎยฎยฌ with ๐ =ยฉยญยญยซ๐1...
๐๐ฝ
ยชยฎยฎยฌ and ๐ฅ =ยฉยญยญยญยญยซ๐ฅ0๐ฅ1...
๐ฅ๐ฝ
ยชยฎยฎยฎยฎยฌ=
ยฉยญยญยญยซ๐ฅ0
๐ฅ
ยชยฎยฎยฎยฌ with ๐ฅ =ยฉยญยญยซ๐ฅ1...
๐ฅ๐ฝ
ยชยฎยฎยฌ .Note that ๐0๐ก = ๐
00 ยท๐
๐0๐ก and the annualized return b0๐= 1
๐ก๐+1โ๐ก๐ ln๐0๐ก+1
๐0๐ก+1
= ๐0 is the constant risk-free
interest rate. The risk free asset is not correlated with other assets. The covariance matrix
ฮฃ B
(0 00 ฮฃ
)thus is not invertible. Consequently, the results from the previous section do not apply.
Theorem 3.26. The Markowitz portfolio is given by
๐ฅโ(`) =(๐ฅโ0(`)๐ฅโ(`)
)=
(`๐โ``๐โ๐0
`โ๐0`๐โ๐0 ยท ๐ฅ๐
), (3.20)
where ๐ฅ๐ (cf. (3.19)) is the tangency portfolio (market portfolio).
Proof. Differentiate the Lagrangian (cf. Figure 3.3 for illustration)
๐ฟ (๐ฅ;_, ๐พ) B 12๐ฅ>ฮฃ๐ฅ โ _(๐>๐ฅ โ `) โ ๐พ(1> ๐ฅ โ 1)
to get the necessary conditions for optimality,
0 =๐๐ฟ
๐๐ฅ= ฮฃ๐ฅ โ _๐> โ ๐พ 1>, (3.21)
0 =๐๐ฟ
๐๐ฅ0= โ_๐0 โ ๐พ, (3.22)
0 =๐๐ฟ
๐_= ๐>๐ฅ โ ` = ๐0๐ฅ0 + ๐>๐ฅ โ `, (3.23)
0 =๐๐ฟ
๐๐พ= 1> ๐ฅ โ 1 = ๐ฅ0 + 1> ๐ฅ โ 1, (3.24)
We get from (3.21) that ๐ฅโ = _ฮฃโ1๐ + ๐พฮฃโ1 1. Substitute ๐ฅโ in (3.23) and (3.24), and aftercollecting terms in (3.22)โ(3.24) one finds (cf. (3.10))
ยฉยญยซ0 ๐0 1๐0 ๐ ๐
1 ๐ ๐
ยชยฎยฌ ยฉยญยซ๐ฅ0_
๐พ
ยชยฎยฌ =ยฉยญยซ0`
1
ยชยฎยฌ .This linear matrix equation has explicit solution
ยฉยญยซ๐ฅโ0_โ
๐พโ
ยชยฎยฌ =1
๐ โ 2๐๐0 + ๐๐20๏ธธ ๏ธท๏ธท ๏ธธ=๐ 2๐, cf. (3.18)
ยฉยญยซ๐ โ ๐๐0 + `(๐๐0 โ ๐)
โ๐0 + `๐20 โ ๐0`
ยชยฎยฌ . (3.25)
Version: May 28, 2021
30 THE MARKOWITZ MODEL
So we finally get
๐ฅโ =
(๐ฅโ0๐ฅโ
)=
(1๐ 2๐(๐ โ ๐๐0 + `(๐๐0 โ ๐))_โฮฃโ1๐ + ๐พโฮฃโ1 1
)=1๐ 2๐
(๐ 2๐ โ (๐ โ ๐0๐) (` โ ๐0)(` โ ๐0)
(ฮฃโ1๐ โ ๐0ฮฃโ1 1
) )from (3.24); cf. Exercise 3.3. ๏ฟฝ
Corollary 3.27. The variance (standard deviation, resp.) of the portfolio corresponding to `is
var(๐ฅโ(`)>b
)=
(` โ ๐0๐ ๐
)2(๐(`) = |` โ ๐0 |
๐ ๐, resp.).
Proof. Recall the special structure of ฮฃ. It thus follows from (3.20) that
var(๐ฅโ(`)>b
)= ๐ฅโ(`)>ฮฃ๐ฅโ(`)
=(` โ ๐0)2
๐ 4๐(๐ โ ๐0 1)> ฮฃโ1ฮฃฮฃโ1 (๐ โ ๐0 1)
=(` โ ๐0)2
๐ 4๐
(๐>ฮฃโ1๐ โ 2๐0 1> ฮฃโ1๐ + ๐20 1
> ฮฃโ1 1)
=(` โ ๐0)2
๐ 4๐
(๐ โ 2๐0๐ + ๐20๐
)=
(` โ ๐0๐ ๐
)2, (3.26)
which is the assertion. ๏ฟฝ
Remark 3.28. Note, that the portfolio with smallest variance is attained here for ห = ๐0, thecorresponding variance by (3.26) is zero , i.e, there is no risk. This is in significant contrastto Remark 3.18.
3.9 ONE FUND THEOREM
Theorem 3.29 (One fund theorem, Tobin5-separation, market portfolio). Every efficient port-folio is the affine combination of
(i) a portfolio without a risk-free asset (the market portfolio), and
(ii) the risk free asset.
Definition 3.30. The portfolio allocation in (i) is called market portfolio.
Proof. Choose
(i) ` B ๐0, then the portfolio in (3.20) is ๐ฅโ(๐0) =ยฉยญยญยญยญยซ10...
0
ยชยฎยฎยฎยฎยฌ; this portfolio does not involve stocks
and thus is completely free of risk, i.e., it consists of the risk-free asset solely;
5Tobin-Separation, 1918โ2002, American economist
rough draft: do not distribute
3.9 ONE FUND THEOREM 31
(ii) For the market portfolio, recall (cf. the tangency portfolio (3.14))
`๐ B `๐ก =๐ โ ๐0 ๐๐ โ ๐0 ๐
and ๐ฅ๐ B ๐ฅ๐ก . Then, by (3.19) and (3.20),
๐ฅโ(`๐) =(0
๐ฅโ(`๐ก )
)=
(0
๐๐
๐ ๐ฮฃโ1 (๐ โ ๐0 ยท 1)
)=
(0๐ฅ๐
),
which means that the portfolio ๐ฅโ(`๐) is free from risk-free assets, i.e., does not containthe risk free asset.
The assertion follows, as every portfolio is a linear combination of both portfolios by the TwoFund Theorem, Theorem 3.25. Explicitly, the optimal portfolio (cf. (3.20)) is
๐ฅโ(`) = `๐ โ ``๐ โ ๐0
(10
)+ ` โ ๐0`๐ โ ๐0
(0
๐๐
๐ ๐ฮฃโ1
(๐ โ ๐0 1
) ) =`๐ โ ``๐ โ ๐0
(10
)+ ` โ ๐0`๐ โ ๐0
(0๐ฅ๐
).
๏ฟฝ
Remark 3.31. The tangency portfolio coincides with the market portfolio, ๐ฅ๐ = ๐ฅ๐ก .
Example 3.32. For ๐0 B 2%, the optimal portfolios for our data are given according the onefund theorem as
๐ฅโ(`) = 83.3% โ `81.3%
ยฉยญยญยญยญยญยญยญยซ
100%00000
ยชยฎยฎยฎยฎยฎยฎยฎยฌ๏ธธ ๏ธท๏ธท ๏ธธno stocks
+` โ 2%81.3%
ยฉยญยญยญยญยญยญยญยซ
0160.7%โ56.0%10.3%132.2%โ147.3%
ยชยฎยฎยฎยฎยฎยฎยฎยฌ๏ธธ ๏ธท๏ธท ๏ธธno cash
;
cf. Figure 3.6.
3.9.1 Capital Asset Pricing Model (CAPM)
Recall that the market (or tangency) portfolio
๐ฅ๐ = ๐ฅ๐ก =๐๐
๐ ๐ฮฃโ1
(๐ โ ๐0 ยท 1
)has expectation (cf. (3.14))
E ๐ฅ>๐b = `๐
and variance (cf. (3.15))var
(๐ฅ>๐b
)= ๐2๐.
Remark 3.33 (Covariance of the market portfolio). The covariance of the market portfoliowith asset ๐ is
cov(๐>๐ b, ๐ฅ
>๐b
)= ๐>๐ ฮฃ๐ฅ๐ = ๐>๐ ฮฃ ยท
๐๐
๐ ๐ฮฃโ1 (๐ โ ๐0 ยท 1) =
๐๐
๐ ๐
(๐ ๐ โ ๐0),
where ๐ = E b and ๐ ๐ = E b ๐ .
Version: May 28, 2021
32 THE MARKOWITZ MODEL
0.05 0.1 0.15 0.2
ยต
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
allo
ca
tio
n
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
sta
nd
ard
de
via
tio
n ฯ
DAX
RWE
gold
oil
cash
Figure 3.6: Markowitz portfolio including a risk free asset (cash) with return ๐0 = 2% forvarying return `
It follows from the definition of the Sharpe ratio (3.16) that
๐ฝ ๐ Bcov
(๐ฅ>๐b, ๐
>๐b
)var (๐ฅ>๐b)
=
๐๐
๐ ๐
(๐ ๐ โ ๐0)๐2๐
=๐ ๐ โ ๐0๐ ๐ ๐๐
=๐ ๐ โ ๐0`๐ โ ๐0
,
i.e.,
๐ ๐ = ๐0 + ๐ฝ ๐ ยท (`๐ โ ๐0) . (3.27)
The quantity ๐ฝ ๐ is the sensitivity of the expected excess asset returns to the expected excessmarket returns
The relation (3.27) is the core of the capital asset pricing model (CAPM). The graphof (3.27),
๐ฝ โฆโ ๐0 + ๐ฝ (`๐ โ ๐0)
is also called security market line (SML in the `-๐ฝ-diagram).
Remark 3.34. For the market portfolio ๐ฅ๐ it holds that ๐ฅ>๐๐ฝ = ๐ฝ๐ = 1.Indeed, with (3.27),
`๐ = ๐ฅ>๐๐ = ๐0๐ฅ>๐ 1+๐ฅ>๐๐ฝ ยท (`๐ โ ๐0) = ๐0 + ๐ฅ>๐๐ฝ ยท (`๐ โ ๐0)
and thus the assertion.
rough draft: do not distribute
3.10 ALTERNATIVE FORMULATIONS OF THE MARKOWITZ PROBLEM 33
3.9.2 On systematic and specific risk
Observe that the correlation of asset ๐ with the market is defined as ๐ ๐ ,๐ Bcov
(๐>๐b , b>๐ฅ>๐ b
)โvar(๐ฅ>๐ b) ยทvar
(๐>๐b
)so that
๐ฝ ๐ = ๐ ๐ ,๐ ยท๐๐
๐๐
.
It follows that
๐๐ = ๐ ๐ ,๐ ๐๐ +(1 โ ๐ ๐ ,๐
)๐๐ = ๐ฝ ๐ ๐๐๏ธธ๏ธท๏ธท๏ธธ
systematic
+(1 โ ๐ฝ ๐
๐๐
๐๐
)๐๐๏ธธ ๏ธท๏ธท ๏ธธ
specific
.
โฒ The systematic risk 6 is also called aggregate or undiversifiable risk ;
โฒ the specific risk 7 is also called unsystematic, residual or idiosyncratic risk.
3.9.3 Sharpe ratio
Note that E ๐>๐b = ๐ ๐ and var ๐>
๐b = ๐>
๐ฮฃ๐ ๐ = ฮฃ ๐ ๐ .
Definition 3.35. Then quantity๐ ๐ โ ๐0โ
ฮฃ ๐ ๐
is the Sharpe ratio of asset ๐ .8
It holds that
๐ฝ ๐ =cov
(b>๐ ๐ , b>๐ฅโ๐
)var (b>๐ฅโ๐)
=corr
(b>๐ ๐ , b>๐ฅโ๐
)๐๐
โฮฃ ๐ ๐
๐2๐=
โฮฃ ๐ ๐
๐๐
๐ ๐ ,๐.
The security market line (SML) is
SML : ๐ฝ โฆโ ๐0 + ๐ฝ ยท (`๐ โ ๐0) .
Note, from (3.27), that
SML(๐ฝ ๐) = ๐ ๐ ,SML(0) = ๐0 andSML(1) = ๐๐.
3.10 ALTERNATIVE FORMULATIONS OF THE MARKOWITZ PROBLEM
Instead of Markowitz (3.3) one may consider the problem
maximize ๐>๐ฅsubjet to ๐ฅ>ฮฃ๐ฅ โค ๐,
1> ๐ฅ โค 1,(๐ฅ โฅ 0)
6systematisches Risiko (dt.)7unsystematisches, spezifisches, diversifizierbares Risiko (dt.)8William Sharpe, 1934, Nobel memorial Price in Economic Sciences (1990)
Version: May 28, 2021
34 THE MARKOWITZ MODEL
Eigenvalue 2.254 0.773 0.440 0.165 0.024Variance explained 61.7% 21.2% 12.0% 4.5% 0.6%
(a) Eigenvalues and percentages of explained variance
PC1 PC2 PC3
DAX -0.02 -0.34 0.31RWE -0.95 0.30 0.05gold 0.05 -0.24 -0.91oil -0.31 0.91 0.26FX 0.08 0.14 0.13
(b) The first 3 principal components ex-plain 94.9%
Table 3.5: Principal component analysis
Proposition 3.36 (Utility maximization). The explicit solution of (^ > 0)
maximize E ๐ฅ>b โ ^2
var ๐ฅ>b (3.28)
subjet to 1> ๐ฅ โค 1,(๐ฅ โฅ 0)
is
๐ฅ =1^ฮฃโ1
(๐ + ^ โ 1
> ฮฃโ1๐
1> ฮฃโ1 11
).
Proof. The first order conditions for the Lagrangian ๐ฟ (๐ฅ;_) B ๐ฅ>๐ โ ^2 ๐ฅ>ฮฃ๐ฅ + _ (1> ๐ฅ โ 1) are
0 = ๐ โ ^ ฮฃ ๐ฅ + _ 1 and1 = 1> ๐ฅ.
from which follows that ๐ฅ = 1^ฮฃโ1(๐ + _ 1). Further, 1 = 1> ๐ฅ = 1
^1> ฮฃโ1(๐ + _ 1), i.e., _ =
^โ1> ฮฃโ1๐1> ฮฃโ1 1
. Hence the result. ๏ฟฝ
Remark 3.37. The portfolios of (3.28) and (3.3) coincide for ^ = ๐๐ `โ๐ . In this case, ` = ๐+๐ ^
๐ ^
and ๐2 = ๐+^2๐ ^2
.
3.11 PRINCIPAL COMPONENTS
Table 3.5 collects the eigenvalues and principal components of the covariance matrix ฮฃ forthe three components according the KarhunenโLoรจve decomposition. The first three princi-pal components explain 95 % of the data.
3.12 PROBLEMS
Exercise 3.3. Verify (3.25) and (3.20).
rough draft: do not distribute
3.12 PROBLEMS 35
Stocks:return ` ๐1 ๐2 ๐3 ๐4 ๐5
0% 2.8% 10.5% 14.3% โ7.8% 80.2%15% 31.2% โ1.5% 13.6% 17.4% 39.3%
5%
(a) Markowitz portfolio
Stocks:return ` ๐1 ๐2 ๐3
2% โ4% โ2% 106%14% 12% 6% 82%
(b) Markowitz portfolio
Table 3.6: Markowitz portfolios for various `
Exercise 3.4. The following portfolios (asset allocations, Table 3.6a) are efficient (in thesense of Markowitz). Give the Markowitz portfolio for ` = 5%?
Exercise 3.5. Is there a risk free asset among ๐1, . . . ๐5 in Table 3.6a?
Exercise 3.6. Give two pros and two cons for Markowtzโs model.
Exercise 3.7. The portfolios in Table 3.6b are efficient. What is the risk free rate?
Exercise 3.8. Give the portfolio in Table 3.6b which does not contain a risk free asset.
Exercise 3.9. Verify Remark 3.37.
Version: May 28, 2021
36 THE MARKOWITZ MODEL
rough draft: do not distribute
4Value-at-Risk
Never catch a falling knife.
investment strategy
4.1 DEFINITIONS
Definition 4.1 (Cumulative distribution function, cdf). Let ๐ : ฮฉโ R be a real-valued randomvariable. The cumulative distribution function (cdf, or just distribution function) is1
๐น๐ (๐ฅ) B ๐(๐ โค ๐ฅ). (4.1)
Definition 4.2. The Value-at-Risk at (confidence, or risk) level ๐ผ โ [0, 1] is2
V@R๐ผ (๐ ) B ๐นโ1๐ (๐ผ) = inf {๐ฅ : ๐ (๐ โค ๐ฅ) โฅ ๐ผ} . (4.2)
The Value-at-Risk is also called the quantile funciton ๐๐ผ (๐ ) B V@R๐ผ (๐ ) or generalizedinverse.
Example 4.3. Cf. Figur 4.1 and Table 4.1 or Figure 9.1.
4.2 HOW ABOUT ADDING RISK?
Fact. Consider the random variables (cf. Table 4.2) for which
V@R20%(๐) + V@R20%(๐ ) = 2 + 0 = 2 < 4 = V@R20%(๐ + ๐ ),
1Note that ๐น๐ (ยท) is cร dlร g, i.e., continue ร droite, limite ร gauche: right continuous with left limits2Note, that inf โ = +โ.
๐ 1 2 3 4 5
๐ฆ๐ 3 7 โ3 8 โ5๐ (๐ = ๐ฆ๐) 1/5 1/5 1/5 1/5 1/5
(a) Observations
๐ฆ โ5 โ3 3 7 8 3.9๐น๐ (๐ฆ) 1/5 2/5 3/5 4/5 5/5 3/5
(b) Cumulative distribution function
๐ผ 1/5 2/5 3/5 4/5 5/5V@R๐ผ (๐ ) โ5 โ3 3 7 8
(c) Value-at-Risk
Table 4.1: Value-at-Risk
37
38 VALUE-AT-RISK
๐ฅ0
1
๐๐(๐ โค ๐)
๐ V@R๐ผ (๐ )
๐ผ
๐ฆโฒ
๐(๐ = ๐)
(a) Cumulative distribution function, cdf
๐นโ1๐(๐ผ)
๐ผ0 1
๐ = V@R๐ (๐ )
๐ฆโฒ
๐ผ
โV@R1โ๐ (โ๐ )
๐ = ๐(๐ โค ๐ฆโฒ)
๐(๐ = ๐)
(b) Quantile function, the generalized inverse of Figure 4.1a
Figure 4.1: Cumulative distribution and its corresponding quantile function
Figure 4.2: Deutsche Bank, annual report 2014, Values-at-Risk
rough draft: do not distribute
4.2 HOW ABOUT ADDING RISK? 39
๐ (๐ = ๐ฅ๐ , ๐ = ๐ฆ๐) 1/3 1/3 1/3๐ 2 4 5๐ 7 0 6
๐ + ๐ 9 4 11
Table 4.2: Counterexample
butV@R40%(๐) + V@R40%(๐ ) = 4 + 6 = 10 > 9 = V@R40%(๐ + ๐ ).
Lemma 4.4 (Cf. Figur 4.1). It holds that
(i) ๐นโ1๐(๐ผ) โค ๐ฅ if and only if ๐ผ โค ๐น๐ (๐ฅ) (Galois inequalities);
(ii) ๐น๐ is continuous from right (upper semi-continuous);
(iii) ๐นโ1๐
is continuous from left (lower semi-continuous);
(iv) ๐นโ1๐(๐น๐ (๐ฅ)) โค ๐ฅ for all ๐ฅ โ R and ๐น๐
(๐นโ1๐(๐ผ)
)โฅ ๐ผ for all ๐ผ โ (0, 1);
(v) ๐นโ1๐
(๐น๐
(๐นโ1๐(๐ฆ)
) )= ๐นโ1
๐(๐ฆ) and ๐น๐
(๐นโ1๐(๐น๐ (๐ฆ))
)= ๐น (๐ฆ).
Remark 4.5 (Quantile transform). Let ๐ be uniformly distributed, i.e., ๐(๐ โค ๐ข) = ๐ข for every๐ข โ (0, 1). The random variables ๐ and ๐นโ1
๐(๐) share the same distribution.
Proof. ๐(๐นโ1๐(๐) โค ๐ฆ
)= ๐ (๐ โค ๐น๐ (๐ฆ)) = ๐น๐ (๐ฆ), the assertion. ๏ฟฝ
The converse does not hold true, i.e., ๐น๐ (๐ ) is not necessarily uniformly distributed. How-ever, we have the following:
Lemma 4.6 (The generalized quantile transform, Pflug and Rรถmisch [2007, Proposition 1.3]).Let ๐ be uniform and independent from ๐ . Then
๐น (๐,๐) B (1 โ๐) ยท ๐น (๐โ) +๐ ยท ๐น (๐ ) (4.3)
is uniformly [0, 1] and๐นโ1๐
(๐น (๐,๐)
)= ๐ almost surely,
where ๐น (๐ฅโ) B lim๐ฅโฒโ๐ฅ ๐น (๐ฅ โฒ).
Proof. For ๐ โ (0, 1) fixed let ๐ฆ๐ satisfy ๐น๐ (๐ฆ๐โ) โค ๐ โค ๐น (๐ฆ๐). Then
๐(๐น (๐,๐) โค ๐ |๐
)=
1 if ๐ < ๐ฆ๐
๐โ๐น (๐ฆ๐โ)๐น (๐ฆ๐)โ๐น (๐ฆ๐โ) if ๐ = ๐ฆ๐
0 if ๐ > ๐ฆ๐
and thus ๐(๐น (๐,๐) โค ๐
)= ๐น (๐ฆ๐โ) +
(๐น (๐ฆ๐) โ ๐น (๐ฆ๐โ)
) ๐โ๐น (๐ฆ๐โ)๐น (๐ฆ๐)โ๐น (๐ฆ๐โ) = ๐, i.e., ๐น (๐,๐) is
uniformly distributed.Conditional on {๐ = ๐ฆ} it holds that ๐น (๐,๐) โ
[๐นโ1(๐ฆโ), ๐นโ1(๐ฆ)
]. But ๐นโ1(๐ข) = ๐ฆ for every
๐ข โ[๐นโ1(๐ฆโ), ๐นโ1(๐ฆ)
]and thus the assertion. ๏ฟฝ
Version: May 28, 2021
40 VALUE-AT-RISK
4.3 PROPERTIES OF THE VALUE-AT-RISK
Nice properties (cf. Lemma 4.4)
(i) Homogeneity: it holds that V@R๐ผ (_๐ ) = _ ยท V@R๐ผ (๐ ) for _ โฅ0;3
๐น_๐ (ยท) = ๐น๐ ( ยท/_) .
(ii) Cash-invariance: V@R๐ผ (๐ + ๐) = ๐ + V@R๐ผ (๐ ), where ๐ โ R is a constant. Note alsothat
๐น๐+๐ (ยท) = ๐น๐ (ยท โ ๐) .
(iii) Law-invariance: if ๐ and ๐ share the same law, i.e., ๐(๐ โค ๐ง) = ๐(๐ โค ๐ง) for all ๐ง โ R,then V@R๐ผ (๐) = V@R๐ผ (๐ ) (note ๐ and ๐ may have the same law, even if ๐ (๐) โ ๐ (๐)for all ๐ โ ฮฉ and even if ๐ : ฮฉโ R and ๐ : ฮฉโฒโ R);
(iv) ๐น๐(๐นโ1๐(๐)
)โฅ ๐, with equality, if ๐ is in the range of ๐น๐ , equivalently, if ๐นโ1
๐(๐) is a point
of continuity of ๐น๐ ;
(v) ๐นโ1๐(๐น๐ (๐ข)) โค ๐ข, with equality, if ๐ข is in the range of ๐นโ1
๐, equivalently, if ๐น๐ (๐ข) is a point
of continuity of ๐นโ1๐
;
(vi) comonotonic additive (cf. Section 14), i.e., V@R๐ผ (๐ + ๐ ) = V@R๐ผ (๐) + V@R๐ผ (๐ ), pro-vided that ๐ and ๐ are comonotonic.
4.4 PROFIT VERSUS LOSS
For an illustration see Figure 4.3.
Lemma 4.7 (Profit vs. loss, cf. Figure 4.3). It holds that
V@R๐ผ (๐ ) โค โV@R1โ๐ผ (โ๐ ) (4.4)
with equality if ๐น๐ (๐ฆ + โ) > ๐น๐ (๐ฆ) for โ > 0 at ๐ฆ = V@R๐ผ (๐ ).
Proof. First,
โV@R1โ๐ผ (โ๐ ) = โ inf {๐ฆ : ๐(โ๐ โค ๐ฆ) โฅ 1 โ ๐ผ}= sup {โ๐ฆ : ๐(๐ โฅ โ๐ฆ) โฅ 1 โ ๐ผ}= sup {๐ฆ : ๐(๐ โฅ ๐ฆ) โฅ 1 โ ๐ผ}= sup {๐ฆ : 1 โ ๐(๐ โฅ ๐ฆ) โค ๐ผ}= sup {๐ฆ : ๐(๐ < ๐ฆ) โค ๐ผ} (4.5)
Now observe that{๐ฆ : ๐(๐ < ๐ฆ) โค ๐ผ} ยคโช {๐ฆ : ๐(๐ < ๐ฆ) > ๐ผ}
3Throughout this lecture shall investigate positively homogeneous acceptability functionals.
rough draft: do not distribute
4.4 PROFIT VERSUS LOSS 41
โAV@R95%(โ๐ ) = A(๐ )V@R5%(๐ )
0
E๐
loss profit
๐
V@R95%(โ๐ ) = โV@R5%(๐ )AV@R95%(โ๐ )
0
โE๐
profit loss
โ๐
Figure 4.3: Profit versus loss
Version: May 28, 2021
42 VALUE-AT-RISK
๐1 ๐2 ๐3
ฮ: 5% โ4% โ2%8% 2% 0%4% 1% 0%โ9% 0% โ10%
Table 4.3: Annualized, logarithmic returns
and disjoint intervals. It follows that
โV@R1โ๐ผ (โ๐ ) = inf {๐ฆ : ๐(๐ < ๐ฆ) > ๐ผ}โฅ inf {๐ฆ : ๐(๐ โค ๐ฆ) > ๐ผ}โฅ inf {๐ฆ : ๐(๐ โค ๐ฆ) โฅ ๐ผ} ,
the assertion.Now set ๐ฆ B V@R๐ผ (๐ ). As the cdf ๐ฆ โฆโ ๐(๐ โค ๐ฆ) is right-continuous it follows that ๐ฆ is
feasible for (4.2) and (4.5), i.e., ๐(๐ < ๐ฆ) โค ๐ผ โค ๐(๐ โค ๐ฆ). Hence the result. ๏ฟฝ
Problem 4.8. A Markowitz-like formulation involving the Value-at-Risk is
maximize(in ๐ฅ = (๐ฅ1, . . . ๐ฅ๐) )
1๐
โ๐๐=1 ๐ฅ
>b๐ = ๐ฅ>b
subject to V@R5% (๐ฅ>b) โฅ โ2$ 5% worst profits > โ2$๐ฅ1 + ยท ยท ยท + ๐ฅ๐ โค 1.000$ budget constraint(๐ฅ โฅ 0) shortselling allowed / not allowed
or, with (4.4),
maximize(in ๐ฅ = (๐ฅ1, . . . ๐ฅ๐) )
1๐
โ๐๐=1 ๐ฅ
>b๐ = ๐ฅ>b
subject to V@R95% (โ๐ฅ>b) โค 2$ 95% of all losses < 2$๐ฅ1 + ยท ยท ยท + ๐ฅ๐ โค 1.000$ budget constraint(๐ฅ โฅ 0) shortselling allowed / not allowed
4.5 PROBLEMS
Exercise 4.1. Is Problem 4.8 always feasible? Where is the Risk? Which statistics areinvolved?
Exercise 4.2. Is Problem 4.8 clever? โ Downsides? How can one obtain higher returns?
Exercise 4.3. The matrix ฮ in Table 4.3 contains logarithmic, annualized returns of 3 sharesat the end of 4 quarters. You are invested with ๐ฅ = (40%, 30%, 30%). What is the Value-at-Risk at risk-level ๐ผ = 30%, ๐ผ = 70% of your returns?
rough draft: do not distribute
5Axiomatic Treatment of Risk
If in trouble, double.
Bรถrsenweisheit
Definition 5.1 (Artzner et al. [1999, 1997]). A positively homogeneous risk measure, aka.risk functional or coherent risk measure is a mapping R : ๐ฟ ๐ โ R โช {โ} with the followingproperties:
(i) MONOTONICITY: R (๐1) โค R (๐2) whenever ๐1 โค ๐2 almost surely;
(ii) CONVEXITY: R ((1 โ _)๐0 + _๐1) โค (1 โ _) R (๐0) + _R (๐1) for 0 โค _ โค 1;
(iii) TRANSLATION EQUIVARIANCE:1 R(๐ + ๐) = R(๐ ) + ๐ if ๐ โ R;
(iv) POSITIVE HOMOGENEITY: R(_๐ ) = _R(๐ ) whenever _ > 0.
Throughout this lecture shall investigate positively homogeneous risk functionals.
Remark 5.2. In the present context ๐ is associated with loss. In the literature the mapping
๐ : ๐ โฆโ R(โ๐ )
is often called coherent risk measure instead, when ๐ is associated with a reward rather thana loss: Whereas R is more frequent in an actuarial (insurance) context, ๐ is typically used ina banking context.
The term acceptability functional (or Utility Function, cf. Figure 4.3) is frequently used forthe concave mapping
A : ๐ โฆโ โR (โ๐ ) . (5.1)
Example 5.3 (Simple Examples of risk measures). The functionals
R(๐ ) B E๐
andR(๐ ) B ess sup๐
are risk measures.
Example 5.4. The functional R(๐ ) B E๐๐ is a risk functional, provided that ๐ โฅ 0 andE ๐ = 1.
1In an economic or monetary environment this is often called CASH INVARIANCE instead.
43
44 AXIOMATIC TREATMENT OF RISK
Proof. By translation equivariance we have that
R(๐ ) + ๐ = R(๐ + ๐ 1) = E(๐ + ๐ 1)๐ = E๐ + ๐E ๐,
hence E ๐ = 1.Further, we have for all ๐1 โค ๐2 that E๐1๐ = R(๐1) โค R(๐1) = E๐2๐, i.e., E ๐๐ โฅ 0 for all
๐ โฅ 0. This can only hold true for ๐ โฅ 0. ๏ฟฝ
Theorem 5.5. Suppose a risk functional R(ยท) is well defined on ๐ฟโ and satisfies (i) and (iii)in Definition 5.1. Then R is Lipschitz-continuous with respect to โยทโโ; the Lipschitz constantis 1.
Proof. For ๐ and ๐ โ ๐ฟโ it is true that ๐ โ ๐ โค โ๐ โ ๐ โโ and hence ๐ โค โ๐ โ ๐ โโ + ๐ . Bymonotonicity and translation equivariance thus
R(๐) โค R(๐ + โ๐ โ ๐ โโ) = โ๐ โ ๐ โโ + R(๐ )
and thusR(๐) โ R(๐ ) โค โ๐ โ ๐ โโ ;
interchanging the role of ๐ and ๐ reveals the result. ๏ฟฝ
Lemma 5.6. If R1 and R2 are risk measures, then so are
โฒ 12R1 +
12R2 and
โฒ max {R1,R2}.
By the FenchelโMoreau theorem (see below), no other risk measures are possible.
rough draft: do not distribute
6Examples of Coherent Risk Functionals
Buy on rumors, sell on facts.
Bรถrsenweisheit
6.1 MEAN SEMI-DEVIATION
The semi-deviation risk measure is1
R(๐ ) B E๐ + ๐ฝ ยท โ (๐ โ E๐ )+โ ๐ , (6.1)
where ๐ฝ โ [0, 1] and ๐ โฅ 1.
Proposition 6.1. The semi-deviation (6.1) is a risk measure.
Proof. Convexity (ii), translation equivariance (iii) and homogeneity (iv) are evident. To showmonotonicity (i) assume that ๐ โค ๐ . By Jensens inequality we have that (๐ฅ + ๐ฆ)+ โค ๐ฅ+ + ๐ฆ+and it follows that
(๐ โ E ๐)+ =(๐ โ E๐ + (๐ โ ๐ โ E(๐ โ ๐ ))
)+
โค (๐ โ E๐ )+ +(๐ โ ๐ โ E(๐ โ ๐ )
)+.
Now we have that ๐ฅ+ = ๐ฅ + (โ๐ฅ)+ (cf. Footnote 1) and thus further
(๐ โ E ๐)+ โค (๐ โ E๐ )+ +(๐ โ ๐ โ E(๐ โ ๐ )
)+
(๐ โ ๐ โ E(๐ โ ๐)
)+.
Notice recall that ๐ โ ๐ โฅ 0 and further that E ๐ โค E๐ , and thus(๐ โ ๐ โE(๐ โ ๐)
)+ โค ๐ โ ๐.
Consequently
(๐ โ E ๐)+ โค (๐ โ E๐ )+ + ๐ โ ๐ โ E(๐ โ ๐ ) + ๐ โ ๐= (๐ โ E๐ )+ + E(๐ โ ๐).
It follows thatE ๐ + (๐ โ E ๐)+ โค E๐ + (๐ โ E๐ )+.
Now multiply the latter inequality with the density ๐ with ๐ โฅ 0, E ๐ = 1 and โ๐ โ๐ โค 1 toobtain
E ๐ + โ(๐ โ E ๐)+โ ๐ โค E๐ + โ(๐ โ E๐ )+โ ๐by Hรถlderโs inequality. Multiplying this inequality with ๐ฝ and adding (1โ ๐ฝ) times the inequalityE ๐ โค E๐ finally gives monotonicity (i). ๏ฟฝ
1๐ฅ+ B max {0, ๐ฅ}. Note, that ๐ฅ+ โ (โ๐ฅ)+ = ๐ฅ.
45
46 EXAMPLES OF COHERENT RISK FUNCTIONALS
6.2 AVERAGE VALUE-AT-RISK
The most important and prominent acceptability functional satisfying all axioms of the Defini-tion is the Average Value-at-Risk.
Definition 6.2. The Average Value-at-Risk 2 at level ๐ผ โ [0, 1] is given by
AV@R๐ผ (๐ ) B11 โ ๐ผ
โซ 1
๐ผ
V@R๐ (๐ ) d๐ =11 โ ๐ผ
โซ 1
๐ผ
๐นโ1๐ (๐ข) d๐ข (0 โค ๐ผ < 1)
andAV@R1 (๐ ) B ess sup๐ .
Proposition 6.3. Representations of the Average Value-at-Risk include(cf. Footnote 1 on thepreceding page)
AV@R๐ผ (๐ ) =11 โ ๐ผ
โซ 1
๐ผ
๐นโ1๐ (๐) d๐ (6.2)
= inf๐โR
๐ + 11 โ ๐ผ E (๐ โ ๐)+ (6.3)
= sup{E ๐๐ : 0 โค ๐ โค (1 โ ๐ผ)โ1 , E ๐ = 1
}(6.4)
= sup{E๐ ๐ :
d๐d๐โค 11 โ ๐ผ
}=
cf. Remark 6.4E
(๐ | ๐ > V@R๐ผ (๐ )
). (6.5)
The ๐ผ-quantile ๐โ = ๐นโ1๐(๐ผ) minimizes (6.3).
Remark 6.4. Equation (6.5) is correct, provided that ๐ (๐ > V@R๐ผ (๐ )) = 1โ๐ผ, i.e., ๐ (๐ โค V@R๐ผ (๐ )) =๐ผ, or ๐น๐
(๐นโ1๐(๐ผ
)= ๐ผ (cf. Lemma 4.4 (iv)). In this case (6.5) follows from (6.2).
Proof. Differentiate the objective in (6.3) with respect to ๐ to get the necessary condition ofoptimality
0 = 1 โ 11 โ ๐ผ E1{๐<๐ } = 1 โ
11 โ ๐ผ (1 โ ๐ (๐ โค ๐)) ,
i.e., ๐(๐ โค ๐) = ๐ผ, so that ๐โ = ๐นโ1๐(๐ผ) = V@R๐ผ (๐ ). The objective in (6.3) hence is
๐โ + 11 โ ๐ผ E (๐ โ ๐
โ)+ = ๐โ +11 โ ๐ผ
โซ 1
0
(๐นโ1๐ (๐ข) โ ๐โ
)+d๐ข
= ๐นโ1๐ (๐ผ) +11 โ ๐ผ
โซ 1
๐ผ
๐นโ1๐ (๐ข) โ ๐นโ1๐ (๐ผ) d๐ข
=11 โ ๐ผ
โซ 1
๐ผ
๐นโ1๐ (๐ข) d๐ข = AV@R๐ผ (๐ ).
2Theโฒ Average Value-at-Risk, or conditional Value-at-Risk,
is sometimes also calledโฒ Conditional Tail Expectation (CTE)โฒ expected shortfall,โฒ tail value-at-risk or newlyโฒ super-quantile.
rough draft: do not distribute
6.2 AVERAGE VALUE-AT-RISK 47
See Lemma (6.7) below for the remaining assertion. ๏ฟฝ
Lemma 6.5. It holds that
(i) AV@R0(๐ ) = E๐ ;
(ii) V@R๐ผ (๐ ) โค AV@R๐ผ (๐ );
(iii) AV@R๐ผโฒ (๐ ) โค AV@R๐ผ (๐ ), provided that ๐ผโฒ โค ๐ผ, and particularly
(iv) E๐๏ธธ๏ธท๏ธท๏ธธrisk neutral
โค AV@R๐ผ (๐ )๏ธธ ๏ธท๏ธท ๏ธธrisk averse
โค ess sup๐๏ธธ ๏ธท๏ธท ๏ธธcompletely risk averse
.
Proof. Indeed, substitute ๐ข โ ๐น๐ (๐ฆ) and we have AV@R0(๐ ) =โซ 10 ๐น
โ1๐(๐ข) d๐ข =
โซR๐ฆ d๐น๐ (๐ฆ) =
E๐ . In case a density is available then d๐น๐ (๐ฆ) = ๐๐ (๐ฆ) d๐ฆ and thus AV@R0(๐ ) =โซR๐ฆ ๐๐ (๐ฆ) d๐ฆ.
For the proof of (iii) see the more general Proposition 6.12 below. ๏ฟฝ
Proposition 6.6. The Average Value-at-Risk is a risk functional according the axioms ofDefinition 5.1.
Proof. Monotonicity, translation equivariance and positive homogeneity are evident by (6.2)and (6.3).
As for convexity let ๐โ0 (๐โ1, resp.) be optimal in (6.3) for the random variable ๐0 (๐1, resp.).Set ๐_ B (1 โ _)๐0 + _๐1 and ๐_ B (1 โ _)๐โ0 + _๐
โ1. Then
AV@R๐ผ
((1 โ _)๐0 + _๐1
)โค ๐_ +
11 โ ๐ผ E (๐_ โ ๐_)+
= (1 โ _)๐โ0 + _๐โ1 +
11 โ ๐ผ E
((1 โ _)
(๐0 โ ๐โ0
)+ _
(๐1 โ ๐โ1
) )+
โค (1 โ _)๐โ0 + _๐โ1 +1 โ _1 โ ๐ผ E
(๐0 โ ๐โ0
)+ +
_
1 โ ๐ผ E(๐1 โ ๐โ1
)+ (6.6)
= (1 โ _) AV@R๐ผ (๐0) + _ AV@R๐ผ (๐1),
where we have used Jensenโs inequality in (6.6) for the convex function ๐ฆ โฆโ (๐ฆ โ ๐)+; thusthe assertion. ๏ฟฝ
Lemma 6.7. It holds that
AV@R๐ผ (๐ ) = max{E๐๐ : 0 โค ๐ โค 1
1 โ ๐ผ , E ๐ = 1}= min
๐โR
{๐ + 11 โ ๐ผ E (๐ โ ๐)+
}. (6.7)
Proof. We provide a prove of the statement for discrete random variables based on duality.Recall first that the linear problems
minimize (in ๐ฅ) ๐>๐ฅsubject to ๐ด1๐ฅ = ๐1
๐ด2๐ฅ โฅ ๐2๐ฅ โฅ 0
and
maximize (in _,`) _>๐1 + `>๐2subject to _>๐ด1 + `>๐ด2 โค ๐>
` โฅ 0
Version: May 28, 2021
48 EXAMPLES OF COHERENT RISK FUNCTIONALS
are dual to each other. We rewrite the initial problem (6.7)
โminimize (in ๐) โ๐ โ๐๐๐๐๐๐
subject toโ
๐ ๐๐๐๐ = 1โ๐๐๐๐ โฅ โ๐๐ 11โ๐ผ๐๐ โฅ 0
with ๐๐ = โ๐๐๐๐, ๐ด1,๐ = ๐๐, ๐1 = 1, ๐ด2,๐ = โ๐๐ and ๐2,๐ = โ ๐๐1โ๐ผ . Inserting in the dual gives
โmaximize (in _, `) _ โโ๐
๐๐1โ๐ผ `๐
subject to _๐๐ โ ๐๐`๐ โค โ๐๐๐๐ ,`๐ โฅ 0.
(6.8)
Now note that the latter two inequalities are `๐ โฅ 0 and `๐ โฅ _ + ๐๐. The maximum in (6.8) isattained for `๐ = max {0, _ + ๐๐}. Hence (6.8) rewrites as
โmaximize_ = _ โ 11 โ ๐ผ
โ๐
๐๐ (๐๐ + _)+ = minimize_ = โ_ + 11 โ ๐ผ
โ๐
๐๐ (๐๐ + _)+ ,
from which the assertion follows. ๏ฟฝ
6.3 ENTROPIC VALUE-AT-RISK
Definition 6.8. The Entropic Value-at-Risk at risk level ๐ผ โ [0, 1) is
EV@R๐ผ (๐ ) = inf๐ก>0
1๐กlog
11 โ ๐ผ E ๐
๐ก๐ .
It is a risk measure satisfying all conditions (i)โ(iv) in Definition 5.1.
6.4 SPECTRAL RISK MEASURES
Definition 6.9. Spectral risk measures3 are
R๐ (๐ ) Bโซ 1
0๐(๐ข)๐นโ1๐ (๐ข) d๐ข
for some spectral function ๐ : [0, 1] โ R; occasionally, the function ๐(ยท) is also called spec-trum.
Proposition 6.10. Spectral functions ๐ : [0, 1) โ Rโฅ0 necessarily satisfy
(i)โซ 10 ๐(๐ข) d๐ข = 1 (by translation equivariance),
(ii) ๐(ยท) โฅ 0 (by monotonicity) and
(iii) ๐(ยท) is nondecreasing (by convexity).
Proof. See Exercise 6.3. ๏ฟฝ
3Also: distortion risk funtionals
rough draft: do not distribute
6.5 KUSUOKAโS REPRESENTATION OF LAW INVARIANT RISK MEASURES 49
Remark 6.11. The Average Value-at-Risk is a spectral risk measure for the spectrum
๐(๐ข) B{0 if ๐ข < ๐ผ,11โ๐ผ if ๐ข โฅ ๐ผ.
Proposition 6.12. Suppose thatโซ 1๐ผ๐1(๐ข) d๐ข โค
โซ 1๐ผ๐2(๐ข) d๐ข for all ๐ผ โ (0, 1), then R๐1 (๐ ) โค
R๐2 (๐ ).
Proof. We verify the statement for ๐ bounded. Set ฮฃ๐ (๐ข) Bโซ 1๐ข๐๐ (๐) d๐. Then
R๐1 (๐ ) =โซ 1
0๐1(๐ข)๐นโ1๐ (๐ข) d๐ข = โ
โซ 1
0๐นโ1๐ (๐ข) dฮฃ1(๐ข),
and by integration by parts
R๐1 (๐ ) = โ๐นโ1๐ (๐ข)ฮฃ1(๐ข)๏ฟฝ๏ฟฝ1๐ข=0 +
โซ 1
0ฮฃ1(๐ข) d๐นโ1๐ (๐ข) = ๐นโ1๐ (0) +
โซ 1
0ฮฃ1(๐ข) d๐นโ1๐ (๐ข).
Note, that ฮฃ1(ยท) โค ฮฃ2(ยท) by assumption and ๐นโ1๐(ยท) is an increasing function, thus
R๐1 (๐ ) = ๐นโ1๐ (0) +โซ 1
0ฮฃ1(๐ข) d๐นโ1๐ (๐ข) โค ๐นโ1๐ (0) +
โซ 1
0ฮฃ2(๐ข) d๐นโ1๐ (๐ข) = R๐2 (๐ ).
๏ฟฝ
Lemma 6.13. R` (๐ ) Bโซ 10 AV@R๐ผ (๐ ) ` (d๐ผ) is a spectral risk measure, provided that
โฒโซ 10 ` (d๐ผ) = 1 (to ensure translation equivariance) and
โฒ `(ยท) โฅ 0 (to ensure monotonicity).
Proof. Indeed,
R` (๐ ) =โซ 1
0AV@R๐ผ (๐ ) ` (d๐ผ) =
โซ 1
0V@R๐ผ (๐ ) ๐ (๐ผ) d๐ผ,
where ๐ (๐) =โซ ๐
0` (d๐ผ)1โ๐ผ is the spectrum. ๏ฟฝ
Lemma 6.14. For ๐ โฅ 0 a.s. we have the representation
R๐ (๐ ) =โซ โ
0ฮฃ (๐น๐ (๐)) d๐ (if ๐ โฅ 0 a.s.),
where ฮฃ (๐ผ) =โซ 1๐ผ๐ (๐) d๐ is the negative antiderivative.
Version: May 28, 2021
50 EXAMPLES OF COHERENT RISK FUNCTIONALS
Figure 6.1: AV@R๐ผโs are extreme points, so there is some Choquet-Bishop-de Leeuw Rep-resentation for A (Krein-Milman Theorem).
6.5 KUSUOKAโS REPRESENTATION OF LAW INVARIANT RISK MEASURES
A supremum of Choquet representations.
Theorem 6.15. Suppose R is a law invariant Risk measure. Then it has the Kusuoka-representation
R (๐ ) = sup`โM
โซ 1
0AV@R๐ผ (๐ ) ` (d๐ผ)
(M is a set of positive measures on [0, 1]).
Proof. We have that R(๐ ) = sup๐ E๐๐ โ Rโ(๐) = sup๐ โZ E๐๐ as Rโ(๐) โ {0,โ}. For ๐ โ Zgiven, let ๐ โฒ have the same distribution as ๐ so that ๐ โฒ and ๐ are comonotone. Then
R(๐ ) = R(๐ โฒ) = sup๐ โZ
E๐๐ โ Rโ(๐) = sup๐ โZ
โซ 1
0๐นโ1๐ (๐ข)๐นโ1๐ (๐ข) d๐ข = sup
๐โฮฃ
โซ 1
0๐(๐ข)๐นโ1๐ (๐ข) d๐ข,
where ฮฃ = {๐น๐ (ยท) : ๐ โ Z} collects all distribution functions of Z. Hence the result. ๏ฟฝ
Proposition 6.16. For any law invariant risk functional R it holds that
E๐ โค R(๐ ).
Proof. Consider the functional R๐ (ยท) first. Find ๏ฟฝ๏ฟฝ such that ๐(๐ข) โค 1 whenever ๐ข โค ๏ฟฝ๏ฟฝ and๐(๐ข) โฅ 1 for ๐ข โฅ ๏ฟฝ๏ฟฝ. Note as well that
โซ ๏ฟฝ๏ฟฝ
0 1 โ ๐(๐ข) d๐ข =โซ 1๏ฟฝ๏ฟฝ๐(๐ข) โ 1 d๐ข, as
(โซ ๏ฟฝ๏ฟฝ
0 +โซ 1๏ฟฝ๏ฟฝ
)๐(๐ข) d๐ข =(โซ ๏ฟฝ๏ฟฝ
0 +โซ 1๏ฟฝ๏ฟฝ
)1 d๐ข = 1. Thenโซ ๏ฟฝ๏ฟฝ
0(1 โ ๐(๐ข))๐นโ1๐ (๐ข) d๐ข โค
โซ ๏ฟฝ๏ฟฝ
0(1 โ ๐(๐ข))๐นโ1๐ (๏ฟฝ๏ฟฝ) d๐ข
=
โซ 1
๏ฟฝ๏ฟฝ
(๐(๐ข) โ 1)๐นโ1๐ (๏ฟฝ๏ฟฝ) d๐ข โคโซ 1
๏ฟฝ๏ฟฝ
(๐(๐ข) โ 1)๐นโ1๐ (๐ข) d๐ข.
The assertion follows from Kusuokaโs representation. ๏ฟฝ
Proposition 6.17. For any law invariant risk measure R and sub-sigma algebra G we havethat
R(E(๐ | G)
)โค R (๐ ) .
rough draft: do not distribute
6.6 APPLICATION IN INSURANCE 51
Proof. Note that ๐ฅ โฆโ (๐ฅ โ ๐)+ is convex. Thus, by the conditional Jensen inequality
(E(๐ | G) โ ๐)+ โค E ((๐ โ ๐)+ |G) .
It follows that
AV@R๐ผ
(E(๐ |G)
)= min
๐โR๐ + 11 โ ๐ผ E ((E๐ | G) โ ๐)+
โค min๐โR
๐ + 11 โ ๐ผ EE ((๐ โ ๐)+ | G)
= min๐โR
๐ + 11 โ ๐ผ E ((๐ โ ๐)+) = AV@R๐ผ (๐ ).
The assertion follows form Kusuokaโs representation. ๏ฟฝ
Remark 6.18. For the Average Value-at-Risk we have that AV@R๐ผ (E(๐ |F )) โค AV@R๐ผ (๐ ),where F is a sub-sigma algebra, and thus R (E(๐ |F )) โค R (๐ ) for law invariant risk function-als by Kusuokaโs theorem.
Proposition 6.19 (Cf. Fรถllmer and Schied [2004, Theorem 4.67]). AV@R๐ผ (ยท) is the smallestlaw invariant coherent risk functional dominating V@R๐ผ (ยท) (cf. Lemma 6.5 (ii) and Exer-cise 6.4).
Proof. By translation equivariance and for ๐ โ ๐ฟโ we may assume that ๐ > 0. Set ๐ด B{๐ > V@R๐ผ (๐ )} and consider ๐ B ๐ ยท 1
๐ด{+E(๐ |๐ด) ยท 1๐ด. Notice, that ๐ = E
(๐
๏ฟฝ๏ฟฝ๐ ยท 1๐ด{
)and
V@R๐ผ (๐) = E(๐ |๐ด). Suppose the coherent risk functional R(ยท) dominates V@R๐ผ (ยท), then
R(๐ ) โฅ R(๐) โฅ V@R๐ผ (๐) = E(๐ |๐ด) = AV@R๐ผ (๐ )
by (6.5). ๏ฟฝ
6.6 APPLICATION IN INSURANCE
The Dutch Premium Principle
A comprehensive list of Kusuoka representations for important risk functionals is providedin [Pflug and Rรถmisch, 2007]. A compelling example is the absolute semi-deviation riskmeasure (the Dutch premium principle, introduced in [van Heerwaarden and Kaas, 1992])for some fixed \ โ [0, 1],
R\ (๐ ) B E [๐ + \ ยท (๐ โ E๐ )+] ,
assigning an additional loading of \ to any loss ๐ฟ exceeding the net-premium price E ๐ฟ. ItsKusuoka representation is
R\ (๐ ) = sup0โค`โค1
(1 โ \ ยท `)E๐ + \ ` ยท AV@R1โ` (๐ )
(cf. Shapiro [2012], Shapiro et al. [2014]).
Theorem 6.20. R\ (ยท) is a convex risk functional.
Version: May 28, 2021
52 EXAMPLES OF COHERENT RISK FUNCTIONALS
Proof. We verify that R\ (ยท) is convex. Indeed, define ๐_ B (1 โ _)๐0 + _๐1. Then
(๐_ โ E๐_)+ = ((1 โ _) (๐0 โ E๐0) + _ (๐1 โ E๐1))+ โค (1 โ _) (๐0 โ E๐0)+ + _ (๐1 โ E๐1)+
and hence
E [๐_ + \ ยท (๐_ โ E๐_)+] โค (1 โ _)E๐0 + _E๐1 + \ (1 โ _) (๐0 โ E๐0)+ + _ (๐1 โ E๐1)+= (1 โ _) (E๐0 + \ (๐0 โ E๐0)+) + _ (E๐1 + \ (๐1 โ E๐1)+)= (1 โ _)R\ (๐0) + _R\ (๐1) ,
i.e., R\ is convex. ๏ฟฝ
6.7 PROBLEMS
Exercise 6.1. Compute the Average Value-at-Risk for the returns given in Table 9.2 for ๐ผ =
20%, 40%, 60% and ๐ผ = 80%.
Exercise 6.2 (Cf. Exercise 4.3). The matrix ฮ in Table 4.3 contains logarithmic, annualizedreturns of 3 shares at the end of 4 quarters.
(i) Compute the Average Value-at-Risk for the risk levels ๐ผ = 20%, 40%, 60% and ๐ผ = 80%for each asset.
(ii) You are invested with ๐ฅ = (40%, 30%, 30%). What is the Value-at-Risk of your returnsat the above risk-levels?
Exercise 6.3. Verify Proposition 6.10. Hint: try the random variables 1[_,1] (๐) to showmonotonicity, and ๐0 B 1[๐ข,1] (๐) and ๐1 B 1[๐ขโฮ,1โฮ] (๐) for convexity.
Exercise 6.4. Give a risk functional R and a random variable ๐ โ ๐ฟโ so that R(๐ ) > E๐ .
rough draft: do not distribute
7Portfolio Optimization Problems Involving Risk Measures
. . . : daร keines von ihnen verlorengehe.
Edith Stein, ESGA, Band 1
7.1 INTEGRATED RISK MANAGEMENT FORMULATION
The portfolio optimization problem we want to consider here for simplicity and introduction is(cf. Figure 4.3 and (iv) in Theorem 9.10)
maximizein ๐ฅ โR (โ๐ฅ>b) = A (๐ฅ>b)
subject to ๐ฅ> 1 โค 1C,๐ฅ โฅ 0,
where A(ยท) B โR(โ ยท) is an acceptability functional, cf. (5.1), Remark 5.2. The problem is no-tably unbounded without shortselling constraints. Equivalent is the formulation (cf. Figure 4.3,again)
minimizein ๐ฅ R (โ๐ฅ>b)
subject to ๐ฅ> 1 โค 1C,๐ฅ โฅ 0,
which is apparently a convex problem formulation.Typical risk functionals are R(๐ ) B (1 โ ๐พ)E๐ + ๐พ AV@R๐ผ (๐ ).
7.2 MARKOWITZ TYPE FORMULATION
Recall from Figure 4.3 that E๐ + R(โ๐ ) (โฅ 0) is a one-sided deviation from the mean, whichcan be interpreted as risk. The formulation
๐ฃ(`) B minimizein ๐ฅ E ๐ฅ>b + R
(โ๐ฅ>b
)(7.1)
subject to E ๐ฅ>b โฅ `,๐ฅ> 1 โค 1C,(๐ฅ โฅ 0)
specifies a convex problem in ๐ฅ, as the objective (i.e., R) is convex, the constraints are evenlinear. The function ๐ฃ is nondecreasing and it holds that 0 โค E๐ + R (โ๐ ), i.e., ๐ฃ(`) โฅ 0.
53
54 PORTFOLIO OPTIMIZATION PROBLEMS INVOLVING RISK MEASURES
Let ๐ฅ` denote the optimal diversification in (7.1). For _ โ (0, 1) set `_ B (1 โ _)`0 + _`1and ๐ฅ`_ B (1 โ _)๐ฅ`0 + _๐ฅ`1 . By linearity, ๐ฅ`_ is feasible for ๐ฃ
(`_
)and we have from convexity
of R that
๐ฃ(`_
)โค E ๐ฅ>`_b + R
(โ๐ฅ>`_b
)โค (1 โ _)๐ฃ(`0) + _๐ฃ(`1),
the function ๐ฃ(ยท) thus is convex. This gives rise to the efficient frontier ` โฆโ(๐ฃ(`)`
)(which is
concave) and an accordant tangency portfolio.
Example 7.1. Consider R(๐ ) B (1โ๐พ)E๐ +๐พ AV@R๐ผ (๐ ), then the problem of integrated riskmanagement is (cf. Figure 4.3, and again)
minimizein ๐ฅ ๐พ ยท E ๐ฅ>b + ๐พ ยท AV@R๐ผ (โ๐ฅ>b)
subject to E ๐ฅ>b โฅ `,๐ฅ> 1 โค 1C,(๐ฅ โฅ 0) ,
(7.2)
with parameters ๐ผ, ๐พ โ (0, 1).Recall that AV@R๐ผ (๐ ) = min๐โR
{๐ + 1
1โ๐ผ E(๐ โ ๐)+}. The discrete model problem (7.2)
thus may be rewritten as
minimizein ๐ฅ, ๐ ๐พ ยท ๐>ฮ๐ฅ + ๐พ ยท
(๐ + 1
1โ๐ผ ๐> (โฮ๐ฅ โ ๐)+
)subject to ๐>ฮ๐ฅ โฅ `,
๐ฅ> 1 โค 1C,(๐ฅ โฅ 0) .
To eliminate the nonlinear expression (ยท)+ define the slack variable ๐ง๐ B (โ๐ โ ๐ฅ>b๐)+ (๐ =1, . . . , ๐) and note that ๐ง๐ โฅ 0 and โ๐ โ ๐ฅ>b๐ โค ๐ง๐. So we get the linear program
minimizein ๐ฅ, ๐ง, ๐ ๐พ ยท ๐>ฮ๐ฅ + ๐พ ยท ๐ + ๐พ
1โ๐ผ ๐>๐ง
subject to โ๐ โ ๐ฅ>b๐ โค ๐ง๐ (๐ = 1, . . . ๐) ,๐>ฮ๐ฅ โฅ `,๐ฅ> 1 โค 1C,๐ง โฅ 0, (๐ฅ โฅ 0) ,
or re-written in matrix-form
minimizein ๐ฅ, ๐ง, ๐ โ๐พ๐>ฮ๐ฅ + ๐พ ยท ๐ + ๐พ
1โ๐ผ ๐>๐ง
subject to ยฉยญยซโฮ โ๐ผ๐ โ1๐1>๐
0 . . . 0 0โ๐>ฮ 0 0
ยชยฎยฌ ยฉยญยซ๐ฅ
๐ง
๐
ยชยฎยฌ โค ยฉยญยซ01โ`
ยชยฎยฌ ,๐ง โฅ 0, (๐ฅ โฅ 0) .
rough draft: do not distribute
7.3 ALTERNATIVE FORMULATION 55
Example 7.2. Consider R(๐ ) B EV@R๐ผ (๐ ), then the problem of integrated risk managementis
minimizein ๐ฅ, ๐ก E ๐ฅ>b + ๐ก log 1
1โ๐ผ E ๐โ๐ฅ> b/๐ก
subject to E ๐ฅ>b โฅ `,๐ฅ> 1 โค 1C,๐ก > 0, (๐ฅ โฅ 0) .
(7.3)
7.3 ALTERNATIVE FORMULATION
The formulation
๏ฟฝ๏ฟฝ(๐) B maximizein ๐ฅ E ๐ฅ>b
subject to โ R(โ๐ฅ>b
)โฅ ๐,
๐ฅ> 1 โค 1C,(๐ฅ โฅ 0)
specifies a convex problem in ๐ฅ โ R๐ as well (the objective is linear, the constraints convex).It holds that ๐ โค โR (โ๐ฅ>b) โค E ๐ฅ>b and thus ๏ฟฝ๏ฟฝ(๐) โฅ ๐. The function ๏ฟฝ๏ฟฝ(๐) is concave and the
frontier ๐ โฆโ(๐
๏ฟฝ๏ฟฝ(๐)
)(or ๐ โฆโ
(๐
๏ฟฝ๏ฟฝ(๐) โ ๐
)) is a concave efficient frontier, which again gives rise
for a tangency portfolio.
Version: May 28, 2021
56 PORTFOLIO OPTIMIZATION PROBLEMS INVOLVING RISK MEASURES
rough draft: do not distribute
8Expected Utility Theory
Buy on bad news, sell on good news.
Bรถrsenweisheit
The concept of utility functions dates back to Oskar Morgenstern1 and John von Neu-mann,2 expected utilities to Kenneth Arrow3 and John W. Pratt.4
Preference is given to ๐ over ๐, if
E ๐ข(๐) โค E ๐ข(๐ ). (8.1)
8.1 EXAMPLES OF UTILITY FUNCTIONS
The exponential utility for ๐พ โฅ 0 is defined as
๐ข(๐ฅ) = 1 โ ๐โ๐พ๐ฅ . (8.2)
For ๐ผ > 0, ๐ผ โ 1 the polynomial utility functions are defined as
๐ข(๐ฅ) ={
๐ฅ1โ๐ผ
1โ๐ผ ๐ฅ โฅ 0,โโ ๐ฅ < 0;
(8.3)
they are sometimes also termed power utility functions,
๐ข(๐ฅ) ={
๐ฅ^
^๐ฅ โฅ 0,
โโ ๐ฅ < 0;(8.4)
and in case of ๐ผ = 1,
๐ข(๐ฅ) ={log ๐ฅ ๐ฅ โฅ 0,โโ ๐ฅ < 0.
Definition 8.1 (HARA utilities). Hyperbolic risk aversion (HARA) utilities are๐ (๐ค) = 1โ๐พ๐พ
(๐๐ค1โ๐พ + ๐
)๐พ.
8.2 ARROWโPRATT MEASURE OF ABSOLUTE RISK AVERSION
Definition 8.2. The local risk aversion coefficient at ๐ (cf. ArrowโPratt measure of absoluterisk-aversion (ARA), also known as the coefficient of absolute risk), is
๐ด(๐) = โ๐ขโฒโฒ(๐)๐ขโฒ(๐) ,
11902 (in Gรถrlitz) โ 197721903 โ 195831921โ2017, Nobel memorial price in economic sciences in 197241931
57
58 EXPECTED UTILITY THEORY
the coefficient for relative risk aversion is
๐ (๐) = โ๐ ยท ๐ขโฒโฒ(๐)
๐ขโฒ(๐) .
For a motivation consider the Taylor-series expansion ๐ข(๐ฆ) โ ๐ข(๐ฅ)+(๐ฆโ๐ฅ)๐ขโฒ(๐ฅ)+ (๐ฆโ๐ฅ)2
2 ๐ขโฒโฒ(๐ฅ).At ๐ฅ = E๐ , ๐ฆ = ๐ and after taking expectations we obtain
E ๐ข(๐ ) โ ๐ข(E๐ ) + E(๐ โ E๐ )๐ขโฒ(E๐ ) + E(๐ โ E๐ )2
2๐ขโฒโฒ(E๐ ) = ๐ข(E๐ ) + var๐
2๐ขโฒโฒ(E๐ ). (8.5)
Now apply a Taylor-series expansion to the inverse ๐ขโ1(๐ฅ) โ ๐ขโ1(๐ฆ) + ๐ฅโ๐ฆ๐ขโฒ (๐ขโ1 (๐ฆ)) with ๐ฅ = E ๐ข(๐ )
and ๐ฆ = ๐ข(E๐ ) to get
๐ขโ1(E ๐ข(๐ )) โ E๐ + E ๐ข(๐ ) โ ๐ข(E๐ )๐ขโฒ(E๐ ) โ E๐ + ๐ขโฒโฒ(E๐ )
2๐ขโฒ(E๐ )๏ธธ ๏ธท๏ธท ๏ธธArrowโPratt at E๐
ยท var๐,
by (8.5).
Example 8.3. The risk aversion coefficient is โ๐พ (thus constant) for the utility function (8.2),while ๐ด(๐) = ๐ผ
๐and ๐ (๐) = ๐ผ for the utility given in (8.3).
Example 8.4. Consider ๐ข(๐ฅ) = log ๐ฅ, then ๐ด(๐) = โ๐ขโฒโฒ (๐)๐ขโฒ (๐) =
1๐.
8.3 EXAMPLE: ST. PETERSBURG PARADOX5
Consider the following game. A fair coin is tossed until heads appears for the first time andsuppose this happens at the Nth toss. The player will then get 2๐โ1 euros. What is the fairamount a player should pay in order to play the game?
The fee is given by the expected payout. By definition of the geometric distribution:
๐(๐ = ๐) = 2โ๐ ๐ = 1, 2, . . .
Therefore the expected payout is
E 2๐โ1 =โโ๐=12โ๐2๐โ1 =
โโ๐=1
12= โ.
This result obviously does not make sense. Several approaches were developed byN. Bernoulli and G. Cramer. Insetad of calculating the expected payout E
[2๐โ1
], ๐ =
๐ขโ1(E
[๐ข(2๐โ1)
] )with ๐ข(๐ฅ) = log(๐ฅ) or ๐ข(๐ฅ) =
โ๐ฅ is calculated. In the case of ๐ข(๐ฅ) = log(๐ฅ), it
follows that
E log 2๐โ1 =โโ๐=12โ๐ (๐ โ 1) log 2 = log(2)
โโ๐=0
๐ ยท 2โ๐
=log(2)4
โโ๐=0
๐2๐โ1 =log 24
1(1 โ 12
)2 = log 2
5by ruben schlotter
rough draft: do not distribute
8.4 PREFERENCES AND UTILITY FUNCTIONS 59
and ๐ = ๐log 2 = 2.In case of ๐ข(๐ฅ) =
โ๐ฅ,
Eโ2๐โ1 =
โโ๐=12
๐โ12 2โ๐ =
1โ2
โโ๐=1
(โ22
) ๐=1โ2
โ22
1 โโ22
=1
2 โโ2
and therefore ๐ =(2 โโ2)โ2โ 2.914.
The expected payout was weighted with ๐ข(ยท) which yields a finite value. The weightingwith ๐ข can be interpreted as giving less importance to very high payouts which have smallprobabilities of occurring.
Remark. Note the shape of both log(๐ฅ) andโ๐ฅ. Such functions are called utility functions.
8.4 PREFERENCES AND UTILITY FUNCTIONS
The main aim is to
โฒ model decisions under uncertainty
โฒ compare random payouts (lotteries)
Definition 8.5. A function ๐น : Rโ [0, 1] is called (cumulative) distribution function on R, if
โฒ F is monotone increasing and right continuous.
โฒ lim{๐ฅโโโ ๐น (๐ฅ) = 0 and lim{๐ฅโโ ๐น (๐ฅ) = 1
LetM be the set of all distribution functions onR. We define a relation onM. A preferenceonM is a relation 4 such that
โฒ ๐น 4 ๐น for all ๐น โ M
โฒ (๐น 4 ๐บ) โง (๐บ 4 ๐ป) implies that ๐น 4 ๐ป for all ๐น, ๐บ, ๐ป โ M
โฒ Either (๐น 4 ๐บ) or (๐บ 4 ๐น)
๐น 4 ๐บ is interpreted as ๐บ is preferred over ๐น. ๐น and ๐บ are called equivalent, denoted by๐น โผ๐บ if๐น 4 ๐บ and๐บ 4 ๐น. The preference4satisfies the continuity axiom if for all ๐น, ๐บ, ๐ป โ Msuch that ๐น 4 ๐บ 4 ๐ป there exists an ๐ผ โ [0, 1] with
(1 โ ๐ผ)๐น + ๐ผ๐ป โผ ๐บ.
The preference4satisfies the independence of irrelevant alternatives axiom, if for all ๐น, ๐บ, ๐ป โM and all ๐ผ โ [0, 1]
๐น 4 ๐บ โโ (1 โ ๐ผ)๐น + ๐ผ๐ป 4 (1 โ ๐ผ)๐บ + ๐ผ๐ป
Remark. The continuity axiom means that โgoodโ and โbadโ risks can be pooled into anaverage one.
Version: May 28, 2021
60 EXPECTED UTILITY THEORY
A preference4has a numerical representation if there is a mapping ๐ : M โ [โโ,โ) ,such that
๐น 4 ๐บ โโ ๐ (๐น) โค ๐ (๐บ)
This representation has a von Neumann-Morgenstern representation if there exists an-other function ๐ข : Rโ [โโ,โ) such that for all ๐ with (cumulative) distribution function ๐น
๐ (๐น) = E ๐ข(๐)
Theorem 8.6. Let 4 be a preference onM. Then the following are equivalent
(i) 4satisfies the continuity and the independence axiom
(ii) 4 has a vNM representation
Definition 8.7. ๐ข : Rโ [โโ,โ) is called Bernoulli-utility function if ๐ข is monotone increasingand strictly concave.
Theorem 8.8. Let 4be a preference with a vNM-representation. Then ๐ข is a Bernoulli utilityfunction if and only if
(i) ๐ 4 ๐ โโ ๐ โค ๐ for all ๐, ๐ โ R
(ii) ๐ 4 E ๐
Let๐ข be a Bernoulli utility function (wrt. .: (4๐ข) and ๐ have finite expectation then definethe certainty equivalent of ๐ by
๐ B ๐(๐, ๐ข) โ R, such that ๐ โผ๐ข ๐
In the introductory example the certainty equivalent of the payout ๐ with respect to 2different Bernoulli utility functions was calculated.
rough draft: do not distribute
9Stochastic Orderings
Ich bin so glรผcklich, ich habe meinenPosten verloren. Mein Chef ist nรคmlichin Konkurs gegangen. Mich bringtniemand mehr in ein Bankhaus.
Arnold Schรถnberg, 1874โ1951, anDavid Josef Bach
A particular utility function ๐ข(ยท) is occasionally considered as artifact which specifies avery particular and individual personal preference. Different investors might employ verydifferent utility functions to express their individual preference.
Some concepts of stochastic orderings robustify decisions by replacing a single utilityfunction by a class of functions, so that (8.1) holds for all of them.
9.1 STOCHASTIC DOMINANCE OF FIRST ORDER
Definition 9.1. For R-valued random variables ๐ and ๐ we say that ๐ is dominated by ๐ infirst order stochastic dominance,
๐ 4(1) ๐ or ๐ 4๐น๐๐ท ๐,
if (8.1) holds for all ๐ข โ U๐น๐๐ท B {๐ข : Rโ R nondecreasing} for which the integrals exists.
Remark 9.2. Not that the function ๐ข(๐ฅ) B ๐ฅ in nondecreasing, so that E ๐ โค E๐ whenever๐ 4(1) ๐ .
Remark 9.3. The relation๐ โค ๐ almost surely
(cf. Definition 5.1 (i)) is occasionally referred to as stochastic dominance of order 0 anddenoted ๐ 4(0) ๐ .
Theorem 9.4. The following are equivalent:
(i) ๐ 4(1) ๐ ,
(ii) ๐น๐ (ยท) โฅ ๐น๐ (ยท), i.e., ๐(๐ โค ๐ง) โฅ ๐(๐ โค ๐ง) for all ๐ง โ R and
(iii) ๐นโ1๐(ยท) โค ๐นโ1
๐(ยท), i.e., V@R๐ผ (๐) โค V@R๐ผ (๐ ) for all ๐ผ โ (0, 1).
Proof. The function ๐ข๐ง (๐ฅ) B 1(๐ง,โ) (๐ฅ) is nondecreasing and thus ๐ข๐ง โ U๐น๐๐ท. Note thatE ๐ข๐ง (๐) = ๐(๐ > ๐ง), thus (8.1) is equivalent to ๐น๐ (๐ง) = ๐(๐ โค ๐ง) = 1โ๐(๐ > ๐ง) = 1โE ๐ข๐ง (๐) โฅ1 โ E ๐ข๐ง (๐ ) = 1 โ ๐(๐ > ๐ง) = ๐(๐ โค ๐ง) = ๐น๐ (๐ง), so that (ii) follows from (i).
As for the converse note that every nondecreasing function ๐ข(ยท) may be approximated bya simple step function ๐ข๐ (ยท) =
โ๐๐=1 ๐ผ๐ 1(๐ง๐ ,โ) (ยท) with ๐ผ๐ > 0 so that |E ๐ข(๐) โ E ๐ข๐ (๐) | < Y and
61
62 STOCHASTIC ORDERINGS
probabilities 40% 20% 40%
๐0 = ๐ 2 4 4๐1 4 4 2
12 (๐0 + ๐1) 3 4 3
(a) The relation 4(1) is not convex, cf. Re-mark 9.5. Note that ๐0 โ ๐1, but ๐น๐0 = ๐น๐1 .
probabilities 40% 20% 10% 30%
๐ 0 2 3 3๐ 1 1 1 4
(b) ๐ $(1) ๐ , but ๐ 4(2) ๐ , cf. Figure 9.1
Table 9.1: Counterexamples
|E ๐ข(๐ ) โ E ๐ข๐ (๐ ) | < Y. With (ii) it follows that E ๐ข๐ (๐) =โ๐
๐=1 ๐ผ๐ ๐(๐ > ๐ง๐) โคโ๐
๐=1 ๐ผ๐ ๐(๐ >
๐ง๐) = E ๐ข๐ (๐ ), so that stochastic dominance in first order follows.It is evident that (ii) and (iii) are equivalent. ๏ฟฝ
Remark 9.5. Exercise 9.5 (Table 9.1a) demonstrates that the order 4(1) is not convex, i.e.,the sets
{๐ : ๐ 4(1) ๐
}and
{๐ : ๐ 4(1) ๐
}are not convex.
9.2 STOCHASTIC DOMINANCE OF SECOND ORDER
Definition 9.6. For R-valued random variables ๐ and ๐ we say that ๐ is dominated by ๐ insecond order stochastic dominance,
๐ 4(2) ๐ or ๐ 4๐๐๐ท ๐,
if (8.1) holds for all ๐ข โ U๐๐๐ท B {๐ข : Rโ R nondecreasing and concave}, for which theintegrals exists.
Remark 9.7. As in Remark 9.2 we have that that E ๐ โค E๐ whenever ๐ 4(2) ๐ .
Remark 9.8. By Jensenโs inequality ๐ข(E๐ ) โฅ E ๐ข(๐ ). This can be read as possessing (i.e.,not investing) the amount ๐ข(E๐ ) is given preference to investing.
Lemma 9.9. The set{๐ : ๐ 4(2) ๐
}is convex (cf. Remark 9.5).
Proof. Let ๐ 4(2) ๐0, ๐ 4(2) ๐1 and ๐ข โ U๐๐๐ท be chosen. Define ๐_ B (1 โ _)๐0 + _๐1. ByJensenโs inequality it holds that ๐ข
((1 โ _)๐0 + _๐1
)โฅ (1 โ _)๐ข(๐0) + _ ๐ข(๐1) and thus
E ๐ข(๐_) = E ๐ข((1 โ _)๐0 + _๐1
)โฅ
Jensenโs inequality(1 โ _)E ๐ข(๐0) + _E ๐ข(๐1)
โฅ (1 โ _)E ๐ข(๐) + _E ๐ข(๐) = E ๐ข(๐)
by Jensenโs inequality and hence ๐ 4(2) ๐_. ๏ฟฝ
Theorem 9.10. The following are equivalent:
(i) ๐ 4(2) ๐ ,
(ii)โซ ๐
โโ ๐น๐ (๐ง) d๐ง โฅโซ ๐
โโ ๐น๐ (๐ง) d๐ง for all ๐ โ R and
(iii)โซ ๐ผ
0 ๐นโ1๐(๐ข) d๐ข โค
โซ ๐ผ
0 ๐นโ1๐(๐ข) d๐ข (this is what is called the absolute Lorentz function) for
๐ผ โ (0, 1),
rough draft: do not distribute
9.2 STOCHASTIC DOMINANCE OF SECOND ORDER 63
0 1 2 3 40
0.2
0.4
0.6
0.8
1
๐ฅ
๐น๐ (๐ฅ)๐น๐ (๐ฅ)
(a) Cumulative distribution function, Theorem 9.4(ii)
0 0.2 0.4 0.6 0.8 10
1
2
3
4
๐ผ
V@R๐ผ (๐)V@R๐ผ (๐ )
(b) Generalized inverse, V@R, Theorem 9.4(iii)
0 1 2 3 40
0.5
1
1.5
2
2.5
๐ฅ
โซ ๐ฅ
โโ ๐น๐ (๐ฆ) d๐ฆโซ ๐ฅ
โโ ๐น๐ (๐ฆ) d๐ฆ
(c) Integrated cdf, Theorem 9.10(ii)
0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
๐ผ
โซ ๐ผ
0 V@R๐ข (๐) d๐ขโซ ๐ผ
0 V@R๐ข (๐ ) d๐ข
(d) Integrated V@R, Theorem 9.10(iii)
Figure 9.1: Random variables ๐ (blue) and ๐ (orange) from Table 9.1b
Version: May 28, 2021
64 STOCHASTIC ORDERINGS
(iv) โAV@R๐ผ (โ๐) โค โAV@R๐ผ (โ๐ ) for all ๐ผ โ (0, 1).1
Proof. By RiemannโStieltjes integration by parts we have thatโซ ๐
โโ๐น๐ (๐ฅ) d๐ฅ = ๐ฅ ยท ๐น๐ (๐ฅ) |๐๐ฅ=โโ โ
โซ ๐
โโ๐ฅ d๐น๐ (๐ฅ) =
โซ ๐
โโ๐ โ ๐ฅ d๐น๐ (๐ฅ) (9.1)
=
โซ โ
โโ(๐ โ ๐ฅ)+ d๐น๐ (๐ฅ) = โE ๐ข๐ (๐),
where๐ข๐ (๐ฅ) B โ(๐ โ ๐ฅ)+.
The function ๐ข๐ (ยท) is nondecreasing and concave, hence it follows from (8.1) that E ๐ข๐ (๐) โคE ๐ข๐ (๐ ) and thus the assertion (ii).
A nondecreasing and concave function ๐ข(ยท) โ U๐๐๐ท can be approximated by ๐ข๐ (๐ฅ) Bโ๐๐=1 ๐ผ๐ ยท ๐ข๐๐ (๐ฅ) where ๐ผ๐ > 0 so that |E ๐ข(๐) โ E ๐ข๐ (๐) | < Y and |E ๐ข(๐ ) โ E ๐ข๐ (๐ ) | < Y (see
Mรผller and Stoyan [2002] for details). Assertion (i) then follows by combining (9.1) and (ii).Define
๐บ๐ (๐) Bโซ ๐
โโ๐น๐ (๐ฅ) d๐ฅ and ๐บโ1๐ (๐ผ) B
โซ ๐ผ
0๐นโ1๐ (๐) d๐.
Recall Youngโs inequality (15.7), i.e.,
๐ ๐ผ โคโซ ๐
0๐น๐ (๐ฅ) d๐ฅ๏ธธ ๏ธท๏ธท ๏ธธ๐บ๐ (๐)
+โซ ๐ผ
0๐นโ1๐ (๐) d๐๏ธธ ๏ธท๏ธท ๏ธธ๐บโ1
๐(๐ผ)
.
From (ii) we deduce that ๐ ๐ผ โ ๐บ๐ (๐) โค ๐ ๐ผ โ ๐บ๐ (๐) and thus
๐บโ1๐ (๐ผ) = sup๐โR{๐ ๐ผ โ ๐บ๐ (๐)} โค sup
๐โR{๐ ๐ผ โ ๐บ๐ (๐)} = ๐บโ1๐ (๐ผ).
The converse follows by the same reasoning.As for (iv) recall that E ๐ข๐ (๐) โค E ๐ข๐ (๐ ) is equivalent to E(๐โ๐)+ โฅ E(๐โ๐ )+, which holds
true for every ๐ โ R. Thus ๐ + 11โ๐ผ E(โ๐ โ ๐)+ โฅ ๐ +
11โ๐ผ E(โ๐ โ ๐)+, from which assertion (iv)
follows after taking the infimum with respect to ๐ โ R.As for the converse define ๐โ๐ผ B โV@R๐ผ (โ๐ ) and recall that AV@R๐ผ (โ๐ ) = โ๐โ๐ผ +
11โ๐ผ E(โ๐ + ๐
โ๐ผ)+. It follows that
โ๐โ๐ผ +11 โ ๐ผ E(โ๐ + ๐
โ๐ผ)+ โฅ AV@R๐ผ (โ๐) โฅ AV@R๐ผ (โ๐ ) = โ๐โ๐ผ +
11 โ ๐ผ E
(โ๐ + ๐โ๐ผ
)+ ,
and thus E ๐ข๐โ๐ผ (๐) โค E ๐ข๐โ๐ผ (๐ ). The assertion follows now, as ๐โ๐ผ can be adjusted for every๐ผ โ (0, 1). ๏ฟฝ
Remark 9.11. It is evident that ๐ 4(1) ๐ implies ๐ 4(2) ๐ , but the converse is not true: cf.Exercise 9.6 (Table 9.1b).
1Recall from Remark 5.2 that A(๐ ) B โAV@R๐ผ (โ๐ ) is an acceptability functional.
rough draft: do not distribute
9.3 PORTFOLIO OPTIMIZATION 65
9.3 PORTFOLIO OPTIMIZATION
Let ๐ be a random variable understood as a benchmark (cf. (2.3) in the introduction). ByLemma 9.9, the problem
maximizein ๐ฅ E ๐ฅ>b
subject to ๐ 4(2) ๐ฅ>b
๐ฅ> 1 โค 1C,(๐ฅ โฅ 0)
(9.2)
is a convex optimization problem.The relation ๐ 4๐๐๐ท ๐ฅ>b in (9.2) can be restated as
maximizein ๐ฅ E ๐ฅ>b
subject to โAV@R๐ผ (โ๐) โค โAV@R๐ผ (โ๐ฅ>b) for all ๐ผ โ (0, 1),๐ฅ> 1 โค 1C,(๐ฅ โฅ 0)
(9.3)
so that the problem has infinity many (uncountably many) constraints. We refer to Dentchevaand Ruszczynski [2011] for a discussion and numerical implementation schemes.
9.4 PROBLEMS
Exercise 9.1. Show that
(1 โ ๐ผ) AV@R๐ผ (๐ ) โ ๐ผ AV@R1โ๐ผ (โ๐ ) = E๐ .
Exercise 9.2. Compute the preferences E ๐ข๐พ (๐) โค E ๐ข๐พ (๐ ) for ๐ข๐พ (๐ฅ) B ๐ฅ๐พ
๐พwith ๐พ = 0.5 and
๐พ = 0.6 for the random variable specified in Table 4.2.
Exercise 9.3. Compute the preferences E ๐ข_(๐) for ๐ข_(๐ฅ) B 1 โ ๐โ_๐ฅ with selections of _ toget different preferences (again Table 4.2).
Exercise 9.4. Show by using Theorem 9.4 that we have ๐ $(1) ๐ and ๐ $(2) ๐ for the randomvariable in Table 4.2.
Exercise 9.5. Consider the random variables in Table 9.1a and verify that the set{๐ : ๐ 4(1) ๐
}is not convex.
Exercise 9.6. Verify for the random variables in Table 9.1b that ๐ $(1) ๐ , but ๐ 4(2) ๐ .
Exercise 9.7. Which portfolio is preferable in Table 9.2 if employing the utility function ๐ข(๐ฅ) =๐ฅ^ for ^ โ (0, 1)?
Version: May 28, 2021
66 STOCHASTIC ORDERINGS
probabilities 30% 70%
return ๐1 10% 25%return ๐2 25% 10%
Table 9.2: Return of different portfolios ๐1 and ๐2
rough draft: do not distribute
10Arbitrage
On ne peut vivre de frigidaires, depolitique, de bilans et de mots croisรฉs,voyez-vous! On ne peut plus vivre sanspoรฉsie, couleur ni amour.
Antoine de Saint-Exupรฉry,Lettre au gรฉnรฉral X, 30 juillet 1944
This section follows Cornuejols and Tรผtรผncรผ [2006, Chapter 4].
Definition 10.1. Arbitrage is a trading strategy,
Type A: that has a positive cash flow and no risk of a later loss;
Type B: that requires no initial cash input, has no risk of a loss, and a positive probability ofmaking profits in the future: ๐0 = 0, ๐(๐๐ก โฅ 0) = 1 and ๐(๐๐ก > 0) > 0, where ๐๐ก is theportfolio value at time ๐ก.
10.1 TYPE A
Consider the exchange rates in Table 10.1. Note, that converting any (!) currency forwardsand backwards will result in a loss; for example
1 EUR = 1.12 US$ = 1.12 โ 0.892 EUR = 0.99904 EUR < 1 EUR,
etc.Exchanging a sequence of currencies is a loss as well (in general), e.g.,
1 EUR = 121 JPY= 121 โ 0.00701 GBP= 121 โ 0.00701 โ 1.286 US$= 121 โ 0.00701 โ 1.286 โ 0.892 EUR = 0.97 EUR < 1 EUR. (10.1)
to: EUR US$ GBP JPY1 EUR = 1.12 0.87 121.01 US$ = 0.892 0.777 110.71 GBP = 1.149 1.286 142.61 JPY = 0.0082 0.00900 0.00701
Table 10.1: Exchange rates
67
68 ARBITRAGE
However, the table allows for arbitrage (a free lunch of 1.76%), for example by converting
1 EUR = 1.12 US$= 1.12 โ 0.777 GBP= 1.12 โ 0.777 โ 142.6 JPY= 1.12 โ 0.777 โ 142.6 โ 0.0082 EUR = 1.0176 EUR > 1 EUR. (10.2)
Investing 1C thus will result in a profit of EUR 0.0176 without risk. Note that (10.2) is actuallythe reverse order of (10.1).
How can one detect an opportunity for arbitrage?
Define the variables
๐ธ๐ท, etc.: quantity of EUR (i.e., number of EUR banknotes) changed to US$, etc.
๐ด: quantity of EUR generated by arbitrage.
Then we may consider the optimization problem
maximize ๐ด
subject to 0.892๐ท๐ธ โ ๐ธ๐ท + 1.149๐๐ธ โ ๐ธ๐ + 0.0082๐๐ธ โ ๐ธ๐ โฅ ๐ด, (10.3)1.12๐ธ๐ท โ ๐ท๐ธ + 1.286๐๐ท โ ๐ท๐ + 0.009๐๐ท โ ๐ท๐ โฅ 0,0.87๐ธ๐ โ ๐๐ธ + 0.777๐ท๐ โ ๐๐ท + 0.00701๐๐ โ ๐๐ โฅ 0,121๐ธ๐ โ ๐๐ธ + 110.7๐ท๐ โ ๐๐ท + 142.6๐๐ โ ๐๐ โฅ 0, (10.4)๐ธ๐ท + ๐ธ๐ + ๐ธ๐ โค 1000 EUR, (10.5)๐ธ๐ท โฅ 0, ๐ท๐ธ โฅ 0, etc. and ๐ด โฅ 0.
The constraints (10.3)โ(10.4) are balance equations (conservation equations1) for EUR, USD,GBP and JPY (resp.), while the budget constraint (10.5) limits the total, initial amount avail-able. (0, . . . , 0) is always a feasible solution with profit ๐ด = 0 (convert nothing, no arbitrage).We have found an arbitrage opportunity, if our solver returns a solution with objective ๐ด > 0.
Indeed, this is possible here as ๐ธ๐ท = 1, ๐ท๐ = 1.12, ๐๐ = 1.12 โ 0.777 = 0.87024, ๐๐ธ =
1.12 โ 0.777 โ 142.6 = 124.09 (all other are ๐ธ๐ = ๐ธ๐ = ๐ท๐ธ = ยท ยท ยท = 0) and ๐ด = 0.0176 > 0(cf. (10.2)) is feasible and indeed the optimal solution (up to scaling with 1000, cf. (10.5)).
10.2 TYPE B
Consider as series of options ๐ = 1, . . . ๐ with payoff ฮจ๐ (ยท), written on one single underlyingwith random terminal price ๐. By investing the amount of ๐ฅ๐ in each option (we allow short-selling, i.e., ๐ฅ๐ โค 0 or ๐ฅ๐ โฅ 0), we obtain the random payoff
ฮจ๐ฅ (๐) =๐โ๐=1
๐ฅ๐ ยท ฮจ๐ (๐).
1Erhaltungsgleichungen, in German
rough draft: do not distribute
10.2 TYPE B 69
For call and put options with strike ๐พ๐, the function ฮจ๐ฅ (ยท) is piecewise linear with kinks at thestrikes ๐พ๐ and thus everywhere nonnegative (thus generating arbitrage) in the range of theunderlying ๐ โ [0,โ), if
ฮจ๐ฅ (0) โฅ 0, ฮจ๐ฅ (๐พ ๐) โฅ 0 for all ๐ = 1, . . . , ๐ and ฮจโฒ๐ฅ (๐พ๐๐๐ฅ) โฅ 0,
where ๐พ๐๐๐ฅ B max ๐=1,...๐ ๐พ ๐ is the largest of all strikes.Assume the price of option ๐ is ๐๐ and solve the linear problem
minimize๐ฅโR๐
๐โ๐=1
๐ฅ๐๐๐ (10.6)
subject to๐โ๐=1
๐ฅ๐ ยท ฮจ๐ (0) โฅ 0,
๐โ๐=1
๐ฅ๐ ยท ฮจ๐ (๐พ ๐) โฅ 0 for ๐ = 1, . . . , ๐ and
๐โ๐=1
๐ฅ๐ ยท(ฮจ๐ (๐พ๐๐๐ฅ + 1) โ ฮจ๐ (๐พ๐๐๐ฅ)
)โฅ 0.
Then there is type B arbitrage, if the objective (10.6) โค 0 or unbounded; no arbitrage ispossible, if (10.6) > 0.
Example 10.2. How should one modify problem (10.6) to incorporate interest, for examplebecause the options are exercised in 1 year, e.g.?
Version: May 28, 2021
70 ARBITRAGE
rough draft: do not distribute
11The Flowergirl Problem1
Sell in May and go away.
investment strategy
11.1 THE FLOWERGIRL PROBLEM
Example 11.1 (The flowergirl problem, cf. [Pflug and Pichler, 2014]). A flowergirl has todecide how many flowers she orders from the wholesaler.
โฒ She
โ buys for the price ๐ per flower and
โ sells them for a price ๐ > ๐.
โ The random demand is b.
โ If the demand is higher than the available stock, she may procure additional flow-ers for an extra price ๐ > ๐.
โ Unsold flowers may be returned for a price of ๐ < ๐
โฒ What is the optimal order quantity ๐ฅโ, if the expected profit should be maximized?
We formulate the profit as negative costs (expenses minus revenues) are
total costs B initial purchase = ๐ ๐ฅ
โ revenue from sales โ ๐ b+ extra procurement costs + ๐ [b โ ๐ฅ]+โ revenue from returns. โ ๐ [b โ ๐ฅ]โ;
here, [๐]+ = max {๐, 0} is the positive part of ๐ and [๐]โ = max {โ๐, 0} is the negative part of๐. Since ๐ = [๐]+ โ [๐]โ, the cost function may be rewritten as
๐(๐ฅ, b) = (๐ โ ๐) ๐ฅ โ (๐ โ ๐) b + (๐ โ ๐) [b โ ๐ฅ]+
= (๐ โ ๐){๐ฅ + 11 โ ๐โ๐
๐โ๐[b โ ๐ฅ]+
}โ (๐ โ ๐) b.
Since AV@R๐ผ (๐ ) = min{๐ฅ + 1
1โ๐ผ E[๐ โ ๐ฅ]+ : ๐ฅ โ R}
(see Average Value-at-Risk below) it fol-lows that
min๐ฅโR
E๐(๐ฅ, b) = (๐ โ ๐) AV@R๐ผ (b) โ (๐ โ ๐) E b,
1Also: Newsboy, or Newsvendor problem
71
72 THE FLOWERGIRL PROBLEM2
where ๐ผ B ๐โ๐๐โ๐ .
To determine the order quantity consider the function
๐ฅ โฆโ (๐ โ ๐){๐ฅ + 11 โ ๐โ๐
๐โ๐E[b โ ๐ฅ]+
}โ (๐ โ ๐)E b (11.1)
and its derivative
0 = (๐ โ ๐){1 โ 11 โ ๐ผ E1b>๐ฅ
}= (๐ โ ๐)
{1 โ 11 โ ๐ผ๐ (b > ๐ฅ)
}.
Note next that this is equivalent to ๐ผ = ๐(b โค ๐ฅ). The optimal procurement quantity of theflowergirl thus has the explicit expression
๐ฅโ = V@R๐ผ (b) = V@R ๐โ๐๐โ๐(b) = ๐นโ1b
(๐ โ ๐๐ โ ๐
)(see, e.g., Pflug and Rรถmisch [2007, page 56]).
11.2 PROBLEMS
Exercise 11.1. Show that (11.1) is convex.
rough draft: do not distribute
12Duality For Convex Risk Measures
Risk measures, as introduced in Section 5, are convex. They hence have a representationR(๐ ) = sup {๐ฅโ(๐ง) โ Rโ(๐งโ) : ๐งโ โ ๐โ}. To this end we specify the domain, its dual and the innerproduct ๐ฅโ(๐ง).
Typical candidates for the domain of risk measures are ๐ฟ ๐ spaces, particularly ๐ฟโ. Recallthat the duals are ๐ฟ๐. Note further that the canonical inner product for the ๐ฟ ๐ โ ๐ฟ๐-duality is
๐ฟ ๐ ร ๐ฟ๐ 3 (๐, ๐) โฆโ E๐๐.
Proposition 12.1. Suppose that the risk measure R : ๐ฟ ๐ โ R โช {โ} is positively homoge-neous. Then Rโ(๐) โ {0,โ}.
Proof. Note that
Rโ(๐) = sup๐ โ๐ฟ๐
E๐๐ โ R(๐ )
= sup_โR
_ ยท sup๐ โ๐ฟ๐
(E๐๐ โ R(๐ )) โ {0,โ} .
๏ฟฝ
Proposition 12.2. Suppose that the risk measure R : ๐ฟ ๐ โ Rโช{โ} is translation equivariant.Then Rโ(๐) = โ unless E ๐ = 1.
Proof.
Rโ(๐) = sup๐ โ๐ฟ๐
E๐๐ โ R(๐ )
= sup๐ โ๐ฟ๐
sup๐โR(E(๐ + ๐)๐ โ R(๐ + ๐)) = sup
๐ โ๐ฟ๐
E(๐ )๐ โ R(๐ ) + sup๐โR
๐ (E ๐ โ 1) ,
from which the assertion follows. ๏ฟฝ
Proposition 12.3. Suppose that the risk measure R : ๐ฟ ๐ โ R โช {โ} is monotone. ThenRโ(๐) = โ unless ๐ โฅ 0 almost surely.
Proof. Suppose that ๐(๐ โค 0) > 0. Set ๐ด B {๐ โค 0} and ๐0 B 1๐ด. Note, that โ๐0 โค 0, henceR(โ๐0) โค R(0) and
Rโ(๐) = sup๐ โ๐ฟ๐
E๐๐ โ R(๐ )
โฅ sup_<0(E_๐0๐ โ R(_๐0)) โฅ sup
_<0E_ 1๐ด ๐ โ R(0) = โ.
Hence, ๐ โฅ 0 a.s. ๏ฟฝ
Definition 12.4. The support function of a set Z is ๐ Z (๐ ) B sup๐ โZ E๐๐.
Theorem 12.5. Define Z B {๐ : Rโ(๐) < โ}.
73
74 DUALITY FOR CONVEX RISK MEASURES
rough draft: do not distribute
13Stochastic Optimization: Terms, and Definitions, and theDeterministic Equivalent
13.1 EXPECTED VALUE OF PERFECT INFORMATION (EVPI) AND VALUE OF
STOCHASTIC SOLUTION (VSS)
The expected value of perfect information (EVPI) is the price that one would be willing to payin order to gain access to perfect information, that is to say the difference SP โ wait-and-see,
E[min๐ฅ๐ (๐ฅ, b)
]๏ธธ ๏ธท๏ธท ๏ธธ
wait-and-see
โค min๐ฅE ๐ (๐ฅ, b)๏ธธ ๏ธท๏ธท ๏ธธSP
โค E ๐ (๐ฅ, b). (13.1)
Both inequalities hold always.For ๐ (๐ฅ, ยท) concave Jensenโs inequality continues the sequence of inequalities above with
E ๐ (๐ฅ, b) โค ๐ (๐ฅ,E b) ,
drawing some attention to the strategy ๐ฅ0 โ argmin ๐ (๐ฅ,E b) (if this exists at all): Giventhis reference strategy ๐ฅ0 the distance RHS โ SP in (13.1) is called Value of the StochasticSolution (VSS). However, in a general context ๐ (๐ฅ, ยท) are rather convex and no comparisonof ๐ (๐ฅ0,E b) with (SP) is possible in (13.1) for this case (counter-example: Farmer Ted).
13.2 THE FARMER TED
See Jeffโs lecture, http://homepages.cae.wisc.edu/โผlinderot/classes/ie495/lecture2.pdf.
13.3 THE RISK-NEUTRAL PROBLEM
Two-stage stochastic linear program with fixed recourse:
minimize ๐>๐ฅ + Eb
[๐>b๐ฆ b
]= ๐ (๐ฅ) + EQ (๐ฅ, ยท)
(in ๐ฅ and ๐ฆ)subject to ๐ด๐ฅ = ๐ 1st stage constraints
๐b ๐ฅ +๐๐ฆ b = โb for a.e. b โ ฮ 2nd stage constraints๐ฅ โ ๐, ๐ฆ b โ ๐ for a.e. b โ ฮ
(13.2)
Note, that minimization is done over a deterministic ๐ฅ and a random variable ๐ฆ.
75
76STOCHASTIC OPTIMIZATION: TERMS, AND DEFINITIONS, AND THE DETERMINISTIC
EQUIVALENT
13.4 GLOSSARY/ CONCEPT/ DEFINITIONS:
โฒ(๐ฅ, ๐ฆ b
)โฆโ ๐>๐ฅ + Eb
[๐>๐ฆ b
]is the objective function;
โ ๐ฅ is called here-and-now decision (solution), 1st stage decision;
โ the (optimal) random variable ๐ฆ b is called wait-and-see decision (solution), 2๐๐stage decision or recourse action;
โฒ ๐ (deterministic) costs;
โฒ ๐: vector of recourse costs, which is sometimes considered random as well;
โฒ ๐ is the recourse matrix. Fixed recourse is given, if โ as in (13.2) โ ๐ = ๐ (b) (i.e., thematrix is deterministic/ nonrandom);
โฒ ๐b are sometimes called technology matrices;
โฒ ๐ : feasible set of recourse actions;
โฒ the function
๐ฃ๐ (๐ง) B{min๐ฆโ๐ {๐>๐ฆ : ๐๐ฆ = ๐ง} if feasible+โ else
is called second stage value function or recourse (penalty) function;
โฒ then defineQ (๐ฅ, b) B ๐ฃ
(โb โ ๐b ๐ฅ
)= min
๐ฆโ๐
{๐>๐ฆ : ๐๐ฆ = โb โ ๐b ๐ฅ
},
(notice: ๐ฅ โฆโ Q (๐ฅ, b) is lsc.)
โฒ andQ (๐ฅ) B Eb [Q (๐ฅ, b)] = Eb
[๐ฃ(โb โ ๐b ๐ฅ
) ]is called expected value function, or expected minimum recourse function.
โฒ A recourse is relatively complete if ๐ด๐ฅ = ๐, ๐ฅ โฅ 0 implies Q (๐ฅ, b) < โ for a.e. b โ ฮ.
โฒ A recourse is complete if โ๐ง : ๐ฃ (๐ง) < โ (i.e., there always exists a feasible recourseaction, โ๐ง โ๐ฆ : ๐๐ฆ = ๐ง). As a consequence, Q (๐ฅ, b) < โ.
13.5 KKT FOR (13.2)
โฒ ๐ฃ is a LP itself and consequently, from duality,
๐ฃ (๐ง) = min๐ฆโฅ0
{๐>๐ฆ : ๐๐ฆ = ๐ง
}= max
_
{_>๐ง : _>๐ โค ๐>
}(13.3)
where additionally _โ> โ ๐๐ฃ (๐ง) (_โ being the optimal (argmax) solution of the dual prob-lem).
โฒ From the chain rule, ๐๐ฅQ (๐ฅ, b) = ๐๐ฅ๐ฃ(โb โ ๐b ๐ฅ
)3 โ_โ>
b๐b .
rough draft: do not distribute
13.6 DETERMINISTIC EQUIVALENT 77
โฒ Suppose further the probability space is discrete, that is P =โ
b โฮ ๐ b ยท๐ฟb (๐ b B P [{b}]),then
Q (๐ฅ) = Eb [Q (๐ฅ, b)] =โb โฮ
๐ bQ (๐ฅ, b) . (13.4)
For ๐ข>bB โ_โ>
b๐b โ ๐๐ฅQ (๐ฅ, b) thus ๐ข> B Eb
[๐ข>b
]=
โb โฮ ๐ b๐ข
>bโ ๐Q (๐ฅ).
KKT, applied to the problem (13.2):๐ฅโ is an optimal solution of (13.2) iff โ_โ, `โ โฅ 0 st.
(i) 0 โ ๐> + ๐Q (๐ฅโ) + _โ>๐ด โ `โ>,
(ii) `โ>๐ฅโ = 0.
13.6 DETERMINISTIC EQUIVALENT
Given the situation, that ฮ consists of finitely many (๐ B |ฮ|) atoms, Equation (13.2) can bereformulated in its deterministic equivalent, i.e.
minimize ๐>๐ฅ+ ๐ b1๐>๐ฆ b1+ ๐ b2๐
>๐ฆ b2+ . . . +๐ b๐๐>๐ฆ b๐
(in x and y)subject to ๐ด๐ฅ = ๐
๐b1๐ฅ +๐๐ฆ b1 +0 . . . +0 = โb1
๐b2๐ฅ +0 +๐๐ฆ b2. . .
... = โb2...
.... . .
. . . +0...
...
๐b๐๐ฅ +0 . . . +0 +๐๐ฆ๐ = โb๐๐ฅ โ ๐ ๐ฆ b1 โ ๐ ๐ฆ b2 โ ๐ ๐ฆ b๐ โ ๐
(13.5)
where ๐ b B P [{b}].NB: (13.5) is a big LP (often too big, indeed), but linear and sparse. The size increases,
as the number of atoms (scenarios) ๐ increases. However, we expect that a lot of theseconstraints are redundant and we want to exploit this presumption.
13.7 L-SHAPED METHOD
i.e., Benderโs decomposition, applied to (13.5).Let _โ
b(๐ฅ)> โ argmax_
{_>
(โb โ ๐b ๐ฅ
): _>๐ โค ๐
}be an optimal, dual solution to the re-
course problem in scenario b, then ๐ข (๐ฅ)> B โโb ๐ b_
โ>b(๐ฅ) ๐b โ ๐Q (๐ฅ). Thus,
Q (๐ฅ) + ๐ข (๐ฅ)> (๐ฅ โ ๐ฅ) โค Q (๐ฅ) ,
hence ๐ฅ โฆโ Q (๐ฅ) + ๐ข (๐ฅ)> (๐ฅ โ ๐ฅ) is a supporting hyperplane, supporting Q from below. So isthe bundle,
Q๐ฟ (๐ฅ) B max๐โ๐ฟQ (๐ฅ๐) + ๐ข>๐ (๐ฅ โ ๐ฅ๐) โค Q (๐ฅ)
(where ๐ข๐ B ๐ข (๐ฅ๐)).
Version: May 28, 2021
78STOCHASTIC OPTIMIZATION: TERMS, AND DEFINITIONS, AND THE DETERMINISTIC
EQUIVALENT
13.8 FARKASโ LEMMA
Lemma 13.1 (A Theorem on the Alternative). Exactly one of these following two statementsholds true:
โฒ There exists ๐ฆ such that ๐๐ฆ = ๐ง and ๐ฆ โฅ 0;
โฒ There exists ๐ such that ๐>๐ โค 0 and ๐>๐ง > 0.
13.9 L-SHAPED ALGORITHM.
The initial problem (13.2) can be restated equivalently, getting rid of the random variable inthe objective function at the same time, as
(13.2) โโ
minimize (in x) ๐>๐ฅ + Q (๐ฅ)subject to ๐ด๐ฅ = ๐,
๐ฅ โ ๐โโ min
๐ฅโ๐
{๐>๐ฅ + Q (๐ฅ) : ๐ด๐ฅ = ๐
}โโ
minimize (in x, \) ๐>๐ฅ + \subject to ๐ด๐ฅ = ๐,
Q (๐ฅ) โค \,๐ฅ โ ๐
Algorithm 13.1 is based on the previous observation of supporting hyperplanes and the latter,equivalent formulation:
13.10 VARIANTS OF THE ALGORITHM.
โฒ Multicut: In view of linearity of (13.4) and (13.7) we may equally well start with the set
B B {(๐ฅ, \1, . . . \๐) : ๐ฅ โฅ 0, ๐ด๐ฅ = ๐, \๐ โฅ \0}
and solve the problem
min
{๐>๐ฅ +
โb
๐ b \ b : (๐ฅ, \1, . . . , \๐) โ B}
instead of (13.6) โ the problem is said to be separable into scenario sub-problems.
โ The abort criterion reads: Q (๐ฅ) โค โb ๐ b \ b?
โ The feasibility cut (singular!) remains unchanged, as they do not involve \s.โ The new optimality cuts (plural!) read
B โ B โฉโ
b : Q( ๏ฟฝ๏ฟฝ, b )>\b
{(๐ฅ, \1, . . . \ b . . . , \๐
): \ b โฅ Q (๐ฅ, b) + ๏ฟฝ๏ฟฝ>b (๐ฅ โ ๐ฅ)
}.
โฒ Chunked multicut: The same idea as multicut, but with a few b-clusters instead of theentire ฮ: ฮ = {b1, . . . b๐} = ยค
โ๐ถ
๐=1๐๐ . Define Q [๐๐ ] (๐ฅ) Bโ
b โ๐๐ ๐ bQ (๐ฅ, b) and proceedas for the multicut version.
rough draft: do not distribute
13.10 VARIANTS OF THE ALGORITHM. 79
Algorithm 13.1 L-Shaped Method
(i) Find \0 such that \0 โค Q (๐ฅ) (for all ๐ฅ) and define
B B {(๐ฅ, \) : ๐ฅ โฅ 0, ๐ด๐ฅ = ๐, \ โฅ \0} .
(B โ to some extent โ characterizes the epi-graph of the approximate Q๐ฟ and we haveB โ epiQ; if not available then choose \0 B โโ.)
(ii) Solve the problemmin
{๐>๐ฅ + \ : (๐ฅ, \) โ B
}(13.6)
and call the solution found(๐ฅ, \
).
(iii) Compute Q (๐ฅ).
(a) If Q (๐ฅ) โค \ < โ, then ๐ฅ is the best (i.e., minimal) solution of our original problem.
(b) Optimality cut : if \ < Q (๐ฅ) < โ, then put
๏ฟฝ๏ฟฝ> B โโb โฮ
๐ b_โb (๐ฅ)
> ๐b โ ๐Q (๐ฅ) (13.7)
and send the additional hyperplane, that is
B โ B โฉ{(๐ฅ, \) : \ โฅ Q (๐ฅ) + ๏ฟฝ๏ฟฝ> (๐ฅ โ ๐ฅ)
}.
Continue with step (ii).
(c) Feasibility cut : It holds that Q (๐ฅ) = โ. In this case there exists b, such that๐ฃ
(โ b โ ๐b ๐ฅ
)= โ, i.e.,
{๐>๐ฆ : ๐๐ฆ = โ b โ ๐b ๐ฅ
}= {}, i.e.
{๐ฆ : ๐๐ฆ = โ b โ ๐b ๐ฅ
}= {}.
Hence, โ b โ ๐b ๐ฅ is not feasible for ๐ฃ and thus ๐ฅ is not feasible for Q. Thus, byFarkasโ lemma (Lemma 13.1), find an appropriate ๏ฟฝ๏ฟฝ, such that ๏ฟฝ๏ฟฝ>๐ โค 0 and๏ฟฝ๏ฟฝ>
(โ b โ ๐b ๐ฅ
)> 0. To exclude this particular ๐ฅ for the future send the additional
conditionB โ B โฉ
{(๐ฅ, \) : ๏ฟฝ๏ฟฝ>
(โ b โ ๐b ๐ฅ
)โค 0
}and continue with step (ii).
Version: May 28, 2021
80STOCHASTIC OPTIMIZATION: TERMS, AND DEFINITIONS, AND THE DETERMINISTIC
EQUIVALENT
โฒ Numerical experiments show that the algorithm is sometimes flipping around withoutimproving the solution significantly. In order to stop this misbehavior search for animproved local solution, by modifying the objective function as follows:
โ Regularization
min
{๐>๐ฅ +
โ๐
๐๐๐ \๐ : (๐ฅ, \1, . . . , \๐ถ) โ B, ๐ฅ โ ๐ฅ๐ โค ฮ๐
}โ Regularized decomposition method
min
{๐>๐ฅ +
โ๐
๐๐๐ \๐ + ๐ฅ โ ๐ฅ๐ 22๐
: (๐ฅ, \1, . . . , \๐ถ) โ B}
Additional controls on ฮ and ๐ are available here to (intuitively) enforce a direc-tion of (significantly) good descent. The same techniques as for (unconstrained),nonlinear optimization apply here.
โฒ Parallelizing: ๐ข>bB โ_โ>
b๐b โ ๐Q (๐ฅ, b) has to be evaluated, which is an LP for all b โ ฮ
(or chunks). This work can be parallelized, reducing (optimistically) the total time bythe factor 1
# Computers .
โฒ Computing _โ>bโ ๐๐ฃ
(โb โ ๐b ๐ฅ
)is almost the same LP for all b โ ฮ, but without involving
๐ as a dimension: the constrains of the dual function stay unchanged, only the objectivefunction varies (if ๐ is nonrandom). The idea of bunching is to avoid all those evalua-tions and instead build a basis of representative directions such that _โ> = ๐>
๐ต๐โ1
๐ต(cf.
(13.3)). This is particularly advantageous if dimฮ ๏ฟฝ dim ๐. The statement is based onthe following
โ Theorem: Let ๐ฅ be optimal for (LPโ),
* then there is a basis such that ๐>๐โฅ ๐>
๐ต๐โ1
๐ต๐๐ for the appropriate decompo-
sition ๐ = (๐๐ต,๐๐ ) etc.;
* moreover, _โ> B ๐>๐ต๐โ1
๐ตis optimal for (DPโ).
rough draft: do not distribute
14Co- and Antimonotonicity
14.1 REARRANGEMENTS
Theorem 14.1 (Generalized Chebyshevโs sum inequality). Let ๐๐ โฅ 0 withโ๐
๐=1 ๐๐ = 1. Then,for ๐ฅ1 โค ๐ฅ2 ยท ยท ยท โค ๐ฅ๐ and ๐ฆ1 โค ๐ฆ2 ยท ยท ยท โค ๐ฆ๐ it holds that(
๐โ๐=1
๐๐๐ฅ๐
)ยท(
๐โ๐=1
๐๐๐ฆ๐
)โค
๐โ๐=1
๐๐๐ฅ๐๐ฆ๐ .
Proof. Note that๐โ๐=1
๐โ๐=1
๐ ๐ ๐๐ (๐ฅ ๐ โ ๐ฅ๐) (๐ฆ ๐ โ ๐ฆ๐)๏ธธ ๏ธท๏ธท ๏ธธโฅ0
โฅ 0,
as the components are increasing. Hence, by expanding,
0 โค๐โ๐=1
๐โ๐=1
๐ ๐ ๐๐๐ฅ ๐ ๐ฆ ๐ โ ๐ ๐ ๐๐๐ฅ ๐ ๐ฆ๐ โ ๐ ๐ ๐๐๐ฅ๐ ๐ฆ ๐ + ๐ ๐ ๐๐๐ฅ๐ ๐ฆ๐ = 2๐โ๐=1
๐ ๐๐ฅ ๐ ๐ฆ ๐ โ 2๐โ๐=1
๐ ๐๐ฅ ๐
๐โ๐=1
๐๐ ๐ฆ๐ ,
from which the result is immediate. ๏ฟฝ
Theorem 14.2 (Chebyshevโs sum inequality, the continuous version). Let ๐ , ๐ : [0, 1] โ R benondecreasing. Then it holds thatโซ 1
0๐ (๐ฅ) d๐ฅ ยท
โซ 1
0๐(๐ฅ) d๐ฅ โค
โซ 1
0๐ (๐ฅ)๐(๐ฅ) d๐ฅ.
Theorem 14.3 (The rearrangement inequality). Let ๐ฅ1 โค ๐ฅ2 ยท ยท ยท โค ๐ฅ๐ and ๐ฆ1 โค ๐ฆ2 ยท ยท ยท โค ๐ฆ๐.Then
๐โ๐=1
๐ฅ๐+1โ๐๐ฆ๐ โค๐โ๐=1
๐ฅ๐ (๐) ๐ฆ๐ โค๐โ๐=1
๐ฅ๐๐ฆ๐ (14.1)
for every permutation ๐ : {1, . . . , ๐} โ {1, . . . , ๐}.Proof. Suppose the permutation ๐ maximizing (14.1) were not the identity. Then find thesmallest ๐ so that ๐( ๐) โ ๐ . Note, that ๐( ๐) > ๐ and there is ๐ > ๐ so that ๐( ๐) = ๐. Now
๐ < ๐ =โ ๐ฆ ๐ โค ๐ฆ๐ and ๐ < ๐( ๐) =โ ๐ฅ ๐ โค ๐ฅ๐ ( ๐)and thus 0 โค
(๐ฅ๐ ( ๐) โ ๐ฅ ๐
) (๐ฆ๐ โ ๐ฆ ๐
), i.e.,
๐ฅ๐ ( ๐) ๐ฆ ๐ + ๐ฅ ๐ ๐ฆ๐ โค ๐ฅ ๐ ๐ฆ ๐ + ๐ฅ๐ ( ๐) ๐ฆ๐ . (14.2)
Define the permutation exchanging the values ๐( ๐) and ๐(๐), i.e., ๐(๐) B
๐ for ๐ โ {1, . . . , ๐}๐( ๐) if ๐ = ๐๐(๐) else
and
observe that the right hand side of (14.2) is better for ๐(ยท) and ๐( ๐) = ๐ . It follows that ๐(ยท) isthe identity. ๏ฟฝ
81
82 CO- AND ANTIMONOTONICITY
14.2 COMONOTONICITY
Definition 14.4. The random variables ๐๐, ๐ = 1, . . . , ๐ are comonotonic (aka. nondecreas-ing), if (
๐๐ (๐) โ ๐๐ (๏ฟฝ๏ฟฝ))ยท(๐ ๐ (๐) โ ๐ ๐ (๏ฟฝ๏ฟฝ)
)โฅ 0 for all ๐, ๏ฟฝ๏ฟฝ โ ๐ and ๐, ๐ โค ๐ (14.3)
and anti-monotone, if(๐๐ (๐) โ ๐๐ (๏ฟฝ๏ฟฝ)
)ยท(๐ ๐ (๐) โ ๐ ๐ (๏ฟฝ๏ฟฝ)
)โค 0 for all ๐, ๏ฟฝ๏ฟฝ โ ๐ and ๐, ๐ โค ๐,
where ๐(๐) = 1.
Theorem 14.5 (Cf. Denneberg [1994]). Let ๐ and ๐ be R-valued random variables. Thefollowing are equivalent:
(i) ๐ and ๐ are comonotonic;
(ii) there exists an R-valued random variable ๐ and nondecreasing functions ๐ฃ, ๐ค : Rโ R
so that๐ = ๐ฃ(๐) and ๐ = ๐ค(๐);
(iii) there are nondecreasing functions ๐ฃ, ๐ค : Rโ R so that
๐ = ๐ฃ(๐ + ๐ ) and ๐ = ๐ค(๐ + ๐ ).
Remark. The subsequent proof verifies that ๐ฃ and ๐ค are monotone on the range of ๐.
Proof. (๐) =โ (๐๐๐): Define ๐ B ๐ + ๐ . For ๐ง = ๐ (๐) define ๐ฃ(๐ง) B ๐ (๐) and ๐ค(๐ง) B ๐ (๐).To see that ๐ฃ(ยท) and ๐ค(ยท) are well-defined choose ๐1, ๐2 โ ๐โ1({๐ง}), then ๐ (๐1) + ๐ (๐1) =๐ (๐1) = ๐ง = ๐ (๐2) = ๐ (๐2) + ๐ (๐2), and thus
๐ (๐1) โ ๐ (๐2) = โ(๐ (๐1) โ ๐ (๐2)
). (14.4)
If ๐ (๐1) โ ๐ (๐2) โค 0, then ๐ (๐1) โ ๐ (๐2) โฅ 0 by (14.4) and ๐ (๐1) โ ๐ (๐2) โค 0 bycomonotonicity, thus ๐ (๐1) โ ๐ (๐2) = 0 and consequently ๐ (๐1) = ๐ (๐2) by (14.4). Hence,๐ฃ(ยท) and ๐ค(ยท) are well-defined.
To see that ๐ฃ(ยท) and ๐ค(ยท) are monotonic pick ๐1, ๐2 โ ฮฉ with ๐ง1 B ๐ (๐1) โค ๐ (๐2) =: ๐ง2,then we find
๐ (๐1) โ ๐ (๐2) โค โ(๐ (๐1) โ ๐ (๐2)
). (14.5)
If ๐ (๐1) > ๐ (๐2), then ๐ (๐1) < ๐ (๐2) by (14.5), which is in contrast to our assumption oncomonotonicity and hence ๐ (๐1) โค ๐ (๐2); similarly we find that ๐ (๐1) โค ๐ (๐2). It followsthat
๐ฃ(๐ง1) = ๐ (๐1) โค ๐ (๐2) = ๐ฃ(๐ง2) and๐ค(๐ง1) = ๐ (๐1) โค ๐ (๐2) = ๐ค(๐ง2),
i.e., ๐ฃ(ยท) and ๐ค(ยท) are nondecreasing.We shall verify next that ๐ฃ(ยท) and ๐ค(ยท) are Lipschitz with constant 1. Note that we have
๐ฃ(๐ง) + ๐ค(๐ง) = ๐ (๐) + ๐ (๐) = ๐ (๐) = ๐ง.
rough draft: do not distribute
14.2 COMONOTONICITY 83
If ๐ง1 โค ๐ง2, then
๐ง2 โ ๐ง1 = ๐ฃ(๐ง2) + ๐ค(๐ง2) โ ๐ฃ(๐ง1) โ ๐ค(๐ง1) โฅ ๐ฃ(๐ง2) โ ๐ฃ(๐ง1) = |๐ฃ(๐ง2) โ ๐ฃ(๐ง1) |
by monotonicity of ๐ค(ยท); if ๐ง1 โฅ ๐ง2, then
๐ง1 โ ๐ง2 = ๐ฃ(๐ง1) + ๐ค(๐ง1) โ ๐ฃ(๐ง2) โ ๐ค(๐ง2) โฅ ๐ฃ(๐ง1) โ ๐ฃ(๐ง2) = |๐ฃ(๐ง2) โ ๐ฃ(๐ง1) | ,
i.e., ๐ฃ(ยท) and ๐ค(๐ง) = ๐ง โ ๐ฃ(๐ง) are both Lipschitz on ๐ (ฮฉ).Finally note that a Lipschitz function ๐ (ยท) with Lipschitz constant ๐ฟ can be extended to the
entire domain by setting ๐ (๐ง) B inf๐ก โ๐ (ฮฉ) ๐ (๐ก) + ๐ฟ |๐ก โ ๐ง | while keeping the Lipschitz constant๐ฟ (cf. Exercise 14.1).(๐๐๐) =โ (๐๐) is evident by choosing the random variable ๐ B ๐ + ๐ .(๐๐) =โ (๐): Let ๐1, ๐2 โ ฮฉ. To prove (i) we may assume that ๐ (๐1) > ๐ (๐2), i.e., by
assumption, ๐ฃ(๐ (๐1)) > ๐ฃ(๐ (๐2)). As ๐ฃ(ยท) is monotone we deduce that ๐ (๐1) โฅ ๐ (๐2). As๐ค(ยท) is monotone as well it follows that ๐ (๐1) = ๐ค(๐ (๐1)) > ๐ค(๐ (๐2)) = ๐ (๐2) and hence (i),i.e., ๐ and ๐ are comonotonic. ๏ฟฝ
Corollary 14.6. The random variables ๐๐ are comonotonic iff
(๐1, . . . ๐๐) โผ(๐นโ1๐1 (๐), . . . , ๐น
โ1๐๐(๐)
)for one uniform random variable ๐.
Proof. It is evident that(๐นโ1๐1(๐), . . . , ๐นโ1
๐๐(๐)
)are comonotonic, as ๐นโ1
๐๐(ยท) are nondecreasing
and ๐นโ1๐๐(๐) โผ ๐๐.
As for the converse apply Theorem 14.5 (ii) and (4.3). Then ๐ = ๐นโ1๐(๐) = ๐ข โฆ ๐นโ1
๐(๐). ๏ฟฝ
Proposition 14.7 (Upper Frรฉchet1 bound). If ๐๐ are pairwise comonotonic, then
๐น๐1,...๐๐(๐ฅ1, . . . , ๐ฅ๐) = min
๐=1,...๐๐น๐๐(๐ฅ๐).
Proof. By Theorem 14.5 (ii) we have
๐น๐1,๐2 (๐ฅ1, ๐ฅ2) = ๐(๐1 โค ๐ฅ1, ๐2 โค ๐ฅ2) = ๐(๐ฃ(๐) โค ๐ฅ1, ๐ค(๐) โค ๐ฅ2).
As ๐ฃ(ยท) and ๐ค(ยท) are monotone we have either {๐ฃ(๐) โค ๐ฅ1} โ {๐ค(๐) โค ๐ฅ2} or {๐ฃ(๐) โค ๐ฅ1} โ{๐ค(๐) โค ๐ฅ2}.
If {๐ฃ(๐) โค ๐ฅ1} โ {๐ค(๐) โค ๐ฅ2}, then
๐น๐1,๐2 (๐ฅ1, ๐ฅ2) = ๐(๐1 โค ๐ฅ1, ๐2 โค ๐ฅ2) = ๐(๐ฃ(๐) โค ๐ฅ1) = ๐น๐1 (๐ฅ1) โค ๐น๐2 (๐ฅ2),
and if {๐ฃ(๐) โค ๐ฅ1} โ {๐ค(๐) โค ๐ฅ2}, then
๐น๐1,๐2 (๐ฅ1, ๐ฅ2) = ๐(๐ฃ(๐) โค ๐ฅ1, ๐ค(๐) โค ๐ฅ2) = ๐(๐ค(๐) โค ๐ฅ2) = ๐น๐2 (๐ฅ2) โค ๐น๐1 (๐ฅ1).
The assertion follows. ๏ฟฝ
Remark 14.8. It is always true that ๐น๐1,...๐๐(๐ฅ1, . . . , ๐ฅ๐) โค min๐=1,...๐ ๐น๐๐
(๐ฅ๐). For comonotonicrandom variables, however, the upper Frรฉchet bound is attained.
1Maurice Renรฉ Frรฉchet, 1878โ1973
Version: May 28, 2021
84 CO- AND ANTIMONOTONICITY
๐น (๐ฅ + ฮ๐ฅ, ๐ฆ + ฮ๐ฆ)
๐น (๐ฅ + ฮ๐ฅ, ๐ฆ)
๐น (๐ฅ, ๐ฆ + ฮ๐ฆ)
๐น (๐ฅ, ๐ฆ)
Figure 14.1: Probability of an area
Corollary 14.9. Let ๐ and ๐ be comonotonic. Then E๐ ยท E ๐ โค E๐๐.
Proof. Integrate (14.3) with respect to ๐(d๐) โ ๐(d๐โฒ) and proceed as in Chebyshevsโs suminequality, Theorem 14.1. ๏ฟฝ
Corollary 14.10. The covariance cov( ๏ฟฝ๏ฟฝ,๐ ) among all random variables with ๏ฟฝ๏ฟฝ โผ ๐ and ๐ โผ ๐is maximal, if ๏ฟฝ๏ฟฝ and ๐ are comonotonic.
14.3 INTEGRATION OF RANDOM VECTORS
For R-valued random variables we have
E ๐(๐) =โซฮฉ
๐(๐)๐(d๐) =โซ โ
โโ๐(๐ฅ) d๐น๐ (๐ฅ) =
โซ โ
โโ๐(๐ฅ) ๐๐ (๐ฅ) d๐ฅ,
where the latter is only possible if the derivative (density) d๐น๐ (๐ฅ) = ๐๐ (๐ฅ) d๐ฅ exists.How do these formulae generalize for higher dimensions?
E ๐(๐,๐ ) =โซฮฉ
๐(๐ (๐), ๐ (๐)
)๐(d๐) =
โฌR2๐(๐ฅ, ๐ฆ) d2๐น๐,๐ (๐ฅ, ๐ฆ) =
โฌR2๐(๐ฅ, ๐ฆ) ๐๐,๐ (๐ฅ, ๐ฆ) d๐ฅd๐ฆ.
(14.6)To this end observe that
๐(๐ โ [๐ฅ, ๐ฅ + ฮ๐ฅ), ๐ โ [๐ฆ, ๐ฆ + ฮ๐ฅ)
)(14.7)
= ๐น๐,๐ (๐ฅ + ฮ๐ฅ, ๐ฆ + ฮ๐ฆ) โ ๐น๐,๐ (๐ฅ, ๐ฆ + ฮ๐ฆ) โ ๐น๐,๐ (๐ฅ + ฮ๐ฅ, ๐ฆ) + ๐น๐,๐ (๐ฅ, ๐ฆ)
and it is thus evident what d2๐น๐,๐ (๐ฅ, ๐ฆ) in (14.6) has to stand for (cf. Figure 14.1).Generalizations to random vectors in R๐ are obvious, the general form for (14.7), however,
involves 2๐ evaluations of ๐น๐1,...๐๐(๐ฅ1, . . . , ๐ฅ๐).
14.4 COPULA
Definition 14.11. The copula function of a random vector (๐1, . . . , ๐๐) is the cdf ๐ถ : [0, 1]๐ โ[0, 1] on [0, 1]๐ expressing the joint distribution function by all marginal distributions, i.e.,
๐น๐1,...,๐๐(๐ฅ1, . . . , ๐ฅ๐) = ๐ถ
(๐น๐1 (๐ฅ1), . . . ๐น๐๐
(๐ฅ๐)).
Remark 14.12. Note, that the copula in dimension 1 is trivial, as ๐ถ (๐ข) = ๐ข.
Remark 14.13 (Independence copula). The independence copula
๐ถ (๐ข1, . . . ๐ข๐) = ๐ข1 ยท . . . ๐ข๐
governs independent random variables ๐1, . . . ๐๐.
rough draft: do not distribute
14.5 PROBLEMS 85
Lemma 14.14. Copulas functions can be assumed to be uniformly continuous; more pre-cisely, it holds that
๐ถ (๐ข1, . . . ๐ข๐) โ ๐ถ (๐ฃ1, . . . ๐ฃ๐) โค |๐ฃ1 โ ๐ข1 | + ยท ยท ยท + |๐ฃ๐ โ ๐ข๐ | .
Proof. Just observe that
๐ (๐ โค ๐ฅ2, ๐ โค ๐ฆ2) โ ๐ (๐ โค ๐ฅ1, ๐ โค ๐ฆ1) โค |๐ (๐ โค ๐ฅ2, ๐ โค ๐ฆ2) โ ๐ (๐ โค ๐ฅ1, ๐ โค ๐ฆ2) |+ |๐ (๐ โค ๐ฅ1, ๐ โค ๐ฆ2) โ ๐ (๐ โค ๐ฅ1, ๐ โค ๐ฆ1) |
โค |๐ (๐ โค ๐ฅ2) โ ๐ (๐ โค ๐ฅ1) | + |๐ (๐ โค ๐ฆ2) โ ๐ (๐ โค ๐ฆ1) |
and thus๐ถ (๐ข1, ๐ข2) โ ๐ถ (๐ฃ1, ๐ฃ2) โค |๐ข1 โ ๐ฃ1 | + |๐ข2 โ ๐ฃ2 | .
This generalizes to higher dimensions. ๏ฟฝ
Lemma 14.15. It holds that
E ๐(๐1, . . . ๐๐) =โซ 1
0ยท ยท ยท
โซ 1
0๐
(๐นโ1๐1 (๐ข1), . . . ๐น
โ1๐1(๐ข1)
)d๐๐ถ (๐ข1, . . . ๐ข๐).
Proof. By (14.6) and substituting the marginals ๐ฅ๐ โ ๐นโ1๐๐(๐ข๐) we have
E ๐(๐1, . . . ๐๐) =โซ โ
โโยท ยท ยท
โซ โ
โโ๐(๐ฅ1, . . . ๐ฅ๐) d๐๐น๐1,...๐๐
(๐ฅ1, . . . ๐ฅ๐)
=
โซ 1
0ยท ยท ยท
โซ 1
0๐
(๐นโ1๐1 (๐ข1), . . . ๐น
โ1๐1(๐ข1)
)d๐๐น๐1,...,๐๐
(๐นโ1๐1 (๐ข1), . . . ๐น
โ1๐1(๐ข1)
)=
โซ 1
0ยท ยท ยท
โซ 1
0๐
(๐นโ1๐1 (๐ข1), . . . ๐น
โ1๐1(๐ข1)
)d๐๐ถ (๐ข1, . . . ๐ข1) .
๏ฟฝ
Example 14.16. The copula for comonotonic random variables is๐ถ (๐ข1, . . . , ๐ข๐) = min๐=1,...๐ ๐ข๐ .
Lemma 14.17. For comonotonic random variables ๐๐, ๐ = 1, . . . , ๐, we have
E ๐(๐1, . . . , ๐๐) =โซ 1
0๐
(๐นโ1๐1 (๐ข), . . . ๐น
โ1๐๐(๐ข)
)d๐ข.
14.5 PROBLEMS
Exercise 14.1 (McShaneโs Lemma on Lipschitz extensions). Let (๐, ๐) be a set equippedwith a metric and let ๐ : ๐ โ R be Lipschitz with Lipschitz constant ๐ฟ, where ๐ โ ๐. Then๐ (๐ง) B inf๐ขโ๐ ๐ (๐ข) + ๐ฟ๐ (๐ข, ๐ง) is well-defined for ๐ง โ ๐, ๐ (๐ข) = ๐ (๐ข) for ๐ข โ ๐ and ๐ : ๐ โ R
has Lipschitz constant ๐ฟ.Kirszbraunโs theorem provides an extension for vector-valued functions, although the as-
sertion for general Lipschitz functions is false.
Version: May 28, 2021
86 CO- AND ANTIMONOTONICITY
rough draft: do not distribute
15Convexity
Some parts follow a lecture by Mete Soner, but the content can be found in many elementarytextbooks on convex analysis, for example in Bot et al. [2009].
In what follows ๐ is a real topological vector space.
15.1 PROPERTIES OF CONVEX FUNCTIONS
We consider functions to the extended reals, ๐ : ๐ โ R โช {ยฑโ}.
Definition 15.1. The domain of ๐ is dom ๐ B { ๐ < โ}. ๐ is proper, if its domain is not empty.
Definition 15.2. We shall say that ๐ : ๐ โ R โช {ยฑโ} is lower semicontinuous (lsc.) if thesets { ๐ > _} are open for all _ โ R. The sets { ๐ โค _} are called lower levelsets, sublevel setsor trenches.
Lemma 15.3. The following are equivalent.
(i) ๐ is lsc. at ๐ฅ0 โ ๐;
(ii) for every Y > 0 there exists a neighborhood so that ๐ (๐ฅ) > ๐ (๐ฅ0) โ Y for every ๐ฅ โ ๐;
(iii) then lim inf๐ฅโ๐ฅ0 ๐ (๐ฅ) โฅ ๐ (๐ฅ0).
Lemma 15.4. The following are equivalent.
(i) ๐ is lsc.;
(ii) the epigraph epi ๐ B {(๐ฅ, ๐ผ) โ ๐ ร R : ๐ผ โฅ ๐ (๐ฅ)} is closed in ๐ ร R;
(iii) the level sets { ๐ โค _} are closed for all _ โ R.
Example 15.5. ๐ฟ๐ด B
{0 ๐ฅ โ ๐ด+โ else
is lsc., iff ๐ด is closed.
Definition 15.6. We shall call ๐ : ๐ โ R
โฒ convex, if ๐((1 โ _)๐ฅ + _๐ฆ
)โค (1 โ _) ๐ (๐ฅ) + _ ๐ (๐ฆ) for _ โ [0, 1],
โฒ concave, if ๐((1 โ _)๐ฅ + _๐ฆ
)โฅ (1 โ _) ๐ (๐ฅ) + _ ๐ (๐ฆ) for _ โ [0, 1], and
โฒ affine, if ๐((1 โ _)๐ฅ + _๐ฆ
)= (1 โ _) ๐ (๐ฅ) + _ ๐ (๐ฆ) for all _ โ R.
Remark 15.7. Note, that ๐ is affine iff ๐ (๐ฅ) = ๐ + ๐ฅโ(๐ฅ) for some linear ๐ฅโ. Indeed, for ๐ affinewrite ๐ (๐ฅ) = ๐(0) + ๐ (๐ฅ) โ ๐(0) and show that ๐ (๐ฅ) โ ๐(0) is linear.
87
88 CONVEXITY
Lemma 15.8. If ๐ is lsc. and convex, then
๐ (๐ฅ) = sup๐ ( ยท) โค ๐ ( ยท)
๐(๐ฅ) for all ๐ฅ โ ๐,
where ๐ is affine and ๐(ยท) โค ๐ (ยท) iff ๐(๐ฅ) โค ๐ (๐ฅ) for all ๐ฅ โ ๐.
Proof. Note first that sup๐ ( ยท) โค ๐ ( ยท) ๐(๐ฅ) is convex and lsc.Conversely, consider
๐ B {(๐ฅโ, ๐ผ) โ ๐โ ร R : ๐ฅโ(๐ฅ) + ๐ผ โค ๐ (๐ฅ) for all ๐ฅ โ ๐} .
We show that ๐ is not empty.If ๐ โก +โ, the always (๐ฅโ, ๐ผ) โ ๐, hence ๐ โ โ .Otherwise, there is ๐ฆ โ ๐ so that ๐ (๐ฆ) โ R. Then epi ๐ โ โ and (๐ฆ, ๐ (๐ฆ) โ 1) โ epi ๐ . As
๐ is lsc., it follows from Lemma 15.4 that epi ๐ is closed and convex. By the HahnโBanachtheorem there exists (๐ฅโ, ๐ผ) โ ๐โ ร R so that
๐ฅโ(๐ฆ) + ๐ผ( ๐ (๐ฆ) โ 1) < ๐ฅโ(๐ฅ) + ๐ผ๐ for all (๐ฅ, ๐) โ epi ๐ . (15.1)
As (๐ฆ, ๐ (๐ฆ)) โ epi ๐ it follows that ๐ผ > 0 and by rescaling (๐ฅโ, ๐), we may assume that ๐ผ = 1and we get ๐ฅโ(๐ฆ โ ๐ฅ) + ๐ (๐ฆ) โ 1 < ๐ for all (๐ฅ, ๐) โ epi ๐ . For ๐ฅ โ dom ๐ we have that(๐ฅ, ๐ (๐ฅ)) โ epi ๐ , and thus ๐ฅโ(๐ฆ โ ๐ฅ) + ๐ (๐ฆ) โ 1 < ๐ (๐ฅ), which actually holds for all ๐ฅ โ ๐.Consequently, the function ๐ฅ โฆโ โ๐ฅโ(๐ฅ) + ๐ฅโ(๐ฆ) + ๐ (๐ฆ) โ 1 is a minorant and ๐ โ โ .
We thus have๐ (๐ฅ) โฅ sup {๐ฅโ(๐ฅ) + ๐ผ : (๐ฅโ, ๐ผ) โ ๐} ,
it remains to be shown that equality holds.Assume there were ๐ฅ โ ๐ and ๐ โ R such that
๐ (๐ฅ) > ๐ > sup {๐ฅโ(๐ฅ) + ๐ผ : (๐ฅโ, ๐ผ) โ ๐} . (15.2)
Then (๐ฅ, ๐) โ epi ๐ . Again, by HahnโBanach theorem, there are (๐ฅโ, ๏ฟฝ๏ฟฝ) โ ๐โ ร R and Y > 0such that
๐ฅโ(๐ฅ) + ๏ฟฝ๏ฟฝ๐ > ๐ฅโ(๐ฅ) + ๏ฟฝ๏ฟฝ๐ + Y for all (๐ฅ, ๐) โ epi ๐ . (15.3)
It follows that ๏ฟฝ๏ฟฝ โฅ 0, as (๐ฅ, ๐ โฒ) โ epi ๐ for every ๐ โฒ โฅ ๐.Assume that ๐ (๐ฅ) โ R. Then ๏ฟฝ๏ฟฝ (๐ โ ๐) > Y by (15.3), thus ๏ฟฝ๏ฟฝ > 0. It follows from (15.3) that
๐ (๐ฅ) > 1๏ฟฝ๏ฟฝ๐ฅโ(๐ฅ โ ๐ฅ) + ๐ + Y
๏ฟฝ๏ฟฝ. (15.4)
Hence, ๐ฅ โฆโ 1๏ฟฝ๏ฟฝ๐ฅโ(๐ฅ โ ๐ฅ) + ๐ + Y
๏ฟฝ๏ฟฝis a minorant of ๐ (ยท) which evaluates to ๐ + Y
๏ฟฝ๏ฟฝat ๐ฅ, so that we
get from (15.2) that ๐ (๐ฅ) > ๐ > ๐ + Y๏ฟฝ๏ฟฝ
, which is a contradiction.Hence, ๐ (๐ฅ) = โ, i.e., ๐ฅ โ dom ๐. As the domain of ๐ is convex, we may separate ๐ฅ from
dom ๐ , i.e., there is ๐ฅโ so that ๐ฅโ(๐ฅ โ ๐ฅ) > Y > 0 (i.e., we may choose ๏ฟฝ๏ฟฝ = 0 in (15.3)).Consider the function ๐งโ(๐ฅ) B โ๐ฅโ(๐ฅ โ ๐ฅ) + Y. By (15.3) we get that ๐งโ(๐ฅ) โค 0 for every
๐ฅ โ dom ๐ . As ๐ โ โ , there are ๐ฆโ โ ๐โ and ๐ฝ โ R such that ๐ฆโ(๐ฅ) + ๐ฝ โค ๐ (๐ฅ) for all ๐ฅ โ ๐. Itfollows from (15.2) that ๐ > ๐ฆโ(๐ฅ) + ๐ฝ, so ๐พ B 1
Y(๐ โ ๐ฆโ(๐ฅ) โ ๐ฝ) > 0.
The function ๐(๐ฅ) B ๐ฆโ(๐ฅ) โ ๐พ๐ฅโ(๐ฅ) + ๐พ๐ฅโ(๐ฅ) + ๐ฝ + ๐พY is affine. For ๐ฅ โ dom ๐ we have that
๐(๐ฅ) = ๐ฆโ(๐ฅ) + ๐ฝ + ๐พยฉยญยญยญยซโ๐ฅ
โ(๐ฅ โ ๐ฅ) + Y๏ธธ ๏ธท๏ธท ๏ธธ๐งโ (๐ฅ)
ยชยฎยฎยฎยฌ โค ๐ฆโ(๐ฅ) + ๐ฝ โค ๐ (๐ฅ). Hence ๐(ยท) is an affine minorant of ๐
and for ๐ฅ = ๐ฅ one gets ๐(๐ฅ) B ๐ฆโ(๐ฅ) โ ๐พ๐ฅโ(๐ฅ) + ๐พ๐ฅโ(๐ฅ) + ๐ฝ + ๐พY = ๐, which contradicts (15.4).Hence, ๐ is the pointwise supremum of affine functions. ๏ฟฝ
rough draft: do not distribute
15.2 DUALITY 89
15.2 DUALITY
Definition 15.9 (LegendreโFenchel1 transformation). The convex conjugate function is
๐ โ : ๐โ โ R โช {โ}๐ โ (๐ฅโ) B sup
๐ฅโ๐๐ฅโ(๐ฅ) โ ๐ (๐ฅ) (15.5)
and the bi-conjugate is
๐ โโ : ๐ โ R โช {โ}๐ โโ(๐ฅ) B sup
๐ฅโโ๐โ๐ฅโ(๐ฅ) โ ๐ โ(๐ฅโ).
Remark 15.10 (Fenchelโs inequality, or FenchelโYoung2 inequality). By (15.5) it holds that
๐ฅโ(๐ฅ) โค ๐ (๐ฅ) + ๐ โ(๐ฅโ) (15.6)
for all ๐ฅ โ ๐ and ๐ฅโ โ ๐โ.
Lemma 15.11. We have that ๐ โค ๐ implies that ๐ โ โฅ ๐โ.
Proof. Cf. Exercise 15.1. ๏ฟฝ
Lemma 15.12. If ๐(๐ฅ) = ๐ + ๐ฆโ(๐ฅ) is affine linear, then ๐โ(๐ฅโ) ={โ๐ if ๐ฅโ = ๐ฆโ
+โ else.and ๐โโ(๐ฅ) =
๐(๐ฅ).
Proof. Observe that
๐โ(๐ฅโ) = sup๐ฅโ๐
๐ฅโ(๐ฅ) โ ๐ โ ๐ฆโ(๐ฅ) ={โ๐ if ๐ฅโ = ๐ฆโ
+โ else.
Further,
๐โโ(๐ฅ) = sup๐ฅโโ๐โ
๐ฅโ(๐ฅ) โ ๐โ(๐ฅโ) = sup{๐ฆโ(๐ฅ) + ๐๏ธธ ๏ธท๏ธท ๏ธธ
๐ฅโ=๐ฆโ
, ๐ฅโ(๐ฅ) โ โ๏ธธ ๏ธท๏ธท ๏ธธ๐ฅโโ ๐ฆโ
}= ๐ฆโ(๐ฅ) + ๐ = ๐(๐ฅ)
for every ๐ฅ โ ๐, i.e., ๐ = ๐โโ. ๏ฟฝ
Example 15.13. On ๐ = R let ๐ (๐ฅ) B 1๐|๐ฅ |๐, then ๐ โ(๐ฆ) = 1
๐|๐ฆ |๐, where 1
๐+ 1
๐= 1.
Indeed, the maximum is attained at 0 = dd๐ฅ ๐ฅ๐ฆ โ
1๐|๐ฅ |๐ = ๐ฆ โ ๐ฅ๐โ1, so ๐ฅโ = ๐ฆ
1๐โ1 and thus
๐ โ(๐ฆ) = ๐ฅโ๐ฆ โ 1๐|๐ฅโ |๐ = ๐ฆ
1๐โ1 ๐ฆ โ 1
๐๐ฆ
๐
๐โ1 =๐ โ 1๐
๐ฆ๐
๐โ1 =1๐๐ฆ๐ .
Remark 15.14. ๐ โ and ๐ โโ are lsc.
1Werner Fenchel, 1905โ19882William Henry Young, 1863โ1942
Version: May 28, 2021
90 CONVEXITY
Theorem 15.15 (FenchelโMoreau Theorem, Rockafellar). Let ๐ be a Banach space. Let๐ : ๐ โ R be a proper, extended real valued lsc. and convex, function. Then ๐ = ๐ โโ, where
๐ โโ(๐ฅ) B sup๐ฅโโ๐โ
๐ฅโ(๐ฅ) โ ๐ โ(๐ฅโ).
Proof. By Fenchelโs inequality (15.6) we have that that ๐ (๐ฅ) โฅ ๐ฅโ(๐ฅ) โ ๐ โ(๐ฅโ), and thus
๐ (๐ฅ) โฅ sup๐ฅโโ๐โ
๐ฅโ(๐ฅ) โ ๐ โ(๐ฅโ) = ๐ โโ(๐ฅ),
i.e., ๐ โฅ ๐ โโ.Let ๐ be affine so that ๐ โค ๐ . Then ๐โ โฅ ๐ โ and ๐โโ โค ๐ โโ. Now by Lemma 15.12 we have
that ๐ = ๐โโ, hence๐ (๐ฅ) = sup
๐โค ๐๐(๐ฅ) โค sup
๐โค ๐ โโ๐(๐ฅ) = ๐ โโ(๐ฅ),
which is the converse inequality. ๏ฟฝ
Corollary 15.16 (The bipolar theorem). The polar cone is
๐ถโฆ B {๐ฆ โ ๐โ : ๐ฆ(๐) โค 0 for all ๐ โ ๐ถ} .
Let ๐ถ be a cone, then ๐ถโฆโฆ B (๐ถโฆ)โฆ = conv {_๐ : _ โฅ 0, ๐ โ ๐ถ}.
Proof. Consider the indicator function ๐ (๐) B ๐ฟ๐ถ (๐) B{0 if ๐ โ ๐ถ,+โ else.
Then ๐ฟโ๐ถ(๐ฆ) = sup๐โ๐ถ ๐ฆ(๐)
and ๐ฟโโ๐ถ(๐) = ๐ฟ๐ถโฆโฆ (๐) iff ๐ถ = ๐ถโฆโฆ. ๏ฟฝ
Corollary 15.17 (Youngโs inequality). For ๐(ยท) strictly increasing it holds that
๐ฅ๐ฆ โคโซ ๐ฅ
0๐(๐ข) d๐ข +
โซ ๐ฆ
0๐โ1(๐ฃ) d๐ฃ,
where ๐ฅ > 0 and ๐ฆ โ [0, ๐(๐ฅ)]; particularlyโซ ๐ฆ
0๐โ1(๐ฃ) d๐ฃ = sup
๐ฅ
{๐ฅ๐ฆ โ
โซ ๐ฅ
0๐(๐ข) d๐ข
}and
โซ ๐ฅ
0๐(๐ข) d๐ข = sup
๐ฆ
{๐ฅ๐ฆ โ
โซ ๐ฆ
0๐โ1(๐ฃ) d๐ฃ
}. (15.7)
Proof. Set ๐ (๐ฅ) Bโซ ๐ฅ
0 ๐(๐ข) d๐ข, then 0 = dd๐ฅ ๐ฅ๐ฆ โ
โซ ๐ฅ
0 ๐(๐ข) d๐ข = ๐ฆ โ ๐(๐ฅ), i.e., ๐ฅโ = ๐โ1(๐ฆ). Hence
๐ โ(๐ฆ) = ๐ฆ๐โ1(๐ฆ) โโซ ๐โ1 (๐ฆ)0 ๐(๐ข) d๐ข =
โซ ๐ฆ
0 ๐โ1(๐ฃ) d๐ฃ, and hence the assertion. ๏ฟฝ
15.3 PROBLEMS
Exercise 15.1. Verify Lemma 15.11.
Exercise 15.2. For a family ๐ ] it holds that (inf ] ๐ ])โ (๐ฅโ) = sup ] ๐ โ] (๐ฅโ), but(sup ] ๐ ]
)โ (๐ฅโ) โคinf ] ๐ โ] (๐ฅโ).
Exercise 15.3. Show that ((1 โ _) ๐0 + _ ๐1)โ โค (1 โ _) ๐ โ0 + _ ๐โ1 for _ โ [0, 1].
Exercise 15.4. Set ๐(๐ฅ) B ๐ผ + ๐ฝ ยท ๐ฅ + ๐พ ๐ (_๐ฅ + ๐ฟ), then ๐โ(๐ฅโ) = โ๐ผโ ๐ฟ ๐ฅโโ๐ฝ_+ ๐พ ๐ โ
(๐ฅโโ๐ฝ_๐พ
), where
_ โ 0 and ๐พ >0.
rough draft: do not distribute
15.3 PROBLEMS 91
Exercise 15.5 (Infimal convolution). Define the infimal convolution
( ๐๏ฟฝ๐) (๐ฅ) B inf { ๐ (๐ฅ โ ๐ฆ) + ๐(๐ฆ) : ๐ฆ โ R๐}
and more generally, ( ๐1๏ฟฝ . . .๏ฟฝ ๐๐) (๐ฅ) B inf{โ๐
๐=1 ๐๐ (๐ฅ๐) :โ๐
๐=1 = ๐ฅ}. Then, for ๐๐ proper, con-
vex and lsc., ( ๐1๏ฟฝ . . .๏ฟฝ ๐๐)โ = ๐ โ1 + ยท ยท ยท + ๐โ๐.
Exercise 15.6. Show that the conjugate of ๐ (๐ฅ) = ๐๐ฅ is ๐ โ(๐ฅโ) =
๐ฅโ log ๐ฅโ โ ๐ฅโ if ๐ฅโ > 00 if ๐ฅโ = 0+โ if ๐ฅโ < 0
.
Version: May 28, 2021
92 CONVEXITY
rough draft: do not distribute
16Sample Average Approximation (SAA)
This chapter is based on Shapiro et al. [2014]
16.1 SAA
Let ๐ โ R๐ be closed and ๐ โ โ . Consider the problem ๐โ B min๐ฅโ๐ E ๐น (๐ฅ, b)๏ธธ ๏ธท๏ธท ๏ธธ=: ๐ (๐ฅ)
which we
compare with ๏ฟฝ๏ฟฝ๐ B min๐ฅโ๐1๐
๐โ๐=1
๐น (๐ฅ, b ๐)๏ธธ ๏ธท๏ธท ๏ธธ=: ๐๐ (๐ฅ)
for the empirical measure ๐๐ = 1๐
โ๐๐=1 ๐ฟb๐ and iid
samples b๐.
16.1.1 Pointwise LLN
Suppose that E ๐น (๐ฅ, b) < โ.
Lemma 16.1. The following hold true:
(i) E ๐๐ (๐ฅ) = ๐ (๐ฅ), i.e., ๐๐ (๐ฅ) is an unbiased estimator for ๐ (๐ฅ);
(ii) (LLN) For every ๐ฅ โ ๐ it holds that ๐๐ (๐ฅ) โ ๐ (๐ฅ), as ๐ โโ with probability 1.
Proposition 16.2. The estimator ๏ฟฝ๏ฟฝ๐ is not necessarily consistent, it holds in general that
lim sup๐โโ
๏ฟฝ๏ฟฝ๐ โค ๐โ.
Proof. We have that ๏ฟฝ๏ฟฝ๐ โค ๐๐ (๐ฅ) for every ๐ฅ โ ๐, thus
lim sup๐โโ
๏ฟฝ๏ฟฝ๐ โค lim๐โโ
๐๐ (๐ฅ) = ๐ (๐ฅ)
by the Law of Large Numbers (ii). Thus
lim sup๐โโ
๏ฟฝ๏ฟฝ๐ โค inf ๐ (๐ฅ) = ๐โ.
๏ฟฝ
Proposition 16.3. The estimator ๏ฟฝ๏ฟฝ๐ is downside biased, it holds that E ๏ฟฝ๏ฟฝ๐+1 โค E ๏ฟฝ๏ฟฝ๐ โค ๐โ.Proof. It holds that
E ๏ฟฝ๏ฟฝ๐ = E ๐๐ (๐ฅ) = Emin๐ฅโ๐
1๐
๐โ๐=1
๐น (๐ฅ, b ๐) โค min๐ฅโ๐
E1๐
๐โ๐=1
๐น (๐ฅ, b ๐)
= min๐ฅโ๐
1๐
๐โ๐=1E ๐น (๐ฅ, b ๐) = min
๐ฅโ๐๐ (๐ฅ) = ๐โ.
93
94 SAMPLE AVERAGE APPROXIMATION (SAA)
Further, note that ๐๐+1(๐ฅ) = 1๐+1
โ๐+1๐=1 ๐น (๐ฅ, b๐) = 1
๐+1โ๐+1
๐=11๐
โ๐โ ๐ ๐น (๐ฅ, b ๐), thus
E ๏ฟฝ๏ฟฝ๐+1 = Emin๐ฅโ๐
๐๐+1(๐ฅ) = Emin๐ฅโ๐
1๐ + 1
๐+1โ๐=1
1๐
โ๐โ ๐
๐น (๐ฅ, b ๐)
โฅ E 1๐ + 1
๐+1โ๐=1min๐ฅโ๐
1๐
โ๐โ ๐
๐น (๐ฅ, b ๐)๏ธธ ๏ธท๏ธท ๏ธธ๏ฟฝ๏ฟฝ๐
= E1
๐ + 1
๐+1โ๐=1
๏ฟฝ๏ฟฝ๐ = E ๏ฟฝ๏ฟฝ๐ ,
thus the assertion. ๏ฟฝ
16.1.2 Pointwise and Functional CLT
Suppose here that
(i) ๐(๐ฅ)2 B var ๐น (๐ฅ,ฮ) < โ and
(ii) |๐น (๐ฅ, b) โ ๐น (๐ฅ โฒ, b) | โค ๐ถ (b) โ๐ฅ โ ๐ฅ โฒโ a.s. and E๐ถ (b)2 < โ.
Lemma 16.4. ๐ (ยท) is Lipschitz with constant E๐ถ (b).
Proof. It follows from (ii) by taking expectations that ๐ (๐ฅ) โ ๐ (๐ฅ โฒ) = E ๐น (๐ฅ, b) โ E ๐น (๐ฅ โฒ, b) โคE๐ถ โ๐ฅ โ ๐ฅ โฒโ, from which the assertion follows. ๏ฟฝ
We have that โ๐ ( ๐๐ (๐ฅ) โ ๐ (๐ฅ))
Dโโโ ๐ (๐ฅ) โผ N(0, ๐(๐ฅ)2
).
More generally,
โ๐ ( ๐๐ (๐ฅ1) โ ๐ (๐ฅ1), . . . ๐๐ (๐ฅ๐) โ ๐ (๐ฅ๐))
Dโโโ ๐ (๐ฅ) โผ N (0, ฮฃ) ,
where ฮฃ = cov (๐น (๐ฅ๐ , b), ๐น (๐ฅ๐ , b))๐๐, ๐=1.In a functional way,
โ๐ ( ๐๐ (ยท) โ ๐ (ยท))
Dโโโ ๐ : ฮฉโ ๐ถ (๐),
where ๐ : ฮฉโ ๐ถ (๐) is called a random element in ๐ถ (๐).
Theorem 16.5. If (i) and (ii), then
(i) ๏ฟฝ๏ฟฝ๐ = inf๐ฅ ๐๐ (๐ฅ) + ๐(๐โ1/2
)and
(ii) ๐1/2(๏ฟฝ๏ฟฝ๐ โ ๐โ
) Dโโโ inf๐ โ๐ ๐ (๐ ), where ๐ = argmin๐ฅโ๐ ๐ (๐ฅ) โ ๐.
Proof. The proof uses the ฮ-method described in Section 16.2 below for finite dimensions.๏ฟฝ
Remark 16.6. We obtain from (ii) that ๏ฟฝ๏ฟฝ๐ = ๐โ + ๐โ1/2 inf๐ โ๐ ๐ (๐ ) + ๐(๐โ1/2
). For ๐ = {๐ฅโ} we
have that inf๐ โ๐ ๐ (๐ ) = ๐ (๐ฅโ) โผ ๐(0, ๐(๐ฅโ)2
)and hence E ๏ฟฝ๏ฟฝ๐ = ๐โ + ๐
(๐โ1/2
).
However, convergence is slower, in general, if ๐ consists of more than 1 point.
rough draft: do not distribute
16.2 THE ฮ-METHOD 95
16.2 THE ฮ-METHOD
Proposition 16.7. Let ๐๐ โ R๐ be random vectors with, ๐๐ โ ` โ R๐ in probability and
R 3 ๐๐ โ โ deterministic numbers such that ๐๐ (๐๐ โ `)Dโโโ ๐ . Further, let ๐บ : R๐ โ R๐ be
differentiable at `. Then ๐๐ (๐บ (๐๐ ) โ ๐บ (`))Dโโโ ๐ฝ ยท ๐ , where ๐ฝ = โ๐บ (`) is the ๐ ร ๐ Jacobian
matrix at `.
Proof. Notice, that ๐บ (๐ฆ) โ ๐บ (`) = ๐ฝ (๐ฆ โ `) + ๐ (๐ฆ), where ๐ (๐ฆ) = ๐(โ๐ฆ โ `โ), so that wehave ๐๐ (๐บ (๐๐ ) โ ๐บ (`)) = ๐ฝ ๐๐ (๐๐ โ `)๏ธธ ๏ธท๏ธท ๏ธธ
Dโโโ๐
+๐๐ ๐ (๐๐ ). We have that ๐๐ (๐๐ โ `) = O(1) (as it
converges in distribution), hence โ๐๐ โ `โ = O(1) and thus ๐ (๐๐ ) = ๐ (โ๐๐ โ `โ) = ๐(๐โ1๐
).
Thus the result. ๏ฟฝ
Claim 16.8. For ๐1/2 (๐๐ โ `)Dโโโ N (0, ฮฃ) we have particularly that ๐1/2 (๐บ (๐๐ ) โ ๐บ (`))
DโโโN (0, ๐ฝฮฃ๐ฝ>).
Version: May 28, 2021
96 SAMPLE AVERAGE APPROXIMATION (SAA)
rough draft: do not distribute
17Weak Topology of Measures
17.1 GENERAL CHARACTERISTICS
Definition 17.1. Let (๐, ๐) be a metric space. The weak topology of probability measures ischaracterized byโซ
๐
โ d๐๐ โโโโโ๐โโ
โซ๐
โ d๐ for all bounded and continuous functions โ : ๐ โ R.
Theorem 17.2 (Riesz representation theorem). For any continuous linear functional ๐ : ๐ถ0(๐) โR (the bounded functions vanishing at infinity on a locally compact Hausdorff space ๐) thereis a regular, countatbly additive measure ` on the Borels so that
๐(โ) =โซ๐
โ d` for all โ โ ๐ถ0(๐).
Definition 17.3. Let `, a be (probability) measures. The LรฉvyโProkhorov metric is
๐(`, a) B inf {Y > 0: `(๐ด) โค a(๐ดY) + Y and a(๐ด) โค `(๐ดY) + Y for all ๐ด โ โฌ(๐)} , (17.1)
where ๐ดY Bโ
๐โ๐ด ๐ตY (๐) is the Y-fattening (or Y-enlargement) of ๐ด โ ๐.
Remark 17.4. Notice, that ๐(๐,๐) โค 1 for probability measures.
Definition 17.5. A collection ๐ โ P(๐) of probability measures on (๐, ๐) is tight iff for everyY > 0 there is a compact set ๐พY โ ๐ so that
`(๐พY) > 1 โ Y for all ๐ โ ๐.
Example 17.6. The collection {๐ฟ๐ : ๐ = 1, 2, . . . } on R is not tight, while{๐ฟ1/๐ : ๐ = 1, 2, . . .
}is.
Example 17.7. A collection of Gaussian measures {N (`๐ , ฮฃ๐) : ๐ โ ๐ผ} is tight, if {`๐ : ๐ โ ๐ผ}and {ฮฃ๐ : ๐ โ ๐ผ} are uniformly bounded.
Theorem 17.8 (Prokhorovโs theorem). The following hold true:
(i) The metric ๐ in (17.1) metrizes the weak topology of measures.
(ii) A set K โ P(๐) is tight iff K is sequentially compact.
(iii) If {๐๐ : ๐ = 1, 2, . . . } โ P(R๐
)is tight, then there is a subsequence and a measure
๐ โ P(R๐
)so that ๐๐ โ ๐ weakly.
97
98 WEAK TOPOLOGY OF MEASURES
Properties
(i) If (๐, ๐) is separable, convergence of measures in the LรฉvyโProkhorov metric is equiv-alent to weak convergence of measures. Thus, ๐ is a metrization of the topology ofweak convergence on P(๐).
(ii) The metric space (P(๐), ๐) is separable if and only if (๐, ๐) is separable.
(iii) If (P(๐), ๐) is complete then (๐, ๐) is complete. If all the measures in P(๐) haveseparable support, then the converse implication also holds: if (๐, ๐) is complete then(P(๐), ๐) is complete.
(iv) If (๐, ๐) is separable and complete, a subset K โ P(๐) is relatively compact if and onlyif its ๐-closure is ๐-compact.
17.2 THE WASSERSTEIN DISTANCE
Some points here follow [Pflug and Pichler, 2014].
Definition 17.9 (Optimal transportation cost). Given two probability spaces (ฮ, F , ๐) and(ฮ, F , ๏ฟฝ๏ฟฝ
), the Wasserstein distance of order ๐ โฅ 1 (optimal transportation costs) is
d๐(๐, ๏ฟฝ๏ฟฝ
)= inf
๐
(โฌฮรฮ
๐(b, b
)๐๐
(๐b, ๐b
) )1/๐, (17.2)
where the infimum is taken over all (bivariate) probability measures ๐ on ฮ ร ฮ having themarginals ๐ and ๏ฟฝ๏ฟฝ, that is
๐(๐ด ร ฮ
)= ๐(๐ด) and ๐ (ฮ ร ๐ต) = ๏ฟฝ๏ฟฝ(๐ต) (17.3)
for all measurable sets ๐ด โ F and ๐ต โ F . The optimal measure ๐ is called the optimaltransport plan.
Remark 17.10. Occasionally, the Wasserstein distance is also considered for a (convex)function ๐(๐ฅ, ๐ฆ) instead of the distance ๐ (๐ฅ, ๐ฆ)๐ .
Proposition 17.11 (Embedding). It holds that
d๐(๐, ๐ฟb0
)๐=
โซฮ
d (b, b0)๐ ๐ (db) ,
and the mapping
๐ : (ฮ, d) โ (P๐ (ฮ; d) , d๐ ) ,b โฆโ ๐ฟb (ยท)
assigning to each point b โ ฮ its point measure ๐ฟb located on b (Dirac measure1) is anisometric embedding for all 1 โค ๐ < โ ((ฮ, d) โฉโ P๐ (ฮ; d)).
1 ๐ฟb (๐ด) B 1๐ด(b) ={1 if b โ ๐ด0 if b โ ๐ด
is the usual Dirac measure.
rough draft: do not distribute
17.3 THE REAL LINE 99
Proof. There is just one single measure with marginals ๐ and ๐ฟb0 , which is the transport plan๐ = ๐ โ ๐ฟb0 . Hence
d๐(๐, ๐ฟb0
)๐=
โซฮ
โซฮ
d(b, b
)๐๐ฟb0
(๐b
)๐ (๐b) =
โซฮ
d (b, b0)๐ ๐ (๐b) ,
the first assertion.For the particular choice ๐ = ๐ฟb0 the latter formula simplifies to
d๐(๐ฟ b0 , ๐ฟb0
)๐=
โซฮ
d (b, b0)๐ ๐ฟ b0 (๐b) = d(b0, b0
)๐,
and hence b โฆโ ๐ฟb is an isometry. ๏ฟฝ
Notice that if d is inherited by b, then d๐ (๐, ๐ฟb0)๐ =โซฮโb โ b0โ๐ ๐(๐b).
17.3 THE REAL LINE
Theorem 17.12 (Cf. Rachev and Rรผschendorf [1998, Theorem 2.18]). Let ๐ and ๏ฟฝ๏ฟฝ be prob-ability measures on the real line with โcdfโ ๐น (๐ฅ) B ๐ ((โโ, ๐ฅ]) and ๐บ (๐ฅ) B ๏ฟฝ๏ฟฝ ((โโ, ๐ฅ]) . Let๐ be the measure on R2 with cdf. ๐ป (๐ฅ, ๐ฆ) B min {๐น (๐ฅ), ๐บ (๐ฆ)}. Then ๐ is optimal for the Kan-torovich transportation problem between ๐ and ๏ฟฝ๏ฟฝ for every cost function ๐(๐ฅ, ๐ฆ) = ๐(๐ฅ โ ๐ฆ)where ๐(ยท) is convex. Further,
d๐ (๐, ๏ฟฝ๏ฟฝ) =โซ 1
0๐
(๐นโ1(๐ข) โ ๐บโ1(๐ข)
)d๐ข.
Corollary 17.13. For the cost function ๐(ยท) = |ยท| we have further that d(๐, ๏ฟฝ๏ฟฝ) =โซ 10
๏ฟฝ๏ฟฝ๐นโ1(๐ข) โ ๐บโ1(๐ข)๏ฟฝ๏ฟฝ d๐ข =โซ โโโ |๐น (๐ฅ) โ ๐บ (๐ฅ) | d๐ฅ.
Remark 17.14.
(i) If ๐บ does not give mass to points, then one may define ๐ B ๐บโ1 โฆ ๐น and it holds thatโซ ๐ฅ
โโd๐ = ๐น (๐ฅ) = ๐บ (๐ (๐ฅ)) =
โซ ๐ (๐ฅ)
โโd๏ฟฝ๏ฟฝ (17.4)
The transport map ๐ is a monotone rearrangement of ๐ to ๏ฟฝ๏ฟฝ.
(ii) Suppose ๐ and ๏ฟฝ๏ฟฝ have the densities ๐ = ๐น โฒand ๐ = ๐บ โฒ. Then differentiating (17.4) gives
๐ (๐ฅ) = ๐(๐ (๐ฅ)) ยท ๐ โฒ(๐ฅ).
Version: May 28, 2021
100 WEAK TOPOLOGY OF MEASURES
rough draft: do not distribute
18Topologies For Set-Valued Convergence
18.1 TOPOLOGICAL FEATURES OF MINKOWSKI ADDITION
Theorem (Topological properties of Minkowski addition in locally compact vector spaces).Let ๐ด and ๐ต be sets.
(i)โฆ๐ด + ๐ต is open and
โฆ๐ด + ๐ต โ (๐ด + ๐ต)โฆ โ ๐ด + ๐ต. If ๐ด or ๐ต is open, then ๐ด + ๐ต = (๐ด + ๐ต)โฆ;
(ii) ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ โ ๐ด + ๐ต. If ๐ด or ๐ต is bounded, then ๐ด + ๐ต = ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ;
(iii) For ๐ด and ๐ต closed and ๐ด (or ๐ต) bounded, ๐ (๐ด + ๐ต) โ ๐๐ด + ๐๐ต.
Proof. Let ๐ฅ โโฆ๐ด + ๐ต have the composition ๐ฅ = ๐ + ๐ with ๐ โ ๐ต๐ (๐) โ
โฆ๐ด for some ๐ > 0 and
๐ โ ๐ต. Then ๐ฅ โ ๐ต๐ (๐ฅ) = ๐ต๐ (๐ + ๐) = ๐ต๐ (๐) + {๐} โโฆ๐ด + ๐ต, so
โฆ๐ด + ๐ต is open. As
โฆ๐ด + ๐ต โ ๐ด + ๐ต
andโฆ๐ด + ๐ต open it is immediate that
โฆ๐ด + ๐ต โ (๐ด + ๐ต)โฆ, which is ((i)) (the rest being obvious).
Let ๐ โ ๏ฟฝ๏ฟฝ and ๐ โ ๏ฟฝ๏ฟฝ. Choose ๐๐ โ ๐ด with ๐๐ โ ๐ โ ๏ฟฝ๏ฟฝ and ๐๐ โ ๐ต with ๐๐ โ ๐ โ ๏ฟฝ๏ฟฝ.Obviously ๐๐ + ๐๐ โ ๐ด + ๐ต and thus ๐ + ๐ โ ๐ด + ๐ต, whence ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ โ ๐ด + ๐ต.
As for the converse let ๐ฅ โ ๐ด + ๐ต, so there is a sequence ๐ฅ๐ = ๐๐ + ๐๐ with ๐๐ โ ๐ด and๐๐ โ ๐ต and ๐ฅ๐ โ ๐ฅ. Assume (wlog.) ๐ด bounded, thus there is a subsequence such that๐๐ โ ๐ โ ๏ฟฝ๏ฟฝ, and thus ๐๐ = ๐ฅ๐ โ ๐๐ โ ๐ฅ โ ๐ converges as well with ๐๐ โ ๐ฅ โ ๐ =: ๐ โ ๏ฟฝ๏ฟฝ. Thatis ๐ฅ = ๐ + ๐ โ ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ and thus ๐ด + ๐ต โ ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ.
Observe first that ๐ (๐ด + ๐ต) โ ๐ด + ๐ต โ ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ = ๐ด + ๐ต. Suppose that ๐ฅ โ ๐ (๐ด + ๐ต) can be
written as ๐ฅ = ๐ + ๐ for some ๐ โโฆ๐ด and ๐ โ ๐ต. Then ๐ฅ = ๐ + ๐ โ
โฆ๐ด + ๐ต โ (๐ด + ๐ต)โฆ, whence
๐ฅ โ ๐ (๐ด + ๐ต). This is a contradiction, so ๐ โโฆ๐ด, that is ๐ โ ๐๐ด. By similar reasoning (๐ด and ๐ต
reversed) we find that ๐ โ ๐๐ต as well, which is the desired assertion. ๏ฟฝ
The assertion bounded in (ii) may not be dropped: To see this consider the closed sets๐ด B R ร {0} โ R2 and ๐ต B
{(๐ฅ, eโ๐ฅ2
): ๐ฅ โ R
}. ๐ด + ๐ต is open though, and ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ ๐ด + ๐ต.
18.1.1 Topological features of convex sets
Theorem (Topological properties of convex sets).
โฒ If ๐ด is open, then conv ๐ด is open;1
โฒ If ๐ด is bounded, then conv ๐ด is bounded;
โฒ If ๐ด is closed and bounded, then conv ๐ด is closed and bounded;
โฒ conv (๐ด + ๐ต) = conv ๐ด + conv ๐ต.
1The statement if ๐ด is closed, then conv ๐ด is closed is wrong (why?).
101
102 TOPOLOGIES FOR SET-VALUED CONVERGENCE
Proof. Let ๐ โ conv ๐ด have a representation ๐ =โ๐
๐=1 _๐๐๐ with ๐๐ โ ๐ด, whence ๐ โ โ๐๐=1 _๐๐ด โ
conv ๐ด. As ๐ด is open it follows from Theorem (18.1) ((i)) thatโ๐
๐=1 _๐๐ด is open, whence conv ๐ดis open.
Boundedness is obvious.Let ๐ โ conv ๐ด. Then there is a sequence ๐๐ =
โ๐+1๐=1 _
(๐)๐๐(๐)๐โ conv ๐ด with ๐๐ โ ๐ (here
we use Carathรฉodoryโs theorem; the statement is wrong in non-finite dimensions). By pickingsubsequences we may assume that ๐ (๐)1 converges, ๐ (๐)2 converges, etc. and finally ๐ (๐+1)
๐,
and moreover all _ (๐)๐
. ๏ฟฝ
18.2 PRELIMINARIES AND DEFINITIONS
In a vector space ๐ the Minkowski sum (also known as dilation) of two sets ๐ด and ๐ต is๐ด + ๐ต B {๐ + ๐ : ๐ โ ๐ด, ๐ โ ๐ต}, and the product with a scalar ๐ is ๐ ยท ๐ด B {๐ ยท ๐ : ๐ โ ๐ด}.
18.2.1 Convexity, and Conjugate Duality
The support function of a set ๐ด โ ๐ is
๐ ๐ด (๐ฅโ) B sup๐โ๐ด
๐ฅโ (๐) , (18.1)
where ๐ฅโ โ ๐โ is from the dual of a Banach space (๐, โยทโ) with norm โยทโ; its dual we willdenote as (๐โ, โยทโ), as no confusion with denoting the norm in the dual again by โยทโ will bepossible anyway.
Remark 18.1 (A collection of properties). Important properties of the support function include
(i) ๐ ๐ด โค ๐ ๐ต whenever ๐ด โ ๐ต (more specifically, ๐ ๐ด (๐ฅโ) โค ๐ ๐ด (๐ฅโ) for all ๐ฅโ),
(ii) ๐ _ยท๐ด (๐ฅโ) = ๐ ๐ด (_ ยท ๐ฅโ) = _ ยท ๐ ๐ด (๐ฅโ) for _ > 0 (positive homogeneity) and ๐ ๐ด (0) = 0,
(iii) ๐ ๐ด+๐ต = ๐ ๐ด + ๐ ๐ต,
(iv) ๐ conv ๐ด = ๐ ๐ด2 and
(v) ๐ ๐ด is convex, that is ๐ ๐ด((1 โ _) ๐ฅโ0 + _๐ฅ
โ1)โค (1 โ _) ๐ ๐ด
(๐ฅโ0
)+ _๐ ๐ด
(๐ฅโ1
)whenever 0 โค _ โค 1.
By employing the indicator function of the set ๐ด, I๐ด (๐) B{0 if ๐ โ ๐ดโ else , it is immediate
that๐ ๐ด (๐ฅโ) = sup
๐ฅโ๐๐ฅโ (๐ฅ) โ I๐ด (๐ฅ) ,
where the supremum ranges over all ๐ฅ โ ๐ โ ๐ด now. The support function itself thus is theusual convex conjugate function of I๐ด, which we denote ๐ ๐ด = Iโ
๐ด. The bi-conjugate function
of I๐ด (the conjugate of ๐ ๐ด) is the function
๐ โ๐ด (๐) B sup๐ฅโโ๐โ
๐ฅโ (๐) โ ๐ ๐ด (๐ฅโ) ={0 if ๐ฅโ (๐) โค ๐ conv ๐ด (๐ฅ
โ) for all ๐ฅโ โ ๐โโ else,
2conv ๐ด B {โ๐ _๐๐๐ : _๐ โฅ 0,โ๐ _๐ = 1 and ๐๐ โ ๐ด} is the convex hull of ๐ด.
rough draft: do not distribute
18.2 PRELIMINARIES AND DEFINITIONS 103
and by the Rockafellar-Fenchel-Moreau-duality Theorem (cf. Rockafellar [1974]) one furtherinfers that ๐ โ
๐ด= Iconv ๐ด.
This also reveals the relation
conv ๐ด ={๐ โ๐ด < โ
}=
โ๐ฅโโ๐โ
{๐ : ๐ฅโ (๐) โค ๐ ๐ด (๐ฅโ)} =โ
๐ฅโโ๐โ{๐ฅโ โค ๐ ๐ด (๐ฅโ)} ,
from which follows that the correspondence ๐ด โฆโ ๐ ๐ด, restricted to convex, compact sets๐ด โ C, is one-to-one (injective).
18.2.2 PompeiuโHausdorff Distance
Having addition and multiplication available for sets an adequate and fitting notion of dis-tance is useful. For this define the distance from ๐ to a set ๐ต as ๐ (๐, ๐ต) B inf๐โ๐ต ๐ (๐, ๐),where ๐ is the distance function. The deviation of the set ๐ด from the set ๐ต is D (๐ด, ๐ต) Bsup๐โ๐ด ๐ (๐, ๐ต),3 and the Pompeiu-Hausdorff distance is H (๐ด, ๐ต) B max {D (๐ด, ๐ต) , D (๐ต, ๐ด)}(cf. Rockafellar and Wets [1997]). Note that D (๐ด, ๐ต) = 0 iff ๐ด is contained in the topologi-cal closure, ๐ด โ ๏ฟฝ๏ฟฝ, and H (๐ด, ๐ต) = 0 iff ๏ฟฝ๏ฟฝ = ๏ฟฝ๏ฟฝ; moreover H (๐ด, ๐ต) = H
(๐ด, ๐ต
)and obviously
H (๐ด, ๐ต) โค sup๐โ๐ด,๐โ๐ต ๐ (๐, ๐).In a normed space with ๐ (๐, ๐) = โ๐ โ ๐โ it is enough to consider the boundaries, as we
have in addition that H (๐ด, ๐ต) = H (๐๐ด, ๐๐ต) if ๐ด and ๐ต are (sequentially) compact (i.e., ๐ด and๐ต are relatively compact); moreover
H (๐ด, ๐ต) = โ๐ โ ๐โ (18.2)
for some ๐ โ ๐๐ด and ๐ โ ๐๐ต in this situation.
Lemma 18.2. The deviation D and the Pompeiu-Hausdorff distance H satisfy the triangleinequality, D (๐ด,๐ถ) โค D (๐ด, ๐ต) + D (๐ต,๐ถ) and H (๐ด,๐ถ) โค H (๐ด, ๐ต) + H (๐ต,๐ถ).(C, H), where C is the set of all nonempty, compact and convex subsets of ๐, is a Polish
space (i.e. a complete, separable and metric space), provided that (๐, ๐) is Polish.
Proof. See, e.g., Castaing and Valadier [1977]. ๏ฟฝ
The concept of the Hausdorff distance and the support functions introduced above linkas follows to a nice ensemble: in a normed space (๐ (๐, ๐) = โ๐ โ ๐โ) the deviation D,using Minkowski addition, rewrites as D (๐ด,๐ถ) = inf {๐ > 0: ๐ด โ ๐ถ + ๐ ยท ๐ต๐ } where ๐ต๐ =
{๐ฅ : โ๐ฅโ โค 1} is the unit ball and ๐ถ๐ B ๐ถ + ๐ ยท ๐ต๐ is the ๐-fattening of ๐ถ. If ๐ด and ๐ถ areconvex, then D (๐ด,๐ถ) = inf
{๐ > 0: ๐ ๐ด โค ๐ ๐ถ + ๐ ยท ๐ ๐ต๐
}, where โโคโ is the usual โโคโ-comparison
of functions (๐ ๐ด โค ๐ ๐ถ + ๐ ยท ๐ ๐ต๐iff ๐ ๐ด (๐ฅโ) โค ๐ ๐ถ (๐ฅโ) + ๐ ยท ๐ ๐ต๐
(๐ฅโ) for all ๐ฅโ โ ๐โ). As ๐ ๐ต๐(๐ฅโ) =
sup๐โ๐ต๐๐ฅโ (๐) = โ๐ฅโโ, the norm of ๐ฅโ in the dual (๐โ, โ.โ) by the Hahn-Banach Theorem, this
simplifies further and it follows for general sets that
D (conv ๐ด, conv๐ถ) = inf{๐ > 0: ๐ ๐ด โ ๐ ๐ถ โค ๐ ยท ๐ ๐ต๐
}= supโ๐ฅโ โโค1
๐ ๐ด (๐ฅโ) โ ๐ ๐ถ (๐ฅโ) ,
and the Pompei-Hausdorff distance thus is
H (conv ๐ด, conv๐ถ) = supโ๐ฅโ โโค1
|๐ ๐ด (๐ฅโ) โ ๐ ๐ถ (๐ฅโ) | (18.3)
3in some references Hess [2002] also called excess of ๐ด over ๐ต.
Version: May 28, 2021
104 TOPOLOGIES FOR SET-VALUED CONVERGENCE
in terms of seminorms. These observations convincingly relate the Pompeiu-Hausdorff dis-tance with Minkowski addition of convex sets.
It follows from the preceding discussion and remarks that for relatively compact sets ๐ด
and ๐ถ there are ๐ โ ๐๐ด, ๐ โ ๐๐ถ and โ๐ฅโโ โค 1 such that D (๐ด,๐ถ) = โ๐ โ ๐โ = ๐ฅโ (๐ โ ๐). ๐ฅโ isan outer normal for both sets, conv ๐ด and conv๐ถ.
18.3 LOCAL DESCRIPTION
The sub-differential of a real-valued function ๐ : ๐โ โ R at a point ๐ฅโ โ ๐โ is the set 4
๐ ๐ (๐ฅโ) B {๐ข โ ๐ : ๐ (๐งโ) โ ๐ (๐ฅโ) โฅ ๐งโ (๐ข) โ ๐ฅโ (๐ข) for all ๐งโ โ ๐โ} โ ๐.
Notably ๐ ๐ (๐ฅโ) is a subset of ๐, so ๐ ๐ is a set-valued mapping which is expressed by writing
๐ ๐ : ๐โ โ ๐
๐ฅโ โฆโ ๐ ๐ (๐ฅโ) .
The symbolโ ๐ indicates that the outcomes are subsets โ a collection of elements โ of ๐.With the sub-differential at hand we may add the following standard characterization of
the support function ๐ ๐ด of a set ๐ด, which will turn out useful for our purpose:
Lemma 18.3. The support function ๐ ๐ด has the sub-differential ๐๐ ๐ด (๐ฅโ) = argmaxconv ๐ด ๐ฅโ.5
Moreover ๐๐ ๐ด (๐ฅโ) โ ๐๐ด.
Proof. With ๐ข โ argmaxconv ๐ด ๐ฅโ โ conv ๐ด, for any ๐งโ โ ๐โ we have that ๐ ๐ด (๐งโ) โฅ ๐งโ (๐ข) =๐ ๐ด (๐ฅโ) + ๐งโ (๐ข) โ ๐ฅโ (๐ข) and hence ๐ข โ ๐๐ ๐ด (๐ฅโ).
Conversely, with ๐ โ ๐๐ ๐ด (๐ฅโ) we have that ๐ ๐ด (๐งโ) โ ๐ ๐ด (๐ฅโ) โฅ ๐งโ (๐) โ ๐ฅโ (๐) or
๐ฅโ (๐) โฅ ๐ ๐ด (๐ฅโ) + ๐งโ (๐) โ ๐ ๐ด (๐งโ) (18.4)
for all ๐งโ. For the particular choice ๐งโ = 0 we find that ๐ฅโ (๐) โฅ ๐ ๐ด (๐ฅโ) and it remains to showthat ๐ โ conv ๐ด. Suppose that ๐ โ conv ๐ด, then โ by the Hahn-Banach Theorem โ there is a ๐งโ
such that ๐งโ (๐) > sup{๐งโ (๐โฒ) : ๐โฒ โ conv ๐ด
}= ๐ ๐ด (๐งโ). This same equation holds for multiples
_ ยท ๐งโ (_ > 0), hence (18.4) cannot hold in general; thus, ๐ โ conv ๐ด.The second statement is obvious. ๏ฟฝ
4note, that ๐ ๐ (๐ฅโ) โ ๐ is a subset in the pre-dual ๐ rather than ๐โโ.5We shall abbreviate the argument of the maximum of a function ๐ restricted to ๐ท by argmax๐ท ๐ B
argmax { ๐ (๐ฅ) : ๐ฅ โ ๐ท}.
rough draft: do not distribute
Index
CCapital Asset Pricing Model (CAPM), 32capital market line, CML, 27
DDirac measure, 98distribution function
cumulative, cdf, 37
Mmarginal, 98
Pportfolio
market, 30most efficient, 27tangency, 27
Rrisk
aggregate, 33idiosyncratic, 33residual, 33systematic, 33undiversifiable, 33unsystematic, 33
Ssecurity market line (SML), 32Sharpe ratio, 33
105