paris-princeton lectures on mathematical finance 2003

257

Upload: others

Post on 11-Sep-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Paris-Princeton Lectures on Mathematical Finance 2003
Page 2: Paris-Princeton Lectures on Mathematical Finance 2003

Lecture Notes in Mathematics 1847Editors:J.--M. Morel, CachanF. Takens, GroningenB. Teissier, Paris

Page 3: Paris-Princeton Lectures on Mathematical Finance 2003

Tomasz R. Bielecki Tomas BjorkMonique Jeanblanc Marek RutkowskiJose A. Scheinkman Wei Xiong

Paris-Princeton Lectureson Mathematical Finance2003

Editorial Committee:

R. A. Carmona, E. Cinlar,I. Ekeland, E. Jouini,J. A. Scheinkman, N. Touzi

123

Page 4: Paris-Princeton Lectures on Mathematical Finance 2003

Authors

Tomasz R. Bielecki

Department of Applied MathematicsIllinois Institute of TechnologyChicago, IL 60616, USAe-mail: [email protected]

Tomas Bjork

Department of FinanceStockholm School of EconomicsBox 650111383 Stockholm, Swedene-mail: [email protected]

Monique Jeanblanc

Equipe d’Analyse et ProbabilitesUniversite d’Evry-Val d’Essonne91025 Evry, Francee-mail:[email protected]

Marek Rutkowski

Faculty of Mathematics andInformation ScienceWarsaw University of TechnolgyPl. Politechniki 100-661 Warsaw, Polande-mail: [email protected]

Jose A. Scheinkman

Bendheim Center of FinancePrinceton UniversityPrinceton NJ 08530, USAe-mail: [email protected]

Wei Xiong

Bendheim Center of FinancePrinceton UniversityPrinceton NJ 08530, USAe-mail: [email protected]

[The addresses of the volume editors appearon page IX]

Library of Congress Control Number:2004110085

Mathematics SubjectClassification (2000): 92B24, 91B28, 91B44, 91B70, 60H30, 93E20

ISSN 0075-8434ISBN 3-540-22266-9 Springer Berlin Heidelberg New YorkDOI 10.1007/b98353

This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, specif ically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,reproduction on microf ilm or in any other way, and storage in data banks. Duplication of this publicationor parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,in its current version, and permission for use must always be obtained from Springer. Violations are liablefor prosecution under the German Copyright Law.

Springer is part of Springer Science+Business Mediaspringeronline.com

c© Springer-Verlag Berlin Heidelberg 2004Printed in Germany

The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,even in the absence of a specif ic statement, that such names are exempt from the relevant protective lawsand regulations and therefore free for general use.

Typesetting: Camera-ready TEX output by the authors

41/3142-543210 - Printed on acid-free paper

Page 5: Paris-Princeton Lectures on Mathematical Finance 2003

Preface

This is the second volume of the Paris-Princeton Lectures in Mathematical Finance.The goal of this series is to publish cutting edge research in self-contained articlesprepared by well known leaders in the field or promising young researchers invitedby the editors. Particular attention is paid to the quality of the exposition, and the aimis at articles that can serve as an introductory reference for research in the field.

The series is a result of frequent exchanges between researchers in finance andfinancial mathematics in Paris and Princeton. Many of us felt that the field wouldbenefit from timely exposes of topics in which there is important progress. Rene Car-mona, Erhan Cinlar, Ivar Ekeland, Elyes Jouini, Jose Scheinkman and Nizar Touziwill serve in the first editorial board of the Paris-Princeton Lectures in FinancialMathematics. Although many of the chapters in future volumes will involve lecturesgiven in Paris or Princeton, we will also invite other contributions. Given the currentnature of the collaboration between the two poles, we expect to produce a volumeper year. Springer Verlag kindly offered to host this enterprise under the umbrella ofthe Lecture Notes in Mathematics series, and we are thankful to Catriona Byrne forher encouragement and her help in the initial stage of the initiative.

This second volume contains three chapters. The first one is written by TomaszBielecki, Monique Jeanblanc and Marek Rutkowski. It reviews recent developmentsin the reduced form approach to credit risk and offers an exhaustive presentation ofthe hedging issues when contingent claims are subject to counterparty default. Thesecond chapter is contributed by Tomas Bjork and is based on a short course givenby him during the Spring of 2003 at Princeton University. It gives a detailed intro-duction to the geometric approach to mathematical models of fixed income markets.This contribution is a welcome addition to the long list of didactic surveys writtenby the author. Like the previous ones, it is bound to become a reference for the new-comers to mathematical finance interested in learning how and why the geometricpoint of view is so natural and so powerful as an analysis tool. The last chapter isdue to Jose Scheinkman and Wei Xiong. It considers dynamic trading by agents withheterogeneous beliefs. Among other things, it uses short sale constraints and over-confidence of groups of agents to show that equilibrium prices can be consistent withspeculative bubbles.

It is anticipated that the publication of this volume will coincide with the ThirdWorld Congress of the Bachelier Finance Society, to be held in Chicago (July 21-24,2004).

The EditorsParis / Princeton

June 04, 2004.

Page 6: Paris-Princeton Lectures on Mathematical Finance 2003

Contents

Hedging of Defaultable ClaimsTomasz R. Bielecki, Monique Jeanblanc, Marek Rutkowski . . . . . . . . . . . . . . . . . 1Part I. Replication of Defaultable Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Defaultable Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Properties of Trading Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Replication of Defaultable Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Vulnerable Claims and Credit Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 PDE Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Part II. Mean-Variance Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617 Mean-Variance Pricing and Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638 Strategies Adapted to the Reference Filtration . . . . . . . . . . . . . . . . . . . . . . . . 679 Strategies Adapted to the Full Filtration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8010 Risk-Return Portfolio Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Part III. Indifference Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9811 Hedging in Incomplete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9912 Optimization Problems and BSDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10913 Quadratic Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11814 Optimization in Incomplete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

On the Geometry of Interest Rate ModelsTomas Bjork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1331 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1342 A Primer on Linear Realization Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1373 The Consistency Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1454 The General Realization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1605 Constructing Realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1756 The Filipovic and Teichmann Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1837 Stochastic Volatility Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

Page 7: Paris-Princeton Lectures on Mathematical Finance 2003

VIII Contents

Heterogeneous Beliefs, Speculation and Trading in Financial MarketsJose Scheinkman, Wei Xiong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2171 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2172 A Static Model with Heterogeneous Beliefs and Short-Sales Constraints . . 2223 A Dynamic Model in Discrete Time with Short-Sales Constraints . . . . . . . 2234 No-Trade Theorem under Rational Expectations . . . . . . . . . . . . . . . . . . . . . . 2265 Overconfidence as Source of Heterogeneous Beliefs . . . . . . . . . . . . . . . . . . . 2286 Trading and Equilibrium Price in Continuous Time . . . . . . . . . . . . . . . . . . . 2327 Other Related Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2398 Survival of Traders with Incorrect Beliefs . . . . . . . . . . . . . . . . . . . . . . . . . . . 2429 Some Remaining Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

Page 8: Paris-Princeton Lectures on Mathematical Finance 2003

Editors

Rene A. CarmonaPaul M. Wythes ’55 Professor of Engineering and FinanceORFE and Bendheim Center for FinancePrinceton UniversityPrinceton NJ 08540, USAemail: [email protected]

Erhan CinlarNorman J. Sollenberger Professor of EngineeringORFE and Bendheim Center for FinancePrinceton UniversityPrinceton NJ 08540, USAemail: [email protected]

Ivar EkelandCanada Research Chair in Mathematical EconomicsDepartment of Mathematics, Annex 1210University of British Columbia1984 Mathematics RoadVancouver, B.C., Canada V6T 1Z2email: [email protected]

Elyes JouiniCEREMADE, UFR Mathematiques de la DecisionUniversite Paris-DauphinePlace du Marechal de Lattre de Tassigny75775 Paris Cedex 16, Franceemail: [email protected]

Jose A. ScheinkmanTheodore Wells ’29 Professor of EconomicsDepartment of Economics and Bendheim Center for FinancePrinceton UniversityPrinceton NJ 08540, USAemail: [email protected]

Nizar TouziCentre de Recherche en Economie et Statistique15 Blvd Gabriel Peri92241 Malakoff Cedex, Franceemail: [email protected]

Page 9: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims

Tomasz R. Bielecki,1 Monique Jeanblanc2 and Marek Rutkowski3

1 Department of Applied MathematicsIllinois Institute of TechnologyChicago, USAemail: [email protected]

2 Equipe d’Analyse et ProbabilitesUniversite d’Evry-Val d’EssonneEvry, Franceemail: [email protected]

3 Faculty of Mathematics and Information ScienceWarsaw University of TechnologyandInstitute of Mathematics of the Polish Academy of SciencesWarszawa, Polandemail: [email protected]

Summary. The goal of this chapter is to present a survey of recent developments in the prac-tically important and challenging area of hedging credit risk. In a companion work, Bielecki etal. (2004a), we presented techniques and results related to the valuation of defaultable claims.It should be emphasized that in most existing papers on credit risk, the risk-neutral valuationof defaultable claims is not supported by any other argument than the desire to produce anarbitrage-free model of default-free and defaultable assets. Here, we focus on the possibil-ity of a perfect replication of defaultable claims and, if the latter is not feasible, on variousapproaches to hedging in an incomplete setting.

Key words: Defaultable claims, credit risk, perfect replication, incomplete markets, mean-variance hedging, expected utility maximization, indifference pricing.MSC 2000 subject classification. 91B24, 91B28, 91B70, 60H30, 93E20

Acknowledgements: Tomasz R. Bielecki was supported in part by NSF Grant 0202851.Monique Jeanblanc thanks T.R.B. and M.R. for their hospitality during her visits to Chicagoand Warsaw. Marek Rutkowski thanks M.J. for her hospitality during his visit to Evry. MarekRutkowski was supported in part by KBN Grant PBZ-KBN-016/P03/1999.

T.R. Bielecki et al.: LNM 1847, R.A. Carmona et al. (Eds.), pp. 1–132, 2004.c© Springer-Verlag Berlin Heidelberg 2004

Page 10: Paris-Princeton Lectures on Mathematical Finance 2003

2 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Introduction

The present chapter is naturally divided into three different parts.

Part I is devoted to methods and results related to the replication of defaultable claimswithin the reduced-form approach (also known as the intensity-based approach). Letus mention that the replication of defaultable claims in the so-called structural ap-proach, which was initiated by Merton (1973) and Black and Cox (1976), is entirelydifferent (and rather standard), since the value of the firm is usually postulated to bea tradeable underlying asset. Since we work within the reduced-form framework, wefocus on the possibility of an exact replication of a given defaultable claim througha trading strategy based on default-free and defaultable securities. First, we analyze(following, in particular, Vaillant (2001)) various classes of self-financing tradingstrategies based on default-free and defaultable primary assets. Subsequently, wepresent various applications of general results to financial models with default-freeand defaultable primary assets are given. We develop a systematic approach to repli-cation of a generic defaultable claim, and we provide closed-form expressions forprices and replicating strategies for several typical defaultable claims. Finally, wepresent a few examples of replicating strategies for particular credit derivatives. Inthe last section, we present, by means of an example, the PDE approach to the valu-ation and hedging of defaultable claims within the framework of a complete model.

In Part II, we formulate a new paradigm for pricing and hedging financial risks inincomplete markets, rooted in the classical Markowitz mean-variance portfolio se-lection principle and first examined within the context of credit risk by Bielecki andJeanblanc (2003). We consider an investor who is interested in dynamic selection ofher portfolio, so that the expected value of her wealth at the end of the pre-selectedplanning horizon is no less then some floor value, and so that the associated risk, asmeasured by the variance of the wealth at the end of the planning horizon, is mini-mized. If the perfect replication is not possible, then the determination of a price thatthe investor is willing to pay for the opportunity, will become subject to the investor’soverall attitude towards trading. In case of our investor, the bid price and the corre-sponding hedging strategy is to be determined in accordance with the mean-varianceparadigm.

The optimization techniques used in Part II are based on the mean-variance portfo-lio selection in continuous time. To the best of our knowledge, Zhou and Li (2000)were the first to use the embedding technique and linear-quadratic (LQ) optimal con-trol theory to solve the continuous-time mean-variance problem with assets havingdeterministic diffusion coefficients. Their approach was subsequently developed invarious directions by, among others, Li et al. (2001), Lim and Zhou (2002), Zhouand Yin (2002), and Bielecki et al. (2004b). For an excellent survey of most of theseresults, the interested reader is referred to Zhou (2003).

In the final part, we present a few alternative ways of pricing defaultable claims inthe situation when perfect hedging is not possible. We study the indifference pricingapproach, that was initiated by Hodges and Neuberger (1989). This method leads

Page 11: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 3

us to solving portfolio optimization problems in an incomplete market model, andwe shall use the dynamic programming approach. In particular, we compare the in-difference prices obtained using strategies adapted to the reference filtration to theindifference prices obtained using strategies based on the enlarged filtration, whichencompasses also the observation of the default time. We also solve portfolio opti-mization problems for the case of the exponential utility; our method relies here onthe ideas of Rouge and El Karoui (2000) and Musiela and Zariphopoulou (2004).Next, we study a particular indifference price based on the quadratic criterion; it willbe referred to as the quadratic hedging price. In a default-free setting, a similar studywas done by Kohlmann and Zhou (2000). Finally, we present a solution to a specificoptimization problem, using the duality approach for exponential utilities.

Part I. Replication of Defaultable Claims

The goal of this part is the present some methods and results related to the replicationof defaultable claims within the reduced-form approach (also known as the intensity-based approach). In contrast to some other related works, in which this issue wasaddressed by invoking a suitable version of the martingale representation theorem(see, for instance, Belanger et al. (2001) or Blanchet-Scalliet and Jeanblanc (2004)),we analyze here the possibility of a perfect replication of a given defaultable claimthrough a trading strategy based on default-free and defaultable securities. There-fore, the important issue of the choice of primary assets that are used to replicatea defaultable claim (e.g., a vulnerable option or a credit derivative) is emphasized.Let us stress that replication of defaultable claims within the structural approach tocredit risk is rather standard, since in this approach the default time is, typically, apredictable stopping time with respect to the filtration generated by prices of primaryassets.

By contrast, in the intensity-based approach, the default time is not a stopping timewith respect to the filtration generated by prices of default-free primary assets, and itis a totally inaccessible stopping time with respect to the enlarged filtration, that is,the filtration generated by the prices of primary assets and the jump process associ-ated with the random moment of default.

Our research in this part was motivated, in particular, by the paper by Vaillant (2001).Other related works include: Wong (1998), Arvanitis and Laurent (1999), Green-field (2000), Lukas (2001), Collin-Dufresne and Hugonnier (2002) and Jamshidian(2002).

For a more exhaustive presentation of the mathematical theory of credit risk, werefer to the monographs by Cossin and Pirotte (2000), Arvanitis and Gregory (2001),Bielecki and Rutkowski (2002), Duffie and Singleton (2003), or Schonbucher (2003).

Page 12: Paris-Princeton Lectures on Mathematical Finance 2003

4 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

This part is organized as follows. Section 1 is devoted to a brief description of thebasic concepts that are used in what follows. In Section 2, we formally introduce thedefinition of a generic defaultable claim (X,Z,C, τ) and we examine the basic fea-tures of its ex-dividend price and pre-default value. The well-known valuation resultsfor defaultable claims are also provided. In the next section, we analyze (following, inparticular, Vaillant (2001)) various classes of self-financing trading strategies basedon default-free and defaultable primary assets.

Section 4 deals with applications of results obtained in the preceding section to finan-cial models with default-free and defaultable primary assets. We develop a system-atic approach to replication of a generic defaultable claim, and we provide closed-form expressions for prices and replicating strategies for several typical defaultableclaims. A few examples of replicating strategies for particular credit derivatives arepresented.

Finally, in the last section, we examine the PDE approach to the valuation and hedg-ing of defaultable claims.

1 Preliminaries

In this section, we introduce the basic notions that will be used in what follows. First,we introduce a default-free market model; for the sake of concreteness we focus ondefault-free zero-coupon bonds. Subsequently, we shall examine the concept of arandom time associated with a prespecified hazard process.

1.1 Default-Free Market

Consider an economy in continuous time, with the time parameter t ∈ R+. We aregiven a filtered probability space (Ω,F,P∗) endowed with a d-dimensional standardBrownian motion W ∗. It is convenient to assume that F is the P∗-augmented andright-continuous version of the natural filtration generated by W ∗. As we shall see inwhat follows, the filtration F will also play an important role of a reference filtrationfor the intensity of default event. Let us recall that any (local) martingale with respectto a Brownian filtration F is continuous; this well-known property will be of frequentuse in what follows.

In the first step, we introduce an arbitrage-free default-free market. In this market,we have the following primary assets:

• A money market account B satisfying

dBt = rtBt dt, B0 = 1,

or, equivalently,

Page 13: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 5

Bt = exp(∫ t

0

ru du

),

where r is an F-progressively measurable stochastic process. Thus, B is an F-adapted, continuous, and strictly positive process of finite variation.

• Default-free zero-coupon bonds with prices

B(t, T ) = Bt EP∗(B−1T | Ft), ∀ t ≤ T,

where T is the bond’s maturity date. Since the filtration F is generated by aBrownian motion, for any maturity date T > 0 we have

dB(t, T ) = B(t, T )(rt dt + b(t, T ) dW ∗

t

)for some F-predictable, Rd-valued process b(t, T ), referred to as the bond’svolatility.

For the sake of expositional simplicity, we shall postulate throughout that the default-free term structure model is complete. The probability P∗ is thus the unique mar-tingale measure for the default-free market model. This assumption is not essen-tial, however. Notice that all price processes introduced above are continuous F-semimartingales.

Remarks. The bond was chosen as a convenient and practically important exampleof a tradeable financial asset. We shall be illustrating our theoretical derivations withexamples in which the bond market will play a prominent role. Most results can beeasily applied to other classes of financial models, such as: models of equity markets,futures markets, or currency markets, as well as to models of LIBORs and swap rates.

1.2 Random Time

Let τ be a non-negative random variable on a probability space (Ω,G,Q∗), termeda random time (it will be later referred to as a default time). We introduce the jumpprocess Ht = 11τ≤t and we denote by H the filtration generated by this process.

Hazard process. We now assume that some reference filtration F such that Ft ⊆ Gis given. We set G = F∨H so that Gt = Ft∨Ht = σ(Ft,Ht) for every t ∈ R+. Thefiltration G is referred to as to the full filtration: it includes the observations of defaultevent. It is clear that τ is an H-stopping time, as well as a G-stopping time (but notnecessarily an F-stopping time). The concept of the hazard process of a random timeτ is closely related to the process Ft which is defined as follows:

Ft = Q∗τ ≤ t | Ft, ∀ t ∈ R+.

Let us denote Gt = 1 − Ft = Q∗τ > t | Ft and let us assume that Gt > 0 forevery t ∈ R+ (hence, we exclude the case where τ is an F-stopping time). Then theprocess Γ : R+ → R+, given by the formula

Page 14: Paris-Princeton Lectures on Mathematical Finance 2003

6 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Γt = − ln(1 − Ft) = − lnGt, ∀ t ∈ R+,

is termed the hazard process of a random time τ with respect to the reference filtra-tion F, or briefly the F-hazard process of τ .

Notice that Γ∞ = ∞ and Γ is an F-submartingale, in general. We shall only con-sider the case when Γ is an increasing process (for a construction of a random timeassociated with a given hazard process Γ , see Section 1.2). This indeed is not a se-rious compromise to generality. We refer to Blanchet-Scalliet and Jeanblanc (2004)for a discussion regarding completeness of the underlying financial market and prop-erties of the process Γ . They show that if the underlying financial market is completethen the so-called (H) hypothesis is satisfied and, as a consequence, the process Γ isindeed increasing.

Remarks. The simplifying assumption that Q∗τ > t | Ft > 0 for every t ∈R+ can be relaxed. First, if we fix a maturity date T , it suffices to postulate thatQ∗τ > T | FT > 0. Second, if we have Q∗τ ≤ T = 1, so that the defaulttime is bounded by some U = ess sup τ ≤ T , then it suffices to postulate thatQ∗τ > t | Ft > 0 for every t ∈ [0, U) and to examine separately the eventτ = U. For a general approach to hazard processes, the interested reader is referredto Belanger et al. (2001).

Deterministic intensity. The study of a simple case when the reference filtration F

is trivial (or when a random time τ is independent of the filtration F, and thus thehazard process is deterministic) may be instructive. Assume that τ is such that thecumulative distribution function F (t) = Q∗τ ≤ t is an absolutely continuousfunction, that is,

F (t) =∫ t

0

f(u) du

for some density function f : R+ → R+. Then we have

F (t) = 1 − e−Γ (t) = 1 − e−∫ t0 γ(u) du, ∀ t ∈ R+,

where (recall that we postulated that G(t) = 1 − F (t) > 0)

γ(t) =f(t)

1 − F (t), ∀ t ∈ R+.

The function γ : R+ → R is non-negative and satisfies∫∞0 γ(u) du = ∞. It is

called the intensity function of τ (or the hazard rate). It can be checked by directcalculations that the process Ht −

∫ t∧τ0 γ(u) du is an H-martingale.

Stochastic intensity. Assume that the hazard process Γ is absolutely continuouswith respect to the Lebesgue measure (and therefore an increasing process), so thatthere exists a process γ such that Γt =

∫ t0 γu du for every t ∈ R+. Then the F-

predictable version of γ is called the stochastic intensity of τ with respect to F,

Page 15: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 7

or simply the F-intensity of τ . In terms of the stochastic intensity, the conditionalprobability of the default event t < τ ≤ T , given the full information Gt availableat time t, equals

Q∗t < τ ≤ T | Gt = 11τ>t EQ∗(1 − e−

∫ Ttγu du

∣∣∣Ft).

ThusQ∗τ > T | Gt = 11τ>t EQ∗

(e−

∫Ttγu du

∣∣∣Ft).

It can be shown (see, for instance, Jeanblanc and Rutkowski (2002) or Bielecki andRutkowski (2004)) that the process

Ht − Γτ∧t = Ht −∫ τ∧t

0

γu du =∫ t

0

(1 −Hu)γu du, ∀ t ∈ R+,

is a (purely discontinuous) G-martingale

Construction of a Random Time

We shall now briefly describe the most commonly used construction of a randomtime associated with a given hazard process Γ . It should be stressed that the randomtime obtained through this particular method – which will be called the canonicalconstruction in what follows – has certain specific features that are not necessarilyshared by all random times with a given F-hazard process Γ . We start by assumingthat we are given an F-adapted, right-continuous, increasing process Γ defined on afiltered probability space (Ω,F,P∗). As usual, we assume that Γ0 = 0 and Γ∞ =+∞. In many instances, the hazard process Γ is given by the equality

Γt =∫ t

0

γu du, ∀ t ∈ R+,

for some non-negative, F-predictable, stochastic intensity γ. To construct a randomtime τ such that Γ is the F-hazard process of τ, we need to enlarge the underlyingprobability space Ω. This also means that Γ is not the F-hazard process of τ un-der P∗, but rather the F-hazard process of τ under a suitable extension Q∗ of theprobability measure P∗. Let ξ be a random variable defined on some probabilityspace (Ω, F , Q), uniformly distributed on the interval [0, 1] under Q. We considerthe product space Ω = Ω × Ω, endowed with the product σ-field G = F∞ ⊗ F andthe product probability measure Q∗ = P∗ ⊗ Q. The latter equality means that forarbitrary events A ∈ F∞ and B ∈ F we have Q∗A × B = P∗AQB. Wedefine the random time τ : Ω → R+ by setting

τ = inf t ∈ R+ : e−Γt ≤ ξ = inf t ∈ R+ : Γt ≥ η ,

where the random variable η = − ln ξ has a unit exponential law under Q∗. It is notdifficult to find the process Ft = Q∗τ ≤ t | Ft. Indeed, since clearly τ > t =ξ < e−Γt and the random variable Γt is F∞-measurable, we obtain

Page 16: Paris-Princeton Lectures on Mathematical Finance 2003

8 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Q∗τ > t | F∞ = Q∗ξ < e−Γt | F∞ = Qξ < e−xx=Γt = e−Γt .

Consequently, we have

1 − Ft = Q∗τ > t | Ft = EQ∗(Q∗τ > t | F∞ | Ft

)= e−Γt ,

and so F is an F-adapted, right-continuous, increasing process. It is also clear thatΓ is the F-hazard process of τ under Q∗. Finally, it can be checked that any P∗-Brownian motion W ∗ with respect to F remains a Brownian motion under Q∗ withrespect to the enlarged filtration G = F ∨ H.

2 Defaultable Claims

A generic defaultable claim (X,C,Z, τ) with maturity date T consists of:

• The default time τ specifying the random time of default and thus also the defaultevents τ ≤ t for every t ∈ [0, T ]. It is always assumed that τ is strictly positivewith probability 1.

• The promised payoff X , which represents the random payoff received by theowner of the claim at time T, if there was no default prior to or at time T . Theactual payoff at time T associated with X thus equals X11τ>T.

• The finite variation process C representing the promised dividends – that is, thestream of (continuous or discrete) random cash flows received by the owner ofthe claim prior to default or up to time T , whichever comes first. We assume thatCT − CT− = 0.

• The recovery process Z, which specifies the recovery payoff Zτ received by theowner of a claim at time of default, provided that the default occurs prior to or atmaturity date T .

It is convenient to introduce the dividend process D, which represents all cash flowsassociated with a defaultable claim (X,C,Z, τ). Formally, the dividend process Dis defined through the formula

Dt = X11τ>T11[T,∞)(t) +∫

(0,t]

(1 −Hu) dCu +∫

(0,t]

Zu dHu,

where both integrals are (stochastic) Stieltjes integrals.

Definition 1. The ex-dividend price process U of a defaultable claim of the form(X,C,Z, τ) which settles at time T is given as

Ut = Bt EQ∗(∫

(t,T ]

B−1u dDu

∣∣∣Gt), ∀ t ∈ [0, T ),

where Q∗ is the spot martingale measure and B is the savings account. In addition,at maturity date we set UT = UT (X) + UT (Z) = X11τ>T + ZT 11τ=T .

Page 17: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 9

Observe that Ut = Ut(X) + Ut(Z) + Ut(C), where the meaning of Ut(X), Ut(Z)and Ut(C) is clear. Recall also that the filtration G models the full information, thatis, the observations of the default-free market and of the default event.

2.1 Default Time

We assume from now on that we are given an F-adapted, right-continuous, increasingprocess Γ on (Ω,F,P∗) with Γ∞ = ∞. The default time τ and the probabilitymeasure Q∗ are constructed as in Section 1.2. The probability Q∗ will play the roleof the martingale probability for the defaultable market. It is essential to observethat:

• The Wiener process W ∗ is also a Wiener process with respect to G under theprobability measure Q∗.

• We have Q∗|Ft

= P∗|Ft

for every t ∈ [0, T ].

If the hazard process Γ admits the integral representation Γt =∫ t0γu du then the

process γ is called the (stochastic) intensity of default with respect to the referencefiltration F.

2.2 Risk-Neutral Valuation

We shall now present the well-known valuation formulae for defaultable claimswithin the reduced-form setup (see, e.g., Lando (1998), Schonbucher (1998), Bi-elecki and Rutkowski (2004) or Bielecki et al. (2004a)).

Terminal payoff. The valuation of the terminal payoff is based on the followingclassic result.

Lemma 1. For any G-measurable, integrable random variable X and any t ≤ T wehave

EQ∗(11τ>TX | Gt) = 11τ>tEQ∗(11τ>TX | Ft)

Q∗(τ > t | Ft).

If, in addition, X is FT -measurable then

EQ∗(11τ>TX | Gt) = 11τ>t EQ∗(eΓt−ΓT X | Ft).

Let X be an FT -measurable random variable representing the promised payoff atmaturity date T . We consider a defaultable claim of the form 11τ>TX with zerorecovery in case of default (i.e., we set Z = C = 0). Using the definition of theex-dividend price of a defaultable claim, we get the following risk-neutral valuationformula

Page 18: Paris-Princeton Lectures on Mathematical Finance 2003

10 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Ut(X) = Bt EQ∗(B−1T 11τ>TX | Gt)

which holds for any t < T . The next result is a straightforward consequence ofLemma 1.

Proposition 1. The price of the promised payoff X satisfies for t ∈ [0, T ]

Ut(X) = Bt EQ∗(B−1T X11τ>T | Gt) = 11τ>tUt(X), (1)

where we define

Ut(X) = Bt EQ∗(B−1T eΓt−ΓT X | Ft) = Bt EQ∗(B−1

T X | Ft),

where the risk-adjusted savings account Bt equals Bt = BteΓt . If, in addition, the

default time admits the intensity process γ then

Bt = exp(∫ t

0

(ru + γu) du)

.

The process Ut(X) represents the pre-default value at time t of the promised payoffX . Notice that UT (X) = X and the process Ut(X)/Bt, t ∈ [0, T ], is a continuousF-martingale (thus, the process U(X) is a continuous F-semimartingale).

Remark. The valuation formula (1), as well as the concept of pre-default value,should be supported by replication arguments. To this end, we need first to constructa suitable model of a defaultable market. In fact, if we wish to use formula (1), weneed to know the joint law of all random variables involved, and this appears to be anon-trivial issue, in general.

Recovery payoff. The following result appears to be useful in the valuation of therecovery payoff Zτ which occurs at time τ . The process U(Z) introduced belowrepresents the pre-default value of the recovery payoff.

For the proof of Proposition 2 we refer, for instance, to Bielecki and Rutkowski(2004) (see Propositions 5.1.1 and 8.2.1 therein).

Proposition 2. Let the hazard process Γ be continuous, and let Z be an F-predictablebounded process. Then for every t ∈ [0, T ] we have

Ut(Z) = Bt EQ∗(B−1τ Zτ11 t<τ≤T | Gt)

= 11τ>tBt EQ∗(∫ T

t

ZuB−1u eΓt−Γu dΓu

∣∣∣Ft)

= 11τ>tUt(Z).

where we set

Ut(Z) = Bt EQ∗(∫ T

t

ZuB−1u dΓu

∣∣∣Ft), ∀ t ∈ [0, T ].

Page 19: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 11

If the default intensity γ with respect to F exists then we have

Ut(Z) = EQ∗(∫ T

t

Zue− ∫ u

t(rv+γv) dv γu du

∣∣∣Ft).

Remark. Notice that UT (Z) = 0 while, in general, UT (Z) = ZT 11τ=T is non-zero. Note, however, that if the hazard process Γ is assumed to be continuous thenwe have Q∗τ = T = 0, and thus UT (Z) = 0 = UT (Z).

Promised dividends. To value the promised dividends C that are paid prior to defaulttime τ we shall make use of the following result. Notice that at any date t < T theprocess U(C) gives the pre-default value of future promised dividends.

Proposition 3. Let the hazard process Γ be continuous, and let C be an F-predictable,bounded process of finite variation. Then for every t ∈ [0, T ]

Ut(C) = Bt EQ∗( ∫

(t,T ]

B−1u (1 −Hu) dCu

∣∣∣Gt)

= 11τ>tBt EQ∗( ∫

(t,T ]

B−1u eΓt−Γu dCu

∣∣∣Ft)

= 11τ>tUt(C),

where we define

Ut(C) = Bt EQ∗(∫

(t,T ]

B−1u dCu

∣∣∣Ft), ∀ t ∈ [0, T ].

If, in addition, the default time τ admits the intensity γ with respect to F then

Ut(C) = EQ∗( ∫

(t,T ]

e−∫

ut

(rv+γv) dv dCu

∣∣∣Ft).

2.3 Defaultable Term Structure

For a defaultable discount bond with zero recovery it is natural to adopt the followingdefinition (the superscript 0 refers to the postulated zero recovery scheme) of theprice

D0(t, T ) = Bt EQ∗(B−1T 11τ>T | Gt) = 11τ>tD0(t, T ),

where D0(t, T ) stands for the pre-default value of the bond, which is given by thefollowing equality:

D0(t, T ) = Bt EQ∗(B−1T | Ft).

Since F is the Brownian filtration, the process D0(t, T )/Bt is a continuous, strictlypositive, F-martingale. Therefore, the pre-default bond price D0(t, T ) is a continu-ous, strictly positive, F-semimartingale. In the special case, when Γ is deterministic,we have D0(t, T ) = eΓt−ΓT B(t, T ).

Page 20: Paris-Princeton Lectures on Mathematical Finance 2003

12 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Remark. The case zero recovery is, of course, only a particular example. For moregeneral recovery schemes and the corresponding bond valuation results, we refer, forinstance, to Section 2.2.4 in Bielecki et al. (2004a).

Let QT stand for the forward martingale measure, given on (Ω,GT ) (as well as on(Ω,FT )) through the formula

dQT

dQ∗ =1

BTB(0, T ), Q∗-a.s.,

so that the process WTt = W ∗

t −∫ t0 b(u, T ) du is a Brownian motion under QT .

Denote by F (t, U, T ) = B(t, U)(B(t, T ))−1 the forward price of the U -maturitybond, so that

dF (t, U, T ) = F (t, U, T )(b(t, U) − b(t, T )

)dWT

t .

Since the processes Bt and B(t, T ) are F-adapted, it can be shown (see, e.g.,Jamshidian (2002)) that Γ is also the F-hazard process of τ under QT , and thus

QT t < τ ≤ T | Gt = 11τ>tEQT (eΓt−ΓT | Ft).

Let us define an auxiliary process Γ (t, T ) = D0(t, T )(B(t, T ))−1 (for a fixed T >0). The next result examines the basic properties of the process Γ (t, T ).

Lemma 2. Assume that the F-hazard process Γ is continuous. The process Γ (t, T ),t ∈ [0, T ], is a continuous F-submartingale and

dΓ (t, T ) = Γ (t, T )(dΓt + β(t, T ) dWT

t

)(2)

for some F-predictable process β(t, T ). The process Γ (t, T ) is of finite variationif and only if the hazard process Γ is deterministic. In the latter case, we haveΓ (t, T ) = eΓt−ΓT .

Proof. Recall that Bt = BteΓt and notice that

Γ (t, T ) =D0(t, T )B(t, T )

=Bt EQ∗(B−1

T | Ft)Bt EQ∗(B−1

T | Ft)= EQT (eΓt−ΓT | Ft) = eΓtMt,

where we set Mt = EQT (e−ΓT | Ft). Recall that the filtration F is generated by aprocess W ∗, which is a Wiener process with respect to P∗ and Q∗, and all martingaleswith respect to a Brownian filtration are continuous processes.

We conclude that Γ (t, T ) is the product of a strictly positive, increasing, right-continuous, F-adapted process eΓt , and a strictly positive, continuous, F-martingaleM . Furthermore, there exists an F-predictable process β(t, T ) such that M satisfies

dMt = Mtβ(t, T ) dWTt

Page 21: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 13

with the initial condition M0 = EQT (e−ΓT ). Formula (2) follows by an applicationof Ito’s formula, by setting β(t, T ) = e−Γt β(t, T ). To complete the proof, it sufficesto recall that a continuous martingale is never of finite variation, unless it is a constantprocess.

Suppose that Γt =∫ t0 γu du. Then (2) yields

dΓ (t, T ) = Γ (t, T )(γt dt + β(t, T ) dWT

t

).

Consequently, the pre-default price D0(t, T ) = Γ (t, T )B(t, T ) is governed by

dD0(t, T ) = D0(t, T )((

rt + γt + β(t, T )b(t, T ))dt + b(t, T ) dW ∗

t

), (3)

where the volatility process equals b(t, T ) = β(t, T ) + b(t, T ).

3 Properties of Trading Strategies

In this section, we shall examine the most basic properties of the wealth process of aself-financing trading strategy. First, we concentrate on trading in default-free assets.In the next step, we also include defaultable assets in our portfolio.

3.1 Default-Free Primary Assets

Our goal in this section is to present some auxiliary results related to the concept ofa self-financing trading strategy for a market model involving default-free and de-faultable securities. For the sake of the reader’s convenience, we shall first discussbriefly the classic concepts of self-financing cash and futures strategies in the con-text of default-free market model. It appears that in case of defaultable securitiesonly minor adjustments of definitions and results are needed (see, Vaillant (2001) orBlanchet-Scalliet and Jeanblanc (2004)).

Cash Strategies

Let Y 1t and Y 2

t stand for the cash prices at time t ∈ [0, T ] of two tradeable assets. Wepostulate that Y 1 and Y 2 are continuous semimartingales. We assume, in addition,that the process Y 1 is strictly positive, so that it can be used as a numeraire.

Remark. We chose the convention that price processes of default-free securities arecontinuous semimartingales. Results of this section can be extended to the case ofgeneral semimartingales (for instance, jump diffusions). Our choice was motivatedby the desire of providing relatively simple closed-form expressions.

Page 22: Paris-Princeton Lectures on Mathematical Finance 2003

14 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Let φ = (φ1, φ2) be a trading strategy for default-free market so that, in partic-ular, processes φ1 and φ2 are predictable with respect to the reference filtration F

(the same measurability assumption will be valid for components φ1, . . . , φk of ak-dimensional trading strategy). The component φit represents the number of unitsof the ith asset held in the portfolio at time t.

Let Vt(φ) denote the wealth of the cash strategy φ = (φ1, φ2) at time t, so that

Vt(φ) = φ1tY

1t + φ2

tY2t , ∀ t ∈ [0, T ].

We say that the cash strategy φ is self-financing if

Vt(φ) = V0(φ) +∫ t

0

φ1u dY 1

u +∫ t

0

φ2u dY 2

u , ∀ t ∈ [0, T ],

that is,dVt(φ) = φ1

t dY1t + φ2

t dY2t .

This yieldsdVt(φ) = (Vt(φ) − φ2

tY2t )(Y 1

t )−1 dY 1t + φ2

t dY2t .

Let us introduce the relative values:

V 1t (φ) = Vt(φ)(Y 1

t )−1, Y 2,1t = Y 2

t (Y 1t )−1.

A simple application of Ito’s formula yields

V 1t (φ) = V 1

0 (φ) +∫ t

0

φ2u dY 2,1

u .

It is well known that a similar result holds for any finite number of cash assets.Let Y 1

t , Y 2t , . . . , Y k

t represent that cash values at time t of k assets. We postulate thatY 1, Y 2, . . . , Y k are continuous semimartingales. Then the wealth Vt(φ) of a tradingstrategy φ = (φ1, φ2, . . . , φk) equals

Vt(φ) =k∑i=1

φitYit , ∀ t ∈ [0, T ], (4)

and φ is said to be a self-financing cash strategy if

Vt(φ) = V0(φ) +k∑i=1

∫ t

0

φiu dY iu , ∀ t ∈ [0, T ]. (5)

Suppose that the process Y 1 is strictly positive. Then by combining the last twoformulae, we obtain

dVt(φ) =(Vt(φ) −

k∑i=2

φitYit

)(Y 1t )−1 dY 1

t +k∑i=2

φit dYit .

The latter representation shows that the wealth process depends only on k − 1components of φ. Choosing Y 1 as a numeraire asset, and denoting V 1

t (φ) =Vt(φ)(Y 1

t )−1, Y i,1t = Y i

t (Y 1t )−1, we get the following well-known result.

Page 23: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 15

Lemma 3. Let φ = (φ1, φ2, . . . , φk) be a self-financing cash strategy. Then we have

V 1t (φ) = V 1

0 (φ) +k∑i=2

∫ t

0

φiu dY i,1u , ∀ t ∈ [0, T ].

Cash-Futures Strategies

Let us first consider the special case of two assets. Assume that Y 1t and Y 2

t repre-sent the cash and futures prices at time t ∈ [0, T ] of some assets, respectively. Asbefore, we postulate that Y 1 and Y 2 are continuous semimartingales. Moreover, Y 1

is assumed to be a strictly positive process. In view of specific features of a futurescontract, it is natural to postulate that the wealth Vt(φ) satisfies

Vt(φ) = φ1tY

1t + φ2

t 0 = φ1tY

1t , ∀ t ∈ [0, T ].

The cash-futures strategy φ = (φ1, φ2) is self-financing if

dVt(φ) = φ1t dY

1t + φ2

t dY2t , (6)

which yields, provided that Y 1 is strictly positive,

dVt(φ) = Vt(φ)(Y 1t )−1 dY 1

t + φ2t dY

2t .

Remark. Let us recall that the futures price Y 2t (that is, the quotation of a futures

contract at time t) has different features than the cash price of an asset. Specifically,we make the standard assumption that it is possible to enter a futures contract at noinitial cost. The gains or losses from futures contracts are associated with markingto market (see, for instance, Duffie (2003) or Musiela and Rutkowski (1997)). Notethat the 0 in the formula defining Vt(φ) is aimed to represent the value of a futurescontract at time t, as opposed to the futures price Y 2

t at this date.

Lemma 4. Let φ = (φ1, φ2) be a self-financing cash-futures strategy. Suppose thatthe processes Y 1 and Y 2 are strictly positive. Then the relative wealth processV 1t (φ) = Vt(φ)(Y 1

t )−1 satisfies

V 1t (φ) = V 1

0 (φ) +∫ t

0

φ2,1u dY 2,1

u , ∀ t ∈ [0, T ],

where φ2,1t = φ2

t (Y1t )−1eα

2,1t , Y 2,1

t = Y 2t e−α

2,1t and

α2,1t = 〈ln Y 2, lnY 1〉t =

∫ t

0

(Y 2u )−1(Y 1

u )−1 d〈Y 2, Y 1〉u.

Page 24: Paris-Princeton Lectures on Mathematical Finance 2003

16 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Proof. For brevity, we write Vt = Vt(φ) and V 1t = V 1

t (φ). The Ito formula, com-bined with (6), yields

dV 1t = (Y 1

t )−1dVt + Vt d(Y 1t )−1 + d〈(Y 1)−1, V 〉t

= φ1t (Y

1t )−1 dY 1

t + φ2t (Y

1t )−1 dY 2

t + φ1tY

1t d(Y 1

t )−1

− φ1t (Y

1t )−2 d〈Y 1, Y 1〉t − φ2

t (Y1t )−2 d〈Y 1, Y 2〉t

= φ2t (Y

1t )−1dY 2

t − φ2t (Y

1t )−2 d〈Y 1, Y 2〉t

= φ2t eα2,1

t (Y 1t )−1

(e−α

2,1t dY 2

t − Y 2t e−α

2,1t dα2,1

t

)= φ2,1

t dY 2,1t

and the result follows.

Let Y 1, . . . , Y l be the cash prices of l assets, and let Y l+1, . . . , Y k represent thefutures prices of k − l assets. Then the wealth process of a trading strategy φ =(φ1, φ2, . . . , φk) is given by the formula

Vt(φ) =l∑

i=1

φitYit , ∀ t ∈ [0, T ], (7)

and φ is a self-financing cash-futures strategy whenever

Vt(φ) = V0(φ) +k∑i=1

∫ t

0

φiu dY iu , ∀ t ∈ [0, T ].

The proof of the next result relies on the similar calculations as the proofs of Lemmas3 and 4.

Lemma 5. Let φ = (φ1, φ2, . . . , φk) be a self-financing cash-futures strategy. Sup-pose that the processes Y 1 and Y l+1, . . . , Y k are strictly positive. Then the relativewealth process V 1

t (φ) = Vt(φ)(Y 1t )−1 satisfies, for every t ∈ [0, T ],

V 1t (φ) = V 1

0 (φ) +l∑

i=2

∫ t

0

φiu dY i,1u +

k∑i=l+1

∫ t

0

φi,1u dY i,1u ,

where we denote Y i,1t = Y i

t (Y 1t )−1, φi,1t = φit(Y

1t )−1eα

i1t , Y i,1

t = Y it e

−αi1t , and

αi1t = 〈lnY i, lnY 1〉t =∫ t

0

(Y iu)−1(Y 1

u )−1 d〈Y i, Y 1〉u.

Constrained Cash Strategies

We continue the analysis of cash strategies for some k ≥ 3. Price processesY 1, Y 2, . . . , Y k are assumed to be continuous semimartingales. We postulate, in ad-dition, that Y 1 and Y l+1, . . . , Y k are strictly positive processes, where 1 < l+1 ≤ k.

Page 25: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 17

Let φ = (φ1, φ2, . . . , φk) be a self-financing trading strategy, so that the wealth pro-cess V (φ) satisfies (4)-(5). We shall consider three particular cases of increasinggenerality.

Strategies with zero net investment in Y l+1, . . . , Y k. Assume first that at anytime t there is zero net investment in assets Y l+1, . . . , Y k. Specifically, we postulatethat the strategy is subject to the following constraint:

k∑i=l+1

φitYit = 0, ∀ t ∈ [0, T ], (8)

so that the wealth process Vt(φ) is given by (7). Equivalently, we have φkt =−∑k−1i=l+1 φitY

it (Y k

t )−1. Combining the last equality with (5), we obtain

dVt(φ) =(Vt(φ) −

l∑i=2

φitYit

)(Y 1t )−1 dY 1

t

+l∑i=2

φit dYit +

k−1∑i=l+1

φit(dY i

t − Y it (Y k

t )−1 dY kt

).

It is thus clear that the wealth process V (φ) depends only on k − 2 componentsφ2, . . . , φk−1 of the k-dimensional trading strategy φ. The following result, whichcan be seen as an extension of Lemma 4, provides a more convenient representationfor the (relative) wealth process.

Lemma 6. Let φ = (φ1, φ2, . . . , φk) be a self-financing cash strategy such that (8)holds. Assume that the processes Y 1, Y l+1, . . . , Y k are strictly positive. Then therelative wealth process V 1

t (φ) = Vt(φ)(Y 1t )−1 satisfies

V 1t (φ) = V 1

0 (φ) +l∑i=2

∫ t

0

φiu dY i,1u +

k−1∑i=l+1

∫ t

0

φi,k,1u dY i,k,1u , ∀ t ∈ [0, T ],

where we denote

φi,k,1t = φit(Y1,kt )−1eα

i,k,1t , Y i,k,1

t = Y i,kt e−α

i,k,1t , (9)

with Y i,kt = Y i

t (Y kt )−1 and

αi,k,1t = 〈lnY i,k, lnY 1,k〉t =∫ t

0

(Y i,ku )−1(Y 1,k

u )−1 d〈Y i,k, Y 1,k〉u. (10)

Proof. Let us consider the relative values of all processes, with the price Y k chosenas a numeraire, and let us consider the process

Page 26: Paris-Princeton Lectures on Mathematical Finance 2003

18 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

V kt (φ) := Vt(φ)(Y k

t )−1 =k∑i=1

φitYi,kt .

In view of the constraint (8) we have that V kt (φ) =

∑li=1 φitY

i,kt . In addition, simi-

larly as in Lemma 3, we obtain

dV kt (φ) =

k−1∑i=1

φit dYi,kt .

SinceY i,kt (Y 1,k

t )−1 = Y i,1t , V 1

t (φ) = V kt (φ)(Y 1,k

t )−1,

using argument analogous as in proof of Lemma 4, we obtain

V 1t (φ) = V 1

0 (φ) +l∑i=2

∫ t

0

φiu dY i,1u +

k−1∑i=l+1

∫ t

0

φi,k,1u dY i,k,1u , ∀ t ∈ [0, T ],

where the processes φi,k,1t , Y i,k,1t and αi,k,1t are given by (9)-(10).

Strategies with a pre-specified net investment Z in Y l+1, . . . , Y k. We shall nowpostulate that the strategy φ is such that

k∑i=l+1

φitYit = Zt, ∀ t ∈ [0, T ], (11)

for a pre-specified, F-progressively measurable, process Z . The following result is arather straightforward extension of Lemma 6.

Lemma 7. Let φ = (φ1, φ2, . . . , φk) be a self-financing cash strategy such that (11)holds. Assume that the processes Y 1, Y l+1, . . . , Y k are strictly positive. Then therelative wealth process V 1

t (φ) = Vt(φ)(Y 1t )−1 satisfies

V 1t (φ) = V 1

0 (φ) +l∑i=2

∫ t

0

φiu dY i,1u +

k−1∑i=l+1

∫ t

0

φi,k,1u dY i,k,1u

+∫ t

0

Zu(Y ku )−1 d(Y 1,k

u )−1,

where φi,k,1t , Y i,k,1t and αi,k,1t are given by (9)-(10).

Proof. Let us sketch the proof of the lemma for k = 3. Then l = 2 and φ2tY

2t +

φ3tY

3t = Zt for every t ∈ [0, T ]. Consequently, for the process V 3(φ) = V (φ)(Y 3)−1

we get

Page 27: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 19

V 3t (φ) =

3∑i=1

φitYit (Y 3

t )−1 = φ1tY

1,3t + Zt(Y 3

t )−1, ∀ t ∈ [0, T ].

Furthermore, the self-financing condition yields

dV 3t (φ) = φ1

t dY1,3t + φ2

t dY2,3t .

Proceeding in an analogous way as in the proof of Lemma 4, we obtain for V 1t (φ) =

V 3t (φ)(Y 1,3

t )−1

dV 1t (φ) = φ2

t eα2,3,1

t (Y 1,3t )−1

(e−α

2,3,1t dY 2,3

t − Y 2,3t e−α

2,3,1t dα2,3,1

t

)+ Zt(Y 3

t )−1d(Y 1,3t )−1

= φ2,3,1u dY 2,3,1

u + Zt(Y 3t )−1d(Y 1,3

t )−1,

where φ2,3,1t = φ2

t (Y1,3t )−1eα

2,3,1t , Y 2,3,1

t = Y 2,3t e−α

2,3,1t and

α2,3,1t = 〈ln Y 2,3, lnY 1,3〉t =

∫ t

0

(Y 2,3u )−1(Y 1,3

u )−1 d〈Y 2,3, Y 1,3〉u.

The proof for the general case is based on similar calculations.

Strategies with consumption and a pre-specified net investment Z in Y l+1, . . . ,Y k. Let consumption be given by an F-adapted process A of finite variation, withA0 = 0. We consider a self-financing cash strategy φ with consumption process A,so that the wealth process V (φ) satisfies:

Vt(φ) =k∑i=1

φitYit =

l∑i=1

φitYit + Zt, ∀ t ∈ [0, T ],

and

Vt(φ) = V0(φ) +k∑i=1

∫ t

0

φiu dY iu + At, ∀ t ∈ [0, T ].

Then it suffices to modify the formula established in Lemma 7 by adding a termassociated with the consumption process A. Specifically, for the relative wealth pro-cess V 1

t (φ) = Vt(φ)(Y 1t )−1 we obtain the following integral representation, which

is valid for every t ∈ [0, T ]

V 1t (φ) = V 1

0 (φ) +l∑i=2

∫ t

0

φiu dY i,1u +

k−1∑i=l+1

∫ t

0

φi,k,1u dY i,k,1u

+∫ t

0

Zu(Y ku )−1 d(Y 1,k

u )−1 +∫ t

0

(Y 1u )−1 dAu.

Remark. We use here a generic term ‘consumption’ to reflect the impact of A onthe wealth. The financial interpretation of A depends on particular circumstances.For instance, an increasing process A represents the inflows of cash, rather thanthe outflows of cash (the latter case is commonly referred to as consumption in thefinancial literature).

Page 28: Paris-Princeton Lectures on Mathematical Finance 2003

20 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

3.2 Defaultable and Default-Free Primary Assets

Let Y 1, . . . , Ym be prices of m defaultable assets, and let Y m+1, . . . , Y k representprices of k − m default-free assets. Processes Y m+1, . . . , Y k are assumed to becontinuous semimartingales. We make here an essential assumption that τ is thedefault time for each defaultable asset Y i, i = 1, . . . ,m. Of course, in the case ofdefaultable assets with different default times (e.g., when dealing with the first-to-default claim), some definitions should be modified in a natural way. A special caseof first-to-default claims is examined in Section 4.4.

Self-Financing Trading Strategies

The following definition is a rather obvious extension of conditions (4)-(5). We pos-tulate here that the processes φ1, . . . , φk are G-predictable processes, in general.

Definition 2. The wealth Vt(φ) of a trading strategy φ = (φ1, φ2, . . . , φk) equalsVt(φ) =

∑ki=1 φitY

it for every t ∈ [0, T ]. A strategy φ is said to be self-financing if

for every t ∈ [0, T ]

Vt(φ) = V0(φ) +m∑i=1

∫ t

0

φiu− dY iu +

k∑i=m+1

∫ t

0

φiu dY iu.

Although Definition 2 is formulated in a general setup, it can be simplified for ourfurther purposes. Indeed, since we shall deal only with defaultable claims with de-fault time τ , we shall only examine a particular trading strategy φ prior to andat default time τ or, more precisely, on the stochastic interval [[0, τ ∧ T ]], where[[0, τ ∧ T ]] = (t, ω) ∈ R+ ×Ω : 0 ≤ t ≤ τ(ω) ∧ T .In fact, we shall examine separately the following issues: (i) the behavior of thewealth process V (φ) on the random interval [[0, τ ∧ T [[= (t, ω) ∈ R+ × Ω :0 ≤ t < τ(ω) ∧ T and (ii) the size of its jump at the random time moment τ ∧T or, equivalently, the value of Vτ∧T . Such a study is, of course, sufficient in oursetup, since we only consider the case where a recovery payment (if any) is madeat the default time (and not after this date). Consequently, since we never deal witha trading strategy after the random time τ ∧ T , we may and do assume from nowon that all components φ1, φ2, . . . , φk of a portfolio φ are F-predictable, rather thanG-predictable processes.

It is worthwhile to mention, that in the next two parts we will examine the importanceof the measurability property of an admissible trading strategy within the frameworkof optimization problems in incomplete market.

Remark. It can be formally shown that for any Rk-valued G-predictable processφ there exists a unique F-predictable process ψ such that the equality 11τ≥tφt =

Page 29: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 21

11τ≥tψt holds for every t ∈ [0, T ]. In addition, we find it convenient to postulate,by convention, that the price processes Y m+1, . . . , Y k are also stopped at the randomtime τ ∧ T .

We have the following definition of a trading strategy.

Definition 3. By a trading strategy φ = (φ1, φ2, . . . , φk) we mean a family φ1, φ2,. . . , φk of F-predictable stochastic processes.

Let us stress that if a trading strategy considered in this section is self-financing on[[0, τ∧T [[ then it is also self-financing on [[0, τ∧T ]]. At the intuitive level, the portfoliois not rebalanced at time τ∧T , but it is rather sold out in order to cover liabilities. LetY it stands for the pre-default value of the ith defaultable asset at time t. We postulate

throughout that processes Y i, i = 1, . . . ,m are continuous F-semimartingales.

Definition 4. We define the pre-default wealth process V (φ) of a trading strategyφ = (φ1, φ2, . . . , φk) by setting for every t ∈ [0, T ],

Vt(φ) =m∑i=1

φitYit +

k∑i=m+1

φitYit .

A strategy φ is said to be self-financing prior to default if for every t ∈ [0, T ]

Vt(φ) = V0(φ) +m∑i=1

∫ t

0

φiu dY iu +

k∑i=m+1

∫ t

0

φiu dY iu.

Note that V0(φ) = V0(φ), since P∗τ > 0 = 1. Let us stress that if a tradingstrategy φ is self-financing prior to default then φ is also self-financing on [0, T ].Indeed, we always postulate that trading ceases at time of default, and the terminalwealth at time τ ∧ T equals

Vτ∧T (φ) =k∑i=1

φiτ∧TYiτ∧T .

Of course, on the event τ > T we also have

Vτ∧T (φ) = VT (φ) = VT (φ) =m∑i=1

φiT YiT +

k∑i=m+1

φiTY iT .

Hence, we shall not distinguish in what follows between the concept of a self-financing trading strategy and a trading strategy self-financing prior to default.

Page 30: Paris-Princeton Lectures on Mathematical Finance 2003

22 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Zero Recovery for Defaultable Assets

The following assumption corresponds to the simplest situation of zero recovery forall defaultable primary assets that are used for replication. Manifestly, this assump-tion is not practical, and thus it will be later relaxed.

Assumption (A). The defaultable primary assets Y 1, . . . , Ym are all subject to thezero recovery scheme, and they have a common default time τ.

By virtue of Assumption (A), the prices Y 1, . . . , Ym vanish at default time τ ,and thus also after this date. Consequently, for every i = 1, . . . ,m we haveY it = 11τ>tY i

t for every t ∈ [0, T ] for some F-predictable processes Y 1, . . . , Y m.

In other words, for any i = 1, . . . ,m the price Y i jumps from Y iτ− to Y i

τ = 0at the time of default. We make a technical assumption that the pre-default valuesY 1, . . . , Y m are continuous F-semimartingales.

In order to be able to use the price Y 1 as a numeraire prior to default, we assume thatthe pre-default price Y 1 is a strictly positive continuous F-semimartingale. Noticethat Y 1

0 = Y 10 .

Assume first zero recovery for the defaultable contingent claim we wish to replicate.Thus, at time τ the wealth process of any strategy that is capable to replicate the claim11τ>TX should necessarily jump to zero, provided that τ ≤ T . We can achieve thisby considering only self-financing strategies φ = (φ1, φ2, . . . , φk) such that at anytime the net investment in default-free assets Y m+1, . . . , Y k equals zero, so that wehave

k∑i=m+1

φitYit = 0, ∀ t ∈ [0, T ]. (12)

In the general case, that is, when Z is a pre-specified non-zero recovery process for adefaultable claim under consideration, it suffices to consider self-financing strategiesφ = (φ1, φ2, . . . , φk) such that

k∑i=m+1

φitYit = Zt, ∀ t ∈ [0, T ]. (13)

Notice that prior to default time (that is, on the event τ > t) we have Vt(φ) =∑mi=1 φitY

it + Zt, and the self-financing property of φ prior to default time τ takes

the following form

dVt(φ) =m∑i=1

φit dYit +

k∑i=m+1

φit dYit . (14)

At default time τ , we have Vτ (φ) = Zτ on the set τ ≤ T .The next goal is to examine the existence of φ with the properties described above. Tothis end, we denote Y i,1

t = Y it (Y 1

t )−1 for i = 2, . . . ,m and Y 1,kt = Y 1

t (Y kt )−1. As

Page 31: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 23

before, we write Y i,kt = Y i

t (Y kt )−1. Using Lemma 7, we obtain the following aux-

iliary result that will be later used to establish the existence of a replicating strategyfor a defaultable claim.

Proposition 4. (i) Let φ = (φ1, φ2, . . . , φk) be a self-financing strategy such that(13) holds. Assume that the processes Y 1, Y m+1, . . . , Y k are strictly positive. Thenthe pre-default wealth process V (φ) satisfies for every t ∈ [0, T ]

Vt(φ) = Y 1t

(V 1

0 (φ) +m∑i=2

∫ t

0

φiu dY i,1u +

k−1∑i=m+1

∫ t

0

φi,k,1u dY i,k,1u

+∫ t

0

Zu(Y ku )−1 d(Y 1,k

u )−1),

where we denote

φi,k,1t = φit(Y1,kt )−1eα

i,k,1t , Y i,k,1

t = Y i,kt e−α

i,k,1t ,

and

αi,k,1t = 〈lnY i,k, ln Y 1,k〉t =∫ t

0

(Y i,ku )−1(Y 1,k

u )−1 d〈Y i,k, Y 1,k〉u.

In addition, at default time the wealth of φ equals Vτ (φ) = Zτ on the event τ ≤ T .(ii) Suppose that the F-predictable processes ψi, i = 2, . . . ,m and ψi,k,1, i = m +1, . . . , k − 1 are given. For an arbitrary constant c ∈ R, we define the process V bysetting, for t ∈ [0, T ],

Vt = c +m∑i=2

∫ t

0

ψiu dY i,1u +

k−1∑i=m+1

∫ t

0

ψi,k,1u dY i,k,1u +

∫ t

0

Zu(Y ku )−1 d(Y 1,k

u )−1.

Then there exists a self-financing trading strategy φ = (φ1, φ2, . . . , φk) such that:

(a) φit = ψit for i = 2, . . . ,m and φit = ψi,k,1t Y 1,kt e−α

i,k,1t for i = m+1, . . . , k−1,

(b) φ satisfies (13), so that∑k

i=m+1 φitYit = Zt for every t ∈ [0, T ],

(c) the pre-default wealth V (φ) of φ equals V ,(d) at default time the wealth of φ equals Vτ (φ) = Zτ on the event τ ≤ T .

Proof. Part (i) is an almost immediate consequence of Lemma 7. Therefore, we shallfocus on the second part. The idea of the proof of part (ii) is also rather clear. First,let φi, i = 2, . . . ,m and φi, i = m+1, . . . , k− 1 be defined from processes ψi andψi,k,1t as in (a). Given the processes φi for i = m+1, . . . , k− 1, we observe that thecomponent φk is uniquely specified by condition (13). Thus, it remains to check thatthere exists a (unique) component φ1 such that the resulting k-dimensional tradingstrategy is self-financing prior to default, in the sense of Definition 4. Let us set

Page 32: Paris-Princeton Lectures on Mathematical Finance 2003

24 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

φ1t =

(Vt −

m∑i=2

φitYit − Zt

)(Y 1t )−1 =

(Vt −

k∑i=2

φitYit

)(Y 1t )−1.

It is clear that Vt(φ) = Vt for every t ∈ [0, T ]. To show that the strategy(φ1, φ2, . . . , φk) described above is self-financing prior to default, it suffices to showthat for the discounted pre-default wealth

V 1t (φ) =

m∑i=1

φitYi,1t +

k∑i=m+1

φitYi,1t

we have for every t ∈ [0, T ]

V 1t (φ) = V 1

0 (φ) +m∑i=2

∫ t

0

φiudYi,1u +

k∑i=m+1

∫ t

0

φiudYi,1u .

Towards this end, it is enough observe that V 1t (φ) = (Y 1

t )−1Vt = V 1t , and then to

verify that

V 1t = V 1

0 +m∑i=2

∫ t

0

φiudYi,1u +

k∑i=m+1

∫ t

0

φiudYi,1u

for every t ∈ [0, T ]. To establish that last equality, it suffices to use the definition ofthe process V 1 and to observe that

k−1∑i=m+1

ψi,k,1t dY i,k,1t =

k∑i=m+1

φit dYi,1t ,

which follows by direct calculations, using the definitions of φi, i = m + 1, . . . , k.It is easy to see that the strategy φ satisfies conditions (a)-(d).

Remarks. Let us observe that the equality established in Proposition 4 is in factvalid on the random interval [[0, τ [[ on the event τ ≤ T and on the interval [0, T ]on the event τ > T . It is also important to notice that the assumption of zerorecovery for Y 1, . . . , Y m is not essential for the validity of the statements in the lastresult, except for the last part, that is, the equality Vτ (φ) = Zτ . Indeed, the proofof Proposition 4 relies on conditions (13) and (14). Therefore, if defaultable primaryassets Y 1, . . . , Y m are subject to non-zero recovery, it will be possible to modifyProposition 4 accordingly (see Section 3.2 below).

When dealing with defaultable claims with no recovery, that is, claims for whichthe recovery process Z vanishes, it will be convenient to use directly the followingcorollary to Proposition 4.

Corollary 1. Let φ = (φ1, φ2, . . . , φk) be a self-financing strategy such that condi-tion (12) holds.

Page 33: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 25

(i) Assume that the processes Y 1, Y m+1, . . . , Y k are strictly positive. Then thewealth process V (φ) satisfies for every t ∈ [0, T ]

Vt(φ) = Y 1t

(V 1

0 (φ) +m∑i=2

∫ t

0

φiu dY i,1u +

k−1∑i=m+1

∫ t

0

φi,k,1u dY i,k,1u

).

(ii) Assume that all primary assets are defaultable, that is, m = k, and the pre-default value Y 1 is a strictly positive process. Then the wealth process V (φ) satisfiesfor every t ∈ [0, T ]

Vt(φ) = Y 1t

(V 1

0 (φ) +m∑i=2

∫ t

0

φiu dY i,1u

).

Of course, the counterparts of part (ii) in Proposition 4 are also valid and they willbe used in what follows, although they are not explicitly formulated here.

Remark. Consider the special case of two primary assets, defaultable and default-free, with prices Y 1

t = 11τ>tY 1t and Y 2

t , respectively, where Y 1 and Y 2 are strictlypositive, continuous, F-semimartingales. Suppose we wish to replicate a defaultableclaim with zero recovery. We have

Vt(φ) = φ1tY

1t + φ2

tY2t = φ1

t11τ>tY1t + φ2

tY2t

anddVt(φ) = (Vt−(φ) − φ2

tY2t )(Y 1

t )−1dY 1t + φ2

t dY2t .

It is rather clear that the equality Vt(φ) = 0 on τ ≤ t implies that φ2t = 0 for every

t ∈ [0, T ]. Therefore,dVt(φ) = Vt−(φ)(Y 1

t )−1dY 1t

and the existence of replicating strategy for a defaultable claim with zero-recovery isunlikely within the present setup (except for some trivial cases).

Non-Zero Recovery for Defaultable Assets

In this section, the assumption of zero recovery for defaultable primary assetsY 1, . . . , Ym is relaxed. To be more specific, Assumption (A) is replaced by the fol-lowing weaker restriction.

Assumption (B). We assume that the defaultable assets Y 1, . . . , Y m are subject toan arbitrary recovery scheme, and they have a common default time τ.

Under Assumption (B), condition (13) no longer implies that Vτ (φ) = Zτ on the setτ ≤ T . We can achieve this requirement by substituting (13) with the followingconstraint

Page 34: Paris-Princeton Lectures on Mathematical Finance 2003

26 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

m∑i=1

φitYit +

k∑i=m+1

φitYit = Zt, ∀ t ∈ [0, T ], (15)

where Y i represents the recovery payoff of the defaultable asset Y i, so that Y iτ =

Y iτ for i = 1, 2, . . . ,m. In this general setup, condition (15) does not seem to be

sufficiently restrictive for more explicit calculations. It is plausible, however, thatit can be used to derive a replicating strategy in several non-trivial and practicallyinteresting cases.

It is not difficult to see that Proposition 4 can be extended to the case of non-zerorecovery for defaultable assets, provided, of course, that we are in a position to finda priori the wealth invested in non-defaultable assets, that is, if the process βt :=∑k

i=m+1 φitYit is known beforehand. By arguing as in Proposition 4, we then obtain

for every t ∈ [0, T ]

Vt(φ) = Y 1t

(V 1

0 (φ) +m∑i=2

∫ t

0

φiu dY i,1u +

k−1∑i=m+1

∫ t

0

φi,k,1u dY i,k,1u

+∫ t

0

βu(Y ku )−1 d(Y 1,k

u )−1).

In view of (15), we also have that

αt :=m∑i=1

φitYit = Zt − βt, ∀ t ∈ [0, T ], (16)

thereby imposing an additional constraint on the wealth invested in defaultable as-sets. Condition (16) is not directly accounted for in the last formula for V (φ), how-ever, and thus the problem at hand is not completely solved. For further considera-tions related to non-zero recovery of defaultable primary assets, see Section 4.1 and4.2.

Fractional recovery of market value. As an example of a non-zero recoveryscheme, we consider the so-called fractional recovery of (pre-default) market value(FRMV) scheme with constant recovery rates δi = 1 (typically, 0 ≤ δi < 1). Thenwe have Y i

t = δiYit for every i = 1, 2, . . . ,m, and thus (15) becomes

m∑i=1

φitδiYit +

k∑i=m+1

φitYit = Zt, ∀ t ∈ [0, T ]. (17)

Let us mention that in the case of a defaultable zero-coupon bond, the FRMV schemeresults in the following expression for the pre-default value of a defaultable bondwith unit face value (see, for instance, Section 2.2.4 in Bielecki et al. (2004a))

DδM (t, T ) = EQ∗

(e−

∫ Tt

(ru+(1−δ)γu)du∣∣∣Ft

),

Page 35: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 27

where the recovery rate δ may depend on the bond’s maturity T , in general. In par-ticular, if the default intensity γ is deterministic then we have

DδM (t, T ) = e−

∫ Tt

(1−δ)γ(u) duB(t, T ).

Manifestly, we always have DδM (τ, T ) = δDδ

M (τ−, T ) on the set τ ≤ T underthe FRMV scheme.

4 Replication of Defaultable Claims

We are in a position to examine the issue of an exact replication of a generic default-able claim. By a replicating strategy we mean here a self-financing trading strategy φsuch that the wealth process V (φ) matches exactly the pre-default value of the claimat any time prior to default (and prior to the maturity date), as well as coincides withthe claim’s payoff at default time or at maturity date, whichever comes first. Usingour notation introduced in Section 2, this can be formalized as follows.

Definition 5. A self-financing trading strategy φ is a replicating strategy for a de-faultable claim (X, 0, Z, τ) if and only if the following hold:(i) Vt(φ) = Ut(X) + Ut(Z) on the random interval [[0, τ ∧ T [[,(ii) Vτ (φ) = Zτ on the set τ ≤ T ,(iii) VT (φ) = X on the set τ > T .We say that a defaultable claim is attainable if it admits at least one replicatingstrategy.

The last definition is suitable only in the case of a defaultable claim with no promiseddividends. Some comments regarding replication of promised dividends are given inSection 4.3.

4.1 Replication of a Promised Payoff

We shall first examine the possibility of an exact replication of a defaultable con-tingent claim of the form (X, 0, 0, τ), that is, a defaultable claim with zero recov-ery and with no promised dividends. Our approach will be based on Proposition4. Thus, we assume that processes Y 1, . . . , Ym represent prices of defaultable pri-mary assets and Y m+1, . . . , Y k are prices of default-free primary assets. ProcessesY 1, . . . , Y m, Y m+1, . . . , Y k are assumed to be continuous F-semimartingales, andprocesses Y 1, Y m+1, . . . , Y k are strictly positive.

Zero Recovery for Defaultable Primary Assets

Unless explicitly stated otherwise, we postulate that Assumption (A) is valid. Re-call that Ut(X) stands for the pre-default value at time t ∈ [0, T ] of a defaultable

Page 36: Paris-Princeton Lectures on Mathematical Finance 2003

28 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

claim (X, 0, 0, τ). In the statement of following result we preserve the notation ofProposition 4.

Proposition 5. Suppose that there exist a constant V 10 , and F-predictable processes

ψi, i = 2, . . . ,m and ψi,k,1, i = m + 1, . . . , k − 1 such that

Y 1T

(V 1

0 +m∑i=2

∫ T

0

ψiu dY i,1u +

k−1∑i=m+1

∫ T

0

ψi,k,1u dY i,k,1u

)= X. (18)

Let Vt = Y 1t V 1

t , where the process V 1t is defined as, for every t ∈ [0, T ],

V 1t = V 1

0 +m∑i=2

∫ t

0

ψiu dY i,1u +

k−1∑i=m+1

∫ t

0

ψi,k,1u dY i,k,1u .

Then the trading strategy φ = (φ1, φ2, . . . , φk) defined by

φ1t =

(Vt −

m∑i=2

ψitYit

)(Y 1t )−1,

φit = ψit, i = 2, . . . ,m,

φit = ψi,k,1t Y 1,kt e−α

i,k,1t , i = m + 1, . . . , k − 1,

φkt = −k−1∑

i=m+1

ψitYit (Y k

t )−1,

is self-financing and it replicates the claim (X, 0, 0, τ). In particular, we haveVt(φ) = Vt = Ut(X), so that V represents the pre-default value of (X, 0, 0, τ).

Proof. The statement is an almost immediate consequence of part (ii) of Proposition4 (see also Corollary 1). The strategy (φ1, φ2, . . . , φk) introduced in the statementof the proposition is self-financing, and at the default time τ the wealth V (φ) jumpsto zero. Finally, VT (φ) = VT (φ) = X on the event τ > T . We conclude that φ isself-financing and it replicates (X, 0, 0, τ).

The following corollary to Proposition 5 provides the risk-neutral characterization ofthe process Ut(X), and thereby it furnishes a convenient method for the valuation ofa promised payoff.

Corollary 2. Assume that a defaultable claim (X, 0, 0, τ) is attainable. Suppose thatthere exists a probability measure Q such that the processes Y i,1, i = 2, . . . ,m −1 and processes Y i,k,1, i = m + 1, . . . , k − 1 are F-martingales under Q. If allstochastic integrals in (18) are Q-martingales, rather than Q-local martingales, thenthe pre-default value of (X, 0, 0, τ) equals, for every t ∈ [0, T ],

Ut(X) = Y 1t E

Q

(X(Y 1

T )−1∣∣Ft).

Page 37: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 29

Defaultable asset and two default-free assets. In the case when m = 1 and k = 2,Proposition 5 reduces to the following result. Recall that we denote

α2,3,1t = 〈ln Y 2,3, ln Y 1,3〉t =

∫ t

0

(Y 2,3u )−1(Y 1,3

u )−1 d〈Y 2,3, Y 1,3〉u,

where in turn Y 1,3t = Y 1

t (Y 3t )−1 and Y 2,3

t = Y 2t (Y 3

t )−1. Moreover, Y 2,3,1t =

Y 2,3t e−α

2,3,1t . We postulate that the processes Y 1, Y 2 and Y 3 are strictly positive.

Corollary 3. Suppose that there exists a constant V 10 and an F-predictable process

ψ2,3,1 such that

Y 1T

(V 1

0 +∫ T

0

ψ2,3,1u dY 2,3,1

u

)= X. (19)

Let us set Vt = Y 1t V 1

t , where for every t ∈ [0, T ] the process V 1t is given by

V 1t = V 1

0 +∫ t

0

ψ2,3,1u dY 2,3,1

u . (20)

Then the trading strategy φ = (φ1, φ2, φ3), given by the expressions

φ1t = Vt(Y 1

t )−1, φ2t = ψ2,3,1

t Y 1,3t e−α

2,3,1t , φ3

t = −φ2tY

2t (Y 3

t )−1,

is self-financing prior to default and it replicates a claim (X, 0, 0, τ).

Assume that a claim (X, 0, 0, τ) is attainable, and let Q be a probability measure suchthat Y 2,3,1 is an F-martingale under Q. Then the pre-default value of (X, 0, 0, τ)equals, for every t ∈ [0, T ],

Ut(X) = Y 1t E

Q

(X(Y 1

T )−1∣∣Ft), (21)

provided that the integral in (20) is also a Q-martingale.

Example 1. Assume that

dY 1t = Y 1

t (µt dt + σ1t dWt)

anddY i

t = Y it (rt dt + σit dW

∗t )

for i = 2, 3, where W ∗ is a one-dimensional standard Brownian motion with respectto the filtration F = FW

∗under the martingale measure Q∗. Then for the processes

Y 1,3t = Y 1

t (Y 3t )−1 and Y 2,3

t = Y 2t (Y 3

t )−1 we get

dY 1,3t = Y 1,3

t

((µt − rt + σ3

t (σ3t − σ1

t ))dt + (σ1

t − σ3t ) dW

∗t

),

dY 2,3t = Y 2,3

t

(σ3t (σ

3t − σ2

t ) dt + (σ2t − σ3

t ) dW∗t

),

Page 38: Paris-Princeton Lectures on Mathematical Finance 2003

30 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

and thus

α2,3,1t =

∫ t

0

(σ3u − σ1

u)(σ3u − σ2

u) du.

Hence, the process Y 2,3,1t = Y 2,3

t e−α2,3,1t satisfies

dY 2,3,1t = Y 2,3,1

t

(σ1t (σ

3t − σ2

t ) dt + (σ2t − σ3

t )dW∗t

).

If σ2 = σ3 then, under mild technical assumptions, there exists a probability mea-sure Q such that Y 2,3,1 is a martingale. To conclude, it suffices to use the fact thatan FT -measurable random variable X(Y 1

T )−1 can be represented (by virtue of thepredictable representation theorem) as follows

X(Y 1T )−1 = U0(X) +

∫ T

0

φ2,3,1u dY 2,3,1

u

for some F-predictable process φ2,3,1. It is natural to conjecture that within thepresent setup all defaultable claims with zero recovery and no promised dividendswill be attainable, provided that the underlying default-free market is assumed to becomplete, and provided we can use in our hedging portfolio a defaultable asset thatis sensitive to the same default risk as the defaultable claim that we want to hedge.

Two defaultable assets. Let us examine the case when m = k = 2. We thus con-sider two defaultable primary assets Y 1 and Y 2 with zero recovery at default.

Corollary 4. Suppose that there exists a constant V 10 and an F-predictable process

ψ2 such that

Y 1T

(V 1

0 +∫ T

0

ψ2u dY 2,1

u

)= X, (22)

where Y 2,1t = Y 2

t (Y 1t )−1. Let us set Vt = Y 1

t V 1t , where for every t ∈ [0, T ] the

process V 1t is given by

V 1t = V 1

0 +∫ t

0

ψ2u dY 2,1

u . (23)

Then the trading strategy φ = (φ1, φ2) where, for every t ∈ [0, T ],

φ1t = (V 1

t − ψ2t Y

2t )(Y 1

t )−1, φ2t = ψ2

t ,

is self-financing and it replicates a defaultable claim (X, 0, 0, τ).

Suppose that (X, 0, 0, τ) is an attainable claim. Let Q be a probability measuresuch that Y 2,1 is an F-martingale under Q. If the stochastic integral in (23) is aQ-martingale, then the pre-default value of (X, 0, 0, τ) satisfies, for every t ∈ [0, T ],

Ut(X) = Y 10 E

Q

(X(Y 1

T )−1∣∣Ft). (24)

Page 39: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 31

Remark. Under the assumptions of Corollary 4, a defaultable claim (X, 0, 0, τ) isattainable since the associated promised payoff X can be achieved by trading in thepre-default values Y 1 and Y 2. If we introduce, in addition, some default-free assets,a replicating strategy for an arbitrary defaultable claim (X, 0, 0, τ) will typicallyhave a zero net investment in default-free assets. Therefore, default-free assets arenot relevant if we restrict our attention to defaultable claims of the form (X, 0, 0, τ).

Non-Zero Recovery for Defaultable Primary Assets

We relax Assumption (A), and we postulate instead that Assumption (B) is valid.Specifically, let us consider m defaultable primary assets with a common defaulttime τ that are subject to a fractional recovery of market value (see Section 3.2) withδi = δ = 1 for i = 1, 2, . . . ,m. Let us denote

αt =m∑i=1

φitYit , βt =

k∑i=m+1

φitYit .

so that αt + βt represents the pre-default wealth of φ. As usual, Ut(X) stands forthe pre-default value at time t of the promised payoff X . It is rather clear that theprocesses αt and βt should be chosen in such a way that αt + βt = Ut(X) andαt + βt = δαt + βt = 0 for every t ∈ [0, T ] (for the latter equality, see (16) and(17)). By solving these equations, we obtain, for every t ∈ [0, T ],

αt = (1 − δ)−1Ut(X), βt = (δ − 1)−1δUt(X).

We end up with the following equation

Y 1T

(U0(X) +

m∑i=2

∫ T

0

φiu dY i,1u +

k−1∑i=m+1

∫ T

0

φi,k,1u dY i,k,1u

+∫ T

0

βu(Y ku )−1 d(Y 1,k

u )−1)

= X.

Using the latter equation, one may try to establish a suitable extension of Proposition5. Notice that the process β depends explicitly on the pre-default value U(X). Inaddition, we need to take care of the constraint αt = (1 − δ)−1Ut(X) for every t ∈[0, T ]. Thus, the problem of replication of a promised payoff under non-zero recoveryfor defaultable primary assets seems to be rather difficult to solve, in general.

4.2 Replication of a Recovery Payoff

Let us now focus on the recovery payoff Z at time of default. As before, we writeUt(Z) to denote the pre-default value at time t ∈ [0, T ] of the claim (0, 0, Z, τ).Recall that UT (Z) = 0 (and UT (Z) = 0 on the event τ > T .

Page 40: Paris-Princeton Lectures on Mathematical Finance 2003

32 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Zero Recovery for Defaultable Primary Assets

In order to examine the replicating strategy, we shall once again make use ofProposition 4. As already explained, in this case we need to assume that condition(11) is imposed on a strategy φ we are looking for, that is, we necessarily have∑k

i=m+1 φitYit = Zt for every t ∈ [0, T ].

Proposition 6. Suppose that there exist a constant V 10 , and F-predictable processes

ψi, i = 2, . . . ,m and ψi,k,1, i = m + 1, . . . , k − 1 such that

Y 1T

(V 1

0 +m∑i=2

∫ T

0

ψiu dY i,1u +

k−1∑i=m+1

∫ T

0

ψi,k,1u dY i,k,1u

+∫ T

0

Zu(Y ku )−1 d(Y 1,k

u )−1)

= 0. (25)

Let Vt = Y 1t V 1

t , where the process V 1t is defined as

V 1t = V 1

0 +m∑i=2

∫ t

0

ψiu dY i,1u +

k−1∑i=m+1

∫ t

0

ψi,k,1u dY i,k,1u

+∫ t

0

Zu(Y ku )−1 d(Y 1,k

u )−1.

Then the replicating strategy φ = (φ1, φ2, . . . , φk) for (0, 0, Z, τ) is given by

φ1t =

(Vt − Zt −

m∑i=2

φitYit

)(Y 1t )−1,

φit = ψit, ∀ i = 2, . . . ,m,

φit = ψi,k,1t Y 1,kt e−α

i,k,1t , ∀ i = m + 1, . . . , k − 1,

φkt =(Zt −

k−1∑i=m+1

φitYit

)(Y kt )−1.

Proof. The proof is based on an application of part (ii) of Proposition 4. First, noticethat by virtue of the specification of the strategy φ we have Vt(φ) = Vt for everyt ∈ [0, T ]. Moreover, Vτ (φ) = Zτ on the set τ ≤ T . Finally, VT (φ) = VT (φ) = 0on the event τ > T .

Defaultable asset and two default-free assets. For the ease of reference, we con-sider here a special case of Proposition 6. We take m = 1 and k = 3, and we pos-tulate that the processes Y 1, Y 2 and Y 3 are strictly positive. Recall that the recoveryprocess Z , and thus also its pre-default value process U(Z), are prespecified.

Page 41: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 33

Corollary 5. Suppose that there exists a constant V 10 and an F-predictable process

ψ2,3,1 such that

Y 1T

(V 1

0 +∫ T

0

ψ2,3,1u dY 2,3,1

u +∫ T

0

Zu(Y 3u )−1 d(Y 1,3

u )−1)

= 0. (26)

Let Vt = Y 1t V 1

t , where the process V 1t is defined as

Vt = V 10 +

∫ t

0

ψ2,3,1u dY 2,3,1

u +∫ t

0

Zu(Y 3u )−1 d(Y 1,3

u )−1.

Then the replicating strategy for the claim (0, 0, Z, τ) equals

φ1t = (Vt − Zt)(Y 1

t )−1, φ2t = ψ2,3,1

t Y 1,3t e−α

2,3,1t , φ3

t = (Zt − φ2tY

2t )(Y 3

t )−1.

The existence of ψ2,3,1, as well as the possibility of deriving a closed-form expres-sion for φ are not obvious. One needs to impose more specific assumptions on theprice processes of primary assets and the recovery process in order to obtain resultsthat would be more practical.

If there exists a probability Q∗ such that Y 2,3,1 is an F-martingale, then the (ex-dividend) value of Z0 equals

Ut(Z) = Y 1t EQ∗

(∫ T

t

Zu(Y 3u )−1 d(Y 1,3

u )−1∣∣∣Ft

).

Two defaultable assets. Of course, if both defaultable primary assets are subjectto the zero recovery scheme, and no other asset is available for trade, no replicatingstrategy exists in the case of a non-zero recovery process Z . Thus, we need to pos-tulate a more general recovery scheme for defaultable assets if we wish to have apositive result.

Non-Zero Recovery for Defaultable Primary Assets

Suppose now that Assumption (B) is valid and Y 1, . . . , Y m are defaultable primaryassets with a fractional recovery of market value. We assume that δi = δ = 1 fori = 1, 2, . . . ,m, and we proceed along the similar lines as in Section 4.1. Recall thatwe denote

αt =m∑i=1

φitYit , βt =

k∑i=m+1

φitYit .

We now postulate that αt + βt = Ut(Z) and αt + βt = δαt + βt = Zt for everyt ∈ [0, T ], where Ut(Z) is the pre-default value of (0, 0, Z, τ). Consequently, forevery t ∈ [0, T ] we have

Page 42: Paris-Princeton Lectures on Mathematical Finance 2003

34 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

αt = (δ − 1)−1(Zt − Ut(Z)), βt = (δ − 1)−1(δUt(Z) − Zt).

To find a replicating strategy for a defaultable claim (0, 0, Z, τ), we need, in partic-ular, to find F-predictable processes ψi and ψi,k,1 such that the equality

Ut(Z) = Y 1t

(U0(Z) +

m∑i=2

∫ t

0

ψiu dY i,1u +

k−1∑i=m+1

∫ t

0

ψi,k,1u dY i,k,1u

+∫ t

0

βu(Y ku )−1 d(Y 1,k

u )−1)

is satisfied for every t ∈ [0, T ]. Similarly as in Section 4.2, we conclude that theconsidered problem is non-trivial, in general.

4.3 Replication of Promised Dividends

We return to the case of zero recovery for defaultable primary assets, and we considera defaultable claim (0, C, 0, τ). In principle, replication of the stream of promiseddividends can reduced to previously considered cases (that’s why it was possibleto postulate in Definition 5 that C = 0). Specifically, it suffices to introduce therecovery process ZC generated by C by setting, for every t ∈ [0, T ],

ZCt =∫

(0,t)

B−1(u, t) dCu,

and to combine it with the terminal payoff 11τ>TXC , where the promised payoffXC associated with C equals

XC =∫

(0,T ]

B−1(u, T ) dCu.

It should be stressed, however, that the pre-default price of an “equivalent” default-able claim (XC , 0, ZC, τ) introduced above does not coincide with the pre-defaultprice of the original claim (0, C, 0, τ), that is, processes U(C) and U(ZC)+ U(XC)are not identical. But, clearly, the equality U0(C) = U0(ZC) + U0(XC) is satisfied,and thus at time 0 the replicating strategies for both claims coincide.

Remark. It is apparent that the concept of the (ex-dividend) pre-default price U(C)does not fit well into study of replication of promised dividends if one only considersnon-dividend paying primary assets. It would be much more convenient to use in thecase of dividend-paying (default-free or defaultable) primary assets. For instance,it is sometimes legitimate to postulate the existence of a default-free version of thedefaultable claim (0, C, 0, τ), that is, a default-free asset with the dividend stream C.

If we insist on working directly with the process U(C), then we derive the followingset of necessary conditions for a self-financing trading strategy φ with the consump-tion process A = −C

Page 43: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 35

k∑i=m+1

φitYit = 0, Vt(φ) =

m∑i=1

φitYit = Ut(C), (27)

and

dVt(φ) =m∑i=1

φit dYit +

k∑i=m+1

φit dYit − dCt = dUt(C). (28)

The existence of a strategy φ = (φ1, φ2, . . . , φk) with consumption process A =−C, which satisfies (27)-(28) is not evident, however.

Example 2. Let us take, for instance, m = 1 and k = 3. Then conditions (27)-(28)become:

φ1t Y

1t = Ut(C), φ2

tY2t + φ3

tY3t = 0,

andφ1t dY

1t + φ2

t dY2t + φ3

tdY3t = dUt(C) + dCt.

Assume that under Q∗ we have

dY 1 = µt dt + σ1t dW ∗

t ,

dY it = rt dt + σit dW

∗t , i = 2, 3,

dUt(C) = at dt + bt dW∗t .

If, in addition, dCt = ctdt then we obtain the following system of equations forφ = (φ1, φ2, φ3)

φ1t Y

1t = Ut(C),

φ2tY

2t + φ3

tY3t = 0,

φ1tµ

1t + φ2

tµ2t + φ3

tµ3t = at + ct,

φ1tσ

1t + φ2

tσ2t + φ3

tσ3t = bt.

4.4 Replication of a First-to-Default Claim

Until now, we have always postulated that a random time τ represents a commondefault time for all defaultable primary assets, as well as for a defaultable contingentclaim under consideration. This simplifying assumptions manifestly fails to hold inthe case of a credit derivative that explicitly depends on default times of several(possibly independent) reference entities. Consequently, the issue of replication ofa so-called first-to-default claim is more challenging, and the approach presented inthe preceding sections needs to be extended.

Let the random times τ1, . . . , τm represent the default times of m reference entitiesthat underlie a given first-to-default claim. We assume that Q∗τi = τj = 0 forevery i = j, and we denote by τ(1) the random moment of the first default, that is,

Page 44: Paris-Princeton Lectures on Mathematical Finance 2003

36 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

we set τ(1) = min τ1, τ2, . . . , τn = τ1 ∧ τ2 ∧ · · · ∧ τn. A first-to-default claim(X,C,Z1, . . . , Zm, τ1, . . . , τm) with maturity date T can be described as follows.If τ(1) = τi ≤ T for some i = 1, . . . ,m, then it pays at time τ(1) the amount Ziτ(1) ,

where Zi is an F-predictable recovery process. Otherwise, that is, if τ(1) > T , theclaim pays at time T an FT -measurable promised amount X . Finally, a claim payspromised dividends stream C prior to the default time τ(1), more precisely, on therandom interval 11τ(1)≤T[[0, τ(1)[[∪ 11τ(1)>T[0, T ]. It is clear the dividend processof a generic first-to-default claim equals, for every t ∈ [0, T ],

Dt = X11τ(1)>T11[T,∞)(t) +∫

(0,t]

(1 −H(1)u ) dCu +

∫(0,t]

Ziu11τ(1)=τi dH(1)u ,

where H(1)t = 1 −

∏mi=1(1 − Hi

t ) or, equivalently, H(1)t = 11τ(1)≤t. Let Hi be

the filtration generated by the process Hit = 11τi≤t for i = 1, 2, . . . ,m, and let

the filtration G be given as G = F ∨ H1 ∨ H2 ∨ · · · ∨ Hm. Then, by definition, the(ex-dividend) price of (X,C,Z1, . . . , Zm, τ1, . . . , τm) equals, for every t ∈ [0, T ],

Ut = Bt EQ∗( ∫

(t,T ]

B−1u dDu

∣∣∣Gt).

By a pre-default value of a claim we mean an F-adapted process U such that Ut =Ut11τ(1)>t for every t ∈ [0, T ]. The following definition is a direct extension ofDefinition 5 (thus, we maintain the assumption that C = 0). By a self-financingstrategy we mean here a strategy which is self-financing prior to the first default (cf.Definition 4), and thus it is self-financing on [0, T ] as well.

Definition 6. A self-financing strategy φ is a replicating strategy for a first-to-defaultcontingent claim (X, 0, Z1, . . . , Zm, τ1, . . . , τm) if and only if the following hold:(i) Vt(φ) = Ut on the random interval [[0, τ(1) ∧ T [[,(ii) Vτ (φ) = Ziτ on the event τ(1) = τi ≤ T ,(iii) VT (φ) = X on the event τ(1) > T .

In order to provide a replicating strategy for a first-to-default claim we postulatethe existence of m defaultable primary assets Y 1, . . . , Ym with the correspondingdefault times τ1, . . . , τm. It is natural to postulate that the default times τ1, . . . , τmare also the default times of m reference entities that underlie a first-to-default claimunder consideration, so that, τi = τi for i = 1, 2, . . . ,m. It should be stressed that,typically, the pre-default value Y j will follow a discontinuous process (for instance,it may have jumps at default times of other entities). Finally, let us recall that Y i

t

represents the recovery payoff of the ith defaultable asset if its default occurs at timet.

Case of zero promised dividends. We shall assume from now on that C = 0. Forarbitrary i = j, let Y ij

t represent the pre-default value of the ith asset conditioned onthe event τ(1) = τj = t. More explicitly, Y ij

t is equal to Y it on the random interval

Page 45: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 37

[[τ(1)11D, τ(2)11D[[, where D = τ(1) = τj and τ(2) is the time of the second default

(Y ijt is not defined outside the random interval introduced above). At the intuitive

level, the process Y ijt gives the value at time t of the ith defaultable asset, provided

that the first default has occurred at time t, and the jth entity is the first defaultingentity. Hence, Y ij

t is not a new process, but rather an additional notation introducedin order to simplify the formulae that follow.

Remark. It is important to stress that the notion of a ‘defaultable asset’ should not beunderstood literally. For instance, if the case of the so-called flight to quality the priceof a default-free bond is discontinuous, and it jumps at the moment τ associated withsome ‘default event’ (see, e.g., Collin-Dufresne et al. (2003)). Thus, from the per-spective of hedging a default-free bond may be formally classified as a ‘defaultableasset’.

In order to find a replicating strategy φ for a first-to-default claim within the presentsetup, we need to impose the following m conditions on its components φ1, . . . , φk:for every j = 1, . . . ,m and every t ∈ [0, T ]

m∑i=1, i=j

φitYijt + φjt Y

jt +

k∑i=m+1

φitYit = Zjt , (29)

where Z1, . . . , Zm is a given family of recovery processes. Recall that Zj specifiesthe payoff received by the owner of a claim if the first default occurs prior to or at T ,and the first defaulting entity is the jth entity.

For the sake of concreteness, assume that

Zjt = gj(t, Y 1t , . . . , Y m

t , Y 1t , . . . , Y m

t , Y m+1t , . . . , Y k

t )

for some function g : Rk+m+1 → R. Under some additional assumptions, the systemof equations (29) can be solved explicitly for φ1, . . . , φm. In the second step, we needto choose processes φm+1, . . . , φk in such a way that a strategy φ is self-financingprior to the first default, and thus also on the random interval [[0, τ(1) ∧ T ]]. Finally,the wealth of a strategy φ should match the promised payoff X at time T on theevent τ(1) > T . Equivalently, the wealth of φ should coincide with the value of aconsidered claim prior to and at default, or up to time T if there is no default in [0, T ].It is apparent that the problem of existence of a replicating strategy is non-trivial, butit can be solved in some circumstances.

A detailed analysis of an explicit replication result for a particular example of a first-to-default claim is given in Section 5.2.

5 Vulnerable Claims and Credit Derivatives

In this section, we present a few examples of models and simple defaultable claimsfor which there exists explicit replicating strategy. We maintain our assumption that

Page 46: Paris-Princeton Lectures on Mathematical Finance 2003

38 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

the default time τ admits a continuous hazard process Γ with respect to F underQ∗, where F = FW

∗is generated by a Brownian motion W ∗. Recall that Γ is also

assumed to be an increasing process.

5.1 Vulnerable Claims

Let us fix T > 0. We postulate that the T -maturity default-free bond and defaultablezero-coupon bond with zero recovery are also traded assets. As before, we assumethat the risk-neutral dynamics of the discount default-free bond are

dB(t, T ) = B(t, T )(rt dt + b(t, T ) dW ∗

t

)for some F-predictable volatility process b(t, T ).

Vulnerable Call Options

For a fixed U > T , we assume that the U -maturity default-free bond is also traded,and we consider a vulnerable European call option with the terminal payoff

CT = 11τ>T(B(T, U) −K)+ = 11τ>TX.

We thus deal with a defaultable claim (X, 0, 0, τ) with the promised payoff X =(B(T, U) − K)+. The same method can be applied to an arbitrary FT -measurablepromised payoff X = g(B(T, U)), where a function g : R → R satisfies usualtechnical assumptions.

We consider here the situation when one defaultable asset and two default-free assetsare traded; we thus place ourselves within the framework of Corollary 3. Specifically,we take Y 1

t = D0(t, T ), Y 2t = B(t, U) and Y 3

t = B(t, T ). Consider a strategyφ = (φ1, φ2, φ3) such that Vt(φ) = φ1

t D0(t, T ) and φ2

tB(t, U)+φ3tB(t, T ) = 0 for

every t ∈ [0, T ]. Observe that in view of the definition of Γ (t, T ) (see Section 2.3)we have

Y 1,3t = D0(t, T )(B(t, T ))−1 = Γ (t, T ).

Moreover, Y 2,3t = F (t, U, T ) and Y 2,3,1

t = F (t, U, T )e−α2,3,1t , where we denote

F (t, U, T ) = B(t, U)(B(t, T ))−1 and, by virtue of formula (2),

α2,3,1t = 〈lnF (·, U, T ), lnΓ (·, T )〉t =

∫ t

0

(b(u, U) − b(u, T )

)β(u, T ) du.

Therefore, the dynamics of Y 2,3,1 under QT are

dY 2,3,1t = Y 2,3,1

t

((b(t, T ) − b(t, U)

)β(t, T ) dt +

(b(t, U) − b(t, T )

)dWT

t

)

= Y 2,3,1t

(b(t, U) − b(t, T )

) (dWT

t − β(t, T ) dt).

Page 47: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 39

Let Q be a probability measure such that Y 2,3,1 is a martingale under Q. By virtueof Girsanov’s theorem, it is clear that the process W , given by the formula

Wt = WTt −

∫ t

0

β(u, T ) du, ∀ t ∈ [0, T ],

is a Brownian motion under Q. Thus, the process F (t, U, T ) satisfies under Q

dF (t, U, T ) = F (t, U, T )(b(t, U) − b(t, T )

)(dWt + β(t, T ) dt

). (30)

Since D0(T, T ) = 1, equation (19) becomes

C0 +∫ T

0

φ2,3,1u dY 2,3,1

u = X = (F (T, U, T )−K)+. (31)

By a simple extension of (21), for any t ∈ [0, T ] the pre-default value of the optionequals

Ct = D0(t, T ) EQ

((F (T, U, T )−K)+

∣∣Ft), (32)

provided that the integral in (31) is a Q-martingale, rather than a Q-local martingale.Let us denote

f(t) = β(t, T )(b(t, U) − b(t, T )), ∀ t ∈ [0, T ], (33)

and let us assume that f is a deterministic function. Then we have the followingresult, which extends the valuation formula for a call option written on a default-freezero-coupon bond within the framework of the Gaussian HJM model.

Proposition 7. The pre-default price Ct of a vulnerable call option written on adefault-free zero-coupon bond equals

Ct = D0(t, T )(F (t, U, T )e

∫Ttf(u) duN

(h+(t, U, T )

)−KN

(h−(t, U, T )

)),

where

h±(t, U, T ) =lnF (t, U, T )− lnK +

∫ Tt

f(u) du± 12v

2(t, T )v(t, T )

and

v2(t, T ) =∫ T

t

|b(u, U) − b(u, T )|2 du.

The replicating strategy φ = (φ1, φ2, φ3) for the option satisfies

φ1t = Ct(D0(t, T ))−1,

φ2t = eα

2,3,1T −α2,3,1

t Γ (t, T )N(h+(t, U, T )

),

φ3t = −φ2

tF (t, U, T ).

Page 48: Paris-Princeton Lectures on Mathematical Finance 2003

40 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Proof. Considering the Ito differential d(Ct/D0(t, T )), and identifying terms in ex-pression (31), we obtain that the process φ2,3,1 in the integral representation (31) isgiven by the formula

φ2,3,1t = e

∫T0 f(u) duN

(h+(t, U, T )

)= eα

2,3,1T N

(h+(t, U, T )

).

Consequently the valuation formula presented in the proposition is a rather straight-forward consequence of (30) and (32).

Remark. Although we consider here the bond B(t, U) as the underlying asset, it isapparent that the method (and thus also the result) can be applied to a much widerclass of underlying assets. For instance, a zero-coupon bond can be substituted with anon-dividend paying stock with the price S (this case was examined in Jeanblanc andRutkowski (2003)). A suitable modification of formulae established in Proposition7 can also be used to the valuation and hedging of vulnerable caplets, swaptions,and other vulnerable derivatives in lognormal market models of (non-defaultable)LIBORs and swap rates.

Case of a deterministic hazard process. Assume now that the F-hazard process Γof τ is deterministic. Then β(t, T ) = 0 for every t ∈ [0, T ], and thus α2,3,1

t = 0 andY 2,3,1t = F (t, U, T ) for every t ∈ [0, T ]. We thus obtain the following result.

Corollary 6. Let the F-hazard process Γ and the volatility b(t, U) − b(t, T ) of theforward price F (t, U, T ) be deterministic. Then the pre-default price Ct of a vul-nerable option satisfies Ct = Γ (t, T )Ct, where Ct is the price of an equivalentnon-vulnerable option

Ct = B(t, U)N(h+(t, U, T )

)−KB(t, T )N

(h−(t, U, T )

),

where

h±(t, U, T ) =lnF (t, U, T )− lnK ± 1

2v2(t, T )

v(t, T )and

v2(t, T ) =∫ T

t

|b(u, U) − b(u, T )|2 du.

The replicating strategy φ = (φ1, φ2, φ3) is given by

φ1t = Ct(Γ (t, T )B(t, T ))−1,

φ2t = Γ (t, T )N

(h+(t, U, T )

),

φ3t = −φ2

tF (t, U, T ).

Vulnerable Bonds

Let us consider the payoff of the form 11τ>T which occurs at some date U > T .This payoff is, of course, equivalent to the payoff B(T, U)11τ>T at time T . We

Page 49: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 41

interpret this claim as a vulnerable bond; Vaillant (2001) proposes to term such adelayed defaultable bond. Although vulnerable bonds are not traded, under suitableassumptions one can show that they can be replicated by other liquid assets. Indeed,to replicate this claim within the framework of this section, it suffices to assume thatdefault-free bonds with maturities T and U , as well as the defaultable bond withmaturity T are among primary traded assets.

Specifically, we postulate that φ2tB(t, U) + φ3

tB(t, T ) = 0 for every t ∈ [0, T ]and thus the total wealth is invested in defaultable bonds of maturity T , so thatφ1t D

0(t, T ) = Ut(X) for every t ∈ [0, T ], where X = B(T, U) = F (T, U, T ).Let D0(t, T, U) stand for the pre-default value of a vulnerable bond at time t < T .Then formulae (31) and (32) become

D0(0, T, U) +∫ T

0

φ2,3,1u dY 2,3,1

u = F (T, U, T )

andD0(t, T, U) = D0(t, T ) E

Q

(F (T, U, T ) | Ft

),

respectively. Using dynamics (30), we obtain

D0(t, T, U) = D0(t, T )F (t, T, U) e∫ T

tf(u) du

= D0(t, T )F (t, T, U) eα2,3,1T −α2,3,1

t (34)

provided that α2,3,1 is deterministic.

5.2 Credit Derivatives

The most widely traded credit derivatives are credit default swaps and swaptions,total rate of return swaps and credit linked notes. Furthermore, a large class of basketcredit derivatives have a special feature of being linked to the default risk of sev-eral reference entities. We shall consider here only two examples: a credit defaultswap and a first-to-default contract. Before proceeding to the analysis of more com-plex contract, we shall first examine a standard (non-vulnerable) option written on adefaultable asset.

Options on a Defaultable Asset

We shall now consider a non-vulnerable call option written on a defaultable bondwith maturity date U and zero recovery. Let T be the expiration date and let K > 0stand for the strike. Formally, we deal with the terminal payoff CT given by

CT = (D0(T, U) −K)+.

Page 50: Paris-Princeton Lectures on Mathematical Finance 2003

42 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

To replicate this option, we postulate that defaultable bonds of maturities U and Tare primary assets. Notice also that

CT =(11τ>TD0(T, U) −K

)+ = 11τ>T(D0(T, U) −K

)+ = 11τ>TX,

where X = (D0(T, U)−K)+, so that once again we deal with a defaultable claim ofthe form (X, 0, 0, τ). It should be stressed, however, that since the underlying assetis now defaultable, the valuation result will differ from Proposition 7.

We shall use two defaultable primary assets for replication. Specifically, we shallnow apply Corollary 4, by choosing Y 1

t = D0(t, T ) and Y 2t = D0(t, U) as primary

assets. As before, we denote by Ct the pre-default value of the option under consid-eration. By virtue of Corollary 4, it suffices to show that there exists a process φ2

such that

C0 +∫ T

0

φ2u dY 2,1

u = X = (D0(T, U) −K)+ = (Y 2,1T −K)+, (35)

where Y 2,1t = D0(t, U)(D0(t, T ))−1. Then the trading strategy φ = (φ1, φ2) where

φ1t = (Ct − φ2

t D0(t, U))(D0(t, T ))−1

is self-financing and it replicates the option. To derive the valuation formula, it suf-fices to find the probability measure Q such that the process Y 2,1 is a Q-martingale,and to use the generic representation

Ct = D0(t, T ) EQ

((Y 2,1T −K)+

∣∣Ft).Notice that, with the notation D0(t, U) = Γ (t, U)B(t, T ), the price process D0(t, U)admits the representation D0(t, U) = 11τ>tD0(t, U). Assume that τ has a stochas-tic intensity γ. Then we have (see (3))

dD0(t, U) = D0(t, U)((

rt + γt + β(t, U)b(t, U))dt+

(β(t, U) + b(t, U)

)dW ∗

t

),

and the dynamics of Y 2,1t = D0(t, U)(D0(t, T ))−1 under Q∗ are

dY 2,1t = Y 2,1

t

((rt + γt + β(t, U)b(t, U)

)dt

+(β(t, U) + b(t, U) − b(t, T )

) (dW ∗

t − b(t, T )dt))

.

As we said above, it suffices to find the probability measure Q such that the processY 2,1 is a Q-martingale. By applying standard Girsanov’s transformation, we canconstruct a measure Q so that we have

dY 2,1t = Y 2,1

t

(β(t, U) + b(t, U) − b(t, T )

)dWt,

where W is a Brownian motion under Q.

Page 51: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 43

Proposition 8. Assume that β(t, U)+b(t, U)−b(t, T ), t ∈ [0, T ], is a deterministicfunction. Then the pre-default price Ct of a call option written on a U -maturitydefaultable bond equals

Ct = D0(t, U)N(k+(t, U, T )

)−KD0(t, T )N

(k−(t, U, T )

),

where

k±(t, U, T ) =ln D0(t, U) − ln D0(t, T )− lnK ± 1

2 v2(t, T )

v(t, T )

and

v2(t, T ) =∫ T

t

|β(u, U) + b(u, U) − b(u, T )|2 du.

The replicating strategy φ = (φ1, φ2) for the option is given by

φ1t = (Ct − φ2

t D0(t, U))(D0(t, T ))−1, φ2

t = N(k+(t, U, T )

).

Case of a deterministic hazard process. Assume that the F-hazard process Γ andthe volatility b(t, U) − b(t, T ), t ∈ [0, T ], of the forward price F (t, U, T ) are deter-ministic.

Corollary 7. The pre-default price Ct of a call option written on a U -maturity de-faultable bond equals

Ct = e−∫ U

tγ(u) duB(t, U)N

(k+(t, U, T )

)−Ke−

∫ Ttγ(u) duB(t, T )N

(k−(t, U, T )

),

where

k±(t, U, T ) =lnB(t, U) − lnB(t, T ) − lnK −

∫ UT γ(u) du± 1

2v2(t, T )

v(t, T )

and

v2(t, T ) =∫ T

t

|b(u, U) − b(u, T )|2 du.

The replicating strategy φ = (φ1, φ2) for the option is given by

φ1t = (Ct − φ2

t D0(t, U))(D0(t, T ))−1, φ2

t = N(k+(t, U, T )

).

Notice that this is exactly the same result as in the case of a call option written ona zero-coupon bond in a default-free term structure model with the interest rate rtsubstituted with the default-risk adjusted rate rt + γ(t).

Page 52: Paris-Princeton Lectures on Mathematical Finance 2003

44 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Credit Default Swaps

A generic credit default swap (CDS, for short) is a derivative contract which allowsto directly transfer the credit risk of the reference entity from one party (the riskseller) to another party (the risk buyer). The contingent payment is triggered by thepre-specified default event, provided that it happens before the maturity date T . Thestandard version of a credit default swap stipulates that the contract is settled atdefault time τ of the reference entity, and the recovery payoff equals Zτ = 1 −δB(τ, T ) where δ represents the recovery rate at default of a reference entity. Itis usually assumed that 0 ≤ δ < 1 is non-random, and known in advance. Thisconvention corresponds to the fractional recovery of Treasury value scheme for adefaultable bond issued by the reference entity. Otherwise, that is, in case of nodefault prior to or at T , the contract expires at time T worthless. The followingalternative market conventions are encountered in practice:

• The buyer of the insurance pays a lump sum at inception, and the contract istermed a default option,

• The buyer of the insurance pays annuities κ at the predetermined dates 0 < T1 <· · · < Tn−1 < Tn = T prior to τ , so that the contract represents a plain-vanilladefault swap.

In the former case, the (pre-default) value U0(Z) at time 0 of the default optionequals

U0(Z) = EQ∗(B−1τ

(1 − δB(τ, T )

)11τ≤T

). (36)

In the latter case, the level of the annuity κ should be chosen in such a way that thevalue of the contract at time 0 equals zero. The annuity κ can thus be specified bysolving the following equation

U0(Z) = κEQ∗( n∑i=1

B−1Ti

11τ>Ti),

where the value U0(Z) is given by (36).

Digital credit default swap. The fixed leg of a CDS can be represented as the se-quence of payoffs ci = κ11τ>Ti at the dates Ti for i = 1, . . . , n. The fixed leg ofa CDS can thus be seen as a portfolio of defaultable zero-coupon bonds with zerorecovery, and thus the valuation of the fixed leg is rather straightforward. To simplifythe valuation of the floating leg, we shall consider a digital CDS. Specifically, wepostulate that the constant payoff δ is received at time Ti+1 if default occurs betweenTi and Ti+1. Therefore, the floating leg is represented by the following sequence ofpayoffs:

di = δ11Ti<τ≤Ti+1 = δ11τ≤Ti+1 − δ11τ≤Ti

at the dates Ti+1 for i = 1, . . . , n − 1. Clearly

Page 53: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 45

di = δ(1 − 11τ>Ti+1) − δ(1 − 11τ>Ti).

We conclude that in order to analyze the floating leg of a digital CDS, it suffices tofocus on the valuation and replication of the payoff 11τ>Ti that occurs at time Ti+1,that is, a vulnerable bond. The latter problem was already examined in Section 5.1,however (see, in particular, the valuation formula (34)).

First-to-Default Claims

We shall now focus on the issue of modeling dependent (“correlated”) defaults,which arises in the context of basket credit derivatives. In order to model dependentdefault times, we shall employ Kusuoka’s (1999) setting with n = 2 default times(for related results, see Jarrow and Yu (2001), Gregory and Laurent (2002, 2003),Bielecki and Rutkowski (2003), or Collin-Dufresne et al. (2003)). Our main goal isto show that the jump risk of a first-to-default claim can be perfectly hedged usingthe underlying defaultable zero-coupon bonds. Recovery schemes and the associatedvalues of (deterministic) recovery rates should be specified a priori.

Construction of dependent defaults. Following Kusuoka (1999), we postulate thatunder the original probability Q the random times τi, i = 1, 2, given on a probabilityspace (Ω,G,Q), are assumed to be mutually independent random variables withexponential laws with parameters λ1 and λ2, resp. Let F be some reference filtration(generated by a Wiener process W, say) such that τ1 and τ2 are independent of F

under Q. We write Hi to denote the filtration generated by the process Hit = 11τi≤t

for i = 1, 2, and we set G = F ∨ H1 ∨ H2. Notice that the process M it = Hi

t −∫ t∧τi

0 λi du = Hit − λ(τi ∧ t) is a G-martingale for i = 1, 2.

For a fixed T > 0, we define a probability measure Q∗ on (Ω,GT ) by setting

dQ∗

dQ= ηT , Q-a.s.,

where the Radon-Nikodym density process ηt, t ∈ [0, T ], satisfies

ηt = 1 +2∑i=1

∫(0,t]

ηu−κiu dM iu

with auxiliary processes κ1, κ2 given by

κ1t = 11τ2<t

(α1

λ1− 1

), κ2

t = 11τ1<t

(α2

λ2− 1

).

Let B(t, T ) be the price of zero-coupon bond, and let QT be the forward martingalemeasure for the date T . It appears that the ‘martingale intensities’ under Q∗ andunder QT are

Page 54: Paris-Princeton Lectures on Mathematical Finance 2003

46 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

λ1t = λ111τ2>t + α111τ2≤t, λ2

t = λ211τ1>t + α211τ1≤t.

Specifically, the process M it = Hi

t −∫ t∧τi

0 λiu du is a G-martingale under Q∗ andunder QT for i = 1, 2. Moreover, it is easily seen that the random times τ1 and τ2are independent of the filtration F under Q∗ and QT . The following result shows thatintensities λ1 and λ2 can be interpreted as local intensities of default with respectto the information available at time t. Therefore, the model can be reformulated as atwo-dimensional Markov chain.

Proposition 9. For i = 1, 2 and every t ∈ [0, T ] we have

λi = limh↓0

h−1 QT t < τi ≤ t + h | Ft, τ1 > t, τ2 > t.

Moreover

α1 = limh↓0

h−1 QT t < τ1 ≤ t + h | Ft, τ1 > t, τ2 ≤ t

andα2 = lim

h↓0h−1 QT t < τ2 ≤ t + h | Ft, τ2 > t, τ1 ≤ t.

Assume that defaultable zero-coupon bonds are subject to zero recovery rule. Thenthe price of the bond issued by the ith entity is given by

D0i (t, T ) = B(t, T ) QT τi > T | Gt = 11τi>tD

0i (t, T ),

where, as usual, D0i (t, T ) stands for the pre-default value of the bond. Let us denote

λ = λ1 + λ2 and let us assume that λ − α1 = 0. Then straightforward calculationslead to an explicit formula for D0

i (t, T ) (for details, see Bielecki and Rutkowski(2003)). Of course, an analogous expression holds for the pre-default price D0

2(t, T )provided that λ− α2 = 0.

Proposition 10. Assume that λ − α1 = 0. Then for every t ∈ [0, T ] the pre-defaultprice D0

1(t, T ) equals

D01(t, T ) = 11τ2>t D∗

1(t, T ) + 11τ2≤t D1(t, T ),

where

D∗1(t, T ) =

B(t, T )λ− α1

(λ2e

−α1(T−t) + (λ1 − α1)e−λ(T−t))

represents the value of the bond prior to the first default, that is, on the randominterval [[0, τ(1) ∧ T [[, and D1(t, T ) = B(t, T )e−α1(T−t) is the value of the bondafter the default of the second entity, but prior to default of the issuer, that is, on[[τ2 ∧ T, τ1 ∧ T [[.

Page 55: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 47

Let τ(1) = τ1 ∧ τ2 be the date of the first default. Consider a first-to-default claimwith the terminal payoff X11τ(1)>T, where X is an FT -adapted random variable,and F-predictable recovery processes Z1 and Z2. As primary traded assets, we takedefaultable zero-coupon bonds D0

1(t, T ) and D02(t, T ) with respective default times

τ1 and τ2, as well as the default-free zero-coupon bond B(t, T ).

In Section 4.4, we have examined the basic features of a replicating strategy for afirst-to-default claim. Under the present assumptions, (29) yields

φ1tB(t, T )e−α1(T−t) + φ3

tB(t, T ) = Z2t

andφ2tB(t, T )e−α2(T−t) + φ3

tB(t, T ) = Z1t .

A strategy φ should be self-financing prior to the first default (and thus also on therandom interval [[0, τ(1) ∧ T ]]). In other words, we are looking for φ such that the

pre-default wealth process V (φ), given by the formula

Vt(φ) = φ1tD

∗1(t, T ) + φ2

tD∗2(t, T ) + φ3

tB(t, T ), ∀ t ∈ [0, T ],

satisfiesdVt(φ) = φ1

t dD∗1(t, T ) + φ2

t dD∗2(t, T ) + φ3

t dB(t, T ). (37)

Finally, at time T the wealth of φ should coincide with the promised payoff X on theevent τ(1) > T . This means that the pre-default wealth needs to satisfy VT (φ) =X , so that (37) becomes

V0(φ) +∫ T

0

φ1t dD

∗1(t, T ) +

∫ T

0

φ2t dD

∗2(t, T ) +

∫ T

0

φ3t dB(t, T ) = X.

Equivalently, the pre-default wealth should coincide with the pre-default value of afirst-to-default claim on the random interval [[0, τ(1) ∧ T [[ and the jump of the wealthat default time τ(1) should adequately reproduce the behavior at τ(1) of a first-to-default claim.

First-to-default credit swap. For the sake of concreteness, let us consider a first-to-default credit swap. Specifically, we shall examine replication of a first-to-defaultclaim with X = 0 and Zit = δB(t, T ) for i = 1, 2, where 0 ≤ δ ≤ 1. Let Ut be thevalue of this claim at time t ∈ [0, T ]. It can be shown that

QTτ(1) > T | Gt = 11τ(1)>t e−λ(T−t).

Consequently, for every t ∈ [0, T ] we have

Ut = 11τ(1)>t δ(1 − e−λ(T−t))B(t, T ) + 11τ(1)≤t δB(t, T ),

and thus the pre-default value equals

Ut = δ(1 − e−λ(T−t))B(t, T ).

Page 56: Paris-Princeton Lectures on Mathematical Finance 2003

48 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

To find the replicating strategy φ, we first observe that φ needs to satisfy, for everyt ∈ [0, T ],

φ1t e

−α1(T−t) + φ3t = δ, φ2

t e−α2(T−t) + φ3

t = δ, (38)

Moreover, the pre-default wealth process V (φ), given by

Vt(φ) = φ1tD

∗1(t, T ) + φ2

tD∗2(t, T ) + φ3

tB(t, T ), (39)

should satisfy Vt(φ) = Ut and

dVt(φ) = φ1t dD

∗1(t, T ) + φ2

t dD∗2(t, T ) + φ3

t dB(t, T ). (40)

It is convenient to work with relative prices, by taking B(t, T ) as a numeraire, sothat (39)-(40) become

V Bt (φ) = φ1

tY1t + φ2

tY2t + φ3

t = δ(1 − e−λ(T−t)) (41)

and

V Bt (φ) = V B

0 (φ) +∫ t

0

φ1u dY 1

u +∫ t

0

φ2u dY 2

u , (42)

where V Bt (φ) = Vt(φ)B−1(t, T ) and

Y 1t =

D∗1(t, T )

B(t, T )=

1λ− α1

(λ2e

−α1(T−t) + (λ1 − α1)e−λ(T−t))

and

Y 2t =

D∗2(t, T )

B(t, T )=

1λ− α2

(λ1e

−α2(T−t) + (λ2 − α2)e−λ(T−t)).

Working with relative values is here equivalent to setting B(t, T ) = 1 for everyt ∈ [0, T ] in equations (39)-(40), as well as in the pricing formulae of Proposition10.

From (38) it follows that φ3 equals

φ3t = δ − φ1

t e−α1(T−t) = δ − φ2

t e−α2(T−t), (43)

where φ1 and φ2 are related to each other through the formula

φ2t = φ1

t e(α2−α1)(T−t), ∀ t ∈ [0, T ]. (44)

By substituting the last equality in (41), we obtain the following expression for φ1

φ1t = −δe−λ(T−t)

(Y 1t + Y 2

t e(α2−α1)(T−t) − e−α1(T−t))−1

.

More explicitly,φ1t = −δξ1ξ2e

−ξ1(T−t)(g(t))−1, (45)

Page 57: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 49

where we denote ξi = λ− αi for i = 1, 2 and where g(t) equals

g(t) = λ2ξ2 + (λ1 − α1)ξ2e−ξ1(T−t) + λ1ξ1 + (λ2 − α2)ξ1e−ξ2(T−t) − ξ1ξ2.

To determine φ2 we may either use (44) with (45), or to observe that by the symmetryof the problem

φ2t = −δe−λ(T−t)

(Y 2t + Y 1

t e(α1−α2)(T−t) − e−α2(T−t))−1

.

Of course, both methods yield, as expected, the same expression for φ2, namely,

φ2t = −δξ1ξ2e

−ξ2(T−t)(g(t))−1.

Moreover, straightforward calculations show that for φ1, φ2 as above, we have

φ1t dY

1t + φ2

t dY2t = dV B

t (φ) = −δλe−λ(T−t).

Finally, the component φ3 can be found from (43), and thus the calculation of areplicating strategy for the considered example of first-to-default credit swap is com-pleted.

6 PDE Approach

Let us assume that two (defaultable, in general) assets are tradeable, with respectiveprice processes

dY 1t = Y 1

t−(ν1 dt + σ1 dWt + 1 dMt

), Y 1

0 > 0, (46)

dY 2t = Y 2

t−(ν2 dt + σ2 dWt + 2 dMt

), Y 2

0 > 0, (47)

under the real-world probability Q, where W is a one-dimensional standard Brown-ian motion and the G-martingale M is given by

Mt = Ht −∫ t

0

11τ>uςu du, ∀ t ∈ [0, T ],

and the F-adapted intensity ς of the default time τ is strictly positive. We postulatethat the interest rate is equal to a constant r, so that the money market account equalsY 3t = Bt = ert. We assume that σ1 = 0, σ2 = 0 and the constants 1 and 2 are

greater or equal to −1 so that the price process Y i is non-negative for i = 1, 2.

Remark. It may happen that either 1 or 2 equals 0, and thus the correspondingasset is default-free. The case when 1 = 2 = 0 will be excluded, however (seecondition (48) below).

We shall now examine the no-arbitrage property of this market. Specifically, weshall impose additional conditions on the model’s coefficients that will ensure the

Page 58: Paris-Princeton Lectures on Mathematical Finance 2003

50 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

existence of an equivalent martingale measure. From Kusuoka’s (1999) represen-tation theorem, any equivalent martingale measure Q∗ on (Ω,GT ) is of the formdQ∗|Gt = ηt dQ|Gt for t ∈ [0, T ], where

dηt = ηt−(ψt dWt + κt dMt), η0 = 1,

for some G-predictable processes ψ and κ. By applying Ito’s formula, we obtain fori = 1, 2,

Y it ηte

−rt = Y i0 +

∫ t

0

Y iu ηue

−ru(νi − r + ψuσ1 + κuiξu)du + martingale,

where we denote ξt = ςt11τ>t. Hence, the process Y it ηte

−rt is a (local) G-martingale under Q for i = 1, 2 if and only if

νi − r + ψtσi + κtiξt = 0

for i = 1, 2 and almost every t ∈ [0, T ]. Hence, a density process η determines anequivalent martingale measure Q∗ for the processes Y i

t e−rt, i = 1, 2 if and only if

the processes ψ and κ are such that for every t ∈ [0, T ]

ν1 − r + ψtσ1 + κt1ξt = 0,ν2 − r + ψtσ2 + κt2ξt = 0.

Assume that 1σ2 − 2σ1 = 0. Then the unique solution is the pair of processes(ψt, κt), t ∈ [0, T ], such that

ψt =(ν2 − r)1 − (ν1 − r)2

1σ2 − 2σ1

and

κtξt =(ν2 − r)σ1 − (ν1 − r)σ2

1σ2 − 2σ1.

Since η is a strictly positive process, we restrict our attention to parameters suchthat the process κ is greater than −1. Obviously, the value of the process κ after thedefault time τ is irrelevant. However, the pre-default value of κ is uniquely given as

κt =(ν2 − r)σ1 − (ν1 − r)σ2

ςt(1σ2 − 2σ1),

and thus we postulate that the last formula holds for every t ∈ [0, T ]. We thus havethe following auxiliary result. Let us set γt = ςt(1 + κt).

Lemma 8. Assume that 1σ2 − 2σ1 = 0 and

(ν2 − r)σ1 − (ν1 − r)σ2

ςt(1σ2 − 2σ1)> −1, ∀ t ∈ [0, T ]. (48)

Page 59: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 51

Then the market model defined by (46)-(47) and the money market account Y 3t = ert

is complete and arbitrage-free. Moreover, under the unique equivalent martingalemeasure Q∗ we have

dY 1t = Y 1

t−(r dt + σ1 dW ∗

t + 1 dM∗t

),

dY 2t = Y 2

t−(r dt + σ2 dW ∗

t + 2 dM∗t

), (49)

dY 3t = rY 3

t dt,

where W ∗ is a Brownian motion under Q∗, and where the process M∗, given by

M∗t = Mt −

∫ t

0

ξuκu du = Ht −∫ t

0

11τ>uγu du,

follows a martingale under Q∗.

From now on, we shall conduct the analysis of the model given by (49) under themartingale measure Q∗.

6.1 Markovian Case

To proceed further it would be convenient to assume that ς , and thus also κ, aredeterministic functions of the time parameter. In this case, the default intensity γunder Q∗ would be a deterministic function as well. More generally, it suffices topostulate that the F-intensity of default under Q∗ is of the form γt = γ(t, Y 1

t , Y 2t )

for some sufficiently smooth function γ. For instance, γ(t, x, y) may be assumed tobe piecewise continuous with respect to t and Lipschitz continuous with respect to xand y. Under this assumption, the process (Y 1, Y 2, H), where the two-dimensionalprocess (Y 1, Y 2) is the unique solution to the SDE (49), is Markovian under Q∗

(since Y 3 is deterministic, it is not essential here).

For the sake of concreteness, we shall frequently focus on a defaultable claim repre-sented by the following payoff at the maturity date T

Y = 11τ>Tg(Y 1T , Y 2

T ) + 11τ≤Th(Y 1T , Y 2

T ) (50)

for some functions g, h : R2+ → R satisfying suitable integrability conditions.

Hence, the price of Y is given by the risk-neutral valuation formula

πt(Y ) = Bt EQ∗(B−1T Y | Gt), ∀ t ∈ [0, T ]. (51)

Notice that πt(Y ) represents the standard (cum-dividend) price of a European con-tingent claim Y , which settles at time T . Our goal is to find a quasi-explicit repre-sentation for a self-financing trading strategy ψ such that πt(Y ) = Vt(ψ) for everyt ∈ [0, T ], where Vt(ψ) =

∑3i=1 ψitY

it (see Section 6.3).

We shall first prove an auxiliary result, which shows that the arbitrage price of theclaim Y splits in a natural way into the pre-default price and the post-default price.

Page 60: Paris-Princeton Lectures on Mathematical Finance 2003

52 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Lemma 9. The price πt(Y ) of the claim Y given by (50) satisfies

πt(Y ) = (1 −Ht)v(t, Y 1t , Y 2

t ) + Htv(t, Y 1t , Y 2

t ), ∀ t ∈ [0, T ], (52)

for some functions v, v : [0, T ] × R2+ → R such that v(T, x, y) = g(x, y) and

v(T, x, y) = h(x, y).

Proof. We have

πt(Y ) = Bt EQ∗(B−1T Y

∣∣Gt)= Bt EQ∗

(B−1T 11τ>Tg(Y 1

T , Y 2T )∣∣Gt)+ Bt EQ∗

(B−1T 11τ≤Th(Y 1

T , Y 2T )∣∣Gt)

= 11τ>tBt EQ∗(11τ>TB−1

T g(Y 1T , Y 2

T ) + 11t<τ≤TB−1T h(Y 1

T , Y 2T )∣∣Gt)

+ 11τ≤tBt EQ∗(11τ≤tB

−1T h(Y 1

T , Y 2T )∣∣Gt).

This shows that

πt(Y ) = 11τ>tu(t, Y 1t , Y 2

t , 0) + 11τ≤tu(t, Y 1t , Y 2

t , 1),

where

u(t, Y 1t , Y 2

t , Ht) = Bt EQ∗(11τ>TB−1

T g(Y 1T , Y 2

T )∣∣Gt)

+ Bt EQ∗(11t<τ≤TB−1

T h(Y 1T , Y 2

T )∣∣Gt)

= Bt EQ∗((1 −HT )B−1

T g(Y 1T , Y 2

T )

+ (HT −Ht)B−1T h(Y 1

T , Y 2T )∣∣Y 1

t , Y 2t , Ht

)

and

u(t, Y 1t , Y 2

t , Ht) = Bt EQ∗(11τ≤tB−1

T h(Y 1T , Y 2

T )∣∣Gt)

= Bt EQ∗(HtB

−1T h(Y 1

T , Y 2T )∣∣Y 1

t , Y 2t , Ht

).

Let us set

v(t, Y 1t , Y 2

t ) = u(t, Y 1t , Y 2

t , 0)= Bt EQ∗

(B−1T Y

∣∣Y 1t , Y 2

t , Ht = 0)

(53)

and

v(t, Y 1t , Y 2

t ) = u(t, Y 1t , Y 2

t , 1)= Bt EQ∗

(B−1T h(Y 1

T , Y 2T )∣∣Y 1

t , Y 2t , Ht = 1

). (54)

It is clear that v(T, Y 1T , Y 2

T ) = g(Y 1T , Y 2

T ) and v(T, Y 1T , Y 2

T ) = h(Y 1T , Y 2

T ). We con-clude that the price of the claim Y is of the form v(t, Y 1

t , Y 2t ), where

v(t, Y 1t , Y 2

t ) = 11τ>tv(t, Y 1t , Y 2

t ) + 11τ≤tv(t, Y 1t , Y 2

t ).

Page 61: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 53

Notice that v(t, Y 1t , Y 2

t ) and v(t, Y 1t , Y 2

t ) represent the pre-default and post-defaultvalues of Y , respectively.

Post-default value. It should be stressed that the conditional expectation in (53) isto be evaluated using the dynamics of (Y 1, Y 2, Y 3) given by (49). To compute theconditional expectation in (54), however, it is manifestly sufficient to make use of thepost-default dynamics of (Y 1, Y 2, Y 3), which is given by the following expressions,which are valid if 1 > −1 and 2 > −1,

dY 1t = Y 1

t−(r dt + σ1 dW ∗

t

),

dY 2t = Y 2

t−(r dt + σ2 dW ∗

t

), (55)

dY 3t = rY 3

t dt.

Using standard arguments, we conclude that if the function v = v(t, x, y) is suffi-ciently regular then it satisfies the following PDE:

−rv + ∂tv + rx∂xv + ry∂y v + 12

(σ2

1x2∂2xxv + σ2

2y2∂2yy v)

+ σ1σ2xy∂2xy v = 0 (56)

with the terminal condition v(T, x, y) = h(x, y). Hence, the equation (56) can bereferred to as the post-default pricing PDE for our claim. Of course, since after thedefault time our model becomes a default-free model, the use of a such a PDE toarbitrage valuation of path-independent European claims is fairly standard.

If 1 > −1 and 2 = −1, then the process Y 2 jumps to zero at time of default, andthus the post-default pricing PDE becomes:

−rv + ∂tv + rx∂xv + 12σ

21x

2∂2xxv = 0 (57)

with the terminal condition v(T, x) = h(x) for some function h : R+ → R (for-mally, h(x) = h(x, 0)).

Recovery process. Following Jamshidian (2002) (see Theorem 2.1), one may checkthat for any t ∈ [0, T ] we have

Bt EQ∗(B−1T 11Dh(Y 1

T , Y 2T )∣∣Gt) = Bt EQ∗

(B−1τ 11Dv(τ, Y 1

τ , Y 2τ )∣∣Gt),

where D = t < τ ≤ T . Hence, if we wish to compute the pre-default valueof Y , it is tempting to consider the process v(t, Y 1

t , Y 2t ) as the recovery process Z .

According to our convention, the recovery process Z should necessarily be an F-predictable process, and the process v(t, Y 1

t , Y 2t ) is not F-predictable, in general.

Therefore, we formally define the recovery process Z associated with the claim Yby setting

Zt = z(t, Y 1t (1 + 1), Y 2

t (1 + 2)) = v(t, Y 1t (1 + 1), Y 2

t (1 + 2))), (58)

where Y i is the pre-default value of the ith asset (so that Y i is manifestly an F-adapted, continuous process). It is clear that

Page 62: Paris-Princeton Lectures on Mathematical Finance 2003

54 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Zτ = v(τ, Y 1τ (1 + 1), Y 2

τ (1 + 2)) = v(τ, Y 1τ , Y 2

τ ), Q∗-a.s.

Notice that the pre-default value of the claim Y given by (50) coincides with thepre-default value of (X,Z, 0, τ), where the promised payoff X = g(Y 1

T , Y 2T ) and

the F-predictable recovery process Z is given by (58).

6.2 Pricing PDE for the Pre-Default Value

Recall that the price process of the claim Y given by (50) admits the following rep-resentation, for every t ∈ [0, T ],

πt(Y ) = (1 −Ht)v(t, Y 1t , Y 2

t ) + Htv(t, Y 1t , Y 2

t ) (59)

for some functions v, v : [0, T ] × R2+ → R such that v(T, x, y) = g(x, y) and

v(T, x, y) = h(x, y). We assume that processes v(t, Y 1t , Y 2

t ) and v(t, Y 1t , Y 2

t ) aresemimartingales. We shall need the following simple version of the Ito integrationby parts formula for (discontinuous) semimartingales.

Lemma 10. Assume that Z is a semimartingale and A is a bounded process of finitevariation. Then

ZtAt = Z0A0 +∫ t

0

Zu− dAu +∫ t

0

Au dZu

= Z0A0 +∫ t

0

Zu dAu +∫ t

0

Au− dZu.

Proof. Both formulae are almost immediate consequences of the general Ito formulafor semimartingales (see, for instance, Protter (2003)), and the fact that under thepresent assumptions we have [Z,A]t =

∑0<s≤t∆Zs∆As.

Our next goal is to derive the partial differential equation satisfied by the pre-defaultpricing function v. The post-default pricing function v (or, equivalently, the recoveryfunction z) is taken here as an input. Hence, the only unknown function at this stageis the pre-default pricing function v.

In view of the financial interpretation of the function v, the PDE derived in Proposi-tion 11 will be referred to as the pre-default pricing PDE for a defaultable claim Y .For a related result, see Proposition 3.4 in Lukas (2001).

Proposition 11. We now assume that the function v = v(t, x, y) belong to the classC1,2,2([0, T ] × R+ × R+), and that in addition, it satisfies the PDE

−rv + ∂tv + rx∂xv + ry∂y v + 12

(σ2

1x2∂2xxv + σ2

2y2∂2yy v)

+ σ1σ2xy∂2xy v

+ γ(t, x, y)(v(t, x(1 + 1), y(1 + 2)) − v − 1x∂xv − 2y∂y v

)= 0

with the terminal condition v(T, x, y) = g(x, y). Let the process π(Y ) be given by(59). Then the process V ∗

t = B−1t πt(Y ) stopped at τ is a G-martingale under Q∗.

Page 63: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 55

Proof. By applying the Ito integration by parts formula to both terms in the right-hand side of (59), we obtain

dπt(Y ) = (1 −Ht) dv(t, Y 1t , Y 2

t ) − v(t, Y 1t−, Y 2

t−) dHt

+ Ht− dv(t, Y 1t , Y 2

t ) + v(t, Y 1t , Y 2

t ) dHt

= 11τ>t dv(t, Y 1t , Y 2

t ) +(v(t, Y 1

t , Y 2t ) − v(t, Y 1

t−, Y 2t−))dHt

+ 11τ<t dv(t, Y 1t , Y 2

t ).

Hence, the process V ∗t = e−rtπt(Y ) satisfies for every t ∈ [0, T ]

V ∗t = V ∗

0 −∫ τ∧t

0

re−ruv(u, Y 1u , Y 2

u ) du +∫

(0,τ∧t)e−ru dv(u, Y 1

u , Y 2u )

+∫

(0,τ∧t]e−ru

(v(u, Y 1

u , Y 2u ) − v(u, Y 1

u−, Y 2u−)

)dHu

+∫

(τ∧t,t]e−ru dv(u, Y 1

u , Y 2u ).

It is clear that if π(Y ) is given by (51) then the process V ∗ is a G-martingale underQ∗ (see also Corollary 8 below). To derive the pre-default pricing PDE, it suffices tomake use of the martingale property of the stopped process

V ∗τ∧t = 11τ>te−rtv(t, Y 1

t , Y 2t ) + 11τ≤te−rτ v(τ, Y 1

τ , Y 2τ ).

By applying Ito’s formula to v(t, Y 1t , Y 2

t ) on τ > t, we obtain

V ∗τ∧t = V ∗

0

+∫ τ∧t

0

e−ru(− rvu + ∂tvu + rY 1

u ∂xvu + rY 2u ∂y vu

)du

+∫ τ∧t

0

12e−ru

(σ2

1(Y 1u )2∂xxvu + σ2

2(Y 2u )2∂yy vu + 2σ1σ2Y

1u Y 2

u ∂xy vu

)du

−∫ τ∧t

0

e−ru(1Y

1u ∂xvu + 2Y

2u ∂y vu

)γu du

+∫

(0,τ∧t]e−ru

(v(u, Y 1

u−(1 + 1), Y 2u−(1 + 2)) − v(u, Y 1

u−, Y 2u−)

)dHu

+∫ τ∧t

0

e−ru(σ1Y

1u ∂xvu + σ2Y

2u ∂y vu

)dW ∗

u ,

where vu = v(u, Y 1u , Y 2

u ), ∂xvu = ∂xv(u, Y 1u , Y 2

u ), γu = γ(u, Y 1u , Y 2

u ), etc. Thelast formula can be rewritten as follows:

Page 64: Paris-Princeton Lectures on Mathematical Finance 2003

56 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

V ∗τ∧t = V ∗

0

+∫ τ∧t

0

e−ru(− rvu + ∂tvu + rY 1

u ∂xvu + rY 2u ∂y vu

)du

+∫ τ∧t

0

12e−ru

(σ2

1(Y 1u )2∂xxvu + σ2

2(Y 2u )2∂yy vu + 2σ1σ2Y

1u Y 2

u ∂xy vu

)du

−∫ τ∧t

0

e−ru(1Y

1u ∂xvu + 2Y

2u ∂y vu

)γu du

+∫ τ∧t

0

e−ru(v(u, Y 1

u−(1 + 1), Y 2u−(1 + 2)) − v(u, Y 1

u−, Y 2u−)

)γu du

+∫

(0,τ∧t]e−ru

(v(u, Y 1

u−(1 + 1), Y 2u−(1 + 2)) − v(u, Y 1

u−, Y 2u−)

)dM∗

u

+∫ τ∧t

0

e−ru(σ1Y

1u ∂xvu + σ2Y

2u ∂y vu

)dW ∗

u .

Recall that the processes W ∗ and M∗ are G-martingales under Q∗. Thus, the stoppedprocess V ∗

t∧τ is a G-martingale if and only if for every t ∈ [0, T ]∫ τ∧t

0

e−ru(− rvu + ∂tvu + rY 1

u ∂xvu + rY 2u ∂y vu

)du

+∫ τ∧t

0

12e−ru

(σ2

1(Y 1u )2∂xxvu + σ2

2(Y 2u )2∂yy vu + 2σ1σ2Y

1u Y 2

u ∂xy vu

)du

−∫ τ∧t

0

e−ru(1Y

1u ∂xvu + 2Y

2u ∂y vu

)γu du

+∫ τ∧t

0

e−ru(v(u, Y 1

u (1 + 1), Y 2u (1 + 2)) − v(u, Y 1

u , Y 2u ))γu du = 0.

The last equality is manifestly satisfied if the function v solves the PDE given in thestatement of the proposition. Conversely, if the function v in representation (59) issufficiently regular, then it necessarily satisfies the last equation.

Corollary 8. Assume that the pricing functions v and v belong to the class C1,2,2

([0, T ] × R+ × R+) and satisfy the post-default and pre-default pricing PDEs, re-spectively. Then the discounted price process V ∗

t , t ∈ [0, T ], is a G-martingale underQ∗ and the dynamics of V ∗ under Q∗ are

dV ∗t = 11τ>te−rt(σ1Y

1t ∂xvt + σ2Y

2t ∂y vt) dW ∗

t

+ 11τ<te−rt(σ1Y1t ∂xvt + σ2Y

2t ∂yvt) dW ∗

t

+ e−rt(v(t, Y 1

t−(1 + 1), Y 2t−(1 + 2)) − v(t, Y 1

t−, Y 2t−))dM∗

t .

Generic defaultable claim. Technique described above can be applied to the caseof a general defaultable claim. Consider a generic defaultable claim (X,Z, 0, τ) withthe promised payoff X = g(Y 1

T , Y 2T ) and the recovery process Zt = z(t, Y 1

t , Y 2t ),

Page 65: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 57

where z is a continuous function. Then the discounted price process stopped at τequals

V ∗τ∧t = 11τ>te−rtv(t, Y 1

t , Y 2t ) + 11τ≤te−rτz(τ, Y 1

τ , Y 2τ ).

The latter formula can also be rewritten as follows (note that the pre-default pricesY 1 and Y 2 are continuous)

V ∗τ∧t = 11τ>te−rtv(t, Y 1

t , Y 2t ) + 11τ≤te−rτz(τ, Y 1

τ (1 + 1), Y 2τ (1 + 2)).

In this case, the pre-default pricing PDE reads

−rv + ∂tv + rx∂xv + ry∂y v + 12

(σ2

1x2∂2xxv + σ2

2y2∂2yy v)

+ σ1σ2xy∂2xy v

+ γ(t, x, y)(z(t, x(1 + 1), y(1 + 2)) − v − 1x∂xv − 2y∂y v

)= 0

with the terminal condition v(T, x, y) = g(x, y). According to our interpretation ofthe pre-default value U = U(X) + U(Z) of the claim (X,Z, 0, τ), the solution tothe last equation is expected to satisfy v(t, Y 1

t , Y 2t ) = Ut for every t ∈ [0, T ].

6.3 Replicating Strategy

Consider a claim Y of the form (50), and assume that any t ∈ [0, T ] we have

πt(Y ) = v(t, Y 1t , Y 2

t )11τ>t + v(t, Y 1t , Y 2

t )11τ≤t,

where the functions v and v satisfy the post-default and pre-default pricing PDEs,respectively. It view of Corollary 8, we have (recall that the process M∗ is stoppedat τ and processes Y 1 and Y 2 are continuous)

dV ∗t = 11τ≥te−rtVt dW ∗

t + 11τ<te−rtVt dW ∗t

+ e−rt[v(t, Y 1

t (1 + 1), Y 2t (1 + 2)) − v(t, Y 1

t , Y 2t )]dM∗

t ,

where the F-adapted process V is given by

Vt = σ1Y1t ∂xv(t, Y 1

t , Y 2t ) + σ2Y

2t ∂y v(t, Y 1

t , Y 2t ) (60)

and V is the G-adapted process:

Vt = σ1Y1t ∂xv(t, Y 1

t , Y 2t ) + σ2Y

2t ∂y v(t, Y 1

t , Y 2t ). (61)

As before, we denote the discounted prices by

Y 1,3t = Y 1

t /Y 3t = Y 1

t e−rt, Y 2,3t = Y 2

t /Y 3t = Y 2

t e−rt.

Some algebra leads to

Page 66: Paris-Princeton Lectures on Mathematical Finance 2003

58 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

dW ∗t =

12σ1 − 1σ2

(2

Y 1,3t−

dY 1,3t − 1

Y 2,3t−

dY 2,3t

),

dM∗t =

11σ2 − 2σ1

(σ2

Y 1,3t−

dY 1,3t − σ1

Y 2,3t−

dY 2,3t

).

It should be stressed that the above representation for W ∗ and M∗ is always valid,under the present assumptions, on the stochastic interval [[0, τ ∧ T ]]. It also holdsafter default, provided that neither Y 1 nor Y 2 jumps to zero at time τ . Hence, thecase when Y 1 (or Y 2) becomes worthless at time τ (and thus also after τ ) shouldbe considered separately. It is worthwhile to emphasize that the strategy φ derivedbelow is always the replicating strategy for the claim Y up to default time τ . Recallthat we work under the standing assumption that c = 2σ1−1σ2 = 0. Hence, underthe assumption that 1 > −1 and 2 > −1, we obtain

V ∗t = V ∗

0 +1c

∫(0,t]

[2

(11τ≥uVu + 11τ<uVu

)

− σ2

(v(u, Y 1

u (1 + 1), Y 2u (1 + 2)) − v(u, Y 1

u , Y 2u ))] dY 1,3

u

Y 1u−

−1c

∫(0,t]

[1

(11τ≥uVu + 11τ<uVu

)

− σ1

(v(u, Y 1

u (1 + 1), Y 2u (1 + 2)) − v(u, Y 1

u , Y 2u ))] dY 2,3

u

Y 2u−

= V ∗0 +

∫(0,t]

ψ1u dY 1,3

u +∫

(0,t]

ψ2u dY 2,3

u ,

where the processes ψ1 and ψ2 are G-predictable. If we do not postulate that 1 >−1 and 2 > −1, then we obtain

V ∗t∧τ = V ∗

0 +∫ t∧τ

0

Vu dW ∗u

+∫

(0,t∧τ ]e−ru

[v(u, Y 1

u (1 + 1), Y 2u (1 + 2)) − v(u, Y 1

u , Y 2u )]dM∗

u

or, equivalently,

V ∗t∧τ = V ∗

0 +1c

∫(0,t∧τ ]

2VudY 1,3

u

Y 1u

− 1c

∫(0,t∧τ ]

1VudY 2,3

u

Y 2u

− σ2

c

∫(0,t∧τ ]

(v(u, Y 1

u (1 + 1), Y 2u (1 + 2)) − v(u, Y 1

u , Y 2u )) dY 1,3

u

Y 1u

+σ1

c

∫(0,t∧τ ]

(v(u, Y 1

u (1 + 1), Y 2u (1 + 2)) − v(u, Y 1

u , Y 2u )) dY 2,3

u

Y 2u

= V ∗0 +

∫(0,t∧τ ]

φ1u dY 1,3

u +∫

(0,t∧τ ]φ2u dY 2,3

u ,

where the processes φ1 and φ2 are F-predictable.

Page 67: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 59

We are in a position to state the following result, which establishes the formula forthe replicating strategy for Y .

Proposition 12. Assume that 1 > −1 and 2 > −1. Then the replicating strategyfor the defaultable claim Y defined by (50) is given as ψ = (ψ1, ψ2, π(Y )−ψ1Y 1−ψ2Y 2), where the G-predictable processes ψ1 and ψ2 are given by the expressions

ψ1t = (cY 1

t−)−1(2

(11τ≥tVt + 11τ<tVt

)

− σ2

(v(t, Y 1

t (1 + 1), Y 2t (1 + 2)) − v(t, Y 1

t , Y 2t )))

and

ψ2t = −(cY 2

t−)−1(1

(11τ≥tVt + 11τ<tVt

)

− σ1

(v(t, Y 1

t (1 + 1), Y 2t (1 + 2)) − v(t, Y 1

t , Y 2t )))

with the processes V and V given by (60) and (61), respectively. The wealth processof ψ satisfies Vt(ψ) = πt(Y ) for every t ∈ [0, T ].

It is worthwhile to stress that the replicating strategy ψ is understood in the standardsense, that is, it duplicates the payoff Y at the maturity date T . If we wish insteadto use the convention adopted in Section 4, then we should focus on the defaultableclaim (X,Z, 0, τ) associated with Y through equality (58) (in this case, z(t, x, y) =v(t, x, y)), and thus it is a replicating strategy for the associated defaultable claim(X,Z, τ). The latter convention is particularly convenient if the assumption that both1 and 2 are strictly greater than -1 is relaxed. Let us focus on the replication of theclaim (X,Z, 0, τ) with the pre-default value

Ut = Ut(X) + Ut(Z) = v(t, Y 1t , Y 2

t ).

Proposition 13. Assume that either 1 > −1 or 2 > −1, and let the process V begiven by (60). Then the replicating strategy for the defaultable claim (X,Z, 0, τ) isφ = (φ1, φ2, U − φ1Y 1 − φ2Y 2), where the F-predictable processes φ1 and φ2 aregiven by the formulae

φ1t = (cY 1

t )−1(2Vt − σ2

(z(t, Y 1

t (1 + 1), Y 2t (1 + 2)) − Ut

))

φ2t = −(cY 2

t )−1(1Vt − σ1

(z(t, Y 1

t (1 + 1), Y 2t (1 + 2)) − Ut

)).

Survival claim. Assume that the first tradeable asset is a default-free asset (thatis, 1 = 0), and the second asset is a defaultable asset with zero recovery (hence,2 = −1). Then we have

Page 68: Paris-Princeton Lectures on Mathematical Finance 2003

60 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

dY 1t = Y 1

t

(r dt + σ1 dW ∗

t

), Y 1

0 > 0,

dY 2t = Y 2

t−(r dt + σ2 dW ∗

t − dM∗t

), Y 2

0 > 0,

dY 3t = rY 3

t dt, Y 30 = 1.

Notice that c = 2σ1 − 1σ2 = −σ1 = 0. Consider a survival claim Y of theform Y = 11τ>Tg(Y 1

T ), that is, a vulnerable claim with zero recovery written onthe default-free asset Y 1. It is obvious that we may formally identify Y with thedefaultable claim (X, 0, 0, τ) with the promised payoff X = g(Y 1

T ) and Z = 0. Forthe replicating strategy φ we obtain that

φ2tY

2t = v(t, Y 1

t , Y 2t ) − v(t, Y 1

t (1 + 1), Y 2t (1 + 2)y) = Ut,

since v(t, x, y) = 0. We conclude that the net investment in default-free assets equals0 at any time t ∈ [0, T ]. One can check, by inspection, that the strategy φ replicatesthe claim Y also after default (formally, we set ψit = 0 for i = 1, 2, 3 on the eventτ > t).

Suppose that the risk-neutral intensity of default is of the form γt = γ(t, Y 1t ). In this

case, it is rather obvious that the pre-default pricing function v does not depend onthe variable y. In particular, the volatility coefficient σ2 of the second asset plays norole in the risk-neutral valuation of Y ; only the properties of the default time τ reallymatter. This feature of the function v can be formally deduced from the representa-tion (53) and the observation that if γt = γ(t, Y 1

t ) then the two-dimensional process(Y 1, H) is Markovian with respect to the filtration G. We conclude that the functionv = v(t, x) satisfies the following simple version of the pre-default pricing PDE

−rv + ∂tv + rx∂xv + 12σ

21x

2∂2xxv − γ(t, x)v = 0

with the terminal condition v(T, x) = g(x).

6.4 Generalizations

For the sake of simplicity, we have postulated that the prices Y 1, Y 2 and Y 3 aregiven by the SDE (49) with constant coefficients. In order to cover a large class ofdefaultable assets, we should relax these restrictive assumptions by postulating, forinstance, that the processes Y 1 and Y 2 are governed under Q∗ by

dY it = Y i

t−(rt dt + σit dW

∗t + it dM

∗t

), Y i

0 > 0,

whereσit = σi(t, T )11τ<t + σ2(t, T )11τ≥t

for some pre-default and post-default volatilities σi(t, T ) and σi(t, T ), and whereit = i(t, Y 1

t−, Y 2t−, Y 3

t−) for some functions i : [0, T ] × R3+ → [−1,∞). The

proposed dynamics for Y 1 and Y 2 has the following practical consequences. First,the choice of σi and σi allows us to model the real-life fact that the character of a

Page 69: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 61

defaultable security may change essentially after default. Second, through a judiciousspecification of the function i, we are able to examine various alternative recoveryschemes at time of default. As the process Y 3, we may take the price of a zero-coupon default-free bond. Hence, Y 3 = B(t, T ) satisfies under Q∗

dY 3t = Y 3

t

(rt dt + b(t, T ) dW ∗

t

).

Example 3. Suppose that the process Y 1 represents the price of a generic defaultablezero-coupon bond with maturity date T . Then the bond is subject to the fractionalrecovery of market value scheme with recovery rate δ1 ∈ [0, 1] if the process 1 isconstant, specifically,

1t = 1(t, Y 1

t−, Y 2t−, Y 3

t−) = δ1 − 1.

To model a defaultable bond with the fractional recovery of par value at default, weset

1t = 1(t, Y 1

t−, Y 2t−, Y 3

t−) = δ1(Y 1t−)−1 − 1.

Finally, the fractional recovery of Treasury value scheme corresponds to the follow-ing choice of the process 1

t (recall that Y 3t = B(t, T ), and thus it is a continuous

process)1t = 1(t, Y 1

t−, Y 2t−, Y 3

t−) = δ1Y3t (Y 1

t−)−1 − 1.

In all cases, the post-default volatility σ1(t, T ) should coincide with the volatilityof the default-free zero-coupon bond of maturity T . This corresponds to the naturalinterpretation that after default the recovery payoff is invested in default-free bonds.

Part II. Mean-Variance Approach

In this part, we formulate a new paradigm for pricing and hedging financial risksin incomplete markets, rooted in the classical Markowitz mean-variance portfolioselection principle. We consider an underlying market of liquid financial instrumentsthat are available to an investor (also called an agent) for investment. We assume thatthe underlying market is arbitrage-free and complete. We also consider an investorwho is interested in dynamic selection of her portfolio, so that the expected value ofher wealth at the end of the pre-selected planning horizon is no less then some floorvalue, and so that the associated risk, as measured by the variance of the wealth atthe end of the planning horizon, is minimized.

When a new investment opportunity becomes available for the agent, in a form ofsome contingent claim, she needs to decide how much she is willing to pay for ac-quiring the opportunity. More specifically, she has to decide what portion of hercurrent endowment she is willing to invest in a new opportunity. It is assumed thatthe new claim, if acquired, is held until the horizon date, and the remaining part of

Page 70: Paris-Princeton Lectures on Mathematical Finance 2003

62 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

the endowment is dynamically invested in primary (liquid) assets. If the cash-flowsgenerated by the new opportunity can be perfectly replicated by the existing liquidmarket instruments already available for trading, then the price of the opportunitywill be uniquely determined by the wealth of the replicating strategy. However, ifperfect replication is not possible, then the determination of a purchase (or bid) pricethat the investor is willing to pay for the opportunity, will become subject to the in-vestor’s overall attitude towards trading. In case of our investor, the bid price andthe corresponding hedging strategy will be determined in accordance with the mean-variance paradigm. Analogous remarks apply to an investor who engages in creationof an investment opportunity and needs to decide about its selling (or ask) price.

As explained above, it suffices to focus on a situation when the newly available in-vestment opportunity can not be perfectly replicated by the instruments existing inthe underlying market. Thus, the emerging investment opportunity is not attainable,and consequently the market model (that is the underlying market and new invest-ment opportunities) is incomplete.

It is well known (see, e.g., El Karoui and Quenez (1995) or Kramkov (1996)) thatwhen a market is incomplete, then for any non-attainable contingent claim X thereexists a non-empty interval of arbitrage prices, referred to as the no-arbitrage inter-val, determined by the maximum bid price πu(X) (the upper price) and the minimumask price πl(X) (the lower price) The maximum bid price represents the cost of themost expensive dynamic portfolio that can be used to perfectly hedge the long posi-tion in the contingent claim. The minimum ask price represents the initial cost of thecheapest dynamic portfolio that can be used to perfectly hedge the short position inthe contingent claim.

Put another way, the maximum bid price is the maximum amount that the agentpurchasing the contingent claim can afford to pay for the claim, and still be sure tofind an admissible portfolio that would fully manage her debt and repay it with cashflows generated by the strategy and the contingent claim, and end up with a non-negative wealth at the maturity date of the claim. Likewise, the minimum ask price isthe minimum amount that the agent selling the claim can afford to accept to chargefor the claim, and still be sure to find an admissible portfolio that would generateenough cash flow to make good on her commitment to buyer of the claim, and endup with a non-negative wealth at the maturity date of the claim.

As is well known, the arbitrage opportunities are precluded if and only if the actualprice of the contingent claim belongs to the no-arbitrage interval. But this means,of course, that perfect hedging will not be accomplished by neither the short party,nor by the long party. Thus, any price that precludes arbitrage, enforces possibilityof a financial loss for either party at the maturity date. This observation gave rise toquite abundant literature regarding the judicious choice of a specific price within theno-arbitrage interval by means of minimizing some functional that assesses the riskassociated with potential losses.

Page 71: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 63

We shall not be discussing this extensive literature here. Let us only observe thatmuch work within this line of research has been done with regard to the so-calledmean-variance hedging; we refer to the recent paper by Schweizer (2001) for anexhaustive survey of relevant results. It is worth stressing that the interpretation ofthe term “mean-variance hedging”, as defined in these works, is entirely differentfrom what is meant here by mean-variance hedging.

The optimization techniques used in this part are based on mean-variance portfolioselection in continuous time. Probably the first work in this area was the paper byZhou and Li (2000) who used the embedding technique and linear-quadratic (LQ)optimal control theory to solve the continuous-time, mean-variance problem withassets having deterministic diffusion coefficients. They essentially ended up with aproblem that was inherently an indefinite stochastic LQ control problem, the theoryof which has been developed only very recently (see, e.g., Yong and Zhou (1999),Chapter 6). In subsequent works, the techniques of stochastic LQ optimal controlwere heavily exploited in order to solve more sophisticated variants of the mean-variance portfolio selection in continuous time. For instance, Li et al. (2001) intro-duced a constraint on short-selling, Lim and Zhou (2002) allowed for stocks whichare modeled by processes having random drift and diffusion coefficients, Zhou andYin (2004) featured assets in a regime switching market, and Bielecki et al. (2004b)solved the problem with positivity constraint imposed on the wealth process. An ex-cellent survey of most of these results is presented in Zhou (2003), who also provideda number of examples that illustrate the similarities as well as differences betweenthe continuous-time and single-period settings.

7 Mean-Variance Pricing and Hedging

We consider an economy in continuous time, t ∈ [0, T ∗], and the underlying prob-ability space (Ω,G,P) endowed with a one-dimensional standard Brownian motionW (with respect to its natural filtration). The probability P plays the role of the sta-tistical probability. We denote by F the P-augmentation of the filtration generated byW . Consider an agent who initially has two liquid assets available to invest in:

• a risky asset whose price dynamics are

dZ1t = Z1

t

(ν dt + σ dWt

), Z1

0 > 0,

for some constants ν and σ > 0,

• a money market account whose price dynamics under P are

dZ2t = rZ2

t dt, Z20 = 1,

where r is a constant interest rate.

Page 72: Paris-Princeton Lectures on Mathematical Finance 2003

64 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Suppose for the moment that G = FT∗ . It is well known that in this case the under-lying market, consisting of the two above assets, is complete. Thus the fair value ofany claim contingent X which settles at time T ≤ T ∗, and thus is formally definedas an FT -measurable random variable, is the (unique) arbitrage price of X , denotedas π0(X) in what follows.

Now let H be another filtration in (Ω,G,P), which satisfies the usual conditions. Weconsider the enlarged filtration G = F ∨ H and we postulate that G = GT∗ . We shallrefer to G as to the full filtration; the Brownian filtration F will be called the referencefiltration. We make an important assumption that W is a standard Brownian motionwith respect to the full filtration G under the probability P.

Let φit represent the number of shares of asset i held in the agent’s portfolio at timet. We consider trading strategies φ = (φ1, φ2), where φ1 and φ2 are G-predictableprocesses. A strategy φ is self-financing if

Vt(φ) = V0(φ) +∫ t

0

φ1u dZ1

u +∫ t

0

φ2u dZ2

u, ∀ t ∈ [0, T ∗],

where Vt(φ) = φ1tZ

1t + φ2

tZ2t is the wealth of φ at time t. Thus, we postulate the

absence of outside endowments and/or consumption.

Definition 7. We say that a self-financing strategy φ is admissible on the interval[0, T ] if and only if for any t ∈ [0, T ] the wealth Vt(φ) is a P-square-integrablerandom variable.

The condition

EP

(∫ T

0

(φiuZiu)

2du)

< ∞, i = 1, 2,

is manifestly sufficient for the admissibility of φ on [0, T ]. Let us fix T and let usdenote by Φ(G) the linear space of all admissible trading strategies on the finiteinterval [0, T ].

Suppose that the agent has at time t = 0 a positive amount v > 0 available forinvestment (we shall refer to v as the initial endowment). It is easily seen that for anyφ ∈ Φ(G) the wealth process satisfies the following SDE

dV vt (φ) = rV v

t (φ) dt + φ1t

(dZ1

t − rZ1t dt), V v

0 (φ) = v.

This shows that the wealth at time t depends exclusively on the initial endowment vand the component φ1 of a self-financing strategy φ.

Now, imagine that a new investment opportunity becomes available for the agent.Namely, the agent may purchase at time t = 0 a contingent claim X, whose corre-sponding cash-flow of X units of cash occurs at time T . We assume that X is not anFT -measurable random variable. Notice that this requirement alone may not sufficefor the non-attainability of X . Indeed, in the present setup, we have the followingdefinition of attainability.

Page 73: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 65

Definition 8. A contingent claim X is attainable if there exists a strategy φ ∈ Φ(G)such that X = VT (φ) or, equivalently,

X = V0(φ) +∫ T

0

φ1u dZ1

u +∫ T

0

φ2u dZ2

u.

If a claim X can be replicated by means of a trading strategy φ ∈ Φ(F), we shallsay that X is F-attainable. According to the definition of admissibility, the square-integrability of X under P is a necessary condition for attainability. Notice, however,that it may happen that X is not an FT -measurable random variable, but it representsan attainable contingent claim according to the definition above.

Suppose now that a considered claim X is not attainable. The main question thatwe want to study is: how much would the agent be willing to pay at time t = 0for X , and how the agent should hedge her investment? A symmetric study can beconducted for an agent creating such an investment opportunity by selling the claim.In what follows, we shall first present our results in a general framework of a genericGT -measurable claim; then we shall examine a particular case of defaultable claims.

7.1 Mean-Variance Portfolio Selection

We postulate that the agent’s objective for investment is based on the classical mean-variance portfolio selection. Let VP(Z) be the variance under P of a random variableZ . For any fixed date T , any initial endowment v > 0, and any given d ∈ R, the agentis interested in solving the following problem:

Problem MV(d, v): Minimize VP(V vT (φ)) over all strategies φ ∈ Φ(G), subject to

EPVvT (φ) ≥ d.

We shall show that, given the parameters d and v satisfy certain additional conditions,the above problem admits a solution, so that there exists an optimal trading strategy,say φ∗(d, v). Let V ∗(d, v) = V (φ∗(d, v)) stand for the optimal wealth process, andlet us denote by v∗(d, v) the value of the variance VP(V ∗

T (d, v)).

For simplicity of presentation, we did not postulate above that agent’s wealth shouldbe non-negative at any time. Problem MV(d, v) with this additional restriction hasbeen recently studied in Bielecki et al. (2004b).

Remark. It is apparent that the problem MV(d, v) is non-trivial only if d >verT . Otherwise, investing in the money market alone generates the wealth processV vt (φ) = vert, that obviously satisfies the terminal condition EPV

vT (φ) = verT ≥ d,

and for which the variance of the terminal wealth V vT (φ) is zero. Thus, when con-

sidering the problem MV(d, v) we shall always assume that d > verT . Put anotherway, we shall only consider trading strategies φ for which the expected return satis-fies EP(V v

T (φ)/v) ≥ erT , that is, it is strictly higher than the return on the moneymarket account.

Page 74: Paris-Princeton Lectures on Mathematical Finance 2003

66 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Assume that a claim X is available for purchase at time t = 0. We postulate thatthe random variable X is GT -measurable and square-integrable under P. The agentshall decide whether to purchase X , and what is the maximal price she could offerfor X . According to the mean-variance paradigm, her decision will be based on thefollowing reasoning. First, for any p ∈ [0, v] the agent needs to solve the relatedmean-variance problem.

Problem MV(d, v, p,X): Minimize VP(V v−pT (φ) + X) over all trading strategies

φ ∈ Φ(G), subject to EP(V v−pT (φ) + X) ≥ d.

We shall show that if d, v, p and X satisfy certain sufficient conditions, then there ex-ists an optimal strategy for this problem. We shall denote by φ∗(d, v, p,X) this opti-mal strategy, and by V ∗

T (d, v, p,X) the corresponding value of V v−pT (φ∗(d, v, p,X))

and we set v∗(d, v, p,X) = VP

(V ∗T (d, v, p,X) + X

).

It is reasonable to expect that the agent will be willing to pay for the claim X theprice that is no more than (by convention, sup ∅ = −∞)

pd,v(X) := sup p ∈ [0, v] : MV(d, v, p,X) admits a solution

and v∗(d, v, p,X) ≤ v∗(d, v).

This leads to the following definition of mean-variance price and hedging strategy.

Definition 9. The number pd,v(X) is called the buying agent’s mean-variance priceof X . The optimal trading strategy φ∗(d, v, pd,v(X), X) is called the agent’s mean-variance hedging strategy for X .

Of course, in order to make the last definition operational, we need to be able to solveexplicitly problems MV(d, v) and MV(d, v, p,X), at least in some special cases of acommon interest. These issues will be examined in some detail in the remaining partof this note, first for the special case of F-adapted trading strategies (see Section 8),and subsequently, in the general case of G-adapted strategies (see Section 9).

Remark. Let us denote µX = EPX . Inequality EP(V v−pT (φ) + X) ≥ d is equiv-

alent to EPVv−pT (φ) ≥ d − µX . Observe that, unlike as in the case of the prob-

lem MV(d, v), the problem MV(d, v, p,X) may be non-trivial even if d − µX ≤erT (v − p). Although investing in a money market alone will produce in this case awealth process for which the condition EPV

v−pT (φ) ≥ d−µX is manifestly satisfied,

the corresponding variance VP

(V v−pT (φ)+X

)= VP(X) is not necessarily minimal.

Financial Interpretation

Let us denote by N (X) the no-arbitrage interval for the claim X , that is, N (X) =[πl(X), πu(X)]. It may well happen that the mean-variance price pd,v(X) is outside

Page 75: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 67

this interval. Since this possibility may appear as an unwanted feature of the approachto pricing and hedging presented in this note, we shall comment briefly on this issue.When we consider the valuation of a claim X from the perspective of the entiremarket, then we naturally apply the no-arbitrage paradigm.

According to the no-arbitrage paradigm, the financial market as a whole will acceptonly those prices of a financial asset, which fall into the no-arbitrage interval. Pricesfrom outside this interval can’t be sustained in a longer term due to market forces,which will tend to eliminate any arbitrage opportunity.

Now, let us consider the same issue from the perspective of an individual. Supposethat an individual investor is interested in putting some of her initial endowmentv > 0 into an investment opportunity provided by some claim X . Thus, the investorneeds to decide whether to acquire the investment opportunity, and if so then howmuch to pay for it, based on her overall attitude towards risk and reward.

The number pd,v(X) is the price that investor is willing to pay for the investmentopportunity X , given her initial capital v, given her attitude towards risk and re-ward, and given the primary market. The investor “submits” her price to the market.Now, suppose that the market recognized no-arbitrage interval for X is N (X). Ifit happens that p ∈ N (X) then the investor’s bid price for X can be accepted bythe market. In the opposite case, the investor’s bid price may not be accepted by themarket, and the investor may not enter into the investment opportunity.

8 Strategies Adapted to the Reference Filtration

In this section, we shall solve the problem MV(d, v) under the restriction that tradingstrategies are based on the reference filtration F. In other words, we postulate that φbelongs to the class Φ(F) of all admissible and F-predictable strategies φ. In thiscase, we shall say that a strategy φ is F-admissible. The assumption that φ is F-admissible implies, of course, that the terminal wealth V v

T (φ) is an FT -measurablerandom variable.

8.1 Solution to MV(d, v) in the Class Φ(F)

A general version of the problem MV(d, v) has been studied in Bielecki et al.(2004b). Because our problem is a very special version of the general one, we givebelow a complete solution tailored to present set-up.

Reduction to Zero Interest Rate Case

Recall our standing assumption that d > verT . Problem MV(d, v) is clearly equiva-lent to: minimize the variance VP(e−rTVT (φ)) under the constraint

Page 76: Paris-Princeton Lectures on Mathematical Finance 2003

68 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

EP(e−rTVT (φ)) ≥ e−rTd.

For the sake of notational simplicity, we shall write Vt instead of V vt (φ). We set

Vt = Vt(Z2t )

−1 = e−rtVt, so that

dVt = φ1t dZ

1t = φ1

t Z1t

(ν dt + σ dWt

), (62)

where we denote ν = ν−r. So we can and do restrict our attention to the case r = 0.Thus, in what follows, we shall have Z2

t = 1 for every t ∈ R+. In the rest of thisnote, unless explicitly stated otherwise, we assume that d > v.

Decomposition of Problem MV(d, v)

Let Q be a (unique) equivalent martingale measure on (Ω,FT∗) for the underlyingmarket. It is easily seen that

dQ

dP

∣∣∣Ft

= ηt, ∀ t ∈ [0, T ∗],

where we denote by η the Radon-Nikodym density process. Specifically, we have

dηt = −θηt dWt, η0 = 1, (63)

or, equivalently,ηt = exp

(− θWt − 1

2θ2t),

where θ = ν/σ (recall that we have formally reduced the problem to the case r = 0).The process η is a F-martingale under P. Moreover,

EP(η2T | Ft) = η2

t exp(θ2(T − t)), (64)

and thus EP(η2t ) = exp(θ2t) for t ∈ [0, T ∗]. It is easily seen that the price Z1 is an

F-martingale under Q, since

dZ1t = σZ1

t d(Wt + θt) = σZ1t dWt (65)

for the Q-Brownian motion Wt = Wt + θt. The measure Q is thus the equivalentmartingale measure for our primary market.

From (62), we have that

Vt = v +∫ t

0

φ1u dZ1

u = v +∫ t

0

φ1uσZ

1u dWu. (66)

Recall that if φ is an F-admissible strategy, that is, φ ∈ Φ(F), then VT is an FT -measurable random variable, which is P-square-integrable.

Let X be a P-square-integrable and FT -measurable random variable. It is easily seenthat X is integrable with respect to Q (since ηT is square-integrable with respect toP). The existence of a self-financing trading strategy that replicates X can be justifiedby the predictable representation theorem combined with the Bayes formula. We thushave the following result.

Page 77: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 69

Lemma 11. Let X be a P-square-integrable and FT -measurable random variable.Then X is an F-attainable contingent claim, i.e., there exists a strategy φX in Φ(F)such that VT (φ) = X .

We shall argue that problem MV(d, v) can be split into two problems (see also Pliska(2001) and Bielecki et al. (2004b) in this regard). We first focus on the optimal termi-nal wealth V ∗

T (d, v). Let L2(Ω,FT ,P) denote the collection of P-square-integrablerandom variables that are FT -measurable. Thus the first problem we need to solveis:

Problem MV1: Minimize VP(ξ) over all ξ ∈ L2(Ω,FT ,P), subject to EPξ ≥ dand EQξ = v.

Lemma 12. Suppose that φ∗ = φ∗(d, v) solves the problem MV(d, v), and letV ∗(d, v) = V (φ∗). Then the random variable ξ∗ = V ∗

T (d, v) solves the problemMV1.

Proof. We argue by contradiction. Suppose that there exists a random variable ξ ∈L2(Ω,FT ,P) such that EPξ ≥ d, EQξ = v and VP(ξ) < VP(ξ∗). Since ξ is P-square-integrable and FT -measurable, it represents an attainable contingent claim,so that there exists an F-admissible strategy φ such that ξ = VT (φ). Of course, thiscontradicts the assumption that φ∗ solves MV(d, v).

Denoting by ξ∗ the optimal solution to problem MV1, the second problem is:

Problem MV2: Find an F-admissible strategy φ∗ such that VT (φ∗) = ξ∗.

Since the next result is analogous to Theorem 2.1 in Bielecki et al. (2004b), its proofis omitted. It demonstrates that solving problem MV(d, v) is indeed equivalent tosuccessful solving problems MV1 and MV2. In the formulation of the result belowwe make use of a backward stochastic differential equation (BSDE). The reader canrefer to El Karoui and Mazliak (1997), El Karoui and Quenez (1997), El Karoui et al.(1997), Ma and Yong (1999) or to the survey by Buckdahn (2000) for an introductionto the theory of backward stochastic differential equations and its applications infinance.

Proposition 14. Suppose that the problem MV1 has a solution ξ∗. The followingBSDE

dvt = −θzt dt + zt dWt, vT = ξ∗, t ∈ [0, T ], (67)

has a unique, P-square-integrable solution, denoted as (v∗, z∗), which is adapted toF. Moreover, if we define a process φ1∗ by

φ1∗t = z∗t (σZ

1t )

−1, ∀ t ∈ [0, T ],

then the F-admissible trading strategy φ∗ = (φ1∗, φ2∗) with the wealth processVt(φ∗) = v∗t solves the problem MV(d, v).

Page 78: Paris-Princeton Lectures on Mathematical Finance 2003

70 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

For the last statement, recall that if the first component of a self-financing strategy φand its wealth process V (φ) is known, then the component φ2 is uniquely determinedthrough the equality Vt(φ) = φ1

tZ1t + φ2

tZ2t .

Remark. In what follows, we shall derive closed-form expressions for φ∗ and V (φ∗).It will be easily seen that the process V (φ∗) is not only P-square-integrable, but alsoQ-square-integrable. It should be stressed that Proposition 14 will not be used inthe derivation of a solution to problem MV(d, v). In fact, we shall find a solution toMV(d, v) through explicit calculations.

Solution of Problem MV1

In order to make the problem MV1 non-trivial, we need to make an additional as-sumption that θ = 0. Indeed, if θ = 0 then we have P = Q, and thus the problemMV1 becomes:

Problem MV1: Minimize VP(ξ) over all ξ ∈ L2(Ω,FT ,P), subject to EPξ ≥ dand EQξ = v.

It is easily seen that this problem admits a solution for d = v only, and the optimalsolution is trivial, in the sense that the optimal variance is null. Consequently, forθ = 0, the solution to MV(d, v) exists if and only if d = v, and it is trivial: φ∗ =(0, 1). Let us reiterate that we postulate that d > v in order to avoid trivial solutionsto MV(d, v).

From now on, we assume that θ = 0. We begin with the following auxiliary problem:

Problem MV1A: Minimize VP(ξ) over all ξ ∈ L2(Ω,FT ,P), subject to EPξ = dand EPξ = v.

The previous problem is manifestly equivalent to:

Problem MV1B: Minimize EPξ2 over all ξ ∈ L2(Ω,FT ,P), subject to EPξ = d

and EQξ = v.

Since EQξ = EP(ηT ξ), the corresponding Lagrangian is

EP(ξ2 − λ1ξ − λ2ηT ξ) − d2 + λ1d + λ2v.

The optimal random variable is given by 2ξ∗ = λ1 + λ2ηT , where the Lagrangemultipliers satisfy

2d = λ1 + λ2, 2v = λ1 + λ2 exp(θ2T ).

Hence, we have

ξ∗ =(deθ

2T − v + (v − d)ηT)(

eθ2T − 1

)−1, (68)

Page 79: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 71

and the corresponding minimal variance is

VP(ξ∗) = EP(ξ∗)2 − d2 = (d − v)2(eθ

2T − 1)−1

. (69)

Since we assumed that d > v, the minimal variance is an increasing function of theparameter d for any fixed value of the initial endowment v, we conclude that we havesolved not only the problem MV1A, but the problem MV1 as well. We thus have thefollowing result.

Proposition 15. The solution ξ∗ to problem MV1 is given by (68) and the minimalvariance VP(ξ∗) is given by (69).

For an alternative approach to Problem MV1, in a fairly general setup, see Jankunas(2001).

Solution of Problem MV2

We maintain the assumption that θ = 0. Thus, the optimal wealth for the termi-nal time T is given by (68), that is, VT (φ∗) = ξ∗. Our goal is to determine anF-admissible strategy φ∗ for which the last equality is indeed satisfied. In view if(66), it suffices to find φ1∗ such that the process V ∗

t given by

V ∗t = v +

∫ t

0

φ1∗u dZ1

u (70)

satisfies VT = ξ∗, and the strategy φ∗ = (φ1∗, φ2∗), where φ2∗ is derived fromVt = φ1∗

t Z1t + φ2∗

t Z2t , is F-admissible.

To this end, let us introduce an F-martingale V under Q by setting Vt = EQ(ξ∗ | Ft)(the integrability of ξ∗ under Q is rather obvious).

It is easy to see that V ∗T = ξ∗ and V ∗

0 = v. It thus remains to find the process φ1∗.Using (64), we obtain

V ∗t =

(deθ

2T − v + (v − d)ηteθ2(T−t))(eθ2T − 1

)−1.

Consequently, in view of (63) and (65), we have

dV ∗t =

v − d

eθ2T − 1(eθ

2(T−t)dηt − ηteθ2(T−t)θ2 dt

)

= eθ2(T−t) θηt(v − d)

eθ2T − 1(dWt − θdt)

= eθ2(T−t) d − v

eθ2T − 1νηtσ2

dZ1t

Z1t

.

This shows that we may choose

Page 80: Paris-Princeton Lectures on Mathematical Finance 2003

72 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

φ1∗t = eθ

2(T−t) d − v

eθ2T − 1ν

σ2

ηtZ1t

. (71)

It is clear that φ∗ is F-admissible, since it is F-adapted, self-financing, and Vt(φ∗) isP-square-integrable for every t ∈ [0, T ].

Solution of Problem MV(d, v)

By virtue of Lemma 12, we conclude that φ∗ solves MV(d, v). In view of (69), thevariance under P of the terminal wealth of the optimal strategy is

v∗(d, v) = EP(V ∗T )2 − d2 =

(d − v)2

eθ2T − 1.

Let us stress that since we did not impose any no-bankruptcy condition, that is we dono require that the agent’s wealth is non-negative, we see that d can be any numbergreater then v.

We are in a position to state the following result, which summarizes the analysisabove. For a fixed T > 0, we denote ρ(θ) = eθ

2T (eθ2T −1)−1 and ηt(θ) = ηte

−θ2t,so that η0(θ) = 1.

Proposition 16. Assume that θ = 0 and let d > v. Then a solution φ∗(d, v) =(φ∗1(d, v), φ∗2(d, v)) to MV(d, v) is given by

φ1∗t (d, v) = (d − v)ρ(θ)

νηt(θ)σ2Z1

t

(72)

and V ∗t (d, v) = Vt(φ∗(d, v)) = φ∗1

t (d, v)Z1t + φ∗2

t (d, v), where the optimal wealthprocess equals

V ∗t (d, v) = v + (d − v)ρ(θ)

(1 − ηt(θ)

). (73)

The minimal variance v∗(d, v) is given by

v∗(d, v) = EP(V ∗T (d, v))2 − d2 =

(d − v)2

eθ2T − 1. (74)

Notice that the optimal trading strategy φ∗(d, v), the minimal variance v∗(d, v) andthe optimal gains process G∗

t (d, v) = V ∗t (d, v)− v depend exclusively on the differ-

ence d − v > 0, rather than on parameters d and v themselves.

Efficient Portfolio

As it was observed above the function f(d) := v∗(d, v) is (strictly) increasing ford ≥ v. Consider the following problem (as usual, for d ≥ v):

Page 81: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 73

Problem ME(d, v): Maximize EPVvT (φ) over all strategies φ ∈ Φ(G), subject to

VP(V vT (φ)) = v∗(d, v).

Denote the maximal expectation in the above problem by µ∗(d, v). In view of thestrict monotonicity of the function f(d) for d ≥ v, it is clear that µ∗(d, v) = d.Consequently, the minimum variance portfolio φ∗ is in fact an efficient portfolio.

8.2 Solution to MV(d, v, p, X) in the Class Φ(F)

Consider first the special case of an attainable claim, which is FT -measurable. Sub-sequently, we shall show that in general it suffices to decompose a general claimX into an attainable component X = EP(X | FT ) ∈ L2(Ω,FT ,P), and a compo-nent X − X which is orthogonal in L2(Ω,GT ,P) to the subspace L2(Ω,FT ,P) ofadmissible terminal wealths.

Case of an Attainable Claim

We shall verify that the mean-variance price coincides with the (unique) arbitrageprice for any contingent claim that is attainable. Of course, this feature is a standardrequirement for any reasonable valuation mechanism for contingent claims. Since inthis section we consider only F-adapted strategies, we postulate here that a claim X isFT -measurable; the general case of a GT -measurable claim is considered in Section9.1. Let φX ∈ Φ(F) be a replicating strategy for X , so that X is F-attainable, and letπ0(X) = EQX be the arbitrage price of X . Since Φ(F) is a linear space, it is easilyseen that Φ(F) = Φ(F) + φX = Φ(F) − φX . The following lemma is thus easy toprove.

Lemma 13. Let X be an F-attainable contingent claim. In this situation, problemMV(d, v, p,X) is equivalent to problem MV(d, v) with v = v − p + π0(X).

Equivalence of problems MV(d, v, p,X) and MV(d, v) is understood in the follow-ing way: first, the minimal variance for both problems is identical. Second, if a strat-egy ψ∗ is a solution to MV(d, v − p + π0(X)), then a strategy φ∗ = ψ∗ − φX is asolution to the original problem MV(d, v, p,X).

Corollary 9. Suppose that an FT -measurable random variable X represents anF-attainable claim. (i) If the arbitrage price π0(X) satisfies π0(X) ∈ [0, v] thenpd,v(X) = π0(X).(ii) If the arbitrage price π0(X) is strictly greater than v then pd,v(X) = v.

Proof. By definition, the mean-variance price of X is the maximal value of p ∈ [0, v]for which v∗(d, v, p,X) = v∗(d, v) ≤ v∗(d, v). Recall that we assume that d > vso that, in view of (78),

Page 82: Paris-Princeton Lectures on Mathematical Finance 2003

74 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

v∗(d, v) =(d − v)2

eθ2T − 1.

By applying this result to MV(d, v) we obtain

v∗(d, v, p,X) =(d − v + p− π0(X))2

eθ2T − 1

provided that d > v−p+π0(X). Assume that p > π0(X). Then d > v−p+π0(X)and thus v∗(d, v, p,X) > v∗(d, v) since manifestly (d−v)2 > (d−v+p−π0(X))2

in this case. This shows that pd,v(X) ≤ π0(X). Of course, for p = π0(X) we havethe equality of minimal variances. We conclude that pd,v(X) = π0(X) provided thatπ0(X) ∈ [0, v]. This completes the proof of part (i).

To prove part (ii), let us assume that π0(X) > v. In this case, it suffices to takep = v and to check that v∗(d, v, v,X) = v∗(d, π0(X)) ≤ v∗(d, v). This is againrather obvious since for v < π0(X) < d we have (d− π0(X))2 < (d− v)2, and forπ0(X) ≥ d we have v∗(d, π0(X)) = 0.

Case of a Generic Claim

Consider an arbitrary GT -measurable claim X , which is P-square-integrable. Recallthat our goal is to solve the following problem for 0 ≤ p ≤ v.

Problem MV(d, v, p,X): Minimize VP(V v−pT (φ) + X) over all trading strategies

φ ∈ Φ(F), subject to EP(V v−pT (φ) + X) ≥ d.

Let us denote by X the conditional expectation EP(X | FT ). Then, of course, EPX =EPX . Moreover, X is an attainable claim and its arbitrage price at time 0 equals

π0(X) = EQX = EP(ηTEP(X | FT )) = EP(ηTX) = EQX,

where Q is the martingale measure introduced in Section 8.1. Let φX stand for thereplicating strategy for X in the class Φ(F). Arguing as in the previous case, weconclude that the problem MV(d, v, p,X) is equivalent to the following problem.We set here p = p− π0(X).

Problem MV(d, v, p,X − X): Minimize VP(V v−pT (φ) + X − X) over all trading

strategies φ ∈ Φ(F), subject to EP(V v−pT (φ) + X − X) ≥ d.

Recall that EPX = EPX and denote γX = VP(X − X). Observe that for anyφ ∈ Φ(F) we have

VP(V v−pT (φ) + X − X) = VP(V v−p

T (φ)) + VP(X − X) = VP(V v−pT (φ)) + γX .

The problem MV(d, v, p,X − X) can thus be represented as follows. We denotev = v − p.

Page 83: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 75

Problem MV(d, v; γX): Minimize VP(V vT (φ)) + γX over all trading strategies φ ∈

Φ(F), subject to EP(V vT (φ)) ≥ d.

Observe that the problem MV(d, v; γX) is formally equivalent to the original prob-lem MV(d, v, p,X) in the following sense: first, the minimal variances for both prob-lems are identical, more precisely, we have

v∗(d, v, p,X) = v∗(d, v) + γX ,

where v∗(d, v) is the minimal variance for MV(d, v). Second, if a strategy ψ∗ is asolution to problem MV(d, v), then φ∗ = ψ∗ − φX is a solution to MV(d, v, p,X).

Remark. It is interesting to notice that a solution MV(d, v; γX) does not dependexplicitly on the expected value of X under P. Hence, the minimal variance for theproblem MV(d, v, p,X) is independent of µX as well, but, of course, it depends onthe price π0(X) = EQX , which may in fact coincide with µX under some circum-stances.

In view of the arguments above, it suffices to consider the problem MV(d, v), wherev = v − p + EQX . Since the problem of this form has been already solved inSection 8.1, we are in a position to state the following result, which is an immediateconsequence of Proposition 16. Recall that ρ(θ) = eθ

2T (eθ2T − 1)−1 and ηt(θ) =

ηte−θ2t, so that η0(θ) = 1. Finally, v = v − p + EQX = v − p + EQX .

Proposition 17. Assume that θ = 0. (i) Suppose that d > v. Then a solutionφ∗(d, v, p,X) to MV(d, v, p,X) is given as φ∗(d, v, p,X) = ψ∗(d, v) − φX , whereψ∗(d, v) = (ψ1∗(d, v), ψ2∗(d, v)) is such that ψ1∗(d, v) equals

ψ1∗t (d, v) = (d − v)ρ(θ)

νηt(θ)σ2Z1

t

(75)

and ψ2∗(d, v) satisfies ψ∗1t (d, v)Z1

t + ψ∗2t (d, v) = V ∗

t (d, v) for t ∈ [0, T ], where inturn

V ∗t (d, v) = v + (d − v)ρ(θ)

(1 − ηt(θ)

). (76)

Thus the optimal wealth for the problem MV(d, v, p,X) equals

V ∗t (d, v, p,X) = v − p + (d − v)ρ(θ)

(1 − ηt(θ)

)+ EQX − EQ(X | Ft) (77)

and the minimal variance v∗(d, v, p,X) is given by

v∗(d, v, p,X) =(d − v)2

eθ2T − 1+ γX . (78)

(ii) If d ≤ v then the optimal wealth process equals

V ∗t (d, v, p,X) = v − p + EQX − EQ(X | Ft)

and the minimal variance equals γX .

Page 84: Paris-Princeton Lectures on Mathematical Finance 2003

76 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Remark. Let us comment briefly on the assumption θ = 0. Recall that if it fails tohold, the problem MV(d, v) has no solution, unless d = v. Hence, for θ = 0 we needto postulate that d = v − p + EPX (recall that θ = 0 if and only if Q = P). Theoptimal strategy φ∗ = (0, 1) and thus the solution to MV(d, v, p,X) is exactly thesame as in part (ii) of Proposition 17.

Mean-Variance Pricing and Hedging of a Generic Claim

Our next goal is to provide explicit representations for the mean-variance price ofX . We maintain the assumption that the problem MV(d, v, p,X) is examined inthe class Φ(F). Thus, the mean-variance price considered in this section, denoted aspd,vF (X) in what follows, is relative to the reference filtration F.

Assume that d > v = v − p + EQX (recall that EQX = EQX = π0(X)).Then, by virtue of Proposition 17, we see that the minimal variance for the prob-lem MV(d, v, p,X) equals

v∗(d, v, p,X) =(d − v + p− EQX)2

eθ2T − 1+ γX ,

whereγX = VP(X − X).

Of course, if d ≤ v = v − p + EQX then we have v∗(d, v, p,X) = γX . Recall thatwe postulate that d > v, and thus the minimal variance for the problem MV(d, v)equals

v∗(d, v) =(d − v)2

eθ2T − 1.

Let us denote

κ = d− v − EQX, ρ = (d − v)2 − γX(eθ2T − 1).

Proposition 18. (i) Suppose that π0(X) ≥ d so that κ ≤ −v. If γX ≤ v∗(d, v) thenthe mean variance price equals pd,vF (X) = v. Otherwise, pd,vF (X) = −∞.

(ii) Suppose that d − v ≤ π0(X) < d so that −v < κ ≤ 0. If, in addition, ρ ≥ 0then we have

pd,vF (X) = min−κ +√

ρ , v ∨ 0. (79)

Otherwise, i.e., when ρ < 0, we have pd,vF (X) = −κ if γX ≤ v∗(d, v), andpd,v(X) = −∞ if γX > v∗(d, v).(iii) Suppose that π0(X) < d − v so that κ > 0. If ρ ≥ 0 then pd,vF (X) is given by(79). Otherwise, we have pd,vF (X) = −∞.

Proof. In case (i), we have d − v − EQX ≤ −p for every p ∈ [0, v]. Thus d ≤v − p + EQX , so that v∗(d, v, p,X) = γX . Therefore, if γX ≤ v∗(d, v) it is clear

Page 85: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 77

that pd,vF (X) = v. Otherwise, for every p ∈ [0, v] we have v∗(d, v, p,X) = γX >

v∗(d, v) and thus pd,vF (X) = −∞.

In case (ii), it suffices to notice that d ≤ v − p + EQX for any p ∈ [0,−κ], andd > v − p + EQX for any p ∈ (−κ, v]. Thus the maximal p ∈ [0, v] for whichv∗(d, v, p,X) ≤ v∗(d, v) can be found from the equation

(κ + p)2 + γX(eθ

2T − 1)

= (d − v)2,

which admits the solution p = −κ+√

ρ provided that ρ ≥ 0. If ρ < 0, then we need

to examine the case p ∈ [0,−κ], and we see that pd,vF (X) equals either −κ or −∞,depending on whether γX ≤ v∗(d, v) or γX > v∗(d, v).

In case (iii), we have d − v − EQX > 0, which yields d > v − p + EQX for anyp ∈ [0, v]. Inequality v∗(d, v, p,X) ≤ v∗(d, v) becomes

(d − v + p − EQX)2 + γX(eθ

2T − 1)≤ (d − v)2

If ρ ≥ 0 then pd,vF (X) is given by (79). Otherwise, we have pd,vF (X) = −∞.

The mean variance hedging strategy for a claim X is now obtained as φMV =φ∗(d, v, pd,vF (X), X) for all cases above when pd,vF (X) = −∞.

8.3 Defaultable Claims

In order to provide a better intuition, we shall now examine in some detail two spe-cial cases. First, we shall assume that X is independent of the σ-field FT . Since Xis GT -measurable, but obviously it is not GT -measurable, we shall refer to X as adefaultable claim (a more general interpretation of X is possible, however).

Although this case may look rather trivial at the first glance, we shall see that someinteresting conclusions can be obtained. Second, we shall analyze the case of a de-faultable zero-coupon bond with fractional recovery of Treasury value. Of course,both examples are merely simple illustrations of Proposition 17, and thus they shouldnot be considered as real-life applications.

Claim Independent of the Reference Filtration

Consider a GT -measurable contingent claim X , such that X is independent of the σ-field FT . Then for any strategy φ ∈ Φ(F), the terminal wealth VT (φ) and the payoffX are independent random variables, so that

VP

(VT (φ) + X

)= VP(VT (φ)) + VP(X).

It is clear that if the variance VP(X) satisfies VP(X) > v∗(d, v), then pd,vF (X) =−∞ for every v > 0. Moreover, if VP(X) ≤ v∗(d, v) and EPX ≥ d, thenpd,v(X) = v for every v > 0.

Page 86: Paris-Princeton Lectures on Mathematical Finance 2003

78 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

It thus remains to examine the case when VP(X) ≤ v∗(d, v) and EPX < d. Noticethat X = EPX and thus π0(X) = EPX . In particular, since X is constant, its

replicating strategy is trivial, i.e. φX = 0.

In view of Proposition 17, if d > v − p + EPX then the minimal variance for theproblem MV(d, v, p,X) equals

v∗(d, v, p,X) =(d − v + p− µX)2

eθ2T − 1+ σ2

X ,

where µX = EPX and σ2X = VP(X) = γX . Let us denote

pd,v(X) = −d + v + µX +√

(d − v)2 − σ2X(eθ2T − 1).

Proposition 19. The mean variance price of the claim X equals

pd,vF (X) = min pd,v(X), v ∨ 0

if (d − v)2 − σ2X(eθ

2T − 1) ≥ 0, and −∞ otherwise. The mean-variance hedgingstrategy φMV = ψ∗, where ψ∗ is such that

ψ1∗t = eθ

2(T−t) d − v + pd,v(X) − µXeθ2T − 1

ν

σ2

ηtZ1t

, ∀ t ∈ [0, T ].

The mean-variance price depends, of course, on the initial value v of the investor’scapital. This dependence has very intuitive and natural properties, though. Let usdenote

k = d −√

(d − µX)2 + σ2X(eθ2T − 1), l = d − σX

√eθ2T − 1.

We fix all parameters, except for v. Notice that the function p(v) = pd,vF (X) is non-negative and finite for v ∈ [0, l ∨ 0]. Moreover, the function p(v) is increasing forv ∈ [0, k ∨ 0), and it is decreasing on the interval [k ∨ 0, l ∨ 0]. Specifically,

p(v) =

v if 0 ≤ v < k ∨ 0,µX − d + v +

√(d − v)2 − σ2

X(eθ2T − 1), if k ∨ 0 ≤ v ≤ l ∨ 0.

This conclusion is quite intuitive: once the initial level of investor’s capital is bigenough (that is, v ≥ l) the investor is less and less interested in purchasing the claimX . This is because when the initial endowment is sufficiently close to the expectedterminal wealth level, the investor has enough leverage to meet this terminal objectiveat minimum risk; therefore, the investor is increasingly reluctant to purchase theclaim X as this would introduce unwanted additional risk (unless of course σX = 0).For example, if v = d then the investor is not at all interested in purchasing the claim(pv,vF (X) = −∞ if σX > 0 and θ = 0). For further properties of the mean-varianceprice of a claim X independent of FT , we refer to Bielecki and Jeanblanc (2003).

Page 87: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 79

Defaultable Bond

Let τ be a random time on the underlying probability space (Ω,G,P). We define theindicator process H associated with τ by setting Ht = 11τ≤t for t ∈ R+ , andwe denote by H the natural filtration of H (P-completed). We take H to serve as theauxiliary filtration, so that G = F ∨ H. We assume that the default time τ is definedas follows:

τ = inf t ∈ R+ : Γt > ζ , (80)

where Γ is an increasing, F-adapted process, with Γ0 = 0, and ζ is an exponentiallydistributed random variable with parameter 1, independent of F. It is well known thatany Brownian motion W with respect to F is also a Brownian motion with respectto G within the present setup (the latter property is closely related to the so-calledhypothesis (H) frequently used in the modeling of default event, see Jeanblanc andRutkowski (2000) or Bielecki et al. (2004)).

Now, suppose that a new investment opportunity becomes available for the agent.Namely, the agent may purchase a defaultable bond that matures at time T ∈ (0, T ∗].We postulate that the terminal payoff at time T of the bond is X = L11τ>T +δL11τ≤T, where L > 0 is the bond’s notional amount and δ ∈ [0, 1) is the (con-stant) recovery rate. In other words, we deal with a defaultable zero-coupon bondthat is subject to the fractional recovery of Treasury value.

Notice that the payoff X can be represented as follows X = δL + Y , where Y =L(1 − δ)11τ>T. According to our general definition, we associate to X an FT -

measurable random variable X by setting

X = EP(X | FT ) = δL + EP(Y | FT ).

In view of (80), we have

EP(Y |FT ) = P τ > T | FT = e−ΓT ,

and thus the arbitrage price at time 0 of the attainable claim X equals (recall that wehave reduced our problem to the case r = 0)

π0(X) = EQX = δL + EP

(ηT e

−ΓT).

Since clearly

X − X = L(1 − δ)(11τ>T − P τ > T | FT

),

we obtain

γX = VP(X − X) = L2(1 − δ)2 EP

(11τ>T − e−ΓT

)2.

In order to find the mean-variance price pd,vF (X) at time 0 of a defaultable bondwith respect to the reference filtration F, it suffices to make use of Proposition 17

Page 88: Paris-Princeton Lectures on Mathematical Finance 2003

80 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

(or Proposition 18). If we wish to describe the mean-variance hedging strategy withrespect to F, we need also to know an explicit representation for the replicating strat-egy φX for the claim X . To this end, it suffices to find the integral representation ofthe random variable EP(Y | FT ) with respect to the price process Z1 or, equivalently,

to find a process φX for which

X = π0(X) +∫ T

0

φXt dZ1t .

Example 4. In practical applications of the reduced-form approach, it is fairly com-mon to postulate that the F-hazard process Γ is given as Γt =

∫ t0γt dt, where γ is a

non-negative process, progressively measurable with respect to F, referred to as theF-intensity of default. Suppose, for the sake of simplicity, that the intensity of defaultγ is deterministic, and let us set

pγ = Pτ > T = Qτ > T = exp(−∫ T

0

γ(t) dt).

Then we getπ0(X) = EQX = δL + pγ

andγX = L2(1 − δ)2pγ

(1 − pγ

).

Of course, in the case of a deterministic default intensity γ, in order to replicate theclaim X , it suffices to invest the amount π0(X) in the savings account. For a moredetailed analysis of the mean-variance price of a defaultable bond, the reader mayconsult Bielecki and Jeanblanc (2003).

9 Strategies Adapted to the Full Filtration

In this section, the mean-variance hedging and pricing is examined in the case oftrading strategies adapted to the full filtration. Recall that W is assumed to be aone-dimensional Brownian motion with respect to F under P. We postulated, in ad-dition, that W is also a Brownian motion with respect to the filtration G under theprobability P. We define a new probability Q on (Ω,GT∗) by setting

dQ

dP

∣∣∣Gt

= ηt, ∀ t ∈ [0, T ∗],

where the process η is given by (63). Clearly, Q is an equivalent martingale proba-bility for our primary market and the process η is a G-martingale under P. Moreover,we have (cf. (64))

EP(η2T | Gt) = η2

t eθ2(T−t),

Page 89: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 81

and thus EP(η2t ) = exp(θ2t) for every t ∈ [0, T ∗]. It is easy to check that the process

Wt = Wt− θt is a martingale, and thus a Brownian motion, with respect to G underQ.

From the P-square-integrability of ηT , it follows that for any strategy φ ∈ Φ(G) theterminal wealth VT (φ) is Q-integrable. In fact, we have the following useful result.Recall that a G-predictable process φ1 uniquely determines a self-financing strategyφ = (φ1, φ2), and thus we may formally identify φ1 with the associated strategy φ(and vice versa). The following lemma will prove useful.

Lemma 14. Let A(Q) be the linear space of all G-predictable processes ψ such

that the process∫ t0ψu dZ1

u is a Q-martingale and the integral∫ T0

ψu dZ1u is in

L2(Ω,GT ,P). Then A(Q) = Φ(G).

Proof. It is clear that A(Q) ⊆ Φ(G). For the proof of the inclusion Φ(G) ⊆ A(Q),see Lemma 9 in Rheinlander and Schweizer (1997).

It is worthwhile to note that the class A(Q) corresponds to the set ΘGLP (Θ, re-spectively) considered in Schweizer (2001) (in Rheinlander and Schweizer (1997),respectively). The class Φ(G) corresponds with the class ΘS (Θ, respectively) con-sidered in Schweizer (2001) (in Rheinlander and Schweizer (1997), respectively).

Let us denote by G1 the filtration generated by all wealth processes:

V vt (φ) = v +

∫ t

0

φ1u dZ1

u,

where v ∈ R and φ = (φ1, φ2) belongs to Φ(G). Equivalently, G1 is generated bythe processes

x +∫ t

0

ψu dZ1u

with x ∈ R and ψ ∈ A(Q). Also, we denote by P0 the following set of randomvariables:

P0 =ξ ∈ L2(Ω,G1

T ,P)∣∣ ξ =

∫ T

0

ψu dZ1u, ψ ∈ A(Q)

.

We write Π0P to denote the orthogonal projection (in the norm of the space L2(Ω,

GT ,P)) from L2(Ω,GT ,P) on the space P0. A similar notation will be also usedfor orthogonal projections on P0 under Q. Let us mention that, in general, we shallhave Π0

P(Y ) = EP(Y | G1T ) for Y ∈ L2(Ω,GT ,P) and Π0

Q(Y ) = E

Q(Y | G1

T ) for

Y ∈ L2(Ω,GT , Q) (see Section 9.3 for more details).

9.1 Solution to MV(d, v) in the Class Φ(G)

Recall that our basic mean-variance problem has the following form:

Page 90: Paris-Princeton Lectures on Mathematical Finance 2003

82 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Problem MV(d, v): Minimize VP(V vT (φ)) over all strategies φ ∈ Φ(G), subject to

EPVvT (φ) ≥ d.

As in Section 8.1, we postulate that d > v, since otherwise the problem is trivial. Weshall argue that it suffices to solve a simpler problem:

Problem MVA(d, v): Minimize EP(V vT (φ))2 over all strategies φ ∈ Φ(G), subject

to EPVvT (φ) = d.

In view of the definition of class A(Q), Lemma 14, and the fact that EQξ = 0 forany ξ ∈ P0, we see that it suffices to solve the problem

Problem MVB(d, v): Minimize EP(v + ξ)2 over all random variables ξ ∈ P0,subject to EPξ = d − v.

Solution to the last problem is exactly the same as in the case of strategies fromΦ(F). Indeed, by solving the last problem in the class L2(Ω,GT ,P) (rather than inP0), and with additional constraint EQξ = 0, we see that the optimal solution, givenby (68), is in fact FT -measurable, and thus it belongs to the class P0 as well. In viewof (69), the same random variable is a solution to MV(d, v), that is, it represents theoptimal terminal wealth. We conclude that a solution to MV(d, v) in the class Φ(G)is given by the formulae (72)-(74) of Proposition 16, i.e., it coincides with a solutionin the class Φ(F).

Assume that X is an attainable contingent claim, in the sense that there exists atrading strategy φ ∈ Φ(G) which replicates X . Then, arguing along the same linesas in Section 8.2, we get the following result.

Corollary 10. Let a GT -measurable random variable X represent an attainable con-tingent claim. Then(i) If the arbitrage price π0(X) satisfies π0(X) ∈ [0, v] then pd,v(X) = π0(X).(ii) If the arbitrage price π0(X) is strictly greater than v then pd,v(X) = v.

9.2 Solution to MV(d, v, p, X) in the Class Φ(G)

We shall study the problem MV(d, v, p,X) for an arbitrary GT -measurable claim X ,which is P-square-integrable. Recall that we deal with the following problem:

Problem MV(d, v, p,X): Minimize VP(V v−pT (φ) + X) over all trading strategies

φ ∈ Φ(G), subject to EP(V v−pT (φ) + X) ≥ d.

Basic idea of solving the problem MV(d, v, p,X) with respect to G-predictablestrategies is similar to that used in the case of F-predictable strategies. The main dif-ference is that the auxiliary random variable X will now be defined as the orthogonalprojection ΠP(X) of X on P0, rather than the conditional expectation EP(X | GT ).

Page 91: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 83

Let us denote d = d − v + p. The problem MV(d, v, p,X) can be reformulated asfollows:

Problem MV(d, 0, 0, X): Minimize VP(V 0T (φ)+X) over all trading strategies φ ∈

Φ(G), subject to EP(V 0T (φ) + X) ≥ d.

That is, if V 0,∗T is the optimal wealth in problem MV(d, 0, 0, X) then V v−p,∗

T =V 0,∗T + v − p is the optimal wealth in problem MV(d, v, p,X), and the optimal

strategies as well as the optimal variances are the same in both problems.

Let X0 = Π0P(X) stand for the orthogonal projection of X on P0, so that ψX

0is a

process from A(Q) = Φ(G), for which

X0 =∫ T

0

ψ1,X0

t dZ1t (81)

and X − X0 = X −Π0P(X) is orthogonal to P0. The price of X0 equals

πt(X0) =∫ t

0

ψ1,X0

u dZ1u = E

Q(X0 | Gt), ∀ t ∈ [0, T ]. (82)

Let ψX0 ∈ Φ(G) be a replicating strategy for the claim X0. Explicitly, ψX

0=

(ψ1,X0, ψ2,X0

), where ψ2,X0satisfies ψ1,X0

t Z1t + ψ2,X0

t = πt(X0). Notice thatπ0(X0) = E

QX0 = 0 and, of course, πT (X0) = X0. It thus suffices to consider the

following problem:

Problem MV(d, 0, 0, X − X0): Minimize VP(V 0T (φ) + X − X0) over all trading

strategies φ ∈ Φ(G), subject to EP(V 0T (φ) + X − X0) ≥ d.

Since X − X0 is orthogonal to P0, for any strategy φ ∈ Φ(G) we have

VP(V 0T (φ) + X − X0) = VP(V 0

T (φ)) + VP(X − X0) = VP(V 0T (φ)) + γ0

X ,

where γ0X = VP(X − X0). Let us denote d = d− v + p− EPX + EPX

0. Then theproblem MV(d, 0, 0, X − X0) can thus be simplified as follows:

Problem MV(d, 0; γ0X): Minimize VP(V 0

T (φ)) + γ0X over all trading strategies φ ∈

Φ(G), subject to EP(V 0T (φ)) ≥ d = d− v + p− EPX + EPX

0.

Let us write v = v−p−EPX+EPX0, so that d = d− v. Then the minimal variance

for the problem MV(d, v, p,X) equals

v∗(d, v, p,X) = v∗(d, 0) + γ0X = v∗(d, v) + γ0

X .

Moreover, if ψ∗ is an optimal strategy to MV(d, 0), then φ1∗ = ψ1∗ − ψX0

isa solution to MV(d, v, p,X). The proof of the next proposition is based on the

Page 92: Paris-Princeton Lectures on Mathematical Finance 2003

84 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

considerations above, combined with Proposition 16. We use the standard notationρ(θ) = eθ

2T (eθ2T − 1)−1 and ηt(θ) = ηte

−θ2t, so that η0(θ) = 1. Recall thatE

QX0 = 0.

Proposition 20. Assume that θ = 0 and let ψX0 ∈ Φ(G) be a replicating strategy

for X0 = Π0P(X).

(i) Suppose that d > v. Then an optimal strategy φ∗(d, v, p,X) for the problem

MV(d, v, p,X) is given as φ1∗(d, v, p,X) = ψ1∗(d, 0) − ψ1,X0with ψ∗(d, 0) =

(ψ1∗(d, 0), ψ2∗(d, 0)) such that ψ1∗(d, 0) equals

ψ1∗t (d, 0) = (d − v)ρ(θ)

νηt(θ)σ2Z1

t

(83)

and ψ2∗(d, 0) satisfies ψ∗1t (d, 0)Z1

t + ψ∗2(d, 0) = V ∗t (d, 0), where in turn

V ∗t (d, 0) = (d − v)ρ(θ)

(1 − ηt(θ)

). (84)

Thus the optimal wealth for the problem MV(d, v, p,X) equals

V ∗t (d, v, p,X) = v − p + (d − v)ρ(θ)

(1 − ηt(θ)

)− E

Q(X0 | Gt). (85)

The minimal variance v∗(d, v, p,X) is given by

v∗(d, v, p,X) =(d − v)2

eθ2T − 1+ γ0

X . (86)

(ii) If d ≤ v then the optimal wealth process equals

V ∗t (d, v, p,X) = v − p − E

Q(X0 | Gt)

and the minimal variance equals γ0X .

Remark. It is natural to expect that the optimal variance given in (86) is not greaterthan the optimal variance given in (78). In fact, this is the case (see Proposition 5.4in Bielecki and Jeanblanc (2003)).

Of course, the practical relevance of the last result hinges on the availability of ex-plicit representation for the orthogonal projection X0 = Π0

P(X) of X on the spaceP0. This important issue will be examined in the next section in a general setup. Weshall continue the study of this question in the framework of defaultable claims inSection 9.5.

9.3 Projection of a Generic Claim

Let us first recall two well-known result concerning the decomposition of a GT -measurable random variable, which represents a generic contingent claim in our fi-nancial model.

Page 93: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 85

Galtchouk-Kunita-Watanabe decomposition under Q. Suppose first that we workunder Q, so that the process Z1 is a continuous martingale. Recall that by assumptionW is a Brownian motion with respect to G under P; hence, the process W is aBrownian motion with respect to G under Q.

It is well known that any random variable Y ∈ L2(Ω,GT , Q) can be represented bymeans of the Galtchouk-Kunita-Watanabe decomposition with respect to the martin-gale Z1 under Q. To be more specific, for any random variable Y ∈ L2(Ω,GT , Q)there exists a G-martingale NY,Q, which is strongly orthogonal in the martingalesense to Z1 under Q, and a G-adapted process ψY,Q, such that Y can be representedas follows:

Y = EQY +

∫ T

0

ψY,Qt dZ1t + NY,Q

T . (87)

Furthermore, the process ψY,Q can be represented as follows:

ψY,Qt =d〈Y, Z1〉td〈Z1〉t

, (88)

where the G-martingale Y is defined as Yt = EQ(Y | Gt).

Follmer-Schweizer decomposition under P. Let us now consider the same issue,but under the original probability P. The process Z1 is a (continuous) semimartingalewith respect to G under P, and thus it admits a unique continuous martingale partunder P.

Any random variable Y ∈ L2(Ω,GT ,P) can be represented by means of theFollmer-Schweizer decomposition. Specifically, there exists a G-adapted processψY,P, a (G,P)-martingale NY,P, strongly orthogonal in the martingale sense to thecontinuous martingale part of Z1, and a constant yY,P, so that

Y = yY,P +∫ T

0

ψY,Pt dZ1t + NY,P

T . (89)

We shall see that it will be not necessary to compute the process ψY,P for the purposeof finding a hedging strategy for the problem considered in this section.

Projection on P0. As already mentioned, Π0Q(Y ) = E

Q(Y | G1

T ) for random vari-

ables Y in L2(Ω,GT , Q), as well as Π0P(Y ) = EP(Y | G1

T ) for Y ∈ L2(Ω,GT ,P),in general. For instance, for any random variable Y as in (87) we get Π0

Q(Y ) =∫ T

0 ψY,Qt dZ1t , whereas

EQ(Y | G1

T ) = Y = Π0Q(Y ) − E

QY.

The projection Π0Q(Y ) differs here from the conditional expectation just by the ex-

pected value EQY . Consequently, we have Π0

Q(Y ) = E

Q(Y | G1

T ) for any Y ∈

Page 94: Paris-Princeton Lectures on Mathematical Finance 2003

86 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

L2(Ω,GT , Q) with EQY = 0. More importantly, observe that for Y as in (89) we

shall have, in general,

Π0P(Y ) =

∫ T

0

ψY,Pt dZ1t ,

so that, in particular, Π0P(Y ) = EP(Y | G1

T ) even if EPY = 0.

Our next goal is to compute the projection Π0P(Y ) for any random variable Y ∈

L2(Ω,GT ,Q). We know that any such Y can be represented as in (87). Due tolinearity of the projection, it is enough to compute the projection of each componentin the right-hand side of (87). Let us set ηt = E

Q(ηT | Gt) for every t ∈ [0, T ], so

that, in particular, ηT = ηT . Since η is a square-integrable G-martingale under Q,there exists a process ψ in A(Q) such that

ηt = EQηT +

∫ t

0

ψu dZ1u = E

QηT + Zηt , ∀ t ∈ [0, T ], (90)

where we denote

Zηt =∫ t

0

ψu dZ1u.

Lemma 15. We have

ψt = − θηtσZ1

t

= −θeθ2T

σZ1t

exp(− θWt − 1

2 θ2(t − 2T ))

(91)

and the process Wt = Wt + θt is a Brownian motion under Q.

Proof. Direct calculations show that for every t ∈ [0, T ]

ηt = exp(− θ

σ

∫ t

0

dZ1u

Z1u

− 12θ2(t − 2T )

)= eθ

2T exp(− θWt − 1

2 θ2t). (92)

Hence, η solves the SDE

dηt = −θηt dWt = − θ

σ

ηtZ1t

dZ1t

with the initial condition η0 = EQηT = E

QηT = eθ

2T .

In the next result, we provide a general representation for the projection Π0P(Y ) for

a GT -measurable random variable Y , which is P-square-integrable.

Proposition 21. Let Y ∈ L2(Ω,GT ,P). Then we have

Π0P(Y ) =

∫ T

0

ψY,Pt dZ1t ,

Page 95: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 87

where

ψY,Pt = ψY,Qt − ψt

(η−10 E

QY +

∫ t

0

η−1u dNY,Q

u

)(93)

and where processes ψY,Q and NY,Q are given by the Galtchouk-Kunita-Watanabedecomposition (87) of Y under Q.

Proof. First, we compute projection of the constant c = EQY . To this end, recall that

ηT = ηT and by virtue of (90) we have ηT = η0 + ZηT . Hence, for any ψ ∈ A(Q)we obtain

EP

((1 + η−1

0 ZηT) ∫ T

0

ψt dZ1t

)= η−1

0 EP

(ηT

∫ T

0

ψt dZ1t

)

= η−10 E

Q

(∫ T

0

ψt dZ1t

)= 0,

and thus Π0P(1) = −η−1

0 ZηT . We conclude that for any c ∈ R

Π0P(c) = cΠ0

P(1) = −cη−10 ZηT = −cη−1

0

∫ T

0

ψt dZ1t . (94)

Next, it is obvious that the projection of the second term, that is, the projection of∫ T0

ψY,Qt dZ1t , on P0 is equal to itself, so that

Π0P

(∫ T

0

ψY,Qt dZ1t

)=∫ T

0

ψY,Qt dZ1t . (95)

Finally, we shall compute the projection Π0P(NY,Q

T ). Recall that the process NY,Q is

a Q-martingale strongly orthogonal to Z1 under Q. Hence, for any NY,Q-integrableprocess ν and any process ψ ∈ A(Q) we have

EP

(ηT

∫ T

0

νt dNY,Qt

∫ T

0

ψt dZ1t

)= 0.

Thus, it remains to find processes ν and ψ ∈ A(Q) for which

ηT

∫ T

0

νt dNY,Qt = NY,Q

T −∫ T

0

ψt dZ1t , (96)

in which case we shall have that Π0P(NY,Q

T ) =∫ T0

ψt dZ1t .

Let us set Ut = ηt∫ t0 νu dNY,Q

u for every t ∈ [0, T ]. Recall that (see (90)) there

exists a process ψ in Φ(G) = A(Q) such that dηt = ψt dZ1t . Using the product rule,

and taking into account the orthogonality of η and NY,Q under Q, we find that U isa local martingale under Q, and it satisfies

Page 96: Paris-Princeton Lectures on Mathematical Finance 2003

88 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Ut =∫ t

0

ηu−νu dNY,Qu +

∫ t

0

( ∫ u

0

νs dNY,Qs

)ψu dZ1

u. (97)

Consequently, upon letting

νt = (ηt−)−1, ∀ t ∈ [0, T ], (98)

we obtain from (97)

Ut = NY,Qt +

∫ t

0

ψu

(∫ u

0

νs dNY,Qs

)dZ1

u. (99)

Note that the left-hand side of (96) is equal to UT . Thus, comparing (96) and (99),we see that we may take

ψt = −ψt

∫ t

0

νu dNY,Qu = −ψt

∫ t

0

(ηu−)−1 dNY,Qu . (100)

It is clear that with ν defined in (98) the integral∫ t0 νu dNY,Q

u is a Q-martingale.

Thus, the process U is a martingale, rather than a local martingale, under Q. Togetherwith (99) this implies that the process

∫ t

0

ψu

( ∫ u

0

νs dNY,Qs

)dZ1

u

is a Q-martingale. Consequently, the process ψ defined in (100) belongs to the classA(Q). To complete the proof, it suffices to combine (94), (95) and (100).

It should be acknowledged that the last result is not new. In fact, it is merely a spe-cial case of Theorem 6 in Rheinlander and Schweizer (1997). We believe, however,that our derivation of the result sheds a new light on the structure of the orthogonalprojection computed above.

Remark. Although the above proposition provides us with the structure of the pro-jection Π0

P(Y ), it is not easy in general to obtain closed-form expressions for thecomponents on the right-hand side of (93) in terms of the initial data for the prob-lem. Thus, one may need to resort to numerical approximations, which in principlecan be obtained by solving the following problem

minξ∈P0

EP(Y − ξ)2. (101)

An approximate solution to the last problem yields a process, say ψY,P, so thatΠ0

P(Y ) ≈∫ T0

ψY,Pt dZ1t .

9.4 Mean-Variance Pricing and Hedging of a Generic Claim

Let us define

Page 97: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 89

κ = d − v = d− v − EPX + EPX0.

For simplicity, we shall only consider the case when κ > 0. This is equivalent toassuming that d > v − p for all p ∈ [0, v]. Thus, the results of Proposition 20 (i)apply. Consequently, denoting

ρ = (d − v)2 − γ0X(eθ

2T − 1),

we obtain the following result.

Proposition 22. Suppose that γ0X ≤ (d − v)2(eθ

2T − 1)−1. Then the buyer’s meanvariance price is

pd,v(X) = min−κ1 +√

ρ, v ∨ 0. (102)

Otherwise, pd,v(X) = −∞.

In case when γ0X ≤ (d − v)2(eθ

2T − 1)−1, the mean-variance hedging strategy fora generic claim X is given by φ∗(d, v, pd,v(X), X), where the process φ∗ is definedin Proposition 20. The projection part of the strategy φ∗(d, v, pd,v(X), X), that is,

the process ψ1,X0, can be computed according to (93).

9.5 Projections of Defaultable Claims

In this section, we adopt the framework of Section 8.3. In particular, the default timeτ is a random time on (Ω,G,P) given by formula (80), and the process H is givenas Ht = 11τ≤t for every t ∈ [0, T ]. The natural filtration H of H is an auxiliaryfiltration, so that G = F ∨ H. Recall that we have assumed that τ admits the F-hazard process Γ under P and thus also, in view of the construction (80), under Q.Suppose, in addition, that the hazard process Γ is an increasing continuous process.Then the process Mt = Ht − Γt∧τ is known to be a G-martingale under Q. AnyGT -measurable random variable X is referred to as a defaultable claim.

Recall that the process Wt = Wt + θt is a Brownian motion with respect to F underQ, and thus the process Z1 is a square-integrable G-martingale under Q, since

dZ1t = Z1

t σ dWt, Z10 > 0.

The following proposition is an important technical result.

Proposition 23. The filtration G1 is equal to the filtration G, that is, G1t = Gt for

every t ∈ R+.

Proof. It is clear that G1 ⊆ G. For a fixed T > 0, let y1, y2 ∈ R and let the processesψ1, ψ2 belong to A(Q). Thus the processes

Page 98: Paris-Princeton Lectures on Mathematical Finance 2003

90 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Y 1t = y1 +

∫ t

0

ψ1u dZ1

u, Y 2t = y2 +

∫ t

0

ψ2u dZ1

u

be G1-adapted processes. Then the process

Y 1t Y 2

t = y1y2 +∫ t

0

Y 1u ψ2

u dZ1u +

∫ t

0

Y 2u ψ1

u dZ1u +

∫ t

0

ψ1uψ

2u d〈Z1〉u

is also G1-adapted. It is easy to check that the processes∫ t

0

Y 1u ψ2

u dZ1u,

∫ t

0

Y 2u ψ1

u dZ1u

are G1-adapted. We thus conclude that for any processes φ and ψ from A(Q), theprocess ∫ t

0

ψ1uψ

2u d〈Z1〉u =

∫ t

0

ψ1uψ

2u(Z

1u)

2σ2du

is G1-adapted as well. In particular, it follows that for any bounded G-adapted pro-cess ζ the integral

∫ t0 ζu du defines a G1-adapted process. Let us take ζu = Hu.

Then we obtain that the process τ ∧ t is G1-adapted. Hence, it is easily seen thatGt ⊆ G1

t for t ∈ [0, T ]. Since T was an arbitrary positive number, we have shownthat G = G1.

Projection of a Survival Claim

We shall now compute the process ψY,P, which occurs in the projection Π0P(Y ) for

a random variable Y = Z11τ>T, where Z ∈ L2(Ω,FT , Q). It is known that any

random variable Y from L2(Ω,GT , Q) = L2(Ω,G1T , Q), which vanishes on the set

τ > T , can indeed be represented in this way. Any random variable Y of theform Z11τ>T is referred to as a survival claim with maturity date T , and a randomvariable Z is said to be the promised payoff associated with Y .

It is known (see, e.g., Bielecki and Rutkowski (2004)) that

EQ(Y | Gt) = E

Q(Z11τ>T | Gt) = E

Q(Z11τ>T | G1

t )

= 11τ>teΓt EQ(Ze−ΓT | Ft) = Ltm

Zt ,

where Lt := 11τ>teΓt is a G-martingale and mZt = E

Q(Ze−ΓT | Ft) is an F-

martingale. From the predictable representation theorem for a Brownian motion (orsince the default-free market is complete), it follows that there exists an F-adaptedprocess µZ such that

mZt = mZ

0 +∫ t

0

µZu dZ1u. (103)

In Proposition 21, we have already described the structure of the process ψY,P thatspecifies the projection of Y on P0. In the next two results, we shall give moreexplicit formulae for ψY,Q and NY,Q within the present setup.

Page 99: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 91

Lemma 16. Consider a survival claim Y = Z11τ>T with the promised payoff

Z ∈ L2(Ω,FT , Q). It holds that ψY,Qt = Lt−µZt for every t ∈ [0, T ], where byconvention L0− = 0.

Proof. It is easy to check that dLt = −Lt−dMt. Since Γ is increasing, the processL is of finite variation, and thus

d(LtmZt ) = Lt− dmZ

t + mZt dLt = Lt−µZt dZ1

t + mZt dLt,

and thus we obtaind〈Y, Z1〉t = Lt−µZt d〈Z1〉t

and ψY,Qt = Lt−µZt , which proves the result.

For the proof of the next auxiliary result, the reader is referred, for instance, to Jean-blanc and Rutkowski (2000) or Bielecki and Rutkowski (2004).

Lemma 17. Consider a survival claim Y = Z11τ>T with the promised payoff Z ∈L2(Ω,FT , Q). The process NY,Q in the Galtchouk-Kunita-Watanabe decompositionof Y with respect to Z1 under Q is given by the expression

NY,Qt =

∫[0,t)

nZu dMu,

where the process Mt = Ht − Γt∧τ is a G-martingale, strongly orthogonal in themartingale sense to W under Q, and where

nZt = −EQ

(ZeΓt−ΓT

∣∣Ft). (104)

By combining Proposition 21 with the last two result, we obtain the following corol-lary, which furnishes an almost explicit representation for the process ψY,P associ-ated with the projection on P0 of a survival claim.

Corollary 11. Let Y = Z11τ>T be a survival claim, where Z belongs to L2(Ω,

FT , Q). Then Π0P(Y ) is given by the following expression

Π0P(Y ) =

∫ T

0

ψY,Pt dZ1t ,

where for every t ∈ [0, T ]

ψY,Pt = Lt−µZt − ψt

(η−10 E

QY +

∫ t

0

η−1u nZu dMu

)(105)

where in turn Lt = 11τ>teΓt and the processes ψ, η, µZ and nZ are given by (91),(92), (103) and (104), respectively.

Page 100: Paris-Princeton Lectures on Mathematical Finance 2003

92 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Projection of a Defaultable Bond

According to the adopted convention regarding the recovery scheme, the terminalpayoff at time T of a defaultable bond equals X = L11τ>T + δL11τ≤T forsome L > 0 and δ ∈ [0, 1). Notice that the payoff X can be represented as followsX = δL + (1 − δ)LY, where Y = 11τ>T is a simple survival claim, with thepromised payoff Z = 1. Using the linearity of the projection Π0

P , we notice thatΠ0

P(X) can be evaluated as follows

Π0P(X) = δLΠ0

P(1) + (1 − δ)LΠ0P(Y ).

By virtue of Corollary 11, we conclude that

Π0P(X) = −δLeθ

2T Π0Q(ηT ) + (1 − δ)L

∫ T

0

ψt dZ1t ,

where (cf. (105))

ψt = 11τ>teΓtµt − ψt

(e−θ

2T EQ

(e−ΓT

)+∫ t

0

η−1u nu dMu

), (106)

where in turn the process ψ is given by (91), n by nt = −EQ(eΓt−ΓT | Ft), and the

process µ is such that

EQ(e−ΓT | Ft) = E

QY +

∫ t

0

µu dZ1u, ∀ t ∈ [0, T ].

Example 5. Consider the special case when Γ is deterministic. It is easily seen thatwe now have µ = 0 and nt = −eΓt−ΓT . Consequently, (106) becomes

ψt = −ψte−ΓT

(e−θ

2T −∫ t

0

η−1u eΓt dMu

),

and thus

Π0P(X) = −δLeθ

2T Π0Q(ηT )

− (1 − δ)L∫ T

0

ψte−ΓT

(e−θ

2T −∫ t

0

η−1u eΓt dMu

)dZ1

t ,

where the processes ψ and η are given by (91) and (92), respectively.

10 Risk-Return Portfolio Selection

In the preceding sections, we have examined the Markowitz-type mean-variancehedging problem from the particular perspective of valuation of non-attainable con-tingent claims. In view of the dependence of the mean-variance price obtained

Page 101: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 93

through this procedure on agent’s preferences, (formally reflected, among others,by the values of parameters d and v), this specific application of Markowitz-typemethodology suffers from deficiencies, which may undermine its practical imple-mentations.

In this section, we shall take a totally different perspective, and we shall assume thata given claim X can be purchased by an agent (an asset management fund, say) forsome pre-specified price. For instance, the price of X can be given by an investmentbank that is able to hedge this claim using some arbitrage-free model, or it can besimply given by the OTC market.

Let us emphasize that an agent is now assumed to be a pricetaker, so that the issueof preference-based valuation of a non-attainable claim will not be considered in thissection.

We postulate that an agent would like to invest in X , but will not be able (or will-ing) to hedge this claim using the underlying primary assets (if any such assets areavailable). As a consequence, an agent will only have in its portfolio standard instru-ments that are widely available for trading. The two important issues we would liketo address in this section are:

• What proportion of the initial endowment v should an agent invest in the claimX if the goal is to lower the standard deviation (or, equivalently, the variance) ofreturn, and to keep the expected rate of return at the desired level.

• How much should an agent invest in X in order to enhance the expected rateof return, and to preserve at the same time the pre-specified level of risk, asmeasured by the standard deviation of the rate of return.

We shall argue that mathematical tools and results presented in the previous sectionsare sufficient to solve both these problems. It seems to us that this alternative appli-cation of the mean-variance methodology can be of practical importance as well.

For the sake of simplicity, we shall solve the optimization problems formulated abovein the class Φ(F) of F-admissible trading strategies. A similar study can be conductedfor the case of G-admissible strategies. For any v > 0 and any trading strategyφ ∈ Φ(F), let r(φ) be the simple rate of return, defined as

r(φ) =V vT (φ) − v

v.

The minimization of the standard deviation of the rate of return, which equals

σ(r(φ)) =

√VP

(V vT (φ) − v

v

)= v−1

√VP(V v

T (φ)),

is, of course, equivalent to the minimization of the variance VP(V vT (φ)). Within the

present context, it is natural to introduce the constraint

EP(v−1V vT (φ)) ≥ d = 1 + dr,

Page 102: Paris-Princeton Lectures on Mathematical Finance 2003

94 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

where dr > 0 represents the desired minimal level of the expected rate of return.

10.1 Auxiliary Problems

The following auxiliary problem MV(dv, v) is merely a version of the previouslyconsidered problem MV(d, v):

Problem MV(dv, v): For a fixed v > 0 and d > 1, minimize the varianceVP(V v

T (φ)) over all strategies φ ∈ Φ(F), subject to EPVvT (φ) ≥ dv.

We assume from now on that θ = 0, and we denote by Θ the constant

Θ = (eθ2T − 1)−1 > 0.

Recall that for the problem MV(dv, v), the risk-return trade-off can be summarizedby the minimal variance curve v∗(dv, v). By virtue of Proposition 16, we have

v∗(dv, v) = Θv2(d − 1)2 = Θv2d2r . (107)

Equivalently, the minimal standard deviation of the rate of return satisfies

σ∗r = σ(r(φ∗(dv, v))) =

√v∗(dv, v) =

√Θdr,

so, as expected, it is independent of the value of the initial endowment v.

Suppose now that a claim X is available for some price pX = 0, referred to as themarket price. It is convenient to introduce the normalized claim X = Xp−1

X . Underthis convention, by the postulated linearity property of the market price, the price pXof one unit of X is manifestly equal to 1.

The next auxiliary problem we wish to solve reads: find p ∈ R such that the solutionto the problem MV(dv, v, p, pX) has the minimal variance. This means, of course,that we are looking for p ∈ R for which v∗(dv, v, p, pX) is minimal. Notice that theconstraint on the expected rate of return becomes

EP(v−1V v−pT (φ) + pX) ≥ d = 1 + dr,

where dr > 0. It is clear that the curve v∗(dv, v, p, pX) can be derived from thegeneral expression for v∗(d, v, p,X), which was established in Proposition 17. Letus denote

γX = VP

(X − EP(X | FT )

)and

νX = EQX − 1.

Let us notice that the condition d − v + p − EQX > 0, which was imposed in part(i) of Proposition 17, now corresponds to the following inequality: vdr > pνX . Weshall assume from now on that X = 1 (this assumption means simply that the claimX does not represent the savings account). Recall that v > 0 and dr = d − 1 > 1.

Page 103: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 95

Proposition 24. (i) If we assume that γX > 0 and that νX = 0, then the problemMV(dv, v, p, pX) has a solution with the minimal variance with respect to p. Theminimal variance equals

v∗(dv, v, p∗, p∗X) = Θv2d2r

(1 −

ν2X

Θ−1γX + ν2X

)(108)

and the optimal value of p equals

p∗ =vdrνX

Θ−1γX + ν2X

. (109)

(ii) Let γX > 0 and νX = 0. Then we have p∗ = 0 and the minimal variance equals

v∗(dv, v, p∗, p∗X) = Θv2d2r.

(iii) Let γX = 0 and νX = 0. If the inequality νX > 0 (νX < 0, respectively)holds then for any p ≥ vdrν

−1X

(p ≤ vdrν−1X

, respectively) the minimal variancev∗(dv, v, p, pX) is minimal with respect to p and it equals 0.(iv) Let γX = νX = 0. Then X is an attainable claim and EQX = 1. In this case,for any p ∈ R the minimal variance equals

v∗(dv, v, p, pX) = Θv2d2r .

Proof. Let us first prove parts (i)-(ii). It suffices to observe that, by virtue of Propo-sition 17, the minimal variance for the problem MV(dv, v, p, pX) is given by theexpression:

v∗(dv, v, p, pX) = Θ(drv − pνX)2 + p2γX (110)

provided that vdr > pνX . A simple argument shows that the minimal value for theright-hand side in (110) is obtained by setting p = p∗, where p∗ is given by (109),and the minimal variance is given by (108). Moreover, it is easily seen that for p∗

given by (109) the inequality vdr > p∗νX is indeed satisfied, provided that γX > 0.Notice also that if EQX = 1, we obviously have vdr > pνX = 0 for any p ∈ R+,and thus we obtain the following optimal values:

p∗ = 0, v∗(dv, v, p∗, p∗X) = Θv2d2r.

Assume now that vdr ≤ pνX , so that the case νX = 0 (i.e., the case EQX = 1) isexcluded. Then, by virtue of part (ii) in Proposition 17, the minimal variance equalsp2γX (notice that the assumption that γX is strictly positive is not needed here).Assume first that EQX < 1, so that νX < 0. Then the condition vdr ≤ pνX becomesp ≤ vdrν

−1X

, and thus p is necessarily negative. The minimal variance correspondsto p∗ = vdrν

−1X

, and it equals

v∗(dv, v, p∗, p∗X) = (p∗)2γX = v2d2rν

−2X

γX . (111)

Page 104: Paris-Princeton Lectures on Mathematical Finance 2003

96 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

In, on the contrary, EQX > 1, then νX > 0 and we obtain p ≥ vdrν−1X

, so that p isstrictly positive. Again, the minimal variance corresponds to p∗ = vdrν

−1X

, and it isgiven by (111). It is easy to check that the following inequality holds:

Θv2d2r

(1 −

ν2X

Θ−1γX + ν2X

)< v2d2

rν−2X

γX .

By combining the considerations above, we conclude that statements (i)-(ii) are valid.The proof of part (iii) is also based on the analysis above. We thus proceed to theproof of the last statement.

Notice that Θ−1γX + ν2X

= 0 if and only if γX = 0 and νX = 0. This means that Xis FT -adapted (and thus F-attainable) and EQX = 1 (so that the arbitrage price ofX coincides with its market price pX ). Condition vdr − pνX > 0 is now satisfied,and thus the minimal variance is given by (110), which now becomes

v∗(dv, v, p, pX) = Θv2d2r , ∀ p ∈ R+.

Obviously, the result does not depend on p. This proves part (iv).

In the last proposition, no a priori restriction on the value of the parameter p was im-posed. Of course, one can also consider a related constrained problem by postulating,for instance, that the price p belongs to the interval [0, v].

10.2 Minimization of Risk

We are in a position to examine the first question, which reads: how much to investin the new opportunity in order to minimize the risk and to preserve at the same timethe pre-specified level dr > 0 of the expected rate of return.

Case of an attainable claim. Assume first that X is an F-attainable contingentclaim, so that EP(X | FT ) = X , and thus γX = 0. If the claim X is correctly pricedby the market, i.e., if EQX = pX = 1 then, by virtue of part (iv) in Proposition 24,for any choice of p the minimal variance is the same as in the problem MV(dv, v).Hence, as expected, the possibility of investing in the claim X has no bearing on theefficiency of trading.

Let us now consider the case where EQX = 1, that is, the market price pX does notcoincide with the arbitrage price π0(X). Suppose first that EQX > 1, that is, X isunderpriced by the market. Then, in view of part (iii) in Proposition 24, the varianceof the rate of return can be reduced to 0 by choosing p which satisfies

p ≥ vdr(EQX − 1)−1 > 0.

Similarly, if EQX < 1 then for any p such that

p ≤ vdr(EQX − 1)−1 < 0

Page 105: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 97

the variance equals 0. Off course, this feature is due to the presence of arbitrage op-portunities in the market. We conclude that, as expected, in the case of an attainableclaim the solution to the problem considered is this section is rather trivial, and thusit has no practical appeal.

Case of a non-attainable claim. We now assume that γX > 0. Suppose first thatEQX = 1. By virtue of part (ii) in Proposition 24, under this assumption it is optimalnot to invest in X . To better appreciate this result, notice that for the conditionalexpectation X = EP(X | FT ) we have EQX = EQX = 1 and EPX = EPX(cf. Section 8.2). Therefore, trading in X is essentially equivalent to trading in anattainable claim X , but trading in X results in the residual variance p2γX . Thisobservations explains why the solution p∗ = 0 is optimal.

Suppose now that EQX = 1. Then part (i) of Proposition 24 shows that the varianceof the rate of return can always be reduced by trading in X . Specifically, p∗ is strictlypositive provided that EQX > 1 = pX , that is, the expected value of X under themartingale measure Q for the underlying market is greater than its market price.

Case of an independent claim. Assume that the claim X is independent of FT , sothat γX > 0 is the variance of X . In this case EQX = EPX and thus (108) becomes

v∗ = Θv2d2r

(1 − (EPX − 1)2

Θ−1VP(X) + (EPX − 1)2

).

From the last formula, it is clear that an agent should always to invest either a positiveor negative amount of initial endowment v in an independent claim X , except for thecase where EPX = 1. If EPX = 1 then the optimal value of p equals (cf. (109))

p∗ =vdr(EPX − 1)

Θ−1VP(X) + (EPX − 1)2

so that it is positive if and only if EPX > 1.

Case of a claim with zero market price. The case when the market price of X iszero (that is, the equality pX = 0 holds) is also of practical interest, since such afeature is typical for forward contracts. It should be stressed that this particular caseis not covered by Proposition 24, however.

In fact, we deal here with the following variant of the mean-variance problem:

Find α ∈ R such that the solution to the problem MV(dv, v, 0, αX) has the minimalvariance.

Under the assumption that vdr > αEQX , we have

v∗(dv, v, 0, αX) = Θ(vdr − αEQX)2 + α2γX .

If, on the contrary, the inequality vdr ≤ αEQX is valid, then the minimal varianceequals α2γX . Of course, we necessarily have α = 0 here (since vdr > 0).

Page 106: Paris-Princeton Lectures on Mathematical Finance 2003

98 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

10.3 Maximization of Expected Return

Let us focus on part (i) in Proposition 24, that is, let us assume that γX > 0 andνX = 0 (as was explained above, other cases examined in Proposition 24 are ofminor practical interest). The question of maximization of the expected rate returnfor a pre-specified level of risk, can be easily solved by comparing (107) with (108).Indeed, for a given level dr of the expected rate of return, and thus a given levelv∗(dv, v) of the minimal variance, it suffices to find a number dr which solves thefollowing equation

Θv2d2r = Θv2d2

r

(1 −

ν2X

Θ−1γX + ν2X

).

It is obvious that the last equation has the unique solution

dr = dr

√1 +

ν2X

Θ−1γX> dr.

The corresponding value of p∗ is given by (109) with dr substituted with dr. It isthus clear that, under the present assumptions, a new investment opportunity can beused to enhance the expected rate of return. If we insist, in addition, that p > 0, thenthe latter statement remains valid, provided that EQX > 1.

Part III. Indifference Pricing

In this part, we present a few alternative ways of pricing defaultable claims in thesituation when perfect hedging is not possible. In the previous part, we have pre-sented the mean-variance hedging framework. Now, we study the indifference priceapproach that was initiated by Hodges and Neuberger (1989). We shall refer to thisapproach as the “Hodges price” approach. This will lead us to solving portfolio opti-mization problems in incomplete market, and we shall use the dynamic programming(DP) approach.

We also present the Hamilton-Jacobi-Bellman (HJB) equations, when appropriate,even though this method typically requires strong assumptions to give closed-formsolutions. In particular, when dealing with the general DP approach, we need notmake any Markovian assumption about the underlying processes; such assumptionsare fundamental for the HJB methodology to work.

In Section 11, we define the Hodges indifference price associated to strategiesadapted with the reference filtration F, and we solve the problem for exponentialpreferences and for some particular defaultable claims. We shall use results obtainedhere to provide basis for a comparison between the historical spread and the risk-neutral one.

Page 107: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 99

In Section 12, using backward stochastic differential equations (BSDEs), we workwith G-adapted strategies, and we solve portfolio optimization problems for expo-nential utility functions. Our method relies on the ideas of Rouge and El Karoui(2000) and Musiela and Zariphopoulou (2004). The reader can refer to El Karouiand Mazliak (1997), El Karoui and Quenez (1997), El Karoui et al. (1997), or to thesurvey by Buckdahn (2000) for an introduction to the theory of backward stochasticdifferential equations and its applications in finance.

Section 13 is devoted to the study of a particular indifference price, based on thequadratic criterion; we call such a price the quadratic hedging price (see the intro-duction to Part II). In particular, we compare the indifference prices obtained usingstrategies adapted to the reference filtration F to the indifference prices obtained us-ing strategies based on the enlarged filtration G. It is worthwhile to stress, though,that the quadratic utility alone is not quite adequate for the pricing purposes, althoughit represents a good criterion for hedging purposes. This is one of the reasons we pre-sented the mean-variance approach to pricing and hedging of defaultable claims inPart II.

In the last section, we present a very particular case of the duality approach forexponential utilities.

As in the previous part, we emphasize that a very important aspect of our analysisis the distinction between the case when admissible portfolios are adapted to thefiltration F, and the case when admissible portfolios are adapted to the filtration G.

11 Hedging in Incomplete Markets

We recall briefly the probabilistic setting of Part II. The default-free asset is Z1 withthe dynamics

dZ1t = Z1

t (νdt + σdWt), Z10 > 0,

and the price process of the money market account has the dynamics

dZ2t = rZ2

t dt, Z20 = 1,

where r is the constant interest rate. The default-free market is complete and arbi-trage free: one can hedge perfectly any square-integrable contingent claim X ∈ FT .The default time is some random time τ , and the default process is denoted asHt = 11τ≤t. The reference filtration is the Brownian filtration Ft = σ(Wu, u ≤ t)and the enlarged filtration is Gt = Ft ∨Ht where Ht = σ(Hu, u ≤ t).

We assume that the hazard process Ft = Pτ ≤ t | Ft is absolutely continuouswith respect to Lebesgue measure, so that Ft =

∫ t0fu du (hence, it is an increasing

process). Therefore, the process

Mt = Ht −∫ t∧τ

0

γu du = Ht −∫ t∧τ

0

fu1 − Fu

du

Page 108: Paris-Princeton Lectures on Mathematical Finance 2003

100 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

is a G-martingale, where γ is the default intensity. Note that the stochastic inten-sity γ is the intensity of the default time τ with respect to the reference filtration F

generated by the Brownian motion W .

For a fixed T > 0, we introduce a risk-neutral probability Q for the market model(Z1, Z2) by setting dQ|Gt = ηt dP|Gt for t ∈ [0, T ], where the Radon-Nikodymdensity η is the F-martingale defined as

dηt = −θηt dWt, η0 = 1,

where θ = (ν − r)/σ. Under Q, the discounted process Z1t = e−rtZ1

t is a mar-tingale. It should be emphasized that Q is not necessarily a martingale measure fordefaultable assets. Let us recall, however, that if Q is any equivalent martingale mea-sure on G for the default-free and defaultable market, then the restriction of Q to F isequal to the restriction of Q to F. A defaultable claim is simply any random variableX , which is GT -measurable. Hence, default-free claims are formally considered asspecial cases of defaultable claims.

11.1 Hodges Indifference Price

We present a general framework of the Hodges and Neuberger (1989) approach withsome strictly increasing, strictly concave and continuously differentiable mapping u,defined on R. We solve explicitly the problem in the case of exponential utility forportfolios adapted to the reference filtration.

The Hodges approach to pricing of unhedgeable claims is a utility-based approachand can be summarized as follows: the issue at hand is to assess the value of some(defaultable) claim X as seen from the perspective of an economic agent who opti-mizes his behavior relative to some utility function, say u. In order to provide suchan assessment one can argue that one should first consider the following possiblemodes of agent’s behavior and the associated optimization problems:

Problem (P): Optimization in the default-free market.

The agent invests his initial wealth v > 0 in the default-free financial market using aself-financing strategy. The associated optimization problem is,

(P) : V(v) := supφ∈Φ(F)

EP

u(V vT (φ)

),

where the wealth process Vt = V vt (φ), t ∈ R+, is solution of

dVt = rVt dt + φt(dZ1t − rZ1

t dt), V0 = v. (112)

Recall that Φ(F) is the class of all admissible, F-adapted, self-financing tradingstrategies (for the definition of this class, see Part II).

Page 109: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 101

Problem (PXF ): Optimization in the default-free market using F-adapted strategiesand buying the defaultable claim.

The agent buys the contingent claim X at price p, and invests the remaining wealthv−p in the financial market, using a trading strategy φ ∈ Φ(F). The resulting globalterminal wealth will be

V v−p,XT (φ) = V v−p

T (φ) + X.

The associated optimization problem is

(PXF ) : VX(v − p) := supφ∈Φ(F)

EP

u(V v−pT (φ) + X

),

where the process V v−p(φ) is a solution of (112) with the initial condition V v−p0 (φ) =

v − p. We emphasize that the class Φ(F) of admissible strategies is the same as inthe problem (P), that is, we restrict here our attention to trading strategies that areadapted to the reference filtration F.

Problem (PXG ): Optimization in the default-free market using G-adapted strategiesand buying the defaultable claim.

The agent buys the contingent claim X at price p, and invests the remaining wealthv − p in the financial market, using a strategy adapted to the enlarged filtration G.The associated optimization problem is

(PXG ) : VGX(v − p) := sup

φ∈Φ(G)

EP

u(V v−pT (φ) + X

),

where Φ(G) is the class of all G-admissible trading strategies (for the definition ofthe class Φ(G), see Part II). Next, the utility based assessment of the value (price) ofthe claim X , as seen from the agent’s perspective, is given in terms of the followingdefinition.

Definition 10. For a given initial endowment v, the F-Hodges buying price of a de-faultable claim X is the real number p∗F(v) such that V(v) = VX

(v − p∗F(v)

).

Similarly, the G-Hodges buying price of X is the real number p∗G(v) such thatV(v) = VG

X

(v − p∗G(v)

).

Remark. We can define the F-Hodges selling price pF∗(v) of X by considering −p,

where p is the buying price of −X , as specified in Definition 10.

If the contingent claim X is FT -measurable, then the F- and the G-Hodges pricescoincide with the hedging price of X , i.e., p∗F(v) = p∗G(v) = π0(X) = EP(ζTX),where we denote ζt = ηtRt with Rt = (Z2

t )−1 = e−rt. Indeed, assume that thereexists a self-financing portfolio φ such that X = V

π0(X)T (φ), and let h be the F-

Hodges buying price. Suppose first that h < π0(X). Then for any φ we obtain

Page 110: Paris-Princeton Lectures on Mathematical Finance 2003

102 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

V v−hT (φ) + X = V v−h

T (φ) + Vπ0(X)T (φ) = V

v−h+π0(X)T (ψ),

where we denote ψ = φ + φ ∈ Φ(F). Hence

VX(v − h) = supφ∈Φ(F)

EP

u(V v−hT (φ) + X

)

= supψ∈Φ(F)

EP

u(Vv−h+π0(X)T (ψ)

)≥ V(v),

where the last inequality (which is a strict inequality) follows from v < v − h +π0(X) and the arbitrage principle. Therefore, the supremum over φ ∈ Φ(F) ofEP(u(V v−h

T (φ) + X)) is greater than V(v). We conclude that the F-Hodges buyingprice can not be smaller than the hedging price. Arguing in a similar way, one canshow that the F-Hodges selling price of an FT -measurable claim can not be smallerthan the hedging price. Finally, almost identical arguments show that the G-Hodgesbuying and selling price of an FT -measurable claim are equal to the hedging priceof X (see Section 12.2).

Remark. It can be shown (see Rouge and El Karoui (2000), or Collin-Dufresne andHugonnier (2002)) that in the general case of non-hedgeable contingent claim, theHodges price belongs to the open interval

(infQ

EQ(Xe−rT ), sup

Q

EQ(Xe−rT )

),

where Q runs over the set of all equivalent martingale measures, and thus it can notinduce arbitrage opportunities.

11.2 Solution of Problem (P)

We briefly recall one of the solution methods for the problem (P). To this end, wefirst observe that in view of (112) the process e−rtV v−p

t (φ), t ∈ R+, is a mar-tingale under any equivalent martingale measure, hence ζtV

v−pt (φ), t ∈ R+, is a

P-martingale and, in particular, EP(V vT (φ)ζT ) = v. It follows that in order to obtain

a terminal wealth equal to, say V , the initial endowment v has to be greater or equalto EP(V ζT ); this condition is commonly referred to as the budget constraint.

Now, let us denote by I the inverse of the monotonic mapping u′ (the first deriva-tive of u). It is well known (see, e.g., Karatzas and Shreve (1998)) that the optimalterminal wealth in the problem (P) is given by the formula

V v,∗T = I(µζT ), P-a.s., (113)

where µ is a real number such that the budget constraint is binding, that is,

v = EP

(ζTV

v,∗T

). (114)

Page 111: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 103

Consequently, the optimal value of the objective criterion for the problem (P) isV(v) = EP(u(V v,∗

T )).

The above results are obtained by means of convex duality theory. The disadvantageof this approach, however, is the fact that it is typically very difficult to identify anoptimal trading strategy. Thus, in general, using the convex duality approach we canonly partially solve the problem (P). Specifically, we can compute the optimal valueof the objective criterion, but we can’t identify the optimal strategy. Later in this part,we shall use the BSDE approach in a more general setting. It will be seen that thisapproach will allow us to identify (at least in principle) an optimal trading strategy.

11.3 Solution of Problem (PXF

)

In this subsection, we shall examine the problem (PXF ) for a defaultable claim of aparticular form. First, we shall provide a solution VX(v− p) to the related optimiza-tion problem. Next, we shall establish a quasi-explicit representation for the Hodgesprice of X in the case of exponential utility. Finally, we shall compare the spread ob-tained via the risk-neutral valuation with the spread determined by the Hodges priceof a defaultable zero-coupon bond. The reader can refer to Bernis and Jeanblanc(2003) for other comments.

Particular Form of a Defaultable Claim

We restrict our attention to the case when X is of the form

X = X111τ>T + X211τ≤T, (115)

where Xi, i = 1, 2 are P-square-integrable and FT -measurable random variables. Inthis case, we have

V v−p,XT (φ) = V v−p

T (φ) + X1

if the default did not occur before maturity date T , that is, on the set τ > T , and

V v−p,XT (φ) = V v−p

T (φ) + X2

otherwise. In other words,

V v−p,XT (φ) = 11τ>T(V

v−pT (φ) + X1) + 11τ≤T(V

v−pT (φ) + X2).

Observe that the pay-off X2 is not paid at time of default τ , but at the terminal timeT .

Since the trading strategies are F-adapted, the terminal wealth V v−pT (φ) is an FT -

measurable random variable. Consequently, it holds that

Page 112: Paris-Princeton Lectures on Mathematical Finance 2003

104 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

EP

u(V v−p,XT (φ)

)= EP

u(V v−pT (φ) + X1

)11τ>T + u

(V v−pT (φ) + X2

)11τ≤T

= EP

EP

(u(V v−pT (φ) + X1

)11τ>T + u

(V v−pT (φ) + X2

)11τ≤T|FT

)= EP

u(V v−pT (φ) + X1

)(1 − FT ) + u

(V v−pT (φ) + X2

)FT,

where FT = P τ ≤ T | FT . Define, for every ω ∈ Ω and y ∈ R,

JX(y, ω) = u(y + X1(ω))(1 − FT (ω)) + u(y + X2(ω))FT (ω).

Notice that under the present assumptions, the problem (PXF ) is equivalent to thefollowing problem:

(PXF ) : VX(v − p) := supφ∈Φ(F)

EP

JX(V v−pT (φ), ω

).

The mapping JX(·, ω) is a strictly concave and increasing real-valued mapping.Consequently, for any ω ∈ Ω we can define the mapping IX(z, ω) by setting

IX(z, ω) =(J ′X(·, ω)

)−1(z) for z ∈ R, where (J ′X(·, ω))−1 denotes the inverse

mapping of the derivative of JX with respect to the first variable. To simplify thenotation, we shall usually suppress the second variable, and we shall write IX(·) inplace of IX(·, ω).

The following lemma provides the form of the optimal solution.

Lemma 18. The optimal terminal wealth for the problem (PXF ) is given by V v−p,∗T =

IX(λ∗ζT ), P-a.s., for some λ∗ such that

v − p = EP

(ζTV v−p,∗

T

). (116)

Thus the optimal global wealth equals V v−p,X,∗T = V v−p,∗

T + X = IX(λ∗ζT ) + Xand the optimal value of the objective criterion for the problem (PXF ) is

VX(v − p) = EP(u(V v−p,X,∗T )) = EP(u(IX(λ∗ζT ) + X)). (117)

Proof. As a consequence of predictable representation property (see, e.g., Karatzasand Shreve (1991)), one knows that in order to find the optimal wealth it is enough tomaximize u(∆) over the set of square-integrable and FT -measurable random vari-ables ∆, subject to the budget constraint, given by

EP(ζT∆) ≤ v − p.

The associated Lagrange multiplier, say λ∗, is non-negative. Moreover, by the strictmonotonicity of u, we know that, at optimum, the constraint is binding, and thusλ∗ > 0. We check that IX(λ∗ζT ) is the optimal wealth.

The mapping JX(·) is strictly concave (for all ω). Hence, for every wealth processV v−p(φ), starting from v − p, by tangent inequality, we have

Page 113: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 105

EP

JX(V v−p

T (φ)) − JX(V v−p,∗T )

≤ EP

(V v−pT (φ) − V v−p,∗

T )J ′X(V v−p,∗

T ).

Replacing V v−p,∗ by its expression given in Lemma 18 yields for any φ ∈ Φ(F)

EP

JX(V v−p

T (φ)) − JX(V v−p,∗T )

≤ λ∗ EP

ζT (V v−p

T (φ) − V v−p,∗T )

≤ 0,

where the last inequality follows from (116) and the budget constraint. To end theproof, it remains to observe that the first order conditions are also sufficient in thecase of a concave criterion. Moreover, by virtue of strict concavity of the functionJX , the optimum is unique.

Exponential Utility: Explicit Computation of the Hodges Price

For the sake of simplicity, we assume here that r = 0. Let us state the followingresult, the proof of which stems from Lemma 18, by direct computations.

Proposition 25. Let u(x) = 1− exp(−x) for some > 0. Assume that for i = 1, 2the random variable ζT e

−Xi

is P-integrable. Then we have

p∗F(v) = −1

EP

(ζT ln

((1 − FT )e−X1 + FT e−X2

))= EP(ζTΨ),

where the FT -measurable random variable Ψ equals

Ψ = −1

ln((1 − FT )e−X1 + FT e

−X2). (118)

Thus, the F-Hodges buying price p∗F(v) is the arbitrage price of the associated claimΨ . In addition, the claim Ψ enjoys the following meaningful property

EP

u(X − Ψ

) ∣∣FT = 0. (119)

Proof. In view of the form of the solution to the problem (P), we obtain (cf. (113))

V v,∗T = −1

ln(

µ∗ζT

).

The budget constraint EP(ζTVv,∗T ) = v implies that the Lagrange multiplier µ∗ sat-

isfies

1

ln(

µ∗

)= −1

EP

(ζT ln ζT

)− v. (120)

In the case of an exponential utility, we have (recall that the variable ω is suppressed)

JX(y) = (1 − e−(y+X1))(1 − FT ) + (1 − e−(y+X2))FT ,

Page 114: Paris-Princeton Lectures on Mathematical Finance 2003

106 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

so thatJ ′X(y) = e−y(e−X1(1 − FT ) + e−X2FT ).

Thus, settingA = e−X1(1 − FT ) + e−X2FT = e−Ψ ,

we obtain

IX(z) = −1

ln(

z

A

)= −1

ln(

z

)− Ψ.

It follows that the optimal terminal wealth for the initial endowment v − p is

V v−p,∗T = −1

ln(

λ∗ζTA

)= −1

ln(

λ∗

)− 1

ln ζT − Ψ,

where the Lagrange multiplier λ∗ is chosen so that:

1

ln(

λ∗

)= −1

EP

(ζT ln ζT

)− EP

(ζTΨ

)− v + p, (121)

which guarantees that the budget constraint EP(ζTV v−p,∗T ) = v − p is satisfied. The

F-Hodges buying price is a real number p∗ = p∗F(v) such that

EP

(exp(−V v,∗

T ))

= EP

(exp(−(V v−p∗,∗

T + X))),

where µ∗ and λ∗ are given by (120) and (121), respectively. After substitution andsimplifications, we arrive at the following equality

EP

exp

(−

(EP(ζTΨ) − p∗ + X − Ψ

))= 1. (122)

Using (115), it is easy to check that

EP

(e−(X−Ψ)

∣∣FT ) = 1 (123)

so that equality (119) holds, and EP

(e−(X−Ψ)

)= 1. Combining (122) and (123),

we conclude that p∗F(v) = EP(ζTΨ).

We briefly provide the analog of (118) for the F-Hodges selling price of X . We havepF∗(v) = EP(ζT Ψ), where

Ψ =1

ln((1 − FT )eX1 + FT eX2

). (124)

Remark. It is important to notice that the F-Hodges prices p∗F(v) and pF∗(v) do notdepend on the initial endowment v. This is an interesting property of the exponentialutility function. In view of (119), the random variable Ψ will be called the indiffer-ence conditional hedge.

Page 115: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 107

Comparison with the Davis price. Let us present the results derived from themarginal utility pricing approach. The Davis price (see Davis (1997)) is given by

d∗(v) =EP

u′(V v,∗

T

)X

V ′(v).

In our context, this yields

d∗(v) = EP

ζT(X1FT + X2(1 − FT )

).

In this case, the risk aversion has no influence on the pricing of the contingentclaim. In particular, when F is deterministic, the Davis price reduces to the arbitrageprice of each (default-free) financial asset X i, i = 1, 2, weighted by the correspond-ing probabilities FT and 1 − FT .

Risk-Neutral Spread Versus Hodges Spreads

Let us consider the case of a defaultable bond with zero recovery, so that X1 = 1and X2 = 0. It follows from (124) that the F-Hodges buying and selling prices of thebond are (it will be convenient here to indicate the dependence of the Hodges priceon maturity T )

D∗F(0, T ) = −1

EP

ζT ln(e−(1 − FT ) + FT )

and

DF∗(0, T ) =

1

EP

ζT ln(e(1 − FT ) + FT )

,

respectively. Let Q be a risk-neutral probability for the filtration G, that is, for theenlarged market. The “market” price at time t = 0 of defaultable bond, denoted asD0(0, T ), is thus equal to the expectation under Q of its discounted pay-off, that is,

D0(0, T ) = EQ

(11τ>TRT

)= E

Q

((1 − FT )RT

),

where Ft = Q τ ≤ t | Ft for every t ∈ [0, T ]. Let us emphasize that the risk-neutral probability Q is chosen by the market, via the price of the defaultable asset.Hence, it should not be confused with the probability measure Q, which combines,in a sense, the risk-neutral probability for the default-free market (Z1, Z2) with thereal-life intensity of default.

Let us recall that in our setting the price process of the T -maturity unit discountTreasury (default-free) bond is B(t, T ) = e−r(T−t). The Hodges buying and sellingspreads at time t = 0 are defined as

S∗(0, T ) = − 1T

lnD∗

F(0, T )B(0, T )

Page 116: Paris-Princeton Lectures on Mathematical Finance 2003

108 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

and

S∗(0, T ) = − 1T

lnDF

∗(0, T )B(0, T )

,

respectively. Likewise, the risk-neutral spread at time t = 0 is given as

S0(0, T ) = − 1T

lnD0(0, T )B(0, T )

.

Since D∗F(0, 0) = DF∗(0, 0) = D0(0, 0) = 1, the respective backward short spreads

at time t = 0 are given by the following limits (provided the limits exist)

s∗(0) = limT↓0

S∗(0, T ) = −d+ lnD∗F(0, T )

dT

∣∣∣T=0

− r

and

s∗(0) = limT↓0

S∗(0, T ) = −d+ lnDF∗(0, T )

dT

∣∣∣T=0

− r,

respectively. We also set

s0(0) = limT↓0

S0(0, T ) = −d+ lnD0(0, T )dT

∣∣∣T=0

− r.

Assuming, as we do, that the processes FT and FT are absolutely continuous withrespect to the Lebesgue measure, and using the observation that the restriction of Q

to FT is equal to Q, we find out that

D∗F(0, T )

B(0, T )= −1

EQ

ln(e−(1 − FT ) + FT

)

= −1

EQ

ln(e−

(1 −

∫ T

0

ft dt)

+∫ T

0

ft dt)

,

and

DF∗(0, T )

B(0, T )=

1

EQ

ln(e(1 − FT ) + FT

)

=1

EQ

ln(e(1 −

∫ T

0

ft dt)

+∫ T

0

ft dt)

.

Furthermore,

D0(0, T )B(0, T )

= EQ(1 − FT ) = EQ

(1 −

∫ T

0

ft dt).

Consequently,

s∗(0) =1

(e − 1

)f0, s∗(0) =

1

(1 − e−

)f0,

Page 117: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 109

and s0(0) = f0. Now, if we postulate, for instance, that s∗(0) = s0(0) (it would bethe case if the market price is the selling Hodges price), then we must have

γ0 = f0 =1

(1 − e−

)f0 =

1

(1 − e−

)γ0

so that γ0 < γ0. Observe, however, that the case when the market price were equalto the buying Hodges price, that is s∗(0) = s0(0) would necessitate that γ0 > γ0.Similar calculations can be made for any t ∈ [0, T ).

12 Optimization Problems and BSDEs

The major distinction between this section and the previous one is that here weconsider strategies φ that are predictable with respect to the full filtration G. Un-less explicitly stated otherwise, the underlying probability measure is the real-worldprobability P. We consider the following dynamics for the risky asset Z1

dZ1t = Z1

t−(νdt + σdWt + ϕdMt), (125)

where Mt = Ht −∫ t∧τ0

γs ds, and where we impose the condition ϕ > −1, whichensures that the price Z1

t remains strictly positive.

In order to simplify notation, we shall denote by ξ the process such that dMt =dHt − ξt dt is a G-martingale, i.e., ξt = γt(1 −Ht). We assume that the hypothesis(H) holds, that is, any F-martingale is a G-martingale as well.

Throughout most of the section, we shall deal with the same market model as in theprevious section, that is, we shall set ϕ = 0. Only in Section 14 we generalize thedynamics of the risky asset to the case when ϕ = 0, so that the dynamics of therisky asset Z1 are sensitive to the default risk. In particular, the limit case ϕ = −1corresponds to the case where the underlying risky asset has value 0 after the default.

We assume for simplicity that r = 0, and we change the notational convention foran admissible portfolio to the one that will be more suitable for problems consideredhere: instead of using the number of shares φ as before, we set π = φZ1, so that πrepresents the value invested in the risky asset. The portfolio process πt should notbe confused with the arbitrage price process πt(X). In addition, we adopt here thefollowing relaxed definition of admissibility of a self-financing trading strategy.

Definition 11. The class Π(F) (respectively Π(G)) of F-admissible (respectivelyG-admissible) trading strategies is the set of all F-predictable (respectively G-predictable) processes π such that

∫ T0 π2

t dt < ∞, P-a.s.

The wealth process of a strategy π satisfies

dVt(π) = πt(νdt + σdWt + ϕdMt

). (126)

Page 118: Paris-Princeton Lectures on Mathematical Finance 2003

110 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Note that with the present definition of admissible strategies the “martingale part” ofthe wealth process is a local martingale, in general.

Let X be a given contingent claim, represented by a GT -measurable random variable.We shall study the following problem:

supπ∈Π(G)

EP

u(V vT (π) + X

).

12.1 Exponential Utility

In this section, we shall examine the problem introduced above in the case of theexponential utility, and setting ϕ = 0 in dynamics (125). First, we examine theexistence and the form of a solution to the optimization problem, under additionaltechnical assumptions. Subsequently, we shall derive the expression for the Hodgesbuying price.

Optimization Problem

Let X ∈ GT be a given non-negative contingent claim, and let v be the initial endow-ment of an agent. Our first goal is to solve an optimization problem for an agent whobuys a claim X . To this end, it suffices to find a strategy π ∈ Π(G) that maximizesEP(u(V v

T (π) + X)), where the wealth process Vt = V vt (π) (for simplicity, we shall

frequently skip v and π from the notation) satisfies

dVt = φt dZ1t = πt(νdt + σdWt), V0 = v.

We consider the exponential utility function u(x) = 1−e−x, with > 0. Therefore,we deal with the following problem:

supπ∈Π(G)

EP

u(V v

T (π) + X)

= 1 − infπ∈Π(G)

EP

(e−V

vT (π)e−X

).

Let us describe the idea of a solution. Suppose that we can find a process Z withZT = e−X , which depends only on the claim X and parameters , σ, ν, and suchthat the process e−V

vt (π)Zt is a G-submartingale under P for any admissible strategy

π and is a martingale under P for some admissible strategy π∗ ∈ Π(G). Then, wewould have

EP(e−Vv

T (π)ZT ) ≥ e−Vv0 (π)Z0 = e−vZ0

for any π ∈ Π(G), with equality for some strategy π∗ ∈ Π(G). Consequently, wewould obtain

infπ∈Π(G)

EP

(e−V

vT (π)e−X

)= EP

(e−V

vT (π∗)e−X

)= e−vZ0, (127)

Page 119: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 111

and thus we would be in a position to conclude that π∗ is an optimal strategy. In fact,it will turn out that in order to implement the above idea we shall need to restrictfurther the class of G-admissible trading strategies.

We shall search for an auxiliary process Z in the class of all processes satisfying thefollowing backward stochastic differential equation (BSDE)

dZt = ft dt + zt dWt + zt dMt, t ∈ [0, T ), ZT = e−X , (128)

where the process f will be determined later (see equation (130) below). By applyingIto’s formula, we obtain

d(e−Vt) = e−Vt((

12

2π2t σ

2 − πtν)dt− πtσ dWt

),

so that

d(e−VtZt) = e−Vt(ft + Zt(1

22π2t σ

2 − πtν) − πtσzt)dt

+ e−Vt((zt − πtσZt) dWt + zt dMt

).

Let us choose π∗ such that it minimizes, for every t, the following expression

Zt(

12

2π2t σ

2 − πtν)− πtσzt = −πt(νZt + σzt) + 1

22π2t σ

2Zt.

It is easily seen that

π∗t =

νZt + σztσ2Zt

=1σ

(θ +

ztZt

). (129)

Now, let us choose the process f , by postulating that

ft = f(Zt, zt) = Zt(π∗

t ν − 12

2(π∗t )

2σ2)

+ π∗t σzt

= π∗t (Ztν + σzt) − 1

22(π∗

t )2σ2Zt =

(νZt + σzt)2

2σ2Zt. (130)

In other words, we shall focus on the following BSDE:

dZt =(νZt + σzt)2

2σ2Ztdt + zt dWt + zt dMt, t ∈ [0, T [, ZT = e−X . (131)

Recall that W is a Brownian motion under P, and that the risk-neutral probability Q

is given by dQ|Ft = ηt dP|Ft , where dηt = −ηtθ dWt with θ = ν/σ and η0 = 1.Thus the process WQ

t = Wt + θt, t ∈ [0, T ], is a Brownian motion under Q. It willbe convenient to write equation (131) as

dZt =(

12θ

2Zt + θzt + 12Z

−1t z2

t

)dt + zt dWt + zt dMt, t ∈ [0, T [, ZT = e−X .

Equivalently,

dZt =(

12θ

2Zt + 12Z

−1t z2

t

)dt+ zt dW

Qt + zt dMt, t ∈ [0, T [, ZT = e−X . (132)

Page 120: Paris-Princeton Lectures on Mathematical Finance 2003

112 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Remark. To the best of out knowledge, no general theorem, which would establishthe existence of a solution to equation (132), is available. The comparison theoremworks for BSDEs driven by a jump process when the drift satisfies some Lipschitzcondition (see Royer (2003)). Hence, the proofs of Lepeltier and San-Martin (1997)and Kobylanski (2000), which rely on comparison results, may not be directly carriedto the case of quadratic BSDEs driven by a jump process. We shall solve the BSDE(132) under rather restrictive assumptions on X . Hence, the general case remains anopen problem.

Lemma 19. Assume that there exists G-predictable processes k, k > −1 and a con-stant c such that

exp(KT )ET (M) = e−X , (133)

where

Kt = c +∫ t

0

ku dWQu , Mt =

∫ t

0

ku dMu,

and E(M) is the Doleans exponential of M . Then Ut = exp(Kt)Et(M) solves thefollowing BSDE

dUt = 12U

−1t u2

t dt + ut dWQt + ut dMt, t ∈ [0, T [, UT = e−X . (134)

Proof. Since dEt(M) = Et−(M) dMt, the process U defined above satisfies

dUt = 12Utk

2t dt + Utkt dW

Qt + Ut−kt dMt

and thusdUt = 1

2U−1t u2

t dt + ut dWQt + ut dMt

where we denote ut = Utkt and ut = Ut−kt. Since obviously UT = e−X , thisends the proof.

Corollary 12. Let X be a GT -measurable claim such that (133) holds for some G-predictable processes k, k > −1 and some constant c. Then there exists a solution(Z, z, z) of the BSDE (132). Moreover, the process Z is strictly positive.

Proof. Let us set Yt = e−(T−t)θ2/2 and let U be the process introduced in Lemma19. Then the process Zt = UtYt satisfies

dZt = Yt dUt + 12θ

2YtUt dt

= 12θ

2YtUt dt + 12YtU

−1t u2

t dt + Ytut dWQt + Ytut dMt

= 12θ

2Zt dt + 12Z

−1t Y 2

t u2t dt + Ytut dW

Qt + Ytut dMt

= 12θ

2Zt dt + 12Z

−1t z2

t dt + zt dWQt + zt dMt

where we set zt = Ytut and zt = Ytut. It is also clear that ZT = UT = e−X and Zis strictly positive.

Page 121: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 113

Recall that the process Z depends on the choice of a contingent claim X , as wellas on the model’s parameters , σ and ν. The next lemma shows that the processesZ and π∗ introduced above have indeed the desired properties that were describedat the beginning of this section. To achieve our goal, we need to restrict the classof admissible trading strategies, however. We say that an admissible strategy π isregular with respect to X if the martingale part of the process e−V

vt (π)Zt is a mar-

tingale under P, rather than a local martingale. We denote by ΠX(G) the class of alladmissible trading strategies, which are regular with respect to X .

Lemma 20. Let X be a GT -measurable claim such that (133) holds for some G-predictable processes k, k and some constant c. Assume that the default intensity γand the processes k, k are bounded. Suppose that the process Z = Z(X, , σ, ν) isa solution to the BSDE (131) given in Corollary 12. Then:(i) The process e−V

vt (π)Zt is a submartingale for any strategy π ∈ ΠX(G).

(ii) The process e−Vv

t (π∗)Zt is a martingale for the process π∗ given by expression(129).(iii) The process π∗ belongs to the class ΠX(G) of admissible trading strategiesregular with respect to X .

Proof. In view of the definition of π∗ and the choice of the process f (see formula(130)), the validity of part (i) is rather clear. To establish (ii), we shall first checkthat the process e−V

∗t Zt is a martingale (and not only a local martingale) under P,

where V ∗t = V v

t (π∗). From the choice of π∗, we obtain

d(e−V∗

t Zt) = e−V∗

t((zt − π∗

t σZt) dWt + zt dMt

)= −θe−V

∗t Zt dWt + e−V

∗t zt dMt.

This means that

e−V∗

t Zt = e−vZ0 exp(− θWt − 1

2θ2t)exp

(−∫ t

0

zsZs

ξs ds)(

1 +zτ−Zτ−

Ht

).

The quantity e−vZ0 exp(− θWt − 1

2θ2t)

is clearly a continuous martingale underP. Recall that

zt = Ytut = ktZt.

and thus zt/Zt = kt is a bounded process. We conclude that the process

exp(−∫ t

0

zsZs

ξs ds)(

1 +zτ−Zτ−

Ht

)

is a bounded, purely discontinuous martingale under P. To complete the proof, itremains to check that the process π∗ given by (129) is G-admissible, in the sense ofDefinition 11. To this end, it suffices to check that

∫ T

0

z2tZ

−2t dt < ∞, P-a.s.

Page 122: Paris-Princeton Lectures on Mathematical Finance 2003

114 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

This is clear since the process zt/Zt = kt is bounded. We conclude that the strategyπ∗ belongs to the class ΠX(G).

Recall now that in this section we examine the following problem:

supπ∈ΠX (G)

EP

(u(V v

T (π) + X))

= 1 − infπ∈ΠX(G)

EP

(e−V

vT (π)e−X

).

We are in a position to state the following result.

Proposition 26. Let X be a GT -measurable claim such that (133) holds for someG-predictable processes k, k and some constant c. Assume that the default intensityγ and the processes k, k are bounded. Then

infπ∈ΠX(G)

EP

(e−V

vT (π)e−X

)= EP

(e−V

vT (π∗)e−X

)= e−vZX0 ,

where the optimal strategy π∗ ∈ ΠX(G) is given by the formula, for every t ∈ [0, T ],

π∗t =

(θ +

zXtZXt

)=

θ + ktσ

,

where ZXt = Zt and zXt = zt are the two first components of a solution (Zt, zt, z)of the BSDE

dZt =(νZt + σzt)2

2σ2Ztdt + zt dWt + zt dMt, ZT = e−X . (135)

More explicitly (see Corollary 12), we have zt = ktZt and

Zt = e−(T−t)θ2/2 exp(Kt)Et(M).

Proof. The proof is rather straightforward. We know that the process Z which solves(135) is such that: (i) the process Zte

−V vt (π∗) is a martingale, and (ii) for any strat-

egy π ∈ ΠX(G) the process Zte−V v

t (π) is equal to a martingale minus an increas-ing process (since the drift term is non-positive), and thus it is a submartingale. Thisshows that (127) holds with Π(G) substituted with ΠX(G).

It should be acknowledged that the assumptions of Proposition 26 are restrictive, sothat it covers only a very special case of a claim X . Let us now comment briefly onthe case of a general claim; we do not pretend here to give strict results, our aim ismerely to give some hints how one can deal with the general case.

Recall that our aim is to find a solution (Z, z, z) of the following BSDE

dZt =(

12θ

2Zt + 12Z

−1t z2

t

)dt + zt dW

Qt + zt dMt, t ∈ [0, T [, ZT = e−X ,

or equivalently, of the equation

Page 123: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 115

dUt = 12U

−1t u2

t dt + ut dWQt + ut dMt, t ∈ [0, T [, UT = e−X ,

Assume that the process U is strictly positive and set Xt = lnUt. Then, denotingxt = utU

−1t , xt = utU

−1t− and applying Ito’s formula, we obtain (recall that we

denote ξt = γt11τ>t)

dXt = xt dWQt + xt dMt + (ln(1 + xt) − xt) dHt

= xt dWQt + xt dMt + (ln(1 + xt) − xt)(dMt + ξt dt)

= xt dWQt + ln(1 + xt) dMt + (ln(1 + xt) − xt)ξt dt

= xt dWQt + x∗

t dMt + (1 − ex∗t + x∗

t )ξt dt= xt dW

Qt + x∗

t dHt + (1 − ex∗t )ξt dt,

where x∗t = ln(1 + xt) and the terminal condition is XT = −X . It thus suffices to

solve the following BSDE

dXt = xt dWQt + x∗

t dHt + (1 − ex∗t )ξt dt, t ∈ [0, T [, XT = −X. (136)

Assume first that X ∈ FT . In that case, it is obvious that we may take x = x∗ = 0and thus Xt = −EQ(X | Gt) = −EQ(X | Ft) is a solution. In the general case,we note that the continuous G-martingales are stochastic integrals with respect tothe Brownian motion WQ. We may thus transform the problem: it suffices to find aprocess x∗ such that the process R, defined through the formula

Rt = EQ

(− X +

∫ T

0

(ex∗s − 1)ξs ds − x∗

τ11τ≤T∣∣∣Gt),

is a continuous G-martingale, so that dRt = xt dWQt for some G-predictable process

x. Suppose that we can find a process x∗ for which the last property is valid. Then,by setting

Xt = Rt −∫ t

0

(ex∗s − 1)ξs ds− x∗

τ11τ≤t

= EQ

(− X +

∫ T

t

(ex∗s − 1)ξs ds− x∗

τ11t<τ≤T∣∣∣Gt)

we obtain a solution (X, x, x∗) to (136).

Case of a survival claim. From now on, we shall focus on a survival claim X =Y 11τ>T, where Y is an FT -measurable random variable. Let us fix t ∈ [0, T ]. Onthe set t ≤ τ we obtain

EQ(Y 11τ>T | Gt) = eΓt EQ(e−ΓT Y | Ft)

and on the set τ < t, we have EQ(Y 11τ>T | Gt) = 0. The jump of the term At,defined as

Page 124: Paris-Princeton Lectures on Mathematical Finance 2003

116 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

At = EQ

(∫ T

t

(ex∗s − 1)ξs ds− x∗

τ11τ≤T∣∣∣Gt),

can be computed as follows. On the set t ≤ τ, we obtain

At =∫ T

t

EQ

((ex

∗s − 1)γs11τ>s | Gt

)ds − EQ

(x∗τ11τ≤T | Gt

)

= 11τ>teΓt EQ

(∫ T

t

(ex

∗s − 1 − x∗

s

)e−Γsγs ds

∣∣∣Ft).

On the set τ < t for At we have

EQ

(∫ T

t

(ex∗s − 1)γs11τ>s ds− x∗

τ11τ≤T∣∣∣Gt)

= −EQ

(x∗τ11τ≤t

∣∣Gt) = −x∗τ .

We conclude that our problem is to find a process x∗ such that

−EQ

(e−ΓT Y |Ft) = −e−Γtx∗

t − EQ

( ∫ T

t

(ex∗s − 1 − x∗

s)e−Γsγs ds

∣∣∣Ft).

In other words, we need to solve the following BSDE with F-adapted processes x∗

and κ

d(x∗t e

−Γt)

=(ex

∗t − 1 − x∗

t

)e−Γtγt dt + κt dW

Qt , t ∈ [0, T [, x∗

T = Y.

From integration by parts, this BSDE can be written

dx∗t =

(ex

∗t − 1

)e−Γtγt dt + κt dW

Qt , t ∈ [0, T [, x∗

T = X.

Unfortunately, the standard results for existence of solutions to BSDEs do not applyhere because the drift term is not of a linear growth with respect to x∗.

12.2 Hodges Buying and Selling Prices

Particular case. Assume, as before, that r = 0 and let us check that the Hodgesbuying price is the hedging price in case of attainable claims. Assume that a claim Xis FT -measurable. By virtue of the predictable representation theorem, there existsa pair (x, x), where x is a constant and xt is an F-adapted process, such that X =x+

∫ T0

xu dWQu , where WQ

t = Wt+θt. Here x = EQX is the arbitrage price π0(X)of X and the replicating portfolio is obtained through x. Hence, the time t value ofX is Xt = x +

∫ t0xu dWQ

u . Then dXt = xt dWQt and the process

Zt = e−θ2(T−t)/2e−Xt

satisfies

Page 125: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 117

dZt = Zt

((12θ

2 + 12

2x2t

)dt + xt dW

Qt

)

=1

2σ2Zt(νZt + σZtxt)2 dt + Ztxt dWt.

Hence (Zt, Ztxt, 0) is the solution of (135) with the terminal condition e−X , and

Z0 = e−θ2T/2e−x

Note that, for X = 0, we get Z0 = e−θ2T/2, therefore

infπ∈Π(G)

EP(e−Vv

T (π)) = e−ve−θ2T/2.

The G-Hodges buying price of X is the value of p such that

infπ∈Π(G)

EP

(e−V

vT (π)

)= inf

π∈Π(G)EP

(e−(V

v−pT (π)+X)

),

that is,e−ve−θ

2T/2 = e−(v−p+π0(X))e−θ2T/2.

We conclude easily that pG∗ (X) = π0(X) = EQX . Similar arguments show that

p∗G(X) = π0(X).

General case. Assume now that a claim X is GT -measurable and the assumptionsof Proposition 26 are satisfied. Since the process Z introduced in Corollary 12 isstrictly positive, we can use its logarithm. Let us assume that the processes k and k

are strictly positive, and let us denote ψt = Zt/zt = k−1t , ψt = Zt/zt = k−1

t and

κt =ψt

ln(1 + ψt)≥ 0.

Then we get

d(lnZt) = 12 θ2dt + ψt dW

Qt + ln(1 + ψt)

(dMt + ξt(1 − κt) dt

),

and thusd(lnZt) = 1

2 θ2dt + ψt dWQt + ln(1 + ψt) dMt,

wheredMt = dMt + ξt(1 − κt) dt = dHt − ξtκt dt.

The process M is a martingale under the probability measure Q defined as dQ|Gt =ηt dP|Gt , where η satisfies

dηt = −ηt−(θ dWt + ξt(1 − κt) dMt

)

with η0 = 1.

Page 126: Paris-Princeton Lectures on Mathematical Finance 2003

118 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Proposition 27. The G-Hodges buying price of X with respect to the exponentialutility is the real number p such that e−(v−p)ZX0 = e−vZ0

0 , that is, p∗G(X) =−1 ln(Z0

0/ZX0 ) or, equivalently, p∗G(X) = E

QX .

Our previous study establishes that the dynamic hedging price of a claim X is theprocess Xt = E

Q(X | Gt). This price is the expectation of the payoff, under some

martingale measure, as is any price in the range of no-arbitrage prices.

13 Quadratic Hedging

We assume here that the wealth process follows

dV vt (π) = πt

(ν dt + σ dWt

), V v

0 (π) = v,

where we assume that π ∈ Π(F) or π ∈ Π(G), depending on the case studied below.The more general case

dV vt (π) = πt

(ν dt + σ dWt + ϕdMt

), V v

0 (π) = v

is too long to be presented here. In this section, we examine the issue of the quadraticpricing and hedging, specifically, for a given P-square-integrable claim X we solvethe following minimization problems:

• For a given initial endowment v, solve the minimization problem:

minπ

EP

((V vT (π) −X)2

).

A solution to this problem provides the portfolio which, among the portfolios with agiven initial wealth, has the closest terminal wealth to a given claim X , in sense ofL2-norm under P.

• Solve the minimization problem:

minπ,v

EP

((V vT (π) −X)2

).

The minimal value of v is called the quadratic hedging price and the optimal π thequadratic hedging strategy.

The mean-variance hedging problem was examined in a fairly general framework ofincomplete markets by means of BSDEs in several papers; see, for example, Mania(2000), Mania and Tevzadze (2003), Bobrovnytska and Schweizer (2004), Hu andZhou (2004) or Lim (2004). Since this list is by no means exhaustive, the interestedreader is referred to the references quoted in the above-mentioned papers.

Page 127: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 119

13.1 Quadratic Hedging with F-Adapted Strategies

We shall first solve, for a given initial endowment v, the following minimizationproblem

minπ∈Π(F)

EP

((V vT (π) −X)2

),

where the claim X ∈ GT is given as

X = X111τ>T + X211τ≤T

for some FT -measurable and P-square-integrable random variables X1 and X2. Us-ing the same approach as in the previous section, we define the auxiliary functionJX by setting

JX(y) = (y −X1)2(1 − FT ) + (y −X2)2FT ,

so that its derivative equals

J ′X(y) = 2

(y −X1(1 − FT ) −X2FT

).

HenceIX(z) = 1

2z + X1(1 − FT ) + X2FT ,

and thus the optimal terminal wealth equals

V v,∗T = 1

2λ∗ζT + X1(1 − FT ) + X2FT ,

where λ∗ is specified through the budget constraint:

EP(ζTV v,∗T ) = 1

2λ∗ EP(ζ2

T ) + EP(ζTX1(1 − FT )) + EP(ζTX2FT ) = v.

We deduce that

minπ

EP((V vT −X)2)

= EP

[(12λ

∗ζT + X1(1 − FT ) + X2FT −X1

)2 (1 − FT )]

+ EP

[(12λ

∗ζT + X1(1 − FT ) + X2FT ) −X2

)2FT

]

= 14 (λ∗)2 EP(ζ2

T ) + EP

((X1 −X2)2FT (1 − FT )

)=

12EP(ζ2

T )

(v − EP(ζT (X1 + FT (X2 −X1))

)2

+ EP((X1 −X2)2FT (1 − FT )).

Therefore, we obtain the following result.

Page 128: Paris-Princeton Lectures on Mathematical Finance 2003

120 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Proposition 28. If we restrict our attention to F-adapted strategies, the quadratichedging price of the claim X = X111τ>T + X211τ≤T equals

EP

(ζT (X1 + FT (X2 −X1)

)= EQ

(X1(1 − FT ) + FTX2

).

The optimal quadratic hedging of X is the strategy which duplicates the FT -measurable contingent claim X1(1 − FT ) + FTX2.

Let us now examine the case of a generic GT -measurable random variable X . In thiscase, we shall only examine the solution of the second problem introduced above,that is,

minv,π

EP

((V vT (π) −X)2

).

As we have explained in the previous part, this problem is essentially equivalent to aproblem where we restrict our attention to the terminal wealth. From the propertiesof conditional expectations, we have

minV ∈FT

EP

((V −X)2

)= EP

((EP(X | FT ) −X)2

)

and the initial value of the strategy with terminal value EP(X | FT ) is

EP(ζTEP(X | FT )) = EP(ζTX).

In essence, the latter statement is a consequence of the completeness of the default-free market model. In conclusion, the quadratic hedging price equals EP(ζTX) =EQX and the quadratic hedging strategy is the replicating strategy of the attainableclaim EP(X | FT ) associated with X .

13.2 Quadratic Hedging with G-Adapted Strategies

Our next goal is to solve, for a given initial endowment v, the following minimizationproblem

minπ∈Π(G)

EP

((V vT (π) −X)2

).

We have seen in Part II that one way of solving this problem is to project the ran-dom variable X on the set of stochastic integrals. Here, we present an alternativeapproach.

We are looking for G-adapted processes X,Θ and Ψ such that the process

Jt(π) =(V vt (π) −Xt

)2Θt + Ψt, ∀ t ∈ [0, T ], (137)

is a G-submartingale for any G-adapted trading strategy π and a G-martingale forsome strategy π∗. In addition, we require that XT = X, ΘT = 1, ΦT = 0. Let usassume that the dynamics of these processes are of the form

Page 129: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 121

dXt = xt dt + xt dWt + xt dMt, (138)

dΘt = Θt−(ϑt dt + ϑt dWt + ϑt dMt

), (139)

dΨt = ψt dt + ψt dWt + ψt dMt, (140)

where the drifts xt, ϑt and ψt are yet to be determined. From Ito’s formula, we obtain(recall that ξt = γt11τ>t)

d(Vt −Xt)2 = 2(Vt −Xt)(πtσ − xt) dWt − 2(Vt − Xt−)xt dMt

+[(Vt −Xt− − xt)2 − (Vt −Xt−)2

]dMt

+(2(Vt −Xt)(πtν − xt) + (πtσ − xt)2

+ ξt[(Vt −Xt − xt)2 − (Vt −Xt)2

])dt,

where we denote Vt = V vt (π). The process J(π) is a martingale if and only if its

drift term k(t, πt, xt, ϑt, ψt) = 0 for every t ∈ [0, T ].

Straightforward calculations show that

k(t, πt, ϑt, xt, ψt) = ψt + Θt

[ϑt(Vt −Xt)2

+ 2(Vt −Xt)[(πtν − xt) + ϑt(πtσ − xt) + ξtxt

]+ (πtσ − xt)2 + ξt(ϑt + 1)

[(Vt −Xt − xt)2 − (Vt −Xt)2

]].

In the first step, for any t ∈ [0, T ] we shall find π∗t such that the minimum of

k(t, πt, xt, ϑt, ψt) is attained. Subsequently, we shall choose the auxiliary processesx = x∗, ϑ = ϑ∗ and ψ = ψ∗ in such a way that k(t, π∗

t , x∗t , ϑ

∗t , ψ

∗t ) = 0. This choice

will imply that k(t, πt, x∗t , ϑ

∗t , ψ

∗t ) ≥ 0 for any trading strategy π and any t ∈ [0, T ].

The strategy π∗, which minimizes k(t, πt, xt, ϑt, ψt), is the solution of the followingequation:

(V vt (π) −Xt)(ν + ϑtσ) + σ(πtσ − xt) = 0, ∀ t ∈ [0, T ].

Hence, the strategy π∗ is implicitly given by

π∗t = σ−1xt − σ−2(ν + ϑtσ)(V v

t (π∗) −Xt) = At −Bt(V vt (π∗) −Xt),

where we denoteAt = σ−1xt, Bt = σ−2(ν + ϑtσ).

After some computations, we see that the drift term of the process J admits thefollowing representation:

k(t, πt, ϑt, xt, ψt) = ψt + Θt(Vt −Xt)2(ϑt − σ2B2t )

+ 2Θt(Vt −Xt)(σ2AtBt − ϑtxt − ϑtxtξt − xt

)+ Θtξt(ϑt + 1)x2

t .

Page 130: Paris-Princeton Lectures on Mathematical Finance 2003

122 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

From now on, we shall assume that the auxiliary processes ϑ, x and ψ are chosen asfollows:

ϑt = ϑ∗t = σ2B2

t ,

xt = x∗t = σ2AtBt − ϑtxt − ϑtxtξt,

ψt = ψ∗t = −Θtξt(ϑt + 1)x2

t .

It is rather clear that if the drift coefficients ϑ, x, ψ in (138)-(140) are chosen asabove, then the drift term in dynamics of J is always non-negative, and it is equal to0 for the strategy π∗, where π∗

t = At −Bt(V vt (π∗) −Xt).

Our next goal is to solve equations (138)-(140). Let us first consider equation (139).Since ϑt = σ2B2

t , it suffices to find the three-dimensional process (Θ, ϑ, ϑ) whichis a solution to the following BSDE:

dΘt = Θt(σ−2(ν + ϑtσ)2 dt + ϑt dWt + ϑt dMt

), ΘT = 1.

It is obvious that the processes ϑ = 0, ϑ = 0 and Θ, given as

Θt = exp(−θ2(T − t)), ∀ t ∈ [0, T ], (141)

solve this equation.

In the next step, we search for a three-dimensional process (X, x, x), which solvesequation (138) with xt = x∗

t = σ2At(ν/σ2) = θxt. It is clear that (X, x, x) is theunique solution to the linear BSDE

dXt = θxt dt + xt dWt + xt dMt, XT = X.

The unique solution to this equation is Xt = EQ(X | Gt), where Q is the risk-neutralprobability measure, so that dQ = ηt dP, where

dηt = −θηt dWt, η0 = 1.

The components x and x are given by the integral representation of the G-martingaleX with respect to WQ and M . Notice also that since ϑ = 0, the optimal portfolio π∗

is given by the feedback formula

π∗t = σ−1

(xt − θ(V v

t (π∗) −Xt)).

Finally, since ϑ = 0, we have ψt = −ξtx2tΘt. Therefore, we can solve explicitly the

BSDE (140) for the process Ψ . Indeed, we are now looking for a three-dimensionalprocess (Ψ, ψ, ψ), which is the unique solution of the BSDE

dΨt = −Θtξtx2t dt + ψt dWt + ψt dMt, ΨT = 0.

Noting that the process

Page 131: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 123

Ψt +∫ t

0

Θsξsx2s ds

is a G-martingale under P, we obtain the value of Ψ in a closed form:

Ψt = EP

(∫ T

t

Θsξsx2s ds

∣∣∣Gt). (142)

Substituting (141) and (142) in (137), we conclude that the value function for ourproblem is J∗

t = Jt(π∗), where in turn

Jt(π∗) = (V vt (π∗) −Xt)2e−θ

2(T−t) + EP

(∫ T

t

Θsξsx2s ds

∣∣∣Gt)

= (V vt (π∗) −Xt)2e−θ

2(T−t) +∫ T

t

e−θ2(T−s) EP

(γsx

2s11τ>s

∣∣Gt) ds= (V v

t (π∗) −Xt)2e−θ2(T−t)

+ 11τ>t

∫ T

t

e−θ2(T−s) EP

(γsx

2seΓt−Γs

∣∣Ft) ds,where we have identified the process x with its F-adapted version (recall that anyG-predictable process is equal, prior to default, to an F-predictable process). In par-ticular,

J∗0 = e−θ

2T((v −X0)2 + EP

(∫ T

0

eθ2sγsx

2se

−Γs ds))

.

From the last formula, it is obvious that the quadratic hedging price is X0 = EQX .We are in a position to formulate the main result of this section. A correspondingtheorem for a default-free financial model was established by Kohlmann and Zhou(2000).

Proposition 29. Let a claim X be GT -measurable and P-square-integrable. The op-timal trading strategy π∗, which solves the quadratic problem

minπ∈Π(G)

EP((V vT (π) −X)2),

is given by the feedback formula

π∗t = σ−1

(xt − θ(V v

t (π∗) −Xt)),

where Xt = EQ(X | Gt) for every t ∈ [0, T ], and the process xt is specified by

dXt = xt dWQt + xt dMt.

The quadratic hedging price of X is equal to EQX .

Page 132: Paris-Princeton Lectures on Mathematical Finance 2003

124 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Survival Claim

Let us consider a simple survival claim X = 11τ>T, and let us assume that Γ is

deterministic, specifically, Γ (t) =∫ t0 γ(s) ds. In that case, from the well-known

representation theorem (see Bielecki and Rutkowski (2004), Page 159), we havedXt = xt dMt with xt = −eΓ (t)−Γ (T ). Hence

Ψt = EP

(∫ T

t

Θsξsx2s ds

∣∣∣Gt)

= EP

(∫ T

t

Θsγ(s)11τ>se2Γ (s)−2Γ (T ) ds∣∣∣Gt)

= 11τ>t eΓ (t)−2Γ (T ) EP

(∫ T

t

e−θ2(T−s)γ(s)eΓ (s) ds

∣∣∣Ft)

= 11τ>t eΓ (t)−2Γ (T )

∫ T

t

e−θ2(T−s)γ(s)eΓ (s) ds.

One can check that, at time 0, the value function is indeed smaller that the one ob-tained with F-adapted portfolios.

Case of an Attainable Claim

Assume now that a claim X is FT -measurable. Then Xt = EQ(X | Gt) is the priceof X , and it satisfies dXt = xt dW

Qt . The optimal strategy is, in a feedback form,

π∗t = σ−1

(xt − θ(Vt −Xt)

)and the associated wealth process satisfies

dVt = π∗t (νdt + σdWt) = π∗

t σ dWQt = σ−1

(σxt − ν(Vt −Xt)

)dWQ

t .

Therefore,d(Vt −Xt) = −θ(Vt −Xt) dW

Qt .

Hence, if we start with an initial wealth equal to the arbitrage price π0(X) of X , thenwe obtain that Vt = Xt for every t ∈ [0, T ], as expected.

Hodges Price

Let us emphasize that the Hodges price has no real meaning here, since the problemmin EP((V v

T )2) has no financial interpretation. We have studied in the precedingpart a more pertinent problem, with a constraint on the expected value of V v

T underP. Nevertheless, from a mathematical point of view, the Hodges price would be thevalue of p such that

Page 133: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 125

(v2 − (v − p)2) =∫ T

0

eθ2sEP(γsx2

se−Γs)11τ>tds

In the case of the example studied in Section 13.2, the Hodges price would be thenon-negative value of p such that

2vp− p2 = e−2ΓT

∫ T

0

eθ2sγse

Γs ds.

Let us also mention that our results are different from results of Lim (2004). Indeed,Lim studies a model with Poisson component, and thus in his approach the intensityof this process does not vanish after the first jump.

14 Optimization in Incomplete Markets

In this last section, we shall briefly (and rather informally) examine a specific op-timization problem associated with a defaultable claim. The interested reader is re-ferred to Lukas (2001) for more details on the approach examined in this section. Wenow assume that the only risky asset available in the market is

dZ1t = Z1

t

(ν dt + σ dWt + ϕdMt

),

and we assume that r = 0. We deal with the following problem:

supπ

EP

(u(V v

τ∧T (π) + X))

for the claim X of the form

X = 11τ>Tg(Z1T ) + 11τ≤Th(Z1

τ )

for some functions g, h : R → R. Note that here the recovery payment is paid athit, that is, at the time of default. In addition, we assume that the default intensity γunder P is constant (hence, it is constant under any equivalent martingale measure aswell). After time τ , the market reduces to a standard Black-Scholes model, and thusthe solution to the corresponding optimization problem is well known.

In the particular case of the exponential utility u(x) = 1 − exp(−x), > 0,we are in a position to use the duality theory. This problem was studied by, amongothers, Rouge and El Karoui (2000), Delbaen et al. (2002) and Collin-Dufresne andHugonnier (2002).

Let H(Q |P) stand for the relative entropy of Q with respect to P. Recall that if aprobability measure Q is absolutely continuous with respect to P then

H(Q |P) = EP

(dQ

dPln

dQ

dP

)= EQ

(ln

dQ

dP

).

Page 134: Paris-Princeton Lectures on Mathematical Finance 2003

126 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Otherwise, the relative entropy H(Q |P) equals ∞.

It is well known that, under suitable technical assumptions (see Rouge and El Karoui(2000) or Delbaen et al. (2002) for details), we have

supπ

EP

(1 − e−(V

vT (π)+X)

)

= 1 − exp(− inf

πinf

Q ∈QT

(H(Q |P) + EQ(V v

T (π) + X)))

,

where π runs over a suitable class of admissible portfolios, and QT stands for the setof equivalent martingale measures on the σ-field GT .

Since for any admissible portfolio π the expected value under any martingale mea-sure Q ∈ QT of the terminal wealth V v

T (π) equals v , we obtain

supπ

EP

(1 − e−(V

vT (π)+X)

)= 1 − exp

(− inf

Q∈QT

(H(Q |P) + EQX + v

)).

Furthermore, since, without loss of generality, we may stop all the processes consid-ered here at the default time τ , we end up with the following equality

infπ

EP

(e−(V

vT∧τ (π)+X)

)= exp

(− inf

Q∈QT∧τ

(H(Q |P) + EQX + v

)),

where π runs over the class of all admissible trading strategies, and QT∧τ standsthe set of equivalent martingale measures on the σ-field GT∧τ . The following resultprovides a description of the class QT∧τ .

Lemma 21. The class QT∧τ of all equivalent martingale measures on the space(Ω,GT∧τ ) is the set of all probability measures Qk,h of the form

dQk,h|GT∧τ = ηT∧τ (k, h) dP,

where the Radon-Nikodym density process η(k, h) is given by the formula

ηt(k, h) = Et(kM)Et(hW ), ∀ t ∈ [0, T ],

for some F-adapted process k such that the inequality kt > −1 holds for everyt ∈ [0, T ], and for the associated process ht = −θ−ϕγσ−1(1+kt), where θ = ν/σ.Under the martingale measure Q = Qk,h the process

Wht∧τ = Wt∧τ −

∫ t∧τ

0

hs ds, ∀ t ∈ [0, T ],

is a stopped Brownian motion, and the process

Mkt∧τ = Mt∧τ −

∫ t∧τ

0

γks ds, ∀ t ∈ [0, T ],

is a martingale stopped at τ .

Page 135: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 127

Straightforward calculations show that the relative entropy of a martingale measureQ = Qk,h ∈ QT∧τ with respect to P equals

EQ

( ∫ τ∧T

0

hs dWhs +

∫ τ∧T

0

(12h

2s − γks + γ(1 + ks) ln(1 + ks)

)ds)

+ EQ

(∫ τ∧T

0

ln(1 + ks) dMks

).

Consequently, the optimization problem

infQ∈QT∧τ

(H(Q |P) + EQX

)

can be reduced to the following problem

infk,h

EQ

(∫ τ∧T

0

(12h

2s − γks + γ(1 + ks) ln(1 + ks)

)ds + X

), (143)

where the processes k and h are as specified in the statement of Lemma 21. Let usset

(ks) = 12h

2s − γks + γ(1 + ks) ln(1 + ks)

so that(k) = 1

2

(θ + ϕγ(1 + k)

)2 − γk + γ(1 + k) ln(1 + k). (144)

Consider a dynamic version of the minimization problem (143)

infk,h

EQ

(∫ τ∧T

t

(ks) ds + 11τ≤Th(Z1τ ) + 11τ>Tg(Z1

T )∣∣∣Gt).

Let us denote Kts = e−

∫ stγ(1+ku) du for t ≤ s. Then, on the pre-default event τ >

t, we obtain the following problem:

infk,h

EQ

(∫ T

t

Kts

((ks) + γ(1 + ks)h(Z1

s (1 + ϕ)))ds + Kt

T g(Z1T )∣∣∣Ft

).

The value function J(t, x) of the latter problem satisfies the HJB equation

∂tJ(t, x) + 12σ

2x2∂xxJ(t, x)+ infk>−1

(− ϕγ(1 + k)x∂xJ(t, x) − γ(1 + k)J(t, x) + ψ(k, x)

)= 0

with the terminal condition J(T, x) = g(x), where we denote

ψ(k, x) = (k) + γ(1 + k)h(x(1 + ϕ))

and where the function is given by (144). The minimizer is given by k = k∗(t, x),which is the unique root of the following equation:

ϕ

σ2

(ν + ϕγ(1 + k)

)+ ln(1 + k) = J(t, x) + ϕx∂xJ(t, x) − h(x(1 + ϕ)),

Page 136: Paris-Princeton Lectures on Mathematical Finance 2003

128 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

and the optimal portfolio π∗ is given by the formula

π∗t = (σ2)−1

(ν + ϕγ(1 + k∗(t, Z1

t )) − σ2Z1t−∂xJ(t, Z1

t−)).

Remark. Note that in the case ϕ = 0 this result is consistent with our result es-tablished in Section 12.1. When ϕ = 0, the process Z1 is continuous, and thus weobtain

π∗t = (σ)−1

(θ − σZ1

t ∂xJ(t, Z1t )),

where the value function J(t, x) satisfies the simplified HJB equation

∂tJ(t, x) + 12σ

2x2∂xxJ(t, x)+ infk>−1

((k) − γ(1 + k)J(t, x) + γ(1 + k)h(x)

)= 0,

where in turn(k) = 1

2θ2 − γk + γ(1 + k) ln(1 + k).

References

Arvanitis A. and Gregory, J. (2001) Credit: The Complete Guide to Pricing, Hedg-ing and Risk Management. Risk Publications.

Arvanitis, A. and Laurent, J.-P. (1999) On the edge of completeness. Risk, October,61–65.

Barles, G., Buckdahn, R. and Pardoux, E. (1997) Backward stochastic differentialequations and integral-partial differential equations. Stochastics and StochasticsReports 60, 57–83.

Belanger, A., Shreve, S. and Wong, D. (2001) A unified model for credit derivatives.To appear in Mathematical Finance.

Bernis, G. and Jeanblanc, M. (2003) Hedging defaultable derivatives via utility the-ory. Preprint, Evry University.

Bielecki, T.R. and Jeanblanc, M. (2003) Genuine mean-variance hedging of creditrisk: A case study. Working paper.

Bielecki, T.R., Jeanblanc, M. and Rutkowski, M. (2004a) Modeling and valuationof credit risk. In: CIME-EMS Summer School on Stochastic Methods in Finance,Bressanone, July 6-12, 2003, Springer-Verlag, Berlin Heidelberg New York.

Bielecki, T.R. and Rutkowski, M. (2003) Dependent defaults and credit migrations.Applicationes Mathematicae 30, 121–145.

Bielecki, T.R. and Rutkowski, M. (2004) Credit Risk: Modeling, Valuation andHedging. Corrected 2nd printing. Springer-Verlag, Berlin Heidelberg New York.

Bielecki, T.R., Jin, H., Pliska, S.R. and Zhou, X.Y. (2004b) Dynamic mean-varianceportfolio selection with bankruptcy prohibition. Forthcoming in MathematicalFinance.

Page 137: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 129

Black, F. and Cox, J.C. (1976) Valuing corporate securities: Some effects of bondindenture provisions. Journal of Finance 31, 351–367.

Blanchet-Scalliet, C. and Jeanblanc, M. (2004) Hazard rate for credit risk and hedg-ing defaultable contingent claims. Finance and Stochastics 8, 145–159.

Bobrovnytska, O. and Schweizer, M. (2004) Mean-variance hedging and stochasticcontrol: Beyond the Brownian setting. IEEE Transactions on Automatic Control49, 396–408.

Buckdahn, R. (2001) Backward stochastic differential equations and viscosity solu-tions of semilinear parabolic deterministic and stochastic PDE of second order.In: Stochastic Processes and Related Topics, R. Buckdahn, H.-J. Engelbert andM. Yor, editors, Taylor and Francis, pp. 1–54.

Collin-Dufresne, P. and Hugonnier, J.-N. (2001) Event risk, contingent claims andthe temporal resolution of uncertainty. Working paper, Carnegie Mellon Univer-sity.

Collin-Dufresne, P. and Hugonnier, J.-N. (2002) On the pricing and hedging ofcontingent claims in the presence of extraneous risks. Preprint, Carnegie MellonUniversity.

Collin-Dufresne, P., Goldstein, R.S. and Hugonnier, J.-N. (2003) A general formulafor valuing defaultable securities. Preprint.

Cossin, D. and Pirotte, H. (2000) Advanced Credit Risk Analysis. J. Wiley, Chich-ester.

Davis, M. (1997) Option pricing in incomplete markets. In: Mathematics of Deriva-tive Securities, M.A.H. Dempster and S.R. Pliska, editors, Cambridge UniversityPress, Cambridge, pp. 216–227.

Delbaen, F., Grandits, P., Rheinlander, Th., Samperi, D., Schweizer, M. and Stricker,Ch. (2002) Exponential hedging and entropic penalties. Mathematical Finance12, 99–124.

Duffie, D. (2003) Dynamic Asset Pricing Theory. 3rd ed. Princeton UniversityPress, Princeton.

Duffie, D. and Singleton, K. (2003) Credit Risk: Pricing, Measurement and Man-agement. Princeton University Press, Princeton.

El Karoui, N. and Mazliak, L. (1997) Backward stochastic differential equations.Pitman Research Notes in Mathematics, Longman.

El Karoui, N., Peng, S. and Quenez, M.-C. (1997) Backward stochastic differentialequations in finance. Mathematical Finance 7, 1–71.

El Karoui, N. and Quenez, M.-C. (1995) Dynamic programming and pricing of con-tingent claims in an incomplete market. SIAM Journal on Control and Optimiza-tion 33, 29–66.

El Karoui, N. and Quenez, M.-C. (1997) Imperfect markets and backward stochasticdifferential equations. In: Numerical Methods in Finance, L.C.G. Rogers, D.Talay, editors, Cambridge University Press, Cambridge, pp. 181–214.

Elliott, R.J., Jeanblanc, M. and Yor, M. (2000) On models of default risk. Mathe-matical Finance 10, 179–195.

Gourieroux, C., Laurent, J.-P. and Pham, N. (1998) Mean-variance hedging andnumeraire. Mathematical Finance 8, 179–200.

Page 138: Paris-Princeton Lectures on Mathematical Finance 2003

130 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Greenfield, Y. (2000) Hedging of credit risk embedded in derivative transactions.Thesis, Carnegie Mellon University.

Gregory, J. and Laurent, J.-P. (2003) I will survive. Risk, June, 103–107.Hodges, S.D. and Neuberger, A. (1989) Optimal replication of contingent claim

under transaction costs. Review of Futures Markets 8, 222–239.Hu, Y. and Zhou, X.Y. (2004) Constrained stochastic LQ control with random co-

efficients and application to mean-variance portfolio selection. Preprint.Hugonnier, J.-N., Kramkov, D. and Schachermayer, W. (2002) On the utility based

pricing of contingent claims in incomplete markets. Preprint.Jamshidian, F. (2002) Valuation of credit default swap and swaptions. To appear in

Finance and Stochastics.Jankunas, A. (2001) Optimal contingent claims. Annals of Applied Probability 11,

735–749Jarrow, R.A. and Yu, F. (2001) Counterparty risk and the pricing of defaultable

securities. Journal of Finance 56, 1756–1799.Jeanblanc, M. and Rutkowski, M. (2000) Modelling of default risk: Mathematical

tools. Working paper, Universite d’Evry and Politechnika Warszawska.Jeanblanc, M. and Rutkowski, M. (2002) Default risk and hazard processes. In:

Mathematical Finance – Bachelier Congress 2000, H. Geman, D. Madan, S.R.Pliska and T. Vorst, editors, Springer-Verlag, Berlin Heidelberg New York, pp.281–312.

Jeanblanc, M. and Rutkowski, M. (2003) Modelling and hedging of default risk.In: Credit Derivatives: The Definitive Guide, J. Gregory, editor, Risk Books,London, pp. 385–416.

Jeanblanc, M., Yor, M. and Chesney, M. (2004) Mathematical Methods for Finan-cial Markets. Springer-Verlag. Forthcoming.

Karatzas, I. and Shreve, S.E. (1998) Brownian Motion and Stochastic Calculus. 2nded. Springer-Verlag, Berlin Heidelberg New York.

Karatzas, I. and Shreve, S.E. (1998) Methods of Mathematical Finance. Springer-Verlag, Berlin Heidelberg New York.

Kobylanski, M. (2000) Backward stochastic differential equations and partial dif-ferential equations with quadratic growth. Annals of Probability 28, 558–602.

Kohlmann, M. and Zhou, X.Y. (2000) Relationship between backward stochasticdifferential equations and stochastic controls: A linear-quadratic approach. SIAMJournal on Control and Optimization 38, 1392–1407.

Kramkov, D. (1996) Optional decomposition of supermartingales and hedging con-tingent claims in incomplete security markets. Probability Theory and RelatedFields 105, 459–479.

Kusuoka, S. (1999) A remark on default risk models. Advances in MathematicalEconomics 1, 69–82.

Lando, D. (1998) On Cox processes and credit-risky securities. Review of Deriva-tives Research 2, 99–120.

Last, G. and Brandt, A. (1995) Marked Point Processes on the Real Line: The Dy-namic Approach. Springer-Verlag, Berlin Heidelberg New York.

Page 139: Paris-Princeton Lectures on Mathematical Finance 2003

Hedging of Defaultable Claims 131

Laurent, J.-P. and Gregory, J. (2002) Basket defaults swaps, CDOs and factor cop-ulas. Working paper.

Lepeltier, J.-P. and San-Martin, J. (1997) Existence for BSDE with superlinear-quadratic coefficient. Stochastics and Stochastics Reports 63, 227–240.

Li, X., Zhou, X.Y. and Lim, A.E.B. (2001) Dynamic mean-variance portfolio se-lection with no-shorting constraints. SIAM Journal on Control and Optimization40, 1540–1555.

Lim, A.E.B. (2004) Hedging default risk. Preprint.Lim, A.E.B. and Zhou, X.Y. (2002) Mean-variance portfolio selection with random

parameters. Mathematics of Operations Research 27, 101–120.Lotz, C. (1998) Locally minimizing the credit risk. Working paper, University of

Bonn.Lukas, S. (2001) On pricing and hedging defaultable contingent claims. Thesis,

Humboldt University.Ma, J. and Yong, Y. (1999) Forward-Backward Stochastic Differential Equations

and Their Applications. Springer-Verlag, Berlin Heidelberg New York.Mania, M. (2000) A general problem of an optimal equivalent change of measure

and contingent claim pricing in an incomplete market. Stochastic Processes andtheir Applications 90, 19–42.

Mania, M. and Tevzadze, R. (2003) Backward stochastic PDE and imperfect hedg-ing. International Journal of Theoretical and Applied Finance 6, 663–692.

Merton, R. (1974) On the pricing of corporate debt: The risk structure of interestrates. Journal of Finance 29, 449–470.

Musiela, M. and Rutkowski, M. (1997) Martingale Methods in Financial Mod-elling. Springer-Verlag, Berlin Heidelberg New York.

Musiela, M. and Zariphopoulou, T. (2004) An example of indifference prices underexponential preferences. Finance and Stochastics 8, 229–239.

Pham, N., Rheinlander, Th. and Schweizer, M. (1998) Mean-variance hedging forcontinuous processes: New results and examples. Finance and Stochastics 2,173–98.

Pliska, S.R. (2001) Dynamic Markowitz problems with no bankruptcy. Presentationat JAFEE Conference, 2001.

Protter, P. (2003) Stochastic Integration and Differential Equations. 2nd ed. Springer-Verlag, Berlin Heidelberg New York.

Rheinlander, Th. (1999) Optimal martingale measures and their applications inmathematical finance. Ph.D. Thesis, TU-Berlin.

Rheinlander, Th. and Schweizer, M. (1997) On L2-projections on a space ofstochastic integrals. Annals of Probability 25, 1810–1831.

Rong, S. (1997) On solutions of backward stochastic differential equations withjumps and applications. Stochastic Processes and their Applications 66, 209–236.

Rouge, R. and El Karoui, N. (2000) Pricing via utility maximization and entropy.Mathematical Finance 10, 259–276.

Royer, M. (2003) Equations differentielles stochastiques retrogrades et martingalesnon-lineaires. Doctoral thesis.

Page 140: Paris-Princeton Lectures on Mathematical Finance 2003

132 T.R. Bielecki, M. Jeanblanc, and M. Rutkowski

Schonbucher, P.J. (1998) Pricing credit risk derivatives. Working paper, Universityof Bonn.

Schonbucher, P.J. (2003) Credit Derivatives Pricing Models. J.Wiley, Chichester.Schweizer, M. (2001) A guided tour through quadratic hedging approaches. In:

Option Pricing, Interest Rates and Risk Management, E. Jouini, J. Cvitanic andM. Musiela, editors, Cambridge University Press, pp. 538–574.

Vaillant, N. (2001) A beginner’s guide to credit derivatives. Working paper, NomuraInternational.

Wong, D. (1998) A unifying credit model. Working paper, Research Advisory Ser-vices, Capital Markets Group, Scotia Capital Markets.

Yong, J. and Zhou, X.Y. (1999) Stochastic Controls: Hamiltonian Systems and HJBEquations. Springer-Verlag, Berlin Heidelberg New York.

Zhou, X.Y. (2003) Markowitz’s world in continuous time, and beyond. In: Stochas-tic Modeling and Optimization, D.D. Yao et al., editors, Springer, New York, pp.279–310.

Zhou, X.Y. and Li, D. (2000) Continuous time mean-variance portfolio selection:A stochastic LQ framework. Applied Mathematics and Optimization 42, 19–33.

Zhou, X.Y. and Yin, G. (2004) Dynamic mean-variance portfolio selection withregime switching: A continuous-time model. IEEE Transactions on AutomaticControl 49, 349–360.

Page 141: Paris-Princeton Lectures on Mathematical Finance 2003

On the Geometry of Interest Rate Models

Tomas Bjork

Department of FinanceStockholm School of EconomicsBox 6501S-113 83 Stockholm SWEDENemail: [email protected]

Summary. In this chapter, which is a substantial extension of an earlier essay [3], we give anoverview of some recent work on the geometric properties of the evolution of the forward ratecurve in an arbitrage free bond market. The main problems to be discussed are as follows.

• When is a given forward rate model consistent with a given family of forward rate curves?• When can the inherently infinite dimensional forward rate process be realized by means

of a Markovian finite dimensional state space model.

We consider interest rate models of Heath-Jarrow-Morton type, where the forward rates aredriven by a multidimensional Wiener process, and where he volatility is allowed to be anarbitrary smooth functional of the present forward rate curve. Within this framework we givenecessary and sufficient conditions for consistency, as well as for the existence of a finitedimensional realization, in terms of the forward rate volatilities. We also study stochasticvolatility HJM models, and we provide a systematic method for the construction of concreterealizations.

Key words: HJM models, stochastic volatility, factor models, forward rates, state space mod-els, Markovian realizations, infinite dimensional stochastic differential equations, invariantmanifolds, geometry.MSC 2000 subject classification. 91B28, 91B70

Acknowledgements: support from the Jan Wallander and Tom Hedelius Foundation is grate-fully acknowledged. I am grateful to D. Filipovic, J. Teichmann and J. Zabzcyk for a numberof very helpful discussions. A number of valuable comments from unknown referees on theunderlying papers helped to improve these considerably. I am also highly indebted to B.J.Christensen, C.Landen and L. Svensson for their generosity in letting me use our joint resultsfor this overview.

T.R. Bielecki et al.: LNM 1847, R.A. Carmona et al. (Eds.), pp. 133–215, 2004.c© Springer-Verlag Berlin Heidelberg 2004

Page 142: Paris-Princeton Lectures on Mathematical Finance 2003

134 T. Bjork

1 Introduction

We start by presenting the probabilistic framework and formulating the main prob-lems to be studied.

1.1 Setup

We consider a bond market model (see [4], [33]) living on a filtered probabilityspace (Ω,F ,F,Q) where F = Ftt≥0. The basis is assumed to carry a standardm-dimensional Wiener process W , and we also assume that the filtration F is theinternal one generated by W .

By pt(x) we denote the price, at t, of a zero coupon bond maturing at t + x, and theforward rates rt(x) are defined by

rt(x) = −∂ log pt(x)∂x

.

Note that we use the Musiela parameterization, where x denotes the time to maturity.The short rate R is defined as Rt = rt(0), and the money account B is given by

Bt = exp∫ t

0 Rsds

. The model is assumed to be free of arbitrage in the sense that

the measure Q above is a martingale measure for the model. In other words, for everyfixed time of maturity T ≥ 0, the process Zt(T ) = pt(T − t)/Bt is a Q-martingale.

Let us now consider a given forward rate model of the form

drt(x) = βt(x)dt + σt(x)dW,

r0(x) = ro0(x).(1)

where, for each x, β(x) and σ(x) are given optional processes. The initial curvero0(x); x ≥ 0 is taken as given. It is interpreted as the observed forward rate curve.

The standard Heath-Jarrow-Morton drift condition ([26]) can easily be transferred tothe Musiela parameterization. The result (see [10], [32]) is as follows.

Proposition 1.1 (The Forward Rate Equation) Under the martingale measure Q

the r-dynamics are given by

drt(x) =

∂xrt(x) + σt(x)

∫ x

0

σt(u)du

dt + σt(x)dWt, (2)

r0(x) = ro0(x). (3)

where denotes transpose.

Page 143: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 135

1.2 The Geometric Point of View

At a first glance it is natural to view the forward rate equation (2) as an infinite systemof stochastic differential equations (SDEs in what follows): We have one equation foreach fixed x, so we are handling a continuum of SDEs.

An alternative, more geometrically oriented, view of (2) is to regard it as a singleequation for an infinite dimensional object. The infinite dimensional object understudy is the forward rate curve, i.e. the curve x −→ rt(x). Denoting the forwardrate curve at time t by rt, and the entire forward rate curve process by r, we thus takethe point of view that r is a process evolving in some infinite dimensional functionspace H of forward rate curves. For a fixed t we will thus view each outcome of rtas a vector (or point) in H. In the same way, the volatility process σ = [σ1, . . . , σm]is viewed as a process evolving in Hm, so that for each t the outcome of σt =[σ1t, . . . , σmt] is regarded as a point in Hm.

In order to avoid detailed technical discussions at this preliminary level we postponethe precise definition of H, as well as the necessary technical conditions on σ, untilSection 3.3. For the time being the reader is asked to think (loosely) of H as a spaceof C∞ functions and to assume that all SDEs appearing in the text do admit uniquestrong solutions.

In order to emphasize the geometric point of view, we can now rewrite the forwardrate equation (2) as

drt = Frt + σtHσt dt + σtdWt, (4)

r0 = r0, (5)

where the operators F and H, both defined on H, are given by

F =∂

∂x, (6)

[Hf ] (x) =∫ x

0

f(s)ds, for f ∈ H. (7)

and where we use the obvious interpretation Hσt = [Hσ1t, . . . ,Hσmt].

1.3 Main Problems

Suppose now that we are given a concrete model M within the above framework,i.e. suppose that we are given a concrete specification of the volatility process σ. Wenow formulate a couple of natural problems:

1. Take, in addition to M, also as given a parameterized family G of forward ratecurves. Under which conditions is the family G consistent with the dynamicsof M? Here consistency is interpreted in the sense that, given an initial forwardrate curve in G, the interest rate model M will only produce forward rate curvesbelonging to the given family G.

Page 144: Paris-Princeton Lectures on Mathematical Finance 2003

136 T. Bjork

2. When does the given, inherently infinite dimensional, interest rate model M ad-mit a finite dimensional realization? More precisely, we seek conditions underwhich the forward rate process rt(x) induced by the model M, can be realizedby a system of the form

dZt = a (Zt) dt + b (Zt) dWt, (8)

rt(x) = G (Zt, x) . (9)

where Z (interpreted as the state vector process) is a finite dimensional diffusion,a(z), b(z) and G(z, x) are deterministic functions and W is the same Wienerprocess as in (2). Expressed in other terms, we thus wish to investigate underwhat conditions the HJM model is generated by a finite dimensional Markovianstate space model.

As will be seen below, these two problems are intimately connected, and the mainpurpose of this chapter, which is a substantial extension of a previous paper [3], is togive an overview of some recent work in this area. The text is based on [6], [5], [9],[7], and [8], but the presentation given below is more focused on geometric intuitionthan the original articles, where full proofs, technical details and further results canbe found. In the analysis below we use ideas from systems and control theory (see[30]) as well as from nonlinear filtering theory (see [12]). References to the literaturewill sometimes be given in the text, but will also be summarized in the Notes at theend of each section.

It should be noted that the functional analytical framework of the entire theory hasrecently been improved in a quite remarkable way by Filipovic and Teichmann. In aseries of papers these authors have considerably extended the Hilbert space frame-work of the papers mentioned above. In doing so, they have also clarified manystructural problems and derived a large number of concrete results. However: a fullunderstanding of these extensions require a high degree of detailed technical knowl-edge in analysis on Frechet spaces so the scope of the present chapter prohibits usfrom doing complete justice to this beautiful part of the theory. The interested readeris referred to the original papers [23], [24], and [25].

The organization of the text is as follows. In Section 2 we treat the relatively simplecase of linear realizations. Section 3 is devoted to a study of the general consistencyproblem, including a primer on the Filpovic inverse consistency theory, and in Sec-tion 4 we use the consistency results from Section 3 in order to give a fairly completepicture of the nonlinear realization problem. The problem of actually constructing aconcrete realization is treated in Section 5, in Section 6 we discuss very briefly theFilipovic–Teicmann extensions, and in Section 7 we extend the theory to includestochastic volatility models.

Page 145: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 137

2 A Primer on Linear Realization Theory

In the general case, the forward rate equation (4) is a highly nonlinear infinite di-mensional SDE but, as can be expected, the special case of linear dynamics is mucheasier to handle. In this section we therefore concentrate on linear forward rate mod-els, and look for finite dimensional linear realizations. Almost all geometric ideaspresented in this chapter will then be generalized to the nonlinear case studied laterin the text.

2.1 Deterministic Forward Rate Volatilities

For the rest of the section we only consider the case when the volatility processσt(x) = [σ1t(x), . . . , σmt(x)] is a deterministic function σ(x) of x only.

Assumption 2.1 Each component σi, of the volatility process σ is a deterministicvector in H for i = 1, . . . ,m. Equivalently, The volatility σ is a C∞-mapping σ :R+ → Rm.

Under this assumption, the forward rate equation (4) takes the form

drt = Ft + D dt + σdWt, (10)

where the function D, which we view as a vector D ∈ H, is defined by

D(x) = σ(x)∫ x

0

σ(s)ds. (11)

The point to note here is that, because of our choice of a deterministic volatilityσ(x), the forward rate equation (10) is a linear (or rather affine) SDE evolving in theinfinite dimensional function space H.

Because of the linear structure of the equation (albeit in infinite dimensions) weexpect to be able to provide an explicit solution of (10). We now recall that a scalarequation of the form

dyt = [ayt + b] dt + cdWt

has the solution

yt = eaty0 +∫ t

0

ea(t−s)bds +∫ t

0

ea(t−s)cdWs,

and we are thus led to conjecture that the solution to (10) is given by the formalexpression

rt = eFtr0 +∫ t

0

eF(t−s)Dds +∫ t

0

eF(t−s)σdWs.

Page 146: Paris-Princeton Lectures on Mathematical Finance 2003

138 T. Bjork

We now have to make precise mathematical sense of the formal exponent eFt. Fromthe context it is clear that it acts on vectors in H, i.e. on real valued C∞ functions. Itis in fact an operator eFt : H → H and we have to figure out how it acts. From theusual series expansion of the exponential function one is led to write

eFtf =∞∑n=0

tn

n!Fnf. (12)

In our case Fn = ∂n

∂xn , so we have

[eFtf

](x) =

∞∑n=0

tn

n!∂nf

∂xn(x) (13)

This is, however, just a Taylor series expansion of f around the point x, so for ana-lytic f we have

[eFtf

](x) = f(x + t). We have in fact the following precise result

(which can be proved rigorously).

Proposition 2.1 The operator F is the infinitesimal generator of the semigroup ofleft translations, i.e. for any f ∈ H (and in fact for any continuous f ) we have

[eFtf

](x) = f(t + x). (14)

Furthermore, the solution of the forward rate equation (10) is given by

rt = eFtro +∫ t

0

eF(t−s)Dds +∫ t

0

eF(t−s)σdWs (15)

or, written in component form, by

rt(x) = ro(x + t) +∫ t

0

D(x + t − s)ds +∫ t

0

σ(x + t − s)dWs. (16)

From (15) it is clear by inspection that we may write the solution of the forward rateequation (10) as

rt = qt + δt, (17)

dqt = Fqtdt + σdWt, (18)

q0 = 0 (19)

where δ is given by

δt = eFtro +∫ t

0

eF(t−s)Dds, (20)

or on component form

δt(x) = ro(x + t) +∫ t

0

D(x + t − s)ds. (21)

Page 147: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 139

Since δt(x) is not affected by the input W , we see that the problem of finding a real-ization for the term structure system (10) is equivalent to that of finding a realizationfor (18)-(19). Since we have a linear dynamical system it seems natural to look forlinear realizations and we are thus led to the following definition.

Definition 2.1 A triple [A,B,C(x)], where A is an (n×n)-matrix, B is an (n×m)-matrix and C is an n dimensional row vector function, is called an n-dimensionalrealization of the systems (18) if q has the representation

qt(x) = C(x)Zt, (22)

dZt = AZtdt + BdWt, (23)

Z0 = 0. (24)

Our main problems are now as follows.

• Consider an a priori given volatility structure σ(x).

• When does there exist a finite dimensional realization?

• If there exists a finite dimensional realization, what is the minimal dimension?

• How do we construct a minimal realization from knowledge of σ?

2.2 Finite Dimensional Realizations

In this section we will investigate the existence of a finite dimensional realization(FDR) from a geometric point of view. There are also a number of other ways toattack the problem, but it is in fact the geometrical point of view which later in thetext will be generalized to the nonlinear case. The discussion will be rather informaland some technical questions are sidestepped.

We recall the q-equation (18) as

dqt = Fqtdt + σdWt, (25)

q0 = 0.

Expressing the operator exponential eF (t−s) as a power series, and using Proposition2.1, we may write the solution to (25) as

q(t) =∫ t

0

∞∑0

(t− s)n

n!FnσdW (s). (26)

From this expression we see that, for each t, the random vector qt is in fact given asa random infinite linear combination of the (deterministic) vectors σ,Fσ,F2σ, . . ..

Page 148: Paris-Princeton Lectures on Mathematical Finance 2003

140 T. Bjork

Thus we see that the q-process will in fact evolve in the (deterministic) subspaceR ⊆ H defined by

R = span[σ,Fσ, F2σ, . . .

]. (27)

The subspace R is thus invariant under the action of the q process, and it is ratherobvious that it is in fact the minimal (under inclusion) invariant subspace for q.

The obvious conjecture is that there exists an FDR if and only if R is finite dimen-sional. This conjecture is in fact correct and we have the following main result.

Proposition 2.2 Consider a given volatility function

σ = [σ1, · · · , σm] .

Then there exists an FDR if and only if ,

dim (R) < ∞, (28)

with R defined by

R = span[Fkσi ; i = 1, · · · ,m; k = 0, 1, · · ·

]. (29)

Furthermore; the minimal dimension of an FDR, also known as the McMillan de-gree, is given by dim (R).

Proof. For brevity we only give the proof for the case m = 1. The proof for generalcase is almost identical. Assume first that there exists an n-dimensional FDR of theform

qt(x) = C(x)Zt, (30)

with Z-dynamics as in (24), and with

C(x) = [C1(x), . . . , Cn(x)] .

Writing this asqt = CZt, (31)

it is now obvious that the finite dimensional subspace of H spanned by the vectorsC1, . . . , Cn will in fact be invariant and thus contain R (since R is minimal invari-ant). Thus the existence of an FDR implies the finite dimensionality of R.

Conversely, assume that R is n-dimensional. We now prove the existence of an FDRby actually constructing an explicit realization of the form (22)-(24). The finite di-mensionality of R implies that (with n as above) there exists a linear relation of theform

Fnσ =n−1∑i=0

γiFiσ (32)

where γ0, . . . , γn−1 are real numbers. Thus, the vectors σ,Fσ, . . .Fn−1σ are lin-early independent and span R.

Page 149: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 141

Since σ,Fσ, . . . ,Fn−1σ is a basis for the invariant subspace R we can write

qt =n−1∑i=0

ZitFiσ. (33)

where the processes Z0, . . . , Zn−1 are the coordinate processes of q for the givenbasis. We now want to find the dynamics of Z =

[Z0, . . . , Zn−1

]so we make the

AnsatzdZit = αi (Zt) dt + βi (Zt) dWt, i = 0, . . . , n− 1, (34)

and the problem is to identify the unknown functions α and β.

Our strategy for finding α and β is as follows.

• Compute dq from (33)-(34).

• Compare the expression thus obtained with the original q dynamics given by

dqt = Fqtdt + σdWt. (35)

• Identify the coefficients.

From (33)-(34) and using the notation

σ(i) = Fiσ,

we obtain

dqt =

(n−1∑i=0

αi (Zt)σ(i)

)dt +

(n−1∑i=0

βi (Zt)σ(i)

)dWt. (36)

We now want to compare this expression with the dynamics in (35). To do this wefirst use (33) to obtain

Fqt =n−1∑i=0

ZitFi+1σ =

n−1∑i=0

Zitσi+1 =

n∑i=1

Zi−1t σi. (37)

After inserting (32) into (37) and collecting terms we have

Fqt =n−1∑i=0

ci (Zt)σi, (38)

whereci (Z) = Zi−1 + γiZ

n−1, i = 0, . . . , n − 1, (39)

with the convention Z−1 = 0. We may thus write the q dynamics in (35) as

Page 150: Paris-Princeton Lectures on Mathematical Finance 2003

142 T. Bjork

dqt =

(n−1∑i=0

ci (Zt)σi)

dt + σ0dWt. (40)

We may now identify coefficients by comparing (38) with (40) to obtain

β0(Z) = 1,βi = 0, i = 1, . . . , n− 1.

andαi(Z) = ci(Z), i = 0, . . . , n − 1. (41)

with ci as in (39). We have thus derived the explicit realization

qt = CZt, (42)

whereC =

[σ,Fσ, . . . ,Fn−1σ

], (43)

and where the Z dynamics are given by

dZ0t = γ0Z

n−1t dt + dWt, (44)

dZit =(Zi−1t − γiZ

n−1t

)dt, i = 1, . . . , n− 1. (45)

We note in passing that the proof above, apart from proving the existence result, alsoprovides us with the concrete realization (43)-(45). In the proof this is only done forthe case of a scalar Wiener process, but the method can easily be extended to themulti-dimensional case. See [7] for worked out examples.

We now go on to find a more explicit characterization of condition (28). Recallingthat the operator F is defined as F = ∂/∂x, we see from Proposition 2.2 that theforward rate system admits an FDR if and only if the space spanned by the compo-nents of σ and all their derivatives is finite dimensional. In other words; there existsan FDR if and only if the components of σ satisfy a linear system of ordinary differ-ential equations (ODEs in what follows) with constant coefficients. This leads us tothe topic of quasi-exponential functions.

Definition 2.2 A quasi-exponential (or QE) function is by definition any functionof the form

f(x) =∑i

eλix +∑j

eαix [pj(x) cos(ωjx) + qj(x) sin(ωjx)] , (46)

where λi, α1, ωj are real numbers, whereas pj and qj are real polynomials.

QE functions will turn up over and over again, so we list some simple well knownproperties.

Page 151: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 143

Lemma 2.1 The following hold for the quasi-exponential functions.

• A function is QE if and only if it is a component of the solution of a vector valuedlinear ODE with constant coefficients.

• A function is QE if and only if it can be written as f(x) = ceAxb. where c is arow vector, A is a square matrix and b is a column vector.

• If f is QE, then f ′ is QE.

• If f is QE, then its anti-derivative is QE.

• If f and g are QE, then fg is QE.

We can thus restate Proposition 2.2.

Proposition 2.3 The forward rate equation admits a finite dimensional realizationif and only of each component of σ is quasi-exponential.

2.3 Economic Interpretation of the State Space

In general, the state space of a realization of a given system has no concrete (e.g.economic) interpretation. In the case of the forward rate equation, however, the statesof the minimal realization turn out to have a simple economic interpretation.

Proposition 2.4 Assume that R is n-dimensional, so that the existence of an FDRis guaranteed. Then, for any minimal realization, i.e. a realization with an n-dimensional state vector Z , there will exist an affine transformation mapping thestate vector into a vector of benchmark forward rates.

The moral of this is that, in a minimal realization, you can always choose your statevariables as a fixed set of forward rates. It can also be shown that the maturities ofthe benchmark forward rates can be chosen without restrictions.

For precise statements, proofs and examples, see [6].

2.4 Connections to Systems and Control Theory

The geometric ideas of the previous section are in fact standard in the theory ofmathematical systems and control. To see this, consider again the equation

dqt = Fqtdt + σdWt, (47)

q0 = 0. (48)

Let us now formally “divide by dt”, which gives us

Page 152: Paris-Princeton Lectures on Mathematical Finance 2003

144 T. Bjork

dqtdt

= Fqt + σdWt

dt,

where the formal time derivative dWt/dt is interpreted as “white noise”. We interpretthis equation as an input-output system where the random input signal t −→ dWt/dtis transformed into the infinite dimensional output signal t −→ qt. We thus view theequation as a stochastic version of the following controlled ODE

dqtdt

= Fqt + σut, (49)

q0 = 0,

where u is a deterministic input signal (which in our case is replaced by white noise).Generally speaking, tricks like this does not work directly, since we are ignoring thedifference between standard differential calculus, which is used to analyze (49), andIto calculus which we use when dealing with SDEs. In this case, however, becauseof the linear structure, the second order Ito term will not come into play, so we aresafe. (See the discussion in Section 3.4 around the Stratonovich integral for how totreat the nonlinear situation.)

The reader who is familiar with systems and control theory (see [11]) will now rec-ognize the space R above as the reachable subspace of the control system (49). Notsurprisingly, there is also a frequency domain approach to our realization problem.See [6] for details.

2.5 Examples

We now give some simple illustrations of the theory. We only consider the case of ascalar driving Wiener process.

Example 2.1 σ(x) = σe−ax

We consider a model driven by a one-dimensional Wiener process, having the for-ward rate volatility structure

σ(x) = σe−ax,

where σ in the right hand side denotes a constant. (The reader will probably recog-nize this example as the Hull-White model.) We start by determining R which in thiscase is defined as

R = span

[dk

dxkσe−ax ; k ≥ 0

].

It is obvious that R is one-dimensional, so we can expect to find a one-dimensionalrealization. The existence of an FDR could of course also have been seen directly byobserving that σ is quasi-exponential. Since we have Fσ = −aσ we see that, in thenotation of (32) we have γ0 = −a. Thus denoting the single state variable Z0 by Z ,we may use (43)-(45) to obtain the realization

Page 153: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 145

qt(x) = e−axZt, (50)

dZt = −aZtdt + dWt. (51)

A full realization of the forward rate process rt is then obtained from (17) as

rt(x) = σe−axZt + r0(x + t) +12σ2e−2ax

e−2at − 1

. (52)

Example 2.2 σ(x) = e−x2

In this example R is given by

R = span

[dk

dxke−x

2, k ≥ 0

]= span

[xke−x

2, k ≥ 0

],

which is easily seen to be infinite dimensional. Thus we see that in this case thereexists no finite dimensional linear realization. We will return to this example laterand we will in fact prove that neither does there exists a non-linear FDR.

2.6 Notes

This section is to a large extent based on [6] where, however, the focus is more on thefrequency domain approach. The first paper to appear in this area was to our knowl-edge the seminal preprint [32], where the Musiela parameterization is introduced andthe space R is discussed in some detail. Because of the linear structure, the theoryabove is closely connected to (and in a sense inverse to) the theory of affine termstructures developed in [17]. The standard reference on infinite dimensional SDEsis [16], where one also can find a presentation of the connections between controltheory and infinite dimensional linear stochastic equations.

3 The Consistency Problem

We now turn to a more serious study of the geometric properties of the forward rateequation in the general nonlinear case. We begin by studying when a given subman-ifold of forward rate curves is consistent (a precise definition will be given below)with a given interest rate model. This problem is of interest from an applied as wellas from a theoretical point of view. In particular we will use the results from thissection to analyze problems about existence of finite dimensional factor realizationsfor interest rate models on forward rate form. Invariant manifolds are, however, alsoof interest in their own right, so we begin by discussing a concrete problem whichnaturally leads to the invariance concept.

Page 154: Paris-Princeton Lectures on Mathematical Finance 2003

146 T. Bjork

3.1 Parameter Recalibration

A standard procedure when dealing with concrete interest rate models on a highfrequency (say, daily) basis can be described as follows:

1. At time t = 0, use market data to fit (calibrate) the model to the observed bondprices.

2. Use the calibrated model to compute prices of various interest rate derivatives.

3. The following day (t = 1), repeat the procedure in 1. above in order to recalibratethe model, etc..

To carry out the calibration in step 1. above, the analyst typically has to producea forward rate curve ro(x);x ≥ 0 from the observed data. However, since only afinite number of bonds actually trade in the market, the data consist of a discrete set ofpoints, and a need to fit a curve to these points arises. This curve-fitting may be donein a variety of ways. One way is to use splines, but also a number of parameterizedfamilies of smooth forward rate curves have become popular in applications—themost well-known probably being the Nelson-Siegel (see [34]) family. Once the curvero(x);x ≥ 0 has been obtained, the parameters of the interest rate model may becalibrated to this.

Now, from a purely logical point of view, the recalibration procedure in step 3. aboveis of course slightly nonsensical: If the interest rate model at hand is an exact pictureof reality, then there should be no need to recalibrate. The reason that everyone insistson recalibrating is of course that any model in fact only is an approximate pictureof the financial market under consideration, and recalibration allows incorporatingnewly arrived information in the approximation. Even so, the calibration procedureitself ought to take into account that it will be repeated. It appears that the optimalway to do so would involve a combination of time series and cross-section data, asopposed to the purely cross-sectional curve-fitting, where the information containedin previous curves is discarded in each recalibration.

The cross-sectional fitting of a forward curve and the repeated recalibration is thus, ina sense, a pragmatic and somewhat non-theoretical endeavor. Nonetheless, there aresome nontrivial theoretical problems to be dealt with in this context, and the problemto be studied in this section concerns the consistency between, on the one hand, thedynamics of a given interest rate model, and, on the other hand, the forward curvefamily employed.

What, then, is meant by consistency in this context? Assume that a given interestrate model M (e.g. the Hull–White model) in fact is an exact picture of the financialmarket. Now consider a particular family G of forward rate curves (e.g. the Nelson-Siegel family) and assume that the interest rate model is calibrated using this family.We then say that the pair (M,G) is consistent (or, that M and G are consistent) if allforward curves which may be produced by the interest rate model M are containedwithin the family G. Otherwise, the pair (M,G) is inconsistent.

Page 155: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 147

Thus, if M and G are consistent, then the interest rate model actually produces for-ward curves which belong to the relevant family. In contrast, if M and G are incon-sistent, then the interest rate model will produce forward curves outside the familyused in the calibration step, and this will force the analyst to change the model param-eters all the time—not because the model is an approximation to reality, but simplybecause the family does not go well with the model.

Put into more operational terms this can be rephrased as follows.

• Suppose that you are using a fixed interest rate model M. If you want to dorecalibration, then your family G of forward rate curves should be chosen is sucha way as to be consistent with the model M.

Note however that the argument also can be run backwards, yielding the followingconclusion for empirical work.

• Suppose that a particular forward curve family G has been observed to provide agood fit, on a day-to-day basis, in a particular bond market. Then this gives youmodeling information about the choice of an interest rate model in the sense thatyou should try to use/construct an interest rate model which is consistent withthe family G.

We now have a number of natural problems to study.

I Given an interest rate model M and a family of forward curves G, what are neces-sary and sufficient conditions for consistency?

II Take as given a specific family G of forward curves (e.g. the Nelson-Siegel fam-ily). Does there exist any interest rate model M which is consistent with G?

III Take as given a specific interest rate model M (e.g. the Hull-White model). Doesthere exist any finitely parameterized family of forward curves G which is con-sistent with M?

In this section we will mainly address problem (I) above. Problem II has been stud-ied, for special cases, in [19], [20], whereas Problem III can be shown (see Proposi-tion 4.1) to be equivalent to the problem of finding a finite dimensional factor real-ization of the model M and we provide a fairly complete solution in Section 4.

3.2 Invariant Manifolds

We now move on to give precise mathematical definition of the consistency propertydiscussed above, and this leads us to the concept of an invariant manifold.

Definition 3.1 (Invariant Manifold) Take as given the forward rate process dynam-ics (2). Consider also a fixed family (manifold) of forward rate curves G. We say that

Page 156: Paris-Princeton Lectures on Mathematical Finance 2003

148 T. Bjork

G is locally invariant under the action of r if, for each point (s, r) ∈ R+ × G, thecondition rs ∈ G implies that rt ∈ G, on a (possibly random) time interval withpositive length. If r stays forever on G, we say that G is globally invariant.

The purpose of this section is to characterize invariance in terms of local character-istics of G and M, and in this context local invariance is the best one can hope for.In order to save space, local invariance will therefore be referred to as invariance.

To get some intuitive feeling for the invariance concepts one can consider the fol-lowing two-dimensional deterministic system

dy1

dt= y2,

dy2

dt= −y1.

For this system it is obvious that the unit circle C =(y1, y2) : y2

1 + y22 = 1

is

globally invariant, i.e. if we start the system on C it will stay forever on C. The ‘upperhalf’ of the circle, Cu =

(y1, y2) : y2

1 + y22 = 1, y2 > 0

, is on the other hand only

locally invariant, since the system will leave Cu at the point (1, 0). This geometricsituation is in fact the generic one also for our infinite dimensional stochastic case.The forward rate trajectory will never leave a locally invariant manifold at a pointin the relative interior of the manifold. Exit from the manifold can only take placeat the relative boundary points. We have no general method for determining whethera locally invariant manifold also is globally invariant or not. Problems of this kindhave to be solved separately for each particular case.

3.3 The Space

In order to study the consistency problem we need (see Remark 3.1 below) a veryregular space to work in.

Definition 3.2 Consider a fixed real number γ > 0. The space Hγ is defined as thespace of all infinitely differentiable functions

r : R+ → R

satisfying the norm condition ‖r‖γ < ∞. Here the norm is defined as

‖r‖2γ =

∞∑n=0

2−n∫ ∞

0

(dnr

dxn(x))2

e−γxdx.

Note that H is not a space of distributions, but a space of functions. We will oftensuppress the subindex γ. With the obvious inner product H is a pre-Hilbert space,and in [9] the following result is proved.

Page 157: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 149

Proposition 3.1 The space H is a Hilbert space, i.e. it is complete. Furthermore,every function in the space is in fact real analytic, and can thus be uniquely extendedto a holomorphic function in the entire complex plane.

Remark 3.1 The reason for our choice of H as the underlying space, is that thelinear operator F = d/dx is bounded in this space. Together with the assumptionsabove, this implies that both µ and σ are smooth vector fields on H, thus ensuringthe existence of a strong local solution to the forward rate equation for every initialpoint ro ∈ H.

The Forward Curve Manifold

We consider as given a mapping

G : Z → H, (53)

where the parameter space Z is an open connected subset of Rd, i.e. for each param-eter value z ∈ Z ⊆ Rd we have a curve G(z) ∈ H. The value of this curve at thepoint x ∈ R+ will be written as G(z, x), so we see that G can also be viewed as amapping

G : Z × R+ → R. (54)

The mapping G is thus a formalization of the idea of a finitely parameterized familyof forward rate curves, and we now define the forward curve manifold as the set ofall forward rate curves produced by this family.

Definition 3.3 The forward curve manifold G ⊆ H is defined as

G = Im (G) ,

where we use the notation Im (G) for the image (or range) of the mapping G.

The Interest Rate Model

We consider a given volatility σ of the form

σ : H× R+ → Rm.

In other words, σ(r, x) is a functional of the infinite dimensional r-variable, and afunction of the real variable x. An alternative, and more instructive, way of viewinga component σi is to see it as a mapping where point r ∈ H is mapped to the realvalued function σi(r, ·). We will in fact assume that this real valued function is amember of H, which means that we can view each component σi as a vector fieldσi : H → H on the space H. Denoting the forward rate curve at time t by rt we thenhave the following forward rate equation.

Page 158: Paris-Princeton Lectures on Mathematical Finance 2003

150 T. Bjork

drt(x) =

∂xrt(x) + σ(rt, x)

∫ x

0

σ(rt, u)du

dt + σ(rt, x)dWt. (55)

Remark 3.2 For notational simplicity we have assumed that the r-dynamics aretime homogenous. The case when σ is of the form σ(t, r, x) can be treated in ex-actly the same way. See [5].

We obviously need some regularity assumptions and these will be collected in As-sumption 3.1 below. See [5] for further technical details.

The Problem

Our main problem is the following.

• Suppose that we are given

– A volatility σ, specifying an interest rate model M as in (55)

– A mapping G, specifying a forward curve manifold G.

• Is G then invariant under the action of r?

3.4 The Invariance Conditions

In order to study the invariance problem from a geometrical point of view we intro-duce some compact notation.

Definition 3.4 We define Hσ by

Hσ(r, x) =∫ x

0

σ(r, s)ds

Suppressing the x-variable, the Ito dynamics for the forward rates are thus given by

drt =

∂xrt + σ(rt)Hσ(rt)

dt + σ(rt)dWt (56)

and we write this more compactly as

drt = µ0(rt)dt + σ(rt)dWt. (57)

In this way we see clearly how (57) is an SDE on H, specified by its diffusion vectorfields σ1, . . . , σm and drift vector field µ0, where µ0 is given by the bracket term in(56). To get some intuition we now formally “divide by dt” and obtain

dr

dt= µ0(rt) + σ(rt)Wt, (58)

Page 159: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 151

where the formal time derivative Wt is interpreted as an “input signal” chosen bychance. As in Section 2.4 we are thus led to study the associated deterministic controlsystem

dr

dt= µ0(rt) + σ(rt)ut. (59)

The intuitive idea is now that G is invariant under (57) if and only if G is invariant un-der (59) for all choices of the input signal u. It is furthermore geometrically obviousthat this happens if and only if the velocity vector µ(r)+σ(r)u is tangential to G forall points r ∈ G and all choices of u ∈ Rm. Since the tangent space of G at a pointG(z) is given by Im [G′

z(z)], where G′z denotes the Frechet derivative (Jacobian),

we are led to conjecture that G is invariant if and only if the condition

µ0(r) + σ(r)u ∈ Im [G′z(z)]

is satisfied for all u ∈ Rm. This can also be written

µ0(r) ∈ Im [G′z(z)] ,

σ(r) ∈ Im [G′z(z)] ,

where the last inclusion is interpreted component wise for σ.

This “result” is, however, not correct due to the fact that the argument above neglectsthe difference between ordinary calculus, which is used for (59), and Ito calculus,which governs (57). In order to bridge this gap we have to rewrite the analysis interms of Stratonovich integrals instead of Ito integrals.

Definition 3.5 For given semimartingales X and Y , the Stratonovich integral ofX with respect to Y ,

∫ t0X(s) dY (s), is defined as

∫ t

0

Xs dYs =∫ t

0

XsdYs +12〈X,Y 〉t . (60)

The first term on the right hand side is the Ito integral. In the present case, with onlyWiener processes as driving noise, we can define the ‘quadratic variation process’〈X,Y 〉 in (60) by

d〈X,Y 〉t = dXtdYt, (61)

with the usual ‘multiplication rules’ dW · dt = dt · dt = 0, dW · dW = dt. We nowrecall the main result and raison d’etre for the Stratonovich integral.

Proposition 3.2 (Chain Rule) Assume that the function F (t, y) is smooth. Then wehave

dF (t, Yt) =∂F

∂t(t, Yt)dt +

∂F

∂y dYt . (62)

Thus, in the Stratonovich calculus, the Ito formula takes the form of the standardchain rule of ordinary calculus.

Page 160: Paris-Princeton Lectures on Mathematical Finance 2003

152 T. Bjork

Returning to (57), the Stratonovich dynamics are given by

drt =

∂xrt + σ(rt)Hσ(rt)

dt − 1

2d〈σ(rt),Wt〉 (63)

+ σ(rt) dWt.

In order to compute the Stratonovich correction term above we use the infinite di-mensional Ito formula (see [16]) to obtain

dσ(rt) = . . . dt + σ′r(rt)σ(rt)dWt, (64)

where σ′r denotes the Frechet derivative of σ w.r.t. the infinite dimensional r-variable.

From this we immediately obtain

d〈σ(rt),Wt〉 = σ′r(rt)σ(rt)dt. (65)

Remark 3.3 If the Wiener process W is multidimensional, then σ is a vector σ =[σ1, . . . , σm], and the right hand side of (65) should be interpreted as

σ′r(rt)σ(rt, x) =

m∑i=1

σ′ir(rt)σi(rt)

Thus (63) becomes

drt =

∂xrt + σ(rt)Hσ(rt) −

12σ′r(rt)σ(rt)

dt (66)

+ σ(rt) dWt

We now write (66) asdrt = µ(rt)dt + σ(rt) dWt (67)

where

µ(r, x) =∂

∂xr(x) + σ(rt, x)

∫ x

0

σ(rt, u)du − 12

[σ′r(rt)σ(rt)] (x). (68)

For all these arguments to make sense, we need some formal regularity assumptions.

Assumption 3.1 We assume the following .

• For each i = 1, . . . ,m the volatility vector field σi : H → H is smooth.

• The mapping

r −→ σ(r)Hσ(r) − 12σ′r(r)σ(r)

is a smooth map from H to H.

Page 161: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 153

• The mapping z −→ G(z) is a smooth embedding, so in particular the Frechetderivative G′

z(z) is injective for all z ∈ Z .

Given the heuristics above, our main result is not surprising. The formal proof, whichis somewhat technical, is left out. See [5].

Theorem 3.1 (Invariance Theorem) Under Assumption 3.1, the forward curve man-ifold G is locally invariant for the forward rate process rt(x) in M if and only if,

G′x(z) + σ (r)Hσ (r) − 1

2σ′r (r) σ (r) ∈ Im[G′

z(z)] , (69)

σ (r) ∈ Im[G′z(z)] , (70)

hold for all z ∈ Z with r = G(z).

Here, G′z and G′

x denote the Frechet derivative of G with respect to z and x, re-spectively. The condition (70) is interpreted component wise for σ. Condition (69)is called the consistent drift condition, and (70) is called the consistent volatilitycondition.

Remark 3.4 It is easily seen that if the family G is invariant under shifts in thex-variable, then we will automatically have the relation

G′x(z) ∈ Im[G′

z(z)],

so in this case the relation (69) can be replaced by

σ(r)Hσ(r) − 12σ′r (r) σ (r) ∈ Im[G′

z(z)],

with r = G(z) as usual.

3.5 Examples

The results above are extremely easy to apply in concrete situations. As a test casewe consider the Nelson–Siegel (see [34]) family of forward rate curves. We analyzethe consistency of this family with the Ho-Lee and Hull-White interest rate models.It should be emphasized that these examples are chosen only in order to illustrate thegeneral methodology. For more examples and details, see [5].

The Nelson-Siegel Family

The Nelson–Siegel (henceforth NS) forward curve manifold G is parameterized byz ∈ R4, the curve x −→ G(z, x) as

Page 162: Paris-Princeton Lectures on Mathematical Finance 2003

154 T. Bjork

G(z, x) = z1 + z2e−z4x + z3xe

−z4x . (71)

For z4 = 0, the Frechet derivatives are easily obtained as

G′z(z, x) =

[1, e−z4x, xe−z4x, −(z2 + z3x)xe−z4x

], (72)

G′x(z, x) = (z3 − z2z4 − z3z4x)e−z4x . (73)

In order for the image of this map to be included in Hγ , we need to impose thecondition z4 > −γ/2. In this case, the natural parameter space is thus Z =z ∈ R4 : z4 = 0, z4 > −γ/2

. However, as we shall see below, the results are uni-

form w.r.t. γ. Note that the mapping G indeed is smooth, and for z4 = 0, G and G′z

are also injective.

In the degenerate case z4 = 0, we have

G(z, x) = z1 + z2 + z3x , (74)

We return to this case below.

The Hull-White and Ho-Lee Models

As our test case, we analyze the Hull and White (henceforth HW) extension of theVasicek model. On short rate form the model is given by

dR(t) = Φ(t) − aR(t) dt + σdW (t), (75)

where a, σ > 0. As is well known, the corresponding forward rate formulation is

drt(x) = β(t, x)dt + σe−axdWt. (76)

Thus, the volatility function is given by σ(x) = σe−ax, and the conditions of Theo-rem 3.1 become

G′x(z, x) +

σ2

a

[e−ax − e−2ax

]∈ Im[G′

z(z, x)], (77)

σe−ax ∈ Im[G′z(z, x)]. (78)

To investigate whether the NS manifold is invariant under HW dynamics, we startwith (78) and fix a z-vector. We then look for constants (possibly depending on z)A, B, C, and D, such that for all x ≥ 0 we have

σe−ax = A + Be−z4x + Cxe−z4x −D(z2 + z3x)xe−z4x. (79)

This is possible if and only if z4 = a, and since (78) must hold for all choices ofz ∈ Z we immediately see that HW is inconsistent with the full NS manifold (seealso the Notes below).

Page 163: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 155

Proposition 3.3 (Nelson-Siegel and Hull-White) The Hull-White model is incon-sistent with the NS family.

We have thus obtained a negative result for the HW model. The NS manifold is ‘toosmall’ for HW, in the sense that if the initial forward rate curve is on the manifold,then the HW dynamics will force the term structure off the manifold within an arbi-trarily short period of time. For more positive results see [5].

Remark 3.5 It is an easy exercise to see that the minimal manifold which is consis-tent with HW is given by

G(z, x) = z1e−ax + z2e

−2ax.

In the same way, one may easily test the consistency between NS and the modelobtained by setting a = 0 in (75). This is the continuous time limit of the Ho andLee model [27], and is henceforth referred to as HL. Since we have a pedagogicalpoint to make, we give the results on consistency, which are as follows.

Proposition 3.4 (Nelson-Siegel and Ho-Lee)

(a) The full NS family is inconsistent with the Ho-Lee model.

(b) The degenerate family G(z, x) = z1 + z3x is in fact consistent with Ho-Lee.

Remark 3.6 We see that the minimal invariant manifold provides information aboutthe model. From the result above, the HL model is closely tied to the class of affineforward rate curves. Such curves are unrealistic from an economic point of view,implying that the HL model is overly simplistic.

3.6 The Filipovic State Space Approach to Consistency

As we very easily detected above, neither the HW nor the HL model is consistentwith the Nelson-Siegel family of forward rate curves. A much more difficult problemis to determine whether any interest rate model is. This is Problem II in Section 3.1for the NS family, and in a very general setting, inverse consistency problems like thishas been studied in great detail by Filipovic in [19], [20], and [21]. In this sectionwe will give an introduction to the Filipovic state space approach to the (inverse)consistency problem, and we will also study a small laboratory example.

The study will be done within the framework of a factor model.

Definition 3.6 A factor model for the forward rate process r consists of the follow-ing objects and relations.

Page 164: Paris-Princeton Lectures on Mathematical Finance 2003

156 T. Bjork

• A d-dimensional factor or state process Z with Q-dynamics of the form

dZt = a(Zt)dt + b(Zt)dWt, (80)

where W is an m-dimensional Wiener process. We denote by ai the i : th com-ponent of the column vector a, and by bi the i : th row of the matrix b.

• A smooth output mappingG : Rd → H.

For each z ∈ Rd, G(z) is thus a real valued C∞ function and it’s value at thepoint x ∈ R is denoted by G(z, x).

• The forward rate process is then defined by

rt = G(Zt), (81)

or on component formrt(x) = G(Zt, x). (82)

Since we have given the Z dynamics under the martingale measure Q, it is obviousthat there has to be some consistency requirements on the relations between a, b andG in order for r in (81) to be a specification of the forward rate process under Q. Theobvious way of deriving the consistency requirements is to compute the r dynamicsfrom (80)-(81) and then to compare the result with the general form of the forwardrate equation in (4). For ease of notation we will use the shorthand notation

Gx =∂G

∂x, Gi =

∂G

∂zi, Gi =

∂2G

∂zi∂zj(83)

From the Ito formula, (80), and (81) we obtain

drt =

d∑i=1

Gi(Zt)ai(Zt)dt +12

d∑i,j=1

Gij(Zt)bi(Zt)bj (Zt)

dt (84)

+d∑i=1

Gi(Zt)bi(Zt)dWt (85)

where denotes transpose. Going back to the forward rate equation (4) we can iden-tify the volatility process as

σt =d∑i=1

Gi(Zt)bi(Zt).

We now insert this into the drift part of (4). We then use (81) to deduce thatFrt = Gx(Zt) and also insert this expression into the drift part of (4). Comparingthe resulting equation with (84) gives us the required consistency conditions.

Page 165: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 157

Proposition 3.5 (Filipovic) The following relation must hold identically in (z, x).

Gx(z, x) +d∑

i,j=1

bi(z)bj(z)Gi(z, x)∫ x

0

Gj(z, s)ds

=d∑i=1

Gi(z, x)ai(z) +12

d∑i,j=1

Gij(z, x)bi(z)bj(z) (86)

We can view the consistency equation (86) in three different ways.

• We can check consistency for a given specification of G, a,b.

• We can specify a and b. Then (86) is a PDE for the determination of a consistentoutput function G.

• We can specify G, i.e. we can specify a finite dimensional manifold of forwardrate curves, and then use (86) to investigate whether there exists an underlyingconsistent state vector process Z , and if so, to find a and b. This inverse problemis precisely Problem II in Section 3.1.

We will focus on the last inverse problem above, and to see how the consistencyequation can be used, we now go on to study two simple laboratory examples.

Example 3.1 In this example we consider the 2-dimensional manifold of linear for-ward rate curves, i.e. the output function G defined by

G(z, x) = z1 + z2x. (87)

This is not a very natural example from a finance point of view, but it is a goodillustration of technique. The question we ask is whether there exist some forwardrate model consistent with the class of linear forward rate curves and if so what thefactor dynamics look like. For simplicity we restrict ourselves to the case of a scalardriving Wiener process, but the reader is invited to analyze the (perhaps more natural)case with a two-dimensional W .

We thus model the factor dynamics as

dZ1,t = a1(Zt)dt + b1(Zt)dWt, (88)

dZ2,t = a1(Zt)dt + b2(Zt)dWt. (89)

In this case we have

Gx(z, x) = z2, G1(z, x) = 1, G2(z, x) = x,

G11(z, x) = 0, G12(z, x) = 0, G22(z, x) = 0,

and

Page 166: Paris-Princeton Lectures on Mathematical Finance 2003

158 T. Bjork∫ x

0

G1(z, s)ds = x,

∫ x

0

G2(z, s)ds =12x2,

so the consistency equation (86) becomes

z2 + b21(z)x+ b1(z)b2(z)12x2 + b2(z)b1(z)x2 + b22(z)

12x3 = a1(z)+ a2(z)x (90)

Identifying coefficients we see directly that b2 = 0 so the equation reduces to

z2 + b21(z)x = a1(z) + a2(z)x (91)

which gives us the relations a1 = z2 and a2 = b21. Thus we see that for this choice ofG there does indeed exist a class of consistent factor models, with factor dynamicsgiven by

dZ1,t = Z2,tdt + b1(Zt)dWt (92)

dZ2,t = b21(Zt)dt. (93)

Here b1 can be chosen completely freely (subject only to regularity conditions).Choosing b1(z) = 1, we see that the factor Z2 is essentially running time, and themodel is then in fact a special case of the Ho-Lee model.

Example 3.2 We now go on to study the more complicated two-dimensional mani-fold of exponential forward rate curves, given by the output function

G(z, x) = z1ez2x. (94)

This is a simplified version of the Nelson-Siegel manifold, so it will give us someinsight into the consistency problem for the NS case. In this case we will assumetwo independent driving Wiener processes W 1, and W 2, and we will assume factordynamics of the form

dZ1,t = a1(Zt)dt + b11(Zt)dW 1t , (95)

dZ2,t = a2(Zt)dt + b22(Zt)dW 2t . (96)

Note that the factors are being driven by independent Wiener processes. The readeris invited to study the general case when both W 1 and W 2 enters into both equations.In our case we have

Page 167: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 159

b1(z) = [b11(z), 0] , (97)

b2(z) = [0, b22(z)] , (98)

Gx(z, x) = z1z2ez2x, (99)

G1(z, x) = ez2x, (100)∫ x

0

G1(z, s)ds = z−12 (ez2x − 1) , (101)

G11(z, x) = 0, (102)

G2(z, x) = z1xez2x, (103)

G22(z, x) = z1x2ez2x, (104)∫ x

0

G2(z, s)ds = z1z−12 xez2x − z1z

−22 ez2x + z−2

2 . (105)

The consistency equation thus becomes

z1z2ez2x

+b211(z)z−12 e2z2x − b211(z)z−1

2 ez2x

+b222(z)z21z

−12 x2e2z2x − b222(z)z2

1z−22 xe2z2x + b222(z)z1z

−22 xez2x

= a1(z)ez2x + a2(z)z1xez2x +

12b222(z)z1x

2ez2x

Rearranging terms we have

ez2x[z1z2 + b11(z)z−1

2 − a1(z)]

+ xez2x[b222(z)z1z

−22 − a2(z)z1

]

+ x2ez2x[−1

2b222(z)z1

]

+ e2z2x[b211(z)z−1

2

]+ xe2z2x

[−b222(z)z2

1z−22

]+ x2e2z2x

[b222(z)z2

1z−12

]= 0

From the linear independence of the quasi-exponential functions, we immediatelyobtain

b11(z) = 0, b22(z) = 0,a2(z) = 0, a1(z) = z1z2.

The only consistent Z-dynamics are thus given by

dZ1,t = z2,tZ1,t, dt (106)

dZ2,t = 0, (107)

which implies that Z2 is constant and that

Page 168: Paris-Princeton Lectures on Mathematical Finance 2003

160 T. Bjork

Z1,t = Z1,0eZ2t.

We thus see that, apart from allowing randomness in the initial values, both Z1 andZ2 evolve along deterministic paths, where in fact Z2 stays constant whereas Z1

grows exponentially at the rate Z2. In other words, there exists no non-trivial factormodel which is consistent with the class of exponential forward rate curves.

As we have seen, the calculations quickly become rather messy, and it is thus aformidable task to find the set of consistent factor models for a more complicatedmanifold like, say, the Nelson-Siegel family of forward rate curves. Since the NSfamily is four-dimensional we would need a four dimensional factor model withfour independent Wiener processes (all of which would be driving each of the fourequations).

In [19] the case of the NS family was indeed studied, and it was proved that nonontrivial Wiener driven model is consistent with NS. Thus, for a model to beconsistent with Nelson-Siegel, it must be deterministic (apart from randomness inthe initial conditions). In [20] (which is a technical tour de force) this result was thenextended to a much larger exponential polynomial family than the NS family.

3.7 Notes

The section is largely based on [5] and [19]. In our presentation we have used strongsolutions of the infinite dimensional forward rate SDE. This is of course restrictive.The invariance problem for weak solutions has been studied by Filipovic in [22]and [21]. An alternative way of studying invariance is by using some version of theStroock–Varadhan support theorem, and this line of thought is carried out in depthin [38].

4 The General Realization Problem

We now turn to Problem 2 in Section 1.3, i.e. the problem when a given forward ratemodel has a finite dimensional factor realization. For ease of exposition we mostlyconfine ourselves to time invariant forward rate dynamics. Time varying systems canbe treated similarly (see [9]). We will use some ideas and concepts from differentialgeometry, and a general reference here is [37]. The section is based on [9].

4.1 Setup

We consider a given volatility structure σ : H → Hm and study the induced forwardrate model (on Stratonovich form)

Page 169: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 161

drt = µ(rt)dt + σ(rt) dWt (108)

where as before (see Section 3.4).

µ(r) =∂

∂xr + σ(r)Hσ(r) − 1

2σ′r(r)σ(r) (109)

Throughout the rest of the section, Assumption 3.1 is in force.

4.2 The Geometric Problem

Given a specification of the volatility mapping σ, and an initial forward rate curvero we now investigate when (and how) the corresponding forward rate processpossesses a finite, dimensional realization. We are thus looking for smooth d-dimensional vector fields a and b, an initial point z0 ∈ Rd, and a mapping G :Rd → H such that r, locally in time, has the representation

dZt = a (Zt) dt + b (Zt) dWt, Z0 = z0 (110)

rt(x) = G (Zt, x) . (111)

Remark 4.1 Let us clarify some points. Firstly, note that in principle it may wellhappen that, given a specification of σ, the r-model has a finite dimensional realiza-tion given a particular initial forward rate curve ro, while being infinite dimensionalfor all other initial forward rate curves in a neighborhood of ro. We say that sucha model is a non-generic or accidental finite dimensional model. If, on the otherhand, r has a finite dimensional realization for all initial points in a neighborhood ofro, then we say that the model is a generically finite dimensional model. In this textwe are solely concerned with the generic problem. Secondly, let us emphasize thatwe are looking for local (in time) realizations.

We can now connect the realization problem to our studies of invariant manifolds.

Proposition 4.1 The forward rate process possesses a finite dimensional realizationif and only if there exists an invariant finite dimensional submanifold G with ro ∈ G.

Proof. See [5] for the full proof. The intuitive argument runs as follows. Suppose thatthere exists a finite dimensional invariant manifold G with ro ∈ G. Then G has a localcoordinate system, and we may define the Z process as the local coordinate processfor the r-process. On the other hand it is clear that if r has a finite dimensionalrealization as in (110)-(111), then every forward rate curve that will be produced bythe model is of the form x −→ G(z, x) for some choice of z. Thus there exists afinite dimensional invariant submanifold G containing the initial forward rate curvero, namely G = ImG. Using Theorem 3.1 we immediately obtain the following geometric characterizationof the existence of a finite realization.

Page 170: Paris-Princeton Lectures on Mathematical Finance 2003

162 T. Bjork

Corollary 4.1 The forward rate process possesses a finite dimensional realization ifand only if there exists a finite dimensional manifold G containing ro, such that, foreach r ∈ G the following conditions hold.

µ(r) ∈ TG(r),σ(r) ∈ TG(r).

Here TG(r) denotes the tangent space to G at the point r, and the vector fields µ andσ are as above. The tangency condition for σ is as usual interpreted component wise.

4.3 The Main Result

Given the volatility vector fields σ1, . . . , σm, and hence also the field µ, we now arefaced with the problem of determining if there exists a finite dimensional manifoldG with the property that µ and σ1, . . . , σm are tangential to G at each point of G. Inthe case when the underlying space is finite dimensional, this is a standard problemin differential geometry, and we will now give the heuristics.

To get some intuition we start with a simpler problem and therefore consider thespace H (or any other Hilbert space), and a smooth vector field f on the space. Foreach fixed point ro ∈ H we now ask if there exists a finite dimensional manifold Gwith ro ∈ G such that f is tangential to G at every point. The answer to this questionis yes, and the manifold can in fact be chosen to be one-dimensional. To see this,consider the infinite dimensional ODE

drtdt

= f(rt), (112)

r0 = ro. (113)

If rt is the solution, at time t, of this ODE, we use the notation

rt = eftro.

We have thus defined a group of operatorseft : t ∈ R

, and we note that the set

eftro : t ∈ R⊆ H is nothing else than the integral curve of the vector field f ,

passing through ro. If we define G as this integral curve, then our problem is solved,since f will be tangential to G by construction.

Let us now take two vector fields f1 and f2 as given, where the reader informally canthink of f1 as σ (in the case of a scalar Wiener process) and f2 as µ. We also fix aninitial point ro ∈ H and the question is if there exists a finite dimensional manifoldG, containing ro, with the property that f1 and f2 are both tangential to G at eachpoint of G. We call such a manifold an tangential manifold for the vector fields. Ata first glance it would seem that there always exists a tangential manifold, and that itcan even be chosen to be two-dimensional. The geometric idea is that we start at ro

and let f1 generate the integral curveef1sro : s ≥ 0

. For each point ef1sro on this

Page 171: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 163

curve we now let f2 generate the integral curve starting at that point. This gives usthe object ef2tef1sro and thus it seems that we sweep out a two dimensional surfaceG in H. This is our obvious candidate for a tangential manifold.

In the general case this idea will, however, not work, and the basic problem is asfollows. In the construction above we started with the integral curve generated by f1

and then applied f2, and there is of course no guarantee that we will obtain the samesurface if we start with f2 and then apply f1. We thus have some sort of commuta-tivity problem, and the key concept is the Lie bracket.

Definition 4.1 Given smooth vector fields f and g on H, the Lie bracket [f, g] is anew vector field defined by

[f, g] (r) = f ′(r)g(r) − g′(r)f(r) (114)

The Lie bracket measures the lack of commutativity on the infinitesimal scale in ourgeometric program above, and for the procedure to work we need a condition whichsays that the lack of commutativity is “small”. It turns out that the relevant conditionis that the Lie bracket should be in the linear hull of the vector fields.

Definition 4.2 Let f1, . . . , fn be smooth independent vector fields on some space X .Such a system is called a distribution, and the distribution is said to be involutive if

[fi, fj] (x) ∈ span f1(x), . . . , fn(x) , ∀i, j,

where the span is the linear hull over the real numbers.

We now have the following basic result, which extends a classic result from finitedimensional differential geometry (see [37]).

Theorem 4.1 (Frobenius) Let f1, . . . , fk and be independent smooth vector fieldsin H and consider a fixed point ro ∈ H. Then the following statements are equiva-lent.

• For each point r in a neighborhood of ro, there exists a k-dimensional tangentialmanifold passing through r.

• The system f1, . . . , fk of vector fields is involutive.

Proof. See [9], which provides a self contained proof of the Frobenius Theorem inBanach space. Let us now go back to our interest rate model. We are thus given the vector fields µ,σ, and an initial point ro, and the problem is whether there exists a finite dimensionaltangential manifold containing ro. Using the infinite dimensional Frobenius theorem,this situation is now easily analyzed. Suppose for simplicity that m = 1 i.e. that

Page 172: Paris-Princeton Lectures on Mathematical Finance 2003

164 T. Bjork

we only have one scalar driving Wiener process. Now; if µ, σ is involutive thenthere exists a two dimensional tangential manifold. If µ, σ is not involutive, thismeans that the Lie bracket [µ, σ] is not in the linear span of µ and σ, so then weconsider the system µ, σ, [µ, σ]. If this system is involutive there exists a threedimensional tangential manifold. If it is not involutive at least one of the brackets[µ, [µ, σ]], [σ, [µ, σ]] is not in the span of µ, σ, [µ, σ], and we then adjoin this (these)bracket(s). We continue in this way, forming brackets of brackets, and adjoiningthese to the linear hull of the previously obtained vector fields, until the point whenthe system of vector fields thus obtained actually is closed under the Lie bracketoperation.

Definition 4.3 Take the vector fields f1, . . . , fk as given. The Lie algebra generatedby f1, . . . , fk is the smallest linear space (over R) of vector fields which containsf1, . . . , fk and is closed under the Lie bracket. This Lie algebra is denoted by

L = f1, . . . , fkLA

The dimension of L is defined, for each point r ∈ H as

dim [L(r)] = dim span f1(r), . . . , fk(r) .

Putting all these results together, we can now state the main result on finite dimen-sional realizations. As can be seen from the arguments above, the fact that we havebeen studying the particular case of the forward rate equation is not at all essential:all results will continue to hold for any SDE with smooth drift and diffusion vectorfields, evolving on a Hilbert space. We therefore state the main realization theoremfor an arbitrary SDE in Hilbert space.

Theorem 4.2 (Main Result) Consider the following Stratonovich SDE, evolving ina given Hilbert space H.

dr = µ(rt)dt + σ(rt) dWt. (115)

We assume that the drift and diffusion terms µ and σ are smooth vector fields on H.

Then the SDE (115) generically admits a finite dimensional realization if and only if

dim µ, σ1, . . . , σmLA < ∞

in a neighborhood of ro.

The result above thus provides a general solution to Problem 2 from Section 1.3. Forany given specification of forward rate volatilities, the Lie algebra can in principlebe computed, and the dimension can be checked. Note, however, that the theoremis a pure existence result. If, for example, the Lie algebra has dimension five, thenwe know that there exists a five-dimensional realization, but the theorem does not

Page 173: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 165

directly tell us how to construct a concrete realization. This is the subject of Section5 below. Note also that realizations are not unique, since any diffeomorphic mappingof the factor space Rd onto itself will give a new equivalent realization.

When computing the Lie algebra generated by µ and σ, the following observationsare often useful.

Lemma 4.1 Let us assume tht the vector fields f1, . . . , fk as given. The Lie algebraL = f1, . . . , fkLA remains unchanged under the following operations.

• The vector field fi(r) may be replaced by α(r)fi(r), where α is any smoothnonzero scalar field.

• The vector field fi(r) may be replaced by

fi(r) +∑j =i

αj(r)fj(r),

where αj is any smooth scalar field.

Proof. The first point is geometrically obvious, since multiplication by a scalar fieldwill only change the length of the vector field fi, and not its direction, and thusnot the tangential manifold. Formally it follows from the “Leibnitz rule” [f, αg] =α [f, g] − (α′f)g. The second point follows from the bilinear property of the Liebracket together with the fact that [f, f ] = 0.

4.4 Constructing the Invariant Manifold

As we have seen above, there exists generically an FDR for (115) if and only ifthere exists, for any initial point near ro an invariant manifold containing the initialpoint, and this manifold will also be a tangential manifold for all vector fields inthe Lie algebra µ, σLA. In this section we provide a concrete parameterization ofthis invariant manifold. This result will be used in connection with the constructionproblem treated in Section 5.

Proposition 4.2 Consider the SDE (115), and assume that the Lie algebra L =µ, σLA is finite dimensional near ro. Assume furthermore that we have cho-sen an involutive system of independent vector fields f1, . . . , fn such that L =span f1, . . . , fn. Now choose an initial point r0 ∈ H near ro. Denote the in-duced invariant (and thus tangential) manifold through r0, by G. Define the mappingG : Rn → X by

G(z1, . . . zn) = efnzn . . . ef1z1r0.

Then G is a local parameterization of G. Furthermore, the inverse of G restricted toV is a local coordinate system for G at r0.

Page 174: Paris-Princeton Lectures on Mathematical Finance 2003

166 T. Bjork

Proof. It follows directly from the definition of a tangential manifold that G(z) ∈ Gfor all z near 0 in Rn. Furthermore it is easy to see that G′(0)h =

∑ni=1 hifi(x0) and

, since f1, . . . , fn are independent, G′(0) is injective. The inverse function theoremdoes the rest. With this machinery we can also very easily solve a related question. Consider a fixedinterest rate model, specified by the volatility σ and also a fixed family of forwardrate curves parameterized by the mapping G0 : Rk → H. Now, if G0 = Im[G0]is invariant, then the interest rate model will, given any initial point ro in G0, onlyproduce forward rate curves belonging to G0 or, in the terminology of Section 3, thegiven model and the family G0 are consistent. If the family is not consistent, thenan initial forward rate curve in G0 may produce future forward rate curves outsideG0, and the question arises how to construct the smallest possible family of forwardrate curves which contains the initial family G0, and is consistent (i.e. invariant) w.r.tthe interest rate model. As a concrete example, one may want to find the minimalextension of the Nelson-Siegel family of forward rate curves (see [34], [5]) whichis consistent with the Hull-White (extended Vasicek) model. In particular one wouldlike to know under what conditions this minimal extension of G0 is finite dimen-sional.

In geometrical terms we thus want to construct the minimal manifold containing G0,which is tangential w.r.t. the vector fields µ, σ1, . . . , σm. The solution is obvious: Forvery point on G0 we construct the minimal tangential manifold through that point,and then we define the extension G as the union of all these fibers. Thus we have thefollowing result, the proof of which is obvious. Concrete applications will be givenbelow.

Proposition 4.3 Consider a fixed volatility mapping σ, and let G0 be a k-dimensionalsubmanifold parameterized by G0 : Rk → H. Then G0 can be extended to a finitedimensional invariant submanifold G, if and only if

dim µ, σ1, . . . , σmLA < ∞.

Moreover, if G0 is transversal to µ, σLA and if the Lie algebra is spanned by theindependent vector fields f1, . . . , fd, then dim G = k + d and a parameterization ofG is given by the map G : Rk+d → H, defined by

G(z1, . . . , zk, y1, . . . , yd) = efdyd . . . ef1y1G0(z1, . . . , zk). (116)

Remark 4.2 The term “transversal” above means that no vector in the Lie algebraL(µ, σ) is contained the tangent space of G0 at any point of G0. This prohibits anintegral curve of L to be contained in G0, which otherwise would lead to an extensionwith lower dimension than d + k. In such a case the parameterization above wouldamount to an over parameterization in the sense that G would not be injective.

Page 175: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 167

4.5 Applications

In this section we give some simple applications of the theory developed above. Formore examples and results, see [9].

Constant Volatility: Existence of FDRs

We start with the simplest case, which is when the volatility σ(r, x) does not dependon r. In other words, σ is of the form σ(r, x) = σ(x), and σ is thus a constant vectorfield on H. We assume for brevity of notation that we have only one driving Wienerprocess. Since σ is deterministic we have no Stratonovich correction term and thevector fields are given by

µ(r, x) = Fr(x) + σ(x)∫ x

0

σ(s)ds,

σ(r, x) = σ(x),

where as before F = ∂/∂x.

The Frechet derivatives are trivial in this case. Since F is linear (and bounded in ourspace), and σ is constant as a function of r, we obtain

µ′r = F,

σ′r = 0.

Thus the Lie bracket [µ, σ] is given by

[µ, σ] = Fσ,

and in the same way we have

[µ, [µ, σ]] = F2σ.

Continuing in the same manner it is easily seen that the relevant Lie algebra L isgiven by

L = µ, σLA = spanµ, σ,Fσ,F2σ, . . .

= span µ,Fnσ ;n = 0, 1, 2, . . .

and it is thus clear that L is finite dimensional (at each point r) if and only if thefunction space

span Fnσ; n = 0, 1, 2, . . .is finite dimensional. This, on the other hand, occurs if and only if each componentof σ solves a linear ODE with constant coefficients. This argument is easily extendedto the case of a multidimensional driving Wiener process so, using Lemma 2.1, wecan finally state the existence result for constant volatility models.

Page 176: Paris-Princeton Lectures on Mathematical Finance 2003

168 T. Bjork

Proposition 4.4 Assume that the volatility components σ1, . . . , σm are determinis-tic, i.e. of the form

σi(r, x) = σi(x), i = 1, . . . ,m.

Then there exists a finite dimensional realization if and only if the function space

span Fnσi; i = 1, . . . ,m; n = 0, 1, 2, . . .

is finite dimensional. This occurs if and only if each component of σ is a quasi-exponential function.

Constant Volatility: Invariant Manifolds

For models with constant volatility vector fields, we now turn to the construction ofinvariant manifolds, and to this end we assume that the Lie algebra above is finitedimensional. Thus it is spanned by a finite number of vector fields as

µ, σLA = spanµ, σ

(k)i ; i = 1, . . . ,m; k = 0, 1, . . . , ni

,

where

σ(k)i (x) =

∂kσi∂xk

(x).

In order to apply Proposition 4.2 and Proposition 4.3, we have to compute the op-

erators exp [µt] and exp[σ

(k)i t

], i.e. we have to solve H-valued ODEs. We recall

thatµ(r) = Fr + D,

where the constant field D is given

D(x) =m∑i

σi(x)∫ x

0

σi (s)ds,

which can be written as

D(x) =12

∂x‖S(x)‖2,

where S(x) =∫ x0

σ(s)ds. Thus eµt is obtained by solving

dr

dt= Fr + D.

This is a linear equation, and from Proposition 2.1 we obtain the solution

rt = eFtr0 +∫ t

0

eF(t−s)Dds

so

Page 177: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 169

(eµtr0

)(x) = r0(x + t) +

12(‖S(x + t)‖2 − ‖S(x)‖2

).

The vector fields σ(k)i are constant, so the corresponding ODEs are trivial. We have

eσ(k)i tr0 = r0 + σ

(k)i t.

We thus have the following results on the parameterization of invariant manifolds.For a given mapping G : Rn → H, we write G(z)(x) or G(z, x) to denote thefunction G(z) ∈ H evaluated at x ∈ R+.

Proposition 4.5 The invariant manifold generated by the initial forward rate curver0 is parameterized as

G(z0, zki ; i = 1, . . . ,m; k = 0, . . . , ni)(x)

= r0(x + z0) +12(‖S(x + t)‖2 − ‖S(x)‖2

)+

m∑i=1

ni∑k=0

σ(k)i (x)zki .

If the k-dimensional manifold G0 is transversal to Lµ, σ and parameterized byG0(y1, . . . , yk), then the minimal consistent (i.e. invariant) extension is parameter-ized as

G(y1, . . . , yk, z0, zki ; i = 1, . . . ,m; k = 0, . . . , ni)(x)

= G0(y1, . . . , yk)(x + z0) +12(‖S(x + z0)‖2 − ‖S(x)‖2

)+

m∑i=1

ni∑k=0

σ(k)i (x)zki .

Note that if G0 is invariant under shift in the x-variable (this is in fact the typicalcase), then a simpler parameterization of G is given by

G(y1, . . . , yk, z0, zki ; i = 1, . . . ,m; k = 0, . . . , ni)(x)

= G0(y1, . . . , yk)(x) +12(‖S(x + z0)‖2 − ‖S(x)‖2

)+

m∑i=1

ni∑k=0

σ(k)i (x)zki .

As a concrete application let us consider the simple case when m = 1 and

σ(x) = σe−ax,

where, with a slight abuse of notation, a and σ denote positive constants. As is wellknown, this is the HJM formulation of the Hull-White extension of the Vasicek model[28],[36]. In this case we have

S(x) =σ

a

[1 − e−ax

].

The relevant function space

Page 178: Paris-Princeton Lectures on Mathematical Finance 2003

170 T. Bjork

Fnσ; n ≥ 0 =

∂n

∂xne−ax; n ≥ 0

is obviously one-dimensional and spanned by the single function e−ax, so the Liealgebra is two-dimensional.

As the given manifold G0 we take the Nelson-Siegel ([34]) family of forward ratecurves, parameterized as

G0(y1, . . . , y4)(x) = y1 + y2e−y4x + y3xe

−y4x.

This family is obviously invariant under shift in x, so we have the following result.

Proposition 4.6 For a given initial forward rate curve r0, the invariant manifoldgenerated by the Hull-White extended Vasicek model is parameterized by

G(z0, z1)(x) = r0(x+z0)+e−axσ2

a2

[1 − e−az0

]−e−2ax σ2

2a2

[1 − e−2az0

]+z1e

−ax.

The minimal extension of the NS family consistent with the Hull-White extendedVasicek model is parameterized by

G(z0, z1, y1, . . . , y4)(x) = y1 + y2e−y4x + y3xe

−y4x

+e−axσ2

a2

[1 − e−az0

]− e−2ax σ2

2a2

[1 − e−2az0

]+ z1e

−ax.

Constant Direction Volatility

We go on to study the most natural extension of the deterministic volatility casenamely the case when the volatility is of the form

σ(r, x) = ϕ(r)λ(x). (117)

We restrict ourselves to the case of a scalar Wiener process. In this case the individualvector field σ has the constant direction λ ∈ H, but is of varying length, determinedby ϕ, where ϕ is allowed to be any smooth functional of the entire forward rate curve.In order to avoid trivialities we make the following assumption.

Assumption 4.1 We assume that ϕ(r) = 0 for all r ∈ H.

After a simple calculation the drift vector µ turns out to be

µ(r) = Fr + ϕ2(r)D − 12ϕ′(r)[λ]ϕ(r)λ, (118)

where ϕ′(r)[λ] denotes the Frechet derivative ϕ′(r) acting on the vector λ, and wherethe constant vector D ∈ H is given by

Page 179: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 171

D(x) = λ(x)∫ x

0

λ(s)ds.

We now want to know under what conditions on ϕ and λ we have a finite dimensionalrealization, i.e. when the Lie algebra generated by

µ(r) = Fr + ϕ2(r)D − 12ϕ′(r)[λ]ϕ(r)λ,

σ(r) = ϕ(r)λ,

is finite dimensional. Under Assumption 4.1 we can use Lemma 4.1, to see that theLie algebra is in fact generated by the simpler system of vector fields

f0(r) = Fr + Φ(r)D,

f1(r) = λ,

where we have used the notation

Φ(r) = ϕ2(r).

Since the field f1 is constant, it has zero Frechet derivative. Thus the first Lie bracketis easily computed as

[f0, f1] (r) = Fλ + Φ′(r)[λ]D.

The next bracket to compute is [[f0, f1] , f1] which is given by

[[f0, f1] , f1] = Φ′′(r)[λ;λ]D.

Note that Φ′′(r)[λ;λ] is the second order Frechet derivative of Φ operating on thevector pair [λ;λ]. This pair is to be distinguished from (notice the semicolon) the Liebracket [λ, λ] (with a comma), which if course would be equal to zero. We now makea further assumption.

Assumption 4.2 We assume that Φ′′(r)[λ;λ] = 0 for all r ∈ H.

Given this assumption we may again use Lemma 4.1 to see that the Lie algebra isgenerated by the following vector fields

f0(r) = Fr,

f1(r) = λ,

f3(r) = Fλ,

f4(r) = D.

Of these vector fields, all but f0 are constant, so all brackets are easy. After elemen-tary calculations we see that in fact

Page 180: Paris-Princeton Lectures on Mathematical Finance 2003

172 T. Bjork

µ, σLA = span Fr,Fnλ, FnD; n = 0, 1, . . . .

From this expression it follows immediately that a necessary condition for the Liealgebra to be finite dimensional is that the vector space spanned by Fnλ; n ≥ 0 isfinite dimensional. This occurs if and only if λ is quasi-exponential (see Remark 2.2).If, on the other hand, λ is quasi-exponential, then we know from Lemma 2.1, thatalso D is quasi-exponential, since it is the integral of the QE function λ multiplied bythe QE function λ. Thus the space FnD; n = 0, 1, . . . is also finite dimensional,and we have proved the following result.

Proposition 4.7 Under Assumptions 4.1 and 4.2, the interest rate model with volatil-ity given by σ(r, x) = ϕ(r)λ(x) has a finite dimensional realization if and only if λis a quasi-exponential function. The scalar field ϕ is allowed to be any smooth field.

When is the Short Rate a Markov Process?

One of the classical problems concerning the HJM approach to interest rate modelingis that of determining when a given forward rate model is realized by a short ratemodel, i.e. when the short rate is Markovian. We now briefly indicate how the theorydeveloped above can be used in order to analyze this question. For the full theory see[9].

Using the results above, we immediately have the following general necessary con-dition.

Proposition 4.8 The forward rate model generated by σ is a generic short ratemodel, i.e the short rate is generically a Markov process, only if

dim µ, σLA ≤ 2 (119)

Proof. If the model is really a short rate model, then bond prices are given as pt(x) =F (t, Rt, x) where F solves the term structure PDE. Thus bond prices, and forwardrates are generated by a two dimensional factor model with time t and the short rateR as the state variables.

Remark 4.3 The most natural case is clearly dim µ, σLA = 2. However, it is anopen problem whether there exists a non-deterministic generic short rate model withdim µ, σLA = 1.

Note that condition (119) is only a sufficient condition for the existence of a shortrate realization. It guarantees that there exists a two-dimensional realization, but thequestion remains whether the realization can chosen in such a way that the short rateand running time are the state variables. This question is completely resolved by thefollowing central result.

Page 181: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 173

Theorem 4.3 Assume that the model is not deterministic, and take as given a timeinvariant volatility σ(r, x). Then there exists a short rate realization if and only ifthe vector fields [µ, σ] and σ are parallel, i.e. if and only if there exists a scalar fieldα(r) such that the following relation holds (locally) for all r.

[µ, σ] (r) = α(r)σ(r). (120)

Proof. See [9]. It turns out that the class of generic short rate models is very small indeed. We have,in fact, the following result, which was first proved in [31] (using techniques differentfrom those above). See [9] for a proof based on Theorem 4.3.

Theorem 4.4 Consider a HJM model with one driving Wiener process and a volatil-ity structure of the form

σ(r, x) = g(R, x).

where R = r(0) is the short rate. Then the model is a generic short rate model if andonly if g has one of the following forms.

• There exists a constant c such that

g(R, x) ≡ c.

• There exist constants a and c such that.

g(R, x) = ce−ax.

• There exist constants a and b, and a function α(x), where α satisfies a certainRiccati equation, such that

g(R, x) = α(x)√

aR + b

We immediately recognize these cases as the Ho-Lee model, the Hull-White ex-tended Vasicek model, and the Hull-White extended Cox-Ingersoll-Ross model.Thus, in this sense the only generic short rate models are the affine ones, and themoral of this, perhaps somewhat surprising, result is that most short rate models con-sidered in the literature are not generic but “accidental”. To understand the geometricpicture one can think of the following program.

1. Choose an arbitrary short rate model, say of the form

dRt = a(Rt)dt + b(Rt)dWt

with a fixed initial point R0.

2. Solve the associated PDE in order to compute bond prices. This will also pro-duce:

Page 182: Paris-Princeton Lectures on Mathematical Finance 2003

174 T. Bjork

• An initial forward rate curve ro(x).

• Forward rate volatilities of the form g(R, x).

3. Forget about the underlying short rate model, and take the forward rate volatilitystructure g(R, x) as given in the forward rate equation.

4. Initiate the forward rate equation with an arbitrary initial forward rate curvero(x)

The question is now whether the thus constructed forward rate model will producea Markovian short rate process. Obviously, if you choose the initial forward ratecurve ro as ro = ro, then you are back where you started, and everything is OK.If, however, you choose another initial forward rate curve than ro, say the observedforward rate curve of today, then it is no longer clear that the short rate will beMarkovian. What the theorem above says, is that only the models listed above willproduce a Markovian short rate model for all initial points in a neighborhood of ro.If you take another model (like, say, the Dothan model) then a generic choice of theinitial forward rate curve will produce a short rate process which is not Markovian.

4.6 Notes

The section is based on [9] where full proofs and further results can be found, andwhere also the time varying case is considered. In our study of the constant directionmodel above, ϕ was allowed to be any smooth functional of the entire forward ratecurve. The simpler special case when ϕ is a point evaluation of the short rate, i.e.of the form ϕ(r) = h(r(0)) has been studied in [1], [29] and [35]. All these casesfall within our present framework and, the results are included as special cases of thegeneral theory above. A different case, treated in [14], occurs when σ is a finite pointevaluation, i.e. when σ(t, r) = h(t, r(x1), . . . r(xk)) for fixed benchmark maturi-ties x1, . . . , xk. In [14] it is studied when the corresponding finite set of benchmarkforward rates is Markovian.

A classic paper on Markovian short rates is [13], where a deterministic volatility ofthe form σ(t, x) is considered. Theorem 4.4 was first stated and proved in [31]. See[18] for an example with a driving Levy process.

The geometric ideas presented above and in [9] are intimately connected to con-trollability problems in systems theory, where they have been used extensively (see[30]). They have also been used in filtering theory, where the problem is to find afinite dimensional realization of the unnormalized conditional density process, theevolution of which is given by the Zakai equation. See [12] for an overview of theseareas.

Page 183: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 175

5 Constructing Realizations

The purpose of this section is to present a systematic procedure for the construc-tion of finite dimensional realizations for any model possessing a finite dimensionalrealization.

5.1 The Construction Algorithm

The method basically works as follows: From Theorem 4.2 we know that there ex-ists an FDR if and only if the Lie algebra µ, σLA is finite dimensional. Given a setof generators for this Lie algebra we now show how to construct an FDR by essen-tially solving a finite number of ordinary differential equations in Hilbert space. Themethod will work for any Hilbert space SDE of the form (115) with smooth drift anddiffusion vector fields, and in particular it can be applied to the forward rate equation.

Let us assume that the Lie algebra µ, σLA, is finite dimensional near the point ro.Then a finite dimensional realization can be constructed in the following way:

• Choose an involutive system of independent vector fields f1, . . . , fd which spanµ, σLA. Lemma 4.1 is often useful for simplifying the vector fields.

• Compute the invariant manifold G(z1, . . . , zd) using Proposition 4.2.

• Since G is invariant under r, we now know that rt = G(Zt) for some stateprocess Z . We thus make the following Ansatz for the dynamics of the statespace variables Z

dZt = a(Zt)dt + b(Zt) dWt.

• From the Stratonovich version of the Ito formula it then follows that

Ga = µ, Gb = σ. (121)

• Use the equations in (121) to solve for the vector fields a and b.

Before going on to concrete applications, let us make some remarks.

Remark 5.1

• We know that there will always exist solutions, a and b, to (121).

• It may be that the equations in (121) do not have unique solutions, but for us it isenough to find one solution, and any solution will do.

• Although we have to solve for the Stratonovich dynamics of the state variables, itturns out that the Ito-dynamics are typically much nicer looking (see below). Thisis not surprising since this is also true for the forward rate dynamics themselves.

Page 184: Paris-Princeton Lectures on Mathematical Finance 2003

176 T. Bjork

Again we emphasize that this method can be applied quite mechanically, the onlychoice to be made is that of vector fields which span the Lie algebra µ, σLA.Generally we will want to choose these vector fields as simple as possible and todo this we use Lemma 4.1. The reason why we want simple vector fields is thatthis simplifies the computation of the parameterization of the forward rate curves inthe next step (recall that this requires solving H-valued ODEs with right hand sidesequal to the generating vector fields).

In the next few sections we will apply this scheme repeatedly to various volatilitiesσ and derive finite dimensional realizations.

5.2 Deterministic Volatility

Assume thatσ(r, x) = σ(x), (122)

where each component of the vector σ is of the following form

σi(x) = σiλi(x), i = 1, . . . ,m (123)

Here, with a slight abuse of notation, σi on the right hand side denotes a constant,and λi is a constant vector field. We know from Proposition 4.4 that the forward rateequation generated by this volatility structure has a finite dimensional realization ifand only if

dim(spanσ,Fσ,F2σ, . . .

) < ∞,

where F denotes the operator ∂∂x . We therefore assume that λi solves the ODE

Fni+1λi(x) =ni∑k=0

cikFkλi(x), (124)

where the cik:s are constants. Since the Lie algebra spanned by µ and σ for this caseis given by

µ, σLA = spanµ, σ,Fσ,F2σ, . . .,we can choose the following generator system for the Lie algebra

µ, σLA = spanµ,Fkλi; i = 1, . . . ,m; k = 0, 1, . . . , ni.

The next step in constructing a finite dimensional realization is to compute the in-variant manifold G(z0, z

ik; i = 1, . . . ,m; k = 0, 1, . . . , ni). This means computing

the operators expµt and expFkλi, i = 1, . . . ,m, k = 0, . . . , ni. This has beendone in Proposition 4.5 and the invariant manifold generated by the initial forwardrate curve r0 is parameterized as

G(z0, zik; i = 1, . . . ,m; k = 0, 1, . . . , ni)(x)

= r0(x + z0) +12(‖S(x + z0)‖2 − ‖S(x)‖2) +

m∑i=1

ni∑k=0

Fkλi(x)zik,(125)

Page 185: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 177

where

S(x) =∫ x

0

σ(u)du.

We now proceed to the last step of the procedure, which is finding the dynamics ofthe state space variables. This means solving the equations (121). We therefore needthe Frechet derivative G′ of G. Simple calculations give

G′(z0, zik; i = 1, . . . ,m; k = 0, 1, . . . , ni)

h0

h10

h11...

hmnm

(x)

=∂

∂xr0(x + z0)h0 + D(x + z0)h0 +

m∑i=1

ni∑k=0

Fkλi(x)hik,

where D is the constant field given by

D(x) =m∑i=1

σ2i λi(x)

∫ x

0

λi(u)du.

Since for this model the Frechet derivative with respect to r of each component ofthe volatility is zero, i.e. σ′

i(r, x) = 0, we obtain the following expression for µ.

µ(r) = Fr + D.

If we use that r = G(z) we can obtain an expression for Fr, and the equationGa = µ then reads

∂xr0(x + z0)a0 + D(x + z0)a0 +

m∑j=1

nj∑k=0

Fkλj(x)ajk

=∂

∂xr0(x + z0) + D(x + z0) +

m∑j=1

nj∑k=0

Fk+1λj(x)zjk.

Since this equality is to hold for all x, and a is not allowed to depend on x it ispossible to identify what a must look like. If we recall that λi solves the ODE definedin (124) we obtain

a0 = 1,

aj0 = cj0zjnj

, j = 1, . . . ,m,

ajk = cjkzjnj

+ zjk−1, j = 1, . . . ,m; k = 1, . . . , nj .

From Gbi(z)(x) = σi(x) we obtain the equation

Page 186: Paris-Princeton Lectures on Mathematical Finance 2003

178 T. Bjork

∂xr0(x + z0)bi0 + D(x + z0)bi0 +

m∑j=1

nj∑k=0

Fkλj(x)bijk

= σiλi(x),

where σi denotes a constant. Therefore we have that

bijk = σi, j = i, k = 0,

bijk = 0, all other j and k.

From this we see that to each Wiener process there corresponds one state variablewhich is driven by this, and only this, Wiener process. The dynamics for these statevariables are given by

dZj0 = cj0Zjnj

dt + σj dW jt , j = 1, . . . ,m.

Since σj is a constant, the Ito-dynamics will look the same, and we have thus provedthe following proposition.

Proposition 5.1 Given the initial forward rate curve r0 the forward rate system gen-erated by the volatilities described in equations (122) through (124) has a finite di-mensional realization given by

rt = G(Zt),

where G was defined in (125) and the dynamics of the state space variables Z aregiven by

dZ0 = dt,

dZj0 = cj0Zjnj

dt + σjdWjt , j = 1, . . . ,m,

dZjk = (cjkZjnj

+ Zjk−1)dt, j = 1, . . . ,m; k = 1, . . . , nj .

Remark 5.2 Note that the first state space variable represents running time. Thiswill be the case for all realizations derived below.

Ho-Lee

As a special case of the deterministic volatilities studied in the previous section con-sider a volatility given by

σ(x) = σ, (126)

where σ is a scalar constant, that is we have only one driving Wiener process. In theformalism of the previous paragraph we have λ(x) ≡ 1, which satisfies the trivialODE Fλ(x) = 0. A direct application of Proposition 5.1 gives the following result.

Page 187: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 179

Proposition 5.2 Given the initial forward rate curve r0 the forward rate system gen-erated by the volatility of equation (126) has a finite dimensional realization givenby

rt = G(Zt),where G is given by

G(z0, z1)(x) = r0(x + z0) + σ2

(xz0 +

12z20

)+ z1,

and the dynamics of the state space variables Z are given bydZ0(t) = dt,

dZ1(t) = σdWt.

Hull-White

Another special case of deterministic volatilities is

σ(x) = σe−cx, (127)

where σ and c are scalar constants, so again there is only one driving Wiener process.This time we have λ(x) = e−cx, which satisfies the ordinary differential equationFλ(x) = −cλ(x). Applying Proposition 5.1 once more we obtain the following.

Proposition 5.3 Given the initial forward rate curve r0 the forward rate system gen-erated by the volatility of equation (127) has a finite dimensional realization givenby

rt = G(Zt),

where G is given by

G(z0, z1)(x) = r0(x + z0) +σ2

c2

(e−cx(1 − e−cz0) +

e−2cx

2(e−2cz0 − 1)

)+ z1,

and the dynamics of the state space variables Z are given bydZ0(t) = dt,

dZ1(t) = −cZ1(t)dt + σdWt.

5.3 Deterministic Direction Volatility

Consider a volatility structure of the form

σ(r, x) = ϕ(r)λ(x). (128)

Here ϕ is a smooth functional of r, and λ is a constant vector field. Note that weare now dealing with the case with only one driving Wiener process. Depending onwhether ϕ satisfies a certain non-degeneracy condition or not we get two cases. Wenext study these two cases separately.

Page 188: Paris-Princeton Lectures on Mathematical Finance 2003

180 T. Bjork

The Generic Case

In the generic case ϕ satisfies the following assumption.

Assumption 5.1 We assume that

• ϕ(r) = 0 for all r ∈ H and for all i = 1, . . . ,m.

• Φ′′(r)[λ;λ] = 0 for all r ∈ H, where Φ(r) = ϕ2(r) and Φ′′(r)[λ;λ] denotes thesecond order Frechet derivative of Φ operating on [λ;λ].

Given these assumptions, Proposition 6.1 in [9] states that the system of forwardrates generated by the volatility (128) possesses a finite dimensional realization ifand only if λ is a quasi-exponential function, i.e. of the form λ(x) = ceAxb, wherec is a row vector, A is a square matrix and b is a column vector. We will thereforeassume that λ is of the form

λ(x) = p(x)eαx, (129)

where p is a polynomial of degree n and α is a scalar constant.

It is also shown in [9] that, given Assumption 5.1, the Lie algebra generated by µand σ is given by

µ, σLA = spanFr,Fiλ,FiD; i = 0, 1, . . .,

where

D(x) = λ(x)∫ x

0

λ(u)du. (130)

We may now note that λ, regardless of what p looks like, satisfies the following ODEof order n + 1

(F− α)n+1λ(x) = 0.

This can also be written in the following way

Fn+1λ(x) = −n∑i=0

(n + 1

i

)(−α)n+1−iFiλ(x). (131)

Partial integration reveals that D can be written as D(x) = u(x)e2αx+γλ(x), whereu is a polynomial of degree q = 2n and γ is a constant. Using Lemma 4.1 we seethat we can use D instead of D to generate the Lie algebra, where D is given by

D(x) = D(x) −[

n∑i=0

(−1α

)i+1

Fip(0)

]· λ(x).

Here the sum on the right hand side equals γ. Therefore D(x) = u(x)e2αx and thusD satisfies the following ODE of order q + 1

Page 189: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 181

(F − 2α)q+1D(x) = 0,

which we can also write as

Fq+1D(x) = −q∑j=0

(q + 1

j

)(−2α)q+1−jFjD(x). (132)

After these considerations we choose the following generator system for the Liealgebra

µ, σLA = spanFr,Fiλ,FjD; i = 0, 1, . . . , n; j = 0, 1, . . . , q,

We now turn to the task of finding a parameterization of the invariant manifoldG(z0, z

1i , z

2j ; i = 0, 1, . . . , n; j = 0, 1, . . . , q), which amounts to computing the

operators expFrt expFiλt, i = 0, 1, . . . , n and expFjDt, j = 0, 1, . . . , q.The operator expFrt is obtained as the solution to

dytdt

= Fr.

This is a linear equation and the solution is

yt = eFty0,

which means that(eFtr0)(x) = r0(x + t).

Since the rest of the generating fields are constant, the corresponding ODEs are triv-ial, and we have

(eFiλtr0)(x) = r0(x) + Fiλt,

and(eF

jDtr0)(x) = r0(x) + FjDt,

respectively. The invariant manifold generated by the initial forward rate curve r0 isthus parameterized as

G(z0, z1i , z

2j ; i = 0, 1, . . . , n; j = 0, 1, . . . , q)(x)

= r(x + z0) +n∑i=0

Fiλ(x)z1i +

q∑j=0

FjD(x)z2j .

(133)

To obtain the state space dynamics we solve the equations (121). The Frechet deriva-tive G′ of G is given by

G′(z0, z1i , z

2j ; i = 0, 1, . . . , n; j = 0, 1, . . . , q)

h0

h10

h11...h2q

(x)

=∂

∂xr0(x + z0)h0 +

n∑i=0

Fiλ(x)h1i +

q∑j=0

FjD(x)h2j .

Page 190: Paris-Princeton Lectures on Mathematical Finance 2003

182 T. Bjork

We have the following expression for µ

µ(r) = Fr + ϕ2(r)D − 12ϕ′(r)[λ]ϕ(r)λ,

where D was defined in (130). Using that r = G(z), the equation Ga = µ reads

∂xr0(x + z0)a0 +

n∑i=0

Fiλ(x)a1i +

q∑j=0

FjD(x)a2j

=∂

∂xr0(x + z0) +

n∑i=0

Fi+1λ(x)z1i +

q∑j=0

Fj+1D(x)z2j

+ ϕ2(G(z))D(x) − 12ϕ′(G(z))[λ]ϕ(G(z))λ(x).

This equality has to hold for all x, and a is not allowed to depend on x. This allowsus to identify what a must look like. Recall that λ solves the ODE defined in (131),and that D solves the ODE in (132). Furthermore, recall that D(x) = D(x)+γλ(x),and let

ci = −(

n + 1i

)(−α)n+1−i and dj = −

(q + 1

j

)(−2α)q+1−j . (134)

We then obtain

a0 = 1,

a10 = c0z

1n + γϕ2(G(z)) − 1

2ϕ′(G(z))[λ]ϕ(G(z)),

a1i = ciz

1n + z1

i−1, i = 1, . . . , n,

a20 = d0z

2q + ϕ2(G(z)),

a2j = djz

2q + z2

j−1, j = 1, . . . , q.

From Gb = σ we obtain the equation

∂xr0(x + z0)b0 +

n∑i=0

Fiλ(x)b1i +q∑j=0

FjD(x)b2j

= ϕ(G(z))λ(x),

where we have used that r = G(z). This gives us

b0 = 0,

bij = ϕ(G(z)), i = 1, j = 0,

bij = 0, all other i, j.

Just as for the case with deterministic volatilities we see that the Wiener process onlydrives one of the state variables. On Stratonovich form the dynamics of Z1

0 are

Page 191: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 183

dZ10 =

(c0Z

1n + γϕ2(G(Z)) − 1

2ϕ′(G(Z))[λ]ϕ(G(Z))

)dt + ϕ(G(Z)) dWt.

Changing to Ito-dynamics for Z10 we have the following proposition.

Proposition 5.4 Given the initial forward rate curve r0 the forward rate system gen-erated by the volatility defined by the equations (128) and (129) has a finite dimen-sional realization given by

rt = G(Zt),

where G was defined in (133) and the dynamics of the state space variables Z aregiven by

dZ0 = dt,

dZ10 = [c0Z1

n + γϕ2(G(Z))]dt + ϕ(G(Z))dWt,

dZ1i = (ciZ1

n + Z1i−1)dt, i = 1, . . . , n,

dZ20 = [d0Z

2q + ϕ2(G(Z))]dt,

dZ2j = (djZ2

q + Z2j−1)dt, j = 1, . . . , q.

Here ci and dj are given by (134)

6 The Filipovic and Teichmann Extension

While in one sense the general FDR problem is more or less completely solved usingthe Lie algebra methodology of [9] described above, we still have a major technicalproblem to tackle. This has to do with the fact that in the approach above, the frame-work was that of strong solutions of infinite dimensional SDEs in Hilbert space andthis forced us to construct the particular Hilbert space H of real analytic functionsas the space of forward rate curves. While serving reasonably well, it was even atan early stage clear that this particular space was very small, and in particular it waspointed out by Filipovic and Teichmann that the space does not include the forwardrate curves generated by the Cox-Ingersoll-Ross model (see [15]). It was thereforenecessary to extend the theory to a larger space but such an extension is far from totrivial to carry out, the problem being that on a larger Hilbert space you will loosethe smoothness of the differential operator ∂/∂x appearing in the drift term of theforward rate equation. This problem was overcome with great elegance by Filipovicand Teichmann who, partly building on the geometric and analytic results from [22],in [23] managed to extend the Lie algebraic FDR theory to a much larger spaceof forward rate curves than the space H considered in [9]. In doing so, Filipovicand Teichmann first extend the space of [9] to a much larger Hilbert space. On thenew space, however, the operator ∂/∂x becomes unbounded so they then changethe topology on the space, thus making it into a Frechet space where the operator in

Page 192: Paris-Princeton Lectures on Mathematical Finance 2003

184 T. Bjork

fact is bounded. This approach, however, leads to new problems, since on a Frechetspace there is no easy way of introducing differential calculus–in fact there is evenno obvious way of defining the concept of smoothness which is necessary in orderto have a Frobenius theorem. In order to overcome this problem, Filipovic and Te-ichmann used the framework of so called “convenient spaces” developed some tenyears ago (see [23] for references) in order to carry out analysis on the enlargedspace. The main result of all this is that the Lie algebra conditions obtained by Bjorkand Svensson are shown to still hold in this more general setting. At this point it isworth mentioning that the technical price one has to pay for going into the deep partsof the theory of convenient analysis is quite high. It is therefore fortunate that the Liealgebraic machinery of [23] can be used without going into these (sometimes veryhard) technical details. In fact, one of the main result of [23] can be formulated in thefollowing pedestrian terms for the working mathematician: “When you are search-ing for FDRs for equations of HJM type, you can compute the relevant Lie algebrawithout worrying about the space, since Filipovic and Teichmann will always pro-vide you with a convenient space to work in”. In [23] and in the follow up papers[24] and [25] the extended Lie algebra theory in [23] is used in to analyze a numberof concrete problems concerning the forward rate equation: In particular, Filipovicand Teichmann prove the remarkable result that any forward rate model admitting anFDR must necessarily have an affine term structure.

7 Stochastic Volatility Models

We now extend the theory developed above to include stochastic volatility models.More precisely we will study HJM models of the forward rates in which the volatility,apart from being dependent on the present forward rate curve, also is allowed to bemodulated by a k dimensional hidden Markov process y. The model is defined asfollows.

Definition 7.1 The Ito formulation of the stochastic volatility model (henceforthSVM) is defined as the process pair (r, y), where the Q-dynamics of r and y aredefined by the following system of SDEs.

drt(x) =

∂xrt(x) + Hσ(rt, yt, x)

dt + σ(rt, yt, x)dWt (135)

dyt = a0(yt)dt + b(yt)dWt, (136)

where H is defined by

Hσ(r, y, x) = σ(r, y, x)∫ x

0

σ(r, y, s)ds, (137)

and denotes transpose.

Page 193: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 185

In this specification we consider the following objects as given a priori:

• The volatility structure σ for the forward rates, i.e. a deterministic mapping

σ : H× Rk × R+ → Rm.

• The drift vector field a0 for y, i.e. a deterministic mapping

a0 : Rk → Rk.

(The superscript on a0 will be explained below)

• The volatility vector field b for y, i.e. a deterministic mapping

b : Rk → M(k,m).

where M(k,m) denotes the set of k ×m matrices.

We view σ as a row vector

σ(r, y, x) = [σ1(r, y, x), . . . , σm(r, y, x)] ,

the drift a0 is viewed as a column vector and the volatility b as a matrix:

a0(y) =

a01(y)...

a0k(y)

, b(y) =

b11(y) b12(y) · · · b1m(y)b21(y) b22(y) · · · b2m(y)

......

...bk1(y) bk2(y) · · · bkm(y)

.

We note in particular that the forward rate volatility σ is allowed to be an arbi-trary functional of the entire forward rate curve r, as well as a function of the k-dimensional variable y. We may also view each component of σ as a mapping fromH×Rk to a space of functions (parameterized by x), and we will in fact assume thateach σi, viewed in this way, is a smooth mapping with values in H, i.e.

σi : H× Rk → H.

We make the following regularity assumptions.

Assumption 7.1 From now on we assume that:

• The mappings σi : H× Rk → H are smooth for i = 1, . . . ,m.

• The mapping Hσ : H× Rk → H, defined by (137) is smooth.

• The mappings a0 and b are smooth on Rk.

Page 194: Paris-Princeton Lectures on Mathematical Finance 2003

186 T. Bjork

In the forward rate dynamics (135) we recognize the drift term in the r-dynamicsabove as the HJM drift condition, transferred into the Musiela parameterization. Notethe particular structure of the equations (135)-(136): The y-process is feeding thedrift and diffusion terms of the r-dynamics, but the r-process does not appear in they-dynamics. Thus the y process is a Markov process in its own right, but this is notthe case for the r-process. The extended process r = (r, y) is, however, Markovian.

In many applications it is natural to study, not only the full SVM above but alsoa restricted model, where we forget about the dynamics of y and consider y as aconstant parameter. In this way we obtain a parameterized model, and the formaldefinition is as follows.

Definition 7.2 Consider the SVM defined by (135)-(136) above. For any fixed valueof y ∈ Rk, the induced parameterized forward rate model is defined by the dy-namics

dryt (x) =

∂xryt (x) + Hσ(ryt , y, x)

dt + σ(ryt , y, x)dWt. (138)

Note that in the parameterized model, the forward rate process ry itself is Markovian,whereas this is not the case in the full stochastic volatility model. For ease of readingwe will sometimes drop the superscript y .

7.1 Problem Formulation

The basic problem to be discussed is under what conditions the, inherently infinitedimensional, SVM defined above by (135)-(136), with given initial conditions r0 =r0, y0 = y0, admits a generic finite dimensional Markovian realization in the senseof Section 4. More precisely we thus want to investigate under what conditions theextended process rt = (rt, yt) possesses a local representation of the form

rt = G(Zt), Q − a.s. (139)

where, for some d, Z satisfies a d-dimensional SDE of the form

dZt = A0(Zt)dt + B(Zt)dWt,Z0 = z0.

(140)

and where G is a smooth map G : Rd → H× Rk. The drift and diffusion terms A0

and B are assumed to be smooth and of suitable dimensions.

In a realization of this kind, the objects G, A0, B and z0 will typically depend uponthe choice of starting point (r0, y0). We recall that the term “generic” above meansthat we demand that there exists a realization, not only for the given initial point(r0, y0), but in fact for all initial points (r0, y0) in a neighborhood of (r0, y0). When

Page 195: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 187

we speak of realizations in the sequel we always intend this to mean generic realiza-tions.

Note that the state process Z above is driven by the same Wiener process as the rsystem, and that the realization above is assumed to hold almost surely and trajectorywise.

We may now formulate some natural problems: Main problems:

• Find necessary and sufficient conditions for the existence of an FDR for a givenstochastic volatility model.

• Assuming the existence of an FDR has been guaranteed, how do you constructit?

• How is the existence of an FDR for the full stochastic volatility model related tothe existence of an FDR for the induced parameterized model? More precisely: isthe existence of an FDR for the parameterized model necessary and/or sufficientfor the existence of an FDR for the full model?

7.2 Test Examples: I.

To illustrate technique, we now present four simple recurrent test examples. In allcases we assume a scalar driving Wiener process W r for the forward rates, a scalary process and a scalar driving Wiener process W y for the y process. Furthermore weassume that W r and W y are independent. To motivate our choice of examples werecall (see [2]) the following well known (non stochastic) HJM volatilities for theforward rates.

I. Hull-White extended Vasicek:

σ(r, x) = σe−ax. (141)

Here a and the right hand side occurrence of σ are real constants. This HJMmodel has a short rate realization of the form.

dRt = Φ(t) − aRt dt + σdWt, (142)

where the deterministic function Φ depends on the initial term structure (see [2]).The parameters σ and a are the same as in (141).

II. Hull-White extended Cox-Ingersoll-Ross:

σ(r, x) = σ√

r(0) · λ(x, σ, a), (143)

Here a and the right hand side occurrence of σ are real constants, whereas λ isgiven by

Page 196: Paris-Princeton Lectures on Mathematical Finance 2003

188 T. Bjork

λ(x, σ, a) = − ∂

∂x

(2(eγx − 1)

(γ + a)(eγx − 1) + 2γ

), (144)

whereγ =

√a2 + 2σ2.

Also this HJM model admits a short rate realization, namely

dRt = Φ(t) − aRt dt + σ√

RtdWt (145)

The role of Φ is as in the extended Vasicek model above.

It is now natural to ask if we can extend these models by allowing one or severalparameters to be stochastic, and still retain the existence of a finite dimensional real-ization.

We consider the following extensions of the above volatility structures. In all caseswe assume that the scalar y process has dynamics of the form

dyt = a0(yt)dt + b(yt)dWyt ,

with b(y) = 0 for all y.

1. HW with stochastic a:σ(r, y, x) = σe−yx (146)

2. HW with stochastic σ:σ(r, y, x) = ye−ax (147)

3. CIR with stochastic σ:

σ(r, y, x) = y√

r(0) · λ(x, y, a) (148)

4. CIR with stochastic a:

σ(r, y, x) = σ√

r(0) · λ(x, σ, y) (149)

For all these models, the induced parameterized model admits, by construction, anFDR. It is now reasonable to ask if this also holds for the corresponding stochasticvolatility models.

7.3 Finite Realizations for General Stochastic Volatility Models

In order to solve the FDR problem for stochastic volatility models we will of courseuse the Lie algebra theory for the existence of FDRs in Hilbert space, developed inSection 4.

Page 197: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 189

Lie Algebra Conditions for the Existence of an FDR

Our problem is to study the existence of an FDR for a stochastic volatility model ofthe form

drt = µ0(rt, yt)dt + σ(rt, yt)dWt (150)

dyt = a0(yt)dt + b(yt)dWt. (151)

In the particular case of a forward rate model, the drift term is given by

µ0(r, y, x) =∂

∂xr(x) + Hσ(r, y, x) (152)

but none of the results in this section does in fact depend upon this particular struc-ture of µ0. Therefore we will, for the rest of the section, consider a general abstractstochastic volatility model of the form (150)-(151).

To apply our earlier Lie algebra results to the present situation we proceed in thefollowing way.

• Define the Hilbert space H by H = H× Rk.

• Define the H-valued process r by

rt =[rtyt

](153)

• Write the dynamics of r on Stratonovich form instead of the original Ito form.

• Use the abstract Lie algebraic result from Theorem 4.2 on the process r.

We will thus view r as an infinite dimensional “column vector” process, and we willhenceforth always write it on block vector form as above.

The Stratonovich dynamics of r are routinely derived as

drt = µ(rt, yt)dt + σ(rt, yt) dWt (154)

dyt = a(yt)dt + b(yt) dWt, (155)

where

µ(r, y) = µ0(r, y) − 12σr(r, y)σ(r, y) − 1

2σy(r, y)b(y) (156)

a(y) = a0(y) − 12by(y)b(y). (157)

Here σr denotes the partial Frechet derivative of σ w.r.t. the vector variable r andsimilarly for the other terms.

Page 198: Paris-Princeton Lectures on Mathematical Finance 2003

190 T. Bjork

Written as a single equation on H we thus have

drt = µ(r)dt + σ(r) dWt, (158)

where µ and σ are given by

µ(r, y) =[µ(r, y)a(y)

](159)

σ(r, y) =[σ1(r, y), . . . , σm(r, y)

](160)

Here the vector fields σ1, . . . , σm are defined by

σi(r, y) =[σi(r, y)bi(y)

](161)

where bi is the i : th column of the b matrix.

We make the following standing regularity assumption which is assumed to holdthroughout the entire chapter.

Assumption 7.2 We assume that the dimension (evaluated pointwise) of the Lie al-gebra

µ, σ1, . . . , σmLA < ∞, (162)

is constant in a neighborhood of r0 ∈ H

Our first general result now follows immediately from Theorem 4.2. Note that whenwe below speak about the dimension of a Lie algebra, this is always to be interpretedin terms of pointwise evaluation.

Theorem 7.1 Under Assumption 7.2, the stochastic volatility model (150)-(151) willhave a generic FDR at the point r0 if and only if

dim µ, σ1, . . . , σmLA < ∞, (163)

in a neighborhood of r0 ∈ H.

For simplicity of notation we will often use the shorthand notation µ, σLA for theLie algebra µ, σ1, . . . , σmLA.

Geometric Intuition

At this level of generality it is hard to obtain more concrete results. As an exam-ple: there seems to be no simple result connecting the existence of an FDR for thefull model with existence of an FDR for the parameterized model. The geometricintuition behind this is roughly as follows.

Page 199: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 191

• From Proposition 4.1 we know that existence of an FDR for r is equivalent to theexistence of a finite dimensional invariant manifold in H passing through r0.

• If the parameterized model admits a generic FDR then, for every fixed y neary0, there exists an invariant manifold G in H through r0. Thus one would per-haps guess that the manifold G × Rk would be invariant for r, thus implying theexistence of an FDR for r.

• However, the manifold G above will generically depend on y. Writing it as Gy ,what may (and generically will) happen is that, as rt moves around in H, yt willmove in Rk and the family Gyt ; t ≥ 0 may sweep out an infinite dimensionalmanifold in H. Thus the existence of an FDR for the parameterized model is notsufficient for the existence of an FDR for the full model.

• Conversely, the existence of an FDR for the parameterized model does not evenseem to be necessary for the existence of an FDR for the full model. Supposefor example that, for each y, there does not exist an invariant manifold for theparameterized model. This means that the parameterized model does not possessan FDR. Despite this it could well happen that the process r does live on a finitedimensional invariant manifold (and thus possesses an FDR). The reason for thisis that there could be a subtle interplay between the dynamics of r and y, andin particular one might intuitively expect this interplay to be possible if there isstrong correlation between the Wiener process components driving r and y.

• From the argument above we are led to guess that the simplest structural situationoccurs when r and y are driven by independent Wiener processes. Since in thiscase, the evolution of y is independent of the present state of r, we may evenguess (bravely) that any FDR properties of the full model will be “uniform” w.r.t.y in the sense that the results will not depend much on the particular dynamicsof y.

As we shall see below, the intuition outlined above is basically substantiated.

7.4 General Orthogonal Noise Models

Based on the informal arguments in the previous section we now go on to study thecase when r and y are driven by independent Wiener processes. We will refer to thistype of model as an “orthogonal noise model”. We consider the case of a generalSDE in Hilbert space.

Model Specification and Preliminary Results

Assumption 7.3 For the rest of the section we assume that we can write the Wienerprocess W on block vector form as

Page 200: Paris-Princeton Lectures on Mathematical Finance 2003

192 T. Bjork

Wt =

W r

t

W yt

where W r and W y are vector Wiener processes of dimensions mr and my respec-tively. Furthermore we assume that the (r, y) dynamics are of the particular form

drt = µ0(rt, yt)dt + σ(rt, yt)dW rt (164)

dyt = a0(yt)dt + b(yt)dWyt , (165)

where the coefficients satisfy suitable smoothness conditions (see Section 7.3).

Under this assumption r and y are driven by orthogonal noise terms, and this leadsto an important simplification of the geometric structure of the model.

Lemma 7.1 The Stratonovich formulation of (164)-(165) is given by

drt = µ(rt, yt)dt + σ(rt, yt) dW rt (166)

dyt = a(yt)dt + b(yt) dW yt , (167)

where

µ(r, y) = µ0(r, y) − 12σr(r, y)σ(r, y) (168)

a(y) = a0(y) − 12by(y)b(y). (169)

Proof. In order to find the Stratonovich form of the r dynamics we need to compute

d〈σ,W r〉t = dσ(rt, yt).

The infinite dimensional Ito formula gives us

dσ(rt, yt) = (dt-terms) + σr(rt, yt)σ(rt, yt)dW rt + σy(rt, yt)b(yt)dW

yt

We thus obtain

d〈σ,W r〉t = σr(rt, yt)σ(rt, yt)d〈W r ,W r〉t + σy(rt, yt)b(yt)d〈W y,W r〉tSince W r and W y are independent this simplifies to

d〈σ,W r〉t = σr(r, y)σ(r, y)dt.

In order to see more clearly the geometric structure of the orthogonal noise modelwe write it on block operator form as

d

[rtyt

]=[µ(rt, yt)a(yt)

]dt +

[σ(rt, yt)

0

] dW r

t +[

0b(yt)

] dW y

t (170)

We thus have the following immediate and preliminary result.

Page 201: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 193

Proposition 7.1 The orthogonal noise model (164)-(165) admits an FDR if and onlyif the Lie algebra generated by the vector fields

[µ(r, y)a(y)

],

[σ1(r, y)

0

], . . . ,

[σmr (r, y)

0

],

[0

b1(y)

], . . . ,

[0

bmy(y)

]

is finite dimensional at r0.

More compactly we will often write the generators of the Lie algebra above as µ, σ,and b where,

µ(r, y) =[µ(r, y)a(y)

], σ(r, y) =

[σ(r, y)

0

], b(y) =

[0

b(y)

](171)

A very useful property of the orthogonal noise model is the simple structure of theStratonovich formulation of the parameterized model. The proof is trivial.

Lemma 7.2 For the orthogonal noise model (164)-(165), the Ito formulation of theparameterized model is defined by

drt = µ0(rt, y)dt + σ(rt, y)dW rt , (172)

and the Stratonovich formulation of the parameterized model is given by

drt = µ(rt, y)dt + σ(rt, y) dW rt , (173)

with µ defined by (168).

The point of this Lemma is that it shows that, for orthogonal noise models, the op-erations “restrict to the parameterized model” and “compute the Stratonovich dy-namics” commute, i.e. the Stratonovich formulation of the parameterized model isidentical to the parameterized version of the Stratonovich formulation of the originalmodel.

In order to obtain easily verifiable necessary and sufficient conditions for the ex-istence of an FDR we will in the next sections introduce some further structuralassumptions. In doing this we will have to deal with Lie brackets in several spaces,so we have to clarify some notation.

Definition 7.3 From now on, the following notation is in force:

• For any vector smooth fields f(r, y) and g(r, y) on H, the expression[f , g

]denotes the Lie bracket in H.

Page 202: Paris-Princeton Lectures on Mathematical Finance 2003

194 T. Bjork

• For any smooth mapping f(r, y) where f : H → H and for any fixed y ∈ R, theparameterized vector field fy : H → H is defined by fy(r) = f(r, y)

• For any smooth mappings f, g : H → H , the expression [fy, gy] denotes theLie bracket on H between fy and gy . This Lie bracket will sometimes also bedenoted by [f(·, y), g(·, y)]H.

• For vector fields c(y) and d(y) on Rk, the notation [c, d] denotes the Lie bracketon Rk.

Necessary Conditions

It turns out that, in order to obtain easy necessary condition, a crucial role is playedby the geometric relation between the drift vector field a(y) and the Lie algebra onRk generated by the diffusion vector fields b1(y), . . . , bmy .

Our first result relates the stochastic volatility model to the corresponding parame-terized model.

Proposition 7.2 Consider the model (164)-(165). Assume that

a ∈b1, . . . , bmy

LA

(174)

in a neighborhood of y0. Under this assumption, a necessary condition for the exis-tence of an FDR for the stochastic volatility model is that the corresponding param-eterized model

drt = µ(rt, y)dt + σ(rt, y)dW rt (175)

admits a generic FDR at y0.

Proof. We assume that the full stochastic volatility model admits an FDR, and wealso assume that (174) is satisfied. We now have to show that, under these assump-tions, the parameterized model admits and FDR, i.e. that the Lie algebra (on H) ofthe parameterized model is finite dimensional near r0, for every fixed y near y0. FromLemma 7.2 we know that the Stratonovich formulation of the parameterized modelis given by

drt = µ(rt, y)dt + σ(rt, y) dW rt , (176)

which we write asdrt = µy(rt)dt + σy(rt) dW r

t . (177)

Our task is now to show thatµy, σyLA

is finite dimensional near r0 for all y near y0.

Since we assumed that the full model possessed an FDR we know that the Lie algebra

Page 203: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 195

µ, σ, b

LA

=[

µ(r, y)a(y)

],

[σ(r, y)

0

],

[0

b(y)

]LA

is finite dimensional near r0. We now have the trivial inclusionbLA

⊆µ, σ, b

LA

,

and we go on to computebLA

=b1, . . . , bmy

LA

. For any i and j, let us thus

compute the Lie bracket[bi, bj

]. We easily obtain the block matrix form for the

Frechet derivative of bi on H as

b′i(y) =[

0 00 b′i(y)

]

where b′i denotes the Frechet derivative on Rk of the vector field bi. Performing thesame calculation for bj we obtain

[bi, bj

]H

=[

0 00 b′i

] [0bj

]−[

0 00 b′j

] [0bi

]=[

0[bi, bj]Rk

].

Continuing in this way by taking repeated brackets, wee see that if β denotes a

generic element ofbLA

then it has the form

β =[

]

where β denotes a generic element of bLA. We can formally write this as

bLA

=b1, . . . , bmy

LA

=[

0b1, . . . , bmy

LA

]=[

0bLA

]

We assumed that a ∈ bLA, so there exists vector fields c1(y), . . . , cn(y) in bLAand scalar fields α1(y), . . . , αn(y) on Rk such that

a(y) =n∑1

αi(y)ci(y)

for all y near y0. SincebLA

⊆µ, σ, b

LA

we see from the above that the vector

fields c1, . . . , cn where

ci =[

0ci(y)

]

all lie inµ, σ, b

LA

. From [9] we know that we are allowed to perform Gaussian

elimination. More precisely, we may replace µ by µ−∑n

1 αici, and we obtain

Page 204: Paris-Princeton Lectures on Mathematical Finance 2003

196 T. Bjork

µ −n∑1

αici =[µa

]−

n∑1

αi

[0ci

]=[µ0

].

From this we see that the Lie algebraµ, σ, b

LA

for the full model is in fact gen-

erated by the much simpler system m, σ and b where m is defined by

m =[µ0

].

Since we assumed thatµ, σ, b

LA

was finite dimensional, then also the smaller Lie

algebra

m, σLA =[

µ0

],

[σ0

]LA

is necessarily also finite dimensional. In computing this latter Lie algebra we maynow argue as for bLA above. Let us, for example, compute the Lie bracket [m, σi].We easily obtain

[m, σi] =[µr µy0 0

] [σi0

]−[σir σiy0 0

] [µ0

]=[µrσi − σirµ

0

]

where subindex r and y denotes the partial Frechet derivative w.r.t r and y. Now weobserve that µr(r, y)σi(r, y) − σir(r, y)µ(r, y) = [µy, σyi ] (r) so we have

[m, σi] (r, y) =[

[µy, σyi ]0

](r),

and continuing in this way we obtain

m, σLA (r, y) =[

µ0

],

[σ0

]LA

(r, y) =[µy, σyLA

0

](r)

Since m, σLA is finite dimensional for all (r, y) near (r0, y0) we thus see thatµy, σyLA has to be finite dimensional near r0 for all y near y0. This however isequivalent to the existence of an FDR for the parameterized model. We have the following obvious corollary, which seems to be enough for many con-crete applications.

Corollary 7.1 Assume that the Lie algebra generated by b in Rk is full, i.e. thatb1, . . . , bmy

LA

= Rk. (178)

Then, regardless of the form of a, the existence of an FDR for the parameterizedmodel is necessary for the existence of an FDR for the full model. In particular, theassumption above is valid, and thus the conclusion holds, for the following specialcases.

Page 205: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 197

• my = k and the k × k diffusion matrix b(y) is invertible near y0.

• y is scalar and driven by a scalar Wiener process (i.e. k = my = 1), and thescalar field b(y) is nonzero near y0.

We now go on to obtain more precise (but still easily verifiable) necessary conditions,and the simplest case is when the diffusion matrix b is square and invertible. Sincethe multidimensional case is a bit messy we start with the scalar case, and we will infact use the scalar result in the proof of the multidimensional case.

Proposition 7.3 Assume that y and W y are scalar and that the (scalar) diffusionterm b(y) is nonzero near y0. Then the following conditions are necessary for theexistence of an FDR for the full model.

• For every fixed r and y near (r0, y0) the partial derivatives of µ and σi(r, y)i = 1, . . . ,mr w.r.t y span a finite dimensional space in H, i.e. there exists afinite number N such that for every (r, y)

dim span

∂nµ

∂yn(r, y); n = 0, 1, 2, . . .

≤ N (179)

and

dim span

∂nσi∂yn

(r, y); n = 0, 1, 2, . . .

≤ N (180)

for every i = 1, . . .mr.

• The drift term µ, and each volatility component σi have the form

µ(r, y, x) =n0∑j=1

c0j(r, y)λ0j(r, x). (181)

and

σi(r, y, x) =ni∑j=1

cij(r, y)λij(r, x). (182)

Proof. In order to obtain necessary conditions we assume that the full model admitsan FDR, and for simplicity of notation we assume that mr = 1 (this will not affectthe proof). The Lie algebra for the full model is then finite dimensional and it isgenerated by

µ =[µ(r, y)a(y)

], σ =

[σ0

], b =

[0

b(y)

].

Since b is scalar and nonzero we can use Gaussian elimination and locally replace bby

1b(y)

b(y) =[

01

],

Page 206: Paris-Princeton Lectures on Mathematical Finance 2003

198 T. Bjork

and, with further elimination, we see that the full Lie algebra is in fact generated by

µ =[µ(r, y)

0

], σ =

[σ0

], 1 =

[01

].

We start by proving (180), the proof for (179) being identical. Since the full algebrais finite dimensional, also the smaller Lie algebra generated by σ and 1 has to befinite dimensional. In particular the space spanned in H by the vector fields

σ,[σ, 1

],[[

σ, 1], 1],[[[

σ, 1], 1], 1], . . .

obtained by starting with σ and then taking repeated brackets with 1, has to be finitedimensional at every point (r, y) near r0. We can write these vectors more compactlyas

ad01(σ), ad1

1(σ), ad2

1(σ), . . .

where for any vector field f the operators adnf

: H → H are defined recursively by

ad0f(g) = g,

ad1f(g) =

[g, f

],

adn+1

f(g) =

[adnf(g), f

].

We easily obtain the Frechet derivatives of σ and 1 as

σ′ =[∂rσ ∂yσ0 0

], 1′ =

[0 00 0

],

where ∂r and ∂r denotes the corresponding partial Frechet derivatives. Thus we have

ad11(σ) =

[σ, 1

]=[∂rσ ∂yσ0 0

] [01

]−[

0 00 0

] [σ0

]=[∂yσ0

]

Similarly we have

ad1

1(σ)′

=

∂r∂yσ ∂2

0 0

and thus

ad21(σ) =

[ad11(σ), 1

]=

∂r∂yσ ∂2

0 0

[0

1

]−[

0 00 0

] [∂yσ0

]=[∂2yσ0

]

Continuing this way we see by induction that

adn1(σ) =

[∂ny σ0

].

Page 207: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 199

Since, by the argument above,adn

1(σ)(r, y); n ≥ 0

span a finite dimensional

subspace of H for all (r, y) near r0, we thus see that∂ny σ(r, y), n ≥ 0

must span a finite dimensional subspace in H for all (r, y) near r0. We have thusproved (180) for the case when W r is scalar. The general case is proved by applyingthe above argument for each component of σ.

We now go on to prove the necessary condition (182) and we will in fact showthat (182) follows from (180). Again we carry out a separate argument for eachcomponent σi, so without loss of generality we may assume that σ only has a singlecomponent (i.e that mr = 1). Now, if (180) holds and we denote the dimension ofthe spanned subspace by n+1, there exists scalar fields aj(r, y); j = 0, . . . n, suchthat we have the following H-valued vector identity holding locally at r0

∂n+1y σ(r, y) =

n∑j=0

aj(r, y)∂jyσ(r, y) (183)

We now fix an arbitrary r, and for this fixed r we define the H-vector functionsZ0(y), Z1(y), . . . Zn(y) by

Z0(y) = σ(r, y),Z1(y) = ∂yσ(r, y),

......

Zn(y) = ∂ny σ(r, y),

and the Hn+1-valued block vector function Z(y) by

Z(y) =

Z0(y)Z1(y)

...Zn(y)

The point of this is that we can now write equation (183) as the linear ODE

dZ(y)dy

= (A(y) ⊗ I)Z(y) (184)

where ⊗ denotes the Kronecker product, and the (n + 1) × (n + 1) matrix functionA is defined as the companion matrix

A(y) =

0 1 0 . . . 00 0 1 . . . 0...

... 1a0(y) a1(y) a2(y) . . . an(y)

.

Page 208: Paris-Princeton Lectures on Mathematical Finance 2003

200 T. Bjork

As one would perhaps guess, the solution of (184) can be shown to have the repre-sentation

Z(y) = [Φ(y, y0) ⊗ I]Z(y0), (185)

where Φ is the transition matrix induced by A. In particular we thus obtain

Z0(y) =n∑j=0

cj(y)Zj(y0)

where cj(y) = Φ(y, y0)1,j . Recalling that there is a suppressed r and that Zj(y) =∂jyσ(r, 0) we obtain

σ(r, y) =n∑j=0

cj(r, y)∂jyσ(r, 0), (186)

which proves (182). The proof for (181) is identical. In order to state the corresponding multidimensional result we need to introducesome notation.

Definition 7.4 A multi index α ∈ Zk+ is any k-vector with nonnegative integerelements. For a multi index α = (α1, . . . , αk) the differential operator ∂αy is definedby

∂αy =∂α1

∂yα11

∂α2

∂yα22

. . .∂αk

∂yαk

k

We can now state multidimensional version of the theorem above. The crucial as-sumption needed is that the Lie algebra generated by the diffusion matrix b(y) spansthe entire space Rk. For the proof see [8].

Proposition 7.4 Assume that the conditionb1, . . . bmy

LA

= Rk, (187)

is satisfied near y0.

Then the following conditions are necessary for the existence of an FDR for thestochastic volatility model.

• For every fixed r and y near (r0, y0) the partial derivatives of µ(r, y) and σi(r, y)w.r.t y span a uniformly finite dimensional space in H, i.e. there exists a numberN such that for every (r, y)

dim span∂αy µ(r, y); α ∈ Zk+

≤ N (188)

anddim span

∂αy σi(r, y); α ∈ Zk+

≤ N (189)

for every i = 1, . . .mr.

Page 209: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 201

• The drift µ and every volatility component σi have the form

µ(r, y, x) =ni∑j=1

cij(r, y)λij(r, x). (190)

σi(r, y, x) =ni∑j=1

cij(r, y)λij(r, x). (191)

Test Examples: II.

We illustrate the necessary conditions obtained so far by studying the test examples(146)-(149) of Section 7.2. By the assumptions of Section 7.2, all three examples arewithin the class of orthogonal noise models. We may thus directly apply Proposition7.2, or (since we have a scalar model) Corollary 7.1 and check whether the corre-sponding parameterized models possess finite dimensional realizations. In all thesecases, however, this test is trivially satisfied since the volatility structures were con-structed directly from HJM models possessing short rate realizations. Thus all themodels pass this necessary conditions.

We now go on to the necessary conditions of Proposition (7.3). From (182) andocular inspection of the examples above we immediately have the following result.

Proposition 7.5 Assuming a scalar y-process with non zero diffusion term, thestochastic volatilities in (146), (148) and (149) do not admit an FDR.

Thus (146), (148) and (149) are out of the race. In particular we note that there isno stochastic volatility extension of the CIR forward rate volatility for which thereexists a finite dimensional realization. In fact, it is easy to see that we in fact havethe following stronger result where we allow both the parameters a and σ to dependupon the process y.

Proposition 7.6 Consider any stochastic volatility extension of the CIR model of theform

σ(r, y, x) = σ(y)√

r(0) · λ(x, σ(y), a(y)) (192)

where the functions σ(y) and a(y) are assumed to be non-constant and where they process is assumed to have non zero diffusion term. Then the stochastic volatilitymodel does not possess an FDR.

It remains to study the volatility structure (146) in more detail, and this will be donebelow.

Page 210: Paris-Princeton Lectures on Mathematical Finance 2003

202 T. Bjork

Necessary and Sufficient Conditions

In this section we provide necessary and sufficient conditions for the existence ofan FDR in the case of an orthogonal noise model, thus improving upon the generalresults of Theorem 7.1.

We need the following definition.

Definition 7.5 Define, for each y, the parameterized Lie algebra Ly on H by

Ly =∂αy µ

y, ∂αy σy1 , . . . , ∂

αy σ

ymr

; α ∈ Zk+LA

In this expression ∂αy µy is, for each fixed y, considered as a (parameterized) vector

field on H, and correspondingly for the σ components.

In order to obtain reasonably concrete results we need to assume that the Lie algebragenerated by the b matrix is full dimensional, leaving the general case as an openproblem.

Proposition 7.7 Assume that

dimb1, . . . , bmy

LA

= k. (193)

Under this assumption, a necessary and sufficient condition for the existence of anFDR for the stochastic volatility model is that, for each y, we have

dim Ly < ∞ (194)

near r0.

Proof. From proposition 7.1 we know that there exists an FDR if and only if the Liealgebra L on H generated by

[µ(r, y)a(y)

],

[σ1(r, y)

0

], . . . ,

[σmr (r, y)

0

],

[0

b1(y)

], . . . ,

[0

bmy(y)

]

is finite dimensional. Under the assumption (193), and using Gaussian elimination,we see that L is generated by

[µ(r, y)

0

],

[σ1(r, y)

0

], . . . ,

[σmr (r, y)

0

],

[0Ik

],

where Ik is the identity matrix on Rk. Using the fact that repeated bracketing of avector field of the form [

f(r, y)0

]

with different columns in

Page 211: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 203[

0Ik

]

will produce a vector field of the form[∂αy f(r, y)

0

]

it now follows that L is in fact generated by[∂αy µ(r, y)

0

],

[∂αy σ(r, y)

0

], . . . ,

[∂αy σmr (r, y)

0

],

[0Ik

]; α ∈ Zk+

From this it is clear that L is generated by[Ly0

],

[0Ik

]; y ∈ Rk,

and the proof is finished if we can show that for each multi index α we have

∂αy Ly ⊆ Ly. (195)

It follows by induction that in order to prove (195) we may WLOG assume thatk = 1 (i.e.y is scalar) and that it is in fact enough to prove that

∂y Ly ⊆ Ly. (196)

Now, it is easily seen that

Ly =∞⋃k=0

Lyk,

where

Ly0 = span∂ny µ

y, ∂ny σy1 , . . . , ∂

ny σ

ymr

; n ≥ 0

Lyk+1 = span Lyk, [Lyk, L

yk] , k = 0, 1, . . .

so it is enough to prove that each Lyk is invariant under ∂y and we prove this byinduction. The case k = 0 is clear, so assume that

∂yLyn ⊆ Lyn

for all n ≤ k. Now fix an arbitrary f ∈ Lyk+1. We start by considering two cases:the case when f ∈ Lyk and the case when f = [g, h] with g, h ∈ Lyk. If f ∈ Lyk then∂y ∈ Lyk by the induction assumption, so ∂y ∈ Lyk+1. If f = [g, h] with g, h ∈ Lykthen an easy calculation shows that

∂yf =[∂yg, h

]+[g, ∂yh

]which is in [Lyk, L

yk] by the induction assumption. Thus also in this case we have

∂yf ∈ Lyk+1. A generic f ∈ Lyk+1 is, by definition, a linear combination of terms ofthe above type so we are finished.

Page 212: Paris-Princeton Lectures on Mathematical Finance 2003

204 T. Bjork

A Simple Sufficient Condition

The object of this section is to show that, under some rather restrictive but nontrivialassumptions, it is possible to derive an extremely simple sufficient condition for theexistence of an FDR for the full stochastic volatility model in terms of the FDR forthe parameterized model. Furthermore; under these assumptions the realization forthe full model can be constructed directly, and in a trivial manner, from the realizationfor the parameterized model.

Assumption 7.4

1. The Ito formulation of the r-dynamics of the stochastic volatility model is of theform

drt = µ0(rt, yt)dt + σt(rt, yt)dWt. (197)

2. We assume that y is independent of W . Apart from this assumption, the processy is allowed to be an arbitrary semimartingale with values in Rk.

3. For any fixed y, the parameterized r-model is assumed to possess an FDR of theform

ryt = G(Zyt ), (198)

dZyt = A(Zyt , y)dt + B(Zyt , y) dWt, (199)

where Zy is Rd valued and G is a smooth mapping G : Rd → H.

The important part of this assumption is that, for the parameterized model, the pa-rameter y only appears in the Zy dynamics, but not the output mapping G. We willdiscuss the geometric significance of this below, but first we state the result.

Proposition 7.8 Under Assumption 7.4, the stochastic volatility model possesses anFDR, and a concrete realization is in fact given by

rt = G(Zt), (200)

dZt = A(Zt, yt)dt + B(Zt, yt) dWt, (201)

With G, A and B as in (198)-(199).

Proof. From the independence between y and W it follows that the Stratonovichformulation of the r-dynamics is given by

drt = µ(rt, yt)dt + σ(rt, yt) dWt, (202)

where

µ(r, y) = µ0(r, y) − 12σr(r, y)σ(r, y).

Page 213: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 205

Now let us consider (200)-(201) as an Ansatz. The r-dynamics induced by (200)-(201) are given by

drt = G′(Zt)A(Zt, yt)dt + G′(Zt)B(Zt, yt) dWt, (203)

so it follows that (200)-(201) is a realization of (202) if and only if

µ(r, y) = GA(r, y), (204)

σ(r, y) = GB(r, y). (205)

We thus have to prove that (204)-(205) hold, and to this end we use the factthat, by assumption, (198)-(199) is a realization for the parameterized model. TheStratonovich formulation for the parameterized model is easily seen to be given by

dryt = µ(ryt , y)dt + σ(ryt , y) dWt, (206)

and the important point here is that this is precisely the parameterized version of theStratonovich formulation of the original r-dynamics. The ry-dynamics induced by(198)-(199) are given by

dryt = G′(Zyt )A(Zyt , y)dt + G′(Zyt )B(Zyt , y) dWt, (207)

and since this was assumed to be a realization of (206) we thus have

µ(r, y) = GA(r, y),σ(r, y) = GB(r, y),

which was to be proved.

Remark 7.1 If the Stratonovich differential in (199) is replaced by an Ito differentiali.e. by

dZyt = A(Zyt , y)dt + B(Zyt , y)dWt,

then the conclusion of Proposition 7.8 still holds if the Stratonovich differential in(201) is replaced by an Ito differential, i,.e. by

dZt = A(Zt, yt)dt + B(Zt, yt)dWt.

This is useful if the realization of the parameterized model is originally given in Itoform.

This, very strong but also very restrictive, result has a clear and simple geometricinterpretation. First, we know from general (orthogonal noise) theory that a necessarycondition for an FDR is that the parameterized model possesses an FDR. In general,the realization for the parameterized model will of course be of the form

ryt = G(Zyt , y), (208)

dZyt = A(Zyt , y)dt + B(Zyt , y) dWt, (209)

Page 214: Paris-Princeton Lectures on Mathematical Finance 2003

206 T. Bjork

where the output function G as well as the drift term A and diffusion term B dependupon y, but in Proposition 7.8 we have assumed that G does not in fact depend ony. To understand the geometric meaning of this assumption we recall from Propo-sition 4.1 that the parameterized model, for a fixed y, admits an FDR if and only ifthere exists an invariant manifold Gy passing through r0, and in the generic case thisinvariant manifold will of course depend upon y. The relation between Gy and therealization (208)-(209) is that

Gy = Im Gy,

where the mapping Gy : Rd → H is defined by Gy(z) = G(z, y). Thus; assumingthat G does not depend upon the parameter y is equivalent to assuming that theinvariant manifold for the parameterized model passing through r0 does not dependupon y. In that case, denoting the invariant manifold by G it is of course geometricallyobvious that G × Rk will be a finite dimensional invariant manifold for the process(rt, yt) thus guaranteeing the existence of an FDR for the full model.

Furthermore, it follows from Proposition 4.2 that the invariant manifold Gy is deter-mined uniquely by the parameterized Lie algebra

Ly =µy, σy1 , . . . , σ

ymr

LA

, (210)

so if Ly does not depend upon y then neither will G(z, y). We thus have the followingresult.

Proposition 7.9 Assume that

• The process y is an Rk-valued semimartingale which is independent of W .

• The parameterized model admits an FDR for every fixed y.

• Lie algebra Ly defined in (210) does not depend upon the parameter y.

Then the full model will possess an FDR.

We finish this discussion by noticing that for the general Lie algebraic machinery towork it is essential that all processes are Wiener driven. The geometric reason forthis is that the Wiener process acts locally in space (the infinitesimal generator is apartial differential operator) and this allows us to analyze the realization problemsusing differential geometry (i.e. local analysis). It is therefore noteworthy that in thesimple situation discussed above in this section, we did not have to assume that y isdriven by a Wiener process – it can also have jumps.

An Example

As an application of the results in Section 7.4, we consider the following volatilitystructure for a standard forward rate model driven by a scalar Wiener process W r,

Page 215: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 207

σ(r, x) = ϕ(r)e−αx. (211)

Here ϕ is assumed to be an arbitrarily chosen smooth scalar field, and α is a positiveconstant. This is an extension of the model investigated in [35], where an FDR wasconstructed for the case when ϕ was assumed to be of the particular form ϕ(r) =g (r(0)), for some smooth function g : R → R. As was shown in Section 5.3, themodel admits an FDR of the following form.

Define the mapping G : R+ × R2 → H by

G(t, z1, z2)(x) = r0(x + t) + z1e−αx + z2e

−2αx (212)

The realization is then given by

rt(x) = G (t, Z1(t), Z2(t)) (x), (213)

dZ1(t) =

1αϕ2 [Gt] − αZ1(t)

dt + ϕ [Gt]dW r

t , (214)

dZ2(t) = −

2αZ1(t) +1αϕ2 [Gt]

dt. (215)

where we have used the shorthand notation

Gt = G (t, Z1(t), Z2(t)) .

The important point to notice is that the mapping G in (212) does not involve ϕ.We may now extend the model above to a stochastic volatility model with an arbi-trary scalar y-process (assumed to be independent of W r), by defining the volatilitystructure as

σ(r, y, x) = ϕ(r, y)e−αx. (216)

where ϕ is an arbitrarily chosen scalar field.

By construction, the parameterized model admits an FDR of the form (213)-(215)where G is exactly as above, and where ϕ [Gt] is replaced by ϕ [Gt, y]. The point isagain that G does not involve y , so it now follows immediately from Proposition 7.8that a realization for the stochastic volatility model is given by

rt(x) = G (t, Z1(t), Z2(t)) (x),

dZ1(t) =

1αϕ2 [Gt, yt] − αZ1(t)

dt

+ ϕ [Gt, yt] dW rt ,

dZ2(t) = −

2αZ1(t) +1αϕ2 [Gt, yt]

dt.

Remark 7.2 In this example we have used the Ito dynamics instead of theStratonovich dynamics. The reason is that the Ito dynamics of the realization aresimpler than the Stratonovich dynamics.

Page 216: Paris-Princeton Lectures on Mathematical Finance 2003

208 T. Bjork

7.5 Forward Rate Stochastic Volatility Models

We now go on to apply the general results above to the more concrete case of forwardrate models. we recall that the Ito formulation of the stochastic volatility forward ratemodel is given by

drt(x) =

∂xrt(x) + Hσ(rt, yt, x)

dt + σ(rt, yt, x)dWt (217)

dyt = a0(yt)dt + b(yt)dWt, (218)

where H is defined in (137). On Stratonovich form the model has the form

drt = µ(rt, yt)dt + σ(rt, yt) dWt (219)

dyt = a(yt)dt + b(yt) dWt, (220)

where

µ(r, y) = Fr + Hσ(r, y) − 12σr(r, y)σ(r, y) − 1

2σy(r, y)b(y) (221)

a(y) = a0(y) − 12by(y)b(y). (222)

As usual F denotes the operator ∂/∂x, σr denotes the partial Frechet derivative of σw.r.t. the vector variable r and similarly for σy .

Necessary Conditions for Orthogonal Noise Models

In the orthogonal noise case the model has the following Stratonovich form

drt = µ(rt, yt)dt + σ(rt, yt) dW rt (223)

dyt = a(yt)dt + b(yt) dW yt , (224)

where

µ(r, y) = Fr + Hσ(r, y) − 12σr(r, y)σ(r, y) (225)

a(y) = a0(y) − 12by(y)b(y). (226)

We now have the following surprisingly restrictive result.

Proposition 7.10 Assume the following:

• The model is an orthogonal noise model.

Page 217: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 209

• The condition b1, . . . bmy

LA

= Rk, (227)

is satisfied near y0.

Then, a necessary condition for the existence of an FDR is that the volatility structurehas the form

σi(r, y, x) =N∑j=1

ϕij(r, y)λj(x), i = 1, . . . ,mr, (228)

where λ1, . . . , λN are constant vector fields, and ϕij are smooth scalar fields.

Proof. Since we have assumed orthogonal noise, Proposition 7.2 implies that a nec-essary condition for the existence of an FDR is that the parameterized model admitsan FDR. Furthermore; applying Theorem 4.13 of [23] to the parameterized model itfollows that the volatility must be of the form

σi(r, y, x) =N∑j=1

ϕij(r, y)λj(y, x). (229)

Given this expression, an application of Proposition 7.4 finishes the proof. Given a volatility structure of the form (228) we now go on to find sufficient condi-tions for the existence of an FDR.

Sufficient Conditions for the General Noise Models

We now consider a multidimensional forward rate model of the form

drt = µ(rt, yt)dt + σ(rt, yt) dWt (230)

dyt = a(yt)dt + b(yt) dWt. (231)

where W is assumed to be m-dimensional, and y is as usual k-dimensional. We willassume that the volatility structure is of the form (228), but we stress the fact that wedo not restrict ourselves to the orthogonal noise model.

We recall from Section 2.2 that a real valued function f : R → R is said to bequasi exponential if it is the solution of a linear ODE with constant coefficients,alternatively that it can be written as

f(x) =∑i

eγix +∑j

eαjx [pj(x) cos(ωjx) + qj(x) sin(ωjx)] , (232)

where γi, αj , ωj are real numbers, whereas pj and qj are real polynomials.

The main result is as follows.

Page 218: Paris-Princeton Lectures on Mathematical Finance 2003

210 T. Bjork

Proposition 7.11 Consider the model (230)-(231) and assume that the componentsof σ are of the form

σi(r, y, x) =N∑j=1

ϕij(r, y)λj(x), i = 1, . . . ,m. (233)

Under this assumption a sufficient condition for the existence of an FDR is thatλ1(x), . . . , λm(x) are quasi exponential. The scalar fields ϕij(x) are allowed to bearbitrary.

Proof. In order to avoid to much and messy notation, we give the proof only for thesimplified case when

σi(r, y, x) = ϕi(r, y)λi(x).

The arguments in the general case are almost identical. Under the given assumptionthe Stratonovich drift term of r is given by

µ = Fr +m∑i=1

ΦiDi −12

m∑i=1

ϕir[λi]ϕiλi −12

m∑i=1

ϕiy[bi]λi (234)

where bi denotes the i.th column of the matrix b. The Lie algebra L under study isthe one generated by the vector fields

[µa

],

[ϕ1λ1

b1

], . . . ,

[ϕmλmbm

].

Obviously, L is included in the larger algebra L1, generated by[µ0

],

[ϕ1λ1

0

], . . . ,

[ϕmλm

0

],

[0a

],

[0b1

], . . . ,

[0bm

].

Using the structure of µ we can reduce this generator system to[Fr +

∑mi=1 ΦiDi

0

],

[λ1

0

], . . . ,

[λm0

],

[0a

],

[0b

].

From this we see that L1 is included in the algebra L2, generated by[Fr0

],

[D1

0

], . . . ,

[Dm

0

],

[λ1

0

], . . . ,

[λm0

],

[0a

],

[0b

].

As in Section 4.5 it now follows that L2 is finite dimensional if and only ifλ1, . . . , λm are quasi exponential.

Page 219: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 211

The Scalar Case

We finish by a reasonably complete investigation of the most important special case,which occurs when y is scalar, r and y are driven by scalar Wiener processes, andthe volatility has the form

σ(r, y, x) = ϕ(r, y)λ(x). (235)

Such a model will have the form

drt(x) = Frt(x) + Φ(r, y)D(x) dt + ϕ(r, y)λ(x)dW rt

dyt = a0(yt)dt + b(yt)dWyt .

where

Φ(r, y) = ϕ2(r, y),

D(x) = λ(x)∫ x

0

λ(s)ds.

In order to allow for a correlation, ρ, between W r and W y we write them as

W rt = ρW 1

t +√

1 − ρ2W 2t ,

W yt = W 1

t

where W 1 and W 2 are independent Wiener processes. We then have the dynamics

drt = Frt + ΦD dt + ϕλρW 1t + ϕλ

√1 − ρ2W 2

t

dyt = a0dt + bdW 1t .

We can now prove the following main result for the scalar case.

Proposition 7.12 Assume that ϕy(r, y) = 0, and that b(y) = 0 i.e. that the model isnon trivial. Then the following hold.

• In the non-perfectly correlated case |ρ| < 1, a necessary and sufficient conditionfor the existence of an FDR is that the vector field λ is quasi exponential. Thescalar field ϕ(r, y) is allowed to be arbitrary.

• In the perfectly correlated case |ρ| = 1, the condition above is sufficient.

Proof. The Stratonovich dynamics of the model are given by

drt =Frt + ΦD − 1

2ϕr[λ]ϕλ − 1

2ϕybλ

dt + ϕλ W 1

t +√

1 − ρ2ϕλ W 2t

dyt = adt + b dW 1t .

Page 220: Paris-Princeton Lectures on Mathematical Finance 2003

212 T. Bjork

Thus the relevant Lie algebra L on H is generated by the vector fields

[Fr + ΦD − 1

2ϕr[λ]ϕλ − 12ϕybλ

a

],

[ρϕλb

],

[√1 − ρ2ϕλ

0

],

We start with the non-perfectly correlated case, so we assume that |ρ| < 1. Then, byGaussian elimination, the system of generators can immediately be reduced to

[Fr + ΦD

0

],

[01

],

[λ0

]

The Lie bracket between the first two vector fields gives us[ΦyD

0

],

so after reducing this field we have the generators[Fr + ΦD

0

],

[D0

],

[λ0

],

[01

],

which finally reduce to [Fr0

],

[D0

],

[λ0

],

[01

].

From this it follows immediately that the Lie algebra is finite dimensional if an onlyif the linear span of

Fnλ, FnD; n ≥ 0

is a finite dimensional subspace in H. It is however easily seen that this happens ifand only if λ is quasi exponential.

In the perfectly correlated case |ρ| = 1 we can WLOG assume that ρ = 1 and we areleft with the following generators for the Lie algebra L.

[Fr + ΦD − 1

2ϕr[λ]ϕλ − 12ϕybλ

a

],

[ϕλb

],

There seems tho be no easy way of reducing this set of generators, but it is obviousthat L is included in the Lie algebra Lext generated by the fields

[Fr + ΦD − 1

2ϕr[λ]ϕλ − 12ϕybλ

a

],

[ϕλ0

],

[0b

]

Thus a sufficient condition for an FDR is that the larger Lie algebra Lext is finite di-mensional. It is however easily seen that Lext is identical with the algebra discussedin the non-perfectly correlated case above, so we are finished.

Page 221: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 213

Test Examples: III.

We can now continue our study of the test examples of Section 7.2. In fact, only oneexample is left in the race, namely

2. HW with stochastic σ:σ(r, y, x) = ye−ax. (236)

We now have the following result, which is immediately obtained from Proposition7.12.

Proposition 7.13 The stochastic volatility version of the Hull-White extended Vasicekvolatility structure with stochastic σ, as in (236) admits an FDR.

Construction of Realizations

In the previous sections we have provided existence results for FDRs, but so farwe have not actually constructed any concrete realizations. However; the construc-tion technique outlined in Section 5 can immediately be adapted to the stochasticvolatility framework, and we only give an illustrative example. The example is theHull-White extended Vasicek model with stochastic σ as in (236) above. We alreadyknow that the forward rate model of the form (230)-(231), with volatilities given by(236), has a finite dimensional realization. Not surprisingly, y can be chosen as oneof the state variables, and a concrete realization can be shown to be given by

rt = G(Zt, yt). (237)

Here G is defined by

G(z0, z1, z2, y) =[G(z0, z1, z2, y)

y0 + y

], (238)

where G is given by

G(z0, z1, z2, y)(x) = r0(x + z0) + e−αxz1 −e−2αx

αz2. (239)

The dynamics of the state space variables are given by

dZ0 = dt,

dZ1 =[−αZ1 + 1

α (y0 + y)2)]dt + (y0 + y)dWt,

dZ2 =[−2αZ2 + (y0 + y)2

]dt

dy = a0(y)dt + b(y)dWt.

(240)

Here a0(y) = a(y) + 12by(y)b(y).

Page 222: Paris-Princeton Lectures on Mathematical Finance 2003

214 T. Bjork

References

1. BHAR, R., AND CHIARELLA, C. Transformation of Heath-Jarrow-Morton models toMarkovian systems. The European Journal of Finance 3 (1997), 1–26.

2. BJORK, T. Arbitrage Theory in Continuous Time. Oxford University Press, 1998.3. BJORK, T. A geometric view of interest rate theory. In Option Pricing, Interest Rates

and Risk Management., E. Jouini, J. Cvitanic, and M. Musiela, Eds. Cambridge UniversityPress, 2001.

4. BJORK, T. Interest rate theory. In Financial Mathematics, Springer Lecture Notes inMathematics, Vol 1656, W. Runggaldier, Ed. Springer Verlag, 2001.

5. BJORK, T., AND CRISTENSEN, B. Interest rate dynamics and consistent forward ratecurves. Mathematical Finance 9, (1999), 323–348.

6. BJORK, T., AND GOMBANI, A. Minimal realizations of interest rate models. Financeand Stochastics 3, 4 (1999), 413–432.

7. BJORK, T., AND LANDEN, C. On the construction of finite dimensional realizations fornonlinear forward rate models. Finance and Stochastics 6, 3 (2002), 303–331.

8. BJORK, T., LANDEN, C., AND SVENSSON, L. Finite dimensional markovian realiza-tions for stochastic volatility forward rate models. In: Proceedings of the Royal Society,Vol. 460, No. 2041, 53–84, 2004.

9. BJORK, T., AND SVENSSON, L. On the existence of finite dimensional realizations fornonlinear forward rate models. Mathematical Finance 11, 2 (2001), 205–243.

10. BRACE, A., AND MUSIELA, M. A multifactor Gauss Markov implementation of Heath,Jarrow, and Morton. Mathematical Finance 4 (1994), 259–283.

11. BROCKET, P. Finite Dimensional Linear Systems. Wiley, 1970.12. BROCKET, P. Nonlinear systems and nonlinear estimation theory. In Stochastic sys-

tems: The mathematics of filtering and identification and applications, M. Hazewinkeland J. Willems, Eds. Reidel, 1981.

13. CARVERHILL, A. When is the spot rate Markovian? Mathematical Finance 4 (1994),305–312.

14. CHIARELLA, C., AND KWON, O. K. Forward rate dependent Markovian transformationsof the Heath-Jarrow-Morton term structure model. Finance and Stochastics 5 (2001),237–257.

15. COX, J., INGERSOLL, J., AND ROSS, S. A theory of the term structure of interest rates.Econometrica 53 (1985), 385–408.

16. DA PRATO, G., AND ZABZCYK, J. Stochastic Equations in Infinite Dimensions. Cam-bridge University Press, 1992.

17. DUFFIE, D., AND KAN, R. A yield factor model of interest rates. Mathematical Finance6, 4 (1996), 379–406.

18. EBERLEIN, E., AND RAIBLE, S. Term structure models driven by general Levy pro-cesses. Mathematical Finance 9, 1 (1999), 31–53.

19. FILIPOVIC, D. A note on the nelson-siegel family. Mathematical Finance 9, 4 (1999),349–359.

20. FILIPOVIC, D. Exponential-polynomial families and the term structure of interest rates.Bernoulli 6 (2000), 1–27.

21. FILIPOVIC, D. Invariant manifolds for weak solutions of stochastic equations. Probabil-ity Theory and Related Fields 118 (2000), 323–341.

22. FILIPOVIC, D. Consistency Problems for Heath-Jarrow-Morton Interest Rate Models.Springer Lecture Notes in Mathematics, Vol. 1760. Springer Verlag., 2001.

Page 223: Paris-Princeton Lectures on Mathematical Finance 2003

Geometry of Interest Rate Models 215

23. FILIPOVIC, D., AND TEICHMANN, J. Existence of finite dimensional realizations forstochastic equations, 2001. Forthcoming in J. Funct. Anal.

24. FILIPOVIC, D., AND TEICHMANN, J. On finite dimensional term structure models, 2002.Working paper.

25. FILIPOVIC, D., AND TEICHMANN, J. On the geometry of the term structure of interestrates, 2003. In: Proceedings of the Royal Society, Vol. 460, No. 2041, 129–168, 2004.

26. HEATH, D., JARROW, R., AND MORTON, A. Bond pricing and the term structure of in-terest rates: a new methodology for contingent claims valuation. Econometrica 60 (1992),77–105.

27. HO, T., AND LEE, S. Term structure movements and pricing interest rate contingentclaims. Journal of Finance 41 (1986), 1011–1029.

28. HULL, J., AND WHITE, A. Pricing interest-rate-derivative securities. Review of FinancialStudies 3 (1990), 573–592.

29. INUI, K., AND KIJIMA, M. A Markovian framework in multi-factor Heath-Jarrow-Morton models. Journal of Financial and Quantitative Analysis 33 (1998), 423–440.

30. ISIDORI, A. Nonlinear Control Systems. Springer Verlag, 1989.31. JEFFREY, A. Single factor Heath-Jarrow-Morton term structure models based on Markov

spot interest rate dynamics. Journal of Financial and Quantitative Analysis 30 (1995),619–642.

32. MUSIELA, M. Stochastic PDE:s and term structure models. Preprint, 1993.33. MUSIELA, M., AND RUTKOWSKI, M. Martingale Methods in Financial Modelling.

Springer-Verlag, 1997.34. NELSON, C., AND SIEGEL, A. Parsimonious modelling of yield curves. Journal of

Business 60 (1987), 473–489.35. RITCHKEN, P., AND SANKARASUBRAMANIAN, L. Volatility structures of forward rates

and the dynamics of the term structure. Mathematical Finance 5, 1 (1995), 55–72.36. VASICEK, O. An equilibrium characterization of the term stucture. Journal of Financial

Economics 5 (1977), 177–188.37. WARNER, F. Foundations of differentiable manifolds and Lie groups. Scott, Foresman,

Hill, 1979.38. ZABCZYK, J. Stochastic invariance and conistency of financial models. Preprint. Scuola

Normale Superiore, Pisa, 2001.

Page 224: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading inFinancial Markets

Jose Scheinkman and Wei Xiong

Bendheim Center for FinancePrinceton UniversityPrinceton, NJ 08540email: [email protected]: [email protected]

Summary. We survey recent developments in finance that analyze how heterogeneous beliefsamong investors generate speculation and trading. We describe the joint effects of heteroge-neous beliefs and short-sales constraints on asset prices, using both static and dynamic mod-els, discuss the no-trade theorem in the rational expectations framework, and present investoroverconfidence as a potential source of heterogeneous beliefs. We review recent results ofScheinkman and Xiong (2003) modeling the resale option that is embedded in share prices inthe presence of short-sale constraints and heterogeneous beliefs, highlighting the implied cor-relation between stock prices and trading volume. Finally, we discuss the survival of investorswith incorrect beliefs.

Key words: Heterogeneous beliefs, speculation, trading, overconfidence, resale option, bub-ble, optimal stopping, survival.MSC 2000 subject classification. 91B24, 91B28, 91B44.

Acknowledgements: Research partially supported by the NSF through grant 0001647.

1 Introduction

Standard asset pricing theories have difficulty explaining episodes of asset price bub-bles such as the one that seems to have occurred in the market for US internet stocksduring the period of 1998-2000. In addition to asset prices that are difficult to justifyby fundamentals such as expected future dividends, one typically observes inordi-nate increases in trading volume. For instance, Ofek and Richardson (2003) docu-ment that during the internet bubble of the late 90’s, Internet stocks represented sixpercent of the market capitalization but accounted for 20% of the publicly tradedvolume of the U.S. stock market. Cochrane (2002) provides additional support for

T.R. Bielecki et al.: LNM 1847, R.A. Carmona et al. (Eds.), pp. 217–250, 2004.c© Springer-Verlag Berlin Heidelberg 2004

Page 225: Paris-Princeton Lectures on Mathematical Finance 2003

218 J. Scheinkman and W. Xiong

the correlation between bubbles and trading volume for US stocks during the late90’s. This evidence indicates that a satisfactory theory of bubbles should be able toexplain simultaneously the level of prices and trading volume.

Several papers have been written over the last couple of decades that emphasize therole of heterogeneous beliefs in generating higher levels of asset prices and tradingvolume. In this chapter we present a selective survey of this literature.

We start by expositing a simple point made by Miller (1977), who argued that, ifagents have heterogeneous beliefs about an asset’s fundamentals and short sales arenot allowed, equilibrium prices would, if opinions diverge enough, reflect the opinionof the more optimistic investor.

The Miller model is static and cannot be used to discuss the dynamics of trading.Harrison and Kreps (1978) exploit the dynamic consequences of heterogenous be-liefs. Since an investor knows that, in the future, there may be other investors thatvalue the asset more than he does, the investor is willing to pay more for an assetthan he would pay if he was forced to hold the asset forever. The difference betweenthe investor’s willingness to pay, and his own discounts expected dividends reflectsa speculative motive, the willingness to pay more than the intrinsic value of an assetbecause the ownership of the asset gives the owner the right to sell it in the future.To make this right valuable, short sales must be costly - in the Harrison and Krepsmodel it is simply assumed that short sales are not possible.

In the Harrison-Kreps model there is a single unit of an asset and several classesof risk neutral traders that disagree about the probability distribution of future divi-dends. The reservation price of a buyer is the supremum, over all stopping times, ofthe discounted cumulative dividends until the stopping time plus the discounted (ex-dividend) price that the owner can obtain by selling the asset at the stopping time. Ateach time, the agent in the group with highest reservation price would buy the asset.The equilibrium price process satisfies a simple recursive relationship. Furthermore,if there is a positive probability that at some future time a group that is currentlynot holding the asset would have a higher reservation price than the current owner,then the current price has to strictly exceed the maximum of the discounted expectedfuture dividends among all groups - that is the current price exceeds the maximumfundamental valuation of any agent in the economy. This difference between the cur-rent price and the maximal valuation can be identified as a bubble. Section 3 containsa summary of the Harrison-Kreps theory.

Harrison and Kreps do not discuss the source of heterogeneous beliefs. Although pri-vate information seems to be a natural source of disagreement and of trading, a seriesof results, known as ”no-trade theorems” appear in the economics literature showingthat if all agents are rational and share identical prior beliefs, heterogeneity of infor-mation cannot generate trading or cause a price bubble. These results are discussedin Section 4. Several possibilities exist to avoid these no-trade results, including thepresence of noise-traders, who trade for liquidity reasons, or heterogeneous priors.Another option is to assume that agents have behavioral biases that preclude ”full ra-

Page 226: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 219

tionality.” Several behavioral biases are suggested by the psychology literature. SeeHirshleier (2001) and Barberis and Thaler (2003) for detailed reviews of these biases.

Overconfidence, the tendency to overestimate the precision of own’s opinion is a welldocumented behavioral bias. Scheinkman and Xiong (2003) use overconfidence asa convenient way to generate a parameterized model of heterogeneous beliefs. Theyadopt a continuous time framework describing a market for a single risky-asset with alimited supply, and many risk-neutral agents who can borrow and lend at a fixed rateof interest r. The current dividend of the asset is a noisy observation of a fundamentalvariable that will determine future dividends. More precisely:

dDt = ftdt + σDdZDt ,

where ZD is a standard Brownian motion and f is not observable. However, it satis-fies:

dft = −λ(ft − f)dt + σfdZft ,

In addition to the dividends, there are two other sets of information available at eachinstant. These signals again satisfy the linear SDEs:

dsAt = ftdt + σsdZAt

dsBt = ftdt + σsdZBt .

The information is available to all agents. However, agents are divided in two groupsA and B, and group A (B) has more confidence in signal sA (resp. sB .) As a con-sequence, when forecasting future dividends, each group of agents place differentweights in the three sets of information, resulting in different forecasts. In the para-metric structure that Scheinkman and Xiong (2003) consider, linear filtering appliesand the conditional beliefs are normally distributed with a common variance andmeans fAt and fBt . Although agents in the model know exactly the amount by whichtheir forecast of the fundamental variables exceed that of agents in the other group,behavioral limitations lead them to continue to disagree. As information flows, themean forecasts by agents of the two groups fluctuate, and the group of agents that isat one instant relatively more optimistic, may become in a future date less optimisticthan the agents in the other group. These changes in relative opinion generate trades.

Each agent in the model understands that the agents in the other group are placingdifferent weights on the different sources of information. When deciding the valueof the asset, agents consider their own view of the fundamentals as well as the factthat the owner of the asset has an option to sell the asset in the future to the agentsin the other group. This option can be exercised at any time by the current owner,and the new owner gets in turn another option to sell the asset in the future. In theparametric example discussed by Scheinkman and Xiong (2003) it is natural to lookfor an equilibrium where the value of this option for the current owner, is a functionof differences in opinions, that is if he belongs to group A (B,) this value equalsq = q(fBt − fAt ) (resp. q = q(fBt − fAt ).) This option is “American,” and hence thevalue of the option is the value function of an optimal stopping problem. Since the

Page 227: Paris-Princeton Lectures on Mathematical Finance 2003

220 J. Scheinkman and W. Xiong

buyer’s willingness to pay is a function of the value of the option that he acquires,the payoff from stopping is, in turn, related to the value of the option. This gives riseto a fixed point problem that the option value must satisfy. Scheinkman and Xiong(2003) show that the function q must, in the absence of trading costs, satisfy:

q(x) = supτ≥0

Eo[(

xτr + λ

+ q(xτ ))

e−rτ],

where Eo represents the expected value using the beliefs of the current owner.Scheinkman and Xiong (2003) write down an “explicit” solution to this equationthat involves Kummer functions.

In equilibrium an asset owner will sell the asset to agents in the other group, when-ever his view of the fundamental is surpassed by the view of agents in the othergroup by a critical amount. When there are no trading costs, this critical amount iszero - it is optimal to sell the asset immediately after the valuation of the fundamen-tals of the asset owner is “crossed” by the valuation of agents in the other group.The agents’ beliefs satisfy simple stochastic differential equations and it is a conse-quence of properties of Brownian motion, that once the beliefs of agents cross, theywill cross infinitely many times in any finite period of time right afterwards. Thisresults in a trading frenzy. Although agents’ profit from exercising the resale optionis infinitesimal, the net value of the option is large because of the high frequency oftrades. When trading costs are positive the duration between trades increases but ina continuous manner. In this way the model predicts large trading volume in marketswith small transaction costs.

When a trade occurs the buyer has the highest fundamental valuation among allagents, and because of the re-sale option the price he pays exceeds his fundamen-tal valuation. Agents pay prices that exceed their own valuation of future dividends,because they believe that in the future they will find a buyer willing to pay evenmore. This difference between the transaction price and the highest fundamental val-uation can be reasonably called a bubble. Sections 5 and 6 contain an exposition ofthe model in Scheinkman and Xiong (2004).

The bubble in the Scheinkman and Xiong model, based on the expectations of tradersto take advantage of future differences in opinions is quite different from the moretraditional “rational bubbles” that are discussed, for instance, in Blanchard and Wat-son (1982) or Santos and Woodford (1997). In the typical rational bubble modelagents have identical rational expectations, but prices include an extra bubble com-ponent that is always expected to grow at a rate equal to the risk free rate. Models ofrational bubbles are incapable of explaining the increase in trading volume that is typ-ically observed in the historic bubble episodes, and have non-stationarity propertiesthat are at odds with empirical observations. Nonetheless, because these models con-stituted the first attempt to develop equilibrium models of price bubbles we brieflydiscuss them in Section 7.1.

While rational bubble models study asset prices while not considering questions ontrading volume, a complementary literature uses heterogeneous beliefs to study trad-

Page 228: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 221

ing, without dealing with the impact of heterogeneous beliefs on asset prices. Section7.2 discusses these models using a paper by Harris and Raviv (1993) as an example.

We also discuss two other related questions that have played an important role in theeconomics literature on asset pricing. The first one considers what economists havedubbed the equity premium puzzle. This puzzle is the observation that stock returnsover the last 50 years have been too high to be justified by standard asset pricingmodels with reasonable risk aversion parameters for investors. Section 7.3 brieflydescribes this puzzle and reviews several models that use heterogeneous beliefs toexplain it.

The second question that dates back at least to Friedman (1953), is whether traderswith irrational beliefs will lose money trading with rational traders and eventuallydisappear from the market. In Section 8, we discuss a model by Kogan, Ross, Wangand Westerfield (2004) who analyze this issue using a continuous-time equilibriumsetup. In their model, in addition to a risk-free bond, there is a stock that pays adividend DT at the consumption date T. This terminal dividend satisfies:

dDt = Dt(µdt + σdZt).

There are two groups of traders of equal size. One group the “rational” traders, knowsthe true probability P determining the distribution of the final dividend. The secondgroup believes incorrectly on a probability Q that is absolutely continuous with re-spect to P. As a result, the two groups disagree on the drift of the process determiningDT . Irrational traders believe that

dDt = Dt[(µ + σ2η)dt + σdZQt ],

where ZQt is a Brownian motion under Q. Irrational traders are optimistic (pes-simistic) if η > 0 (resp. η < 0.) All agents maximize a constant relative risk aversionutility function that depends on final consumption, that is they choose a trading strat-egy to maximize:

EΠ0

[1

1 − γC1−γT

]

where CT is the consumption at T, and the measure Π = P for rational agents andequals Q for irrational agents. Kogan et al. use the fact that, since there is only onesource of uncertainty and two traded assets, it is natural to expect that the markets arecomplete and as a consequence that the equilibrium is Pareto efficient. They find theset of all Pareto optimal allocations of final consumption across types, by maximizingthe weighted sum of utilities and derive the corresponding support prices. Usingthese support prices and the budget constraint Kogan et al. characterize the particularPareto efficient allocation that corresponds to a competitive equilibrium, and showthat the corresponding support prices are such that markets are complete. They thencombine the expression for the equilibrium allocation with the strong law of largenumbers to show that, if traders are sufficiently risk averse and irrational traders aremoderately optimistic, irrational traders may survive in the long run. The reason isthat they may earn higher returns albeit by bearing excessive risk.

Page 229: Paris-Princeton Lectures on Mathematical Finance 2003

222 J. Scheinkman and W. Xiong

The papers we discuss in this chapter were chosen mainly on the basis of our viewsconcerning their contribution to the understanding of speculation and trading. Somesuch as Miller (1977) involve very little mathematics. Nonetheless we believe thatthe area is ripe for a more rigorous mathematical treatment. In fact, in section 9 wediscuss some open problems related to the speculative behavior of investors and theresale option component in asset prices. Typically these problems involve optimalstopping problems with nonlinear filtering and multiple state variables.

2 A Static Model with Heterogeneous Beliefs and Short-SalesConstraints

Miller (1977) argued that short-sales constraints can cause stocks to be overpricedwhen investors have heterogeneous beliefs about stock fundamentals. In the presenceof short-sale constraints, stock prices reflect the views of the more optimistic partici-pants. If some of the pessimistic investors that would like to short are not allowed todo it, prices will in general be higher than the price that would prevail in the absenceof short-sale constraints.

We illustrate Miller’s argument using a version of Lintner (1969) model, where weadd short-sales constraints. Related models can also be found in Jarrow (1980), Var-ian (1989), Chen, Hong and Stein (2002), and Gallmeyer and Hollifield (2004). Themodel has one period with two dates: t = 0, 1. There is one risky asset which will beliquidated at t = 1. The final liquidation value is

f = µ + ε,

where µ is a constant unknown to investors and ε is normally distributed with meanzero. Investors have diverse opinions concerning the distribution of liquidation val-ues. Investor i believes that f has a normal distribution with mean µi and varianceσ2. Since all investors share the same views concerning the variance, we index in-vestors by their mean beliefs µi, and assume that µi is uniformly distributed aroundµ in an interval [µ−κ, µ+κ]. The parameter κ measures the heterogeneity of beliefs.In addition, we assume that all investors can borrow or lend at a risk-free interest rateof zero, short-sales of the risky asset are prohibited, and the total supply of the assetis Q.

At t = 0, each investor chooses his asset demand to maximize his expected utility att = 1:

maxxi

E

[−e−γ(W0+xi(f−p0))

],

where γ is the investor’s risk aversion, W0 is the initial wealth, p0 is the market priceof the asset, and xi is the investor’s asset demand, subject to xi ≥ 0. It is immediatethat

xi = max

µi − p0

γσ2, 0

.

Page 230: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 223

The investor’s demand, in the absence of the short-sale constraint would be (µi −p0)/(γσ2). When these constraints are present, investors with mean beliefs µi belowthe market price stay out of the market. The market clearing condition,

∫ixi = Q

thus implies that ∫ µ+κ

maxp0,µ−κ

µi − p0

γσ2

dµi2κ

= Q,

and the equilibrium price

p0 =

µ− γσ2Q if κ < γσ2Q

µ + κ− 2√

κγσ2Q if κ ≥ γσ2Q

In the absence of short-sales constraints the equilibrium price would be µ − γσ2Q.Thus, the short-sales constraints cause the asset price to become higher when theheterogeneity of investors’ beliefs κ is greater than γσ2Q. If heterogeneity of beliefsis small enough than the no-short-sale constraint is not binding for any investor, andthe equilibrium price is not affected by the presence of the constraint.

This simple model shows that short-sales constraints combined with heterogeneousbeliefs can cause asset prices to become higher than they would be in the absenceof the short-sales constraints. When beliefs are sufficiently heterogeneous, short-saleconstraints insure that asset prices reflect the opinion of the more optimistic investors.However, because of its static nature, the model has no prediction concerning the dy-namics of trading. In the following sections, we discuss the effects of heterogeneousbeliefs and short-sales constraints on asset prices and share turnovers in dynamicmodels.

3 A Dynamic Model in Discrete Time with Short-SalesConstraints

Harrison and Kreps (1978) say that investors exhibit speculative behavior if the rightto resell an asset makes them willing to pay more for it than they would pay if obligedto hold it forever. This definition is particularly compelling when agents are risk-neutral since in this case no risk-sharing benefits arise from trading. Harrison andKreps constructed a model where, because risk-neutral agents have heterogeneousexpectations and face short-sales constraints, speculative behavior arises.

In the model, there exists one unit of an asset that pays a non-negative random divi-dend dt at each time t. All agents are risk-neutral, and discount future revenues at aconstant rate γ < 1 or equivalently can borrow and lend at a rate r = 1−γ

γ . Agentsare divided into groups that differ on their views on the distribution of the stochasticprocess dt. Harrison and Kreps allow for an arbitrary number of groups of agents,but the analysis they provide is well illustrated by treating the case of two groups.

Page 231: Paris-Princeton Lectures on Mathematical Finance 2003

224 J. Scheinkman and W. Xiong

Thus, we consider two groups A,B, each with an infinite number of agents. Forsimplicity we assume that each group C ∈ A,B views dt as a stochastic processdefined on a probability space Ω,F ,PC , and that PA ∼ PB. We write EC forthe expected value with respect to the probability distribution shared by all agents ingroup C ∈ A,B.Write Ft, t ≥ 0 for the σ-algebra generated by the realizations of dt ≡ (d1, . . . , dt).A price process is an Ft adapted non-negative process.

The owner of an asset at t must decide on a strategy to sell all or part of his holdingsin the future. Since agents are risk neutral it suffices to consider the strategy of sellingone unit of the asset at a (possibly infinite) stopping time. For this reason we definea feasible selling strategy from time t as a (possibly infinite) integer valued stoppingtime T > t.

Because each group has an infinite number of agents and there is a single unit ofthe asset, competition among buyers will lead to a price that equals the reservationprice of the buyers. Since agents are risk-neutral and can choose any feasible sellingstrategy, the value of the asset at t for an agent in group C ∈ A,B is given by

supT>t

EC

[T∑

k=t+1

γk−tdk + γT−tpT |Ft

],

where∑Tk=t+1 γk−tdk represents the value of the discounted dividend stream re-

ceived up to the moment of sale at T , and γT−tpT represents the discounted valuefrom selling the asset at the prevailing market price at T. The buyers will belong tothe group that places the highest valuation on the asset. Hence an equilibrium priceprocess has to satisfy:

pt = maxC∈A,B

supT>t

EC

[T∑

k=t+1

γk−tdk + γT−tpT |Ft

]. (1)

Since T = ∞ is a feasible strategy, it follows that

pt ≥ maxC∈A,B

EC

[ ∞∑k=t+1

γk−tdk|Ft

]. (2)

Since the right hand side of equation (2) represents the maximal value that any agentis willing to pay for the asset if resale is impossible, speculative behavior is equiv-alent to a strict inequality in equation (2). Suppose F ∈ Ft is such that A realizesthe maximum in the right-hand side of (1) for each ω ∈ F. Suppose further that forsome t′ > t and event F ′ ∈ Ft′ , F ′ ⊂ F with PA(F ′) > 0 (and hence PB(F ′) > 0)

EB

[ ∞∑k=t′+1

γk−tdk|Ft′]

(ω) > EA

[ ∞∑k=t′+1

γk−tdk|Ft′]

(ω),

Page 232: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 225

for each ω ∈ F ′. Then a strict inequality must hold in equation (2), for ω ∈ F.

In fact,

EA

[ ∞∑k=t+1

γk−tdk|Ft

]

= EA

t′∑k=t+1

γk−tdk|Ft

+ EA

[EA

[ ∞∑k=t′+1

γk−tdk|Ft′]|Ft

]

< EA

t′∑k=t+1

γk−tdk|Ft

+ EA

[max

C∈A,BEC

[ ∞∑k=t′+1

γk−tdk|Ft′]|Ft

]

≤ EA

t′∑k=t+1

γk−tdk|Ft

+ EA

[γt

′−tpt′ |Ft]≤ pt.

Speculative behavior arises, because the owner of the asset retains in addition tothe flow of future dividends an option to resell the asset to other investors. Thisoption will become in-the-money when there are investors that have a relatively moreoptimistic view of future dividends than the current owner.

The following proposition allows one to characterize all pricing processes that satisfyequation (1) by a two period condition.

Proposition 1. A price process satisfies equation (1) if and only if, for each t,

pt = maxC∈A,B

EC [γdt+1 + γpt+1|Ft] . (3)

Proof: Suppose (3) holds. Then for each C,

pt ≥ EC [γdt+1 + γpt+1|Ft] .

Hence the process yt =∑t

s=1 γsds + γtpt is a non-negative supermartingaleand hence lim

t→∞ yt exists. Doob’s optional stopping theorem implies that pt ≥

EC[∑T

k=t+1 γk−tdk + γT−tpT |Ft], what implies that (1) must hold.

Conversely, suppose (1) holds, but that:

pt > maxC∈A,B

EC [γdt+1 + γpt+1|Ft] .

The law of iterated expectations and equation (1) applied at t + 1 implies that:

pt > maxC∈A,B

supT

EC

[T∑

k=t+1

γk−tdk + γT−tpt|Ft

],

Page 233: Paris-Princeton Lectures on Mathematical Finance 2003

226 J. Scheinkman and W. Xiong

a contradiction.

Suppose dt is a time-homogeneous Markov process, that is

PC [dt+s|Ft] = PC [dt+s|dt] = PC [ds+1|d1],

for each C ∈ A,B. Then it is natural to search for equilibrium prices that are ofthe form pt = p(dt). If we write:

T p(d) = maxC∈A,B

EC [γdt+1 + γp(dt+1)|dt = d] . (4)

Then equation (3) can be rewritten as:

T p = p. (5)

The operator T has a monotonicity property. If p ≥ q then T (p) ≥ T (q). Theexistence and uniqueness of (continuous) solutions to equation (5) are guaranteed if,for instance, the process d stays in a bounded set.

Harrison and Kreps do not explicitly address the source of heterogenous beliefsamong investors. In what follows we will examine specific mechanisms to generatebeliefs’ heterogeneity. In Section 4 we describe some results concerning the diffi-culty of generating heterogeneous beliefs from the presence of private information.We then discuss a model that utilizes overconfidence, a behavioral limitation thatis suggested by psychological studies, to parameterize heterogeneous beliefs. Themodel will be then used to link Harrison and Kreps’ speculative behavior to a resaleoption value and to explain some empirical regularities concerning trading volumeand prices during asset “bubbles.”

4 No-Trade Theorem under Rational Expectations

A possible source of heterogeneous beliefs is private information. The presence ofprivate information suggests that investors could use their information to trade andrealize a profit. However, Tirole (1982) and Milgrom and Stokey (1982) prove thatthis cannot happen when all agents are rational and share identical prior beliefs,the conditions that are imposed in the standard rational expectations models. Thus,private information cannot be the source of speculative trading. Results of this kindare called “no-trade theorems”.

We use a static setup from Tirole (1982) to illustrate the main ideas. Consider amarket with I risk neutral traders: i = 1, ..., I . The traders exchange claims for anasset with random payoff p ∈ E, which will be realized after the trading. The setE ⊂ R is the set of all possible payoffs. Claims are traded at an equilkibrium marketprice p. If a trader buys x units of the asset and p obtains, the realized profit is

G = (p− p)x.

Page 234: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 227

Each trader i receives a private signal si belonging to a possible set of signals Si.Let s = (s1, . . . , sI) ∈ S = ΠI

i=1Si, and Ω = E × S. Assume that all traders have

a common prior ν on Ω, and that each trader can only take a bounded position.

Trader i = 1, . . . , I chooses an amount xi to maximize his conditional expectedvalue of G. In a Rational Expectations Equilibrium, (REE) each trader uses all in-formation at his disposal, including the observed market price p. In spite of its name,a REE does not involve only an equilibrium price, but a forecast function that mapseach vector of all signals s ∈ S into a price that establishes equilibrium in the marketif the state s obtains. This forecast function is typically not one-to-one. By observingthe price p as well as his own signal si, trader i is not able to identify the full vectorof signals s. However, a forecast function Φ : S → R, an observed p, and a signal si

induce a conditional distribution on E × S, Γ iΦ,p,si .

Definition 1: A rational expectations equilibrium (REE) is a forecast function Φ :S → R, and a set of trades xi(Φ, p, si) for each trader i such that

1. xi(Φ, p, si) maximizes EΦ(Gi|p, si) ≡∫

GdΓ iΦ,p,si

2. The market clears for each s ∈ S :∑

i xi(Φ, p, si) = 0.

The next proposition is a no-trade theorem.

Proposition 2. In a REE, EΦ(Gi|p, si) = 0. As a consequence given an REE, thereexists another REE with the same forecast function and xi ≡ 0.

Proof: Since xi = 0 is always a possible choice,

EΦ(Gi|p, si) ≥ 0. (6)

The Law of Iterated expectations thus implies that

EΦ(Gi|p) ≥ 0. (7)

The market clearing condition implies that for each realization of p aggregate gainsare null. Hence ∑

i

EΦ(Gi|p) = 0,

and equality must hold in equations (6) and (7).

Proposition 2 rules out the possibility that investors that share the same prior canexpect to profit from speculating against each other based on differences in infor-mation. As a consequence, they cannot do any better than by choosing not to trade.Although Proposition 2 only deals with the risk neutral case, it is intuitive that riskaversion would further reduce the net gain among investors from trading.

Tirole (1982) also analyzes a dynamic model with rational expectations and demon-strates that the no-trade theorem holds in dynamic setup. He further shows that the

Page 235: Paris-Princeton Lectures on Mathematical Finance 2003

228 J. Scheinkman and W. Xiong

resale options suggested by Harrison and Kreps cannot arise in asset prices in suchan environment even if short-sales constraints are imposed. Diamond and Verrecchia(1987) also study the effects of short-sales constraints on asset prices in a rationalexpectations model with asymmetric information. They show that short-sales con-straints reduce the adjustment speed of prices to private information, especially tobad news, since agents with negative information are prohibited from shorting theasset. However Diamond and Verrecchia also confirm that short-sales constraints donot lead to an upward bias in prices since agents, when forming their own beliefs,could rationally take into account the fact that negative information may be not re-flected in trading prices.

There are at least two ways to weaken the assumptions in Proposition 2 and avoid theno-trade result. First one might consider some agents who trade for non-speculativereasons such as diversification or liquidity. The presence of such traders would makethe trading among speculators a positive-sum game. This is the approach that hasbeen adopted in several models of market microstructure such as Grossman andStiglitz (1980), Kyle (1985), and Wang (1993).

Another possibility is to relax the assumption that agents share the same prior beliefs.This approach is pursued by Morris (1996), Biais and Bossaerts (1998), and Brav andHeaton (2002). Finally one may assume that agents display behavioral biases. In thischapter, we discuss in detail overconfidence as a way to parameterize the dynamicsof heterogeneous beliefs among agents. However many other behavioral biases maygenerate heterogeneity of beliefs. For instance, heterogeneous beliefs can arise ifagents gain utility from adopting certain beliefs as discussed in Brunnermeier andParker (2003).

5 Overconfidence as Source of Heterogeneous Beliefs

Overconfidence, the tendency of people to overestimate the precision of their knowl-edge, provides a convenient way to generate heterogeneous beliefs. Psychology stud-ies suggest that people are overconfident. Alpert and Raiffa (1982), and Brenner etal. (1996) and other calibration studies find that people overestimate the precisionof their knowledge. Camerer (1995) argues that even experts can display overconfi-dence. Hirshleifer (2001) and Barberis and Thaler (2003) contain extensive reviewsof the literature.

In finance, researchers have developed theoretical models to analyze the implicationsof overconfidence on financial markets. Kyle and Wang (1997) show that overconfi-dence can be used as a commitment device over competitors to improve one’s wel-fare. Daniel, Hirshleifer and Subrahmanyam (1998) use overconfidence to explainthe predictable returns of financial assets. Odean (1998) demonstrates that overcon-fidence can cause excessive trading. Bernardo and Welch (2001) discuss the benefitsof overconfidence to entrepreneurs through the reduced tendency to herd. In all these

Page 236: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 229

studies, overconfidence is modelled as overestimation of the precision of one’s infor-mation.

In this section we exposit the model in Scheinkman and Xiong (2003), that exploitsthe consequences of this overestimation in a dynamic model of pricing and trading.Since overconfident investors believe more strongly in their own assessments of anasset’s value than in the assessment of others, heterogeneous beliefs arise.

Consider a single risky asset with a dividend process that is the sum of two com-ponents. The first component is a fundamental variable that determines future divi-dends. The second is “noise”. The cumulative dividend process D satisfies:

dDt = ftdt + σDdZDt , (8)

where ZD is a standard Brownian motion and σD is the volatility parameter. Thestochastic process of fundamentals f is not observable. However, it satisfies:

dft = −λ(ft − f)dt + σfdZft , (9)

where λ ≥ 0 is the mean reversion parameter, f is the long-run mean of f , σf >0 is a volatility parameter and Zf is a standard Brownian motion, uncorrelated toZD. The presence of dividend noise makes it impossible to infer f perfectly fromobservations of the cumulative dividend process.

There are two sets of risk-neutral agents, who use the observations of D and anyother signals that are correlated with f to infer current f and to value the asset. Inaddition to the cumulative dividend process, all agents observe a vector of signals sA

and sB that satisfy:

dsAt = ftdt + σsdZAt (10)

dsBt = ftdt + σsdZBt , (11)

where ZA and ZB are standard Brownian motions, and σs > 0 is the commonvolatility of the signals. We assume that all four processes ZD, Zf , ZA and ZB aremutually independent.

Agents in group A (B) think of sA (sB) as their own signal although they can alsoobserve sB (sA). Heterogeneous beliefs arise because each agent believes that theinformativeness of his own signal is larger than its true informativeness. Agents ofgroup A (B) believe that innovations dZA (dZB) in the signal sA (sB) are correlatedwith the innovations dZf in the fundamental process, with φ (0 < φ < 1) as thecorrelation parameter. Specifically, agents in group A believe that the process sA

satisfies

dsAt = ftdt + σsφdZft + σs

√1 − φ2dZAt .

Although agents in group A perceive the correct unconditional volatility of the signalsA, the correlation that they attribute to innovations causes them to over-react tosignal sA. Similarly, agents in group B believe the process sB satisfies

Page 237: Paris-Princeton Lectures on Mathematical Finance 2003

230 J. Scheinkman and W. Xiong

dsBt = ftdt + σsφdZft + σs

√1 − φ2dZBt .

On the other hand, agents in group A (B) believe (correctly) that innovations tosB (sA) are uncorrelated with innovations to ZB (ZA.) We assume that the jointdynamics of the processes D, f, sA and sB in the mind of agents of each group ispublic information.

Since all variables are Gaussian, the filtering problem of the agents is standard. WithGaussian initial conditions, the conditional beliefs of agents in group C ∈ A,B isGaussian. Standard arguments, e.g. section VI.9 in Rogers and Williams (1987) andTheorem 12.7 in Liptser and Shiryayev (1977), can be used to compute the varianceof the stationary solution and the evolution of the conditional mean of beliefs. Thevariance of this stationary solution is the same for both groups of agents and equals

γ ≡

√(λ + φσf/σs)2 + (1 − φ2)(2σ2

f/σ2s + σ2

f/σ2D) − (λ + φσf/σs)

1σ2

D+ 2

σ2s

.

One can directly verify that the stationary variance γ decreases with φ. When φ > 0,agents have an exaggerated view of the precision of their estimates of f. A larger φleads to more overstatement of this precision. For this reason we refer to φ as the“overconfidence” parameter.

The conditional mean of the beliefs of agents in group A satisfies:

dfAt = −λ(fAt − f)dt +φσsσf + γ

σ2s

(dsAt − fAt dt)

σ2s

(dsBt − fAt dt) +γ

σ2D

(dDt − fAt dt). (12)

Since f mean-reverts, the conditional beliefs also mean-revert. The other three termsrepresent the effects of “surprises.” These surprises can be represented as standardmutually independent Brownian motions for agents in group A:

dWA,At =

1σs

(dsAt − fAt dt), (13)

dWA,Bt =

1σs

(dsBt − fAt dt), (14)

dWA,D =1σD

(dDt − fAt dt). (15)

Note that these processes are only Wiener processes in the mind of group A agents.Due to overconfidence (φ > 0), agents in group A over-react to surprises in sA.

Similarly, the conditional mean of the beliefs of agents in group B satisfies:

dfBt = −λ(fBt − f)dt +γ

σ2s

(dsAt − fBt dt)

+φσsσf + γ

σ2s

(dsBt − fBt dt) +γ

σ2D

(dDt − fBt dt), (16)

Page 238: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 231

and the surprise terms can be represented as mutually independent Wiener processes:dWB,A = 1

σs(dsAt −fBt dt), dWB,B = 1

σs(dsBt −fBt dt), and dWB,D = 1

σD(dDt−

fBt dt). These processes form a standard 3-d Brownian only for agents in group B.

Since, in the stationary solution the beliefs of all agents have constant variance, onemay refer to the conditional mean of the beliefs as the agents beliefs. Let gA and gBdenote the differences in beliefs:

gA = fB − fA, gB = fA − fB.

The next proposition describes the evolution of these differences in beliefs:

Proposition 3.

dgAt = −ρgAt dt + σgdWA,gt , (17)

where

ρ =

√(λ + φ

σfσs

)2

+ (1 − φ2)σ2f

(2σ2s

+1σ2D

), (18)

σg =√

2φσf ,

and WA,g is a standard Wiener process for agents in group A.

Proof: from equations (12) and (16):

dgAt = dfBt − dfAt = −[λ +

2γ + φσsσfσ2s

σ2D

]gAt dt +

φσfσs

(dsBt − dsAt ).

Using the formula for γ, we may write the mean-reversion parameter as in equation(18). Using equations (13) and (14),

dgAt = −ρgAt dt +φσfσs

(σsdW

A,B − σsdWA,A).

The result follows by writing

WA,g =1√2

(WA,B −WA,A

).

It is easy to verify that innovations to WA,g are orthogonal to innovations to fA inthe mind of agents in group A.

Proposition 3 implies that the difference in beliefs gA follows a simple mean revert-ing diffusion process in the mind of group A agents. In particular, the volatility ofthe difference in beliefs is zero in the absence of overconfidence. A larger φ leads togreater volatility. In addition,−ρ/(2σ2

g) measures the pull towards the origin. A sim-ple calculation shows that this mean-reversion decreases with φ. A higher φ causesan increase in fluctuations of opinions and a slower mean-reversion.

Page 239: Paris-Princeton Lectures on Mathematical Finance 2003

232 J. Scheinkman and W. Xiong

In an analogous fashion, for agents in group B, gB satisfies:

dgBt = −ρgBt dt + σgdWB,gt , (19)

where WB,g is a standard Wiener process.

Notice that although we started with a Markovian structure on dividends and signals,the beliefs depend on the history of dividends and signals - only the vector involvingdividends, all signals and beliefs is Markovian. This is a consequence of the inferenceproblem faced by investors. In contrast, in this model, the difference in beliefs is aMarkov diffusion, what greatly facilitates the analysis that follows.

6 Trading and Equilibrium Price in Continuous Time

In the previous section we specified a particular model of heterogeneous beliefs, gen-erated by overconfidence. Equations (17) and (19) state that, in each group’s mind,the difference of opinions follows a mean-reverting diffusion process. The coeffi-cients of this process are linked to the parameters describing the original uncertaintyand the degree of overconfidence. In this section we derive implications of this par-ticular model of heterogeneity for the equilibrium prices and trading behavior. Wealso summarize some results from Scheinkman and Xiong (2003) concerning the ef-fect of the parameters that determine the original uncertainty and overconfidence onthe prices and trading volume that obtain in equilibrium.

As in the Harrison and Kreps model described in Section 3, assume that each groupof investors is large and there is no short selling of the risky asset. To value futurecash flows, assume that every agent can borrow and lend at the same rate of interestr. These assumptions facilitate the calculation of equilibrium prices.

At each t, agents in group C = A,B are willing to pay pCt for a unit of the as-set. As in the Harrison and Kreps model described in Section 3, the presence of theshort-sale constraint, a finite supply of the asset, and an infinite number of prospec-tive buyers, guarantee that any successful bidder will pay his reservation price. Theamount that an agent is willing to pay reflects the agent’s fundamental valuation andthe fact that he may be able to sell his holdings at a later date at the demand price ofagents in the other group for a profit. If o ∈ A,B denotes the group of the currentowner, o the other group, and Eot the expectation of members of group o, conditionalon the information they have at t, then:

pot = supτ≥0

Eot

[∫ t+τ

t

e−r(s−t)dDs + e−rτ (pot+τ − c)], (20)

where τ is a stopping time, c is a transaction cost charged to the seller, and pot+τ isthe reservation value of the buyer at the time of transaction t + τ .

Using the equations for the evolution of dividends and for the conditional mean ofbeliefs (equations (8), (12) and (16) above), one obtains:

Page 240: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 233

∫ t+τ

t

e−r(s−t)dDs =∫ t+τ

t

e−r(s−t)[f + e−λ(s−t)(fos − f)]ds + Mt+τ ,

where EotMt+τ = 0. Hence, we may rewrite equation (20) as:

pot = maxτ≥0

Eot

t+τ∫t

e−r(s−t)[f + e−λ(s−t)(fos − f)]ds + e−rτ(pot+τ − c)

.(21)

Scheinkman and Xiong (2003) start by postulating a particular form for the equi-librium price function, equation (22) below. Proceeding in a heuristic fashion, theyderive properties that our candidate equilibrium price function should satisfy. Theythen construct a function that satisfies these properties, and verify that they haveproduced an equilibrium.

Since all the relevant stochastic processes are Markovian and time-homogeneous,and traders are risk-neutral, it is natural to look for an equilibrium in which thedemand price of the current owner satisfies

pot = po(fot , got ) =

f

r+

fot − f

r + λ+ q(got ). (22)

with q > 0 and q′ > 0. This equation states that prices are the sum of two com-

ponents. The first part, fr + fot −fr+λ , is the expected present value of future dividends

from the viewpoint of the current owner. The second is the value of the resale op-tion, q(got ), which depends on the current difference between the beliefs of the othergroup’s agents and the beliefs of the current owner. We call the first quantity theowner’s fundamental valuation and the second the value of the resale option. Using(22) in equation (21) and collecting terms:

pot = po(fot , got ) =

f

r+

fot − f

r + λ+ supτ≥0

Eot

[(got+τr + λ

+ q(got+τ ) − c

)e−rτ

].

Equivalently, the resale option value satisfies

q(got ) = supτ≥0

Eot

[(got+τr + λ

+ q(got+τ ) − c

)e−rτ

]. (23)

Hence to show that an equilibrium of the form (22) exists, it is necessary and suffi-cient to construct an option value function q that satisfies equation (23). This equationis similar to a Bellman equation. The current asset owner chooses an optimal stop-ping time to exercise his re-sale option. Upon the exercise of the option, the owner

gets the “strike price”go

t+τ

r+λ + q(got+τ ), the amount of excess optimism that the buyerhas about the asset’s fundamental value and the value of the resale option to thebuyer, minus the cost c. In contrast to the optimal exercise problem of American op-tions, the “strike price” in this problem depends on the re-sale option value functionitself.

Page 241: Paris-Princeton Lectures on Mathematical Finance 2003

234 J. Scheinkman and W. Xiong

The region where the value of the option equals that of an immediate sale is thestopping region. The complement is the continuation region. In the mind of the riskneutral asset holder, the discounted value of the option e−rtq(got ) should be a martin-gale in the continuation region, and a supermartingale in the stopping region. UsingIto’s lemma and the evolution equation for go, these conditions can be stated as:

q(x) ≥ x

r + λ+ q(−x) − c (24)

12σ2gq

′′ − ρxq′ − rq ≤ 0, with equality if (24) holds strictly. (25)

In addition, the function q should be continuously differentiable (smooth pasting).As usual, one first shows that there exists a smooth function q that satisfies equations(24) and (25) and then uses these properties and a growth condition on q to show thatin fact the function q solves (23).

To construct the function q, guess that the continuation region will be an interval(−∞, k∗), with k∗ ≥ 0. k∗ is the minimum amount of difference in opinions thatgenerates a trade. The second order ordinary differential equation that q must satisfy,albeit only in the continuation region, is:

12σ2gu

′′ − ρxu′ − ru = 0 (26)

The following proposition describes all solutions of equation (26).

Proposition 4. Let

h(x) =

U(r2ρ ,

12 ,

ρσ2

gx2)

if x ≤ 0

Γ( 12 + r

2ρ )Γ( 12 )

M(r2ρ ,

12 ,

ρσ2

gx2)− U

(r2ρ ,

12 ,

ρσ2

gx2)

if x > 0(27)

where Γ (·) is the Gamma function, and M : R3 → R and U : R3 → R are twoKummer functions described in chapter 13 of Abramowitz and Stegum (1964). h(x)is positive and increasing in (−∞, 0). In addition h solves equation (26) with

h(0) =π

Γ(

12 + r

)Γ(

12

) .

Any solution u(x) to equation (26) that is strictly positive and increasing in (−∞, 0)must satisfy: u(x) = β1h(x) for some β1 > 0.

Proof: Let v(y) be a solution to the differential equation

yv′′(y) + (1/2 − y)v′(y) − r

2ρv(y) = 0. (28)

Page 242: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 235

It is straightforward to verify that u(x) = v(ρσ2

gx2)

satisfies equation (26). The

general solution of equation (28) is

v(y) = αM

(r

2ρ,12, y

)+ βU

(r

2ρ,12, y

).

Given a solution u to equation (26) one can construct two solutions v to equation(28), by using the values of the function for x < 0 and for x > 0. Write the corre-sponding linear combinations of M and U as α1M +β1U and α2M +β2U. If thesecombinations are constructed from the same u their values and first derivatives musthave the same limit as x → 0. To guarantee that u(x) is positive and increasing forx < 0, α1 must be zero. Therefore,

u(x) = β1U

(r

2ρ,12,

ρ

σ2g

x2

)if x ≤ 0.

From the definition of the two Kummer functions, one can show that

x → 0−, u(x) → β1π

Γ( 12+ r

2ρ )Γ( 12 )

, u′(x) → β1π√ρ

σgΓ( r2ρ )Γ( 3

2 )x → 0+, u(x) → α2 + β2π

Γ( 12+ r

2ρ )Γ( 12 )

, u′(x) → − β2π√ρ

σgΓ( r2ρ )Γ( 3

2 )

By matching the values and first order derivatives of u(x) from the two sides ofx = 0, we have

β2 = −β1, α2 =2β1π

Γ(

12 + r

)Γ(

12

) .

The function h is a solution to equation (26) that satisfies

h(0) =π

Γ(

12 + r

)Γ(

12

) > 0,

and h(−∞) = 0. Equation (26) guarantees that at any critical point where h < 0, hhas a maximum, and at any critical point where h > 0 it has a minimum. Hence h isstrictly positive and increasing in (−∞, 0).

Further properties of the function h are summarized in the following Lemma.

Lemma 1. For each x ∈ R, h(x) > 0, h′(x) > 0, h′′(x) > 0, h′′′(x) >0, lim

x→−∞h(x) = 0, and limx→−∞h′(x) = 0.

Since q must be positive and increasing in (−∞, k∗), it follows from Proposition 4that

q(x) =

β1h(x), for x < k∗xr+λ + β1h(−x) − c, for x ≥ k∗. (29)

Page 243: Paris-Princeton Lectures on Mathematical Finance 2003

236 J. Scheinkman and W. Xiong

Since q is continuous and continuously differentiable at k∗,

β1h(k∗) − k∗

r + λ− β1h(−k∗) + c = 0,

β1h′(k∗) + β1h

′(−k∗) − 1r + λ

= 0.

These equations imply that

β1 =1

(h′(k∗) + h′(−k∗))(r + λ), (30)

and k∗ satisfies

[k∗ − c(r + λ)](h′(k∗) + h′(−k∗)) − h(k∗) + h(−k∗) = 0. (31)

The next proposition shows that for each c, there exists a unique pair (k∗, β1) thatsolves equations (30) and (31). The smooth pasting conditions are sufficient to de-termine the function q and the “trading point” k∗.

Proposition 5. For each trading cost c ≥ 0, there exists a unique k∗ that solves (31).If c = 0 then k∗ = 0. If c > 0, k∗ > c(r + λ).

Proof: Let l(k) = [k − c(r + λ)](h′(k) + h′(−k)) − h(k) + h(−k).

If c = 0, l(0) = 0, and l′(k) = k[h′′(k) − h′′(−k)] > 0, for all k = 0. Thereforek∗ = 0 is the only root of l(k) = 0.

If c > 0, then l(k) < 0, for all k ∈ [0, c(r + λ)]. Since h′′ and h′′′ are increasing(Lemma 1), for all k > c(r + λ)

l′(k) = [k − c(r + λ)][h′′(k) − h′′(−k)] > 0,l′′(k) = h′′(k) − h′′(−k) + [k − c(r + λ)][h′′′(k) − h′′′(−k)] > 0.

Therefore l(k) = 0 has a unique solution k∗ > c(r + λ).

The next proposition establishes that the function q described by equation (29), withβ1 and k∗ given by (30) and (31), solves (23). The proof consists of two parts. First,it is established that (24) and (25) hold and that q′ is bounded. A standard argument,e.g. Kobila (1993) or Scheinkman and Zariphopoulou (2001), is then used to showthat in fact q solves equation (23).

Proposition 6. The function q constructed above is an equilibrium option value func-tion. The optimal policy consists of exercising immediately if go > k∗, otherwise waituntil the first time in which go ≥ k∗.

Proof: Let

Page 244: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 237

b ≡ q(−k∗) =1

(r + λ)h(−k∗)

(h′(k∗) + h′(−k∗)), (32)

then equation (29) implies

q(−x) =

b

h(−k∗)h(−x) for x > −k∗−xr+λ + b

h(−k∗)h(x) − c for x ≤ −k∗.

Equation (24) may be rewritten as U(x) = q(x) − xr+λ − q(−x) + c ≥ 0. A simple

calculation shows that

U(x) =

2c for x < −k∗−xr+λ + b

h(−k∗) [h(x) − h(−x)] + c for −k∗ ≤ x ≤ k∗

0 for x > k∗

Thus, U ′′(x) = bh(−k∗) [h

′′(x) − h′′(−x)], −k∗ ≤ x ≤ k∗. Lemma 1 guaranteesU ′′(x) > 0 for 0 < x < k∗, and U ′′(x) < 0 for −k∗ < x < 0. Since U ′(k∗) = 0,U ′(x) < 0 for 0 < x < k∗. On the other hand, U ′(−k∗) = 0, so U ′(x) < 0 for−k∗ < x < 0. Therefore U(x) is monotonically decreasing for −k∗ < x < k∗.Since U(−k∗) = 2c > 0 and U(k∗) = 0, U(x) > 0 for −k∗ < x < k∗. Henceequation (24) holds in (−∞, k∗).

Equation (25) holds by construction in the region x ≤ k∗. Therefore it is suffi-cient to show that equation (25) is valid for x ≥ k∗. In this region, q(x) = x

r+λ +b

h(−k∗)h(−x) − c, thus q′(x) = 1r+λ − b

h(−k∗)h′(−x) and q′′(x) = b

h(−k∗)h′′(−x).

Hence,

12σ2gq

′′(x) − ρxq′(x) − rq(x)

=b

h(−k∗)

[12σ2gh

′′(−x) + ρxh′(−x) − rh(−x)]− r + ρ

r + λx + rc

= − r + ρ

r + λx + rc ≤ −(r + ρ)c + rc = −ρc < 0

where the inequality comes from the fact that x ≥ k∗ > (r + λ)c (see Proposition5.)

Also q has an increasing derivative in (−∞, k∗) and has a derivative bounded inabsolute value by 1

r+λ in (k∗,∞). Hence q′ is bounded.

If τ is any stopping time, the version of Ito’s lemma for twice differentiable functionswith absolutely continuous first derivatives (e.g. Revuz and Yor (1999), Chapter VI)implies that

e−rτq(goτ ) = q(go0) +∫ τ

0

[12σ2gq

′′(gos) − ρgosq′(gos) − rq(gos)

]ds +

∫ τ

0

σgq′(gos)dWs

Equation (25) states that the first integral is non positive, while the bound on q′

guarantees that the second integral is a martingale. Using equation (24) we obtain,

Page 245: Paris-Princeton Lectures on Mathematical Finance 2003

238 J. Scheinkman and W. Xiong

Eo

e−rτ[

goτr + λ

+ q(−goτ ) − c

]≤ Eo

[e−rτq(goτ )

]≤ q(go0).

This shows that no policy can yield more than q(x).

Now consider the stopping time τ = inft : got ≥ k∗. Such τ is finite with proba-bility one, and gos is in the continuation region for s < τ. Using exactly the same rea-soning as above, but recalling that in the continuation region (25) holds with equalitywe obtain

q(go) = Eo

e−rτ[

goτr + λ

+ q(−goτ ) − c

].

It is a consequence of Proposition 6 that the process go will have values in (−∞, k∗).The value k∗ acts as a barrier, and when go reaches k∗, a trade occurs, the owner’sgroup switches and the process is restarted at −k∗. q(go) is the difference betweenthe current owner’s demand price and his fundamental valuation and can be legiti-mately called a “bubble”.

The model determines the magnitude of the bubble and the duration between trades.The magnitude of the bubble can be measured by b, as in equation (32), the value ofthe resale option when a trade occurs.

If we write h in equation (27) as a function of both x and r, Scheinkman and Xiong(2003) show that the expected duration between two trades is given by

E[τ ] = − ∂

∂r

[h(−k∗, r)h(k∗, r)

]∣∣∣∣r=0

.

When the trading cost is zero (c = 0), the trading barrier k∗ = 0, which implies thatthe expected duration between trades is also zero. This is due to the fact that once aBrownian motion hits a point, it will hits the same point for infinite many times forany given period immediately afterward.

Various comparative statics results are described in Scheinkman and Xiong (2003).As investors become more overconfident (φ increases), the volatility parameter ofthe difference in beliefs (σg) increases, resulting in more trades (shorter durationbetween trades) and a larger price bubble. As the signals become more informative,the mean reverting speed of the difference in beliefs (ρ) becomes larger, resultingin shorter duration between trades and a larger price bubble. As the trading cost cincreases from zero, the duration between trades increases and the magnitude of thebubble is reduced, in a continuous manner. In particular the model predicts largetrading volume in markets with sufficiently small transaction costs. The effect froman increase in trading cost is most dramatic for the duration between trades but theeffects on the bubble are modest. This suggests that while a tax on transactions (TobinTax) would have some effect on trading volume, it would have a small effect on thesize of a bubble.

Page 246: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 239

In a risk-neutral world, one may consider several assets and analyze the equilib-rium in each market independently. In this way the comparative statics propertiesdescribed in the previous paragraph can be translated into results about correlationsamong equilibrium variables in the different markets. Thus this model is potentiallycapable of explaining the observed cross-sectional correlation between market/bookratio and turnover for U.S. stocks in the period of 1996-2000 as documented byCochrane (2002). It is also able to account for the analogous cross-sectional correla-tion that has been found by Mei, Scheinkman, and Xiong (2003) between the priceratio of China’s A shares to B shares and turnover.

7 Other Related Models

In this section, we discuss other models that have been proposed in the finance liter-ature to study price bubbles and effects of heterogeneous beliefs on trading and assetprices.

7.1 Models on Rational Bubbles

There has been a large literature studying rational bubbles including Blanchard andWatson (1982), Santos and Woodford (1997) and others. In these papers, all agentshave identical rational expectations, and the asset prices can be decomposed into twoparts, a fundamental component and a bubble component which is expected to growat a rate equal to the risk free rate. In fact, such a rational bubble component canalso be built into the models discussed in Sections 3 and 6. Given a price process ptthat satisfies equation (1) and mt, a non-negative Ft-martingale, pt = pt + γ−tmt

also satisfies the equation (1). A corresponding remark holds for equation (20) thatdescribes the equilibrium of the model based on overconfidence. We have ruled outsuch rational bubbles in our previous discussion.

Campbell, Lo, and MacKinlay (1997, pages 258-260) provide a detailed discussionon the properties of rational bubbles. To make this type of bubbles sustainable, theasset must have a potentially infinite maturity. Another property of rational bubblesis that the asset price grows on average without bounds. In addition, the modelsof rational bubbles provide no explanation for the increase in trading that is oftenobserved during historical bubble episodes.

As we discussed in the previous sections, the resale option provides an alternativeway to analyze asset price bubbles, since its value is determined by heterogeneousbeliefs among investors which is a variable orthogonal to the fundamental value ofthe asset. In contrast to rational bubbles, the resale option component does not needto be explosive although its magnitude could be very significant due to its recursivestructure. Consequently, in the model exposited in Section 6, variables such as theasset price in equation (22) and its change have stationary distributions. In addition,

Page 247: Paris-Princeton Lectures on Mathematical Finance 2003

240 J. Scheinkman and W. Xiong

the size of the bubble generated from the resale option is positively correlated withtrading volume, a property that is apparent in several actual episodes of price bubbles.

Finally, the resale option still exists for an asset with a fixed finite maturity, which isnot possible for rational bubbles, that depend on a potentially infinite life. It shouldbe apparent from the analysis in Section 6 that one can, in principle, treat an as-set with a fixed terminal date. Equations (20) to (21) would apply with the obviouschanges to account for the finite horizon. However, the option value q will now de-pend on the remaining life of the asset, introducing another dimension to the optimalstopping problem. The infinite horizon problem is stationary, greatly reducing themathematical difficulty.

7.2 Trading Caused by Heterogeneous Beliefs

Several other models have been proposed to analyze asset trading based on heteroge-neous beliefs, such as Varian (1989), Harris and Raviv (1993), Kandel and Pearson(1995), Kyle and Lin (2003), and Cao and Ou-Yang (2004).

Harris and Raviv (1993) analyze a model with two groups of risk neutral speculatorswho trade a risky asset at dates t = 1, 2, ..., T − 1. The final liquidation value of theasset at T is random and can be either high (H) or low (L). At each date a public noisysignal is revealed to the two groups of speculators, who assign different probabilitydistributions for the signal and therefore would hold heterogeneous expectations ofthe final liquidation value of the asset. This mechanism for generating heterogeneousbeliefs is similar to what we discuss in Section 5 with investor overconfidence, butwith a different random process and noise distribution. The beliefs of the two groupsare denoted by πjH(t), the posterior probability that the final liquidation value willbe high (j = 1, 2).

To analyze trading between the two groups of speculators resulting from differencein their beliefs, the model also imposes a short-sales constraint, and that one grouphas sufficient market power to offer a price on a take-it-or-leave-it basis to the othergroup. As a result, the trading price of the asset will always equal the reservationprice of the other group (the “price-taking” group). The existence of such a price-taking group, is not a natural assumption and rules out the presence of a bubblein the observed trading prices, but greatly simplifies the recursive structure in thedetermination of equilibrium prices that arises in the models of Harrison and Kreps(1978) and Scheinkman and Xiong (2003). Harris and Raviv demonstrate in theirmodel that trade will only occur when the two groups switch side (π1

H and π2H flip

order), and that there is a positive correlation between trading volume and absoluteprice changes (but not necessarily the level of prices.)

Page 248: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 241

7.3 Effects of Heterogeneous Beliefs on Risk Premia

Lucas (1978) provides a simple and elegant equilibrium model to analyze the re-lation between equity premium and aggregate consumption. Consider an economywith a representative agent and an infinite time horizon (t = 1, 2, 3, ...). The agentmaximizes his lifetime utility from consumption:

E

[ ∞∑t=0

βtu(ct)

]

where β, (0 < β < 1), is the agent’s subjective discount factor, ct is the consumptionin period t, and the agent’s utility from consumption u(c) is often assumed to have apower form: u(c) = 1

1−γ c1−γ . There are two assets – one is a risk-free asset and the

other is a claim to aggregate endowment in the economy ct (t=1, 2, ...). The risk-freeasset is in zero net supply. The agent’s marginal rate of consumption provides hispricing kernel for future cashflow:

mt+1 = βu′(ct+1)u′(ct)

.

More specifically, a random cashflow of xt+1 at t + 1 is worth

pt = Et (mt+1xt+1)

at period t. Thus, the riskfree rate is given by

Rf = 1/E(mt),

and the return on a risky asset Ri should satisfy

E(mtRit) = 1.

By decomposing the last equation, we obtain E(mt)E(Rit) + cov(mt, R

it) = 1.

Therefore,

E(Rit) −Rf = −cov(mt, R

it)

E(mt).

This relation implies an upper bound on any risky asset’s Sharpe ratio:∣∣∣∣E(Ri

t) −Rf

σ(Rit)

∣∣∣∣ =∣∣∣∣− cov(mt, R

it)

E(mt)σ(Rit)

∣∣∣∣ ≤ σ(mt)E(mt)

. (33)

The return from the stock market portfolio provides a way to calibrate the relation-ship between the equity risk premium and the pricing kernel implied by aggregateconsumption. In the case of the market portfolio Rmv, the relation in formula (33)holds exactly, and a power utility function would imply

Page 249: Paris-Princeton Lectures on Mathematical Finance 2003

242 J. Scheinkman and W. Xiong

∣∣∣∣E(Rmv) −Rf

σ(Rmv)

∣∣∣∣ =σ(mt+1)E(mt+1)

=σ[(ct+1/ct)−γ ]E[(ct+1/ct)−γ ]

≈ γσ(∆ ln c),

where Rmv is the return on the market portfolio and Rf the risk-free rate. As sum-marized by Cochrane (2001, page 23), “over the last 50 years in the United States,real stock returns have averaged 9% with a standard deviation of about 16%, whilethe real return on treasury bills has been about 1%. Thus, the historical annual mar-ket Sharpe ratio has been about 0.5. Aggregate nondurable and services consumptiongrowth had a mean and standard deviation of about 1%. We can only reconcile thesefacts if investors have a risk-aversion coefficient of 50,” which is much higher thanwhat economists have usually assumed. This is called an “equity premium puzzle”by Mehra and Prescott (1985).

The equity premium puzzle has motivated a large number of papers since the mideighties. The objective of these papers is to present modifications of the standardmodel that justify a Sharpe ratio that is considerably higher than the one implied bythe standard model. Part of this literature has used heterogenous beliefs. Williams(1977), Abel (1990) Detemple and Murthy (1994), Zapatero (1998), and Basak(2000), among others, analyze the effects of heterogenous beliefs on the equilibriumrisk premium and interest rates.

Detemple and Murthy (1994) consider a continuous time production economy ofCox, Ingersoll and Ross (1985) with a Brownian uncertainty structure. They assumea risky production technology with a return that is invariant to scale and that hasan unobservable mean. In addition, there are two groups of risk-averse agents whohave heterogeneous prior beliefs on the mean return of the production technology.There are two assets - a claim to the aggregate output and a risk-less asset in zero netsupply. Heterogeneous beliefs motivate agents in one group to borrow from agentsin the other group using the risk-less asset. In equilibrium, the interest rate and riskpremium on risky securities are determined by the wealth distribution across the twogroups and each groups’ estimates of the production growth rate. Zapatero (1998)considers a similar model in a pure exchange economy of Lucas (1978). He showsthat heterogeneous beliefs can lead to a reduced risk-free interest rate in equilibriumfrom an otherwise identical economy with homogeneous beliefs. This lower interestrate induces a higher excess return. Basak (2000), using a model similar to Detem-ple and Murthy (1994), further shows that heterogeneous beliefs could add risk toinvestors’ financial investment and therefore may lead to a greater equity premiumthan an economy with homogeneous beliefs.

8 Survival of Traders with Incorrect Beliefs

In the earlier sections, we discussed various effects that can arise when traders withheterogeneous beliefs interact with each other in an asset market. Some traders mayhave incorrect beliefs which are generated from incorrect prior beliefs or from incor-rect information processing rules, while some others may be smarter and have beliefs

Page 250: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 243

that are closer to the objective ones. In such an environment, an important questionis whether traders with incorrect beliefs will lose money trading with smarter tradersand eventually disappear from the market.

There has been a long debate on this fundamental issue. Friedman (1953) arguesthat irrational traders who use wrong beliefs cannot survive in a competitive market,since they will eventually lose their wealth to rational traders in the long run. Morerecently, De Long, Shleifer, Summers and Waldman (1991) suggest that traders withwrong beliefs may survive in the long-run since they may hold a portfolio with ex-cessive risk but also higher expected return and therefore their wealth can eventuallyoutgrow that of rational traders. Several recent studies have been devoted to ana-lyze this issue, e.g. Sandroni (2000), Blume and Easley (2001), and Kogan, Ross,Wang and Westerfield (2004). However, the answer is still inconclusive. Dependingon the model assumptions, different results have been found in these studies. Here,we discuss a model from Kogan et al. (2004) as an example.

Kogan et al. consider an economy that has a finite horizon and evolves in continuoustime. Uncertainty is described by a one-dimensional, standard Brownian motion Ztfor 0 ≤ t ≤ T , which is defined on a complete probability space (Ω,F, P ). Thereis a single share of a risky asset in the economy, the stock, which pays a terminaldividend payment DT at time T , determined by process

dDt = Dt(µdt + σdZt)

where D0 = 1 and σ > 0. There is also a zero coupon bond available in zero netsupply. Each unit of the bond makes a sure payment of one at time T . We use therisk-free bond as the numeraire and denote the price of the stock at time t by St.

There are two types of competitive traders in the economy. Each set corresponds tohalf of the total number of traders. At time zero, all traders have an equal endowmentof the stock (that we normalize as 1/2) and have zero units of the bond. Traders differin their beliefs of the drift parameters of the dividend process. One set of traders, therational traders, knows the true probability measure P , while the other set of traders,the irrational traders, believes incorrectly that the probability measure is Q, underwhich

ZQt = Zt −∫ t

0

(ση)ds

is a standard Brownian motion. Hence, for an irrational trader

dDt = Dt[(µ + σ2η)dt + σdZQt ],

where ZQt is a Brownian motion. The constant η parameterizes the irrationalityof these traders. When η is positive, the irrational traders are optimistic about theprospects of the evolution of the dividend and overestimates the rate of growth ofthe dividend. Conversely, a negative η corresponds to a pessimistic view. Accordingto this specification, the two types of traders do not update their beliefs of the driftrate, and one group will always stay as the optimistic one. This structure is different

Page 251: Paris-Princeton Lectures on Mathematical Finance 2003

244 J. Scheinkman and W. Xiong

from the one introduced in Section 5, where agents’ beliefs fluctuate with the flowof information.

Rational traders and irrational traders consume only at time T , and they share thesame constant relative risk aversion. They can trade continuously before time T andwould aim to maximize their utility based on their own probability measure. Thus, arational trader’s objective is to maximize

EP0

[1

1 − γC1−γr,T

]

where Cr,T is the consumption of the typical rational trader at time T . Correspond-ingly an irrational trader’s objective is to maximize

EQ0

[1

1 − γC1−γn,T

]

where Cn,T is the consumption of an irrational trader at time T . In addition, themodel assumes that there is no trading cost and short-sales of shares are allowed.

The probability measure used by irrational traders Q is absolutely continuous withrespect to the objective measure P . Thus, the expectation using probability measureQ can be transformed into an expectation using probability measure P through thedensity (Radon-Nikodym derivative) of the probability measure Q with respect to P :

ξt ≡dQ

dP

∣∣∣∣t

= e−12η

2σ2t+ησZt . (34)

Hence the irrational trader maximizes

EQ0

[1

1 − γC1−γn,T

]= EP0

[ξt

11 − γ

C1−γn,T

].

Effectively, an irrational trader is maximizing a state dependent utility function,ξt

11−γC

1−γn,T , under the true probability measure P .

The competitive equilibrium of the economy described above can be solved analyti-cally. Since there is only one source of uncertainty in the economy and there are notrading cost or short-sales constraints, it is expected that continuous trading on thestock and the bond is sufficient to dynamically complete the markets. Since com-plete markets yield Pareto efficient allocations, it is natural to first examine the set ofPareto efficient allocations and then show that with the corresponding support pricesthe two assets actually yield complete markets. This is the route chosen by Kogan etal. First they examine the set of allocations that solve:

max1

1 − γC1−γr,T + bξt

11 − γ

C1−γn,T

s.t. Cr,T + Cn,T = DT

Page 252: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 245

where b is the ratio of the utility weights for the two types of traders. The optimalallocations are

Cr,T =1

1 + (bξt)1/γDT , Cn,T =

(bξt)1/γ

1 + (bξt)1/γDT . (35)

Based on the traders’ marginal utilities, one can derive the supporting state pricedensity given the information at time t as

(1 + (bξt)1/γ)γD−γT

Et[(1 + (bξt)1/γ)γD−γT ]

.

This state price density allows us to price any contingent claim in the economy,such as the dividend from the stock that is paid at T and the traders’ consumptionallocations. Kogan et al. show that the equilibrium stock price is given by

Pt =Et[(1 + (bξt)1/γ)γD

−γT ZT ]

Et[(1 + (bξt)1/γ)γD−γT ]

.

The utility weights between the two groups of traders, b, is determined so that thebudget constraints for the two traders are satisfied. Since the traders start with thesame endowments at time t = 0, the values of their consumption allocations at timet = T should have the same value at t = 0. This gives an equation to identifyb = e(γ−1)ησ2T . With these calculations, Kogan et al. further show that in fact thestock and the bond dynamically complete markets.

Given the equilibrium consumptions for the two types, Kogan et al. use the asymp-totic properties of the two consumptions to discuss the survival of traders. Morespecifically, they say that irrational traders experience relative extinction in the long-run if

limT→∞

Cn,TCr,T

= 0 a.s.

The relative extinction of rational traders is defined symmetrically. Kogan et al. usethe expression for consumption allocations in equation (35) above, the formula forthe Radon-Nykodim derivative of Q with respect to P (equation (34)), and the stronglaw of large numbers for Brownian motion to establish:

Proposition 7. Suppose η = 0. Let η∗ = 2(η − 1). For γ > 1 and η = η∗, typicallyonly one type of traders survives. Furthermore:

Case 1: η < 0 (pessimistic irrational traders), the rational traders survive.

Case 2: 0 < η < η∗ (modestly optimistic irrational traders), the irrational traderssurvive.

Case 3: η > η∗ (strongly optimistic irrational traders), the rational trader survive.

Page 253: Paris-Princeton Lectures on Mathematical Finance 2003

246 J. Scheinkman and W. Xiong

Interestingly, the irrational trader could survive in the long-run, as in case 2 witha modestly optimistic belief. In such a case, the irrational trader takes more riskand therefore able to outgrow the rational trader. This effect has been pointed outby De Long, Shleifer, Summers and Waldman (1991) using a partial equilibriummodel in which irrational traders’ trading has no impact on the price. Once their priceimpact is taken into account as in this model, Kogan et al. show that the survivalof irrational traders becomes less likely, although still possible. Kogan et al. alsodiscuss the price impact of the irrational traders in the long run. They also show thateven if the irrational traders do not survive in the long run, irrational traders canstill have a persistent impact on the stock price since they are willing to bet stronglyon some small probability events when their probability assessment of these eventsdiffer greatly from that of rational traders.

9 Some Remaining Problems

Many interesting problems remain in modelling the effects of heterogeneous beliefson financial markets. Rather than present a long list of unsolved questions we selecta few problems that are particularly related to the models discussed above.

The model in Section 5 specifies a normal process for the unobservable fundamentalvariable. This assumption allows one to use the standard linear filtering technique toanalyze the agents’ learning process. It also generates a particularly tractable formfor the process determining the difference in beliefs. However, a lognormal processwould help capture the limited liability feature of many assets such as stocks andbonds. A model of this kind would have to contemplate the difficulties involvedin nonlinear filtering and optimal stopping with multiple state variables. Panageas(2003) provides an attempt along this line to analyze the effect of stock market bub-bles on firm investment. Multi-dimensional stopping time problems will also resultfrom the introduction of multiple classes of agents.

As discussed above, heterogeneous beliefs should be able to support a bubble evenif an asset has a finite life in the context of the model in Section 6. However, in thiscase, the optimal stopping problem that defines the equilibrium would involve anextra dimension. Tackling this additional complication would allow one to analyzethe exact impact of horizon on speculative trading and price bubbles, in addition toestablishing rigourously that a bubble generated by heterogeneous beliefs can prevailin assets with a finite life.

The models in Sections 3 and 6 ignore risk aversion of agents by assuming risk neu-trality. When agents are risk averse, subtle effects arise in equilibrium. As shownby Hong, Scheinkman and Xiong (2003) in a model with finite periods and myopicrisk averse agents, the payoff from the resale option appears to be similar to a stan-dard call option with a strike price determined by the asset float (number of tradableshares) and the risk bearing capacity of investors. The model suggests that asset floatcould have an important effect on price bubble and trading volume. A more elaborate

Page 254: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 247

model remains to be developed to analyze the effects of risk aversion and asset floatwhen agents have heterogeneous beliefs and short-sales of assets are constrained.

In the model in Section 6 the level of overconfidence is a constant. It is perhaps morerealistic to assume that the level of investor’s overconfidence fluctuates as investorslearn of their own ability to forecast. Gervais and Odean (2001) analyze a model inwhich a trader infers his ability to forecast from his past successes and failures. Theyshow that overconfidence can be generated in this learning process if the trader at-tributes success to himself and failure to external forces, an attribution bias that hasbeen documented in psychological studies of human behavior. It remains an openproblem to analyze the dynamics of heterogeneous beliefs in the presence of endo-geneously generated overconfidence and the market equilibrium that would obtain.

References

Abel, Andrew (1990), Asset prices under heterogeneous beliefs: implications forthe equity premium, Working paper, Wharton School, University of Pennsylva-nia.

Abramowitz, Milton and Irene Stegun (1964), Handbook of Mathematical Func-tions, Dover Publications, New York.

Alpert, Marc and Howard Raiffa (1982), A progress report on the training of prob-ability assessors, in Daniel Kahneman, Paul Slovic, and Amos Tversky, ed.:Judgement under Uncertainty: Heuristics and Biases, Cambridge UniversityPress, Cambridge.

Barberis, Nicholas and Richard Thaler (2003), A survey of behavioral finance, inGeorge Constantinides, Milton Harris and Rene Stulz (ed.), Handbook of theEconomics of Finance, North-Holland.

Basak, Suleyman (2000), A model of dynamic equilibrium asset pricing with het-erogeneous beliefs and extraneous beliefs, Journal of Economic Dynamics andControl 24, 63-95.

Bernardo, Antonio, and Ivo Welch (2001), On the evolution of overconfidence andentrepreneurs, Journal of Economics and Management Strategy 10, 301-330.

Biais, Bruno and Peter Bossaerts (1998), Asset prices and trading volume in abeauty contest, Review of Economic Studies 65, 307-340.

Blanchard, Olivier and Mark Watson (1982), Bubbles, rational expectations and fi-nancial markets, in P. Wachtel (ed.) Crisis in the Economic and Financial Struc-ture: Bubbles, Bursts, and Shocks, Lexington Press, Lexington, MA.

Blume, Lawrence and David Easley (2001), If you’re so smart, why aren’t yourich? Belief selection in complete and incomplete markets, Cowles FoundationDiscussion Papers, 1319.

Brav, Alon and J.B. Heaton (2002), Competing Theories of Financial Anomalies,Review of Financial Studies 15, 575-606.

Page 255: Paris-Princeton Lectures on Mathematical Finance 2003

248 J. Scheinkman and W. Xiong

Brenner, Lyle, Derek Koehler, Varda Liberman, and Amos Tversky (1996), Over-confidence in probability and frequency judgements, Organizational Behavioraland Human Decision Processes 65 (3), 212-219.

Brunnermeier, Markus and Jonathan Parker (2003), Optimal expectations, Workingpaper, Princeton University.

Camerer, C. (1995), Individual decision making, in John Kagel and Alvin Roth(ed.) The Handbook of Experimental Economics, Princeton University Press,Princeton, NJ.

Campbell, John, Andrew Lo, and Craig MacKinlay (1997), The Econometrics ofFinancial Markets, Priceton University Press, Princeton, NJ.

Cao, Henry and Hui Ou-Yang (2004), Differences of opinion and options, Workingpaper, University of North Carolina and Duke University.

Chen, Joseph, Harrison Hong and Jeremy Stein (2002), Breadth of ownership andstock returns, Journal of Financial Economics 66, 171-205.

Cochrane, John (2001), Asset Pricing, Princeton University Press, Princeton, NJ.Cochrane, John (2002), Stocks as money: convenience yield and the tech-stock bub-

ble, NBER Working Paper 8987.Cox, John, Jonathan Ingersoll and Stephen Ross (1985), An intertemporal general

equilibrium model of asset prices, Econometrica 53, 363-384.De Long, Bradford, Andrei Shleifer, Lawrence Summers, and Robert Waldman

(1991), The survival of noise traders in financial markets, Journal of Business64, 1-19.

Detemple, Jerome and Shashidhar Murthy (1994), Intertemporal asset pricing withheterogeneous beliefs, Journal of Economic Theory 62, 294-320.

Diamond, Douglas and Robert Verrecchia (1987), Constraints on short-selling andasset price adjsutment to private information , Journal of Financial Econonomics18, 277-311.

Friedman, Milton (1953), The case for flexible exchange rates, Essays in PositiveEconomics, University of Chicago Press.

Gallmeyer, Michael and Burton Hollifield (2004), An examination of heteroge-neous beliefs with a short sale constraint, Working paper, Carnegie Mellon Uni-versity.

Gervais, Simon and Terrance Odean (2001), Learning to be overconfident, Reviewof Financial Studies 14, 1-27.

Grossman, Sanford and Joseph Stiglitz (1980), On the impossibility of information-ally efficient markets, American Economic Review 70, 393-408.

Harris, Milton and Artur Raviv (1993), Differences of opinion make a horse race,Review of Financial Studies 6, 473-506.

Harrison, Michael and David Kreps (1978), Speculative investor behavior in a stockmarket with heterogeneous expectations, Quarterly Journal of Economics 92,323-336.

Hirshleifer, David (2001), Investor psychology and asset pricing, Journal of Fi-nance 56, 1533-1597.

Hong, Harrison, Jose Scheinkman, and Wei Xiong (2003), Asset float and specula-tive bubbles, Working paper, Princeton University.

Page 256: Paris-Princeton Lectures on Mathematical Finance 2003

Heterogeneous Beliefs, Speculation and Trading 249

Jarrow, Robert (1980), Heterogeneous expectations, restrictions on short-sales, andequilibrium asset prices, Journal of Finance 35, 1105-1113.

Kandel, Eugene and Neil Pearson (1995), Differential interpretation of public sig-nals and trade in speculative markets, Journal of Political Economy 103, 831-872.

Kogan, Leonid, Stephen Ross, Jiang Wang, and Mark Westerfield (2004), The priceimpact and survival of irrational traders, Working paper, MIT.

Kyle, Albert (1985), Continuous auctions and insider trading, Econometrica 53,1315-1336.

Kyle, Albert and Tao Lin (2003), Continuous trading with heterogeneous beliefsand no noise trading, Working paper, Duke University.

Kyle, Albert and Albert Wang (1997), Speculation duopoly with agreement to dis-agree: Can overconfidence survive the market test? Journal of Finance 52, 2073-2090.

Lintner, John (1969), The aggregation of investor’s diverse judgements and prefer-ences in purely competitive security markets, Journal of Financial and Quanti-tative Analysis 4, 347-400.

Liptser, R. S. and A. N. Shiryayev (1977), Statistics of Random Processes, Spring-Verlag, New York.

Lucas, Robert (1978), Asset prices in an exchange economy, Econometrica 46,1429-1446.

Mehra, Rajnish and Edward Prescott (1985), The equity premium puzzle, Journalof Monetary Economics 15, 145-161.

Mei, Jianping, Jose Scheinkman and Wei Xiong (2003), Speculative trading andstock prices: An analysis of Chinese A-B share premia, Working paper, Prince-ton University.

Milgrom, Paul and Nancy Stokey (1982), Information, trade and common knowl-edge, Journal of Economic Theory 12, 112-128.

Miller, Edward (1977), Risk, uncertainty and divergence of opinion, Journal of Fi-nance 32, 1151-1168.

Morris, Stephen (1996), Speculative investor behavior and learning, QuarterlyJournal of Economics 110, 1111-1133.

Odean, Terrance (1998), Volume, volatility, price, and profit when all traders areabove average, Journal of Finance 53, 1887-1934.

Ofek, Eli and Matthew Richardson (2003), Dotcom mania: The rise and fall ofinternet stock prices, Journal of Finance 58, 1113-1137.

Panageas, Stavros (2003), Speculation, overpricing and investment – theory andempirical evidence, Job market paper, MIT.

Revuz, Daniel and Marc Yor (1999), Continuous Martingales and Brownian Mo-tion, Springer, New York.

Rogers, L. C. G. and David Williams (1987), Diffusions, Markov Processes, andMartingales, Volume 2: Ito Calculus, John Wiley & Sons, New York.

Sandroni, Alvaro (2000), Do markets favor agents able to make accurate predic-tions? Econometrica 68, 1303-1341.

Page 257: Paris-Princeton Lectures on Mathematical Finance 2003

250 J. Scheinkman and W. Xiong

Santos, Manuel, and Michael Woodford (1997), Rational asset pricing bubbles,Econometrica 65, 19-57.

Scheinkman, Jose and Thaleia Zariphopoulou (2001), Optimal environmental man-agement in the presence of irreversibilities, Journal of Economic Theory 96,180-207.

Scheinkman, Jose and Wei Xiong (2003), Overconfidence and Speculative Bubbles,Journal of Political Economy 111, 1183-1219.

Tirole, Jean (1982), On the possibility of speculation under rational expecations,Econometrica 50, 1163-1181.

Varian, Hal (1989), Differences of opinion in financial markets, in Courtenay Stone(ed.) Financial Risk: Theory, Evidence and Implications, Kluwer Academic Pub-lishers, Boston.

Wang, Jiang (1993), A model of intertemporal asset prices under a asymmetricinformation, Review of Economic Studies 60, 249-282.

Williams, Joseph (1977), Capital asset prices with heterogeneous beliefs, Journalof Financial Economics 5, 219-239.

Zapatero, Fernando (1998), Effects of financial innovations on market volatilitywhen beliefs are heterogeneous, Journal of Economic Dynamics and Control22, 597-626.