interest rate modeling market models, products and risk...

Interest rate modeling

Market models, products and risk management(following [AP10-1], [AP10-2] and [AP10-3])

Alan Marc Watson

July 5, 2016

Abstract

This document contains a brief summary of Andersen and Piterbarg’s superb three-volume treatise on fixed-income derivatives. I have used this as a self-study guide and alsoto document my subsequent model implementations

Contents

1 Fundamentals of interest rate modeling 61.1 Fixed income notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 Fixed income probability measures . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.1 Risk neutral measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.2 T-forward measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.3 Spot measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.4 Swap measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 The HJM analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3.1 The Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3.2 Gaussian model with Markovian short rate . . . . . . . . . . . . . . . 10

2 Vanilla models with local volatility 112.1 General framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 The CEV model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.1 Valuation of European call options in the CEV model . . . . . . . . . 122.2.2 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.3 Displaced diffusion models . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 Finite difference solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3.1 Multiple λ and T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4 Forward equation for call options . . . . . . . . . . . . . . . . . . . . . . . . . 132.5 Time-dependent local volatility function ϕ . . . . . . . . . . . . . . . . . . . . 14

2.5.1 Separable case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.5.2 Skew averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Vanilla models with stochastic volatility 173.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Fourier integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2.1 Calculation of option prices when MGF is available for ln(S(t)). . . . 173.2.2 Direct integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1

3.2.3 Fourier integration for arbitrary European Payoffs . . . . . . . . . . . 193.3 Averaging methods for models with time-dependent parameters . . . . . . . . 21

3.3.1 Volatility averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3.2 Skew averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3.3 Volatility of variance averaging . . . . . . . . . . . . . . . . . . . . . . 233.3.4 Calibration by parameter averaging . . . . . . . . . . . . . . . . . . . 23

4 One-factor short rate models 264.1 Review of Swaps and swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2 The Ho-Lee model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3 The mean-reverting GSR model . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.3.1 Swaption pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.3.2 Swaption calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.3.3 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.4 The affine one-factor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.4.1 Bond pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.2 Discount bond calibration . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.3 European option pricing . . . . . . . . . . . . . . . . . . . . . . . . . . 354.4.4 Swaption calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.5 Log-normal short-rate models . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.6 Spanned and unspanned stochastic volatility . . . . . . . . . . . . . . . . . . . 38

5 Multi-factor short-rate models 395.1 The Gaussian multi-factor model . . . . . . . . . . . . . . . . . . . . . . . . . 395.2 Two-factor Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.3 Swaption pricing (via Gaussian swap rate approximation) . . . . . . . . . . . 41

5.3.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.3.2 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 The one-factor quasi-Gaussian model 456.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.2 Local volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.2.1 Linear local volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.2.2 Volatility calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.2.3 Computational tip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.2.4 Mean reversion calibration . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.3 Stochastic volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.4 Quasi-Gaussian multi-factor models . . . . . . . . . . . . . . . . . . . . . . . 52

7 The Libor market model 537.1 The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.1.1 Probability measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.1.2 Link to HJM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547.1.3 Choice of σn(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

7.2 Correlation structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557.2.1 Empirical PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557.2.2 Empirical estimates for forward rate correlations . . . . . . . . . . . . 56

7.3 Pricing European options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.3.1 Caplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.3.2 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577.3.3 Spread options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.4 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.4.1 Calibration objective function . . . . . . . . . . . . . . . . . . . . . . . 61

2

7.4.2 Calibration algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.4.3 Correlation calibration to spread options . . . . . . . . . . . . . . . . . 64

7.5 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

7.5.1 Euler-type schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

7.5.2 Special-purpose schemes with drift Predictor-Corrector . . . . . . . . . 65

7.6 Interpolation of front / back stubs . . . . . . . . . . . . . . . . . . . . . . . . 66

7.6.1 Back stub interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 66

7.6.2 Back stub interpolation inspired by the Gaussian model . . . . . . . . 67

7.6.3 Front stub interpolation inspired by the Gaussian model . . . . . . . . 68

7.7 Advanced approximation for swaption pricing via Markovian projection . . . 68

7.7.1 Advanced formula for the swap rate volatility . . . . . . . . . . . . . . 70

7.7.2 Advanced formula for the swap rate skew . . . . . . . . . . . . . . . . 70

7.7.3 Skew and smile calibration . . . . . . . . . . . . . . . . . . . . . . . . 70

7.8 Other extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

7.8.1 Evolving separate discount and forward rate curves . . . . . . . . . . . 72

7.8.2 SV models with non-zero correlation . . . . . . . . . . . . . . . . . . . 72

7.8.3 Multi-stochastic volatility extensions . . . . . . . . . . . . . . . . . . . 72

8 Single-rate vanilla derivatives 73

8.1 European swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

8.2 Caps and floors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

8.3 Terminal swap rate (TSR) models . . . . . . . . . . . . . . . . . . . . . . . . 74

8.3.1 Linear TSR models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

8.4 Libor-in-Arrears . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

8.5 Libor-with-Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

8.6 CMS and CMS-Linked cash flows . . . . . . . . . . . . . . . . . . . . . . . . . 77

8.6.1 Linear TSR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

8.6.2 Libor market model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

8.6.3 Quasi-Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

8.6.4 Correcting non-arbitrage-free methods . . . . . . . . . . . . . . . . . . 79

8.6.5 CDF and PDF of CMS rate in Forward measure . . . . . . . . . . . . 80

8.6.6 SV model for CMS rate . . . . . . . . . . . . . . . . . . . . . . . . . . 81

8.6.7 Dynamics of CMS rate in forward measure . . . . . . . . . . . . . . . 82

8.7 Quanto CMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

8.8 Eurodollar futures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

8.8.1 The problem and the plan . . . . . . . . . . . . . . . . . . . . . . . . . 85

8.8.2 Expansion around futures value . . . . . . . . . . . . . . . . . . . . . . 85

8.8.3 Forward rate variances . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8.8.4 Forward rate correlations . . . . . . . . . . . . . . . . . . . . . . . . . 86

9 Multi-rate vanilla derivatives 87

9.1 Dependence structure via copulas . . . . . . . . . . . . . . . . . . . . . . . . . 87

9.2 Gaussian copula method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

9.3 General couplas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

9.4 Copula methods for spread options . . . . . . . . . . . . . . . . . . . . . . . . 89

9.4.1 Power gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

9.5 Limitations of the copula method . . . . . . . . . . . . . . . . . . . . . . . . . 90

3

10 Callable Libor Exotics 92

10.1 Review of basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

10.2 Model calibration for CLE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

10.3 Valuation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

10.4 Monte Carlo valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

10.4.1 Regression for the underlying . . . . . . . . . . . . . . . . . . . . . . . 95

10.4.2 Using regressed variables for decision only . . . . . . . . . . . . . . . . 96

10.4.3 Controlling the MC estimate bias . . . . . . . . . . . . . . . . . . . . . 97

10.4.4 Upper bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

10.4.5 Choice of regression variables . . . . . . . . . . . . . . . . . . . . . . . 97

10.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

10.6 Valuation with low-dimensional models . . . . . . . . . . . . . . . . . . . . . . 99

10.7 Suitable analogue for core swap rates . . . . . . . . . . . . . . . . . . . . . . . 100

11 Bermudan swaptions 102

11.1 Local projection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

11.2 Amortizing, accreting and other non-standard Bermudan swaptions . . . . . . 103

11.2.1 Relating non-standard to standard swap rates . . . . . . . . . . . . . . 103

11.2.2 Representative swaption approach (payoff matching) . . . . . . . . . . 104

11.2.3 Basket approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

11.2.4 American swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

11.3 Flexi-swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

11.3.1 Purely local bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

11.4 Fast pricing via exercise premia representation . . . . . . . . . . . . . . . . . 107

12 TARN’s, volatility swaps and other derivatives 108

12.1 TARN’s (targeted redemption notes) . . . . . . . . . . . . . . . . . . . . . . . 108

12.1.1 Local projection method . . . . . . . . . . . . . . . . . . . . . . . . . . 108

12.1.2 Effect of volatility smiles . . . . . . . . . . . . . . . . . . . . . . . . . . 109

12.2 Volatility swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

12.2.1 Volatility swaps with shout option . . . . . . . . . . . . . . . . . . . . 109

12.2.2 Min-Max volatility swaps . . . . . . . . . . . . . . . . . . . . . . . . . 110

12.3 Forward swaption straddles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

13 Risk management 111

13.1 Value at risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

14 Payoff smoothing for Monte Carlo 113

14.1 Tube Monte Carlo for digital options . . . . . . . . . . . . . . . . . . . . . . . 113

14.2 Tube Monte Carlo for barrier options . . . . . . . . . . . . . . . . . . . . . . . 114

15 Pathwise differentiation 116

15.1 On the unsuitability of the likelihood ratio method in interest rate modeling . 117

15.2 CLE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

15.3 Barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

15.3.1 Digital option warm-up . . . . . . . . . . . . . . . . . . . . . . . . . . 119

15.3.2 Barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

15.4 Pathwise differentiation for Monte Carlo based models . . . . . . . . . . . . . 120

4

16 Importance sampling and control variates 12116.1 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12116.2 Payoff smoothing via importance sampling . . . . . . . . . . . . . . . . . . . . 124

16.2.1 Binary options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12416.2.2 TARN’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

16.3 Control variates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12616.3.1 Model-based control variates . . . . . . . . . . . . . . . . . . . . . . . 12816.3.2 Instrument-based control variates . . . . . . . . . . . . . . . . . . . . . 13016.3.3 Dynamic control variates . . . . . . . . . . . . . . . . . . . . . . . . . 131

5

1 Fundamentals of interest rate modeling

1.1 Fixed income notations

Some notations:

• P (t, T ): time-t price of a zero-coupon bond (ZCB) delivering $1 at time T ≥ t.• P (t, T, T + τ) = P (t,T+τ)

P (t,T ) : time-t forward price for the ZCB spanning [T, T + τ ]1.

• y(t, T, T + τ): continuously compounded yield, defined by

e−y(t,T,T+τ)τ = P (t, T, T + τ)

• L(t, T, T + τ) simple forward rate, defined by

1 + τL(t, T, T + τ) =1

P (t, T, T + τ)=⇒ L(t, T, T + τ) =

1

τ

[P (t, T )

P (t, T + τ)− 1

]L(t, T, T + τ) will generally be the Libor rates quoted in the interbank market.

• f(t, T ) is the time-t instantaneous forward rate

f(t, T ) = limτ→0

L(t, T, T + τ)

• Relation between f(t, T ) and P (t, T, T + τ)

P (t, T, T + τ) = exp

(−∫ T+τ

T

f(t, u) du

), f(t, T ) = − ∂

∂TlnP (t, T )

y(t, T, T + τ) =1

τ

∫ T+τ

T

f(t, u) du, f(t, T ) = limτ→0

y(t, T, T + τ)

• r(t) = f(t, t) short/spot rate.

• F (t, T, T + τ) futures rate.

1.2 Fixed income probability measures

We outline some of the main measures used in the study of fixed-income and list some of theirproperties. Let V (t) denote the time-t value of a derivative security with payout functionV (T ).

1.2.1 Risk neutral measure

The numraire defining the measure Q is the continuously compounded money market accountβ(t).

dβ(t) = β(t)dtβ(0) = 1

=⇒ β(t) = e∫ t0r(u) du

In the absence of arbitrage the numraire deflated process V (t)β(t) must be a martingale,

which yields the risk-neutral pricing formula

V (t) = EQt

[V (T )e−

∫ Ttr(u) du

]1This is the time-T purchase price of a bond maturing at T + τ , fixed at time t < T . To see this, at time t we

can buy a (T + t)-maturity bond for P (t, T + τ) and short P (t,T+τ)P (t,T )

T -maturity bonds, which costs us 0. At time

T , we close the short position on the T -maturity bond by paying P (t,T+τ)P (t,T )

· P (T, T ) to the buyer, and at time

T + τ we receive $1 for our (T + t)-maturity bond.

6

In particular, for V (T ) = 1 we obtain the price of a ZCB

P (t, T ) = EQt

[e−

∫ Ttr(u) du

]1.2.2 T-forward measure

The T-forward measure QT uses a T -maturity ZCB as numraire. The risk-neutral pricingformula in this case reads

V (t) = P (t, T )ETt [V (T )]

In the absence of arbitrage, the forward Libor rate L(t, T, T + τ) is a QT -martingale suchthat

L(t, T, T + τ) = ET+τt [L(T, T, T + τ)] , t ≤ T

1.2.3 Spot measure

Consider a tenor structure 0 < T0 < · · · < TN . Sometimes it is convenient to introduce anumraire that can be extended to arbitrary horizons by compounding. Consider the followingsituation:

(i) Time 0: invest $1 in 1P (0,T1) T1-maturity discount bonds, returning an amount

1

P (0, T1)= 1 + τ0L(0, 0, T1)

(ii) Time T1: Re-invest this amount in T2-maturity bonds, returning

1

P (0, T1)

1

P (T1, T2)= [1 + τ0L(0, 0, T1)][1 + τ1L(T1, T1, T2)]

(iii) Iterating this we obtain a process

B(t) = P (t, Ti+1)

i∏n=0

[1 + τnLn(Tn)] , Ti < t ≤ Ti+1

The spot measure QB is the measure induced by this process and the time-t value of asecutiry in this measure is by definition

V (t) = EBt

[V (T )

B(t)

B(T )

],

B(t)

B(T )=

j∏n=i+1

[1 + τnLn(Tn)]−1 P (t, Ti+1)

P (t, Tj+1)

for Ti < t ≤ Ti+1 and Tj < T ≤ Tj+1.

1.2.4 Swap measures

Given a tenor structure 0 ≤ T0 < T1 < · · · < TN , for k,m satisfying 0 ≤ k < N andk +m ≤ N we define:

(i) Annuity factor:

Ak,m(t) =

k+m−1∑n=k

P (t, Tn+1)τn, τn = Tn+1 − Tn

7

(ii) Swap rate:

Sk,m(t) =P (t, Tk)− P (t, Tk+m)

Ak,m(t)=

∑k+m−1n=k τnP (t, Tn+1)Ln(t)

Ak,m(t), t ≤ Tk

Being a combination of zero-coupon bonds, an annuity factor Ak,m(t) qualifies as a num-raire. Denoting by Qk,m the corresponding measure and by Ek,m the corresponding expec-tation, in the absence of arbitrage we have

V (t) = Ak,mEk,m

[V (T )

Ak,m(T )

]

1.3 The HJM analysis

The HJM framework attempts to model how an entire continuum of T -indexed bond pricesP (t, T ) jointly evolves through time, staring from a known condition P (0, T ).

In the absence of arbitrage, the deflated bond values are Q-martingales, so if Pβ(t, T ) =P (t,T )β(t) , by the martingale representation theorem there exists a d-dimensional stochastic

process σT (t, T ) with σP (T, T ) = 0 such that

dPβ(t, T ) = −Pβ(t, T )σP (t, T )tdW (t), t ≤ T

By It’s lemma and the previous SDE it follows that

dP (t, T )

P (t, T )= r(t)dt− σP (t, T )tdW (t)

By It’s lemma it also follows that the forward bond prices satisfy

dP (t, T, T + τ)

P (t, T, T + τ)= −[σP (t, T + τ)− σP (t, T )]tσP (t, T )dt− [σP (t, T + τ)− σP (t, T )]tdW (t)

In the T-forward measure, P (t, T, T + τ) is a martingale, so we must have

dP (t, T, T + τ)

P (t, T, T + τ)= −[σP (t, T + τ)− σP (t, T )]tdWT (t)

where WT (t) is a QT -Brownian motion.

Comparison of the last two expressions yields

dWT (t) = dW (t) + σP (t, T )dt

which by Girsanov’s theorem identifies the Radon-Nikodym process for the measure shift

ξ(t) = EQt

[dQT

dQ

]= Et(σP ·W ) = exp

(−∫ t

0

||σP (s, T )||2 ds−∫ t

0

σP (s, T ) dW (s)

)HJM models are traditionally stated in terms of instantaneous forward rates f(t, T ).

Once again by It’s lemma we have (omitting the drift term)

d lnP (t, T ) = O(dt)− σP (t, T )tdW (t)

Differentiating with respect to T on both sides and recalling that f(t, T ) = −∂P (t,T )∂T we have

df(t, T ) = µf (t, T ) +∂

∂TσP (t, T )t︸︷︷︸

:=σf (t,T )t

dW (t)

Concretely, we have the following:

8

Lemma 1.1. The process for f(t, T ) in the T-forward measure is

df(t, T ) = σf (t, T )tdWT (t)

and in the risk-neutral measure the process is

df(t, T ) = σf (t, T )tσP (t, T )dt+ σf (t, T )tdW (t)

= σf (t, T )t∫ T

t

σf (t, u) dudt+ σf (t, T )tdW (t)

We thus have:

• The HJM model is fully specified once the forward rate diffusion coefficients σf (t, T )have been specified for all t and T .

• The initial forward rates f(0, T ) are taken as exogenous inputs.

• The model suffers from an obvious dimensionality problem: in order to describe thetime-t state of a discount bond curve spanning [t, T ] we need to keep track of a contin-uum of forward rates f(t, u)t≤u≤T .

The short rate process in measure Q for the HJM model is

r(t) = f(t, t) = f(0, t) +

∫ t

0

σf (u, t)t∫ t

u

σf (u, s) ds du+

∫ t

0

σf (u, t)t dW (u) (1)

It is generally not Markovian: for the path-dependent term

D(t) :=

∫ t

0

σf (u, t)t dW (u)

we must have

D(T ) = D(t) +

∫ T

t

σf (u, T )> dW (u) +

[∫ t

0

σf (u, T ) dW (u)−∫ t

0

σf (u, t)> dW (u)

](2)

whereby

EQ [D(T )|D(t)] 6= EQt [D(T )]

unless the bracketed term in (2) is either non-random or a deterministic function of D(t).

1.3.1 The Gaussian model

If σP (t, T ) is assumed to be a bounded d-dimensional deterministic function, then forwardbond prices P (t, T, T + τ) are log-normally distributed and r(T ) = f(T, T ) is Gaussian withQ-moments

EQt [f(T, T )] =

∫ T

t

σf (u, T )tσP (u, T ) du, V arQt [f(T, T )] =

∫ T

t

σf (u, T )tσf (u, T ) du

Propositions 4.5.1, 4.5.2 and 4.5.3 in [AP10-1] illustrate how to easily price options onZCB’s, caplets and futures rates in the Gaussian HJM model.

9

1.3.2 Gaussian model with Markovian short rate

Caverhill showed in 1994 that the short rate process can be made Markovian by imposing aseparability condition on the deterministic forward rate volatility function, namely

σf (t, T ) = g(t)h(T ) (3)

where h : R→R is positive and g : R→Rd×1. Using (3), the short rate expression (1) becomes

r(t) = f(0, t) + h(t)

∫ t

0

g(u)>g(u)

(∫ t

u

h(s) ds

)du+ h(t)

∫ t

0

g(u)> dW (u)︸︷︷︸not= D(t)

(4)

and importantly, the term D(t) is now Matkov since

D(T ) =h(T )

h(t)D(t) + h(T ) +

∫ T

t

g(u)> dW (u)

The following proposition shows that under the separability condition (3), the short rateprocess is the solution of an SDE, and is hence Markovian.

Proposition 1.2 (Markovian short rate under separable Gaussian HJM). In the d-dimensionalGaussian HJM model2 and under the separability condition (3)

σf (t, T ) = g(t)h(T )

the short rate process satisfies an SDE

dr(t) = (a(t)− κ(t)r(t)) dt+ σr(t)>dW (t)

where

h(T ) = e−∫ T0κ(s) ds,

g(t) = e∫ t0κ(s) dsσr(t),

a(t) =∂f(0, t)

∂t+ κ(t)f(0, t) +

∫ t

0

σf (u, t)> du

2Namely, under the assumption that σP (t, T ) is a bounded (d-dimensional) deterministic function.

10

2 Vanilla models with local volatility

Local or deterministic volatility models are one-factor diffusive models where our ability toalter the terminal distribution stems from a single source: a swap rate dependent function.

• Via copula methods, vanilla models can be extended to describe joint terminal distri-butions of more than one rate.

• Vanilla models serve as a foundation for the development of more widely applicable fullterm structure models.

2.1 General framework

Notation:

• S(t): forward Libor rate

• P : measure under which S(t) is a martingale satisfying the SDE

dS(t) = λϕ(S(t))dW (t), λ = const, ϕ(0) = 0 (5)

• W (t): one-dimensional Brownian motion under P .

The role of the function ϕ is to match the distribution of S to that observed throughcalls and puts traded in the market. Remember that the time-t price of a European callstruck at K maturing at T is given by rhe risk-neutral pricing formula

c(t, S(t);T,K) = Et[(S(T )−K)+

]The time-t implied volatility function σB(t, S(t);T,K) is the implicit solution to the

Black-Scholes formula

c(t, S;T,K) = SN (d+)−KN (d−), d± =ln(S/K)± 1

2σ2B(t, S;T,K)(T − t)

σB(t, S;T,K)√T − t

The mapping T 7→ σB(t, s;T,K) is the T-maturity volatility smile.

2.2 The CEV model

The constant elasticity of variance (CEV) model is a local volatility model (5) withϕ(S) = Sp.

Proposition 2.1. See [AP10-1, Proposition 7.2.1]. Consider the SDE

∂

∂S(t) = λS(t)pdW (t), p > 0 (6)

1. All solutions of (6) are non-explosive.

2. For p ≥ 12 , the SDE (6) has a unique solution3.

3. For 0 < p < 1, S = 0 is an attainable boundary for (6); for p ≥ 1, S = 0 is anunattainable boundary for (6).

4. For 0 < p ≤ 1, S(t) in (6) is a martingale; for p > 1, S(t) is a strict supermartingale.

The transition probability of S(t) in (6) is known in closed form (see Lemma 7.2.4 in[AP10-1]).

3Hence, if the solution ever reaches S = 0, it must stay there (i.e. it is absorbed). If 0 < p < 12, a boundary

condition at S = 0 needs to be specified in order to get a unique solution.

11

2.2.1 Valuation of European call options in the CEV model

2.2.2 Regularization

The CEV process implies a positive probability of absorption at S = 0 for p < 1, whichmight create difficulties in pricing more exotic structures. To avoid absorption, we canspecify a regularized version of the CEV model by letting

ϕ(x) = xminεp−1, xp−1, ε > 1, p < 1

With ϕ(x) now being Lipschitz, the process S(t) can no longer reach the origin.

• The above specification might not allow for closed-form option pricing.

• For moderate (small) values of ε, Andersen and Andreasen show that the formulas insubsection ?? provide good approximations (c.f. [AP10-1, Proposition 7.2.10]).

2.2.3 Displaced diffusion models

These are local volatility models (5) with ϕ(S) = (α+ S)p, for some constant α.

Setting Z(t) = α+ S(t), by It’s lema we see that Z(t) satisfies the CEV SDE

dZ(t) = λ[Z(t)]pdW (t)

and call option pricing is hence straightforward:

Proposition 2.2. Let

cdCEV (t, S(t);T,K, α) = Et[(S(T )−K)+

]be the call option price associated with the displaced CEV process. Then

cdCEV (t, S(t);T,K, α) = cCEV (t, S + α;T,K + α), S,K > −α

In practice, the main use of displacement constants is for the special case of the displacedlog-normal process where p = 1.

Proposition 2.3. Consider the displaced log-normal process

dS(t) = λ(β + ξS(t))dW (t), ξ, λ 6= 0

where W (t) is a 1-dimensional Brownian motion in measure P . Assuming S(t),K > − βxi ,

we have

cDLN (t, S(t);T,K) = Et[(S(T )−K)+

]=

(S(t) +

β

ξ

)N (d+)−

(K +

β

ξ

)N (d−), d± = · · ·

Remark 2.1. A first-order approximation to any local volatility process is ofdisplaced normal type. Indeed, given a general local volatility model (5)

dS(t) = λϕ(S(t))dW (t), λ = const, ϕ(0) = 0,

expanding the local volatility function ϕ around at-the-money to first order, we obtain

dS(t) ' λϕ(S(0)) + ϕ′(S(0))(S(t)− S(0))dW (t)

which is of displaced log-normal type

dS(t) = σ(bS(t) + (1− b)L)dW (t), σ = λϕ(S(0))

S(0), b = ϕ′(S(0))

S(0)

ϕ(S(0)), L = S(0)

12

2.3 Finite difference solutions

For general specifications of the function ϕ defining the local volatility model

dS(t) = λϕ(S(t))dW (t)

closed-form solutions for European option will not exist. One may instead try to obtaina numerical solution: for fixed T,K, the Feynman-Kac theorem translates the risk-neutralpricing formula

c(t, S(t)) = c(t, S(t);Y,K) = Et[(S(T )−K)+

]into a PDE

∂c(t, S)

∂t+

1

2λ2ϕ(S)2 ∂

2c(t, S)

∂S2= 0, c(T, S) = (S −K)+

whose solution can be approximated using finite difference methods.

2.3.1 Multiple λ and T

In applications we may need to compute the values of c(t, S;T,K) for many differentvalues od T and/or λ (e.g. within a standard calibration exercise, where a root-searchalgorithm is used to to determine the value of λ that minimizes the error between computedand market prices). Instead of solving the PDE multiple times, one may use the follwingresult:

Proposition 2.4. Let g(τ, x) be the solution to the PDE

−∂g(τ, x)

∂τ+

1

2ϕ(x)2 ∂

2g(τ, x)

∂x2= 0, g(0, x) = (x−K)+ (7)

If c(t, S;λ) denotes the solution to the backward PDE

∂c(t, S)

∂t+

1

2λ2ϕ(S)2 ∂

2c(t, S)

∂S2= 0, c(T, S) = (S −K)+ (8)

for a given value of λ then

c(t, S;T,K) = g(λ2(T − t), S) (9)

Hence, one may use finite differences construct a solution g to the PDE (7) on a (τ, S)-grid and then use (9) to recover c(t, S;T,K) for arbitrary choices of S, λ, T by simple look-upor interpolation.

2.4 Forward equation for call options

Instead, we may want to use different strikes for different values of T . In order toaccomplish this (again, without having to solve the entire PDE each time) we may employDupire’s forward equation, in which calendar time t and the initial spot S are consid-ered fixed, whereas maturity T and strike K are variable. As before, let c(t, S(t);T,K) =Et [(S(T )−K)+] denote the price of a European call.

Proposition 2.5 (Dupire, c.g. Proposition 7.4.2 in [AP10-1]). The function c(T,K) =c(t, S;T,K) (with t, S fixed) satisfies the forward PDE

−∂c(T,K)

∂T+

1

2λ2ϕ(K)2 ∂

2c(T,K)

∂K2= 0, c(t,K) = (S −K)+, t > T (10)

An analogous version of Proposition also holds in this setting.

13

2.5 Time-dependent local volatility function ϕ

So far we have assumed that S(t) is driven by an SDE

dS(t) = λϕ(S(t))dW (t), λ = const, ϕ(0) = 0 (11)

and we will now be extending our treatment to models for which ϕ(t, S(t)) depends on t aswell (not just through S(t)).

Even though vanilla models with time-dependent local volatility functions ϕ have limiteduse in fixed income modeling, they do arise as describing approximate dynamics of swap orLibor rates in term structure models.

2.5.1 Separable case

The case

dS(t) = λ(t)ϕ(S(t))dW (t)

is treated via a simple time change argument that reduces it to a model of the type (11).

Lemma 2.6. See [AP10-1, Proposition 7.6.1]. Define

τ(t) =

∫ t

0

λ2(u) du

and define s(·) by S(t) = s(τ(t)), where S(t) follows

dS(t) = λ(t)ϕ(S(t))dW (t)

Then

ds(τ) = ϕ(s(τ))dW (τ), s(0) = S(0)

where W is Brownian motion.

The valuation problem

c(t, S(t);T,K) = Et[(S(T )−K)+

]can hence be rewritten as

c(τ, s(τ);T,K) = E[(s(τ)−K)+|Fτ(t)

]which can be solved using the methods outlined earlier.

2.5.2 Skew averaging

In order to handle the general case

dS(t) = ϕ(t, S(t))dW (t), λ = const, ϕ(0) = 0 (12)

one can look for a time-independent local volatility function (a time average of the time-dependent function) that yields European option prices approximately matching prices fromthe time-dependent model. We first rewrite (12) as

dS(t) = λ(t)g(t, S(t))dW (t), g(t, x) =ϕ(t, x)

ϕ(t, S0), X0 = S(0), λ(t) = ϕ(t,X0)

14

The idea is to fix T > 0 and to derive conditions that a time-independent function g(x)needs to satisfy so that

dS(t) = λ(t)g(t, S(t))dW (t)

can be replaced with

dS(t) = λ(t)g(S(t))dW (t)

for the purposes of valuing T -expiry European options of all strikes.

For ε ≥ 0, define

gε(t, x) = g(t,X0 + ε(x−X0)), gε(x) = g(X0 + ε(x−X0))

and define 2 sets of processes

dXε(t) = λ(t)gε(t,Xε(t))dW (t), Xε(0) = X0

dY ε(t) = λ(t)gε(Y ε(t))dW (t), Y ε(0) = X0

The requirement that the distributions of Xε and Y ε be close can be formalized as finding

g(·) = argming : q(ε) := E

[(Xε(T )− Y ε(T ))2

]Expanding

q(ε) = q(0) + εq′(0) +1

2ε2q′′(0) +O(ε3)

one shows that q(0) = q′(0) = 0 so minimizing q(ε) is equivalent to minimizing q′′(0). Thenone has the following result:

Proposition 2.7. See [AP10-1, Proposition 7.6.2]. Any function g that minimizes q′′(0)must satisfy the condition

∂g

∂xg(X0) =

∫ T

0

∂g

∂x(t,X0)ωT (t) dt (13)

where

ωT (t) =v2(t)λ2(t)∫ T

0v2(t)λ2(t) dt

, v2(t) = E[(X0(0)−X0)2

]Typical examples of time-dependent local volatility functions g(t, x), scaled to satisfy

g(t, S(0)) = 1, are:

• Displaced log-normal function g(t, x) = b(t) xS(0) + (1− b(t)).

• CEV function g(t, x) =(

xS(0)

)p(t).

The condition (13) does not define the function g uniquely, so one generally seeks to finda function g within the same family of functions they approximate. For instance, for theabove examples we would look for functions

g(x) = bx

S(0)+ (1− b), g(x) =

(x

S(0)

)pConcretely, we have the following useful result:

Corollary 2.8. Over the time horizon [0, T ]:

15

1. The effective skew b in g(x) = b xS(0) +(1−b) for the model defined by the time-dependent

local volatility function g(t, x) = b(t) xS(0) + (1− b(t)) is given by

b =

∫ T

0

b(t)ωT (t) dt

where

ωT (t) =v(t)2λ(t)2∫ T

0v(t)2λ(t)2 dt

,

v(t) =

∫ t

0

λ(s)2 ds

2. Similarly, the effective parameter p in g(x) =(

xS(0)

)pfor the model defined by the

time-dependent local volatility function g(t, x) =(

xS(0)

)p(t)is given by

p =

∫ T

0

p(t)ωT (t) dt

with ωT (t) and v(t) as above.

3. If, furthermore, the volatility is constant λ(t) ≡ λ, we obtain

v(t)2 = λ2t, ωT (t) =t

T 2/2

16

3 Vanilla models with stochastic volatility

Here we focus on models which incorporate a new stochastic component for the volatility:

dS(t) = λ (bS(t) + (1− b)L)√z(t)dW (t) (14)

dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t) (15)

E[dW (t)dZ(t)] = ρdt

3.1 Basic properties

Proposition 3.1 (Explosions).

Proposition 3.2 (Martingale property).

The stochastic equation (14) can be integrated explicitly:

Proposition 3.3. In the model (14)-(15), we have

S(t) =1

b[(bS(0) + (1− b)L)X(t)− (1− b)L]

where

lnX(t) = λb

∫ t

0

√z(s)dW (s)− 1

2λ2b2

∫ t

0

z(s) ds (16)

As we will see, the moment generating function lnX(t) is of fundamental importance forEuropean option pricing:

Proposition 3.4. Consider the moment generating function of lnX(t)

ΨX(u; t)def= E[eu lnX(t)]

In the model (14)-(15), for any u ∈ C we have

ΨX(u; t) = Ψz

(1

2(λb)2u(u− 1), u; t

)where

Ψz(v, u; t) = EP[evz(t)

], z(t) =

∫ t

0

z(s) ds

Here P is the measure defined by Et (uλb√z ·W ).

3.2 Fourier integration

3.2.1 Calculation of option prices when MGF is available for ln(S(t)).

Theorem 3.5. Let ξ be a random variable with moment generating function (MGF) χ(u) =E[euξ]. For k ∈ R and any4 0 < α < 1 we have

E[(eξ − ek)+

]= χ(1)− ek

2π

∫ ∞−∞

e−k(α+iw)χ(α+ iw)

(α+ iw)(1− α− iw)dw (17)

= χ(1)− ek

π

∫ ∞0

Re

[e−k(α+iw)χ(α+ iw)

(α+ iw)(1− α− iw)

]dw (18)

4Here α is a dampening constant which is introduced in the proof as a factor e−αk in order to guarantee thatthe Fourier transform exists.

17

Combining Theorem 3.5 (using a damping factor α = 12 ) with the closed-form expression

for the moment generating function in the SV model given in Proposition 3.4, we obtain anefficient formula for pricing European call and put options in the SV model (14)-(15).

Theorem 3.6. The price of a call option cSV (0, S;T,K) in the SV model (14)-(15) is givenby

cSV (0, S;T,K) =1

bcB(0, S′;T,K ′, λb)− K ′

2πb

∫ ∞−∞

e(1/2+iw) ln(S′/K′)q(

12 + iw

)w2 + 1

4

dw (19)

where cB(0, S′;T,K ′, σ) is the Black formula for spot S′, strike K ′, expiry T and volatilityσ, with

S′ = bS + (1− b)L, K ′ = bK + (1− b)L

and

q(u) = ΨZ

(1

2(λb)2u(u− 1), u;T

)− e 1

2λ2b2z0Tu(u−1)

where Ψz is given in Proposition 3.4.

3.2.2 Direct integration

In order to compute the integral inside equation (19)∫ ∞−∞

e(1/2+iw) ln(S′/K′)q(

12 + iw

)w2 + 1

4

dw

we first need to truncate the domain.

Lemma 3.7. Given that |ρ| < 1, the function

q(u) = ΨZ

(1

2(λb)2u(u− 1), u;T

)− e 1

2λ2b2z0Tu(u−1)

defined in Theorem 3.6 satisfies

limw→+∞

1

wln

[q(

1

2+ iw)

]= −q∞

where

q∞ =λbz0

η

(√1− ρ2 + iρ

)(1 + θT )

One then shows easily that∣∣∣∣∣∫ ∞wmax

e(1/2+iw) ln(S′/K′)q( 12 + iw

w2 + 14

dw

∣∣∣∣∣ ≤√S′

K ′e−Re(q∞)wmax

wmax

so if εw is the absolute tolerance (generally ε = 10−3, 10−6) one can set the upper truncationlimit wmax by the condition

e−Re(q∞)wmax

wmax= εw

We can then apply anu quadrature rule in order to discretize the integral. For instance, thetrapezoidal rule yields

· · ·

18

3.2.3 Fourier integration for arbitrary European Payoffs

Given a payoff function f(x), recall that the value of a European option with payoff f(S(T ))is given by5

E[f(S(T ))] =

∫f(x)ϕS(T )(x) dx =

∫f(x)

∂2c(0, S(0);T, x)

∂x2dx

where ϕS(T )(x) is the density function of S(T ) and c(0, S(0);T,K) is the European calloption value for S(·). Integrating by parts we obtain a useful representation of a generalEuropean payoff in terms of European calls and puts.

Proposition 3.8. For any twice-continuously differentiable function f(x), the value of aEuropean option with payoff f(·) maturing at T equals the weighted integral

E[f(S(T ))] = f(K∗) + f ′(K∗)(S(0)−K∗) +

∫ K∗

−∞p(0, S(0);T,K)f ′′(K) dK

+

∫ ∞K∗

c(0, S(0);T,K)f ′′(K) dK (20)

for any K∗.

Hence, in order to compute the value of a European option with arbitrary payoff we needto simultaneously compute call option prices for different strikes, and the FFT methodprovides an efficient way of accomplishing this. The integrals we need to evaluate are

I(K ′) =

∫ ∞−∞

e−(1/2+iw) ln(S′/K′)

w2 + 14

q

(1

2+ iw

)dw

for various K ′, where recall

S′ = bS + (1− b)L, K ′ = bK + (1− b)L

We discretize K ′ in such a way that ln( S′

K′ ) are equidistant. In particular, choosing stepδ > 0 we define

xn = δn, K ′n = S′exn , n = −N, . . . , Nso that

ln

(S′

K ′n

)= −xn, Kn =

(S +

1− bb

L

)exn − 1− b

bL

and then

Innot= I(K ′n) =

∫ ∞−∞

e−(1/2+iw)δn

w2 + 14

q

(1

2+ iw

)dw = e−0.5δnJn

Jn =

∫ ∞−∞

e−iwδnq(

12 + iw

)w2 + 1

4

dw

Once this is accomplished, the value of an option with general payoff f(·) is obtained bydiscretizing the integrals in equation (20).

5The price of a call option is given by C(S,K) =∫∞KϕS(T )(x)(x − K) dx Differentiating with respect to K

twice we obtain

∂C(S,K)

∂K=

∂

∂K

[∫ ∞K

ϕS(T )(x) dx−K∫ ∞K

ϕS(T )(x) dx

]= −

∫ ∞K

ϕS(T )(x) dx

∂2C(S,K)

∂K2= ϕS(T )(K)

19

Proposition 3.9. Fix δ > 0. Let Kn,K′n be defined by

Kn =

(S +

1− bb

L

)exn − 1− b

bL, K ′n = S′exn , n = −N, . . . , N

The value of a call option with payoff f(·) at time T in the SV model (14)-(15) is ap-proximately given by

E[f(S(T ))] = f(S(0)) +

−1∑n=−N

pSV (0, S(0);T,Kn)f ′′(Kn)(Kn+1 −Kn)

+

N−1∑n=0

cSV (0, S(0);T,Kn)f ′′(Kn)(Kn+1 −Kn)

where

cSV (0, S(0);T,Kn) =1

bcB(0, S′;T,K ′n;λb)− K ′n

2πbe−0.5δnJn

pSV (0, S(0);T,Kn) =1

bpB(0, S′;T,K ′n;λb)− K ′n

2πbe−0.5δnJn

with JnNn=−N evaluated by an inverse FFT transform of the function

q(

12 + iw

)w2 + 1

4

, q(u) = ΨZ

(1

2(λb)2u(u− 1), u;T

)− e 1

2λ2b2z0Tu(u−1)

Remark 3.1 (FFT reminder). The discrete Fourier transform maps

x = (x1, . . . , xN ) 7→ x = (x1, . . . , xN ),

xk =

∑Nj=1 e

−i 2πN (j−1)(k−1)xj

xk = 1N

∑Nj=1 e

i 2πN (j−1)(k−1)xj

(21)

Computing these sums independently would requireN2 operations but the FFT algorithmaccomplishes this with only N log2(N) operations. The idea is to reduce the computation ofan integral to the computation of an inverse Fourier transform to which the FFT algorithmcan be applied. We seek to compute the integrals

Jn = 2

∫ ∞0

Re

[e−iwδn

q(

12 + iw

)w2 + 1

4

]dw, n = −N, · · · , N

In order to do so we truncate the domain of integration to [0, wmax] according to thediscussion in section 4.2.2 and we discretize the integral using, for instance, the trapezoidalrule∫ b

a

f(w) dw =

Nw∑k=0

αkf(wk), wk = kη, η =b− aNw

, αk =

1/2, k = 0, N1, 2 ≤ k ≤ N − 1

Our discretized integral hence becomes

Jn ' 2

Nw∑k=0

αkRe

[e−iwkδn

q(

12 + iwk

)w2k + 1

4

]=

Nw∑k=0

αkRe

[e−iηδkn

q(

12 + ikη

)(kη)2 + 1

4

], n = −N, · · · , N

20

and choosing η such that ηδ = 2πNw

our integrals become

Jn ' 2

Nw∑k=0

αkRe

[e−i

2πNw

kn q(

12 + ikη

)(kη)2 + 1

4

], n = −N, · · · , N

and from (21) we can identify the Jn’s as the inverse Fourier transforms of the xk’s where

xk =q( 1

2 +ikη)(kη)2+ 1

4

.

3.3 Averaging methods for models with time-dependent parameters

Models with time-dependent parameters emerge naturally when vanilla models are used toapproximate interest rate dynamics in a full term structure model. Our goal here is two-fold:

1. Given a stochastic volatility model with time dependent parameters

dS(t) = λ(t) [b(t)S(t) + (1− b(t))L]√z(t)dW (t) (22)

dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t) (23)


we find a constant time-averaged volatility λ, skew b and volatility of variance η suchthat the model

dS(t) = λ[bS(t) + (1− b)L

]√z(t)dW (t) (24)

dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t) (25)


produces the same European option prices as the model (22)-(23).

2. We illustrate how to employ time-averaging techniques to calibrate the SV model withtime-dependent parameters (22)-(23) to quoted market prices.

3.3.1 Volatility averaging

We start by letting the volatility λ(t) be time-dependent, while holding the skew b and thevolatility of variance η constant.

Theorem 3.10 (effective volatility approximation). Values of European options with expiryT in the model

dS(t) = λ(t) [bS(t) + (1− b)L]√z(t)dW (t)

dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t)


are well approximated by their values in the model

dS(t) = λ [bS(t) + (1− b)L]√z(t)dW (t)



where the effective SV volatility λ solves the equation

Ψz

(h′′(ζT )

h′(ζT )λ2, 0;T

)= Ψzλ2

(h′′(ζT )

h′(ζT )λ2, 0;T

)(26)

21

where

ζT = z0

∫ T

0

λ(t)2 dt, h(x) =bS(0) + (1− b)L

b

[2N

(b√x

2

)− 1

]and where Ψz and Ψzλ2 are moment-generating functions given by [AP10-1, Proposition8.3.8] and [AP10-1, Proposition 9.1.2] respectively (see remark below for details).

Remark 3.2. The LHS of equation (26) can be computed in closed form via [AP10-1,Proposition 8.3.8]:

Ψz(v, u;T ) = eA(v,u)+B(v,u)z0 ,

A(v, u) =θz0

η2

[2 ln

(2γ

θ′ + γ − e−γT (θ′ − γ)

)+ (θ′ − γ)T

],

B(v, u) =2v(1− e−γT )

(θ′ + γ)(1− e−γT ) + 2γe−γT,

γ = γ(v, u) =√

(θ′)2 − 2η2v,

θ′ = θ′(u) = θ − ρηλbu

As for the RHS, assuming that λ(t) is piecewise constant λ(t) =∑ni=1 λiI(ti−1,ti](t) for

some 0 = t0 < · · · < Tn = T , then on each interval (ti−1, ti] we can apply the closed-formsolution from the previous paragraph, and piece all these together.

Equation (26) can then be solved in a couple of iterations of the Netwon-Raphson method.

3.3.2 Skew averaging

We now also allow the skew b to be time-dependent and we find an effective skew b.

Proposition 3.11 (effective skew approximation). Values of European options with expiryT in the model

dS(t) = λ(t) [b(t)S(t) + (1− b(t))L]√z(t)dW (t)




dS(t) = λ(t)[bS(t) + (1− b)L

]√z(t)dW (t)



where the effective SV skew b is given by

b =

∫ T

0

b(t)ωT (t) dt

where the weights are given by

ωT (t) =v(t)2λ(t)2∫ T

0v(t)2λ(t)2 dt

,

v(t)2 = z20

∫ t

0

λ(s)2 ds+ z0η2e−θt

∫ t

0

λ(s)2 eθs − e−θs

2θds

22

3.3.3 Volatility of variance averaging

Finally we also allow the volatility of variance to be time-dependent.

Proposition 3.12 (effective volatility of variance approximation). Values of European op-tions with expiry T in the model


dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t)






where the effective SV volatility of variance η is given by

η2 =

∫ T0η(t)2ρT (t) dt∫ T0ρT (t) dt

where the weight function is given by

ρT (u) =

∫ T

u

ds

∫ T

s

dtλ(t)2λ(s)2e−θ(t−s)e−2θ(s−r).

3.3.4 Calibration by parameter averaging

Given:

• A collection of expiries is given 0 = T0 < · · · < TN .

• A collection of strikes K1, . . . ,KM .

• Market values of European call options with expiries Tn and strikes Km, namely

cn,m, n = 1, . . . , N, m = 1, . . . ,M

The goal is to find time-dependent model parameters λ(t), b(t) and η(t) such that themodel


dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t)


values the above options as close as possible. For instance, if X = λ(·), b(·), η(·) denotesthe state of the model and cn,m(X ) are the model option prices, we seek to find

λ(·), b(·), η(·)arg min∑n,m

(cn,m(X )− cn,m)2

Alternatively, we can work with market parameter values: denote by λ, b, η, n =1, . . . , N determined in such a way that the market prices of European options expiringat time Tn, namely cn,mMm=1 match the prices obtained in the model

dS(t) = λn

[bnS(t) + (1− bn)L

]√z(t)dW (t)

dz(t) = θ(z0 − z(t))dt+ ηn√z(t)dZ(t)

23

Denote by λ(X ), b(X ), η(X ) the averaged parameters to time Tn for the time-dependent

model (the results in the previous section relate these to λ, b, η). We can then insteadconsider the following more convenient minimization problem

λ(·), b(·), η(·) = arg min

Wλ

∑n,m

(λn(X )− λn

)2

+Wb

∑n,m

(bn(X )− bn

)2

+Wη

∑n,m

(ηn(X )− ηn)2

with appropriate user-specified weights Wλ, Wb and Wη.

Calibration can then be split into independent sub-calibrations as follows. Recall thethree averaging results we outlined in the previous section:

ηn(X )2 =

∫ Tn0

η(t)2ρTn(t, λ(·)) dt∫ Tn0

ρTn(t, λ(·)) dt, n = 1, . . . , N

bn(X ) =

∫ Tn

0

b(t)ωTn(t;λ(·), η(·)) dt, n = 1, . . . , N

λn(X ) = F(λ(·), bn(X ), ηn(X )

), where

F (λ(·); bn(X ), ηn(X )) =

√h′(ζTn)

h′′(ζTn)·Ψ−1

z

(Ψzλ2

(h′(ζTn)

h′′(ζTn), 0;T

)), 0;T

Assume the model parameters are piecewise constant between option expiry dates:

λ(t) =

N∑i=1

λiI(Ti−1,Ti](t), b(t) =

N∑i=1

biI(Ti−1,Ti](t), η(t) =

N∑i=1

ηiI(Ti−1,Ti](t)

We can then also discretize the weights as

ρTn(t, λ(·)) =

n∑i=1

ρn,i(λ(·))I(Ti−1,Ti](t),

ωTn(t;λ(·), η(·)) =

n∑i=1

ωn,i(λ(·), η(·))I(Ti−1,Ti](t)

and denote

ρn,i(λ(·)) =ρn,i(λ(·))∫ Tn

0ρTn(y;λ(·)) dt

We are thus reduced to solving the three systems of equations

n∑i=1

ρn,i(λ(·))(Ti − Ti−1)η2i = η2

n, (27)

n∑i=1

ωn,i(λ(·), η(·))(Ti − Ti−1)bi = bn, (28)

F(λ(·); bn(X ), ηn(X )

)= λn (29)

for n = 1, . . . , N .

This system can be solved iteratively as follows:

24

(i) We first solve equations (29) by replacing the model parameters bn(X ), ηn(X ) withtheir market values, namely

F(λ(·); b, η

)= F

(λ1, . . . , λn; b, η

)= λn, n = 1, . . . , N

which is a system of N decoupled one-dimensional equations. For instance, the n-thequation reads

F(λ∗1, . . . , λ

∗n−1, λn; b, η

)= λn

where λ∗i denote model parameters already solved for.

(ii) We next solve the system (27) for η2i , i = 1, . . . , N , using the values of λi found in (i),

namelyn∑i=1

ρn,i(λ∗(·))(Ti − Ti−1)η2

i = η2n, n = 1, . . . , N

This also be done sequentially.

(iii) Finally we solve the linear system

n∑i=1

ωn,i(λ∗(·), η∗(·))(Ti − Ti−1)bi = bn, n = 1, . . . , N

for bi, again sequentially.

25

4 One-factor short rate models

The general HJM class with its infinite-dimensional Markovian dynamics is to unwieldy towork in practice, so we seek to identify HJM model subclasses that involve a finite numberof Markovian state variables only.

We start by reviewing the notions of swaps and swaptions. We then outline the mostbasic one-factor models and show how to calibrate them to current prices of ZCB’s andswaptions. We also illustrate how to use these models to price European derivatives.

4.1 Review of Swaps and swaptions

A swap is a generic term for an OTC derivative in which two counterparties agree to exchangeone stream (leg) of cash flows against another stream. A fixed-for-floating interest rate swapis a swap in which one leg is a stream of fixed rate payments and the other is a stream ofpayments based on a floating rate, generally Libor.

Consider an increasing sequence of maturity times (the tenor structure)

0 ≤ T0 < T1 < · · · < TN , τn = Tn+1 − Tn

From the perspective of the fixed rate payer, the net cash flow of the swap at time Tn+1,per unit of notional, is

τn(Ln(Tn)− k), Ln(t) = L(t, Tn, Tn+1)

By the fundamental valuation result, the value os a swap is hence equal to

Vswap(t) = β(t)

N−1∑n=1

τnEt

[1

β(Tn+1)(Ln(Tn)− k)

]

=

N−1∑n=0

[P (t, Tn)− P (t, Tn+1)− τnkP (t, Tn+1)]

=

N−1∑n=0

τnP (t, Tn+1)(Ln(t)− k)

= A(t) [S(t)− k]

whereA(t) =, S(t) =

are the annuity and the forward rate o the swap, respectively. We hence observe that avanilla fixed-floating swap can be valued at time t using only the term structure of interestrates observed on that date.

A European swaption is an option giving the holder the right, but not he obligation, toenter a swap at a future date at a given fixed rate. Assuming the underlying swap starts onthe expiry date T0 of the option, for a payer swap we have

Vswaption(T0) = [Vswap(T0)]+

=

(N−1∑n=0

τnP (T0, Tn+1)(Ln(T0)− k)

)+

Vswaption(t) = β(t)Et

[1

β(T0)Vswaption(T0)

]

= β(t)

(1

β(T0)

N−1∑n=0

τnP (T0, Tn+1)(Ln(T0)− k)

)+

26

4.2 The Ho-Lee model

Assume that the short rate follows an SDE

dr(t) = σrdW (t) =⇒ r(t) = r(0) + σrW (t)

With only 2 parameters r(0) and σr it is not possible to match the observable discount bondprices, so instead we may consider

r(t) = r(0) + a(t) + σrW (t), a(0) = 0

Choosing a(t) accordingly, we can match the discount bond curve at time 0:

Lemma 4.1 (definition of the Ho-Lee model). Assume

r(t) = r(0) + a(t) + σrW (t),

a(t) = f(0, t)− r(0) +1

2σ2r t

2. [Ho-Lee model]

f(0, t) = −∂f(0, t)

∂t

Then for any T > 0 we have

Et

[e−

∫ T0r(u) du

]= P (0, T )

This model can be characterized further:

Lemma 4.2 (characterization of the Ho-Lee model). In the Ho-Lee model, the risk-neutralprocess for r(t) and the time-t bond prices are respectively given by

dr(t) =(∂tf(0, t) + σ2

r t)dt+ σrdW (t)

P (t, T ) =P (0, T )

P (0, t)exp

(−(r(t)− f(0, t))(T − t)− 1

2σtrt(T − t)2

)

A few observations are in order:

• The Ho-Lee model could have been specified as an HJM model with constant forwardrate volatility

σf (t, T ) = σr, σP (t, T ) =

∫ T

t

σf (t, u) du = σr(T − t)

• The constancy of σ(t, T ) is unrealistic and provides too few degrees of freedom forcalibration to quoted option prices.

• Interest rates are Gaussian, so they can become negative.

• With only one driving Brownian motion, the instantaneous moves of all forward ratesare perfectly correlated (this is obviously common to all one-factor models.

4.3 The mean-reverting GSR model

Empirical studies suggest that interest rates exhibit mean-reversion. For instance,Vasicek assumed that interest rates are governed by an SDE

dr(t) = κ(θ − r(t))dt+ σrdW (t)

27

Integrating the linear SDE it follows that

r(t) = θ + (r(0)− θ)e−κt + σr

∫ t

0

e−κ(t−s) dW (s)

so r(t) is a Gaussian random variable with moments

E[r(t)] = θ + [r(0)− θ]e−κt

V ar[r(t)] =σ2r

2κ(1− e−2κt)

Allowing all constants to be time-dependent in the Vasicek model we obtain the generalshort rate (GSR) model

dr(t) = κ(t) [θ(t)− r(t)] dt+ σr(t)dW (t)

Recall (c.f. Lemma 4.5.4 in [AP10-1]) that within an HJM setting we have

df(t, T ) = σf (t, T )t∫ T

t


σf (t, T ) = σr(t) exp

(−∫ T

t

κ(u) du

)

and in order to match the initial yield curve, we must have

θ(t) =1

κ(t)

∂f(0, t)

∂t+ f(0, t) +

1

κ(t)

∫ t

0

e−2∫ tuκ(s) dsσ2

r(u) du (30)

The term ∂f(0,t)∂t can be a nuisance in applications where the initial forward curve is not

smooth. In order to get rid of it we can perform a change of variables.

Lemma 4.3. Defining

x(t) := r(t)− f(0, t)

the GSR model becomes

dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t)dW (t),

x(0) = 0,

y(t) =

∫ t

0


r(u) du

The bond reconstitution formula is

P (t, T ) =P (0, T )

P (0, t)exp

(−x(t)G(t, T )− 1

2y(t)G2(t, T )

)(31)

G(t, T ) =

∫ T

t

e−∫ utκ(s) ds du

Remark 4.1. Note that the term G(t, T ) can be expressed as

G(t, T ) = [G(0, T )−G(0, t)] e∫ t0κ(s) ds

which is useful in grid-based numerical work.

28

4.3.1 Swaption pricing

Assume that the function κ(t) is fixed exogenously (based on empirical observations). By(30), the function θ(t) is then uniquely fixed by the initial forward curve. In order to havea full model specification, it thus suffices to determine the function σr(t), which in practiceis done by calibration of the model to observed prices of liquid European options (i.e. capsand swaptions). In any Gaussian HJM model, caplets can be priced by simple Black-Scholesformulas, so:

• In this subsection we focus on pricing European swaptions in two different ways.

• In the next subsection we use these formulas to calibrate the model (i.e. to find σr(t)).

4.3.1.1 Jamshidian decomposition

For a payer swaption expiring at time T0 with the underlying swap paying an annualizedcoupon c at times T1 < · · · < TN , the swaption payout at time T0 is

Vswaption(T0) =

(1− P (T0, TN )− c

N−1∑i=0

τiP (T0, Ti+1)

)+

, τi = Ti+1 − Ti

Jamshidian re-writes the swaption payout from an option on a sum of discount bonds toa sum of options on discount bonds.

First write P (T0, TN ) = P (T0, TN ;x(T0)) and let x∗ be the critical value for which theswap at time T0 is exactly zero. This can be found by numerical root search on the equation

P (T0, TN , x∗) + c

N−1∑i=0

τiP (T0, Ti+1, x∗) = 1

Also define the strikes

Ki = P (T0, Ti, x∗)

From the expression of P (t, T ) in terms of x(t) it follows that P (T0, Ti, x(T0)) is monotonicallydecreasing in x(T0), the swaption only pays out a positive amount if x(T0) > x∗. After afew computations one concludes

Vswaption(T0) =

(1− P (T0, TN , x(T0))− c

N−1∑i=0

τiP (T0, Ti+1, x(T0))

)1x(T0)>x∗

= (KN − P (T0, TN , x(T0)))+

+ c

N−1∑i=0

τi ((Ki+1 − P (T0, Ti+1, x(T0)))+(32)

so the swaption payout has now been decomposed into N+1 put options on ZCB’s, whichallows us to price the swaption in closed form.

• The use of root-search algorithms and the need to price a potentially large amount ofZCB’s can be cumbersome.

• Alternative: Examine forward swap rate in appropriate annuity measure, introducingapproximations as needed to make the dynamics tractable (see next subsection).

29

4.3.1.2 Swap rate approximation

Consider again a payer swaption expiring at time T0 with the underlying swap paying anannualized coupon c at times T1 < · · · < TN . Re-write the swaption payout as

Vswaption(T0) = A(T0)(S(T0)− c)+

where, recall

A(t)not= A0,N (t) =

N−1∑i=0

τiP (t, Ti+1), S(t)not= S0,N (t) =

P (t, T0)− P (t, TN )

A(t)

In the QA measure we have

Vswaption(0) = A(0)EA[(S(T0)− c)+

]and it now suffices to establish tractable dynamics for S(t) in QA. S(t) is obviously aQA-martingale, and from the bond reconstitution formula (31)

P (t, T ) =P (0, T )

P (0, t)exp

(−x(t)G(t, T )− 1

2y(t)G2(t, T )

)G(t, T ) =

∫ T

t


it is clear that S(t) is a deterministic function S(t) = S(t, x(t)), so from It’s lemma

dS(t) = q(t, x(t))σr(t)dWA(t), q(t, x) =

∂

∂x

P (t, T0, x)− P (t, TN , x)∑N−1i=0 τiP (t, Ti+1, x)

where we used the bond reconstitution formula (31) to express the P (t, Ti)’s as functions ofx. Evaluating the partial derivatives one easily sees that

q(t, x) = −P (t, T0, x)G(t, T0)− P (t, TN , x)G(t, TN )

A(t, x)+S(t, x)

A(t, x)

N−1∑i=0

τiP (t, Ti+1, x)G(t, Ti+1)

The function q(t, x(t)) can be experimentally verified to be close to a constant in thex-direction, so

q(t, x(t)) ' q(t, x(t)), x(t) deterministic proxy for x(t)

For instance, setting x(t) ≡ 0 will yield good precision if σr(t) is not too high. More accurateapproximation will be given later on.

We are thus left with the problem


],

dS(t) = q(t, x(t))σr(t)dWA(t)

The dynamics of S(t) can be thought of as those of a CEV process

dS(t) = λ(t)Sp(t)dWA(t), λ(t) = q(t, x(t))σr(t)

with p = 0 and time-dependent volatility.

• By [AP10-1, Proposition 7.6.1], the prices of European options in the model dS(t) =λ(t)Sp(t)dWA(t) are the same as those of the time-averaged constant volatility model

dS(t) =√λSp(t)dWA(t), λ =

1

T0

∫ T0

0

λ2(u) du

30

• For p = 0 and constant λ, namely dS(t) = λdWA(t), a standard computation (c.f.[AP10-1, Remark 7.2.9]) shows that

c(0, S(0);T0, c, λ) = (S(0)− c)N (d) + λ√T0n(d), d =

S(0)− cλ√T0

In conclusion (simplifying√T0) we obtain:

Vswaption(0) = A(0)[(S(0)− c)N (d) +

√λ · n(d)

](33)

where

d =S(0)− c√

λ, λ =

∫ T0

0

q(t, x(t))2σr(t)2 dt

4.3.2 Swaption calibration

Using the option pricing formulas from the previous subsection, we can calibrate the model(i.e. find σr(t) to match the market prices of one or more calibration targets, generallyEuropean swaptions).

Assume that we are given a collection6 of N − 1 swaptions defined on a maturity grid0 = T0 < T1 < · · · < TN such that the i-th swaption expires at time Ti, for i = 1, . . . , N − 1.

• Assume that the mean reversion speed κ(t) is a constant and that it is fixed exogenously.

• Observe that the value of the swaption expiring at Ti depends on the volatility curveσr(s) for s ∈ [0, Ti] only7.

• Assume that the volatility function σr(t) is piecewise-constant on the maturity grid

σr(t) =

N−2∑i=0

σiI[Ti,Ti+1]

Calibration can hence be performed one swaption at a time via a series of one-dimensional root searches as follows:

1. Assume that σ0, . . . , σi−1 have been computed.

2. Set the value σi such that the model price for the swaption expiring at Ti+1 equals itsmarket price, by numerically inverting equation (33)

Vswaption(0) = A(0)[(S(0)− c)N (d) +

√λ · n(d)

]6Such a collection is called a swaption strip.7Indeed, the i-th swaption payout at time Ti

Vswaption(Ti) =

(1− P (Ti, TN )− c

N−1∑n=i

τnP (Ti, Tn+1)

)+

so our claim follows from the bond reconstitution formula (31)

P (Ti, T ) =P (0, T )

P (0, Ti)exp

(−x(Ti)G(Ti, T )− 1

2y(Ti)G

2(Ti, T )

)G(t, T ) =

∫ T

t

e−2∫ tu κ(s) ds du

y(t) =

∫ t

0

e−∫ tu κ(s) dsσ2

r(u) du

31

3. Repeat Step 2 for i = 0, . . . , N − 2.

More concretely:

(i) Pre-compute the terms

G(Ti, Tj) =

∫ Tj

Ti

e−

∫ uTiκ(s) ds

du, i ≤ j

via the recursionG(t, T ) = [G(0, T )−G(0, t)] e

∫ t0κ(s) ds

At the n-th step of the recursion, for n = 1, . . . , N − 1, we seek to obtain σn−1 so thatthe n-th swaption market price Vn is matched.

(ii) Evaluate y(Tn). This is done via the recursion

y(Tn) = e−2κn−1(Tn−1−Tn−2)y(Tn−1) +1

2σ2n−1(Tn−1 − Tn−2)

[e−2κn−1(Tn−1−Tn−2) + 1

],

y(T0) = 0

which is obtained by applying the trapezoidal quadrature rule to the integral

y(t) =

∫ t

0

e−2∫ tuκ(s) dsσr(u) du

Note that σn−1 is unknown at the n-th step: we are optimizing for it.

(iii) Use y(Tn) to calculate the discount factors P (Tn, Tj), for j > n using

P (Tn, Tj) =P (0, Tj)

P (0, Tn)e−

12y(Tn)G(Tn,Tj)

2

(iv) Compute An(t) ≡ An,N (t) and Sn(t) ≡ Sn,N (t) at t = T0, . . . , Tn.

(v) Compute q(t) for t = T0, . . . , Tn.

(vi) Compute λ2 = 1Tn

∫ Tn0

q(t, x(t))2σr(t)2 dt using the trapezoidal rule (we are using the

approximation x(t) ≡ 0

λ ' σ20(T1 − T0)

q(T0, 0)2 + q(T1, 0)2

2+ · · ·σ2

n−1(Tn−1 − Tn)q(Tn−1, 0)2 + q(Tn, 0)2

2

(vii) Use a one-dimensional root-search procedure to find σn−1 satisfying equation (33)

Vn(0) = An(0)

[(Sn(0)− c)N (d(σn−1)) +

√λ(σn−1) · n(d(σn−1))

]

4.3.3 Monte Carlo simulation

To price a derivative security paying V (T ) at time T , recalling that x(t) = r(t)− f(0, t), weneed to compute

V (0) = EQ[V (T )e−

∫ T0r(u) du

]= P (0, T )EQ

[V (T, x(t)Tt=0)e−

∫ T0x(u) du

](34)

The dynamics of x(t) in the GSR model are

dx(t) = [y(t)− κ(t)x(t)]dt+ σr(t)dW (t),

y(t) =

∫ t

0

e−2∫ tuκ(s) dsσr(u)2 du

32

Discretizing into a schedule8 t0 < t1 < · · · < tN we see that

x(ti+1) = E[x(ti+1)|x(ti)] +√

Var(x(ti+1)|x(ti))Zi, i = 0, . . . , N − 1,

E[x(ti+1)|x(ti)] = e−∫ ti+1ti

κ(u) dux(ti) +

∫ ti+1

ti

e−∫ ti+1s κ(u) duy(s) ds,

Var(x(ti+1)|x(ti)) =

∫ ti+1

ti

(e−

∫ ti+1s κ(u) duσr(s)

)2

ds

where Z0, . . . , ZN−1 is a sequence of independent standard Gaussian random variables.

For each simulated path x(t0), . . . , x(tN ), we can compute V (T, x(t)Tt=0). As for the

quantity I(T ) = −∫ T

0x(u) du that is also present in (34), we can proceed via a standard

quadrature (which will introduce a discretization bias) or we can use [AP10-2, Lemma10.1.11] to simulate it as

I(ti+1) = E[I(ti+1)|I(ti), x(ti)] +√

Var(I(ti+1)|I(ti), x(ti))Zi, i = 0, . . . , N − 1

where

E[I(ti+1)|I(ti), x(ti)] = I(ti)− x(ti)G(ti, ti+1)−∫ ti+1

ti

∫ u

ti

e−∫ usκ(v) dvy(s) ds du,

Var(I(ti+1)|I(ti), x(ti)) = 2

∫ ti+1

ti

∫ u

ti

e−∫ usκ(v) dvy(s) ds du− y(ti)G(ti, ti+1)2,

Cov(x(ti+1), I(ti+1)|I(ti), x(ti)) = −∫ ti+1

ti

∫ u

ti

σr(s)2e−

∫ usκ(v) dve−

∫ ti+1s κ(v) dv ds du

where Z0, . . . , ZN−1 are independent Gaussian variables satisfying

ρinot= Corr(Zi, Zi) =

Cov(x(ti+1), I(ti+1)|I(ti), x(ti))√Var(I(ti+1)|I(ti), x(ti))

√Var(x(ti+1)|I(ti), x(ti))

Using the Cholesky decomposition as usual, the sequence Zi can be constructed by generat-ing, for each i, two independent Gaussian variables Zi, Z

∗i and setting

Zi = ρiZi +√

1− ρ2iZ∗i

4.4 The affine one-factor model

The GSR model suffers from two main deficiencies:

(i) Interest rates distribute as normal variables, so they can become negative.

(ii) The short rate volatility σr(t) is independent of the short rate itself, so there is no wayto control the volatility skew implied by the model.

Both issues are addressed by the affine short rate model

dr(t) = κ(t) [θ(t)− r(t)] dt+ σ√α+ βr(t)dW (t)

8In order to see this, simply write down the solution to the linear SDE for x(t). The choice of the schedulet0 < t1 < · · · < tN depends on the particulars of the payoff. For instance, if V (T ) only depends on the yield curveat time T , we may take N = 1.

33

4.4.1 Bond pricing

In order to compute bond prices

P (t, T ) = Et

[e−

∫ Ttr(u) du

]one establishes more generally the extended transform

g(t, T ; c1, c2) = Et

[e−c1r(T )−c2

∫ Ttr(u) du

], c1, c2 ∈ C

Lemma 4.4. Wherever the extended transform g(t, T ; c1, c2) us defined, it is given by

g(t, T ; c1, c2) = exp [A(t, T ; c1, c2)−B(t, T ; c1, c2)r(t)]

where A,B satisfy a system of Riccati ODE’s

dA

dt− κ(t)θ(t)B +

1

2σ2(t)αB2 = 0,

−dBdt

+ κ(t)B +1

2σ2(t)βB2 = c2

subject to the terminal conditions A(T, T ; c1, c2) = B(T, T ; c1, c2) = 0.

• This system of ODE’s can be solved robustly by Runge-Kutta methods.

• If the parameters are piecewise constant, A,B can also be found analytically (c.f.section 10.2.2.2 in [AP10-2]).

4.4.2 Discount bond calibration

In the affine SDE

dr(t) = κ(t) [θ(t)− r(t)] dt+ σ√α+ βr(t)dW (t)

• The role of θ(t) is to calibrate the model to the initial term structure of discount bonds.

• The short rate volatility σ(t) will be determined through calibration against swaptionsand caps/floors.

As in the GSR model, it is convenient to eliminate the dependence of θ on ∂f(0,t)∂t . This

is accomplished via the change of variables x(t) = r(t)− f(0, t); the SDE for x(t) is

dx(t) = dr(t)− ∂tf(0, t)dt = (w(t)− κ(t)x(t)) dt+ σ(t)√ξ(t) + βx(t)dW (t) (35)

w(t) = κ(t)θ(t)− ∂tf(0, t)− κ(t)f(0, t)

ξ(t) = α+ βf(0, t)

The extended transform g(t, T ; c1, c2) is then expressed in terms of x(t)

g(t, T ; c1, c2) = e−c1f(0,T )P (0, T )c2

P (0, t)c2exp [C(t, T ; c1, c2)− x(t)B(t, T ; c1, c2)]

where B,C satisfy a system of Riccati ODE’s

dC

dt− w(t)B +

1

2σ2(t)ξ(t)B2 = 0, ξ(t) = α+ βf(0, t),

−dBdt

+ κ(t)B +1

2σ2(t)βB2 = c2

34

4.4.2.1 Algorithm for ω(t), for fixed α, β and known κ(t), σ(t)

The function ω(t) is determined by matching observed discount bond prices at time 0. Con-cretely, ω(t) is specified on a grid t0 < · · · < TN under a piecewise-constant assumption

ω(t) =∑N−1i=0 ωiI(ti,ti+1](t).

Note that

P (t, T ) = g(t, T ; 0, 1) =P (0, T )

P (0, t)ec(t,T )−x(t)b(t,T ) (36)

where

dc(t, T )

dt− ω(t)b(t, T ) +

1

2σ2(t)ξ(t)b2(t, T ) = 0, (37)

−db(t, T )

dt+ κ(t)b(t, T ) +

1

2σ2(t)βb2(t, T ) = 1 (38)

subject to c(T, T ) = b(T, T ) = 0.

Setting t = 0 in equation (36) establishes the fundamental calibration requirement

c(0, T ;ω(·)) = 0, ∀T

The idea is then to solve equation (38) for b(ti, tj), j < i and then find wi iteratively from

equation (37) by integrating it over [0, ti+1], and using the fact that∫ ti+1

0dc(t,ti+1)

dt dt =−c(0, ti+1) = 0. Concretely:

1. As a pre-processing step, find b(ti, tj) for all j < i by solving equation (38).

2. For a given i, assume that ωj is known for j < i, namely that ω(t) has been specifiedover [0, ti].

3. Integrating equation (37) over [0, ti+1] yields

−c(0, ti+1) = 0

=

not= Θ(ti)︷︸︸︷

−1

2

∫ ti+1

0

σ2(s)ξ(s)b2(s, ti+1) ds+

∫ ti

0

ω(s)b(s, ti+1) ds+

∫ ti+1

ti

ω(s)b(s, ti+1) ds

= Θ(ti) + ωi

∫ ti+1

ti

b(s, ti+1) ds

whereby one can solve for ωi and then repeat the process for i = 0, 1, . . . , N − 1.

4.4.3 European option pricing

In order to to determine the volatility function σ(t)), the affine model is calibrated to capletsand swaptions, so we need efficient schemes to compute prices thereof.

4.4.3.1 Caplets

Recall that, by Lemma 4.4, the moment generating function is available in the affine model

g(t, T ; c1, c2) = exp [A(t, T ; c1, c2)−B(t, T ; c1, c2)r(t)]

where A,B satisfy a system of Riccati ODE’s, so caplets (i.e. options on zero-coupon bonds,can be priced by Fourier methods.

35

Briefly: given a tenor structure 0 ≤ T0 < · · · < TN , a caplet fixing at Tn has time-Tn+1

payout

Vcaplet(Tn+1) = τn (Ln(Tn)− k)+

=

(1

P (Tn, Tn+1)− 1− τk

)+

So it’s time-Tn value is

Vcaplet(Tn) = P (Tn, Tn+1)ETn+1

Tn

[(1

P (Tn, Tn+1)− 1− τk

)+]

[∗]= (1− (1 + kτ)P (Tn, Tn+1))

+

where in [∗] we used that the expression inside the expectation is FTn-measurable, whencethe time-Tn value of the caplet can be represented as the payoff of a scaled put option on azero-coupon bond for which we have an m.g.f. available.

4.4.3.2 Swaptions

In order to price swaptions efficiently, we work out an approximation for the swap ratemartingale dynamics. Start by writing the swaption payout as

Vswaption(T0) = A(T0)(S(T0)− c)+

with

A(t)not= A0,N (t) =

N−1∑i=0

τiP (t, Ti+1), S(t)not= S0,N (t) =

P (t, T0)− P (t, TN )

A(t)

In the QA measure we have


]S(t) is obviously a QA-martingale, and from the bond reconstitution formula (36)

P (t, T ) = g(t, T ; 0, 1) =P (0, T )

P (0, t)ec(t,T )−x(t)b(t,T )

we obtain

dS(t) =∂S(t, x(t))

∂xσ(t)

√ξ(t) + βx(t)dWA(t)

where

∂S(t, x)

∂x= −b(t, T0)P (t, T0)− b(t, TN )P (t, TN )

A(t)+S(t)

A(t)

N−1∑i=0

τib(t, Ti+1)P (t, Ti+1)

S(t) can be empirically verified to be well approximated by a linear function of x(t), sayS(t) = ζ(t) + χ(t)x(t), whence

dS(t) ' σ(t)√ξs(t) + βs(t)S(t)dWA(t), for appropriate ξs, βs(t)

We can further reduce to

dS(t) = σ(t)√βs(t)

√ψ + S(t)dWA(t), ψ =

∫ T0

0σ(t)2βs(t)ξs(t) dt∫ T0

0σ(t)2βs(t) dt

36

Defining y(t) = ψ + S(t) our swaption pricing problem reduces to


]= A(0)EA

[(y(T0)− (c+ ψ))+

],

dy(t) = σ(t)√βs(t)

√y(t)dWA(t)

y(0) = ψ + S(0)

Now y(t) is simply a CEV process with power p = 12 and time dependent volatility

λ(t) := σ(t)√βs(t). By [AP10-1, Proposition 7.6.1], the swaption price will be that of a

swaption in a CEV with constant time-averaged volatility

dy(t) =√λ√y(t)dWA(t), λ =

1

T0

∫ T0

0

σ(u)√βs(u) du

By [AP10-1, Proposition 7.2.6], the swaption price will be

Vswaption(0) = Vswaption(0, y(0);T0, ψ + c) = y(0) [1−Υ(a, b+ 2, c)]− (ψ + c)Υ(c, b, a)

where

a =4(ψ + c)

λ2T0, b = 2, c =

4y(0)

λ2T0

where Υ(x, ν, γ) = P (χ2ν(γ) ≤ x) is the cumulative distribution function for a non-central

chi-square distribution χ2ν(γ) with ν degrees of freedom and non-centrality parameter γ.

4.4.4 Swaption calibration

We finally illustrate how to calibrate the volatility function σ(t) to a swaption strip definedon a maturity grid 0 = T0 < · · · < TN . Assume that all swaptions are written on swaps thatmature at time TN ...

4.5 Log-normal short-rate models

Some authors have attempted to specify short-rate models where the dynamics of r(t) are ofthe form

dr(t) = O(dt) + σr(t)r(t)dW (t)

A few approaches are outlined in [AP10-2, Section 11.1]:

• Black-Derman-Toy model

r(t) = U(t)eσr(t)W (t), U(t), σr(t) deterministic

This model lacks a bond reconstitution formula and additionally, ln r(t) can be seen tobe a mean-reverting Gaussian [AP10-2, Lemma 11.1.1] with a mean reversion speed

κ(t) = −σ′r(t)σr(t) that is beyond user control.

• The Black-Karasinski model is analogous to the BDT model, but it specified the meanreersion speed exogenously.

A common problem shared by all these log-normal models is that

Et

[1

P (t′, T )

]=∞, t < t′ < T

In particular since L(t, T ) = 1τ

(P (t,T )

P (t,T+τ) − 1)

, this implies the following unfortunate eco-

nomic corollaries:

37

• Et[L(T, T )] =∞ for T > t, which predicts that all future Libor rates should be infinite.

• Using Jensen’s inequality we obtain

1

P (t′, T )=

1

Et′[e−

∫ Tt′ r(u) du

] ≤ Et′ [e∫ Tt′ r(u) du]

whence Et

[e∫ Tt′ r(u) du

]= ∞, so that the expected return of investing in the continu-

ously compounded money market account for a finite period of time is infinite.

4.6 Spanned and unspanned stochastic volatility

We illustrate why the short rate framework is not particularly amenable to stochastic volatil-ity extensions. Consider the Fong-Vasicek model

dr(t) = κr(θr − r(t))dt+√z(t)dW1(t), κr, θr ∈ R+

dz(t) = κz(θz − z(t))dt+ η√z(t)dW2(t), κz, θz ∈ R+

dW1(t) · dW2(t) = ρdt

The bond prices in this model can be seen to satisfy

P (t, t+ δ) = eA(δ)+r(t)B(δ)+z(t)C(δ)

where A,B,C a deterministic functions satisfying a coupled system of ODE’s. Therefore,P (t, T ) is a deterministic function of the two state variables r(t) and z(t) and hence theexposure to both can be hedged out by taking positions in two discount bonds with differentmaturities9. When this happens, we say that the state variables are spanned by the discountcurve.

There is evidence, however, that interest rate option volatilities cannot be perfectlyhedged by trading only discount bonds, which implies that volatilities of discount bondsdepend on a vector of random state variables (z1(t), . . . , zn(t)) that are not included in thestate variables used in reconstitution formulas for the discount curve, namely

∂P (t, T )

∂zi(t)= 0, i = 1, . . . , n

9Namely, given observations at time t of the prices of two discount bonds with different maturities, we caninvert the reconstitution formula for P (t, T ) to uncover the current values of the two state variables.

38

5 Multi-factor short-rate models

Short rate models with a single driving Brownian motion:

1. Imply that the instantaneous correlation between forward rates at different maturitiesis one, which contradicts reality. As a general rule, all derivatives that have payoutsexhibiting significant convexity to non-parallel moves of the forward curve must not bepriced in a one-factor model.

2. Are unable to produce a time-stationary non-monotonic term structure of forward ratevolatilities (the so-called short-maturity volatility hump). Two-factor Gaussian modelsare, in contrast, capable of reproducing this volatility hump.

In this section we briefly sketch:

• The Gaussian multi-factor model, following [AP10-2, Section 12.1]: development fromseparability condition, swaption pricing and model calibration.

• The quasi-Gaussian multi-factor model, following [AP10-2, Section 13.3]

5.1 The Gaussian multi-factor model

One starts along the lines of the one-factor Gaussian model. A general d-factor model canbe written as

df(t, T ) = σf (t, T )>∫ t

0

σf (t, u) dudt+ σf (t, T )>dW (t),

dP (t, T )

P (t, T )= r(t)dt− σP (t, T )>dW (t)

where σP (t, T ) is a bounded d-dimensional function of time and W (t) is a d-dimensionalBrownian motion in the risk-neutral measure Q. This model can be made Markovian byimposing a separability condition:

Proposition 5.1. Assuming that

σf (t, T ) = g(t)h(T ) (39)

where g is a d×d deterministic matrix-valued function, and h is a d-dimensional deterministicvector. Then

f(t, T ) = f(0, T ) + Ω(t, T ) + h(T )>z(t), (40)

dz(t) = g(t)>dW (t), z(0) = 0, (41)

r(t) = f(0, t) + Ω(t, T ) + h(t)>z(t), (42)

Ω(t, T ) = h(T )>∫ t

0

g(s)>g(s)

∫ T

s

h(u) du ds (43)

If the separability condition is satisfied, the forward curve can be reconstructed from dGaussian martingale variables zi(t), which are not determined uniquely. Set

H(t) = diag(h(t)) ≡ diag(h1(t), . . . , hd(t)), hi(t) 6= 0 i = 1, . . . , d

which is clearly invertible, and define a diagonal d× d matrix10 by

κ(t) = −dH(t)

dtH(t)−1 (44)

10Recall that in the HJM setting κ(t) = −h′(t)h(t)

, where σf (t, T ) = g(t)h(T ).

39

Proposition 5.2. Let the forward rate volatility by separable σf (t, T ) = g(t)h(T ). Let κ(t)be defined as above and define

x(t) = H(t)

∫ t

0

g(s)>g(s)

∫ t

s

h(u) du ds+H(t)z(t) (45)

y(t) = H(t)

(∫ t

0

g(s)>g(s) ds

)H(t) (46)

(so that x(t) is a d-dimensional random vector and y(t) is a deterministic d× d symmetricmatrix). Then if 1 = (1, 1, . . . , 1)> ∈ Rd we have

dx(t) = [y(t)1− κ(t) · x(t)] dt+ σx(t)>dW (t),

σx(t) = g(t)H(t)

and with M(t, T ) := H(T )H(t)−11,

f(t, T ) = f(0, T ) +M(t, T )>

(x(t) + y(t)

∫ T

t

M(t, u) du

)(47)

r(t) = f(t, t) = f(0, t) + 1>x(t) = f(0, t) +

d∑i=1

xi(t) (48)

As for the bond prices, defining

G(t, T ) =

∫ T

t

M(t, u) du

we can write explicitly

P (t, T ) =P (0, T )

P (0, t)exp

[−G(t, T )> · x(t)− 1

2G(t, T )> · y(t) ·G(t, T )

](49)

More explicitly: writing

κ(t) = diag(κ1(t), . . . , κd(t))

it follows from the definitions above that

h(t) =(e−

∫ t0κ1(s) ds, . . . , e−

∫ t0κd(s) ds

)>(50)

M(t, T ) = H(T )H(t)−11 =(e−

∫ Ttκ1(s) ds, . . . , e−

∫ Ttκd(s) ds

)>(51)

y(t) =

∫ t

0

diag(M(s, t)) · σx(s)> · σx(s) · diag(M(s, t)) ds (52)

whereby specification of the mean reversion κ(t) and σx(t) fully determines the d-dimensionalGaussian model.

One important motivation for the introduction of a multi-factor interest rate model isthe ability to control correlations among the various points on the forward curve.

Lemma 5.3. In the model for r(t) described in Proposition 5.1, the time-t instantaneouscorrelation between the forward rates f(t, T1) and f(t, T2) is given by

ρ(t, T1, T2) =h(T1)> · g(t)> · g(t) · h(T2)√

h(T1)> · g(t)> · g(t) · h(T1) ·√h(T2)> · g(t)> · g(t) · h(T2)

(53)

40

5.2 Two-factor Gaussian model

In this case we have

h(t) =

(e−

∫ t0κ1(u) du

e−∫ t0κ2(u) du

), g(t) =

(σ11(t)e−

∫ t0κ1(u) du 0

σ21(t)e−∫ t0κ1(u) du σ22(t)e−

∫ t0κ2(u) du

)and writing x(t) = (x1(t), x2(t))>

dx(t) = [y(t)1− κ(t)x(t)]dt+ σx(t)>dW (t), σx(t) =

(σ11(t) 0σ21(t) σ22(t)

)The instantaneous correlation between x1(t) and x2(t) is

ρx(t) =σ22(t)σ21(t)√

σ11(t)2 + σ21(t)2 · ‖σ22(t)‖It is clear that the model has the forward rate process

df(t, T ) = O(dt) +

(σ1(t)e−

∫ Ttκ1(u) du

σ2(t)e−∫ Ttκ2(u) du

)dW ∗(t), dW ∗1 (t)dW ∗2 (t) = ρx(t)dt

with σ1(t) =√σ11(t)2 + σ21(t)2 and σ2(t) = ‖σ22(t)‖. In particular (c.f. [AP10-2, Lemma

12.1.11]) we have

ρ(t, T1, T2) := Corrt(df(t, T1), df(t, T2)) =b(t, T1, T2)√

b(t, T1, T1)b(t, T2, T2)(54)

where

b(t, T1, T2) = 1 + ρx(t)σ2(t)

σ1(t)

(e−

∫ T1t (κ2(u)−κ1(u)) du + e−

∫ T2t (κ2(u)−κ1(u)) du

)+

(σ2(t)

σ1(t)

)2

e−∫ T1t (κ2(u)−κ1(u)) du−

∫ T2t (κ2(u)−κ1(u)) du

5.3 Swaption pricing (via Gaussian swap rate approximation)

Swaptions in a multi-factor Gaussian model can be priced identically to the one-dimensionalsetting (c.f. Section 4.3.1.2).

The time-0 value of a swaption is given by Vswaption(0) = A(0)EA [(S(T0)− c)+] Theswap rate dynamics can be approximated as

dS(t) = q(t, x(t))>σx(t)>dWA(t), qj(t,x(t)) =∂S(t)

∂xj

The functions qj can be experimentally verified to be close to a constant, so we canapproximate

qj(t,x(t)) ' qj(t, x(t))

where x(t) is some deterministic proxy of the random vector x(t), for instance x(t) = 0.Then, as in the 1-dimensional case we have the following simple swaption pricing formula.

Lemma 5.4. Let x(t) be a deterministic function of time and assume that qj(t,x(t)) 'qj(t, x(t)). Then

Vswaption(0) = A(0)[(S(0)− c)N (d) +

√λn(d)

]where

d =S(0)− c√

λ, λ =

∫ T0

0

‖q(t, x(t))>σx(t)>‖2 dt

41

5.3.1 Calibration

In order to calibrate the Gaussian multi-factor model we need to specify the functions g(·)ad h(·). From (50)

h(t) =(e−

∫ t0κ1(s) ds, . . . , e−

∫ t0κd(s) ds

)>it is clear that specification of the mean reversions (κ1(t), . . . , κd(t))) fully determines thefunction h(·).

The functions g(·) ad h(·) affect both volatilities and correlations of market rates, so inorder to calibrate them to market instrument prices, we need both caplet/swaptions and alsocorrelation-sensitive instruments like spread options. The former are far more liquid thanthe latter, so it is a good idea to try to separate volatility and correlation calibration.

Consider the following:

• d benchmark tenors δ1 < · · · < δd.

• d forward rates fi(t) = f(t, t+ δi), i = 1, . . . , d, with instantaneous correlations11

λi(t) = h(t+ δi)>g(t)>, λi(t) := ‖λi(t)‖

• Denote by χi,j(t) = Corr(fi(t), fj(t)), i, j = 1, . . . , d.

• Denote the covariance matrix of the vector f(t) = (f1(t), . . . , fd(t))> by

[Rf (t)]i,j = λi(t)λ>j (t) = χi,j(t)λi(t)λj(t)

Also denote its Cholesky decomposition by

Rf (t) = Cf (t)>Cf (t)

• This instantaneous covariance matrix can also be written as

Hf (t)g(t)>g(t)Hf (t)>, Hf (t) =

h(t+ δ1)>

...h(t+ δd)

>

=

h1(t+ δ1) · · · hd(t+ δ1)...

. . ....

h1(t+ δd) · · · hd(t+ δd)

whence

Hf (t)g(t)> = Cf (t) =⇒ g(t)> = Hf (t)−1Cf (t)

• The matrix g(t) can also be determined by fitting the correlation rather than thecovariance. Concretely, defining

[Xf (t)]i,j := χi,j(t) and Xf (t) = Df (t)>Df (t)

11Recall the d-factor Gaussian model specification from Proposition 5.1

f(t, T ) = f(0, T ) + Ω(t, T ) + h(T )>z(t),

dz(t) = g(t)>dW (t), z(0) = 0,

Ω(t, T ) = h(T )>∫ t

0

g(s)>g(s)

∫ T

s

h(u) du ds

whence for T = t+ δi we have

dfi(t) = dfi(t, t+ δi) = dΩ(t, t+ δi)dt+ h(t+ δi)>dz(t) = dΩ(t, t+ δi)dt+ h(t+ δi)

>g(t)>︸︷︷︸:=λi(t)

dW (t).

42

one can obtain g(t) by solving

Hf (t)g(t)> = diag(λ1(t), . . . , λd(t))Df (t)>

with the advantage that Df (t) is independent of the volatilities λi(t) and it doesn’tneed to be recomputed after each update of λi(t), within a calibration iteration.

One can thus implement the following calibration algorithm:

1. Specify the vector h(t) by the mean-reversion parametrization (50)

h(t) =(e−

∫ t0κ1(s) ds, . . . , e−

∫ t0κd(s) ds

)>using d different constant mean reversions (expand on how to select κi.)

2. Fill in the correlation matrix [Xf (t)]i,j = χi,j(t), as explained later in Section 7.2.2.

3. Calibrate benchmark rate volatilities against swaptions using the pricing formula fromLemma 5.4.

Concretely, consider the 2-factor case and assume that

κ(t) = (κ1(t), κ2(t))>, and Rf (t) =

(1 χ12(t)

χ12(t) 1

)have been pre-specified. Consider two swaption strips (same fixing dates, differenttenors) on a tenor structure T0 < T1 < · · · < TN . Assume that σ11(t), σ21(t), σ22(t) are

piecewise constant, with σij(t) =∑N−1n=0 σij,nI[Tn,Tn+1]. For each n = 1, . . . , N − 1, use

Lemma 5.4 to find the model price of the two swaptions maturing at Tn. These dependon three unknown variables σij,n−1(t), which are related by equation (54)

Corr(df1(t), df2(t)) = χ12(t)

This gives us three equations for three unknown at each step n = 1, . . . , N − 1.

4. Recover the diffusion matrix g(t) by solving

Hf (t)g(t)> = diag(λ1(t), . . . , λd(t))Df (t)>

This procedure allows us to calibrate d swaption strips, ideally d strips with constantswap tenors matching the benchmark tenors δ1, . . . , δd.

5.3.2 Monte Carlo simulation

Monte Carlo methods for the d-dimensional Gaussian model are straightforward, as all statevariables are jointly Gaussian. Recalling that r(t) = f(0, t) + 1>x(t), for a security payingV (T ) at time T we need to compute

V (0) = EQ[V (T )e−

∫ T0r(u) du

]= P (0, T )EQ

[V (T ; x(t)Tt=0)e−

∫ T0

1>x(u) du]

The risk-neutral dynamics of x(t) = (x1(t), . . . , xd(t)) are

dx(t) = [y(t)1− κ(t)x(t)]dt+ σx(t)>dW (t), x(0) = 0

Given a discrete schedule tiNi=0, it is clear that x(ti+1)|x(ti) is d-dimensional Gaussianwith mean and covariance matrix

EQ [x(ti+1)|x(ti)] = e−∫ ti+1ti

κ(u) dux(ti) +

∫ ti+1

ti

e−∫ ti+1s κ(u) duy(s)1 ds

VarQ [x(ti+1)|x(ti)] =

∫ ti+1

ti

e−∫ ti+1s κ(u) duσx(s)σx(s)>e−

∫ ti+1s κ(u)> du

43

If C is the square root of the covariance matrix, for a draw of d independent Gaussian samplesZ = (Z1, . . . , Zd) the advancement of x(·) from ti to ti+1 is given by

x(ti+1) = EQ [x(ti+1)|x(ti)] + CZ

As for the quantity I(T ) = −∫ T

01>x(u) du:

• I(t) is a Gaussian process, so we can simulate it jointly with x(t) once we have workedout the moments of I(ti+1)|I(ti), x(ti), as in section 4.3.3.

• We can compute I(T ) by numerical integration I(T ) ' −1>∑Ni=1 x(ti).

44

6 The one-factor quasi-Gaussian model

We consider extensions of one-factor Gaussian short rate models with local and stochasticvolatility, in which the Markovian structure of the model comes at the expense of addingadditional state variables.

6.1 Definition

The Gaussian (GSR) model es given, within an HJM setting, by


df(t, T ) = σf (t, T )t∫ T

t



(−∫ T

t

κ(u) du

)Recall that in order to match the initial yield curve, we must have

θ(t) =1

κ(t)

∂f(0, t)

∂t+ f(0, t) +

1

κ(t)

∫ t

0


r(u) du

In order to make the short rate r(t) Markovian we imposed a separability conditionσf (t, T ) = g(t)h(T ), where g, h are deterministic functions. Quasi-Gaussian models arise byallowing the function g to be stochastic:

σf (t, T, w) = g(t, w)h(T )

Lemma 6.1 (description of quasi-Gaussian models). Consider a one-factor quasi-Gaussianmodel


df(t, T ) = σf (t, T )t∫ T

t



(−∫ T

t

κ(u) du

)dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t, w)dW (t), x(0) = 0

dy(t) =[σ2r(t, w)− 2κ(t)y(t)

]dt, y(0) = 0

κ(t) = −h′(t)

h(t)

σr(t, w) = σf (t, t, w) = g(t, w)h(t)

All zero-coupon discount bonds are deterministic functions of the processes x(t), y(t)

P (t, T ) = P (t, T, x(t), y(t))

where

P (t, T, x, y) =P (0, T )

P (0, t)exp

(−G(t, T )x− 1

2G2(t, T )y

)(55)

G(t, T ) =1

h(t)

∫ T

t

h(s) ds

45

Note that:

• The evolution of the whole interest rate curve can be reduced to the evolution of justtwo state variables x(t) (main yield curve driver) and y(t) (auxiliary variable requiredto uphold the no-arbitrage condition).

• The qG model has a closed-form bond reconstitution formula for arbitrary choices ofg(t, ω) (we gain tractability at the expense of requiring two state variables, even thoughthe Brownian motion is one-dimensional.

6.2 Local volatility

A one-factor gQ model with local volatility is obtained by requiring the function g(·) to bedeterministic

g(t) = g(t, x(t), y(t)) and hence σr(t) = g(t, x(t), y(t))h(t)

with dynamics

dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t, x(t), y(t))dW (t),

dy(t) =[σ2r(t, x(t), y(t))− 2κ(t)y(t)

]dt

Fix a tenor structure 0 < T0 < · · · < TN , with τn = Tn+1−tn and consider a forward swaprate12 S(t) with first fixing T0 and last payment TN . Since clearly S(t) = S(t, x(t), y(t)),using It’s lemma and noting that S(t) is a QA-martingale we conclude

dS(t) =∂S

∂x(t, x(t), y(t))σr(t, x(t), y(t))dWA(t), (56)

where

∂S

∂x(t, x, y) = − 1

A(t, x, y)[P (t, T0, x, y)G(t, T0)− P (t, TN , x, y)G(t, TN )]

+S(t, x, y)

A(t, x, y)

N−1∑n=0

τnP (t, Tn+1, x, y)G(t, Tn+1) (57)

Using the methods of Markovian projection one can show that the dynamics of the swaprate (56) can be written with a diffusion term that is just a function of the swap rate itself.The resulting SDE can hence be solved using standard methods for local volatility models.

Proposition 6.2 (Approximate local volatility dynamics for swap rate). The values ofall European options on the swap rate S(t) can be computed in a vanilla model with time-dependent local volatility function:

dS(t) = ϕ(t, S(t))dWA(t) (58)

ϕ2(t, s) = EA

[∂S

∂x(t, x(t), y(t))σr(t, x(t), y(t))

2

|S(t) = s

](59)

Evaluating conditional expectations like these can be hard. However:

12Recall that, given a tenor structure 0 < T0 < · · · < TN , we have

A(t) ≡ A0,N (t) =

N−1∑n=0

τnP (t, Tn+1), S(t) ≡ S0,N (t) =P (t, T0)− P (t, Tn)

A(t).

46

• If we assume that y(t) is well approximated by a deterministic function y(t), so thatS = S(t, x(t)) and conversely, x(t) = X(t, S(t)).

• If these functions y(t) and X(t, s) were known, then the evaluation of the conditionalexpectation (58) would boil down to

ϕ(t, s) =∂S

∂x(t,X(t, s), y(t)) · σr(t,X(t, s), y(t))

where ∂S∂x (t, x, y) is given by (57).

Once the volatility function ϕ(t, s) is determined, namely once we have approximatedy(t) and X(t, s), swaption values can be computed from (58) by using methods for localvolatility vanilla models (c.f. [AP10-1, Section 7.2]). We summarize these approximationsin the following proposition (see [AP10-2, Section 13.1.4] for details).

Remark 6.1 (A simple approximation for low volatilities). Setting y(t) ≡ 0 and applying alinear approximation

S(t, x, 0) = S(t, 0, 0) +∂S

∂x(t, 0, 0)x,

∂S

∂x(t, x, 0) =

∂S

∂x(t, 0, 0)

whence

x ' S(t, x, 0)− S(t, 0, 0)∂S∂x (t, 0, 0)

and we obtain

ϕ(t, s) ' ∂S

∂x(t, 0, 0)σr

(t,s− S(t, 0, 0)∂S∂x (t, 0, 0)

, 0

)

Proposition 6.3 (approximations of y(t) and X(t, s)). (i) [AP10-2, Proposition 13.1.4]We approximate y(t) by

y(t) ' EA[y(t)] = h2(t)

∫ t

0

[σr(s, 0, 0)

h(s)

]2

ds, t ∈ [0, T0] (60)

h(t) = exp

(−∫ t

0

κ(u) du

)(ii) [AP10-2, Lemma 13.1.6] We approximate x(t) by

x(t) = EA[x(t)] = x0(t) +1

2

∂2X

∂s2(t, S(0))VarA(S(t)), t ∈ [0, T0] (61)

where x0(t) is given as the solution of the equation

S(t, x0(t), y(t)) = S(0)

which can be solved in a few iterations of Newton’s algorithm (staring the search withx = 0), and where the variance VarA(S(t)) can be approximated by

VarA(S(t)) '∫ t

0

[∂S

∂x(s, 0, 0)σr(s, 0, 0)

]2

ds

The derivative ∂2X∂s2 (t, x) in (61) can be computed by differentiating the implicit defini-

tion S(t,X(t, s), y(t)) ≡ s twice13.

13Namely,

∂X

∂s(t, s) =

(∂S

∂x(t,

)−1

and∂2X

∂s2=

∂2S∂x2(∂S∂x

)2

47

(iii) [AP10-2, Proposition 13.1.8] In order to approximate X = X(t, s), since S(t, x) isobserved to be closely approximated as a quadratic function of x, we perform a secondorder expansion of

S(t,X(t, s), y(t)) = s

around x(t), namely we approximate X(t, s) with the solution ξ = ξ(t, s) of the quadraticequation

S(t, x(t), y(t)) +∂S

∂x(t, x(t), y(t))[ξ − x(t)] +

1

2

∂2S

∂x2(t, x(t), y(t))[ξ − x(t)]2 = s (62)

(iv) In conclusion, the conditional expectation (59)

ϕ2(t, s) = EA

[∂S

∂x(t, x(t), y(t))σr(t, x(t), y(t))

2

|S(t) = s

]

can be approximated by

ϕ(t, s) ≈ ∂S

∂x(t, ξ(t, s), y(t)) · σr(t, ξ(t, s), y(t)) (63)

where

• ∂S∂x (t, x, y) is given by (57).

• ∂2X∂s2 (x, t) can be computed by differentiating the implicit definition S(t,X(t, s), y(t)) ≡s twice.

• ξ(t, s) is the solution of the quadratic equation14 (62),

• x(t) is given by (61),

• y(t) is given by (60),

6.2.1 Linear local volatility

Even though one could use the previous results to calibrate the function σr(t, x, y) non-parametrically to the implied volatilities of a collection of swaptions across all strikes, it isrecommendable to chose σr(t, x, y) from a parametric family of monotone, downward slopingfunctions of state variables.

• We next make the assumption that the short rate local volatility function islinear

σr(t, x, y) = λr(t) [αr(t) + br(t)x] (64)

where the scale function αr(t) is fixed exogenously and the volatility λr(t) and skewbr(t) are calibrated to the market. The swap rate local volatility function ϕ(t, s) (63)thus becomes

ϕ(t, s) ≈ ∂S

∂x(t, ξ(t, s), y(t)) · λr(t) [αr(t) + br(t)ξ(t, s)]

• We further assume that the swap rate local volatility function ϕ(t, s) is wellapproximated by a linear function in s too, which leads to the following proposition[AP10-2, Corollary 13.1.9]:

14Any of the 2 solutions?

48

Proposition 6.4 (approximate swap rate dynamics with linear local volatility function).The swap rate local volatility function ϕ(t, S(t)) can be approximated by

ϕ(t, S(t)) ≈ ϕ(t, S(0)) +∂ϕ

∂s(t, S(0))[S(t)− S(0)]

= ϕ(t, S(0))

[∂ϕ∂s (t, S(0))

ϕ(t, S(0))S(t) +

(1−

∂ϕ∂s (t, S(0))

ϕ(t, S(0))

)S(0)

]not= λS(t) [bS(t)S(t) + (1− bS(t))S(0)]

where

λS(t) = λr(t)1

S(0)

∂S

∂x(t, x(t), y(t)) [αr(t) + br(t)x(t)] (65)

bS(t) =S(0)

αr(t) + br(t)x(t)

br(t)∂S∂x (t, x(t), y(t))

+S(0)∂

2S∂x2 (t, x(t), y(t))[

∂S∂x (t, x(t), y(t))

]2 (66)

The dynamics of the swap rate (58) are hence approximated by

dS(t) = ϕ(t, S(t))dWA(t) = λS(t) [bS(t)S(t) + (1− bS(t))S(0)] dWA(t) (67)

The model (67) is a displaced log-normal SDE with time-dependent volatility λS(t) andskew bS(t) and it can be converted into a displaced log-normal SDE with time-constantcoefficients using averaging methods. This yields the following approximate swaption pricingformula [AP10-2, Proposition 13.1.10].

Proposition 6.5 (approximate swaption pricing in local volatility qG model). Considera payer swaption with strike c and expiry T0 on the swap rate S(t). In the local-volatilityquasi-Gaussian model with linear short rate volatility

σr(t, x, y) = λr(t) [αr(t) + br(t)x] ,

the swaption price can be approximated by the displaced log-normal option formula

Vswaption(0) ' A(0)

[(S(0) + S(0)

1− bSbS

)N (d+)−

(c+ S(0)

1− bSbS

)N (d−)

]

d± =

ln

(S(0)+S(0)

1−bSbS

c+S(0)S(0)1−bSbS

)± 1

2 b2S λ

2ST0

bS λS√T0

where

λS =

(1

T0

∫ T0

0

λ2S(t) dt

)1/2

(68)

bS =

∫ T0

0

bS(t)ωS(t) dt (69)

ωS(t) =λS(t)2

∫ t0λS(s)2 ds∫ T0

0

(λS(u)2

∫ u0λS(s)2 ds

)du

(70)

49

6.2.2 Volatility calibration

For calibration purposes, we assume that the parameters αr(t), br(t), λr(t) defining the localvolatility function (64) of the qG model are piecewise constant on a maturity grid.

Concretely, given 0 = T0 < T1 < · · · < TN−1:

• Consider a collection of N − 1 coterminal options, with the n-th one expiring at Tn.

• Assume the n-th swaption has an underlying swap with µ(n) periods and denote thecorresponding swap rate and annuity

Sn(t) ≡ Sn,µ(n)(t), An(t) ≡ An,µ(n)(t), n = 1, . . . , N − 1

• The parameter αr(t) is redundant, so we elect to set αr(t) = Sn(0)1(Tn−1,Tn](t).

With these observations in mind, the local volatility definition (64) specializes to

σr(t, x, y) =

N−1∑n=1

λn [Sn(0) + bnDnx] 1(Tn−1,Tn](t), Dn =∂Sn∂x

(t, 0, 0)

where

λr(t) =

N−1∑n=1

λn1(Tn−1,Tn](t), αr(t) =

N−1∑n=1

αn1(Tn−1,Tn](t), br(t) =

N−1∑n=1

bn1(Tn−1,Tn](t)

Assume for now that the mean reversion function κ(t) is specified externally and that a

collection of market parameters (λSn , bSn), n = 1, . . . , N−1 is given15. From Proposition 6.5,the value of a swaption expiring at Tn depends only on parameters (λi, bi) for i = 1, . . . , n,so the local volatility qG model can be calibrated by a bootstrap method as follows:

1. Starting values: set the λn’s to volatilities obtained by calibrating a pure Gaussianmodel and bn = bSn , n = 1, . . . , N − 1. Set n = 1.

2. Given n, parameters (λ∗i , b∗i ) are already calibrated for i = 1, . . . , n − 1. Use (61) and

(60) to update x(t) and y(t) for t ∈ [0, Tn] using the former, together with the startingvalue (λn, bn).

3. Calculate λSn(t) and bSn(t) for t ∈ [0, Tn−1] using (65) and (66) respectively.

4. Make another guess for (λn, bn) and use it to update λSn(t) and bSn(t) for t ∈ [Tn−1, Tn],once again using (65) and (66) respectively.

5. Calculate the averaged parameters λSn and bSn using (68) and (69) from Proposition6.5.

6. If (λSn , bSn) is not equal to (λSn , bSn) within a given tolerance, go to Step 4.

Essentially, in this step we seek to obtain (λn, bn) such that∥∥∥(λSn − λSn , bSn − bSn)∥∥∥2

≤ ε

for a given tolerance ε. Once accomplished, move to the next step.

7. Set (λ∗n, b∗n) = (λn, bn), update n 7→ n+ 1 ≤ N − 1 and go to Step 2.

15These can be obtained by fitting a series of constant parameter displaced log-normal vanilla models to theobserved swaption volatility smiles at all expiries.

50

6.2.3 Computational tip

We explain in detail how to perform the n-th step...

y(t)

x(t)

6.2.4 Mean reversion calibration

The calibration of the mean reversion speed κ(t) of the local volatility qG model relies onthe observation that the ratio of volatilities of two swaptions with the same expiry dateis practically independent of volatility. This suggests using a second strip of swap rates todefine targets for mean reversion calibration as the ratios between these rates and the originalones.

Consider the original strip of swap rates Sn,ν(n)(·), n = 1, . . . , N − 1 and obtain a newone Sn,µ(n)(·), n = 1, . . . , N −1 with identical fixing dates, but spanning possibly differentperiods.

Proposition 6.6. Consider the quasi-Gaussian model with local short rate volatility functionσr(t) = g(t, x(t), y(t))h(t). The ratio of two swap rates fixing on Tn with m1 and m2 periods,respectively, is approximately given by

Var(Sn,m1(Tn))

Var(Sn,m2(Tn))≈

∫ Tn0

(∂Sn,m1

∂x (t, 0, 0))2

dt∫ Tn0

(∂Sn,m2

∂x (t, 0, 0))2

dt(71)

Note that the ratios (71) depend on the mean reversion parameter κ(t) only, so that itcan be calibrated independently of the other parameters. For calibration we add the usualpiecewise constant assumption

κ(t) =

N−1∑n=1

κn1(Tn−1,Tn](t) + κN1(TN−1,∞](t)

We then set the optimal mean reversion levels κn as the solution of the optimizationproblem

κ∗n = arg min

N−1∑n=1

[Var(Sn,m1(Tn))

Var(Sn,m2(Tn))

(κn)−Var(Sn,m1(Tn))

Var(Sn,m2(Tn))

]2

+ ω

N−1∑n=1

(κn+1 − κn)2

where we included a regularization term penalizing non-stationary behavior (dependent on

a user-specified weight ω) and where the V ar(Sn,m)’s are market-implied variances of swaprates, which in the case of the linear short rate volatility model under consideration can beapproximated16 by

V ar(Sn,m(Tn)) ≈(Sn,m(0)λSn,m

)2

Tn

6.3 Stochastic volatility

Local volatility models are characterized by a dependence of the function g in the factoriza-tion σf (t, T, w) = g(t, w)h(T ) on the state variables, namely g(t) ≡ g(t, x(t), y(t)). Instead,

16Market-implied variances can be calculated in more general models from values of options on the swap rates.

51

we can enrich the model by incorporating an additional stochastic factor via following spec-ification

g(t, w) ≡√z(t)g(t, x(t), y(t)) (72)

dz(t) = θ[z0 − z(t)]dt+ η(t)√z(t)dZ(t), (73)

E[dW (t), dZ(t)] = 0

The analysis from the previous section carries through similarly in this context. Inparticular, the model is defined by the collection of SDE’s

dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t, x(t), y(t))dW (t), x(0) = 0,

dy(t) =[σ2r(t, x(t), y(t))− 2κ(t)y(t)

]dt, y(0) = 0,

dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t), z(0) = z0 = 1,

dZ(t)dWA(t) = 0

A few remarks are in order:

• The bond reconstitution formula (55) is the same as for any other quasi-Gaussian model(c.f. Lemma 6.1.

• In particular, the bond reconstitution formulas do not depend on the stochastic volatil-ity process z(t), so the SV-qG model is a true stochastic volatility model, in the sensethat the stochastic volatility is unspanned and cannot be hedged by discount bonds.

• The time-dependent volatility of variance is assumed to be piecewise constant η(t) =∑N−1n=1 ηnI(Tn−1,Tn](t).

An analysis identical to the one carried out in the local volatility setting yields thefollowing approximate dynamics for the short rate:

Proposition 6.7 (approximate dynamics for the swap rate S(t) in the SV-qV model).Assuming that the local short rate volatility is linear σr(t, x, y) = λr(t) [αr(t) + br(t)x], thedynamics of the swap rate S(t) in the stochastic volatility quasi-Gaussian model (??) aregiven approximately by

dS(t) =√z(t)λS(t) [bS(t)S(t) + (1− bS(t))S(0)] dWA(t),

dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t), z(0) = z0 = 1,

dZ(t)dWA(t) = 0

where λS(t) and bS(t) are given by equations (65)-(66).

This is a stochastic volatility model with time-dependent parameters for which the timeaveraging methods of [AP10-1, Sction 9.3] are available.

6.4 Quasi-Gaussian multi-factor models

52

7 The Libor market model

The Libor market model (LM) is a model capable of:

• Capturing the full correlation structure of across the entire yield curve.

• Allowing volatility calibration to a large enough set f European options that the volatil-ity characteristics of most exotic securities can be considered spanned by the calibration.

7.1 The basics

The HJM framework generally involves multiple driving Brownian motions and an infiniteset of state variables (the instantaneous forward rates). Working directly with forward ratesis unattractive for a number of reasons:

(i) Instantaneous forward rates are never quoted in the market: realistic securities in-volve simply compounded (Libor) rates. The development of market-consistent pricingexpressions is thus cumbersome.

(ii) An infinite set of instantaneous forward rates will require discretization.

(iii) Prescribing the form of the volatility function of instantaneous forward rates is com-plicated.

BGM, Jamshidian: All three issues can be solved by formulating the model in termsof a non-overlapping set of simply compounded Libor rates.

Fix a tenor structure

0 = T0 < T1 < · · · < TN , τn = Tn − Tn−1

Instead of keeping track of an entire yield curve, at any point in time we are focused onlyon a finite set of zero-coupon bonds

P (t, Tn), t < Tn ≤ TN

Define q(t) to be the tenor structure index of the shortest-dated discount bond still aliveat time t, namely

Tq(t)−1 ≤ t < Tq(t)

We define the Libor forward rates as

Ln(t) = L(t, Tn, Tn+1) =1

τn

(P (t, Tn)

P (t, Tn+1)− 1

), q(t) ≤ n ≤ N − 1

7.1.1 Probability measures

By construction, Ln(t) is a QTn+1-martingale, so by the martingale representation theoremthere is an adapted vector process σn(t) such that

dLn(t) = σn(t)>dWn+1(t)

In order towrite the process in measure QTn , consider the Radon-Nikodym process

ξ(t) =

(dQTn

dQTn+1

)t

= (1 + τnLn(t))P (0, Tn+1)

P (0, Tn),

dξ(t)

ξ(t)=τnσn(t)>dWn+1(t)

1 + τnLn(t)

53

From Girsanov’s theorem, dWn(t) = dWn+1(t)− τnσn(t)>

1+τnLn(t)dt is a QTn -BM and

dLn(t) = σn(t)>(

τnσn(t)

1 + τnLn(t)dt+ dWn(t)

)Iterating this argument, one easily sees that in the terminal measure QTN , the process

for Ln(t) is

dLn(t) = σn(t)>

− N−1∑j=n+1

τjσj(t)

1 + τjLj(t)dt+ dWN (t)

As for the spot measure, recall that it is the measure induced by taking as numeraire

B(t) = P (t, Tq(t))

q(t)−1∏n=0

(1 + τnLn(Tn))

The random part of the numeraire is the discount bond P (t, Tq(t)), so the dynamics in

measure QB coincide with the dynamics of QTq(T ) and arguing as above we see

dLn(t) = σn(t)>

n∑j=q(t)

τjσj(t)

1 + τjLj(t)dt+ dWB(t)

7.1.2 Link to HJM

Recall that HJM models have risk-neutral dynamics of the form

df(t, T ) = σf (t, T )>∫ T

t

σf (t, u) dudt+ σf (t, T )>dW (T )

The dynamics for the forward bond P (t, Tn, Tn+1) are of the form

dP (t, Tn, Tn+1)

P (t, Tn, Tn+1)= O(dt) +

(σP (t, Tn+1)> − σP (t, Tn)>

)dW (t)

By the definition of the forward Libor rate Ln(t) = 1τn

(P (t,Tn)P (t,Tn+1) − 1

)and using the diffusion

invariance principle one sees that

σn(t) =1

τn(1 + τnLn(t))

∫ Tn+1

Tn

σf (t, u) du

We thus see that the LM model is a special case of the HJM model: a full specificationof σf (t, T ) uniquelt determined the LM volatility σn(t), but the converse is not true.

7.1.3 Choice of σn(t)

To build a workable model, we need to be specific about σn(t).

(i) Local volatility LMσn(t) = λn(t)ϕ(Ln(t))

where λn(t) is a bounded vector-valued deterministic function and ϕ is a time-homogeneouslocal volatility function.

54

(ii) Stochastic volatility LM: we extend the previous local volatility model and we allowthe term on the Brownian motion to be scaled by a stochastic process so we have

dLn(t) =√z(t)ϕ(Ln(t))λn(t)>

(√z(t)µn(t)dt+ dWB(t)

)(74)

µn(t) =

n∑j=q(t)

τjϕ(Lj(t))λj(t)

1 + τjLj(t)

dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t), z(0) = 0 (75)

• Note that a single common factor√z(t) simultaneously scales all forward rate

volatilities (even though this can be relaxed, see section ??).

• In order to make the variance process (75) more tractable, it is common to assumethat the Brownian motions Z(t) and WB(t) are uncorrelated.

7.2 Correlation structure

In one-factor models, where all the points on the forward curve move in the same direction.Empirical evidence contradicts this and indicates that various points on the forward curvedo not move co-monotonically with each other. The LM model gives us control over theinstantaneous correlation between various points, via the simple observation

Corr(dLk(t), dLj(t)) =λk(t)>λj(t)

‖λk(t)‖‖λj(t)‖

As we add more Brownian motions, our ability to capture increasingly complicated cor-relation structures progressively improves, at the cost of increasing the model complexity. Inorder to choose the model dimension m we may follow two approaches:

(i) Fix the correlation structure in the model to match empirical forward rate correlations,for instance via PCA.

(ii) Imply the correlation structure directly from traded market data by amending the setof calibration instruments with securities that have stronger sensitivity to forward ratecorrelations, like yield curve spread options.

7.2.1 Empirical PCA

For fixed τ (e.g. 0.25 or 0.5), define the rates

`(t, x) = L(t, t+ x, t+ τ)

Fix a set of tenors x1, . . . , xNx and a set of calendar times t1, . . . , tNt and set up the Nx×Ntobservation matrix

Oi,j =`(tj , xi)− `(tj−1, xi)√

tj − tj−1, i = 1, . . . , Nx, j = 1, . . . , Nt

where for each date in the time grid tj , we construct the forward curve (starting at tj) frommarket observable swaps, futures and deposits. Use this to construct a variance covariancematrix

C =OO>

NtIn order for the LM model to conform to empirical data, we need sufficiently many drivingBrownian motions m to replicate this variance matrix. This is accomplished by performinga PCA: compute the eigenvalues of C together with the percentage of variance that theyexplain, and decide how many are necessary to explain a reasonable amount thereof (generally4 or 5).

55

7.2.2 Empirical estimates for forward rate correlations

Introduce the diagonal matrix

c = diag(√

C1,1, . . . ,√CNx,Nx

)so that the empirical forward rate correlation matrix becomes

R = c−1Cc−1. (76)

The termRi,j provides a sample estimate of the instantaneous correlation between incrementsin `(t, xi) and `(t, xj)

Ri,j =1√

Ci,i√Cj,j

Ci,j =1√

Ci,i√Cj,j

1

Nt

Nt∑k=1

[`(tk, xi)− `(tk−1, xi)] · [`(tk, xj)− `(tk−1, xj)]

tk − tk−1.

In multi-factor yield curve modeling, it is common practice to work with simple para-metric forms for correlations

Corr(Lk(t), Lj(t)) = q(Tk − t, Tj − t)

For instance

q(x, y) = ρ∞(minx, y) + [1− ρ∞(minx, y)] e−a(minx,y)·|y−x|, (77)

a(z) = a∞ + (a0 − a∞)κz,

ρ∞(z) = b∞ + (b0 − b∞)e−αz

where ξ = (a0, a∞, κ, b0, b∞, α)> is a vector of parameters that are found by least-squaresoptimization against an empirical correlation matrix (76). Concretely, if R2(ξ) is the corre-lation matrix generated from the form (77), the optimal parameter vector ξ∗ is obtained byminimizing

ξ∗ = arg minξ[tr((R−R2(ξ)) · (R−R2(ξ))>

)], ξ ≥ 0

7.3 Pricing European options

In order to calibrate the model



)µn(t) =

n∑j=q(t)

τjϕ(Lj(t))λj(t)

1 + τjLj(t)

dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t), z(0) = 0

the vectors λk(t) must be such that the model successfully reproduces the prices of liquidplain-vanilla derivatives (i.e. swaptions and caps). We thus need to find pricing formulas forvanilla options that are fast enough to be embedded into an iterative calibration algorithm.

7.3.1 Caplets

An interest rate cap is a security that allows one to benefit from low floating rates, yet beprotected from high rates. This is accomplished by giving the holder the right to pay thesmaller of two simple rates: the floating rate f and a fixed rate κ: the holder of a cap overthe interval [T, T + τ ], the interest rate paid at time T + τ on each dollar of principal is

56

minτκ, τf. Without the caplet the interest payment would be τf , so the caplet’s worthto the holder is

τf −minτf, τκ = (f − κ)+τ

Given a tenor structure T0 < T1 < · · · < TN and forward Libor rates Ln(t) = L(t;Tn, Tn+1),a cap is a stream of κ-caplets maturing at times Tn and paying out

Vcaplet(Tn+1) = τn (Ln(Tn)− c)+

The time-t value of a caplet is hence

Vcaplet(t) = B(t)Et

[1

B(Tn+1)Vcaplet(Tn+1)

]= B(t)Et

[1

B(Tn+1)τn (Ln(Tn)− c)+

]and the time-t value of a cap in the both the risk-neutral is

Vcap(t) = B(t)

N−1∑n=0

Et

[1

B(Tn+1)τn (Ln(Tn)− c)+

]Switching to the Tn+1-forward measure17 for the n-th caplet, the pricing formula becomes

Vcap(t) =

N−1∑n=0

τnP (t, Tn+1)ETn+1

t

[(Ln(Tn)− c)+

]Remark 7.1. As can be seen from the last expression, the terminal correlation between theforward rates Ln does not affect the cap value. Unfortunately, this is no longer the case forswaptions.

Proposition 7.1 (Caplet pricing in the SV-LM model (74)-(75)). Assume that the forwardrate dynamics in the spot measure are given by (74)-(75). Then

Vcaplet(0) = P (0, Tn+1)τnETn+1

[(Ln(Tn)− c)+

](78)

where

dLn(t) =√z(t)ϕ(Ln(t))‖λn(t)‖dY n+1(t)

dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t)

Y n+1(t) =

∫ t

0

λn(s)>

‖λn(s)‖dWn+1(s)

We are thus reduced to computing the expectation of (Ln(Tn)− c)+, where the processLn(t) is a standard scalar stochastic volatility diffusion.

7.3.2 Swaptions

Recall that a fixed-for-floating interest rate swap allows to exchange a stream of cash-flowsat a floating (Libor) rate by a stream of fixed-rate cash-flows. Given a tenor structure0 ≤ T0 < T1 < · · · < TN , from the perspective of the fixed-rate payer, a swap pays out

Vswap(Tn+1) = τn(Ln(Tn)− κ)

17Recall that the pricing formula with numeraire g reads

V (t) = g(t)E

[1

g(T )V (T )|Ft

]

57

The time-t price of a swap is hence given by

Vswap(t) = β(t)

N−1∑n=0

Et

[1

β(Tn+1)τn(Ln(Tn)− κ)

][1]=

N−1∑n=0

τnP (t, Tn+1)(Ln(t)− κ)

[2]= A(t) (S(t)− κ)

where:

• Equality [1] shows that a vanilla fixed-floating swap can be valued on date t usingonly the term structure of interest rates observed on that date. In order to obtain [1],

one starts by using that P (Tn, Tn+1) = E[β(Tn)β(Tn+1)

], then the definition 1 + τnLn(t) =

P (t,Tn)P (t,Tn+1) and finally the fact that P (t,Tn)

β(t) is a martingale.

• In [2] we denoted

A(t) = A0,N (t) =

N−1∑n=0

τnP (t, Tn), S(t) = S0,N (t) =1

A(t)

N−1∑n=0

τnP (t, Tn)Ln(t)

A European swaption is a contract giving the holder the right, but no the obligation, toenter a swap at a future date, at a given fixed rate. A payer (receiver) swaption is an optionto pay (receive) the fixed leg on a fixed-floating swap.

The payoff for a payer swaption maturing at time Tj , with the underlying security beinga fixed-for-floating swap making payments at times Tj+1, . . . , Tk, j < k ≤ N is simply

Vswaption(Tj) = (Vswap(Tj))+

=

k∑n=j

τnP (Tj , Tn+1)(Ln(T0)− κ)

+

and for t < Tj , the time-t value of a payer swaption is given by

Vswaption(t) = β(t)Et

[1

β(Tj)Vswaption(Tj)

]

= β(t)Et

1

β(Tj)

k−1∑n=j

τnP (Tj , Tn+1)(Ln(T0)− κ)

+= β(t)Et

[1

β(Tj)A(Tj)(S(Tj)− κ)+

]= A(t)EAt

[(S(Tj)− κ)+

]where

A(t) = Aj,k−j(t) =

k−1∑n=j

P (t, Tn+1)τn, S(t) = Sj,k−j(t) =P (t, Tj)− P (t, Tk)

A(t)

Pricing swaptions in the Libor model is more involved than pricing caps and will requiresome amount of approximation if a quick algorithm is required. Since S(t) is clearly aQA-martingale, an application of Ito’s lemma easily yields the following:

58

Lemma 7.2 (Swaption pricing in the SV-LM model (74)-(75)). Assume that the forwardrate dynamics in the spot measure are given by (74)-(75) and let QA be the measure inducedby using A(t) as numeraire. Then

dS(t) =√z(t)ϕ(S(t))

k−1∑n=j

wn(t)λn(t)>dWA(t) (79)

where

wn(t) =ϕ(Ln(t))

ϕ(S(t))

∂S(t)

∂Ln(T )

=ϕ(Ln(t))

ϕ(S(t))× S(t)τn

1 + τnLn(t)×

[P (t, Tk)

P (t, Tj)− P (t, Tk)+

1

A(t)

k−1∑i=n

τiP (t, Ti+1)

](80)

Sketch of proof. Since S(t) is QA-martingale, dS(t) is driftless. Hence, by Ito’s lemma,since S(t) = S(Lj(t), . . . , Lk−1(t)), we have

dS(t) =

k−1∑n=j

∂S(t)

∂Ln(t)

√z(t)ϕ(Ln(t))λn(t)>dWA(t)

Evaluating the partial derivatives ∂S(t)∂Ln(t) , we obtain (79).

The swap rate dynamics in (79) are clearly too complicated to allow analytical treatment.For simplicity, one generally assumes that the weights wn(t) are constant, and equal to theirtime-0 value18.

Proposition 7.3 (Approximate swaption pricing in the SV-LM model (74)-(75)). The time-0 price of the Tj-maturity swaption is given by

Vswaption(0) = A(0)EA[(S(Tj)− c)+

](81)

If wn(t) are the weights in (80), define

λS(t) =

k−1∑n=j

wn(0)λn(t)

The swap dynamics in Proposition 7.2 can be approximated by

dS(t) '√z(t)ϕ(S(t))‖λS(t)‖dY A(t), (82)

dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t),

‖λS(t)‖dY A(t) =

k−1∑n=j

wn(0)λn(t)>dWA(t) (83)

where Y A(t) and Z(t) are independent scalar Brownian motions in measure QA.

• the scalar term ‖λS(t)‖ is purely deterministic, so computation of the QA-expectation(81) can be obtained using standard analytical tools for scalar stochastic volatilitymodels (c.f. Chapter 8 in [AP10-1]).

We next give an example for displaced log-normal Libor rates.

18C.f. discussion at the bottom of page 617 in [AP10-2].

59

Corollary 7.4 (swaption pricing under displaced log-normal Libor rates). Assume that eachrate Ln(t) follows a displaced log-normal process in its own measure

dLn(t) = [bLn(t) + (1− b)Ln(0)]λn(t)>dW n+1(t), n = 1, . . . , N − 1

the time-0 price of a Tj-maturity swaption is given by

Vswaption,j(0) = A(0)cB

(0,S(0)

b;Tj , c− S(0) +

S(0)

b; bλSj

)with term swap rate volatility given by

λSj =1

Tj

(∫ Tj

0

‖λS(t)‖2 dt

)1/2

,

λS(t) =

k−1∑n=j

ωn(0)λn(t)

Above cB(t, S;T,K, σ) is Black’s call option formula with volatility σ, namely

cB(t, S;T,K, σ) = SN (d+)−K(d−), d± =ln S

K ±12σ

2(T − t)σ√T − t

7.3.3 Spread options

We now sketch a crude approach for pricing spread options that is adequate for calibrationpurposes. A more accurate evaluation using copulas is presented later. Consider a spreadoption paying

Vspread(T ) = (S1(T )− S2(T )−K)+, T ≤ minTj1 , Tj2

where S1(t) = Sj1,k1−j1(t) and S2(t) = Sj2,k2−j2(t) so that

dSi(t) = O(dt) +√z(t)ϕ(Si(t))λSi(t)

>dWB(t),

λSi(t) 'ki−1∑n=j

ωSi,n(0)λn(t)

Its time-0 value in the T -forward measure is given by

Vspread(0) = P (0, T )ET[(S1(T )− S2(T )−K)+

]Simplification. Assume that the spread ε(T ) = S1(T )−S2(T ) is a Gaussian with mean

and variance

ET [ε(T )] = ET [S1(T )]− ET [S2(T )] ' S1(0)− S2(0),

Var[ε(T )] =

2∑i=1

ϕ(Si(0))2z0

∫ T

0

‖λS1(t)‖2 dt

−2ρterm(0, T )z0

2∏i=1

ϕ(Si(0))

(∫ T

0

‖λS1(t)‖2 dt

)1/2

ρterm(T, T ′) = Corr(S1(T )− S1(T ′), S2(T )− S2(T ′))

='∫ TT ′λS1

(t)> · λS2(t)√∫ T

T ′‖λS1

(t)‖2 dt ·√∫ T

T ′‖λS2

(t)‖2 dt

60

Bachelier’s formula then yields

Vspread(0) = P (0, T )

√VarT (ε(T )) · [dΦ(d) + φ(d)] ,

d =ET [ε(T )]−K√

VarT (ε(T ))

7.4 Calibration

Assume that a tenor structure T0 < T1 < · · · < TN has been selected, that we have decidedupon the number of Brownian motions m to be used and that we have selected the basicform of the LM model (e.g. local/stochastic volatility) to be deployed. To complete themodel specification we need to establish the m-dimensional deterministic volatility vectorsλk(·), k = 1, . . . , N − 1. We follow these steps:

(i) Prescribe the basic form of ‖λk(t)‖ by introducing discrete time- and tenor-grids.

(ii) Use correlation information to obtain λk(t) from ‖λk(t)‖.(iii) Choose a set of observable securities against which to calibrate the model and establish

the norm to be used for calibration.

(iv) Recover λk(t) by norm optimization.

7.4.1 Calibration objective function

Assume that our calibration targets are

Vswaption,1, . . . , Vswaption,NS Vcap,1, . . . , Vcap,NC

Denote by V their quoted market prices and by V (G) their model-generated prices as func-tions of the volatility gird G. We will use the following calibration objective function:

I(G) =wSNS

NS∑i=1

(Vswaption,i(G)− Vswaption,i

)2

+wCNC

NC∑i=1

(Vcap,i(G)− Vcap,i

)2

w∂tNxNt

Nt,Nx∑i,j=1

(∂Gi,j∂ti

)2

+w∂xNxNt

Nt,Nx∑i,j=1

(∂Gi,j∂xj

)2

w∂t2

NxNt

Nt,Nx∑i,j=1

(∂2Gi,j∂t2i

)2

+w∂x2

NxNt

Nt,Nx∑i,j=1

(∂2Gi,j∂x2

j

)2

(84)

where wS , wC , w∂t, w∂x, w∂t2 , w∂x2 are exogenously specified weights which determine thetrade-off between accuracy and smoothness. Here is what all the terms above account for:

1. Terms (i) and (ii) measure how well the model is capable of reproducing the suppliedmarket prices.

2. Term (iii) measures the degree of volatility term structure time homogeneity, and pe-nalizes volatility functions that vary too much over calendar time.

3. Term (iv) measures the smoothness of the calendar time evolution of volatilities andpenalizes deviations from linear evolution.

4. Term (v) and (vi) measure constancy and smoothness of in the tenor direction.

61

7.4.1.1 Error function for implied volatilities

The error (objective) function (84) is often applied implied volatilities as opposed to out-right prices, so one normally institutes a pre-processing step where the market price of eachswaption Vswaption,i is converted into a constant implied volatility λSi , in such a way thatthe scalar SDE for the swap rate Si underlying the swaption Vswaption,i,

dSi(t) =√z(t)λSiϕ(Si(t))dYi(t)

reproduces the observed market price. If λSi(G) denotes the corresponding model volatility(and λCi(G) denotes the corresponding implied volatility for caps), one obtains an alternativecalibration norm

I(G) =wSNS

NS∑i=1

(λSi(G)− λSi

)2

+wCNC

NC∑i=1

(λCi(G)− λCi

)2

+ · · ·

There are two main reasons for proceeding in this fashion:

• The relative scaling of individual swaptions and caps is more natural (high-valuedtrades tend to be overweighed using outright prices).

• In some models, the implied volatilities λSi and λCi can be computed by simple timeintegration, so using them for calibration avoids the need of using time-consumingoption pricing formulas to obtain Vswaption,i and so forth. For instance, in Corollary7.4, where we illustrated how to price swaptions under displaced log-normal Libor rates

dLn(t) = [bLn(t) + (1− b)Ln(0)]λn(t)>dWn+1(t), n = 1, . . . , N − 1

we saw that the time-0 price of a Tj-maturity swaption is given by Black’s call optionformula

Vswaption,j(0) = A(0)cB

(0,S(0)

b;Tj , c− S(0) +

S(0)

b; bλSj

)with term swap rate volatility given by

λSj =1

Tj

(∫ Tj

0

‖λS(t)‖2 dt

)1/2

,

λS(t) =

k−1∑n=j

ωn(0)λn(t)

where the weights ωn(t) are given in (80). Given Vswaption,j , the implied volatility in

this case would be λSj := bλSj .

7.4.2 Calibration algorithm

Fix a number of factors m and fix a tenor structure 0 = T0 < T1 < · · · < TN and assumethat all the parameters in our Libor model have been fixed exogenously19 except for the m-dimensional vectors λk(t), k = 1, . . . , N−1, which we now seek to calibrate. Define functionsh : R2→Rm and g : R2→R as follows

λk(t) = h(t, Tk − t), ‖λk(t)‖ = g(t, Tk − t)19Namely, assume that we have exogenously fixed the parameters defining the square-root process for z(t) (i.e.

θ, η) and the parameters defining the local volatility function ϕ(·) (i.e. b if we use ϕ(Ln(t)) = bLn(t)+(1−b)Ln(0).

62

Fix a time/tenor grid

ti × xj, i = 1, . . . , Nt, j = 1, . . . , Nx, Nt, Nx ∼ 10 : 15

subject tot1 + xNx ≥ TN , and ti + xj ∈ Tn

have been selected, together with the number of Brownian motions m, a correlation matrixR, the set of calibration swaptions and caps and the weights in the calibration norm I.

Starting from some guess for the Nt ×Nx matrix G, Gij = g(ti, xj) (so that for some n

we have g(ti,

xj︷︸︸︷Tn − ti) = ‖λn(ti)‖), we run the following iterative algorithm:

1. Given G, use interpolation to obtain the full norm volatility grid ‖λn,k‖ for all Liborindices k = 1, . . . , N − 1 and all expiry indices n = 1, . . . , k. This is done by assumingthat, for each k, the norm ‖λk(t)‖ is piecewise constant in t

‖λk(t)‖ =

k∑n=1

I[Tn−1,Tn)(t)‖λn,k‖

‖λn,k‖ = w++Gi,j + w+−Gi,j−1 + w−+Gi−1,j + w−−Gi−1,j−1

and using linear interpolation in both dimensions of G.

Figure 1: Computation of ‖λn,k‖ = ‖λk(t)‖|[Tn−1,Tn) = g(t, Tk − t)

2. For each n = 1, . . . , N − 1, compute the (N − n)×m matrix H(Tn),

[H(Tn)]ji = hi(Tn, Tn+j−1 − Tn), i = 1, . . . ,m, j = 1, . . . , N − n

as follows: by construction such H(Tn)H(Tn)> will be the covariance matrix of the(N−n)-dimensional vector20 (Ln(Tn), . . . , LN−1(Tn)). We can construct it by definingan instantaneous correlation (N −n)× (N −n) matrix (constructed from an estimatedparametric form)

[R(Tn)]ij = Corr(Li(Tn), Lj(Tn)), i, j = n, . . . , N − 1 (85)

and a diagonal volatility matrix

[c(Tn)]jj = ‖λn,n+j−1‖, j = 1, . . . , N − n

so that the instantaneous covariance matrix will be

C(Tn) = c(Tn)R(Tn)c(Tn)

20Indeed, the (i, j)-term of the matrix H(Tn)H(Tn)> is

h(Tn, Tn+i−1)>h(Tn, Tn+j−1) = λn+i−1)(Tn)> · λn+j−1)(Tn) = Cov(dLn+i−1(Tn), dLn+j−1(Tn))

63

We can finally solve for H(Tn) in the equation

c(Tn)R(Tn)c(Tn) = H(Tn)H(Tn)>

via PCA.

(i) Covariance PCA: Let Λm(Tn) be a diagonal matrix containing the m largesteigenvalues of R(Tn), and let em(Tn) be an (N − n) ×m whose columns containthe corresponding eigenvectors. From the approximation21

c(Tn)R(Tn)c(Tn) ' em(Tn)Λm(Tn)em(Tn)> = em(Tn)√

Λm(Tn)√

Λm(Tn)em(Tn)>

it follows thatH(Tn) ' em(Tn)

√Λm(Tn)

(ii) Correlation PCA: It is more efficient22 to work with the correlation matrixdirectly and factor

R(Tn) = D(Tn)D(Tn)>

where D(Tn) is an (N − n)×m matrix obtained using PCA analysis, namely

D = arg minD′ h(D′;R) s.t. v(D) = 1 [∗]= arg minD′h(D′;R+ diag(µ)

where h(D;R) = tr((R−DD>)(R−DD>)>

), v(D) denotes the vector of diag-

onal elements of DD> and µ∗ is the solution of the equation

v(Dµ) = 1.

The passage from a constrained minimization problem to an unconstrained one in[∗] is justified by [AP10-2, Proposition 14.3.2].

3. Given λk(·) for all k = 1, . . . , N − 1, use the pricing formulas in Propositions 7.1 and7.3 to model prices for all swaptions and caps in the calibration set.

4. Establish the value of I(G) by direct computation.

5. Update G and repeat steps 1−4 until I(G) is minimized, using for instance the downhillsimplex method.

Remark 7.2. We will see how to calibrate the LM model to skews and smiles in section7.7.3.

7.4.3 Correlation calibration to spread options

In the above calibration algorithm, the instantaneous correlation matrix R (85) was specifiedexogenously by calibrating a parametric form such as (77) to an empirical forward ratecorrelation matrix (76). Instead, one can attempt to imply R from market data.

Assume the matrix R is time-inhomogeneous and is specified as some parametric functionof a low-dimension (unknown) parameter-vector23 ξ

R = R(ξ)

For this, let I(G, ξ) be our old objective function (which depends on ξ, since the prices ofcaps and swaptions depend on the correlation matrix). Introduce:

21Which minimizes the norm Tr[(C(Tn)− em(Tn)Λm(Tn)em(Tn)>) · (C(Tn)− em(Tn)Λm(Tn)em(Tn)>)

].

22Even though the PCA decomposition of a correlation matrix is technically more difficult, the correlation PCAis independent of the matrix c(Tn) and as such it won’t have to be updated when we update guesses for the Gmatrix in a calibration search loop.

23To be determined in the calibration procedure, along with the elements of G.

64

• Set of market-observable spread options Vspread,1, . . . , Vspread,NSP .

• Corresponding model-based prices Vspread,1(G, ξ), . . . , Vspread,NSP (G, ξ), computed us-ing ??

Update the old norm I to the norm

I∗(G, ξ) = I(G, ξ) +ωNSNSP

NSP∑i=1

(Vspread,i(G, ξ)− Vspread,i

)2

7.5 Monte Carlo simulation

Once the LM model has been calibrated to market data, we can proceed to use the param-eterized model for the pricing and risk management of non-vanilla options. Given the largenumber of Markov state variables (the full number of Libor forward rates), one must nearlyalways rely on Monte Carlo simulation.

Goal: given a probability measure and the state of the Libor forward curve at time t,we need to move the entire Libor curve forward to time t+ ∆, ∆ > 0.

7.5.1 Euler-type schemes

Assume we stand at time t and we know Lq(t)(t), Lq(t)+1(t), . . . , LN−1(t). We seek to advanceto time t+ ∆ and construct a sample of Lq(t+∆)(t+ ∆), Lq(t+∆)+1(t+ ∆), . . . , LN−1(t+ ∆).If q(t + ∆) exceeds q(t), some of the front-end forward rates expire and drop off the curveas we move to t+ ∆. Working in the spot measure QB , the general LM model dynamics are

dLn(t) = σn(t)>[µn(t)dt+ dWB(t)

]µn(t) =

n∑j=q(t)

τjσj(t)

1 + τjLj(t)

The Euler and log-Euler schemes for this model are:

Ln(t+ ∆) = Ln(t) + σn(t)>[µn(t)∆ +

√∆Z

],

Ln(t+ ∆) = Ln(∆) exp

[1

Ln(t)σn(t)>

((µn(t)− 1

2

σn(t)

Ln(t)

)∆ +

√∆Z

)]

where Z ∼ N = (0, Im) is a vector of m independent Gaussian draws.

7.5.2 Special-purpose schemes with drift Predictor-Corrector

In integrated form, the general LM dynamics become

Ln(t+ ∆) = Ln(t) +

∫ t+∆

t

σn(u)>µn(u) du+

∫ t+∆

t

σn(u)>dWB(u)

not= Ln(t) +Dn(t, t+ ∆) +Mn(t, t+ ∆) (86)

On occasions high-performance special-purpose schemes exist for simulation of Mn(t, t+∆). Generating M(t, t + ∆) by a special-purpose scheme, we can use a predictor-corrector

65

scheme for Ln:

Ln(t+ ∆) = Ln(t) + σn(t, L(t))>µn(t, L(t))∆ + M(t, t+ ∆)

Ln(t+ ∆) = Ln(t) + θPCσn(t, L(t))>µn(t, L(t))∆

+(1− θPC)σn(t+ ∆, L(t+ ∆))>µn(t+ ∆, L(t+ ∆))∆

+M(t, t+ ∆) (87)

M(t, t+ ∆) = σn(t)>√

∆Z

where L(t) = (L1(t), . . . , LN−1(t))>, L is the semi-implicit corrector scheme, L is theexplicit predictor scheme and θPC ∈ [0, 1] is the parameter that determined the amount ofimplicitness.

7.5.2.1 Lagging Predictor-Corrector scheme

7.6 Interpolation of front / back stubs

So far, we have seen how to obtain, at any time t, a vector of forward Libor rates

L(t) = (Lq(t)(t), Lq(t)+1, . . . , LN−1(t))

on a pre-specified tenor structure 0 ≤ T0 < · · · < TN .

In order to be able to compute P (t, Tn) for all n we need to additionally establish thefrontstub factor P (t, Tq(t)). Indeed, recall that

P (t, Tn) = P (t, Tq(t))

n−1∏i=q(t)

1

1 + τiLi(t)

Also, in order to compute P (t, T ) for arbitrary T , the back stub discount factor P (t, T, Tq(T )) =P (t,Tq(T ))

P (t,T ) is needed since

P (t, T ) =P (t, Tq(T ))

P (t, T, Tq(T ))=

1

P (t, T, Tq(T ))

P (t,Tq(T ))︷︸︸︷P (t, Tq(t))

q(T )−1∏i=q(t)

1

1 + τiLi(t)

The front and back stubs cannot be generally expressed in terms if Libor rates.

7.6.1 Back stub interpolation

A simple idea to obtain P (t, T, Tm) is to interpolate between P (t, Tm, Tm) = 1 and P (t, Tm−1, Tm) =1

1+τm−1Lm−1(t) , for instance setting

P (t, T, Tm) =T − Tm−1

Tm − Tm−1+

Tm − TTm − Tm−1

P (t, Tm−1, Tm)

Interpolations like these violate the basic no-arbitrage condition

P (0, Tm−1, Tm) = ETm−1 [P (t, Tm−1, Tm]

One way to remedy this is to consider choosing na arbitrary constant α(T ) and setting

P (t, Tm−1, Tm) = P (0, Tm−1, Tm) + α(T ) (P (t, Tm−1, Tm)− P (0, Tm−1, Tm))

66

subject to the consistency conditions α(Tm) = 1 and α(Tm−1) = 0. Note

dP (t, Tm−1, T ) = O(dt) + α(T )dP (t, Tm−1, Tm),

dP (t, Tm−1, T )

P (t, Tm−1, T )= O(dt) + [σP (t, Tm−1)− σP (t, T )] dW (t),

dP (t, Tm−1, Tm)

P (t, Tm−1, Tm)= O(dt) + [σP (t, Tm−1)− σP (t, Tm)] dW (t)

We may thus approximate α(T ) as

α(T ) ' P (0, Tm−1, T )

P (0, Tm−1, Tm)

‖σP (t, Tm−1)− σP (t, T )‖‖σP (t, Tm−1)− σP (t, Tm)‖

This approximation turns the problem of interpolating bond prices to that of interpolatingbond volatilities. For instance, in the one-dimensional Gaussian model with constant meanreversion κ we have

σP (t, T ) = σ(t)1− e−κ(T−t)

κ

Plugging this into the approximation above yields

α(T ) ' P (0, Tm−1, T )

P (0, Tm−1, Tm)

1− e−κ(T−Tm−1)

1− e−κ(Tm−Tm−1)

where κ could be either set as part of the user input or obtained by best-fitting theGaussian parametric form to the volatility structure of the LM model.

7.6.2 Back stub interpolation inspired by the Gaussian model

From the bond reconstitution formula in the Gaussian model we have

P (t, Tm−1, Tm) = P (0, Tm−1, Tm)× exp[−(G(Tm−1, Tm)e−κ(Tm−1−t)x(t′)

]× exp

[−1

2(G(t, Tm)2 −G(t, Tm−1)2)y(t)

]where

G(t, T ) =1− e−κ(T−t)

κ, y(t) = e−2κt

∫ t

0

e2κsσ(s)2 ds

A simple computation yields

P (t, Tm−1, T ) = P (0, Tm−1, T )

(P (t, Tm−1, Tm)

P (0, Tm−1, Tm)

) G(Tm−1,T )

G(Tm−1,Tm)

× exp

[1

2

G(Tm−1, T )

G(Tm−1, Tm)(G(t, Tm)2 −G(t, Tm−1)2)y(t)

]× exp

[−1

2(G(t, T )2 −G(t, Tm−1)2)y(t)

]where the volatility σ(t) and the mean reversion parameter κ can be obtained by fitting

the Gaussian volatility structure to the volatilities of Libor rates generated by the LM modelitself.

67

7.6.3 Front stub interpolation inspired by the Gaussian model

Using the forward bond prices we obtained in the previous subsection P (t′, Tm, Tm+1), t′ ≥m = q(t′), we can now construct P (t′, Tm) from these by employing a bond reconstitutionformula from a Gaussian model.

Assume that for short tenors, the LM model can be locally approximated by a one-factorGaussian model with constant mean reversion. In particular we have

P (t′, Tm) = P (0, t′, Tm) exp

[−G(t′, Tm)x(t′)− 1

2G(t′, Tm)2y(t′)

],

P (t′, Tm, Tm+1) = P (0, Tm, Tm+1)× exp [−(G(t′, Tm+1)−G(t′, Tm))x(t′)]

× exp

[−1

2(G(t′, Tm+1)2 −G(t′, Tm)2)y(t′)

]whereby one obtains

P (t′, Tm) = P (0, t′, Tm)

(P (t′, Tm, Tm+1)

P (0, Tm, Tm+1)

) G(t′,Tm)

G(t′,Tm+1)−G(t′,Tm)

×exp

(1

2G(t′, Tm)G(t′, Tm+1)y(t′)

)where

• The short rate volatility σ(t) can be approximated by the volatilities of the front Liborrate.

• The mean reversion parameter κ can be used to set the stub bond volatility near to itsmarket-implied value.

7.7 Advanced approximation for swaption pricing via Markovianprojection

So far we have seen how to calibrate the SV-LM model (74)-(75)



)µn(t) =

n∑j=q(t)

τjϕ(Lj(t))λj(t)

1 + τjLj(t)

dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t), z(0) = 0

and how to use it to price non-vanilla derivatives via Monte Carlo discretization. In order tocalibrate the model to liquid caps and swaptions, we developed fast pricing formulas (78)-(81) for the prices thereof, which in the case of swaptions required the use of approximateddynamics for the swap rate S(t).

• When approximating the dynamics of the swap rate S(t) to price swaptions via (82)

dS(t) '√z(t)ϕ(S(t))‖λS(t)‖dY A(t)

the skew of the swap rate is taken to be the same as the (common) skew of all Liborrates, yet numerical simulation shows that this is not the case.

• The model (74)



)uses the same time-homogeneous local volatility function ϕ(·) for all Libor rates, whichimplies that swaptions of all tenors and expiries have essentially the same volatilityskew, a model feature which is again inconsistent with market reality.

68

We thus consider the following generalization the old LM model specification (74)-(75)

dLn(t) =√z(t)ϕn(t, Ln(t))λn(t)>dWTn+1(t), n = 1, . . . , N − 1 (88)

dz(t) = θ[z0 − z(t)]dt+ η(t)ψ(z(t))dZ(t), z(0) = 0 (89)

and assume

ϕn(t, Ln(0)) = 1

ϕn(t, x) ' 1 + bn(t) [x− Ln(0)] ,

bn(t) = ∂xϕn(t, Ln(0))

The dynamics of the swap rate S(t) = Sj,k−k(t) in this model are

dS(t) =√z(t)λS(t,L(t))>dWA(t) (90)

λS(t,L(t)) =

k−1∑n=j

∂S(t)

∂Ln(t)ϕn(t, Ln(t))λn(t)

or in one-dimensional form

dS(t) =√z(t)‖λS(t,L(t))‖dY A(t)

Using standard results on Markovian projection, we can derive the following approximation.

Proposition 7.5 (Swap rate dynamics approximation). For the purpose of European swap-tion valuation, the dynamics of the swap rate S(t) in measure QA are approximately givenby the following displaced log-normal stochastic volatility SDE

dS(t) '√z(t)pS(t) [1 + bS(t)(S(t)− S(0))] dY A(t) (91)

pS(t) = ‖λS(t, EA[L(t)]

)‖ (92)

bS(t) =1

pS(t)

k−1∑n=j

∂‖λS(t, EA[L(t)]

)‖

∂Ln(t)

∫ t0λn(u)>λS (u,L(0)) du∫ t0‖λS (u,L(0)) ‖2 du

(93)

As shown in Proposition 7.3, the price of a Tj-maturity European swaption is given by

Vswaption(0) = A(0)EA[(S(Tj)− c)+

]and S(T ) is driven by the stochastic volatility model with time-dependent parameters

dS(t) '√z(t)pS(t) [1 + bS(t)(S(t)− S(0))] dY A(t),

dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t)

This is a stochastic volatility model with time-dependent parameters, so using the time-averaging techniques of [AP10-1, Section 9.3] we have devised a fast way of pricing Europeanswaptions.

The improvements over our early approximation essentially stem from the evaluationof the volatility function of the swap rate λS(t,L(t)) at L(t) = EA[L(t)] rather than atL(t) = L(0).

69

7.7.1 Advanced formula for the swap rate volatility

In order to compute the swap rate volatility (92)

pS(t) = ‖λS(t, EA[L(t)]

)‖

we need to approximate the expectations EA[Ln(t)]

Proposition 7.6 (Advanced formula for the swap rate volatility). For j ≤ n ≤ k.1, theexpected value of the n-th Libor rate in the annuity measure is approximately given by

EA[Ln(t)] = Ln(0)[1 + cn(t)]

cn(t) =1

Ln(0)Qn(0)

k−1∑i=j

∂Qn(0)

∂Li(0)

∫ t

0

λ>i (s)λn(s) ds,

Qn(t) =A(t)

P (t, Tn+1)

7.7.2 Advanced formula for the swap rate skew

We now show how to compute the swap rate skew (93).

Proposition 7.7 (Advanced formula for the swap rate skew). The time-dependent swaptionskew bS(t) 93 is approximately given by

bS(t) =

k−1∑i,n=j

rS,n(t)

rS,i(t)vi,n(0)ξi(t) +

k−1∑i=j

bi(t)vi(0)ξi(t)

rS,i(t) = λi(t)>λS(t,L(0)),

rS(t) = ‖λS(t,L(0))‖2

ξi(t) =rS,i(t)

∫ t0rS,i(u) du

rS(t)∫ t

0rS(u) du

7.7.3 Skew and smile calibration

The general model

dLn(t) =√z(t)ϕn(t, Ln(t))λn(t)>dWTn+1(t), n = 1, . . . , N − 1

dz(t) = θ[z0 − z(t)]dt+ η(t)ψ(z(t))dZ(t), z(0) = 0

ϕn(t, x) = 1 + bn(t) [x− Ln(0)]

has enough flexibility to match (in addition to the correlation structure of Libor rates):

• ATM volatilities of all European swaptions, using ‖λS(t)‖.• Slopes of volatility smiles (skews), using bn(t).

• Curvatures of smiles for swaptions of different expiries and fixed tenor, using η(t).

Consider NS swaptions Vswaption,1, . . . ,Vswaption,NS as calibration targets, where eachVswaption,i represents a collection of swaptions with fixed tenor and different strikes.

We first convert the observed market prices Vswaption,1, . . . , Vswaption,NS into market-

implied SV parameters, namely implied volatilities λSi , skews bSi and volatilities of variance

70

ηSi for i = 1, . . . , NS (these conversions are routinely maintained and updated by tradingdesks, c.f. [AP10-1, Page 378]), presumably by a simple root-search24.

As in the standard Libor model, consider a grid of calendar times tiNti=1 and tenors

xjNxj=1 and define:

• [Gλ]i,j = ‖λn(ti)‖, where we recall that n is such that Tn − ti = xj .

• [Gb]i,j = bn(ti).

• [Gη]i = η(ti).

The formulas from the previous section allow us to compute model term volatilities, skewsand volatilities of variance by

λSi(Gλ, Gb, Gη), bSi(Gλ, Gb, Gη), ηSi(Gλ, Gb, Gη)

Instead of defining a uniform norm incorporating all these terms (which would result ina hard multi-dimensional non-linear optimization problem), we define three norms25:

Iλ(Gλ, Gb, Gη) =ωS,λNS

NS∑i=1

(λSi(Gλ, Gb, Gη)− λSi

)2

+ · · ·

Ib(Gλ, Gb, Gη) =ωS,bNS

NS∑i=1

(bSi(Gλ, Gb, Gη)− bSi

)2

+ · · ·

Iη(Gλ, Gb, Gη) =ωS,ηNS

NS∑i=1

(ηSi(Gλ, Gb, Gη)− ηSi)2

+ · · ·

This leads to the following calibration algorithm:

1. Make initial guesses I0b and I0

η . For instance take

I0b =

1

NS

NS∑k=1

bSk , I0η =

1

NS

NS∑k=1

ηSk

2. Use the calibration algorithm in section 7.4.2 to obtain

G1λ := arg minGλIλ(Gλ, G

0b , G

0η).

3. Iterate over Gb to obtain

G1b := arg minGbIb(G

1λ, Gb, G

0η).

4. Iterate over Gη to obtain

G1η := arg minGηIη(G1

λ, G1b , Gη).

One could then iterate further starting with (G1b , G

1η) on the first step.

24See the end of section 7.4.1 for a concrete example.25Impact of changes in the Libor skews on term swaption volatilities, or their volatilities of variance, is rather

small.

71

7.8 Other extensions

7.8.1 Evolving separate discount and forward rate curves

A single yield curve for discounting and calculating Libor rates us not always compatible withno-arbitrage constraints of cross-currency markets. In general, separating the discountingcurve from the forward curve will ensure that linear instruments (i.e. swaps and bonds) arecorrectly priced at time 0.

7.8.2 SV models with non-zero correlation

So far we have assumed that the scalar Brownian motion Z(t) driving the variance processwas independent of all the components of the m-dimensional Brownian motion W (t). We canrelax this assumption and assume a non-zero deterministic correlation vector ρ(t) betweenZ(t) and WB(t).

7.8.3 Multi-stochastic volatility extensions

In the specification (74)-(75) of the LM model, a single stochastic volatility process√z(t) is

used to scale the diffusion coefficients of all forward rates. While sufficiently rich to introducethe volatility smile for all European swaptions, it presents some limitations.

(i) Interest rate exotics link to the spread of two CMS rates derive their values from thedistribution of the slope of the interest rate curve and a common stochastic volatilityfactor applied to all rates does not always allow for sufficient control over the distribu-tion of the slope of the interest rate curve.

(ii) The are of multi-dimensional stochastic volatility interest rate modeling is quite new.

72

8 Single-rate vanilla derivatives

Here we study derivatives whose payoffs can be decomposed as a linear combination offunctions of individual rate observations (either exactly or approximately).

8.1 European swaptions

Given a tenor structure 0 < T0 < · · · < TN , consider swaptions of different expiriesTnn=0,...,N−1 that can be exercised into swaps that start at Tn and cover m periods.Define

An,m(t) =

n+m−1∑i=n

τiP (t, Ti+1), Sn,m(t) =1

A(t)

n+m−1∑i=n

τiP (t, Ti+1)Li(t)

The time-t value of the m-period swaption expiring at Tn is

Vn,m(t) = An,m(t)En,mt[(Sn,m(t)− k)+

]For each swap rate Sn,m(t) we specify stochastic volatility (SV) dynamics in the annuity

measure

dSn,m(t) = λn,m [bn,mSn,m(t) + (1− bn,m)Sn,m(0)]√zn,m(t)dWn,m(t) (94)

dzn,m(t) = θ [1− zn,m(t)] dt+ ηn,m√zn,mdZ

n,m(t)

where recall

(i) θ is the mean-reversion of variance

• It controls the speed at which deviations of z(·) away from z0 are pulled back

towards this level. Increasing θ decreases the long-term variance of z().

• We assume it is global: same parameter for all swaptions.

• The idea is to choose θ in order to minimize the variability of ηn,mn,m acrossdifferent expiries n.

(ii) ηn,m is the volatility of variance and controls the curvature of the volatility smile.

Swaption calibration is performed individually for each grid point: fix a pair maturity-tenor (n,m). Suppose a collection of strikes K1, . . . ,KJ , along with corresponding marketprices V1, . . . , VJ (where we omit n,m from the notation since they are now fixed). Theoptimal parameters λ∗, b∗, η∗ are obtained as the solution of the following minimizationproblem

(λ∗, b∗, η∗) = argminλ,b,ηI(λ, b, η), I(λ, b, η) =

J∑j=1

wJ

(V (Kj ;λ, b, η)− Vj

)2

(95)

The algorithm to calibrate a SV-model to swaption prices is hence the following:

1. Choose θ.

2. For each maturity-tenor pair (n,m), find the optimal parameters (λ∗n,m, b∗n,m, η

∗n,m) by

solving the optimization problem (95).

3. If the ηn,m’s increase (decrease) with n, reduce (increment) θ and iterate.

73

8.2 Caps and floors

8.3 Terminal swap rate (TSR) models

• Pricing swaptions and caps is simple because valuation requires only knowledge of theterminal distribution of a single swap rate in the appropriate annuity measure.

1. Given a tenor structure 0 < T0 ≤ · · · ≤ TN , consider swaptions of different expiriesTnn=0,...,N−1 that can be exercised into swaps that start at Tn and cover mperiods. Define

An,m(t) =

n+m−1∑i=n

τiP (t, Ti+1), Sn,m(t) =1

A(t)

n+m−1∑i=n

τiP (t, Ti+1)Li(t)

The time-t value of the m-period swaption expiring at Tn is

An,m(t)En,mt[(Sn,m(t)− k)+

]2. A cap...

• Much more common a relatively simple payoffs that depend on the rate S(T ), butwhich also require the knowledge of additional discount bonds.

• Idea underlying TSR models: instead of using a full-blown term structure model,it’s better to functionally link the values of discount bonds on the date T to the drivingrate S(T ) through certain approximations.

Given a tenor structure 0 < T0 < T1 < · · · < TN and defining the annuity and swap rateas usual

A(t) = A0,N (t) =

N−1∑n=0

τnP (t, Tn+1), S(t) = S0,N (t) =P (t, T )− P (t, TN )

A(t)

the value of a derivative with payoff X in the annuity measure is

V (0) = A(0)EA[

X

A(T )

](96)

If P (T,M)M≥T is a family of discount bonds of various maturities (observed at timeT ), a TSR model exogenously specifies a family of maturity-indexed functionsπ(·,M)M≥T such that

P (T,M) = π(S(T ),M), ∀M ≥ T (97)

satisfying certain conditions

(i) No-arbitrage condition: when applying the valuation formula (96) to the specifica-tion (97) one should be able to reproduce the initial discount bond prices

P (0,M) = A(0)EA[π(S(T ),M)

A(T )

]= A(0)EA

[π(S(T ),M)∑N−1

n=0 τnπ(S(T ), Tn+1)

], M ≥ T

(98)

(ii) Consistency condition: the swap rate S(T ) itself is a function of discount factors

S(T ) = 1−P (T,TN )∑N−1n=0 τnP (T,Tn+1)

so we need to impose that

x =1− π(x, TN )∑N−1

n=0 τnπ(S(T ), Tn+1), ∀x (99)

74

(iii) Reasonable family of functions: we generally require the functions π(·,M)M torange between 0 and 1, to be monotonic in M and to be continuous in (x,M). Gener-ally one chooses a parametric class for the functions π(·,M)M and then one choosesfunctions within the parametric class satisfying the no-arbitrage and consistency con-ditions.

8.3.1 Linear TSR models

We use the linear specification

π(x,M)∑N−1n=0 τnπ(x, Tn+1)

= a(M)x+ b(M), M ≥ T (100)

for deterministic functions a(·), b(·).

• The no-arbitrage condition implies that

b(M) =P (0,M)

A(0)− a(M)S(0) (101)

• The consistency condition implies that

b(T0) = b(TN ) (102)

a(T0) = 1 + a(TN ) (103)

• The specification (100) itself imposes

N−1∑n=0

τna(Tn+1) = 0, (104)

N−1∑n=0

τnb(Tn+1) = 1. (105)

To complete the model specification one may proceed as follows:

(i) Choose coefficients a(T1), . . . , a(TN ) subject to condition (104), as described below.

(ii) Calculate a(T ) = a(T0) = 1 + a(TN ) from condition (103).

(iii) Calculate the rest of a(M)’s by interpolation of a(T1), . . . , a(TN ).(iv) Calculate the b(M)’s by condition (101).

The coefficients a(T1), . . . , a(TN ) can be connected to a mean reversion pa-rameter: this will not only reduce the number of parameters required, will also parameterizethe model with a single parameter that has a strong financial interpretation. The argumenton pages 713-714 on [AP10-2] shows that one may choose

a(M) =P (0,M) (γ −G(T,M))

P (0, TN )G(T, TN ) + S(0)γ

γ =

∑τnP (0, Tn+1)G(T, Tn+1)∑

τnP (0, Tn+1)

G(t, T ) =

∫ T

t


This also eliminates the need to interpolate in (iii) and facilitates better risk management(c.f. discussion on page 714 in [AP10-3]).

75

8.4 Libor-in-Arrears

A Libor-in-Arrears cash flow pays the Libor rate on the date when it fixes, rather than onthe date it matures. We focus on a single cash flow, the valuation of a full strip followingby additivity.Let T > 0 be the start date, and M the end date of the period covered by

Libor rate L(t) ≡ L(t;T,M) = P (t,T )−P (t,M)τP (t,M) , with τ = M − T . The time-0 value of a

Libor-in-Arrears cash flow is

VLIA(0) = β(0)E

[L(T )

β(T )

]One would then naturally switch to the T -forward measure, but traded caplets provide

information about the distribution of the Libor rate in the M -forward measure, not theT -forward measure26.

The valuation formula in the M -forward measure is

VLIA(0) = P (0,M)EM[

L(T )

P (T,M)

]= P (0,M)EM [(1 + τL(T ))P (T,M)]

= P (0,M)(L(0) + τEM

[L2(T )

])The term EM

[L2(T )

]can be computed by Proposition ??

EM[L2(T )

]= L2(0) + 2

∫ L(0)

−∞p(0, L(0);T,K) dK + 2

∫ ∞L(0)

c(0, L(0);T,K) dK (106)

where p(t, L;T,K) and c(t, L;T,K) are undiscounted values of put and call options on therate L(T ) with strike K (i.e. simple undiscounted floorlets and caplets).

8.5 Libor-with-Delay

A Libor-with-delay cash-flow pays the Libor rate L(t, T,M) at some arbitrary payment timeTp ≥ T . In the M-forward measure we have

VLD(0) = P (0,M)EM[P (T, Tp)

P (T,M)L(T )

]and we can now use a TSR model to express P (T, Tp) in terms of the Libor rate.

Alternatively one can draw inspiration from the quasi-Gaussian model to express P (T, Tp)in terms of L(T ). Recall that

P (T,M) = P (T,M ;x(T ), y(T )), P (T,M ;x, y) =P (0,M)

P (0, T )exp

(−G(T,M)x− 1

2G(T,M)2y

)Approximating y(T ) by a deterministic function y(T ) := E[y(T )] and denoting by X(T, `)

the inverse function in x to L(T, x, y(T )) we can write

P (T, Tp) = P (T, Tp;X(T, L(T )), y(T )) (107)

26Note that L(t) = L(t;T,M) is a martingale in the M -forward measure, but not in the T -forward measure, soET [L(T )] 6= L(0). To characterize this situation one defines the Libor-in-Arrears convexity adjustment

DLIA(0)def= ET [L(T )]− L(0).

In general, by convexity we generally mean the difference of valuations under different measures (in this case thedifference between the measure appropriate for the given payment date and the measure in which the market rateis a martingale).

76

Assuming a linear relationship between the Libor rate and the state variable x

L(T ) ' L(T, 0, 0) +∂L

∂x(T, 0, 0)x(T ) =⇒ Var(L(T )) '

(∂L

∂x(T, 0, 0)

)2

Var(x(T ))

whence

y(T ) = Var(x(T )) '(∂L

∂x(T, 0, 0)

)−2

Var(L(T ))

Thus given P (T, Tp), we can solve equation (107) implicitly for L(T ) to obtain our soughtfor relation. Note that this procedure might result in a mild violation of the no-arbitragecondition.

8.6 CMS and CMS-Linked cash flows

A CMS (linked) cash-flow pays (a function of) the swap rate S(T ) at time T ≤ Tp < T1.It is hence more naturally valued in Tp-forward measure, even though the market-implieddistribution of S(T ) is known in the annuity measure. We have

VCMS(0) = A(0)EA[P (T, Tp)

A(T )S(T )

](108)

Lemma 8.1 (Replication method for CMS pay-offs). Assuming that the annuity mapping

function α(S(T )) =P (T,Tp)A(T ) is known, then

VCMS(0) = A(0)S(0)α(S(0))+

∫ S(0)

−∞w(K)Vrec(0,K) dK+

∫ ∞S(0)

w(K)Vpay(0,K) dK (109)

where

Vrec(0,K) = A(0)EA[(K − S(T ))+

]Vpay(0,K) = A(0)EA

[(S(T )−K)+

]w(s) =

d2

ds2(α(s)s)

where Vrec(0,K), Vpay(0,K) are the time-0 values of receiver and payer swaptions.

Using iterated conditioning one easily sees:

Proposition 8.2. The annuity mapping function α(s) may be written as the conditionalexpectation

α(s) = EA[P (T, Tp)

A(T )

∣∣∣∣S(T ) = s

](110)

In order to compute the conditional expectation (110), it is useful to remember that giventwo random variables X,Y , the conditional expectation can be interpreted as

E[X|Y ] = f∗(Y ), f∗ = argminE[(X − f(Y ))2

], f ∈ B

for a space B of suitably regular functions on Y , generally defined by a parametric functionalform B =

f(y; θ), θ ∈ Θ ⊂ Rd

. Imposing the conditions

∂

∂θiE[(X − f(Y ; θ))2

]= 0, i = 1, . . . , d

we obtain the following Proposition.

77

Proposition 8.3. Given two random variables X and Y and a paremetric set of functionsf(y; θ), θ ∈ Θ ⊂ Rd, an approximation to E[X|Y ] is given by

E[X|Y ] ' f(Y ; θ∗)

where θ∗ is a solution to the set of equations

E

[X∂f

∂θi(Y ; θ)

]= E

[f(Y ; θ)

∂f

∂θi(Y ; θ)

], i = 1, . . . , d (111)

8.6.1 Linear TSR model

Assume α(s) = α1s+ α2. Imposing the no-arbitrage condition we obtain

P (0, Tp)

A(0)= EA

[P (T, Tp)

A(T )

]= EA [α1S(T ) + α2] = α1S(0) + α2

whence

α2 =P (0, Tp)

A(0)− α1S(0)

α1 can be considered an exogenous input. The value of the CMS payoff is hence


A(T )S(T )

]= EA[α(S(T ))S(T )] = EA [(α1S(T ) + α2)S(T )]

= P (0, Tp)S(0) + α1A(0)VarA[S(T )]

where the variance VarA[S(T )] can be computed (as in the Libor-in-arrears case) by thereplication method27

VarA[S(T )] = 2

∫ S(0)

−∞S(K)p(0, S(0);T,K) dK + 2

∫ ∞S(0)

S(K)c(0, S(0);T,K) dK

8.6.2 Libor market model

Establishing dependency explicitly would be too complicated if our sole purpose is to priceCMS cash-flows, but it would be useful for applications of Libor market models to exoticderivatives that are linked to CMS rates (e.g. callable CMS range accruals or CMS spreadTARN’s).

Proposition 8.4. See [AP10-3, Proposition 16.6.3]. The annuity mapping function (8.2)in the Libor market model is approximated by

α(s) = EA[P (T, Tp)

A(T )

∣∣∣∣S(T ) = s

]'

(N−1∑n=0

τn

n∏i=0

1

1 + τi`i(s)

)−1

where

`n(s) = Ln(0)

(1 + cn

s− S(0)

S(0)

)27Recall that for any twice-differentiable function we have

E[f(S(T ))] = f(K∗) + f ′(K∗)(S(0)−K∗) +

∫ K∗

−∞p(0, S(0);T,K)f ′′(K) dK

+

∫ ∞K∗

c(0, S(0);T,K)f ′′(K) dK (112)

In our case, we apply it with K∗ = S(0) and f(S(T )) = (S(T )− S(0))2.

78

8.6.3 Quasi-Gaussian model

Recall the bond reconstitution formula in the qG-model

P (T,M) = P (T,M ;x(T ), y(T )),

P (T,M ;x, y) =P (0,M)

P (0, T )exp

(−G(T,M)x− 1

2G(T,M)2y

)and recall that y(T ) is well approximated by a deterministic function

y(T ) =

(∂L

∂x(T, 0, 0)

)−2

VarA(S(T ))

where VarA(S(T )) is computed consistently with the market via replication as above. Onecan then define an annuity mapping

α(s)P (T, Tp;X(T, s), y(T ))

A(T,X(T, s), y(T ))

where X(T, s) is the inverse function (in x) to S(T, x, y(T )) which can then be used tocompute VCMS(0) via (111).

Computational remark. Note that

dα

ds=

1

A(T,X(T, s), y(T ))

[∂P (T, Tp, X(T, s), y(T ))

∂x

∂X(T, s)

∂sA(T,X(T, s), y(T ))

−P (T, Tp, X(T, s), y(T ))∂A(T,X(T, s), y(T ))

∂x

∂X(T, s)

∂s

]To compute dα(K)

ds we need

• X(T,K): by construction S(T,X(T, s), y(T )) = s so X(T,K) is the x-solution to theequation S(T, x, y(T )) = K.

• ∂X(T,K)∂s : differentiating implicitly we obtain

X(T, S(T, x, y(T ))) = x =⇒ ∂X(T, s)

∂s

∂S(T, x, y(T ))

∂x= 1

whence∂X

∂s(T,K) =

(∂S

∂x(T,K, y(T ))

)−1

One proceeds analogously to compute d2α(K)ds2 .

8.6.4 Correcting non-arbitrage-free methods

Some annuity mapping methods are not arbitrage-free by construction (e.g. Libor) and insome others, like the linear TSR model, despite being arbitrage-free theoretically, numericalmethods can induce slight errors. We introduce a modification to amend this error. Thevaluation formula for CMS cash-flows reads


A(T )S(T )

]= A(0)EA [α(S(T ))S(T )]

In order to have an arbitrage-free model, sinceP (T,Tp)A(T ) is a tradeable asset, we need

EA [α(S(T ))] = EA[P (T, Tp)

A(T )

]=P (0, Tp)

A(0)

79

Since this may fail in some models, we can re-scale the function α(s) to force the previousequality to be satisfied. Defining

α(s) =P (0, Tp)

A(0)

α(s)

EA [α(S(T ))],

it is clear that EA [α(S(T ))] =P (0,Tp)A(0) and we obtain the improved CMS valuation formula

VCMS(0) = A(0)EA [S(T )α(S(T ))] = P (0, Tp)EA [S(T )α(S(T ))]

EA [α(S(T ))]

This correction is useful even for arbitrage-free models, since there may be arbitrage inducedby the numerical scheme.

8.6.5 CDF and PDF of CMS rate in Forward measure

We look at an alternative to the replication method of Lemma 8.1, since the weights maybe hard to compute for discontinuous payoffs. The problem of pricing a cash-flow that paysg(S(T )) at time Tp ≥ T can be formulated as

ETp [g(S(T ))] =

∫ ∞−∞

g(s)ψTp(s) ds

where ψTp(s) is the density of S(T ) in the Tp-forward measure.The CDF and PDF of the swap rate S(T ) in the annuity measure28 are given respectively

by

ΨA(K) = 1 +∂

∂Kc(K), ψ(K) =

∂2

∂K2c(K)

where c(K) = EA [(S(T )−K)+]. Changing to the Tp-forward measure we obtain

Proposition 8.5. Given an annuity mapping function α(s) = EA[P (T,Tp)A(T )

∣∣∣S(T ) = s], the

PDF ψTp(s) and the CDF ΨTp(s) of the swap rate S(T ) in the Tp-forward measure are givenby

ψTp(s) =A(0)

P (0, Tp)α(s)ψA(s), (113)

ΨTp(s) =A(0)

P (0, Tp)

∫ s

−∞α(u)ψA(u) du (114)

Sketch of proof. Note that ψTp = ETp [δ(S(T )−K)], so changing to the annuity measureand using iterated conditioning yields the result.

When the annuity mapping function is linear α(s) = α1s+α2, expressions (113-114) takea simple form

28The price of a call option is given by C(S,K) =∫∞KϕS(T )(x)(x − K) dx Differentiating with respect to K

twice we obtain

∂C(S,K)

∂K=

∂

∂K

[∫ ∞K

ϕS(T )(x) dx−K∫ ∞K

ϕS(T )(x) dx

]= −

∫ ∞K

ϕS(T )(x) dx

∂2C(S,K)

∂K2= ϕS(T )(K)

80

Corollary 8.6. In the linear TSR model, namely when α(s) = α1s + α2, the CDF ΨTp(s)and the PDF ψTp(s) of the swap rate in the Tp-forward measure are given by

ΨTp(s) =A(0)

P (0, Tp)

[α1(S(0)− s− c(s)) + α(s)ΨA(s)

](115)

ψTp(s) =A(0)

P (0, Tp)(α1sα2)ψA(s), (116)

c(s) = EA[(S(T )− s)+

]Finally notice that since a CMS caplet payoff is given by

VCMS−caplet(0,K) = P (0, Tp)ETp[(S(T )−K)+

],

there is an explicit link between VCMS−caplet(0,K) and ψTp(s).

Lemma 8.7 (Link between CMS caplet payoff and swap rate in Tp-forward measure). Themarket-implied PDF ψTp(s) of the swap rate in the Tp-forward measure can be obtained fromvalues of CMS caplets by

ψTp(K) =1

P (0, Tp)

∂2

∂K2VCMS−caplet(0,K) (117)

8.6.6 SV model for CMS rate

It is useful, specially when dealing with vanilla derivatives linked to multiple market rates, tohave PDFs and CDFs thereof in the forward measure to be of tractable form. Unfortunately,this is not the case if one starts with the SV model in the annuity measure and then switchesto the forward measure. We seek to obtain an approximation to the PDF ψTp(s) that is fromthe same family as the PDF ψA(s).

Assume the swap rate follows

dS(t) = λ (bS(t) + (1− b)S(0))√z(t)dWA(t) (118)

dz(t) = θ(1− z(t))dt+ η√z(t)dZA(t), z(0) = 1 (119)

E[dWA(t)dZA(t)] = 0

This system defines the PDF ψA(·) of S(T ) in measure QA. Define an adjusted process S(t)given by the following dynamics in the Tp-forward measure

dS(t) = λ(bS(t) + (1− b)S(0)

)√z(t)dWTp(t)

dz(t) = θ(1− z(t))dt+ η√z(t)dZTp(t), z(0) = 1 (120)

E[dWTp(t)dZTp(t)] = 0

Align the mean of S(T ) to equal the CMS-adjusted value of S(T ), namely

S(0) = ETp [S(T )]

Goal: define ψTp(·) to be the PDF of S(T ) given by the previous system of SDEs andaim to set the adjusted parameters λ, b, η such that the distribution of S(T ) is as close aspossible to that of S(T ) in the Tp-forward measure.

• S(T ) = ETp [S(T )] is calculated by the replication method of Lemma 8.1.

81

• Since VCMS−caplet(0,K) = P (0, Tp)ETp [(S(T )−K)+] and S(0) = ETp [S(T )], by iter-

ated conditioning we see that in fact

VCMS−caplet(0,K) = P (0, Tp)ETp[(S(T )−K)+

]so that CMS caplets are nothing more than European calls on S(T ). Therefore, λ, b, ηcan be obtained by direct calibration of the SV model (120) (which is in the Tp-forwardmeasure) to prices of CMS caplets that are computed in the original SV model29 forS(T ) (with dynamics specified in the annuity measure).

8.6.7 Dynamics of CMS rate in forward measure

Consider a stochastic volatility model in the QA-measure

dS(t) = λϕ(S(t))√z(t)dWA(t) (121)

dz(t) = θ(1− z(t))dt+ ηψ(z(t))dZA(t), z(0) = 1 (122)

E[dWA(t)dZA(t)] = 0

In the previous subsection we absorbed the measure change to QTp into the SV diffusionparameters: namely, we defined an analogous system in the QTp -measure and adjusted itsparameters to obtain market-consistent results. We now seek to bring the two-dimensionalSDE (121)-122) into an approximation of the Tp-forward measure30. Concretely, we construct

a measure QTp such that for any function f(·) we have

EA[P (T, Tp)

A(T )f(S(T ))

]=P (0, Tp)

A(0)ETp [f(S(T ))]

Proposition 8.8. See [AP10-3, Proposition 16.6.8]. Defining the measure QTp by thecondition that the process (z(t), S(t)) satisfies the following SDE system in QTp

dS(t) = λϕ(S(t))√z(t)

(dWTp(t) + vS(t)dt

)(123)

dz(t) = θ(1− z(t))dt+ ηψ(z(t))(dZTp(t) + vZ(t)

), z(0) = 1

E[dWTpdZTp

]= 0

where the drift adjustments are given by

vZ(t) = ηψ(z(t))∂z ln [Λ(t, z(t), S(t))] (124)

vS(t) = λϕ(S(t))√z(t)∂S ln [Λ(t, z(t), S(t))]

and where the function Λ(t, z, s) satisfies the following PDE

∂tΛ(t, z, s) + θ(1− z)∂zΛ(t, z, s) +1

2η2ψ(z)2∂2

zzΛ(t, z, s)

+1

2λ2ϕ(s)2z∂2

ssΛ(t, z, s) = 0, t ∈ [0, T ] (125)

with terminal condition

Λ(T, z, s) = α(s) :=A(0)

P (0, Tp)α(s) (126)

29These are obtained by the usual replication method of Lemma 8.1.30Namely, a measure that will resemble a forward measure for European-type payoffs fixing at T and paying at

Tp.

82

Then for any function f(·) we have

A(0)

P (0, Tp)EA

[P (T, Tp)

A(T )f(S(T ))

]= ETp [f(S(T ))]

Remark 8.1. This Proposition establishes a numerical scheme for simulating (z(t), S(t)) inQTp for the purpose of pricing European-style derivatives fixing at T and paying at Tp.

1. Solve the PDE (125-126) numerically on a grid (t, z, s).

2. Perform Monte Carlo simulation for (123), using the drift adjustments vZ(t) and vS(t),which are computed for each path using (124).

Remark 8.2. If α = α1s+ α2 then no finite difference scheme is needed since

dvz(t) = 0,

dvS(t) = λϕ(S(t))√z(t)

α1

α1S(t) + α2

8.7 Quanto CMS

Given a swap rate S(T ) in domestic currency, a quanto CMS cash-flow pays g(S(T )) at timeTp ≥ T in some other foreign currency and it thus has value

VQuantoCMS(0) = βf (0)Ef[

1

βf (Tp)g(S(T ))

]Let X(t) be the foreign exchange rate expressing one unit of domestic currency in foreign

currency units. In the domestic risk-neutral measure (and in foreign currency units) thepricing formula can be written as

VQuantoCMS(0) =βd(0)

X(0)Ed[

1

βd(Tp)g(S(T ))X(Tp)

][1]=

βd(0)

X(0)Ed[

1

βd(T )g(S(T ))βd(T )EdT

[1

βd(Tp)X(Tp)

]][2]=

βd(0)

X(0)Ed[

1

βd(T )g(S(T ))P (T, Tp)XTp(T )

]where in [1] we used iterated conditioning and in [2] we used a change of measure βd(T )EdT

[1

βd(Tp)X(Tp)]

=

Pd(T, Tp)EdT [X(Tp)] and the notation XTp(T ) =

Pf (T,Tp)X(T )Pd(T,Tp) .

Changing to the annuity measure we obtain

VQuantoCMS(0) =A(0)

X(0)Ed[

1

A(T )g(S(T ))Pd(T, Tp)XTp(T )

]=A(0)

X(0)EA,d [g(S(T ))v(S(T ))]

where

v(s) = EA,d[Pd(T, Tp)

A(T )XTp(T )|S(T ) = s

].

Recalling the definition of the annuity mapping function α(s) = EA,d[P (T,Tp)A(T ) |S(T ) = s

]and defining χ(s) = EA,d

[XTp(T )|S(T ) = s

]we can approximate

v(s) ' α(s)χ(s)

83

so that

VQuantoCMS(0) ' A(0)

X(0)EA,d [g(S(T ))α(S(T ))χ(S(T ))]

and this can be computed via replication once the function χ(·) is determined.

In order to compute χ(s) = EA,d[XTp(T )|S(T ) = s

]one needs to specify a joint distri-

bution of the swap rate S(T ) and the forward FX rate XTp(T ) and this can be accomplishedby means of a Gaussian copula. The marginal one-dimensional PDF of S(T ) in measureQA,d is given by ψA(s) (given by the swaption model). Assuming that XTp(T ) is log-normalin QA,d we have

XTp(T ) = X(0)emXT+σX√Tξ1 , ξ1 ∼ N(0, 1).

where

• σX is obtained by calibrating to T -expiry ATM options on the FX rate. For long-datesquanto contracts, the FX smile make start to matter and we may have to allow σX tovary stochastically. This can be accomplished, for instance, using copulas.

• mX is clarified below.

Given ξ2 ∼ N(0, 1) with ρXS = Corr(ξ1, ξ2) we have

S(T )d= (ΨA)−1 (Φ(ξ2))

whence

χ(s) = EA,d[XTp(T )|S(T ) = s

]= X(0)emXTEA,d

[eσX√Tξ1 |ξ2 = Φ−1(ΨA(s))

]not= X(0)emXT χ(s),

χ(s) = exp

(ρXSσX

√TN−1(ΨA(s)) +

1

2σ2XT (1− ρ2

XS)

)In order to establish the constant mX we impose a no-arbitrage condition. Since XTp(·)

is a QTp,d-martingale we have

XTp(0) = ETp,d[XTp(T )] =A(0)

Pd(0, Tp)EA,d

[Pd(T, Tp)

A(T )XTp(T )

]=

A(0)

Pd(0, Tp)EA,d [α(S(T ))χ(S(T ))]

= X(0)emXTA(0)

Pd(0, Tp)EA,d [α(S(T ))χ(S(T ))]

and we can then solve for e−mXT .

8.8 Eurodollar futures

Eurodollar futures are exchange-traded contracts on Libor rates. The latter are inputs intoan interest rate curve construction whereas the former are liquidly quoted in the market.It is important to be able to value ED futures efficiently, due to their high trading volumeand their use in yield curve construction, which rules out Monte Carlo methods and thusnecessitates the development of analytic approximations that incorporate the volatility smile.

84

Daily mark-to-market causes forward and futures contracts values to differ. If F (t;T,M)and L(t;T,M) are the time-t futures and Libor rates respectively covering the period [T,M ],recall that L(t;T,M) = EMt [L(T ;T,M)].

For futures contracts that are marked-to-market continuously we have

F (t;T,M) = Et[L(T ;T,M)]

If mark-to-market happens discretely, consider a tenor structure 0 = T0 < T1 < · · · < TNand let Ln(t) = L(t;Tn, Tn+1). Denote by QB the spot Libor measure induced by the process

B(t) = P (t, Tq(t))

q(t)−1∏n=0

(1 + τnLn(Tn))

Then if the futures rate is marked-to-market only on the dates T0 < T1 < · · · < Tn = T < M ,we have

F (t, T ;M) = EBt [L(T ;T,M)]

8.8.1 The problem and the plan

Assuming that all futures rates Fn(0)n=0,...,N−1 are known (market-observable), our goalis to derive formulas that express forward Libor rates Ln(0) = L(0;Tn, Tn+1) in terms offutures Fn(0). The road map to derive the ED futures valuation formula is:

1. Use expansion technique to express a forward rate as a (model-independent) functionof a collection of futures rates with expiries on or before the expiry of the forward rate.

2. Separate variance terms appearing in previous formula into slow- and fast-moving.

3. Fast-moving volatility parameters

4. Slow-moving correlation parameters

8.8.2 Expansion around futures value

Since Ln(0) = En+1[Ln(Tn)] for all n = 0, . . . , N−1, changing to the spot measure we obtain

Ln(0) = EB[dQn+1

dQBLn(Tn)

]= EB

[Ln(Tn)

n∏i=0

1 + τiLi(0)

1 + τiLi(Ti)

](127)

We seek to express the expectation in the RHS of (127) in terms of market-observedquantities by expanding it in powers of a small parameter that measures the deviation ofeach Ln(Tn) from its mean in the spot measure Fn(0) = EB [Ln(Tn)]. Define

Lεn(t)def= Fn(0) + ε (Ln(t)− Fn(0))

V (ε) = Lεn(Tn)

n∏i=0

1 + τiLi(0)

1 + τiLεi(Ti)

Since clearly L1n(t) = Ln(t), we have Ln(0) = EB [V (1)]. Expanding V (ε) into a Taylor series

in ε yields

V (ε) = V (0) + EB[dV

dε(0)

]ε+

1

2EB

[d2V

dε2(0)

]ε2 +O(ε3) (128)

Computing these derivatives (c.f. [AP10-3, Lemma 16.8.4]) one obtains the following:

85

Theorem 8.9. For any n = 0, . . . , N − 1, an approximation to the forward rate Ln(0) isobtained from the futures Fi(0)ni=0 and forward rates for previous periods Li(0)n−1

i=0 bysequentially solving the equations

Ln(0) = EB [V (1)] = V (0)

1 +1

2

n∑j,m=0

Dj,mCovB(Lj(Tj), Lm(Tm))

(129)

with

V (0) = Fn(0)

n∏i=0

1 + τiLi(0)

1 + τiFi(0)

Dj,m = · · ·

8.8.3 Forward rate variances

The covariances in (129) are, by definition

CovB(Lj(Tj), Lm(Tm)) =

√VarB(Lj(Tj))

√VarB(Lm(Tm))CorrB(Lj(Tj), Lm(Tm))

The spot-measure variances VarB(Ln(Tn)) can be approximated in the forward measure31

VarB(Ln(Tn)) ' Varn+1(Ln(Tn))

The variances Varn+1(Ln(Tn)) can be computed by calibrating a low-parametric vanillamodel to liquid strikes. Assuming

dLn(t) = σn (bnLn(t) + (1− bn)Ln(0))√z(t)dWn+1(t)

dz(t) = θ (1− z(t)) dt+ ηn√z(t)dZn+1(t)

E[dWn+1(t)dZn+1(t)

]= 0

After calibrating the parameters (σn, bn, ηn) to swaption prices, the forward variances aregiven by

Varn+1(Ln(Tn)) =L2n(0)

b2n

[Ψz((σnbn)2, 0;Tn)

]Ψz(v, u; t) = · · ·

8.8.4 Forward rate correlations

See [AP10-3, Page 759].

31See [AP10-3, Page 757]. The error made by the approximation is claimed to be small.

86

9 Multi-rate vanilla derivatives

Here we look at payoffs that are linked to more than one swap rate, the CMS spread beingthe most prominent example, without resorting to a full term structure model. Concretely,given a payment date Tp, a collection of swap or Libor rates S1(t), . . . , Sd(t), a collectionof fixing dates t1, . . . , td and a d-argument function f(s1, . . . , sd), a multi-rate derivative isdefined by a time-Tp payoff

V (Tp) = f(S1(t1), . . . , Sd(td))

We outline the notion of copula and we indicate how it can be used to specify and controlthe dependence structure between the rates involved.

9.1 Dependence structure via copulas

The copula approach is a method of constructing a joint distribution of random variablesconsistently with pre-specified one-dimensional marginal distributions.

9.2 Gaussian copula method

Assume we are given one-dimensional CDF’s Ψ1(·), . . . ,Ψd(·) and that we want to construct amulti-dimensional random vector (X1, . . . , Xd) with a measure of control over the dependenceof the random variables Xi, constrained so that each variable Xi has CDF Ψi(·). We canaccomplish this by specifying Xi = Ψ−1

i (Φ(Zi) where (Z1, . . . , Zd) is a multi-dimensionalnormalized Gaussian vector with correlation matrix R and where Φ(·) is the standard one-dimensional Gaussian CDF. Clearly

P (Xi ≤ x) = P (Ψ−1i (Φ(Zi) ≤ x) = P (Zi ≤ Φ−1(Ψi(x))) = Φ

(Φ−1(Ψi(x))

)= Ψi(x)

so Xi indeed has CDF Ψi(·). Denote by

• Ψgauss(x1, . . . , xd) the CDF of (X1, . . . , Xd) just constructed.

• Φd(z1, . . . , zd;R) the CDF of (Z1, . . . , Zd).

Then

Ψgauss(x1, . . . , xd) = P (X1 ≤ x1, . . . , Xd ≤ xd)= P

(Z1 ≤ Φ−1(Ψ1(x)), . . . , Zd ≤ Φ−1(Ψd(x))

)= Φd

(Φ−1(Ψ1(x1)), . . . ,Φ−1(Ψd(xd));R

)(130)

One easily obtains the PDF as

ψ(x1, . . . , xd) =∂d

∂x1 · · · ∂xdΨgauss(x1, . . . , xd)

= φd(Φ−1(Ψ1(x1)), . . . ,Φ−1(Ψd(xd));R

)·d∏i=1

ψi(xi)

φ (Φ−1(Ψi(xi)))

where φ(z) and φd(z1, . . . , zd;R) are the one- and d-dimensional Gaussian PDF’s, andψi(xi) is the one-dimensional PDF of Xi.

87

9.3 General couplas

Note that we can re-write the Gaussian copula CDF (130) as

Ψgauss(x1, . . . , xd) = Φd(Φ−1(Ψ1(x1)), . . . ,Φ−1(Ψd(xd));R

)= Cgauss (Ψ1(x1), . . . ,Ψd(xd);R)

whereCgauss(u1, . . . , ud;R) = Φd

(Φ−1(u1),Φ−1(ud);R

)is a multi-dimensional distribution function for a vector of d uniform random variables32.

Definition 9.1 (general copula). A d-dimensional copula function is a function C : [0, 1]d→[0, 1],C(u1, . . . , ud) defining a valid joint distribution function for a d-dimenaional vector of ran-dom variables, with each variable being uniformly distributed on [0, 1].

Requiring that C(u1, . . . , ud) be a distribution imposes strong constraints on the form ofthe function C. In particular

C(u1, u2, . . . , ui−1, 0, ui+1, . . . , ud) = 0

C(1, 1, . . . , 1, ui, 1, . . . , 1) = ui

The simplest examples of couplas are the following:

1. Independence copula: if all d uniform random variables underlying the copula are in-dependent, we obtain

CID(u1, . . . , ud) =

d∏i=1

ui

2. Perfect dependence copula: introduce d uniform random variables U1 = · · · = Ud. Then

CD(u1, . . . , ud) = P (U1 ≤ u1, . . . , Ud ≤ ud) = P

(U1 ≤ min

i=1,...,dui

)= mini=1,...,d

ui

The following result allows us to construct copulas from other copulas and will be thebasis to obtain a useful copula for multi-rate derivatives.

Proposition 9.2 (making copulas from copulas). Let C1, . . . , CM be d-dimensional copulasand let qm,i : [0, 1]→[0, 1], m = 1, . . . ,M , i = 1, . . . , d be functions that are either strictly

increasing or identically equal to 1. Suppose that∏Mm=1 qm,i(u) = u for u ∈ [0, 1] and

limu→0+ qm,i(u) = qm,i(0). Then

C(u1, . . . , ud) =

M∏m=1

Cm(qm,1(u1), . . . , qm,d(ud))

is a coupla

In particular, takingM = 2, C1 =∏di=1 ui the independence copula, C2 = Cgauss(u1, . . . , ud;R),

q1,i(u) = u1−θi , q2,i(u) = uθi for θi ∈ [0, 1], i = 1, . . . , d we obtain the so-called power Gaus-sian copula.

Corollary 9.3 (power Gaussian copula). Let R be a d × d correlation matrix and θ =(θ1, . . . , θd) ∈ [0, 1]d a d-dimensional vector of parameters. Then the power Gaussian function

CPG(u1, . . . , ud;R) =

(d∏i=1

u1−θ1

)Cgauss(u

θ11 , . . . , u

θdd ;R) (131)

32Recall that if X is a random variable with CDF FX(x) then Y = FX(X) has a uniform distribution over(0, 1). Indeed,

P [Y ≤ y] = P [FX(X) ≤ y] = P[X ≤ F−1

X (y)]

= FX(F−1X (y)

)= y

88

9.4 Copula methods for spread options

Spread options are the most liquid among the multi-rate vanilla derivatives. The payoff ofa spread option is given by (S1(T )− S2(T )−K)

+at time Tp, so the (undiscounted) value f

the spreads options is

V (0;T,K) = ETp[(S1(T )− S2(T )−K)

+]

Using a two-dimensional Gaussian copula with correlation

ρ = Corr(S1(T ), S2(T ), R =

(1 ρρ 1

)1. Derive market-implied density for each swap rate Si(T ) under its own annuity measure

ψAii (s), i = 1, 2 using swaption prices33.

2. Use Proposition 8.5 to convert each ψAii (x) into the PDF ψTpi (x) in the Tp-forward

measure

ψTp(s) =A(0)

P (0, Tp)α(s)ψA(s), (132)

ΨTp(s) =A(0)

P (0, Tp)

∫ s

−∞α(u)ψA(u) du (133)

where α(s) = EA[P (T,Tp)A(T )

∣∣∣S(T ) = s].

3. Given the correlation matrix R, use the Gaussian copula function to construct a jointPDF

ψTp(x1, x2;R) = cgauss(ΨTp1 (x1),Ψ

Tp2 (x2)) · ψTp1 (x1)ψ

Tp2 (x2)

where

cgauss(u1, u2;R) =φ2

(Φ−1(u1),Φ−1(u2);R

)φ(Φ−1(u1))φ(Φ−1(u2))

4. Integrate the payoff (x1 − x2 − K)+ against the density ψTp(x1, x2;R) to obtain theundiscounted model price of the spread option

Vmdl(0;T,K,R) =

∫ ∫(x1 − x2 −K)+ψTp(x1, x2;R) dx1 dx2

Finally, in order to calibrate the model to market:

5. Find the implied spread correlation ρ = ρ(T,K) such that

Vmdl(0;T,K,R) = Vmkt(0;T,K)

The main issue with this algorithm stems from the so-called correlation smile: for a fixedmaturity date, one obtains a different implied spread correlation for each strike. A simpleGaussian copula has insufficient flexibility to capture the market-observed distributions ofCMS spreads. Essentially, varying ρ allows one to shift the implied spread option volatilitysmile in parallel, making it impossible to match the market-implied smile (see figure 2).

33Explain

89

Figure 2: Implied normal spread volatility for a 5-year spread option on the difference between10-year and 2-year swap rates, [AP10-3, Figure 17.1].

9.4.1 Power gaussian

One way of fixing the correlation smile issue outlined in the previous subsection is to resortto a two-dimensional power Gaussian copula

CPG(u, v; ρ, θ1, θ2) = u1−θ1v1−θ2Cgauss(uθ1 , vθ2 ; ρ)

where ρ is a correlation coefficient and θ1, θ2 ∈ [0, 1]. As above, observes empirically thatρ allows to shift the implied volatility spread vertically, while the parameters θ1, θ2 providegood control over the slope and curvature of the smile (see figure 3).

Figure 3: Implied normal spread volatility for a 5-year spread option on the difference between10-year and 2-year swap rates obtained using the power Gaussian copula in three parameterscenarios, [AP10-3, Figure 17.2].

9.5 Limitations of the copula method

• Copulas do not result n a dynamic model for the yield curve.

• Copulas allow us to easily ascribe a joint terminal distribution to a collection of CMSrates, for the purpose of pricing European multi-rate options.

• Copulas in practical use are not usually chosen for their links to observed financialrelationships but instead for their technical properties and ease of implementation.

90

• Even though the power Gaussian copula is capable of reproducing a wide range ofmarket-observed shapes of spread volatility smiles, to date no copula has been foundthat reproduced market volatilities of spread options exactly for all values of spreadstrikes. See [AP10-3, Section17.4.4] for some insights.

91

10 Callable Libor Exotics

10.1 Review of basic notions

• In an exotic swap, a regular floating Libor rate is swapped against structured couponsthat are allowed to be arbitrary functions of observed interest rates (e.g. Libor or CMSrates).

For instance, floor can be represented as an exotic swap in which a Libor rate is ex-changed for a floored payoff.

(k − Ln(Tn))+

= (k − Ln(Tn))+

+ Ln(Tn)− Ln(Tn) = max(k, Ln(Tn))− Ln(Tn)

• [AP10-1, Section 5.13] Exotic swaps start their life as bonds or notes, sold by banksto investors. The investor pays an upfront principal amount to the note issuer, who inturn pays the investor a structured coupon, and repays the principal at the maturity ofthe note. The principal amount is invested by the issuer and pays de Libor rate, plusor minus a spread. From the perspective f the issuer, the net cash-flows of the note arethose of an exotic swap.

• If Cn is the structured coupon for the n-th period, the value of the exotic swap equals(from perspective of structured leg buyer)

Vexotic(t) = β(t)

N−1∑n=0

τnEt

[1

β(Tn+1)(Cn − Ln(Tn))

]

• Cn can be a complicated function of interest rates, structured to reflect investors’ viewson the market.

• A Bermudan swaption is an option to enter into a fixed-floating swap on any of itsfixing dates (or a subset of them). Once exercised on date Tn, say, the holder entersthe swap with the first fixing date Tn and final payment date TN

At time Tn, the value of a payer swap, if exercised, is

Un(Tn) = β(Tn)

N−1∑i=n

τiETn

[1

β(Ti+1)(Li(Ti)− k)

]

A Bermudan contract swaption is hence an option to choose between Un(Tn) for n =0, . . . , N − 1 and its time Tn value is

Vbermudan−swaption(Tn) = maxUn(Tn), Hn(Tn)

where Hn(Tn)) is the value of a Bermudan swaption with exercise dates TiN−1i=n+1.

• A callable Libor exotic (CLE) is a Bermudan-style option to enter an exotic swap. Thedifference with respect to an ordinary exotic swap is that the issuer of the note has theright to cancel (or call) the note on a schedule of dates. The principal is then returnedto the investor and no further coupons are paid.

• Given an exotic swap with structured coupons CnN−1n=0 . Its time-Tn value if exercised

is

Un(Tn) = β(Tn)

N−1∑i=n

τiETn

[1

β(Ti+1)(Ci − Li(Ti))

]The n-th hold value of the CLE, denoted Hn(Tn)) is the time-Tn value of the CLEprovided it has not been exercised on or before Tn.

92

10.2 Model calibration for CLE’s

CLE’s present non-trivial dependencies on the dynamics of market rates and generally requiresophisticated term-structure models.

The values of exotic derivatives are not directly observable in the market and thereforemodels for these securities have to be calibrated indirectly to other market information thatis deemed relevant for the class of exotics under consideration.

There are three sources of potentially relevant information:

1. Market prices of liquid vanilla interest rate derivatives (or market-implied - spot -volatilities) of various strikes. Complex structured products can be decomposed intosimpler liquid derivatives. For instance, a callable inverse floater with exercise datesT1 < T2 < · · · < TN−1 is a Bermudan option to swap a Libor rate and a structuredcoupon

Cn = min max6%− Ln(Tn), 0%, 4%

and it can be decomposed as

Cn = (6− Ln(Tn))+ − (2− Ln(Tn))+

namely a portfolio of a long floorlet struck at 6% and a short floorlet struck at 2%. Thecorresponding market-implied volatilities should thus be included as targets for modelcalibration.

2. Volatilities of core / coterminal rates. Even though the underlying exotic swap hasno dependencies on spot volatilities other than those underlying the floorlets inherentin the CIF, the callability structure introduces additional dependence on other vanilladerivatives.

For instance, the callable inverse floater will depend on the implied volatility of theswap rate that fixes on Ti and runs for the period [Ti, TN ] for each i = 1, . . . , N − 1.

3. Historical information about market quantities, such as volatilities and correlations ofmarket rates. In the callable inverse floater, assuming that the option hasn’t beenexercised by time Tn, the time-Tn value of the remaining part of the underlying swapdepends on caplet volatilities observed at time Tn (which are only known at time Tnand are called forward volatilities.

Any model will impose certain dynamics on the volatility structure of interest rates,and one should make sure that the model projections for forward volatilities are in linewith market-implied information.

4. Modeler’s belief of what constitutes reasonable behavior of the model parameters.

10.3 Valuation theory

We use the spot Libor measure QB , with numraire B(t) defined on a tenor structure 0 =T0 < T1 < · · · < TN . The net payment seen by the structured coupon receiver at time Tn+1

is Xn = τn(Cn − Ln(Tn)).

• The n-th exercise value, namely the value of all future payments if the CLE is exercisedat time Tn is

Un(t) = B(t)

N−1∑i=n

Et

[Xi

B(Ti+1)

], UN (t) ≡ 0

93

• The n-th hold value is the value of a CLE where the exercise opportunities have beenrestricted to Tn+1, . . . , TN−1

Hn(Tn) = B(Tn) supξ∈Tn

ETn

[Uξ(Tξ)

B(Tξ)

]which can be computed recursively

Hn−1(Tn−1) = B(Tn−1)ETn−1

[1

B(Tn)maxHn(Tn), Un(Tn)

], (134)

HN−1 ≡ 0

Ifηn(ω) = min k > n : Uk(Tk) > Hk(Tk) ∧N

then

H0(0) = E

[Uη0

(Tη0)

B(Tη0)

]= E

[N−1∑n=η0

Xn

B(Tn+1)

]

10.4 Monte Carlo valuation

The regression-based least squares scheme relies on the assumption that the hold valuesHn(Tn) can be regressed against simulated state variables of the model at time Tn.

We fix some notation first. Consider:

• ζ(t) = (ζ1(t), . . . , ζq(t))> q-dimensional vector process of regression variables.

• X random variable, with value X(ω) on a Monte Carlo path ω.

• ω1, . . . , ωK K simulated Monte Carlo paths.

• RT (X) results of regression of the K-dimensional vector (X(ω1), . . . , X(ωK)) on theK × q matrix ζ(T )> of regression variable observations at time T , namely

RT (X) = ζ(T )>β, [ζ(T )]i,j = ζi(T, ωj)

where

β = arg minβ∥∥(X(ω1)− ζ(T, ω1)>β, . . . ,X(ωK)− ζ(T, ωK)>β

)∥∥2

The most basic least squares scheme for CLE valuation consists of replacing the condi-tional expected value operator ETn−1

in (134) with the regression operator RTn−1, namely

Hn−1(Tn−1) = RTn−1

Vn(ωk)︷︸︸︷[B(Tn−1)

B(Tn)max

Hn(Tn), Un(Tn)

], n = N − 1, . . . , 1 (135)

HN−1 ≡ 0

The full algorithm34 is the following:

1. Choose and calibrate a term structure model (e.g. LM).

2. Choose regression variables ζ(t).

3. Simulate K paths ω1, . . . , ωK , each one representing (in the LM case) one simulatedpath of all core Libor rates.

34See [AP10-3, Page 824].

94

4. For each path ωk, calculate B(Tn, ωk), n = 1, . . . , N − 1.

5. For each path, calculate the exercise value Un(Tn, ωk) of the underlying exotic swapon all exercise dates n = 1, . . . , N − 1 (more work may be needed here if these are notavailable in closed form, see Section 10.4.1 below).

6. For each path ωk, calculate the values of the q-dimensional regression variables ζ(Tn, ωk),n = 1, . . . , N − 1.

7. Set HN−1 ≡ 0.

8. For each n = N − 1, . . . , 1

(a) Form a K-dimensional vector Vn = (Vn(ω1), . . . , Vn(ωK))>,

Vn(ωk) =B(Tn−1, ωk)

B(Tn, ωk)max

Hn(Tn, ωk), Un(Tn, ωk)

, k = 1, . . . ,K

(b) CalculateHn−1(Tn−1) = RTn−1(Vn) = ζ(Tn−1)>β

by regressing the vector Vn against the matrix of regression variables observed ondate Tn−1, namely by finding a q-dimensional

β = arg minβ∥∥(Vn(ω1)− ζ(Tn−1, ω1)>β, . . . , Vn(ωK)− ζ(Tn−1, ωK)>β

)∥∥2

9. H0(T0) is our sought for estimate of the value of the CLE.

The scheme presents several shortcomings:

(i) The exercise values Un(Tn) may not me computable in closed form, and they arenecessary to evaluate (135) at each step.

(ii) The use of nested regression (we apply RTn to a function of a regressed value Hn(Tn))could spawn significant biases.

(iii) It is unclear whether the Hn’s are low- or high-biased estimates of Hn’s.

10.4.1 Regression for the underlying

The above scheme can be easily extended to arbitrary underlying swaps, which is useful ifthe latter are too complex and do not admit a closed form. The idea is to apply regressionto compute Un(Tn) using

Un(t) =

N−1∑i=n

Et

[B(t)

B(Ti+1)Xi

], Xn = τn(Cn − Ln(Tn))

1. For each path ωk and each fixing date Tn calculate the net coupons Xn(Tn, ωk).

2. Use regression to estimate the exercise value Un(Tn) as

Un(Tn) = RTn

(N−1∑i=n

B(Tn)

B(Ti+1)Xi

)= ζ(Tn)>β

where

β = arg minβ

∥∥∥∥∥(N−1∑i=n

B(Tn)

B(Ti+1)Xi(ω1)− ζ(Tn, ω1)>β, . . . ,

N−1∑i=n

B(Tn)

B(Ti+1)Xi(ωK)− ζ(Tn, ωK)>β

)∥∥∥∥∥2

95

This can be computed recursively as

Un(Tn) = RTn(Yn),

Yn =B(Tn)

B(Tn+1)(Xn + Yn+1), n = N − 1, . . . , 1

3. Replace the expression Vn(ωk) inside of RTn−1in (135) by

Vn(ωk) =B(Tn−1)

B(Tn)max

Hn(Tn), Un(Tn)

10.4.2 Using regressed variables for decision only

The values that are regressed at time Tn−1 themselves come as the result of a regressionat time Tn. Compounded regression can lead to substantial biases, even though these canbe reduced significantly by using regressed variables for decision-making purposes only, asexplained in [AP10-3, Section 18.3.4] and as we summarize next.

Define Gn(t) = Hn(t)− Un(t) and note (c.f. [AP10-3, Equation 18.18]) that

Gn−1(Tn−1) = ETn−1

[B(Tn−1)

B(Tn)

(−Xn−1 +Gn(Tn)+

)], n = N, . . . , 0

Hence G0(0) is the value of a swap paying coupons −Xn on dates Tn, plus the right to cancelit on any of the exercise dates (CLE).

This provides an alternative way of valuing the CLE : we can compute a least squaresapproximation of G0(0) and obtain the value of the CLE by subtracting the swap value fromit. The least squares version of the latter (i.e. replacing ETn−1

by RTn−1thus reads

Gn−1(Tn−1) = RTn−1(Vn), GN−1(TN−1) ≡ 0,

Vn(ωk) =B(Tn−1, ωk)

B(Tn, ωk)

(−Xn−1(ωk) + Gn(Tn, ωk)+

), k = 1, . . . ,K

We next show how to bypass the need to use compounded regression: note that

G0(0) = H0(0)− U0(0) = −E

[η−1∑n=0

Xn

B(Tn+1)

]= −E

[N−1∑n=0

Xn

B(Tn+1)Iη>n

]

= −E

[N−1∑n=0

Xn

B(Tn+1)

(n∏i=1

IGi(Ti)>0

)]

Defining

Vn =B(Tn)

B(Tn+1)

[−Xn + IGn+1(Tn+1)>0Vn+1

], n = N − 1, . . . , 0

we obtain

V0 = −N−1∑n=0

Xn

B(Tn+1)

(n∏i=1

IGi(Ti)>0

)whence taking expectations

G0(0) = E[V0]

96

Critically, the recursion for Vn involves the value of the cancelable note for exercise deci-sions only, whereas the coupon values are never regressed. We can thus manage to avoidcompounded regression by defining the following approximation:

Gn−1(Tn−1) =B(Tn−1)

B(Tn)

[−Xn−1 + IGn(Tn)>0Gn(Tn)

],

Gn−1(Tn−1) = RTn−1(Gn−1(Tn−1))

10.4.3 Controlling the MC estimate bias

So far the bias of our MC estimate for the value of a CLE is unknown:

• The exercise decisions used in the schemes are necessarily suboptimal, which suggeststhat the estimate is biased low.

• However, the previous schemes use the same set of sample paths to estimate the exercisedecision as to calculate the security value if exercised, which can lead to an upwardbias (since future information can affect our decision to exercise).

In order to remedy this and guarantee a low bias, one could use the LS regression toestimate the exercise decision rule only, and then use an independent simulation to calculatethe value of the CLE given that exercise rule. Concretely:

1. Run any of the regression schemes above to produce coefficients (betas) for the holdC(Hn(Tn)) and exercise values C(Un(Tn)) at all exercise times, n = 1, . . . , N − 1.

2. Simulate K ′ additional paths ω′1, . . . , ω′K′ that are independent from the paths used in

the regression scheme.

3. For each path ω′k, calculate the values of the q-dimensional regression variables processζ(Tn, ω

′k), n = 1, . . . , N − 1.

4. For each path ω′k calculate an estimate of the exercise index η by

η(ω′k) = minn ≥ 1 : C(Un(Tn))>ζ(Tn, ω

′k) ≥ C(Hn(Tn))>ζ(Tn, ω

′k)∧N

5. Calculate the CLE value as the MC value of a knock-in discrete barrier option

H0(0) ∼ 1

K ′

K′∑k=1

N−1∑n=η(ω′k)

1

B(Tn+1, ω′k)Xn(ω′k)

(136)

10.4.4 Upper bound

10.4.5 Choice of regression variables

The closer ζ(T ) = (ζ1(T ), . . . , ζq(T )) approximates the information in FT that is relevantfor the security, the better the regression method performs (namely, the smaller the bias ofthe lower bound estimates of the security value).

• The dimensionality of the state vector (i.e. Libor rates) is typically so large that theregression will suffer from numerical problems. This issue can be handled by formingthe first d principal components of the vector of Libor rates. Concretely, given a vectora (centered) still-alive forward Libor rates

A = (Ln(Tn)− E[Ln(Tn)], . . . , LN−1(Tn)− E[LN−1(Tn)])>, N − n ≥ d

97

one can estimate the term covariance matrix of the rates c = E[AA>

]and by PCA we

can find an (N − n)× d matrix D such that DD> is the closest rank-d approximationto c, namely

D = arg minD[tr(c−DD>)(c−DD>)>

]and then

A ' Dx, where x = arg minx‖A−Dx‖2 = (D>D)−1D>A

• In some situations, the information carried in the state variables is inadequate forthe security in question and in other situations the state variables may carry too muchinformation, adding unnecessary work to the numerical scheme and reducing the qualityof the CLE price estimate. See [AP10-3, Section 18.3.9.2].

10.5 Implementation

[AP10-3, Section 18.3.10] explores a number of ways in which the regression algorithm can bemade more robust. Recall that given a K-dimensional vector Y = (Y1, . . . , YK)> of simulatedvalues of a random variable Y and a K × q matrix Z of simulated values of the regressionvariables ζ(t), namely Zk,j = ζj(t, ωk), our goal in regression is to find a q-dimensional vectorβ such that solves the minimization problem

β = arg minβ‖Y − Zβ‖2

This problem has a naive solution β = (Z>Z)−1Z>Y .

• [AP10-3, Section 18.3.10.1] explains the R2 criterion for automated explanatory vari-able selection.

• [AP10-3, Section 18.3.10.4] explains two methods to make the computation β =(Z>Z)−1Z>Y more robust when the matrix Z>Z is ill-conditioned, namely close tosingular.

(i) Tikhonov regularization: consider the minimization problem β = arg minβ‖Y −Zβ‖2 + ωreg‖β‖2 where the scalar regularization weight ωreg can be selected in anumber of ways, for instance

ωreg = ε

√1

qtr(Z>ZZ>Z), ε = 10−4

The solution is given by

β = (Z>Z + ωrefI)−1Z>Y

where the matrix to be inverted Z>Z + ωrefI always has full rank.

(ii) Singular value decomposition. Write β = (Z>Z)−1Z>Y as a system of linearequations

Mβ = Z>Y, M = Z>Z

and decompose the q × q matrix M = UΣV > where...

98

10.6 Valuation with low-dimensional models

Single-rate CLE’s35 can be valued accurately using the so-called local-projection method,which consists of calibrating a local low-dimensional model to the volatility information thathas been identified as important to the CLE valuation.

As a starting point, any low-dimensional model should be calibrated to the followingtargets:

1. Underlying swap rate volatilities S1n(Tn)N−1

n=1 . Here we assume that each couponCn depends on S1

n(Tn) only and that Sn(t) ≡ Sn,µ(n)(t) where µ(n) is the number ofperiods.

2. Core swap rate volatilities S2n(Tn)N−1

n=1 , namely S2n(t) ≡ Sn,N−n(t). We have flexibility

in choosing strikes, but ATM strikes are common.

3. Core correlations for S2n(Tn)N−1

n=1 . These are not observable in the market, so we needto draw on an LM model which has been calibrated to the market as a whole. This isparamount: by including this dynamic information, the local model not only capturesthe static information about the interest rate volatilities at valuation time, but also thedynamics of the volatility structure.

There are several options for the local model, the one-dimensional quasi-Gaussian (qG)model being a natural one. Recall that a standard Gaussian model is given by


df(t, T ) = σf (t, T )t

[∫ T

t

σf (t, u) du

]dt+ σf (t, T )tdW (t)


(−∫ T

t

κ(u) du

)σf (t, T ) = g(t)h(T )

where one normally works with the state variables

x(t) := r(t)− f(0, t)

dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t)dW (t), x(0) = 0,

y(t) =

∫ t

0


r(u) du

In a local volatility quasi-Gaussian (qG) model the function g(·) is allowed to depend on thestate variables, namely g(t) ≡ g(t, x(t), y(t). The approximate dynamics of a local volatilityquasi-Gaussian model are given by

dS(t) = ϕ(t, S(t))dWA(t)

ϕ2(t, s) = EA[∂xS(t, x(t), y(t))σr(t, x(t), y(t))2 |S(t) = s

]where

ϕ(t, S(t)) ≈ λS(t) [bS(t)S(t) + (1− bS(t))S(0)]

and

λr(t) =

N−1∑n=1

λn1(Tn−1,Tn](t), br(t) =

N−1∑n=1

bn1(Tn−1,Tn](t)

In this model we have;

35Namely, CLE’s for which each coupon Cn depends on at most a single market rate, which we denote byS1n(Tn).

99

• 2(N − 1) independent volatility parameters (λn, bn), n = 1, . . . , N − 1 which can beused to calibrate the model to term volatilities for one of the swap rate strips.

• N−1 parameters κn, n = 1, . . . , N−1 defining the mean reversion which can be used toeither match the term volatilities for the second swap rate strip or the core correlations.

Therefore, the one-factor local volatility qG model is not large enough to match all threesets of calibration targets identified above. Even though this is acceptable for some securities,it is not for some others, and in this case we might need to resort to a two-factor model.

10.7 Suitable analogue for core swap rates

We noted that the callability value is driven by the volatilities of core swap rates S2n(t) =

Sn,N−n(t), since CLE’s are related to standard Bermudan swaptions. In some cases though,it is more relevant to calibrate the model to other European swaptions36. We thus try torefine the selection of the volatility targets relevant for the callability option of a CLE.

We use the idea is that the local model should match the values of European options onexercise values Un(Tn), n = 1, . . . , N − 1. Since such options can be hard to come by, wecan linearize the underlying Un(Tn) of the CLE and use the resulting rate as a replacementfor the core swap rate.

We use an LM model as a backdrop. Assume that

Un(Tn) = fn(L(Tn)), n = 1, . . . , N − 1

L(Tn) = (Ln(Tn), . . . , LN−1(Tn))

Linearizing we obtain

Un(Tn) = fn(L(0)) +∇fn(L(0))[L(Tn)− L(0)]

so the value of the European option on the underlying can be approximated by

E

[(Un(Tn))+

B(Tn)

]≈ E

[1

B(Tn)(∇fn(L(0))L(Tn)− (∇fn(L(0))L(0)− fn(L(0))))

+

]and the relevant interest rate is hence

Rn(Tn) := ∇fn(L(0))L(Tn) =∑j

ωn,jLj(Tn), ωn,j =∂fn∂Lj

(L(0))

• The volatility of the rate Rn(Tn) can be approximated in an LM or local market models,so they can easily be used as volatility targets in place of core swap rates.

• Un(t) typically consists of options on market rates, so the derivatives ∂fn∂Lj

can be com-

puted with Black-type approximations of option values.

Example 10.1 (Bermudan swaptions). The exercise value of a Bermudan swaption on apayer swap is

Un(Tn) =

N−1∑i=n

τi(Li(Tn)−K)

P (Tn,Ti+1)︷︸︸︷(i∏

k=n

1

1 + τkLk(Tn)

)36For instance, in the case of Bermudan swaptions on amortizing swaps, the most relevant European swaptions

are not the standard core European swaptions, but instead swaptions with tenors based on the durations of theunderlying amortizing swap (see section ?? ahead).

100

with

fn(x) =

N−1∑i=n

τi(xi −K)

(i∏

k=n

1

1 + τkxk

)∂fn∂Lj

(x) = τj

j∏k=n

1

1 + τkxk− τj

1 + τjxj

N−1∑i=j

τi(xi −K)

(i∏

k=n

1

1 + τkLk(Tn)

)so

ωn,j = τjP (0;Tn, Tj+1)

[1− Uj(0)

P (0, Tj)

]and Rn(Tn) =

∑N−1j=n ωn,jLj(Tn) is quite close to the core swap rate Sn,N−n(Tn).

101

11 Bermudan swaptions

Given a tenor structure 0 = T0 < T1 < · · · < TN−1 , a Bermudan swaption is CLE with thecoupon paying a fixed rate Cn = k for n = 1, . . . , N − 1. Assume for simplicity that exerciseis possible on any tenor date. If exercised at time Tn, the value for a payer swap in is

Un(t) =

N−1∑i=n

P (t, Ti+1)ETi+1

t [τi(Li(Ti)− k)] =

N−1∑i=n

τiP (t, Ti+1)[Li(t)− k];

Un(Tn) = An(Tn)[Sn(Tn)− k]

where An(t) ≡ An,N−n(t) and Sn(t) ≡ Sn,N−n(t).

Bermudan swaptions are liquid and their trading volume is high, so the performanceadvantage of PDE methods over Monte Carlo simulation make low-factor Markovian modelsattractive, and the local projection method for single-rate CLE’s provides a sound frameworkfor this. The method takes a simple form for Bermudan swaptions since only the volatilityparameters of the core swap rates are relevant.

11.1 Local projection method

A Bermudan swaption gives the holder the right to exercise into one of several different swapsthat are observed on different exercise dates. Bermudan swaptions are thus conceptuallysimilar to a basket option with payoff maxLn(Tn), n = 1, . . . , N − 1 and their values arehence driven by the volatilities of core swap rates37 Sn(Tn)N−1

n=1 and by the inter-temporalcorrelations thereof38.

This observation allows one to use simple one-factor Gaussian or quasi-Gaussian mod-els for valuation ad risk-management of Bermudan swaptions, with the volatility function

37Recall that For each swap rate Sn,m(t) we specify stochastic volatility (SV) dynamics in the annuity measure

dSn,m(t) = λn,m [bn,mSn,m(t) + (1− bn,m)Sn,m(0)]√zn,m(t)dWn,m(t)

dzn,m(t) = θ [1− zn,m(t)] dt+ ηn,m√zn,mdZ

n,m(t)

38Recall that in a one-factor pure Gaussian model or a quasi-Gaussian model (with local volatility), the meanreversion parameter can be calibrated to a market-implied inter-temporal correlation matrix

χn1,n2 = Corr(Sn1(Tn1), Sn2(Tn2)), 1 ≤ n1 ≤ n2 < N − 1

independently of the volatility term. For instance in a pure Gaussian model with constant volatility σr and meanreversion κ we have (c.f. [AP10-2, Section 13.1.8.1])

Corr(F (T1;T1,M1), F (T2;T2,M2)) = Corr(x(T1), x(T2)) = e−κ(T2−T1)

(1− e−2κT1

1− e−2κT2

)1/2

where F (t;T,M) = − 1M−T ln P (t,M)

P (t,T ).

Likewise, in a one-factor quasi-Gaussian model we have

Corr(Sn1(Tn1), Sn2(Tn2)) '∫ Tn1

0

(∂Sn1

∂x(t, 0, 0)

)(∂Sn2

∂x(t, 0, 0)

)dt

×

(∫ Tn1

0

(∂Sn1

∂x(t, 0, 0)

)2

dt

)−1/2

×

(∫ Tn2

0

(∂Sn2

∂x(t, 0, 0)

)2

dt

)−1/2

102

calibrated to volatilities of core rates and the mean-reversion function calibrated to inter-temporal correlations thereof, with the latter being extracted from a global Libor model.

Several approaches are possible when it comes to choosing the calibration targets. Sup-pose we want to value and hedge a Bermudan swaption with fixed rate k and tenor structure0 = T0 < T1 < · · · < TN giving the holder the right to exercise into one of the swaps observedon each of the exercise dates T1, . . . , TN−1.

• For each expiry Tn, use the volatility of the appropriate ATM core European swap-tion. Unfortunately, this choice leads to inconsistent valuation between European andBermudan swaptions: given a Bermudan swaption model calibrated to ATM Europeanswaptions, applying it to a Bermudan swaption with a non-ATM strike and just a singleexercise date (namely, a non-ATM European swaption) the value will differ from thatobtained from that of the same derivative, priced as a European swaption.

• Instead, for each expiry Tn, use the volatility of the appropriate core European swaptioncorresponding to the strike k.

11.2 Amortizing, accreting and other non-standard Bermudan swap-tions

Standard Bermudan swaptions can be extended by replacing the unit notional by a time-dependent and deterministic notional Ri, so that the exercise value becomes

Un(t) =

N−1∑i=n

RiτiP (t, Ti+1)[Li(t)− k] (137)

• If the notional increases with the coupon index i, the Bermudan swaption is said to beaccreting.

• If the notional decreases with the coupon index i, the Bermudan swaption is said to beamortizing.

Valuation of such swaptions in a properly calibrated model doesn’t entail anything new.Nevertheless, while general models like LM present a product-independent calibration, mod-els requiring local calibration, such as the one-factor quasi-Gaussian model, calibration fornon-standard Bermudan swaptions requires a deeper analysis, since their liquidity is consid-erably poorer than for vanilla European swaptions.

We thus require a pre-processing step to extract amortizing European swaption prices froma model calibrated to liquid vanilla European swaptions.

11.2.1 Relating non-standard to standard swap rates

Consider a swap that starts at Tn and ends at TN , with a notional schedule Ri The annuityand swap rate that correspond to this non-standard swap are defined as

An(t) =

N−1∑i=n

RiτiP (t, Ti+1), Sn(t) =1

An(t)

N−1∑i=n

RiτiP (t, Ti+1)Li(t)

The non-standard swap rate can be decomposed as a linear combination of standard swapslike so39:

39The weights ωn,m(Tn) can be approximated reasonably well b their values at time 0.

103

Sn(t) =

N−1∑m=1

ωn,m(Tn)Sn,m(Tn), ωn,m(Tn) = (Rn+m−1 −Rn+m)An,m(Tn)

An(Tn)

Hence in order to price a non-standard Bermudan swaption in principle one needs tocalibrate the model to the volatilities of standard rates with all expiries T1, . . . , TN−1 andall maturities, something which a low-dimensional local model will virtually never be able todo. We next discuss a method to remedy this issue.

11.2.2 Representative swaption approach (payoff matching)

The idea is to choose a standard swap that approximates the non-standard swap in somereasonable sense, and then to calibrate the Bermudan swaption model to the market-impliedvolatilities of swaptions on these standard swaps, one per exercise date.

Consider a one-factor Gaussian model to illustrate the idea. Fix a start date Tn and letUn(Tn;x) be the value of a non-standard swap, where x = x(Tn) is the Gaussian short ratestate on date Tn. We have seen that

Un(Tn;x) =

N−m∑m=1

(Rn+m−1 −Rn+m)Vn,m(Tn)

=

N−m∑m=1

(Rn+m−1 −Rn+m)

[n+m−1∑i=n

τiP (Tn, Ti+1)[Li(Tn)− k]

]

Recall that by the Gaussian bond reconstitution formula we have

P (t, T ) =P (0, T )

P (0, t)exp

(−x(t)G(t, T )− 1

2y(t)G2(t, T )

)G(t, T ) =

∫ T

t

e−2∫ tuκ(s) ds du

y(t) =

∫ t

0


r(u) du

which explains why we indicate the dependence on x in the notation Un(Tn;x). Note alsothat Un(Tn;x) depends on the volatility up to time Tn only.

We now seek to find a standard swaption providing the best possible approximation toUn(Tn;x). Let V (x;R, q,m) denote the value of a standard swap starting on Tn as a functionof x, with constant notional R, fixed rate (strike) q and covering m periods (where we allowm to be a real number in order to avoid discontinuities.

In the so-called payoff matching method, the three parameters R, q,m are chosen so thatthe level, slope and curvature of the swap payoffs as functions of the state variable x, namelywe impose

V (x0;R, q,m) = Un(Tn;x0)

∂

∂xV (x0;R, q,m) =

∂

∂xUn(Tn;x0) (138)

∂2

∂x2V (x0;R, q,m) =

∂2

∂x2Un(Tn;x0)

where x0 = E[x(Tn)]. These systems are solved numerically and produce the optimal pa-rameters R∗n, q

∗n,m

∗n (which are numerically seen to depend only mildly on mean reversion).

104

Since Un(Tn;x) only depends on the volatility parameter40 σr(t) over [0, Tn], it is easyto incorporate the payoff matching methods into a volatility calibration scheme such as theone outlined in section 5.3.2. Specifically, we calibrate one swaption at a time:

1. Assume that the mean reversion has been fixed exogenously and that σ0, . . . , σi−1 havebeen computed.

2. Select the standard swaption Vi(x;R∗i , q∗i ,m

∗i ) that best approximates Ui(Ti;x) by solv-

ing the system of equations (138), which only requires knowledge of σ0, . . . , σi−1.

3. Set the value σi such that the model price for the standard swaption Vi(x;R∗i , q∗i ,m

∗i )

equals its market price, by numerically inverting Jamshidian’s formula (32)

4. Repeat Step 2 for i = 0, . . . , N − 2.

Remark 11.1. This method as is doesn’t work well for accreting Bermudan swaptions,since it would produce an optimal tenor m∗ that is longer than the tenor of the amortizingswap N − n (see discussion on [AP10-3, Page 884]). It turns out that in order to get areasonable calibration scheme for an accreting Bermudan swaption we would need to calibrateto two standard European swaptions per expiry and this can be difficult within a one-factorMarkovian model, so additional factors are required.

11.2.3 Basket approach

This approach is a creative way of using one-factor models for non-standard Bermudanswaptions by splitting the valuation into two stages:

(i) Some model is used to calculate values of core non-standard European swaptions.

(ii) A one-factor model is then calibrated to the previous values and subsequently used tocompute the value of non-standard Bermudan swaptions.

The variants of this method rely on different ways of implementing (i), namely of valuingnon-standard European swaptions. One could for instance use a globally calibrated modelsuch as LM to value non-standard European swaptions and then use the local projectionmethod for non-standard Bermudan swaptions (namely, calibrate a low-dimensional modelto the prices produced by the global model).

Alternatively, one may proceed as follows. Consider a swaption grid on a tenor T1 <· · · < TN .

1. For each n = 1, . . . , N − 1, fit an instance of a one-factor Gaussian or quasi-Gaussianmodel to the prices of European swaptions expiring at Tn with underlying swaps span-ning the intervals [Tn, Tn+1], . . . , [Tn, TN−1].

Note that the value of a swap fixing at Tn and covering m periods depends on the meanreversion function over [Tn, Tn+m] so that we can keep σr fixed at a reasonable level(e.g. 1%) and calibrate the function κ(t) one swaption at a time.

Choice of European swaption strikes: there is no fixed rule for this. The best optionis to use a one-factor quasi-Gaussian model with stochastic volatility, which allowscalibration to the volatility smile at more than a single strike, thereby alleviating thestrike selection problem.

2. For each Tn, use the relevant instance of the one-factor model calibrated in the previousstep to price the Tn-expiry non-vanilla European option that the Bermudan swaptioncan be exercised into.

40In [AP10-3, Section 19.4.3] the authors claim that Un(Tn;x) does NOT depend on the volatility parameter,which is not true for the Gaussian model as far as I’m concerned. The one-swaption-at-a-time volatility calibrationscheme can still be easily coupled with the payoff matching method though, since the time-Tn bond prices dependonly on the volatility parameter up to time Tn.

105

3. Calibrate a one-factor model to the prices of the non-standard swaptions obtained inthe previous step using the standard procedure (namely, calibrating the short ratevolatility function σr(t) one swaption at a time while keeping the mean reversion fixedat a user-specified value or at a level that makes inter-temporal correlations of coreswap rates match those coming from a global model.

11.2.4 American swaptions

American swaptions allow the holder to exercise on any given date after the lockout period.These are particularly popular in the US for hedges of mortgage bonds.

American swaptions present a subtlety: if exercise takes place during [Tn, Tn+1], theoption holder receives a swap starting at Tn+1 as well as an exercise fee equal to (Ln(Tn)−k)(Tn+1 − t), namely the exercise value per unit of notional is

UAn (t) = (Ln(Tn)− k)(Tn+1 − t) + Un+1(t)

Note that there time-t exercise value hence depends on the value of the Libor rate at Tn < tand is thus path-dependent.

11.2.4.1 American vs. high-frequency Bermudan swaptions

11.2.4.2 The Proxy Libor rate method

11.3 Flexi-swaps

A Bermudan swaption can be interpreted as as fixed-floating swap with zero notional and asingle option to increase the notional to a given level on any of the exercise dates. Similarly, acancelable swap can be seen as a swap of full notional with an option to decrease the notionalto zero on any of the exercise dates.

Flexi-swaps (a.k.a. chooser swaps) are contracts offering more flexibility in choosingswap notionals, namely swaps with multiple options to change the notional on a given set ofexercise dates, subject to certain constraints.

Consider a tenor structure Tn, n = 0, . . . , N and a collection of coupons Xn with unitnotional fixing at Tn and paying at Tn+1. A flexi-swap is a contract paying a net couponXnRn at time Tn+1, where R0 is fixed up-front, and the time-Tn notional Rn is chosen bythe holder of the option at time Tn, subject to some constraints:

• Global deterministic bonds: Rn ∈ [glon , ghin ].

• Local bounds which are functions of the current notional: Rn ∈ [llon (Rn−1), lhin (Rn−1)].

• Bounds that are functions of market data xn (e.g. Libor or swap rates): Rn ∈[mlo

n (xn),mhin (xn)].

Denote by Cn(Rn−1, xn) the set of time-Tn constraints, so that Rn ∈ Cn(Rn−1, xn). Thevaluation of a flexi-swap can be done via a backward recursion equation: if Vn(t, R) denotesthe time-t value of the part of a flexi-swap paying strictly after Tn, given that Rn = R, wehave

Vn−1(Tn−1, R) = P (Tn−1, Tn)Xn−1R+B(Tn−1)ETn−1

[1

B(Tn)max

R′∈Cn(R,xn)Vn(Tn, R

′)]

(139)for n = N, . . . , 1 and with terminal condition VN (TN , R) ≡ 0.

106

11.3.1 Purely local bounds

Assume that only local constraints are enforced and that these are of the form

`lon (Rn−1) = λlonRn−1, `hin (Rn−1) = λhin Rn−1, 0 ≤ λlon ≤ λhin

Also assume for simplicity that the value of the flexi-swap scales linearly in notional, namely

Vn(Tn, R) = RVn(Tn, 1)

Simple computations then show that the recursive equation (139) simplifies to

Vn−1(Tn−1) = P (Tn−1, Tn)Xn−1

+B(Tn−1)ETn−1

[1

B(Tn)Vn(Tn)

(λlon 1Vn(Tn)<0 + λhin 1Vn(Tn)>0

)](140)

11.4 Fast pricing via exercise premia representation

On occasions the speed of valuation of Bermudan swaptions is more important than accuracy(e.g. robust hedging, calculation of CVA). We now consider a useful approximation basedon the representation of a Bermudan swaption as a stream of coupons paid in the exerciseregion.

107

12 TARN’s, volatility swaps and other derivatives

12.1 TARN’s (targeted redemption notes)

A TARN pays structured coupons in exchange for Libor coupons until the cumulative amountof structured coupon payments exceeds a pre-agreed target R, at which point the derivativeterminates. For a given tenor structure 0 = T0 < T1 < · · · < TN , the time-0 value of aTARN is hence

Vtarn(0) = EB

[N−1∑n=1

1

B(Tn+1)τn (Cn − Ln(Tn)) IQn<B

],

Qn =

n−1∑i=1

τiCi

• Faithful reproduction of volatility smiles of various Libor rates is important for TARN’s,so it is recommendable to use globally calibrated models with stochastic volatility.

• The knockout feature of the TARN will introduce digital discontinuities (jumps com-ing from the step function IQn<B), so Monte Carlo errors of the contract value and,especially, its risk sensitivities can be large. Ways of amending this are described in[AP10-3, Chapters 23 and 24].

• The full power of LM and gG models may not be required for TARN’s can often do witha local projection method, as described below: in a nutshell, the idea is to find a simplelocal model that is calibrated the parts of a global model’s volatility structure (e.g.LM) that are relevant to the derivative being valued, in such a way as to approximatethe value of the global model for a particular derivative.

12.1.1 Local projection method

Consider the example of an inverse floating coupon Cn = (s− g · Ln(Tn))+

. The time-0value of the corresponding TARN with target B is

Vtarn(0) = EB

[N−1∑n=1

1

B(Tn+1)τn

((s− g · Ln(Tn))

+ − Ln(Tn))I∑n−1

i=1 τi(s−g·Li(Ti))+<B

]

which depends on the values

L = (L1(T1), . . . , LN−1(TN−1))

of Libor rates on their fixing dates only (the values at intermediate dates being irrelevant)so only the properties of the (N − 1)-dimensional vector L need be captured.

Assuming log-normal distributions for market rates, if two models assign the same valuesto the term variances of Libor rates Var(lnLn(Tn)) and inter-temporal correlations of Liborrates Corr(lnLn(Tn)), then the values of a TARN in both models should be the same. Oneshould thus apply the local projection method as follows:

1. Calibrate a Libor market model to the full swaption volatility grid.

2. Use the calibrated LM model to calculate the relevant term volatilities and inter-temporal correlations needed for the TARN.

3. Pick a simpler model and calibrate it to the volatilities and correlations extracted fromthe LM model.

4. Use the calibrated local model for valuing the TARN and computing risk sensitivities.

108

12.1.2 Effect of volatility smiles

A successful candidate for the local model should have the ability to calibrate to volatilitysmiles of all Libor rates, in addition to having a low number of state variables, for instancethe one-factor quasi-Gaussian model with stochastic volatility. One may proceed as follows:

1. Fix the mean reversion function to match the inter-temporal correlations of Libor ratesCorr(lnLn(Tn), lnLm(Tm)), n,m = 1, . . . , N − 1 as described in Section 6.2.4.

2. Calibrate to the volatility smiles of all Libor rates that appear in the payoff formula.

3. Choose the time-dependent local volatility function σr(t, x, y) and volatility of variancefunction η(t) to match the implied SV parameters of relevant caplets as in Section 6.2.2.

12.2 Volatility swaps

Volatility derivatives are contingent claims whose underlying is the volatility of a financialobservable, rather than a financial observable itself. An (interest rate) volatility swap is acontract the measures realized volatility of a given rate Xn(t). The most common coupons(uncapped and capped versions) are

Cn = |Xn+1(Tn+1)−Xn(Tn)|, Cn = min|Xn+1(Tn+1)−Xn(Tn)|, c

The value of the structured leg of the volatility swap measures the realized variation of therate Xn(·)

Vvolswap(t) = β(t)Et

[N−1∑n=0

τn1

β(Tn+1)|Xn+1(Tn+1)−Xn(Tn)|

]− Vfloatleg(t), (141)

Vfloatleg(t) =

N−1∑n=0

τnP (t, Tn+1)Ln(t) = P (t, T0)− P (t, TN )

Common choice for the rate Xn are

• Fixed-tenor volatility swap, involving a swap rate of the same tenor on each of thefixing dates Xn(t) = Sn,m(t), with fixed m.

• Fixed-expiry volatility swap, involving a swap rate with fixed expiry and tenor Xn(t) =SK,m(t).

• Volatility swaps on CMS rates, which measure the variation of the spread of two ratesXn(t) = Sn,m1

(t)− Sn,m2(t).

As was the case for TARN’s, the value of a volatility swap (141) depends on the valuesof the swap rates Sn on their fixing dates only, so we may apply the local projection method(instead of a full-blown globally calibrated market model).

12.2.1 Volatility swaps with shout option

Some volatility swaps carry the option to shout, namely to choose when the fixing date occursfor the purpose of calculating the coupon payoff. The payoff for the n-th coupon is hence

Cn = |Xn+1(ηn)−Xn(Tn)|

with ηn ∈ [Tn, Tn+1] chosen by the investor. This coupon is paid on date Tn+1. At time Tn,the option looks like an American option with exercise value

P (ηn, Tn+1)|Sn+1(ηn)− Sn(Tn)|

109

• For valuation purposes, the shout option can be ignored for uncappedcoupons. By Jensen’s inequality we have

P (ηn, Tn+1)ETn+1ηn [|Sn+1(Tn+1)− Sn(Tn)|] ≥ P (ηn, Tn+1)

∣∣ETn+1ηn [Sn+1(Tn+1)]− Sn(Tn)

∣∣' P (ηn, Tn+1) |Sn+1(ηn)− Sn(Tn)|

whence the early-exercise value is negligible.

• Consider now the case of a capped coupon with a shout option

C ′n = min|Sn+1(ηn)− Sn(Tn)|, c

It seems as though one requires an estimation of the optimal exercise rule via regression,but it turns out that the situation is much simpler, as the following result shows.

Proposition 12.1 (Broadie and Detemple). The value of an American option on a cappedstraddle with payoff C ′n = min|Sn+1(ηn)−Sn(Tn)|, c is equal to the value of a straddle with

a barrier, so that ETn+1

Tn[C ′n] = E

Tn+1

Tn[C ′′n ] where

C ′′n = c·Imaxt∈[Tn,Tn+1]|Sn+1(t)−Sn(t)|≥c+|Sn+1(t)−Sn(t)|·Imaxt∈[Tn,Tn+1]|Sn+1(t)−Sn(t)|<c

so that the optimal exercise strategy is known analytically: for the period [Tn, Tn+1] oneshould exercise the shout option on the first time t when Sn+1(t) hits either of the barriersSn(t)± c.

Having replaced an American option with a barrier option, we can value capped volatilityswaps in standard Monte Carlo.

12.2.2 Min-Max volatility swaps

The structured coupon for a min-max volatility swap is

Cn = Mn −mm, where Mn = maxt∈[Tn,Tn+1]

Xn(t), mn = mint∈[Tn,Tn+1]

Xn(t)

12.3 Forward swaption straddles

110

13 Risk management

Consider the following notation:

• Θmkt(t): Nmkt-dimensional vector of observable market data at time t (swap andfutures rates for yield curve construction, cap and swaption prices for volatility cali-bration.

• Θprm(t): Nprm-dimensional vector of additional parameters that are not directly ob-served, but are estimated from historical data, such as short rate mean reversion,correlation parameterizations, SV mean reversion speeds, local volatility parameters,etc.

• Θnum(t): additional parameters that control the numerical schemes used in the model,such as the number of MC paths, the size of the discretization steps, etc.

One first uses Θmkt(t) and Θparm(t) to construct the yield curve and calibrate the model,that is to obtain a new vector

Θmdl = C(Θmkt(t); Θprm(t))

Given the time-t yield curve, a set of model parameters Θmdl(t) and a set of numericalscheme parameters Θnum(t) we can compute the value V (t) of our portfolio of derivativesecurities Vi(t), namely

V (t) = V1(t) + · · ·+ Vn(t) = M(Θmdl(t); Θnum(t)) = H(Θmkt(t); Θprm(t); Θnum(t)) (142)

The role of a trading desk is then to construct a hedge around (142), with the hedgeaiming to neutralize, in a standard Taylor-series sense, as many of the movements in theentire market data vector Θmkt(t) as possible.

The dimension of the market data vector Nmkt can be very high and it may be too costlyto hedge against all components of Θmkt(t), so some type of principal component analysismay be undertaken.

13.1 Value at risk

The risk management team in a bank is primarily focused on analyzing the distribution offuture portfolio values, in order to gauge the overall riskiness of the portfolio. The value atrisk is one of the most commonly used risk measures.

VaR at level α, denoted Λα and typically a negative value, is the (1α)-percentile of thedistribution of the P&L move V (t+ h)− V (t) in the real-life measure P , namely

P (V (t+ h)− V (t) ≤ Λα|Ft) = 1− α

In other words, the probability of losing more than −Λα over the time interval [t, t + h] isless than 1− α. Typically, α is set to 99% or 95%, and h to one business day.

Another commonly used risk measure is conditional value-at-risk (cVaR) which is definedas the conditional expectation

Ξα,h = EP [V (t+ h)− V (t)|V (t+ h)− V (t) ≤ Λα]

To compute Λα one needs a statistical description for the market data increment vector41

δ, for which one may follow different methodologies:

41Namely, a vector of variables affecting the portfolio value, such as exchange rates, equity prices, interest ratesand so forth.

111

1. Using the historical distribution of δ directly: ones takes the actual realizations of δover the last NV aR trading days and applies them to the current market data, therebygenerating the empirical distribution of V (t+ h)− V (t).

The calculation of VaR then amounts to ranking the impact of the last NV aR marketmoves on the current portfolio, worst to best, and using the impact on the day withrank (1− α)NV aR as VaR.

2. Using a parameterized, rather than historical, distribution of market moves. One as-sumes that

δi ∼ N (0, s2i ), si = σi

√h, i = 1, . . . , Nmkt

where σi is annualized basis point volatility of the i-th market element.

The co-dependence between the elements of δ is captured in a correlation matrix whichis typically estimated from historical time series.

To compute VaR in the Gaussian setup, we may use equation

V (t+ h) = V (t) +∇H(t)δ +1

2δ>AH(t)δ

Computations are then analytically tractable and are encapsulated in the followingproposition.

Proposition 13.1. Neglecting the Hessian AH(t) = 0, namely assuming that

V (t+ h) = V (t) +∇H(t)δ

and assuming that the elements of the Nmkt-dimensional vector δ have correlation ma-trix R and satisfy δi ∼ N (0, s2

i ) we have

Λα = vΦ−1(1− α)

Ξα = − v

1− αφ(Φ−1(1− α)

),

v2 = ∇H(t)diag(s)Rdiag(s)∇H(t)>

where s = (s1, . . . , sNmkt) and Φ(x) = 1√2π

∫ x−∞ e−x

2/2 dx is the Gaussian CDF.

Proof. Note that the covariance matrix of δ is C = diag(s)Rdiag(s)>. Since V (t+h) =V (t) +∇H(t)δ, this implies that V (t+ h)− V (t) ∼ N(0, v2) ∼ vN(0, 1) where

v2 = ∇H(t)C∇H(t)>.

Hence if Z ∼ N (0, 1) we have

P (V (t+ h)− V (t) ≤ Λα) = P

(Z ≤ Λα

v

)= Φ

(Λαv

)≡ 1− α =⇒ Λα = vΦ−1(1− α)

As for cVaR we have

EP [∆V |∆V ≤ Λα] =1

P (∆V ≤ Λα)EP

[∆V I∆V≤Λα

]=

v

1− αEP

[ZIZ≤Λα

v

]

Remark 13.1. 1. If we wish to include AH in the computation, the distribution of V (t+h)−V (t) is no longer simple. One can show that it is expressible as a sum of independentnon-central chi-square random variables. From this representation, its characteristicfunction can be computed, and this can be turned numerically into a CDF from whichVaR can be computed.

2. The key to a good VaR computation is a reliable and accurate estimate for ∇H andAH .

112

14 Payoff smoothing for Monte Carlo

• Practical risk management of a portfolio of interest rate securities revolves aroundprice sensitivities with respect to various valuation inputs, such as market prices andmodel parameters. These sensitivities are computed by applying small perturbationsto market and model parameters, followed by re-pricing of the securities portfolio inquestion.

• Price sensitivities are inherently less smooth than the price function itself, and this lackof smoothness will often put significant stress on a numerical scheme.

• It is thus important to try to adapt numerical schemes to avoid introducing spuriousinstabilities into the calculation of greeks.

We focus here on the so-called Tube Monte Carlo method for illustrative purposes42 andwe apply it to digital options, barrier options and CLE’s. The method is designed to beapplied when calculating sensitivities by direct perturbation.

14.1 Tube Monte Carlo for digital options

Consider a digital option with payoff

VT = IS(T )>B

where S(t) is the process for the underlying that is simulated using Monte Carlo. Thestandard Monte Carlo estimate (where ωj, j = 1, . . . , J are the sample paths) is given by

V ' 1

J

J∑j=1

IS(T,ωj)>B

The idea is to replace the point sample of the payoff IS(T )>B at S(T, ωj) with an averageestimate of the payoff in a small interval around S(T, ωj), namely

V ' 1

J

J∑j=1

Vj , Vj = E[IS(T )>B|Aj

]Aj = ω : S(T, ω) ∈ [S(T, ωj)− ε, S(T, ωj) + ε] , ε > 0

Clearly:

• If B /∈ Aj , then E[IS(T )>B|Aj

]= IS(T,ωj)>B.

• If B ∈ Aj and ε is small, using the uniform distribution approximation we obtain

E[IS(T )>B|Aj

]= Q (S(T ) > B|S(T, ωj)− ε ≤ S(T ) < S(T, ωj) + ε)

=S(T, ωj) + ε−B

2ε

Hence,

Vj = E[IS(T )>B|Aj

]=

S(T, ωj) + ε−B2ε

IS(T,ωj)−ε≤B<S(T,ωj)+ε + 1 · IS(T,ωj)+ε≤B + 0 · IB<S(T,ωj)+ε

42For a detailed discussion, including payoff smoothing methods for PDE’s, see [AP10-3, Chapter 23].

113

14.2 Tube Monte Carlo for barrier options

Consider a derivative which is a knock-in barrier into a stream of coupons X1, . . . , XN−1,with the knock-in feature defined by a stopping time index η, so that the derivative payscoupons Xi at Ti+1 for i = η, . . . , N − 1 and has value

V (0) = E

[N−1∑i=1

1

B(Ti+1)XiIi≥η

]

The standard Monte Carlo estimate is

V (0) =1

J

J∑j=1

(N−1∑i=1

1

B(Ti+1, ωj)Xi(ωj)Ii≥η(ωj)

)

As above, the indicator Ii≥η(ωj) will introduce digital discontinuities in the payoff whichlead to poor stability and risk sensitivities. The idea of the tube method is again to replacethe point estimates of the payoff with payoff averages over appropriately defined tubes. Ifx(t, ω) denotes the d-dimensional vector of state variables of the underlying model, define

Aεj =

N−1⋂i=1

Aεj,i,

Aεj,i = ω : ‖x(Ti, ω)− x(Ti, ωj)‖ < ε

Observe that Aεj denotes the set of sample paths that come within ε-distance of x(Ti, ωj)for all Ti, i = 1, . . . , N − 1. We may then take our new Monte Carlo estimator to be

V (0) =1

J

J∑j=1

Vj , Vj = E

[N−1∑i=1

1

B(Ti+1, ω)Xi(ω)Ii≥η(ω)|Aεj

]

Being smooth functions of the path ω, we may just evaluate 1B(Ti+1,ω) and Xi(ω) at the

sample path, so that Vj becomes

Vj =

N−1∑i=1

1

B(Ti+1, ωj)Xi(ωj)E

[Ii≥η(ω)|Aεj

]An approximation of the probabilities qi(ωj) := E

[Ii≥η(ω)|Aεj

]is provided in [AP10-3,

Proposition 23.4.1] under the assumption that stopping time index η can be written as thefirst hitting time of a state-dependent boundary

η(ω) = minn ≥ 1 : ψn(x(Tn, ω)) ≥ 0 ∩N

and is given by

1− qi(ωj) =

i∏n=1

(1− pn(ωj)),

pn(ωj) =

1, ψn,j − δn,j ≥ 0,ψn,j+δn,j

2δn,j, ψn,j − δn,j < 0 < ψn,j + δn,j

0, ψn,j + δn,j ≤ 0

where

ψn,j = ψn(x(Tn, ωj)), δn,j = ε‖∇ψn,j‖, ‖∇ψn,j‖ = ∇ψn(x)|x=x(Tn,ωj)

114

Remark 14.1 (Tube Monte Carlo for CLE’s). Recall that CLE’s can be valued in MonteCarlo by representing them as knock-in discrete-barrier options with the knock-in defined byan estimate of the exercise index43 so we may apply the tube Monte Carlo method we justoutlined to smooth out the payoff. Concretely, the function ψ(x(Tn, ω)) in this case wouldbe

ψ(x(Tn, ω)) = C(Un(Tn))>ζ(Tn, ω′k)− C(Hn(Tn))>ζ(Tn, ω

′k)

Remark 14.2 (Tube Monte Carlo for TARN’s). A TARN can be represented as a derivativethat pays a stream of coupons until a knock-out event takes place when a sum of structuredcoupons exceeds a certain target.

VTARN (0) = E

(N−1∑i=1

1

B(Ti+1)XiIi≤η

),

η(ω) = min n ≥ 1 : Qn(ω)−R ≥ 0 ∧N

43C.f. equation ( 136).

115

15 Pathwise differentiation

We start by recalling the pathwise differentiation for European options, following [AP10-1,Section 3.3.2].

Take a European T -maturity option on a dividend-free stock with price process S(t) withpayout function g(S(T )) and consider the problem of computing

dV

dS0= limh→0

V (S0 + h)− V (S0)

h

By the risk-neutral pricing formula V (S0) = E [g(S(T ))]. Assuming that the stock pricefollows a GBM we have

S(T ) = S0 exp

[(r − σ2

2

)T + σ

√TZ

], Z ∼ N(0, 1)

Sh(T ) = (S0 + h) exp

[(r − σ2

2

)T + σ

√TZ

]so Sh(T ) = (S0 + h)S(T )

S0. If the payout function g(·) is regular enough so that the

expectation and the limit are interchangeable, then

dV

dS0= e−rTE

[limh→0

g(Sh(T ))− g(S(T ))

h

]= e−rTE

[g′(S(T ))

S(T )

S0

](143)

so we can implement (143) by generating samples of S(T ) and recording the sample averages

of g′(S(T ))S(T )S0

.

Remark 15.1. For discontinuous payoffs, such as g(x) = 1x>K care must be taken uponimplementing this. In this case, (143) reads

dV

dS0= e−rTE

[δ(S(T )−K)

S(T )

S0

]= e−rT

K

S0ϕS(K)

where δ(·) is the delta function. This is correct (provided that the terminal stock densityϕS(·) is known, but it is unsuited for Monte Carlo simulation since the likelihood of δ(S(T )−K) being non-zero is zero.

More generally, consider estimating dVdα where V (α) = E[Y (α)]. Under certain regularity

conditions (c.f. [AP10-1, Proposition 3.3.1]), we seek to write

dV

dα=

d

dαE[Y (α)] = E

[d

dαY (α)

]Assuming that Y represents a payout function, namely

Y (α) = g(X(α)), X(α) = (X1(α, . . . ,Xq(α))

we can further writed

dαY (α) =

q∑i=1

∂g

∂Xi

dXi

dα

In order to compute the derivatives dXidα , assuming that X(t) is driven by a scalar SDE

dX(t) = µ(t,X(t))dt+ σ(t,X(t))dW (t)

116

differentiating with respect to α on both sides and writing Dα(t) = dX(t)dα we obtain

dDα(t) =d

dxµ(t,X(t))Dα(t)dt+

d

dxσ(t,X(t))Dα(t)dW (t)

and the latter SDE can be discretized and simulated in parallel with the simulation of theSDE for X(t) itself.

We next illustrate how to apply this method to compute greeks of CLE’s and barrieroptions, but we first make a few comments on the likelihood ratio method.

15.1 On the unsuitability of the likelihood ratio method in interestrate modeling

In general, an alternative method to compute sensitivities by Monte Carlo simulation is thelikelihood ratio method. In general, assume that Y (α) represents a deflated payout functiong(·) applied to a vector of random variables X(α) = (S1(α), Sq(α)) with joint densityf(x;α), x ∈ Rq. Then

V (α) = E [g(S(T ))] =

∫Rqg(x)f(x;α) dx

Assuming that the density function f(x;α) is a smooth, we can exchange differentiation andintegration

∂V (α)

∂α=

∫Rqg(x)

∂f(x;α)

∂αdx =

∫Rqg(x)

∂ ln f(x;α)

∂αf(x;α) dx = E [g(X(T ))`(X(T ))]

where `(x)∂ ln f(x;α)∂α is the so-called log-likelihood ratio.

Example 15.1. In the Black-Scholes setting S(T ) = S0 exp[(r − σ2

2

)T + σ

√TZ], Z ∼

N (0, 1) and Y (T ) = lnS(T ), the log-likelihood ratio is given by

`(Y (T )) =∂ lnϕ(Y (T );S0)

∂S0=

1

S0σ√TZ

• The likelihood ratio method does not require smoothness of the payout function, butonly of the density function, which is a mild assumption.

• Knowledge of the density function is required, although one always has the option ofusing a Gaussian approximation based on an Euler discretization.

• In general, it is the earliest observation T1 of the underlying asset process44 that de-termines how fast its variance explodes (note the

√T in the denominator of the log-

likelihood ratio in the Black-Scholes model). Because of the regular structure of mostinterest rate derivatives, the time T1 will in most cases be rather short, resulting inhigh variance of the estimate.

44For instance, the first coupon fixing date, the first exercise date of a CLE or the first knock-out date of abarrier.

117

15.2 CLE’s

Recall the main valuation recursion for CLE’s (in the spot measure QB

Hn−1(Tn−1)

B(Tn−1)= EBTn−1

[1

B(Tn)maxHn(Tn), Un(Tn)

],

Un(t) = B(t)

N−1∑i=n

Et

[1

B(Ti+1)Xi

],

Xi = τi(Ci − Li(Ti))

where Xi are the net coupon payments, Ci are structured coupons, Hn(t) is the n-th holdvalue and Un(t) is the n-th exercise value. Denote by ∆α a pathwise differentiation operatorwith respect to a given parameter α.

Proposition 15.1. Assuming that the coupons Xn, n = 1, . . . , N − 1 and the inverse num-raire 1

B(t) are Lipschitz continuous functions of the parameter α, for any n = 1, . . . , N − 1

we have

∆α

(Hn−1(Tn−1)

B(Tn−1)

)= ETn−1

[∆α

(Un(Tn)

B(Tn)

)IUn(Tn)>Hn(Tn)

]+ ETn−1

[∆α

(Hn(Tn)

B(Tn)

)IHn(Tn)>Un(Tn)

]Unwrapping this recursive statement, if η is the optimal exercise time index then

∆α(H0(0)) = E

[N−1∑n=η

∆α

(Xn

B(Tn+1)

)]

Remark 15.2. From the fact that

H0(0) = E

[N−1∑n=η

Xn

B(Tn+1)

], ∆α(H0(0)) = E

[N−1∑n=η

∆α

(Xn

B(Tn+1)

)]

is looks as though one can compute the derivative ∆α by differentiating the first sum andpretending that the optimal exercise time index η is independent of α. This is not the casein most cases: see [AP10-3, Section 24.1.1.2] for a justification.

Remark 15.3. When using brute-force perturbation methods to evaluate CLE greeks, thereare different alternatives when deciding how to treat the exercise decision in the perturbedmarket data scenario. As justified in [AP10-3, Section 24.1.1.3], it is convenient to re-use theestimate of the optimal exercise index from the base scenario in order to avoid introducingdiscontinuities.

15.3 Barrier options

Recall that a CLE can be interpreted as a knock-in barrier option (where the barrier conditionis defined by the optimal exercise rule) which suggests that the pathwise differentiationmethod outlined in the previous section could be extended to general barrier options.

118

15.3.1 Digital option warm-up

Consider a T -maturity European option with payoff

X = IG>hR, G,R FT -measurable, h ∈ R

Differentiating with respect to α we obtain

∆α

(IG>hR

)= IG>h∆αR+

∂IG>h∂G

R∆αG = IG>h∆αR+ δ(G− h)R∆αG

Hence,

∆αE

[X

B(T )

][∗]= E

[∆α

(X

B(T )

)]= E

[∆α

(1

B(T )

)IG>hR

]+ E

[1

B(T )IG>h∆αR

]+ E

[1

B(T )δ(G− h)R∆αG

]= E

[∆α

(1

B(T )

)IG>hR

]+ E

[1

B(T )IG>h∆αR

]+γG(h) · E

[1

B(T )R∆αG|G = h

]where γG(h) is the PDF of G evaluated at h.

• In [∗] we assumed that we can exchange ∆α and E[·], even though we can’t due to thedigital discontinuity of IG>h. The result of the computation is nevertheless true, andcan be justified using Malliavin Calculus.

• The expected values on the RHS can be computed in a numerical scheme such as MonteCarlo as long as the density γH(·) is known and the conditional expectation

E

[1

B(T )R∆αG|G = h

]can be evaluated. Both can be computed using Malliavin calculus, even though in somecases these can be approximated in closed form.

15.3.2 Barrier options

Consider a barrier schedule TnN−1n=1 to which we associate knockout variables Gn and barrier

levels hn. Consider an option that pays Rn on the first date Tn where Gn > hn, so that

V (0) = E

[Rη

B(Tη)

],

η = mink ≥ 1 : Gk > hk ∧N

More generally, consider the option with the barrier condition checked at times Tn+1, . . . , TN−1

only, namely

Vn(t) = B(t)Et

[Rηn

B(Tηn)

],

ηn = mink ≥ n+ 1 : Gk > hk ∧N

119

Proposition 15.2. (c.f. [AP10-3, Proposition 24.1.3]) For the barrier option describedabove, which pays Rn on the first Tn where Gn > hn, n = 1, . . . , N − 1, the pathwisederivative with respect to a parameter α is given by

∆αV (0) = E

[1

B(Tη)∆αRη|n=η

]+ E

[1

B(Tη)γη(hη) (Rη − Vη(Tη)) ∆αGn|n=η

∣∣∣∣Gη = hη

]

Sketch of proof. The following recursion is clear

Vki,n(Tn)

B(Tn)= ETn

[Rn+1

B(Tn+1)IGn+1>hn+1

]+ ETn

[1

B(Tn+1)Vki,n+1(Tn+1)IGn+1≤hn+1

]Formal differentiation yields the following recursion for ∆αVki,n(Tn)

∆α

(Vki,n(Tn)

B(Tn)

)= ETn

[∆α

(Rn+1

B(Tn+1)

)IGn+1>hn+1

]+ ETn

[∆α

(1

B(Tn+1)Vki,n+1(Tn+1)

)IGn+1≤hn+1

]+ ETn

[1

B(Tn+1)(Rn+1 − Vki,n+1(Tn+1)) δ(Gn+1 − hn+1)

]Unwrapping the recursion the claim follows.

Remark 15.4. The method applies directly to CLE’s through their interpretation as knock-in options. Recall that

VCLE(0) = E

[Uη(Tη)

B(Tη)

]= E

[N−1∑n=η

Xn

B(Tn+1)

], η = minn : Un(Tn) > Hn(Tn)

so that Proposition 15.1 follows directly from Proposition 15.2 upon setting Rn = Un(Tn),Gn = Un(Tn)−Hn(Tn) and hn = 0.

Remark 15.5. These results are rather complex and require knowledge of the transitiondensities and conditional probabilities, which are often hard to compute. Therefore, it maybe more fruitful to apply payoff smoothing techniques (e.g. tube Monte Carlo method) orimportance sampling techniques to integrate any discontinuities before applying the pathwisedifferentiation method.

15.4 Pathwise differentiation for Monte Carlo based models

Consider an LM model with separable deterministic local volatility with dynamics in thespot measure QB given by

dLn(t) = ϕ(Ln(t))λn(t)>(µn(t)dt+ dWB(t)

)µn(t) =

n∑j=q(t)

τjϕ(Lj(t))λj(t)

1 + τjLj(t)

Assume that the LM model and the security to be priced share the same tenor structure0 = T0 < · · · < TN .

120

16 Importance sampling and control variates

In this section we explore two common variance reduction techniques (importance samplingand control variates). We start by outlining the basic principles upon which they rely and byillustrating them via simple examples extracted from [Gla03, Chapter 4]. We then describeapplications to fixed-income derivatives following [AP10-3, Chapter 25].

16.1 Importance sampling

The idea of importance sampling is to use a change of measure to reduce variance. Assumewe seek to estimate µ = EP [Y ]. Let P be an equivalent measure to P ; then

µ = EP [Y ] = EP[YdP

dP

]not= EP

[Y

R

]where R = dP

dP is the Radon-Nikodym derivative. The goal of the method is to choose the

new measure P optimally in order to minimize the P -variance of YR .

Assume Y = g(X) where g : Rp→R is a well-behaved function and X is a p-dimensionalrandom vector with density fX : Rp→R. Then

µ = EP [g(X)] =

∫Rpg(x)fX(x) dx

can be estimated by µn = 1n

∑ni=1 g(Xi) for independent samples X1, . . . , Xn of X.

If h : Rp→R is the (strictly positive) density function of X in a nother measure P , wecan represent µ as

µ = EP [g(X)] =

∫Rpg(x)

fX(x)

h(x)h(x) dx = EP

[g(X)

f(X)

h(X)

]which in turn has a Monte Carlo estimate µhn = 1

n

∑ni=1 g(Xi)

f(Xi)h(Xi)

, where now X1, . . . , Xn

are independent draws from h. The quotient f(x)(h(x) is the so-called likelihood ratio.

We now compare both variances:

Var[µhn] =1

n

(EP

[g(X)2 f(X)2

h(X)2

]− µ2

)=

1

n

(EP

[g(X)2 f(X)

h(X)

]− µ2

)Var[µn] =

1

n

(EP

[g(X)2

]− µ2

)and we note that importance sampling will lower variance if

EP[g(X)2 f(X)

h(X)

]< EP

[g(X)2

]Now assume that the p-dimensional process X(t) is driven by an SDE

dX(t) = µ(t,X(t))dt+ σ(t,X(t))dW (t) (144)

where W (t) is a d-dimensional Brownian motion and that we seek to evaluate

EP [g(X(T ))]

Assume that the SDE is simulated by an m-dimensional scheme so that

g(X(T )) = G(Z1, . . . , Zm), G : Rp×m→R

121

where Z1, . . . , Zm are independent p-dimensional Gaussian vectors with density φ(z). Byindependence,

EP [g(X(T ))] =

∫Rp×m

G(z1, . . . , zm)

m∏i=1

φ(zi) dzi

The likelihood ratio `(z) =∏miφ(zi)h(zi)

performs a change of measure (to P , say) that preserves

independence of Zi and alters the common marginal density from φ(z) to h(z) so that

EP [g(X(T ))] = EP

[G(Z1, . . . , Zm)

m∏i=1

φ(Zi)

h(Zi)

]

Example 16.1. Assume p = 1 and consider shifting the means of the Zi from 0 to somescalar µ, so that

h(zi) =1√2π

exp

(−1

2(zi − µ)2

)whereby

`(z;µ) = exp

(−µ

m∑i=1

zi +m

2µ2

)The idea would then be to set µ to minimize the P -variance of the term G(Z)`(Z;µ). Thisminimization problem is generally handled numerically.

Example 16.2. (c.f. [AP10-1, Section 3.4.4.5]) Consider the problem of estimating byMonte Carlo

P (Z > c)

where Z ∼ N (0, 1) and large c, so that the event Z > c is rare and a standard MonteCarlo simulation might be inaccurate.

In ordinary Monte Carlo we would use the sample mean estimator

P (Z > c) = EP [IZ>c] '1

n

n∑i=1

IZi>c

where Z1, . . . , Zn ∼ N (0, 1) are independent standard Gaussian samples. The variance ofthis estimator is easily seen to be

VarP (IZ>c) = P (Z > c) [1− P (Z > c)]

Introducing a probability that shifts the mean of Z from 0 to some µ, by Example 16.1 thelikelihood ratio is

`(z) =dP

dP= e−µz+

12 z

2

soP (Z > c) = EP [IZ>c] = EP

[e−µz+

12 z

2

IZ>c], Z ∼ N (µ, 1)

and a Monte Carlo estimator for this is then

1

n

n∑i=1

e−µ(Zi+µ)+ 12 z

2

IZi+µ>c, Z1, . . . , Zn ∼ N (0, 1)

As for the variance of the estimator one can compute that

VarP(e−µz+

12 z

2

IZ>c)

= eµ2

P (Z > µ+ c)− P (Z > c)2

122

and the choice of µ that minimizes this P -variance is

µ∗ = arg minµ

eµ

2

P (Z > µ+ c)

= µ∗ : 2µ∗ [1−N (c+ µ∗)]− n(c+ µ∗)

This holds more generally in higher dimensions and for more general payoffs. Assume wewant to estimate E[G(Z)] and assume that we restrict ourselves to changes of distributionthat change the mean of Z from 0 to µ. In this case we have

E[G(Z)] = EPµ

[G(Z)e−µ

>Z+ 12µ>µ], µ ∈ Rn, Z

Pµ∼ N (µ, I)

[1]' EPµ

[eF (Z)e−µ

>Z+ 12µ>µ]

' E[eF (µ+Z)e−µ

>Z− 12µ>µ]

[2]' E

[eF (µ)+∇F (µ)Ze−µ

>Z− 12µ>µ]

where in [1] we expressed F (Z) = lnG(Z) and in [2] we linearized F (µ + Z). Note thatchoosing

∇F (µ)Z = µ> (145)

we eliminate the dependence on Z, and hence the variance of the estimator (although [2] isjust an approximation, so we end up with a low-variance estimator).

Example 16.3. (c.f. [Gla03, Section 4.6.2]) Consider for instance the case of an arithmeticAsian option. Ignoring the discount factor e−rT we have a payoff

G(Z) =[S(Z)−K

]+, S =

1

m

m∑i=1

S(ti), 0 = t0 < t1 < · · · < tm

S(ti) = S(ti−1) exp

((r − 1

2σ2)(ti − ti−1) + σ

√ti − ti−1Zi

), Zi ∼ N (0, 1) (146)

Imposing condition (145)

∇(lnG(Z)) = Z =⇒ ∂

∂Zjln(S(Z)−K) =

1

S −K∂S

∂Zj= Zj

=⇒ ∂S

∂Zj− (S −K)Zj = 0

Using the discretization of S this condition becomes

Zj =1

mG(Z)

m∑i=j

σ√ti − ti−1S(ti)

which, given an equally spaced time grid with ti − ti−1 = h yields a recursion

z1 =σ√h[G(Z) +K]

G(Z), zj+1 = zj −

σ√hS(tj)

mG(Z), j = 1, . . . ,m− 1 (147)

Given a payoff vale y ≡ G(Z), this scheme determines Z (if y ≡ G(Z), apply (147) tocalculate Z1, ten apply (ch25-importance-sampling-example-1) to calculate S(t1), then apply(ch25-importance-sampling-example-2) again to obtain Z2 and so forth). Each payoff valuey = G(Z) thus determines Z, and hence a path S(ti; y)mi=1 and we seek to find the y forwhich

1

m

m∑i=1

S(ti; y)−K = y

123

This equation can be solved for y∗ easily using a 1-dimensional search and this determinesZ∗ = Z(y∗). To simulate, we then simply set µ = Z∗ and apply importance sampling withmean µ.

16.2 Payoff smoothing via importance sampling

We illustrate how use the importance sampling method to produce payoff smoothing byconsidering two examples: binary options and TARN’s.

16.2.1 Binary options

Assume X ∼ N (µ, σ2) and consider en option that pays g(X) provided that X is below acertain barrier b, namely

V = EP[g(X)IX<b

], P some pricing measure

• Valuing by Monte Carlo entails simulating independent Gaussian samples, discardingthose that end up above the barrier b and averaging the payoff values over non-discardedsamples.

• If b is low, the proportion of samples that contribute to the average is small, whichleads to large simulation error.

• The digital feature of the payoff reduces the accuracy and stability of Monte Carloestimates of greeks.

We thus seek to change the probability to increase the proportion of interesting samples.Conditioning on the survival we obtain

V = E[g(X)|X < b]P (X < b) = E[g(X)|X < b]N(b− µσ

)= E

[g(N−1(U ′))

]N(b− µσ

)where U ′ ∼ U(0,N (b)). Since g is smooth, a Monte Carlo evaluation of E

[g(N−1(U ′))

]will

have good convergence and exhibit stable greek estimates.

Setting Λ =IX<bP (X<b) we can write

V = E[g(X)IX<b

]= E[g(X)Λ]P (X < b) = E[g(X)]P (X < b), Λ =

dP

dP

so conditioning on survival can be interpreted as a change of measure defined by the Radon-Nikodym derivative Λ.

16.2.2 TARN’s

Recall that the TARN valuation formula under the spot measure QB reads

Vtarn(0) = EB

[N−1∑n=1

1

B(Tn+1)Xn(Tn)IQb<R

], Qn =

n−1∑i=1

τiCi

where the Xn’s are net coupons and R is the total return.

Example 16.4 (Removing first digital discontinuity). We first try to apply the approachfrom the previous section to remove the first digital discontinuity. Assume structured couponsof the form

Cn = (s− gLn(Tn))+

124

and note that

Q2 < R⇐⇒ L1(T1) >1

g

(s− R

τ1

)def= b1

Denote by V the path value of the coupons that depend on the first knockout event

V =

N−1∑n=2

1

B(Tn+1)Xn(Tn)IQn<R

and note that since E[V |L1(T1) ≤ b1] = 0 we have

E[V] = E[V|L1(T1) > b1]QB(L1(T1) > b)

• Since T1 is typically small, the probability QB(L1(T1) > b) of not knocking out can beapproximated analytically with high precision.

• E[V|L1(T1) > b1] can be interpreted as the value of the TARN under the conditionthat it does not knock out on the date T1, which is less discontinuous than our originalpayoff function.

More generally Piterbarg shows that all discontinuities can in fact be integrated outsideof Monte Carlo.

Proposition 16.1. Consider a TARN with structured coupon Cn = (s − gLn(Tn))+ and

denote bn = 1g

(s− R−Qn

τn

). Its value is given by

Vtarn(0) =

N−1∑n=1

EB[ψnETn−1

[Xn(Tn)

B(Tn+1)

]], (148)

ψn =

n−1∏k=1

QBTk−1(Lk(Tk)− bk) (149)

where the measure QB is defined by the Radon-Nikodym derivative process Λ(t) = Et

[dQB

dQB

]where

Λ(t) =QBt (Lm+1(Tm+1) > bm+1)

QBTm(Lm+1(Tm+1) > bm+1)

m−1∏k=1

ILm+1(Tm+1)>bm+1

QBTk(Lm+1(Tk+1) > bk+1), t ∈ [Tm, tm+1) (150)

• The quantities ψn in (149) can be approximated analytically since each term of theform QBTk−1

(Lk(Tk > bk) involves an expected value over a short period [Tk−1, Tk], so

short-time (e.g. Gaussian) approximations to the distribution of Lk over [Tk−1, Tk]may be applied effectively (in fact, in some simulation schemes the ψn’s come for free).

• Note that the functions ψn replace the indicator functions IQn<R as weights on thecoupons in the TARN valuation formula. The former are much smoother functions ofa simulated path than are indicator functions.

• In measure QB , the TARN never knocks out. However, in order to use the method inpractice we need to establish how to simulate model dynamics in this measure.

16.2.2.1 Simulating under survival measure using conditional Gaussiandraws

Consider the case of a single-factor Libor market model

dLi(t) = λi(t)ϕ(Li(t))[µi(t)dt+ dWB(t)

], i = 1, . . . , N − 1

125

In order to implement TARN pricing formula (148) we need to simulate Libor rates underthe measure QB , namely, in such a way that Ln(Tn) > bn for each n. Consider a simulationstep from Tn−1 to Tn. Employing an Euler scheme we can approximate that QB-dynamicsas

Ln(Tn) = Ln(Tn−1) + λn(Tn−1)ϕ(Ln(Tn−1))[µn(Tn−1)τn−1 +

√τn−1Z

], Z ∼ N (0, 1)

In order to ensure that Ln(Tn) > bn we need to make sure that Z satisfies

Ln(Tn−1) + λn(Tn−1)ϕ(Ln(Tn−1))[µn(Tn−1)τn−1 +

√τn−1Z

]> bn

namely

Z > Zmindef=

bn −mn

vn,

vn = λn(Tn−1)ϕ(Ln(Tn−1))

√τn−1,

mn = Ln(Tn−1) + vnµn(Tn−1)√τn−1

so that Ln(Tn) = mn + vnZ. In order to simulate Z conditioned on it being above a certainlevel Zmin we proceed as above

Z = N−1 [N (Zmin) + (1−N (Zmin))U ] , U ∼ U(0, 1)

It now suffices to verify that this measure just constructed actually coincides with the measuredefined by the Radon-Nikodym process (150) (c.f. [AP10-3, Page 1071]).

• Once Z has been drawn, all Libor rates can be evaluated using

Li(Tn) = Li(Tn−1) + λi(Tn−1)ϕ(Li(Tn−1))[µi(Tn−1)τn−1 +

√τn−1Z

]• The weights ψn in (149) come for free within this framework, since

QBTn−1(Ln(Tn) > bn) = 1−N (Zmin)

16.3 Control variates

Suppose we wish to estimate E[Y ] by Monte Carlo; the idea of the control variates method isto is a possibly multi-dimensional random variable with known mean which is closely relatedto Y in order to reduce the variance of the simulation. Concretely, if Y c = (Y c1 , . . . , Y

cq )> is

a q-dimensional vector of random variables with known mean µc = E[Y c] = (µc1, . . . , µcq)>

and β = (β1, . . . , βq)> is a constant vector, we may form a new random variable

X = Y − β> · (Y c − µc)

Clearly E[X] = E[Y ] so we can run a Monte Carlo simulation on X to estimate the meanof Y .

Denote the q × q covariance matrix of Y c by

ΣY c , [ΣY c ]i,j = Cov(Y ci , Ycj )

and define the q-dimensional vector

ΣY Y c , [ΣY Y c ]i = Cov(Y, Y ci )

Then clearly

Var(X) = Var(Y ) + β>ΣY cβ − 2β>ΣY Y c

126

Lemma 16.2. Var(X) is minimized at β∗ = Σ−1Y cΣY Y c and the minimum value is given by

minβ

Var(X) = (1−R2)Var(Y ), R2 =1

Var(Y )Σ>Y Y cΣ

−1Y cΣY Y c

where we identify R2 as the R-squared of a multi-dimensional regression of Y against Y c

and the components of the optimal vector β∗ as the corresponding regression coefficients.

The ratio of variances is henceminβ VarX

VarY = 1−R2 so the larger the correlation betweenthe Y and the control variate, the larger the variance reduction.

In our usual derivative pricing setting, we seek to replace our standard Monte Carloestimate

1

K

K∑j=1

Y (ωj)

with

1

K

K∑j=1

(Y (ωj)− β>(Y c(ωj)− E[Y c])

)where Y c(ωj) are random samples of the multi-dimensional control variate Y c, chosen suchthat E[Y c] is available in closed form.

If Y is the value of a security under a given model, there are multiple ways to select thecontrol variate Y c.

• Y c may represent the value of the same security under a different (but closely related)model.

• Y c may represent the value of a different, but closely related, security under the samemodel.

• Y c may represent the value of an approximate hedging strategy.

• Y c may be a weighted combination of any of the above control variates.

We explore some of these in the next subsections, but we first outline a few elementaryexamples from [Gla03, Section 4.1].

Example 16.5 (pricing complex options using more tractable options as variates). Considerpricing an arithmetic Asian option with payoff SA = 1

n

∑ni=1 S(ti) in the Black-Scholes

model. No closed-form solution is known, so one must rely on simulation. In contrast, calls

and puts on the geometric average SG = (∏ni=1 S(ti))

1/ncan be priced in closed form and are

good candidates45 to be used as control variates in pricing options on SA. For an arithmeticAsian call option, one could form a controlled estimator using independent replications of

(SA(ω)−K)+ − b

(SG(ω)−K)+ − E[e−rT (SG −K)+]

Example 16.6 (pricing options using prices in simpler models as variates). Consider pricingan option on an asset with dynamics

dS(t)

S(t)= rdt+ σ(t)dW (t)

We can simulate S at dates t1, . . . , tn using

S(ti+1) = S(ti) exp

([r − 1

2σ(ti)

2](ti+1 − ti) + σ(ti)√ti+1 − tiZi+1

), Zi ∼ N (0, 1)

45 [Gla03, Figure 4.2] shows that there is an almost perfect correlation between the payoff of a call option onarithmetic average and the payoff of a call option on the geometric average.

127

Assume the option in question had a more tractable price if the asset were a geometricBrownian motion. We could simulate

S(ti+1) = S(ti) exp

([r − 1

2σ2](ti+1 − ti) + σ

√ti+1 − tiZi+1

), Zi ∼ N (0, 1)

for some constant σ and initial condition S(0) = S(0). For instance, for a standard call stuckat K, we could form a controlled estimator using independent replications of

(S(tn)−K)+ − b

(S(tn)K)+ − E[(S(tn)−K)+]

For effective variance reduction, one should choose σ to be close to a typical value of σ.

16.3.1 Model-based control variates

We fix some notation:

• Let Vorig be a Monte Carlo estimate for the true security value in the original pricingmodel

• Let Vproxy be a Monte Carlo estimate for the same security in a proxy model where a

highly accurate price estimate E[Vproxy] is available.

• Denote VPDE = E[Vproxy].

We then introduce a corrected value estimate as

Vcorrected = Vorig − β(Vproxy − VPDE

)where β is the appropriate regression coefficient. If the path values used to compute Vorigare positively correlated with the ones used to obtain Vproxy, the variance of the estimate isreduced.

We illustrate how to construct a PDE-friendly model proxy for the LM model.Recall that the LM model is Markovian only in the full set of all forward Libor rates on theyield curve, plus any additional variables required to model unspanned SV. For concreteness,consider a one-factor model with deterministic local volatility

dLn(t) = ϕ(Ln(t))λn(t)>(µn(t,L(t))dt+ dWB(t)

)µn(t,L(t)) =

n∑j=q(t)

τjϕ(Lj(t))λj(t)

1 + τjLj(t)

We seek to construct a low-dimensional Markovian approximation in a number of steps.

1. Eliminate the local volatility ϕ(Ln(t)) via the transform

f(x) =

∫ x

x0

1

ϕ(ξ)dξ, `n(t)

def= f(Ln(t)), n = 1, . . . , N − 1

which eliminates ϕ from the diffusion part of the SDE and yields

d`n(t) = λn(t)>[(µn(t,L(t))− 1

2λn(t)ϕ′(Ln(t))

)dt+ dW (t)

]2. The LM model drift terms are generally small, so we may approximate

µn(t,L(t)) ' µn(t,L(0))

128

3. For local volatility functions ϕ(x) which are close to linear, we may further approximate

ϕ′(Ln(t)) ' ϕ′(Ln(0))

which yields

d`n(t) = λn(t)>[(µn(t,L(0))− 1

2λn(t)ϕ′(Ln(0))

)dt+ dW (t)

]so that each `n(t) is an integral of λn(t) against a Brownian motion with deterministicdrift.

4. To make all variables functions of the same state variable, approximate the volatilitystructure with a separable one

λn(t) ' λn(t) = σnα(t), n = 1, . . . , N − 1

where α is a function of time common to all Libor rates and σn’s are Libor-specificscalars. Defining a one-dimensional Markovian state variable by

dX(t) = α(t)dW (t)

we obtain that all the variables `n(t) are deterministic functions of X(t):

`n(t) = `n(0) + dn(t) + σnX(t),

dn(t) =

∫ t

0

λn(s) ·(µn(s,L(0))− 1

2λn(s)ϕ′(Ln(0))

)ds

which translated back into Libor forwards reads

Ln(t) = f−1 (f(Ln(0)) + dn(t) + σnX(t)) , n = 1, . . . , N − 1

5. With this representation, at each point in time t, the value of any path-independentderivative V can be expressed as a function

V = V (t,X(t))

satisfying the PDE

∂V (t, x)

∂t+

1

2α2(t)

∂2V (t, x)

∂x2= r(t, x)V (t, x)

subject to appropriate boundary and jump conditions.

To use the Markov approximation as the control variate, we define:

• Vproxy: value of the security in the Markov LM model computed by Monte Carlo.

• VPDE : value of the security computed by the PDE method in the same model.

In order to achieve the required high correlation between the path values of a derivativein the original and proxy models, it is necessary to use the same simulation seed for randomnumber generation and also compatible discretization schemes.

The potential downside of the method is the fact that its scope is limited by the needto perform a PDE valuation, with three dimensions being the practical maximum for areasonably quick PDE scheme.

129

16.3.2 Instrument-based control variates

The idea here is to keep the model fixed but change the payoff; we illustrate how the methodworks when applied to Bermudan swaptions and CLE’s.

Consider a Bermudan swaption with N − 1 exercise opportunities T1, . . . , TN−1, withcorresponding exercise values

Un(t) = B(t)Et

[N−1∑i=n

1

B(Ti+1)Xi

], Xi = τi(Li(Ti)− κ)

The value of the Bermudan swaption is given by

H0(0) = E

N−1∑i=η

1

B(Ti+1)Xi

where η is the optimal exercise index, and the corresponding K-path Monte Carlo estimatevalue is

H0(0) ' 1

K

K∑k=1

Y (ωk), Y (ω) =

N−1∑i=η(ω)

1

B(Ti+1, ω)Xi(ω)

where ω1, . . . , ωK are simulated paths.

A naive set of control variates that we could introduce is based on the exercise values

Y cn (ω) = Un(Tn, ω) = B(Tn)

[N−1∑i=n

1

B(Ti+1, ω)Xi(ω)

]but this presents two major drawbacks:

1. The control variates Y cn , n = 1, . . . , N − 1 are linear functions of rates, whereas thepayoff of a Bermudan swaption is not well approximated by a linear function.

2. More subtly, the value of a Bermudan swaption (or CLE) along a path ω involves cash-flows fixed at times Tη(ω), . . . , TN , whereas all the cash-flows in the definition of Y cnare sampled at a single time. Since the number of coupons included in the Bermudanswaption and the controls differ, the correlation between them is likely to be low.

16.3.2.1 Timing mismatch

The timing mismatch in (2) can be rectified by sampling the controls at the Bermudanswaption exercise time, since:

Lemma 16.3. Let Un = Un(Tn) and let Zn, n = 0, . . . , N − 1 be a collection of martingaleswith respect to the filtration FTnNn=0. If η, σ ∈ 1, . . . , N − 1 are stopping times satisfyingη ≤ σ, then

[Corr(Uη, Zη)]2 ≥ [Corr(Uη, Zσ)]2

Therefore if a European option maturing at Tn is used as a control variate, and for aparticular Monte Carlo path η < n, the Lemma implies that we should use the value of theoption at time Tη to generate a control variate (higher correlation) rather than wait untilthe maturity date Tn. Namely, we could consider control variates

Y cndef= Un(Tη) = B(Tη)

η−1∑i=n

Xi

B(Ti+1)+B(Tη)ETη

N−1∑maxη,n

Xi

B(Ti+1)

, n = 1, . . . , N − 1

Note also that by the optional sampling theorem Y cn = Un(Tη) are martingales, whence

E[Y cn ] = E[Un(Tη)] = Un(0)

130

16.3.2.2 Non-linear controls

To introduce non-linearity into the controls for Bermudan swaptions, one can consider usingcaps, which can be valued exactly in the majority of LM models. Concretely, for a Bermudaswaption with strike κ whose exercise values are

Un(t) = B(t)

N−1∑i=n

Et

[1

B(Ti+1)τi(Li(Ti)− κ)

]consider the set of caps

Vcap,n(t) = B(t)

N−1∑i=n

Et

[1

B(Ti+1)τi(Li(Ti)− κ)+

], n = 1, . . . , N − 1

These can be used to construct a few possible controls. For instance, we can use a collectionof all caps, observed at the Bermudan exercise time

Y c = (Y c1 , . . . , YcN−1)>,

Y cn = Vcap,n(Tη) = B(Tη)

η−1∑i=n

1


+B(Tη)ETη

N−1∑maxη,n

1


For securities more complex than the standard Bermudan swaptions, the idea of sampling

the controls at the exercise time still applies, but the choice of controls becomes non-trivialand has to be done on a case-by-case basis.

16.3.3 Dynamic control variates

We present here the delta method of Clewlow and Carverhill. Let X(t) be a p-dimensionalvector process al of whose components are martingales (e.g. assets deflated by a numraire)and g : Rp→R a smooth function. Consider estimating the expected value

V (0) = E[g(X(T ))]

Under regularity conditions we have

V (T ) = V (0) +

∫ T

0

p∑i=1

∂V (t)

∂Xi(t)dXi(t)

On a simulation time line tjmj=1 this can be written in the style of an Euler scheme

V (T ) = V (0) +

m∑j=1

p∑i=1

∂V (tj−1)

∂Xi(t)(Xi(tj)−Xi(tj−1)) .

The zero-mean quantity

m∑j=1

p∑i=1

∂V (tj−1)

∂Xi(t)(Xi(tj)−Xi(tj−1))

is likely to have high correlation to V (T ), so we can consider using it as a control variate. Inpractice, one can use deltas constructed using approximate risk sensitivities in order buildthe control variate.

131

• For the LM model, deltas could originate from an approximate Markov model as theone in Section 16.3.1.

• In [AP10-3, Section 25.5], a general technique due to Moni is described which is suitablefor CLE’s, which we outline below.

Take as regression variables polynomials of explanatory variables x(t) = (x1(t), . . . , xd(t))and approximate the hold value using least squares

Hn(Tn) = pn(x(Tn)), n = 0, . . . , N − 1

Approximate sensitivities can then be computed as

∆Hn(Tn)

∆xm(Tn)' ∆Hn(Tn)

∆xm(Tn)=

∂pn∂xm

(X(Tn)), m = 1, . . . , d

which yields an approximate hedging strategy

Vhs(Tn) = H0(0) +

n−1∑j=0

(d∑

m=0

∂pn∂xm

(X(Tj)) [xm(Tj+1)− xm(Tj)]

), n = 1, . . . , N − 1

with time-0 expected value

E [Vhs(Tn)] = H0(0) +

n−1∑j=0

E

[d∑

m=0

∂pn∂xm

(X(Tj))(E[xm(Tj+1)|FTj ]− xm(Tj)

)] [∗]= H0(0)

where in [∗] we assumed that we selected the explanatory variable to be martingales, wherebyE[xm(Tj+1)|FTj ] = xm(Tj).

A control variate can then be defined as the hedging strategy stopped at the exercisetime

Y cdef= Vhs(Tη)

which according to tests by Moni yields reductions in standard error by a factor of 2 to 3.

132

References

[AP10-1] L. Andersen, V. Piterbarg, Interest rate modeling. Volume 1: Foundations andvanilla models, Atlantic Financial Press, 2010

[AP10-2] L. Andersen, V. Piterbarg, Interest rate modeling. Volume 2: Term structure mod-els, Atlantic Financial Press, 2010

[AP10-3] L. Andersen, V. Piterbarg, Interest rate modeling. Volume 3: Products and riskmanagement, Atlantic Financial Press, 2010

[Gla03] P. Glasserman, Monte Carlo Methods in Financial Engineering, Springer, 2003

[Shr04] S. E. Shreve, Stochastic calculus for finance II, Springer, 2004

133

interest rate modeling market models, products and risk...

Documents