interest rate modeling market models, products and risk...
TRANSCRIPT
Interest rate modeling
Market models, products and risk management(following [AP10-1], [AP10-2] and [AP10-3])
Alan Marc Watson
July 5, 2016
Abstract
This document contains a brief summary of Andersen and Piterbarg’s superb three-volume treatise on fixed-income derivatives. I have used this as a self-study guide and alsoto document my subsequent model implementations
Contents
1 Fundamentals of interest rate modeling 61.1 Fixed income notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 Fixed income probability measures . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Risk neutral measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.2 T-forward measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.3 Spot measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.4 Swap measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 The HJM analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3.1 The Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3.2 Gaussian model with Markovian short rate . . . . . . . . . . . . . . . 10
2 Vanilla models with local volatility 112.1 General framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 The CEV model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Valuation of European call options in the CEV model . . . . . . . . . 122.2.2 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.3 Displaced diffusion models . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Finite difference solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3.1 Multiple λ and T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Forward equation for call options . . . . . . . . . . . . . . . . . . . . . . . . . 132.5 Time-dependent local volatility function ϕ . . . . . . . . . . . . . . . . . . . . 14
2.5.1 Separable case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.5.2 Skew averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Vanilla models with stochastic volatility 173.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Fourier integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Calculation of option prices when MGF is available for ln(S(t)). . . . 173.2.2 Direct integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1
3.2.3 Fourier integration for arbitrary European Payoffs . . . . . . . . . . . 193.3 Averaging methods for models with time-dependent parameters . . . . . . . . 21
3.3.1 Volatility averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3.2 Skew averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3.3 Volatility of variance averaging . . . . . . . . . . . . . . . . . . . . . . 233.3.4 Calibration by parameter averaging . . . . . . . . . . . . . . . . . . . 23
4 One-factor short rate models 264.1 Review of Swaps and swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2 The Ho-Lee model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3 The mean-reverting GSR model . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3.1 Swaption pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.3.2 Swaption calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.3.3 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 The affine one-factor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.4.1 Bond pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.2 Discount bond calibration . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.3 European option pricing . . . . . . . . . . . . . . . . . . . . . . . . . . 354.4.4 Swaption calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.5 Log-normal short-rate models . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.6 Spanned and unspanned stochastic volatility . . . . . . . . . . . . . . . . . . . 38
5 Multi-factor short-rate models 395.1 The Gaussian multi-factor model . . . . . . . . . . . . . . . . . . . . . . . . . 395.2 Two-factor Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.3 Swaption pricing (via Gaussian swap rate approximation) . . . . . . . . . . . 41
5.3.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.3.2 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6 The one-factor quasi-Gaussian model 456.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.2 Local volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.2.1 Linear local volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.2.2 Volatility calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.2.3 Computational tip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.2.4 Mean reversion calibration . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.3 Stochastic volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.4 Quasi-Gaussian multi-factor models . . . . . . . . . . . . . . . . . . . . . . . 52
7 The Libor market model 537.1 The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.1.1 Probability measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.1.2 Link to HJM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547.1.3 Choice of σn(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.2 Correlation structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557.2.1 Empirical PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557.2.2 Empirical estimates for forward rate correlations . . . . . . . . . . . . 56
7.3 Pricing European options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.3.1 Caplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.3.2 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577.3.3 Spread options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.4 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.4.1 Calibration objective function . . . . . . . . . . . . . . . . . . . . . . . 61
2
7.4.2 Calibration algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.4.3 Correlation calibration to spread options . . . . . . . . . . . . . . . . . 64
7.5 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.5.1 Euler-type schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.5.2 Special-purpose schemes with drift Predictor-Corrector . . . . . . . . . 65
7.6 Interpolation of front / back stubs . . . . . . . . . . . . . . . . . . . . . . . . 66
7.6.1 Back stub interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.6.2 Back stub interpolation inspired by the Gaussian model . . . . . . . . 67
7.6.3 Front stub interpolation inspired by the Gaussian model . . . . . . . . 68
7.7 Advanced approximation for swaption pricing via Markovian projection . . . 68
7.7.1 Advanced formula for the swap rate volatility . . . . . . . . . . . . . . 70
7.7.2 Advanced formula for the swap rate skew . . . . . . . . . . . . . . . . 70
7.7.3 Skew and smile calibration . . . . . . . . . . . . . . . . . . . . . . . . 70
7.8 Other extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.8.1 Evolving separate discount and forward rate curves . . . . . . . . . . . 72
7.8.2 SV models with non-zero correlation . . . . . . . . . . . . . . . . . . . 72
7.8.3 Multi-stochastic volatility extensions . . . . . . . . . . . . . . . . . . . 72
8 Single-rate vanilla derivatives 73
8.1 European swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.2 Caps and floors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.3 Terminal swap rate (TSR) models . . . . . . . . . . . . . . . . . . . . . . . . 74
8.3.1 Linear TSR models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.4 Libor-in-Arrears . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.5 Libor-with-Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.6 CMS and CMS-Linked cash flows . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.6.1 Linear TSR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.6.2 Libor market model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.6.3 Quasi-Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.6.4 Correcting non-arbitrage-free methods . . . . . . . . . . . . . . . . . . 79
8.6.5 CDF and PDF of CMS rate in Forward measure . . . . . . . . . . . . 80
8.6.6 SV model for CMS rate . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.6.7 Dynamics of CMS rate in forward measure . . . . . . . . . . . . . . . 82
8.7 Quanto CMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.8 Eurodollar futures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.8.1 The problem and the plan . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.8.2 Expansion around futures value . . . . . . . . . . . . . . . . . . . . . . 85
8.8.3 Forward rate variances . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.8.4 Forward rate correlations . . . . . . . . . . . . . . . . . . . . . . . . . 86
9 Multi-rate vanilla derivatives 87
9.1 Dependence structure via copulas . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.2 Gaussian copula method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.3 General couplas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
9.4 Copula methods for spread options . . . . . . . . . . . . . . . . . . . . . . . . 89
9.4.1 Power gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
9.5 Limitations of the copula method . . . . . . . . . . . . . . . . . . . . . . . . . 90
3
10 Callable Libor Exotics 92
10.1 Review of basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
10.2 Model calibration for CLE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
10.3 Valuation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
10.4 Monte Carlo valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
10.4.1 Regression for the underlying . . . . . . . . . . . . . . . . . . . . . . . 95
10.4.2 Using regressed variables for decision only . . . . . . . . . . . . . . . . 96
10.4.3 Controlling the MC estimate bias . . . . . . . . . . . . . . . . . . . . . 97
10.4.4 Upper bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
10.4.5 Choice of regression variables . . . . . . . . . . . . . . . . . . . . . . . 97
10.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
10.6 Valuation with low-dimensional models . . . . . . . . . . . . . . . . . . . . . . 99
10.7 Suitable analogue for core swap rates . . . . . . . . . . . . . . . . . . . . . . . 100
11 Bermudan swaptions 102
11.1 Local projection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
11.2 Amortizing, accreting and other non-standard Bermudan swaptions . . . . . . 103
11.2.1 Relating non-standard to standard swap rates . . . . . . . . . . . . . . 103
11.2.2 Representative swaption approach (payoff matching) . . . . . . . . . . 104
11.2.3 Basket approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
11.2.4 American swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
11.3 Flexi-swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
11.3.1 Purely local bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
11.4 Fast pricing via exercise premia representation . . . . . . . . . . . . . . . . . 107
12 TARN’s, volatility swaps and other derivatives 108
12.1 TARN’s (targeted redemption notes) . . . . . . . . . . . . . . . . . . . . . . . 108
12.1.1 Local projection method . . . . . . . . . . . . . . . . . . . . . . . . . . 108
12.1.2 Effect of volatility smiles . . . . . . . . . . . . . . . . . . . . . . . . . . 109
12.2 Volatility swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
12.2.1 Volatility swaps with shout option . . . . . . . . . . . . . . . . . . . . 109
12.2.2 Min-Max volatility swaps . . . . . . . . . . . . . . . . . . . . . . . . . 110
12.3 Forward swaption straddles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
13 Risk management 111
13.1 Value at risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
14 Payoff smoothing for Monte Carlo 113
14.1 Tube Monte Carlo for digital options . . . . . . . . . . . . . . . . . . . . . . . 113
14.2 Tube Monte Carlo for barrier options . . . . . . . . . . . . . . . . . . . . . . . 114
15 Pathwise differentiation 116
15.1 On the unsuitability of the likelihood ratio method in interest rate modeling . 117
15.2 CLE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
15.3 Barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
15.3.1 Digital option warm-up . . . . . . . . . . . . . . . . . . . . . . . . . . 119
15.3.2 Barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
15.4 Pathwise differentiation for Monte Carlo based models . . . . . . . . . . . . . 120
4
16 Importance sampling and control variates 12116.1 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12116.2 Payoff smoothing via importance sampling . . . . . . . . . . . . . . . . . . . . 124
16.2.1 Binary options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12416.2.2 TARN’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
16.3 Control variates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12616.3.1 Model-based control variates . . . . . . . . . . . . . . . . . . . . . . . 12816.3.2 Instrument-based control variates . . . . . . . . . . . . . . . . . . . . . 13016.3.3 Dynamic control variates . . . . . . . . . . . . . . . . . . . . . . . . . 131
5
1 Fundamentals of interest rate modeling
1.1 Fixed income notations
Some notations:
• P (t, T ): time-t price of a zero-coupon bond (ZCB) delivering $1 at time T ≥ t.• P (t, T, T + τ) = P (t,T+τ)
P (t,T ) : time-t forward price for the ZCB spanning [T, T + τ ]1.
• y(t, T, T + τ): continuously compounded yield, defined by
e−y(t,T,T+τ)τ = P (t, T, T + τ)
• L(t, T, T + τ) simple forward rate, defined by
1 + τL(t, T, T + τ) =1
P (t, T, T + τ)=⇒ L(t, T, T + τ) =
1
τ
[P (t, T )
P (t, T + τ)− 1
]L(t, T, T + τ) will generally be the Libor rates quoted in the interbank market.
• f(t, T ) is the time-t instantaneous forward rate
f(t, T ) = limτ→0
L(t, T, T + τ)
• Relation between f(t, T ) and P (t, T, T + τ)
P (t, T, T + τ) = exp
(−∫ T+τ
T
f(t, u) du
), f(t, T ) = − ∂
∂TlnP (t, T )
y(t, T, T + τ) =1
τ
∫ T+τ
T
f(t, u) du, f(t, T ) = limτ→0
y(t, T, T + τ)
• r(t) = f(t, t) short/spot rate.
• F (t, T, T + τ) futures rate.
1.2 Fixed income probability measures
We outline some of the main measures used in the study of fixed-income and list some of theirproperties. Let V (t) denote the time-t value of a derivative security with payout functionV (T ).
1.2.1 Risk neutral measure
The numraire defining the measure Q is the continuously compounded money market accountβ(t).
dβ(t) = β(t)dtβ(0) = 1
=⇒ β(t) = e∫ t0r(u) du
In the absence of arbitrage the numraire deflated process V (t)β(t) must be a martingale,
which yields the risk-neutral pricing formula
V (t) = EQt
[V (T )e−
∫ Ttr(u) du
]1This is the time-T purchase price of a bond maturing at T + τ , fixed at time t < T . To see this, at time t we
can buy a (T + t)-maturity bond for P (t, T + τ) and short P (t,T+τ)P (t,T )
T -maturity bonds, which costs us 0. At time
T , we close the short position on the T -maturity bond by paying P (t,T+τ)P (t,T )
· P (T, T ) to the buyer, and at time
T + τ we receive $1 for our (T + t)-maturity bond.
6
In particular, for V (T ) = 1 we obtain the price of a ZCB
P (t, T ) = EQt
[e−
∫ Ttr(u) du
]1.2.2 T-forward measure
The T-forward measure QT uses a T -maturity ZCB as numraire. The risk-neutral pricingformula in this case reads
V (t) = P (t, T )ETt [V (T )]
In the absence of arbitrage, the forward Libor rate L(t, T, T + τ) is a QT -martingale suchthat
L(t, T, T + τ) = ET+τt [L(T, T, T + τ)] , t ≤ T
1.2.3 Spot measure
Consider a tenor structure 0 < T0 < · · · < TN . Sometimes it is convenient to introduce anumraire that can be extended to arbitrary horizons by compounding. Consider the followingsituation:
(i) Time 0: invest $1 in 1P (0,T1) T1-maturity discount bonds, returning an amount
1
P (0, T1)= 1 + τ0L(0, 0, T1)
(ii) Time T1: Re-invest this amount in T2-maturity bonds, returning
1
P (0, T1)
1
P (T1, T2)= [1 + τ0L(0, 0, T1)][1 + τ1L(T1, T1, T2)]
(iii) Iterating this we obtain a process
B(t) = P (t, Ti+1)
i∏n=0
[1 + τnLn(Tn)] , Ti < t ≤ Ti+1
The spot measure QB is the measure induced by this process and the time-t value of asecutiry in this measure is by definition
V (t) = EBt
[V (T )
B(t)
B(T )
],
B(t)
B(T )=
j∏n=i+1
[1 + τnLn(Tn)]−1 P (t, Ti+1)
P (t, Tj+1)
for Ti < t ≤ Ti+1 and Tj < T ≤ Tj+1.
1.2.4 Swap measures
Given a tenor structure 0 ≤ T0 < T1 < · · · < TN , for k,m satisfying 0 ≤ k < N andk +m ≤ N we define:
(i) Annuity factor:
Ak,m(t) =
k+m−1∑n=k
P (t, Tn+1)τn, τn = Tn+1 − Tn
7
(ii) Swap rate:
Sk,m(t) =P (t, Tk)− P (t, Tk+m)
Ak,m(t)=
∑k+m−1n=k τnP (t, Tn+1)Ln(t)
Ak,m(t), t ≤ Tk
Being a combination of zero-coupon bonds, an annuity factor Ak,m(t) qualifies as a num-raire. Denoting by Qk,m the corresponding measure and by Ek,m the corresponding expec-tation, in the absence of arbitrage we have
V (t) = Ak,mEk,m
[V (T )
Ak,m(T )
]
1.3 The HJM analysis
The HJM framework attempts to model how an entire continuum of T -indexed bond pricesP (t, T ) jointly evolves through time, staring from a known condition P (0, T ).
In the absence of arbitrage, the deflated bond values are Q-martingales, so if Pβ(t, T ) =P (t,T )β(t) , by the martingale representation theorem there exists a d-dimensional stochastic
process σT (t, T ) with σP (T, T ) = 0 such that
dPβ(t, T ) = −Pβ(t, T )σP (t, T )tdW (t), t ≤ T
By It’s lemma and the previous SDE it follows that
dP (t, T )
P (t, T )= r(t)dt− σP (t, T )tdW (t)
By It’s lemma it also follows that the forward bond prices satisfy
dP (t, T, T + τ)
P (t, T, T + τ)= −[σP (t, T + τ)− σP (t, T )]tσP (t, T )dt− [σP (t, T + τ)− σP (t, T )]tdW (t)
In the T-forward measure, P (t, T, T + τ) is a martingale, so we must have
dP (t, T, T + τ)
P (t, T, T + τ)= −[σP (t, T + τ)− σP (t, T )]tdWT (t)
where WT (t) is a QT -Brownian motion.
Comparison of the last two expressions yields
dWT (t) = dW (t) + σP (t, T )dt
which by Girsanov’s theorem identifies the Radon-Nikodym process for the measure shift
ξ(t) = EQt
[dQT
dQ
]= Et(σP ·W ) = exp
(−∫ t
0
||σP (s, T )||2 ds−∫ t
0
σP (s, T ) dW (s)
)HJM models are traditionally stated in terms of instantaneous forward rates f(t, T ).
Once again by It’s lemma we have (omitting the drift term)
d lnP (t, T ) = O(dt)− σP (t, T )tdW (t)
Differentiating with respect to T on both sides and recalling that f(t, T ) = −∂P (t,T )∂T we have
df(t, T ) = µf (t, T ) +∂
∂TσP (t, T )t︸ ︷︷ ︸
:=σf (t,T )t
dW (t)
Concretely, we have the following:
8
Lemma 1.1. The process for f(t, T ) in the T-forward measure is
df(t, T ) = σf (t, T )tdWT (t)
and in the risk-neutral measure the process is
df(t, T ) = σf (t, T )tσP (t, T )dt+ σf (t, T )tdW (t)
= σf (t, T )t∫ T
t
σf (t, u) dudt+ σf (t, T )tdW (t)
We thus have:
• The HJM model is fully specified once the forward rate diffusion coefficients σf (t, T )have been specified for all t and T .
• The initial forward rates f(0, T ) are taken as exogenous inputs.
• The model suffers from an obvious dimensionality problem: in order to describe thetime-t state of a discount bond curve spanning [t, T ] we need to keep track of a contin-uum of forward rates f(t, u)t≤u≤T .
The short rate process in measure Q for the HJM model is
r(t) = f(t, t) = f(0, t) +
∫ t
0
σf (u, t)t∫ t
u
σf (u, s) ds du+
∫ t
0
σf (u, t)t dW (u) (1)
It is generally not Markovian: for the path-dependent term
D(t) :=
∫ t
0
σf (u, t)t dW (u)
we must have
D(T ) = D(t) +
∫ T
t
σf (u, T )> dW (u) +
[∫ t
0
σf (u, T ) dW (u)−∫ t
0
σf (u, t)> dW (u)
](2)
whereby
EQ [D(T )|D(t)] 6= EQt [D(T )]
unless the bracketed term in (2) is either non-random or a deterministic function of D(t).
1.3.1 The Gaussian model
If σP (t, T ) is assumed to be a bounded d-dimensional deterministic function, then forwardbond prices P (t, T, T + τ) are log-normally distributed and r(T ) = f(T, T ) is Gaussian withQ-moments
EQt [f(T, T )] =
∫ T
t
σf (u, T )tσP (u, T ) du, V arQt [f(T, T )] =
∫ T
t
σf (u, T )tσf (u, T ) du
Propositions 4.5.1, 4.5.2 and 4.5.3 in [AP10-1] illustrate how to easily price options onZCB’s, caplets and futures rates in the Gaussian HJM model.
9
1.3.2 Gaussian model with Markovian short rate
Caverhill showed in 1994 that the short rate process can be made Markovian by imposing aseparability condition on the deterministic forward rate volatility function, namely
σf (t, T ) = g(t)h(T ) (3)
where h : R→R is positive and g : R→Rd×1. Using (3), the short rate expression (1) becomes
r(t) = f(0, t) + h(t)
∫ t
0
g(u)>g(u)
(∫ t
u
h(s) ds
)du+ h(t)
∫ t
0
g(u)> dW (u)︸ ︷︷ ︸not= D(t)
(4)
and importantly, the term D(t) is now Matkov since
D(T ) =h(T )
h(t)D(t) + h(T ) +
∫ T
t
g(u)> dW (u)
The following proposition shows that under the separability condition (3), the short rateprocess is the solution of an SDE, and is hence Markovian.
Proposition 1.2 (Markovian short rate under separable Gaussian HJM). In the d-dimensionalGaussian HJM model2 and under the separability condition (3)
σf (t, T ) = g(t)h(T )
the short rate process satisfies an SDE
dr(t) = (a(t)− κ(t)r(t)) dt+ σr(t)>dW (t)
where
h(T ) = e−∫ T0κ(s) ds,
g(t) = e∫ t0κ(s) dsσr(t),
a(t) =∂f(0, t)
∂t+ κ(t)f(0, t) +
∫ t
0
σf (u, t)> du
2Namely, under the assumption that σP (t, T ) is a bounded (d-dimensional) deterministic function.
10
2 Vanilla models with local volatility
Local or deterministic volatility models are one-factor diffusive models where our ability toalter the terminal distribution stems from a single source: a swap rate dependent function.
• Via copula methods, vanilla models can be extended to describe joint terminal distri-butions of more than one rate.
• Vanilla models serve as a foundation for the development of more widely applicable fullterm structure models.
2.1 General framework
Notation:
• S(t): forward Libor rate
• P : measure under which S(t) is a martingale satisfying the SDE
dS(t) = λϕ(S(t))dW (t), λ = const, ϕ(0) = 0 (5)
• W (t): one-dimensional Brownian motion under P .
The role of the function ϕ is to match the distribution of S to that observed throughcalls and puts traded in the market. Remember that the time-t price of a European callstruck at K maturing at T is given by rhe risk-neutral pricing formula
c(t, S(t);T,K) = Et[(S(T )−K)+
]The time-t implied volatility function σB(t, S(t);T,K) is the implicit solution to the
Black-Scholes formula
c(t, S;T,K) = SN (d+)−KN (d−), d± =ln(S/K)± 1
2σ2B(t, S;T,K)(T − t)
σB(t, S;T,K)√T − t
The mapping T 7→ σB(t, s;T,K) is the T-maturity volatility smile.
2.2 The CEV model
The constant elasticity of variance (CEV) model is a local volatility model (5) withϕ(S) = Sp.
Proposition 2.1. See [AP10-1, Proposition 7.2.1]. Consider the SDE
∂
∂S(t) = λS(t)pdW (t), p > 0 (6)
1. All solutions of (6) are non-explosive.
2. For p ≥ 12 , the SDE (6) has a unique solution3.
3. For 0 < p < 1, S = 0 is an attainable boundary for (6); for p ≥ 1, S = 0 is anunattainable boundary for (6).
4. For 0 < p ≤ 1, S(t) in (6) is a martingale; for p > 1, S(t) is a strict supermartingale.
The transition probability of S(t) in (6) is known in closed form (see Lemma 7.2.4 in[AP10-1]).
3Hence, if the solution ever reaches S = 0, it must stay there (i.e. it is absorbed). If 0 < p < 12, a boundary
condition at S = 0 needs to be specified in order to get a unique solution.
11
2.2.1 Valuation of European call options in the CEV model
2.2.2 Regularization
The CEV process implies a positive probability of absorption at S = 0 for p < 1, whichmight create difficulties in pricing more exotic structures. To avoid absorption, we canspecify a regularized version of the CEV model by letting
ϕ(x) = xminεp−1, xp−1, ε > 1, p < 1
With ϕ(x) now being Lipschitz, the process S(t) can no longer reach the origin.
• The above specification might not allow for closed-form option pricing.
• For moderate (small) values of ε, Andersen and Andreasen show that the formulas insubsection ?? provide good approximations (c.f. [AP10-1, Proposition 7.2.10]).
2.2.3 Displaced diffusion models
These are local volatility models (5) with ϕ(S) = (α+ S)p, for some constant α.
Setting Z(t) = α+ S(t), by It’s lema we see that Z(t) satisfies the CEV SDE
dZ(t) = λ[Z(t)]pdW (t)
and call option pricing is hence straightforward:
Proposition 2.2. Let
cdCEV (t, S(t);T,K, α) = Et[(S(T )−K)+
]be the call option price associated with the displaced CEV process. Then
cdCEV (t, S(t);T,K, α) = cCEV (t, S + α;T,K + α), S,K > −α
In practice, the main use of displacement constants is for the special case of the displacedlog-normal process where p = 1.
Proposition 2.3. Consider the displaced log-normal process
dS(t) = λ(β + ξS(t))dW (t), ξ, λ 6= 0
where W (t) is a 1-dimensional Brownian motion in measure P . Assuming S(t),K > − βxi ,
we have
cDLN (t, S(t);T,K) = Et[(S(T )−K)+
]=
(S(t) +
β
ξ
)N (d+)−
(K +
β
ξ
)N (d−), d± = · · ·
Remark 2.1. A first-order approximation to any local volatility process is ofdisplaced normal type. Indeed, given a general local volatility model (5)
dS(t) = λϕ(S(t))dW (t), λ = const, ϕ(0) = 0,
expanding the local volatility function ϕ around at-the-money to first order, we obtain
dS(t) ' λϕ(S(0)) + ϕ′(S(0))(S(t)− S(0))dW (t)
which is of displaced log-normal type
dS(t) = σ(bS(t) + (1− b)L)dW (t), σ = λϕ(S(0))
S(0), b = ϕ′(S(0))
S(0)
ϕ(S(0)), L = S(0)
12
2.3 Finite difference solutions
For general specifications of the function ϕ defining the local volatility model
dS(t) = λϕ(S(t))dW (t)
closed-form solutions for European option will not exist. One may instead try to obtaina numerical solution: for fixed T,K, the Feynman-Kac theorem translates the risk-neutralpricing formula
c(t, S(t)) = c(t, S(t);Y,K) = Et[(S(T )−K)+
]into a PDE
∂c(t, S)
∂t+
1
2λ2ϕ(S)2 ∂
2c(t, S)
∂S2= 0, c(T, S) = (S −K)+
whose solution can be approximated using finite difference methods.
2.3.1 Multiple λ and T
In applications we may need to compute the values of c(t, S;T,K) for many differentvalues od T and/or λ (e.g. within a standard calibration exercise, where a root-searchalgorithm is used to to determine the value of λ that minimizes the error between computedand market prices). Instead of solving the PDE multiple times, one may use the follwingresult:
Proposition 2.4. Let g(τ, x) be the solution to the PDE
−∂g(τ, x)
∂τ+
1
2ϕ(x)2 ∂
2g(τ, x)
∂x2= 0, g(0, x) = (x−K)+ (7)
If c(t, S;λ) denotes the solution to the backward PDE
∂c(t, S)
∂t+
1
2λ2ϕ(S)2 ∂
2c(t, S)
∂S2= 0, c(T, S) = (S −K)+ (8)
for a given value of λ then
c(t, S;T,K) = g(λ2(T − t), S) (9)
Hence, one may use finite differences construct a solution g to the PDE (7) on a (τ, S)-grid and then use (9) to recover c(t, S;T,K) for arbitrary choices of S, λ, T by simple look-upor interpolation.
2.4 Forward equation for call options
Instead, we may want to use different strikes for different values of T . In order toaccomplish this (again, without having to solve the entire PDE each time) we may employDupire’s forward equation, in which calendar time t and the initial spot S are consid-ered fixed, whereas maturity T and strike K are variable. As before, let c(t, S(t);T,K) =Et [(S(T )−K)+] denote the price of a European call.
Proposition 2.5 (Dupire, c.g. Proposition 7.4.2 in [AP10-1]). The function c(T,K) =c(t, S;T,K) (with t, S fixed) satisfies the forward PDE
−∂c(T,K)
∂T+
1
2λ2ϕ(K)2 ∂
2c(T,K)
∂K2= 0, c(t,K) = (S −K)+, t > T (10)
An analogous version of Proposition also holds in this setting.
13
2.5 Time-dependent local volatility function ϕ
So far we have assumed that S(t) is driven by an SDE
dS(t) = λϕ(S(t))dW (t), λ = const, ϕ(0) = 0 (11)
and we will now be extending our treatment to models for which ϕ(t, S(t)) depends on t aswell (not just through S(t)).
Even though vanilla models with time-dependent local volatility functions ϕ have limiteduse in fixed income modeling, they do arise as describing approximate dynamics of swap orLibor rates in term structure models.
2.5.1 Separable case
The case
dS(t) = λ(t)ϕ(S(t))dW (t)
is treated via a simple time change argument that reduces it to a model of the type (11).
Lemma 2.6. See [AP10-1, Proposition 7.6.1]. Define
τ(t) =
∫ t
0
λ2(u) du
and define s(·) by S(t) = s(τ(t)), where S(t) follows
dS(t) = λ(t)ϕ(S(t))dW (t)
Then
ds(τ) = ϕ(s(τ))dW (τ), s(0) = S(0)
where W is Brownian motion.
The valuation problem
c(t, S(t);T,K) = Et[(S(T )−K)+
]can hence be rewritten as
c(τ, s(τ);T,K) = E[(s(τ)−K)+|Fτ(t)
]which can be solved using the methods outlined earlier.
2.5.2 Skew averaging
In order to handle the general case
dS(t) = ϕ(t, S(t))dW (t), λ = const, ϕ(0) = 0 (12)
one can look for a time-independent local volatility function (a time average of the time-dependent function) that yields European option prices approximately matching prices fromthe time-dependent model. We first rewrite (12) as
dS(t) = λ(t)g(t, S(t))dW (t), g(t, x) =ϕ(t, x)
ϕ(t, S0), X0 = S(0), λ(t) = ϕ(t,X0)
14
The idea is to fix T > 0 and to derive conditions that a time-independent function g(x)needs to satisfy so that
dS(t) = λ(t)g(t, S(t))dW (t)
can be replaced with
dS(t) = λ(t)g(S(t))dW (t)
for the purposes of valuing T -expiry European options of all strikes.
For ε ≥ 0, define
gε(t, x) = g(t,X0 + ε(x−X0)), gε(x) = g(X0 + ε(x−X0))
and define 2 sets of processes
dXε(t) = λ(t)gε(t,Xε(t))dW (t), Xε(0) = X0
dY ε(t) = λ(t)gε(Y ε(t))dW (t), Y ε(0) = X0
The requirement that the distributions of Xε and Y ε be close can be formalized as finding
g(·) = argming : q(ε) := E
[(Xε(T )− Y ε(T ))2
]Expanding
q(ε) = q(0) + εq′(0) +1
2ε2q′′(0) +O(ε3)
one shows that q(0) = q′(0) = 0 so minimizing q(ε) is equivalent to minimizing q′′(0). Thenone has the following result:
Proposition 2.7. See [AP10-1, Proposition 7.6.2]. Any function g that minimizes q′′(0)must satisfy the condition
∂g
∂xg(X0) =
∫ T
0
∂g
∂x(t,X0)ωT (t) dt (13)
where
ωT (t) =v2(t)λ2(t)∫ T
0v2(t)λ2(t) dt
, v2(t) = E[(X0(0)−X0)2
]Typical examples of time-dependent local volatility functions g(t, x), scaled to satisfy
g(t, S(0)) = 1, are:
• Displaced log-normal function g(t, x) = b(t) xS(0) + (1− b(t)).
• CEV function g(t, x) =(
xS(0)
)p(t).
The condition (13) does not define the function g uniquely, so one generally seeks to finda function g within the same family of functions they approximate. For instance, for theabove examples we would look for functions
g(x) = bx
S(0)+ (1− b), g(x) =
(x
S(0)
)pConcretely, we have the following useful result:
Corollary 2.8. Over the time horizon [0, T ]:
15
1. The effective skew b in g(x) = b xS(0) +(1−b) for the model defined by the time-dependent
local volatility function g(t, x) = b(t) xS(0) + (1− b(t)) is given by
b =
∫ T
0
b(t)ωT (t) dt
where
ωT (t) =v(t)2λ(t)2∫ T
0v(t)2λ(t)2 dt
,
v(t) =
∫ t
0
λ(s)2 ds
2. Similarly, the effective parameter p in g(x) =(
xS(0)
)pfor the model defined by the
time-dependent local volatility function g(t, x) =(
xS(0)
)p(t)is given by
p =
∫ T
0
p(t)ωT (t) dt
with ωT (t) and v(t) as above.
3. If, furthermore, the volatility is constant λ(t) ≡ λ, we obtain
v(t)2 = λ2t, ωT (t) =t
T 2/2
16
3 Vanilla models with stochastic volatility
Here we focus on models which incorporate a new stochastic component for the volatility:
dS(t) = λ (bS(t) + (1− b)L)√z(t)dW (t) (14)
dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t) (15)
E[dW (t)dZ(t)] = ρdt
3.1 Basic properties
Proposition 3.1 (Explosions).
Proposition 3.2 (Martingale property).
The stochastic equation (14) can be integrated explicitly:
Proposition 3.3. In the model (14)-(15), we have
S(t) =1
b[(bS(0) + (1− b)L)X(t)− (1− b)L]
where
lnX(t) = λb
∫ t
0
√z(s)dW (s)− 1
2λ2b2
∫ t
0
z(s) ds (16)
As we will see, the moment generating function lnX(t) is of fundamental importance forEuropean option pricing:
Proposition 3.4. Consider the moment generating function of lnX(t)
ΨX(u; t)def= E[eu lnX(t)]
In the model (14)-(15), for any u ∈ C we have
ΨX(u; t) = Ψz
(1
2(λb)2u(u− 1), u; t
)where
Ψz(v, u; t) = EP[evz(t)
], z(t) =
∫ t
0
z(s) ds
Here P is the measure defined by Et (uλb√z ·W ).
3.2 Fourier integration
3.2.1 Calculation of option prices when MGF is available for ln(S(t)).
Theorem 3.5. Let ξ be a random variable with moment generating function (MGF) χ(u) =E[euξ]. For k ∈ R and any4 0 < α < 1 we have
E[(eξ − ek)+
]= χ(1)− ek
2π
∫ ∞−∞
e−k(α+iw)χ(α+ iw)
(α+ iw)(1− α− iw)dw (17)
= χ(1)− ek
π
∫ ∞0
Re
[e−k(α+iw)χ(α+ iw)
(α+ iw)(1− α− iw)
]dw (18)
4Here α is a dampening constant which is introduced in the proof as a factor e−αk in order to guarantee thatthe Fourier transform exists.
17
Combining Theorem 3.5 (using a damping factor α = 12 ) with the closed-form expression
for the moment generating function in the SV model given in Proposition 3.4, we obtain anefficient formula for pricing European call and put options in the SV model (14)-(15).
Theorem 3.6. The price of a call option cSV (0, S;T,K) in the SV model (14)-(15) is givenby
cSV (0, S;T,K) =1
bcB(0, S′;T,K ′, λb)− K ′
2πb
∫ ∞−∞
e(1/2+iw) ln(S′/K′)q(
12 + iw
)w2 + 1
4
dw (19)
where cB(0, S′;T,K ′, σ) is the Black formula for spot S′, strike K ′, expiry T and volatilityσ, with
S′ = bS + (1− b)L, K ′ = bK + (1− b)L
and
q(u) = ΨZ
(1
2(λb)2u(u− 1), u;T
)− e 1
2λ2b2z0Tu(u−1)
where Ψz is given in Proposition 3.4.
3.2.2 Direct integration
In order to compute the integral inside equation (19)∫ ∞−∞
e(1/2+iw) ln(S′/K′)q(
12 + iw
)w2 + 1
4
dw
we first need to truncate the domain.
Lemma 3.7. Given that |ρ| < 1, the function
q(u) = ΨZ
(1
2(λb)2u(u− 1), u;T
)− e 1
2λ2b2z0Tu(u−1)
defined in Theorem 3.6 satisfies
limw→+∞
1
wln
[q(
1
2+ iw)
]= −q∞
where
q∞ =λbz0
η
(√1− ρ2 + iρ
)(1 + θT )
One then shows easily that∣∣∣∣∣∫ ∞wmax
e(1/2+iw) ln(S′/K′)q( 12 + iw
w2 + 14
dw
∣∣∣∣∣ ≤√S′
K ′e−Re(q∞)wmax
wmax
so if εw is the absolute tolerance (generally ε = 10−3, 10−6) one can set the upper truncationlimit wmax by the condition
e−Re(q∞)wmax
wmax= εw
We can then apply anu quadrature rule in order to discretize the integral. For instance, thetrapezoidal rule yields
· · ·
18
3.2.3 Fourier integration for arbitrary European Payoffs
Given a payoff function f(x), recall that the value of a European option with payoff f(S(T ))is given by5
E[f(S(T ))] =
∫f(x)ϕS(T )(x) dx =
∫f(x)
∂2c(0, S(0);T, x)
∂x2dx
where ϕS(T )(x) is the density function of S(T ) and c(0, S(0);T,K) is the European calloption value for S(·). Integrating by parts we obtain a useful representation of a generalEuropean payoff in terms of European calls and puts.
Proposition 3.8. For any twice-continuously differentiable function f(x), the value of aEuropean option with payoff f(·) maturing at T equals the weighted integral
E[f(S(T ))] = f(K∗) + f ′(K∗)(S(0)−K∗) +
∫ K∗
−∞p(0, S(0);T,K)f ′′(K) dK
+
∫ ∞K∗
c(0, S(0);T,K)f ′′(K) dK (20)
for any K∗.
Hence, in order to compute the value of a European option with arbitrary payoff we needto simultaneously compute call option prices for different strikes, and the FFT methodprovides an efficient way of accomplishing this. The integrals we need to evaluate are
I(K ′) =
∫ ∞−∞
e−(1/2+iw) ln(S′/K′)
w2 + 14
q
(1
2+ iw
)dw
for various K ′, where recall
S′ = bS + (1− b)L, K ′ = bK + (1− b)L
We discretize K ′ in such a way that ln( S′
K′ ) are equidistant. In particular, choosing stepδ > 0 we define
xn = δn, K ′n = S′exn , n = −N, . . . , Nso that
ln
(S′
K ′n
)= −xn, Kn =
(S +
1− bb
L
)exn − 1− b
bL
and then
Innot= I(K ′n) =
∫ ∞−∞
e−(1/2+iw)δn
w2 + 14
q
(1
2+ iw
)dw = e−0.5δnJn
Jn =
∫ ∞−∞
e−iwδnq(
12 + iw
)w2 + 1
4
dw
Once this is accomplished, the value of an option with general payoff f(·) is obtained bydiscretizing the integrals in equation (20).
5The price of a call option is given by C(S,K) =∫∞KϕS(T )(x)(x − K) dx Differentiating with respect to K
twice we obtain
∂C(S,K)
∂K=
∂
∂K
[∫ ∞K
ϕS(T )(x) dx−K∫ ∞K
ϕS(T )(x) dx
]= −
∫ ∞K
ϕS(T )(x) dx
∂2C(S,K)
∂K2= ϕS(T )(K)
19
Proposition 3.9. Fix δ > 0. Let Kn,K′n be defined by
Kn =
(S +
1− bb
L
)exn − 1− b
bL, K ′n = S′exn , n = −N, . . . , N
The value of a call option with payoff f(·) at time T in the SV model (14)-(15) is ap-proximately given by
E[f(S(T ))] = f(S(0)) +
−1∑n=−N
pSV (0, S(0);T,Kn)f ′′(Kn)(Kn+1 −Kn)
+
N−1∑n=0
cSV (0, S(0);T,Kn)f ′′(Kn)(Kn+1 −Kn)
where
cSV (0, S(0);T,Kn) =1
bcB(0, S′;T,K ′n;λb)− K ′n
2πbe−0.5δnJn
pSV (0, S(0);T,Kn) =1
bpB(0, S′;T,K ′n;λb)− K ′n
2πbe−0.5δnJn
with JnNn=−N evaluated by an inverse FFT transform of the function
q(
12 + iw
)w2 + 1
4
, q(u) = ΨZ
(1
2(λb)2u(u− 1), u;T
)− e 1
2λ2b2z0Tu(u−1)
Remark 3.1 (FFT reminder). The discrete Fourier transform maps
x = (x1, . . . , xN ) 7→ x = (x1, . . . , xN ),
xk =
∑Nj=1 e
−i 2πN (j−1)(k−1)xj
xk = 1N
∑Nj=1 e
i 2πN (j−1)(k−1)xj
(21)
Computing these sums independently would requireN2 operations but the FFT algorithmaccomplishes this with only N log2(N) operations. The idea is to reduce the computation ofan integral to the computation of an inverse Fourier transform to which the FFT algorithmcan be applied. We seek to compute the integrals
Jn = 2
∫ ∞0
Re
[e−iwδn
q(
12 + iw
)w2 + 1
4
]dw, n = −N, · · · , N
In order to do so we truncate the domain of integration to [0, wmax] according to thediscussion in section 4.2.2 and we discretize the integral using, for instance, the trapezoidalrule∫ b
a
f(w) dw =
Nw∑k=0
αkf(wk), wk = kη, η =b− aNw
, αk =
1/2, k = 0, N1, 2 ≤ k ≤ N − 1
Our discretized integral hence becomes
Jn ' 2
Nw∑k=0
αkRe
[e−iwkδn
q(
12 + iwk
)w2k + 1
4
]=
Nw∑k=0
αkRe
[e−iηδkn
q(
12 + ikη
)(kη)2 + 1
4
], n = −N, · · · , N
20
and choosing η such that ηδ = 2πNw
our integrals become
Jn ' 2
Nw∑k=0
αkRe
[e−i
2πNw
kn q(
12 + ikη
)(kη)2 + 1
4
], n = −N, · · · , N
and from (21) we can identify the Jn’s as the inverse Fourier transforms of the xk’s where
xk =q( 1
2 +ikη)(kη)2+ 1
4
.
3.3 Averaging methods for models with time-dependent parameters
Models with time-dependent parameters emerge naturally when vanilla models are used toapproximate interest rate dynamics in a full term structure model. Our goal here is two-fold:
1. Given a stochastic volatility model with time dependent parameters
dS(t) = λ(t) [b(t)S(t) + (1− b(t))L]√z(t)dW (t) (22)
dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t) (23)
E[dW (t)dZ(t)] = ρdt
we find a constant time-averaged volatility λ, skew b and volatility of variance η suchthat the model
dS(t) = λ[bS(t) + (1− b)L
]√z(t)dW (t) (24)
dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t) (25)
E[dW (t)dZ(t)] = ρdt
produces the same European option prices as the model (22)-(23).
2. We illustrate how to employ time-averaging techniques to calibrate the SV model withtime-dependent parameters (22)-(23) to quoted market prices.
3.3.1 Volatility averaging
We start by letting the volatility λ(t) be time-dependent, while holding the skew b and thevolatility of variance η constant.
Theorem 3.10 (effective volatility approximation). Values of European options with expiryT in the model
dS(t) = λ(t) [bS(t) + (1− b)L]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t)
E[dW (t)dZ(t)] = ρdt
are well approximated by their values in the model
dS(t) = λ [bS(t) + (1− b)L]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t)
E[dW (t)dZ(t)] = ρdt
where the effective SV volatility λ solves the equation
Ψz
(h′′(ζT )
h′(ζT )λ2, 0;T
)= Ψzλ2
(h′′(ζT )
h′(ζT )λ2, 0;T
)(26)
21
where
ζT = z0
∫ T
0
λ(t)2 dt, h(x) =bS(0) + (1− b)L
b
[2N
(b√x
2
)− 1
]and where Ψz and Ψzλ2 are moment-generating functions given by [AP10-1, Proposition8.3.8] and [AP10-1, Proposition 9.1.2] respectively (see remark below for details).
Remark 3.2. The LHS of equation (26) can be computed in closed form via [AP10-1,Proposition 8.3.8]:
Ψz(v, u;T ) = eA(v,u)+B(v,u)z0 ,
A(v, u) =θz0
η2
[2 ln
(2γ
θ′ + γ − e−γT (θ′ − γ)
)+ (θ′ − γ)T
],
B(v, u) =2v(1− e−γT )
(θ′ + γ)(1− e−γT ) + 2γe−γT,
γ = γ(v, u) =√
(θ′)2 − 2η2v,
θ′ = θ′(u) = θ − ρηλbu
As for the RHS, assuming that λ(t) is piecewise constant λ(t) =∑ni=1 λiI(ti−1,ti](t) for
some 0 = t0 < · · · < Tn = T , then on each interval (ti−1, ti] we can apply the closed-formsolution from the previous paragraph, and piece all these together.
Equation (26) can then be solved in a couple of iterations of the Netwon-Raphson method.
3.3.2 Skew averaging
We now also allow the skew b to be time-dependent and we find an effective skew b.
Proposition 3.11 (effective skew approximation). Values of European options with expiryT in the model
dS(t) = λ(t) [b(t)S(t) + (1− b(t))L]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t)
E[dW (t)dZ(t)] = ρdt
are well approximated by their values in the model
dS(t) = λ(t)[bS(t) + (1− b)L
]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t)
E[dW (t)dZ(t)] = ρdt
where the effective SV skew b is given by
b =
∫ T
0
b(t)ωT (t) dt
where the weights are given by
ωT (t) =v(t)2λ(t)2∫ T
0v(t)2λ(t)2 dt
,
v(t)2 = z20
∫ t
0
λ(s)2 ds+ z0η2e−θt
∫ t
0
λ(s)2 eθs − e−θs
2θds
22
3.3.3 Volatility of variance averaging
Finally we also allow the volatility of variance to be time-dependent.
Proposition 3.12 (effective volatility of variance approximation). Values of European op-tions with expiry T in the model
dS(t) = λ(t) [b(t)S(t) + (1− b(t))L]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t)
E[dW (t)dZ(t)] = ρdt
are well approximated by their values in the model
dS(t) = λ(t) [b(t)S(t) + (1− b(t))L]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ η√z(t)dZ(t)
E[dW (t)dZ(t)] = ρdt
where the effective SV volatility of variance η is given by
η2 =
∫ T0η(t)2ρT (t) dt∫ T0ρT (t) dt
where the weight function is given by
ρT (u) =
∫ T
u
ds
∫ T
s
dtλ(t)2λ(s)2e−θ(t−s)e−2θ(s−r).
3.3.4 Calibration by parameter averaging
Given:
• A collection of expiries is given 0 = T0 < · · · < TN .
• A collection of strikes K1, . . . ,KM .
• Market values of European call options with expiries Tn and strikes Km, namely
cn,m, n = 1, . . . , N, m = 1, . . . ,M
The goal is to find time-dependent model parameters λ(t), b(t) and η(t) such that themodel
dS(t) = λ(t) [b(t)S(t) + (1− b(t))L]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t)
E[dW (t)dZ(t)] = ρdt
values the above options as close as possible. For instance, if X = λ(·), b(·), η(·) denotesthe state of the model and cn,m(X ) are the model option prices, we seek to find
λ(·), b(·), η(·)arg min∑n,m
(cn,m(X )− cn,m)2
Alternatively, we can work with market parameter values: denote by λ, b, η, n =1, . . . , N determined in such a way that the market prices of European options expiringat time Tn, namely cn,mMm=1 match the prices obtained in the model
dS(t) = λn
[bnS(t) + (1− bn)L
]√z(t)dW (t)
dz(t) = θ(z0 − z(t))dt+ ηn√z(t)dZ(t)
23
Denote by λ(X ), b(X ), η(X ) the averaged parameters to time Tn for the time-dependent
model (the results in the previous section relate these to λ, b, η). We can then insteadconsider the following more convenient minimization problem
λ(·), b(·), η(·) = arg min
Wλ
∑n,m
(λn(X )− λn
)2
+Wb
∑n,m
(bn(X )− bn
)2
+Wη
∑n,m
(ηn(X )− ηn)2
with appropriate user-specified weights Wλ, Wb and Wη.
Calibration can then be split into independent sub-calibrations as follows. Recall thethree averaging results we outlined in the previous section:
ηn(X )2 =
∫ Tn0
η(t)2ρTn(t, λ(·)) dt∫ Tn0
ρTn(t, λ(·)) dt, n = 1, . . . , N
bn(X ) =
∫ Tn
0
b(t)ωTn(t;λ(·), η(·)) dt, n = 1, . . . , N
λn(X ) = F(λ(·), bn(X ), ηn(X )
), where
F (λ(·); bn(X ), ηn(X )) =
√h′(ζTn)
h′′(ζTn)·Ψ−1
z
(Ψzλ2
(h′(ζTn)
h′′(ζTn), 0;T
)), 0;T
Assume the model parameters are piecewise constant between option expiry dates:
λ(t) =
N∑i=1
λiI(Ti−1,Ti](t), b(t) =
N∑i=1
biI(Ti−1,Ti](t), η(t) =
N∑i=1
ηiI(Ti−1,Ti](t)
We can then also discretize the weights as
ρTn(t, λ(·)) =
n∑i=1
ρn,i(λ(·))I(Ti−1,Ti](t),
ωTn(t;λ(·), η(·)) =
n∑i=1
ωn,i(λ(·), η(·))I(Ti−1,Ti](t)
and denote
ρn,i(λ(·)) =ρn,i(λ(·))∫ Tn
0ρTn(y;λ(·)) dt
We are thus reduced to solving the three systems of equations
n∑i=1
ρn,i(λ(·))(Ti − Ti−1)η2i = η2
n, (27)
n∑i=1
ωn,i(λ(·), η(·))(Ti − Ti−1)bi = bn, (28)
F(λ(·); bn(X ), ηn(X )
)= λn (29)
for n = 1, . . . , N .
This system can be solved iteratively as follows:
24
(i) We first solve equations (29) by replacing the model parameters bn(X ), ηn(X ) withtheir market values, namely
F(λ(·); b, η
)= F
(λ1, . . . , λn; b, η
)= λn, n = 1, . . . , N
which is a system of N decoupled one-dimensional equations. For instance, the n-thequation reads
F(λ∗1, . . . , λ
∗n−1, λn; b, η
)= λn
where λ∗i denote model parameters already solved for.
(ii) We next solve the system (27) for η2i , i = 1, . . . , N , using the values of λi found in (i),
namelyn∑i=1
ρn,i(λ∗(·))(Ti − Ti−1)η2
i = η2n, n = 1, . . . , N
This also be done sequentially.
(iii) Finally we solve the linear system
n∑i=1
ωn,i(λ∗(·), η∗(·))(Ti − Ti−1)bi = bn, n = 1, . . . , N
for bi, again sequentially.
25
4 One-factor short rate models
The general HJM class with its infinite-dimensional Markovian dynamics is to unwieldy towork in practice, so we seek to identify HJM model subclasses that involve a finite numberof Markovian state variables only.
We start by reviewing the notions of swaps and swaptions. We then outline the mostbasic one-factor models and show how to calibrate them to current prices of ZCB’s andswaptions. We also illustrate how to use these models to price European derivatives.
4.1 Review of Swaps and swaptions
A swap is a generic term for an OTC derivative in which two counterparties agree to exchangeone stream (leg) of cash flows against another stream. A fixed-for-floating interest rate swapis a swap in which one leg is a stream of fixed rate payments and the other is a stream ofpayments based on a floating rate, generally Libor.
Consider an increasing sequence of maturity times (the tenor structure)
0 ≤ T0 < T1 < · · · < TN , τn = Tn+1 − Tn
From the perspective of the fixed rate payer, the net cash flow of the swap at time Tn+1,per unit of notional, is
τn(Ln(Tn)− k), Ln(t) = L(t, Tn, Tn+1)
By the fundamental valuation result, the value os a swap is hence equal to
Vswap(t) = β(t)
N−1∑n=1
τnEt
[1
β(Tn+1)(Ln(Tn)− k)
]
=
N−1∑n=0
[P (t, Tn)− P (t, Tn+1)− τnkP (t, Tn+1)]
=
N−1∑n=0
τnP (t, Tn+1)(Ln(t)− k)
= A(t) [S(t)− k]
whereA(t) =, S(t) =
are the annuity and the forward rate o the swap, respectively. We hence observe that avanilla fixed-floating swap can be valued at time t using only the term structure of interestrates observed on that date.
A European swaption is an option giving the holder the right, but not he obligation, toenter a swap at a future date at a given fixed rate. Assuming the underlying swap starts onthe expiry date T0 of the option, for a payer swap we have
Vswaption(T0) = [Vswap(T0)]+
=
(N−1∑n=0
τnP (T0, Tn+1)(Ln(T0)− k)
)+
Vswaption(t) = β(t)Et
[1
β(T0)Vswaption(T0)
]
= β(t)
(1
β(T0)
N−1∑n=0
τnP (T0, Tn+1)(Ln(T0)− k)
)+
26
4.2 The Ho-Lee model
Assume that the short rate follows an SDE
dr(t) = σrdW (t) =⇒ r(t) = r(0) + σrW (t)
With only 2 parameters r(0) and σr it is not possible to match the observable discount bondprices, so instead we may consider
r(t) = r(0) + a(t) + σrW (t), a(0) = 0
Choosing a(t) accordingly, we can match the discount bond curve at time 0:
Lemma 4.1 (definition of the Ho-Lee model). Assume
r(t) = r(0) + a(t) + σrW (t),
a(t) = f(0, t)− r(0) +1
2σ2r t
2. [Ho-Lee model]
f(0, t) = −∂f(0, t)
∂t
Then for any T > 0 we have
Et
[e−
∫ T0r(u) du
]= P (0, T )
This model can be characterized further:
Lemma 4.2 (characterization of the Ho-Lee model). In the Ho-Lee model, the risk-neutralprocess for r(t) and the time-t bond prices are respectively given by
dr(t) =(∂tf(0, t) + σ2
r t)dt+ σrdW (t)
P (t, T ) =P (0, T )
P (0, t)exp
(−(r(t)− f(0, t))(T − t)− 1
2σtrt(T − t)2
)
A few observations are in order:
• The Ho-Lee model could have been specified as an HJM model with constant forwardrate volatility
σf (t, T ) = σr, σP (t, T ) =
∫ T
t
σf (t, u) du = σr(T − t)
• The constancy of σ(t, T ) is unrealistic and provides too few degrees of freedom forcalibration to quoted option prices.
• Interest rates are Gaussian, so they can become negative.
• With only one driving Brownian motion, the instantaneous moves of all forward ratesare perfectly correlated (this is obviously common to all one-factor models.
4.3 The mean-reverting GSR model
Empirical studies suggest that interest rates exhibit mean-reversion. For instance,Vasicek assumed that interest rates are governed by an SDE
dr(t) = κ(θ − r(t))dt+ σrdW (t)
27
Integrating the linear SDE it follows that
r(t) = θ + (r(0)− θ)e−κt + σr
∫ t
0
e−κ(t−s) dW (s)
so r(t) is a Gaussian random variable with moments
E[r(t)] = θ + [r(0)− θ]e−κt
V ar[r(t)] =σ2r
2κ(1− e−2κt)
Allowing all constants to be time-dependent in the Vasicek model we obtain the generalshort rate (GSR) model
dr(t) = κ(t) [θ(t)− r(t)] dt+ σr(t)dW (t)
Recall (c.f. Lemma 4.5.4 in [AP10-1]) that within an HJM setting we have
df(t, T ) = σf (t, T )t∫ T
t
σf (t, u) dudt+ σf (t, T )tdW (t)
σf (t, T ) = σr(t) exp
(−∫ T
t
κ(u) du
)
and in order to match the initial yield curve, we must have
θ(t) =1
κ(t)
∂f(0, t)
∂t+ f(0, t) +
1
κ(t)
∫ t
0
e−2∫ tuκ(s) dsσ2
r(u) du (30)
The term ∂f(0,t)∂t can be a nuisance in applications where the initial forward curve is not
smooth. In order to get rid of it we can perform a change of variables.
Lemma 4.3. Defining
x(t) := r(t)− f(0, t)
the GSR model becomes
dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t)dW (t),
x(0) = 0,
y(t) =
∫ t
0
e−2∫ tuκ(s) dsσ2
r(u) du
The bond reconstitution formula is
P (t, T ) =P (0, T )
P (0, t)exp
(−x(t)G(t, T )− 1
2y(t)G2(t, T )
)(31)
G(t, T ) =
∫ T
t
e−∫ utκ(s) ds du
Remark 4.1. Note that the term G(t, T ) can be expressed as
G(t, T ) = [G(0, T )−G(0, t)] e∫ t0κ(s) ds
which is useful in grid-based numerical work.
28
4.3.1 Swaption pricing
Assume that the function κ(t) is fixed exogenously (based on empirical observations). By(30), the function θ(t) is then uniquely fixed by the initial forward curve. In order to havea full model specification, it thus suffices to determine the function σr(t), which in practiceis done by calibration of the model to observed prices of liquid European options (i.e. capsand swaptions). In any Gaussian HJM model, caplets can be priced by simple Black-Scholesformulas, so:
• In this subsection we focus on pricing European swaptions in two different ways.
• In the next subsection we use these formulas to calibrate the model (i.e. to find σr(t)).
4.3.1.1 Jamshidian decomposition
For a payer swaption expiring at time T0 with the underlying swap paying an annualizedcoupon c at times T1 < · · · < TN , the swaption payout at time T0 is
Vswaption(T0) =
(1− P (T0, TN )− c
N−1∑i=0
τiP (T0, Ti+1)
)+
, τi = Ti+1 − Ti
Jamshidian re-writes the swaption payout from an option on a sum of discount bonds toa sum of options on discount bonds.
First write P (T0, TN ) = P (T0, TN ;x(T0)) and let x∗ be the critical value for which theswap at time T0 is exactly zero. This can be found by numerical root search on the equation
P (T0, TN , x∗) + c
N−1∑i=0
τiP (T0, Ti+1, x∗) = 1
Also define the strikes
Ki = P (T0, Ti, x∗)
From the expression of P (t, T ) in terms of x(t) it follows that P (T0, Ti, x(T0)) is monotonicallydecreasing in x(T0), the swaption only pays out a positive amount if x(T0) > x∗. After afew computations one concludes
Vswaption(T0) =
(1− P (T0, TN , x(T0))− c
N−1∑i=0
τiP (T0, Ti+1, x(T0))
)1x(T0)>x∗
= (KN − P (T0, TN , x(T0)))+
+ c
N−1∑i=0
τi ((Ki+1 − P (T0, Ti+1, x(T0)))+(32)
so the swaption payout has now been decomposed into N+1 put options on ZCB’s, whichallows us to price the swaption in closed form.
• The use of root-search algorithms and the need to price a potentially large amount ofZCB’s can be cumbersome.
• Alternative: Examine forward swap rate in appropriate annuity measure, introducingapproximations as needed to make the dynamics tractable (see next subsection).
29
4.3.1.2 Swap rate approximation
Consider again a payer swaption expiring at time T0 with the underlying swap paying anannualized coupon c at times T1 < · · · < TN . Re-write the swaption payout as
Vswaption(T0) = A(T0)(S(T0)− c)+
where, recall
A(t)not= A0,N (t) =
N−1∑i=0
τiP (t, Ti+1), S(t)not= S0,N (t) =
P (t, T0)− P (t, TN )
A(t)
In the QA measure we have
Vswaption(0) = A(0)EA[(S(T0)− c)+
]and it now suffices to establish tractable dynamics for S(t) in QA. S(t) is obviously aQA-martingale, and from the bond reconstitution formula (31)
P (t, T ) =P (0, T )
P (0, t)exp
(−x(t)G(t, T )− 1
2y(t)G2(t, T )
)G(t, T ) =
∫ T
t
e−∫ utκ(s) ds du
it is clear that S(t) is a deterministic function S(t) = S(t, x(t)), so from It’s lemma
dS(t) = q(t, x(t))σr(t)dWA(t), q(t, x) =
∂
∂x
P (t, T0, x)− P (t, TN , x)∑N−1i=0 τiP (t, Ti+1, x)
where we used the bond reconstitution formula (31) to express the P (t, Ti)’s as functions ofx. Evaluating the partial derivatives one easily sees that
q(t, x) = −P (t, T0, x)G(t, T0)− P (t, TN , x)G(t, TN )
A(t, x)+S(t, x)
A(t, x)
N−1∑i=0
τiP (t, Ti+1, x)G(t, Ti+1)
The function q(t, x(t)) can be experimentally verified to be close to a constant in thex-direction, so
q(t, x(t)) ' q(t, x(t)), x(t) deterministic proxy for x(t)
For instance, setting x(t) ≡ 0 will yield good precision if σr(t) is not too high. More accurateapproximation will be given later on.
We are thus left with the problem
Vswaption(0) = A(0)EA[(S(T0)− c)+
],
dS(t) = q(t, x(t))σr(t)dWA(t)
The dynamics of S(t) can be thought of as those of a CEV process
dS(t) = λ(t)Sp(t)dWA(t), λ(t) = q(t, x(t))σr(t)
with p = 0 and time-dependent volatility.
• By [AP10-1, Proposition 7.6.1], the prices of European options in the model dS(t) =λ(t)Sp(t)dWA(t) are the same as those of the time-averaged constant volatility model
dS(t) =√λSp(t)dWA(t), λ =
1
T0
∫ T0
0
λ2(u) du
30
• For p = 0 and constant λ, namely dS(t) = λdWA(t), a standard computation (c.f.[AP10-1, Remark 7.2.9]) shows that
c(0, S(0);T0, c, λ) = (S(0)− c)N (d) + λ√T0n(d), d =
S(0)− cλ√T0
In conclusion (simplifying√T0) we obtain:
Vswaption(0) = A(0)[(S(0)− c)N (d) +
√λ · n(d)
](33)
where
d =S(0)− c√
λ, λ =
∫ T0
0
q(t, x(t))2σr(t)2 dt
4.3.2 Swaption calibration
Using the option pricing formulas from the previous subsection, we can calibrate the model(i.e. find σr(t) to match the market prices of one or more calibration targets, generallyEuropean swaptions).
Assume that we are given a collection6 of N − 1 swaptions defined on a maturity grid0 = T0 < T1 < · · · < TN such that the i-th swaption expires at time Ti, for i = 1, . . . , N − 1.
• Assume that the mean reversion speed κ(t) is a constant and that it is fixed exogenously.
• Observe that the value of the swaption expiring at Ti depends on the volatility curveσr(s) for s ∈ [0, Ti] only7.
• Assume that the volatility function σr(t) is piecewise-constant on the maturity grid
σr(t) =
N−2∑i=0
σiI[Ti,Ti+1]
Calibration can hence be performed one swaption at a time via a series of one-dimensional root searches as follows:
1. Assume that σ0, . . . , σi−1 have been computed.
2. Set the value σi such that the model price for the swaption expiring at Ti+1 equals itsmarket price, by numerically inverting equation (33)
Vswaption(0) = A(0)[(S(0)− c)N (d) +
√λ · n(d)
]6Such a collection is called a swaption strip.7Indeed, the i-th swaption payout at time Ti
Vswaption(Ti) =
(1− P (Ti, TN )− c
N−1∑n=i
τnP (Ti, Tn+1)
)+
so our claim follows from the bond reconstitution formula (31)
P (Ti, T ) =P (0, T )
P (0, Ti)exp
(−x(Ti)G(Ti, T )− 1
2y(Ti)G
2(Ti, T )
)G(t, T ) =
∫ T
t
e−2∫ tu κ(s) ds du
y(t) =
∫ t
0
e−∫ tu κ(s) dsσ2
r(u) du
31
3. Repeat Step 2 for i = 0, . . . , N − 2.
More concretely:
(i) Pre-compute the terms
G(Ti, Tj) =
∫ Tj
Ti
e−
∫ uTiκ(s) ds
du, i ≤ j
via the recursionG(t, T ) = [G(0, T )−G(0, t)] e
∫ t0κ(s) ds
At the n-th step of the recursion, for n = 1, . . . , N − 1, we seek to obtain σn−1 so thatthe n-th swaption market price Vn is matched.
(ii) Evaluate y(Tn). This is done via the recursion
y(Tn) = e−2κn−1(Tn−1−Tn−2)y(Tn−1) +1
2σ2n−1(Tn−1 − Tn−2)
[e−2κn−1(Tn−1−Tn−2) + 1
],
y(T0) = 0
which is obtained by applying the trapezoidal quadrature rule to the integral
y(t) =
∫ t
0
e−2∫ tuκ(s) dsσr(u) du
Note that σn−1 is unknown at the n-th step: we are optimizing for it.
(iii) Use y(Tn) to calculate the discount factors P (Tn, Tj), for j > n using
P (Tn, Tj) =P (0, Tj)
P (0, Tn)e−
12y(Tn)G(Tn,Tj)
2
(iv) Compute An(t) ≡ An,N (t) and Sn(t) ≡ Sn,N (t) at t = T0, . . . , Tn.
(v) Compute q(t) for t = T0, . . . , Tn.
(vi) Compute λ2 = 1Tn
∫ Tn0
q(t, x(t))2σr(t)2 dt using the trapezoidal rule (we are using the
approximation x(t) ≡ 0
λ ' σ20(T1 − T0)
q(T0, 0)2 + q(T1, 0)2
2+ · · ·σ2
n−1(Tn−1 − Tn)q(Tn−1, 0)2 + q(Tn, 0)2
2
(vii) Use a one-dimensional root-search procedure to find σn−1 satisfying equation (33)
Vn(0) = An(0)
[(Sn(0)− c)N (d(σn−1)) +
√λ(σn−1) · n(d(σn−1))
]
4.3.3 Monte Carlo simulation
To price a derivative security paying V (T ) at time T , recalling that x(t) = r(t)− f(0, t), weneed to compute
V (0) = EQ[V (T )e−
∫ T0r(u) du
]= P (0, T )EQ
[V (T, x(t)Tt=0)e−
∫ T0x(u) du
](34)
The dynamics of x(t) in the GSR model are
dx(t) = [y(t)− κ(t)x(t)]dt+ σr(t)dW (t),
y(t) =
∫ t
0
e−2∫ tuκ(s) dsσr(u)2 du
32
Discretizing into a schedule8 t0 < t1 < · · · < tN we see that
x(ti+1) = E[x(ti+1)|x(ti)] +√
Var(x(ti+1)|x(ti))Zi, i = 0, . . . , N − 1,
E[x(ti+1)|x(ti)] = e−∫ ti+1ti
κ(u) dux(ti) +
∫ ti+1
ti
e−∫ ti+1s κ(u) duy(s) ds,
Var(x(ti+1)|x(ti)) =
∫ ti+1
ti
(e−
∫ ti+1s κ(u) duσr(s)
)2
ds
where Z0, . . . , ZN−1 is a sequence of independent standard Gaussian random variables.
For each simulated path x(t0), . . . , x(tN ), we can compute V (T, x(t)Tt=0). As for the
quantity I(T ) = −∫ T
0x(u) du that is also present in (34), we can proceed via a standard
quadrature (which will introduce a discretization bias) or we can use [AP10-2, Lemma10.1.11] to simulate it as
I(ti+1) = E[I(ti+1)|I(ti), x(ti)] +√
Var(I(ti+1)|I(ti), x(ti))Zi, i = 0, . . . , N − 1
where
E[I(ti+1)|I(ti), x(ti)] = I(ti)− x(ti)G(ti, ti+1)−∫ ti+1
ti
∫ u
ti
e−∫ usκ(v) dvy(s) ds du,
Var(I(ti+1)|I(ti), x(ti)) = 2
∫ ti+1
ti
∫ u
ti
e−∫ usκ(v) dvy(s) ds du− y(ti)G(ti, ti+1)2,
Cov(x(ti+1), I(ti+1)|I(ti), x(ti)) = −∫ ti+1
ti
∫ u
ti
σr(s)2e−
∫ usκ(v) dve−
∫ ti+1s κ(v) dv ds du
where Z0, . . . , ZN−1 are independent Gaussian variables satisfying
ρinot= Corr(Zi, Zi) =
Cov(x(ti+1), I(ti+1)|I(ti), x(ti))√Var(I(ti+1)|I(ti), x(ti))
√Var(x(ti+1)|I(ti), x(ti))
Using the Cholesky decomposition as usual, the sequence Zi can be constructed by generat-ing, for each i, two independent Gaussian variables Zi, Z
∗i and setting
Zi = ρiZi +√
1− ρ2iZ∗i
4.4 The affine one-factor model
The GSR model suffers from two main deficiencies:
(i) Interest rates distribute as normal variables, so they can become negative.
(ii) The short rate volatility σr(t) is independent of the short rate itself, so there is no wayto control the volatility skew implied by the model.
Both issues are addressed by the affine short rate model
dr(t) = κ(t) [θ(t)− r(t)] dt+ σ√α+ βr(t)dW (t)
8In order to see this, simply write down the solution to the linear SDE for x(t). The choice of the schedulet0 < t1 < · · · < tN depends on the particulars of the payoff. For instance, if V (T ) only depends on the yield curveat time T , we may take N = 1.
33
4.4.1 Bond pricing
In order to compute bond prices
P (t, T ) = Et
[e−
∫ Ttr(u) du
]one establishes more generally the extended transform
g(t, T ; c1, c2) = Et
[e−c1r(T )−c2
∫ Ttr(u) du
], c1, c2 ∈ C
Lemma 4.4. Wherever the extended transform g(t, T ; c1, c2) us defined, it is given by
g(t, T ; c1, c2) = exp [A(t, T ; c1, c2)−B(t, T ; c1, c2)r(t)]
where A,B satisfy a system of Riccati ODE’s
dA
dt− κ(t)θ(t)B +
1
2σ2(t)αB2 = 0,
−dBdt
+ κ(t)B +1
2σ2(t)βB2 = c2
subject to the terminal conditions A(T, T ; c1, c2) = B(T, T ; c1, c2) = 0.
• This system of ODE’s can be solved robustly by Runge-Kutta methods.
• If the parameters are piecewise constant, A,B can also be found analytically (c.f.section 10.2.2.2 in [AP10-2]).
4.4.2 Discount bond calibration
In the affine SDE
dr(t) = κ(t) [θ(t)− r(t)] dt+ σ√α+ βr(t)dW (t)
• The role of θ(t) is to calibrate the model to the initial term structure of discount bonds.
• The short rate volatility σ(t) will be determined through calibration against swaptionsand caps/floors.
As in the GSR model, it is convenient to eliminate the dependence of θ on ∂f(0,t)∂t . This
is accomplished via the change of variables x(t) = r(t)− f(0, t); the SDE for x(t) is
dx(t) = dr(t)− ∂tf(0, t)dt = (w(t)− κ(t)x(t)) dt+ σ(t)√ξ(t) + βx(t)dW (t) (35)
w(t) = κ(t)θ(t)− ∂tf(0, t)− κ(t)f(0, t)
ξ(t) = α+ βf(0, t)
The extended transform g(t, T ; c1, c2) is then expressed in terms of x(t)
g(t, T ; c1, c2) = e−c1f(0,T )P (0, T )c2
P (0, t)c2exp [C(t, T ; c1, c2)− x(t)B(t, T ; c1, c2)]
where B,C satisfy a system of Riccati ODE’s
dC
dt− w(t)B +
1
2σ2(t)ξ(t)B2 = 0, ξ(t) = α+ βf(0, t),
−dBdt
+ κ(t)B +1
2σ2(t)βB2 = c2
34
4.4.2.1 Algorithm for ω(t), for fixed α, β and known κ(t), σ(t)
The function ω(t) is determined by matching observed discount bond prices at time 0. Con-cretely, ω(t) is specified on a grid t0 < · · · < TN under a piecewise-constant assumption
ω(t) =∑N−1i=0 ωiI(ti,ti+1](t).
Note that
P (t, T ) = g(t, T ; 0, 1) =P (0, T )
P (0, t)ec(t,T )−x(t)b(t,T ) (36)
where
dc(t, T )
dt− ω(t)b(t, T ) +
1
2σ2(t)ξ(t)b2(t, T ) = 0, (37)
−db(t, T )
dt+ κ(t)b(t, T ) +
1
2σ2(t)βb2(t, T ) = 1 (38)
subject to c(T, T ) = b(T, T ) = 0.
Setting t = 0 in equation (36) establishes the fundamental calibration requirement
c(0, T ;ω(·)) = 0, ∀T
The idea is then to solve equation (38) for b(ti, tj), j < i and then find wi iteratively from
equation (37) by integrating it over [0, ti+1], and using the fact that∫ ti+1
0dc(t,ti+1)
dt dt =−c(0, ti+1) = 0. Concretely:
1. As a pre-processing step, find b(ti, tj) for all j < i by solving equation (38).
2. For a given i, assume that ωj is known for j < i, namely that ω(t) has been specifiedover [0, ti].
3. Integrating equation (37) over [0, ti+1] yields
−c(0, ti+1) = 0
=
not= Θ(ti)︷ ︸︸ ︷
−1
2
∫ ti+1
0
σ2(s)ξ(s)b2(s, ti+1) ds+
∫ ti
0
ω(s)b(s, ti+1) ds+
∫ ti+1
ti
ω(s)b(s, ti+1) ds
= Θ(ti) + ωi
∫ ti+1
ti
b(s, ti+1) ds
whereby one can solve for ωi and then repeat the process for i = 0, 1, . . . , N − 1.
4.4.3 European option pricing
In order to to determine the volatility function σ(t)), the affine model is calibrated to capletsand swaptions, so we need efficient schemes to compute prices thereof.
4.4.3.1 Caplets
Recall that, by Lemma 4.4, the moment generating function is available in the affine model
g(t, T ; c1, c2) = exp [A(t, T ; c1, c2)−B(t, T ; c1, c2)r(t)]
where A,B satisfy a system of Riccati ODE’s, so caplets (i.e. options on zero-coupon bonds,can be priced by Fourier methods.
35
Briefly: given a tenor structure 0 ≤ T0 < · · · < TN , a caplet fixing at Tn has time-Tn+1
payout
Vcaplet(Tn+1) = τn (Ln(Tn)− k)+
=
(1
P (Tn, Tn+1)− 1− τk
)+
So it’s time-Tn value is
Vcaplet(Tn) = P (Tn, Tn+1)ETn+1
Tn
[(1
P (Tn, Tn+1)− 1− τk
)+]
[∗]= (1− (1 + kτ)P (Tn, Tn+1))
+
where in [∗] we used that the expression inside the expectation is FTn-measurable, whencethe time-Tn value of the caplet can be represented as the payoff of a scaled put option on azero-coupon bond for which we have an m.g.f. available.
4.4.3.2 Swaptions
In order to price swaptions efficiently, we work out an approximation for the swap ratemartingale dynamics. Start by writing the swaption payout as
Vswaption(T0) = A(T0)(S(T0)− c)+
with
A(t)not= A0,N (t) =
N−1∑i=0
τiP (t, Ti+1), S(t)not= S0,N (t) =
P (t, T0)− P (t, TN )
A(t)
In the QA measure we have
Vswaption(0) = A(0)EA[(S(T0)− c)+
]S(t) is obviously a QA-martingale, and from the bond reconstitution formula (36)
P (t, T ) = g(t, T ; 0, 1) =P (0, T )
P (0, t)ec(t,T )−x(t)b(t,T )
we obtain
dS(t) =∂S(t, x(t))
∂xσ(t)
√ξ(t) + βx(t)dWA(t)
where
∂S(t, x)
∂x= −b(t, T0)P (t, T0)− b(t, TN )P (t, TN )
A(t)+S(t)
A(t)
N−1∑i=0
τib(t, Ti+1)P (t, Ti+1)
S(t) can be empirically verified to be well approximated by a linear function of x(t), sayS(t) = ζ(t) + χ(t)x(t), whence
dS(t) ' σ(t)√ξs(t) + βs(t)S(t)dWA(t), for appropriate ξs, βs(t)
We can further reduce to
dS(t) = σ(t)√βs(t)
√ψ + S(t)dWA(t), ψ =
∫ T0
0σ(t)2βs(t)ξs(t) dt∫ T0
0σ(t)2βs(t) dt
36
Defining y(t) = ψ + S(t) our swaption pricing problem reduces to
Vswaption(0) = A(0)EA[(S(T0)− c)+
]= A(0)EA
[(y(T0)− (c+ ψ))+
],
dy(t) = σ(t)√βs(t)
√y(t)dWA(t)
y(0) = ψ + S(0)
Now y(t) is simply a CEV process with power p = 12 and time dependent volatility
λ(t) := σ(t)√βs(t). By [AP10-1, Proposition 7.6.1], the swaption price will be that of a
swaption in a CEV with constant time-averaged volatility
dy(t) =√λ√y(t)dWA(t), λ =
1
T0
∫ T0
0
σ(u)√βs(u) du
By [AP10-1, Proposition 7.2.6], the swaption price will be
Vswaption(0) = Vswaption(0, y(0);T0, ψ + c) = y(0) [1−Υ(a, b+ 2, c)]− (ψ + c)Υ(c, b, a)
where
a =4(ψ + c)
λ2T0, b = 2, c =
4y(0)
λ2T0
where Υ(x, ν, γ) = P (χ2ν(γ) ≤ x) is the cumulative distribution function for a non-central
chi-square distribution χ2ν(γ) with ν degrees of freedom and non-centrality parameter γ.
4.4.4 Swaption calibration
We finally illustrate how to calibrate the volatility function σ(t) to a swaption strip definedon a maturity grid 0 = T0 < · · · < TN . Assume that all swaptions are written on swaps thatmature at time TN ...
4.5 Log-normal short-rate models
Some authors have attempted to specify short-rate models where the dynamics of r(t) are ofthe form
dr(t) = O(dt) + σr(t)r(t)dW (t)
A few approaches are outlined in [AP10-2, Section 11.1]:
• Black-Derman-Toy model
r(t) = U(t)eσr(t)W (t), U(t), σr(t) deterministic
This model lacks a bond reconstitution formula and additionally, ln r(t) can be seen tobe a mean-reverting Gaussian [AP10-2, Lemma 11.1.1] with a mean reversion speed
κ(t) = −σ′r(t)σr(t) that is beyond user control.
• The Black-Karasinski model is analogous to the BDT model, but it specified the meanreersion speed exogenously.
A common problem shared by all these log-normal models is that
Et
[1
P (t′, T )
]=∞, t < t′ < T
In particular since L(t, T ) = 1τ
(P (t,T )
P (t,T+τ) − 1)
, this implies the following unfortunate eco-
nomic corollaries:
37
• Et[L(T, T )] =∞ for T > t, which predicts that all future Libor rates should be infinite.
• Using Jensen’s inequality we obtain
1
P (t′, T )=
1
Et′[e−
∫ Tt′ r(u) du
] ≤ Et′ [e∫ Tt′ r(u) du]
whence Et
[e∫ Tt′ r(u) du
]= ∞, so that the expected return of investing in the continu-
ously compounded money market account for a finite period of time is infinite.
4.6 Spanned and unspanned stochastic volatility
We illustrate why the short rate framework is not particularly amenable to stochastic volatil-ity extensions. Consider the Fong-Vasicek model
dr(t) = κr(θr − r(t))dt+√z(t)dW1(t), κr, θr ∈ R+
dz(t) = κz(θz − z(t))dt+ η√z(t)dW2(t), κz, θz ∈ R+
dW1(t) · dW2(t) = ρdt
The bond prices in this model can be seen to satisfy
P (t, t+ δ) = eA(δ)+r(t)B(δ)+z(t)C(δ)
where A,B,C a deterministic functions satisfying a coupled system of ODE’s. Therefore,P (t, T ) is a deterministic function of the two state variables r(t) and z(t) and hence theexposure to both can be hedged out by taking positions in two discount bonds with differentmaturities9. When this happens, we say that the state variables are spanned by the discountcurve.
There is evidence, however, that interest rate option volatilities cannot be perfectlyhedged by trading only discount bonds, which implies that volatilities of discount bondsdepend on a vector of random state variables (z1(t), . . . , zn(t)) that are not included in thestate variables used in reconstitution formulas for the discount curve, namely
∂P (t, T )
∂zi(t)= 0, i = 1, . . . , n
9Namely, given observations at time t of the prices of two discount bonds with different maturities, we caninvert the reconstitution formula for P (t, T ) to uncover the current values of the two state variables.
38
5 Multi-factor short-rate models
Short rate models with a single driving Brownian motion:
1. Imply that the instantaneous correlation between forward rates at different maturitiesis one, which contradicts reality. As a general rule, all derivatives that have payoutsexhibiting significant convexity to non-parallel moves of the forward curve must not bepriced in a one-factor model.
2. Are unable to produce a time-stationary non-monotonic term structure of forward ratevolatilities (the so-called short-maturity volatility hump). Two-factor Gaussian modelsare, in contrast, capable of reproducing this volatility hump.
In this section we briefly sketch:
• The Gaussian multi-factor model, following [AP10-2, Section 12.1]: development fromseparability condition, swaption pricing and model calibration.
• The quasi-Gaussian multi-factor model, following [AP10-2, Section 13.3]
5.1 The Gaussian multi-factor model
One starts along the lines of the one-factor Gaussian model. A general d-factor model canbe written as
df(t, T ) = σf (t, T )>∫ t
0
σf (t, u) dudt+ σf (t, T )>dW (t),
dP (t, T )
P (t, T )= r(t)dt− σP (t, T )>dW (t)
where σP (t, T ) is a bounded d-dimensional function of time and W (t) is a d-dimensionalBrownian motion in the risk-neutral measure Q. This model can be made Markovian byimposing a separability condition:
Proposition 5.1. Assuming that
σf (t, T ) = g(t)h(T ) (39)
where g is a d×d deterministic matrix-valued function, and h is a d-dimensional deterministicvector. Then
f(t, T ) = f(0, T ) + Ω(t, T ) + h(T )>z(t), (40)
dz(t) = g(t)>dW (t), z(0) = 0, (41)
r(t) = f(0, t) + Ω(t, T ) + h(t)>z(t), (42)
Ω(t, T ) = h(T )>∫ t
0
g(s)>g(s)
∫ T
s
h(u) du ds (43)
If the separability condition is satisfied, the forward curve can be reconstructed from dGaussian martingale variables zi(t), which are not determined uniquely. Set
H(t) = diag(h(t)) ≡ diag(h1(t), . . . , hd(t)), hi(t) 6= 0 i = 1, . . . , d
which is clearly invertible, and define a diagonal d× d matrix10 by
κ(t) = −dH(t)
dtH(t)−1 (44)
10Recall that in the HJM setting κ(t) = −h′(t)h(t)
, where σf (t, T ) = g(t)h(T ).
39
Proposition 5.2. Let the forward rate volatility by separable σf (t, T ) = g(t)h(T ). Let κ(t)be defined as above and define
x(t) = H(t)
∫ t
0
g(s)>g(s)
∫ t
s
h(u) du ds+H(t)z(t) (45)
y(t) = H(t)
(∫ t
0
g(s)>g(s) ds
)H(t) (46)
(so that x(t) is a d-dimensional random vector and y(t) is a deterministic d× d symmetricmatrix). Then if 1 = (1, 1, . . . , 1)> ∈ Rd we have
dx(t) = [y(t)1− κ(t) · x(t)] dt+ σx(t)>dW (t),
σx(t) = g(t)H(t)
and with M(t, T ) := H(T )H(t)−11,
f(t, T ) = f(0, T ) +M(t, T )>
(x(t) + y(t)
∫ T
t
M(t, u) du
)(47)
r(t) = f(t, t) = f(0, t) + 1>x(t) = f(0, t) +
d∑i=1
xi(t) (48)
As for the bond prices, defining
G(t, T ) =
∫ T
t
M(t, u) du
we can write explicitly
P (t, T ) =P (0, T )
P (0, t)exp
[−G(t, T )> · x(t)− 1
2G(t, T )> · y(t) ·G(t, T )
](49)
More explicitly: writing
κ(t) = diag(κ1(t), . . . , κd(t))
it follows from the definitions above that
h(t) =(e−
∫ t0κ1(s) ds, . . . , e−
∫ t0κd(s) ds
)>(50)
M(t, T ) = H(T )H(t)−11 =(e−
∫ Ttκ1(s) ds, . . . , e−
∫ Ttκd(s) ds
)>(51)
y(t) =
∫ t
0
diag(M(s, t)) · σx(s)> · σx(s) · diag(M(s, t)) ds (52)
whereby specification of the mean reversion κ(t) and σx(t) fully determines the d-dimensionalGaussian model.
One important motivation for the introduction of a multi-factor interest rate model isthe ability to control correlations among the various points on the forward curve.
Lemma 5.3. In the model for r(t) described in Proposition 5.1, the time-t instantaneouscorrelation between the forward rates f(t, T1) and f(t, T2) is given by
ρ(t, T1, T2) =h(T1)> · g(t)> · g(t) · h(T2)√
h(T1)> · g(t)> · g(t) · h(T1) ·√h(T2)> · g(t)> · g(t) · h(T2)
(53)
40
5.2 Two-factor Gaussian model
In this case we have
h(t) =
(e−
∫ t0κ1(u) du
e−∫ t0κ2(u) du
), g(t) =
(σ11(t)e−
∫ t0κ1(u) du 0
σ21(t)e−∫ t0κ1(u) du σ22(t)e−
∫ t0κ2(u) du
)and writing x(t) = (x1(t), x2(t))>
dx(t) = [y(t)1− κ(t)x(t)]dt+ σx(t)>dW (t), σx(t) =
(σ11(t) 0σ21(t) σ22(t)
)The instantaneous correlation between x1(t) and x2(t) is
ρx(t) =σ22(t)σ21(t)√
σ11(t)2 + σ21(t)2 · ‖σ22(t)‖It is clear that the model has the forward rate process
df(t, T ) = O(dt) +
(σ1(t)e−
∫ Ttκ1(u) du
σ2(t)e−∫ Ttκ2(u) du
)dW ∗(t), dW ∗1 (t)dW ∗2 (t) = ρx(t)dt
with σ1(t) =√σ11(t)2 + σ21(t)2 and σ2(t) = ‖σ22(t)‖. In particular (c.f. [AP10-2, Lemma
12.1.11]) we have
ρ(t, T1, T2) := Corrt(df(t, T1), df(t, T2)) =b(t, T1, T2)√
b(t, T1, T1)b(t, T2, T2)(54)
where
b(t, T1, T2) = 1 + ρx(t)σ2(t)
σ1(t)
(e−
∫ T1t (κ2(u)−κ1(u)) du + e−
∫ T2t (κ2(u)−κ1(u)) du
)+
(σ2(t)
σ1(t)
)2
e−∫ T1t (κ2(u)−κ1(u)) du−
∫ T2t (κ2(u)−κ1(u)) du
5.3 Swaption pricing (via Gaussian swap rate approximation)
Swaptions in a multi-factor Gaussian model can be priced identically to the one-dimensionalsetting (c.f. Section 4.3.1.2).
The time-0 value of a swaption is given by Vswaption(0) = A(0)EA [(S(T0)− c)+] Theswap rate dynamics can be approximated as
dS(t) = q(t, x(t))>σx(t)>dWA(t), qj(t,x(t)) =∂S(t)
∂xj
The functions qj can be experimentally verified to be close to a constant, so we canapproximate
qj(t,x(t)) ' qj(t, x(t))
where x(t) is some deterministic proxy of the random vector x(t), for instance x(t) = 0.Then, as in the 1-dimensional case we have the following simple swaption pricing formula.
Lemma 5.4. Let x(t) be a deterministic function of time and assume that qj(t,x(t)) 'qj(t, x(t)). Then
Vswaption(0) = A(0)[(S(0)− c)N (d) +
√λn(d)
]where
d =S(0)− c√
λ, λ =
∫ T0
0
‖q(t, x(t))>σx(t)>‖2 dt
41
5.3.1 Calibration
In order to calibrate the Gaussian multi-factor model we need to specify the functions g(·)ad h(·). From (50)
h(t) =(e−
∫ t0κ1(s) ds, . . . , e−
∫ t0κd(s) ds
)>it is clear that specification of the mean reversions (κ1(t), . . . , κd(t))) fully determines thefunction h(·).
The functions g(·) ad h(·) affect both volatilities and correlations of market rates, so inorder to calibrate them to market instrument prices, we need both caplet/swaptions and alsocorrelation-sensitive instruments like spread options. The former are far more liquid thanthe latter, so it is a good idea to try to separate volatility and correlation calibration.
Consider the following:
• d benchmark tenors δ1 < · · · < δd.
• d forward rates fi(t) = f(t, t+ δi), i = 1, . . . , d, with instantaneous correlations11
λi(t) = h(t+ δi)>g(t)>, λi(t) := ‖λi(t)‖
• Denote by χi,j(t) = Corr(fi(t), fj(t)), i, j = 1, . . . , d.
• Denote the covariance matrix of the vector f(t) = (f1(t), . . . , fd(t))> by
[Rf (t)]i,j = λi(t)λ>j (t) = χi,j(t)λi(t)λj(t)
Also denote its Cholesky decomposition by
Rf (t) = Cf (t)>Cf (t)
• This instantaneous covariance matrix can also be written as
Hf (t)g(t)>g(t)Hf (t)>, Hf (t) =
h(t+ δ1)>
...h(t+ δd)
>
=
h1(t+ δ1) · · · hd(t+ δ1)...
. . ....
h1(t+ δd) · · · hd(t+ δd)
whence
Hf (t)g(t)> = Cf (t) =⇒ g(t)> = Hf (t)−1Cf (t)
• The matrix g(t) can also be determined by fitting the correlation rather than thecovariance. Concretely, defining
[Xf (t)]i,j := χi,j(t) and Xf (t) = Df (t)>Df (t)
11Recall the d-factor Gaussian model specification from Proposition 5.1
f(t, T ) = f(0, T ) + Ω(t, T ) + h(T )>z(t),
dz(t) = g(t)>dW (t), z(0) = 0,
Ω(t, T ) = h(T )>∫ t
0
g(s)>g(s)
∫ T
s
h(u) du ds
whence for T = t+ δi we have
dfi(t) = dfi(t, t+ δi) = dΩ(t, t+ δi)dt+ h(t+ δi)>dz(t) = dΩ(t, t+ δi)dt+ h(t+ δi)
>g(t)>︸ ︷︷ ︸:=λi(t)
dW (t).
42
one can obtain g(t) by solving
Hf (t)g(t)> = diag(λ1(t), . . . , λd(t))Df (t)>
with the advantage that Df (t) is independent of the volatilities λi(t) and it doesn’tneed to be recomputed after each update of λi(t), within a calibration iteration.
One can thus implement the following calibration algorithm:
1. Specify the vector h(t) by the mean-reversion parametrization (50)
h(t) =(e−
∫ t0κ1(s) ds, . . . , e−
∫ t0κd(s) ds
)>using d different constant mean reversions (expand on how to select κi.)
2. Fill in the correlation matrix [Xf (t)]i,j = χi,j(t), as explained later in Section 7.2.2.
3. Calibrate benchmark rate volatilities against swaptions using the pricing formula fromLemma 5.4.
Concretely, consider the 2-factor case and assume that
κ(t) = (κ1(t), κ2(t))>, and Rf (t) =
(1 χ12(t)
χ12(t) 1
)have been pre-specified. Consider two swaption strips (same fixing dates, differenttenors) on a tenor structure T0 < T1 < · · · < TN . Assume that σ11(t), σ21(t), σ22(t) are
piecewise constant, with σij(t) =∑N−1n=0 σij,nI[Tn,Tn+1]. For each n = 1, . . . , N − 1, use
Lemma 5.4 to find the model price of the two swaptions maturing at Tn. These dependon three unknown variables σij,n−1(t), which are related by equation (54)
Corr(df1(t), df2(t)) = χ12(t)
This gives us three equations for three unknown at each step n = 1, . . . , N − 1.
4. Recover the diffusion matrix g(t) by solving
Hf (t)g(t)> = diag(λ1(t), . . . , λd(t))Df (t)>
This procedure allows us to calibrate d swaption strips, ideally d strips with constantswap tenors matching the benchmark tenors δ1, . . . , δd.
5.3.2 Monte Carlo simulation
Monte Carlo methods for the d-dimensional Gaussian model are straightforward, as all statevariables are jointly Gaussian. Recalling that r(t) = f(0, t) + 1>x(t), for a security payingV (T ) at time T we need to compute
V (0) = EQ[V (T )e−
∫ T0r(u) du
]= P (0, T )EQ
[V (T ; x(t)Tt=0)e−
∫ T0
1>x(u) du]
The risk-neutral dynamics of x(t) = (x1(t), . . . , xd(t)) are
dx(t) = [y(t)1− κ(t)x(t)]dt+ σx(t)>dW (t), x(0) = 0
Given a discrete schedule tiNi=0, it is clear that x(ti+1)|x(ti) is d-dimensional Gaussianwith mean and covariance matrix
EQ [x(ti+1)|x(ti)] = e−∫ ti+1ti
κ(u) dux(ti) +
∫ ti+1
ti
e−∫ ti+1s κ(u) duy(s)1 ds
VarQ [x(ti+1)|x(ti)] =
∫ ti+1
ti
e−∫ ti+1s κ(u) duσx(s)σx(s)>e−
∫ ti+1s κ(u)> du
43
If C is the square root of the covariance matrix, for a draw of d independent Gaussian samplesZ = (Z1, . . . , Zd) the advancement of x(·) from ti to ti+1 is given by
x(ti+1) = EQ [x(ti+1)|x(ti)] + CZ
As for the quantity I(T ) = −∫ T
01>x(u) du:
• I(t) is a Gaussian process, so we can simulate it jointly with x(t) once we have workedout the moments of I(ti+1)|I(ti), x(ti), as in section 4.3.3.
• We can compute I(T ) by numerical integration I(T ) ' −1>∑Ni=1 x(ti).
44
6 The one-factor quasi-Gaussian model
We consider extensions of one-factor Gaussian short rate models with local and stochasticvolatility, in which the Markovian structure of the model comes at the expense of addingadditional state variables.
6.1 Definition
The Gaussian (GSR) model es given, within an HJM setting, by
dr(t) = κ(t) [θ(t)− r(t)] dt+ σr(t)dW (t)
df(t, T ) = σf (t, T )t∫ T
t
σf (t, u) dudt+ σf (t, T )tdW (t)
σf (t, T ) = σr(t) exp
(−∫ T
t
κ(u) du
)Recall that in order to match the initial yield curve, we must have
θ(t) =1
κ(t)
∂f(0, t)
∂t+ f(0, t) +
1
κ(t)
∫ t
0
e−2∫ tuκ(s) dsσ2
r(u) du
In order to make the short rate r(t) Markovian we imposed a separability conditionσf (t, T ) = g(t)h(T ), where g, h are deterministic functions. Quasi-Gaussian models arise byallowing the function g to be stochastic:
σf (t, T, w) = g(t, w)h(T )
Lemma 6.1 (description of quasi-Gaussian models). Consider a one-factor quasi-Gaussianmodel
dr(t) = κ(t) [θ(t)− r(t)] dt+ σr(t)dW (t)
df(t, T ) = σf (t, T )t∫ T
t
σf (t, u) dudt+ σf (t, T )tdW (t)
σf (t, T ) = σr(t) exp
(−∫ T
t
κ(u) du
)dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t, w)dW (t), x(0) = 0
dy(t) =[σ2r(t, w)− 2κ(t)y(t)
]dt, y(0) = 0
κ(t) = −h′(t)
h(t)
σr(t, w) = σf (t, t, w) = g(t, w)h(t)
All zero-coupon discount bonds are deterministic functions of the processes x(t), y(t)
P (t, T ) = P (t, T, x(t), y(t))
where
P (t, T, x, y) =P (0, T )
P (0, t)exp
(−G(t, T )x− 1
2G2(t, T )y
)(55)
G(t, T ) =1
h(t)
∫ T
t
h(s) ds
45
Note that:
• The evolution of the whole interest rate curve can be reduced to the evolution of justtwo state variables x(t) (main yield curve driver) and y(t) (auxiliary variable requiredto uphold the no-arbitrage condition).
• The qG model has a closed-form bond reconstitution formula for arbitrary choices ofg(t, ω) (we gain tractability at the expense of requiring two state variables, even thoughthe Brownian motion is one-dimensional.
6.2 Local volatility
A one-factor gQ model with local volatility is obtained by requiring the function g(·) to bedeterministic
g(t) = g(t, x(t), y(t)) and hence σr(t) = g(t, x(t), y(t))h(t)
with dynamics
dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t, x(t), y(t))dW (t),
dy(t) =[σ2r(t, x(t), y(t))− 2κ(t)y(t)
]dt
Fix a tenor structure 0 < T0 < · · · < TN , with τn = Tn+1−tn and consider a forward swaprate12 S(t) with first fixing T0 and last payment TN . Since clearly S(t) = S(t, x(t), y(t)),using It’s lemma and noting that S(t) is a QA-martingale we conclude
dS(t) =∂S
∂x(t, x(t), y(t))σr(t, x(t), y(t))dWA(t), (56)
where
∂S
∂x(t, x, y) = − 1
A(t, x, y)[P (t, T0, x, y)G(t, T0)− P (t, TN , x, y)G(t, TN )]
+S(t, x, y)
A(t, x, y)
N−1∑n=0
τnP (t, Tn+1, x, y)G(t, Tn+1) (57)
Using the methods of Markovian projection one can show that the dynamics of the swaprate (56) can be written with a diffusion term that is just a function of the swap rate itself.The resulting SDE can hence be solved using standard methods for local volatility models.
Proposition 6.2 (Approximate local volatility dynamics for swap rate). The values ofall European options on the swap rate S(t) can be computed in a vanilla model with time-dependent local volatility function:
dS(t) = ϕ(t, S(t))dWA(t) (58)
ϕ2(t, s) = EA
[∂S
∂x(t, x(t), y(t))σr(t, x(t), y(t))
2
|S(t) = s
](59)
Evaluating conditional expectations like these can be hard. However:
12Recall that, given a tenor structure 0 < T0 < · · · < TN , we have
A(t) ≡ A0,N (t) =
N−1∑n=0
τnP (t, Tn+1), S(t) ≡ S0,N (t) =P (t, T0)− P (t, Tn)
A(t).
46
• If we assume that y(t) is well approximated by a deterministic function y(t), so thatS = S(t, x(t)) and conversely, x(t) = X(t, S(t)).
• If these functions y(t) and X(t, s) were known, then the evaluation of the conditionalexpectation (58) would boil down to
ϕ(t, s) =∂S
∂x(t,X(t, s), y(t)) · σr(t,X(t, s), y(t))
where ∂S∂x (t, x, y) is given by (57).
Once the volatility function ϕ(t, s) is determined, namely once we have approximatedy(t) and X(t, s), swaption values can be computed from (58) by using methods for localvolatility vanilla models (c.f. [AP10-1, Section 7.2]). We summarize these approximationsin the following proposition (see [AP10-2, Section 13.1.4] for details).
Remark 6.1 (A simple approximation for low volatilities). Setting y(t) ≡ 0 and applying alinear approximation
S(t, x, 0) = S(t, 0, 0) +∂S
∂x(t, 0, 0)x,
∂S
∂x(t, x, 0) =
∂S
∂x(t, 0, 0)
whence
x ' S(t, x, 0)− S(t, 0, 0)∂S∂x (t, 0, 0)
and we obtain
ϕ(t, s) ' ∂S
∂x(t, 0, 0)σr
(t,s− S(t, 0, 0)∂S∂x (t, 0, 0)
, 0
)
Proposition 6.3 (approximations of y(t) and X(t, s)). (i) [AP10-2, Proposition 13.1.4]We approximate y(t) by
y(t) ' EA[y(t)] = h2(t)
∫ t
0
[σr(s, 0, 0)
h(s)
]2
ds, t ∈ [0, T0] (60)
h(t) = exp
(−∫ t
0
κ(u) du
)(ii) [AP10-2, Lemma 13.1.6] We approximate x(t) by
x(t) = EA[x(t)] = x0(t) +1
2
∂2X
∂s2(t, S(0))VarA(S(t)), t ∈ [0, T0] (61)
where x0(t) is given as the solution of the equation
S(t, x0(t), y(t)) = S(0)
which can be solved in a few iterations of Newton’s algorithm (staring the search withx = 0), and where the variance VarA(S(t)) can be approximated by
VarA(S(t)) '∫ t
0
[∂S
∂x(s, 0, 0)σr(s, 0, 0)
]2
ds
The derivative ∂2X∂s2 (t, x) in (61) can be computed by differentiating the implicit defini-
tion S(t,X(t, s), y(t)) ≡ s twice13.
13Namely,
∂X
∂s(t, s) =
(∂S
∂x(t,
)−1
and∂2X
∂s2=
∂2S∂x2(∂S∂x
)2
47
(iii) [AP10-2, Proposition 13.1.8] In order to approximate X = X(t, s), since S(t, x) isobserved to be closely approximated as a quadratic function of x, we perform a secondorder expansion of
S(t,X(t, s), y(t)) = s
around x(t), namely we approximate X(t, s) with the solution ξ = ξ(t, s) of the quadraticequation
S(t, x(t), y(t)) +∂S
∂x(t, x(t), y(t))[ξ − x(t)] +
1
2
∂2S
∂x2(t, x(t), y(t))[ξ − x(t)]2 = s (62)
(iv) In conclusion, the conditional expectation (59)
ϕ2(t, s) = EA
[∂S
∂x(t, x(t), y(t))σr(t, x(t), y(t))
2
|S(t) = s
]
can be approximated by
ϕ(t, s) ≈ ∂S
∂x(t, ξ(t, s), y(t)) · σr(t, ξ(t, s), y(t)) (63)
where
• ∂S∂x (t, x, y) is given by (57).
• ∂2X∂s2 (x, t) can be computed by differentiating the implicit definition S(t,X(t, s), y(t)) ≡s twice.
• ξ(t, s) is the solution of the quadratic equation14 (62),
• x(t) is given by (61),
• y(t) is given by (60),
6.2.1 Linear local volatility
Even though one could use the previous results to calibrate the function σr(t, x, y) non-parametrically to the implied volatilities of a collection of swaptions across all strikes, it isrecommendable to chose σr(t, x, y) from a parametric family of monotone, downward slopingfunctions of state variables.
• We next make the assumption that the short rate local volatility function islinear
σr(t, x, y) = λr(t) [αr(t) + br(t)x] (64)
where the scale function αr(t) is fixed exogenously and the volatility λr(t) and skewbr(t) are calibrated to the market. The swap rate local volatility function ϕ(t, s) (63)thus becomes
ϕ(t, s) ≈ ∂S
∂x(t, ξ(t, s), y(t)) · λr(t) [αr(t) + br(t)ξ(t, s)]
• We further assume that the swap rate local volatility function ϕ(t, s) is wellapproximated by a linear function in s too, which leads to the following proposition[AP10-2, Corollary 13.1.9]:
14Any of the 2 solutions?
48
Proposition 6.4 (approximate swap rate dynamics with linear local volatility function).The swap rate local volatility function ϕ(t, S(t)) can be approximated by
ϕ(t, S(t)) ≈ ϕ(t, S(0)) +∂ϕ
∂s(t, S(0))[S(t)− S(0)]
= ϕ(t, S(0))
[∂ϕ∂s (t, S(0))
ϕ(t, S(0))S(t) +
(1−
∂ϕ∂s (t, S(0))
ϕ(t, S(0))
)S(0)
]not= λS(t) [bS(t)S(t) + (1− bS(t))S(0)]
where
λS(t) = λr(t)1
S(0)
∂S
∂x(t, x(t), y(t)) [αr(t) + br(t)x(t)] (65)
bS(t) =S(0)
αr(t) + br(t)x(t)
br(t)∂S∂x (t, x(t), y(t))
+S(0)∂
2S∂x2 (t, x(t), y(t))[
∂S∂x (t, x(t), y(t))
]2 (66)
The dynamics of the swap rate (58) are hence approximated by
dS(t) = ϕ(t, S(t))dWA(t) = λS(t) [bS(t)S(t) + (1− bS(t))S(0)] dWA(t) (67)
The model (67) is a displaced log-normal SDE with time-dependent volatility λS(t) andskew bS(t) and it can be converted into a displaced log-normal SDE with time-constantcoefficients using averaging methods. This yields the following approximate swaption pricingformula [AP10-2, Proposition 13.1.10].
Proposition 6.5 (approximate swaption pricing in local volatility qG model). Considera payer swaption with strike c and expiry T0 on the swap rate S(t). In the local-volatilityquasi-Gaussian model with linear short rate volatility
σr(t, x, y) = λr(t) [αr(t) + br(t)x] ,
the swaption price can be approximated by the displaced log-normal option formula
Vswaption(0) ' A(0)
[(S(0) + S(0)
1− bSbS
)N (d+)−
(c+ S(0)
1− bSbS
)N (d−)
]
d± =
ln
(S(0)+S(0)
1−bSbS
c+S(0)S(0)1−bSbS
)± 1
2 b2S λ
2ST0
bS λS√T0
where
λS =
(1
T0
∫ T0
0
λ2S(t) dt
)1/2
(68)
bS =
∫ T0
0
bS(t)ωS(t) dt (69)
ωS(t) =λS(t)2
∫ t0λS(s)2 ds∫ T0
0
(λS(u)2
∫ u0λS(s)2 ds
)du
(70)
49
6.2.2 Volatility calibration
For calibration purposes, we assume that the parameters αr(t), br(t), λr(t) defining the localvolatility function (64) of the qG model are piecewise constant on a maturity grid.
Concretely, given 0 = T0 < T1 < · · · < TN−1:
• Consider a collection of N − 1 coterminal options, with the n-th one expiring at Tn.
• Assume the n-th swaption has an underlying swap with µ(n) periods and denote thecorresponding swap rate and annuity
Sn(t) ≡ Sn,µ(n)(t), An(t) ≡ An,µ(n)(t), n = 1, . . . , N − 1
• The parameter αr(t) is redundant, so we elect to set αr(t) = Sn(0)1(Tn−1,Tn](t).
With these observations in mind, the local volatility definition (64) specializes to
σr(t, x, y) =
N−1∑n=1
λn [Sn(0) + bnDnx] 1(Tn−1,Tn](t), Dn =∂Sn∂x
(t, 0, 0)
where
λr(t) =
N−1∑n=1
λn1(Tn−1,Tn](t), αr(t) =
N−1∑n=1
αn1(Tn−1,Tn](t), br(t) =
N−1∑n=1
bn1(Tn−1,Tn](t)
Assume for now that the mean reversion function κ(t) is specified externally and that a
collection of market parameters (λSn , bSn), n = 1, . . . , N−1 is given15. From Proposition 6.5,the value of a swaption expiring at Tn depends only on parameters (λi, bi) for i = 1, . . . , n,so the local volatility qG model can be calibrated by a bootstrap method as follows:
1. Starting values: set the λn’s to volatilities obtained by calibrating a pure Gaussianmodel and bn = bSn , n = 1, . . . , N − 1. Set n = 1.
2. Given n, parameters (λ∗i , b∗i ) are already calibrated for i = 1, . . . , n − 1. Use (61) and
(60) to update x(t) and y(t) for t ∈ [0, Tn] using the former, together with the startingvalue (λn, bn).
3. Calculate λSn(t) and bSn(t) for t ∈ [0, Tn−1] using (65) and (66) respectively.
4. Make another guess for (λn, bn) and use it to update λSn(t) and bSn(t) for t ∈ [Tn−1, Tn],once again using (65) and (66) respectively.
5. Calculate the averaged parameters λSn and bSn using (68) and (69) from Proposition6.5.
6. If (λSn , bSn) is not equal to (λSn , bSn) within a given tolerance, go to Step 4.
Essentially, in this step we seek to obtain (λn, bn) such that∥∥∥(λSn − λSn , bSn − bSn)∥∥∥2
≤ ε
for a given tolerance ε. Once accomplished, move to the next step.
7. Set (λ∗n, b∗n) = (λn, bn), update n 7→ n+ 1 ≤ N − 1 and go to Step 2.
15These can be obtained by fitting a series of constant parameter displaced log-normal vanilla models to theobserved swaption volatility smiles at all expiries.
50
6.2.3 Computational tip
We explain in detail how to perform the n-th step...
y(t)
x(t)
6.2.4 Mean reversion calibration
The calibration of the mean reversion speed κ(t) of the local volatility qG model relies onthe observation that the ratio of volatilities of two swaptions with the same expiry dateis practically independent of volatility. This suggests using a second strip of swap rates todefine targets for mean reversion calibration as the ratios between these rates and the originalones.
Consider the original strip of swap rates Sn,ν(n)(·), n = 1, . . . , N − 1 and obtain a newone Sn,µ(n)(·), n = 1, . . . , N −1 with identical fixing dates, but spanning possibly differentperiods.
Proposition 6.6. Consider the quasi-Gaussian model with local short rate volatility functionσr(t) = g(t, x(t), y(t))h(t). The ratio of two swap rates fixing on Tn with m1 and m2 periods,respectively, is approximately given by
Var(Sn,m1(Tn))
Var(Sn,m2(Tn))≈
∫ Tn0
(∂Sn,m1
∂x (t, 0, 0))2
dt∫ Tn0
(∂Sn,m2
∂x (t, 0, 0))2
dt(71)
Note that the ratios (71) depend on the mean reversion parameter κ(t) only, so that itcan be calibrated independently of the other parameters. For calibration we add the usualpiecewise constant assumption
κ(t) =
N−1∑n=1
κn1(Tn−1,Tn](t) + κN1(TN−1,∞](t)
We then set the optimal mean reversion levels κn as the solution of the optimizationproblem
κ∗n = arg min
N−1∑n=1
[Var(Sn,m1(Tn))
Var(Sn,m2(Tn))
(κn)−Var(Sn,m1(Tn))
Var(Sn,m2(Tn))
]2
+ ω
N−1∑n=1
(κn+1 − κn)2
where we included a regularization term penalizing non-stationary behavior (dependent on
a user-specified weight ω) and where the V ar(Sn,m)’s are market-implied variances of swaprates, which in the case of the linear short rate volatility model under consideration can beapproximated16 by
V ar(Sn,m(Tn)) ≈(Sn,m(0)λSn,m
)2
Tn
6.3 Stochastic volatility
Local volatility models are characterized by a dependence of the function g in the factoriza-tion σf (t, T, w) = g(t, w)h(T ) on the state variables, namely g(t) ≡ g(t, x(t), y(t)). Instead,
16Market-implied variances can be calculated in more general models from values of options on the swap rates.
51
we can enrich the model by incorporating an additional stochastic factor via following spec-ification
g(t, w) ≡√z(t)g(t, x(t), y(t)) (72)
dz(t) = θ[z0 − z(t)]dt+ η(t)√z(t)dZ(t), (73)
E[dW (t), dZ(t)] = 0
The analysis from the previous section carries through similarly in this context. Inparticular, the model is defined by the collection of SDE’s
dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t, x(t), y(t))dW (t), x(0) = 0,
dy(t) =[σ2r(t, x(t), y(t))− 2κ(t)y(t)
]dt, y(0) = 0,
dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t), z(0) = z0 = 1,
dZ(t)dWA(t) = 0
A few remarks are in order:
• The bond reconstitution formula (55) is the same as for any other quasi-Gaussian model(c.f. Lemma 6.1.
• In particular, the bond reconstitution formulas do not depend on the stochastic volatil-ity process z(t), so the SV-qG model is a true stochastic volatility model, in the sensethat the stochastic volatility is unspanned and cannot be hedged by discount bonds.
• The time-dependent volatility of variance is assumed to be piecewise constant η(t) =∑N−1n=1 ηnI(Tn−1,Tn](t).
An analysis identical to the one carried out in the local volatility setting yields thefollowing approximate dynamics for the short rate:
Proposition 6.7 (approximate dynamics for the swap rate S(t) in the SV-qV model).Assuming that the local short rate volatility is linear σr(t, x, y) = λr(t) [αr(t) + br(t)x], thedynamics of the swap rate S(t) in the stochastic volatility quasi-Gaussian model (??) aregiven approximately by
dS(t) =√z(t)λS(t) [bS(t)S(t) + (1− bS(t))S(0)] dWA(t),
dz(t) = θ(z0 − z(t))dt+ η(t)√z(t)dZ(t), z(0) = z0 = 1,
dZ(t)dWA(t) = 0
where λS(t) and bS(t) are given by equations (65)-(66).
This is a stochastic volatility model with time-dependent parameters for which the timeaveraging methods of [AP10-1, Sction 9.3] are available.
6.4 Quasi-Gaussian multi-factor models
52
7 The Libor market model
The Libor market model (LM) is a model capable of:
• Capturing the full correlation structure of across the entire yield curve.
• Allowing volatility calibration to a large enough set f European options that the volatil-ity characteristics of most exotic securities can be considered spanned by the calibration.
7.1 The basics
The HJM framework generally involves multiple driving Brownian motions and an infiniteset of state variables (the instantaneous forward rates). Working directly with forward ratesis unattractive for a number of reasons:
(i) Instantaneous forward rates are never quoted in the market: realistic securities in-volve simply compounded (Libor) rates. The development of market-consistent pricingexpressions is thus cumbersome.
(ii) An infinite set of instantaneous forward rates will require discretization.
(iii) Prescribing the form of the volatility function of instantaneous forward rates is com-plicated.
BGM, Jamshidian: All three issues can be solved by formulating the model in termsof a non-overlapping set of simply compounded Libor rates.
Fix a tenor structure
0 = T0 < T1 < · · · < TN , τn = Tn − Tn−1
Instead of keeping track of an entire yield curve, at any point in time we are focused onlyon a finite set of zero-coupon bonds
P (t, Tn), t < Tn ≤ TN
Define q(t) to be the tenor structure index of the shortest-dated discount bond still aliveat time t, namely
Tq(t)−1 ≤ t < Tq(t)
We define the Libor forward rates as
Ln(t) = L(t, Tn, Tn+1) =1
τn
(P (t, Tn)
P (t, Tn+1)− 1
), q(t) ≤ n ≤ N − 1
7.1.1 Probability measures
By construction, Ln(t) is a QTn+1-martingale, so by the martingale representation theoremthere is an adapted vector process σn(t) such that
dLn(t) = σn(t)>dWn+1(t)
In order towrite the process in measure QTn , consider the Radon-Nikodym process
ξ(t) =
(dQTn
dQTn+1
)t
= (1 + τnLn(t))P (0, Tn+1)
P (0, Tn),
dξ(t)
ξ(t)=τnσn(t)>dWn+1(t)
1 + τnLn(t)
53
From Girsanov’s theorem, dWn(t) = dWn+1(t)− τnσn(t)>
1+τnLn(t)dt is a QTn -BM and
dLn(t) = σn(t)>(
τnσn(t)
1 + τnLn(t)dt+ dWn(t)
)Iterating this argument, one easily sees that in the terminal measure QTN , the process
for Ln(t) is
dLn(t) = σn(t)>
− N−1∑j=n+1
τjσj(t)
1 + τjLj(t)dt+ dWN (t)
As for the spot measure, recall that it is the measure induced by taking as numeraire
B(t) = P (t, Tq(t))
q(t)−1∏n=0
(1 + τnLn(Tn))
The random part of the numeraire is the discount bond P (t, Tq(t)), so the dynamics in
measure QB coincide with the dynamics of QTq(T ) and arguing as above we see
dLn(t) = σn(t)>
n∑j=q(t)
τjσj(t)
1 + τjLj(t)dt+ dWB(t)
7.1.2 Link to HJM
Recall that HJM models have risk-neutral dynamics of the form
df(t, T ) = σf (t, T )>∫ T
t
σf (t, u) dudt+ σf (t, T )>dW (T )
The dynamics for the forward bond P (t, Tn, Tn+1) are of the form
dP (t, Tn, Tn+1)
P (t, Tn, Tn+1)= O(dt) +
(σP (t, Tn+1)> − σP (t, Tn)>
)dW (t)
By the definition of the forward Libor rate Ln(t) = 1τn
(P (t,Tn)P (t,Tn+1) − 1
)and using the diffusion
invariance principle one sees that
σn(t) =1
τn(1 + τnLn(t))
∫ Tn+1
Tn
σf (t, u) du
We thus see that the LM model is a special case of the HJM model: a full specificationof σf (t, T ) uniquelt determined the LM volatility σn(t), but the converse is not true.
7.1.3 Choice of σn(t)
To build a workable model, we need to be specific about σn(t).
(i) Local volatility LMσn(t) = λn(t)ϕ(Ln(t))
where λn(t) is a bounded vector-valued deterministic function and ϕ is a time-homogeneouslocal volatility function.
54
(ii) Stochastic volatility LM: we extend the previous local volatility model and we allowthe term on the Brownian motion to be scaled by a stochastic process so we have
dLn(t) =√z(t)ϕ(Ln(t))λn(t)>
(√z(t)µn(t)dt+ dWB(t)
)(74)
µn(t) =
n∑j=q(t)
τjϕ(Lj(t))λj(t)
1 + τjLj(t)
dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t), z(0) = 0 (75)
• Note that a single common factor√z(t) simultaneously scales all forward rate
volatilities (even though this can be relaxed, see section ??).
• In order to make the variance process (75) more tractable, it is common to assumethat the Brownian motions Z(t) and WB(t) are uncorrelated.
7.2 Correlation structure
In one-factor models, where all the points on the forward curve move in the same direction.Empirical evidence contradicts this and indicates that various points on the forward curvedo not move co-monotonically with each other. The LM model gives us control over theinstantaneous correlation between various points, via the simple observation
Corr(dLk(t), dLj(t)) =λk(t)>λj(t)
‖λk(t)‖‖λj(t)‖
As we add more Brownian motions, our ability to capture increasingly complicated cor-relation structures progressively improves, at the cost of increasing the model complexity. Inorder to choose the model dimension m we may follow two approaches:
(i) Fix the correlation structure in the model to match empirical forward rate correlations,for instance via PCA.
(ii) Imply the correlation structure directly from traded market data by amending the setof calibration instruments with securities that have stronger sensitivity to forward ratecorrelations, like yield curve spread options.
7.2.1 Empirical PCA
For fixed τ (e.g. 0.25 or 0.5), define the rates
`(t, x) = L(t, t+ x, t+ τ)
Fix a set of tenors x1, . . . , xNx and a set of calendar times t1, . . . , tNt and set up the Nx×Ntobservation matrix
Oi,j =`(tj , xi)− `(tj−1, xi)√
tj − tj−1, i = 1, . . . , Nx, j = 1, . . . , Nt
where for each date in the time grid tj , we construct the forward curve (starting at tj) frommarket observable swaps, futures and deposits. Use this to construct a variance covariancematrix
C =OO>
NtIn order for the LM model to conform to empirical data, we need sufficiently many drivingBrownian motions m to replicate this variance matrix. This is accomplished by performinga PCA: compute the eigenvalues of C together with the percentage of variance that theyexplain, and decide how many are necessary to explain a reasonable amount thereof (generally4 or 5).
55
7.2.2 Empirical estimates for forward rate correlations
Introduce the diagonal matrix
c = diag(√
C1,1, . . . ,√CNx,Nx
)so that the empirical forward rate correlation matrix becomes
R = c−1Cc−1. (76)
The termRi,j provides a sample estimate of the instantaneous correlation between incrementsin `(t, xi) and `(t, xj)
Ri,j =1√
Ci,i√Cj,j
Ci,j =1√
Ci,i√Cj,j
1
Nt
Nt∑k=1
[`(tk, xi)− `(tk−1, xi)] · [`(tk, xj)− `(tk−1, xj)]
tk − tk−1.
In multi-factor yield curve modeling, it is common practice to work with simple para-metric forms for correlations
Corr(Lk(t), Lj(t)) = q(Tk − t, Tj − t)
For instance
q(x, y) = ρ∞(minx, y) + [1− ρ∞(minx, y)] e−a(minx,y)·|y−x|, (77)
a(z) = a∞ + (a0 − a∞)κz,
ρ∞(z) = b∞ + (b0 − b∞)e−αz
where ξ = (a0, a∞, κ, b0, b∞, α)> is a vector of parameters that are found by least-squaresoptimization against an empirical correlation matrix (76). Concretely, if R2(ξ) is the corre-lation matrix generated from the form (77), the optimal parameter vector ξ∗ is obtained byminimizing
ξ∗ = arg minξ[tr((R−R2(ξ)) · (R−R2(ξ))>
)], ξ ≥ 0
7.3 Pricing European options
In order to calibrate the model
dLn(t) =√z(t)ϕ(Ln(t))λn(t)>
(√z(t)µn(t)dt+ dWB(t)
)µn(t) =
n∑j=q(t)
τjϕ(Lj(t))λj(t)
1 + τjLj(t)
dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t), z(0) = 0
the vectors λk(t) must be such that the model successfully reproduces the prices of liquidplain-vanilla derivatives (i.e. swaptions and caps). We thus need to find pricing formulas forvanilla options that are fast enough to be embedded into an iterative calibration algorithm.
7.3.1 Caplets
An interest rate cap is a security that allows one to benefit from low floating rates, yet beprotected from high rates. This is accomplished by giving the holder the right to pay thesmaller of two simple rates: the floating rate f and a fixed rate κ: the holder of a cap overthe interval [T, T + τ ], the interest rate paid at time T + τ on each dollar of principal is
56
minτκ, τf. Without the caplet the interest payment would be τf , so the caplet’s worthto the holder is
τf −minτf, τκ = (f − κ)+τ
Given a tenor structure T0 < T1 < · · · < TN and forward Libor rates Ln(t) = L(t;Tn, Tn+1),a cap is a stream of κ-caplets maturing at times Tn and paying out
Vcaplet(Tn+1) = τn (Ln(Tn)− c)+
The time-t value of a caplet is hence
Vcaplet(t) = B(t)Et
[1
B(Tn+1)Vcaplet(Tn+1)
]= B(t)Et
[1
B(Tn+1)τn (Ln(Tn)− c)+
]and the time-t value of a cap in the both the risk-neutral is
Vcap(t) = B(t)
N−1∑n=0
Et
[1
B(Tn+1)τn (Ln(Tn)− c)+
]Switching to the Tn+1-forward measure17 for the n-th caplet, the pricing formula becomes
Vcap(t) =
N−1∑n=0
τnP (t, Tn+1)ETn+1
t
[(Ln(Tn)− c)+
]Remark 7.1. As can be seen from the last expression, the terminal correlation between theforward rates Ln does not affect the cap value. Unfortunately, this is no longer the case forswaptions.
Proposition 7.1 (Caplet pricing in the SV-LM model (74)-(75)). Assume that the forwardrate dynamics in the spot measure are given by (74)-(75). Then
Vcaplet(0) = P (0, Tn+1)τnETn+1
[(Ln(Tn)− c)+
](78)
where
dLn(t) =√z(t)ϕ(Ln(t))‖λn(t)‖dY n+1(t)
dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t)
Y n+1(t) =
∫ t
0
λn(s)>
‖λn(s)‖dWn+1(s)
We are thus reduced to computing the expectation of (Ln(Tn)− c)+, where the processLn(t) is a standard scalar stochastic volatility diffusion.
7.3.2 Swaptions
Recall that a fixed-for-floating interest rate swap allows to exchange a stream of cash-flowsat a floating (Libor) rate by a stream of fixed-rate cash-flows. Given a tenor structure0 ≤ T0 < T1 < · · · < TN , from the perspective of the fixed-rate payer, a swap pays out
Vswap(Tn+1) = τn(Ln(Tn)− κ)
17Recall that the pricing formula with numeraire g reads
V (t) = g(t)E
[1
g(T )V (T )|Ft
]
57
The time-t price of a swap is hence given by
Vswap(t) = β(t)
N−1∑n=0
Et
[1
β(Tn+1)τn(Ln(Tn)− κ)
][1]=
N−1∑n=0
τnP (t, Tn+1)(Ln(t)− κ)
[2]= A(t) (S(t)− κ)
where:
• Equality [1] shows that a vanilla fixed-floating swap can be valued on date t usingonly the term structure of interest rates observed on that date. In order to obtain [1],
one starts by using that P (Tn, Tn+1) = E[β(Tn)β(Tn+1)
], then the definition 1 + τnLn(t) =
P (t,Tn)P (t,Tn+1) and finally the fact that P (t,Tn)
β(t) is a martingale.
• In [2] we denoted
A(t) = A0,N (t) =
N−1∑n=0
τnP (t, Tn), S(t) = S0,N (t) =1
A(t)
N−1∑n=0
τnP (t, Tn)Ln(t)
A European swaption is a contract giving the holder the right, but no the obligation, toenter a swap at a future date, at a given fixed rate. A payer (receiver) swaption is an optionto pay (receive) the fixed leg on a fixed-floating swap.
The payoff for a payer swaption maturing at time Tj , with the underlying security beinga fixed-for-floating swap making payments at times Tj+1, . . . , Tk, j < k ≤ N is simply
Vswaption(Tj) = (Vswap(Tj))+
=
k∑n=j
τnP (Tj , Tn+1)(Ln(T0)− κ)
+
and for t < Tj , the time-t value of a payer swaption is given by
Vswaption(t) = β(t)Et
[1
β(Tj)Vswaption(Tj)
]
= β(t)Et
1
β(Tj)
k−1∑n=j
τnP (Tj , Tn+1)(Ln(T0)− κ)
+= β(t)Et
[1
β(Tj)A(Tj)(S(Tj)− κ)+
]= A(t)EAt
[(S(Tj)− κ)+
]where
A(t) = Aj,k−j(t) =
k−1∑n=j
P (t, Tn+1)τn, S(t) = Sj,k−j(t) =P (t, Tj)− P (t, Tk)
A(t)
Pricing swaptions in the Libor model is more involved than pricing caps and will requiresome amount of approximation if a quick algorithm is required. Since S(t) is clearly aQA-martingale, an application of Ito’s lemma easily yields the following:
58
Lemma 7.2 (Swaption pricing in the SV-LM model (74)-(75)). Assume that the forwardrate dynamics in the spot measure are given by (74)-(75) and let QA be the measure inducedby using A(t) as numeraire. Then
dS(t) =√z(t)ϕ(S(t))
k−1∑n=j
wn(t)λn(t)>dWA(t) (79)
where
wn(t) =ϕ(Ln(t))
ϕ(S(t))
∂S(t)
∂Ln(T )
=ϕ(Ln(t))
ϕ(S(t))× S(t)τn
1 + τnLn(t)×
[P (t, Tk)
P (t, Tj)− P (t, Tk)+
1
A(t)
k−1∑i=n
τiP (t, Ti+1)
](80)
Sketch of proof. Since S(t) is QA-martingale, dS(t) is driftless. Hence, by Ito’s lemma,since S(t) = S(Lj(t), . . . , Lk−1(t)), we have
dS(t) =
k−1∑n=j
∂S(t)
∂Ln(t)
√z(t)ϕ(Ln(t))λn(t)>dWA(t)
Evaluating the partial derivatives ∂S(t)∂Ln(t) , we obtain (79).
The swap rate dynamics in (79) are clearly too complicated to allow analytical treatment.For simplicity, one generally assumes that the weights wn(t) are constant, and equal to theirtime-0 value18.
Proposition 7.3 (Approximate swaption pricing in the SV-LM model (74)-(75)). The time-0 price of the Tj-maturity swaption is given by
Vswaption(0) = A(0)EA[(S(Tj)− c)+
](81)
If wn(t) are the weights in (80), define
λS(t) =
k−1∑n=j
wn(0)λn(t)
The swap dynamics in Proposition 7.2 can be approximated by
dS(t) '√z(t)ϕ(S(t))‖λS(t)‖dY A(t), (82)
dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t),
‖λS(t)‖dY A(t) =
k−1∑n=j
wn(0)λn(t)>dWA(t) (83)
where Y A(t) and Z(t) are independent scalar Brownian motions in measure QA.
• the scalar term ‖λS(t)‖ is purely deterministic, so computation of the QA-expectation(81) can be obtained using standard analytical tools for scalar stochastic volatilitymodels (c.f. Chapter 8 in [AP10-1]).
We next give an example for displaced log-normal Libor rates.
18C.f. discussion at the bottom of page 617 in [AP10-2].
59
Corollary 7.4 (swaption pricing under displaced log-normal Libor rates). Assume that eachrate Ln(t) follows a displaced log-normal process in its own measure
dLn(t) = [bLn(t) + (1− b)Ln(0)]λn(t)>dW n+1(t), n = 1, . . . , N − 1
the time-0 price of a Tj-maturity swaption is given by
Vswaption,j(0) = A(0)cB
(0,S(0)
b;Tj , c− S(0) +
S(0)
b; bλSj
)with term swap rate volatility given by
λSj =1
Tj
(∫ Tj
0
‖λS(t)‖2 dt
)1/2
,
λS(t) =
k−1∑n=j
ωn(0)λn(t)
Above cB(t, S;T,K, σ) is Black’s call option formula with volatility σ, namely
cB(t, S;T,K, σ) = SN (d+)−K(d−), d± =ln S
K ±12σ
2(T − t)σ√T − t
7.3.3 Spread options
We now sketch a crude approach for pricing spread options that is adequate for calibrationpurposes. A more accurate evaluation using copulas is presented later. Consider a spreadoption paying
Vspread(T ) = (S1(T )− S2(T )−K)+, T ≤ minTj1 , Tj2
where S1(t) = Sj1,k1−j1(t) and S2(t) = Sj2,k2−j2(t) so that
dSi(t) = O(dt) +√z(t)ϕ(Si(t))λSi(t)
>dWB(t),
λSi(t) 'ki−1∑n=j
ωSi,n(0)λn(t)
Its time-0 value in the T -forward measure is given by
Vspread(0) = P (0, T )ET[(S1(T )− S2(T )−K)+
]Simplification. Assume that the spread ε(T ) = S1(T )−S2(T ) is a Gaussian with mean
and variance
ET [ε(T )] = ET [S1(T )]− ET [S2(T )] ' S1(0)− S2(0),
Var[ε(T )] =
2∑i=1
ϕ(Si(0))2z0
∫ T
0
‖λS1(t)‖2 dt
−2ρterm(0, T )z0
2∏i=1
ϕ(Si(0))
(∫ T
0
‖λS1(t)‖2 dt
)1/2
ρterm(T, T ′) = Corr(S1(T )− S1(T ′), S2(T )− S2(T ′))
='∫ TT ′λS1
(t)> · λS2(t)√∫ T
T ′‖λS1
(t)‖2 dt ·√∫ T
T ′‖λS2
(t)‖2 dt
60
Bachelier’s formula then yields
Vspread(0) = P (0, T )
√VarT (ε(T )) · [dΦ(d) + φ(d)] ,
d =ET [ε(T )]−K√
VarT (ε(T ))
7.4 Calibration
Assume that a tenor structure T0 < T1 < · · · < TN has been selected, that we have decidedupon the number of Brownian motions m to be used and that we have selected the basicform of the LM model (e.g. local/stochastic volatility) to be deployed. To complete themodel specification we need to establish the m-dimensional deterministic volatility vectorsλk(·), k = 1, . . . , N − 1. We follow these steps:
(i) Prescribe the basic form of ‖λk(t)‖ by introducing discrete time- and tenor-grids.
(ii) Use correlation information to obtain λk(t) from ‖λk(t)‖.(iii) Choose a set of observable securities against which to calibrate the model and establish
the norm to be used for calibration.
(iv) Recover λk(t) by norm optimization.
7.4.1 Calibration objective function
Assume that our calibration targets are
Vswaption,1, . . . , Vswaption,NS Vcap,1, . . . , Vcap,NC
Denote by V their quoted market prices and by V (G) their model-generated prices as func-tions of the volatility gird G. We will use the following calibration objective function:
I(G) =wSNS
NS∑i=1
(Vswaption,i(G)− Vswaption,i
)2
+wCNC
NC∑i=1
(Vcap,i(G)− Vcap,i
)2
w∂tNxNt
Nt,Nx∑i,j=1
(∂Gi,j∂ti
)2
+w∂xNxNt
Nt,Nx∑i,j=1
(∂Gi,j∂xj
)2
w∂t2
NxNt
Nt,Nx∑i,j=1
(∂2Gi,j∂t2i
)2
+w∂x2
NxNt
Nt,Nx∑i,j=1
(∂2Gi,j∂x2
j
)2
(84)
where wS , wC , w∂t, w∂x, w∂t2 , w∂x2 are exogenously specified weights which determine thetrade-off between accuracy and smoothness. Here is what all the terms above account for:
1. Terms (i) and (ii) measure how well the model is capable of reproducing the suppliedmarket prices.
2. Term (iii) measures the degree of volatility term structure time homogeneity, and pe-nalizes volatility functions that vary too much over calendar time.
3. Term (iv) measures the smoothness of the calendar time evolution of volatilities andpenalizes deviations from linear evolution.
4. Term (v) and (vi) measure constancy and smoothness of in the tenor direction.
61
7.4.1.1 Error function for implied volatilities
The error (objective) function (84) is often applied implied volatilities as opposed to out-right prices, so one normally institutes a pre-processing step where the market price of eachswaption Vswaption,i is converted into a constant implied volatility λSi , in such a way thatthe scalar SDE for the swap rate Si underlying the swaption Vswaption,i,
dSi(t) =√z(t)λSiϕ(Si(t))dYi(t)
reproduces the observed market price. If λSi(G) denotes the corresponding model volatility(and λCi(G) denotes the corresponding implied volatility for caps), one obtains an alternativecalibration norm
I(G) =wSNS
NS∑i=1
(λSi(G)− λSi
)2
+wCNC
NC∑i=1
(λCi(G)− λCi
)2
+ · · ·
There are two main reasons for proceeding in this fashion:
• The relative scaling of individual swaptions and caps is more natural (high-valuedtrades tend to be overweighed using outright prices).
• In some models, the implied volatilities λSi and λCi can be computed by simple timeintegration, so using them for calibration avoids the need of using time-consumingoption pricing formulas to obtain Vswaption,i and so forth. For instance, in Corollary7.4, where we illustrated how to price swaptions under displaced log-normal Libor rates
dLn(t) = [bLn(t) + (1− b)Ln(0)]λn(t)>dWn+1(t), n = 1, . . . , N − 1
we saw that the time-0 price of a Tj-maturity swaption is given by Black’s call optionformula
Vswaption,j(0) = A(0)cB
(0,S(0)
b;Tj , c− S(0) +
S(0)
b; bλSj
)with term swap rate volatility given by
λSj =1
Tj
(∫ Tj
0
‖λS(t)‖2 dt
)1/2
,
λS(t) =
k−1∑n=j
ωn(0)λn(t)
where the weights ωn(t) are given in (80). Given Vswaption,j , the implied volatility in
this case would be λSj := bλSj .
7.4.2 Calibration algorithm
Fix a number of factors m and fix a tenor structure 0 = T0 < T1 < · · · < TN and assumethat all the parameters in our Libor model have been fixed exogenously19 except for the m-dimensional vectors λk(t), k = 1, . . . , N−1, which we now seek to calibrate. Define functionsh : R2→Rm and g : R2→R as follows
λk(t) = h(t, Tk − t), ‖λk(t)‖ = g(t, Tk − t)19Namely, assume that we have exogenously fixed the parameters defining the square-root process for z(t) (i.e.
θ, η) and the parameters defining the local volatility function ϕ(·) (i.e. b if we use ϕ(Ln(t)) = bLn(t)+(1−b)Ln(0).
62
Fix a time/tenor grid
ti × xj, i = 1, . . . , Nt, j = 1, . . . , Nx, Nt, Nx ∼ 10 : 15
subject tot1 + xNx ≥ TN , and ti + xj ∈ Tn
have been selected, together with the number of Brownian motions m, a correlation matrixR, the set of calibration swaptions and caps and the weights in the calibration norm I.
Starting from some guess for the Nt ×Nx matrix G, Gij = g(ti, xj) (so that for some n
we have g(ti,
xj︷ ︸︸ ︷Tn − ti) = ‖λn(ti)‖), we run the following iterative algorithm:
1. Given G, use interpolation to obtain the full norm volatility grid ‖λn,k‖ for all Liborindices k = 1, . . . , N − 1 and all expiry indices n = 1, . . . , k. This is done by assumingthat, for each k, the norm ‖λk(t)‖ is piecewise constant in t
‖λk(t)‖ =
k∑n=1
I[Tn−1,Tn)(t)‖λn,k‖
‖λn,k‖ = w++Gi,j + w+−Gi,j−1 + w−+Gi−1,j + w−−Gi−1,j−1
and using linear interpolation in both dimensions of G.
Figure 1: Computation of ‖λn,k‖ = ‖λk(t)‖|[Tn−1,Tn) = g(t, Tk − t)
2. For each n = 1, . . . , N − 1, compute the (N − n)×m matrix H(Tn),
[H(Tn)]ji = hi(Tn, Tn+j−1 − Tn), i = 1, . . . ,m, j = 1, . . . , N − n
as follows: by construction such H(Tn)H(Tn)> will be the covariance matrix of the(N−n)-dimensional vector20 (Ln(Tn), . . . , LN−1(Tn)). We can construct it by definingan instantaneous correlation (N −n)× (N −n) matrix (constructed from an estimatedparametric form)
[R(Tn)]ij = Corr(Li(Tn), Lj(Tn)), i, j = n, . . . , N − 1 (85)
and a diagonal volatility matrix
[c(Tn)]jj = ‖λn,n+j−1‖, j = 1, . . . , N − n
so that the instantaneous covariance matrix will be
C(Tn) = c(Tn)R(Tn)c(Tn)
20Indeed, the (i, j)-term of the matrix H(Tn)H(Tn)> is
h(Tn, Tn+i−1)>h(Tn, Tn+j−1) = λn+i−1)(Tn)> · λn+j−1)(Tn) = Cov(dLn+i−1(Tn), dLn+j−1(Tn))
63
We can finally solve for H(Tn) in the equation
c(Tn)R(Tn)c(Tn) = H(Tn)H(Tn)>
via PCA.
(i) Covariance PCA: Let Λm(Tn) be a diagonal matrix containing the m largesteigenvalues of R(Tn), and let em(Tn) be an (N − n) ×m whose columns containthe corresponding eigenvectors. From the approximation21
c(Tn)R(Tn)c(Tn) ' em(Tn)Λm(Tn)em(Tn)> = em(Tn)√
Λm(Tn)√
Λm(Tn)em(Tn)>
it follows thatH(Tn) ' em(Tn)
√Λm(Tn)
(ii) Correlation PCA: It is more efficient22 to work with the correlation matrixdirectly and factor
R(Tn) = D(Tn)D(Tn)>
where D(Tn) is an (N − n)×m matrix obtained using PCA analysis, namely
D = arg minD′ h(D′;R) s.t. v(D) = 1 [∗]= arg minD′h(D′;R+ diag(µ)
where h(D;R) = tr((R−DD>)(R−DD>)>
), v(D) denotes the vector of diag-
onal elements of DD> and µ∗ is the solution of the equation
v(Dµ) = 1.
The passage from a constrained minimization problem to an unconstrained one in[∗] is justified by [AP10-2, Proposition 14.3.2].
3. Given λk(·) for all k = 1, . . . , N − 1, use the pricing formulas in Propositions 7.1 and7.3 to model prices for all swaptions and caps in the calibration set.
4. Establish the value of I(G) by direct computation.
5. Update G and repeat steps 1−4 until I(G) is minimized, using for instance the downhillsimplex method.
Remark 7.2. We will see how to calibrate the LM model to skews and smiles in section7.7.3.
7.4.3 Correlation calibration to spread options
In the above calibration algorithm, the instantaneous correlation matrix R (85) was specifiedexogenously by calibrating a parametric form such as (77) to an empirical forward ratecorrelation matrix (76). Instead, one can attempt to imply R from market data.
Assume the matrix R is time-inhomogeneous and is specified as some parametric functionof a low-dimension (unknown) parameter-vector23 ξ
R = R(ξ)
For this, let I(G, ξ) be our old objective function (which depends on ξ, since the prices ofcaps and swaptions depend on the correlation matrix). Introduce:
21Which minimizes the norm Tr[(C(Tn)− em(Tn)Λm(Tn)em(Tn)>) · (C(Tn)− em(Tn)Λm(Tn)em(Tn)>)
].
22Even though the PCA decomposition of a correlation matrix is technically more difficult, the correlation PCAis independent of the matrix c(Tn) and as such it won’t have to be updated when we update guesses for the Gmatrix in a calibration search loop.
23To be determined in the calibration procedure, along with the elements of G.
64
• Set of market-observable spread options Vspread,1, . . . , Vspread,NSP .
• Corresponding model-based prices Vspread,1(G, ξ), . . . , Vspread,NSP (G, ξ), computed us-ing ??
Update the old norm I to the norm
I∗(G, ξ) = I(G, ξ) +ωNSNSP
NSP∑i=1
(Vspread,i(G, ξ)− Vspread,i
)2
7.5 Monte Carlo simulation
Once the LM model has been calibrated to market data, we can proceed to use the param-eterized model for the pricing and risk management of non-vanilla options. Given the largenumber of Markov state variables (the full number of Libor forward rates), one must nearlyalways rely on Monte Carlo simulation.
Goal: given a probability measure and the state of the Libor forward curve at time t,we need to move the entire Libor curve forward to time t+ ∆, ∆ > 0.
7.5.1 Euler-type schemes
Assume we stand at time t and we know Lq(t)(t), Lq(t)+1(t), . . . , LN−1(t). We seek to advanceto time t+ ∆ and construct a sample of Lq(t+∆)(t+ ∆), Lq(t+∆)+1(t+ ∆), . . . , LN−1(t+ ∆).If q(t + ∆) exceeds q(t), some of the front-end forward rates expire and drop off the curveas we move to t+ ∆. Working in the spot measure QB , the general LM model dynamics are
dLn(t) = σn(t)>[µn(t)dt+ dWB(t)
]µn(t) =
n∑j=q(t)
τjσj(t)
1 + τjLj(t)
The Euler and log-Euler schemes for this model are:
Ln(t+ ∆) = Ln(t) + σn(t)>[µn(t)∆ +
√∆Z
],
Ln(t+ ∆) = Ln(∆) exp
[1
Ln(t)σn(t)>
((µn(t)− 1
2
σn(t)
Ln(t)
)∆ +
√∆Z
)]
where Z ∼ N = (0, Im) is a vector of m independent Gaussian draws.
7.5.2 Special-purpose schemes with drift Predictor-Corrector
In integrated form, the general LM dynamics become
Ln(t+ ∆) = Ln(t) +
∫ t+∆
t
σn(u)>µn(u) du+
∫ t+∆
t
σn(u)>dWB(u)
not= Ln(t) +Dn(t, t+ ∆) +Mn(t, t+ ∆) (86)
On occasions high-performance special-purpose schemes exist for simulation of Mn(t, t+∆). Generating M(t, t + ∆) by a special-purpose scheme, we can use a predictor-corrector
65
scheme for Ln:
Ln(t+ ∆) = Ln(t) + σn(t, L(t))>µn(t, L(t))∆ + M(t, t+ ∆)
Ln(t+ ∆) = Ln(t) + θPCσn(t, L(t))>µn(t, L(t))∆
+(1− θPC)σn(t+ ∆, L(t+ ∆))>µn(t+ ∆, L(t+ ∆))∆
+M(t, t+ ∆) (87)
M(t, t+ ∆) = σn(t)>√
∆Z
where L(t) = (L1(t), . . . , LN−1(t))>, L is the semi-implicit corrector scheme, L is theexplicit predictor scheme and θPC ∈ [0, 1] is the parameter that determined the amount ofimplicitness.
7.5.2.1 Lagging Predictor-Corrector scheme
7.6 Interpolation of front / back stubs
So far, we have seen how to obtain, at any time t, a vector of forward Libor rates
L(t) = (Lq(t)(t), Lq(t)+1, . . . , LN−1(t))
on a pre-specified tenor structure 0 ≤ T0 < · · · < TN .
In order to be able to compute P (t, Tn) for all n we need to additionally establish thefrontstub factor P (t, Tq(t)). Indeed, recall that
P (t, Tn) = P (t, Tq(t))
n−1∏i=q(t)
1
1 + τiLi(t)
Also, in order to compute P (t, T ) for arbitrary T , the back stub discount factor P (t, T, Tq(T )) =P (t,Tq(T ))
P (t,T ) is needed since
P (t, T ) =P (t, Tq(T ))
P (t, T, Tq(T ))=
1
P (t, T, Tq(T ))
P (t,Tq(T ))︷ ︸︸ ︷P (t, Tq(t))
q(T )−1∏i=q(t)
1
1 + τiLi(t)
The front and back stubs cannot be generally expressed in terms if Libor rates.
7.6.1 Back stub interpolation
A simple idea to obtain P (t, T, Tm) is to interpolate between P (t, Tm, Tm) = 1 and P (t, Tm−1, Tm) =1
1+τm−1Lm−1(t) , for instance setting
P (t, T, Tm) =T − Tm−1
Tm − Tm−1+
Tm − TTm − Tm−1
P (t, Tm−1, Tm)
Interpolations like these violate the basic no-arbitrage condition
P (0, Tm−1, Tm) = ETm−1 [P (t, Tm−1, Tm]
One way to remedy this is to consider choosing na arbitrary constant α(T ) and setting
P (t, Tm−1, Tm) = P (0, Tm−1, Tm) + α(T ) (P (t, Tm−1, Tm)− P (0, Tm−1, Tm))
66
subject to the consistency conditions α(Tm) = 1 and α(Tm−1) = 0. Note
dP (t, Tm−1, T ) = O(dt) + α(T )dP (t, Tm−1, Tm),
dP (t, Tm−1, T )
P (t, Tm−1, T )= O(dt) + [σP (t, Tm−1)− σP (t, T )] dW (t),
dP (t, Tm−1, Tm)
P (t, Tm−1, Tm)= O(dt) + [σP (t, Tm−1)− σP (t, Tm)] dW (t)
We may thus approximate α(T ) as
α(T ) ' P (0, Tm−1, T )
P (0, Tm−1, Tm)
‖σP (t, Tm−1)− σP (t, T )‖‖σP (t, Tm−1)− σP (t, Tm)‖
This approximation turns the problem of interpolating bond prices to that of interpolatingbond volatilities. For instance, in the one-dimensional Gaussian model with constant meanreversion κ we have
σP (t, T ) = σ(t)1− e−κ(T−t)
κ
Plugging this into the approximation above yields
α(T ) ' P (0, Tm−1, T )
P (0, Tm−1, Tm)
1− e−κ(T−Tm−1)
1− e−κ(Tm−Tm−1)
where κ could be either set as part of the user input or obtained by best-fitting theGaussian parametric form to the volatility structure of the LM model.
7.6.2 Back stub interpolation inspired by the Gaussian model
From the bond reconstitution formula in the Gaussian model we have
P (t, Tm−1, Tm) = P (0, Tm−1, Tm)× exp[−(G(Tm−1, Tm)e−κ(Tm−1−t)x(t′)
]× exp
[−1
2(G(t, Tm)2 −G(t, Tm−1)2)y(t)
]where
G(t, T ) =1− e−κ(T−t)
κ, y(t) = e−2κt
∫ t
0
e2κsσ(s)2 ds
A simple computation yields
P (t, Tm−1, T ) = P (0, Tm−1, T )
(P (t, Tm−1, Tm)
P (0, Tm−1, Tm)
) G(Tm−1,T )
G(Tm−1,Tm)
× exp
[1
2
G(Tm−1, T )
G(Tm−1, Tm)(G(t, Tm)2 −G(t, Tm−1)2)y(t)
]× exp
[−1
2(G(t, T )2 −G(t, Tm−1)2)y(t)
]where the volatility σ(t) and the mean reversion parameter κ can be obtained by fitting
the Gaussian volatility structure to the volatilities of Libor rates generated by the LM modelitself.
67
7.6.3 Front stub interpolation inspired by the Gaussian model
Using the forward bond prices we obtained in the previous subsection P (t′, Tm, Tm+1), t′ ≥m = q(t′), we can now construct P (t′, Tm) from these by employing a bond reconstitutionformula from a Gaussian model.
Assume that for short tenors, the LM model can be locally approximated by a one-factorGaussian model with constant mean reversion. In particular we have
P (t′, Tm) = P (0, t′, Tm) exp
[−G(t′, Tm)x(t′)− 1
2G(t′, Tm)2y(t′)
],
P (t′, Tm, Tm+1) = P (0, Tm, Tm+1)× exp [−(G(t′, Tm+1)−G(t′, Tm))x(t′)]
× exp
[−1
2(G(t′, Tm+1)2 −G(t′, Tm)2)y(t′)
]whereby one obtains
P (t′, Tm) = P (0, t′, Tm)
(P (t′, Tm, Tm+1)
P (0, Tm, Tm+1)
) G(t′,Tm)
G(t′,Tm+1)−G(t′,Tm)
×exp
(1
2G(t′, Tm)G(t′, Tm+1)y(t′)
)where
• The short rate volatility σ(t) can be approximated by the volatilities of the front Liborrate.
• The mean reversion parameter κ can be used to set the stub bond volatility near to itsmarket-implied value.
7.7 Advanced approximation for swaption pricing via Markovianprojection
So far we have seen how to calibrate the SV-LM model (74)-(75)
dLn(t) =√z(t)ϕ(Ln(t))λn(t)>
(√z(t)µn(t)dt+ dWB(t)
)µn(t) =
n∑j=q(t)
τjϕ(Lj(t))λj(t)
1 + τjLj(t)
dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t), z(0) = 0
and how to use it to price non-vanilla derivatives via Monte Carlo discretization. In order tocalibrate the model to liquid caps and swaptions, we developed fast pricing formulas (78)-(81) for the prices thereof, which in the case of swaptions required the use of approximateddynamics for the swap rate S(t).
• When approximating the dynamics of the swap rate S(t) to price swaptions via (82)
dS(t) '√z(t)ϕ(S(t))‖λS(t)‖dY A(t)
the skew of the swap rate is taken to be the same as the (common) skew of all Liborrates, yet numerical simulation shows that this is not the case.
• The model (74)
dLn(t) =√z(t)ϕ(Ln(t))λn(t)>
(√z(t)µn(t)dt+ dWB(t)
)uses the same time-homogeneous local volatility function ϕ(·) for all Libor rates, whichimplies that swaptions of all tenors and expiries have essentially the same volatilityskew, a model feature which is again inconsistent with market reality.
68
We thus consider the following generalization the old LM model specification (74)-(75)
dLn(t) =√z(t)ϕn(t, Ln(t))λn(t)>dWTn+1(t), n = 1, . . . , N − 1 (88)
dz(t) = θ[z0 − z(t)]dt+ η(t)ψ(z(t))dZ(t), z(0) = 0 (89)
and assume
ϕn(t, Ln(0)) = 1
ϕn(t, x) ' 1 + bn(t) [x− Ln(0)] ,
bn(t) = ∂xϕn(t, Ln(0))
The dynamics of the swap rate S(t) = Sj,k−k(t) in this model are
dS(t) =√z(t)λS(t,L(t))>dWA(t) (90)
λS(t,L(t)) =
k−1∑n=j
∂S(t)
∂Ln(t)ϕn(t, Ln(t))λn(t)
or in one-dimensional form
dS(t) =√z(t)‖λS(t,L(t))‖dY A(t)
Using standard results on Markovian projection, we can derive the following approximation.
Proposition 7.5 (Swap rate dynamics approximation). For the purpose of European swap-tion valuation, the dynamics of the swap rate S(t) in measure QA are approximately givenby the following displaced log-normal stochastic volatility SDE
dS(t) '√z(t)pS(t) [1 + bS(t)(S(t)− S(0))] dY A(t) (91)
pS(t) = ‖λS(t, EA[L(t)]
)‖ (92)
bS(t) =1
pS(t)
k−1∑n=j
∂‖λS(t, EA[L(t)]
)‖
∂Ln(t)
∫ t0λn(u)>λS (u,L(0)) du∫ t0‖λS (u,L(0)) ‖2 du
(93)
As shown in Proposition 7.3, the price of a Tj-maturity European swaption is given by
Vswaption(0) = A(0)EA[(S(Tj)− c)+
]and S(T ) is driven by the stochastic volatility model with time-dependent parameters
dS(t) '√z(t)pS(t) [1 + bS(t)(S(t)− S(0))] dY A(t),
dz(t) = θ[z0 − z(t)]dt+ ηψ(z(t))dZ(t)
This is a stochastic volatility model with time-dependent parameters, so using the time-averaging techniques of [AP10-1, Section 9.3] we have devised a fast way of pricing Europeanswaptions.
The improvements over our early approximation essentially stem from the evaluationof the volatility function of the swap rate λS(t,L(t)) at L(t) = EA[L(t)] rather than atL(t) = L(0).
69
7.7.1 Advanced formula for the swap rate volatility
In order to compute the swap rate volatility (92)
pS(t) = ‖λS(t, EA[L(t)]
)‖
we need to approximate the expectations EA[Ln(t)]
Proposition 7.6 (Advanced formula for the swap rate volatility). For j ≤ n ≤ k.1, theexpected value of the n-th Libor rate in the annuity measure is approximately given by
EA[Ln(t)] = Ln(0)[1 + cn(t)]
cn(t) =1
Ln(0)Qn(0)
k−1∑i=j
∂Qn(0)
∂Li(0)
∫ t
0
λ>i (s)λn(s) ds,
Qn(t) =A(t)
P (t, Tn+1)
7.7.2 Advanced formula for the swap rate skew
We now show how to compute the swap rate skew (93).
Proposition 7.7 (Advanced formula for the swap rate skew). The time-dependent swaptionskew bS(t) 93 is approximately given by
bS(t) =
k−1∑i,n=j
rS,n(t)
rS,i(t)vi,n(0)ξi(t) +
k−1∑i=j
bi(t)vi(0)ξi(t)
rS,i(t) = λi(t)>λS(t,L(0)),
rS(t) = ‖λS(t,L(0))‖2
ξi(t) =rS,i(t)
∫ t0rS,i(u) du
rS(t)∫ t
0rS(u) du
7.7.3 Skew and smile calibration
The general model
dLn(t) =√z(t)ϕn(t, Ln(t))λn(t)>dWTn+1(t), n = 1, . . . , N − 1
dz(t) = θ[z0 − z(t)]dt+ η(t)ψ(z(t))dZ(t), z(0) = 0
ϕn(t, x) = 1 + bn(t) [x− Ln(0)]
has enough flexibility to match (in addition to the correlation structure of Libor rates):
• ATM volatilities of all European swaptions, using ‖λS(t)‖.• Slopes of volatility smiles (skews), using bn(t).
• Curvatures of smiles for swaptions of different expiries and fixed tenor, using η(t).
Consider NS swaptions Vswaption,1, . . . ,Vswaption,NS as calibration targets, where eachVswaption,i represents a collection of swaptions with fixed tenor and different strikes.
We first convert the observed market prices Vswaption,1, . . . , Vswaption,NS into market-
implied SV parameters, namely implied volatilities λSi , skews bSi and volatilities of variance
70
ηSi for i = 1, . . . , NS (these conversions are routinely maintained and updated by tradingdesks, c.f. [AP10-1, Page 378]), presumably by a simple root-search24.
As in the standard Libor model, consider a grid of calendar times tiNti=1 and tenors
xjNxj=1 and define:
• [Gλ]i,j = ‖λn(ti)‖, where we recall that n is such that Tn − ti = xj .
• [Gb]i,j = bn(ti).
• [Gη]i = η(ti).
The formulas from the previous section allow us to compute model term volatilities, skewsand volatilities of variance by
λSi(Gλ, Gb, Gη), bSi(Gλ, Gb, Gη), ηSi(Gλ, Gb, Gη)
Instead of defining a uniform norm incorporating all these terms (which would result ina hard multi-dimensional non-linear optimization problem), we define three norms25:
Iλ(Gλ, Gb, Gη) =ωS,λNS
NS∑i=1
(λSi(Gλ, Gb, Gη)− λSi
)2
+ · · ·
Ib(Gλ, Gb, Gη) =ωS,bNS
NS∑i=1
(bSi(Gλ, Gb, Gη)− bSi
)2
+ · · ·
Iη(Gλ, Gb, Gη) =ωS,ηNS
NS∑i=1
(ηSi(Gλ, Gb, Gη)− ηSi)2
+ · · ·
This leads to the following calibration algorithm:
1. Make initial guesses I0b and I0
η . For instance take
I0b =
1
NS
NS∑k=1
bSk , I0η =
1
NS
NS∑k=1
ηSk
2. Use the calibration algorithm in section 7.4.2 to obtain
G1λ := arg minGλIλ(Gλ, G
0b , G
0η).
3. Iterate over Gb to obtain
G1b := arg minGbIb(G
1λ, Gb, G
0η).
4. Iterate over Gη to obtain
G1η := arg minGηIη(G1
λ, G1b , Gη).
One could then iterate further starting with (G1b , G
1η) on the first step.
24See the end of section 7.4.1 for a concrete example.25Impact of changes in the Libor skews on term swaption volatilities, or their volatilities of variance, is rather
small.
71
7.8 Other extensions
7.8.1 Evolving separate discount and forward rate curves
A single yield curve for discounting and calculating Libor rates us not always compatible withno-arbitrage constraints of cross-currency markets. In general, separating the discountingcurve from the forward curve will ensure that linear instruments (i.e. swaps and bonds) arecorrectly priced at time 0.
7.8.2 SV models with non-zero correlation
So far we have assumed that the scalar Brownian motion Z(t) driving the variance processwas independent of all the components of the m-dimensional Brownian motion W (t). We canrelax this assumption and assume a non-zero deterministic correlation vector ρ(t) betweenZ(t) and WB(t).
7.8.3 Multi-stochastic volatility extensions
In the specification (74)-(75) of the LM model, a single stochastic volatility process√z(t) is
used to scale the diffusion coefficients of all forward rates. While sufficiently rich to introducethe volatility smile for all European swaptions, it presents some limitations.
(i) Interest rate exotics link to the spread of two CMS rates derive their values from thedistribution of the slope of the interest rate curve and a common stochastic volatilityfactor applied to all rates does not always allow for sufficient control over the distribu-tion of the slope of the interest rate curve.
(ii) The are of multi-dimensional stochastic volatility interest rate modeling is quite new.
72
8 Single-rate vanilla derivatives
Here we study derivatives whose payoffs can be decomposed as a linear combination offunctions of individual rate observations (either exactly or approximately).
8.1 European swaptions
Given a tenor structure 0 < T0 < · · · < TN , consider swaptions of different expiriesTnn=0,...,N−1 that can be exercised into swaps that start at Tn and cover m periods.Define
An,m(t) =
n+m−1∑i=n
τiP (t, Ti+1), Sn,m(t) =1
A(t)
n+m−1∑i=n
τiP (t, Ti+1)Li(t)
The time-t value of the m-period swaption expiring at Tn is
Vn,m(t) = An,m(t)En,mt[(Sn,m(t)− k)+
]For each swap rate Sn,m(t) we specify stochastic volatility (SV) dynamics in the annuity
measure
dSn,m(t) = λn,m [bn,mSn,m(t) + (1− bn,m)Sn,m(0)]√zn,m(t)dWn,m(t) (94)
dzn,m(t) = θ [1− zn,m(t)] dt+ ηn,m√zn,mdZ
n,m(t)
where recall
(i) θ is the mean-reversion of variance
• It controls the speed at which deviations of z(·) away from z0 are pulled back
towards this level. Increasing θ decreases the long-term variance of z().
• We assume it is global: same parameter for all swaptions.
• The idea is to choose θ in order to minimize the variability of ηn,mn,m acrossdifferent expiries n.
(ii) ηn,m is the volatility of variance and controls the curvature of the volatility smile.
Swaption calibration is performed individually for each grid point: fix a pair maturity-tenor (n,m). Suppose a collection of strikes K1, . . . ,KJ , along with corresponding marketprices V1, . . . , VJ (where we omit n,m from the notation since they are now fixed). Theoptimal parameters λ∗, b∗, η∗ are obtained as the solution of the following minimizationproblem
(λ∗, b∗, η∗) = argminλ,b,ηI(λ, b, η), I(λ, b, η) =
J∑j=1
wJ
(V (Kj ;λ, b, η)− Vj
)2
(95)
The algorithm to calibrate a SV-model to swaption prices is hence the following:
1. Choose θ.
2. For each maturity-tenor pair (n,m), find the optimal parameters (λ∗n,m, b∗n,m, η
∗n,m) by
solving the optimization problem (95).
3. If the ηn,m’s increase (decrease) with n, reduce (increment) θ and iterate.
73
8.2 Caps and floors
8.3 Terminal swap rate (TSR) models
• Pricing swaptions and caps is simple because valuation requires only knowledge of theterminal distribution of a single swap rate in the appropriate annuity measure.
1. Given a tenor structure 0 < T0 ≤ · · · ≤ TN , consider swaptions of different expiriesTnn=0,...,N−1 that can be exercised into swaps that start at Tn and cover mperiods. Define
An,m(t) =
n+m−1∑i=n
τiP (t, Ti+1), Sn,m(t) =1
A(t)
n+m−1∑i=n
τiP (t, Ti+1)Li(t)
The time-t value of the m-period swaption expiring at Tn is
An,m(t)En,mt[(Sn,m(t)− k)+
]2. A cap...
• Much more common a relatively simple payoffs that depend on the rate S(T ), butwhich also require the knowledge of additional discount bonds.
• Idea underlying TSR models: instead of using a full-blown term structure model,it’s better to functionally link the values of discount bonds on the date T to the drivingrate S(T ) through certain approximations.
Given a tenor structure 0 < T0 < T1 < · · · < TN and defining the annuity and swap rateas usual
A(t) = A0,N (t) =
N−1∑n=0
τnP (t, Tn+1), S(t) = S0,N (t) =P (t, T )− P (t, TN )
A(t)
the value of a derivative with payoff X in the annuity measure is
V (0) = A(0)EA[
X
A(T )
](96)
If P (T,M)M≥T is a family of discount bonds of various maturities (observed at timeT ), a TSR model exogenously specifies a family of maturity-indexed functionsπ(·,M)M≥T such that
P (T,M) = π(S(T ),M), ∀M ≥ T (97)
satisfying certain conditions
(i) No-arbitrage condition: when applying the valuation formula (96) to the specifica-tion (97) one should be able to reproduce the initial discount bond prices
P (0,M) = A(0)EA[π(S(T ),M)
A(T )
]= A(0)EA
[π(S(T ),M)∑N−1
n=0 τnπ(S(T ), Tn+1)
], M ≥ T
(98)
(ii) Consistency condition: the swap rate S(T ) itself is a function of discount factors
S(T ) = 1−P (T,TN )∑N−1n=0 τnP (T,Tn+1)
so we need to impose that
x =1− π(x, TN )∑N−1
n=0 τnπ(S(T ), Tn+1), ∀x (99)
74
(iii) Reasonable family of functions: we generally require the functions π(·,M)M torange between 0 and 1, to be monotonic in M and to be continuous in (x,M). Gener-ally one chooses a parametric class for the functions π(·,M)M and then one choosesfunctions within the parametric class satisfying the no-arbitrage and consistency con-ditions.
8.3.1 Linear TSR models
We use the linear specification
π(x,M)∑N−1n=0 τnπ(x, Tn+1)
= a(M)x+ b(M), M ≥ T (100)
for deterministic functions a(·), b(·).
• The no-arbitrage condition implies that
b(M) =P (0,M)
A(0)− a(M)S(0) (101)
• The consistency condition implies that
b(T0) = b(TN ) (102)
a(T0) = 1 + a(TN ) (103)
• The specification (100) itself imposes
N−1∑n=0
τna(Tn+1) = 0, (104)
N−1∑n=0
τnb(Tn+1) = 1. (105)
To complete the model specification one may proceed as follows:
(i) Choose coefficients a(T1), . . . , a(TN ) subject to condition (104), as described below.
(ii) Calculate a(T ) = a(T0) = 1 + a(TN ) from condition (103).
(iii) Calculate the rest of a(M)’s by interpolation of a(T1), . . . , a(TN ).(iv) Calculate the b(M)’s by condition (101).
The coefficients a(T1), . . . , a(TN ) can be connected to a mean reversion pa-rameter: this will not only reduce the number of parameters required, will also parameterizethe model with a single parameter that has a strong financial interpretation. The argumenton pages 713-714 on [AP10-2] shows that one may choose
a(M) =P (0,M) (γ −G(T,M))
P (0, TN )G(T, TN ) + S(0)γ
γ =
∑τnP (0, Tn+1)G(T, Tn+1)∑
τnP (0, Tn+1)
G(t, T ) =
∫ T
t
e−∫ utκ(s) ds du
This also eliminates the need to interpolate in (iii) and facilitates better risk management(c.f. discussion on page 714 in [AP10-3]).
75
8.4 Libor-in-Arrears
A Libor-in-Arrears cash flow pays the Libor rate on the date when it fixes, rather than onthe date it matures. We focus on a single cash flow, the valuation of a full strip followingby additivity.Let T > 0 be the start date, and M the end date of the period covered by
Libor rate L(t) ≡ L(t;T,M) = P (t,T )−P (t,M)τP (t,M) , with τ = M − T . The time-0 value of a
Libor-in-Arrears cash flow is
VLIA(0) = β(0)E
[L(T )
β(T )
]One would then naturally switch to the T -forward measure, but traded caplets provide
information about the distribution of the Libor rate in the M -forward measure, not theT -forward measure26.
The valuation formula in the M -forward measure is
VLIA(0) = P (0,M)EM[
L(T )
P (T,M)
]= P (0,M)EM [(1 + τL(T ))P (T,M)]
= P (0,M)(L(0) + τEM
[L2(T )
])The term EM
[L2(T )
]can be computed by Proposition ??
EM[L2(T )
]= L2(0) + 2
∫ L(0)
−∞p(0, L(0);T,K) dK + 2
∫ ∞L(0)
c(0, L(0);T,K) dK (106)
where p(t, L;T,K) and c(t, L;T,K) are undiscounted values of put and call options on therate L(T ) with strike K (i.e. simple undiscounted floorlets and caplets).
8.5 Libor-with-Delay
A Libor-with-delay cash-flow pays the Libor rate L(t, T,M) at some arbitrary payment timeTp ≥ T . In the M-forward measure we have
VLD(0) = P (0,M)EM[P (T, Tp)
P (T,M)L(T )
]and we can now use a TSR model to express P (T, Tp) in terms of the Libor rate.
Alternatively one can draw inspiration from the quasi-Gaussian model to express P (T, Tp)in terms of L(T ). Recall that
P (T,M) = P (T,M ;x(T ), y(T )), P (T,M ;x, y) =P (0,M)
P (0, T )exp
(−G(T,M)x− 1
2G(T,M)2y
)Approximating y(T ) by a deterministic function y(T ) := E[y(T )] and denoting by X(T, `)
the inverse function in x to L(T, x, y(T )) we can write
P (T, Tp) = P (T, Tp;X(T, L(T )), y(T )) (107)
26Note that L(t) = L(t;T,M) is a martingale in the M -forward measure, but not in the T -forward measure, soET [L(T )] 6= L(0). To characterize this situation one defines the Libor-in-Arrears convexity adjustment
DLIA(0)def= ET [L(T )]− L(0).
In general, by convexity we generally mean the difference of valuations under different measures (in this case thedifference between the measure appropriate for the given payment date and the measure in which the market rateis a martingale).
76
Assuming a linear relationship between the Libor rate and the state variable x
L(T ) ' L(T, 0, 0) +∂L
∂x(T, 0, 0)x(T ) =⇒ Var(L(T )) '
(∂L
∂x(T, 0, 0)
)2
Var(x(T ))
whence
y(T ) = Var(x(T )) '(∂L
∂x(T, 0, 0)
)−2
Var(L(T ))
Thus given P (T, Tp), we can solve equation (107) implicitly for L(T ) to obtain our soughtfor relation. Note that this procedure might result in a mild violation of the no-arbitragecondition.
8.6 CMS and CMS-Linked cash flows
A CMS (linked) cash-flow pays (a function of) the swap rate S(T ) at time T ≤ Tp < T1.It is hence more naturally valued in Tp-forward measure, even though the market-implieddistribution of S(T ) is known in the annuity measure. We have
VCMS(0) = A(0)EA[P (T, Tp)
A(T )S(T )
](108)
Lemma 8.1 (Replication method for CMS pay-offs). Assuming that the annuity mapping
function α(S(T )) =P (T,Tp)A(T ) is known, then
VCMS(0) = A(0)S(0)α(S(0))+
∫ S(0)
−∞w(K)Vrec(0,K) dK+
∫ ∞S(0)
w(K)Vpay(0,K) dK (109)
where
Vrec(0,K) = A(0)EA[(K − S(T ))+
]Vpay(0,K) = A(0)EA
[(S(T )−K)+
]w(s) =
d2
ds2(α(s)s)
where Vrec(0,K), Vpay(0,K) are the time-0 values of receiver and payer swaptions.
Using iterated conditioning one easily sees:
Proposition 8.2. The annuity mapping function α(s) may be written as the conditionalexpectation
α(s) = EA[P (T, Tp)
A(T )
∣∣∣∣S(T ) = s
](110)
In order to compute the conditional expectation (110), it is useful to remember that giventwo random variables X,Y , the conditional expectation can be interpreted as
E[X|Y ] = f∗(Y ), f∗ = argminE[(X − f(Y ))2
], f ∈ B
for a space B of suitably regular functions on Y , generally defined by a parametric functionalform B =
f(y; θ), θ ∈ Θ ⊂ Rd
. Imposing the conditions
∂
∂θiE[(X − f(Y ; θ))2
]= 0, i = 1, . . . , d
we obtain the following Proposition.
77
Proposition 8.3. Given two random variables X and Y and a paremetric set of functionsf(y; θ), θ ∈ Θ ⊂ Rd, an approximation to E[X|Y ] is given by
E[X|Y ] ' f(Y ; θ∗)
where θ∗ is a solution to the set of equations
E
[X∂f
∂θi(Y ; θ)
]= E
[f(Y ; θ)
∂f
∂θi(Y ; θ)
], i = 1, . . . , d (111)
8.6.1 Linear TSR model
Assume α(s) = α1s+ α2. Imposing the no-arbitrage condition we obtain
P (0, Tp)
A(0)= EA
[P (T, Tp)
A(T )
]= EA [α1S(T ) + α2] = α1S(0) + α2
whence
α2 =P (0, Tp)
A(0)− α1S(0)
α1 can be considered an exogenous input. The value of the CMS payoff is hence
VCMS(0) = A(0)EA[P (T, Tp)
A(T )S(T )
]= EA[α(S(T ))S(T )] = EA [(α1S(T ) + α2)S(T )]
= P (0, Tp)S(0) + α1A(0)VarA[S(T )]
where the variance VarA[S(T )] can be computed (as in the Libor-in-arrears case) by thereplication method27
VarA[S(T )] = 2
∫ S(0)
−∞S(K)p(0, S(0);T,K) dK + 2
∫ ∞S(0)
S(K)c(0, S(0);T,K) dK
8.6.2 Libor market model
Establishing dependency explicitly would be too complicated if our sole purpose is to priceCMS cash-flows, but it would be useful for applications of Libor market models to exoticderivatives that are linked to CMS rates (e.g. callable CMS range accruals or CMS spreadTARN’s).
Proposition 8.4. See [AP10-3, Proposition 16.6.3]. The annuity mapping function (8.2)in the Libor market model is approximated by
α(s) = EA[P (T, Tp)
A(T )
∣∣∣∣S(T ) = s
]'
(N−1∑n=0
τn
n∏i=0
1
1 + τi`i(s)
)−1
where
`n(s) = Ln(0)
(1 + cn
s− S(0)
S(0)
)27Recall that for any twice-differentiable function we have
E[f(S(T ))] = f(K∗) + f ′(K∗)(S(0)−K∗) +
∫ K∗
−∞p(0, S(0);T,K)f ′′(K) dK
+
∫ ∞K∗
c(0, S(0);T,K)f ′′(K) dK (112)
In our case, we apply it with K∗ = S(0) and f(S(T )) = (S(T )− S(0))2.
78
8.6.3 Quasi-Gaussian model
Recall the bond reconstitution formula in the qG-model
P (T,M) = P (T,M ;x(T ), y(T )),
P (T,M ;x, y) =P (0,M)
P (0, T )exp
(−G(T,M)x− 1
2G(T,M)2y
)and recall that y(T ) is well approximated by a deterministic function
y(T ) =
(∂L
∂x(T, 0, 0)
)−2
VarA(S(T ))
where VarA(S(T )) is computed consistently with the market via replication as above. Onecan then define an annuity mapping
α(s)P (T, Tp;X(T, s), y(T ))
A(T,X(T, s), y(T ))
where X(T, s) is the inverse function (in x) to S(T, x, y(T )) which can then be used tocompute VCMS(0) via (111).
Computational remark. Note that
dα
ds=
1
A(T,X(T, s), y(T ))
[∂P (T, Tp, X(T, s), y(T ))
∂x
∂X(T, s)
∂sA(T,X(T, s), y(T ))
−P (T, Tp, X(T, s), y(T ))∂A(T,X(T, s), y(T ))
∂x
∂X(T, s)
∂s
]To compute dα(K)
ds we need
• X(T,K): by construction S(T,X(T, s), y(T )) = s so X(T,K) is the x-solution to theequation S(T, x, y(T )) = K.
• ∂X(T,K)∂s : differentiating implicitly we obtain
X(T, S(T, x, y(T ))) = x =⇒ ∂X(T, s)
∂s
∂S(T, x, y(T ))
∂x= 1
whence∂X
∂s(T,K) =
(∂S
∂x(T,K, y(T ))
)−1
One proceeds analogously to compute d2α(K)ds2 .
8.6.4 Correcting non-arbitrage-free methods
Some annuity mapping methods are not arbitrage-free by construction (e.g. Libor) and insome others, like the linear TSR model, despite being arbitrage-free theoretically, numericalmethods can induce slight errors. We introduce a modification to amend this error. Thevaluation formula for CMS cash-flows reads
VCMS(0) = A(0)EA[P (T, Tp)
A(T )S(T )
]= A(0)EA [α(S(T ))S(T )]
In order to have an arbitrage-free model, sinceP (T,Tp)A(T ) is a tradeable asset, we need
EA [α(S(T ))] = EA[P (T, Tp)
A(T )
]=P (0, Tp)
A(0)
79
Since this may fail in some models, we can re-scale the function α(s) to force the previousequality to be satisfied. Defining
α(s) =P (0, Tp)
A(0)
α(s)
EA [α(S(T ))],
it is clear that EA [α(S(T ))] =P (0,Tp)A(0) and we obtain the improved CMS valuation formula
VCMS(0) = A(0)EA [S(T )α(S(T ))] = P (0, Tp)EA [S(T )α(S(T ))]
EA [α(S(T ))]
This correction is useful even for arbitrage-free models, since there may be arbitrage inducedby the numerical scheme.
8.6.5 CDF and PDF of CMS rate in Forward measure
We look at an alternative to the replication method of Lemma 8.1, since the weights maybe hard to compute for discontinuous payoffs. The problem of pricing a cash-flow that paysg(S(T )) at time Tp ≥ T can be formulated as
ETp [g(S(T ))] =
∫ ∞−∞
g(s)ψTp(s) ds
where ψTp(s) is the density of S(T ) in the Tp-forward measure.The CDF and PDF of the swap rate S(T ) in the annuity measure28 are given respectively
by
ΨA(K) = 1 +∂
∂Kc(K), ψ(K) =
∂2
∂K2c(K)
where c(K) = EA [(S(T )−K)+]. Changing to the Tp-forward measure we obtain
Proposition 8.5. Given an annuity mapping function α(s) = EA[P (T,Tp)A(T )
∣∣∣S(T ) = s], the
PDF ψTp(s) and the CDF ΨTp(s) of the swap rate S(T ) in the Tp-forward measure are givenby
ψTp(s) =A(0)
P (0, Tp)α(s)ψA(s), (113)
ΨTp(s) =A(0)
P (0, Tp)
∫ s
−∞α(u)ψA(u) du (114)
Sketch of proof. Note that ψTp = ETp [δ(S(T )−K)], so changing to the annuity measureand using iterated conditioning yields the result.
When the annuity mapping function is linear α(s) = α1s+α2, expressions (113-114) takea simple form
28The price of a call option is given by C(S,K) =∫∞KϕS(T )(x)(x − K) dx Differentiating with respect to K
twice we obtain
∂C(S,K)
∂K=
∂
∂K
[∫ ∞K
ϕS(T )(x) dx−K∫ ∞K
ϕS(T )(x) dx
]= −
∫ ∞K
ϕS(T )(x) dx
∂2C(S,K)
∂K2= ϕS(T )(K)
80
Corollary 8.6. In the linear TSR model, namely when α(s) = α1s + α2, the CDF ΨTp(s)and the PDF ψTp(s) of the swap rate in the Tp-forward measure are given by
ΨTp(s) =A(0)
P (0, Tp)
[α1(S(0)− s− c(s)) + α(s)ΨA(s)
](115)
ψTp(s) =A(0)
P (0, Tp)(α1sα2)ψA(s), (116)
c(s) = EA[(S(T )− s)+
]Finally notice that since a CMS caplet payoff is given by
VCMS−caplet(0,K) = P (0, Tp)ETp[(S(T )−K)+
],
there is an explicit link between VCMS−caplet(0,K) and ψTp(s).
Lemma 8.7 (Link between CMS caplet payoff and swap rate in Tp-forward measure). Themarket-implied PDF ψTp(s) of the swap rate in the Tp-forward measure can be obtained fromvalues of CMS caplets by
ψTp(K) =1
P (0, Tp)
∂2
∂K2VCMS−caplet(0,K) (117)
8.6.6 SV model for CMS rate
It is useful, specially when dealing with vanilla derivatives linked to multiple market rates, tohave PDFs and CDFs thereof in the forward measure to be of tractable form. Unfortunately,this is not the case if one starts with the SV model in the annuity measure and then switchesto the forward measure. We seek to obtain an approximation to the PDF ψTp(s) that is fromthe same family as the PDF ψA(s).
Assume the swap rate follows
dS(t) = λ (bS(t) + (1− b)S(0))√z(t)dWA(t) (118)
dz(t) = θ(1− z(t))dt+ η√z(t)dZA(t), z(0) = 1 (119)
E[dWA(t)dZA(t)] = 0
This system defines the PDF ψA(·) of S(T ) in measure QA. Define an adjusted process S(t)given by the following dynamics in the Tp-forward measure
dS(t) = λ(bS(t) + (1− b)S(0)
)√z(t)dWTp(t)
dz(t) = θ(1− z(t))dt+ η√z(t)dZTp(t), z(0) = 1 (120)
E[dWTp(t)dZTp(t)] = 0
Align the mean of S(T ) to equal the CMS-adjusted value of S(T ), namely
S(0) = ETp [S(T )]
Goal: define ψTp(·) to be the PDF of S(T ) given by the previous system of SDEs andaim to set the adjusted parameters λ, b, η such that the distribution of S(T ) is as close aspossible to that of S(T ) in the Tp-forward measure.
• S(T ) = ETp [S(T )] is calculated by the replication method of Lemma 8.1.
81
• Since VCMS−caplet(0,K) = P (0, Tp)ETp [(S(T )−K)+] and S(0) = ETp [S(T )], by iter-
ated conditioning we see that in fact
VCMS−caplet(0,K) = P (0, Tp)ETp[(S(T )−K)+
]so that CMS caplets are nothing more than European calls on S(T ). Therefore, λ, b, ηcan be obtained by direct calibration of the SV model (120) (which is in the Tp-forwardmeasure) to prices of CMS caplets that are computed in the original SV model29 forS(T ) (with dynamics specified in the annuity measure).
8.6.7 Dynamics of CMS rate in forward measure
Consider a stochastic volatility model in the QA-measure
dS(t) = λϕ(S(t))√z(t)dWA(t) (121)
dz(t) = θ(1− z(t))dt+ ηψ(z(t))dZA(t), z(0) = 1 (122)
E[dWA(t)dZA(t)] = 0
In the previous subsection we absorbed the measure change to QTp into the SV diffusionparameters: namely, we defined an analogous system in the QTp -measure and adjusted itsparameters to obtain market-consistent results. We now seek to bring the two-dimensionalSDE (121)-122) into an approximation of the Tp-forward measure30. Concretely, we construct
a measure QTp such that for any function f(·) we have
EA[P (T, Tp)
A(T )f(S(T ))
]=P (0, Tp)
A(0)ETp [f(S(T ))]
Proposition 8.8. See [AP10-3, Proposition 16.6.8]. Defining the measure QTp by thecondition that the process (z(t), S(t)) satisfies the following SDE system in QTp
dS(t) = λϕ(S(t))√z(t)
(dWTp(t) + vS(t)dt
)(123)
dz(t) = θ(1− z(t))dt+ ηψ(z(t))(dZTp(t) + vZ(t)
), z(0) = 1
E[dWTpdZTp
]= 0
where the drift adjustments are given by
vZ(t) = ηψ(z(t))∂z ln [Λ(t, z(t), S(t))] (124)
vS(t) = λϕ(S(t))√z(t)∂S ln [Λ(t, z(t), S(t))]
and where the function Λ(t, z, s) satisfies the following PDE
∂tΛ(t, z, s) + θ(1− z)∂zΛ(t, z, s) +1
2η2ψ(z)2∂2
zzΛ(t, z, s)
+1
2λ2ϕ(s)2z∂2
ssΛ(t, z, s) = 0, t ∈ [0, T ] (125)
with terminal condition
Λ(T, z, s) = α(s) :=A(0)
P (0, Tp)α(s) (126)
29These are obtained by the usual replication method of Lemma 8.1.30Namely, a measure that will resemble a forward measure for European-type payoffs fixing at T and paying at
Tp.
82
Then for any function f(·) we have
A(0)
P (0, Tp)EA
[P (T, Tp)
A(T )f(S(T ))
]= ETp [f(S(T ))]
Remark 8.1. This Proposition establishes a numerical scheme for simulating (z(t), S(t)) inQTp for the purpose of pricing European-style derivatives fixing at T and paying at Tp.
1. Solve the PDE (125-126) numerically on a grid (t, z, s).
2. Perform Monte Carlo simulation for (123), using the drift adjustments vZ(t) and vS(t),which are computed for each path using (124).
Remark 8.2. If α = α1s+ α2 then no finite difference scheme is needed since
dvz(t) = 0,
dvS(t) = λϕ(S(t))√z(t)
α1
α1S(t) + α2
8.7 Quanto CMS
Given a swap rate S(T ) in domestic currency, a quanto CMS cash-flow pays g(S(T )) at timeTp ≥ T in some other foreign currency and it thus has value
VQuantoCMS(0) = βf (0)Ef[
1
βf (Tp)g(S(T ))
]Let X(t) be the foreign exchange rate expressing one unit of domestic currency in foreign
currency units. In the domestic risk-neutral measure (and in foreign currency units) thepricing formula can be written as
VQuantoCMS(0) =βd(0)
X(0)Ed[
1
βd(Tp)g(S(T ))X(Tp)
][1]=
βd(0)
X(0)Ed[
1
βd(T )g(S(T ))βd(T )EdT
[1
βd(Tp)X(Tp)
]][2]=
βd(0)
X(0)Ed[
1
βd(T )g(S(T ))P (T, Tp)XTp(T )
]where in [1] we used iterated conditioning and in [2] we used a change of measure βd(T )EdT
[1
βd(Tp)X(Tp)]
=
Pd(T, Tp)EdT [X(Tp)] and the notation XTp(T ) =
Pf (T,Tp)X(T )Pd(T,Tp) .
Changing to the annuity measure we obtain
VQuantoCMS(0) =A(0)
X(0)Ed[
1
A(T )g(S(T ))Pd(T, Tp)XTp(T )
]=A(0)
X(0)EA,d [g(S(T ))v(S(T ))]
where
v(s) = EA,d[Pd(T, Tp)
A(T )XTp(T )|S(T ) = s
].
Recalling the definition of the annuity mapping function α(s) = EA,d[P (T,Tp)A(T ) |S(T ) = s
]and defining χ(s) = EA,d
[XTp(T )|S(T ) = s
]we can approximate
v(s) ' α(s)χ(s)
83
so that
VQuantoCMS(0) ' A(0)
X(0)EA,d [g(S(T ))α(S(T ))χ(S(T ))]
and this can be computed via replication once the function χ(·) is determined.
In order to compute χ(s) = EA,d[XTp(T )|S(T ) = s
]one needs to specify a joint distri-
bution of the swap rate S(T ) and the forward FX rate XTp(T ) and this can be accomplishedby means of a Gaussian copula. The marginal one-dimensional PDF of S(T ) in measureQA,d is given by ψA(s) (given by the swaption model). Assuming that XTp(T ) is log-normalin QA,d we have
XTp(T ) = X(0)emXT+σX√Tξ1 , ξ1 ∼ N(0, 1).
where
• σX is obtained by calibrating to T -expiry ATM options on the FX rate. For long-datesquanto contracts, the FX smile make start to matter and we may have to allow σX tovary stochastically. This can be accomplished, for instance, using copulas.
• mX is clarified below.
Given ξ2 ∼ N(0, 1) with ρXS = Corr(ξ1, ξ2) we have
S(T )d= (ΨA)−1 (Φ(ξ2))
whence
χ(s) = EA,d[XTp(T )|S(T ) = s
]= X(0)emXTEA,d
[eσX√Tξ1 |ξ2 = Φ−1(ΨA(s))
]not= X(0)emXT χ(s),
χ(s) = exp
(ρXSσX
√TN−1(ΨA(s)) +
1
2σ2XT (1− ρ2
XS)
)In order to establish the constant mX we impose a no-arbitrage condition. Since XTp(·)
is a QTp,d-martingale we have
XTp(0) = ETp,d[XTp(T )] =A(0)
Pd(0, Tp)EA,d
[Pd(T, Tp)
A(T )XTp(T )
]=
A(0)
Pd(0, Tp)EA,d [α(S(T ))χ(S(T ))]
= X(0)emXTA(0)
Pd(0, Tp)EA,d [α(S(T ))χ(S(T ))]
and we can then solve for e−mXT .
8.8 Eurodollar futures
Eurodollar futures are exchange-traded contracts on Libor rates. The latter are inputs intoan interest rate curve construction whereas the former are liquidly quoted in the market.It is important to be able to value ED futures efficiently, due to their high trading volumeand their use in yield curve construction, which rules out Monte Carlo methods and thusnecessitates the development of analytic approximations that incorporate the volatility smile.
84
Daily mark-to-market causes forward and futures contracts values to differ. If F (t;T,M)and L(t;T,M) are the time-t futures and Libor rates respectively covering the period [T,M ],recall that L(t;T,M) = EMt [L(T ;T,M)].
For futures contracts that are marked-to-market continuously we have
F (t;T,M) = Et[L(T ;T,M)]
If mark-to-market happens discretely, consider a tenor structure 0 = T0 < T1 < · · · < TNand let Ln(t) = L(t;Tn, Tn+1). Denote by QB the spot Libor measure induced by the process
B(t) = P (t, Tq(t))
q(t)−1∏n=0
(1 + τnLn(Tn))
Then if the futures rate is marked-to-market only on the dates T0 < T1 < · · · < Tn = T < M ,we have
F (t, T ;M) = EBt [L(T ;T,M)]
8.8.1 The problem and the plan
Assuming that all futures rates Fn(0)n=0,...,N−1 are known (market-observable), our goalis to derive formulas that express forward Libor rates Ln(0) = L(0;Tn, Tn+1) in terms offutures Fn(0). The road map to derive the ED futures valuation formula is:
1. Use expansion technique to express a forward rate as a (model-independent) functionof a collection of futures rates with expiries on or before the expiry of the forward rate.
2. Separate variance terms appearing in previous formula into slow- and fast-moving.
3. Fast-moving volatility parameters
4. Slow-moving correlation parameters
8.8.2 Expansion around futures value
Since Ln(0) = En+1[Ln(Tn)] for all n = 0, . . . , N−1, changing to the spot measure we obtain
Ln(0) = EB[dQn+1
dQBLn(Tn)
]= EB
[Ln(Tn)
n∏i=0
1 + τiLi(0)
1 + τiLi(Ti)
](127)
We seek to express the expectation in the RHS of (127) in terms of market-observedquantities by expanding it in powers of a small parameter that measures the deviation ofeach Ln(Tn) from its mean in the spot measure Fn(0) = EB [Ln(Tn)]. Define
Lεn(t)def= Fn(0) + ε (Ln(t)− Fn(0))
V (ε) = Lεn(Tn)
n∏i=0
1 + τiLi(0)
1 + τiLεi(Ti)
Since clearly L1n(t) = Ln(t), we have Ln(0) = EB [V (1)]. Expanding V (ε) into a Taylor series
in ε yields
V (ε) = V (0) + EB[dV
dε(0)
]ε+
1
2EB
[d2V
dε2(0)
]ε2 +O(ε3) (128)
Computing these derivatives (c.f. [AP10-3, Lemma 16.8.4]) one obtains the following:
85
Theorem 8.9. For any n = 0, . . . , N − 1, an approximation to the forward rate Ln(0) isobtained from the futures Fi(0)ni=0 and forward rates for previous periods Li(0)n−1
i=0 bysequentially solving the equations
Ln(0) = EB [V (1)] = V (0)
1 +1
2
n∑j,m=0
Dj,mCovB(Lj(Tj), Lm(Tm))
(129)
with
V (0) = Fn(0)
n∏i=0
1 + τiLi(0)
1 + τiFi(0)
Dj,m = · · ·
8.8.3 Forward rate variances
The covariances in (129) are, by definition
CovB(Lj(Tj), Lm(Tm)) =
√VarB(Lj(Tj))
√VarB(Lm(Tm))CorrB(Lj(Tj), Lm(Tm))
The spot-measure variances VarB(Ln(Tn)) can be approximated in the forward measure31
VarB(Ln(Tn)) ' Varn+1(Ln(Tn))
The variances Varn+1(Ln(Tn)) can be computed by calibrating a low-parametric vanillamodel to liquid strikes. Assuming
dLn(t) = σn (bnLn(t) + (1− bn)Ln(0))√z(t)dWn+1(t)
dz(t) = θ (1− z(t)) dt+ ηn√z(t)dZn+1(t)
E[dWn+1(t)dZn+1(t)
]= 0
After calibrating the parameters (σn, bn, ηn) to swaption prices, the forward variances aregiven by
Varn+1(Ln(Tn)) =L2n(0)
b2n
[Ψz((σnbn)2, 0;Tn)
]Ψz(v, u; t) = · · ·
8.8.4 Forward rate correlations
See [AP10-3, Page 759].
31See [AP10-3, Page 757]. The error made by the approximation is claimed to be small.
86
9 Multi-rate vanilla derivatives
Here we look at payoffs that are linked to more than one swap rate, the CMS spread beingthe most prominent example, without resorting to a full term structure model. Concretely,given a payment date Tp, a collection of swap or Libor rates S1(t), . . . , Sd(t), a collectionof fixing dates t1, . . . , td and a d-argument function f(s1, . . . , sd), a multi-rate derivative isdefined by a time-Tp payoff
V (Tp) = f(S1(t1), . . . , Sd(td))
We outline the notion of copula and we indicate how it can be used to specify and controlthe dependence structure between the rates involved.
9.1 Dependence structure via copulas
The copula approach is a method of constructing a joint distribution of random variablesconsistently with pre-specified one-dimensional marginal distributions.
9.2 Gaussian copula method
Assume we are given one-dimensional CDF’s Ψ1(·), . . . ,Ψd(·) and that we want to construct amulti-dimensional random vector (X1, . . . , Xd) with a measure of control over the dependenceof the random variables Xi, constrained so that each variable Xi has CDF Ψi(·). We canaccomplish this by specifying Xi = Ψ−1
i (Φ(Zi) where (Z1, . . . , Zd) is a multi-dimensionalnormalized Gaussian vector with correlation matrix R and where Φ(·) is the standard one-dimensional Gaussian CDF. Clearly
P (Xi ≤ x) = P (Ψ−1i (Φ(Zi) ≤ x) = P (Zi ≤ Φ−1(Ψi(x))) = Φ
(Φ−1(Ψi(x))
)= Ψi(x)
so Xi indeed has CDF Ψi(·). Denote by
• Ψgauss(x1, . . . , xd) the CDF of (X1, . . . , Xd) just constructed.
• Φd(z1, . . . , zd;R) the CDF of (Z1, . . . , Zd).
Then
Ψgauss(x1, . . . , xd) = P (X1 ≤ x1, . . . , Xd ≤ xd)= P
(Z1 ≤ Φ−1(Ψ1(x)), . . . , Zd ≤ Φ−1(Ψd(x))
)= Φd
(Φ−1(Ψ1(x1)), . . . ,Φ−1(Ψd(xd));R
)(130)
One easily obtains the PDF as
ψ(x1, . . . , xd) =∂d
∂x1 · · · ∂xdΨgauss(x1, . . . , xd)
= φd(Φ−1(Ψ1(x1)), . . . ,Φ−1(Ψd(xd));R
)·d∏i=1
ψi(xi)
φ (Φ−1(Ψi(xi)))
where φ(z) and φd(z1, . . . , zd;R) are the one- and d-dimensional Gaussian PDF’s, andψi(xi) is the one-dimensional PDF of Xi.
87
9.3 General couplas
Note that we can re-write the Gaussian copula CDF (130) as
Ψgauss(x1, . . . , xd) = Φd(Φ−1(Ψ1(x1)), . . . ,Φ−1(Ψd(xd));R
)= Cgauss (Ψ1(x1), . . . ,Ψd(xd);R)
whereCgauss(u1, . . . , ud;R) = Φd
(Φ−1(u1),Φ−1(ud);R
)is a multi-dimensional distribution function for a vector of d uniform random variables32.
Definition 9.1 (general copula). A d-dimensional copula function is a function C : [0, 1]d→[0, 1],C(u1, . . . , ud) defining a valid joint distribution function for a d-dimenaional vector of ran-dom variables, with each variable being uniformly distributed on [0, 1].
Requiring that C(u1, . . . , ud) be a distribution imposes strong constraints on the form ofthe function C. In particular
C(u1, u2, . . . , ui−1, 0, ui+1, . . . , ud) = 0
C(1, 1, . . . , 1, ui, 1, . . . , 1) = ui
The simplest examples of couplas are the following:
1. Independence copula: if all d uniform random variables underlying the copula are in-dependent, we obtain
CID(u1, . . . , ud) =
d∏i=1
ui
2. Perfect dependence copula: introduce d uniform random variables U1 = · · · = Ud. Then
CD(u1, . . . , ud) = P (U1 ≤ u1, . . . , Ud ≤ ud) = P
(U1 ≤ min
i=1,...,dui
)= mini=1,...,d
ui
The following result allows us to construct copulas from other copulas and will be thebasis to obtain a useful copula for multi-rate derivatives.
Proposition 9.2 (making copulas from copulas). Let C1, . . . , CM be d-dimensional copulasand let qm,i : [0, 1]→[0, 1], m = 1, . . . ,M , i = 1, . . . , d be functions that are either strictly
increasing or identically equal to 1. Suppose that∏Mm=1 qm,i(u) = u for u ∈ [0, 1] and
limu→0+ qm,i(u) = qm,i(0). Then
C(u1, . . . , ud) =
M∏m=1
Cm(qm,1(u1), . . . , qm,d(ud))
is a coupla
In particular, takingM = 2, C1 =∏di=1 ui the independence copula, C2 = Cgauss(u1, . . . , ud;R),
q1,i(u) = u1−θi , q2,i(u) = uθi for θi ∈ [0, 1], i = 1, . . . , d we obtain the so-called power Gaus-sian copula.
Corollary 9.3 (power Gaussian copula). Let R be a d × d correlation matrix and θ =(θ1, . . . , θd) ∈ [0, 1]d a d-dimensional vector of parameters. Then the power Gaussian function
CPG(u1, . . . , ud;R) =
(d∏i=1
u1−θ1
)Cgauss(u
θ11 , . . . , u
θdd ;R) (131)
32Recall that if X is a random variable with CDF FX(x) then Y = FX(X) has a uniform distribution over(0, 1). Indeed,
P [Y ≤ y] = P [FX(X) ≤ y] = P[X ≤ F−1
X (y)]
= FX(F−1X (y)
)= y
88
9.4 Copula methods for spread options
Spread options are the most liquid among the multi-rate vanilla derivatives. The payoff ofa spread option is given by (S1(T )− S2(T )−K)
+at time Tp, so the (undiscounted) value f
the spreads options is
V (0;T,K) = ETp[(S1(T )− S2(T )−K)
+]
Using a two-dimensional Gaussian copula with correlation
ρ = Corr(S1(T ), S2(T ), R =
(1 ρρ 1
)1. Derive market-implied density for each swap rate Si(T ) under its own annuity measure
ψAii (s), i = 1, 2 using swaption prices33.
2. Use Proposition 8.5 to convert each ψAii (x) into the PDF ψTpi (x) in the Tp-forward
measure
ψTp(s) =A(0)
P (0, Tp)α(s)ψA(s), (132)
ΨTp(s) =A(0)
P (0, Tp)
∫ s
−∞α(u)ψA(u) du (133)
where α(s) = EA[P (T,Tp)A(T )
∣∣∣S(T ) = s].
3. Given the correlation matrix R, use the Gaussian copula function to construct a jointPDF
ψTp(x1, x2;R) = cgauss(ΨTp1 (x1),Ψ
Tp2 (x2)) · ψTp1 (x1)ψ
Tp2 (x2)
where
cgauss(u1, u2;R) =φ2
(Φ−1(u1),Φ−1(u2);R
)φ(Φ−1(u1))φ(Φ−1(u2))
4. Integrate the payoff (x1 − x2 − K)+ against the density ψTp(x1, x2;R) to obtain theundiscounted model price of the spread option
Vmdl(0;T,K,R) =
∫ ∫(x1 − x2 −K)+ψTp(x1, x2;R) dx1 dx2
Finally, in order to calibrate the model to market:
5. Find the implied spread correlation ρ = ρ(T,K) such that
Vmdl(0;T,K,R) = Vmkt(0;T,K)
The main issue with this algorithm stems from the so-called correlation smile: for a fixedmaturity date, one obtains a different implied spread correlation for each strike. A simpleGaussian copula has insufficient flexibility to capture the market-observed distributions ofCMS spreads. Essentially, varying ρ allows one to shift the implied spread option volatilitysmile in parallel, making it impossible to match the market-implied smile (see figure 2).
33Explain
89
Figure 2: Implied normal spread volatility for a 5-year spread option on the difference between10-year and 2-year swap rates, [AP10-3, Figure 17.1].
9.4.1 Power gaussian
One way of fixing the correlation smile issue outlined in the previous subsection is to resortto a two-dimensional power Gaussian copula
CPG(u, v; ρ, θ1, θ2) = u1−θ1v1−θ2Cgauss(uθ1 , vθ2 ; ρ)
where ρ is a correlation coefficient and θ1, θ2 ∈ [0, 1]. As above, observes empirically thatρ allows to shift the implied volatility spread vertically, while the parameters θ1, θ2 providegood control over the slope and curvature of the smile (see figure 3).
Figure 3: Implied normal spread volatility for a 5-year spread option on the difference between10-year and 2-year swap rates obtained using the power Gaussian copula in three parameterscenarios, [AP10-3, Figure 17.2].
9.5 Limitations of the copula method
• Copulas do not result n a dynamic model for the yield curve.
• Copulas allow us to easily ascribe a joint terminal distribution to a collection of CMSrates, for the purpose of pricing European multi-rate options.
• Copulas in practical use are not usually chosen for their links to observed financialrelationships but instead for their technical properties and ease of implementation.
90
• Even though the power Gaussian copula is capable of reproducing a wide range ofmarket-observed shapes of spread volatility smiles, to date no copula has been foundthat reproduced market volatilities of spread options exactly for all values of spreadstrikes. See [AP10-3, Section17.4.4] for some insights.
91
10 Callable Libor Exotics
10.1 Review of basic notions
• In an exotic swap, a regular floating Libor rate is swapped against structured couponsthat are allowed to be arbitrary functions of observed interest rates (e.g. Libor or CMSrates).
For instance, floor can be represented as an exotic swap in which a Libor rate is ex-changed for a floored payoff.
(k − Ln(Tn))+
= (k − Ln(Tn))+
+ Ln(Tn)− Ln(Tn) = max(k, Ln(Tn))− Ln(Tn)
• [AP10-1, Section 5.13] Exotic swaps start their life as bonds or notes, sold by banksto investors. The investor pays an upfront principal amount to the note issuer, who inturn pays the investor a structured coupon, and repays the principal at the maturity ofthe note. The principal amount is invested by the issuer and pays de Libor rate, plusor minus a spread. From the perspective f the issuer, the net cash-flows of the note arethose of an exotic swap.
• If Cn is the structured coupon for the n-th period, the value of the exotic swap equals(from perspective of structured leg buyer)
Vexotic(t) = β(t)
N−1∑n=0
τnEt
[1
β(Tn+1)(Cn − Ln(Tn))
]
• Cn can be a complicated function of interest rates, structured to reflect investors’ viewson the market.
• A Bermudan swaption is an option to enter into a fixed-floating swap on any of itsfixing dates (or a subset of them). Once exercised on date Tn, say, the holder entersthe swap with the first fixing date Tn and final payment date TN
At time Tn, the value of a payer swap, if exercised, is
Un(Tn) = β(Tn)
N−1∑i=n
τiETn
[1
β(Ti+1)(Li(Ti)− k)
]
A Bermudan contract swaption is hence an option to choose between Un(Tn) for n =0, . . . , N − 1 and its time Tn value is
Vbermudan−swaption(Tn) = maxUn(Tn), Hn(Tn)
where Hn(Tn)) is the value of a Bermudan swaption with exercise dates TiN−1i=n+1.
• A callable Libor exotic (CLE) is a Bermudan-style option to enter an exotic swap. Thedifference with respect to an ordinary exotic swap is that the issuer of the note has theright to cancel (or call) the note on a schedule of dates. The principal is then returnedto the investor and no further coupons are paid.
• Given an exotic swap with structured coupons CnN−1n=0 . Its time-Tn value if exercised
is
Un(Tn) = β(Tn)
N−1∑i=n
τiETn
[1
β(Ti+1)(Ci − Li(Ti))
]The n-th hold value of the CLE, denoted Hn(Tn)) is the time-Tn value of the CLEprovided it has not been exercised on or before Tn.
92
10.2 Model calibration for CLE’s
CLE’s present non-trivial dependencies on the dynamics of market rates and generally requiresophisticated term-structure models.
The values of exotic derivatives are not directly observable in the market and thereforemodels for these securities have to be calibrated indirectly to other market information thatis deemed relevant for the class of exotics under consideration.
There are three sources of potentially relevant information:
1. Market prices of liquid vanilla interest rate derivatives (or market-implied - spot -volatilities) of various strikes. Complex structured products can be decomposed intosimpler liquid derivatives. For instance, a callable inverse floater with exercise datesT1 < T2 < · · · < TN−1 is a Bermudan option to swap a Libor rate and a structuredcoupon
Cn = min max6%− Ln(Tn), 0%, 4%
and it can be decomposed as
Cn = (6− Ln(Tn))+ − (2− Ln(Tn))+
namely a portfolio of a long floorlet struck at 6% and a short floorlet struck at 2%. Thecorresponding market-implied volatilities should thus be included as targets for modelcalibration.
2. Volatilities of core / coterminal rates. Even though the underlying exotic swap hasno dependencies on spot volatilities other than those underlying the floorlets inherentin the CIF, the callability structure introduces additional dependence on other vanilladerivatives.
For instance, the callable inverse floater will depend on the implied volatility of theswap rate that fixes on Ti and runs for the period [Ti, TN ] for each i = 1, . . . , N − 1.
3. Historical information about market quantities, such as volatilities and correlations ofmarket rates. In the callable inverse floater, assuming that the option hasn’t beenexercised by time Tn, the time-Tn value of the remaining part of the underlying swapdepends on caplet volatilities observed at time Tn (which are only known at time Tnand are called forward volatilities.
Any model will impose certain dynamics on the volatility structure of interest rates,and one should make sure that the model projections for forward volatilities are in linewith market-implied information.
4. Modeler’s belief of what constitutes reasonable behavior of the model parameters.
10.3 Valuation theory
We use the spot Libor measure QB , with numraire B(t) defined on a tenor structure 0 =T0 < T1 < · · · < TN . The net payment seen by the structured coupon receiver at time Tn+1
is Xn = τn(Cn − Ln(Tn)).
• The n-th exercise value, namely the value of all future payments if the CLE is exercisedat time Tn is
Un(t) = B(t)
N−1∑i=n
Et
[Xi
B(Ti+1)
], UN (t) ≡ 0
93
• The n-th hold value is the value of a CLE where the exercise opportunities have beenrestricted to Tn+1, . . . , TN−1
Hn(Tn) = B(Tn) supξ∈Tn
ETn
[Uξ(Tξ)
B(Tξ)
]which can be computed recursively
Hn−1(Tn−1) = B(Tn−1)ETn−1
[1
B(Tn)maxHn(Tn), Un(Tn)
], (134)
HN−1 ≡ 0
Ifηn(ω) = min k > n : Uk(Tk) > Hk(Tk) ∧N
then
H0(0) = E
[Uη0
(Tη0)
B(Tη0)
]= E
[N−1∑n=η0
Xn
B(Tn+1)
]
10.4 Monte Carlo valuation
The regression-based least squares scheme relies on the assumption that the hold valuesHn(Tn) can be regressed against simulated state variables of the model at time Tn.
We fix some notation first. Consider:
• ζ(t) = (ζ1(t), . . . , ζq(t))> q-dimensional vector process of regression variables.
• X random variable, with value X(ω) on a Monte Carlo path ω.
• ω1, . . . , ωK K simulated Monte Carlo paths.
• RT (X) results of regression of the K-dimensional vector (X(ω1), . . . , X(ωK)) on theK × q matrix ζ(T )> of regression variable observations at time T , namely
RT (X) = ζ(T )>β, [ζ(T )]i,j = ζi(T, ωj)
where
β = arg minβ∥∥(X(ω1)− ζ(T, ω1)>β, . . . ,X(ωK)− ζ(T, ωK)>β
)∥∥2
The most basic least squares scheme for CLE valuation consists of replacing the condi-tional expected value operator ETn−1
in (134) with the regression operator RTn−1, namely
Hn−1(Tn−1) = RTn−1
Vn(ωk)︷ ︸︸ ︷[B(Tn−1)
B(Tn)max
Hn(Tn), Un(Tn)
], n = N − 1, . . . , 1 (135)
HN−1 ≡ 0
The full algorithm34 is the following:
1. Choose and calibrate a term structure model (e.g. LM).
2. Choose regression variables ζ(t).
3. Simulate K paths ω1, . . . , ωK , each one representing (in the LM case) one simulatedpath of all core Libor rates.
34See [AP10-3, Page 824].
94
4. For each path ωk, calculate B(Tn, ωk), n = 1, . . . , N − 1.
5. For each path, calculate the exercise value Un(Tn, ωk) of the underlying exotic swapon all exercise dates n = 1, . . . , N − 1 (more work may be needed here if these are notavailable in closed form, see Section 10.4.1 below).
6. For each path ωk, calculate the values of the q-dimensional regression variables ζ(Tn, ωk),n = 1, . . . , N − 1.
7. Set HN−1 ≡ 0.
8. For each n = N − 1, . . . , 1
(a) Form a K-dimensional vector Vn = (Vn(ω1), . . . , Vn(ωK))>,
Vn(ωk) =B(Tn−1, ωk)
B(Tn, ωk)max
Hn(Tn, ωk), Un(Tn, ωk)
, k = 1, . . . ,K
(b) CalculateHn−1(Tn−1) = RTn−1(Vn) = ζ(Tn−1)>β
by regressing the vector Vn against the matrix of regression variables observed ondate Tn−1, namely by finding a q-dimensional
β = arg minβ∥∥(Vn(ω1)− ζ(Tn−1, ω1)>β, . . . , Vn(ωK)− ζ(Tn−1, ωK)>β
)∥∥2
9. H0(T0) is our sought for estimate of the value of the CLE.
The scheme presents several shortcomings:
(i) The exercise values Un(Tn) may not me computable in closed form, and they arenecessary to evaluate (135) at each step.
(ii) The use of nested regression (we apply RTn to a function of a regressed value Hn(Tn))could spawn significant biases.
(iii) It is unclear whether the Hn’s are low- or high-biased estimates of Hn’s.
10.4.1 Regression for the underlying
The above scheme can be easily extended to arbitrary underlying swaps, which is useful ifthe latter are too complex and do not admit a closed form. The idea is to apply regressionto compute Un(Tn) using
Un(t) =
N−1∑i=n
Et
[B(t)
B(Ti+1)Xi
], Xn = τn(Cn − Ln(Tn))
1. For each path ωk and each fixing date Tn calculate the net coupons Xn(Tn, ωk).
2. Use regression to estimate the exercise value Un(Tn) as
Un(Tn) = RTn
(N−1∑i=n
B(Tn)
B(Ti+1)Xi
)= ζ(Tn)>β
where
β = arg minβ
∥∥∥∥∥(N−1∑i=n
B(Tn)
B(Ti+1)Xi(ω1)− ζ(Tn, ω1)>β, . . . ,
N−1∑i=n
B(Tn)
B(Ti+1)Xi(ωK)− ζ(Tn, ωK)>β
)∥∥∥∥∥2
95
This can be computed recursively as
Un(Tn) = RTn(Yn),
Yn =B(Tn)
B(Tn+1)(Xn + Yn+1), n = N − 1, . . . , 1
3. Replace the expression Vn(ωk) inside of RTn−1in (135) by
Vn(ωk) =B(Tn−1)
B(Tn)max
Hn(Tn), Un(Tn)
10.4.2 Using regressed variables for decision only
The values that are regressed at time Tn−1 themselves come as the result of a regressionat time Tn. Compounded regression can lead to substantial biases, even though these canbe reduced significantly by using regressed variables for decision-making purposes only, asexplained in [AP10-3, Section 18.3.4] and as we summarize next.
Define Gn(t) = Hn(t)− Un(t) and note (c.f. [AP10-3, Equation 18.18]) that
Gn−1(Tn−1) = ETn−1
[B(Tn−1)
B(Tn)
(−Xn−1 +Gn(Tn)+
)], n = N, . . . , 0
Hence G0(0) is the value of a swap paying coupons −Xn on dates Tn, plus the right to cancelit on any of the exercise dates (CLE).
This provides an alternative way of valuing the CLE : we can compute a least squaresapproximation of G0(0) and obtain the value of the CLE by subtracting the swap value fromit. The least squares version of the latter (i.e. replacing ETn−1
by RTn−1thus reads
Gn−1(Tn−1) = RTn−1(Vn), GN−1(TN−1) ≡ 0,
Vn(ωk) =B(Tn−1, ωk)
B(Tn, ωk)
(−Xn−1(ωk) + Gn(Tn, ωk)+
), k = 1, . . . ,K
We next show how to bypass the need to use compounded regression: note that
G0(0) = H0(0)− U0(0) = −E
[η−1∑n=0
Xn
B(Tn+1)
]= −E
[N−1∑n=0
Xn
B(Tn+1)Iη>n
]
= −E
[N−1∑n=0
Xn
B(Tn+1)
(n∏i=1
IGi(Ti)>0
)]
Defining
Vn =B(Tn)
B(Tn+1)
[−Xn + IGn+1(Tn+1)>0Vn+1
], n = N − 1, . . . , 0
we obtain
V0 = −N−1∑n=0
Xn
B(Tn+1)
(n∏i=1
IGi(Ti)>0
)whence taking expectations
G0(0) = E[V0]
96
Critically, the recursion for Vn involves the value of the cancelable note for exercise deci-sions only, whereas the coupon values are never regressed. We can thus manage to avoidcompounded regression by defining the following approximation:
Gn−1(Tn−1) =B(Tn−1)
B(Tn)
[−Xn−1 + IGn(Tn)>0Gn(Tn)
],
Gn−1(Tn−1) = RTn−1(Gn−1(Tn−1))
10.4.3 Controlling the MC estimate bias
So far the bias of our MC estimate for the value of a CLE is unknown:
• The exercise decisions used in the schemes are necessarily suboptimal, which suggeststhat the estimate is biased low.
• However, the previous schemes use the same set of sample paths to estimate the exercisedecision as to calculate the security value if exercised, which can lead to an upwardbias (since future information can affect our decision to exercise).
In order to remedy this and guarantee a low bias, one could use the LS regression toestimate the exercise decision rule only, and then use an independent simulation to calculatethe value of the CLE given that exercise rule. Concretely:
1. Run any of the regression schemes above to produce coefficients (betas) for the holdC(Hn(Tn)) and exercise values C(Un(Tn)) at all exercise times, n = 1, . . . , N − 1.
2. Simulate K ′ additional paths ω′1, . . . , ω′K′ that are independent from the paths used in
the regression scheme.
3. For each path ω′k, calculate the values of the q-dimensional regression variables processζ(Tn, ω
′k), n = 1, . . . , N − 1.
4. For each path ω′k calculate an estimate of the exercise index η by
η(ω′k) = minn ≥ 1 : C(Un(Tn))>ζ(Tn, ω
′k) ≥ C(Hn(Tn))>ζ(Tn, ω
′k)∧N
5. Calculate the CLE value as the MC value of a knock-in discrete barrier option
H0(0) ∼ 1
K ′
K′∑k=1
N−1∑n=η(ω′k)
1
B(Tn+1, ω′k)Xn(ω′k)
(136)
10.4.4 Upper bound
10.4.5 Choice of regression variables
The closer ζ(T ) = (ζ1(T ), . . . , ζq(T )) approximates the information in FT that is relevantfor the security, the better the regression method performs (namely, the smaller the bias ofthe lower bound estimates of the security value).
• The dimensionality of the state vector (i.e. Libor rates) is typically so large that theregression will suffer from numerical problems. This issue can be handled by formingthe first d principal components of the vector of Libor rates. Concretely, given a vectora (centered) still-alive forward Libor rates
A = (Ln(Tn)− E[Ln(Tn)], . . . , LN−1(Tn)− E[LN−1(Tn)])>, N − n ≥ d
97
one can estimate the term covariance matrix of the rates c = E[AA>
]and by PCA we
can find an (N − n)× d matrix D such that DD> is the closest rank-d approximationto c, namely
D = arg minD[tr(c−DD>)(c−DD>)>
]and then
A ' Dx, where x = arg minx‖A−Dx‖2 = (D>D)−1D>A
• In some situations, the information carried in the state variables is inadequate forthe security in question and in other situations the state variables may carry too muchinformation, adding unnecessary work to the numerical scheme and reducing the qualityof the CLE price estimate. See [AP10-3, Section 18.3.9.2].
10.5 Implementation
[AP10-3, Section 18.3.10] explores a number of ways in which the regression algorithm can bemade more robust. Recall that given a K-dimensional vector Y = (Y1, . . . , YK)> of simulatedvalues of a random variable Y and a K × q matrix Z of simulated values of the regressionvariables ζ(t), namely Zk,j = ζj(t, ωk), our goal in regression is to find a q-dimensional vectorβ such that solves the minimization problem
β = arg minβ‖Y − Zβ‖2
This problem has a naive solution β = (Z>Z)−1Z>Y .
• [AP10-3, Section 18.3.10.1] explains the R2 criterion for automated explanatory vari-able selection.
• [AP10-3, Section 18.3.10.4] explains two methods to make the computation β =(Z>Z)−1Z>Y more robust when the matrix Z>Z is ill-conditioned, namely close tosingular.
(i) Tikhonov regularization: consider the minimization problem β = arg minβ‖Y −Zβ‖2 + ωreg‖β‖2 where the scalar regularization weight ωreg can be selected in anumber of ways, for instance
ωreg = ε
√1
qtr(Z>ZZ>Z), ε = 10−4
The solution is given by
β = (Z>Z + ωrefI)−1Z>Y
where the matrix to be inverted Z>Z + ωrefI always has full rank.
(ii) Singular value decomposition. Write β = (Z>Z)−1Z>Y as a system of linearequations
Mβ = Z>Y, M = Z>Z
and decompose the q × q matrix M = UΣV > where...
98
10.6 Valuation with low-dimensional models
Single-rate CLE’s35 can be valued accurately using the so-called local-projection method,which consists of calibrating a local low-dimensional model to the volatility information thathas been identified as important to the CLE valuation.
As a starting point, any low-dimensional model should be calibrated to the followingtargets:
1. Underlying swap rate volatilities S1n(Tn)N−1
n=1 . Here we assume that each couponCn depends on S1
n(Tn) only and that Sn(t) ≡ Sn,µ(n)(t) where µ(n) is the number ofperiods.
2. Core swap rate volatilities S2n(Tn)N−1
n=1 , namely S2n(t) ≡ Sn,N−n(t). We have flexibility
in choosing strikes, but ATM strikes are common.
3. Core correlations for S2n(Tn)N−1
n=1 . These are not observable in the market, so we needto draw on an LM model which has been calibrated to the market as a whole. This isparamount: by including this dynamic information, the local model not only capturesthe static information about the interest rate volatilities at valuation time, but also thedynamics of the volatility structure.
There are several options for the local model, the one-dimensional quasi-Gaussian (qG)model being a natural one. Recall that a standard Gaussian model is given by
dr(t) = κ(t) [θ(t)− r(t)] dt+ σr(t)dW (t)
df(t, T ) = σf (t, T )t
[∫ T
t
σf (t, u) du
]dt+ σf (t, T )tdW (t)
σf (t, T ) = σr(t) exp
(−∫ T
t
κ(u) du
)σf (t, T ) = g(t)h(T )
where one normally works with the state variables
x(t) := r(t)− f(0, t)
dx(t) = [y(t)− κ(t)x(t)] dt+ σr(t)dW (t), x(0) = 0,
y(t) =
∫ t
0
e−2∫ tuκ(s) dsσ2
r(u) du
In a local volatility quasi-Gaussian (qG) model the function g(·) is allowed to depend on thestate variables, namely g(t) ≡ g(t, x(t), y(t). The approximate dynamics of a local volatilityquasi-Gaussian model are given by
dS(t) = ϕ(t, S(t))dWA(t)
ϕ2(t, s) = EA[∂xS(t, x(t), y(t))σr(t, x(t), y(t))2 |S(t) = s
]where
ϕ(t, S(t)) ≈ λS(t) [bS(t)S(t) + (1− bS(t))S(0)]
and
λr(t) =
N−1∑n=1
λn1(Tn−1,Tn](t), br(t) =
N−1∑n=1
bn1(Tn−1,Tn](t)
In this model we have;
35Namely, CLE’s for which each coupon Cn depends on at most a single market rate, which we denote byS1n(Tn).
99
• 2(N − 1) independent volatility parameters (λn, bn), n = 1, . . . , N − 1 which can beused to calibrate the model to term volatilities for one of the swap rate strips.
• N−1 parameters κn, n = 1, . . . , N−1 defining the mean reversion which can be used toeither match the term volatilities for the second swap rate strip or the core correlations.
Therefore, the one-factor local volatility qG model is not large enough to match all threesets of calibration targets identified above. Even though this is acceptable for some securities,it is not for some others, and in this case we might need to resort to a two-factor model.
10.7 Suitable analogue for core swap rates
We noted that the callability value is driven by the volatilities of core swap rates S2n(t) =
Sn,N−n(t), since CLE’s are related to standard Bermudan swaptions. In some cases though,it is more relevant to calibrate the model to other European swaptions36. We thus try torefine the selection of the volatility targets relevant for the callability option of a CLE.
We use the idea is that the local model should match the values of European options onexercise values Un(Tn), n = 1, . . . , N − 1. Since such options can be hard to come by, wecan linearize the underlying Un(Tn) of the CLE and use the resulting rate as a replacementfor the core swap rate.
We use an LM model as a backdrop. Assume that
Un(Tn) = fn(L(Tn)), n = 1, . . . , N − 1
L(Tn) = (Ln(Tn), . . . , LN−1(Tn))
Linearizing we obtain
Un(Tn) = fn(L(0)) +∇fn(L(0))[L(Tn)− L(0)]
so the value of the European option on the underlying can be approximated by
E
[(Un(Tn))+
B(Tn)
]≈ E
[1
B(Tn)(∇fn(L(0))L(Tn)− (∇fn(L(0))L(0)− fn(L(0))))
+
]and the relevant interest rate is hence
Rn(Tn) := ∇fn(L(0))L(Tn) =∑j
ωn,jLj(Tn), ωn,j =∂fn∂Lj
(L(0))
• The volatility of the rate Rn(Tn) can be approximated in an LM or local market models,so they can easily be used as volatility targets in place of core swap rates.
• Un(t) typically consists of options on market rates, so the derivatives ∂fn∂Lj
can be com-
puted with Black-type approximations of option values.
Example 10.1 (Bermudan swaptions). The exercise value of a Bermudan swaption on apayer swap is
Un(Tn) =
N−1∑i=n
τi(Li(Tn)−K)
P (Tn,Ti+1)︷ ︸︸ ︷(i∏
k=n
1
1 + τkLk(Tn)
)36For instance, in the case of Bermudan swaptions on amortizing swaps, the most relevant European swaptions
are not the standard core European swaptions, but instead swaptions with tenors based on the durations of theunderlying amortizing swap (see section ?? ahead).
100
with
fn(x) =
N−1∑i=n
τi(xi −K)
(i∏
k=n
1
1 + τkxk
)∂fn∂Lj
(x) = τj
j∏k=n
1
1 + τkxk− τj
1 + τjxj
N−1∑i=j
τi(xi −K)
(i∏
k=n
1
1 + τkLk(Tn)
)so
ωn,j = τjP (0;Tn, Tj+1)
[1− Uj(0)
P (0, Tj)
]and Rn(Tn) =
∑N−1j=n ωn,jLj(Tn) is quite close to the core swap rate Sn,N−n(Tn).
101
11 Bermudan swaptions
Given a tenor structure 0 = T0 < T1 < · · · < TN−1 , a Bermudan swaption is CLE with thecoupon paying a fixed rate Cn = k for n = 1, . . . , N − 1. Assume for simplicity that exerciseis possible on any tenor date. If exercised at time Tn, the value for a payer swap in is
Un(t) =
N−1∑i=n
P (t, Ti+1)ETi+1
t [τi(Li(Ti)− k)] =
N−1∑i=n
τiP (t, Ti+1)[Li(t)− k];
Un(Tn) = An(Tn)[Sn(Tn)− k]
where An(t) ≡ An,N−n(t) and Sn(t) ≡ Sn,N−n(t).
Bermudan swaptions are liquid and their trading volume is high, so the performanceadvantage of PDE methods over Monte Carlo simulation make low-factor Markovian modelsattractive, and the local projection method for single-rate CLE’s provides a sound frameworkfor this. The method takes a simple form for Bermudan swaptions since only the volatilityparameters of the core swap rates are relevant.
11.1 Local projection method
A Bermudan swaption gives the holder the right to exercise into one of several different swapsthat are observed on different exercise dates. Bermudan swaptions are thus conceptuallysimilar to a basket option with payoff maxLn(Tn), n = 1, . . . , N − 1 and their values arehence driven by the volatilities of core swap rates37 Sn(Tn)N−1
n=1 and by the inter-temporalcorrelations thereof38.
This observation allows one to use simple one-factor Gaussian or quasi-Gaussian mod-els for valuation ad risk-management of Bermudan swaptions, with the volatility function
37Recall that For each swap rate Sn,m(t) we specify stochastic volatility (SV) dynamics in the annuity measure
dSn,m(t) = λn,m [bn,mSn,m(t) + (1− bn,m)Sn,m(0)]√zn,m(t)dWn,m(t)
dzn,m(t) = θ [1− zn,m(t)] dt+ ηn,m√zn,mdZ
n,m(t)
38Recall that in a one-factor pure Gaussian model or a quasi-Gaussian model (with local volatility), the meanreversion parameter can be calibrated to a market-implied inter-temporal correlation matrix
χn1,n2 = Corr(Sn1(Tn1), Sn2(Tn2)), 1 ≤ n1 ≤ n2 < N − 1
independently of the volatility term. For instance in a pure Gaussian model with constant volatility σr and meanreversion κ we have (c.f. [AP10-2, Section 13.1.8.1])
Corr(F (T1;T1,M1), F (T2;T2,M2)) = Corr(x(T1), x(T2)) = e−κ(T2−T1)
(1− e−2κT1
1− e−2κT2
)1/2
where F (t;T,M) = − 1M−T ln P (t,M)
P (t,T ).
Likewise, in a one-factor quasi-Gaussian model we have
Corr(Sn1(Tn1), Sn2(Tn2)) '∫ Tn1
0
(∂Sn1
∂x(t, 0, 0)
)(∂Sn2
∂x(t, 0, 0)
)dt
×
(∫ Tn1
0
(∂Sn1
∂x(t, 0, 0)
)2
dt
)−1/2
×
(∫ Tn2
0
(∂Sn2
∂x(t, 0, 0)
)2
dt
)−1/2
102
calibrated to volatilities of core rates and the mean-reversion function calibrated to inter-temporal correlations thereof, with the latter being extracted from a global Libor model.
Several approaches are possible when it comes to choosing the calibration targets. Sup-pose we want to value and hedge a Bermudan swaption with fixed rate k and tenor structure0 = T0 < T1 < · · · < TN giving the holder the right to exercise into one of the swaps observedon each of the exercise dates T1, . . . , TN−1.
• For each expiry Tn, use the volatility of the appropriate ATM core European swap-tion. Unfortunately, this choice leads to inconsistent valuation between European andBermudan swaptions: given a Bermudan swaption model calibrated to ATM Europeanswaptions, applying it to a Bermudan swaption with a non-ATM strike and just a singleexercise date (namely, a non-ATM European swaption) the value will differ from thatobtained from that of the same derivative, priced as a European swaption.
• Instead, for each expiry Tn, use the volatility of the appropriate core European swaptioncorresponding to the strike k.
11.2 Amortizing, accreting and other non-standard Bermudan swap-tions
Standard Bermudan swaptions can be extended by replacing the unit notional by a time-dependent and deterministic notional Ri, so that the exercise value becomes
Un(t) =
N−1∑i=n
RiτiP (t, Ti+1)[Li(t)− k] (137)
• If the notional increases with the coupon index i, the Bermudan swaption is said to beaccreting.
• If the notional decreases with the coupon index i, the Bermudan swaption is said to beamortizing.
Valuation of such swaptions in a properly calibrated model doesn’t entail anything new.Nevertheless, while general models like LM present a product-independent calibration, mod-els requiring local calibration, such as the one-factor quasi-Gaussian model, calibration fornon-standard Bermudan swaptions requires a deeper analysis, since their liquidity is consid-erably poorer than for vanilla European swaptions.
We thus require a pre-processing step to extract amortizing European swaption prices froma model calibrated to liquid vanilla European swaptions.
11.2.1 Relating non-standard to standard swap rates
Consider a swap that starts at Tn and ends at TN , with a notional schedule Ri The annuityand swap rate that correspond to this non-standard swap are defined as
An(t) =
N−1∑i=n
RiτiP (t, Ti+1), Sn(t) =1
An(t)
N−1∑i=n
RiτiP (t, Ti+1)Li(t)
The non-standard swap rate can be decomposed as a linear combination of standard swapslike so39:
39The weights ωn,m(Tn) can be approximated reasonably well b their values at time 0.
103
Sn(t) =
N−1∑m=1
ωn,m(Tn)Sn,m(Tn), ωn,m(Tn) = (Rn+m−1 −Rn+m)An,m(Tn)
An(Tn)
Hence in order to price a non-standard Bermudan swaption in principle one needs tocalibrate the model to the volatilities of standard rates with all expiries T1, . . . , TN−1 andall maturities, something which a low-dimensional local model will virtually never be able todo. We next discuss a method to remedy this issue.
11.2.2 Representative swaption approach (payoff matching)
The idea is to choose a standard swap that approximates the non-standard swap in somereasonable sense, and then to calibrate the Bermudan swaption model to the market-impliedvolatilities of swaptions on these standard swaps, one per exercise date.
Consider a one-factor Gaussian model to illustrate the idea. Fix a start date Tn and letUn(Tn;x) be the value of a non-standard swap, where x = x(Tn) is the Gaussian short ratestate on date Tn. We have seen that
Un(Tn;x) =
N−m∑m=1
(Rn+m−1 −Rn+m)Vn,m(Tn)
=
N−m∑m=1
(Rn+m−1 −Rn+m)
[n+m−1∑i=n
τiP (Tn, Ti+1)[Li(Tn)− k]
]
Recall that by the Gaussian bond reconstitution formula we have
P (t, T ) =P (0, T )
P (0, t)exp
(−x(t)G(t, T )− 1
2y(t)G2(t, T )
)G(t, T ) =
∫ T
t
e−2∫ tuκ(s) ds du
y(t) =
∫ t
0
e−2∫ tuκ(s) dsσ2
r(u) du
which explains why we indicate the dependence on x in the notation Un(Tn;x). Note alsothat Un(Tn;x) depends on the volatility up to time Tn only.
We now seek to find a standard swaption providing the best possible approximation toUn(Tn;x). Let V (x;R, q,m) denote the value of a standard swap starting on Tn as a functionof x, with constant notional R, fixed rate (strike) q and covering m periods (where we allowm to be a real number in order to avoid discontinuities.
In the so-called payoff matching method, the three parameters R, q,m are chosen so thatthe level, slope and curvature of the swap payoffs as functions of the state variable x, namelywe impose
V (x0;R, q,m) = Un(Tn;x0)
∂
∂xV (x0;R, q,m) =
∂
∂xUn(Tn;x0) (138)
∂2
∂x2V (x0;R, q,m) =
∂2
∂x2Un(Tn;x0)
where x0 = E[x(Tn)]. These systems are solved numerically and produce the optimal pa-rameters R∗n, q
∗n,m
∗n (which are numerically seen to depend only mildly on mean reversion).
104
Since Un(Tn;x) only depends on the volatility parameter40 σr(t) over [0, Tn], it is easyto incorporate the payoff matching methods into a volatility calibration scheme such as theone outlined in section 5.3.2. Specifically, we calibrate one swaption at a time:
1. Assume that the mean reversion has been fixed exogenously and that σ0, . . . , σi−1 havebeen computed.
2. Select the standard swaption Vi(x;R∗i , q∗i ,m
∗i ) that best approximates Ui(Ti;x) by solv-
ing the system of equations (138), which only requires knowledge of σ0, . . . , σi−1.
3. Set the value σi such that the model price for the standard swaption Vi(x;R∗i , q∗i ,m
∗i )
equals its market price, by numerically inverting Jamshidian’s formula (32)
4. Repeat Step 2 for i = 0, . . . , N − 2.
Remark 11.1. This method as is doesn’t work well for accreting Bermudan swaptions,since it would produce an optimal tenor m∗ that is longer than the tenor of the amortizingswap N − n (see discussion on [AP10-3, Page 884]). It turns out that in order to get areasonable calibration scheme for an accreting Bermudan swaption we would need to calibrateto two standard European swaptions per expiry and this can be difficult within a one-factorMarkovian model, so additional factors are required.
11.2.3 Basket approach
This approach is a creative way of using one-factor models for non-standard Bermudanswaptions by splitting the valuation into two stages:
(i) Some model is used to calculate values of core non-standard European swaptions.
(ii) A one-factor model is then calibrated to the previous values and subsequently used tocompute the value of non-standard Bermudan swaptions.
The variants of this method rely on different ways of implementing (i), namely of valuingnon-standard European swaptions. One could for instance use a globally calibrated modelsuch as LM to value non-standard European swaptions and then use the local projectionmethod for non-standard Bermudan swaptions (namely, calibrate a low-dimensional modelto the prices produced by the global model).
Alternatively, one may proceed as follows. Consider a swaption grid on a tenor T1 <· · · < TN .
1. For each n = 1, . . . , N − 1, fit an instance of a one-factor Gaussian or quasi-Gaussianmodel to the prices of European swaptions expiring at Tn with underlying swaps span-ning the intervals [Tn, Tn+1], . . . , [Tn, TN−1].
Note that the value of a swap fixing at Tn and covering m periods depends on the meanreversion function over [Tn, Tn+m] so that we can keep σr fixed at a reasonable level(e.g. 1%) and calibrate the function κ(t) one swaption at a time.
Choice of European swaption strikes: there is no fixed rule for this. The best optionis to use a one-factor quasi-Gaussian model with stochastic volatility, which allowscalibration to the volatility smile at more than a single strike, thereby alleviating thestrike selection problem.
2. For each Tn, use the relevant instance of the one-factor model calibrated in the previousstep to price the Tn-expiry non-vanilla European option that the Bermudan swaptioncan be exercised into.
40In [AP10-3, Section 19.4.3] the authors claim that Un(Tn;x) does NOT depend on the volatility parameter,which is not true for the Gaussian model as far as I’m concerned. The one-swaption-at-a-time volatility calibrationscheme can still be easily coupled with the payoff matching method though, since the time-Tn bond prices dependonly on the volatility parameter up to time Tn.
105
3. Calibrate a one-factor model to the prices of the non-standard swaptions obtained inthe previous step using the standard procedure (namely, calibrating the short ratevolatility function σr(t) one swaption at a time while keeping the mean reversion fixedat a user-specified value or at a level that makes inter-temporal correlations of coreswap rates match those coming from a global model.
11.2.4 American swaptions
American swaptions allow the holder to exercise on any given date after the lockout period.These are particularly popular in the US for hedges of mortgage bonds.
American swaptions present a subtlety: if exercise takes place during [Tn, Tn+1], theoption holder receives a swap starting at Tn+1 as well as an exercise fee equal to (Ln(Tn)−k)(Tn+1 − t), namely the exercise value per unit of notional is
UAn (t) = (Ln(Tn)− k)(Tn+1 − t) + Un+1(t)
Note that there time-t exercise value hence depends on the value of the Libor rate at Tn < tand is thus path-dependent.
11.2.4.1 American vs. high-frequency Bermudan swaptions
11.2.4.2 The Proxy Libor rate method
11.3 Flexi-swaps
A Bermudan swaption can be interpreted as as fixed-floating swap with zero notional and asingle option to increase the notional to a given level on any of the exercise dates. Similarly, acancelable swap can be seen as a swap of full notional with an option to decrease the notionalto zero on any of the exercise dates.
Flexi-swaps (a.k.a. chooser swaps) are contracts offering more flexibility in choosingswap notionals, namely swaps with multiple options to change the notional on a given set ofexercise dates, subject to certain constraints.
Consider a tenor structure Tn, n = 0, . . . , N and a collection of coupons Xn with unitnotional fixing at Tn and paying at Tn+1. A flexi-swap is a contract paying a net couponXnRn at time Tn+1, where R0 is fixed up-front, and the time-Tn notional Rn is chosen bythe holder of the option at time Tn, subject to some constraints:
• Global deterministic bonds: Rn ∈ [glon , ghin ].
• Local bounds which are functions of the current notional: Rn ∈ [llon (Rn−1), lhin (Rn−1)].
• Bounds that are functions of market data xn (e.g. Libor or swap rates): Rn ∈[mlo
n (xn),mhin (xn)].
Denote by Cn(Rn−1, xn) the set of time-Tn constraints, so that Rn ∈ Cn(Rn−1, xn). Thevaluation of a flexi-swap can be done via a backward recursion equation: if Vn(t, R) denotesthe time-t value of the part of a flexi-swap paying strictly after Tn, given that Rn = R, wehave
Vn−1(Tn−1, R) = P (Tn−1, Tn)Xn−1R+B(Tn−1)ETn−1
[1
B(Tn)max
R′∈Cn(R,xn)Vn(Tn, R
′)]
(139)for n = N, . . . , 1 and with terminal condition VN (TN , R) ≡ 0.
106
11.3.1 Purely local bounds
Assume that only local constraints are enforced and that these are of the form
`lon (Rn−1) = λlonRn−1, `hin (Rn−1) = λhin Rn−1, 0 ≤ λlon ≤ λhin
Also assume for simplicity that the value of the flexi-swap scales linearly in notional, namely
Vn(Tn, R) = RVn(Tn, 1)
Simple computations then show that the recursive equation (139) simplifies to
Vn−1(Tn−1) = P (Tn−1, Tn)Xn−1
+B(Tn−1)ETn−1
[1
B(Tn)Vn(Tn)
(λlon 1Vn(Tn)<0 + λhin 1Vn(Tn)>0
)](140)
11.4 Fast pricing via exercise premia representation
On occasions the speed of valuation of Bermudan swaptions is more important than accuracy(e.g. robust hedging, calculation of CVA). We now consider a useful approximation basedon the representation of a Bermudan swaption as a stream of coupons paid in the exerciseregion.
107
12 TARN’s, volatility swaps and other derivatives
12.1 TARN’s (targeted redemption notes)
A TARN pays structured coupons in exchange for Libor coupons until the cumulative amountof structured coupon payments exceeds a pre-agreed target R, at which point the derivativeterminates. For a given tenor structure 0 = T0 < T1 < · · · < TN , the time-0 value of aTARN is hence
Vtarn(0) = EB
[N−1∑n=1
1
B(Tn+1)τn (Cn − Ln(Tn)) IQn<B
],
Qn =
n−1∑i=1
τiCi
• Faithful reproduction of volatility smiles of various Libor rates is important for TARN’s,so it is recommendable to use globally calibrated models with stochastic volatility.
• The knockout feature of the TARN will introduce digital discontinuities (jumps com-ing from the step function IQn<B), so Monte Carlo errors of the contract value and,especially, its risk sensitivities can be large. Ways of amending this are described in[AP10-3, Chapters 23 and 24].
• The full power of LM and gG models may not be required for TARN’s can often do witha local projection method, as described below: in a nutshell, the idea is to find a simplelocal model that is calibrated the parts of a global model’s volatility structure (e.g.LM) that are relevant to the derivative being valued, in such a way as to approximatethe value of the global model for a particular derivative.
12.1.1 Local projection method
Consider the example of an inverse floating coupon Cn = (s− g · Ln(Tn))+
. The time-0value of the corresponding TARN with target B is
Vtarn(0) = EB
[N−1∑n=1
1
B(Tn+1)τn
((s− g · Ln(Tn))
+ − Ln(Tn))I∑n−1
i=1 τi(s−g·Li(Ti))+<B
]
which depends on the values
L = (L1(T1), . . . , LN−1(TN−1))
of Libor rates on their fixing dates only (the values at intermediate dates being irrelevant)so only the properties of the (N − 1)-dimensional vector L need be captured.
Assuming log-normal distributions for market rates, if two models assign the same valuesto the term variances of Libor rates Var(lnLn(Tn)) and inter-temporal correlations of Liborrates Corr(lnLn(Tn)), then the values of a TARN in both models should be the same. Oneshould thus apply the local projection method as follows:
1. Calibrate a Libor market model to the full swaption volatility grid.
2. Use the calibrated LM model to calculate the relevant term volatilities and inter-temporal correlations needed for the TARN.
3. Pick a simpler model and calibrate it to the volatilities and correlations extracted fromthe LM model.
4. Use the calibrated local model for valuing the TARN and computing risk sensitivities.
108
12.1.2 Effect of volatility smiles
A successful candidate for the local model should have the ability to calibrate to volatilitysmiles of all Libor rates, in addition to having a low number of state variables, for instancethe one-factor quasi-Gaussian model with stochastic volatility. One may proceed as follows:
1. Fix the mean reversion function to match the inter-temporal correlations of Libor ratesCorr(lnLn(Tn), lnLm(Tm)), n,m = 1, . . . , N − 1 as described in Section 6.2.4.
2. Calibrate to the volatility smiles of all Libor rates that appear in the payoff formula.
3. Choose the time-dependent local volatility function σr(t, x, y) and volatility of variancefunction η(t) to match the implied SV parameters of relevant caplets as in Section 6.2.2.
12.2 Volatility swaps
Volatility derivatives are contingent claims whose underlying is the volatility of a financialobservable, rather than a financial observable itself. An (interest rate) volatility swap is acontract the measures realized volatility of a given rate Xn(t). The most common coupons(uncapped and capped versions) are
Cn = |Xn+1(Tn+1)−Xn(Tn)|, Cn = min|Xn+1(Tn+1)−Xn(Tn)|, c
The value of the structured leg of the volatility swap measures the realized variation of therate Xn(·)
Vvolswap(t) = β(t)Et
[N−1∑n=0
τn1
β(Tn+1)|Xn+1(Tn+1)−Xn(Tn)|
]− Vfloatleg(t), (141)
Vfloatleg(t) =
N−1∑n=0
τnP (t, Tn+1)Ln(t) = P (t, T0)− P (t, TN )
Common choice for the rate Xn are
• Fixed-tenor volatility swap, involving a swap rate of the same tenor on each of thefixing dates Xn(t) = Sn,m(t), with fixed m.
• Fixed-expiry volatility swap, involving a swap rate with fixed expiry and tenor Xn(t) =SK,m(t).
• Volatility swaps on CMS rates, which measure the variation of the spread of two ratesXn(t) = Sn,m1
(t)− Sn,m2(t).
As was the case for TARN’s, the value of a volatility swap (141) depends on the valuesof the swap rates Sn on their fixing dates only, so we may apply the local projection method(instead of a full-blown globally calibrated market model).
12.2.1 Volatility swaps with shout option
Some volatility swaps carry the option to shout, namely to choose when the fixing date occursfor the purpose of calculating the coupon payoff. The payoff for the n-th coupon is hence
Cn = |Xn+1(ηn)−Xn(Tn)|
with ηn ∈ [Tn, Tn+1] chosen by the investor. This coupon is paid on date Tn+1. At time Tn,the option looks like an American option with exercise value
P (ηn, Tn+1)|Sn+1(ηn)− Sn(Tn)|
109
• For valuation purposes, the shout option can be ignored for uncappedcoupons. By Jensen’s inequality we have
P (ηn, Tn+1)ETn+1ηn [|Sn+1(Tn+1)− Sn(Tn)|] ≥ P (ηn, Tn+1)
∣∣ETn+1ηn [Sn+1(Tn+1)]− Sn(Tn)
∣∣' P (ηn, Tn+1) |Sn+1(ηn)− Sn(Tn)|
whence the early-exercise value is negligible.
• Consider now the case of a capped coupon with a shout option
C ′n = min|Sn+1(ηn)− Sn(Tn)|, c
It seems as though one requires an estimation of the optimal exercise rule via regression,but it turns out that the situation is much simpler, as the following result shows.
Proposition 12.1 (Broadie and Detemple). The value of an American option on a cappedstraddle with payoff C ′n = min|Sn+1(ηn)−Sn(Tn)|, c is equal to the value of a straddle with
a barrier, so that ETn+1
Tn[C ′n] = E
Tn+1
Tn[C ′′n ] where
C ′′n = c·Imaxt∈[Tn,Tn+1]|Sn+1(t)−Sn(t)|≥c+|Sn+1(t)−Sn(t)|·Imaxt∈[Tn,Tn+1]|Sn+1(t)−Sn(t)|<c
so that the optimal exercise strategy is known analytically: for the period [Tn, Tn+1] oneshould exercise the shout option on the first time t when Sn+1(t) hits either of the barriersSn(t)± c.
Having replaced an American option with a barrier option, we can value capped volatilityswaps in standard Monte Carlo.
12.2.2 Min-Max volatility swaps
The structured coupon for a min-max volatility swap is
Cn = Mn −mm, where Mn = maxt∈[Tn,Tn+1]
Xn(t), mn = mint∈[Tn,Tn+1]
Xn(t)
12.3 Forward swaption straddles
110
13 Risk management
Consider the following notation:
• Θmkt(t): Nmkt-dimensional vector of observable market data at time t (swap andfutures rates for yield curve construction, cap and swaption prices for volatility cali-bration.
• Θprm(t): Nprm-dimensional vector of additional parameters that are not directly ob-served, but are estimated from historical data, such as short rate mean reversion,correlation parameterizations, SV mean reversion speeds, local volatility parameters,etc.
• Θnum(t): additional parameters that control the numerical schemes used in the model,such as the number of MC paths, the size of the discretization steps, etc.
One first uses Θmkt(t) and Θparm(t) to construct the yield curve and calibrate the model,that is to obtain a new vector
Θmdl = C(Θmkt(t); Θprm(t))
Given the time-t yield curve, a set of model parameters Θmdl(t) and a set of numericalscheme parameters Θnum(t) we can compute the value V (t) of our portfolio of derivativesecurities Vi(t), namely
V (t) = V1(t) + · · ·+ Vn(t) = M(Θmdl(t); Θnum(t)) = H(Θmkt(t); Θprm(t); Θnum(t)) (142)
The role of a trading desk is then to construct a hedge around (142), with the hedgeaiming to neutralize, in a standard Taylor-series sense, as many of the movements in theentire market data vector Θmkt(t) as possible.
The dimension of the market data vector Nmkt can be very high and it may be too costlyto hedge against all components of Θmkt(t), so some type of principal component analysismay be undertaken.
13.1 Value at risk
The risk management team in a bank is primarily focused on analyzing the distribution offuture portfolio values, in order to gauge the overall riskiness of the portfolio. The value atrisk is one of the most commonly used risk measures.
VaR at level α, denoted Λα and typically a negative value, is the (1α)-percentile of thedistribution of the P&L move V (t+ h)− V (t) in the real-life measure P , namely
P (V (t+ h)− V (t) ≤ Λα|Ft) = 1− α
In other words, the probability of losing more than −Λα over the time interval [t, t + h] isless than 1− α. Typically, α is set to 99% or 95%, and h to one business day.
Another commonly used risk measure is conditional value-at-risk (cVaR) which is definedas the conditional expectation
Ξα,h = EP [V (t+ h)− V (t)|V (t+ h)− V (t) ≤ Λα]
To compute Λα one needs a statistical description for the market data increment vector41
δ, for which one may follow different methodologies:
41Namely, a vector of variables affecting the portfolio value, such as exchange rates, equity prices, interest ratesand so forth.
111
1. Using the historical distribution of δ directly: ones takes the actual realizations of δover the last NV aR trading days and applies them to the current market data, therebygenerating the empirical distribution of V (t+ h)− V (t).
The calculation of VaR then amounts to ranking the impact of the last NV aR marketmoves on the current portfolio, worst to best, and using the impact on the day withrank (1− α)NV aR as VaR.
2. Using a parameterized, rather than historical, distribution of market moves. One as-sumes that
δi ∼ N (0, s2i ), si = σi
√h, i = 1, . . . , Nmkt
where σi is annualized basis point volatility of the i-th market element.
The co-dependence between the elements of δ is captured in a correlation matrix whichis typically estimated from historical time series.
To compute VaR in the Gaussian setup, we may use equation
V (t+ h) = V (t) +∇H(t)δ +1
2δ>AH(t)δ
Computations are then analytically tractable and are encapsulated in the followingproposition.
Proposition 13.1. Neglecting the Hessian AH(t) = 0, namely assuming that
V (t+ h) = V (t) +∇H(t)δ
and assuming that the elements of the Nmkt-dimensional vector δ have correlation ma-trix R and satisfy δi ∼ N (0, s2
i ) we have
Λα = vΦ−1(1− α)
Ξα = − v
1− αφ(Φ−1(1− α)
),
v2 = ∇H(t)diag(s)Rdiag(s)∇H(t)>
where s = (s1, . . . , sNmkt) and Φ(x) = 1√2π
∫ x−∞ e−x
2/2 dx is the Gaussian CDF.
Proof. Note that the covariance matrix of δ is C = diag(s)Rdiag(s)>. Since V (t+h) =V (t) +∇H(t)δ, this implies that V (t+ h)− V (t) ∼ N(0, v2) ∼ vN(0, 1) where
v2 = ∇H(t)C∇H(t)>.
Hence if Z ∼ N (0, 1) we have
P (V (t+ h)− V (t) ≤ Λα) = P
(Z ≤ Λα
v
)= Φ
(Λαv
)≡ 1− α =⇒ Λα = vΦ−1(1− α)
As for cVaR we have
EP [∆V |∆V ≤ Λα] =1
P (∆V ≤ Λα)EP
[∆V I∆V≤Λα
]=
v
1− αEP
[ZIZ≤Λα
v
]
Remark 13.1. 1. If we wish to include AH in the computation, the distribution of V (t+h)−V (t) is no longer simple. One can show that it is expressible as a sum of independentnon-central chi-square random variables. From this representation, its characteristicfunction can be computed, and this can be turned numerically into a CDF from whichVaR can be computed.
2. The key to a good VaR computation is a reliable and accurate estimate for ∇H andAH .
112
14 Payoff smoothing for Monte Carlo
• Practical risk management of a portfolio of interest rate securities revolves aroundprice sensitivities with respect to various valuation inputs, such as market prices andmodel parameters. These sensitivities are computed by applying small perturbationsto market and model parameters, followed by re-pricing of the securities portfolio inquestion.
• Price sensitivities are inherently less smooth than the price function itself, and this lackof smoothness will often put significant stress on a numerical scheme.
• It is thus important to try to adapt numerical schemes to avoid introducing spuriousinstabilities into the calculation of greeks.
We focus here on the so-called Tube Monte Carlo method for illustrative purposes42 andwe apply it to digital options, barrier options and CLE’s. The method is designed to beapplied when calculating sensitivities by direct perturbation.
14.1 Tube Monte Carlo for digital options
Consider a digital option with payoff
VT = IS(T )>B
where S(t) is the process for the underlying that is simulated using Monte Carlo. Thestandard Monte Carlo estimate (where ωj, j = 1, . . . , J are the sample paths) is given by
V ' 1
J
J∑j=1
IS(T,ωj)>B
The idea is to replace the point sample of the payoff IS(T )>B at S(T, ωj) with an averageestimate of the payoff in a small interval around S(T, ωj), namely
V ' 1
J
J∑j=1
Vj , Vj = E[IS(T )>B|Aj
]Aj = ω : S(T, ω) ∈ [S(T, ωj)− ε, S(T, ωj) + ε] , ε > 0
Clearly:
• If B /∈ Aj , then E[IS(T )>B|Aj
]= IS(T,ωj)>B.
• If B ∈ Aj and ε is small, using the uniform distribution approximation we obtain
E[IS(T )>B|Aj
]= Q (S(T ) > B|S(T, ωj)− ε ≤ S(T ) < S(T, ωj) + ε)
=S(T, ωj) + ε−B
2ε
Hence,
Vj = E[IS(T )>B|Aj
]=
S(T, ωj) + ε−B2ε
IS(T,ωj)−ε≤B<S(T,ωj)+ε + 1 · IS(T,ωj)+ε≤B + 0 · IB<S(T,ωj)+ε
42For a detailed discussion, including payoff smoothing methods for PDE’s, see [AP10-3, Chapter 23].
113
14.2 Tube Monte Carlo for barrier options
Consider a derivative which is a knock-in barrier into a stream of coupons X1, . . . , XN−1,with the knock-in feature defined by a stopping time index η, so that the derivative payscoupons Xi at Ti+1 for i = η, . . . , N − 1 and has value
V (0) = E
[N−1∑i=1
1
B(Ti+1)XiIi≥η
]
The standard Monte Carlo estimate is
V (0) =1
J
J∑j=1
(N−1∑i=1
1
B(Ti+1, ωj)Xi(ωj)Ii≥η(ωj)
)
As above, the indicator Ii≥η(ωj) will introduce digital discontinuities in the payoff whichlead to poor stability and risk sensitivities. The idea of the tube method is again to replacethe point estimates of the payoff with payoff averages over appropriately defined tubes. Ifx(t, ω) denotes the d-dimensional vector of state variables of the underlying model, define
Aεj =
N−1⋂i=1
Aεj,i,
Aεj,i = ω : ‖x(Ti, ω)− x(Ti, ωj)‖ < ε
Observe that Aεj denotes the set of sample paths that come within ε-distance of x(Ti, ωj)for all Ti, i = 1, . . . , N − 1. We may then take our new Monte Carlo estimator to be
V (0) =1
J
J∑j=1
Vj , Vj = E
[N−1∑i=1
1
B(Ti+1, ω)Xi(ω)Ii≥η(ω)|Aεj
]
Being smooth functions of the path ω, we may just evaluate 1B(Ti+1,ω) and Xi(ω) at the
sample path, so that Vj becomes
Vj =
N−1∑i=1
1
B(Ti+1, ωj)Xi(ωj)E
[Ii≥η(ω)|Aεj
]An approximation of the probabilities qi(ωj) := E
[Ii≥η(ω)|Aεj
]is provided in [AP10-3,
Proposition 23.4.1] under the assumption that stopping time index η can be written as thefirst hitting time of a state-dependent boundary
η(ω) = minn ≥ 1 : ψn(x(Tn, ω)) ≥ 0 ∩N
and is given by
1− qi(ωj) =
i∏n=1
(1− pn(ωj)),
pn(ωj) =
1, ψn,j − δn,j ≥ 0,ψn,j+δn,j
2δn,j, ψn,j − δn,j < 0 < ψn,j + δn,j
0, ψn,j + δn,j ≤ 0
where
ψn,j = ψn(x(Tn, ωj)), δn,j = ε‖∇ψn,j‖, ‖∇ψn,j‖ = ∇ψn(x)|x=x(Tn,ωj)
114
Remark 14.1 (Tube Monte Carlo for CLE’s). Recall that CLE’s can be valued in MonteCarlo by representing them as knock-in discrete-barrier options with the knock-in defined byan estimate of the exercise index43 so we may apply the tube Monte Carlo method we justoutlined to smooth out the payoff. Concretely, the function ψ(x(Tn, ω)) in this case wouldbe
ψ(x(Tn, ω)) = C(Un(Tn))>ζ(Tn, ω′k)− C(Hn(Tn))>ζ(Tn, ω
′k)
Remark 14.2 (Tube Monte Carlo for TARN’s). A TARN can be represented as a derivativethat pays a stream of coupons until a knock-out event takes place when a sum of structuredcoupons exceeds a certain target.
VTARN (0) = E
(N−1∑i=1
1
B(Ti+1)XiIi≤η
),
η(ω) = min n ≥ 1 : Qn(ω)−R ≥ 0 ∧N
43C.f. equation ( 136).
115
15 Pathwise differentiation
We start by recalling the pathwise differentiation for European options, following [AP10-1,Section 3.3.2].
Take a European T -maturity option on a dividend-free stock with price process S(t) withpayout function g(S(T )) and consider the problem of computing
dV
dS0= limh→0
V (S0 + h)− V (S0)
h
By the risk-neutral pricing formula V (S0) = E [g(S(T ))]. Assuming that the stock pricefollows a GBM we have
S(T ) = S0 exp
[(r − σ2
2
)T + σ
√TZ
], Z ∼ N(0, 1)
Sh(T ) = (S0 + h) exp
[(r − σ2
2
)T + σ
√TZ
]so Sh(T ) = (S0 + h)S(T )
S0. If the payout function g(·) is regular enough so that the
expectation and the limit are interchangeable, then
dV
dS0= e−rTE
[limh→0
g(Sh(T ))− g(S(T ))
h
]= e−rTE
[g′(S(T ))
S(T )
S0
](143)
so we can implement (143) by generating samples of S(T ) and recording the sample averages
of g′(S(T ))S(T )S0
.
Remark 15.1. For discontinuous payoffs, such as g(x) = 1x>K care must be taken uponimplementing this. In this case, (143) reads
dV
dS0= e−rTE
[δ(S(T )−K)
S(T )
S0
]= e−rT
K
S0ϕS(K)
where δ(·) is the delta function. This is correct (provided that the terminal stock densityϕS(·) is known, but it is unsuited for Monte Carlo simulation since the likelihood of δ(S(T )−K) being non-zero is zero.
More generally, consider estimating dVdα where V (α) = E[Y (α)]. Under certain regularity
conditions (c.f. [AP10-1, Proposition 3.3.1]), we seek to write
dV
dα=
d
dαE[Y (α)] = E
[d
dαY (α)
]Assuming that Y represents a payout function, namely
Y (α) = g(X(α)), X(α) = (X1(α, . . . ,Xq(α))
we can further writed
dαY (α) =
q∑i=1
∂g
∂Xi
dXi
dα
In order to compute the derivatives dXidα , assuming that X(t) is driven by a scalar SDE
dX(t) = µ(t,X(t))dt+ σ(t,X(t))dW (t)
116
differentiating with respect to α on both sides and writing Dα(t) = dX(t)dα we obtain
dDα(t) =d
dxµ(t,X(t))Dα(t)dt+
d
dxσ(t,X(t))Dα(t)dW (t)
and the latter SDE can be discretized and simulated in parallel with the simulation of theSDE for X(t) itself.
We next illustrate how to apply this method to compute greeks of CLE’s and barrieroptions, but we first make a few comments on the likelihood ratio method.
15.1 On the unsuitability of the likelihood ratio method in interestrate modeling
In general, an alternative method to compute sensitivities by Monte Carlo simulation is thelikelihood ratio method. In general, assume that Y (α) represents a deflated payout functiong(·) applied to a vector of random variables X(α) = (S1(α), Sq(α)) with joint densityf(x;α), x ∈ Rq. Then
V (α) = E [g(S(T ))] =
∫Rqg(x)f(x;α) dx
Assuming that the density function f(x;α) is a smooth, we can exchange differentiation andintegration
∂V (α)
∂α=
∫Rqg(x)
∂f(x;α)
∂αdx =
∫Rqg(x)
∂ ln f(x;α)
∂αf(x;α) dx = E [g(X(T ))`(X(T ))]
where `(x)∂ ln f(x;α)∂α is the so-called log-likelihood ratio.
Example 15.1. In the Black-Scholes setting S(T ) = S0 exp[(r − σ2
2
)T + σ
√TZ], Z ∼
N (0, 1) and Y (T ) = lnS(T ), the log-likelihood ratio is given by
`(Y (T )) =∂ lnϕ(Y (T );S0)
∂S0=
1
S0σ√TZ
• The likelihood ratio method does not require smoothness of the payout function, butonly of the density function, which is a mild assumption.
• Knowledge of the density function is required, although one always has the option ofusing a Gaussian approximation based on an Euler discretization.
• In general, it is the earliest observation T1 of the underlying asset process44 that de-termines how fast its variance explodes (note the
√T in the denominator of the log-
likelihood ratio in the Black-Scholes model). Because of the regular structure of mostinterest rate derivatives, the time T1 will in most cases be rather short, resulting inhigh variance of the estimate.
44For instance, the first coupon fixing date, the first exercise date of a CLE or the first knock-out date of abarrier.
117
15.2 CLE’s
Recall the main valuation recursion for CLE’s (in the spot measure QB
Hn−1(Tn−1)
B(Tn−1)= EBTn−1
[1
B(Tn)maxHn(Tn), Un(Tn)
],
Un(t) = B(t)
N−1∑i=n
Et
[1
B(Ti+1)Xi
],
Xi = τi(Ci − Li(Ti))
where Xi are the net coupon payments, Ci are structured coupons, Hn(t) is the n-th holdvalue and Un(t) is the n-th exercise value. Denote by ∆α a pathwise differentiation operatorwith respect to a given parameter α.
Proposition 15.1. Assuming that the coupons Xn, n = 1, . . . , N − 1 and the inverse num-raire 1
B(t) are Lipschitz continuous functions of the parameter α, for any n = 1, . . . , N − 1
we have
∆α
(Hn−1(Tn−1)
B(Tn−1)
)= ETn−1
[∆α
(Un(Tn)
B(Tn)
)IUn(Tn)>Hn(Tn)
]+ ETn−1
[∆α
(Hn(Tn)
B(Tn)
)IHn(Tn)>Un(Tn)
]Unwrapping this recursive statement, if η is the optimal exercise time index then
∆α(H0(0)) = E
[N−1∑n=η
∆α
(Xn
B(Tn+1)
)]
Remark 15.2. From the fact that
H0(0) = E
[N−1∑n=η
Xn
B(Tn+1)
], ∆α(H0(0)) = E
[N−1∑n=η
∆α
(Xn
B(Tn+1)
)]
is looks as though one can compute the derivative ∆α by differentiating the first sum andpretending that the optimal exercise time index η is independent of α. This is not the casein most cases: see [AP10-3, Section 24.1.1.2] for a justification.
Remark 15.3. When using brute-force perturbation methods to evaluate CLE greeks, thereare different alternatives when deciding how to treat the exercise decision in the perturbedmarket data scenario. As justified in [AP10-3, Section 24.1.1.3], it is convenient to re-use theestimate of the optimal exercise index from the base scenario in order to avoid introducingdiscontinuities.
15.3 Barrier options
Recall that a CLE can be interpreted as a knock-in barrier option (where the barrier conditionis defined by the optimal exercise rule) which suggests that the pathwise differentiationmethod outlined in the previous section could be extended to general barrier options.
118
15.3.1 Digital option warm-up
Consider a T -maturity European option with payoff
X = IG>hR, G,R FT -measurable, h ∈ R
Differentiating with respect to α we obtain
∆α
(IG>hR
)= IG>h∆αR+
∂IG>h∂G
R∆αG = IG>h∆αR+ δ(G− h)R∆αG
Hence,
∆αE
[X
B(T )
][∗]= E
[∆α
(X
B(T )
)]= E
[∆α
(1
B(T )
)IG>hR
]+ E
[1
B(T )IG>h∆αR
]+ E
[1
B(T )δ(G− h)R∆αG
]= E
[∆α
(1
B(T )
)IG>hR
]+ E
[1
B(T )IG>h∆αR
]+γG(h) · E
[1
B(T )R∆αG|G = h
]where γG(h) is the PDF of G evaluated at h.
• In [∗] we assumed that we can exchange ∆α and E[·], even though we can’t due to thedigital discontinuity of IG>h. The result of the computation is nevertheless true, andcan be justified using Malliavin Calculus.
• The expected values on the RHS can be computed in a numerical scheme such as MonteCarlo as long as the density γH(·) is known and the conditional expectation
E
[1
B(T )R∆αG|G = h
]can be evaluated. Both can be computed using Malliavin calculus, even though in somecases these can be approximated in closed form.
15.3.2 Barrier options
Consider a barrier schedule TnN−1n=1 to which we associate knockout variables Gn and barrier
levels hn. Consider an option that pays Rn on the first date Tn where Gn > hn, so that
V (0) = E
[Rη
B(Tη)
],
η = mink ≥ 1 : Gk > hk ∧N
More generally, consider the option with the barrier condition checked at times Tn+1, . . . , TN−1
only, namely
Vn(t) = B(t)Et
[Rηn
B(Tηn)
],
ηn = mink ≥ n+ 1 : Gk > hk ∧N
119
Proposition 15.2. (c.f. [AP10-3, Proposition 24.1.3]) For the barrier option describedabove, which pays Rn on the first Tn where Gn > hn, n = 1, . . . , N − 1, the pathwisederivative with respect to a parameter α is given by
∆αV (0) = E
[1
B(Tη)∆αRη|n=η
]+ E
[1
B(Tη)γη(hη) (Rη − Vη(Tη)) ∆αGn|n=η
∣∣∣∣Gη = hη
]
Sketch of proof. The following recursion is clear
Vki,n(Tn)
B(Tn)= ETn
[Rn+1
B(Tn+1)IGn+1>hn+1
]+ ETn
[1
B(Tn+1)Vki,n+1(Tn+1)IGn+1≤hn+1
]Formal differentiation yields the following recursion for ∆αVki,n(Tn)
∆α
(Vki,n(Tn)
B(Tn)
)= ETn
[∆α
(Rn+1
B(Tn+1)
)IGn+1>hn+1
]+ ETn
[∆α
(1
B(Tn+1)Vki,n+1(Tn+1)
)IGn+1≤hn+1
]+ ETn
[1
B(Tn+1)(Rn+1 − Vki,n+1(Tn+1)) δ(Gn+1 − hn+1)
]Unwrapping the recursion the claim follows.
Remark 15.4. The method applies directly to CLE’s through their interpretation as knock-in options. Recall that
VCLE(0) = E
[Uη(Tη)
B(Tη)
]= E
[N−1∑n=η
Xn
B(Tn+1)
], η = minn : Un(Tn) > Hn(Tn)
so that Proposition 15.1 follows directly from Proposition 15.2 upon setting Rn = Un(Tn),Gn = Un(Tn)−Hn(Tn) and hn = 0.
Remark 15.5. These results are rather complex and require knowledge of the transitiondensities and conditional probabilities, which are often hard to compute. Therefore, it maybe more fruitful to apply payoff smoothing techniques (e.g. tube Monte Carlo method) orimportance sampling techniques to integrate any discontinuities before applying the pathwisedifferentiation method.
15.4 Pathwise differentiation for Monte Carlo based models
Consider an LM model with separable deterministic local volatility with dynamics in thespot measure QB given by
dLn(t) = ϕ(Ln(t))λn(t)>(µn(t)dt+ dWB(t)
)µn(t) =
n∑j=q(t)
τjϕ(Lj(t))λj(t)
1 + τjLj(t)
Assume that the LM model and the security to be priced share the same tenor structure0 = T0 < · · · < TN .
120
16 Importance sampling and control variates
In this section we explore two common variance reduction techniques (importance samplingand control variates). We start by outlining the basic principles upon which they rely and byillustrating them via simple examples extracted from [Gla03, Chapter 4]. We then describeapplications to fixed-income derivatives following [AP10-3, Chapter 25].
16.1 Importance sampling
The idea of importance sampling is to use a change of measure to reduce variance. Assumewe seek to estimate µ = EP [Y ]. Let P be an equivalent measure to P ; then
µ = EP [Y ] = EP[YdP
dP
]not= EP
[Y
R
]where R = dP
dP is the Radon-Nikodym derivative. The goal of the method is to choose the
new measure P optimally in order to minimize the P -variance of YR .
Assume Y = g(X) where g : Rp→R is a well-behaved function and X is a p-dimensionalrandom vector with density fX : Rp→R. Then
µ = EP [g(X)] =
∫Rpg(x)fX(x) dx
can be estimated by µn = 1n
∑ni=1 g(Xi) for independent samples X1, . . . , Xn of X.
If h : Rp→R is the (strictly positive) density function of X in a nother measure P , wecan represent µ as
µ = EP [g(X)] =
∫Rpg(x)
fX(x)
h(x)h(x) dx = EP
[g(X)
f(X)
h(X)
]which in turn has a Monte Carlo estimate µhn = 1
n
∑ni=1 g(Xi)
f(Xi)h(Xi)
, where now X1, . . . , Xn
are independent draws from h. The quotient f(x)(h(x) is the so-called likelihood ratio.
We now compare both variances:
Var[µhn] =1
n
(EP
[g(X)2 f(X)2
h(X)2
]− µ2
)=
1
n
(EP
[g(X)2 f(X)
h(X)
]− µ2
)Var[µn] =
1
n
(EP
[g(X)2
]− µ2
)and we note that importance sampling will lower variance if
EP[g(X)2 f(X)
h(X)
]< EP
[g(X)2
]Now assume that the p-dimensional process X(t) is driven by an SDE
dX(t) = µ(t,X(t))dt+ σ(t,X(t))dW (t) (144)
where W (t) is a d-dimensional Brownian motion and that we seek to evaluate
EP [g(X(T ))]
Assume that the SDE is simulated by an m-dimensional scheme so that
g(X(T )) = G(Z1, . . . , Zm), G : Rp×m→R
121
where Z1, . . . , Zm are independent p-dimensional Gaussian vectors with density φ(z). Byindependence,
EP [g(X(T ))] =
∫Rp×m
G(z1, . . . , zm)
m∏i=1
φ(zi) dzi
The likelihood ratio `(z) =∏miφ(zi)h(zi)
performs a change of measure (to P , say) that preserves
independence of Zi and alters the common marginal density from φ(z) to h(z) so that
EP [g(X(T ))] = EP
[G(Z1, . . . , Zm)
m∏i=1
φ(Zi)
h(Zi)
]
Example 16.1. Assume p = 1 and consider shifting the means of the Zi from 0 to somescalar µ, so that
h(zi) =1√2π
exp
(−1
2(zi − µ)2
)whereby
`(z;µ) = exp
(−µ
m∑i=1
zi +m
2µ2
)The idea would then be to set µ to minimize the P -variance of the term G(Z)`(Z;µ). Thisminimization problem is generally handled numerically.
Example 16.2. (c.f. [AP10-1, Section 3.4.4.5]) Consider the problem of estimating byMonte Carlo
P (Z > c)
where Z ∼ N (0, 1) and large c, so that the event Z > c is rare and a standard MonteCarlo simulation might be inaccurate.
In ordinary Monte Carlo we would use the sample mean estimator
P (Z > c) = EP [IZ>c] '1
n
n∑i=1
IZi>c
where Z1, . . . , Zn ∼ N (0, 1) are independent standard Gaussian samples. The variance ofthis estimator is easily seen to be
VarP (IZ>c) = P (Z > c) [1− P (Z > c)]
Introducing a probability that shifts the mean of Z from 0 to some µ, by Example 16.1 thelikelihood ratio is
`(z) =dP
dP= e−µz+
12 z
2
soP (Z > c) = EP [IZ>c] = EP
[e−µz+
12 z
2
IZ>c], Z ∼ N (µ, 1)
and a Monte Carlo estimator for this is then
1
n
n∑i=1
e−µ(Zi+µ)+ 12 z
2
IZi+µ>c, Z1, . . . , Zn ∼ N (0, 1)
As for the variance of the estimator one can compute that
VarP(e−µz+
12 z
2
IZ>c)
= eµ2
P (Z > µ+ c)− P (Z > c)2
122
and the choice of µ that minimizes this P -variance is
µ∗ = arg minµ
eµ
2
P (Z > µ+ c)
= µ∗ : 2µ∗ [1−N (c+ µ∗)]− n(c+ µ∗)
This holds more generally in higher dimensions and for more general payoffs. Assume wewant to estimate E[G(Z)] and assume that we restrict ourselves to changes of distributionthat change the mean of Z from 0 to µ. In this case we have
E[G(Z)] = EPµ
[G(Z)e−µ
>Z+ 12µ>µ], µ ∈ Rn, Z
Pµ∼ N (µ, I)
[1]' EPµ
[eF (Z)e−µ
>Z+ 12µ>µ]
' E[eF (µ+Z)e−µ
>Z− 12µ>µ]
[2]' E
[eF (µ)+∇F (µ)Ze−µ
>Z− 12µ>µ]
where in [1] we expressed F (Z) = lnG(Z) and in [2] we linearized F (µ + Z). Note thatchoosing
∇F (µ)Z = µ> (145)
we eliminate the dependence on Z, and hence the variance of the estimator (although [2] isjust an approximation, so we end up with a low-variance estimator).
Example 16.3. (c.f. [Gla03, Section 4.6.2]) Consider for instance the case of an arithmeticAsian option. Ignoring the discount factor e−rT we have a payoff
G(Z) =[S(Z)−K
]+, S =
1
m
m∑i=1
S(ti), 0 = t0 < t1 < · · · < tm
S(ti) = S(ti−1) exp
((r − 1
2σ2)(ti − ti−1) + σ
√ti − ti−1Zi
), Zi ∼ N (0, 1) (146)
Imposing condition (145)
∇(lnG(Z)) = Z =⇒ ∂
∂Zjln(S(Z)−K) =
1
S −K∂S
∂Zj= Zj
=⇒ ∂S
∂Zj− (S −K)Zj = 0
Using the discretization of S this condition becomes
Zj =1
mG(Z)
m∑i=j
σ√ti − ti−1S(ti)
which, given an equally spaced time grid with ti − ti−1 = h yields a recursion
z1 =σ√h[G(Z) +K]
G(Z), zj+1 = zj −
σ√hS(tj)
mG(Z), j = 1, . . . ,m− 1 (147)
Given a payoff vale y ≡ G(Z), this scheme determines Z (if y ≡ G(Z), apply (147) tocalculate Z1, ten apply (ch25-importance-sampling-example-1) to calculate S(t1), then apply(ch25-importance-sampling-example-2) again to obtain Z2 and so forth). Each payoff valuey = G(Z) thus determines Z, and hence a path S(ti; y)mi=1 and we seek to find the y forwhich
1
m
m∑i=1
S(ti; y)−K = y
123
This equation can be solved for y∗ easily using a 1-dimensional search and this determinesZ∗ = Z(y∗). To simulate, we then simply set µ = Z∗ and apply importance sampling withmean µ.
16.2 Payoff smoothing via importance sampling
We illustrate how use the importance sampling method to produce payoff smoothing byconsidering two examples: binary options and TARN’s.
16.2.1 Binary options
Assume X ∼ N (µ, σ2) and consider en option that pays g(X) provided that X is below acertain barrier b, namely
V = EP[g(X)IX<b
], P some pricing measure
• Valuing by Monte Carlo entails simulating independent Gaussian samples, discardingthose that end up above the barrier b and averaging the payoff values over non-discardedsamples.
• If b is low, the proportion of samples that contribute to the average is small, whichleads to large simulation error.
• The digital feature of the payoff reduces the accuracy and stability of Monte Carloestimates of greeks.
We thus seek to change the probability to increase the proportion of interesting samples.Conditioning on the survival we obtain
V = E[g(X)|X < b]P (X < b) = E[g(X)|X < b]N(b− µσ
)= E
[g(N−1(U ′))
]N(b− µσ
)where U ′ ∼ U(0,N (b)). Since g is smooth, a Monte Carlo evaluation of E
[g(N−1(U ′))
]will
have good convergence and exhibit stable greek estimates.
Setting Λ =IX<bP (X<b) we can write
V = E[g(X)IX<b
]= E[g(X)Λ]P (X < b) = E[g(X)]P (X < b), Λ =
dP
dP
so conditioning on survival can be interpreted as a change of measure defined by the Radon-Nikodym derivative Λ.
16.2.2 TARN’s
Recall that the TARN valuation formula under the spot measure QB reads
Vtarn(0) = EB
[N−1∑n=1
1
B(Tn+1)Xn(Tn)IQb<R
], Qn =
n−1∑i=1
τiCi
where the Xn’s are net coupons and R is the total return.
Example 16.4 (Removing first digital discontinuity). We first try to apply the approachfrom the previous section to remove the first digital discontinuity. Assume structured couponsof the form
Cn = (s− gLn(Tn))+
124
and note that
Q2 < R⇐⇒ L1(T1) >1
g
(s− R
τ1
)def= b1
Denote by V the path value of the coupons that depend on the first knockout event
V =
N−1∑n=2
1
B(Tn+1)Xn(Tn)IQn<R
and note that since E[V |L1(T1) ≤ b1] = 0 we have
E[V] = E[V|L1(T1) > b1]QB(L1(T1) > b)
• Since T1 is typically small, the probability QB(L1(T1) > b) of not knocking out can beapproximated analytically with high precision.
• E[V|L1(T1) > b1] can be interpreted as the value of the TARN under the conditionthat it does not knock out on the date T1, which is less discontinuous than our originalpayoff function.
More generally Piterbarg shows that all discontinuities can in fact be integrated outsideof Monte Carlo.
Proposition 16.1. Consider a TARN with structured coupon Cn = (s − gLn(Tn))+ and
denote bn = 1g
(s− R−Qn
τn
). Its value is given by
Vtarn(0) =
N−1∑n=1
EB[ψnETn−1
[Xn(Tn)
B(Tn+1)
]], (148)
ψn =
n−1∏k=1
QBTk−1(Lk(Tk)− bk) (149)
where the measure QB is defined by the Radon-Nikodym derivative process Λ(t) = Et
[dQB
dQB
]where
Λ(t) =QBt (Lm+1(Tm+1) > bm+1)
QBTm(Lm+1(Tm+1) > bm+1)
m−1∏k=1
ILm+1(Tm+1)>bm+1
QBTk(Lm+1(Tk+1) > bk+1), t ∈ [Tm, tm+1) (150)
• The quantities ψn in (149) can be approximated analytically since each term of theform QBTk−1
(Lk(Tk > bk) involves an expected value over a short period [Tk−1, Tk], so
short-time (e.g. Gaussian) approximations to the distribution of Lk over [Tk−1, Tk]may be applied effectively (in fact, in some simulation schemes the ψn’s come for free).
• Note that the functions ψn replace the indicator functions IQn<R as weights on thecoupons in the TARN valuation formula. The former are much smoother functions ofa simulated path than are indicator functions.
• In measure QB , the TARN never knocks out. However, in order to use the method inpractice we need to establish how to simulate model dynamics in this measure.
16.2.2.1 Simulating under survival measure using conditional Gaussiandraws
Consider the case of a single-factor Libor market model
dLi(t) = λi(t)ϕ(Li(t))[µi(t)dt+ dWB(t)
], i = 1, . . . , N − 1
125
In order to implement TARN pricing formula (148) we need to simulate Libor rates underthe measure QB , namely, in such a way that Ln(Tn) > bn for each n. Consider a simulationstep from Tn−1 to Tn. Employing an Euler scheme we can approximate that QB-dynamicsas
Ln(Tn) = Ln(Tn−1) + λn(Tn−1)ϕ(Ln(Tn−1))[µn(Tn−1)τn−1 +
√τn−1Z
], Z ∼ N (0, 1)
In order to ensure that Ln(Tn) > bn we need to make sure that Z satisfies
Ln(Tn−1) + λn(Tn−1)ϕ(Ln(Tn−1))[µn(Tn−1)τn−1 +
√τn−1Z
]> bn
namely
Z > Zmindef=
bn −mn
vn,
vn = λn(Tn−1)ϕ(Ln(Tn−1))
√τn−1,
mn = Ln(Tn−1) + vnµn(Tn−1)√τn−1
so that Ln(Tn) = mn + vnZ. In order to simulate Z conditioned on it being above a certainlevel Zmin we proceed as above
Z = N−1 [N (Zmin) + (1−N (Zmin))U ] , U ∼ U(0, 1)
It now suffices to verify that this measure just constructed actually coincides with the measuredefined by the Radon-Nikodym process (150) (c.f. [AP10-3, Page 1071]).
• Once Z has been drawn, all Libor rates can be evaluated using
Li(Tn) = Li(Tn−1) + λi(Tn−1)ϕ(Li(Tn−1))[µi(Tn−1)τn−1 +
√τn−1Z
]• The weights ψn in (149) come for free within this framework, since
QBTn−1(Ln(Tn) > bn) = 1−N (Zmin)
16.3 Control variates
Suppose we wish to estimate E[Y ] by Monte Carlo; the idea of the control variates method isto is a possibly multi-dimensional random variable with known mean which is closely relatedto Y in order to reduce the variance of the simulation. Concretely, if Y c = (Y c1 , . . . , Y
cq )> is
a q-dimensional vector of random variables with known mean µc = E[Y c] = (µc1, . . . , µcq)>
and β = (β1, . . . , βq)> is a constant vector, we may form a new random variable
X = Y − β> · (Y c − µc)
Clearly E[X] = E[Y ] so we can run a Monte Carlo simulation on X to estimate the meanof Y .
Denote the q × q covariance matrix of Y c by
ΣY c , [ΣY c ]i,j = Cov(Y ci , Ycj )
and define the q-dimensional vector
ΣY Y c , [ΣY Y c ]i = Cov(Y, Y ci )
Then clearly
Var(X) = Var(Y ) + β>ΣY cβ − 2β>ΣY Y c
126
Lemma 16.2. Var(X) is minimized at β∗ = Σ−1Y cΣY Y c and the minimum value is given by
minβ
Var(X) = (1−R2)Var(Y ), R2 =1
Var(Y )Σ>Y Y cΣ
−1Y cΣY Y c
where we identify R2 as the R-squared of a multi-dimensional regression of Y against Y c
and the components of the optimal vector β∗ as the corresponding regression coefficients.
The ratio of variances is henceminβ VarX
VarY = 1−R2 so the larger the correlation betweenthe Y and the control variate, the larger the variance reduction.
In our usual derivative pricing setting, we seek to replace our standard Monte Carloestimate
1
K
K∑j=1
Y (ωj)
with
1
K
K∑j=1
(Y (ωj)− β>(Y c(ωj)− E[Y c])
)where Y c(ωj) are random samples of the multi-dimensional control variate Y c, chosen suchthat E[Y c] is available in closed form.
If Y is the value of a security under a given model, there are multiple ways to select thecontrol variate Y c.
• Y c may represent the value of the same security under a different (but closely related)model.
• Y c may represent the value of a different, but closely related, security under the samemodel.
• Y c may represent the value of an approximate hedging strategy.
• Y c may be a weighted combination of any of the above control variates.
We explore some of these in the next subsections, but we first outline a few elementaryexamples from [Gla03, Section 4.1].
Example 16.5 (pricing complex options using more tractable options as variates). Considerpricing an arithmetic Asian option with payoff SA = 1
n
∑ni=1 S(ti) in the Black-Scholes
model. No closed-form solution is known, so one must rely on simulation. In contrast, calls
and puts on the geometric average SG = (∏ni=1 S(ti))
1/ncan be priced in closed form and are
good candidates45 to be used as control variates in pricing options on SA. For an arithmeticAsian call option, one could form a controlled estimator using independent replications of
(SA(ω)−K)+ − b
(SG(ω)−K)+ − E[e−rT (SG −K)+]
Example 16.6 (pricing options using prices in simpler models as variates). Consider pricingan option on an asset with dynamics
dS(t)
S(t)= rdt+ σ(t)dW (t)
We can simulate S at dates t1, . . . , tn using
S(ti+1) = S(ti) exp
([r − 1
2σ(ti)
2](ti+1 − ti) + σ(ti)√ti+1 − tiZi+1
), Zi ∼ N (0, 1)
45 [Gla03, Figure 4.2] shows that there is an almost perfect correlation between the payoff of a call option onarithmetic average and the payoff of a call option on the geometric average.
127
Assume the option in question had a more tractable price if the asset were a geometricBrownian motion. We could simulate
S(ti+1) = S(ti) exp
([r − 1
2σ2](ti+1 − ti) + σ
√ti+1 − tiZi+1
), Zi ∼ N (0, 1)
for some constant σ and initial condition S(0) = S(0). For instance, for a standard call stuckat K, we could form a controlled estimator using independent replications of
(S(tn)−K)+ − b
(S(tn)K)+ − E[(S(tn)−K)+]
For effective variance reduction, one should choose σ to be close to a typical value of σ.
16.3.1 Model-based control variates
We fix some notation:
• Let Vorig be a Monte Carlo estimate for the true security value in the original pricingmodel
• Let Vproxy be a Monte Carlo estimate for the same security in a proxy model where a
highly accurate price estimate E[Vproxy] is available.
• Denote VPDE = E[Vproxy].
We then introduce a corrected value estimate as
Vcorrected = Vorig − β(Vproxy − VPDE
)where β is the appropriate regression coefficient. If the path values used to compute Vorigare positively correlated with the ones used to obtain Vproxy, the variance of the estimate isreduced.
We illustrate how to construct a PDE-friendly model proxy for the LM model.Recall that the LM model is Markovian only in the full set of all forward Libor rates on theyield curve, plus any additional variables required to model unspanned SV. For concreteness,consider a one-factor model with deterministic local volatility
dLn(t) = ϕ(Ln(t))λn(t)>(µn(t,L(t))dt+ dWB(t)
)µn(t,L(t)) =
n∑j=q(t)
τjϕ(Lj(t))λj(t)
1 + τjLj(t)
We seek to construct a low-dimensional Markovian approximation in a number of steps.
1. Eliminate the local volatility ϕ(Ln(t)) via the transform
f(x) =
∫ x
x0
1
ϕ(ξ)dξ, `n(t)
def= f(Ln(t)), n = 1, . . . , N − 1
which eliminates ϕ from the diffusion part of the SDE and yields
d`n(t) = λn(t)>[(µn(t,L(t))− 1
2λn(t)ϕ′(Ln(t))
)dt+ dW (t)
]2. The LM model drift terms are generally small, so we may approximate
µn(t,L(t)) ' µn(t,L(0))
128
3. For local volatility functions ϕ(x) which are close to linear, we may further approximate
ϕ′(Ln(t)) ' ϕ′(Ln(0))
which yields
d`n(t) = λn(t)>[(µn(t,L(0))− 1
2λn(t)ϕ′(Ln(0))
)dt+ dW (t)
]so that each `n(t) is an integral of λn(t) against a Brownian motion with deterministicdrift.
4. To make all variables functions of the same state variable, approximate the volatilitystructure with a separable one
λn(t) ' λn(t) = σnα(t), n = 1, . . . , N − 1
where α is a function of time common to all Libor rates and σn’s are Libor-specificscalars. Defining a one-dimensional Markovian state variable by
dX(t) = α(t)dW (t)
we obtain that all the variables `n(t) are deterministic functions of X(t):
`n(t) = `n(0) + dn(t) + σnX(t),
dn(t) =
∫ t
0
λn(s) ·(µn(s,L(0))− 1
2λn(s)ϕ′(Ln(0))
)ds
which translated back into Libor forwards reads
Ln(t) = f−1 (f(Ln(0)) + dn(t) + σnX(t)) , n = 1, . . . , N − 1
5. With this representation, at each point in time t, the value of any path-independentderivative V can be expressed as a function
V = V (t,X(t))
satisfying the PDE
∂V (t, x)
∂t+
1
2α2(t)
∂2V (t, x)
∂x2= r(t, x)V (t, x)
subject to appropriate boundary and jump conditions.
To use the Markov approximation as the control variate, we define:
• Vproxy: value of the security in the Markov LM model computed by Monte Carlo.
• VPDE : value of the security computed by the PDE method in the same model.
In order to achieve the required high correlation between the path values of a derivativein the original and proxy models, it is necessary to use the same simulation seed for randomnumber generation and also compatible discretization schemes.
The potential downside of the method is the fact that its scope is limited by the needto perform a PDE valuation, with three dimensions being the practical maximum for areasonably quick PDE scheme.
129
16.3.2 Instrument-based control variates
The idea here is to keep the model fixed but change the payoff; we illustrate how the methodworks when applied to Bermudan swaptions and CLE’s.
Consider a Bermudan swaption with N − 1 exercise opportunities T1, . . . , TN−1, withcorresponding exercise values
Un(t) = B(t)Et
[N−1∑i=n
1
B(Ti+1)Xi
], Xi = τi(Li(Ti)− κ)
The value of the Bermudan swaption is given by
H0(0) = E
N−1∑i=η
1
B(Ti+1)Xi
where η is the optimal exercise index, and the corresponding K-path Monte Carlo estimatevalue is
H0(0) ' 1
K
K∑k=1
Y (ωk), Y (ω) =
N−1∑i=η(ω)
1
B(Ti+1, ω)Xi(ω)
where ω1, . . . , ωK are simulated paths.
A naive set of control variates that we could introduce is based on the exercise values
Y cn (ω) = Un(Tn, ω) = B(Tn)
[N−1∑i=n
1
B(Ti+1, ω)Xi(ω)
]but this presents two major drawbacks:
1. The control variates Y cn , n = 1, . . . , N − 1 are linear functions of rates, whereas thepayoff of a Bermudan swaption is not well approximated by a linear function.
2. More subtly, the value of a Bermudan swaption (or CLE) along a path ω involves cash-flows fixed at times Tη(ω), . . . , TN , whereas all the cash-flows in the definition of Y cnare sampled at a single time. Since the number of coupons included in the Bermudanswaption and the controls differ, the correlation between them is likely to be low.
16.3.2.1 Timing mismatch
The timing mismatch in (2) can be rectified by sampling the controls at the Bermudanswaption exercise time, since:
Lemma 16.3. Let Un = Un(Tn) and let Zn, n = 0, . . . , N − 1 be a collection of martingaleswith respect to the filtration FTnNn=0. If η, σ ∈ 1, . . . , N − 1 are stopping times satisfyingη ≤ σ, then
[Corr(Uη, Zη)]2 ≥ [Corr(Uη, Zσ)]2
Therefore if a European option maturing at Tn is used as a control variate, and for aparticular Monte Carlo path η < n, the Lemma implies that we should use the value of theoption at time Tη to generate a control variate (higher correlation) rather than wait untilthe maturity date Tn. Namely, we could consider control variates
Y cndef= Un(Tη) = B(Tη)
η−1∑i=n
Xi
B(Ti+1)+B(Tη)ETη
N−1∑maxη,n
Xi
B(Ti+1)
, n = 1, . . . , N − 1
Note also that by the optional sampling theorem Y cn = Un(Tη) are martingales, whence
E[Y cn ] = E[Un(Tη)] = Un(0)
130
16.3.2.2 Non-linear controls
To introduce non-linearity into the controls for Bermudan swaptions, one can consider usingcaps, which can be valued exactly in the majority of LM models. Concretely, for a Bermudaswaption with strike κ whose exercise values are
Un(t) = B(t)
N−1∑i=n
Et
[1
B(Ti+1)τi(Li(Ti)− κ)
]consider the set of caps
Vcap,n(t) = B(t)
N−1∑i=n
Et
[1
B(Ti+1)τi(Li(Ti)− κ)+
], n = 1, . . . , N − 1
These can be used to construct a few possible controls. For instance, we can use a collectionof all caps, observed at the Bermudan exercise time
Y c = (Y c1 , . . . , YcN−1)>,
Y cn = Vcap,n(Tη) = B(Tη)
η−1∑i=n
1
B(Ti+1)τi(Li(Ti)− κ)+
+B(Tη)ETη
N−1∑maxη,n
1
B(Ti+1)τi(Li(Ti)− κ)+
For securities more complex than the standard Bermudan swaptions, the idea of sampling
the controls at the exercise time still applies, but the choice of controls becomes non-trivialand has to be done on a case-by-case basis.
16.3.3 Dynamic control variates
We present here the delta method of Clewlow and Carverhill. Let X(t) be a p-dimensionalvector process al of whose components are martingales (e.g. assets deflated by a numraire)and g : Rp→R a smooth function. Consider estimating the expected value
V (0) = E[g(X(T ))]
Under regularity conditions we have
V (T ) = V (0) +
∫ T
0
p∑i=1
∂V (t)
∂Xi(t)dXi(t)
On a simulation time line tjmj=1 this can be written in the style of an Euler scheme
V (T ) = V (0) +
m∑j=1
p∑i=1
∂V (tj−1)
∂Xi(t)(Xi(tj)−Xi(tj−1)) .
The zero-mean quantity
m∑j=1
p∑i=1
∂V (tj−1)
∂Xi(t)(Xi(tj)−Xi(tj−1))
is likely to have high correlation to V (T ), so we can consider using it as a control variate. Inpractice, one can use deltas constructed using approximate risk sensitivities in order buildthe control variate.
131
• For the LM model, deltas could originate from an approximate Markov model as theone in Section 16.3.1.
• In [AP10-3, Section 25.5], a general technique due to Moni is described which is suitablefor CLE’s, which we outline below.
Take as regression variables polynomials of explanatory variables x(t) = (x1(t), . . . , xd(t))and approximate the hold value using least squares
Hn(Tn) = pn(x(Tn)), n = 0, . . . , N − 1
Approximate sensitivities can then be computed as
∆Hn(Tn)
∆xm(Tn)' ∆Hn(Tn)
∆xm(Tn)=
∂pn∂xm
(X(Tn)), m = 1, . . . , d
which yields an approximate hedging strategy
Vhs(Tn) = H0(0) +
n−1∑j=0
(d∑
m=0
∂pn∂xm
(X(Tj)) [xm(Tj+1)− xm(Tj)]
), n = 1, . . . , N − 1
with time-0 expected value
E [Vhs(Tn)] = H0(0) +
n−1∑j=0
E
[d∑
m=0
∂pn∂xm
(X(Tj))(E[xm(Tj+1)|FTj ]− xm(Tj)
)] [∗]= H0(0)
where in [∗] we assumed that we selected the explanatory variable to be martingales, wherebyE[xm(Tj+1)|FTj ] = xm(Tj).
A control variate can then be defined as the hedging strategy stopped at the exercisetime
Y cdef= Vhs(Tη)
which according to tests by Moni yields reductions in standard error by a factor of 2 to 3.
132
References
[AP10-1] L. Andersen, V. Piterbarg, Interest rate modeling. Volume 1: Foundations andvanilla models, Atlantic Financial Press, 2010
[AP10-2] L. Andersen, V. Piterbarg, Interest rate modeling. Volume 2: Term structure mod-els, Atlantic Financial Press, 2010
[AP10-3] L. Andersen, V. Piterbarg, Interest rate modeling. Volume 3: Products and riskmanagement, Atlantic Financial Press, 2010
[Gla03] P. Glasserman, Monte Carlo Methods in Financial Engineering, Springer, 2003
[Shr04] S. E. Shreve, Stochastic calculus for finance II, Springer, 2004
133