evaluating sensitivities of bermudan swaptions€¦ · 2011-03-20 · evaluating sensitivities of...

Evaluating Sensitivities ofBermudan Swaptions

Sebastian Schlenkrich

Christ Church College

University of Oxford

A thesis submitted in partial fulfillment of the MSc in

Mathematical Finance

March 20, 2011

Acknowledgements

I would like to thank my supervisor Prof. Mike Giles for helpful advices and dis-

cussions during the preparation of this thesis. Moreover I would like to express my

gratitute to d-fine GmbH for giving me the opportunity to attend the MSc in Mathe-

matical Finance programme. Furthermore I wish to thank Jan Riehme for supporting

me with ADTAGEO, many fruitful discussions, and helpful manuscript proofreading.

Above all I thank Sybille for her endless patience and support.

Contents

1 Scope and Setting 3

1.1 Notation and general setting . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Reference test problem . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Analytical Pricing Formulas for Swaptions and Bond Options 10

2.1 Black’76 formula for European swaptions . . . . . . . . . . . . . . . . 10

2.2 Analytical Formulas for the Hull White model . . . . . . . . . . . . . 12

2.2.1 Risk-neutral drift . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.2 Bond and bond option formulas . . . . . . . . . . . . . . . . . 14

2.2.3 Specifying the model parameters . . . . . . . . . . . . . . . . 17

2.3 Calibration of the Hull White Model . . . . . . . . . . . . . . . . . . 19

2.3.1 Formulation of the optimization problem . . . . . . . . . . . . 21

2.3.2 Iterative solution of the non-linear problem . . . . . . . . . . . 22

3 Discretisation and Numerical Solution 25

3.1 Spatial Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1.1 Discretisation Schemes . . . . . . . . . . . . . . . . . . . . . . 30

3.1.2 Determining the computational domain . . . . . . . . . . . . . 32

3.1.3 Using variable grids . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Time Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Sensitivity Evaluation by Automatic Differentiation 41

4.1 Differentiating the European swaption prices . . . . . . . . . . . . . . 45

4.2 Differentiating the Calibration . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Evaluating Derivatives of the PDE solution . . . . . . . . . . . . . . . 48

4.4 Linking the partial derivatives . . . . . . . . . . . . . . . . . . . . . . 51

5 Conclusions 53

A Derivative Approximation for Normally Distributed Grids 55

B Fundamentals of Automatic Differentiation 59

B.1 Evaluation procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

B.2 Forward mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

B.3 Reverse mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

B.4 General Graph Reduction . . . . . . . . . . . . . . . . . . . . . . . . 66

C Option Pricing via Integration 70

C.1 Risk Neutral Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 71

C.2 Forward Neutral Dynamics . . . . . . . . . . . . . . . . . . . . . . . . 73

C.3 Solving the Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Bibliography 76

Chapter 1

Scope and Setting

The Hull White interest rate model is one of the classical interest rate models in

finance. It was proposed in [HW90] as an extension of the Vasicek model. The model

yields analytical formulas for bonds and European bond options. With time inhomo-

geneous model parameters it can be fitted to an observed term structure of interest

rates and a term structure of volatilities. The resulting calibrated model can then

be used to price more exotic interest rate derivatives. Particular financial deriva-

tives priced by the Hull White model are Bermudan bond options and Bermudan

swaptions.

The evaluation of sensitivities in the Hull White model with respect to changes

in the yield curve (i.e. Deltas and Gammas) are discussed, e.g. in [Hen04]. Risk

sensitivities of Bermudan swaptions (also with a focus on changes in the yield curve)

are elaborated in [Pit04].

Key risk factors for Bermudan swaptions are market observed Black’76 volatilities

of European swaptions. Hence the sensitivity of the price with respect to changes

in the volatility is of particular interest. A market standard method for sensitivity

evaluation is bumping the input risk factors, re-evaluate the derivative price and

compute a finite difference approximation of the sensitivity. This approach may not

work appropriately for Bermudan swaption since the pricing involves an iterative

calibration procedure and a numerical solution on a PDE grid (or a tree) which both

introduce numerical errors.

We propose the application of methods of Automatic Differentiation to the pricing

procedure for Bermudan swaptions. This approach yields derivatives of the numerical

scheme within machine precision. It incorporates differentiating the calibration and

the parabolic PDE integration. Our numerical results confirm the applicability and

accuracy of this approach.

The thesis is organized as follows: In this chapter we define the general notation

and setting. Moreover we describe a reference test problem and market data used

for the evaluation of the numerical results. Chapter 2 presents analytical formulas

for European swaptions and bond options in the Black’76 and Hull White model.

These formulas are used to define the objective function for the calibration. The

numerical solution of Bermudan bond options on a PDE grid is elaborated in Chapter

3. We study several aspects of the efficient discretisation and parametrization of the

numerical scheme. In Chapter 4 we elaborate the sensitivity evaluation in our Hull

White model. Finally we summarize our conclusions in Chapter 5.

1.1 Notation and general setting

We aim at pricing Bermudan swaptions. A Bermudan swaption gives the option

holder the right to enter an interest rate swap at predefined dates. The underlying

swap is assumed to exchange a fixed simple compounding rate R against a floating

rate Li. Typical floating rate indices are interbank offered rates (Ibor) such as Euribor

or Libor. Forthcoming we use the general term Ibor rate. The subscript i indicates

the dependence of the floating rate on the accrual period.

Fixed leg coupon payment dates are denoted by S1, . . . , SM and S0 is the start

date of the first coupon period. Year fractions associated with the fixed leg coupon

periods are τ1, . . . , τM . Moreover we assume that fixed leg payment dates are a subset

of the floating rate payment dates. In the Euro market fixed payments are usually

annual and the fixed leg day count convention is 30/360.

Floating leg payment dates are given by S1, . . . , SM and S0 is again the start date

of the first coupon period. Corresponding year fractions are η1, . . . , ηM . The fixed

and float leg must belong to the same time period, i.e. S0 = S0 and SM = SM .

The price of a risk free zero coupon bond at observation time t = 0 with maturity

T (t ≤ T ) is given by P (t, T ). The mapping T 7→ P (t, T ) represents the yield curve

at observation time t. The yield curve may be inferred from deposit, forward rate,

and swap rates quoted in the market. It is usually given as an interpolated set of zero

coupon bond prices (i.e. discount factors). We work in a single yield curve setting. For

example, if we price a swaption on a standard swap exchanging fixed against 6-month

Euribor the yield curve must be bootstrapped from quotes depending on the 6-month

Euribor, i.e. standard 6-month Euribor swap quotes. By this single curve setting we

assume that we can fund and invest at rates given by the yield curve. This assumption

is important to bridge the gap from the pricing of swaptions to the pricing of bond

options. However in recent market situations we face significant liquidity, basis spread

and credit risks which are not covered by this setting. Although to our knowledge

this setting is market standard for the pricing of Bermudan swaptions it should be

used with care.

Ibor floating rates are reference rates for short term deposits. The discount factor

P (t, T ) inferred from an Ibor rate Lfix fixed for the time period [t, T ] is given by

P (t, T ) =1

1 + Lfix η(t, T ).

Here η(t, T ) denotes the year fraction between observation time t and maturity time

T . Correspondingly, the future Ibor rate Lfixj as observed at Sj−1 for the accrual

period [Sj−1, Sj] becomes

Lfixj =1− P (Sj−1, Sj)

η(Sj−1, Sj) P (Sj−1, Sj).

The corresponding forward Ibor rate Lj(t) as seen at observation time t is defined as

Lj(t) =

P (t, Sj−1)− P (t, Sj)

η(Sj−1, Sj) P (t, Sj)for t ≤ Sj−1

Lfixj for t > Sj−1

Using risk neutral valuation [BM07] we find that

Lj(t) = ESj

[Lfixj | F(t)

The expectation ESj is taken in the Sj-forward measure conditional on the information

at observation time t (characterized by the filtration F(t)). As a consequence the time

t (t < Sj−1) price of a cash flow Lfixj η(Sj−1, Sj) paid at Sj becomes

P (t, Sj) ESj

[Lfixj η(Sj−1, Sj) | F(t)

]= P (t, Sj) Lj(t) η(Sj−1, Sj)

= P (t, Sj−1)− P (t, Sj).

The price of the underlying swap at time t ≤ S0 is determined by discounting the

fixed and forward floating leg cash flows. For a (fixed) receiver swap it becomes

Swap(t) = RM∑i=1

τiP (t, Si)︸︷︷︸FixedLeg(t)

−M∑j=1

Lj(t)ηjP (t, Sj)︸︷︷︸FloatLeg(t)

We require that the swaps float leg day count convention coincides with the day

count convention of the underlying Ibor rate. That is ηj = η(Sj−1, Sj). Moreover,

the accrual periods of the Ibor rates Lfixj must start and end at the payment dates

exactly. Since we are working in our single curve environment the time t price of the

floating leg can be simplified to

FloatLeg(t) =M∑j=1

P (t, Sj−1)− P (t, Sj)

= P (t, S0)− P (t, SM)

= P (t, S0)− P (t, SM).

Thus we obtain the swap pricing formula

Swap(t) = R

M∑i=1

τiP (t, Si) − [P (t, S0)− P (t, SM)]

which only depends on the fixed rate R, the fixed leg schedule S0, . . . , SM , and the

yield curve T 7→ P (t, T ). Rearranging terms yields that the swap can also be inter-

preted as the time t price of a risk free (forward) bond contract with unit bond price

paid at S0, fixed coupons Rτi paid at Si for i = 1, . . . ,M and unit notional payment

at SM . That is

Bond(t) = −P (t, S0)︸︷︷︸bond price

+M∑i=1

RτiP (t, Si) + P (t, SM)︸︷︷︸coupons and notional

Swap(t) = Bond(t).

In practice swaps are usually priced incorporating the basis spread between differ-

ent forward Ibor indices, e.g. 6-month Euribor against 3-month Euribor. This means

that the forward Ibor rates Lj(t) may be evaluated from a yield curve different to the

discount curve given by the discount factors P (t, T ). Additionally, rolling and busi-

ness day conventions of the swap may yield payment dates Sj that have a small lag

(of one or two days) to the end dates of the accrual periods of the floating rates Lfixj .

In such situations the above equivalence between swaps and (risk free) bonds only

holds approximately. Nevertheless, for the pricing of swaptions as bond options (or

vice versa) the equivalence between swaps and bonds is usually assumed in practice.

A swaption gives its holder the right to enter a swap at a given strike rate R.

The swaption is considered to be of European style if the right may be exercised

at a single predefined date. Bermudan swaptions bare the right to enter a fixed

maturity swap at several predefined exercise dates. At an exercise date T with Si0 =

mini=1,...,M {Si|Si ≥ T} the payoff of the swaption is given by

Swaption(T ) = [ω Swap(T )]+

M∑i=i0+1

τiP (T, Si)− [P (T, Si0)− P (T, SM)]

Here ω ∈ {−1,+1} distinguishes between a payer (−1) and receiver (+1) swaption

and [ · ]+ abbreviates max{ · , 0}.Exploiting the equivalence between swaps and bonds yields that the swaption may

be interpreted as a coupon bond option with payoff

CBO(T ) =

M∑i=i0+1

τiP (T, Si)︸︷︷︸coupons

+P (T, SM)︸︷︷︸notional

− P (T, Si0)︸︷︷︸strike

The underlying bond coupons equal the fixed leg payments of the swap. At maturity

the unit notional is paid and the option strike equals the unit notional. In order

to obtain analytical formulas for bond options in the Hull White model we have to

restrict the exercise dates to the settlement dates. That is we require that T = Si0 .

In cases where the option is not settled at the exercise date, i.e. the strike is not paid

at T , a corresponding approximated time-T strike could be determined by (forward)

discounting from T to Si0 . However, this approximation only holds, if the settlement

offset is small.

1.2 Reference test problem

We use a standard setting for a test problem to verify the applicability of the methods

elaborated in this thesis. If not stated otherwise we consider the pricing of a EUR

fixed maturity Bermudan Swaption. The underlying swap receives 4% fixed on an

annual 30/360 day count basis against an Ibor floating rate. The swap matures in

31 years. It may be entered annually with the first exercise date being in one year.

Hence we have 30 exercise dates in 1Y, 2Y, to 30Y. We use reference market data as

of February 2010. The yield curve and swaption volatility surface are illustrated in

Figure 1.1 and Figure 1.2.

Figure 1.1: EUR yield curve for reference test problem.

time to maturity in years

0 5 10 15 20 25 30 35

(cont.

Figure 1.2: Implicit EUR swaption volatility surface for a strike of 4%.

Black’76

11121314151617181920

swap tenor in years

time to excercise in years

volatilities

The Hull White model is calibrated to replicate the prices of reference European

swaptions. In our test setting the 30 reference European swaptions correspond to each

of the Bermudan exercise dates. Denote Swaptioni the reference European swaption

in i years (i = 1, . . . , 30). Its payoff is equivalent to a coupon bond call option CBOi

with 31 − i coupon payments in i + 1 to 31 years plus a unit notional payment in

31 years. The reference European swaptions are particularly used to estimate the

accuracy of the numerical PDE methods.

Chapter 2

Analytical Pricing Formulas forSwaptions and Bond Options

In this chapter we elaborate analytical pricing formulas for swaptions and bond op-

tions. The formulas are used to calibrate the Hull White model to market observable

data. Moreover we use the analytical pricing formulas of the Hull White model to

verify the numerical methods.

2.1 Black’76 formula for European swaptions

European swaptions may be valued in the Black’76 framework. For a derivation of

the formulas see for example [BM07]. The payoff of the swaption is rewritten as

Swaption(T ) =

i=i0+1

τiP (T, Si)

(R− P (T, Si0)− P (T, SM)∑M

i=i0+1 τiP (T, Si)

In this representation the annuity is given by

Annuity(T ) =M∑

i=i0+1

τiP (T, Si)

and the (forward) par swap rate is denoted by

Y (T ) =P (T, Si0)− P (T, SM)∑M

i=i0+1 τiP (T, Si).

With this notation the swaption payoff becomes

Swaption(T ) = Annuity(T ) [−ω (Y (T )−R)]+ .

Thus an European receiver (payer) swaption is equivalent to an European put (call)

on the forward par swap rate Y (T ) with strike R.

The Black’76 model assumes that conditional on the information available at

observation time t the forward par swap rate Y (T ) is log-normally distributed with

mean Y (t) and variance σ2B76δ(t, T ). The function δ( · , · ) denotes the year fraction

function applied to scale the Black’76 volatility σB76. The resulting price of the

European swaption at observation time t becomes

Swaption(t) = Annuity(t) · Black76(Y (t), R, σB76, δ(t, T ),−ω). (2.1)

The Black’76 formula for European puts (ω = −1) and calls (ω = +1) with forward

price F , strike K, volatility σ, and time to maturity τ is

Black76(F,K, σ, τ, ω) = ω [FΦ(ωd1)−KΦ(ωd2)] ,

d1,2 =log (F/K)

σ√τ

± σ√τ

Market prices of European swaptions are quoted in terms of implicit Black’76

volatilities. These quotes are given for several times to maturity δ(t, T ) and remaining

swap tenors δ(Si0 , SM). Moreover the prices and implicit volatilities depend on the

moneyness of the swaption strike rate R. The moneyness for swaptions is measured by

the absolute difference R−Y (t). As a result a swaption volatility cube is spanned by

the time to exercise, the remaining swap tenor, and the moneyness. To evaluate prices

of specific swaptions for which no direct volatility quote is available interpolation

schemes are used.

Considering the smile. For the calibration of the Hull White model we require

swaption volatilities for several exercise dates and swap tenors. The fixed rates of

the swaptions in question are equal to the corresponding fixed rate of the Bermudan

swaption characterizing the calibration problem. Since the forward par swap rates

vary for the European swaptions also the moneynesses of the European swaptions

differ. Therefore we have to interpolate the smiles of the swaption volatilities.

As reported for example in [BM07] it is market practice to interpolate volatility

smiles by the approximated formulas of the SABR model [HKLW02]. We calibrate the

SABR model separately to available market quotes of swaption volatilities for a given

pair of exercise date and swap tenor. Using the resulting grid of SABR parameters

we build a volatility surface for the required strike R. Then the resulting volatility

surface is interpolated linearly and extrapolated constantly. A graph of the SABR

swaption volatility surface used for the numerical tests is given in Figure 1.2.

The modelling of the volatility smile is crucial for the pricing of swaptions. Both

the Black’76 and the Hull White model do not model the smile. In the Black’76 model

we use implicit volatilities which are determined from market prices of European

swaptions inverting the same Black’76 model before. The dependence of the Hull

White option prices to the strike is only taken into account by the calibration. Given

prices of reference European swaptions incorporate the smile. If the Hull White model

is calibrated to these European swaption it may replicate the smile of these specific

instruments. As the smile modelling is such an important issue it is subject to a

wide range of research activities. For further details see, for example [RMW09] and

references therein.

In our setting we assume the required Black’76 volatility surface given for the

strike rate in question. As mentioned before this surface may be evaluated using a

SABR model. In a simplified setting one could also neglect the smile and use the

quoted at-the-money swaption volatility surface.

2.2 Analytical Formulas for the Hull White model

The Hull White model [HW90] specifies a stochastic process for the short rate r(t).

The model is given by

dr(t) = [θ(t)− ar(t)] dt+ σ(t)dW (t).

Here θ(t) denotes the risk neutral drift, a the constant mean reversion parameter,

and σ(t) the volatility of the short rate. W (t) is a Brownian motion under the risk

neutral measure with the bank account as numeraire.

2.2.1 Risk-neutral drift

The stochastic differential equation of the short rate may be solved for r. We find

that for t > T

r(t) = e−a(t−T )

[r(T ) +

ea(u−T ) (θ(u)du+ σ(u)dW (u))

Given the information at time T the price of a zero coupon bond with maturity S > T

becomes

ZCB(T, S) = EQ

e−∫ S

r(t)dt| F(T )

.It follows that∫ S

r(t)dt =

e−a(t−T )

[r(T ) +

ea(u−T ) (θ(u)du+ σ(u)dW (u))

]dt. (2.2)

The integral may be decomposed as B(T, S)r(T ) +X(T, S) + Y (T, S) with

B(T, S) =

e−a(t−T )dt =1− e−a(S−T )

X(T, S) =

e−a(t−T )

ea(u−T )θ(u)du dt =

θ(u)B(u, S)du,

Y (T, S) =

e−a(t−T )

ea(u−T )σ(u)dW (u) dt =

σ(u)B(u, S)dW (u).

The term for X(T, S) may be derived by changing the order of integration∫ S

e−a(t−T )ea(u−T )θ(u) du dt =

e−a(t−T )ea(u−T )θ(u) dt du

(∫ S

e−a(t−u) dt

θ(u)B(u, S) du.

Y (T, S) follows analogously. Moreover, we have for B(T, S) that B(S, S) = 0,

BS(T, S) = e−a(S−T ), BS(S, S) = 1, and BSS = −aBS(T, S).

The only stochastic term in (2.2) is Y (T, S). Since Y (T, S) is an Ito integral with

deterministic integrand we find that

Y (T, S) ∼ N(

σ2(u)B2(u, S)du

For reference see, for example [Shr04, Th. 4.4.9]. Thus, given the information at time

T , we find that e−∫ ST r(t)dt is log-normally distributed and

ZCB(T, S) = exp

{−(B(T, S)r(T ) +

θ(u)B(u, S)du− 1

σ2(u)B2(u, S)du

The Hull White model should reproduce the initial market observed yield curve.

Therefore we must have that at observation time t the formula for the zero coupon

bond prices ZCB(t, S) equals the discount factors P (t, S) for all S ≥ t. For that

purpose we define the continuously compounded forward rates f(t, S) as

f(t, S) = −∂ log(P (t, S)

We assume that the yield curve is modelled such that f(t, S) is continuous in S and

at least piecewise continuous differentiable. Then we can write

f(t, S) = r(t) +

fS(t, u)du.

P (t, S) = e−∫ St f(t,v)dv = e−

∫ St (r(t)+

∫ vt fS(t,u)du)dv.

From P (t, S) = ZCB(t, S) and the properties of B(t, S) follows that

f(t, S) = BS(t, S)r(t) +

θ(u)BS(u, S)du−∫ S

σ2(u)B(u, S)BS(u, S)du,

fS(t, S) = θ(S)− af(t, S)−∫ S

σ2(u)B2S(u, S)du.

This yields the risk neutral drift of the Hull White model as

θ(S) = fS(t, S) + af(t, S) +

σ2(u)B2S(u, S)du

A Hull White model with risk neutral drift is arbitrage free in an economy with all

zero coupon bonds as tradeable assets.

2.2.2 Bond and bond option formulas

We use the Hull White model with risk neutral drift calibrated to the observed yield

curve to determine prices of bonds and bond options in the Hull White model. The

formulas are applied to calibrate the short rate volatility as described in the forth-

coming section.

Zero coupon bonds. We derive an expression of the zero coupon bond ZCB(T, S)

given the information about the short rate r(T ) at time T in a model that replicates

the yield curve at observation time t ≤ T . For that purpose we consider again the

prices of the discount factors P (t, T ) and P (t, S) in the Hull White model. We have

P (t, S) = e−(B(t,S)r(t)+∫ St θ(u)B(u,S)du− 1

∫ St σ2(u)B2(u,S)du).

For the price of the discount factor P (t, T ) we substitute the equivalence B(u, T ) =

B(u, S)−BS(u, T )B(T, S) and get

P (t, T ) = e−(B(t,T )r(t)+∫ Tt θ(u)[B(u,S)−BS(u,T )B(T,S)]du− 1

∫ Tt σ2(u)[B(u,S)−BS(u,T )B(T,S)]2du).

It follows that

(P (t, T )

P (t, S)

)= [B(t, S)−B(t, T )] r(t) +

θ(u)B(u, S)du

+B(T, S)

θ(u)BS(u, S)du− 1

σ2(u)B2(u, S)du

−B(T, S)

σ2(u)B(u, S)BS(u, T )du

2B2(T, S)

σ2(u)B2S(u, T )du.

Rearranging terms and further substitutions yield

(P (t, T )

P (t, S)

θ(u)B(u, S)du− 1

σ2(u)B2(u, S)du+B(T, S)f(t, T )

−B(T, S)

σ2(u) [B(u, S)−B(u, T )]BS(u, T )du

2B2(T, S)

σ2(u)B2S(u, T )du

θ(u)B(u, S)du− 1

σ2(u)B2(u, S)du+B(T, S)f(t, T )

2B2(T, S)

σ2(u)B2S(u, T )du.

We end up with ∫ S

θ(u)B(u, S)du− 1

σ2(u)B2(u, S)du =

(P (t, T )

P (t, S)

)−B(T, S)f(t, T ) +

2B2(T, S)

σ2(u)B2S(u, T )du

Given that the Hull White model is calibrated to data at time t the price of a zero

coupon bond at time T , with maturity S, and realized short rate r becomes

ZCB(t;T, S, r) = A(t;T, S) e−B(T,S) r

A(t;T, S) =P (t, S)

P (t, T )exp

{B(T, S)f(t, T )− B(T, S)2

σ2(u)B2S(u, T )du,

Zero coupon bond options. Consequently, we can derive the price of a zero

coupon bond option at observation time t. We have from (2.2) that for T > t

V ar [r(T ) | F(t)] =

σ2(u)B2S(u, T )du.

Hence the time t price of a zero coupon bond is log-normally distributed with variance

σ2P = B2(T, S)

σ2(u)B2S(u, T ).

Moreover, we have in the T -forward measure that

ET [ZCB(t;T, S, r(T )) | F(t)] =ZCB(t; t, S, r(t))

P (t, T )=

P (t, S)

P (t, T ).

As a result we find that in the T -forward measure the price of a zero coupon bond

is log-normally distributed with mean P (t, S)/P (t, T ) and variance σP . Hence for a

call or put option on a zero coupon bond we can use the Black’76 formula. The time

t price of an option on a zero coupon bond with exercise date T , bond maturity S,

and strike price K paid at T is given by

ZCO(t;T, S,K, ω) = P (t, T ) Black76(P (t, S)/P (t, T ), K, σP , 1, ω).

Coupon bonds and coupon bond options. A coupon bond with cash flows ci

at coupon payment dates Si is determined by the sum of the scaled zero coupon bond

prices, i.e.

CB(t;T, S1, . . . , SM , r) =∑Si≥T

ci ZCB(t;T, Si, r).

An option on a coupon bond with cash flows ci at coupon payment dates Si, exercise

date T , and strike price K may be valued using Jamshidian’s decomposition [Jam89].

This approach requires to solve the equation

CB(t;T, S1, . . . , SM , r?) = K

for the short rate r?. Using the resulting short rate r? we can evaluate corresponding

individual strikes Ki by

Ki = ZCB(t;T, Si, r?).

With these individual strikes the coupon bond option can be prices as a sum of zero

coupon bond options. We have that

CBO(t, T, S1, . . . , SM , ω) =∑Si≥T

ci ZCO(t, T, Si, Ki, ω). (2.3)

2.2.3 Specifying the model parameters

So far we did not specify how we model the functional parameters in the Hull White

model. The derived formulas in the Hull White model require the current yield curve

and the short rate volatility function.

Piecewise constant volatility. It is common to assume that the volatility is piece-

wise constant between two exercise dates of a Bermudan swaption. Let T0 = t and

denote the Bermudan exercise dates with T1, . . . , TN then we have that

σ(t) = σj for t ∈ (Tj−1, Tj], j = 1, . . . , N.

In this setting we can evaluate C(T, S) =∫ Ttσ2(u)B2

S(u, T )du. Additionally, we

incorporate the year fraction function δ(·, ·) which measures the time in years in the

model between two dates. Usually the year fraction is determined on an act/365 day

count basis. The zero coupon bond at an exercise date Tj becomes

ZCB(t;Tj, S, r) = A(t;Tj, S) e−B(Tj ,S)r

A(t;Tj, S) =P (t, S)

P (t, Tj)exp

{B(Tj, S)f(t, Tj)−

B(Tj, S)2

2C(T, S)

B(Tj, S) =1− e−aδ(Tj ,S)

C(Tj, S) =

j∑k=1

[e−2aδ(Tk,S) − e−2aδ(Tk−1,S)

Moreover, the variance of the bond price can be rewritten as

σ2P = B2(Tj, S)C(Tj, S).

For the calibration procedure it is important to note that in this setting prices of

a zero and coupon bonds as well as zero and coupon bond options with exercise date

Tj depends only on short rate volatilities σ1 to σj. The price is independent from

volatilities corresponding to times larger than Tj.

Continuous forward rates A key feature of the implementation is the specifi-

cation of the initial yield curve. As pointed out in Section 1.1 the yield curve is

determined in terms of a set of discount factors 1 = P (t, T0), . . . , P (t, TN) for given

maturities t = T0 to TN . The grid of discount factors is in general independent

(and different) from the grid of Bermudan exercise and cash flow dates. Between the

dates of the yield curve grid we have to interpolate the discount factors. The type of

interpolation specifies the functional relation of the yield curve.

The calibration of the Hull White model to the initial yield curve is based on

the assumption that discount factors P (t, S) can be represented in terms of first

derivatives fS(t, S) of the forward rates, i.e.

P (t, S) = exp

{−∫ S

(r(t) +

fS(t, u)du

This assumption is equivalent to the requirement that the forward rates

f(t, S) = −∂ log(P (t, S))

are continuous and (at least) piecewise continuous differentiable. Such a strong

smoothness condition is not obvious because the zero coupon bond option prices

depend only on discount factors and bond prices depend only on forward rates. How-

ever, this condition becomes observable if one compares analytic and numeric prices

in the Hull White model.

For example, log-linear interpolation of discount factors is often used to model the

initial yield curve. Unfortunately such a model implies piecewise constant forward

rates which are in general not continuous. A similar situation occurs if zero rates are

interpolated linearly to obtain discount factors. The resulting forward rates are not

continuous. Some results related to that issue are elaborated in [HW06].

In our setting we model the initial yield curve by cubic C2-spline interpolation

applied to the set of points

{(T0, logP (t, T0)) , . . . , (TN , logP (t, TN))} .

Cubic C2-splines are twice continuous differentiable and thus the forward rates are

continuous differentiable. The drawback of cubic C2-spline interpolation is, however,

that it does not necessarily preserve monotonicity. This could in principle yield nega-

tive forward rates. In such situations a monotonicity preserving c-spline interpolation

can be used instead.

We study the practical influence of the yield curve interpolation to our Hull White

PDE model by a simple test problem. For that purpose we use a yield curve given

by the following three discount factors

{(0, 1), (1Y, 0.988003933), (2Y, 0.969893541)} , (2.4)

a constant short rate volatility of 5% and mean reversion of 5%. The price of a Euro-

pean zero coupon bond option with exercise in 1Y and maturity in 2Y is evaluated by

the analytical formula and the PDE solver of the Hull White model. We compare the

resulting prices using log-linear and log-cubic C2-spline interpolation of the discount

factors in the upper part (2.4) of Table 2.1. The computations are repeated for an

alternative yield curve given by

{(0, 1), (6M, 0.995578534), (1Y, 0.988003933), (2Y, 0.969893541)} (2.5)

which includes an additional 6M discount factor. The corresponding numerical results

are given in the lower part (2.5) of Table 2.1.

Table 2.1: Approximation of the numerical solution for log-linear and log-cubic inter-polation.

Yield Curve Interpolation Analytic Price Numerical Price Relative Error(2.4) log-linear 1.091061e-02 1.091060e-02 6.1e-07

log-cubic 1.091061e-02 1.091059e-02 2.0e-06(2.5) log-linear 1.091061e-02 1.314207e-02 2.0e-01

log-cubic 1.091061e-02 1.091052e-02 8.2e-06

As expected the analytical price of the zero coupon bond option is independent

of the yield curve and interpolation method applied. This is because the required

discount factors at 1Y and 2Y are given exactly and do not need to be interpolated.

For the three point yield curve we find accurate approximations of the numerical

scheme for both log-linear and log-cubic interpolation. In this situation log-linear

interpolation implies a constant and continuous forward rate curve between 0 and

If we include an additional 6M discount factor log-linear interpolation implies a

piecewise constant forward rate curve which is (in general) not continuous. Thus it

can not be modelled by the Hull White model. As a result we see that the numerical

PDE solution of the zero coupon bond option deteriorates. In contrast to the log-

linear discount factor interpolation log-cubic discount factor interpolation yields an

accurate approximation of the numerical solution.

2.3 Calibration of the Hull White Model

In Section 1.1 we illustrate that a forward receiver swap contract is equivalent to a

contract to buy a coupon bond. Consequently, a European receiver swaption with

strike rate R and fixed leg year fractions τ1, . . . , τM is equivalent to a call option on

a coupon bond with coupons Rτi and unit strike price.

Calibrating the Hull White model means choosing the model parameters such that

the model prices for European coupon bond options given by Equation (2.3) coincide

in a well defined way with market prices of European swaptions determined from

quoted Black’76 swaption volatilities and Equation (2.1). In our setting the model

parameters are the piecewise constant Hull White volatility values σj and the mean

reversion parameter a.

In a test case we analyse the calibration of the mean reversion a and a constant

short rate volatility σ to 10 EUR denominated at-the-money European swaptions

with exercises ranging from 10 to 19 years and fixed maturity in 20 years. We use

market data as of September 2009. The resulting objective function aimed to be

minimized and the convergence history of an optimization run with a constrained

Newton’s method are illustrated in Figure 2.1.

Figure 2.1: Convergence history for the simultaneous calibration of volatility andmean reversion.

Zoom to the last seven iteration steps

Convergence history

f(x) =∑10

(PV HW

i − PV B76i

mean reversion short rate volatility

0.0001

0.10.08

0.060.04

0.040.06

0.080.1

The numerical example demonstrates that there is only very limited progress w.r.t.

the objective function in the last iterations. Moreover there is a steep descent towards

a convergence valley. This observation shows that calibrating the mean reversion to

European swaptions is a rather ill-posed problem. As a result we assume the mean

reversion parameter a to be predefined. However, Bermudan prices do depend on the

choice of the mean reversion. Hence, if market prices for Bermudan swaptions are

available then the mean reversion parameter can be chosen to replicate these prices.

This coincides with procedures proposed for example in [Hag].

2.3.1 Formulation of the optimization problem

For the calibration of the Hull White model we specify a mean reversion parameter a.

Then we consider the difference between the Hull White model price and the market

price of a set of European swaptions. For each exercise date j = 1, . . . , N of the

underlying Bermudan swaption a European swaption with swap maturity equal to

the Bermudan swap maturity is chosen. That is, in terms of the Equations (2.3) and

(2.1) we consider the differences

CBO(t, Tj, S1, . . . , SM , ω)− Swaptionj(t) for j = 1, . . . , N.

Since we use a piecewise constant short rate volatility we may write the Hull White

model price in terms of the relevant short rate volatility values. Thus we have

CBO(σ1, . . . , σj; t, . . .)− Swaptionj(t) for j = 1, . . . , N.

The resulting objective function is formulated by reordering independent vari-

ables (i.e. short rate volatilities) and dependent variables (i.e. residuals of reference

swaption prices). We define

x = (x1, . . . , xN)> = (σN , . . . , σ1)>,

gi(x) = CBO(x;TN−i, S1, . . . , SM , ω)− Swaption(t) for i = 0, . . . , N − 1,

and F : RN → RN , F (x) = (g1(x), . . . , gN(x))> .

The reordering of the volatilities and the reference instruments has no influence on

the modelling of the calibration problem. However it yields, that each individual

function gi(x) only depends non-trivially on elements xi to xN of x. This gives the

advantageous property that the Jacobian of F is an upper triangular matrix.

In principal one can aim at solving the non-linear system of equations F (x) = 0.

Since the Jacobian of F has an upper triangular form this problem could be split into

a sequence of one dimensional non-linear problems. One could start to solve for gN ,

then gN−1 up to g1. Unfortunately this approach fails if at least one of the market

prices could not be replicated by the model exactly. However we want our calibration

to be robust enough to cope with such market situations.

Instead of solving the set of non-linear equations F (x) = 0 directly we consider

the non-linear least squares formulation

2F (x)>F (x).

Moreover we want to require the volatilities to lie within some reasonable bounds. In

particular, we want to ensure that the volatilities are positive. Therefore we consider

some scalar boundaries a and b and the component wise box constraint a ≤ x ≤ b.

Hence we aim at solving

mina≤x≤b

2F (x)>F (x). (2.6)

The formulation as non-linear least squares problem also allows us in principle to

choose a coarser time discretisation of the short rate volatilities. In market situations

with few liquid volatility quotes this yields more stable calibration results.

2.3.2 Iterative solution of the non-linear problem

The non-linear least squares problem in (2.6) may be solved iteratively by a Gauss-

Newton method. For further reference see for example [NW06]. In each Gauss Newton

iteration step with given iterate x a step direction s is evaluated by solving the linear

system

F ′(x)>F ′(x)s = −F ′(x)>F (x).

In general the linear system is solved by computing a QR factorization of the Jacobian

F ′(x). This requires O(N3) operations. Since the Jacobian F ′(x) of the Hull White

calibration problem is already an upper triangular matrix its QR factorization is

trivial. Thus the linear system may be solved within O(N2) operations.

The Gauss Newton iteration is stabilized by a line search in the range of the

objective function F . For given function values F (x) and F (x + s) a step multiplier

λ ∈ R is determined by

λ =F (x)> [F (x)− F (x+ s)]

[F (x)− F (x+ s)]> [F (x)− F (x+ s)].

The state x + s is accepted as the next iterate if λ ≥ 12

+ ε for a suitable choice of

ε ∈ (0, 1/6). If x+ s is not accepted the Gauss Newton step s is replaced by λ · s with

λ, for ε < |λ| < 1/2 + ε

ε, for 0 ≤ λ ≤ ε

−ε, for − ε ≤ λ < 0

−(1/2 + ε) for λ < −(1/2 + ε)

The line search procedure is repeated with the function values F (x) and F (x+ λ · s)until acceptance. Further references and convergence results concerning this line

search approach can be found in [Gri86].

If we find that a Gauss-Newton step x = x+ s would result in an iterate violating

the box constraints a ≤ x ≤ b then the components of x are projected to the bounds.

That is, for x = (x1, . . . , xN)> we set x = (x1, . . . , xN)> with

a, for xi < a

x, for a ≤ xi ≤ b

b, for xi > b

, i = 1, . . . , N.

In such a case we get a new step direction s = x−x. The line search is then performed

along the altered step direction s.

The Gauss Newton iteration may be summarized as follows: Starting with an

initial guess x(0) we evaluate for i = 0, 1, 2, . . .

1. solve F ′(x(i))>F ′(x(i))s(i) = −F ′(x(i))>F (x(i)),

2. determine a suitable step multiplier λ(i) ∈ R (by considering the box con-

straints),

3. evaluate the next iterate x(i+1) = x(i) + λ(i) · s(i).

Numerical results The Hull White model is parametrized with a mean reversion

of 5%. The upper bound of the short rate volatility is given by 50%. As lower bound of

the short rate volatility we choose 0.1%. The iterations are initialized with a constant

volatility of 10%. Figure 2.2 illustrates the convergence history of the Gauss Newton

method applied to our reference test problem. The resulting piecewise constant short

rate volatility function is shown in Figure 2.3. This short rate volatility function is

applied in the PDE methods described in the forthcoming chapters.

Figure 2.2: Convergence history for the calibration of the short rate volatility byGauss Newton method.

iteration count i

residual ‖F (x(i))‖2

5 61e-18

0 1 3 4

Figure 2.3: Short rate volatility calibrated to coterminal swaptions.

time in year fraction15 25 300 5 10

Chapter 3

Discretisation and NumericalSolution

In Section 2.2 we elaborate (auxiliary) analytical pricing formulas for European bond

options in the Hull White model. The aim of this thesis is the pricing and sensitivity

evaluation of Bermudan swaptions which are replicated as Bermudan bond options.

However there are no analytical pricing formulas for Bermudan bond options in the

Hull White model. Hence we have to apply numerical schemes for the pricing of

Bermudan bond options.

The pricing of exotic derivatives (in particular Bermudan bond options) in the

Hull White model may be realized by discretising the stochastic process of the short

rate r(t) in a recombining tree. For details see for example [BM07]. Under certain

constraints the trinomial tree method is equivalent to an explicit Euler finite differ-

ence scheme ([Duf06]). The explicit Euler scheme has limited stability properties

and may yield artificial oscillations in the numerical solution. Therefore we use the

alternative approach of formulating and solving the Hull White partial differential

equation (PDE).

In this chapter we describe the numerical methods for the pricing of Bermudan

swaptions in the Hull White model. The methods elaborated could easily be adapted

to other interest rate payoffs. The time t price V (t, r(t)) of a contingent claim with

payoff V (T, r(T )) = p(r(T )) (T > t) is given by

V (t, r(t)) = EQ[e∫ Tt r(s)dsp(r(T )) | F(t)

The process of the short rate r(t) in the Hull White model is

dr(t) = [θ(t)− ar(t)] dt+ σ(t)dW (t)

with W (t) a Brownian motion under the risk neutral measure Q. Applying Feynman-

Kac’s theorem (see for example [Shr04]) yields that the price V (t, r) for t < T satisfies

the partial differential equation

Vt +σ2(t)

2Vrr + [θ(t)− ar]Vr − rV = 0

with terminal condition V (T, r) = p(r).(3.1)

The payoff structure of a Bermudan option may be considered as a sequence of

terminal conditions. Let T1 to TN be the Bermudan exercise dates. If TN−1 < t < TN

and the option is not exercised then the time t price of the Bermudan option is

equivalent to a European option with exercise date TN . Denote pj(r) the payoff of

the Bermudan option at exercise date Tj. Then the price of the Bermudan at Tj is

V (Tj, r) = max

{pj(r), lim

t↓TjV (t, r)

between the exercise dates and before the first exercise date the Hull White PDE

holds. Hence we may solve the Bermudan option by successively integrating back-

wards from TN to observation time t = 0.

Figure 3.1 illustrates the numerical solution V (t, r) of the Bermudan swaption

reference test problem. Moreover, we mark the solution V (t, f(0, t)) along the forward

rate observed at observation time t = 0. The desired time t = 0 price of the Bermudan

swaption becomes V (0, f(0, 0)).

In the forthcoming sections we describe the spatial and temporal discretisation of

the terminal value problem (3.1). We discuss different approaches for the numerical

solution and address the issues of approximating the boundary conditions.

3.1 Spatial Discretisation

For the discretisation of the Hull White terminal value problem we use the method of

lines [GR94]. That is we first discretise the spatial direction which is the short rate r.

This yields a system of ordinary differential equations with a terminal condition. The

system of ordinary differential equations is subsequently solved in a general setting.

The terminal value problem (3.1) is rewritten using the linear operator L as

Vt(t, r) = −L[V ](t, r) with L[V ] =σ2(t)

2Vrr + [θ(t)− ar]Vr − rV.

For the spatial discretisation we define a grid of n+ 1 short rate values

r0 < r1 < . . . < rn.

Figure 3.1: Numerical solution of the Bermudan swaption reference test problem.

0.02 0.04

0.06 0.08

0 0.02

0.04 0.06

0.08 0.1

Short Rate r

Numerical Solution V (t, r) of Bermudan Swaption

Time to Maturity t

Time to Maturity tShort Rate r

Forward Rate Solution V (t, f(t))

The discretisation width is given by

δ = maxi=1,...,n

{ri − ri−1} .

Each short rate grid point ri is associated with a component function V δi (t) ≈ V (t, ri)

for i = 0, . . . , n. The component functions V δi are combined in the vector valued

function V δ, i.e.

V δ(t) =(V δ

0 (t), . . . , V δn (t)

The partial derivatives at the interior short rate grid points r1, . . . , rn−1 are ap-

proximated by centred finite differences which are defined as

V δr,i(t) =

V δi+1(t)− V δ

i−1(t)

ri+1 − ri−1

V δrr,i(t) = 2

[V δi+1(t)− V δ

i (t)]

(ri − ri−1)−[(V δ

i (t)− V δi−1(t)

](ri+1 − ri)

(ri − ri−1)(ri+1 − ri−1)(ri+1 − ri).

If the solution V of the Hull White terminal value problem is sufficiently smooth (4th

derivatives must be continuous) and the spatial discretisation is exact, i.e. Vi(t) =

V (ri, t) for i = 0, . . . , n then we get from Taylor expansion that

V δr,i(t) =

∂r(ri, t) +

∂r2(ri, t) [(ri+1 − ri)− (ri − ri−1)] +O(δ2), (3.2)

V δrr,i(t) =

∂r2(ri, t) +

∂r3(ri, t) [(ri+1 − ri)− (ri − ri−1)] +O(δ2). (3.3)

Hence for general grids the approximation is first order accurate and for equidistant

grids with δ = ri+1− ri = ri− ri−1 for i = 1, . . . , n− 1 the approximation is of second

order. If there is a smooth transformation from a uniform grid to the general short

rate grid we may also get a second order approximation. For further details see, e.g.

[HV03].

The Hull White terminal value problem (3.1) is defined for r ∈ R. For the numer-

ical treatment we have to truncate the domain for the short rate r. At the boundaries

some reasonable conditions have to be imposed. As there are no natural choices for

the boundary conditions it is market standard to use linear boundary conditions.

That is, we impose that

∂r2(t, r0) =

∂r2(t, rn) = 0.

The stability of the linear boundary condition in the related context of solving the

Black Scholes PDE is analysed, for example in [WFV04]. With this condition the

numerical approximation of the second derivative V δrr also vanishes for r0 and rn.

Using some ghost grid points at the boundaries and the linear boundary condition

yield that the numerical approximation of the first derivative V δr become a one-sided

finite differences

V δr,0(t) =

V δ1 (t)− V δ

r1 − r0

, V δr,n(t) =

V δn (t)− V δ

n−1(t)

rn − rn−1

The spatial discretisation of the partial derivatives of V allows us to define the

discretised linear operator Lδ of the Hull White PDE. We have

Lδ[V δ]

(σ2(t)

2V δrr,i(t) + [θ(t)− ari]V δ

r,i(t)− riV δi (t)

)>i=0,...,n

Substituting the definitions for V δr,i and V δ

rr,i yields the matrix vector representation

of Lδ as

Lδ[V δ]

(t) = Lδ(t)V δ(t) with Lδ(t) ∈ R(n+1)×(n+1).

The (n+ 1)-dimensional matrix Lδ(t) is of the form

Lδ(t) =

l1. . . . . .. . . . . . un−1

For the matrix elements we get the expressions

li =σ2(t)

(ri − ri−1)(ri+1 − ri−1)− θ(t)− ariri+1 − ri−1

di = − σ2(t)

(ri+1 − ri)(ri − ri−1)− ri,

ui =σ2(t)

(ri+1 − ri)(ri+1 − ri−1)+

θ(t)− ariri+1 − ri−1

for i = 1, . . . , n − 1. The first and last row of the matrix Lδ(t) is determined by the

boundary conditions of the discretised Hull White PDE. We have

d0 = −θ(t)− ar0

r1 − r0

− r0, u0 =θ(t)− ar0

r1 − r0

ln = −θ(t)− arnrn − rn−1

, dn =θ(t)− arnrn − rn−1

− rn.

As a result we get the linear time-dependent system of ordinary differential equa-

tionsdV δ

dt(t) = −Lδ(t)V δ(t)

with terminal condition

V δ(T ) = (p(r0), . . . , p(rn))> .

3.1.1 Discretisation Schemes

In this section we describe two approaches for the definition of the short rate grid

r0 < r1 < . . . < rn.

Uniform discretisation scheme. As a standard short rate grid we choose an odd

number of grid points r0 to rn. The center of the short rate grid rn/2 equals the short

rate at observation time t = 0, i.e.

rn/2 = r(0) = f(0, 0).

Thus the time t = 0 option value is directly given by Vn/2(0) and does not need to

be interpolated. Moreover, we specify a maximum deviation s from the current short

rate and set r0 = f(0, 0) − s and rn = f(0, 0) − s. The remaining grid points are

placed uniformly between r0 and rn. The discretisation width becomes

δ =2s

This discretisation scheme yields a second order accurate spatial discretisation.

The (numerical) order of convergence is verified and illustrated in Figure 3.2. We

show the relative error

erri =

∣∣∣∣∣CBOPDEi − CBOanalytical

CBOanalyticali

∣∣∣∣∣between the analytical and numerical price of the ith European reference bond option

(i ∈ {1, 10, 20, 30}) of our reference test problem. The deviation s is chosen as 20%

and the relative error in the time integration is limited to 1.0e− 8.

Normally distributed discretisation scheme. For the pricing of an option in

the Hull White model we are usually interested in the time t option value at the

current realised short rate r(t) = f(t, t). Thus it appears reasonable to improve the

approximation at the current short rate by choosing a smaller grid size near r(t) than

at the boundaries of the numerical domain. Moreover, the short rate in the Hull

White model is normally distributed.

Therefore we construct a short rate grid which mimics a normal distribution of the

grid points. We choose again an odd number of grid points r0 to rn and a maximum

deviation s. The short rate grid is defined as

ri =Φ−1

(i+1n+2

)Φ−1

(n+1n+2

) s+ r(t) for i = 0, . . . , n.

Figure 3.2: Asymptotic approximation of equidistantly discretised PDE solution ofreference European swaptions.

Excercise Date 10

Reference 10/n2

Excercise Date 1

Excercise Date 20 Excercise Date 30

Number of spatial grid points nNumber of spatial grid points n

Relative Error of Numerical Solution

11 21 51 101 201 501 1001

0.0001

11 21 51 101 201 501 1001

0.0001

The function Φ−1 is the inverse cumulative normal distribution function. As for the

uniform grid we get rn/2 = r(t), r0 = r(t) − s, and rn = r(t) + s. The discretisation

width becomes

Φ−1(

)Φ−1

(n+1n+2

)] s.We mentioned before that the spatial approximation with central differences in

(3.2) and (3.3) is only first order accurate for general grids. However, as elaborated

in Appendix A, we get for the normally distributed grid that

(ri+1 − ri)− (ri − ri−1) ≤ Cδ2.

The positive constant C is essentially independent of the discretisation. Hence we

get that in practice this discretisation scheme is also second order accurate.

Figure 3.3 demonstrates the numerical order of convergence for the normally dis-

tributed grid applied to our reference test problem. The setting is equal to the

numerical tests for the uniform short rate grid.

Comparing the numerical results in Figure 3.2 and Figure 3.3 shows that the

approximation with normally distributed short rate grid yields more accurate results

than the approximation with uniformly distributed short rate grid.

3.1.2 Determining the computational domain

The truncation of the domain in which we numerically solve the Hull White PDE

and the linear boundary condition impose an error on the numerical solution. Given

a fixed number of n+ 1 short rate grid points it is a priori not clear where to place r0

and rn. On the one hand it is advantageous to place them far away of the initial short

rate. This reduces the error of the wrong boundary condition. On the other hand a

large computational domain requires a large discretisation width. This increases the

error of the approximation of the derivatives by central differences.

We investigate the influence of the choice of the computational domain. For that

purpose we solve our reference test problem with uniform and normally distributed

grids, 1001 short rate grid points, and varying short rate deviation s. The relative

errors erri between the analytical and numerical price of the ith European reference

bond option (i ∈ {1, 10, 20, 30}) of our reference test problem are compared in Figure

The results in Figure 3.4 demonstrate that the overall approximation error for

small deviation s usually first declines rapidly. For small values of s the distance

between the boundaries of the domain and point of interest rn/2 (i.e. the midpoint

Figure 3.3: Asymptotic approximation of normally distributed short rate grid PDEsolution of reference European swaptions.

Excercise Date 1

Number of spatial grid points nNumber of spatial grid points n

Excercise Date 30

Excercise Date 10

Excercise Date 20

Reference 10/n2

2121 51 101 201 501 1001 51 101 201 501 1001

0.0001

Figure 3.4: Approximation of numerical solution of reference European swaptions forvarying sizes s of the computational domain.

Uniform Short Rate GridNormally Distributed Short Rate Grid

Excercise Date 1

Deviation s of Short Rate Grid Deviation s of Short Rate Grid

Excercise Date 10

Excercise Date 30Excercise Date 20

0.0001

0.1 0.2 0.3 0.4 0.5 0.5

0 0.1 0.2 0.3

0.0001

of the domain) is small. By the linear boundary condition of the Hull White PDE

we introduce a modelling error rather close to the final point of interest. In contrast

the spatial discretisation width is small and thus the numerical approximation error

is small. Hence for small s the error of the linear boundary condition dominates the

approximation error of the finite difference approximation. For larger s the overall

approximation error increases moderately. Now the approximation error of the finite

difference approximation dominates the error of the linear boundary condition.

Unfortunately the optimal deviation which minimizes the overall approximation

error is not obviously determined. It depends on the spatial discretisation and the

integration time. Since the error of the finite difference approximation growths sub-

stantially smaller than the error of the linear boundary condition decreases it appears

reasonable to overestimate the computational domain. For our computations we use

values of s larger than or equal to 0.2.

Additionally, the results in Figure 3.4 demonstrate that the normally distributed

short rate grid usually yields a more accurate approximation of the numerical solution.

3.1.3 Using variable grids

So far we considered placing the center of the short rate grid at the value of the

current observable short rate. As an alternative approach we investigate the strategy

to place the center of the short rate grid at the effective strike rate at the exercise

dates Tj of the Bermudan. This method is motivated by the fact that the payoff is

in general not differential at the strike rate and that this region influences the option

price significantly.

Suppose t ∈ (Tj−1, Tj] and pj(r) is the payoff of the Bermudan option if exercised

at Tj. We assume that pj(r) is monotone in r. This holds in particular for call and

put options on bonds where we have that

pj(r) = ω (CBj(t, Tj, . . . , r)−Kj) .

Here CBj(t, Tj, . . . , r) is the coupon bond, Kj the strike price at Tj, and ω ∈ {−1,+1}describes call and put options. The price of the coupon bond is monotone in r. Hence

pj is also monotone in r. We determine r?j such that

pj(r?j ) = lim

t↓TjV δ(Tj, r

?j ). (3.4)

The function V δ(t, r) ≈ V (t, r) describes the available numerical solution of the option

price function. In our implementation it is obtained by C2-spline interpolation of the

components of V δ(t). For t > TN we set V (t, r) = 0. Equation (3.4) is solved by

Newton’s method. This approach ensures that for the Bermudan exercise condition

V δ(Tj, r) = max

{pj(r), lim

t↓TjV δ(t, r)

}there is a short rate grid point at the in general not differentiable short rate r?j .

The accuracy of the varying grid approach is analysed by means of our reference

test problem. We repeat the numerical computations of the uniform and normally

distributed grids for varying deviations s. However, at each exercise date we re-center

the short rate grid as described above. At observation time t = 0 the final option

value is also determined by C2-spline interpolation of the available numerical solution.

The results are illustrated in Figure 3.5.

Figure 3.5: Comparison of fixed and variable grid approximation of numerical solutionof reference European swaptions for varying sizes s of the computational domain.

Variable Normally Distributed Short Rate GridFixed Normally Distributed Short Rate Grid

Excercise Date 30

Deviation s of Short Rate Grid Deviation s of Short Rate Grid

Excercise Date 20

0 0.1 0.2 0.3 0.4 0.5 0.3

0.4 0.5

0.0001

The results in Figure 3.5 indicate that the variable grid approach does not improve

the approximation of the numerical solutions substantially. Therefore we do not use

it for further computations.

3.2 Time Integration

The spatial discretisation yields a system of ordinary differential equations

V δ(t) = −Lδ(t)V δ(t), V δ(t) ∈ Rn+1, Lδ(t) ∈ R(n+1)×(n+1)

with terminal conditions

V δ(Tj) = V δ,j for j = 1, . . . , N.

We solve the sequence of terminal value problems by Crank-Nicolson’s method. Let

V δt ≈ V δ(t) be the numerical solution of the terminal value problem at time t. The

numerical solution V δt−h at a previous time step t− h for h > 0 is determined by the

equationV δt − V δ

h= −

Lδ(t)V δt + Lδ(t− h)V δ

2≈ V δ(t− h/2).

Solving for V δt−h yields

V δt−h =

[I − h

2Lδ(t− h)

]−1 [I +

2Lδ(t)

]V δt . (3.5)

Since Lδ is of tridiagonal form the matrix vector multiplication U δt =

[I + h/2Lδ(t)

]V δt

may be evaluated within O (n+ 1) operations. The linear system[I − h

2Lδ(t− h)

]V δt−h = U δ

is solved directly by an LU decomposition. Again since Lδ is of tridiagonal form the

LU decomposition and the system solution requires O (n+ 1) operations.

The Crank-Nicolson method is consistent of order 2. That is if the solution is

sufficiently smooth and V δt = V δ(t) then∥∥V δ

t−h − V δ(t− h)∥∥ ≤ C h2

for a positive constant C independent of h. Moreover, the method is A-stable. For a

proof see, for example [HNW93] and [HW91].

The solution of the Bermudan option is evaluated in the following fashion. Starting

at the last exercise date TN we set the terminal value V δTN

equal to the payoff, i.e.

V δTN

= V δ,N =

max {ω [CBN(t, TN , . . . , r0)−KN ] , 0}...

max {ω [CBN(t, TN , . . . , rn)−KN ] , 0}

Then we choose a sequence of time steps tj such that TN−1 = t0 < t1 < . . . < tm = TN

with step sizes hj = tj − tj−1 for j = 1, . . . ,m. We evaluate successively

V δtj−1

[I − hj

2Lδ(tj−1)

]−1 [I +

hj2Lδ(tj)

]V δtj

for j = m, . . . , 1.

The payoff condition at TN−1 is imposed by

V δ,N−1 =

max{ω [CBN(t, TN−1, . . . , r0)−KN−1] , V δ

max{ω [CBN(t, TN−1, . . . , rn)−KN−1] , V δ

Here V δt0,i

(i = 0, . . . , n) denotes the ith component of the vector

V δt0

=(V δt0,0, . . . , V δ

The numerical integration and the evaluation of the Bermudan payoff condition are

repeated analogously to the first exercise date T1. Finally the solution at observation

time t = 0 < T1 is evaluated by numerical integration backwards from T1 to 0.

Numerical stability of Crank-Nicolson’s scheme. Although Crank-Nicolson’s

scheme is A-stable it may yield poor numerical results if the initial data is not contin-

uous. This drawback can be overcome by the use of a Rannacher’s startup procedure.

For details see, e.g. [GC06]. This approach uses initial Backward Euler steps and

then proceeds with Crank-Nicolson steps.

For our setting and test problem we also tried Backward Euler steps and compared

the accuracy of the numerical solution to the available analytical solution of the

reference European swaptions. However, for the prices as well as for the Vegas with

Backward Euler we did not find significant differences in the approximation compared

to Crank-Nicolson’s scheme (despite the property that Backward Euler is only first

order consistent).

Adaptive step size control. The overall error of the numerical solution of the Hull

White terminal value problem consists of the modelling error of the linear boundary

condition, the error by spatial discretisation, and the error by numerical integration.

The impact of the modelling and spatial discretisation error are analysed in Section

3.1. To control the overall integration error we aim at estimating and limiting the

local numerical error of the time integration steps.

In each integration step we evaluate two numerical solutions V δt and V δ

t . The

solution V δt is evaluated by Crank-Nicholson’s method with step size h. Thus the

local error of V δt is of order h2. V δ

t is a more accurate numerical solution with a

local error of order h3. The more accurate solution acts as a local proxy of the exact

solution of the ODE terminal value problem. The local relative error is determined

errrel = maxi=0,...,n

∣∣∣V δt,i − V δ

∣∣∣max{|V δ

t,i|, ε}

The parameter ε is chosen with 1.0e − 8 to prevent floating point overflow in the

implementation.

The higher order numerical solution V δt is evaluated by Richardson extrapolation.

For reference see for example [HNW93]. Since V δt is evaluated by Crank-Nicholson’s

method with step size h we have that

V δt = V δ(t) + Ch2 +O(h3).

Richardson extrapolation is based on evaluating a second numerical solution V δt by

two Crank-Nicholson steps with half the step size. This yields that

V δt = V δ(t) + C

+O(h3).

The higher order solution is given by

V δt =

4V δt − V δ

3= V δ(t) +O(h3).

Using Richardson extrapolation we take errrel as the local relative error. Given

that errrel ≤ tol for a globally defined relative tolerance tol the integration step from

t+h to t is accepted. If errrel > tol the integration step is rejected and the integration

step size h is reduced. As adaptive step selection strategy we use the simple approach

of evaluating a preliminary step multiplier

√tol

errrel.

The step multiplier λ applied to the current step size h is determined considering

some technical restrictions

λmin for λ < λmin

λ for λmin ≤ λ < λfloor

λfloor for λfloor ≤ λ < 1

1 for 1 ≤ λ < λcap

λ for λcap ≤ λ < λmax

λmax for λmax ≤ λ

The mapping λ 7→ λ is also illustrated in Figure 3.6. The restricting parameters are

chosen rather conservative by λmin = 0.5, λfloor = 0.8, λcap = 1.5, and λmax = 2.0.

These choices should reduce the number of rejected steps as much as possible. the

new step size for both accepted and rejected integration steps is then given by

hnew = λh.

Figure 3.6: Step size restriction function for adaptive time stepping

λfloor

��

λmin λcap

��

Moreover we impose a lower boundary of 1.0e − 8 on the resulting new step

size hnew. This restriction should prevent the integration algorithm of getting stuck.

Additionally, we restrict the step size if it would integrate over the next exercise date

or the observation time.

Chapter 4

Sensitivity Evaluation byAutomatic Differentiation

The key issue of this thesis is the efficient evaluation of sensitivities for Bermudan

Swaptions in the Hull White interest rate model. A key input parameter of the pricing

problem of Bermudan swaptions is the Black’76 European swaption volatility surface.

In this chapter the focus lies on the evaluation of Vegas, i.e. the sensitivities of the

price with respect to the volatilities.

We elaborate how these sensitivities may be evaluated by applying methods of

Automatic Differentiation to the numerical PDE solver. Moreover we propagate the

PDE sensitivities backwards differentiating the calibration procedure. Finally we may

obtain the sensitivity of the Bermudan price with respect to the Black’76 volatilities

in the volatility surface.

Methods of Automatic differentiation apply the chain rule of differentiation to

the elementary operations of a computer program. For a comprehensive treatment of

these methods we refer to the Literature, e.g. [GW08]. Appendix B illustrates the

key concepts by means of an example. In this chapter we mention the Automatic

Differentiation (AD) tools and methods applied but do not explain them in detail.

The numerical results of this thesis are evaluated by programs written in C++. All

routines relevant for the sensitivity evaluation are defined as function templates. For

the Bermudan price evaluation the function templates are instantiated using intrinsic

floating point data types (i.e. doubles). For the sensitivity evaluation we use AD

tools based on operator overloading. These tools define an active data type which

facilitates the automatic sensitivity evaluation. The pricing routines are instantiated

with this active data type if sensitivities should be evaluated.

The combination of template based programming and operator overloading AD

tools yields an efficient and flexible approach to incorporate sensitivity evaluation

to pricing procedures. Usually, only minor changes to the original (double-based)

program are necessary to allow for the sensitivity evaluations. A key advantage is

that the template based programming requires maintaining only one source code.

That is the same template functions are used for only price evaluation and sensitivity

evaluation (even with possibly different AD tools).

Comparison of Automatic Differentiation to finite difference approxima-

tions. In practice sensitivities are often evaluated as finite difference approxima-

tions. This is the input parameters are bumped by a small value and the objective

price is re-evaluated. This approach is easy to implement in an existing pricing or

risk management system and hence widely applied.

It is well known that the finite difference approach may be less accurate and not

efficient, particularly for many input parameters. For our reference test problem we

evaluate the average Vega as the sensitivity of the Bermudan price with respect to

a small parallel shift of the Black’76 volatility surface. We compare the results for

several spatial discretisations obtained by Automatic Differentiation and central finite

differences with varying finite difference step size h = 10−2, . . . , 10−8. The results are

given in Figure 4.1.

We find from the results in Figure 4.1 that particularly for smaller problem dis-

cretisations finite differences yield no acceptable sensitivities. Even for larger discreti-

sations the Vega determined by finite differences deteriorates if the step size is chosen

a little too small. However, for larger finite difference step sizes the approximation

error may become significant. In contrast to the finite difference approach Automatic

Differentiation yields stable and accurate approximations of the Bermudan Vega.

In the numerical experiment corresponding to the results of Figure 4.1 we did not

fix the time discretisation of the numerical PDE solver. Thus the bumping of the

input parameters (implicitly) also changes the adaptive time stepping. This may be

a significant reason of the poor behaviour of the finite difference approach.

Sensitivities with respect to the interest rate curve. For hedging and risk

management sensitivities of the price with respect to the yield curve (i.e. Deltas)

are also relevant. In principle these sensitivities can also be evaluated by methods

Automatic Differentiation. In this thesis we do not cover the evaluation of these

sensitivities. The evaluation of risk sensitivities (in particular Deltas) for Bermudan

swaptions is elaborated, e.g. in [Pit04]. Sensitivities of European swaptions in the

Hull White model are discussed, e.g. in [Hen04].

Figure 4.1: Comparison of average Vega evaluated by central finite differences andAutomatic Differentiation.

average vega by finite differences

101 grid points

finite difference step size

1001 grid points501 grid points

finite difference step size

average vega by Automatic Differentiation

201 grid points

1e-08 1e-07 1e-06 1e-05 0.0001 0.001 0.01

Evaluating Vegas in the Hull White PDE model. In this chapter we describe

the sensitivity evaluation of the Hull White Bermudan swaption price PV HW with

respect to the Black’76 swaption volatilities σB76 of the European reference swaptions

and the swaption volatility surface ΣB76. The pricing of Bermudans in the Hull White

model may be split into the following steps:

1. Interpolating the Black’76 volatility surface and evaluating prices for reference

European swaptions (see Section 2.1).

2. Evaluating a piecewise constant short rate volatility function by calibrating the

Hull White model to the prices of reference European swaptions (see Section

3. Solving the Hull White PDE model for the Bermudan swaption price (see Chap-

ter 3).

Hence we get the following mapping

ΣB76Surface

Interpolation7−→ σB76 Swaption7−→ PV B76 Calibration7−→ σHWBermudanHW7−→ PV HW .

ΣB76 =

σB761,1 . . . σB76

1,L...

...σB76K,1 . . . σB76

represents the European swaption Black’76 volatility surface and

σB76 =(σB76

1 , . . . , σB76N

PV B76 =(PV B76

1 , . . . , PV B76N

)>describe the interpolated Black’76 volatilities and corresponding prices for the refer-

ence European swaptions. The calibrated values of the piecewise constant short rate

volatility function are denoted by

σHW =(σHW1 , . . . , σHWN

)>and the resulting price of the Bermudan swaption is PV HW . The sensitivity of the

Bermudan swaptions price PV HW with respect to the Black’76 volatility surface ΣB76

is determined by differentiating each mapping in the evaluation procedure (4.1). The

resulting sensitivity is determined as the product of the individual derivatives by

applying the chain rule of differentiation.

For the sensitivity of the Bermudan swaption price in the Hull White model PV HW

with respect to the Black’76 swaption volatility surface ΣB76 we get

dPV HW

dΣB76=

dPV HW

dσHW· dσHW

dPV B76· dPV

dσB76· dσ

dΣB76.

The first derivative dPV HW/dσHW can also be considered the gradient of the PDE

solution w.r.t. the short rate volatility values. Its evaluation is described in Section

4.3. The second term dσHW/dPV B76 corresponds to the differentiation of the Hull

White model calibration. This is elaborated in Section 4.2. The Black’76 European

swaption Vegas and the derivatives of the Black’76 volatility surface interpolation are

given by dσB76/dΣB76 and dσB76/dΣB76. This is described in Section 4.1.

4.1 Differentiating the European swaption prices

The Black’76 swaption volatility surface ΣB76 is interpolated bilinearly to obtain the

the reference Black’76 volatilities σB76i for i = 1, . . . , N . Suppose Ti is the exercise

date of the ith European reference Swaption and Si = SM − Ti its underlying swap

tenor (in years). Let TB761 , . . . , TB76

K be the exercise dates and S1, . . . , SL be the swap

tenors of the volatility surface ΣB76. We choose TB76k and SB76

l such that

TB76k < Ti ≤ TB76

k+1 and SB76l < (SM − Ti) ≤ SB76

The interpolated Black’76 volatility becomes

σB76i =

(TB76k+1 − Ti)(SB76

l+1 − Si)(TB76

k+1 − TB76k )(SB76

l+1 − SB76l )

σB76k,l +

(Ti − TB76k )(SB76

l+1 − Si)(TB76

k+1 − TB76k )(SB76

l+1 − SB76l )

σB76k+1,l+

(TB76k+1 − Ti)(Si − SB76

(TB76k+1 − TB76

k )(SB76l+1 − SB76

l )σB76k,l+1+

(Ti − TB76k )(Si − SB76

(TB76k+1 − TB76

k )(SB76l+1 − SB76

l )σB76k+1,l+1.

Consequently the sensitivity of the reference volatility σB76i with respect to the volatil-

ity quote σB76k,l in the surface is

dσB76i

dσB76k,l

=(TB76

k+1 − Ti)(SB76l+1 − Si)

(TB76k+1 − TB76

k )(SB76l+1 − SB76

The sensitivity of the reference volatility σB76i with respect to the other volatility

quotes used for the interpolation follows analogously.

The European swaption price in the Black’76 formula (2.1) is given by

Swaption(t) = Annuity(t) · (−ω) [Y (t)Φ(−ωd1)−RΦ(−ωd2)] ,

d1,2 =log (Y (t)/R)

σB76√τ± σB76

The derivative of the swaption price with respect to the input volatility is given by

dSwaption(t)

dσB76= Annuity(t)Y (t)

e−d21/2

√2π

√τ .

Hence we can evaluate the sensitivity of the Black’76 prices of the reference European

swaptions with respect to the volatility surface by analytical formulas.

The implementation of the analytical formulas for the interpolation and the swap-

tion formula in a computer program would require a careful tracking of which surface

volatility σB76k,l contributes to a reference swaption volatility σB76

i . We circumvent this

effort by applying the AD tool ADTAGEO [RG09] to the interpolation and Swaption

price evaluation.

ADTAGEO builds up the internal computational graph between the input volatil-

ities of the volatility surface and the output swaption prices. It evaluates the Jacobian

of all independent and dependent variables. That is it allows the evaluation of the

derivatives of each swaption price with respect to each volatility input. This ap-

proach may not be optimal considering the computational effort. An AD reverse

mode implementation could be more efficient.

However the overall computational effort of the evaluation of Black’76 prices in the

Bermudan pricing procedure is negligible compared to the calibration and the PDE

solution. We prefer to exploit the easy to use programming interface of ADTAGEO

compared to a slightly more sophisticated incorporation of the pure AD reverse mode.

4.2 Differentiating the Calibration

In the calibration procedure of the Hull White model we aim at solving the minimi-

sation problem

mina≤σHW≤b

2F (σHW )>F (σHW ),

see (2.6). The objective function F : RN → RN may be formulated using the notation

in (4.1) as

F (σHW ) = CBOHW (σHW )− PV B76.

The function CBOHW (σHW ) summarizes the evaluation of the swaption equivalent

bond options by formula (2.3) depending on the short rate volatility values in σHW .

We point out that the structure of the short rate volatility σHW ∈ RN and the

number of reference European options prices PV B761 , . . . , PV B76

N is chosen such that

the function F maps from RN to RN .

Usual market situations allow to find a solution σHW of the calibration problem

such that

F (σHW ) = 0 ∈ RN , i.e. CBOHW (σHW ) = PV B76.

Moreover it appears reasonable to assume that the Jacobian F ′(σHW ) = CBOHW ′(σHW )

at the solution is non-singular. We would like to point out that these assumptions

are local properties close to the optimal value σHW . The more general formulation as

minimisation problem in Section 2.3.1 is intended to avoid iteration steps to regions

where the objective function may not be evaluated and the algorithm breaks down.

Since we assume F ′(σHW ) non-singular and F (σHW ) = 0 we may apply the im-

plicit function theorem. This yields that there is a vicinity U ⊂ Rn around PV B76

and a differentiable function CBOHW−1: U → RN with

σHW = CBOHW−1(PV B76).

We get for the Jacobian of CBOHW−1that

dCBOHW−1

dPV(PV B76) =

[F ′(σHW )

]−1=

[dCBOHW

dσ(σHW )

That is the sensitivity of the calibrated short rate volatility σHW with respect to the

input Black’76 prices of reference swaptions PV B76 is given by

dPV B76=

[dCBOHW

dσ(σHW )

If the calibration problem is solved by a Gauss-Newton method as described in

Section 2.3.2 the Jacobian F ′(σHW ) = dCBOHW

dσ(σHW ) is already available from the

last iteration. Moreover the Jacobian is already decomposed into a QR factorisa-

tion which allows an efficient evaluation of inverse Jacobian vector products. In our

situation we are particularly interested in evaluating

dPV HW

dσHW· dσHW

dPV B76= ∇PV HW> ·

[dCBOHW

dσ(σHW )

The derivative evaluation of the coupon bond option formula in the Hull White model

is implemented using the AD tool ADTAGEO again.

If the Jacobian of the objective function is not non-singular at the solution or

if the calibration problem is not formulated as a system of (non-linear) equations

then we can not use the approach above. In such situations a general fixed point

iteration may be used to calibrate the model. The derivative of the calibrated short

rate volatility with respect to input reference prices may then be evaluated by reverse

accumulation as described in [Chr94].

4.3 Evaluating Derivatives of the PDE solution

The price of the Bermudan swaption is evaluated as the solution V (t, f(t, t)) of the

Hull White PDE

Vt +σ2(t)

2Vrr + [θ(t)− ar]Vr − rV = 0

with Bermudan exercise conditions

V (Ti, r) = max

{limt↓Ti

V (t, r), [ω (CBi(Ti, r, σ(·), . . .)−Ki)]+

}, i = 1, . . . , N.

The volatility is piecewise constant between the exercise dates T1, . . . , TN (T0 = t),

σ(t) = σHWi for Ti−1 < t ≤ Ti.

For details see Section 3.

The numerical Bermudan swaption price PV HW = V δ(t, f(t, t)) can be considered

a function of the short rate volatility values σHW1 , . . . , σHWN . The volatilities enter the

evaluation procedure of PV HW in the (boundary) exercise conditions and in the

linear operator of the differential equation. Thus we can evaluate the sensitivity

of the Bermudan price PV HW with respect to the short rate volatilities σHW =

(σHW1 , . . . , σHWN )> by differentiating all the elementary operations involved in the

evaluation.

In our implementation we use the tapeless vector forward of the AD tool ADOL-

C [GJU96] to differentiate the Hull White PDE solver. This approach is based on

operator overloading. For each elementary operation the corresponding derivatives are

directly evaluated and propagated along with function values. Since we are interested

in the derivative of the Bermudan price with respect to each of the N short rate

volatilities we apply the vector forward mode with N tangential directions.

The Bermudan price as a function of the short rate volatilities can be considered

a mapping of N independent variables to one dependent variable. The derivative

dPV HW/dσHW is essentially a gradient. For such problems it is usually computation-

ally more efficient to use the reverse mode of AD. However, the reverse mode requires

the recording (taping) of all intermediate variables of the evaluation procedure. For

applications involving a time stepping the memory effort may be tremendous. This is

particularly the case for small discretisations in space and time. The memory effort

may be reduced by checkpointing strategies [SW10] which successively re-evaluate

and store sequences of the evaluation procedure.

Bermudan swaptions usually involve a rather limited number of N exercise dates.

Hence the number of short rate volatility values is also not that much. In our (rather

extreme) reference test problem we incorporate 30 exercise dates. Consequently the

additional computational effort of the forward mode compared to the reverse mode

is still acceptable.

The more important point for the application of the AD forward mode in the PDE

solver is that it allows us to solve for several solutions of the Hull White PDE and its

derivatives in parallel. This is particularly important if we want to estimate the accu-

racy of the solution and the gradient of the numerical scheme. In our implementation

we solve the Hull White PDE for the exercise conditions of the Bermudan and in

addition for the boundary conditions of each of the N European reference swaptions.

The numerical results for prices and gradients are compared to the (semi-) analytical

prices and gradients available from the calibration procedure.

We illustrate the accuracy of the differentiation of the PDE code for our reference

test problem by evaluating the sensitivity of the numerical prices of the ith reference

swaptions PV HWi with respect to a parallel shift in the short rate volatility curve.

That is we evaluate

˙PViHW

=N∑j=1

dPV HWi

dσHWj.

This quantity is compared to the (average) derivative of the corresponding analytical

bond option price

˙CBOi(t, . . .) =N∑j=1

dCBOi(t, . . .)

dσHWj.

The relative error of the derivative of the numerical scheme is evaluated as

erri =

∣∣∣∣∣ ˙PViHW

˙CBOi(t, . . .)− 1

∣∣∣∣∣ .Figure 4.2 illustrates the results of the accuracy of the derivative evaluation for ref-

erence swaptions of our standard test problem.

Figure 4.2: Asymptotic approximation of average sensitivities by Automatic differen-tiation of reference European swaptions

number of spatial grid points n

Relative Error of Sensitivity

Reference 10/n2

201101 1001

0.0001

101 201 501 1001

0.0001

The results in Figure 4.2 show that the derivatives evaluated by differentiating the

numerical scheme approximate the analytical solutions. However, the convergence of

the numerical derivative to the analytical derivative is considerably slower than the

convergence of the numerical PDE solution to the analytical solution. This is related

to observations reported, for example in [EB99] and [AS09]

4.4 Linking the partial derivatives

According to the description in the Sections 4.1, 4.2, and 4.3 we can evaluate par-

tial derivatives of the Bermudan swaption pricing procedure. The individual partial

derivatives are combined in reverse order of the evaluation procedure. That is we first

evaluate the N -dimensional transposed-vector-matrix multiplication

dPV HW

dPV B76=

dPV HW

dσHW· dσHW

dPV B76.

Here N is the number of exercise dates. This results in a (row) vector of dimension

In the next step we evaluate

dPV HW

dσB76=

dPV HW

dPV B76· dPV

dσB76.

The derivative dPV B76/dσB76 is of diagonal form. Hence this requires only N mul-

tiplications. As a result we get the sensitivity of the Bermudan price with respect

to the interpolated Black’76 volatilities. Figure 4.3 shows the resulting sensitivities

dPV B76/dσB76 for our reference test problem.

If we are also interested in the sensitivity of the Bermudan price with respect to the

quoted Black’76 volatilities we may apply the derivatives of the surface interpolation.

That is we evaluatedPV HW

dΣB76=

dPV HW

dσB76· σB76

dΣB76.

The derivative σB76/dΣB76 is a N -by-K ·L matrix where K ·L is the number of quotes

in the volatility surface. This matrix can be rather big. However it is also sparse. For

our reference test problem the final sensitivities of the Bermudan price with respect

to the quoted Black’76 volatilities are given in Table 4.1.

Figure 4.3: Sensitivity of Bermudan swaption price with respect to interpolatedBlack’76 volatilities of reference European swaptions.

Excercises i

HW/dσB

7 9 11 13 15 17 19 21 23 25 27 29

Table 4.1: Sensitivity of Bermudan swaption price with respect to Black’76 volatilitysurface.

Exercise Swap tenors (years)(years) 1 5 10 20 30

2 0 0 0 5.9E-04 9.6E-033 0 0 0 6.3E-03 2.5E-024 0 0 0 1.7E-02 3.9E-025 0 0 0 2.2E-02 3.3E-026 0 0 0 2.6E-02 2.6E-027 0 0 0 2.7E-02 1.8E-028 0 0 0 2.5E-02 1.1E-029 0 0 0 3.3E-02 8.2E-0310 0 0 5.2E-03 7.8E-02 3.0E-0315 0 0 4.2E-02 7.1E-02 020 0 5.0E-03 6.9E-02 1.4E-02 025 2.4E-03 2.8E-02 1.8E-02 0 030 7.0E-03 5.5E-03 0 0 0

Chapter 5

Conclusions

In this thesis we investigate the sensitivity evaluation of Bermudan swaption prices

with respect to the market observable Black’76 volatilities. For the pricing of the

Bermudan swaption we apply a Hull White model with piecewise constant short

rate volatility function. The Bermudan swaption pricing involves the evaluation of

reference European swaption prices, the calibration of the short rate volatilities of the

Hull White model, and the numerical evaluation of the Bermudan price. Sensitivities

of all the valuation steps are evaluated by methods of Automatic Differentiation (AD).

Numerical results are reported for a reference test problem.

A key result of this thesis is that Automatic Differentiation yields accurate and

efficient sensitivities of Bermudan swaptions (see Chapter 4). It circumvents the

disadvantages of finite difference approximations. We find that template based pro-

gramming and AD tools based on operator overloading provide an efficient and flexible

approach to incorporate sensitivity evaluation in complex pricing procedures.

Besides the sensitivity evaluation we analysed the implementation of the Hull

White model on a PDE grid. It became visible in Subsection 2.2.3 that the numerical

solution of the Hull White model on a PDE grid requires an initial yield curve that

provides continuous forward rates. The PDE discretisation requires to bound the

computational domain and approximate the boundary condition. We find in Sub-

section 3.1.2 that the boundary discretisation error is small if the domain is chosen

sufficiently large. Our numerical results indicate that a short rate radius of 20% to

30% around the initial short rate value appear to be a reasonable choice. Moreover,

we find that the accuracy of the numerical solution may be increased substantially if

we apply the normally distributed short rate grid as elaborated in Subsection 3.1.1.

For the time integration we get in that adaptive step size control by Richardson’s

extrapolation (see 3.2) is crucial for the efficiency of the overall procedure.

Additionally, the calibration procedure of the Hull White model may be differ-

entiated with only minor additional computational effort. If the short rate volatility

calibration problem is formulated as in Subsection 4.2 the required derivatives are

already available in a computationally desirable form from the calibration procedure

itself.

Consequently, we can consider the results in this thesis to be another proof of

concepts that Automatic Differentiation may improve the sensitivity evaluation in a

wide range of applications. Further research might concern the sensitivity evaluation

in other financial models and with respect to further market and model parameters

like interest rates, mean reversion speeds, or correlations. The results may yield a

better understanding of the risk profile of exotic products. The adoption to other

models could be realized, for example, by extending existing pricing libraries to allow

the incorporation of Automatic Differentiation tools.

Appendix A

Derivative Approximation forNormally Distributed Grids

In Section 3.1 we elaborate that the discretisation error of the finite difference ap-

proximation V δr and V δ

rr to the first and second spatial derivative ∂V∂r

and ∂2V∂r2

becomes

V δr (t, ri)−

∂r(ri, t) =

∂r2(ri, t) [(ri+1 − ri)− (ri − ri−1)] +O(δ2),

V δrr(t, ri)−

∂r2(ri, t) =

∂r3(ri, t) [(ri+1 − ri)− (ri − ri−1)] +O(δ2),

for i = 1, . . . , n− 1. The spatial discretisation width δ is

δ = maxi=1,...,n

{ri − ri−1}.

For uniform grids with (ri+1 − ri) − (ri − ri−1 = 0 we get immediately that the

discretisation error is of order O(δ2). This yields quadratic order of convergence of

the numerical scheme. If we have a non-uniform grid the derivative approximation

is in the general case only of order O(δ) which reduces the convergence order of the

numerical scheme accordingly.

For the normally distributed short rate grid we choose the grid points ri according

ri =Φ−1

(i+1n+2

)Φ−1

(n+1n+2

) s+ r(t) for i = 0, . . . , n.

Denote ui = i+1n+2

and hn = 1n+2

. Then

(ri+1−ri)−(ri−ri−1) =s

Φ−1(un)

[(Φ−1(ui + hn)− Φ−1(ui)

)−(Φ−1(ui)− Φ−1(ui − hn)

)]︸︷︷︸d2Φ−1(ui)

for i = 1, . . . , n − 1. Since Φ−1 is continuously differentiable there are constants

ρ, η ∈ [0, 1] such that

d2Φ−1(ui) = Φ−1′(ui + ρhn)hn − Φ−1′(ui − ηhn)hn.

Moreover since Φ−1′ is also continuously differentiable there is a constant λ ∈ [−1, 1]

such that

d2Φ−1(ui) = Φ−1′′(ui + λhn)(ρ+ η)h2n

=Φ−1′′(ui + λhn)

Φ−1′(ui + ρhn)2 (ρ+ η)

(Φ−1(ui + hn)− Φ−1(ui)

SinceΦ−1′′(ui + λhn)

Φ−1′′(ui)→ 1 and

Φ−1′(ui)

Φ−1′(ui + ρhn)→ 1 for hn → 0

we get

|(ri+1 − ri)− (ri − ri−1)| ≤ CΦ−1(un)

∣∣∣∣∣Φ−1′′(ui)

Φ−1′(ui)2

∣∣∣∣∣ |ri+1 − ri|2

for a positive constant C <∞ (usually in the order of one). Moreover it follows that

Φ−1′(ui) =1

Φ′ (Φ−1(ui))and Φ−1′′(ui) = −Φ′′ (Φ−1(ui))

Φ′ (Φ−1(ui))3 .

Exploiting that

Φ′(x) =1√2π

)and thus Φ′′(x) = −xΦ′(x)

yields

|(ri+1 − ri)− (ri − ri−1)| ≤ C

sΦ−1(un) |Φ−1(ui)| |ri+1 − ri|2 . (A.1)

Since |Φ−1(ui)| < Φ−1(un) follows

|(ri+1 − ri)− (ri − ri−1)| < C

[Φ−1(un)

]2 |ri+1 − ri|2 .

Unfortunately Φ−1(un) = −Φ−1(

)is not bounded for n→∞. However, it grows

only very moderately. For (computationally) reasonable numbers of discretisation

grid points n we may consider it bounded above. Figure A.1 illustrates the growth

of the factor [Φ−1(un)]2.

Moreover, we have in Equation (A.1) that Φ−1(ui) = 0 for ui = 12. Hence for ui

in a vicinity of 12

we may expect that the additional error by the non-uniform grid

discretisation almost vanishes. For s = 1 and n = 1001 we illustrate the factor

(ri+1 − ri)− (ri − ri−1)

(ri+1 − ri)2 , (ri − ri−1)2}56

Figure A.1: NormGridFactor

[Φ−1 (un)

number of grid points n

��""""""""""""��

10310 105 107 109

Figure A.2: RFactor

(ri+1−ri)−(ri−ri−1)

max{(ri+1−ri)2,(ri−ri−1)2}

spatial dimention r

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

for i = 1, . . . , n− 1 in Figure A.2.

We may conclude that the discretisation error of the finite difference approxima-

tion for the first and second derivatives of V with normally distributed grid points

becomes ∣∣∣∣V δr (t, ri)−

∂r(ri, t)

∣∣∣∣ =

[Φ−1

O(δ2) +O(δ2),

∣∣∣∣V δrr(t, ri)−

∂r2(ri, t)

∣∣∣∣ =

[Φ−1

O(δ2) +O(δ2),

for i = 1, . . . , n − 1. Hence for the normally distributed short rate grid we get

essentially a second order approximation. This confirms the quadratic convergence

observed in the results of Figure 3.3.

Appendix B

Fundamentals of AutomaticDifferentiation

Automatic Differentiation (AD) covers principles and techniques of derivative evalua-

tions that make use of the computational representation of a function. In applications

a mathematical function is usually given as a computer program (in some high level

programming language) with input and output variables. Methods of AD allow the

efficient evaluation of the derivatives of selected outputs with respect to selected in-

puts. The concepts of AD are implemented by tools that use the computational rep-

resentation of a function given in a programming language like C/C++, Fortran,

or Matlab. The following details are mainly based on the book by A. Griewank

[GW08]. It is also referred to in the Literature for a more detailed discussion.

The main ideas of AD are the representation of a functional relation as a se-

quence of elemental differentiable operations, and the application of the chain-rule of

differentiation in forward or reverse order to this sequence.

Besides the elemental concepts of the forward and reverse mode, there are also

methods for the evaluation of higher order derivatives and the efficient evaluation of

derivatives in a number of directions. For more detailed discussion, it is referred to

the references at www.autodiff.org.

B.1 Evaluation procedure

To illustrate the principles of AD, consider the function described by the Black’76

formula. The Black’76 formula for European puts (ω = −1) and calls (ω = +1) with

forward price F , strike K, volatility σ, and time to maturity τ is

Black76 : (0,∞)4 × {−1,+1} → (0,∞),

Black76(F,K, σ, τ, ω) = ω [FΦ(ωd1)−KΦ(ωd2)] ,

d1,2 =log (F/K)

σ√τ

± σ√τ

By splitting the evaluation of Black76() into a sequence of unary and binary ele-

mental operations, one may get a list of operations as shown in Figure B.1, where

vi denote intermediate variables and fi(·) stands for the elemental operations. This

Figure B.1: Evaluation procedure of Black’76 formula

v−3 = x1 = Fv−2 = x2 = Kv−1 = x3 = σv0 = x4 = τv1 = v−3/v−2 ≡ f1(v−3, v−2)v2 = log(v1) ≡ f2(v1)v3 =

√v0 ≡ f3(v0)

v4 = v−1 · v3 ≡ f4(v−1, v3)v5 = v2/v4 ≡ f5(v2, v4)v6 = 0.5 · v4 ≡ f6(v4)v7 = v5 + v6 ≡ f7(v5, v6)v8 = v7 − v4 ≡ f8(v7, v4)v9 = ω · v7 ≡ f9(v7)v10 = ω · v8 ≡ f10(v8)v11 = Φ(v9) ≡ f11(v9)v12 = Φ(v10) ≡ f12(v10)v13 = v−3 · v11 ≡ f13(v−3, v11)v14 = v−2 · v12 ≡ f14(v−2, v12)v15 = v13 − v14 ≡ f15(v13, v14)v16 = ω · v15 ≡ f16(v15)y1 = v16

representation of Black76() is considered as an evaluation procedure. Note that ele-

mentary functions may involve constants and parameters which are not considered to

be differentiated. For example, in Figure B.1 f6(v4) = 0.5·v4 or f9(v7) = ω ·v7. involve

factors 0.5 or ω which act as constants. The evaluation procedure of a function can be

represented as a directed acyclic computational graph connecting independent input

variables with dependent output variables. A computational graph of the Black76()

function based on the elementary operation sequence in Figure B.1 is given in Figure

Figure B.2: Computational Graph of Black’76 Formula

v14v13

Generally, be F : Rn → Rm and fi : Rni → Rmi . The relation j ≺ i denotes that

vi ∈ R depends directly on vj ∈ R. If for all x ∈ Rn and y ∈ Rm with y = F (x) holds

thatvi−n = xi i = 1, . . . , nvi = fi(vj)j≺i i = 1, . . . , l

ym−i = vl−i i = m− 1, . . . , 0,(B.1)

then (B.1) is called an evaluation procedure of F with elementary operations fi.

We assume differentiability of all elementary operations fi (i = 1, . . . , l). Then the

resulting function F is also differentiable.

B.2 Forward mode

By differentiating the elementary operations fi in the evaluation procedure (B.1)

totally and applying the chain-rule of differentiation directional derivatives can be

evaluated. Denote ui = (vj)j≺i ∈ Rni . Then ui collects the arguments of the operation

fi. In addition to each elementary function evaluation vi = fi(ui) we evaluate the

derivatives vi by

vi =∑j≺i

∂vjfi(ui) · vj.

Abbreviating ui = (vj)j≺i and fi(ui, ui) = f ′i(ui) · ui, leads to the forward mode of AD

by[vi−n, vi−n] = [xi, xi] i = 1, . . . , n

[vi, vi] =[fi(ui), fi(ui, ui)

]i = 1, . . . , l

[ym−i, ym−i] = [vl−i, vl−i] i = m− 1, . . . , 0.

Here, the initializing derivative-values xi−n for i = 1 . . . n are given and determine the

direction of the tangent. This representation can also be interpreted as an evaluation

procedure. Hence with x = (xi) ∈ Rn and y = (yi) ∈ Rm, the forward mode of AD

evaluates

y = F ′(x)x.

For a proof of this statement, it is referred to [GW08]. The forward differentiation of

the Black’76 example is given in Figure B.3.

The additional computational effort of the forward differentiation consists of the

differentiation of the elemental operation fi(ui), the multiplication with the corre-

sponding values in ui, and the summation. A distinguished count of memory moves,

additions, multiplications, and non-linear operations together with a weighted mea-

sure of its computational effort yields, according to [GW08], that

OPS (F ′(x)x) ≤ cf OPS (F (x)) , where cf ∈[

OPS(·) is a measure of the computational effort and the multiplier cf depends on the

weighting of the effort of moves, additions, multiplications and non-linear operations.

With this estimation, one may state that the computational effort of the evaluation

Figure B.3: Forward Differentiation of Black’76 Formula

v−3 = x1 = F v−3 = 0v−2 = x2 = K v−2 = 0v−1 = x3 = σ v−1 = 1v0 = x4 = τ v0 = 0v1 = v−3/v−2 v1 = v−3/v−2 − v1 · v−2/v−2

v2 = log(v1) v2 = v1/v1

v3 =√v0 v3 = 0.5 · v0/v3

v4 = v−1 · v3 v4 = v−1 · v3 + v−1 · v3

v5 = v2/v4 v5 = v2/v4 − v5 · v4/v4

v6 = 0.5 · v4 v6 = 0.5 · v4

v7 = v5 + v6 v7 = v5 + v6

v8 = v7 − v4 v8 = v7 − v4

v9 = ω · v7 v9 = ω · v7

v10 = ω · v8 v10 = ω · v8

v11 = Φ(v9) v11 = φ(v9) · v9

v12 = Φ(v10) v12 = φ(v10) · v10

v13 = v−3 · v11 v13 = v−3 · v11 + v−3 · v11

v14 = v−2 · v12 v14 = v−2 · v12 + v−2 · v12

v15 = v13 − v14 v15 = v13 − v14

v16 = ω · v15 v16 = ω · v15

y1 = v16 y1 = v16

of the forward mode of AD is bounded by the computational effort of 52

function

evaluations. It is to note that a forward mode includes a function evaluation, because

the evaluation of f ′i requires also ui for all i.

In practice, a concrete run-time ratio of the derivative evaluation compared to the

function evaluation may differ from the theoretical result. It depends on the specific

program of the function, the AD tool applied, and the hardware used.

B.3 Reverse mode

In the forward mode, the chain-rule is applied in the same order as the function

evaluation itself, starting at the independent variables and evaluating the dependent

variables. On the contrary, the chain-rule can be applied in the reverse order of the

function evaluation. This means, the sensitivities of the intermediate variables vi

with respect to the dependent variables yi are computed successively and in reverse

order. Note that this requires a preceding function evaluation where all overwritten

program variables are stored for restoration in the reverse sweep.

The reverse mode evaluation requires auxiliary derivative values vj. These inter-

mediate variables are initialized to zero before the reverse mode evaluation. For each

elementary operation fi and all intermediate variables vj with j ≺ i,

vj + = vi ·∂

∂vjfi(ui)

is evaluated. In other words, for each arguments of fi the partial derivative is derived.

Denoting ui = (vj)j≺i ∈ Rni and fi(ui, vi) = vi · f ′i(ui), the incremental reverse

mode of AD is given by the evaluation procedure

vi−n = xi i = 1, . . . , nvi = fi(vj)j≺i i = 1, . . . , l

ym−i = vl−i i = m− 1, . . . , 0vi = yi i = 0, . . . ,m− 1ui + = fi(ui, vi) i = l, . . . , 1xi = vi i = n, . . . , 1.

In this representation all intermediate variables vi are assigned only once. Here the

initializing values yi are given. They represent a weighting of the dependent variables

yi. The vector y = (yi) can also be interpreted as normal vector of a hyperplane in

the range of F . With y = (yi) and x = (xi), the reverse mode of AD yields

xT = ∇[yTF (x)

]= yTF ′(x).

Figure B.4: Reverse Mode Evaluation of Black’76 formula

v−3 = x1 = Fv−2 = x2 = Kv−1 = x3 = σv0 = x4 = τv1 = v−3/v−2

v2 = log(v1)v3 =

v4 = v−1 · v3

v5 = v2/v4

v6 = 0.5 · v4

v7 = v5 + v6

v8 = v7 − v4

v9 = ω · v7

v10 = ω · v8

v11 = Φ(v9)v12 = Φ(v10)v13 = v−3 · v11

v14 = v−2 · v12

v15 = v13 − v14

v16 = ω · v15

y1 = v16

v16 = y1 = 1v15 += ω · v16

v13 += v15; v14 += (−1) · v15

v−2 += v12 · v14; v12 += v−2 · v14

v−3 += v11 · v13: v11 += v−3 · v13

v10 += φ(v10) · v12

v9 += φ(v9) · v11

v8 += ω · v10

v7 += ω · v9

v7 += v8; v4 += (−1) · v8

v5 += v7; v6 += v7

v4 += 0.5 · v6

v2 += v5/v4; v4 += (−1) · v5 · v5/v4

v−1 += v3 · v4; v3 += v−1 · v4

v0 += 0.5 · v3/v3

v1 += v2/v1

v−3 += v1/v−2; v−2 += (−1) · v1 · v1/v−2

τ = x4 = v0

σ = x3 = v−1

K = x2 = v−2

F = x1 = v−3

For a proof of this statement, see [GW08]. An example code list of the reverse mode

differentiated Black’76 formula is given in Figure B.4.

In analogy to the forward mode, the additional computational effort of the reverse

mode can be estimated by considering memory moves, additions, multiplications,

and non-linear operations. This includes again a function evaluation. According to

[GW08] follows

OPS(yTF ′(x)

)≤ cr OPS ((F (x)) , where cr ∈ [3, 4].

It is of particular importance that the reverse mode is independent of the dimen-

sions n of the input variables. In the special case of scalar functions F : Rn → R the

gradient ∇F (x) can be evaluated at an effort of four function evaluations independent

of the number n of input variables.

B.4 General Graph Reduction

The computational graph of a function can be exploited to evaluate derivatives. If the

evaluation procedure of a function with given input variables and parameters is known

we can in principle build up the computational graph. Each vertex in the graph is

associated with an intermediate variable with known value. The edge connecting an

independent variable vj with a dependent variable vi is associated with the derivative

dvi/dvj. As the immediate functional relation between vj and vi is known by the

elementary operation for vi and the intermediate variable values are known we can

evaluate all derivatives between intermediate variables.

Derivatives of dependent output variables with respect to independent input vari-

ables may be evaluated by the repeated elimination of the vertices of intermediate

variables. This can be done by applying the chain rule of differentiation. For a rigor-

ous description of that procedure we refer to [GW08]. Forthcoming we illustrate the

principles of the elimination by means of an example.

A subgraph of the evaluation procedure for the Black’76 formula is given in Figure

B.5. It describes the evaluation of

v8 =v2

− v4

2and v9 = ω

− v4

The total derivatives are given by

= −v5

(−v5

Figure B.5: Subgraph of Black’76 Formula

dv7= ω

v6 = v42

v5 = v2/v4 v9 = ωv7

v8 = v7 − v4

v7 = v5 + v6

dv2= 1

dv7dv5

dv6= 1

dv4= −v5

dv6dv4

dv8dv4

= −1

dv8dv7

The derivatives of the dependent variables v8 and v9 with respect to the inde-

pendent variables v2 and v4 may be determined by successively eliminating vertices

in the graph of Figure B.5. As a first step we may eliminate the vertex of v6.

The chain rule of differentiation yields that the derivative of v7 w.r.t. v4 becomes

dv7/dv4 = dv7/dv6 · dv6/dv4. The resulting graph is illustrated in Figure B.6.

Figure B.6: Subgraph of Black’76 Formula after Elimination of v6

dv7dv4

= 12 · 1

dv2= 1

dv7dv5

= 1dv5

dv4= −v5

dv8dv4

= −1

dv8dv7

dv7= ω

In the next step we may eliminate the vertex of v5. This introduces a new direct

edge between v2 and v7. The corresponding derivative is dv7/dv2 = dv7/dv5 · dv5/dv2.

Moreover, besides the existing edge between v4 and v8 the elimination of v5 introduces

another connection of v4 to v8 with derivative dv8/dv5 · dv5/dv4. The value of that

derivative is added to the existing derivative dv5/dv4. The resulting graph is shown

in Figure B.7.

Figure B.7: Subgraph of Black’76 Formula after Elimination of v6 and v5

dv4= 1

2 −v5v4

dv4= −1

dv8dv7

dv7= ω

dv7dv2

The last step requires the elimination of the vertex of v7. The derivatives are eval-

uated analogously to the preceding steps. This yields the graph illustrated in Figure

B.8. All dependent variables are directly connected to all independent variables. The

derivatives at the edges describe the complete Jacobian of the mapping(v2

)7→(v8

Figure B.8: Subgraph of Black’76 Formula after Elimination of v6, v5, and v7

dv2= 1

v4· ω

dv8dv4

= − 12 −

dv9dv4

12 −

)· ω

Both the forward and reverse mode of AD may be described in the context of

graph elimination. The order of the vertex elimination in these cases is determined

by the selected AD mode. The additional computational effort for the derivative

evaluation depends on the order of the vertex elimination. The problem of finding

the optimal order which causes the minimum additional computations, i.e. Optimal

Jacobian Accumulation Problem, is NP-complete [Nau08].

For this thesis we apply the AD tools ADOL-C [GJU96] and ADTAGEO [RG09].

ADOL-C is based on operator overloading in C++. It allows the evaluation of first

and higher order derivatives with forward and reverse mode. We use particularly its

tapeless vector forward mode.

The AD tool ADTAGEO is based on graph elimination. It builds up a computa-

tional graph of independent and dependent variables. However, it does not construct

the entire computational graph. Whenever a variable is deallocated or reassigned

its vertex in the graph is instantaneously eliminated. Graph elimination AD tolls

often require more computational effort than tools based on operator overloading or

source transformation. However, ADTAGEO’s programming interface allows easy

and flexible derivative evaluation.

Appendix C

Option Pricing via Integration

In this chapter we elaborate an alternative approach for the pricing of Bermudan

bond options in the Hull White model. The method is based on the reformulation of

the Hull White model in the time-T neutral measure. For references, see for example

[BM07]. The fundamental theorem of asset pricing yields for the price V (t, r(t)) of a

security depending on the time t and a (also time-dependent) risk factor r(t) that

V (t, r(t)) = ZCB(0; t, T, r(t)) · ET [V (T, r(T )) | F(t)] for t < T.

In this representation ZCB(0; t, T, r(t)) is the time-t price of a zero coupon bond

maturing at time T . The zero coupon price depends on the model calibration at

time-0 and the time-t state of the risk factor r(t). For the Hull White model the risk

factor is the short rate. The analytical formula for ZCB() is elaborated in Section

2.2.2.

The expectation ET conditional on the information at time t is evaluated in the

time-T neutral measure. That is the numeraire applied is the zero coupon bond

maturing at time T . In our setting the price of the numeraire is given by the ZCB()

formula. For the pricing of the option in this setting we require the dynamics of the

short rate r(t) in the time-T neutral measure.

Provided we can evaluate ET [V (T, r(T )) | F(t)] for a given option price or payoff

at time T then we can also price Bermudan bond options. We discretise the short rate

by a grid r0, . . . , rn. Analogously to the PDE approach we start at the last exercise

date TN and work backwards in time. We evaluate the auxilliary option price

V (TN−1, rj) = ZCB(0;TN−1, TN , rj) · ETN [pN(r(TN)) | F(TN−1)]

for j = 0, . . . , n. Here pN(r) is the Nth payoff function of the Bermudan. The option

price at TN−1 then becomes

V (TN−1, rj) = max{V (TN−1, rj), pN−1(rj)

}for j = 0, . . . , n.

The resulting discrete points V (TN−1, r0), . . . , V (TN−1, rn) are interpolated to model

the option price function V (TN−1, r) at time TN−1 and intermediate short rate points

r. We may proceed evaluating

V (TN−2, rj) = ZCB(0;TN−2, TN − 1, rj) · ETN−1 [V (TN−1, r(TN−1)) | F(TN−2)] ,

V (TN−2, rj) = max{V (TN−2, rj), pN−2(rj)

}for j = 0, . . . , n. These steps are repeated until V (T1, r) is available. The desired

price of the Bermudan option is finally determined as

V (0, r(0)) = P (0, T1) · ET1 [V (T1, r(T1)) | F(0)] .

In the following sections we elaborate the dynamics of the short rate in the Hull

White model. We start by describing the risk neutral dynamics since the model is

initially formulated in this setting. By changing the measure we continue elaborating

the dynamics in the time-T neutral measure. Finally we describe the numerical

evaluation of the expectation.

C.1 Risk Neutral Dynamics

The Hull white model in Section 2.2 yields the short rate process

r(T ) = e−a(T−t)[r(t) +

ea(u−t) (θ(u)du+ σ(u)dW (u))

]for t < T . Here W (t) is a Brownian motion under the risk neutral measure. The

numeraire under the risk neutral measure is the generic bank account which accruals

with the short rate. The price of the bank account Q(t) is given by

Q(t) = exp

{∫ t

r(u)du

We find immediately that r(T ) conditional on the information at time t is normally

distributed. The variance is given by

V ar [r(T ) | F(t)] =

e−2a(T−u)σ2(u)du.

For piecewise constant volatility functions σ(t) the variance can be evaluated analyt-

ically. The expectation becomes

EQ [r(T ) | F(t)] = e−a(T−t)r(t) +

e−a(T−u)θ(u)du.

Our model is calibrated to the observed yield curve at time 0. Hence the drift θ(u) is

θ(u) =∂f(0, u)

∂u+ af(0, u) +

e−2a(u−s)σ2(s)ds.

Consequently, we have to evaluate∫ T

e−a(T−u)

[∂f(0, u)

∂u+ af(0, u)

]du︸︷︷︸

I1(t,T )

e−a(T−u)

e−2a(u−s)σ2(s)ds du︸︷︷︸I2(t,T )

The first integral I1(t, T ) may be easily solved as

I1(t, T ) =[e−a(T−u)f(0, u)

= f(0, T )− e−a(T−t)f(0, t).

The second interval is split again as

I2(t, T ) =

e−a(T−u)

Now we can change the order of integration. For the integral I3(t, T ) we get

I3(t, T ) =

e−a(T−u)e−2a(u−s)du ds

σ2(s)

e−a(T−u)−2a(u−s)du ds

σ2(s)

[e−a(T−u)−2a(u−s)

σ2(s)

[e−a(T−t)e−2a(t−s) − e−2a(T−s)] ds.

The integral I4(t, T ) becomes

I4(t, T ) =

e−a(T−u)e−2a(u−s)σ2(s)du ds

σ2(s)

e−a(T−u)−2a(u−s)du ds

σ2(s)

[e−a(T−u)−2a(u−s)

σ2(s)

[e−a(T−s) − e−2a(T−s)] ds.72

As a result we end up with

EQ [r(T ) | F(t)] = f(0, T ) + e−a(T−t) [r(t)− f(0, t)] +∫ t

σ2(s)

[e−a(T−t)e−2a(t−s) − e−2a(T−s)] ds+∫ T

σ2(s)

[e−a(T−s) − e−2a(T−s)] ds.

The expectation EQ [r(T ) | F(t)] and the variance V ar [r(T ) | F(t)] completely de-

scribe the normally distributed short rate r(T ).

C.2 Forward Neutral Dynamics

The dynamics of the short in the time-T neutral measure are evaluated by comparing

the dynamics of the bank account compared to the time-T maturing zero coupon

bond. The dynamics of the bank account are given by

dQ(t) = r(t)Q(t)dt.

In the Hull White model the price of a zero coupon bond is given by

ZCB(t, T, r(t)) = exp

{−(B(t, T )r(t) +

θ(u)B(u, T )du− 1

σ2(u)B2(u, T )du

)}with B(t, T ) =

[1− e−a(T−t)] /a. Applying Ito’s lemma yields the zero coupon bond

dynamics

dZCB(t, T, r(t))

ZCB(t, T, r(t))=

[−Bt(t, T )r(t) + θ(t)B(t, T )− 1

2σ2(t)B2(t, T )

−B(t, T )dr(t) +1

2B2(t, T ) [dr(t)]2

With [dr(t)]2 = σ2(t)dt and Bt(t, T ) = aB(t, T )− 1 follows

dZCB(t, T, r(t))

ZCB(t, T, r(t))= [−Bt(t, T )r(t) + θ(t)B(t, T )] dt−B(t, T )dr(t)

= [(1− aB(t, T ))r(t) + θ(t)B(t, T )] dt−B(t, T )dr(t).

We choose ZCB(t, T, r(t)) as the numeraire. The fundamental theorem of asset

pricing (???) yields that the price process of any asset in our economy discounted with

that numeraire is a martingale. In particular we must have that Q(t)/ZCB(t, T, r(t))

is a martingale. We get

ZCB(t, ·)

[dQ(t)

Q(t)− dZCB(t, ·)

ZCB(t, ·)+

(dZCB(t, ·)ZCB(t, ·)

By substituting the processes for Q(t) and ZCB(t, T, r(t)) follows

ZCB(t, ·)

[r(t)dt− (1− aB(t, T ))r(t)dt

− θ(t)B(t, T )dt+B(t, T )dr(t) +B(t, T )2σ2(t)dt]

ZCB(t, ·)B(t, T )

[dr(t)−

(θ(t)−B(t, T )σ2(t)− ar(t)

The martingale condition of Q(t)/ZCB(t, T, r(t)) and the fact that the variance of

r(t) is invariant with respect to a change of measure yield that

dr(t)−[θ(t)−B(t, T )σ2(t)− ar(t)

]dt = σ(t)dW T (t).

The processW T (t) is a Brownian motion in the time-T forward measure with ZCB(t, T, r(t))

as numeraire. Consequently, the dynamics of the Hull White model in that forward

measure are

dr(t) =[θ(t)−B(t, T )σ2(t)− ar(t)

]dt+ σ(t)dW T (t).

Thus the change of measure yields the additional drift term −B(t, T )σ2(t)dt in the

time-T forward measure compared to the risk neutral measure.

Using the results in Section C.1 for for the expectation of r(T ) in the risk neutral

measure we get that the expectation ET in the forward neutral measure becomes

ET [r(T ) | F(t)] = EQ [r(T ) | F(t)]−∫ T

σ2(u)e−a(T−u)B(u, T )du.

Moreover, we find that∫ T

σ2(u)e−a(T−u)B(u, T )du =

σ2(u)

[e−a(T−u) − e−2a(T−u)

]du = I4(t, T ).

The integral term I4(t, T ) already appeared in the risk neutral expectation in Section

C.1. As a result we get the forward neutral expectation as

ET [r(T ) | F(t)] = f(0, T ) + e−a(T−t) [r(t)− f(0, t)] +∫ t

σ2(s)

[e−a(T−t)e−2a(t−s) − e−2a(T−s)] ds.

In particular, we see that the expectation depends on the volatility only from time 0

to time t. The volatility between t and T only occurs in the variance term.

C.3 Solving the Integral

At the beginning of this chapter we elaborate that an option price in the Hull White

model can be evaluated by

V (t, r(t)) = ZCB(0; t, T, r(t)) · ET [V (T, r(T )) | F(t)] for t < T.

In this Section we focus on the numerical evaluation of ET [V (T, r(T )) | F(t)]. The

risk factor (i.e. the short rate) is normally distributed with mean µ = ET [r(T ) | F(t)]

and variance σ2 = V ar [r(T ) | F(t)]. The expressions for µ and σ2 are evaluated in

the preceding sections.

The distributional properties of r(T ) yield that

ET [V (T, r(T )) | F(t)] =1√

2πσ2

∫ +∞

−∞V (T, u) exp

{−(u− µ)2

}︸︷︷︸

The integration is split into two parts as

ET [V (T, r(T )) | F(t)] =1√

2πσ2

[∫ µ

−∞q(u)du +

∫ +∞

q(u)du

For the numerical integration of∫ +∞µ

q(u)du we specify a tolerance tol and use an

adaptive discretisation scheme. We set u0 = µ and guess an initial discretisation

width h0. In each step i = 0, 1, . . . the numerical integral from ui to ui + hi is

evaluated by the trapezoidal rule and Simpson’s rule, i.e.

ITrapi =hi2

(q(ui) + q(ui + hi))

ISimpi =hi6

(q(ui) + 4q(ui + hi/2) + q(ui + hi))

Provided q(u) is sufficiently smooth for u ∈ [ui, ui + hi] we get the approximations∫ ui+hi

q(u)du = ITrapi +h3i

12f ′′(ξ) = ISimpi +

2880f (4)(η)

for ξ, η ∈ [ui, ui + h]. The error of the numerical integration erri is estimated by

erri =∣∣∣ITrapi − ISimpi

∣∣∣ .The step i is accepted if

erri ≤ hi · tol.

If the step is accepted we set ui+1 = ui + hi. The new step size hi+1 is determined by

hi+1 = hi ·min

{√tol

erri + ε, 2

The constant ε = 10−32 is incorporated to avoid division by zero. If the integration

step is not accepted, i.e. erri > hi · tol, then we set hi ← hi/2 and repeat step i until

acceptance.

The numerical integration is repeated for i = 0, 1, . . . , n until the residual part∫ +∞un+1

q(u)du is sufficiently small. Unfortunately for general payoffs this can not be

evaluated exactly. We use a heuristic argument and integrate until

{[1− Φ−1

(un+1 − µ

)], V (un+1)

[1− Φ−1

(un+1 − µ

)]}≤ tol2.

Finally the numerical integral is summed up as∫ +∞

q(u)du ≈n∑i=0

ISimpi .

The second integral∫ µ−∞ q(u)du is treated analogously. As a result we end up with

an approximation of ET [V (T, r(T )) | F(t)] which is considered to be bounded by tol.

Figure C.1: Asymptotic approximation of PDE solution of reference European swap-tions.

Excercise Date 20

Excercise Date 1

Number of spatial grid points n Number of spatial grid points n

Reference 10/n2

Excercise Date 30

Excercise Date 10

0.0001

201 501 1001 201

0.0001

501 1001

Figure C.2: Asymptotic approximation of average sensitivities of reference Europeanswaptions

Excercise Date 30

Reference 10/n2Relative Error of Sensitivity

Excercise Date 20

0.0001

501 1001

101 201 501 1001 101

0.0001

References

[AS09] M. Alexe and A. Sandu. On the discrete adjoints of adaptive time step-

ping algorithms. Journal of Computational and Applied Mathematics,

233:1005–1020, 2009.

[BM07] D. Brigo and F. Mercurio. Interest Rate Models - Theory and Practice.

Springer-Verlag, 2007.

[Chr94] B. Christianson. Reverse accumulation and attractive fixed points. Opti-

mization Methods and Software, 3:311–326, 1994.

[Duf06] D.J. Duffy. Finite difference methods in financial engineering. John Wiley

& Sons Ltd, 2006.

[EB99] P. Eberhard and C. Bischof. Automatic differentiation of numerical inte-

gration algorithms. Mathematics of Computation, 68:717–731, 1999.

[GC06] M.B. Giles and R. Carter. Convergence analysis of crank-nicolson and

rannacher time-marching. Journal of Computational Finance, 9:89–112,

[GJU96] A. Griewank, D. Juedes, and J. Utke. ADOL-C: A package for automatic

differentiation of algorithms written in C/C++. TOMS, 22:131–167, 1996.

[GR94] Chr. Großmann and H.-G. Roos. Numerik partieller Differentialgleichun-

gen. Teubner, 2 edition, 1994.

[Gri86] A. Griewank. The ”global” convergence of Broyden-like methods with a

suitable line search. J. Austral. Math. Soc. Ser. B, 28:75–92, 1986.

[GW08] A. Griewank and A Walther. Evaluating derivatives: principles and tech-

niques of algorithmic differentiation - 2nd ed. SIAM, 2008.

[Hag] P. S. Hagan. Evaluating and Hedging Exotic Swap Instruments via LGM.

Bloomberg Technical Report.

[Hen04] M. Henrard. Semi-explicit delta and gamma for european swaptions in

hull-white one-factor model. (Ewp-fin 0411036), 2004.

[HKLW02] P.S. Hagan, D. Kumar, A.S. Lesniewski, and D.E. Woodward. Managing

smile risk. Wilmott magazine, September:84–108, 2002.

[HNW93] E. Hairer, S. P. Norsett, and G. Wanner. Solving Ordinary Differential

Equations I. Springer, 1993.

[HV03] W. Hundsdorfer and J.G. Verwer. Numerical Solution ot Time-Dependent

Advection-Diffusion-Reaction Equations. Springer, 2003.

[HW90] J.C Hull and A. White. Pricing interest-rate-derivative securities. The

Review of Financial Studies, 3:573–592, 1990.

[HW91] E. Hairer and G. Wanner. Solving Ordinary Differential Equations II.

Springer, 1991.

[HW06] P.S. Hagan and G. West. Interpolation methods for curve construction.

Applied Mathematical Finance, 13:89–129, 2006.

[Jam89] F. Jamshidian. An exact bond option pricing formula. The Journal of

Finance, 44:205–209, 1989.

[Nau08] U. Naumann. Optimal jacobian accumulation is np-complete. Mathemat-

ical Programming, 112:427–441, April 2008.

[NW06] J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag,

[Pit04] V.V. Piterbarg. Risk sensitivities of bermudan swaptions. International

Journal of Theoretical and Applied Finance, 7:465–510, 2004.

[RG09] J. Riehme and A. Griewank. Algorithmic differentiation through au-

tomatic graph elimination ordering (ADTAGEO). In U. Naumann,

O. Schenk, H.D. Simon, and S. Toledo, editors, Combinatorial Scien-

tific Computing, number 09061 in Dagstuhl Seminar Proceedings. Schloss

Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany, 2009.

[RMW09] R. Rebonato, K. McKay, and R. White. The SABR/LIBOR Mar-

ket Model: Pricing, Calibration and Hedging for Complex Interest-Rate

Derivatives. John Wiley & Sons Ltd, 2009.

[Shr04] S. Shreve. Stochastic Calculus for Finance II - Continuous-Time Models.

Springer-Verlag, 2004.

[SW10] P. Stumm and A. Walther. New algorithms for optimal online checkpoint-

ing. SIAM Journal on Scientific Computing, 32:836–854, 2010.

[WFV04] H. Windcliff, P.A. Forsyth, and K.R. Vetzal. Analysis of the stability of

the linear boundary condition for the black-scholes equation. J. Compu-

tational Finance, 8:65–92, 2004.

evaluating sensitivities of bermudan swaptions€¦ · 2011-03-20 · evaluating sensitivities of...

Documents

american monte carlo for bermudan cva - quantlib · ·...

pricing interest rate swaptions

chemical sensitivities

[bank of america] guide to credit default swaptions

pricing a bermudan swaption using the libor market model

environmental sensitivities-multiple chemical ... ·...

on the pricing of bermudan swaptions with an application...

monte-carlo pricing and sensitivities of auto-callable and...

empirical pricing analysis of caps and swaptions using

risk managing bermudan swaptions in the libor bgm model...

valuation of credit default swaptions and credit default...

aspects of pricing irregular swaptions with …aspects of...

pricing barrier and bermudan style options under time...

using swaptions in an ldi framework

pricing bermudan swaptions on the libor market model using...

the pricing of bermudan swaptions by...

monte carlo pricing of bermudan-style derivatives with...

pricing bermudan swaptions in the libor market...

l4b risk management applications of swaps and swaptions

pricing models for bermudan-style interest rate...