master’s thesis - mediatum.ub.tum.de · master’s thesis a ne models in credit risk student...
TRANSCRIPT
University of Rome II “Tor Vergata”
Faculty of EngineeringMaster of Science in Models and Systems Engineering
Master’s Thesis
Affine Models in Credit Risk
Student
Vincenzo Ferrazzano
Advisors
Prof. Claudia Kluppelberg
Technische Universitat Munchen
Prof. Benedetto Scoppola
Universita of Rome II “Tor Vergata”
Academic year 2007-2008
I
In beloving memory of two great persons:
To my grandfather Vincenzo, the very first engineer of my family;
I bear his name with pride.
To Prof. Roberta DalPasso, who shown me the rewards of hard work.
II
Martyrdom, sir, is what these people like: it is the only way
in which a man can become famous without ability.
George Bernard Shaw, “The Devil’s Disciple, Act II”.
Shut up and calculate!
Attributed to Richard Feynman.
Contents
Introduction X
1 A Primer in Credit Risk 1
1.1 Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Term structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Credit rating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Historical data of default . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Recovery rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Netting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7 Credit Default Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.8 Credit spread options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Intensity-Based Modelling of Default 9
2.1 Counting processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Poisson processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Doubly stochastic process . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Risk-neutral probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Useful results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5.1 Survival analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5.2 Correlated jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Affine processes and transforms 22
3.1 Affine processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 First examples of affine processes . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Extending the transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.1 Extended transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Fourier transform inversion . . . . . . . . . . . . . . . . . . . . . . . 30
3.3.3 Fourier representation . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.4 Time dependence and multiple jump types . . . . . . . . . . . . . . 33
III
CONTENTS IV
3.4 An optimization idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 A more general result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5.1 Diagonal diffusion matrix . . . . . . . . . . . . . . . . . . . . . . . 38
3.6 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.1 CIR with jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.2 Bates model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.7 Further considerations and reference . . . . . . . . . . . . . . . . . . . . . 43
3.7.1 Infinite activity vs. finite activity . . . . . . . . . . . . . . . . . . . 43
3.7.2 Statistical estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 Numerical Methods 45
4.1 Runge-Kutta methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.1 Facts on ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.2 Remarks on Generalized Riccati Equations . . . . . . . . . . . . . . 46
4.1.3 Analysis of one step methods . . . . . . . . . . . . . . . . . . . . . 47
4.1.4 Runge Kutta methods . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.5 Derivation of an explicit RK method . . . . . . . . . . . . . . . . . 53
4.1.6 Global error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.7 On higher order RK methods . . . . . . . . . . . . . . . . . . . . . 57
4.1.8 Step adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.1.9 Our choice: Dormand-Prince method . . . . . . . . . . . . . . . . . 61
4.2 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.1 One dimensional integration . . . . . . . . . . . . . . . . . . . . . . 63
4.2.2 Step adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.3 Domain transformation . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2.4 Multidimensional integral . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Main sources and further readings . . . . . . . . . . . . . . . . . . . . . . . 69
5 Applications 70
5.1 Defaultable claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.1.1 No recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.2 Claims with recovery . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.3 Unpredictable Default Recovery . . . . . . . . . . . . . . . . . . . . 72
5.1.4 Fractional loss of value on default . . . . . . . . . . . . . . . . . . . 73
5.1.5 Netting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2 Credit derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.1 Credit spread options . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.2 Credit Default Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . 75
CONTENTS V
5.3 A multiname model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.1 Pricing a CDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6 A numerical example 81
A Measure Theory 92
A.1 Stiejeltes-Lebesgue integration . . . . . . . . . . . . . . . . . . . . . . . . . 92
A.2 Lebesgue measure theorems . . . . . . . . . . . . . . . . . . . . . . . . . . 93
B Stochastic processes 95
B.1 Definitions and basic results . . . . . . . . . . . . . . . . . . . . . . . . . . 95
B.2 Levy Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
B.2.1 Compound Poisson process . . . . . . . . . . . . . . . . . . . . . . . 99
B.3 Infinitesimal generator of a Markov Process . . . . . . . . . . . . . . . . . 100
C Risk-neutral Valuation 102
C.1 The market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
C.2 Risk-neutral Valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
D Numerical Codes 104
List of Tables
1.1 Cumulative probabilities of default (%). Source: Moody’s (1970-2003) . . . 5
1.2 Recovery rates on corporate bonds. Source: Moody’s (1982-2003) . . . . . 6
6.1 Parameters for Bates model. . . . . . . . . . . . . . . . . . . . . . . . . . 83
VI
List of Figures
3.1 Solution of the coefficients α and β for the Vasıcek model, with T = 10,
σ = 0.012, k = 0.05, γ = 0.03 . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1 How to use local truncation error to estimate global error . . . . . . . . . . 56
4.2 Example of the evaluation algorithm, to get 23 = 8 interval integration
without evaluating the same point twice . . . . . . . . . . . . . . . . . . . 65
4.3 The change of variable y = sinh(π2
sinhx)
(lower curve) and his derivative
(upper curve). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.1 Local error tolerance: 10−1, evaluations needed: 37; the algorithm shows a
numerical instability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Local error tolerance: 10−2, evaluation needed: 42; no instability. . . . . . 84
6.3 Local error tolerance: 10−5, evaluations needed: 127; perfect matching. . . 84
6.4 Local error tolerance: 10−5, evaluations needed: 127; perfect matching. The
price is increasing with maturity, i.e is convenient to perform a long-run
investment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.5 Local error tolerance: 10−5, evaluations needed: 127; lY = lλ = lr = 0; the
price still grows but at a slower pace. . . . . . . . . . . . . . . . . . . . . . 86
6.6 Local error tolerance: 10−5, evaluations needed: 127; γλ = 0.50. This asset
is really likely to default in the future, then the price is dropping. . . . . . 87
6.7 Local error tolerance: 10−5, evaluations needed: 127; λ0 = 0.80, δ = 2 ,
µS = 0.85. This asset is really likely to default in the near future, then the
price is dropping but the mean reversion make the price rise on the long
run. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.8 Local error tolerance: 10−5, evaluations needed: 127. Upper picture: dis-
counted price of the asset; middle picture: price of the asset without dis-
count, bottom picture: ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . 89
VII
LIST OF FIGURES VIII
6.9 Local error tolerance: 10−5, evaluations needed: 127; λ0 = 0.50, lλ = 0.15.
Upper picture: discounted price of the asset; middle picture: price of the
asset without discount, bottom picture: ratio. . . . . . . . . . . . . . . . . 90
Acknowledgments
This thesis is based upon studies conducted from March 2008 to September 2008 at
the Chair of Statistical Mathematics, Department of Mathematics, Technische Universitat
Munchen, Germany.
First and foremost I would like to thank Professor Doktor Claudia Kluppelberg for all
the support, the advice and the kind hospitality. Her experience and kindness made this
thesis possible.
In addition I am very grateful to Peter Hepperger for all the advice and correction
about numerics and for proofreading my manuscript; special thanks go to Frau Grant for
helping me settling in a fruitful life in Munich.
Last but not least, I sincerely thank my parents for their love and support. I am very
grateful that their confidence and encouragement allowed me to enjoy so many opportu-
nities.
IX
Introduction
Risk is, in its broader meaning, the possibility of something goes worse than expected,
involving a loss or a missed gain. It seems clear that this concept is indissolubly tied to
the more abstract and mathematical ones of probability and uncertainty. In finance there
are many different risks to be taken in account:
• Market risk.
• Liquidity risk.
• Currency risk.
• Interest rate risk.
• Credit risk;
• and so on. . .
This thesis is focused on credit risk, which is the risk related to the probability that one
of the counterparts of a contract does not honor his duty, e.g., contractor of a loan is not
able to give the money back at the arranged date or to pay interest. This event is usually
called credit event and the counterpart it is said to be defaulted.
Thinking about risk, usually implies both losses and gains, since in everyday life,
the more we risk, the more we can perhaps gain. That in general is not true for credit
risk, since if everything goes smooth, an investor gets the promised amount. Otherwise,
if something goes bad, the investor loses an amount of money, depending on the nature
of the contract he signed. So, to counter-balance the risk, an investor has to buy some
form of protection and to act on the interest rate, asking for a discount. In the light of
those considerations and of the recent subprime mortgage crisis, we can understand why
financial institutions seek (and sell) actively protection via credit derivatives like Credit
Default Swaps (CDS) and other derivatives.
A central problem in Mathematical Finance is to price products and the question we
have in mind is: “what is a fair price for that protection?”.
X
Introduction XI
It is well-known that the main tool used in practice to price derivatives in is the Monte
Carlo Method (MCM) and it’s as much well-known that such method is slow, inaccurate
and in general inefficient for low dimensional problems. On the other hand, we often have
to deal with problems depending on few econometrics variables for the sake of calibration
effort, and a prompt and fast solution would be a feature desirable by practitioners.
Those drawbacks can be addressed using an affine framework: the class of affine pro-
cesses is composed of n-dimensional stochastic processes (Xt)t≥0 characterized by an
“affine” generator. Let us fix a complete probability space (Ω,F ,P), a filtration (Ft)t≥0
satisfying some usual conditions (cf. section B.2), and let us denote the conditional ex-
pectation E [·|Ft]:=Et [·]. The affine processes’ “core” feature is the following result:
Et
[eR Tt R(Xs)ds+v·XT
]= eα+β·Xt , 0 ≤ t ≤ T (I.1)
with R(Xt) = ρ0 + ρ1 ·Xt, denoting with · the usual inner product in Rn. The quantities
α and β can be explicitly evaluated solving n+ 1 ordinary differential equations (ODEs),
called generalized Riccati equations (GREs). Solving ODEs instead of using MCM pro-
vides some straightforward useful features:
• the solution of ODE is one of the most studied numerical problem, with many
established results, well-tested algorithms and routines.
• Algorithm to solve an ODE can be very fast, very accurate and often both, if the
RHS of the ODE is enough well-behaved (and that can happen in our case).
Why affine process and credit risk? To begin with, let us take a look at the pricing
problem. As customary in mathematical finance, the price St of a financial product at
time t is computed via the formula:
St = Et
[e−
R Tt rsdsf (. . . , T )
], 0 ≤ t ≤ T (I.2)
where (rt)t≥0 is the short rate process and f is the payoff of the considered product. The
short rate is defined as the risk-free rate of return of an investment over an infinitesimal
period:
Just compare the previous equation (I.1). If we let r and f depend on an affine process
(Xt)t≥0 in an affine and exponential affine fashion, we can solve easily the evaluation
problem. The meaning of those dependences will be made clear in Chapter 3.
In the second place, we recall that in a credit risk setting, the payoff function is a
contingent claim, of the form
f (. . . , t) = F1t<τ +R1t≥τ
Introduction XII
where τ is the time of the credit event (default time), F is the promised amount and R
is the recovery, if any, in case of default. The quantities F and R are in general FT−measurable random variable.
For the sake of simplicity, let us suppose that there is no recovery, then if we let all
the default-related quantities depend on an exogenous parameter λt called the default
intensity we can show, under some technical conditions, that we can rewrite price as
Et
[e−
R Tt rsdsF1t<τ
]= Et
[e−
R Tt (rs+λs)dsF
], 0 ≤ t ≤ T. (I.3)
Intuitively, the parameter λt acts as a risk premium, correcting the interest rate to
compensate the risk associated to default.
If my four (or even fewer) readers will be patient enough, I will explain all the relations
presented, showing applications and numerical results through the chapters of this work.
Chapter 1
A Primer in Credit Risk
This chapter it is meant to be an introduction to some concepts about the credit risk
and credit-related products. We refer the interest reader to [34].
1.1 Bonds
A bond is a debt security, in which the authorized issuer owes to the holders a debt
and is obliged to repay the principal and the interest (the coupon) at a later date, called
maturity.
A bond is simply a loan in the form of a security with different terminology: the
issuer is equivalent to the borrower, the bond holder to the lender, and the coupon to the
interest. Bonds enable the issuer to finance long-term investments with external funds.
A bond can be issued from a firm (corporate bond) or from national entities, both
in local currency (government bond) and in foreign currency (sovereign bond). Usually
government bond are considered risk-free1, but other kinds of risk can occur, since that
form of investment depends both on inflation (for domestic investors) and on currency
exchange’s rates.
Bonds and stocks are both securities, but the major difference between the two is that
stock-holders are the owners of the company (i.e., they have an equity stake), whereas
bond-holders are lenders to the issuing company. Another difference is that bonds usually
have a defined term, or maturity, after which the bond is redeemed.
To define a bond we have to specify some feature, i.e. :
• issue price or, shortly, price.
• Face value F , or principal.
1Default of a nation on such kinds of bonds is a extremely rare, but not impossible, event, e.g considerthe Russian crisis of 1998.
1
1.2 Term structure 2
• Maturity T .
• Coupon and coupon dates.
• Indentures and Covenants.
maybe there are other option embedded in a bond, but those are the minimal.
As said before, when an investor buys a bond he is, as a matter of fact, lending money
to the issuer. That amount is called the issue price, and is the price actually paid for the
bond. At a certain date, the maturity, the bond expires and an amount of money is paid,
the face value. Normally some smaller amounts are regularly paid before the maturity,
and are called coupons2; those coupons can be paid with various rates, but usually on
a semestral basis in Europe and yearly in US ; if no coupons are paid, we have a zero
coupon bond, otherwise we are dealing with a coupon bearing bond.
With a bond comes a formal debt agreement (indentures), and each term of this
agreement is called covenant. A positive covenant requires certain actions, and a negative
covenant limits certain actions. An indenture is a legally binding contract and can be
enforced by the law. Then is appropriate to make a distinction between technical default
and debt-service default: the first is the default inducted by breaking a covenant, and
the second occurs when the borrower has not made a scheduled payment of interest
or principal. Throughout we will denote with the term default the debt-service default,
although those those two form of default are correlated (cf. [4]).
1.2 Term structure
Let us consider a zero-coupon bond, assuming that no default can occur and the
interest rate is fixed. Then the price of such bond is given by
St = e−r(T−t)F, 0 ≤ t ≤ T.
Then
r = − 1
T − tlog
StF, 0 ≤ t ≤ T (1.1)
is the zero rate, which is the rate that quantifies the performance of the bond. The zero-
coupon bond is the simplest example of a credit derivative and since the payoff function
is fixed at the issue of the bond, a zero-coupon bond can be defined, for the sake of
simplicity, with F = 1. Similarly, to quantify the performance of a coupon bearing bond,
2This name dates back to the time when the bonds were actually made of paper, with the coupon asmall piece of paper to be physically detached from the bond, in order for the interest to be paid
1.2 Term structure 3
let us consider the coupon dates, 0 < T1 < . . . < Tn = T , c is the coupon. Then we want
to find the rate y such that
S0 = cn∑i=1
e−yTi + e−yTF = cn−1∑i=1
eyTi + e−yT (F + c) (1.2)
holds. This rate is called yield rate, and makes the actualized promised payments equal
to the market price.
It is common in finance to consider only zero-coupon bonds, since a coupon bearing
bond can be treated like a portfolio of zero coupon bonds having face value c, with
maturity Ti When inferred from market data, it is common practice to plot the zero-rate
r as a function of T − t, to get information about the interest rate. The curve r(T − t) is
called term structure of interest rate, zero-curve or yield curve.
Depending on the shape of the curve r(T − t) is possible to get information about the
attitude of the investors toward the economy: usually three different shape are observed:
• Increasing curve: in a normal economy the curve should have this shape. This
means that at the moment all the economic indicator involved are positive (e.g.
inflation, interest rate..) and the investors are prone to risk. During such conditions,
investors expect higher yields for fixed income instruments with long-term maturities
that occur further into the future. Beside this, the yield is higher to compensate the
prolonged exposition to uncertainty.
• Flat curve: the market is sending mixed signals to investors, and they don’t know
what the interest rate will do. Therefore, a flat yield curve can be the herald of a
trend change in market behavior. This situation, if the long-term rate decreases,
can degenerate in a reverse curve. In that condition investors can maximize their
risk/return tradeoff by choosing fixed-income securities with the least risk, or highest
credit quality.
• Decreasing curve: These yield curves are rare, and they form during extraordinary
market conditions wherein the expectations of investors are completely the inverse
of those demonstrated by the normal yield curve. In such abnormal market envi-
ronments, bonds with maturity dates further into the future are expected to offer
lower yields than bonds with shorter maturities. The inverted yield curve indicates
that the market currently expects interest rates to decline as time moves farther
into the future, which in turn means the market expects yields of long-term bonds
to decline.
1.3 Credit rating 4
Recall our claim (I.1). Then the price of a zero-coupon bond is, considering also the
possibility of default (i.e. (rt)t≥0 is not the risk free rate, or short rate)
St = Et
[e−
R Tt rsdsF
]= Feα(t,T )+β(t,T )rt
and using (1.1) we have
r = − 1
T − t(α(t, T ) + β(t, T )rt) . (1.3)
What is the role of credit risk here? Consider the case of a defaultable bond and
remember (I.3). Then the r inferred from corporate bond include also a credit premium
λ, which is called spread, which is an award to the investor to compensate the risk of
default.
A common practice is to use as reference rate the treasury zero-rates, which are the
zero rates inferred from government bond and are considered risk-free. For a more detailed
discussion of term structure theory, we refer to [28].
1.3 Credit rating
Rating agencies such as Moody’s and Standard & Poor’s (S&P) provide ratings de-
scribing the creditworthiness of corporate bonds. Hence, the rating is a valuable informa-
tion to investors, and the reputation and reliability is fundamental for a rating agencies.
As a matter of fact, many US State and federal laws and regulations3, and many corporate
bylaws require the rating agency to be accredited as a “Nationally Recognized Statisti-
cal Rating Organization” (NRSRO) by the Securities and Exchange Commission (SEC).
That condition, known as regulatory barrier, makes the rating business a quite close one.
Using Moody’s system, the best rating is Aaa. Bonds with this rating are considered
to have almost no chance of defaulting. The next best rating is Aa, followed by A, Baa,
Ba, B, and Caa. Only bonds with ratings of Baa or above are considered to be investment
grade. The S&P ratings corresponding to Moody’s Aaa, Aa, A, Baa, Ba, B, and Caa
are AAA, AA, A, BBB, BB, B, and CCC, respectively. To create finer rating measures,
Moody’s divides the Aa rating category into Aal, Aa2, and Aa3; it divides A into A1, A2
and A3; and so on. Similarly S&P divides its AA rating category into AA+, AA, and AA-;
it divides its A rating category into A+, A, and A-; and so on. (Only the Aaa category
for Moody’s and the AAA category for S&P are not subdivided.)
Bond traders have developed procedures for taking credit risk into account when pric-
ing corporate bonds. They collect market data on actively traded bonds to calculate a
3E.g. money funds can invest only in AAA-rated bonds and many pension founds are bounded toinvestiment-grade bonds.
1.4 Historical data of default 5
generic zero-coupon yield curve for each credit rating category. These zero-coupon yield
curves are then used to value other bonds. For example, a newly issue A-rated bond would
be priced using the zero-coupon yield curve calculated from other A-rated bonds.
The spread normally increases as the rating declines and it increases with maturity.
We point out that the spread tends to increase faster with maturity for low credit ratings
than for high credit ratings. For example, the difference between the five-year spread and
the one-year spread for a BBB-rated bond is greater than that for a AAA-rated bond.
1.4 Historical data of default
We report in Table 1.1 some historical data, relative to corporate bonds, aggregated
per rating; they show how the probability of default of bonds with a certain initial rating
varies over the years. We can see the difference of behavior between the investment grade
bonds and those of speculative grade: while the healthy firms need some time to default,
in the speculative zone the first years are crucial. Of course, bonds with lower ratings offer
higher spread to compensate the credit risk, hence the adjective “speculative”.
Table 1.1: Cumulative probabilities of default (%). Source: Moody’s (1970-2003)
Time (years)
Rating 1 2 3 4 5 7 10 15 20
Aaa 0.00 0.00 0.00 0.04 0.012 0.29 0.62 1.21 1.55
Aa 0.02 0.03 0.06 0.15 0.24 0.43 0.68 1.51 2.70
A 0.02 0.09 0.23 0.38 0.54 0.91 1.59 2.94 5.24
Baa 0.20 0.57 1.03 1.62 2.16 3.24 5.10 9.12 12.59
Ba 1.26 3.48 6.00 8.59 11.17 15.44 21.01 30.88 38.56
B 6.21 13.76 20.65 26.66 31.99 40.79 50.02 59.21 60.73
Caa 23.65 37.20 48.02 55.56 60.83 69.36 77.91 80.23 80.23
1.5 Recovery rate
When a corporation defaults on his obligations, counterparts look for an agreement
to solve the debt. If the firm is unable to fulfil all the requests, it can file voluntary o be
filed for bankruptcy, with modalities that vary from nation to nation. Usually, the money
to refund the investors are collect selling the firm’s assets (Liquidation - Chapter 7 of
the US Bankruptcy Code) or is possible to keep the firm in business while a bankruptcy
1.6 Netting 6
court supervises the “reorganization” of the company’s contractual and debt obligations
(Reorganization - Chapter 11 of the US Bankruptcy Code). The largest Chapter 11 case
is the 2008 Lehman Brothers’ bankruptcy, with $ 613 billion of debts4.
Of course, there is not refund available for everybody. As a general rule, some creditors
have an higher priority class than others, specified as covenant in the indenture.
Table 1.2: Recovery rates on corporate bonds. Source: Moody’s (1982-2003)
Class Average (%)
Senior secured 51.61
Senior unsecured 36.1
Senior subordinated 32.5
Subordinated 31.1
Junior subordinated 24.5
Thus, we can express the price of a corporate bond with recovery as
St = Et
[e−
R Tt rsds
(F1T<τ +R1T≥τ
)]= Et
[e−
R Tt rsdsF1T<τ
]+Et
[e−
R τt rsdsR1T≥τ
].
(1.4)
It is evident that the price is the result of the price of the zero-coupon bond plus the
price of the protection.
1.6 Netting
A complication in the estimation of the losses that will be taken in the event of a
counterparty default is netting. This is a clause in most contracts written by financial
institutions. It states that if a counterparty defaults on one contract with the financial
institution then it must default on all outstanding contracts with the financial institution.
That is useful because two financial institutions usually are mutually on short or long
position with each other, and this prevents a counterpart to default voluntarily on some
contracts while keeping others, which are more advantageous (“cherry picking”). This
situation is known as moral hazard, defined as: “the risk that a party to a transaction has
not entered into the contract in good faith, has provided misleading information about its
assets, liabilities or credit capacity, or has an incentive to take unusual risks in a desperate
attempt to earn a profit before the contract settles.”5. We present an example from [34]
to clarify the matter:
4http://www.bloomberg.com/apps/news?pid=20601103&sid=aI_Hue3zUKgs&refer=us5http://www.investopedia.com/terms/m/moralhazard.asp
1.7 Credit Default Swaps 7
A financial institution that has three contracts outstanding with a particular counter-
party. The contracts are worth $10 million, $30 million, and −$25 million to the financial
institution. Suppose the counterparty runs into financial difficulties and defaults on its
outstanding obligations. To the counterparty, the three contracts have values of $10 mil-
lion, $30 million, and −$25 million, respectively. Without netting, the counterparty would
default on the first two contracts and retain the third for a loss to the financial institution
of $40 million. With netting, it is compelled to default on all three contracts for a loss to
the financial institution of $15 million. (If the third contract had been worth $45 million
to the financial institution, the counterparty would choose not to default and there would
be no loss to the financial institution.)
We will now present some credit derivatives that can be priced with affine framework.
1.7 Credit Default Swaps
A credit default swap (CDS) is a contract that provides insurance against the risk of a
default by particular company. The company is known as the reference entity. The buyer
of the insurance obtains the right to sell a particular bond issued by the company for its
face value when a credit event occurs. The bond is known as the reference obligation and
the total face value of the bond that can be sold is known as the swap’s notional principal.
The buyer of the CDS makes periodic payments to the seller until the end of the life of
the CDS or until a credit event occurs. In the latter case, the recovery is determined and
the seller refund the buyer with the difference between the face value and the recovery.
The settlement is then evaluated in 1 −W of the face value, where W is the recovery.
Therefore CDSs are thought as nothing but an additional form of protection from losses;
as many other derivatives, in recent times CDSs lose much of his original “protection”
function, becoming a speculative instrument: since an investor doesn’t have to actually
own the bond underlying the CDS, he is actually betting on the default of a title.
In the light of those consideration, binary CDS were developed, which consist in a
CDS with a fixed refund. We will deal only with standard CDS, leaving intended that a
binary CDS is a standard CDS whose recovery is fixed.
As said in Section 1.3, in practice the determination of the actual recovery rate could
take a long time, then is common practice to use market data to estimate the recovery;
a common method is to compute the mid-price, i.e. the mean between the bid and the
offer for a similar bond. CDSs allow companies to manage their credit risk actively, it is
quite common for a financial institution to sell protection in markets which differ from
its usual field of operation, in order to diversify risks. That is known by practitioners as
the commandment: “don’t put all your eggs in the same basket”. The CDSs market has
1.8 Credit spread options 8
been growing larger and larger, reaching the astounding value of $58 trillion 6 ($58 ·1012).
For a quick comparison we point out that world’s gross domestic product (GDP) was $
54 trillion in 20067.
It is common practice to use CDSs to calibrate models, since they are so diffused and
are more liquid than the underlying bond (i.e. they can much easily sold and bought
at any time). This assumption is often questioned, in light of asymmetric information
arguments: roughly speaking, that means that CDSs rate strictly depends on the default
probability, which is in general known with a various degree of precision, while for other
kinds of derivatives, the informations are the same for everybody (e.g. interest rates,
change ratios, ecc. ecc.). E.g. the CEO of a firm knows better the firm’s real status than
a private citizen which wants to invest his savings. We refer to [34], 537, and to [8] for
further readings.
1.8 Credit spread options
A credit spread option is basically an European option written on a bond. Let us
consider, for simplicity, a zero-coupon bond with maturity T . Let us suppose that the
bond has a zero rate r = Yt + s, where Yt is some reference index (e.g. Libor) and s is
the spread of the owned bond. We denote with St the spread of bonds available on the
market at time t . Furthermore, let us suppose that we are able to exercise the option
only if the underlying bond is not defaulted at time t.
The option would be exercised at time t if there are bonds with greater spread, selling
the old bond to buy the one with greater spread (i.e. better performance); then the payoff
of such option is
Zt =(e−(Yt+s)(T−t) − e−(Yt+St)(T−t)
)+(1.5)
On the other hand, for a call option, we would buy the bond only if the spread is
higher than spreads available on the market, netting:
Z ′t =(e−(Yt+St)(T−t) − e−(Yt+s)(T−t)
)+
6http://www.sec.gov/news/testimony/2008/ts092308cc.htm7http://siteresources.worldbank.org/DATASTATISTICS/Resources/GDP.pdf
Chapter 2
Intensity-Based Modelling of Default
As we have seen in the Introduction, the critical aspect in credit risk is to predict
if and when a default event is likely to occur. To achieve that, in literature two kind of
approaches were used to describe default’s dynamics: the structural approach and the
reduced approach.
Structural models use the value of the firm to determine the default time τ , defined as
time in which the firm’s value falls below a certain level, e.g. the total values of liabilities;
the reduced approach, instead, models the firm’s status (alive or defaulted) as a jump
process, with τ the time of the first jump.
The most prominent examples of structural models are [39] and [6], both based on the
Black & Scholes framework; cf. for instance [24] for a description and some criticism of
such models.
In this thesis, we will adopt the reduced form approach as it leads naturally affine
processes. We assume throughout that all stochastic elements are defined on a complete
probability space (Ω,F ,P) and a filtration (Ft)t≥0, satisfying the usual condition (Defi-
nition B.2) .
2.1 Counting processes
Definition 2.1 (Counting process). Let (Tn)n≥0 be an increasing sequence of random
variables such that: T0 = 0 a.s. A stochastic process (Nt)t≥0 is the counting process
associated to the sequence (Tn)n≥0 ifNt = n if t ∈ [Tn, Tn+1)
Nt =∞ if t ≥ limn→∞
Tn.
If limn→∞
Tn =∞, then (Nt)t≥0 is called non-explosive. We point out that a counting process
is a right continuous process by definition.
9
2.1 Counting processes 10
Definition 2.2 (Intensity of a counting process). Let (λt)t≥0 be a non-negative predictable
process, such that ∫ t
0
λsds <∞ a.s., for all t
Let (Nt)t≥0 be a counting process. If(Nt −
∫ t0λsds
)t≥0
is a local martingale, then (λt)t≥0
is called intensity of (Nt)t≥0.
We can consider the intensity unique, indeed the following result holds.
Theorem 2.1.1 ([7]). Let (λt)t≥0 and(λt
)t≥0
two intensities for the counting process
(Nt)t≥0. Then ∫ t
0
∣∣∣λs − λs∣∣∣λsds = 0 a.s., for all t. (2.1)
If we take the intensities to be strictly positive, then from (2.1) follows that λt = λt
a.s, for all t ≥ 0.
We can get rid of the localness assuming some technical condition, leading to the
following results:
Proposition 2.1.2. Suppose (Nt)t≥0 is an (Ft)−adapted counting process and (λt)t≥0 is
a non-negative Ft−predictable process such that, for all t, E[∫ t
0λsds
]<∞, with (Ft)t≥0
satisfying the usual conditions. Then the following are equivalent:
(i) (Nt)t≥0 is nonexplosive and (λt)t≥0 is the intensity of (Nt)t≥0.
(ii)(Nt −
∫ t0λsds
)t≥0
is a martingale.
Proof. (ii⇒ i).
A martingale is, obviously, also a local martingale, and (λt)t≥0 is the intensity of
(Nt)t≥0. By the definition of martingale
E[Nt −
∫ t
0
λsds
]≤ E
[∣∣∣∣Nt −∫ t
0
λsds
∣∣∣∣] <∞.then (Nt)t≥0 is non exploding.
(i⇒ ii).
Since Nt and∫ t
0λudu are increasing and positive, we have for all t that∣∣∣∣Nt −
∫ t
0
λudu
∣∣∣∣ ≤ |Nt|+∣∣∣∣∫ t
0
λudu
∣∣∣∣ = Nt +
∫ t
0
λudu
and
sups≤t
∣∣∣∣Ns −∫ s
0
λudu
∣∣∣∣ ≤ Nt +
∫ t
0
λudu.
2.1 Counting processes 11
Then
E[sups≤t
∣∣∣∣Ns −∫ s
0
λudu
∣∣∣∣] <∞and by Theorem B.1.3 we have the desired result.
Remark 1. From the Proposition 2.1.2 (ii), we can observe, for s > t:
Et
[Ns −
∫ s
0
λudu
]= Nt −
∫ t
0
λudu ⇒ Et [Ns −Nt] = Et
[∫ s
t
λudu
].
Proposition 2.1.3 (T8,T9, [7] 27-28). Suppose (Nt)t≥0 to be a nonexplosive (Ft)−adapted
counting process with intensity (λt)t≥0, with∫ t
0λsds <∞ a.s., for all t and (Ft)t≥0 sat-
isfyies the usual conditions. LetMt = Nt −
∫ t0λsds, t ≥ 0
. Then for every predictable
process (Ht)t≥0 such that∫ t
0|Ht|λsds <∞ a.s., for all t the process:
Yt =
∫ t
0
HsdMs =
∫ t
0
HsdNs −∫ t
0
Hsλsds.
is well defined and is a local martingale. In addition, if E[∫ t
0|Hs|λsds
]<∞, then (Yt)t≥0
for all t, is a martingale.
Remark 2. If (Yt)t≥0 is a martingale, then for s > t
Et [Ys] = Yt ⇒ Et
[∫ s
t
HudNu
]= Et
[∫ s
t
Huλudu
]The first jump of a counting process will be central in our context, hence we pose this
Definition 2.3 (Intensity for a stopping time). Let be (Nt)t≥0 be a nonexplosive counting
process with intensity (λt)t≥0 , and let τ := inft : Nt = 1. Then it will be said that the
stopping time τ has intensity (λt)t≥0.
For multiname risk modelling, can be useful to characterize the first jump of n counting
process:
Lemma 2.1.4. Let τi, i = 1, . . . , n be a succession of stopping times with intensity (λit)t≥0.
If P [τi = τj] = δij then, τ := min (τ1, . . . , τn) has intensityn∑i=1
λi.
Proof. By Definition 2.3, for each i, we have that, M it = N i
t−∫ t
0λisds is a local martingale.
Then also
Mt :=n∑i=1
M it =
n∑i=1
N it −
∫ t
0
n∑i=1
λiudu (2.2)
2.2 Poisson processes 12
is a local martingale.
Let us denote with (Nt)t≥0 the counting process associated to τ , since we supposed
that P [τi = τj] = δij, we can write
Nt = N1t + · · ·+Nn
t , t ≤ τ ;
then by 2.2,n∑i=1
λi is the intensity of τ .
2.2 Poisson processes
We present one of the many equivalent definition of a Poisson process, which qualifies
it as a Levy process (cf. Definition B.10 )
Definition 2.4 (Poisson process). An (Ft)−adapted, non-exploding, counting process
(Nt)t≥0 is a Poisson process if
1. ∀s, t, 0 ≤ s ≤ t < ∞, Nt − Ns is independent from Gs := σ (Nu : u ≤ s), the
history of the process up to time s (independent increments.)
2. ∀s, t, u, v; 0 ≤ s ≤ t < ∞, 0 ≤ u ≤ v < ∞, and v − u = t − s we have Nt − Nsd=
Nv −Nu (that is: stationary increments.)
Theorem 2.2.1. Let (Nt)t≥0 be a Poisson process. Then
P [Nt = n] = e−λt(λt)n
n!, n ∈ N0 (2.3)
for some λ and all t ≥ 0. That is, the random variable Nt has a Poisson distribution
of parameter λt, for some deterministic λ > 0.
to be added. [46]
Theorem 2.2.2. Let (Nt)t≥0 be a Poisson process, then
E[Nt] = λt, V ar(Nt) = λt.
In addition, (Nt − λt)t≥0 and ((Nt − λt)2 − λt)t≥0 are martingales.
Proof. The calculation of mean and variance is straightforward and will be omitted. Since
λt is deterministic,
E[Nt] = λt ⇒ E[Nt − λt] = 0,
E[(Nt − λt)2] = λt ⇒ E[(Nt − λt)2 − λt] = 0
To verify that those “compensated” process are martingales, we note that for all t ≥ s :
E[Nt − λt− (Ns − λs) |Gs] = E[Nt −Ns − λ(t− s)] = 0.
The same holds for (Nt − λt)2 − λt, t ≥ 0.
2.3 Doubly stochastic process 13
Definition (2.2) in combination with Theorem 2.2.2 shows that λ is the (deterministic)
intensity of the Poisson process (Nt)t≥0.
As it can be seen in the proof of Theorem 2.2.2 we have that
E[Nt −Ns] = λ(t− s).
Hence λ represents the jump-rate per unit time and the name “intensity” is well-given.
2.3 Doubly stochastic process
Now we will extend the concept of Poisson process. Let (Gt)t≥0 and (Ft)t≥0 be filtra-
tions and Fs ∨ Gt := σ (Fs ∪ Gt).
Definition 2.5 (Doubly stochastic processes). Let (Nt)t≥0 be a non-explosive counting
process with intensity (λt)t≥0, and (Ft)t≥0 satisfies the usual conditions, with Ft ⊂ Gt, t ≥0.
If (λt)t≥0 is (Ft)-predictable, and for all t, s > t, Ns−Nt, conditioned on Fs∨Gt, has
a Poisson distribution with parameter∫ stλsds, then (Nt)t≥0 is called doubly stochastic
process, driven by (Ft)t≥0.
That is:
P [Ns −Nt = n |Fs ∨ Gt ] =
(∫ stλsds
)nn!
e−R st λsds. (2.4)
Remark 3. If we multiply (2.4) by n, and sum over n ∈ N0, recalling that∞∑i=0
xi
i!= ex, we
get
E [Ns −Nt |Fs ∨ Gt ] =
∫ s
t
λudu, 0 ≤ t < s;
once again, the name “intensity” is well-given. Of course, the quantity∫ stλudu is a stochas-
tic process, otherwise we would get the same result as for the Poisson process. An alter-
native representation is by the mean of his characteristic function:
E[eiu(Ns−Nt) |Fs ∨ Gt
]= exp
[(eiu − 1
) ∫ s
t
λudu
], 0 ≤ t < s, u ∈ R (2.5)
We will use a doubly stochastic process to model the status of a debtor: the process
(Nt)t≥0 starts with N0 = 0, and we will denote with τ the time of the first jump (i.e
τ := inft : Nt = 1). Then for Nt = 0, t < τ , the debtor is alive. At time τ the process
jumps and we have the credit event. We see from the definition of doubly stochastic
process that there are two different flows of informations: (Ft)t≥0 and (Gt)t≥0, the first
smaller than the other.
2.3 Doubly stochastic process 14
So, at initial time t, the investors know λt from historical data, and they want to
foretell the probability of default, i.e. P[Ns −Nt > 0 |Gt ], and other correlate quantities.
To know the probability to have n jumps in the time interval [t, s], we have to know
the informations contained in Fs, in order to have that∫ stλudu makes sense, and the
information contained in Gt.
For our applications, we can think that the two filtrations represent two different
actors: one equipped with sigma algebra Fs, that contains the statistical informations of
the process (Nt)t≥0, up to time s and the other equipped with Gt that has information
about (λt)t≥0 only up to time t, but has access to other kinds of information. The notion
of doubly stochastic process can be effortlessy generalized in a multidimensional context.
Definition 2.6. The process N = (N1t , . . . , N
nt ), t ≥ 0 is said to be an n−dimensional
doubly stochastic process, driven by (Ft)t≥0, with intensity λt = (λ1t , . . . , λ
nt ), t ≥ 0 if
• (N it )t≥0 is a doubly stochastic process, driven by (Ft)t≥0, with intensity (λit)t≥0, for
i = 1, . . .m, n.
• N iu − N i
t , t ≤ u ≤ s, i = 1, . . . , n, conditional on the σ−algebra Gt ∨ Fs, are
independent.
As we noted before, the emphasis is on the filtration (Gt)t≥0 and usually it can be
constructed from (Ft)t≥0.
Proposition 2.3.1. Let (Ft)t≥0 satisfies the usual conditions, and (λt)t≥0 be a (Ft)−predictable,
non negative process with∫ t
0λsds <∞ .a.s. for all t Let (Zi)i∈N be i.i.d random variable
with exponential distribution, independent of (Ft)t≥0 for all t.
If a counting process (Nt)t≥0 is associated to the sequence :
T0 = 0,
Tn = inft ≥ Tn−1 :
∫ tTn−1
λsds ≥ Zn
, n ∈ N,
(2.6)
then (Nt)t≥0 is a non exploding counting process, with intensity (λt)t≥0, doubly stochastic
driven by (Ft)t≥0, with Gt the σ-algebra generated both from Ft and σ (Ns : 0 ≤ s ≤ t)
Proof. Let us take n = 1, then
T1 = inf
t ≥ 0 :
∫ t
0
λsds ≥ Z1
;
by definition,
P [t ≥ T1 |Ft ] = P[∫ t
0
λsds ≥ Z1 |Ft
]= e−
R t0 λsds.
2.4 Risk-neutral probability 15
For the strong Markov property we can apply the same reasoning, starting from
(T1, Z1). From the independence of Z1 and Z2 follows that T2 − T1 are independent of
T1; hence (Ti − Ti−1)i∈N are i.i.d.. Therefore Nt, conditioned on Ft has a Poisson distri-
bution with parameter∫ t
0λsds and Ns − Nt, conditional on Fs ∨ Gt, is distributed with
Poisson distribution, i.e. (Nt)t≥0 is a doubly stochastic process.
2.4 Risk-neutral probability
Up to now, all definitions depended on the physical probability measure P: now we
need a risk-neutral probability Q measure to price the derivatives.
We recall the staples of neutral pricing theory in the Appendix C, and in addition we
will suppose the existence of a short interest rate (rt)t≥0, such that∫ t
0rudu <∞ a.s., for
all t. (in analogy with the hypothesis made on (λt)t≥0.).
Before proceeding we answer to the question: a counting process with respect to P is
still a counting process in an equivalent probability measure Q?
Here helps the following:
Proposition 2.4.1 ([1]). Suppose that a nonexplosive counting process (Nt)t≥0 has a
P−intensity process (λt)t≥0, and that Q is any equivalent probability measure to P. Then
(Nt)t≥0 has a Q-intensity process (λQt )t≥0.
The ratio λQt /λt, t ≥ 0, represents the risk premium associated with the uncertainty
about the time of default. Of course we suppose (λt)t≥0 is strictly positive, since we
suppose that there is always a possibility of a default event.
Lemma 2.4.2. Let (Nt)t≥0 be a nonexplosive counting process with intensity (λt)t≥0 with
Ti := inf t : Nt = i (i.e Ti is the i-th jump time of Nt). Let (ϕt)t≥0 be a strictly positive
predictable process such that∫ T
0λsϕsds <∞ a.s, for fixed T . Then
ξt = exp
(∫ t
0
(1− ϕs)λsds) ∏i:Ti≤t
ϕTi , t ≤ T (2.7)
is well defined and a local martingale
Proof. Confronting (2.7) and (A.1.1) se can see from the Theorem A.1.1, posing a(t) =
Mt = Nt −∫ t
0λudu, u(t) = ϕt − 1 that ξt solves the equation
ξt = 1 +
∫ t
0
ξs−(ϕs − 1)dMs, t ≥ 0 (2.8)
since ∆Mt = ∆Nt = 1 and M ct = −
∫ t0λudu. Looking at 2.8 we see that the integrand is
predictable, then by Proposition 2.1.3, (ξt)t≥0 is a local martingale.
2.4 Risk-neutral probability 16
We will present a version of Girsanov’s Theorem, that allows us to find a probability
measure under which a nonexplosive counting process remains a counting process. We
present first some necessary condition for (2.7) to be a martingale.
Lemma 2.4.3. Let (Nt)t≥0 be a nonexplosive counting process with intensity (λt)t≥0. If
(λt)t≥0 is bounded and deterministic and (ϕt)t≥0 is bounded on [0, T ], then (ξt)t∈[0,T ], as
defined in (2.7), is a martingale.
Proof. Let us consider for simplicity λ = 1 and T = 1. Since (ϕt)t≥0 is bounded, there
exists a K > 1 such that ϕt ≤ K, t ∈ [0, 1]; from (2.7) we have
ξt ≤ eKtKNt (2.9)
then∫ t
0
|ξs−(ϕs − 1)|ds ≤ |K − 1|∫ t
0
eKsKNs−ds ≤ (K + 1)eKKN1−t ≤ (K + 1)eKKN1 .
where we used (2.9), and the fact that eKtKNt− is increasing in t, and t ∈ [0, 1]. Then
E[∫ t
0
|ξs−(ϕs − 1)|ds]≤ E
[eKKN1(K + 1)
]<∞
since Nt is non-exploding; then by Proposition 2.1.3∫ t
0
ξs−(ϕs − 1)dMs, t ∈ [0, 1]
is a martingale. Then in the light of 2.8 also (ξt)t∈[0,T ] is a martingale .
Theorem 2.4.4 (Girsanov’s Theorem - 1, [21], 57 and [7] T3, 166-167). Suppose that the
local martingale (ξt)t∈[0,T ] is a martingale. Then an equivalent probability measure Q is
defined by dQdP = ξT . Restricted to the time interval [0, T ], under the probability measure
Q, (Nt)t≥0 is a nonexplosive counting process with intensity λQt = λtϕt, t ∈ [0, T ].
We note that the Radon-Nikodym derivative is totally determined by the process
ϕt = λQt /λt. Then if we are able to find (ϕt)t≥0, then we have an equivalent measure such
that the nonexplosive counting property still holds.
Asking more we can preserve also the double stochastic property.
Lemma 2.4.5. Let (Nt)t≥0 be a doubly stochastic, driven by (Ft)t≥0 with intensity (λt)t≥0.
For a fixed time T > 0, let ϕ be an (Ft)−predictable process with (λt)t≥0 and (ϕt)t≥0
bounded on [0, T ]. Then (ξt)t∈[0,T ], defined as in (2.7), is a martingale.
2.5 Useful results 17
Proof. Consider that for the law of iterated expectations, for a random variable ξt:
E [E [ξt |Ft ]] = E [ξt] .
Then proof is analogous to the one of Lemma 2.4.3, considering the doubly stochastic
property.
Theorem 2.4.6 (Girsanov’s Theorem bis, [21], 57). Suppose (Nt)t≥0 is doubly stochastic,
driven by (Ft)t≥0 with intensity (λt)t≥0, with (Gt)t≥0 is the completion of σ(Ns : 0 ≤ s ≤ t)∨Ft , t ≥ 0. For a fixed time T > 0, let (ϕt)t≥0 be an (Ft)−predictable process with∫ T0ϕsλsds < ∞ a.s.. Let (ξt)t∈[0,T ] be defined by (2.7), and suppose that (ξt)t∈[0,T ] is a
martingale. Let Q be the probability measure with dQdP = ξT . Then, restricted to the time
interval [0, T ], under the probability measure Q and with respect to the filtration (Gt)t≥0,
(Nt)t≥0 is doubly stochastic, driven by (Ft)t≥0, with intensityλQt = ϕtλt, t ∈ [0, T ]
.
Remark 4. As said before, sigma algebra Gt contains the market informations up to time
t. Hypotheses in Theorem 2.4.6 and Proposition 2.3.1 strengthen this interpretation since
they tells that it has to contain information about λt and Nt (respectively, in our context,
the intensity of default and the status of the firm) which are market data.
We have given a density ξ which provide an equivalent measure Q, but we need that
the discounted price of an asset is a martingale under Q. That would cast some restriction
on parameters of an affine model, we refer to [23], Sections 3.1-3.2.
2.5 Useful results
Now we are ready to find some useful relations, which can be effectively evaluated
jointly with affine process theory. We will often use this classical results in probability:
Theorem 2.5.1 (Law of iterated expectations). Let A ,G ⊂ F be two σ algebra such
that A ⊂ G : Then E [X |A ] = E [E [X |A ] |G ] = E [E [X |G ] |A ] .
Proof. E [X |A ] is, by definition, an A−measurable random variable. and, since A ⊂ G ,
then is also G−measurable; therefore E [X |A ] = E [E [X |A ] |G ].
Let us prove the other relation: each A ∈ A belongs also to G , then
E [E [X |G ] 1A] = E [X1A] .
Then the relation follows from the definition of conditional expectation.
2.5 Useful results 18
2.5.1 Survival analysis
Let us denote with A the event Ns −Nt = 0, i.e. no default in the time interval [t, s].
Then the probability of default, conditioned to the information available up to time s to
investors, is, exploiting the Theorem 2.5.1:
P [τ > s |Gt ] = E [1A |Gt ]= E[E[1A |Gt ∨Fs ] |Gt ]= E[P[Ns −Nt = 0 |Gt ∨Fs ] |Gt ]
But recalling (2.4) we have
P [τ > s |Gt ] = E[e−R ts λudu |Gt ] (2.10)
and now let us compare (2.10) with I.2: then evaluating the probability of default form
the market data contained in Gt is equivalent to price a zero-coupon bond discounted by
the intensity (λt)t≥0.
Credit risk can be naturally approached with survival analysis, and the reduced ap-
proach yields some nice results.
Definition 2.7 (Hazard rate). If τ is an absolutely continuous non-negative random
variable, its hazard rate function, is defined by
h(t) =f(t)
S(t), t ≥ 0
where f(t) is the density of τ and S(t) is the survival function: S(t) =∫ t
0f(u)du.
Note that P (T ≤ t+ dt|T > t) ≈ h(t)dt. Then the hazard rate is the instantaneous rate
of failure at time t.
Let us denote with p(t) = P (τ > t) the survival function p : [0,∞) → [0, 1]. The
density of the stopping time is π(t) = −dpdt
, and the hazard rate h : [0,∞)→ [0,∞) is
h(t) =π(t)
p(t)= − d
dtlog p(t) (2.11)
then, we can integrate both sides, getting:
p(t) = P(τ > t) = e−R t0 h(u)du
In an analogous way we can define all those quantity conditioned on Gt, adding as a
shorthand notation, the subscript t .
Then we know from (2.10) that
pt(s) = Pt(τ > s) = Et[e−R st λudu]
2.5 Useful results 19
and, therefore,
πt(s) = − d
dspt(s) = − d
dsEt[e
−R st λudu]
If we were allowed to exchange expectation and derivatives, we would obtain easily
πt(s) = Et[e−R st λuduλs] (2.12)
Since s, λs are usually positive and bounded, we can use regular results from measure
theory to exchange expectation and derivative. Actually, more can be said:
Theorem 2.5.2 ([31], 106-107). Let (Nt)t≥0 be a doubly stochastic process, driven by
(Ft)t≥0, such that
• ∃C constant, such that, E (λ2t ) < C, ∀t.
• ∀ε > 0 and a.e. t,
limδ→0
P(|λ(t+ δ)− λ(t)| < ε) = 1
thend
dsEt[e
−R st λudu] = Et[−λse−
R st λudu].
If we assume that (I.1) holds, in a doubly stochastic setting with λt = l0 + l1 · Xt−,
and (Xt)t≥0 an affine process, then all those quantities can be easily calculated.
Let us begin with the survival function:
pt(s) = Pt(τ > s) = Et
[e−
R st λudu
]= eα(t,s)+β(t,s)·Xt
where α(t, s) and β(t, s) solves the GREs. Recalling the equation (2.11)
ht(s) = − d
dslog pt(s) = − d
ds(α(t, s) + β(t, s) ·Xt) = −∂sα− ∂sβ ·Xt
Once we have solved the GREs, we know also explicitly the time derivatives involved in
the calculation of ht(s).
Again from (2.11) we know that the density of the stopping time is
πt(s) = ht(s)pt(s) = − (∂sα + ∂sβ ·Xt) eα(t,s)+β(t,s)·Xt
Therefore resolving the GREs we can find all the related quantities, and recalling (2.12)
we obtain
Et[e−R st (l0+l1·Xu)du (l0 + l1 ·Xs)] = − (∂sα + ∂sβ ·Xt) e
α(t,s)+β(t,s)·Xt
Actually, a more general result will be shown but this example shows how, in some sense,
the affine structure can be extended.
2.5 Useful results 20
2.5.2 Correlated jumps
The doubly stochastic process framework allow us to get some some results also in the
n−dimensional case, where we can consider each component of an n−dimensional doubly
stochastic process (Nt)t≥0 as a different entity, which is only conditionally independent of
the others components.
Proposition 2.5.3. Let (Nt)t≥0 be an n−dimensional doubly stochastic process with in-
tensity (λt)t≥0 and let τi := inft : N it ≥ 1. Then:
(i) Pt [τ > T ] = Et
[e−
R Tt Λsds
].
(ii) Pt [τ1 > t1, . . . , τn > tn] = Et
[e−
R tnt Γsds
].
where τ = infτ1, . . . , τn, Λs =n∑i=1
λis, Γs =n∑
i:ti>s
λis and t ≤ t1 ≤ . . . ≤ tn
Proof. (i) We have, for the law of iterated expectations,
Pt [τ > T ] = Et [P [τ > T |Ft ∨ GT ]] .
By definition of n−dimensional doubly stochastic process, we have that, conditioned
on Ft ∨ GT , the jump times are independent, then the result follows from Lemma
2.1.4 and (2.10).
(ii) For the law of iterated expectations and conditional independence we have:
Pt [τ1 > t1, . . . , τn > tn] = Et [P [τ1 > t1, . . . , τn > tn |Ft ∨ GT ]] =
Et [P [τ1 > t1 |Ft ∨ GT ] . . .P [τn > tn |Ft ∨ GT ]] = Et
[e−
R t1t λ1
sds · · · e−R tnt λns ds
].
We observe that, if intensities are affine, Λs is affine and Γs is piecewise affine, and
then are tractable with the affine framework. Let us consider the case (ii), since (i) is
trivial.
Let the (I.1) holds, since Γs = Γ(Xs, s) is affine s ∈ (tk−1, tk), we have that
Etk+1
[e−R tk+2tk+1
Γ(Xs,s)ds
]= eα(k+1)+β(k+1)·Xtk+1
and, by the law of iterated expectations
Etk
[e−
R tk+2tk
Γ(Xs,s)ds
]= Etk
[Etk+1
[e−
R tk+2tk
Γ(Xs,s)ds
]]= Etk
[e−
R tk+1tk
Γ(Xs,s)dsEtk+1
[e−R tk+2tk+1
Γ(Xs,s)ds
]].
2.5 Useful results 21
Etk
[eR tk+1tk
Γ(Xs,s)dseα(k+1)+β(k+1)·Xtk+1
]= eα(k)+β(k)·Xtk
Solving the latter equation backward, up to time t0 = t, we have that, by Proposition
2.5.3 (ii):
Pt [τ1 > t1, . . . , τn > tn] = Et
[e−
R tnt Γsds
]= eα(0)+β(0)·Xt ;
i.e we have an analytical expression for joint distribution of default times.
The special case of n = 2 is of interest: many contracts which involves two actors are
valid if and only if both actors are not defaulted at maturity: i.e. τ1 > T ∩ τ2 > T.We are, of course, in the case (i) of Proposition 2.5.3, i.e. Pt [τ1 > T ∩ τ2 > T] =
Et
[e−
R Tt (λ1
s+λ2s)ds].
Recurring to inclusion-exclusion formula, it is easy to find also an expression for the
event τ1 > T ∪ τ2 > T, which means that at least one actor is alive at maturity:
Pt [τ1 > T ∪ τ2 > T] = Pt [τ1 > T ] + Pt [τ2 > T ]− Pt [τ1 > T ∩ τ2 > T] ;
using the doubly stochastic property we have, as usual,
Pt [τ1 > T ∪ τ2 > T] = Et
[e−
R Tt λ1
sds]
+ Et
[e−
R Tt λ2
sds]− Et
[e−
R Tt (λ1
s+λ2s)ds]. (2.13)
Chapter 3
Affine processes and transforms
Now we will present and show the results that prove (I.1). Here we will deal with a
simpler theory of affine process, developed mainly in [23] for jump-diffusion process, a
fully fledged theory can be found in [22]. For notation simplicity we will suppose that all
functions considered don’t depend explicitly on time and with only one type of jumps,
but all the results also apply in a more wide context, as said in Section 3.3.4.
We will denote with · the usual inner product in Rn and with ⊗ the dyadic product
Rn×n 3 (a⊗ b)ij := aibj, a, b ∈ Rn
while : is the scalar product over the space of tensors
A : B := tr[ABT
]=
n∑i,j=1
(A)ij(B)ij, A,B ∈ Rn×n.
3.1 Affine processes
Let us pose some definitions, that will turn useful. Throughout all random random
elements are defined on a filtred probility space (Ω,F , (Ft)t≥0,P). Details and definitions
can be found in Appendix B.
Definition 3.1 (Jump-Diffusion Process). Let us fix the probability space (Ω,F ,P) and
a filtration (Ft)t≥0 satisfying the usual hypotheses, and suppose that (Xt)t≥0 is a Markov
process in some state D ⊂ Rn. (Xt)t≥0 is a jump-diffusion (JD) process if the transition
semigroup has an infinitesimal generator D of the Levy type defined, for any bounded
f : D → R, f ∈ C2(D) with bounded derivatives, by
Df(x) = ∇xf · µ(x) +1
2∇2xf :
(σ(x)σT (x)
)+ λ(x)
∫Rn
[f(x+ z)− f(x)] dν(z) (3.1)
22
3.1 Affine processes 23
where µ : D → Rn, σ : D → Rn×n , ν : Rn → R+ is the fixed jump distribution of a
compound Poisson process (Zt)t≥0 with intensity (λ(Xt))t≥0, with λt : D → R+.
Defining this generator, we have defined a process driven by a finite activity Levy
process (cf. Proposition B.3.3) with triplet (0, 0, λtν).
Anyway, we can think of (Xt)t≥0 as a process solving:
dXt = µ(Xt)dt+ σ(Xt)dWt + dZt (3.2)
whereX0 has, for simplicity, a known distribution. We denote with (Wt)t≥0 a (Ft)−brownian
motion in Rn, (Zt)t≥0 a compound Poisson process valued in Rn with fixed jump distri-
bution ν and intensity (λ(Xt))t≥0.
The choice of D is arbitrary and casts restrictions on µ, σ, ν, along with necessity to
have a strong solution to (3.2). Prior attempts to define those conditions and the form
of D were made in [20] and [16]. In [22] authors established a full characterization of the
affine process in a state space D = Rm+ × Rn−m, m = 1, . . . , n, which is considered to
be the standard state space for financial applications. We will present those condition in
Section 3.5, referring to [22] for the details.
Definition 3.2 (Affine characteristic of an JD process). Let (Xt)t≥0 be a JD process,
with parameters µ, λ, σ, ν; we define the Laplace transform of the jump distribution as
θ(c) =
∫Rnec·zdν(z), c ∈ C. (3.3)
If we let the parameters depend affinely on (Xt)t≥0
µ(x) = K0 +K1x K := (K0, K1) ∈ (Rn,Rn×n)(σ(x)σT (x)
)ij
= (H0)ij +n∑k=1
(H1)ijk xk H := (H0, H1) ∈ (Rn×n,Rn×n×n)
λ(x) = l0 + l1 · x l := (l0, l1) ∈ (R,Rn) .
(3.4)
then the quintuplet (K,H, l, θ) is called the (affine) characteristic of (Xt)t≥0 and (Xt)t≥0
is called affine jump-diffusion process (AJD) or, for short, affine.
Definition 3.3 (Trasform). Let (Xt)t≥0 be an affine process, with characteristic χ =
(K,H, l, θ), and R : D → R+ be a discount-rate function, such that R(x) = ρ0 + ρ1 · x,
ρ ∈ R, ρ1 ∈ Rn. Then for t ≤ T the function ψ : Cn ×D × R+ × R+ → C is well defined
by:
ψ(u,Xt, t, T ) = Et
[e−
R Tt R(Xs)dseu·XT
](3.5)
3.1 Affine processes 24
Definition 3.4. A characteristic χ = (K,H, l, θ) is well-behaved at (u, T ) ∈ Cn × [0,∞)
if α and β solves the GRE:β(t) = ρ1 −KT
1 β(t)− 12βT (t)H1β(t)− l1(θ(β(t))− 1)
α(t) = ρ0 −KT0 · β(t)− 1
2βT (t)H0β(t)− l0(θ(β(t))− 1)
α(T ) = 0, β(T ) = u.
(3.6)
and if
1. E[∫ T
0|γt| dt
]<∞, with γt = Ψt (θ(β(t))− 1)λ(Xt)).
2. E[∫ T
0ηt · ηtdt
]<∞ , with ηt = Ψtβ
T (t)σ(Xt).
3. E [|ψT |] <∞.
where Ψt = e−R t0 R(Xs)dseα(t)+β(t)·Xt
We observe that actually only the first equation is an ODE, the second one is only a
definite integrale, once β is known. The equations (3.6) are ODEs backwards in time and
is useful to operate the transformation
t→ s = T − t
obtaining β(s) = −ρ1 +KT
1 β(s) + 12βT (s)H1β(s) + l1(θ(β(s))− 1)
α(s) = −ρ0 +KT0 · β(s) + 1
2βT (s)H0β(s) + l0(θ(β(s))− 1)
α(0) = 0, β(0) = u.
(3.7)
The time dependence in the following pages will be dropped for ease of notation. We leave
the time dependence in the boundary condition, leaving intended that if we have α(0) and
β(0) as boundary conditions, we are solving the initial time problem (i.e. (3.7)) otherwise
we are solving the backwards time problem.
And here is the results we were all waiting for:
Theorem 3.1.1. Let (K,H, l, θ) be well-behaved in (u, T ), then
ψ(u,Xt, t, T ) = eα(t)+β(t)·Xt (3.8)
Proof. We have to show that (Ψt)t≥0 is a martingale, because, for s ≥ t
Et [Ψs] = Ψt ⇒ eR t0 R(Xu)duEt [Ψs] = e
R t0 R(Xu)duΨt (3.9)
Since eR t0 R(Xu)du is Gt−measurable, and using the boundary condition in (3.6):
Et
[e−
R Tt R(Xs)dseu·XT
]= eα(t)+β(t)·Xt (3.10)
3.1 Affine processes 25
We will denote with the sequence (Ti)i≥0 the jump times of (Xt)t≥0 and with (Nt)t≥0 the
counting process associated to that sequence.
Using the Ito’s Formula, Theorem B.3.4, to real and complex part, we can write
Ψt = Ψ0+
∫ t
0
∂Ψs
∂sds+
∫ t
0
∇xΨs · dXcs+
1
2
∫ t
0
∇2xΨs :
(σ (Xs)σ
T (Xs))ds+
∑0<Ti≤t
ΨTi −ΨTi−
We can compute the involved derivatives, recalling the affine hypotesis (3.4)-(??)
∂∂t
Ψt = Ψt
[α− ρ0 + x
(β − ρ1
)]∇xΨt = Ψtβ ⇒ ∇xΨt · dXc
t = Ψtβ · [(K0 +K1x) dt+ σ(x)dWt]
∇2xΨt = Ψt (β ⊗ β)⇒ βT (H0 +H1x) β
then, adding and subtracting∫ t
0γsds and grouping all the terms we got
Ψt = Ψ0 +∫ t
0Ψs
[α− ρ0 + β ·K0 + 1
2β ·H0β + l0(θ(β)− 1)
]ds+
∫ t0
ΨsβTσdWs+∫ t
0ΨsXs ·
[β − ρ1 + βTK1 + 1
2β ·H1β + l1(θ(β)− 1)
]ds+
∑0<Ti≤t
ΨTi −ΨTi− −∫ t
0γsds.
Since α and β solve the GRE, the first and the third integrand are equal to 0 and
under the hypothesis 2 the second integral is a martingale (cf: [43]).
It remains to show that the last addend∑0<Ti≤t
(ΨTi −ΨTi−)−∫ t
0
γsds
is a martingale. That holds if and only if
Et
[ ∑0<Ti≤s
(ΨTi −ΨTi−)−∫ s
0
γudu
]=∑
0<Ti≤t
(ΨTi −ΨTi−)−∫ t
0
γudu
Splitting the expectations (allowed by hypothesis 1) we get
Et
[ ∑t<Ti≤s
(ΨTi −ΨTi−)
]= Et
[∫ s
t
γudu
]we observe that
ΨTi −ΨTi− = eR Ti0 R(Xs)dseα(Ti)+β(Ti)·XTi − e
R Ti−0 R(Xs)dseα(Ti−)+β(Ti−)·XTi−
but since α, β and the integral are continuous functions, we can regroup
ΨTi −ΨTi− = eR Ti−0 R(Xs)dseα(Ti−)+β(Ti−)·XTi−(eβTi−·(XTi−XTi−) − 1) = ΨTi−(eβTi−·∆XTi − 1)
3.2 First examples of affine processes 26
Then using the Law of Iterated expectation and the above relation:
Et
[ ∑t<Ti≤s
(ΨTi −ΨTi−)
]= Et
[ ∑t<Ti≤s
E [(ΨTi −ΨTi−) |XTi−]
]
= Et
[ ∑t<Ti≤s
(ΨTi− (θ(β(Ti))− 1))
]= Et
[∑i
∫ TiTi−1+
Ψu− (θ(β(u))− 1) dNu
]= Et
[∫ st
Ψu− (θ(β(u))− 1) dNu
](3.11)
In the light of Remark 2, we note that, given hypothesis 1, (Ψt−(θ(β(t))− 1))t≥0 is an
(Gt)−predictable process and (λt)t≥0 is the intensity of the jump-counting process (Nt)t≥0
Et
[∫ s
t
Ψu− (θ(β(u))− 1) dNu
]= Et
[∫ s
t
Ψu (θ(β(u))− 1)λudu
]which is the definition of γt.
Hence (Jt)t≥0, as also (Ψt)t≥0, is a martingale.
Remark 5. As we have seen, up to now we have dealt with three different filtration.
In Chapter 2 we had a process λDt , which is Ft−predictable, where Ft, t ≥ 0 is a
filtration satisfying the usual hypotheses such that Ft ⊂ Gt. We have written λDt to stress
the fact that the intensity of the double stochastic process is not the same of the pure
jump-process seen in (3.2). In addition, we have seen that the filtration (Gt)t≥0 can be
built directly from (Ft)t≥0. On the other hand, in the current chapter, we need only a
filtration with respect to which (Xt)t≥0 would be adapted, then (Gt)t≥0 would fit well since
Ft ⊂ Gt.
In order to apply the affine property to the right hand side of Claim I.3, it appears
clear that also λDt has to be affine. Then λDt = Λ(Xt−), where Λ(x) is an affine function .
Then a good choice is Ft := σ(Xt). About the condition under which a Markov process
generate a filtration satisfying the usual conditions, check [13], Theorem 4, 61.
Then from now on Et := E[· |Gt ] and we will use indifferently E[· |Ft ] or E[· |Xt ].
3.2 First examples of affine processes
An easy, first application of the affine property provided by Theorem 3.1.1 is the evalu-
ation of a zero-coupon bond, posing u = 0 in (3.8). Zero-coupon bonds are, fundamentally,
interest rate derivatives, then is natural to take as driving process an interest rate process;
we will take an one-dimentional interest rate factor for simplicity of notation.
3.2 First examples of affine processes 27
In other words, we want to evaluate
Et
[e−
R Tt R(Xs)ds
]with R(Xt) = ρ0 + ρ1rt = 0 + 1 rt, then the focus moves on the choice of the process
(rt)t≥0, that has to be affine.
Two popular affine model are the Vasıcek model, described by
drt = k(γ − r)dt+ σdWt
and the Cox-Ingersoll-Ross (CIR) model
drt = k(γ − r)dt+ σ√rdWt
By a simple inspection we can find characteristics for both models
Vasıcek CIR
K (kγ,−k) (kγ,−k)
H (σ2, 0) (0, σ2)
l (0, 0) (0, 0)
θ 1 1
ρ (0, 1) (0, 1)
and then we can easily write down the associated GREs.
For the Vasıcek model we haveβ = −1− kβα = kγβ + 1
2σ2β2
α(0) = 0
β(0) = 0
and the solution isβ(t) = 1
k
(e−k(T−t) − 1
)α(t) = (T − t) ( σ
2
2k2 − γ) +(1− e−k(T−t)) (γ
k− σ2
k3
)+ σ2
4k3
(1− e−2k(T−t))
On the other hand, the associated GREs for the CIR model are:β = −1− kβ + 1
2σ2β2
α = kγβ
α(0) = 0
β(0) = 0;
this is a classical homogeneous Riccati equation, and the following result holds:
3.2 First examples of affine processes 28
-2,5 0 2,5 5 7,5 10 12,5 15
-7,5
-5
-2,5
2,5
5
alpha
beta
Time
Figure 3.1: Solution of the coefficients α and β for the Vasıcek model, with T = 10,
σ = 0.012, k = 0.05, γ = 0.03
Lemma 3.2.1. Let us consider the following initial value problem (homogeneous Riccati
equation) x = Ax2 +Bx− C
x(0) = x0 ≤ 0.
Then the unique solution of (3.2.1) is
x(t) = −2C(eρt − 1)− (ρ (eρt + 1) +B (eρt − 1))x0
(ρ−B)(eρt − 1) + 2ρ− 2A(eρt − 1)x0
where ρ :=√B2 + 4AC, for A, C ≥ 0 and B ∈ R.
Therefore, the solution is, posing a =√k2 + 2σ:
β(t) = − 2(ea(T−t) − 1
)(a+ k)
(ea(T−t) − 1
)+ 2a
α(t) =2γkσ2 ln
(2ae(a+k)(T−t)/2
(a+ k)(ea(T−t) − 1
)+ 2a
)Those results are somewhat classic, they can be derived with an equilibrium approach
(cf. original papers : [15], [53]). The existence of analytical solution, with the (1.3) explains
the popularity of those models. We can (and we will) use the existence of those analytical
solution to test our numerical algorithms.
3.3 Extending the transform 29
3.3 Extending the transform
The trasform (3.8) could be a bit limiting for pricing purposes, since the payoff has
to have the peculiar form ev·XT . While is trivial to observe that we can manage the case
eu+v·XT simply replacing the condition α(0) = 0 with α(0) = u, actually a bit more can
be said.
3.3.1 Extended transform
Intuitively we can exploit the approach used in page 19, differentiating both sides of
the (3.8), moving the derivative through the expectation, to get
Et
[e−
R Tt R(Xs)ds (u ·XT ) ev·XT
]= eα(t)+β(t)·Xt (A(t) +B(t) ·Xt) (3.12)
where A(t) and B(t) have the same dynamics as α, β, but different boundary condition
in order to satisfy the identity between RHS and LHS of (3.12) for t = T .
Then A and B have to solve the following ODEs, obtained differentiating (3.6) both
sides, getting those linear ODEsB(t) = −KT
1 β(t)− βT (t)H1B(t)− l1∇xθ(β(t)) ·B(t)
A(t) = −KT0 ·B(t)− βT (t)H0B(t)− l0∇xθ(β(t)) ·B(t)
A(T ) = 0, B(T ) = u.
(3.13)
A fully fledged proof of this moves along the lines of the proof of Theorem 3.1.1, so we
will only enunciate the result, along with a necessary definition with technical hypothesis,
fully analogue of ones in Definition 3.4
Definition 3.5. A characteristic (K,H, l, θ) is extended well-behaved at (u, v, T ) if:
• (3.6) are solved uniquely by α and β.
• The Laplace jump transform θ is differentiable at β(t), t ≤ T ( it suffices that ν is
well defined and finite at β(t)).
• (3.13) are solved uniquely by A and B;
and
1. E[∫ T
0|γt| dt
]<∞, with γt = λ (Xt) (Φt (θ(β(t))− 1) + Ψt∇xθ(β(t))B(t)).
2. E[∫ T
0ηt · ηtdt
]<∞, with ηt = Φt
(βT (t) +BT (t)
)σ (Xt).
3. E [ΦT ] <∞.
3.3 Extending the transform 30
where Φt = Ψt(A(t) +B(t) ·Xt).
Given Definition 3.5 , we have the extended result:
Theorem 3.3.1. Let (K,H, l, θ) be extended well-behaved, then (3.12) holds.
Of course all the considerations made for (3.6) hold also for (3.13).
3.3.2 Fourier transform inversion
The transform can be further extended to evaluate payoff of the form (ed·XT − c)+, i.e.
an option, let us define this with:
C(d, c, t, T ) := Et
[e−
R Tt R(Xs)ds(ed·XT − c)+
]noting that (ed·XT − c)+ = (ed·XT − c)1d·Xt≥ln c, we can rewrite
C(d, c, t, T ) = Gd,−d(− ln c;Xt, t, T )− cG0,−d(− ln c;Xt, t, T ) (3.14)
denoting with
Ga,b(y;Xt, t, T ) = Et
[e−
R Tt R(Xs)dsea·XT 1b·XT≤y
](3.15)
Performing the Fourier-Stieltjes transform (cf. Definition A.3) of Ga,b(y;Xt, t, T ), de-
noted with Ga.b(v;Xt, t, T ), we have:
Ga.b(v;Xt, t, T ) =
∫ReivydGa,b(y;Xt, t, T )
but the differentiating trough the expectation, exchanging integral and expectation and
observing that ddy
1b·Xt≤y = δ(b ·XT − y) we have
Ga.b(v;Xt, t, T ) = Et
[e−
R Tt R(Xs)dse(a+ivb)·XT
]= ψ(a+ ivb, x, t, T )
using the Theorem 3.1.1.
Then, knowing the parameters of an option, we are able to determine easily the Fourier-
Stieltjes transform of (3.14)
Theorem 3.3.2 (Transform Inversion). Let (K,H, l, θ) be well-behaved at (a + ivb, T ),
for fixed T ∈ [0 +∞), a, b ∈ Rn and any v ∈ Rn, and∫R|ψ(a+ ivb, x, t, T )| dv <∞ (3.16)
Then Ga,b is well defined by (3.15) and can be expressed by
Ga.b(y;Xt, t, T ) =ψ(a,Xt, t, T )
2− 1
π
∫ ∞0
= [ψ(a+ ibv,Xt, t, T )e−ivy]
vdv (3.17)
where =[c] denotes the imaginary part of any c ∈ C.
3.3 Extending the transform 31
Proof. Let us fix y ∈ R, and for any τ ∈ (0,∞):
12π
∫ τ−τeivyψ(a− ivb, x, t, T )− e−ivyψ(a+ ivb, x, t, T )
iv dv
= 12π
∫ τ−τ
∫Re−iv(z−y) − eiv(z−y)
iv dGa,b(z;x, t, T )dv
Since for all u, v ∈ R, |eiv − eiu| ≤ |v − u| we have∣∣e−iv(z−y) − eiv(z−y)∣∣
iv≤ −2i |z − y| sgn(v)
andlim
y→−∞Ga,b(y;x, t, T ) = ψ(a, x, t, T ) <∞
limy→−∞
Ga,b(y;x, t, T ) = 0(3.18)
we can use Fubini Theorem to exchange the order of integration, getting
=1
2π
∫R
∫ τ
−τ
e−iv(z−y) − eiv(z−y)
ivdvdGa,b(z;x, t, T ). (3.19)
Recalling the Euler Formula, eia−e−ia2i
= sin(a) for all a ∈ R, we can write
1
2π
e−iv(z−y) − eiv(z−y)
iv= −sin(v(z − y))
πv= −sgn(z − y) sin(v|z − y|)
πvtherefore (3.19) can be rewritten
−∫
Rsgn(z − y)
[∫ τ
−τ
sin(v|z − y|)πv
dv
]dGa,b(z;x, t, T ) (3.20)
with the inner integral bounded for every τ, z and for a fixed y.
Recalling that∫
Rsin(αx)
xdx = π for all α > 0, the bounded convergence theorem yields
limτ→∞1
2π
∫ τ−τeivyψ(a− ivb, x, t, T )− e−ivyψ(a+ ivb, x, t, T )
iv dv
= −∫
R sgn(z − y)dGa,b(z;x, t, T )
= −[∫∞
ydGa,b(z;x, t, T )−
∫ y−−∞ dGa,b(z;x, t, T )
]= − Ga,b(z;x, t, T )|z=∞z=y + Ga,b(z;x, t, T )|z=y−z=−∞ .
Recalling the condition (3.18) we get
= −ψ(a, x, t, T ) +Ga,b(y;x, t, T ) +Ga,b(y−;x, t, T )
where
Ga,b(y−;x, t, T ) = limz→y,z≤y
Et
[e−
R Tt R(Xs)dsea·XT 1b·XT≤z
]∣∣∣XT=x
.
Using (3.16), by dominated convergence theorem we have
Ga,b(y−;x, t, T ) = Ga,b(y;x, t, T );
3.3 Extending the transform 32
then
Ga,b(y;x, t, T ) =ψ(a, x, t, T )
2+
1
4π
∫ ∞−∞
eivyψ(a− ivb, x, t, T )− e−ivyψ(a+ ivb, x, t, T )
ivdv.
Then, observing that eivyψ(a−ivb, x, t, T ) is the complex conjugate of e−ivyψ(a+ivb, x, t, T ),
we get (3.17) using the fact that, for all c ∈ C, c := <[c] + i=[c], we have c∗ − c =
<[c]− i=[c]−<[c]− i=[c] = −2i=[c], denoting with c∗ the complex conjugated of c.
Remark 6. This result is of little practical utility, since each evaluation of the integrand
in ∫ ∞0
= [ψ(a+ ibv,Xt, t, T )e−ivy]
vdv
request the solution of a set of complex GREs (the boundary condition of the GREs
depends on v), and that can be painful from the computational point of view. Different
will be the case in which the transform is explicitly known, then (3.17) requests only an
complex integral to be computed.
3.3.3 Fourier representation
Theorem 3.1.1 allow us to to price a payoff of the form eu·XT , u ∈ C, and it is straight-
forward to apply the trasform to the class of functions that admit an integral representa-
tion, like Fourier or Laplace.
Proposition 3.3.3. Let the hypothesis of Theorem 3.1.1 hold and let us suppose that
the payoff of a contingent claim admits a multidimensional Fourier representation, with
ω, ω0 ∈ Rn
f(XT ) =
∫Rne(ω0+iω0)·XTF (ω)dω.
If Et
[e−
R Tt R(Xs)ds+ω0·XT
]<∞ then the price can be expressed with the formula
St =
∫Rnψ(ω0 + iω,Xt, t, T )F (ω)dω (3.21)
Proof.
St = Et
[e−
R Tt R(Xs)ds
∫Rne(ω0+iω)·XTF (ω)dω
].
Using Fubini’s theorem
St =
∫Rn
Et
[e−
R Tt R(Xs)dse(ω0+iω)·XTF (ω)
]dω
St =
∫Rn
Et
[e−
R Tt R(Xs)dse(ω0+iω)·XT
]F (ω)dω.
Then by Theorem 3.1.1, the result follows.
3.4 An optimization idea 33
Remark 7. Like for Fourier inversion, (3.21) can be really hard to evaluate, since it requests
integration over Rn, and the integrand is a really hard function to evaluate, since for each
ω, the whole trajectory of GREs has to be re-evaluated.
3.3.4 Time dependence and multiple jump types
All the proof so far were carried out with the process (Xt)t≥0 defined by the generator
(3.1) . We can redefine D as a subset of Rn × [0,∞) and the infinitesimal generator
Df(x) =∂f
∂t+∇xf ·µ(x, t)+
1
2∇2xf :
(σ(x, t)σT (x, t)
)+
m∑i=1
λi(x)
∫Rn
[f(x+ z)− f(x)] dνit(z)
with f : D → R a function smooth enough and νit(z), i = 1, . . . ,m is a time-dependent
jump distribution. If we suppose that all the quantities listed above are continuous and
bounded with respect to t on the interval [0,∞), θi(c, t) :=∫
Rn ec·zdνit(z) and
µ(x, t) = K0(t) +K1(t)x K(t) := (K0(t), K1(t)) ∈ (Rn,Rn×n)(σ(x, t)σT (x, t)
)ij
= (H0(t))ij + (H1(t))ij x H(t) := (H0(t), H1(t)) ∈ (Rn×n,Rn×n×n)
λi(x, t) = li0(t) + li1(t) · x li(t) := (li0(t), li1(t)) ∈ (R,Rn)
(3.22)
Theorems (3.1.1) and (3.3.1) still hold, simply replacing l0(θ(c) − 1) and l1(θ(c) − 1) in
GREs withm∑i=1
li0(θi(c, t)− 1) andm∑i=1
li1(θi(c, t)− 1).
3.4 An optimization idea
Let us consider a firm, issuing a contingent claim as form of financing, with a fixed
maturity date T and a payoff ST = F (v) = ev·XT ,. Then the payoff depends on the linear
combination with coefficients vi, i = 1, . . . , n of n econometric variables X iT . Up to now
we have treated the coefficients v as given, but a question arise:
How should the firm choose the coefficients v?
a possible answer is: the firm should choose v such that we get the maximum money
selling the claim. In other words
maxv∈V
S0 := E0
[e−
R T0 R(Xs)dsev·XT
]where V ∈ Rn is the (closed) set of all the admissible v.
3.4 An optimization idea 34
This can be written using the affine framework, as an optimization problem:
maxv∈V
α(T ) +X0 · β(T )
st
β = −ρ1 +KT1 β + 1
2βTH1β + l1(θ(β)− 1)
α = −ρ0 +KT0 · β + 1
2βTH0β + l0(θ(β)− 1)
α(0) = 0
β(0) = v
(3.23)
A more reasonable choice criterion would be, since the dynamics are non linear, to
look for a static control v such that maximises the income/outcome ratio, that is
maxv∈V
E0
[e−
R T0 R(Xs)dsev·XT
]E0 [ev·XT ]
.
Of course the maximum value attainable is 1, and can be seen as if the firm looks for the
way to pay as few interest as possible on the bond.
In our friendly, affine context, that can be written
maxv∈V
α(T )− α(T ) +X0 · (β(T )− β(T ))
st
β = −ρ1 +KT1 β + 1
2βTH1β + l1(θ(β)− 1)
α = −ρ0 +KT0 · β + 1
2βTH0β + l0(θ(β)− 1)
˙β = KT1 β + 1
2βTH1β + l1(θ(β)− 1)
˙α = KT0 · β + 1
2βTH0β + l0(θ(β)− 1)
α(0) = α(0) = 0
β(0) = β(0) = v
(3.24)
It is possible to show that those problems are well posed.
Proposition 3.4.1. Let V 6= ∅ be a closed and limited set. Then a solution to problem
(3.24) exists.
Proof. For Bolzano-Weierstrass’ theorem, then V is also a compact set. The solution of the
GREs can be expressed by functions α(v, t), α(v, t), β(v, t), β(v, t)), which, by Theorems
4.1.2-4.1.3, are continuous with respect to v.
Then for Weierstrass’ Theorem we have that the continuous function v → α(v, T ) −α(v, T ) +X0 · (β(v, T )− β(v, T )) has an extremal value over V .
This approach can be extended to any setting in which the objective function can be
expressed in a way tractable with affine processes.
3.5 A more general result 35
3.5 A more general result
As said in Section 3.1, the state space is taken D : Rm+ × Rn−m, m = 1, . . . , n. We
will denote with Semn the space of n × n positive semi-definite symmetric matrices,
I =: 1, . . . ,m and I := m+ 1, . . . , n. Moreover, we will denote in this section
a · b = a1b1 + . . .+ anbn, a, b ∈ Cn
which is not the standard inner product on Cn.
Definition 3.6. A Markov process (Xt)t≥0 is called regular affine if its characteristic
function has exponential-affine dependence on the initial state, i.e for t ∈ R+, u ∈ iRn,
there exist φ(t, u) ∈ C and ψ(t, u) ∈ Cn such that for all x ∈ D
E[eu·Xt |X0 = x
]= eφ(t,u)+ψ(t,u)·x (3.25)
Moreover, the functions ψ and φ are continuous in t and ∂∂tψ(t, u)
∣∣t=0+ , ∂
∂tφ(t, u)
∣∣t=0+ exist
and are continuous at u = 0.
It has been shown that affine regular processes are fully characterized by
Theorem 3.5.1 (Theorem 2.7, [22]). A regular affine process is a Feller process with
infinitesimal generator
Df(x) = ∇2xf : A(x)+∇xf ·B(x)−C(x)f(x)+
∫D\0
(f(x+ ξ)− f(x)−∇xf · χ(ξ))M(x, dξ)
(3.26)
for f ∈ C2c (D)
A(x) = a+m∑i=1
xiαi a, αi ∈ Rn×n
B(x) = b+N∑i=1
xiβi b, βi ∈ Rn
C(x) = c+m∑i=1
xiγi c, γi ∈ R+
M(x) = m(dξ) +m∑i=1
xiµi(dξ) m,µi are Radon measure over D\0
(3.27)
and χ : Rn → Rn, χi(ξ) = max(1, |ξi|)sgn(ξi). In order to ensure that the process will not
leave D:
• a ∈ Semn, with (a)ii = 0, i ∈ I.
• αj ∈ Semn, with (αj)kk = 0, j, k ∈ I, k 6= j.
• b ∈ D.
3.5 A more general result 36
• (βj)i = 0, i ∈ I, j ∈ I, (βi)j ∈ R+, i, j ∈ I, i 6= j.
• m =∫D\0
(∑i∈Iχi(ξ) +
∑j∈I
|χj(ξ)|2)m(dξ) <∞.
• µi =∫D\0
( ∑j∈I\i
χj(ξ) +∑
j∈I∪i|χj(ξ)|2
)µi(dξ) <∞, i ∈ I.
Furthermore, ψ and φ in (3.25) solve the generalized Riccati equations,∂∂tφ(t, u) = F (ψ(t, u))
∂∂tψ(t, u) = R(ψ(t, u))
φ(0, u) = 0
ψ(0, u) = u
(3.28)
where
F (u) = au · u+ b · u− c+∫D\0
(eu·ξ − 1− u · χ(ξ)
)m(dξ)
Ri(u) = αiu · u+ βi · u− γi +∫D\0
(eu·ξ − 1− u · χ(ξ)
)µi(dξ), i ∈ I
Ri(u) = βi · u i ∈ I
(3.29)
Conversely, for any choice of admissible parameters a, αi, b, βi, c, γi, m, µi, there exists
a unique regular affine process with generator (3.26).
Remark 8. Let β ∈ Rn×n be the matrix such that the i−th column is formed by βi, then
β looks
β =
∗ + . . . + 0 . . . . . . 0
+. . . . . .
......
. . . . . ....
.... . . . . . +
.... . . . . .
...
+ . . . + ∗ 0 . . . . . . 0
∗ . . . . . . ∗ ∗ . . . . . . ∗...
. . . . . ....
.... . . . . .
......
. . . . . ....
.... . . . . .
...
∗ . . . . . . ∗ ∗ . . . . . . ∗
(3.30)
where + denote an element of R+, ∗ an element of R.
Then the third equation in (3.28) is a linear autonomous system that can be separately
solved, yielding ψj(t, u) = (eβtw)j−m, j ∈ I
∂∂tψi(t, u) = Ri(ψ(t, u)) i ∈ I
φ(t, u) =∫ t
0F (ψ(t, u))
ψ(0, u) = u
(3.31)
3.5 A more general result 37
where w is a vector in Rn−m, containing the last n−m component of u, β ∈ R(n−m)×(n−m)
is the bottom-rightest submatrix of β and eβt =∞∑i=0
(βt)k
k!.
Remark 9. A is the diffusion matrix σσT . The restrictions on a and αi imply that (a)kl =
(a)lk = (αj)kl = (αj)lk = 0, j, k ∈ I, k 6= j, imposing a rigid dependence structure on the
matrix A. E.g., for n = 3, if m = 0 A is an arbitrary semi-positive definite matrix and it
can not depend on any component of (Xt)t≥0. For 1 ≤ m ≤ 3, we show the pattern of a,
and αi, i ∈ I:
m = 1, a =
0 0 0
+ ∗+
, α1 =
+ ∗ ∗+ ∗
+
,
m = 2, a =
0 0 0
0 0
+
, α1 =
+ 0 ∗0 0
+
, α2 =
0 0 0
+ ∗+
,
m = 3, a = 0, α1 =
+ 0 0
0 0
0
, α2 =
0 0 0
+ 0
0
, α3 =
0 0 0
0 0
+
,where + is a nonnegative number and ∗ is a number such that semi-positive definiteness
holds. For m = n, A is a diagonal with nonnegative elements with a straightforward
square root; then a regular affine process is Rn+ if and only if has a multifactor CIR σ(X).
A small code was written to plot the structure of those matrices (cf. Appendix D )
Remark 10. Parameters in (3.27) are not time dependent. As for the simpler case of jump
diffusion processes, it can be extended to the time dependent case, assuming that all the
parameters are time continuous, for further details we refer to [27].
The reader could have yet noticed that equations (3.28) are similar to (3.7) if C(x) = 0
but the discounting part is missing; to clarify this, the following result has been shown in
[22], Section 11:
Proposition 3.5.2 (Proposition 11.2, [22], 45). Let (Xt)t≥0 be an affine process with
c = 0 and γi = 0, (cf. Theorem 3.5.1) Let (x, r) ∈ D × R, then
E[eqR
rt+u·Xt |X0 = x
]= eφ
′(t,u,q)+ψ′·z+qr
3.5 A more general result 38
where Rrt = r +
∫ t0
(l + λ ·Xs) ds, l, r ∈ R, λ ∈ Rn, q ∈ iR, and ψ′, φ′ solve∂∂tφ′(t, u, q) = F (ψ′(t, u, q)) + lq
∂∂tψ′(t, u, q) = R(ψ′(t, u, q)) + λq
φ′(0, u, q) = 0
ψ′(0, u, q) = u
(3.32)
Although the case of interest q = −1 is not strictly covered by this Proposition (iR :=
c ∈ C : <[c] = 0), il can be extended to include also for q = −1 if
E[e−R
rt |X0 = x
]<∞, ∀x ∈ D;
which is always satisfied if we assume to beRrt positive (as is usual in financial applications,
since Rrt plays the role of a discount rate).
Then, in the light of Proposition 3.5.2, we have that Theorem 3.5.1 fully extends the
results given in Section 3.1, providing existence restrictions on parameters and allowing
for infinite activity jumps.
3.5.1 Diagonal diffusion matrix
It is common in literature to consider σ as a diagonal matrix (e.g. [16]), but impos-
ing instantaneously uncorrelated diffusion could lead to loss of generality with possible
consequences of poor fit to data. Let us suppose to have an affine jump-diffusion process
(Xt)t≥0, with state space D : Rm+ × Rn−m. Let suppose to have a linear transformation
Λ : D → D, where Λ ∈ Rn×n is a nonsingular matrix. Applying Ito Formula to Yt = ΛXt,
we get:
dYt =(Λb+ ΛβΛ−1Yt
)dt+ Λσ(Λ−1Yt)dW (t) + ΛdZt
which is still affine and b′ := Λb, β′ := ΛβΛ−1, m′ = λf(Λdξ) and σ′ = σ(Λ−1Yt) satisfy
the conditions of Theorem (3.5.1). If we look at the diffusion matrix of (Yt)t≥0 we get
σ′(σ′)T = ΛaΛT + Λα1ΛTx1, · · ·+ ΛαmΛTxm.
If we are able to find a nonsingular matrix Λ such that ΛaΛT ,Λα1ΛT , . . . ,ΛαmΛT are
diagonal, then we can assume a diagonal diffusion matrix without loss of generality.
To partially clarify the matter, there is the following result:
Theorem 3.5.3 ([12]). Let (Xt)t≥0 be an affine jump-diffusion process with state space
D and diffusion matrix A : D → Rn×n. If m ≤ 1 or m ≥ n− 1, then there exists a regular
n× n matrix Λ : D → D, such that ΛaΛT ,Λα1ΛT , . . . ,ΛαmΛT are diagonal.
In particular, for n ≤ 3 one of those condition are always satisfied.
3.5 A more general result 39
Proof. The proof is essentially constructive, and it deals separately with the four cases
m = 0, m = 1, m = n − 1, m = n. Let us denote with ei the i−th component of the
standard basis in Rn. For summation we use the Einstein summation convention, i.e. if
an index i is repeated twice, the expression is summed over all the possible values of i.
We refer to [52] for more details. Finally, we recall that ΛαiΛT is diagonal if and only if
(ΛαkΛT )ij = ei · ΛαjΛT ej = ΛT ei · αkΛT ej = 0, k = 0, . . . ,m, i, j = 1, . . . , n, i 6= j.
Case m = 0: we have only a, an arbitrary semi-definite positive matrix, then there exists
an ortogonal matrix such that ΛaΛT is diagonal.
Case m = 1: we have a and α1, with
a =
[0 0
0 A
]
and A is a n−1×n−1 semi-definite positive matrix. Considering α1, there are two cases:
(α1)ij = 0, and (α1)ij > 0. If (α1)ij = 0 we define
(α1)ij = (α1)ij −(α1)i1(α1)1j
(α1)11
where (α1)i1 = (α1)1j = 0, i, j = 1, . . . , n; from the definition of α1 follows:
x·α1x = xi(α1)ijxj = xi(α
1)ijxj−xi(α
1)i1(α1)1jxj(α1)11
= xi(α1)ijxj−
(xi(α1)ikδk1)(δk1(α1)kjxj)
(α1)11
(3.33)
where δij is the Kroneker delta. Since (ei)j = δij and α1 is symmetric, is straightforward
to recognize that:
x · α1x = x · α1x− (x · α1e1)2
(α1)11
.
For the Cauchy-Schwarz inequality we have x · α1e1 ≤√x · α1x
√e1 · α1e1; hence also α1
is semi-definite positive. On the other hand, if (a1)11 = 0, then α1 =: α1. Then also α1 is
of the form
α1 =
[0 0
0 B
],
with B semi-definite positive. Then for the Theorem 8.7.1, [30] there exist an orthogonal
matrix Q ∈ R(n−1)×(n−1) such that QAQT and QBQT are both diagonal. Let us consider
the matrix
Λ :=
1 0 . . . 0
Λ21
... Q
Λ2n
(3.34)
3.5 A more general result 40
where (Λ)k1 are choose such that ΛT ek · α1e1 = (Λ)kj(α1)j1 = 0, k ≥ 2. Dropping for a
moment the summation convention, that means to compute
Λk1 = −
∑j≥2
(α1)j1(Λ)kj
(α1)11
, k ≥ 2;
if (α1)11 ≥ 0, while is always satisfied (α1)11 = 0. With such Λ we have that both ΛaΛT and
Λα1ΛT are diagonal, and if (α1)11 = 0 then also α1 is diagonal. Otherwise, if (α1)11 > 0,
reasoning in a way similar to (3.33), we have
ΛT ei · α1ΛT ej = ei · Λα1ΛT ej +(ΛT ei · α1e1)(ΛT ej · α1e1)
(α1)11
= 0, i 6= j,
i.e. is diagonal. In addition to this, is straightforward to see from (3.34) that Λ : R+ ×Rn−1 → R+ × Rn−1.
Case m = n − 1 : and the only non zero element in a is (a)nn ≥ 0. On the other hand,
for i = 1, . . . , n− 1, the non zero elements of αi are: (αi)ii, (αi)in, (αi)ni, (αi)ii. We take
Λ =
1 0 0 0
0. . . 0 0
0 0. . . 0
Λ1n . . . . . . Λnn
(3.35)
where
Λin =
− (ai)in
(ai)in1 ≤ i ≤ n− 1 if (ai)ii > 0
0 1 ≤ i ≤ n− 1 if (ai)ii = 0
1 i = n
. (3.36)
To verify that such Λ diagonalize αi let us write
ΛT ej · αiΛT el = (Λ)lh(αi)hk(Λ)kj, i = 1, . . . , n− 1, j 6= l.
Writing explicitly the summation and recalling the restrictions on αi we have:
(Λ)li(αi)ii(Λ)ij+(Λ)li(α
i)in(Λ)nj+(Λ)ln(αi)ni(Λ)ij+(Λ)ln(αi)nn(Λ)nj, i = 1, . . . , n−1, j 6= l.
For j 6= i, l 6= n all the components of Λ involved in the summation are null. It remains
to check values j = i, l = n and l = i, j = n, and also in this case the summation is 0, by
definition of Λ. It is also clear that Λ : Rn−1+ × R→ Rn−1
+ × R.
Case m = n: a is filled with zeroes, and the only non zero element in αi, i = 1, . . . , n− 1
is (αi)ii. Then Λ can be taken as the identity matrix.
3.6 More examples 41
Remark 11. In practice, we do not need to explicitly know the matrix Λ, but it suffice
to know that we can use a diagonal diffusion matrix without fear to miss capability to
match important features of the data. On the other hand, for a Rn+ valued affine process
the diagonal diffusion matrix is the only one admissible.
Remark 12. It was shown in [12] that there exists an affine process in R2+ × R2 whose
matrix cannot be diagonalized with a regular matrix Λ. Therefore, care has to be taken
with the assumption of instantaneously uncorrelated state variables for the cases not
covered by Theorem 3.5.3.
3.6 More examples
We will present two affine jumps-diffusion models showing how a simple generalizations
can make solving GREs explicitly much more difficult.
We will give a hint of how the affine framework works for our credit risk setting
(anticipating the traction made in Chaper 5), introducing a complete 4−dimensional
model which features mean reversion and stochastic volatility.
3.6.1 CIR with jumps
Let (Xt)t≥0 be a R+ valued Markov process, with generator
Df(x) = σ2x∂2f(x)
∂x2+ k(γ − x)
∂f(x)
∂x+l
d
∫R+
(f(x+ z)− f(x))e−zddz. (3.37)
The process is a jump-diffusion CIR model, with jump size distributed exponentially with
mean d and with intensity l. The jump transform is
θ(s) =1
d
∫R+
esze−zddz =
1
1− ds, s ∈ C.
Therefore GREs associated at the problem EQt
[e−
R Tt (ρ0+ρ1Xu−)duev+uXT
]are
β = −ρ1 − kβ + 12σ2β2
α = −ρ0 + kγβ + l dβ1−dβ
α(0) = v
β(0) = u
Those equations admit a closed form solution (we used Mathematica for the formal cal-
culations):β = 1+a1eb1(T−t)
c1+d1eb1(T−t)
α = v +(kγc1
+ lc2− l − ρ0
)(T − t) + kγ(a1c1−d1)
b1c1d1log c1+d1eb1(T−t)
c1+d1+
+ l(a2c2−d2)b2c2d2
log c2+d2eb2(T−t)
c2+d2
3.6 More examples 42
where the coefficients are
c1 =k+√k2+2σ2ρ1−2ρ1
, d1 = (1− c1u)−k+σ2u+
√k2+2σ2ρ1
2u+σ2u2−2ρ1,
a1 = (d1 + c1)u− 1, b1 = −d1(k+2ρ1c1)+a1(σ2−kc1)a1c1−d1 ,
a2 = d1c1, b2 = b1,
c2 = 1− dc1, b2 = d1−da1
c1.
This model can be used to model default intensities or short rate.
3.6.2 Bates model
Bates model was presented in [2], to extend Heston model allowing jumps in the log-
price, and it is described by the following SDE:dStSt
= µSdt+√VtdW
St + dJt
dVt = kV (γV − Vt)dt+√VtdW
Vt .
(3.38)
This model features a stochastic volatility Vt, which is modelled with a CIR process
with mean reversion, and jumps in price St, with intensity λ and jump distribution f ,
while brownian motions (W St )t≥0 and (W V
t )t≥0 are correlated with constant coefficient ρ.
At first glance, the process is not affine, but if we perform the transformation Yt = lnSt,
applying Ito formula to (3.38) we getdYt = (µS − 1
2Vt)dt+
√VtdW
St + dJt
dVt = kV (γV − Vt)dt+ σV√VtdW
Vt .
(3.39)
where
Jt =Nt∑i=1
[ln
(1 +
∆STiSTi−
)].
Model (3.39) is affine, with characteristics:
µ =
(µS
kV γV
)+
[0 −1
2
0 −kV
](Yt
Vt
)
σσT =
[1 ρσV
ρσV(σV)2
]Vt
θ(c) =∫
R eczf(z)dz
where f is the jump distribution of (Jt)t≥0.
Usually f is chosen in a way that (Jt)t≥0 has a desired jump distribution; e.g in [2],
jump-sizes of (Jt)t≥0 are log-normally distributed, and then jumps-sizes of (Jt)t≥0 are
normally distributed, case in which explicit solutions are available.
3.7 Further considerations and reference 43
Remark 13. Let us model the price of a defaultable claim with Bates model, then the
price of that claim would be, using (I.3):
EQt
[eR Tt rsdsSt1τ≥T
]= EQ
t
[eR Tt (rs+λs)dseYt
];
if we model (rt)t≥0 and (λt)t≥0 with one-dimensional affine processes (e.g, CIR with
jumps), then we have that the price of this claim falls in the cases handled by Theo-
rem 3.1.1. To be more precise, we have an affine process (Xt)t≥0 in R3+ × R, with Xt =
(Vt, rt, λt, Yt) and coefficients (cf. Definition 3.4): u = (0, 0, 0, 1), ρ0 = 0, ρ1 = (0, 1, 1, 0);
while remaining coefficients depend on the choice made for (rt)t≥0 and (λt)t≥0. We will
return on this complete example in Chapter 6.
We want to point out that, for such model, hypotheses of Theorem 3.5.3 hold, and
therefore, for calibration purposes, diffusion part for such model can be taken diagonal.
3.7 Further considerations and reference
Here we will mention two topic that are of practical interest, but they were not treated
in detail; here we will give reference for the interested reader.
3.7.1 Infinite activity vs. finite activity
We have seen, in Sections 3.1 and 3.5, results which are similar for two class of pro-
cesses: respectively affine jump diffusion processes and regular affine processes.
Basically, difference between such processes lies in the jump part: the former are
common diffusion processes, interpuncted by “rare” jumps which can represent sudden
events, like upheavals, crashes, discoveries etc.; on the other hand, the latter exhibit
infinitely many small jumps, which can “move” the process even without a diffusion part.
Affine jump diffusion processes are easy to simulate, say, for a MCM, and the dynamical
structure of the process is easy to understand and describe, since the distribution of jump
sizes is known. They are often used for the purpose of implied volatility smile interpolation,
like in [23], for more on this subject, we refer to [14], Chapter 13.
On the other hand, for infinite activity processes the familiar concept of a jump distri-
bution doesn’t hold, and they are less trivial to simulate, but in return they are considered
to be able to reproduce in a realistic way historical price data.
Then the choice between compound Poisson processes and infinite activity process is
just a matter of modelling preference and there is no ultimate solution.
3.7 Further considerations and reference 44
3.7.2 Statistical estimation
On important aspect of any model is the calibration, i.e. to find parameters of a model
that reflet real market conditions. A wide class of techniques are based on the inversion
of the characteristic function
Φ(u,Xt, t, T ) = Et
[eiu·XT
], u ∈ Rn.
Let us consider the characteristic function Φ(u,Xt, tn, tn+1); using a standard Fourier
analysis result we get the conditional density
f(Xtn+1 |Xtn ) =1
(2π)n
∫Rne−iuΦ(u,Xt, tn, tn+1)du. (3.40)
In [50], (3.40) is exploited to construct generalized method-of-moments and to derive
maximum likelihood estimators. However, this approach can be slow, since evaluation of
(3.40), as already pointed out, can be a demanding problem.
If the state process contains latent components which cannot be directly inferred from
data sources (e.g. parameter of volatility in stochastic volatility model); additional filter
procedures come into question. Typically, Kalman filters are used for all model classes:
e.g., [19] and [32].
Always considering latent variables, Bates in [3], proposed a “direct filtration-based
maximum likelihood method” for his model, which we have already encountered in Section
3.6.2 .
Chapter 4
Numerical Methods
So far we have seen that we face two classical problem in Numerical Analysis: Solu-
tion of a set of n + 1 ODEs and integration of a function Rn → R. We will devote to
each problem one section of this chapter. The first one is devoted to Runge-Kutta (RK)
methods, the second one to the numerical integration
4.1 Runge-Kutta methods
RK methods are the most famous and used family of algorithms for the approximation
of solutions to ODEs. To begin with we recall some fundamental results about ODEs.
4.1.1 Facts on ODEs
The Cauchy problem (also known as the initial-value problem) consists of finding the
solution of an ODE, in the scalar or vector case, given suitable initial conditions. In
particular, in the scalar case, denoting by I ⊂ R containing the point t0, the Cauchy
problem associated with a first order ODE reads: find a real-valued function y ∈ C1(I)
such that: y = f(t, y(t)) t ∈ Iy(t0) = y0
(4.1)
where f(t, y) is a given real-valued function on the interval S = I× (−∞,∞) which is
continuous with respect to both variables. If f depends on t only through y, the differential
equation is called autonomous.
Theorem 4.1.1 (Existence and uniqueness theorem, [29]). Let Ω be an open set in R×Rn.
45
4.1 Runge-Kutta methods 46
If f : Ω→ Rn is a continuous function, and∃δ, ε, L > 0 : [t0 − δ, t0 + δ]× Bε(y0) ⊆ Ω
||f(t, y′)− f(t, y
′′)|| ≤ ||y′ − y′′||L (Lipschitz in the second variable)
∀t ∈ [t0 − δ, t0 + δ], ∀y′ , y′′ ∈ Bε(y0)
(4.2)
holds, then exist an open set including x0 where is defined only and only one solution to
the Cauchy problem (4.1). We denote with Bε(y0) the open ball centred in y0 with radius
ε, with Bε(y0) his closure.
Therefore, will suppose through the chapter that the RHS of (4.1) is Lipschitz.
We have seen in Section 3.4 that we have a problem with variable initial value: some
regularity results are available
Theorem 4.1.2 (Continuous dependence theorem, [29]). Let Ω be an open set in R×Rn
and f : Ω → R a succession of continuous function such that fkk→ f point-wise on Ω.
In addition, let us suppose that, on every compact set K ⊂ Ω the convergence is uniform
and let properties (4.2) hold for every x ∈ Ω and fk and with the constants δ, ε and L
independent of k.
Moreover, let be t0 ∈ R and u0k ∈ Rn a succession such that (t0, y
0k)
k→ (t0, y0), where
for every k: (t0, y0k), (t0, y
0) ∈ Ω.
Then the problem (4.1) and the problemyk = fk(t, yk(t)) t ∈ Iy(t0) = y0
k
have unique solutions y(t) and yk(t) such that yk(t)k→ y(t) uniformly.
We can see the solution of (4.1) as a function ϕ(t, y0) that is continuous with respect
both t and y0. Of course the solution is differentiable with respect to t, while for the
differentiability with respect to y0, the following theorem holds:
Theorem 4.1.3 ([44]). Let the existence and uniqueness theorem’s hypotesis hold, and
let the partial derivatives (∇yf)ij be continuous over Ω for each i, j = 1, . . . , n. Then the
derivatives (∇y0ϕ)ij exist and are continuous. Moreover, also the derivatives ∂∂t
(∇y0ϕ)ijare continuous.
4.1.2 Remarks on Generalized Riccati Equations
Looking at (3.28), the presence of∫D\0
(eu·ξ − 1− u · χ(ξ)
)µi(dξ) doesn’t ensure that
R(u) would be Lipschitz.
4.1 Runge-Kutta methods 47
E.g. let us consider a process (Xt)t≥0 in R+ with generator
Df(x) = x2√π
∂f
∂x+
∫R+\0
(f(x+ ξ)− f(x)− ∂f
∂xχ(ξ)
)x
2√π
dξ
ξ3/2.
Then is easy to recognize that our characteristics are
µ(dξ) =1
2√π
dξ
ξ3/2, β =
2√π,
then the GREs are
∂φ∂t
= 0∂ψ∂t
= 2√πψ +
∫R+\0 (eψξ − 1− ψχ(ξ)) 1
2√π
dξξ3/2
φ(0) = 0
ψ(0) = v
. (4.3)
The integral can be explicitly evaluated, yielding:∫R+\0
(eψξ − 1− ψχ(ξ))1
2√π
dξ
ξ3/2= −
√−ψ − 2ψ√
π, <[ψ] ≤ 0.
Of course RHS of (4.3) are not Lipschitz in the origin, and a solution is:
φ(t, v) = 0
ψ(t, v) = −(2√−v + t)2/4
. (4.4)
To ensure the existence and uniqueness of a solution to GREs, the following result
holds:
Theorem 4.1.4 (Proposition 6.1., [22], 24). For every u ∈ Cm−− × iRn−m there exists a
unique solution ψ(·, u) and φ(·, u) to (3.28) with values in Cm−−×iRn−m and C, respectively.
Moreover, φ and ψ are continuous on R×Cm−−× iRn−m. We denote with C−− = c ∈ C :
<[c] < 0.
4.1.3 Analysis of one step methods
Let us pose some definitions, that will become useful in the treatment of the RK
methods, which are basically the most prominent class of the family of one-step methods.
Our analysis will concentrate on one single differential equation (scalar case), but all the
presented results can be effortlessly extended to the general n-dimensional case, just using
in place of the modulus an appropriate norm.
Fix 0 < T < ∞ and let I = (t0, t0 + T ) and for h > 0, define tn = t0 + nh, with
n = 1, . . . , Nh, where Nh is the greatest integer such that tNh ≤ t0 + T . Moreover, let
us denote by uj the approximation of the exact solution yj = y(tj). In a similar way, we
define fj := f(tj, uj).
4.1 Runge-Kutta methods 48
Definition 4.1. A numerical method for the approximation of problem (4.1) is called
a one-step method if n = 1, . . . , Nh, un+1 depends only on un. Otherwise, the scheme is
called a multistep method.
Definition 4.2 (Explicit and implicit methods). A method is called explicit if un+1 can
be computed directly in terms of (some of) the previous values uk,∀ k ≤ n . A method is
said to be implicit if un+1 depends implicitly on itself through f .
We want to point out that one-step methods are suitable to be used in cooperation
with step-adaptive techniques, that is, instead of fixing h at the beginning, we use as time
increment rule tn+1 = hn + tn, with hn chosen with some error-controlling criterion. Such
techniques can provide a great increase in performance, since, if an error-estimation is
available, we can spare computational power on easier parts of the integration process,
keeping at the same time the error at bay.
Each one-step method can be written in the form of
un+1 = un + hΦ(tn, un, fn;h) (4.5)
where the function Φ is called an increment function. A straight-forward property that
we assume for a method is:
limh→0
Φ(tn, un, f(tn, yn);h) = f(tn, yn) (4.6)
In terms of the exact solution y we can write
yn+1 = yn + hΦ (tn, yn, f(tn, yn);h) + εn+1 (4.7)
where εn+1 is the error we commit using the numerical scheme, assuming that un = yn,
that is, we don’t take into account error propagation. We can rewrite
εn+1 = hτn+1(h)
with τn+1(h) is the local truncation error at the time tn+1 ad we define
τ(h) = max0≤n≤Nh−1
|τn+1(h)|
as the global truncation error. We want to outline that truncation errors (local and global)
depend on the solution y of (4.1). That means, in other words, that the numerical method
has to be chosen accordingly to the problem.
We pose in addition the following definitions:
Definition 4.3 (Consistency). A numerical method is said to be consistent if and only if
limh→0
τ(h) = 0.
4.1 Runge-Kutta methods 49
Remark 14. The property (4.6) implies consistency.
Definition 4.4 (Order of a method). A numerical method is said to be of order p i.f.f
∀t ∈ I the solution of y(t) of (4.1) is such that
τ(h) = O(hp), h→ 0.
Up to now, we were working with exact arithmetic, never taking into account the
(actual) problem of working in finite precision arithmetic (we want the computer do the
work for us!). To adress to this question, let us cast this:
Definition 4.5 (zero-stability of one-step methods). The numerical method (4.5) for the
approximation of the solution to problem (4.1) is zero-stable if, for a fixed ε,
∃h0 > 0,∃C > 0 : ∀h ∈ (0, h0] ,∣∣z(h)n − u(h)
n
∣∣ ≤ Cε, 0 ≤ n ≤ Nh (4.8)
where z(h)n , u
(h)n are the solution of the problems
z(h)n+1 = z
(h)n + h
[Φ(tn, z
(h)n , f(tn, z
(h)n );h
)+ δn+1
]z
(h)0 = y0 + δ0
(4.9)
u
(h)n+1 = u
(h)n + hΦ
(tn, u
(h)n , f(tn, u
(h)n );h
)u
(h)0 = y0
(4.10)
for 0 ≤ n ≤ Nh − 1 and |δk| ≤ ε, 0 ≤ k ≤ Nh.
Zero-stability thus requires that, in a bounded interval, (4.8) holds for any value h ≤h0. This property deals, in particular, with the behavior of the numerical method in
the limit case h→ 0 and this justifies the name of zero-stability. The latter is therefore a
distinguishing property of the numerical method itself, not of the Cauchy problem (which,
indeed, is stable due to the uniform Lipschitz continuity of f). Property (4.8) ensures that
the numerical method has a weak sensitivity with respect to small changes in the data. So
the request of a zero-stable numerical method is an answer to problems that could arise
in presence of finite precision arithmetic.
We want to point out that the name zero-stability is attributable to the fact that
we want this property to hold for h belonging to a neighborhood of the origin. Another
similar concept is the one of “absolute stability”, which holds for h not necessarily near
to zero.
Theorem 4.1.5 (Zero-stability of one step methods). Consider the explicit one-step
method (4.5) for the numerical solution of the Cauchy problem (4.1). Assume that the
4.1 Runge-Kutta methods 50
increment function Φ is Lipschitz continuous with respect to the second argument, with
constant Λ independent of h and of the nodes tj ∈ [t0, t0 + T ], that is
∃h0 > 0,∃Λ > 0 : ∀h ∈ (0, h0], 0 ≤ n ≤ Nh∣∣∣Φ(tn, u(h)n , f(tn, u
(h)n );h
)− Φ
(tn, z
(h)n , f(tn, z
(h)n );h
)∣∣∣ ≤ Λ∣∣∣u(h)n − z(h)
n
∣∣∣ . (4.11)
Then, the method (4.5) is zero-stable.
Proof. Let us define w(h)j := u
(h)j − z
(h)n , and subtract (4.9) and (4.10) to get
w(h)j+1 = w
(h)j + h
[Φ(tj, u
(h)j , f(tj, u
(h)j );h
)− Φ
(tj, z
(h)j , f(tj, z
(h)j );h
)]+ δj+1.
assuming |δk| < ε and observing that w(h)0 = δ0. We sum over j, getting for n = 1, . . . , Nh
w(h)n = w
(h)0 + h
n−1∑j=0
(Φ(tj, u
(h)j , f(tj, u
(h)j );h
)− Φ
(tj, z
(h)j , f(tj, z
(h)j );h
))+
n−1∑j=0
δj+1.
taking the norm both sides and using (4.11) we have
|w(h)n | ≤ |w
(h)0 |+ hΛ
n−1∑j=0
|w(h)j |+
n−1∑j=0
|δj+1|, n = 1, . . . , Nh. (4.12)
Applying the discrete Gronwall lemma, stated below, we get
|w(h)n | ≤ (1 + hn)εenhΛ, n = 1, . . . , Nh
and noticing that n ≤ Nh ⇒ hn ≤ hNh ≤ T
|w(h)n | ≤ (1 + T )eTΛε = Cε,
Lemma 4.1.6 (Discrete Gronwall lemma, [47]). Let kn be a nonnegative sequence and φn
a sequence such that φ0 ≤ g0
φn ≤ g0 +n−1∑s=0
ps +n−1∑s=0
ksφs.
If g0 ≥ 0 and pn ≥ 0 ∀n ≥ 0, then:
φn ≤
(g0 +
n−1∑s=0
ps
)exp
(n−1∑s=0
ks
)
To proceed with our tractation, let us state another
4.1 Runge-Kutta methods 51
Definition 4.6. A method is said to be convergent if
|un − yn| ≤ C(h), ∀n = 0, . . . , Nh,
where C(h) → 0 as h → 0. In that case, it is said to be convergent with order p if
∃C > 0 such that C(h) = Chp .
Theorem 4.1.7 (Convergence of one step methods). Under the same assumptions as in
Theorem 4.1.5, we have
|yn − un| ≤ (|y0 − u0|+ nhτ(h))enhΛ, 1 ≤ n ≤ Nh
Therefore, if the consistency assumption (4.6) holds and |y0−u0| → 0 as h→ 0, then the
method is convergent. Moreover, if |y0 − u0| = O(hp) and the method has order p, then it
is also convergent with order p.
Proof. Let wj := yj − uj, subtract (4.7) from (4.5) ad proceeding along the lines of the
previous proof getting to
|wn| ≤ |y0 − u0|+ hΛn−1∑j=0
|wj|+n−1∑j=0
|τj+1(h)|, n = 1, . . . , Nh.
we apply again the discrete Gronwall Lemma getting the desired result. Since nh ≤ T
and τ = O (hp)
|yn − un| ≤ (|y0 − u0|+ Tτ(h))eTΛ, 1 ≤ n ≤ Nh
therefore we can find a constant C dependent on T and Λ but not from h, such that
|yn − un| ≤ Chp
Those result are very important, since to ensure convergence and zero stability we
need only to check the hypothesis of Theorem 4.1.5 and (4.6).
4.1.4 Runge Kutta methods
Now we are going to discuss the chosen method to solve our GREs, and we will stick
again to the one dimensional case, since the RK methods work in the same way in the
general multi-dimensional case. In its most general form, an RK method can be written
as, indicating with un the approximate solution at the step n
un+1 = un + hF (tn, un, h; f) (4.13)
4.1 Runge-Kutta methods 52
with f is the RHS of the (4.1), h is the temporal increment, tn is the value of the time at
the stage n of the integration, and F is the increment function defined as follows
F (tn, un, h; f) =s∑i=1
biKi
Ki = f
(t+ hci, un + h
s∑j=1
aijKj
)(4.14)
and s denotes the number of stages of the method. We see that an s-stage involves in
each step at least s evaluation of the RHS of (4.1), and thus higher-steps RK methods
can be inappropriate for systems with hard-to-evaluate RHS.
In our case of study the RHS can be quite easy to evaluate, so RK appears the most
suitable method for our applications.
The things can turn out different if we allow the process to jump, since we add the
(expensive!) jump trasform to the RHS of the GREs (cf. Section). Anyway, for most
common jump distribution the transform is know in a closed form, so RK remains the
method of choice.
At first glance, we can see from (4.14) that a RK method satisfies (4.6) if and only if∑si=1 bi = 1. Actually more can be said, check[38] for more general condition.
In addition we can see that the increment function is Lipschitz in the second variable
(is the convex combination of Lipschitz function). Therefore the method is, by Theorem
4.1.5, zero-stable, and ensuring consistency, we have convergence as well by Theorem 4.1.7.
The coefficients aij, ci and bi fully characterize an RK method and are usually
collected in the so-called Butcher array (or Tableau),
c1 a11 a12 · · · a1s
c2 a21 a22 · · · a2s
......
.... . .
...
cs as1 as2 · · · ass
b1 b2 · · · bs
or, in a compact notation,c A
bT
with A ∈ Rs×s, (A)ij = aij, b, c ∈ Rs, (b)i = bi, (c)i = ci, ∀ i, j = 1, · · · , s. Usually, the
coefficients ci are taken ci =∑s
j=1 aij.
At this stage of the tractation, we can distingue three different kind of RK methods:
1. fully implicit methods, characterized by a A taken as a full matrix.
2. Semi-implicit methods, with A taken as a triangular matrix, including the diagonal
(that is: aij = 0, j ≥ i )
3. Explicit methods, with A taken as a triangular matrix (that is: aij = 0, j > i ).
4.1 Runge-Kutta methods 53
A RK methods request, in each step, to find the values Ki; that involves, respectively,
the solution of a full non linear set of s equations, s fixed-point problems or a simple
recursion.
The first two choices will skyrocket the computational demands of the method, so for
non-stiff problems, when absolute stability is a must-have-feature, the first two variant
are considered exaggeratedly expensive. So we have chosen to stick to explicit methods,
since they are a good compromise between performance and accuracy.
We anticipate here that our choice will by an order 5 method with built-in error
estimate, in order to use step adaptivity.
4.1.5 Derivation of an explicit RK method
The standard technique for deriving an explicit RK method consists of enforcing that
the highest number of terms in Taylor expansion of the exact solution yn+1 at tn coincide
with those of the approximate solution un+1, assuming that we take one step of the RK
method starting from the exact solution yn . We provide an example of this technique in
the case of an explicit 2-stage RK method. Let us consider a 2-stage explicit RK method
and assume to have at disposal the exact solution yn at the n-th step. Then
un+1 = yn + hF (tn, yn, h; f) = yn + h(b1K1 + b2K2),
K1 = fn, K2 = f(t+ c2h, un + a21K1h)
if we perform Taylor’s expansion on K1, we get:
K2 = yn + hc2∂fn∂t
+ ha21K1∂fn∂y
+ O(h2)
putting the linearized K’s in un+1 expression we get:
un+1 = yn + h (b1 + b2) fn + h2b2
(c2∂fn∂t
+ a21fn∂fn∂y
)+ O(h3)
performing the Taylor’s expansion on the exact solution at the third order:
yn+1 = yn + hfn +h2
2
(∂fn∂t
+ fn∂fn∂y
)+ O(h3).
If we subtract the scheme expansion from the exact solution, recalling that ci =∑s
j=1 aij,
we can write 2 equations: b1 + b2 = 1
b2c2 = 1/2.
4.1 Runge-Kutta methods 54
That lead to a method wich is has a local truncation error of order 2 and is convergent
with order 2.1 In general in this way we can find restrains on constants involved in a RK
explicit method, which are not sufficient to determine the constants of a RK method.
Just to make this statement clear, we report some calculation for a RK method with
s = 4. For a system of a form (4.1):
y = f(y, t)
y = ft + fxf...y = ftt + 2fxtf + fxft + fxxf
2 + f 2xf....
y = fttt + 3fxttf + 3fxtft + 3fxxftf + 5fxfxtf + fxxxf3 + fxftt + 3fxxtf
2+
f 2xft + 4fxxfxf + f 3
xf...
...
and we can see how the derivatives grow in complexity. Using those derivatives in a
Taylor’s expansion retaining all the terms up to the fifth order, we reach those equations:
b1 + b2 + b3 + b4 = 1
b2c2 + b3c3 + b4c4 = 1/2.
b2c22 + b3c
23 + b4c
24 = 1/3
b3a32c2 + b4a42c2 + b4a43c3 = 1/6.
b2c32 + b3c
33 + b4c
34 = 1/4
b3c3a32c2 + b4c4a42c2 + b4c4a43c3 = 1/8
b3a32c22 + b4a42c
22 + b4a43c
23 = 1/12
b4a43a32c2 = 1/24
which has no unique solution, and can lead to various 4-th order methods. In general, to
solve those equation, we need to impose some additional condition, e.g. the minimization
of some form of error. All those condition can be generalized in an algebraic setting, but
the theory is far too complex to be presented here, so we refer the reader to [9].
4.1.6 Global error
Now we will explain the relation between the local truncation error and the convergence
order of the RK. We anticipate here that a Runge Kutta with truncation error of order p
is also convergent with order p.
To being with, we will show a preliminary result:
Lemma 4.1.8. Let f : R × R → R be the RHS of (4.1), and f Litpschitz with costant
L. Let y0, z0 ∈ R be two input values to a step with the RK method with(A,bT , c
), using
1This claim will be clearer in the following pages.
4.1 Runge-Kutta methods 55
stepsize h ≤ h0, with h0Lρ (|A|) < 1, and let y1, z1 be the corresponding output values.
Then
|y1 − z1| ≤ (1 + hL∗)|y0 − z0|
where L∗ = L|bT |(I − h0L|A|)−11 , denoting with 1i = 1, |b|i = |bi|, (|A|)ij = |(A)ij|i, j = 1, . . .m
Proof. Let us denote the increments Ki, defined in (4.14), of the two instance with Yi and
Zi. We easily obtain
Yi − Zi = yi − zi + hs∑j=1
aij (f (t, Yj)− f (t, Zj))
from the triangular inequality and the Lipschitz property of we get
|Yi − Zi| ≤ |y0 − z0|+ h0L
s∑j=1
|aij| |Yj − Zj|
and substituting in
|y1 − z1| ≤ |y0 − z0|+ hLs∑j=1
|bj| |Yj − Zj|
we get the desired result.
and this is the main theorem:
Theorem 4.1.9. Let h0 and L∗ be such that the local truncation error is bounded by
τk(h) ≤ Chp, ∀k = 1, · · · , Nh, h ≤ h0
then the global error is be bounded by
|un − yn| ≤
eL∗T−1L∗
Chp L∗ > 0
CThp L∗ = 0
Therefore the scheme is convergent with order p.
Proof. Let us consider the RK method starting at time t0 up to time t0 + T . As it looks
clear from the Figure 4.1, we can estimante the solution with
|un − yn| ≤Nh−1∑i=1
∆i + δNh (4.15)
where ∆i is the distance between two approximated solutions at time t0 + T , originating
one in yi, and the other in yi−1. We denote the error between the exact solution yi and
4.1 Runge-Kutta methods 56
Figure 4.1: How to use local truncation error to estimate global error
the approximated one started at the previous time ti−1 with δi; initial error δi propagates,
forming ∆i, as shown in the Figure 4.1. In addition, from the definition of τn(h) follows
that δi ≤ Chp+1.
We can apply repeatedly Lemma 4.1.8 to get an estimate of the error ∆i, getting
∆i ≤ δi (1 + hL∗)i ≤ Chp+1 (1 + hL∗)i ,
thus the (4.15) becomes
|un − yn| ≤ Chp+1
Nh−1∑i=0
(1 + hL∗)i
the case L∗ = 0 follows directly from the fact that hNh = T , while if L∗ > 0, we recall
n−1∑i=0
ri =1− rn
1− r. (4.16)
Then the value of our sum is
Nh−1∑i=0
(1 + hL∗)i =(1 + hL∗)Nh − 1
hL∗.
4.1 Runge-Kutta methods 57
Since (1 + hL)Nh ≤ ehNhL∗
= eTL∗
we obtain the desired result.
This theoretical estimation is too complicated to be use in practice, and we will develop
present an alternative strategy to evaluate the error committed using the method.
4.1.7 On higher order RK methods
Up to now, we have shown that if we have a RK method with truncation order p it’s
also convergent with order p. The main question at the moment is: which is the role of
the number of stages s?
An answer to this question actually exists, but it is too complex to be shown here and
it goes beyond the scope of this thesis, so we will report some results, referring once again
the reader to [9] for proofs. To begin with, we recall to the reader that for each use of an
s stage RK methods request s evaluation of the RHS of the problem. Intuitively speaking
would seem clear that higher stage methods bring higher accuracy2, but this gain has to
be quantified.
For this purpose, we cite the following:
Theorem 4.1.10. If an explicit s stage RK method has order p, then s ≥ p
Reasonably, this state: “there is no free-lunch”, i.e. we cannot get an order p method
without evaluating, at least, p times the RHS. The following Theorem holds
Theorem 4.1.11. If an explicit s stage RK method has order p ≥ 5, then s > p. Moreover
the following conditions hold:
s =
s− p ≥ 1 p ≥ 5
s− p ≥ 2 p ≥ 7
s− p ≥ 3 p ≥ 8
That theorem says that in we want more and more precision, we are doomed to use
al lot of evaluation of the RHS of (4.1). it’s why the RK method of order 4 is so popular,
it has the best trade-off between accuracy and computational cost. To make even more
clear how the order of the methods diverges with the number of stages, we enunciate this
Theorem 4.1.12. For any positive integer p, an explicit RK method exists with order p
and s stages, where
s =
3p2−10p+24
8p even
3p2−4p+98
p odd.
2This statement is quite vague, we point out that higher order methods are actually more accuratefor very regular RHS, since higher order Taylor’s expansion are involved.
4.1 Runge-Kutta methods 58
This results doesn’t prevent a lower stage method to exist: actually those are the
minimum stage required to reach the orders ranging from 1 to 8:
order 1 2 3 4 5 6 7 8
smin 1 2 3 4 6 7 9 11(4.17)
Up to now, we had worked in a scalar setting, claiming that all the methods can be
ported to a multidimensional setting. That is true, in fact all the results we shown are
independent from the dimension of the problem, and all of them take as assumption that
the considered method has order p. The main problem is:
the order of an RK method in the scalar case does not necessarily coincide with that in
the vector case. In general, a method with order p ≥ 5 in the scalar case, doesn’t retain
order p in the vector case, while the converse is always true.
An intuitive motivation of this claim can be again found in the increasing complexity
of derivatives, and a precise motivation can be found in the usual [9], par. 316, 148-149.
4.1.8 Step adaptivity
The main idea behind the step adaptivity is to adapt the step, or another error control
parameter, to keep the error under an user specified tolerance. One step methods are well-
suited to adapting the stepsize h, provided that an efficient estimator of the local error is
available.
Usually, a tool of this kind is an a posteriori error estimator, since the a priori local
error estimates are too complicated to be used in practice.
Roughly speaking the process can be schematized in the following way:
1. from un, calculate un+1 using the one-step method.
2. Use the estimator to evaluate the local truncation error.
3. Choose a new step h, accordingly to a rule depending on the error estimate.
this technique is fundamental in every well-designed code, since getting always the desired
precision allows us to spare computational power.
One possible method is to use two RK methods of different order, respectively p and
p+ 1, but with the same number s of stages and the same values Ki. We can denote this
kind of methods with a modified version of the Butcher tableau
c A
bT
bT
ET
4.1 Runge-Kutta methods 59
where bT denote the coefficients of the p method, and bT the ones of the other method.
We denote the p-order solution with un+1 and the p+ 1 order solution with un+1.
We can estimate the error of the p-order solution in this way:
un+1 − un+1 = hs∑i=1
Ki(bi − bi) = h
s∑i=1
KiEi
this estimate tends to underestimate the local truncation error, since we are actually
doing:
un+1 − un+1 = O(hp+1
)+ O
(hp+2
)= O
(hp+1
), h ↓ 0,
otherwise it is O(hp+2); then this estimate is not reliable for great value of h. We want
to point out that the estimation is related to the p-order method, but since there is a
p + 1-method available, is best-practice to use the higher order solution instead of the
lower. Thus the estimation is even more crude, and those methods are not suitable when
extreme accuracy is a essential feature.
Let us now consider an integration problem over an interval [a, b], using a method of
order p, with a step h << 1. Integrating from x with a step h we have truncation error
C(x)hp. Since we are using a (to-be-determined) step-adapting strategy, it is clear that
h := H(x) . If steps are small we can approximate the global error as :
E(H) =
∫ b
a
C(x)Hp(x)dx
and the number of steps
S(H) =
∫ b
a
1
H(x)dx
An optimal policy would be to find a functional H(x) that minimize both E(H), main-
taining S(H) bounded, i.e., in the language of optimisation
minH
E(H)
st
S(H) ≤ N.
(4.18)
It is well know that a necessary condition (Karush-Kuhn-Tucker) for H∗ to solve (4.18),
is that H∗ minimises also the Langrangian function, for some λ ≥ 0
L(H) = E(H) + λ(S(H)−N) =
∫ b
a
[C(x)Hp(x) +
λ
H(x)
]dx− λN,
we refer the interested reader to any book on optimization theory.
We want to recall this basic result in calculus of variation:
4.1 Runge-Kutta methods 60
Theorem 4.1.13 (Euler-Lagrange Formula, [44], 570). Let be f : R×Rn ×Rn → R and
let L be the functional
L(f) =
∫f(x,H(x), H(x)
)dx.
then L(f) has a stationary value if the Euler-Lagrange differential equation is satisfied:
∂f
∂H− d
dx
∂f
∂H= 0.
Since our Lagrangian doesn’t depend on H applying the Euler-Lagrange formula re-
duces only to solve∂
∂H
(C(x)Hp(x) +
λ
H(x)
)= 0.
which leads to:
C(x)Hp+1(x) =λ
p= c.
where c is a constant. Then, the optimal policy is to keep the local truncation error
constant, therefore we choose to take the new step h′
as follows:
h′= h
[errortol
] 1p+1
;
we want to point out that we will use the error estimated for the previous step to generate
the new step length, instead of the error of the step we are about to perform. Due to this
another approximation, we will reduce the suggested step by a quantity γ. On the other
hand, we want the step length h to follow as smoothly as possible the evolution of the
problem, avoiding abrupt variation in the step size, since that can lead to strange behavior
on the error side.
Mixing up all those consideration, we can write down that updating rule for the step
sizeh′= rh
r = max(α,min
(β, γ
(errortol
) 1p+1
)) (4.19)
where α, β, γ are design parameter, typical values are α = 0.5, β = 2.0, γ = 0.9. The value
of h at the first step can be taken directly from (4.7) and from the definition of truncation
error. Therefore we can take for the first step
h =tol
1p+1
2
In some application can be useful to impose a minimum value hmin, e.g. the machine
precision. In this case the equation (4.19) can be readily rewritten in the form
h′= rh
r = max(hmin,max
(α,min
(β, γ
(errortol
) 1p+1
))).
4.1 Runge-Kutta methods 61
The value hmin can be imposed also to create a lower bound on running time, since there
is a class of numerical problem called stiff : if a numerical method is forced to use, in a
certain interval of integration, a step length which is excessively small in relation to the
smoothness of the exact solution in that interval, then the problem is said to be stiff in
that interval.
Usually it is not possible to foretell if the problem is stiff or not and a typical approach
is to use a Runge-Kutta of order 4-5 to test the problem, check the behavior of the method,
and in that case adopt a more suitable method.
4.1.9 Our choice: Dormand-Prince method
Up to now, we have shown some general results and techniques. Let us recall some
feature we want for our method:
• Runge-Kutta explicit method.
• Step adaptivity.
• Medium accuracy (∼ 10−4 − 106).
In the light of those consideration, we have chosen to use RK methods of order 4-5
with embedded error estimation, which at price of 6 function evaluation3 and one more
computation with respect to the ordinary RK5, allows us to control the stepsize.
The first one of this family is the Runge-Kutta-Fehlberg method (RKF45):
0
1/4 1/4
3/8 3/32 9/32
12/13 1932/2197 −7200/2197 7296/2197
1 439/216 −8 3680/513 −845/4104
1/2 −8/27 2 −3544/2565 1859/4104 −11/40
25/216 0 1408/2565 2197/4104 −1/5 0
16/135 0 6656/12825 28561/56430 −9/50 2/55
(4.20)
3Compare con table (4.17)
4.1 Runge-Kutta methods 62
Two variant of this method are Cash-Karp method (RKCK):
0
1/5 1/5
3/10 3/40 9/40
3/5 3/10 −9/10 6/5
1 −11/54 5/2 −70/27 35/27
7/8 1631/55296 175/512 575/13824 44275/110592 253/4096
37/378 0 250/621 125/594 0 512/1771
2825/27648 0 18575/48384 13525/55296 277/14336 1/4
(4.21)
and Dormand-Prince (RKDP):
0
1/5 1/5
3/10 3/40 9/40
4/5 44/45 −56/15 32/9
8/9 19372/6561 −25360/2187 64448/6561 −212/729
1 9017/3168 −355/33 46732/5247 49/176 −5103/18656
1 35/384 0 500/1113 125/192 −2187/6784 11/84
5179/57600 0 7571/16695 393/640 −92097/339200 187/2100 1/40
35/384 0 500/1113 125/192 −2187/6784 11/84 0
(4.22)
Those methods are described for the first time, respectively, in : [26], [10], [18].
The (4.22) method is a 7-stage method but actually only 6 evaluation are needed. This
property is called FSAL (First Same As Last), since the first stage of a stage n is the
same of the last stage of the step n− 1.
To show this, consider the method (4.22) at step n
K7,n = f (tn + c7h, un + h (a71K1 + a72K2 + a73K3 + a74K4 + a75K5 + a76K6)) (4.23)
and the solution at step n+ 1 is
un+1 = un + h(b1K1 + b2K2 + b3K3 + b4K4 + b5K5 + b6K6 + b7K7
). (4.24)
When we start the n+ 1 step:
K1,n+1 = f(tn+1, un+1)
Since c7 = 1,, tn+1 = tn + h,and a7i = bi it is clear from (4.23) and (4.24) that K1,n+1 =
K7,n. What makes method (4.22) different from (4.20) and (4.21) is that the coefficients
4.2 Numerical integration 63
are chosen to minimize the norm of the error of the 5-th order method, while the others
are designed to be used only with the 4-th order method. That makes the method (4.22)
fitter to be used in the 5-th order mode, while the other two methods are often used
improperly.
Therefore this method is the common choice for solving non stiff problem, used also
in MatlabTM and Octave, and will be also our choice.
4.2 Numerical integration
The other numerical duty requested by the usage of the affine models is numerical
integration: if we allow our model to jump with law ν(z) we have to add this term to the
RHS of the GREs:
θ(c) =
∫Rnec·zdν(z), c ∈ C
therefore, for a n-dimensional affine process with jump, we have to perform an integration
over Rn. We want to point out that for most common law, the trasform is known in closed
form, but we want to provide a routine to test different laws, and to evaluate inverse
transform for options pricing applications (Theorem 3.3.2).
4.2.1 One dimensional integration
We will present some formulas and focus on implementation issues, without claiming
to be exhaustive. We refer the reader to any good book on Numerical Analysis, and to
given reference.
The classical trapezium rule, for xi−1 < xi:∫ xi
xi−1
f(x)dx ≈ xi − xi−1
2[f (xi) + f (xi−1)] . (4.25)
Error committed using this formula is, if f ∈ C2([xi−1, xi]),
Etr = −(xi − xi−1)3
12f′′(ξ), ξ ∈ [xi−1, xi] .
Since the integral is additive, we can partition [a, b] in N equi-spaced interval, with x0 =
a, xN = b, such that∫ b
a
f(x)dx =N∑i=1
∫ xi
xi−1
f(x)dx ≈ h
[1
2f0 + f1 + . . .+ fN−1 +
1
2fN
](4.26)
where h = (b− a)/N . Using this extended formula, the error committed is
ENtr = −(b− a)3
12N2f′′(ξ), ξ ∈ [a, b].
4.2 Numerical integration 64
We will call equation (4.25) trapezium rule and (4.26) extended trapezium rule.
Now we turn our attention to the Cavalieri-Simpson rule:∫ xi
xi−1
f(x)dx ≈ xi − xi−1
6
[f (xi−1) + 4f
(xi + xi−1
2
)+ f (xi)
]; (4.27)
the error committed using this formula, if f ∈ C4([xi−1, xi]):
ECV = −(xi − xi−1)5
25 90f IV (ξ), ξ ∈ [xi−1, xi].
In (4.27) we have that for one interval [xi−1, xi], we need to evaluate the integrand f three
times; as done in (4.26), we can get the extended Cavalieri-Simpson rule with N interval:
h
[1
3f0 +
4
3f1 +
2
3f2 +
4
3f3 + . . .+
2
3f2N−2 +
4
3f2N−1 +
1
3f2N
](4.28)
where h = (b− a)/2N , and
fi =
f(xi+xi−1
2
)if i is odd
f (xi) if i is even.
The error, for this method, is:
ENCV = −
(b− a
2
)51
90N4f IV (ξ), ξ ∈ [a, b].
Usually the integration domain is fixed, we emphasize in error terms the number of nodes,
which is the parameter to vary to enhance accuracy.
As can be seen from error terms, the second rule has higher accuracy only if f is
regular enough, then (4.26) is more suitable in presence of an irregular integrand.
4.2.2 Step adaptivity
A simple rule to check convergence is to double the number of evaluation points.
Let us fix a error tolerance tol and consider the numerical integration of f over [a, b]
with the extended trapezium rule, subdividing the domain in 2n subintervals, of length
h = (b− a)/2n. We will denote this with I2n(f), leaving the extreme of the integration
domain unexpressed for simplicity of notation.
We will stop the integration process if
|I2n+1 (f)− I2n (f)| ≤ tol.
To be sure to avoid early convergence, we impose to do at least 5 refinement of the
integration domain. With a clever choice of the point, the trapezium rule is quite effective
when used in combination with step adaptivity.
4.2 Numerical integration 65
Let us denote with In the set of evaluation points. Let us take those sets in the following
way
In =
x0 = a, x1 = b if n = 1xi = a+ h
(12
+ i), i = 0, . . . , 2n−2 − 1
if n ≥ 2;
with |In| = 2n−2.
Denote with
Λn =∑x∈In
f (x)
then we can write the integral with the trapezium rule as
I1 = (b−a)2
Λ0
I2n = b−a2n
(12Λ0 +
n−1∑i=1
Λi
)n ≥ 1
In this way we can reuse the function evaluation of the previous steps, simply as as shown
in the figure 4.2.
Figure 4.2: Example of the evaluation algorithm, to get 23 = 8 interval integration without
evaluating the same point twice
In this way, the integration process can be schematize as follows:
1. At steps n, compute Λn (cost: 2n−2 evaluation).
2. Add Λn to λ = 12Λ0 +
n−2∑i=1
Λi obtained at the precedent step, and update the value
of λ.
3. The value of the integral is I2n(f) = hλ, where h = (b− a)/2n.
4.2 Numerical integration 66
4. Compute |I2n(f) − I2n−1(f)|, if lesser or equal to tol, return I2n , else increase n by
one, halve h and go ahead.
We impose a limitNMAX = 20 on the iterations of the above algorithm. That implies,
recalling the identity (4.16)
2 +NMAX∑n=2
2n−2 = 2 +NMAX−2∑
n=0
2n = 2NMAX−1 + 1 = 524289
evaluations, which are sufficient for most common applications.
The main advantage of the trapezium rule rule is that we can use it as building block
for the Cavalieri-Simpson rule.
Let us compute
S2n(f) :=4
3I2n(f)− 1
3I2n−1(f)
if we perform explicitly that calculation with (4.26) in mind, we get exactly (4.28). Since
to get I2n(f) we have necessarily to compute I2n−1(f), we have an almost free higher order
integration method.
4.2.3 Domain transformation
Let us take a look to the problem
I(f) =
∫ ∞−∞
f(x)dx
and we assume that the integrand f has no singularities on the real axis. A naive approach
to this method is to approximate∫ ∞−∞
f(x)dx ≈∫ a
−af(x)dx
taking a large, e.g. 1034, and using an ordinary integration routine on this integral like
(4.26). It is interesting to notice that, for such approach, the trapezium rule is optimal
in the quadrature methods class with constant step ([42], Section 2.3), i.e., under some
technical hypotheses, it attains the minimum error, with the same step length.
This approach is not efficient since (4.26) uses a constant integration step over the
domain, and a necessary condition on the integrand to have |I(f)| <∞ is
|f(x)| ≤ 1x1+ε , x→ ±∞ ε > 0
that means that evaluation performed at the extremes are bounded by 1a, i.e. almost zero.
In other words, the most meaningful values of f are packed far away from the extremes
of the domain. Then, two possibile approach are available:
4.2 Numerical integration 67
• adapt locally the step size.
• Perform a change of variable, to “tame” the integrand.
the first option would force us to rewrite part of the code written for the finite-domain
case, so we will opt for the second one. Let us choose a change of variable x(t) such that
x : [a, b]→ [c, d], then ∫ b
a
f(y)dy =
∫ d
c
f(y(x))dy
dxdx
A possible change of variable for our purposes is the so-called Double Exponential (DE)
rule:y = sinh (c sinhx)dydx
= c cosh (c sinh(x)) cosh(x)
-4 -3,5 -3 -2,5 -2 -1,5 -1 -0,5 0 0,5 1 1,5 2 2,5 3 3,5 4
-10
-7,5
-5
-2,5
2,5
5
7,5
10
Figure 4.3: The change of variable y = sinh(π2
sinhx)
(lower curve) and his derivative
(upper curve).
where typical values of c are 1 or π2. Usual interval of integration is [−4, 4], which is
roughly equivalent to [−2 · 1018, 2 · 1018].
If the integration interval is only (0,∞), like in the case of CIR with jumps model
(Section 3.6.1), the DE rule can be easily written as
y = e2c sinhx
dydx
= e2c sinhx2c cosh(x)
4.2 Numerical integration 68
That rule derives from analogue techniques developed to cope with integral with end-
point singularities and we refer the interested reader to [42], and to original papers [40]
and [41], besides the sources listed in Section 4.3. It has to be said that the DE rule is
optimal with respect to the trapezoidal rule, in the sense that doesn’t exist any other
transformation that allows a lower error with the same h. We want to point out that
domain transformation techniques really depend on the integrand, and therefore on the
specific singularity. We limit ourself to use a general purpose integrator and we suggest
for real world application to choose the integration method accordingly to the nature of
the integrand in (3.3).
4.2.4 Multidimensional integral
Up to now, we have presented methods to deal with one dimensional integration; To
extend those methods to n-dimensional integral like (3.3), we have a classical result from
multidimensional Riemann integral calculus, which is a special case of Fubini Theorem
(cf. Theorem A.2.3):
Theorem 4.2.1 (reduction theorem, [29]). Let f be an integrable function, f : Rm×Rn →Rp. Let us consider ∀y ∈ Rn the function
x→ f(x, y), x ∈ Rm
and suppose the above function to be integrable over Rm. Then the function
y →∫
Rm f(x, y)dx, y ∈ Rn,
is integrable over Rn and∫Rm×Rn
f(x, y)dxdy =
∫Rn
(∫Rm
f(x, y)dx
)dy
Nothing would stop us to use repeatedly the reduction theorem, getting∫Rnf (x1, . . . , xn) dx1 . . . dxn =
∫R
∫R. . .
∫Rf (x1, . . . , xn) dxn . . .dx2dx1 (4.29)
Then we can evaluate the integral, integrating a variable one after another. In this way
we can use our one dimensional routine recursively to solve an n-dimensional integral.
Needless to say, recursion is enemy of efficiency, therefore this method is unsuitable for
high-dimension integrals, say, n ≥ 4. Anyway we are dealing with low dimension processes,
then the integration itself will not be a problem.
Some problems could arise using one RK method to integrate GREs, if we were too
demanding with precision. Since each step of a RK method request s evaluation of RHS
per step, i.e. s integration per step.
4.3 Main sources and further readings 69
Assuming that we are using the same error tolerance for the RK method and for the
integration method, requesting great precision implies to increase the number of time
steps, and therefore the number of evaluation of the RHS of GRE, which becomes harder
and harder to evaluate for high precision requirement. If high precision is needed, could
be useful to choose another method: possible candidates are Burlish-Stoer method, and
Predictor-Corrector method. Both are suitable for problems that require high precision,
with an hard-to-evaluate RHS. The first is the common choice for great precision problem,
the second performs better with very smooth RHS, property that, in our case, depends
mainly on ν(z) in (3.3).
We chose RK methods for its applicability to a broad spectrum of problem but, as
usual, if more information are available on the nature of integrand(s), a specific designed
method has to be adopted.
4.3 Main sources and further readings
There are a lot of books on numerical calculus, and we will suggest to the reader some
of the authors we consulted to write this part of the thesis.
To the practitioner we definitely recommend to begin with two handbooks: [47] and
[45], the first devoted to a quick review of a great number of numerical methods, the
second to implementation issues in C++. Earlier edition of [45], dealing with algorithms
written in Fortran and C, can be found. Both books are supplied with broad and useful
references.
To go into details RK methods, we refer the reader to the omni-comprehensive [9],
whose author is the putative father of modern RK theory; an useful addition to this can
be [38].
With regard to numerical integration, as said before, we don’t refer to any particular
book since details can be found virtually in any introductive book to Numerical Calculus,
like [51]. Regarding the DE rule, we refer to [42], an interesting article dealing, from an
historical and technical point of view, with the discovery of this optimal rule to evaluate
improper integrals.
Chapter 5
Applications
Now we are ready to make joint use of the theory up to now exposed, and we will show
why affine processes and reduced models work nice together. All the measure-dependent
quantities (expectation , probability, intensity, etc. ) are considered under the risk-neutral
measure Q. To stress this fact, we will use the superscript Q, e.g.: EQ, λQ and so on.
The goal of this chapter is to bring the payoff for as many as possibile financial products
in a form tractable with affine processes’ theory, and we will start with the last claim to
be proved, the price of a defaultable zero-coupon bond.
5.1 Defaultable claims
Now we will show how the doubly stochastic framework works, jointly with affine
process theory. Before we will give a precise definition of defaultable contingent claim.
Definition 5.1. A defaultable contingent claim with maturity T is a claim, whose payoff
is defined as:
F1t<τ +Wτ1t≥τ
where
• F is a (GT )−measurable bounded random variable. F is called the promised payout.
• (Wt)t≥0 is a (Gt)−adapted stochastic process and Wt = 0 for t > T . (Wt)t≥0 is called
the recovery process.
Remark 15. The case of a bond is covered simply taking F = 1 and Wt ≤ 1, for all t.
70
5.1 Defaultable claims 71
5.1.1 No recovery
Theorem 5.1.1. Let us suppose to have a defaultable contingent claim without recovery
and that (rt)t≥0 and (λQt )t≥0 are bounded process, where (rt)t≥0 is Ft−adapted. Further-
more, let us suppose that, under Q, τ is doubly stochastic driven by a filtration (Ft)t≥0,
with intensity process λQ. Fix any t < T , then, for t ≥ τ , we have St = 0, otherwise
St = EQt
[e−
R Tt (ru+λQ
u)duF], t < τ. (5.1)
Proof. Since τ is doubly stochastic, there exists a filtration (Gt)t≥0, such that for every
t,Ft ⊂ Gt. Then for the law of iterated expectations
St = EQt
[EQ[e−
R Tt rudu1τ>TF |Gt ∨FT
]]= EQ
t
[e−
R Tt ruduFEQ [1τ>T |Gt ∨FT
]]for the measurability hypotesis on (rt)t≥0 and F .
Recalling that EQ [1τ>T |Gt ∨FT
]= PQ (τ > s |Gt ∨FT ) = PQ (NT −Nt = 0 |Gt ∨FT )
by the definition of doubly stochastic process we have:
PQ [NT −Nt = 0 |Gt ∨FT ] = e−R Tt λQ
udu
then the result, and claim I.3, follows.
Remark 16. In chapter 2 the process that here plays the role of interest rate is supposed to
be (Ft)−predictable, not (Ft)−adapted. But if we take rt = Λ(Xt−), as said in Remark
5, we have that ∫ t
0
Λ(Xs−)ds =
∫ t
0
Λ(Xs)ds, a.s. for each t.
5.1.2 Claims with recovery
As we have seen in Section 1.5, the price of a defaultable bond with recovery can be
split in two parts: the one inherent the face value, and the other inherent the recovery.
That case will be analyzed in the following:
Theorem 5.1.2. Consider a contingent claim with payoff F and recovery (wt)t≥0 . Let
us suppose that (wt)t≥0, (λQt )t≥0, and (rt)t≥0 are bounded. Futhermore, let us suppose that
τ is doubly stochastic under Q driven by a filtration (Ft)t≥0 with the property that (rt)t≥0
and (wt)t≥0 are Ft-adapted. Then, for t > τ , we have St = 0, otherwise ,
St = EQt
[e−
R Tt (ru+λQ
u)duF]
+
∫ T
t
Σ(t, u)du, t < τ ; (5.2)
where
Σ(t, u) = EQt [e−
R ut (rz+λQ
z )dzλQuwu] (5.3)
5.1 Defaultable claims 72
Proof. We already know the first part of the (5.2), then we have to evaluate:
EQt
[e−
R τ∧Tt rzdz1τ≤swτ
]. (5.4)
From (2.12), we already know that the conditional density of τ is πt(u) = EQt [e−
R ut λ
Qz dzλQ
u ],
then (5.4) can be as expressed as an integral with respect to πt(u)du, getting:
EQt
[e−
R τ∧Tt rzdz1τ≤swτ
]=
∫ Tte−
R ut (rz+λQ
z )dzwuπt(u)du
=∫ Tte−
R ut rzdzwuEQ
t [e−R ut λ
Qz dzλQ
u ]du
=∫ Tt
EQt [e−
R ut (rz+λQ
z )dzλQuwu]du
for the measurability hypotesis made on (rt)t≥0 and (wt)t≥0.
While the first addend can be evaluated with the Theorem 3.1.1, the integrand (5.3),
provided that wτ = ea+b·Xτ− , is naturally evaluated via the Theorem 3.3.1.
If the GREs associated to (5.3) are, as expected, to be solved numerically, we have the
evaluation of the integral in (5.2) for free: Σ(t, u) is of the form (3.12), and a RK method
employed on the associated GREs will return α(ti − t), β(ti − t), A(ti − t), B(ti − t), for
i = 1, . . . , N .
In order to have a better control on the error of∫ st
Σ(t, u)du, we could force the RK
method to work with a fixed h; then, using the trapezium rule with the same h, we have∫ st
Σ(t, u)du =∫ st
(A(u− t) +B(u− t) ·Xt) eα(u−t)+β(u−t)·Xtdu
≈ hN−1∑i=1
(A(ti − t) +B(ti − t) ·Xt) eα(ti−t)+β(ti−t)·Xt+
+h2
[wtλ
Qt + (A(s− t) +B(s− t) ·Xt) e
α(s−t)+β(s−t)·Xt].
Of course integration with RK and the evaluation of this formula can be performed in
parallel, with any need to store the whole GREs’ trajectory.
5.1.3 Unpredictable Default Recovery
The previous theorem suppose that the recovery process is known as time passes,
but it can be considered an unrealistic hypothesis. Without modifying other hypotheses,
a result similar to Theorem 5.1.2 can be obtained if the entity of the recovery will be
revealed only at default time.
Theorem 5.1.3. Let the hypothesis of Theorem 5.1.2 hold, with the difference that w is a
Gτ -measurable random variable. Then there exists a (Gt)−predictable process (Wt)t≥0 such
that
Σ(t, u) = EQt [e−
R ut (rz+λQ
z )dzλQuWu] (5.5)
5.1 Defaultable claims 73
Proof. From [17],Theorem IV.67(b), there is a (Gt)−predictable process (Wt)t≥0 such that
Ws = EQ [wτ1τ≤T |Gs− ] .Then, since s > t, for the law of iterated expectations
EQt
[e−
R s∧Tt rzdz1τ≤Twτ
]= EQ
t
[e−
R s∧Tt rzdzEQ [wτ1τ≤T |Gs− ]]
= EQt
[e−
R s∧Tt rzdzWs
]=
∫ Tt
EQt [e−
R ut (rz+λQ
z )dzλQuWu]du
(5.6)
where the last equation, and the result, is obtained moving along the lines of Theorem
5.1.2.
5.1.4 Fractional loss of value on default
A particularly interesting result comes if we suppose the recovery to be a fraction of
the security’s value just before the default
Definition 5.2 (recovery of market value (RMV)). Let V V RMt be the value of a contingent
claim that pays F at time T . The recovery in case of default at time τ is given by
Wτ = (1− Lτ )V RMVτ , τ ≤ T
The fractional loss process L = (Lt)t>0 is supposed to be L ∈ [0, 1] and predictable.
Theorem 5.1.4 ([33]). Consider a contingent claim that pays F at time T , where F is
(GT ) - measurable. The recovery process (Wt)t≥0 is (Gt)−adapted and defined by the RMV
assumption.
Let SRMV (t) denote the price of the claim. With qt = rt + λQt Lt let
St = EQt [e−
R Tt qsdsF ], t ≤ T
St = 0, t > T
where it is assumed that (St)t≥0 does not jump at the time of default τ (i.e. ∆Sτ = 0,
a.s). Then for t < τ we have
SRMVt = St.
Unfortunately, to keep the affinity of the modified discount rate (qt)t≥0 we have to
impose that the fractional loss process (Lt)t≥0 is deterministic.
5.2 Credit derivatives 74
5.1.5 Netting
Let us consider two financial institutions, which are entering in a contract with netting
as a covenant. The first firm is selling a defaultable claim that pays A at time T and buying
from the other firm another defaultable claim that pays B. Those claims have, respectively,
recovery at default WA and WB. We assume for simplicity that only two claims are
involved, while the assumption of same maturity is made not to give advantages to the
firm which bought the shorter-lived claim: if at any time t a firm has already monetized
the claim which it possessed, it has no more reasons to not default voluntary.
Then the total value of the agreement, from the first firm’s perspective, is, for A > 0
and B < 0:
St = EQt
[e−
R Tt rudu(A+B)
]+ EQ
t
[e−
R τ∧Tt rudu(WA +WB)1τ≤T
].
where τ = τ1 ∧ τ2. Let us suppose that WA and WB fall in the hypothesis of Theorem
5.1.4, then using the Proposition 2.5.3:
St = EQt
[e−
R Tt ru+LAu (λAu+λBu )duA
]+ EQ
t
[e−
R Tt ru+LBu (λAu+λBu )duB
]; (5.7)
RHS of (5.7) can be easily evaluated with affine framework, imposing on the parameter
the usual structure. We point out that intensities (λAt )t≥0, (λBt )t≥0 and the interest rate
have to be taken as affine, positive processes. It follows from Remark 9 that they must
be instantaneously uncorrelated, which is not a strong limitations since we can consider
that netting clause gets rid of moral hazard.
Let us compare (5.7) with the value of the same agreement without netting:
Vt = EQt
[e−
R Tt (ru+LAu λ
′Au )duA
]+ EQ
t
[e−
R Tt (ru+LBu λ
′Bu )duB
];
we used different intensities λ′At and λ′Bt , since, without netting, the moral hazard com-
ponent is relevant, increasing the intensity of a (perhaps voluntary) default. In this case,
the imposition of an instantaneously uncorrelated intensities could be limiting.
5.2 Credit derivatives
5.2.1 Credit spread options
The case of an option with an exponential affine payout has been discusse in Section
3.3.2, we will deal with a put credit spread option, since the call option can be analogously
approached with the put-call parity formula. Taking up the notations used in Section 1.8,
the price of a credit spread option is
EQ[e−
R t0 rudu1τ>tZt
]. (5.8)
5.2 Credit derivatives 75
In (1.5) the price of the bond can be evaluated as usual, using again the affine framework.
If conditions of Theorem 5.1.4 holds, with Lt = l, and l deterministic and constant, we
have, using the fact that rt + lλQt is affine and Theorem 3.1.1,
e−(Yt+St)(T−t) = EQt
[e−
R Tt (ru+lλQ
u)du]
= eα(t)+β(t)·Xt . (5.9)
Then (Yt + St)(T − t) is affine, and since t is fixed (is the maturity of the option) we will
drop the dependence on t from all deterministic coefficients. Then we define:
T − t = m Yt = y0 + y1 ·Xt
St = s0 + s1 ·Xt α = (y0 + s0)m
β = (y1 + s1)m β = y1m
α = (y0 + s)m d = β − βc = α− α rt + lλQ
t = ρ0 + ρ1 ·Xt−,
pointing out that the coefficients depend implicitly on l, (λt)t≥0, (rt)t≥0, since α and β
are obtained solving GREs involved by the latter identity in (5.9). Then the payoff (1.5)
can be rewritten
Zt =(eα+β·Xt − eα+β·Xt
)1d·Xt≥c
and (5.8), recalling the definition (3.15),
EQ[e−
R t0 rudu1τ>tZt
]= eαG−β,−d(−c;X0, 0, t)− eαG−β,−d(−c;X0, 0, t).
where the RHS can be evaluated using Theorem 3.3.2.
5.2.2 Credit Default Swaps
Let consider a CDS on a bond, with recovery W . Then the value of the protection is:
B = EQt
[e−
R τ∧T0 rudu(1−W )1τ≤T
](5.10)
where T is the maturity of the bond and τ the time of default. The buyer has to pay a
rate r at fixed times t1 < . . . < tn = T , whose market value is:
A = r
n∑i=1
EQt
[e−
R tit rudu1τ>ti
](5.11)
In the light of no-arbitrage considerations, r must make (5.11) and (5.10) equal, that
yields:
r =EQt
[e−
R τ∧T0 rudu(1−W )1τ≤T
]n∑i=1
EQt
[e−
R ti0 rudu1τ>ti
] . (5.12)
5.3 A multiname model 76
The numerator of (5.12), depending on the nature of W , can be valued with Theorems
5.1.2, 5.1.3 and 5.1.4, while the denominator is simply a sum of defaultable bond with
maturity ti.
On the other hand, a binary CDS with premium F has a rate/premium ratio:
r
F=
EQt
[e−
R τ∧T0 rudu1τ≤T
]n∑i=1
EQt
[e−
R ti0 rudu1τ>ti
] =
∫ st
EQt [e−
R ut (rz+λQ
z )dzλQu ]du
n∑i=1
EQt
[e−
R ti0 rudu1τ>ti
] , (5.13)
using (5.6).
5.3 A multiname model
The following model was presented in [11]. Let us consider a scenario with n firm, and
let us define a Markov process (Xt)t≥0 valued in R2n+1+ , Xt = (X0
t , . . . , X2nt ) , t ≥ 0 with
infinitesimal generator:
Df(x) =n∑i=0
αixi∂2f(x)
∂x2i
+n∑i=0
(bi + βi · x)∂f(x)
∂xi+∑p∈I
(f
(x+
2n∑i=n+1
piei
)− f(x)
)(lp + λp · x)
(5.14)
where ei denote the i−th component of the standard basis on Rn (i.e. (ei)j = δij, i, j =
0, . . . , n) and p = (pn+1, . . . , p2n) ∈ I = 0, 1n. Of course, such process is an affine
jump-diffusion process.
The coefficients αi, bi, lp are non-negative scalars, λp ∈ R2n+1+ , and βi = (βi,0, . . . , βi,2n).
The process (X0t )t≥0 is the short rate, (X i
t)t≥0, i = 1, . . . , n is the rating of the i−th firm
and (X i+nt )t≥0, i = 1, . . . , n is the status of the i−th firm (i.e. defaulted or not).
Ratings are taken with the convention: “the lower, the better”, if X i+nt = 0, the i−th
firm is not defaulted at time t, and we will assume that X i+n0 = 0. Moreover, the default
time of the i−th firm is τi = inft ≥ 0 : Xn+i
t = 1
.
In the light of those considerations we can give motivations and interpretations of the
involved parameters and restrictions.
• αi ≥ 0 since we are in R2n+1+ , and that is a consequence of Theorem 3.5.1, therefore
all diffusion parts are uncorrelated, and the correlation between ratings has to be
achieved acting on the drift part.
• The quantity bi + βi · x denotes the dependence of the credit rating for the firm i
on all entities on the market. To be more precise, the coefficient βi,j quantifies the
dependence on the Xjt and they can be positively correlated or not. We willl assume
that βi,j ≥ 0,∀i 6= j.
5.3 A multiname model 77
• The vector p is a possible failure scenario, and to each scenario is related an intensity
lp + λp · x, considerations made on βi also apply to λp. If a scenario p is considered
to be impossible, just put λp and lp to 0. We point out that when the i−th firm
defaults, X it will keep on evolving and influencing other variables. This effect can
be corrected (or removed) adding −n∑i=1
∂f∂xi+n
βi+nxi+n to the generator(5.14).
• The short rate X0t act on all the quantities via β0,i, but it is unreasonable that other
quantities would act on the short-rate. Therefore, βi,0 = 0, i = 1, . . . , 2n.
The default indicator function can be written as a limit of an exponential function, i.e.
limk→∞
e−kXi+nt = 1τi>t;
For the peculiar form of this model, we can evaluate the price of a contingent claim
simply solving a slightly modified version of GREs. We will consider the case n = 3 in
the following
Proposition 5.3.1. For t ≤ T , v ∈ R7−, δ ≥ 0 and p ∈ I we have
Et
[e−δ
R Tt X0
sdsev·XT limk→∞
e−k(p4X4T+p5X5
T+p6X6T )]
=
eφ(T−t,v;δ;p)+Pi∈0,...,3∪J0(p) ψi(T−t,v;δ;p)Xi
t
∏j∈J1(p)
1Xjt=0
where J0(p) := 4 ≤ j ≤ 6 : pj = 0, J1(p) := 4 ≤ j ≤ 6 : pj = 1 and the R−−valued
functions φ = φ (T − t, v; δ; p) and ψi = ψi (T − t, v; δ; p) solves
∂∂tφ =
3∑k=0
bkψk +∑
q∈I0(p)
lq(eq4ψ4+q5ψ5+q6ψ6 − 1)−∑
q∈I1(p)
lq
∂∂tψi = αiψ
2i +
3∑k=0
βk,iψk +∑
q∈I0(p)
λq,i(eq4ψ4+q5ψ5+q6ψ6 − 1)−
∑q∈I1(p)
λq,i − δ1i=0
∂∂tψj =
3∑k=0
βk,jψk +∑
q∈I0(p)
λq,j(eq4ψ4+q5ψ5+q6ψ6 − 1)−
∑q∈I1(p)
λq,j
φ (0, v; δ; p) = 0
ψi (0, v; δ; p) = vi
ψj (0, v; δ; p) = vj(5.15)
for i = 0, . . . , 3 and j ∈ J0(p), where I0(p) := q ∈ I : qj = 0 ∀j ∈ J1(p) and I1(p) :=
I\I0.
Proof. By dominated convergence:
Et
[e−δ
R Tt X0
sdsev·XT limk→∞
e−k(p4X4t +p5X5
t +p6X6t )]
= limk→∞
Et
[e−δ
R Tt X0
sdsev·XT e−k(p4X4t +p5X5
t +p6X6t )]
=
5.3 A multiname model 78
limk→∞
eφ(T−t,v−k(p4e4+p5e5+p6e6);δ)+ψ(T−t,v−k(p4e4+p5e5+p6e6);δ)·Xt (5.16)
where ψ and φ solve the usual GREs:
∂∂tφ =
3∑k=0
bkψk +∑q∈Ilq(eq4ψ4+q5ψ5+q6ψ6 − 1)
∂∂tψi = αiψ
2i +
3∑k=0
βk,iψk +∑q∈Iλq,i(e
q4ψ4+q5ψ5+q6ψ6 − 1)− δ1i=0
∂∂tψj =
3∑k=0
βk,jψk +∑q∈Iλq,j(e
q4ψ4+q5ψ5+q6ψ6 − 1)
φ (0, v; δ; p) = 0
ψi (0, v; δ; p) = vi
ψj (0, v; δ; p) = vj
(5.17)
with i = 0, . . . , 3 and j = 4, 5, 6; By Theorem 4.1.4 we have that ψ ∈ R7−, then ∂
∂tψj ≤ 0;
therefore:
ψj (t, v − k(p4e4 + p5e5 + p6e6; δ) ≤ ψj (0, v − k(p4e4 + p5e5 + p6e6; δ) = vj−k1pj=1, t ≥ 0.
Then is straightforward to split 4, 5, 6 in two sets that depends on the choice of p: J1(p)
and J0(p); we have
limk→∞
ψj (t, v − k(p4e4 + p5e5 + p6e6; δ) =
ψj(t, v; δ), j ∈ J1(p)
−∞, j ∈ J0(p).
We will also split I in two sets: I0(p) and I1(p); in the light of the new notation, we can
rewrite (5.16) and (5.17) as
limk→∞
eφ+Pi∈0,...,3∪J0(p) ψiX
it
∏j∈J1(p)
eψjXjt
and
∂∂tφ =
3∑k=0
bkψk +∑
q∈I0(p)
lq(eq4ψ4+q5ψ5+q6ψ6 − 1) +∑
q∈I1(p)
lq(eq4ψ4+q5ψ5+q6ψ6 − 1)
∂∂tψi = αiψ
2i +
3∑k=0
βk,iψk +∑
q∈I0(p)
λq,i(eq4ψ4+q5ψ5+q6ψ6 − 1) +
∑q∈I1(p)
λq,i(eq4ψ4+q5ψ5+q6ψ6 − 1)− δ1i=0
∂∂tψj =
3∑k=0
βk,jψk +∑
q∈I0(p)
λq,j(eq4ψ4+q5ψ5+q6ψ6 − 1) +
∑q∈I1(p)
λq,j(eq4ψ4+q5ψ5+q6ψ6 − 1).
The result follows from the observation that limk→∞
eψjXjt = 1Xj
t=0, j ∈ J1(p) and
limk→∞
eq4ψ4+q5ψ5+q6ψ6 = 0, q ∈ I1(p).
5.3 A multiname model 79
Remark 17. Theorem 5.3.1 considers a contingent claim of the form
ev·XT limk→∞
e−k(p4X4T+p5X5
T+p6X6T ),
which is, obviously, dependent on the choice of a default scenario p ∈ I. E.g., let us take
p = (1, 1, 1) or p = (1, 0, 1), then
ev·XT limk→∞
e−k(X4T+X5
T+X6T ) = ev·XT 1τ1>T1τ2>T1τ3>T = 1τ1∧τ2∧τ3>T
or
ev·XT limk→∞
e−k(X4T+X6
T ) = ev·XT 1τ1>T1τ3>T = ev·XT 1τ1∧τ3>T.
Then is possibile to price a contingent claim depending on a precise default scenario,
simply observing that 1τ≤T = 1− 1τ>T and 1s<τ<t = 1τ>s − 1τ>t.
5.3.1 Pricing a CDS
In 5.2.2 we have already approached the evaluation of a CDS, assuming that neither
the buyer of the protection nor the seller of the protection can default, mainly due to
the difficulty to express more than two defaults with the doubly stochastic setting. Let
us consider the time between two payments, t ∈ (tk−1, tk], k ≤ n, then three different
scenarios can occur:
1. if none of the actors default up to tk (i.e. tk < τ1 ∧ τ2 ∧ τ3) , the buyer (firm 2)
performs a payment to the seller (firm 3), amounting to an r to be determined.
2. If the reference entity (firm 1) has defaulted in period (tk−1, tk], (tk−1 < τ1 ≤ tk)
and the seller has not defaulted yet (tk < τ3 ) and the buyer has not defaulted by
tk (tk−1 < τ2) then the seller pays 1 − ea+b·Xtk , a ∈ R−, b ∈ R7−, and the contract
terminates.
3. In any other situation, the contract resolves with any effect.
Then, taking up again the notation of Section 5.2.2 we have, for t ≤ t1
B =n∑i=1
EQt
[e−
R tit X0
udu(1− ea+b·Xti )1ti−1<τ1≤ti1ti<τ31ti−1<τ2
]and
A = r
n∑i=1
EQt
[e−
R tit X0
udu1ti<τ1∧τ2∧τ3
]. (5.18)
5.3 A multiname model 80
Accordingly to Remark 17 we can write 1ti<τ1∧τ2∧τ3 = limk→∞
e−k(X4ti
+X5ti
+X6ti
) and
1ti−1<τ1≤ti1ti<τ31ti−1<τ2 = limk,l→∞
(e−lX4
ti−1 − e−kX4ti )e
−lX5ti−1−kX6
ti . Then, like we did in
(5.12), we can write
r =
n∑i=1
B1,it −B
2,it −B
3,it +B4,i
t
n∑i=1
EQt
[e−
R tit X0
udu1ti<τ1∧τ2∧τ3
].
where
B1,it = EQ
t
[e−
R tit X0
udu limk,l→∞
e−l“X4ti−1
+X5ti−1
”−kX6
ti
]B2,it = EQ
t
[e−
R tit X0
udu limk,l→∞
e−k(X4
ti+X6
ti)−lX5
ti−1
]B3,it = EQ
t
[e−
R tit X0
uduea+b·Xti limk,l→∞
e−l“X4ti−1
+X5ti−1
”−kX6
ti
]B4,it = EQ
t
[e−
R tit X0
uduea+b·Xti limk,l→∞
e−k(X4
ti+X6
ti)−lX5
ti−1
](5.19)
The argument of the sum in (5.18) can be evaluated simply applying Proposition 5.3.1,
while the terms in (5.19) can be evaluated using the law of iterated expectations, taking
as inner conditional expectation Etk−1and then using twice (5.18). For the sake of brevity,
we don’t report here those passages, referring to [11], 13-14, for complete expressions.
Chapter 6
A numerical example
We take again the example shown in Section 3.6, combining them to build a complete
model to evaluate a defaultable claim. Let us consider, as introduced in Remark 13, a
defaultable claim modelled with Bates model; for simplicity, let us suppose that there is
no recovery; then the price of such claim is
EQt
[e−
R Tt (rs+λs)dseYT
]= eα(t,T )+β1(t,T )Vt+β2(t,T )rt+β3(t,T )λt+β4(t,T )Yt ; (6.1)
where dVt = kV (γV − Vt)dt+ σV
√VtdW
Vt
drt = kr(γr − rt)dt+ σr√rtdW
rt + dJrt
dλt = kλ(γλ − λt)dt+ σλ√λtdW
λt + dJλt
dYt = (µS − 12Vt)dt+
√VtdW
St + dJSt .
(6.2)
We have modelled (λt)t≥0 and (rt)t≥0 with two CIR with jumps model, and log-price
with affine Bates model with gaussians jumps, distributed as N(ln(1 + k)− δ
2, δ
2
), accord-
ingly to [14], Section 15.2; respective jump intensities are denoted with lλ, lr, lY .
Moreover, we suppose that(W Vt
)t≥0
and(W Yt
)t≥0
are instantaneously correlated with
correlation ρ.
Then we can readily write the characteristics of such model, recalling that, in case of
multiple jump types, the GREs has to be modified, accordingly to Section 3.3.4:
µ =
kV γV
krγr
kλγλ
µS
+
−kV 0 0 0
0 −kr 0 0
0 0 −kλ 0
−12
0 0 0
Vt
rt
λt
Yt
; (6.3)
81
82
σσT =
σ2V 0 0 ρσV
0 0 0 0
0 0 0 0
ρσV 0 0 1
Vt +
0 0 0 0
0 σ2r 0 0
0 0 0 0
0 0 0 0
rt +
0 0 0 0
0 0 0 0
0 0 σ2λ 0
0 0 0 0
λt;θ1(c) = 1
1−drc2 , θ2(c) = 11−dλc3 , θ3(c) = exp
([ln(1 + k)− δ
2
]c4 + (δc4)2
8
);
l10 = lr, l20 = lλ, l30 = lY ;
ρ1 = (0, 1, 1, 0);
all the above characteristic satisfy the conditions of Theorem 3.5.1; we want to point out
that, unrealistically, the process (Yt)t≥0 is independent of (λt)t≥0. While the structure of
covariance matrix is fixed by Theorem 3.5.1, we could add negative components in the
last row of the matrix in (6.3), such that the higher the intensity of default, the lower the
price.
Now we can write down GREs associated to the problem (6.1):
β1 = −kV β1 − 12β4 + 1
2β2
1 + 12β1β4ρσV
β2 = −1− krβ2 + 12σ2rβ
22
β3 = −1− kλβ3 + 12σ2λβ
23
β4 = 12ρβ2
1σV + 12β1β4σ
2V
α = kV γV β1 + krγrβ2 + kλγλβ3 + µSβ4 + lr drβ2
1−drβ2+ lλ dλβ3
1−dλβ3+
lY(e[ln(1+k)− δ
2 ]β4+(δβ4)2
8 − 1
)α(0) = 0
β(0) = (0, 0, 0, 1).
(6.4)
Here β2 and β3 are decoupled and they can be solved explicitly: then application of
Lemma 3.2.1 yields:
β2(t) = − 2(eρ2(T−t)−1)(ρ2+kr)(eρ2(T−t)+1)+2ρ2
β3(t) = − 2(eρ3(T−t)−1)(ρ3+kλ)(eρ3(T−t)+1)+2ρ3
β1 = −kV β1 − 12β4 + 1
2β2
1 + 12ρσV β1β4
β4 = 12ρσV β
21 + 1
2σ2V β1β4
α = kV γV β1 + krγrβ2 + kλγλβ3 + µSβ4 + lr drβ2
1−drβ2+ lλ dλβ3
1−dλβ3+
lY(e[ln(1+k)− δ
2 ]β4+(δβ4)2
8 − 1
)α(0) = 0, β1(0) = 0, β4(0) = 0;
where ρ2 =√
(kr)2 + 2σ2r and ρ3 =
√(kλ)2 + 2σ2
λ. The knowledge of β2(t) and β3(t) can
be exploited to test our implementation of RKDP.
83
Table 6.1: Parameters for Bates model.
drift part diffusion part jump part Initial values
(Vt)t≥0 kV = 0.9 γV = 0.6 σV = 0.6 − − − V0 = 0.70
(rt)t≥0 kr = 2 γr = 0.07 σr = 0.65 lr = 0.12 dr = 0.1 − r0 = 0.08
(λt)t≥0 kλ = 1.5 γλ = 0.08 σλ = 0.4 lλ = 0.09 dλ = 0.15 − λ0 = 0.05
(Yt)t≥0 µS = 0.65 − ρ = −0.50 lY = 0.1 k = 1 δ = 0.5 Y0 = 5
Table 6.1 reports the parameters we used to integrate (6.4), we assume that we are
using those parameters in each figure, unless differently stated.
To begin with we will test our algoritm integrating β2 with different value of tolerance
(Figures 6.1 - 6.2 - 6.3 ), for comparison we have plotted the exact solution with a con-
tinuous red line: We want to point out how the step adaptivity works well in Figure 6.3:
0 1 2 3 4 5
01
23
4
t
beta
_2
Figure 6.1: Local error tolerance: 10−1, evaluations needed: 37; the algorithm shows a
numerical instability.
it relaxes the step length on the linear part and tighten it where the solution is varying
faster.
For our pricing purposes we are in particular interested to the value of α(0, T ) and
84
0 1 2 3 4 5
−0.
4−
0.3
−0.
2−
0.1
0.0
t
beta
_2
Figure 6.2: Local error tolerance: 10−2, evaluation needed: 42; no instability.
0 1 2 3 4 5
−0.
4−
0.3
−0.
2−
0.1
0.0
t
beta
_2
Figure 6.3: Local error tolerance: 10−5, evaluations needed: 127; perfect matching.
85
β(0, T ). Let us consider forward GREs (cf. (3.7)); in order to ensure that would get the
solution for s = T , we check every time that the time sn ≤ T . If at any time sn > T and
|sn − T | ≥ hmin, we force the algorithm to perform again the last step with sn = T .
0 1 2 3 4 5
0.0
0.5
1.0
1.5
T
alph
a
0 1 2 3 4 5
−0.
3−
0.2
−0.
10.
0
T
beta
_1
0 1 2 3 4 5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_2
0 1 2 3 4 5
−0.
6−
0.5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_3
0 1 2 3 4 5
0.80
0.85
0.90
0.95
1.00
T
beta
_4
0 1 2 3 4 5
150
160
170
180
190
T
Pric
e
Figure 6.4: Local error tolerance: 10−5, evaluations needed: 127; perfect matching. The
price is increasing with maturity, i.e is convenient to perform a long-run investment.
Since (6.4) is an autonomous system (all the coefficients are supposed constant), so-
lutions depend only on T − t, then we will consider the problem:
EQ0
[eR T0 (rs+λs)dseYT
]= eα(0,T )+β1(0,T )V0+β2(0,T )r0+β3(0,T )λ0+β4(0,T )Y0 ; (6.5)
i.e. we want to look at the same product, having at hand initial values at time t = 0,
86
but for different maturities T , and see how the price change. We have done some simulation
with different parameters, to show how many price structure can be reproduced by this
toy model.
We observe that β1, β2 and β3 are negative, which is reasonable since higher initial
value of, respectively, volatility, interest rate and intensity of default would make the
investment less appealing, i.e. with a lesser price, even with the same initial value of Y0.
On the other hand, β4 is positive and decreasing, then an higher value value of Y0 would
mean an higher price, but those effect will be less significant as maturity increases.
0 1 2 3 4 5
0.0
0.5
1.0
1.5
T
alph
a
0 1 2 3 4 5
−0.
3−
0.2
−0.
10.
0
T
beta
_1
0 1 2 3 4 5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_2
0 1 2 3 4 5
−0.
6−
0.5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_3
0 1 2 3 4 5
0.80
0.85
0.90
0.95
1.00
T
beta
_4
0 1 2 3 4 5
150
155
160
165
170
175
T
Pric
e
Figure 6.5: Local error tolerance: 10−5, evaluations needed: 127; lY = lλ = lr = 0; the
price still grows but at a slower pace.
87
Another computation that can be carried out, is the evaluation of the income/outcome
ratio defined in Section 3.4; we can see how this ratio varies as the maturity increase.
Figures 6.6 and 6.7 show a decreasing ratio; the effect of the corrected (since contains also
the risk spread) interest rate become more and more relevant as the maturity increases.
0 1 2 3 4 5
−0.
10.
00.
10.
20.
3
T
alph
a
0 1 2 3 4 5
−0.
3−
0.2
−0.
10.
0
Tbe
ta_1
0 1 2 3 4 5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_2
0 1 2 3 4 5
−0.
6−
0.5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_3
0 1 2 3 4 5
0.80
0.85
0.90
0.95
1.00
T
beta
_4
0 1 2 3 4 5
4060
8010
012
014
0
T
Pric
e
Figure 6.6: Local error tolerance: 10−5, evaluations needed: 127; γλ = 0.50. This asset is
really likely to default in the future, then the price is dropping.
88
0 1 2 3 4 5
0.0
0.5
1.0
1.5
2.0
2.5
T
alph
a
0 1 2 3 4 5
−0.
3−
0.2
−0.
10.
0
T
beta
_1
0 1 2 3 4 5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_2
0 1 2 3 4 5
−0.
6−
0.5
−0.
4−
0.3
−0.
2−
0.1
0.0
T
beta
_3
0 1 2 3 4 5
0.80
0.85
0.90
0.95
1.00
T
beta
_4
0 1 2 3 4 5
140
160
180
200
220
240
260
280
T
Pric
e
Figure 6.7: Local error tolerance: 10−5, evaluations needed: 127; λ0 = 0.80, δ = 2 ,
µS = 0.85. This asset is really likely to default in the near future, then the price is
dropping but the mean reversion make the price rise on the long run.
89
0 1 2 3 4 5
150
160
170
180
190
200
210
220
T
Pric
e
0 1 2 3 4 5
150
200
250
300
350
400
T
Pric
e_1
0 1 2 3 4 5
0.6
0.7
0.8
0.9
1.0
T
Pric
e/P
rice_
1
Figure 6.8: Local error tolerance: 10−5, evaluations needed: 127. Upper picture: discounted
price of the asset; middle picture: price of the asset without discount, bottom picture: ratio.
90
0 1 2 3 4 5
140
145
150
155
160
T
Pric
e
0 1 2 3 4 5
150
200
250
300
350
400
T
Pric
e_1
0 1 2 3 4 5
0.4
0.5
0.6
0.7
0.8
0.9
1.0
T
Pric
e/P
rice_
1
Figure 6.9: Local error tolerance: 10−5, evaluations needed: 127; λ0 = 0.50, lλ = 0.15.
Upper picture: discounted price of the asset; middle picture: price of the asset without
discount, bottom picture: ratio.
Conclusion
In this thesis, affine models and their applications were studied and investigated. Our
addition to this yet well developed theory are some minor proofs, and the pricing of some
derivatives.
Our focus was on numerical solving of GREs, and we wrote a state-of-the-art Runge-
Kutta routine, from scratch in C++. That routine perform very well in the given example,
and in many other instances. Execution times were always in the order of a second on
an ordinary laptop, for a local error tolerance 10−5. Then evaluation of product modelled
with affine process are suitable to be solved on any computer, including a personal one.
Then, future developments of this field of applications can be :
• Develop more products that can be priced via an affine framework.
• To explored the effect of adding jump, with Laplace transform not known in a closed
form. Then the integral ∫D\0
(eu·ξ − 1− u · χ(ξ)
)µ(dξ)
has to be evaluated numerically, and the code has to be written, accordingly to the
particular jump measure µ.
• Every evaluation of the RHS of a GREs can be very expensive, and methods that
request lesser evaluation of RHS has to be employed, like Burlish-Stoer.
• Could be useful to develop an “affine numerical toolbox”, i.e. a black-box software
oriented to the evaluation of financial products, modelled with affine processes. That
should include all the improvements proposed in the previous points, along model
calibration and an user-friendly interface, to fulfill needs of practitioners.
91
Appendix A
Measure Theory
A.1 Stiejeltes-Lebesgue integration
We will presente here some definition about
Definition A.1. Let f : [0, t]→ R such that
Vf (t) = supD
N∑i=1
|f(ti)− f(ti−1)| <∞
where D is the set of finite partitions of [0, t] :
0 = t0 < t1 < . . . < tn = t.
Then Vf (t) is called the variation of f over [0, t] and f is said to be of finite variation on
each compact interval of R+
It is well known that any function of finite variation can be decomposed into the
difference of two increasing functions, i.e. if g : [0,∞) → R function of finite variation
then there exist monotone increasing functions a : [0,∞) → R and b : [0,∞) → R such
that g(t) = a(t)− b(t). To a and b, correspond two measure
µa((0, t]) = a(t) = Vf (t), µb((0, t]) = b(t)
Thus it is sufficient to define the Stieltjes integral for monotone increasing functions, since
for all measurable function u, and g we can write∫ t
0
u(s)dg(s) =
∫ t
0
u(s)da(s)−∫ t
0
u(s)db(s)
92
A.2 Lebesgue measure theorems 93
Definition A.2. Let f : [0,∞) → R be a deterministic function and g : [0,∞) → Ra monotone increasing function. Let p be a finite partition of the interval [a, b], and let
|p| := supi |ti+1 − ti|. The Stieltjes integral of the function f with respect to the function
g over an interval [a, b] is defined as∫ b
a
f(s)dg(s) = lim|p|→0
n∑i=1
f(εi)(g(ti+1)− g(ti))
where εi ∈ [ti, ti+1].
Theorem A.1.1 (Exponential formula, [7], T4, 337). Let a be a right continuous increas-
ing function with a(0) = 0 and let u be such that∫ t
0
|u(s)| da(s) <∞, t ≥ 0
then the equation
x(t) = x(0) +
∫ t
0
x(s−)u(s)da(s), t ≥ 0
admits a unique locally bounded (sups∈[0,t] |x(s)| <∞, t ≥ 0) solution given by
x(t) = x(0)∏
0<s≤t
(1 + u(s)∆a(s)) exp
(∫ t
0
u(s)dac(s)
), t ≥ 0 (A.1.1)
where ∆a(t) = a(t)− a(t−) and act = a(t)−∑
0<s≤t ∆a(s)
It is possible to write the Fourier transform of a function in terms of Stejeltes integral:
Definition A.3 (Fourier-Stieltjes transform). Let α be a monotone increasing, real-valued
function of finite variation, then
f(x) =
∫Reitxdα(s)
is well defined. The function f(x) is called the Fourier-Stieltjes transform of α.
A.2 Lebesgue measure theorems
Those are some classical convergence results for the Lebesgue measure, all the functions
are considered to be positive, since is always possible to take f = f+ − f−, where f+ and
f− are positive functions. We took as reference [37].
A.2 Lebesgue measure theorems 94
Theorem A.2.1 (Dominated convergence theorem). If the sequence fn(X)n→ f(x) and
if for all n
fn(x) ≤ ϕ(x)
where ϕ(x) is integrable, then the limit function f(x) is integrable and∫A
fn(x)dµ(x)n→∫A
f(x)dµ(x)
Theorem A.2.2 (Bounded convergence theorem). If the sequence fn(X)n→ f(x) and if
for all n
fn(x) ≤ K
then the limit function f(x) is integrable and∫A
fn(x)dµ(x)n→∫A
f(x)dµ(x)
Theorem A.2.3 (Fubini theorem). Let the measures µx and µy be defined on Borel rings,
σ−attive and complete; let, moreover,
µ = µx ⊗ µy,
and let the function f(x, y) be integrable with respect to the measure µ on the sets
A = Ax0 × Ay0 .
Then ∫a
f(x, y)dµ =
∫Y
(∫Ay
f(x, y)dµy
)dµx =
∫X
(∫Ax
f(x, y)dµx
)dµy.
We denote with Ax = y : (x, y) ∈ A, and Ay = x : (x, y) ∈ A.
Appendix B
Stochastic processes
Here we will recall results and definitions used troughout
B.1 Definitions and basic results
Definition B.1 (Filtration). A filtration on (Ω,F ,P) is a family (Ft)t≥0 of sigma alge-
bras Ft ⊂ F such that
Ft ⊂ Fs, 0 ≤ t < s
(i.e. it is increasing). When it is clear from the context, the dependence on the elements
of the probability space will be omitted.
Definition B.2 (Usual conditions). A filtration (Ft)t≥0 is said to satisfy the usual con-
ditions if
• F0 contains all null sets included in F
• ∀t,Ft =⋂s>t Fs, i.e. the filtration is right continuous.
Definition B.3 (Martingale). An n−dimensional stochastic process (Xt)t≥0 on (Ω,F ,P)
is called martingale with respect to the filtration (Ft)t≥0 (for short, (Ft)−martingale or
P−martingale) if
(i) Xt is Ft−measurable for all t.
(ii) EP [|Xt|] <∞ for all t.
(iii) EP [Xs|Ft] = Xt for all 0 ≤ t ≤ s.
If it is clear from the context, the dependence on P or (Ft)t≥0 will be omitted.
95
B.1 Definitions and basic results 96
Definition B.4 (Adapted process). Let be (Ft)t≥0 a filtration defined on the probability
space (Ω,F ,P). A real valued process (Xt(ω))t≥0 is said to be (Ft)−adapted if for each
t ≥ 0 the function
ω → Xt(ω)
is Ft−measurable.
Definition B.5 (Predictable process). Let be (Ft)t≥0 a filtration defined on the proba-
bility space (Ω,F ,P), and let P (Ft) be the σ−algebra generated by the rectangles of
the form
(s, t]× A; 0 ≤ s ≤ t, A ∈ Fs.
A real valued process (Xt)t≥0 such that X0 is (F0)−measurable, and the mapping (t, ω)→Xt(ω) is P (Ft)−measurable is said to be (Ft)−predictable.
An useful class of predictable processes is the one formed by the left continuous process,
as shown by this theorem:
Theorem B.1.1. An Rn−valued process (Xt)t≥0 adapted to (Ft)t≥0 and left continuous
is (Ft)−predictable.
Proof. It suffice to prove the theorem for a R−valued process, for the case in Rn, the
proof has to be carried out component-wise.
Since Xt(ω) is left-continuous, for all (t, ω) ∈ [t,+∞)× Ω we have,
Xt(ω) := limn→∞
[n2n−1∑q=0
Xq/2n(ω)1 q2n<t≤q+ 1
2n +Xn(ω)1t>n
].
We have, for any n ≥ 1 and 0 ≤ q ≤ n2n − 1, that Xn and Xq/2n are Fn and
Fq/2n−measurable. ThenXq/2n(ω)1 q2n<t≤q+ 12n andXn(ω)1t>n are P
(Fq/2n
)and P (Fn)-
measurable. Then, summing over q passing to the limit in n, we have that Xt(ω) is
P (Ft)−measurable, for all t ≥ 0.
In our context all the predictable process of use will be left continuous, and therefore is
common to use the class of adapted, left continuous processes as a definition of predictable
process (cf. [21] and [14]).
Theorem B.1.2 (Corollary 3.2.6, [43]). Let ft(ω) : [0,∞)× Ω→ Rn be a function such
that, for all t ≥ 0
(i) (t, ω) → ft(ω) is B × F−measurable, where B denotes the Borel σ-algebra on
[0,∞).
(ii) ft(ω) is Ft−adapted.
B.2 Levy Processes 97
(iii) E[∫ t
0fs(ω) · fs(ω)ds
]<∞.
then the integral, in the Ito’s sense, ∫ t
0
fsdWs
is a (Ft)−martingale.
Definition B.6 (Local martingale). Let (Xt)t≥0 be a cadlag, adapted process. (Xt)t≥0 is
a local martingale if there exists an increasing succession of stopping times (Tn)n∈N with
limn→∞ Tn =∞ a.s. , such that(Xt∧Tn1Tn>0
)t≥0
is a martingale.
Theorem B.1.3. Let (Xt)t≥0 be a local martingale. If E[sups≤t|Xs|
]<∞, then (Xt)t≥0 is
a Martingale.
Proof. Let (Tn)n∈N be a succession of stopping times for (Xt)t≥0. Then, E[Xt∧Tn |Fs ] =
Xs∧Tn . If we send n to infinity, dominated convergence theorem yields E[Xt |Fs ] = Xs.
Definition B.7 (Cadlag processes). A process (Xt)t≥0 is said to be cadlag if
• lims↑tXs = Xt a.s. for all t ≥ 0, i.e. it is right continuous
• exists lims↓tXs a.s. for all t ≥ 0, i.e. it has left limits
Definition B.8 (Modification). Two processes (Xt)t≥0 and (Yt)t≥0 are modifications if
Xt = Yt a.s., each t ≥ 0.
Theorem B.1.4 (Corollary 1, p. 8, [46]). if X = (Xt)0≤t<∞ is a martingale, then there
exists one unique Y modification of (Xt)t≥0 such that Y is cadlag.
B.2 Levy Processes
Here we present the definition and some results about Levy processes, we refer to [14]
for all the proof and details, located mainly in Chapters 3 and 4. We will denote a set A
equipped with the sigma algebra A with (A,A ) and the usual Borel σ−algebra with B.
Definition B.9 (Radon measure). Let E ⊂ Rd. A Radon measure on (E,B) is a measure
µ such that for every compact measurable set B ∈ B, µ(B) <∞.
Definition B.10 (Levy process). A cadlag stochastic process (Xt)t≥0 on (Ω,F ,P) with
values in Rd such that X0 = 0 is called Levy process if:
B.2 Levy Processes 98
1. Independent increments: for every increasing sequence of time t0, . . . , tn the random
variables Xt0 , Xt1 −Xt0 , . . . , Xtn −Xtn−1 are independents.
2. Stationary increments: Xt+h −Xtd= Xs+h −Xs, for fixed h and all t ≥ 0.
3. Stochastic continuity: limh→0
P[|Xt+h −Xt| ≥ ε] = 0 for all ε ≥ 0.
Definition B.11. Let (Ω,F,P) be a probability space, E ⊂ Rd and µ a given (positive)
Radon measure µ on (E,E ). A Poisson random measure on E with intensity measure µ
is an integer valued random measure, i.e. :
M : Ω× A→ N
such that:
1. For almost all ω ∈ Ω , M(ω, ·) is an integer-valued Radon measure on E: for any
bounded measurable A ⊂ E,M(A) <∞ is an integer valued random variable.
2. For each measurable set A ⊂ E,M(·, A) = M(A) is a Poisson random. variable with
parameter µ(A):
P[µ(A) = k] = e−µ(A) (µ(A))k
k!, k ∈ N
.
3. For disjoint measurable sets A1, . . . , An, the variables M(A1), . . . ,M(An) are inde-
pendent.
Definition B.12 (Jump measure). Let (Xt)t≥0 be a cadlag process on Rd. The measure
JX on [0,∞)× Rd is defined by
JX(B) = |(t,Xt −Xt−) ∈ B|
for any measurable set B ∈ [0,∞)× Rd. We denote with | · | the cardinality of a set.
Definition B.13 (Levy measure). Let (Xt)t≥0 be a Levy process on Rd. The measure ν
on Rd is defined by
ν(A) = E[|t ∈ [0, 1] : ∆Xt 6= 0,∆Xt ∈ A|], A ∈ B(Rd),
is called the Levy measure of (Xt)t≥0 : ν(A) is the expected number, per unit time, of
jumps whose size belongs to A.
Proposition B.2.1 (Levy-Ito decomposition). Let (Xt)t≥0 be a Levy process on Rd and
ν its Levy measure.
B.2 Levy Processes 99
• ν is a Radon measure on Rd\ 0 and verifies∫|x|≤1
|x|2ν(dx) <∞∫|x|≥1
ν(dx) <∞
.
• The jump measure of (Xt)t≥0, denoted by JX , is a Poisson random measure on
[0,∞)× Rd with intensity measure ν(dx)dt.
• There exist a vector γ and a brownian motion (Wt)t≥0 with covariance matrix A of
a such that :Xt = γt+Wt +X l
t + limε↓0Xεt
X lt =
∫|x|≥1,s∈[0,t]
xJX(ds× dx)
Xεt =
∫ε≤|x|≤1,s∈[0,t]
x JX(ds× dx)− ν(dx)ds.(B.2.1)
The terms in (B.2.1) are independent and the convergence in the last term is a.s. and
uniform in t ∈ [0, T ]. The triplet (A, γ, ν) is said to be characteristic triplet of (Xt)t≥0.
The first result tell us that is a Radon measure over Rd\0, but nothing prevents
ν(Rd) to be infinite. Since ν can diverge at the origin, we write limε↓0Xεt . In our context
we are interested to finite activity process (i.e ν(Rd) < ∞), then we have no problems
putting directly ε = 0.
B.2.1 Compound Poisson process
Definition B.14 (Compound Poisson process). A compound Poisson process (or pure
jump process) with intensity λ > 0 and jump size distribution f is a stochastic process
(Xt)t≥0 defined as
Xt =Nt∑i=1
Yi,
where jumps sizes Yi are i.i.d. with distribution f and (Nt)t≥0 is a Poisson process with
intensity λ, independent from (Yi)i≥1.
The compound Poisson process is, of course, a Levy process. More can be said:
Proposition B.2.2. (Xt)t≥0 is a compound Poisson process if and only if it is a Levy
process and its sample paths are piecewise constant functions.
This result, combined with the following:
B.3 Infinitesimal generator of a Markov Process 100
Proposition B.2.3 (Jump measure of a compound Poisson process). Let (Xt)t≥0 be a
compound Poisson process with intensity λ and jump size distribution f . Its jump measure
JX is a Poisson random measure on Rd × [0,∞) with intensity measure µ(dx × dt) =
ν(dx)dt = λf(dx)dt.
allow us to recognize that the characteristic of a compound Poisson process is (0, 0, λf).
B.3 Infinitesimal generator of a Markov Process
We want to present here some results about infinitesimal generator, without claiming
to be exhaustive. We refer to [25] for further details:
Definition B.15 (Markov process). An Rn-valued process (Xt)t≥0 is called Markov pro-
cess if
P [Xt ∈ B |σ (X(u) : 0 ≤ u ≤ s) ] = P [Xt ∈ B |σ (X(s)) ]
for all 0 ≤ s ≤ t and for all B ∈ B.
The definition of Levy process, namely the independent increments hypothesis, clas-
sifies it as a Markov process. A stronger property holds:
Theorem B.3.1 (Strong Markov property). Let (Xt)t≥0 be a Levy process. If T is a
nonanticipating random time, then the process Yt = Xt+T −XT , t ≥ 0 is again a Levy
process, independent from FT and with same law as (Xt)t≥0 .
Theorem B.3.2. Let (Xt)t≥0 be a Markov process and f : Rn → Rn, and let C0 be the
set of continuous functions vanishing at infinity. His transition operator, defined as
Ptf(x) = E [f(x+Xt)]
is a semigroup, i.e.
PtPs = Pt+s
and, if Ptf ∈ C0,
limt↓0
Ptf(x) = f(x), ∀x ∈ Rn (Feller property)
where the convergence is with respect to the sup norm on C0.
Definition B.16. Let f ∈ C0 and (Xt)t≥0 a Markov process. Then his infinitesimal
generator is defined as
Df = limt↓0
Ptf − ft
Again, the limit is taken with respect to the sup norm on C0.
B.3 Infinitesimal generator of a Markov Process 101
The following result allow us to link the infinitesimal generator of a Levy process to
his characteristic function:
Proposition B.3.3. Let (Xt)t≥0 be a Levy process on Rd with characteristic triplet
(A, ν, γ). Then the infinitesimal generator of (Xt)t≥0 is defined for any f ∈ C20(R) as
Df(x) = ∇xf · γ +1
2∇2xf : A+
∫Rn
[f(x+ z)− f(x)−∇xf(x) · z1|z|≤1
]dν(z) (B.3.2)
where f ∈ C20(Rd) is the set of twice continuously differentiable functions, vanishing at
infinity.
Since we are dealing with finite activity process, then we don’t need the jump limiter
1|z|≤1 and we can neglect the last term in the integral. In a Levy process, the characteristic
are independent of t and ω, but the results can be extend to a more general class of process,
considering Levy process as building blocks (cf. [36], [35] and reference therein). Roughly
speaking, it is possible to think of a more general process, with characteristics (b, c, F )
resembles locally after t a Levy process with triplet (b, c, F )(ω, t). The generator of such
process has the same form, and the parameters depends on ω and t.
Theorem B.3.4 (Ito formula for jump-diffusion process). Let (Xt)t≥0 be a diffusion
process with jumps, defined, in integral notation,
Xt = X0 +
∫ t
0
µsds+
∫ t
0
σsdWs +Nt∑i=1
∆Xi
where bt and σt are continuous adapted processes with
E[∫ T
0
σsds
]<∞
Then, for any f : [0, T ] × R → R which is once differentiable in the first variabile and
twice in the second one, the process Yt = f(t,Xt), t ≥ 0 can be represented as:
f(t,Xt) = f(0, X0) +∫ t
0
[∂f∂s
(s,Xs) + ∂f∂s
(s,Xs)bs]ds+ 1
2
∫ t0σ2s∂2f∂x2 (s,Xs)ds+∫ t
0∂f∂x
(s,Xs)σsWs +∑
i≥1,Ti≤t[f(XTi− + ∆XTi)− f(XTi−)]
Of course this result holds also in the n−dimensional case, substituting derivatives
and product with appropriate gradients and inner products.
Appendix C
Risk-neutral Valuation
For an intuitive introduction to the concept of risk neutrality we refer to [34] and [48].
For the theory of continuous time finance we refer to [5]. In the following we will present
the main results for the pricing of financial derivatives.
C.1 The market
We assume that we have a continuous-time security market where investor are allowed
to trade continuously up to some fixed finite planning horizon T . Uncertainty in the
financial market is modelled by a probability space (Ω,F ,P), equipped with the filtration
(Ft)t≥0 which satisfies the usual conditions. Let (rt)t≥0 be a short rate process, and we
define B(t) as the riskless money market account, where we assume B(0) = 1 and
B(t) = eR t0 rsds
We’ll take B(t) as our numeraire, a french term to indicate an item or commodity acting
as a measure of value or as a standard for currency exchange. A mathematic definition is
Definition C.1 (Numeraire). A numeraire is a price process X = (Xt)t≥0 strictly positive
a.s. for each t ∈ [0, T ]
Another basilar concept is the one of equivalent martingale measure
Definition C.2 (Equivalent martingale measure). A probability measure defined on
(Ω,F ) is a equivalent martingale measure if:
(i) Q is equivalent to P, i.e. Q(A) = 0 if and only if P[A] = 0, for all A ∈ F .
(ii) For every dividend-free traded asset with price process P (t), the discounted price
process P (t)B(t)
is a martingale under Q.
102
C.2 Risk-neutral Valuation 103
The market is said to be arbitrage free if there is no arbitrage opportunities. With ar-
bitrage we call an investment with any initial contribution that produce, with probability
different from zero, a positive payoff.
But it can be shown that the existence of an equivalent martingale measure implies
that the market is arbitrage free. That strengthen the intuition that a martingale is a
“fair” game:
Theorem C.1.1 (arbitrage-free market, [5], Theorem 6.1.1.). If an equivalent martingale
measure exists then the market model contains no arbitrage opportunities.
C.2 Risk-neutral Valuation
If we want to determine the fair price of a financial instruments we must ensure that
the discounted price process of the asset is a martingale (recall that a martingale can be
seen as a fair game). This approach is known as risk-neutral pricing.
Theorem C.2.1 (Risk-neutral valuation formula, [49], Theorem 4.7). Let X be an (FT )−measurable random variable that is bounded from below, and let Q be a martingale measure
on the market for the underlying assets. An arbitrage-free price of a contingent claim
paying X at T > t is
St = E[B(t)
B(T )X |Ft
]where the expectation is taken under the pricing measure Q.
Under sufficient technical conditions (market completeness) this price can be taken to
be unique.
Appendix D
Numerical Codes
Here we report, deeply commented, all the C++ code written for this thesis. Why
C++? We have chosen C++ because:
• is easy to write, but yet powerful, flexible and fast.
• there is a lot of source code available in C++, and a big literature about numerical
calculus written in C++, like [45].
• is widely employed in the financial sector by practitioners, e.g. the MoSesTMFinancial
Modeling Software is basically a platform written in C++.
The codes are based on the algorithms explained in Chapter 4, and the only depen-
dency is from the library Template Numerical Toolkit for C++ (TNT)1.
This is a linear algebra library developed by National Institute of Standards and
Technology (NIST), and allows a simple usage of linear algebra data structure. The library
was adopted to avoid the “reinventing the squared wheel” issue: if something was already
well implemented, why don’t use it?
There was some problem with the code, some bugs hat to be fixed, and we added some
operation between tensors that the autor of the library didn’t yet implemented.
Last but not least, we decided to adopt a structured style of programming (i.e. without
objects) since we felt that the code is not complex enough to embark on this task.
All the codes are on the enclosed Cd-Rom.
1Freely available at http://math.nist.gov/tnt/
104
Ringraziamenti
Ecco la parte piu difficile, i ringraziamenti! Mi affido alla mia madrelingua perche solo
con essa riesco ad esprimere pienamente quello che penso e provo.
Voglio ringraziare per prima cosa la mia famiglia: Fulvio, Gina e Mario. Senza di loro
non sarei dove sono adesso e Monaco, che prima per me era solo la folkloristica citta dei
Brez’n e dei tedeschi cicciotti ubriachi, non sarebbe mai diventata prima sogno, e poi
realta di vita e di lavoro. Grazie di tutto ancora una volta.
Ora passiamo agli amici.
Luca “Cube” Burini: direttore della filiale londinese della IMS-Electronics, avido Kite-
sufer, purgatore di pollastre francesi e non, possano i tuoi reni perdonarti per l’abuso di
proteine. E ricorda: chi di tacchino ferisce..
Claudio “Skw” Palandra: in origine il businessman dietro al piano aziendale della IMS
electronics, oggi consulente della KMPG Italia. Che tu rubi i soldi alle vecchiette a Verona,
o che tu lecchi i francobolli a Roma, sarai sempre il piu sola. Oddio, che io abbia violato
qualche norma sulla privacy ?!?!?!? Aspetto ancora un vero “Last Kebab Standing”...
Michele “Dottor Morte\Migga da Nigga” Schirru: persona nobile di cuore e di sen-
timenti puri (sebbene studi avanzati non ne abbiano rilevati), ha deciso di dedicare 12
anni della sua vita per poter dedicare cio che rimane della stessa agli altri. Questo almeno
finche la sua sua orda di zombi assassini macrocefali portatori di orchite non invadera
la terra.. Amico da sempre, sono certo che farai grandi cose! E che non saro mai tuo
paziente..
Sirio “Sarei il piu matto se..” Valent: braccio armato (di penna) di una loggia deviata
di Confindustria, cerca di convincerci che la ricerca scientifica italiana abbia come fine
ultimo la clonazione del nostro “caro leader”. Ed essendo cosı assurdo, e sicuramente vero
e nessuno ci crede! Che tu possa realizzare una piccola parte dei tuoi sogni o per lo meno
non morire schiacciato sotto il loro peso.
Marco “Pikachu” Petralia: In un solo nome cosı tante cose: esteta decadente, medico
approssimativo, amateur bizzarre, vichingo burlone, amante degli animali (sic!) e delle
105
Ringraziamenti 106
giovani pulzelle con gli abiti impregnati degli odori della nostra gioiosa campagna laziale:
caciotte, fieno, concime.. Cerca di diventare maggiorenne di cervello prima che io vada in
pensione!
Emanuele “Ghiottolino” Mattei: sebbene il soprannome possa far pensare ad un orsac-
chiotto, lui e meglio! Meno peli, mangia di meno senza, tutta via, intaccare la morbidezza
al tatto, e se gli schiacci il pancino parla anche giapponese! Cosa vorreste di piu dalla
vita? Coraggio Lele, la svolta e in agguato per tutti! Cosı come un senegalese armato di
mazza..
Matteo “CoffeeMan” Benvenuti: Analizza bene quanto scritto in queste pagine, voglio
il tuo parere professionale.. Non mi dire cose che gia so, tipo:
1. Che sono pazzo.
2. Che ho un rapporto malato col sesso.
3. Che la mamma non mi ha allattato al seno.
Stupiscimi!
Daniele “Gufo” De Carolis: che dire, se non ci fosse andrebbe inventato. Rotto. Con
materiali scadenti. E i progettisti, assunti dal governo iraniano per creare un supersoldato
islamico, dopo il fallimento verrebbero giustiziati tutti. Ti voglio bene come se fossi il
fratello che non ho mai voluto!
Dario “Were-doormat” Cardilli: di giorno distinto consulente per Accenture Technol-
ogy Solution, di sera (e anche nei weekend e nei festivi) animale da compagnia e oggetto
d’arredamento di vaga matrice Ikea (modello “Dorrmatta”). Trovare lo scopo della pro-
pria vita in una persona diversa da se stessi e furbo come chiudersi dentro la gabbia di
un leone per difendersi dai ladri: la soluzione finira per farti piu male del problema.
Ambra: la prima persona che io abbia mai visto come una compagna, grazie per avermi
dato un sogno, anche se esso e morto all’alba di un nuovo giorno. Spero che tu sappia
comportati meglio con la prossima persona che sara al tuo fianco.
Dedico questa tesi, anche ai miei amici che, sopportandomi per mesi, hanno reso
Monaco un’esperienza densa di emozioni:
Monica detta “Chicca”, Elisa, Sabine, Fiorella, Alex detto “Sacco”, Elisabetta detta“Fratello”,
Valeria, Stefano, Maria e tanti altri.
Se non fosse stato per voi, la Baviera non sarebbe di smeraldo, la Madonna non
brillerebbe dorata su Marienplatz e la Weissbier non sarebbe cosı buona.
Ringraziamenti 107
Last but not least, ringrazio Dio per avermi fatto ateo2, la caffeina, la Weissbier e
l’heavy metal in tutte le sue declinazioni.
Alza tu cerveza, brinda por la libertad [. . .] Leva in alto la tua birra, brinda alla liberta [. . .]
llegar a la meta no es vencer. giungere alla meta non e vincere.
Lo importante es el camino y en el, L’importante nel cammino (della vita) e
caer, levantarse, insistir, aprender. cadere, rialzarsi, insistere, imparare.
La Posada De Los Muertos - Mago de Oz
2Sono anni che aspetto questo momento!!!!
Bibliography
[1] P. Artzner and F. Delbaen. Default risk and incomplete insurance
markets. Mathematical Finance, 5:187–195, 1995.
[2] D. Bates. Jumps and stochastic volatility: the exchange rate processes
implicit in deutschemark options. Review of Financial Studies, 9:69–
107, 1996.
[3] D. Bates. Maximum likelihood estimation of latent affine processes.
Review of Financial Studies, 19(3):909–965, 2006.
[4] Messod D. Beneish and Eric G. Press. Interrelation among events of
default. Contemporary Accounting Research, 12:57–84, 1995.
[5] N. Bingham and R. Kiesel. Risk-Neutral Valuation : Pricing and Hedg-
ing of Financial Derivatives. Springer Verlag, 2004.
[6] F. Black and J. Cox. Valuing corporate securities: Liabilities : Some
effectsof bond indenture provisions. Journal of Finance, 31:351–367,
1976.
[7] P. Bremaud. Point Processes and Queues: Martingale Dynamics.
Springer Verlag, 1981.
[8] Wolfgang Buhler and Monika Trapp. Credit and liquidity risk in bond
and cds markets. Working Paper, University of Mannheim, 2005.
[9] J. C. Butcher. Numerical Methods for Ordinary Differential Equation.
John Wiley and Sons, 2003.
108
BIBLIOGRAPHY 109
[10] J. R. Cash and A. H. Karp. A variable order runge-kutta method
for initial value problems with rapidly varying right-hand sides. ACM
Transactions on Mathematical Software, 16:201–222, 1990.
[11] Li Chen and Damir Filipovic. Credit derivatives in an affine frame-
work. Asia-Pacific Financial Markets, 13:123–140, 2007.
[12] Patrick Cheridito, Damir Filipovic, and Robert L. Kimmel. A note
on the dai–singleton canonical representation of affine term structure
models. forthcoming in Mathematical Finance.
[13] K. Chung. Lectures from Markov Process to Brownian Motion.
Springer Verlag, 1982.
[14] Rama Cont and Peter Tankov. Financial modelling with Jump Pro-
cesses. Chapman & Hall - CRC Press, 2003.
[15] J. Cox, J.E. Ingersoll, and S.A. Ross. A theory of the term structure
of interest rates. Econometrica, 53:385–402, 1985.
[16] Quiang Dai and Kenneth J. Singleton. Specification analysis of affine
term structure models. The Journal of Finance, 55(5):1943–1978,
2000.
[17] C. Dellacherie and P.-A. Meyer. Probabilities and Potential. North
Holland, 1978.
[18] J. R. Dormand and P. J. Prince. A family of embedded runge-kutta for-
mulae. Journal of Computational and Applied Mathematics, 6(1):19–
26, 1980.
[19] J.-C. Duan and J.G. Simonato. Estimating and testing exponential
affine term structure models by kalman filter. Review of Quantitative
Finance and Accounting, 13(2):111–135, 1999.
BIBLIOGRAPHY 110
[20] Darrel Duffie and Rui Kan. A yield-factor model for interest rate.
Mathematical Finance, 6(4):379–406, 1996.
[21] Darrell Duffie. Credit risk modeling with affine processes. Technical
report, Stanford University and Scuola Normale Superiore, Pisa, 2004.
[22] Darrell Duffie, David Filipovic, and Walter Schachermayer. Affine
process and applications in finance. Annals of Applied Probability,
13(3):984–1053, 2003.
[23] Darrell Duffie, Jun Pan, and Kenneth Singleton. Transform analysis
and asset pricing for affine jump-diffusion. Econometrica, 68(6):1343–
1376, 2000.
[24] Abel Elizalde. Credit risk models ii: Structural models. CEMFI Work-
ing Paper No. 0606, September 2003.
[25] S. Ethier and T. Kurtz. Markov Processes, Characterization and Con-
vergence. John Wiley and Sons, 1986.
[26] Erwin Fehlberg. Low-order classical runge-kutta formulas with step
size control and their application to some heat transfer problems.
NASA Technical Report, 315, 1969.
[27] Damir Filipovic. Time-inhomogeneous affine processes. Stochastic
Processes and Their Applications, 115:639–659, 2005.
[28] Damir Filipovic. Term Structure Models, an introduction. Springer
Verlag, 2008.
[29] Gianni Gilardi. Analisi due. McGraw-Hill, 1996.
[30] Gene H. Golub and Charles F. Van loan. Matrix computation. Johns
Hopkins University Press, 1996.
[31] Jan Grandell. Doubly stochastic Poisson processes, volume 529 of Lec-
ture Notes in Mathematics. Springer Verlag, 1976.
BIBLIOGRAPHY 111
[32] A. Harvey, E. Ruiz, and N. Shepard. Multivariate stochastic variance
models. Review of Economic Studies, 61(2):247–264, 1994.
[33] Stephanie Hofling. Credit risk modeling and valuation: The reduced
form approach and copula models. Master’s thesis, Technische Uni-
versitat Munchen, 2006.
[34] John C. Hull. Options, futures and other derivatives, 6th edition. Pren-
tice Hall, 2006.
[35] J. Jacod and A.N. Shirayev. Limit Theorems for Stochastic Processes.
Springer Verlag, 1987.
[36] Jan Kallsen. A didactic note on affine volatility models. In From
Stochastic calcolus to Mathematical Finance, pages 343–368. Springer
Verlag, 2006.
[37] A.N. Kolmogorov and S.V. Fomin. Measure, Lebesgue Integrals, and
Hilbert Space. Academic Press, 1961.
[38] J. Lambert. Numerical Methods for Ordinary Differential Systems.
John Wiley and Sons, 1991.
[39] R. Merton. On the pricing of corporate debt: The risk structure of
interest rate. Journal of Finance, 29:449–470, 1974.
[40] H. Mori and H. Takahasi. Quadrature formulas obtained by variable
transformation. Numerische Mathematik, 21:206–219, 1973.
[41] M. Mori. Quadrature formulas obtained by variable transformation
and de rule. Journal of Computational and Applied Mathematics, 12-
13:119–130, 1985.
[42] Masatake Mori. Discovery of the double exponential transformation
and its developments. Publications of the Research Institute for Math-
ematical Sciences, 41:897–935, 2005.
BIBLIOGRAPHY 112
[43] Bernt Oksendal. Stochastic Differential Equations. Springer Verlag,
1991.
[44] C.D. Pagani and S.Salsa. Analisi Matematica, Volume 2. Masson,
2004.
[45] William H. Press, Saul A. Teukolsky, William T. Vetterling, and
Brian P. Flannery. Numerical Recipes: The Art of Scientific Com-
puting, 3th edition. Cambridge University Press, 2007.
[46] Philip E. Protter. Stochastic Integration and Differential Equations.
Springer Verlag, 2005.
[47] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri. Numerical Math-
ematics. Springer Verlag, 2000.
[48] Walter Schachermayer. The notion of arbitrage and free lunch in math-
ematical finance. In Aspects of Mathematical Finance, pages 15–22.
Springer Verlag, 2008.
[49] P. J. Schonbucher. Credit Derivatives Pricing Models. John Wiley and
Sons, 2003.
[50] Kenneth J. Singleton. Estimation of affine asset pricing models us-
ing the empirical characteristic function. Journal of Econometrics,
102:111–141, 2001.
[51] J. Stoer and R. Burlish. Introduction to Numerical Analisys. Springer
Verlag, 1991.
[52] An Introduction to Continuum Mechanics. Morton Gurtin. Academic
Press, 1982.
[53] Oldrich Vasicek. An equilibrium characterization of the term struc-
ture. Journal of Financial Economics, 5:177–188, November 1977.