modern reserving techniques for the insurance business · 2017. 5. 5. · goovaerts and jan dhaene...
TRANSCRIPT
Katholieke Universiteit Leuven
FACULTEIT WETENSCHAPPEN
DEPARTEMENT WISKUNDE
MODERN RESERVING TECHNIQUES
FOR THE INSURANCE BUSINESS
door
Tom HOEDEMAKERS
Promotor:
Prof. Dr. J. Beirlant
Prof. Dr. J. Dhaene
Proefschrift ingediend tot het
behalen van de graad van
Doctor in de Wetenschappen
Leuven 2005
Acknowledgments
Four years ago I became part of the stimulating and renowned academic
environment at K. U. Leuven, the Department of Applied Economics, and
the AFI Leuven Research Center in particular. As a researcher, I had the
opportunity to interact, work with and learn from many interesting people.
I consider myself extremely fortunate to have had the following people in
support for the realization of this thesis.
I feel very privileged to have worked with my two supervisors, Jan
Beirlant and Jan Dhaene. To each of them I owe a great debt of gratitude
for their continuous encouragement, patience, inspiration and friendship.
I especially want to thank them for the freedom they allowed me to seek
satisfaction in research, for supporting me in my choices and for believing
in me. They carefully answered the many (sometimes not well-defined)
questions that I had and they always found a way to make themselves
available for yet another meeting. Each chapter of this thesis has benefitted
from their critical comments, which often inspired me to do further research
and to improve the vital points of the argument. It has been a privilege
to study under Jan and Jan, and to them goes my highest personal and
professional respect.
I am also grateful to Marc Goovaerts for giving me the opportunity to
start my thesis in one of the world-leading actuarial research centers. Marc
Goovaerts and Jan Dhaene have taught me a great deal about the field of
actuarial science by sharing with me the joy of discovery and investigation
that is the heart of research. They brought me in contact with a lot of
interesting people in the actuarial world and gave me the possibility to
present my work at different congresses all over the world.
I would also like to thank the other members of the doctoral committee
Michel Denuit, Rob Kaas, Wim Schoutens and Jef Teugels for their valu-
able contributions as committee members. Their detailed comments as
i
ii Acknowledgments
well as their broader reactions definitely helped me to improve the quality
of my research and its write-up.
Many thanks go also to my (ex-)colleagues Ales, Bjorn, David, Grzegorz,
Katrien, Marc, Piotr, Steven and Yuri for their enthusiasm and stimulat-
ing cooperation. A lot of sympathy goes to Emiliano Valdez for the serious
discussions, and even more important, for the fun we had during his stay
at the K. U. Leuven in the beginning of this year.
After the professionals, a word of thanks is addressed to all my friends
and fellow students for their friendship and support.
Finally, not least, I would like to thank my parents and my sister Leen
for their love, guidance and support. They constantly reminded me of their
confidence and encouraged me to pursue my scientific vocation, especially
in moments of doubt. You have always believed in me and that was a great
moral support.
Tom
Leuven, 2005
Table of Contents
Acknowledgments i
Preface vii
Publications xix
List of abbreviations and symbols xxi
1 Risk and comonotonicity in the actuarial world 1
1.1 Fundamental concepts in actuarial risk theory . . . . . . . . 1
1.1.1 Dependent risks . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Risk measures . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Actuarial ordering of risks . . . . . . . . . . . . . . . 10
1.2 Comonotonicity . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Convex bounds 21
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Convex bounds for sums of dependent random variables . . 23
2.2.1 The comonotonic upper bound . . . . . . . . . . . . 25
2.2.2 The improved comonotonic upper bound . . . . . . . 26
2.2.3 The lower bound . . . . . . . . . . . . . . . . . . . . 28
2.2.4 Moments based approximations . . . . . . . . . . . . 29
2.3 Upper bounds for stop-loss premiums . . . . . . . . . . . . 30
2.3.1 Upper bounds based on lower bound plus error term 31
2.3.2 Bounds by conditioning through decomposition of
the stop-loss premium . . . . . . . . . . . . . . . . . 33
2.3.3 Partially exact/comonotonic upper bound . . . . . . 35
2.3.4 The case of a sum of lognormal random variables . . 35
iii
iv Table of Contents
2.4 Application: discounted loss reserves . . . . . . . . . . . . . 47
2.4.1 Framework and notation . . . . . . . . . . . . . . . . 48
2.4.2 Calculation of convex lower and upper bounds . . . 52
2.5 Convex bounds for scalar products of random vectors . . . 56
2.5.1 Theoretical results . . . . . . . . . . . . . . . . . . . 58
2.5.2 Stop-loss premiums . . . . . . . . . . . . . . . . . . . 61
2.5.3 The case of log-normal discount factors . . . . . . . 62
2.6 Application: the present value of stochastic cash flows . . . 68
2.6.1 Stochastic returns . . . . . . . . . . . . . . . . . . . 68
2.6.2 Lognormally distributed payments . . . . . . . . . . 72
2.6.3 Elliptically distributed payments . . . . . . . . . . . 77
2.6.4 Independent and identically distributed payments . 84
2.7 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3 Reserving in life insurance business 93
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.2 Modelling stochastic decrements . . . . . . . . . . . . . . . 96
3.3 The distribution of life annuities . . . . . . . . . . . . . . . 100
3.3.1 A single life annuity . . . . . . . . . . . . . . . . . . 100
3.3.2 A homogeneous portfolio of life annuities . . . . . . 113
3.3.3 An ‘average’ portfolio of life annuities . . . . . . . . 119
3.3.4 A numerical illustration . . . . . . . . . . . . . . . . 120
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4 Reserving in non-life insurance business 127
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.2 The claims reserving problem . . . . . . . . . . . . . . . . . 131
4.3 Model set-up: regression models . . . . . . . . . . . . . . . 133
4.3.1 Lognormal linear models . . . . . . . . . . . . . . . . 135
4.3.2 Loglinear location-scale models . . . . . . . . . . . . 137
4.3.3 Generalized linear models . . . . . . . . . . . . . . . 141
4.3.4 Linear predictors and the discounted IBNR reserve . 146
4.4 Convex bounds for the discounted IBNR reserve . . . . . . 148
4.4.1 Asymptotic results in generalized linear models . . . 148
4.4.2 Lower and upper bounds . . . . . . . . . . . . . . . 151
4.5 The bootstrap methodology in claims reserving . . . . . . . 157
4.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 157
4.5.2 Central idea . . . . . . . . . . . . . . . . . . . . . . . 158
Table of Contents v
4.5.3 Bootstrap confidence intervals . . . . . . . . . . . . . 158
4.5.4 Bootstrap in claims reserving . . . . . . . . . . . . . 159
4.6 Three applications . . . . . . . . . . . . . . . . . . . . . . . 163
4.6.1 Lognormal linear models . . . . . . . . . . . . . . . . 164
4.6.2 Loglinear location-scale models . . . . . . . . . . . . 171
4.6.3 Generalized linear models . . . . . . . . . . . . . . . 177
4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
5 Other approximation techniques for sums of dependent
random variables 185
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
5.2 Moment matching approximations . . . . . . . . . . . . . . 187
5.2.1 Two well-known moment matching approximations . 187
5.2.2 Application: discounted loss reserves . . . . . . . . . 190
5.3 Asymptotic approximations . . . . . . . . . . . . . . . . . . 192
5.3.1 Preliminaries for heavy-tailed distributions . . . . . 192
5.3.2 Asymptotic results . . . . . . . . . . . . . . . . . . . 194
5.3.3 Application: discounted loss reserves . . . . . . . . . 198
5.4 The Bayesian approach . . . . . . . . . . . . . . . . . . . . 201
5.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 201
5.4.2 Prior choice . . . . . . . . . . . . . . . . . . . . . . . 203
5.4.3 Iterative simulation methods . . . . . . . . . . . . . 205
5.4.4 Bayesian model set-up . . . . . . . . . . . . . . . . . 207
5.5 Applications in claims reserving . . . . . . . . . . . . . . . . 209
5.5.1 The comonotonicity approach versus the Bayesian
approximations . . . . . . . . . . . . . . . . . . . . . 209
5.5.2 The comonotonicity approach versus the asymptotic
and moment matching approximations . . . . . . . . 216
5.6 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Samenvatting in het Nederlands (Summary in Dutch) 227
Bibliography 237
Preface
Uncertainty is very much a part of the world in which we live. Indeed, one
often hears the well-known cliche that the only certainties in life are death
and taxes. However, even these supposed certainties are far from being
completely certain, as any actuary or accountant can attest. For although
one’s eventual death and the requirement that one pay taxes may be facts
of life, the timing of one’s death and the amount of taxes to pay are far from
certain and are generally beyond one’s control. Uncertainty can make life
interesting. Indeed, the world would likely be a very dull place if everything
were perfectly predictable. However, uncertainty can also cause grief and
suffering.
Actuarial science is the subject whose primary focus is analyzing the
financial consequences of future uncertain events. In particular, it is con-
cerned with analyzing the adverse financial consequences of large, unpre-
dictable losses and with designing mechanisms to cushion the harmful fi-
nancial effects of such losses.
Insurance is based on the premise that individuals faced with large and
unpredictable losses can reduce the variability of such losses by forming a
group and sharing the losses incurred by the group as a whole. This im-
portant principle of loss sharing, known as the insurance principle, forms
the foundation of actuarial science. It can be justified mathematically
using the Central Limit Theorem from probability theory. For the insur-
ance principle to be valid, essentially four conditions should hold (or very
nearly hold). The losses should be unpredictable. The risks should be
independent in the sense that a loss incurred by one member of the group
makes additional losses by other members of the group no more or less
likely. The risks should be homogeneous in the sense that a loss incurred
by one member of the group is not expected to be any different in size or
likelihood from losses incurred by other members of the group. Finally,
vii
viii Preface
the group should be sufficiently large so that the portion of the total loss
that each individual is required to pay becomes relatively certain. In prac-
tice, risks are not truly independent or homogeneous. Moreover, there will
always be situations where the condition of unpredictability is violated.
Actuarial science seeks to address the following three problems associated
with any such insurance arrangement:
1. Given the nature of the risk being assumed, what price (i.e. premium)
should the insurance company charge?
2. Given the nature of the overall risks being assumed, how much of
the aggregate premium income should the insurance company set
aside in a reserve to meet contractual obligations (i.e. pay insurance
claims) as they arise?
3. Given the importance to society and the general economy of having
sound financial institutions able to meet all their obligations, how
much capital should an insurance company have above and beyond
its reserves to absorb losses that are larger than expected? Given the
actual level of an insurance company’s capital, what is the probability
of the company remaining solvent?
These are generally referred to as the problems of pricing, reserving, and
solvency.
This thesis focuses on the problem of reserving and total balance sheet
requirements. A reserving analysis involves the determination of the ran-
dom present value of an unknown amount of future loss payments. For a
property/casualty insurance company this uncertain amount is usually the
most important number on its financial statement. The care and expertise
with which that number is developed are crucial to the company and to its
policyholders. It is important not to let the inherent uncertainties serve as
an excuse for providing anything less than a rigorous scientific analysis.
Among those who rely on reserve estimates, interests and priorities may
vary. To company management the reserve estimate should provide reliable
information in order to maximize the company’s viability and profitabil-
ity. To the insurance regulator, concerned with company solvency, reserves
should be set conservatively to reduce the probability of failure of the in-
surance company. To the tax agent charged with ensuring timely reporting
Preface ix
of earned income, the reserves should reflect actual payments as “nearly
as it is possible to ascertain them”. The policyholder is most concerned
that reserves are adequate to pay insured claims, but does not want to be
overcharged.
Besides all the techniques, the primary goal of the reserving process
can be stated quite simply. As of a given date, an insurer is liable for
all claims incurred from that date on. As well as for claims that arise
from already occurred events as for claims that arise from risks covered by
the insurer but for which the uncertain event has not yet occurred. Costs
associated with these claims fall into two categories: those which have been
paid and those which have not. The primary goal of the reserving process
is to estimate those which have not yet been paid (i.e. unpaid losses). As
of a given reserve date, the distribution of possible aggregate unpaid loss
amounts may be represented as a probability density function. Much has
been written about the statistical distributions that have proven to be
most useful in the study of risk and insurance. In practice full information
about the underlying distributions is hardly ever available. For this reason
one often has to rely on partial information, for example estimations of the
first couple of moments. Not only the basic summary measures, but also
more sophisticated risk measures (such as measures of skewness or extreme
percentiles of the distribution) which require much deeper knowledge about
the underlying distributions are of interest. The computation of the first
few moments may be seen as just a first attempt to explore the properties of
a random distribution. Moreover in general the variance does not appear to
be the most suitable risk measure to determine the solvency requirements
for an insurance portfolio. As a two-sided risk measure it takes into account
both positive and negative discrepancies which leads to underestimation
of the reserve in the case of a skewed distribution. Moreover it does not
emphasize the tail properties of the distribution. In this case it seems
much more appropriate to use the Value-at-Risk (the p-th quantile) or
also the Tail Value-at-Risk (which is essentially the same as an average of
all quantiles above a predefined level p). Also risk measures based on stop-
loss premiums (for example the Expected Shortfall) can be used in this
context. These trends are also reflected in the recent regulatory changes
in banking and insurance (Basel 2 and Solvency 2) which stress the role
of the risk-based approach in asset-liability management. This creates a
need for new methodological tools which allow to obtain more sophisticated
information about the underlying risks, like the upper quantiles, stop-loss
x Preface
premiums and others.
There is little in the actuarial literature which considers the adequate
computation of the distribution of reserve outcomes. Several methods exist
which allow to approximate efficiently the distribution functions for sums
of independent risks (e.g. Panjer’s recursion, convolution, ...). Moreover if
the number of risks in an insurance portfolio is large enough, the Central
Limit Theorem allows to obtain a normal approximation for aggregate
claims. Therefore even if the independence assumption is not justified (e.g.
when it is rejected by formal statistical tests), it is often used in practice
because of its mathematical convenience. In a lot of practical applications
the independence assumption may be often violated, which can lead to
significant underestimation of the riskiness of the portfolio. This is the
case for example when the actuarial technical risk is combined with the
financial investment risk.
Unlike in finance, in insurance the concept of stochastic interest rates
emerged quite recently. Traditionally actuaries rely on deterministic in-
terest rates. Such a simplification allows to treat efficiently summary mea-
sures of financial contracts such as the mean, the standard deviation or the
upper quantiles. However due to a high uncertainty about future invest-
ment results, actuaries are forced to adopt very conservative assumptions
in order to calculate insurance premiums or mathematical reserves. As
a result the diversification effects between returns in different investment
periods cannot be taken into account (i.e. the fact that poor investment
results in some periods are usually compensated by very good results in
others). This additional cost is transferred either to the insureds who
have to pay higher insurance premiums or to the shareholders who have
to provide more economic capital.
For these reasons the need for introducing models with stochastic inter-
est rates has been well-understood also in the actuarial world. The move
toward stochastic modelling of interest rates is additionally enhanced by
the latest regulatory changes in banking and insurance (Basel 2, Solvency
2) which promote the risk-based approach to determine economic capital,
i.e. they state that traditional actuarial conservatism should be replaced
by the fair value reserving, with the regulatory capital determined solely
on the basis of unexpected losses which can be estimated e.g. by taking the
Value-at-Risk measure at appropriate probability level p. Projecting cash
flows with stochastic rates of return is also crucial in pricing applications
Preface xi
in insurance, like the embedded value (the present value of cash flows gen-
erated only by policies-in-force) and the appraisal value (the present value
of cash flows generated both by policies-in-force and by new business, i.e.
the policies which will be written in the future).
A mathematical description of the discussed problem can be summarized
as follows.
Let Xi denote a random amount to be paid at time ti, i = 1, . . . , n and let
Vi denote the discount factor over the period [0, ti]. We will consider the
present value of future payments being a scalar product of the form
S =n∑
i=1
XiVi. (1)
The random vector ~X = (X1, X2, . . . , Xn) may reflect e.g. the insurance
or credit risk while the vector ~V = (V1, V2, . . . , Vn) represents the finan-
cial/investment risk. In general we assume that these vectors are mutually
independent. In practical applications the independence assumption is of-
ten violated, e.g. due to an inflation factor which strongly influences both
payments and investment results. One can however tackle this problem by
considering sums of the form
S =n∑
i=1
XiVi,
where Xi = Xi/Zi and Vi = ViZi are the adjusted values expressed in
real terms (Zi denotes here an inflation factor over period [0, ti]). For this
reason the assumption of independence between the insurance risk and the
financial risk is in most cases realistic and can be efficiently applied to
obtain various quantities describing risk within financial institutions, e.g.
discounted insurance claims or the embedded/appraisal value of a com-
pany.
Typically these distribution functions are rather involved, which is
mainly due to two important reasons. First of all, the distribution of
the sum of random variables with marginal distributions in the same dis-
tribution class in general does not belong to the particular distribution
class. Secondly, the stochastic dependence between the elements in the
sum precludes convolution and complicates matters considerably.
xii Preface
Consequently, in order to compute functionals of sums of dependent
random variables, approximation methods are generally indispensable. Pro-
vided that the whole dependency structure is known, one can use Monte
Carlo simulation to obtain empirical distribution functions. However, this
is typically a time consuming approach, in particular if we want to ap-
proximate tail probabilities, which would require an excessive number of
simulations. Therefore, alternative methods need to be explored. In this
thesis we discuss the most frequent used approximation techniques for re-
serving applications.
The central idea in this work is the concept of comonotonicity. We
suggest to solve the above described problem by calculating upper and
lower bounds for the sum of dependent random variables making efficient
use of the available information. These bounds are based on a general
technique for deriving lower and upper bounds for stop-loss premiums of
sums of dependent random variables, as explained in Kaas et al. (2000),
Dhaene et al. (2002a,b), among others.
The first approximation we will consider for the distribution function of
the discounted reserve is derived by approximating the dependence struc-
ture between the random variables involved by a comonotonic dependence
structure. In this way the multi-dimensional problem is reduced to a two-
dimensional one which can easily be solved by conditioning and using some
numerical techniques. It is argued that this approach is plausible in ac-
tuarial applications because it leads to prudent and conservative values of
the reserves and solvency margin. If the dependency structure between
the summands of S is strong enough, this upper bound in convex order
performs reasonably well.
The second approximation, which is derived by considering conditional
expectations, takes part of the dependence structure into account. This
lower bound in convex order turns out to be extremely useful to evaluate
the quality of approximation provided by the upper bound. The lower
bound can also be applied as an approximation of the underlying distribu-
tion. This choice is not actuarially prudent, however the relative error of
this approximation significantly outperforms the relative error of the upper
bound. For this reason, the lower bound will always be preferable in the
applications which require high precision of approximations, like pricing of
exotic derivatives (e.g. Decamps et al. (2004), Deelstra et al. (2004) and
Vyncke et al. (2004)) or optimal portfolio selection problems (e.g. Dhaene
et al. (2005)).
Preface xiii
This thesis is set out as follows.
The first chapter recalls the basics of actuarial risk theory. We define
some frequently used measures of dependence and the most important or-
derings of risks for actuarial applications. We further introduce several
well-known risk measures and the relations that hold between them. We
summarize properties of these risk measures that can be used to facilitate
decision-taking. Finally, we provide theoretical background for the con-
cept of comonotonicity and we review the most important properties of
comonotonic risks.
In Chapter 2 we recall how the comonotonic bounds can be derived and
illustrate the theoretical results by means of an application in the con-
text of discounted loss reserves. The advantage of working with a sum of
comonotonic variables has to be that the calculation of the distribution of
such a sum is quite easy. In particular this technique is very useful to find
reliable estimations of upper quantiles and stop-loss premiums.
In practical applications the comonotonic upper bound seems to be
useful only in the case of a very strong dependency between successive
summands. Even then the bounds for stop-loss premiums provided by
the comonotonic approximation are often not satisfactory. In this chapter
we present a number of techniques which allow to determine much more
efficient upper bounds for stop-loss premiums. To this end, we use on the
one hand the method of conditioning as in Curran (1994) and in Rogers
and Shi (1995), and on the other hand the upper and lower bounds for
stop-loss premiums of sums of dependent random variables. We show also
how to apply the results to the case of sums of lognormally distributed
random variables. Such sums are widely encountered in practice, both in
actuarial science and in finance.
We derive comonotonic approximations for the scalar product of ran-
dom vectors of the form (1) and explain a general procedure to obtain
accurate estimates for quantiles and stop-loss premiums. We study the
distribution of the present value function of a series of random payments
in a stochastic financial environment described by a lognormal discounting
process. Such distributions occur naturally in a wide range of applications
within fields of insurance and finance. Accurate approximations are ob-
tained by developing upper and lower bounds in the convex order sense for
xiv Preface
such present value functions. Finally, we consider several applications for
discounted claim processes under the Black & Scholes setting. In particular
we analyze in detail the cases when the random variables Xi denote insur-
ance losses modelled by lognormal, normal (more general: elliptical) and
gamma or inverse Gaussian (more general: tempered stable) distributions.
As we demonstrate by means of a series of numerical illustrations, the
methodology provides an excellent framework to get accurate and easily
obtainable approximations of distribution functions for random variables
of the form (1).
Chapters 3 and 4 apply the obtained results to two important reserving
problems in insurance business and illustrate them numerically.
In Chapter 3 we consider an important application in the life insurance
business. We aim to provide some conservative estimates both for high
quantiles and stop-loss premiums for a single life annuity and for a whole
portfolio. We focus here only on life annuities, however similar techniques
may be used to get analogous estimates for more general life contingencies.
Our solution enables to solve with a great accuracy personal finance
problems such as: How much does one need to invest now to ensure — given
a periodical (e.g. yearly) consumption pattern — that the probability of
outliving ones money is very small (e.g. less than 1%)?
The case of a portfolio of life annuity policies has been studied exten-
sively in the literature, but only in the limiting case — for homogeneous
portfolios, when the mortality risk is fully diversified. However the applica-
bility of these results in insurance practice may be questioned: especially
in the case of the life annuity business a typical portfolio does not con-
tain enough policies to speak about full diversification. For this reason we
propose to approximate the number of active policies in subsequent years
using a normal power distribution (by fitting the first three moments of
the corresponding binomial distributions) and to model the present value
of future benefits as a scalar product of mutually independent random
vectors.
Chapter 4 focuses on the claims reserving problem. To get the correct
picture of its liabilities, a company should set aside the correctly estimated
amount to meet claims arising in the future on the written policies. The
past data used to construct estimates for the future payments consist of a
Preface xv
triangle of incremental claims.
The purpose is to complete this run-off triangle to a square, and even
to a rectangle if estimates are required pertaining to development years of
which no data are recorded in the run-off triangle at hand. To this end, the
actuary can make use of a variety of techniques. The inherent uncertainty
is described by the distribution of possible outcomes, and one needs to
arrive at the best estimate of the reserve. In this chapter we look at the
discounted reserve and impose an explicit margin based on a risk measure
from the distribution of the total discounted reserve. We will model the
claim payments using lognormal linear, loglinear location-scale and gener-
alized linear models, and derive accurate comonotonic approximations for
the discounted loss reserve.
The bootstrap technique has proved to be a very useful tool in many
statistical applications and can be particularly interesting to assess the
variability of the claim reserving predictions and to construct upper limits
at an adequate confidence level. Its popularity is due to a combination of
available computing power and theoretical development. One advantage of
the bootstrap is that the technique can be applied to any data set without
having to assume an underlying distribution. Moreover, most computer
packages can handle very large numbers of repeated samplings, and this
should not limit the accuracy of the bootstrap estimates.
In the last chapter we derive, review and discuss some other methods
to obtain approximations for S. In the first section we recall two well-
known moment matching approximations: the lognormal and the recip-
rocal gamma approximation. Practitioners often use a moment matching
lognormal approximation for the distribution of S. The lognormal and
reciprocal gamma approximations are chosen such that their first two mo-
ments are equal to the corresponding moments of S.
Although the comonotonic bounds in convex order have proven to be
good approximations in case the variance of the random sum is sufficiently
small, they perform much worse when the variance gets large. In actuarial
applications it is often merely the tail of the distribution function that is
of interest. Indeed, one may think of Value-at-Risk, Conditional Tail Ex-
pectation or Expected Shortfall estimations. Therefore, approximations
for functionals of sums of dependent random variables may alternatively
be obtained through the use of asymptotic relations. Although asymptotic
results are valid at infinity, they may as well serve as approximations near
xvi Preface
infinity. We establish some asymptotic results for the tail probability of
a sum of heavy tailed dependent random variables. In particular, we de-
rive an asymptotic result for the randomly weighted sum of a sequence of
non-negative numbers. Furthermore, we establish under two different sets
of conditions, an asymptotic result for the randomly weighted sum of a
sequence of independent random variables that consist of a random and a
deterministic component. Throughout, the random weights are products of
i.i.d. random variables and thus exhibit an explicit dependence structure.
Since the early 1990’s, statistics has seen an explosion in applied Bayesian
research. This explosion has had little to do with a warming of the statistics
and econometrics communities to the theoretical foundation of Bayesian-
ism, or to a sudden awakening to the merits of the Bayesian approach
over frequentist methods, but instead can be primarily explained on prag-
matic grounds. Bayesian inference is the process of fitting a probability
model to a set of data and summarizing the result by a probability distri-
bution on the parameters of the model and on unobserved quantities such
as predictions for new observations. Simple simulation methods exist to
draw samples from posterior and predictive distributions, automatically
incorporating uncertainty in the model parameters. An advantage of the
Bayesian approach is that we can compute, using simulation, the posterior
predictive distribution for any data summary, so we do not need to put a
lot of effort into estimating the sampling distribution of test statistics. The
development of powerful computational tools (and the realization that ex-
isting statistical tools could prove quite useful for fitting Bayesian models)
has drawn a number of researchers to use the Bayesian approach in prac-
tice. Indeed, the use of such tools often enables researchers to estimate
complicated statistical models that would be quite difficult, if not virtu-
ally impossible, using standard frequentist techniques. The purpose of this
third section is to sketch, in very broad terms, basic elements of Bayesian
computation.
Finally, we compare these approximations with the comonotonic ap-
proximations of the previous chapter in the context of claims reserving. In
case the underlying variance of the statistical and financial part of the dis-
counted IBNR reserve gets large, the comonotonic approximations perform
worse. We will illustrate this observation by means of a simple example
and propose to solve this problem using the derived asymptotic results for
the tail probability of a sum of dependent random variables, in the presence
of heavy-tailedness conditions. These approximations are compared with
Preface xvii
the lognormal moment matching approximations. We finally consider the
distribution of the discounted loss reserve when the data in the run-off tri-
angle is modelled by a generalized linear model and compare the outcomes
of the Bayesian approach with the comonotonic approximations.
Publications
• Ahcan A., Darkiewicz G., Hoedemakers T., Dhaene J. and Goovaerts
M.J. (2004), “Optimal portfolio selection: Applications in insurance
business”, Proceedings of the 8th International Congress on Insur-
ance: Mathematics & Economics, June 14-16, Rome, pp. 40.
• Ahcan A., Darkiewicz G., Goovaerts M.J. and Hoedemakers T. (2005),
“Computation of convex bounds for present value functions of ran-
dom payments”, Journal of Computational and Applied Mathema-
tics, to be published.
• Antonio K., Goovaerts M.J. and Hoedemakers T. (2004), “On the
distribution of discounted loss reserves”, Medium Econometrische
Toepassingen, vol. 12, no. 2, pp. 14-18.
• Antonio K., Beirlant J. and Hoedemakers T. (2005), Discussion of
“A Bayesian generalized linear model for the Bornhuetter-Ferguson
method of claims reserving” by Richard Verrall, North American
Actuarial Journal, to be published.
• Antonio K., Beirlant J., Hoedemakers T. and Verlaak R. (2005),
“On the use of general linear mixed models in loss reserving”, North
American Actuarial Journal, submitted.
• Darkiewicz G. and Hoedemakers T. (2005), “How the co-integration
analysis can help in mortality forecasting”, British Actuarial Journal,
submitted.
• Hoedemakers T., Beirlant J., Goovaerts M.J. and Dhaene J. (2003),
“Confidence bounds for discounted loss reserves”, Insurance: Mathe-
matics & Economics, vol. 33, no. 2, pp. 297-316.
xix
xx Publications
• Hoedemakers T. and Goovaerts M.J. (2004), Discussion of “Risk and
discounted loss reserves” by Greg Taylor, North American Actuarial
Journal, vol. 8, no. 4, pp. 146-150.
• Hoedemakers T., Beirlant J., Goovaerts M.J. and Dhaene J. (2005),
“On the distribution of discounted loss reserves using generalized
linear models”, Scandinavian Actuarial Journal, vol. 2005, no. 1, pp.
25-45.
• Hoedemakers T., Darkiewicz G. and Goovaerts M.J. (2005), “Ap-
proximations for life annuity contracts in a stochastic financial envi-
ronment”, Insurance: Mathematics & Economics, to be published.
• Hoedemakers T., Darkiewicz G., Deelstra G., Dhaene J. and Van-
maele M. (2005), “Bounds for stop-loss premiums of stochastic sums
(with applications to life contingencies)”, Scandinavian Actuarial
Journal, submitted.
• Hoedemakers T., Goovaerts M.J. and Dhaene J. (2003), “IBNR pro-
blematiek in historisch perspectief”, De Actuaris, vol. 11, no. 2, pp.
27-29.
• Hoedemakers T., Goovaerts M.J. and Dhaene J. (2004), “De IBNR-
discussie”, De Actuaris, vol. 11, no. 4, pp. 26-29.
• Laeven R.J.A, Goovaerts M.J. and Hoedemakers T. (2005), “Some
asymptotic results for sums of dependent random variables with ac-
tuarial applications”, Insurance: Mathematics & Economics, to be
published.
• Vanduffel S., Hoedemakers T. and Dhaene J. (2005), “Comparing
approximations for risk measures of sums of non-independent log-
normal random variables”, North American Actuarial Journal, to be
published.
List of abbreviations and
symbols
Abbreviation Explanation
or symbol
ARMA(p, q) AutoRegressive-Moving Average Process of
order (p, q)
cdf cumulative distribution function
c.f. characteristic function
CLT Central Limit Theorem
Corr(X,Y ) = r(X,Y ) Pearson’s correlation coefficient between
the r.v.’s X and Y
Cov[X,Y ] covariance between the r.v.’s X and Y
D class of dominatedly varying functions
d.f. distribution function
E exponential r.v.
En(~µ,Σ, φ) n-dimensional elliptical distribution with
parameters ~µ, Σ and φ
F d.f. and distribution of a r.v.
F tail of the d.f. F : F = 1 − F
F ∗n n-fold convolution of the d.f. or distribution F
Γ(x) gamma function: Γ(x) =∫∞0 tx−1e−tdt, x > 0
Gamma(a, b) gamma distribution with parameters a and b:
f(x) = ba(Γ(a))−1xa−1e−bx, x ≥ 0
I(a, x) incomplete gamma function:
Γa(x) = (Γ(a))−1∫∞x e−tta−1dt, x ≥ 0
GPD Generalized Pareto Distribution
xxi
xxii List of abbreviations and symbols
I(.) indicator function: I(c) = 1 if the condition c is true
and I(c) = 0 if it is not
i.i.d. independent, identically distributed
L class of long-tailed distributions
LLN Law of Large Numbers
logN(µ, σ2) lognormal distribution with parameters µ and σ2:
f(x) = 1xσ
√2πe
−(log x−µ)2
2σ2 , x > 0
MLE Maximum Likelihood Estimator
N(µ, σ2),N(µ,Σ) Gaussian (normal) distribution with mean µ and
variance σ2 or covariance matrix Σ
N(0, 1) standard normal distribution
o(1) a(x) = o(b(x)) as x→ x0 means that
limx→x0 a(x)/b(x) = 0
O(1) a(x) = O(b(x)) as x→ x0 means that
limx→x0 |a(x)/b(x)| <∞ϕX(t) c.f. of the r.v. X: ϕX(t) = E[eitX ]
Φ(.) the cdf of the standard normal r.v.
lim supn→∞(xn) limit superior of the bounded sequence {xn}:= lim(sn), where sn = supk≥n xk = sup{xn, xn+1, . . .}
lim infn→∞(xn) limit inferior of the bounded sequence {xn}:= lim(tn), where tn = infk≥n xk = inf{xn, xn+1, . . .}
p.d.f probability density function
Pr[.] probability measure
p(.|.) conditional probability density
p(.) marginal distribution
R class of the d.f.’s with regularly varying right tail
Rα class of the regularly varying functions with index α
R−∞ class of the rapidly varying functions
r.v. random variable
S class of the subexponential distributions
σ2X variance of the r.v. X
σXiXjCov[Xi, Xj ]
sign(a) sign of the real number a
T S(δ, a, b) tempered stable law with parameters δ, a and b
U(a, b) uniform random variable on (a, b)
UMVUE Uniformly Minimum Variance Unbiased Estimator
Var[X] variance of the r.v. X
List of abbreviations and symbols xxiii
∼ a(x) ∼ b(x) as x→ x0 means that limx→x0 a(x)/b(x) = 1
a(x) ∼ 0 means a(x) = o(1)
≈ a(x) ≈ b(x) as x→ x0 means that a(x) is approximately
(roughly) of the same order as b(x) as x→ x0.
It is only used in a heuristic sense.
� a(x) � b(x) as x→ x0 means that 0 < lim infx→x0a(x)/b(x)
≤ lim supx→x0a(x)/b(x) <∞
d→ convergence in distributiond= equal in distribution
b.c floor function: bxc is the largest integer less than or
equal to x
d.e ceiling function: dxe is the smallest integer greater than or
equal to x
(x− d)+ max(x− d, 0)
=: or := notation
Chapter 1
Risk and comonotonicity in
the actuarial world
Summary In order to make decisions one has to evaluate the (distri-
bution function of the) multivariate risk (or random variable) one faces.
In this chapter we recall the basics of actuarial risk theory. We define
some frequently used measures of dependence and the most important or-
derings of risks for actuarial applications. We further introduce several
well-known risk measures and the relations that hold between them. We
summarize properties of these risk measures that can be used to facilitate
decision-taking. Finally, we provide theoretical background for the con-
cept of comonotonicity and we review the most important properties of
comonotonic risks.
1.1 Fundamental concepts in actuarial risk theory
In this section we briefly recall the most important concepts in actuar-
ial risk theory. The study of dependence has become of major concern
in actuarial research. We start by defining three important measures of
dependence: Pearson’s correlation coefficient, Kendall’s τ and Spearman’s
ρ. Once dependence measures are defined, one could use them to compare
the strength of dependence between random variables.
The determination of capital requirements for an insurance company
is a complex and non-trivial task. From their nature, capital requirements
are numeric values expressed in monetary units and based on quantifiable
1
2 Chapter 1 - Risk and comonotonicity in the actuarial world
measures of risks. Formally a risk measure is defined as a mapping from
the set of risks at hand to the real numbers. In other words, with any
potential loss X one associates a real number ρ[X]. Thus a risk measure
summarizes the riskiness of the underlying distribution in one single num-
ber. Usually such quantification serves as a risk management tool (e.g. an
insurance premium or an economic capital), but it can be also helpful in
overall decision making. We review and place the four popular risk mea-
sures (Value-at-Risk, Tail Value-at-Risk, Conditional Tail Expectation and
Expected Shortfall) in their context.
In the actuarial literature, orderings of risks are an important tool for
comparing the attractiveness of different risks. The essential tool for the
comparison of different concepts of orderings of risks will be the stop-loss
transform/premium and its properties. In the actuarial literature it is a
common feature to replace a risk by a “less favorable” risk that has a
simpler structure, making it easier to determine the distribution function.
We clarify what we mean with a “less favorable” risk and define the three
most important orderings of risks for actuarial applications: stochastic
dominance, stop-loss order and convex order.
This chapter is essentially based on Dhaene, Denuit, Goovaerts, Kaas &
Vyncke (2002a) and Dhaene, Vanduffel, Tang, Goovaerts, Kaas & Vyncke
(2004).
1.1.1 Dependent risks
In risk theory, all the random variables are traditionally assumed to be
mutually independent. It is clear that this assumption is made for mathe-
matical convenience. In some situations however, insured risks tend to act
similarly. The independence assumption is then violated and is not an ade-
quate way to describe the relations between the different random variables
involved. The individual risks of an earthquake or flooding risk portfolio
which are located in the same geographic area are correlated, since indi-
vidual claims are contingent on the occurrence and severity of the same
earthquake or flood. On a foggy day all cars of a region have higher prob-
ability to be involved in an accident. During dry hot summers, all wooden
cottages are more exposed to fire. More generally, one can say that if the
density of insured risks in a certain area or organization is high enough,
then catastrophes such as storms, explosions, earthquakes, epidemics and
1.1. Fundamental concepts in actuarial risk theory 3
so on can cause an accumulation of claims for the insurer. In life insurance,
there is ample evidence that the lifetimes of husbands and their wives are
positively associated. There may be certain selection mechanisms in the
matching of couples (“birds of a feather flock together”): both partners
often belong to the same social class and have the same life style. Fur-
ther, it is known that the mortality rate increases after the passing away
of one’s spouse (the “broken heart syndrome”). These phenomena have
implications on the valuation of aggregate claims in life insurance portfo-
lios. Another example in a life insurance context is a pension fund that
covers the pensions of persons working for the same company. These per-
sons work at the same location, they take the same flights. It is evident
that the mortality of these persons will be dependent, at least to a certain
extent.
The study of dependence has become of major concern in actuarial
research. There are a variety of ways to measure dependence.
First Pearson’s product moment correlation coefficient, captures the
linear dependence between couples of random variables. For a random
couple (X1, X2) having marginals with finite variances, Pearson’s product
correlation coefficient is defined by
Corr(X1, X2) =Cov[X1, X2]√Var[X1]Var[X2]
.
Pearson’s correlation coefficient contains information on both the strength
and direction of a linear relationship between two random variables. If one
variable is an exact linear function of the other variable, a positive relation-
ship leads to correlation coefficient 1, while a negative relationship leads
to correlation coefficient −1. If there is no linear predictability between
the two variables, the correlation is 0.
Kendall’s τ is a nonparametric measure of association based on the
probabilities of concordances and discordances in paired observations. Con-
cordance occurs when paired observations vary together, and discordance
occurs when paired observations vary differently. Specifically, Kendall’s τ
for a random couple (X1, X2) of random variables with continuous cdf’s is
defined as
τ(X1, X2) = Pr[(X1 −X ′1)(X2 −X ′
2) > 0]
−Pr[(X1 −X ′1)(X2 −X ′
2) < 0]
= 2Pr[(X1 −X ′1)(X2 −X ′
2) > 0] − 1,
4 Chapter 1 - Risk and comonotonicity in the actuarial world
where (X ′1, X
′2) is an independent copy of (X1, X2).
Contrary to Pearson’s r, Kendall’s τ is invariant under strictly mono-
tone transformations, that is, if φ1 and φ2 are strictly increasing (or de-
creasing) functions on the supports of X1 and X2, respectively, then
τ(φ1(X1), φ2(X2)
)= τ
(X1, X2
)provided the cdf’s of X1 and X2 are
continuous. Further, (X1, X2) are perfectly dependent if and only if,
|τ(X1, X2)| = 1.
Another very useful dependence measure is Spearman’s ρ. The idea
behind this dependence measure is very simple. Given random variablesX1
and X2 with continuous cdf’s FX1 and FX2 , we first create U1 = FX1(X1)
and U2 = FX2(X2), which are uniformly distributed over [0, 1] and then
use Pearson’s r. Spearman’s ρ is thus defined as ρ(X1, X2) = r(U1, U2).
Dependence measures can be used to compare the strength of depen-
dence between random variables.
1.1.2 Risk measures
Measuring risk and measuring preferences is not the same. When or-
dering preferences, activities, for example, alternatives A and B with fi-
nancial consequences XA and XB, are compared in order of preference
under conditions of risk. A preference order A � B means that A is
preferred to B. This order is represented by a preference function Ψ
with A � B ⇔ Ψ[XA] > Ψ[XB]. In contrast, a risk order A �R B
means that A is riskier than B and is represented by a function ρ with
A �R B ⇔ ρ[XA] > ρ[XB]. Every such function ρ is called a risk measure.
Models in actuarial science are used both for quantifying risks and
for pricing risks. Quantifying risk requires a risk measure to convert a
random future gain or loss into a certainty equivalent that can then be
used to order different risks and for decision-making purposes. In order
to quantify risk, it is necessary to specify the probability distributions of
the risks involved and to apply a preference function to these probability
distributions. Thus, this process involves both statistical assumptions and
economic assumptions. Individuals are assumed to be risk averse and to
have a preference to diversify risks.
Banks and regulatory agencies use monetary measures of risk to assess
the risk taken by financial investors; important examples are given by the
so-called Value-at-Risk and Tail Value-at-Risk.
Two-sided risk measures measure the magnitude of the distance (in
1.1. Fundamental concepts in actuarial risk theory 5
both directions) from X to E[X]. Different functions of distance lead to
different risk measures. Looking, for instance, at quadratic deviations,
this leads to the risk measure variance or to the risk measure standard
deviation. These risk measures have been the traditional measures in eco-
nomics and finance since the pioneering work of Markowitz. They exhibit
a number of nice technical properties. For instance, the variance of a port-
folio return is the sum of the variances and covariances of the individual
returns. Furthermore, the variance is used as a standard optimization
function (quadratic optimization).
On the other hand, a two-sided risk measure contradicts the intuitive
notion of risk that only negative deviations are dangerous. In addition
variance does not account for fat tails of the underlying distribution and
for the corresponding tail risk. For this reason, people include higher
(normalized) central moments, as for example, skewness and kurtosis, into
the analysis to assess risk more properly.
Perhaps the most popular risk measure is the Value-at-Risk (VaR). Let
L be the potential loss of a financial position. The VaR at confidence level
p (0 < p < 1) is then defined by the requirement
Pr[L > VaRp[L]
]= 1 − p. (1.1)
An intuitive interpretation of the VaR is that of a probable maximum loss
or more concrete, a 100×p% maximal loss, because Pr[L ≤ VaRp[L]
]= p,
which means that in 100 × p% of the cases, the loss is smaller or equal to
VaRp[L]. Interpreting the VaR as necessary underlying capital to bear risk,
relation (1.1) implies that this capital will, on average, not be exhausted in
100× p% of the cases. Obviously, the VaR is identical to the p-quantile of
the loss distribution, that is VaRp[L] = F−1L (p). It is important to remark
that the VaR does not take into account the severity of potential losses
in the 100 × (1 − p)% worst cases. A regulator for instance is not only
concerned with the frequency of default, but also about the severity of
default. Also shareholders and management should be concerned with the
question “how bad is bad?” when they want to evaluate the risks at hand
in a consistent way. Therefore, one often uses another risk measure which
is called the Tail Value-at-Risk (TVaR) and defined by
TVaRp[L] =1
1 − p
∫ 1
pVaRq[L]dq, p ∈ (0, 1).
6 Chapter 1 - Risk and comonotonicity in the actuarial world
It is the arithmetic average of the quantiles of L, from p on. Note that the
TVaR is always larger than the corresponding VaR.
We will define the other popular risk measures in terms of L for a
better comparison to the VaR. The Conditional Tail Expectation (CTE)
at confidence level p is defined by
CTEp[L] = E[L|L > VaRp[L]
], p ∈ (0, 1).
On the basis of the interpretation of the VaR as a 100 × p%-maximum
loss, the CTE can be interpreted as the average maximal loss in the worst
100 × (1 − p)% cases. Notice that in case of continuous distributions the
CTE and TVaR coincide.
Measures of shortfall risk are one-sided risk measures and measures
the shortfall risk relative to a target variable. This may be the expected
value, but in general, it is an arbitrary deterministic target or a stochastic
benchmark. The Expected Shortfall (ESF) at confidence level p is defined
as
ESFp[L] = E[max(L− VaRp[L], 0)], p ∈ (0, 1).
The following relations hold between the four risk measures defined above.
Theorem 1 (Relation between VaR, TVaR, CTE and ESF).
For p ∈ (0, 1), we have that
TVaRp[X] = VaRp[X] +1
1 − pESFp[X],
CTEp[X] = VaRp[X] +1
1 − FX(VaRp[X])ESFp[X],
CTEp[X] = TVaRFX(VaRp[X])[X].
Proof. See Dhaene et al. (2004).
Researchers always aimed to find a set of properties (axioms) that any
risk measure should satisfy. Recently the class of coherent risk measures,
introduced in Artzner (1999) and Artzner et al. (1999), has drawn a lot
of attention in the actuarial literature. The authors postulated that every
‘coherent’ risk measure should satisfy the following four properties:
1.1. Fundamental concepts in actuarial risk theory 7
1. monotonicity, i.e. X ≤ Y ⇒ ρ[X] ≤ ρ[Y ];
2. subadditivity, i.e. ρ[X + Y ] ≤ ρ[X] + ρ[Y ];
3. translation invariance, i.e. ρ[X + c] = ρ[X] + c ∀ c ∈ R;
4. positive homogeneity, i.e. ρ[aX] = aρ[X] ∀ a ≥ 0.
It can be demonstrated that the Value-at-Risk and the Expected Short-
fall are in general not subadditive. On the other hand, the TVaR is subad-
ditive. The desirability of the subadditivity property of risk measures has
been a major topic for research and discussion. Some researchers believe
that the axiom of subadditivity of risk measures used to determine the
solvency capital, reflects the risk diversification. However other authors
argue that the diversification benefits should be considered rather in terms
of subadditivity of the corresponding shortfalls.
It is an open question whether the coherent set of axioms is indeed the
‘best one’. For a relevant discussion we refer to e.g. Dhaene et al. (2003),
Goovaerts et al. (2003, 2004) and Darkiewicz et al. (2005a). It should
be noted that in spite of the disagreement in the scientific community
about the axioms of coherency, a lot of well-known risk measures satisfy
conditions (1)-(4) (e.g. the TVaR).
The expressions for the discussed risk measures of normal and lognor-
mal losses are given in the next two examples, which will be used in the
remainder of this thesis. For a proof of these examples, we refer to Dhaene
et al. (2004).
Example 1 (normal losses).
Consider a random variable X ∼ N(µ, σ2). The VaR, ESF and CTE at
confidence level p (p ∈ (0, 1)) of X are given by
VaRp[X] = µ+ σΦ−1(p), (1.2)
ESFp[X] = σφ(Φ−1(p)
)− σΦ−1(p)(1 − p), (1.3)
CTEp[X] = µ+ σφ(Φ−1(p)
)
1 − p, (1.4)
where φ(x) = Φ′(x) denotes the density function of the standard normal
distribution.
8 Chapter 1 - Risk and comonotonicity in the actuarial world
Example 2 (lognormal losses).
Consider a random variable X ∼ logN(µ, σ2). The VaR, ESF and CTE at
confidence level p (p ∈ (0, 1)) of X are given by
VaRp[X] = eµ+σΦ−1(p), (1.5)
ESFp[X] = eµ+σ2/2Φ(σ − Φ−1(p)
)− eµ+σΦ−1(p)(1 − p), (1.6)
CTEp[X] = eµ+σ2/2 Φ(σ − Φ−1(p)
)
1 − p. (1.7)
We end this section with a note about inverse distribution functions.
Inverse distribution functions
The cdf FX(x) = Pr[X ≤ x] of a random variable X is a right continuous
non-decreasing function with
FX(−∞) = limx→−∞
FX(x) = 0, FX(+∞) = limx→+∞
FX(x) = 1.
The classical definition of the inverse of a distribution function is the non-
decreasing and left-continuous function F−1X (p) defined by
F−1X (p) = inf{x ∈ R|FX(x) ≥ p}, p ∈ [0, 1]
with inf ∅ = +∞ by convention. For all x ∈ R and p ∈ [0, 1], we have
F−1X (p) ≤ x⇔ p ≤ FX(x). (1.8)
In this thesis we will use a more sophisticated definition for inverses of
distribution functions. For any real p ∈ [0, 1], a possible choice for the
inverse of FX in p is any point in the closed interval
[inf{x ∈ R|FX(x) ≥ p}, sup{x ∈ R|FX(x) ≤ p}
],
where, as before, inf ∅ = +∞, and also sup ∅ = −∞. Taking the left hand
border of this interval to be the value of the inverse cdf at p, we get F−1X (p).
Similarly, we define F−1+X (p) as the right hand border of the interval:
F−1+X (p) = sup{x ∈ R|Fx(x) ≤ p}, p ∈ [0, 1]
which is a non-decreasing and right-continuous function. Note that F−1X (0)
= −∞, F−1+X (1) = +∞ and that all the probability mass of X is contained
1.1. Fundamental concepts in actuarial risk theory 9
in the interval[F−1+
X , (0)F−1X (1)
]. Also note that F−1
X (p) and F−1+X (p) are
finite for all p ∈ (0, 1). In the sequel we will always use p as a value ranging
over the open interval (0, 1), unless stated otherwise.
In the following lemma, we state the relation between the inverse dis-
tribution functions of the random variables X and g(X) for a monotone
function g.
Lemma 1 (Inverse distribution function of g(X)).
Let X and g(X) be real-valued random variables and 0 < p < 1.
(a) If g is non-decreasing and left-continuous, then
F−1g(X)(p) = g
(F−1
X (p)).
(b) If g is non-decreasing and right-continuous, then
F−1+g(X)(p) = g
(F−1+
X (p)).
(c) If g is non-increasing and left-continuous, then
F−1+g(X)(p) = g
(F−1
X (1 − p)).
(d) If g is non-increasing and right-continuous, then
F−1g(X)(p) = g
(F−1+
X (1 − p)).
Proof. See Dhaene et al. (2002a).
Hereafter, we will reserve the notation U and V for U(0, 1) random vari-
ables, i.e. FU (p) = p and F−1U (p) = p for all 0 < p < 1, and the same for
V . One can prove that
Xd= F−1
X (U)d= F−1+
X (U). (1.9)
The first distributional equality is known as the quantile transform theo-
rem and follows immediately from (1.8). It states that a sample of random
numbers from a general cumulative distribution function FX can be gen-
erated from a sample of uniform random numbers. Note that FX has at
most a countable number of horizontal segments, implying that the last
two random variables in (1.9) only differ in a null-set of values of U . This
means that these random variables are equal with probability one.
10 Chapter 1 - Risk and comonotonicity in the actuarial world
1.1.3 Actuarial ordering of risks
In the actuarial literature, orderings of risks are an important tool for
comparing the attractiveness of different risks. Many examples and results
can be found in the work of Goovaerts et al. (1990), Van Heerwaarden
(1991) and Kaas et al. (1998).
The essential tool for the comparison of different concepts of order-
ings of risks will be the stop-loss transform/premium and its properties.
Throughout this section a risk X will be a random variable with finite
mean. The distribution function of X is denoted by FX , and FX = 1−FX
is the corresponding survival function.
In the actuarial literature it is a common feature to replace a risk by
a “less favorable” risk that has a simpler structure, making it easier to
determine the distribution function. Of course, we have to clarify what we
mean with a “less favorable” risk. Therefore, we first introduce the notion
of “stop-loss premium” of a distribution function.
Definition 1 (Stop-loss premium).
The stop-loss premium with retention d of a risk X is defined by
π(X, d) := E[(X − d)+
]=
∫ ∞
dFX(x)dx, −∞ < d < +∞, (1.10)
with the notation (x− d)+ = max(x− d, 0).
From this formula it is clear that the stop-loss premium with retention
d can be considered as the weight of an upper tail of (the distribution
function of) X. Indeed, it is the surface between the cdf FX of X and
the constant function 1, from d on. For these reasons stop-loss premiums
contain a lot of information about riskiness of underlying distributions.
The following properties of the stop-loss premium can easily be deduced
from the definition.
Theorem 2 (Stop-loss properties).
The stop-loss premium π(X, .) has the following properties:
(i) π(X, .) is decreasing and convex;
(ii) The right-hand derivative π′+(X, .) exists and −1 ≤ π
′+(X, .) ≤ 0;
(iii) limd→+∞ π(X, d) = 0.
1.1. Fundamental concepts in actuarial risk theory 11
To every function π : R+ → R, that fulfils (i)-(iii) there is a risk X, such
that π is the stop-loss premium of X. The distribution function of X is
given by FX(d) = π′+(X, d) + 1.
There are many concepts for comparing random variables. The most fa-
miliar one is the usual stochastic order introduced by Lehmann (1955).
In the actuarial and economic literature this ordering is sometimes called
stochastic dominance, see e.g. Goovaerts et al. (1990) and Van Heerwaar-
den (1991).
Definition 2 (Stochastic order).
We say that risk Y stochastically dominates risk X, written X ≤st Y , if
and only if FX(t) ≥ FY (t) for all t ∈ R.
In other words, X ≤st Y if their corresponding quantiles are ordered.
Note that the condition for stochastic dominance is very strong — it can
be easily seen that X ≤st Y if and only if there exist a bivariate vector
(X ′, Y ′) with the same marginal distributions as X and Y and such that
X ′ ≤ Y ′ almost surely.
Several results for this ordering can be found in Shaked and Shanthiku-
mar (1994). In the following theorem, some equivalent characterizations
are given for stochastic dominance.
Lemma 2 (Characterizations for stochastic dominance).
X ≤st Y holds if and only if any of the following equivalent conditions is
satisfied:
1. Pr[X ≥ t] ≥ Pr[Y ≥ t], for all t ∈ R;
2. Pr[X > t] ≥ Pr[Y > t], for all t ∈ R;
3. E[φ(X)] ≤ E[φ(Y )], for all non-decreasing functions φ(.);
4. E[ψ(−X)] ≥ E[ψ(−Y )], for all non-decreasing functions ψ(.);
5. The function t→ π(Y, t) − π(X, t) is non-increasing.
A consequence of stochastic order X ≤st Y , i.e. a necessary condition for
it, is obviously that E[X] ≤ E[Y ], and even E[X] < E[Y ] unless Xd= Y .
The stochastic dominance has a natural interpretation in terms of utility
theory. We have that X ≤st Y holds if and only if E[u(−X)] ≥ E[u(−Y )]
for every non-decreasing utility function u. So the pairs of risks X and
12 Chapter 1 - Risk and comonotonicity in the actuarial world
Y with X ≤st Y are exactly those pairs of losses about which all decision
makers with an increasing utility function agree.
For actuarial applications the stop-loss order is much more interesting.
This ordering was investigated by Buhlmann et al. (1977), Goovaerts et al.
(1990) and Van Heerwaarden (1991). It is equivalent to increasing convex
order, which is well known in operations research and statistics.
Definition 3 (Stop-loss order).
If X and Y are two risks, then X precedes Y in stop-loss order, written
X ≤sl Y , if and only if
π(X, d) ≤ π(Y, d) for all −∞ < d < +∞. (1.11)
In other words two risks are ordered in the stop-loss sense if their corres-
ponding stop-loss premiums are ordered. It is clear that stochastic order
induces stop-loss order.
Like stochastic order, stop-loss order between two risksX and Y implies
a corresponding ordering of their means. To prove this, assume that d < 0.
From the expression (1.10) in Definition 1 of stop-loss premiums as upper
tails, we immediately find the following equality:
d+ π(X, d) = −∫ 0
dFX(x)dx+
∫ ∞
0(1 − FX(x))dx (1.12)
and also, letting d→ −∞,
limd→−∞
(d+ π(X, d)
)= E[X].
Hence, adding d to both sides of the inequality (1.11) in Definition 3 and
taking the limit for d→ −∞, we get E[X] ≤ E[Y ].
A sufficient condition forX ≤sl Y to hold is that E[X] ≤ E[Y ], together
with the condition that their cumulative distribution functions only cross
once. This means that there exists a real number c such that FX(x) ≥FY (x) for x ≥ c, but FX(x) ≤ FY (x) for x < c. Indeed, considering the
function f(d) = π(Y, d) − π(X, d), we have that
limd→−∞
f(d) = E[Y ] − E[X] ≥ 0, and limd→+∞
f(d) = 0.
Further, f(d) first increases, and then decreases (from c on) but remains
non-negative.
1.1. Fundamental concepts in actuarial risk theory 13
If two risks X and Y are ordered in the stop-loss sense, X ≤sl Y , this
means that X has uniformly smaller upper tails than Y , which in turns
means that a risk X is more attractive than a risk Y for an insurance
company. Moreover stop-loss order has a natural economic interpretation
in terms of expected utility. Indeed, it can be shown that X ≤sl Y if and
only if E[u(−X)
]≥ E
[u(−Y )
]holds for all non-decreasing concave real
functions u. This means that any risk-averse decision maker will prefer
to pay X instead of Y , which implies that acting as if the obligations X
are replaced by Y indeed leads to conservative or prudent decisions. This
characterization of stop-loss order in terms of utility functions is equivalent
to E[v(X)
]≤ E
[v(Y )
]holding for all non-decreasing convex functions v.
For this reason stop-loss order is alternatively called an increasing convex
order and denoted by ≤icx.
Recall that our original problem was to replace a risk X by a less
favorable risk Y , for which the distribution function is easier to obtain. If
X ≤sl Y , then also E[X] ≤ E[Y ], and it is intuitively clear that the best
approximations arise in the borderline case where E[X] = E[Y ]. This leads
to the so-called convex order.
Definition 4 (Convex order).
If X and Y are two risks, then X precedes Y in convex order, written
X ≤cx Y , if and only if
E[X] = E[Y ] and π(X, d) ≤ π(Y, d) for all −∞ < d < +∞. (1.13)
A sufficient condition for X ≤cx Y to hold is that E[X] = E[Y ], together
with the condition that their cumulative distribution functions only cross
once. This once-crossing condition can be observed to hold in most natural
examples, but it is of course easy to construct examples with X ≤cx Y and
distribution functions that cross more than once.
It can also be proven that X ≤cx Y if and only E[v(X)
]≤ E
[v(Y )
]for
all convex functions v. This explains the name “convex order”. Note that
when characterizing stop-loss order, the convex functions v are additionally
required to be non-decreasing. Hence, stop-loss order is weaker: more pairs
of random variables are ordered.
In the utility context one will reformulate this condition to E[X] = E[Y ]
and E[u(−X)
]≥ E
[u(−Y )
]for all non-decreasing concave functions u.
These conditions represent the common preferences of all risk-averse deci-
sion makers between risks with equal mean. We summarize the properties
of convex order in the following lemma.
14 Chapter 1 - Risk and comonotonicity in the actuarial world
Lemma 3 (Characterizations for convex order).
X ≤cx Y if and only if any of the following equivalent conditions is satis-
fied:
1. E[X] = E[Y ] and π(X, d) ≤ π(Y, d) for all d ∈ R;
2. E[X] = E[Y ] and E[(d−X)+] ≤ E[(d− Y )+] for all d ∈ R;
3. π(X, d) ≤ π(Y, d) and E[(d−X)+] ≤ E[(d− Y )+] for all d ∈ R;
4. E[X] = E[Y ] and E[u(−X)
]≥ E
[u(−Y )
]for all concave functions
u(.);
5. E[v(X)
]≤ E
[v(Y )
]for all convex functions v(.).
In case X ≤cx Y , the upper tails as well as the lower tails of Y eclipse the
corresponding tails of X, which means that extreme values are more likely
to occur for Y than for X. This observation also implies that X ≤cx Y is
equivalent to −X ≤cx −Y . Hence, the interpretation of risks as payments
or as incomes is irrelevant for the convex order.
Note that with stop-loss order, we are concerned with large values of
a random loss, and call the risk Y less attractive than X if the expected
values of all top parts (Y − d)+ are larger than those of X. Negative
values for these random variables are actually gains. With stability in
mind, excessive gains might also be unattractive for the decision maker,
for instance for tax reasons. In this situation, X could be considered to
be more attractive than Y if both the top parts (X − d)+ and the bottom
parts (d −X)+ have a lower expected value than for Y . Both conditions
just define the convex order introduced above.
Corollary 1 (Convex order and variance).
If X ≤cx Y then Var[X] ≤ Var[Y ].
Proof. It suffices to take the convex function v(x) = x2.
Notice that the reverse implication does not hold in general. Compar-
ing variances is meaningful when comparing stop-loss premiums of convex
ordered risks. The following corollary links variances and stop-loss premi-
ums.
1.2. Comonotonicity 15
Corollary 2 (Variance and stop-loss premiums).
For any random variable X we can write
1
2Var[X] =
∫ +∞
−∞
(π(X, t) −
(E[X] − t
)+
)dt. (1.14)
Proof. See e.g. Kaas et al. (1998).
From relation (1.14) in Corollary 2 we deduce that if X ≤cx Y ,∫ +∞
−∞
∣∣π(Y, t) − π(X, t)∣∣dt =
1
2
(Var[Y ] − Var[X]
). (1.15)
Thus, if X ≤cx Y , their stop-loss distance, i.e. the integrated absolute
difference of their respective stop-loss premiums, equals half the variance
difference between these two random variables.
As the integrand in (1.15) is non-negative, we find that if X ≤cx Y and
in addition Var[X] = Var[Y ], then X and Y must have necessarily equal
stop-loss premiums and hence the same distribution. We also find that
if X ≤cx Y , and X and Y are not equal in distribution, then Var[X] <
Var[Y ] must hold. Note that (1.14) and (1.15) have been derived under
the additional condition that X and Y have finite second moments, hence
both limx→∞ x2(1 − FX(x)) and limx→−∞ x2FX(x) are equal to 0 (and
similar for Y ).
In the following theorem we recall the characterization of stochastic dom-
inance in terms of Value-at-Risk, and a similar result characterizing stop-
loss order by Tail Value-at-Risk.
Theorem 3. For any random pair (X,Y ) we have that
1. X ≤st Y ⇔ VaRp[X] ≤ VaRp[Y ] for all p ∈ (0, 1);
2. X ≤sl Y ⇔ TVaRp[X] ≤ TVaRp[Y ] for all p ∈ (0, 1).
Proof. See Dhaene et al. (2004).
1.2 Comonotonicity
In an insurance context, one is often interested in the distribution function
of a sum of random variables. Such a sum appears for instance when con-
sidering the aggregate claims of an insurance portfolio over a certain refer-
ence period. In traditional risk theory, the individual risks of a portfolio are
16 Chapter 1 - Risk and comonotonicity in the actuarial world
usually assumed to be mutually independent. This is very convenient from
a mathematical point of view as the standard techniques for determining
the distribution function of aggregate claims, such as Panjer’s recursion
and convolution, are based on the independence assumption. Moreover, in
general the statistics gathered by the insurer only give information about
the marginal distributions of the risks, not about their joint distribution,
i.e. the way these risks are interrelated. The assumption of mutual inde-
pendence however does not always comply with reality, which may resolve
in an underestimation of the total risk. On the other hand, the mathema-
tics for dependent variables is less tractable, except when the variables are
comonotonic.
This section provides theoretical background for the concept of comono-
tonicity.
We start by defining a comonotonicity of a set A of n-vectors in Rn. We
will denote an n-vector (x1, x2, . . . , xn) by ~x. For two n-vectors ~x and ~y, the
notation ~x ≤ ~y will be used for the componentwise order which is defined
by xi ≤ yi for all i = 1, 2, . . . , n. We will denote the (i, j)-projection of a
set A in Rn by Ai,j . It is formally defined by Aij = {(xi, xj)|~x ∈ A}.
Definition 5 (Comonotonic set).
The set A ⊆ Rn is said to be comonotonic if for any ~x and ~y in A, either
~x ≤ ~y or ~y ≤ ~x holds.
A set A ⊆ Rn is comonotonic if for any ~x and ~y in A, if xi < yi for some
i, then ~x ≤ ~y must hold. Hence, a comonotonic set is simultaneously non-
decreasing in each component. Notice that a comonotonic set is a ‘thin’
set: it cannot contain any subset of dimension larger than 1. Any subset of
a comonotonic set is also comonotonic. The proof of the following lemma
is straightforward.
Lemma 4. A ⊆ Rn is comonotonic if and only if the set Ai,j is comono-
tonic for all i 6= j in {1, 2, . . . , n}.
For a general set A, comonotonicity of the (i, i + 1)-projections Ai,i+1,
(i = 1, 2, . . . , n− 1), will not necessarily imply that A is comonotonic. As
a counter example, consider the set A = {(x1, 1, x3)|0 < x1, x3 < 1}. This
set is not comonotonic, although A1,2 and A2,3 are comonotonic.
1.2. Comonotonicity 17
Next, we define the notion of support of an n-dimensional random
vector ~X = (X1, . . . , Xn). Any subsect A ⊆ Rn will be called a support
of ~X if Pr[~X ∈ A
]= 1 and Pr
[~X /∈ A
]= 0. In generally we will be
interested in supports which are “as small as possible”. Informally, the
smallest support of a random vector ~X is the subset of Rn that is obtained
by deleting from Rn all points which have a zero-probability neighborhood
(with respect to ~X). This support can be interpreted as the set of all
possible outcomes of ~X.
Definition 6 (Comonotonic random vector).
A random vector ~X = (X1, X2, . . . , Xn) is said to be comonotonic if it has
a comonotonic support.
From Definition 6 we can conclude that comonotonicity is a very strong
positive dependency structure. Indeed, if ~x and ~y are elements of the
comonotonic support of ~X, i.e. ~x and ~y are possible outcomes of ~X, then
they must be ordered component by component. This explains the term
comonotonic (common monotonic).
Comonotonicity of a random vector ~X implies that the higher the value
of one component Xj , the higher the value of any other component Xk.
This means that comonotonicity entails that no Xj is in any way a ‘hedge’,
for another component Xk.
In the following theorem, some equivalent characterizations are given
for comonotonicity of a random vector.
Theorem 4 (Characterizations for comonotonicity).
A random vector ~X = (X1, X2, . . . , Xn) is comonotonic if and only if one
of the following equivalent conditions are satisfied:
1. ~X has a comonotonic support;
2. For all ~x = (x1, x2, . . . , xn), we have
F ~X(~x) = min{FX1(x1), FX2(x2), . . . , FXn(xn)
}; (1.16)
3. For U ∼ U(0, 1), we have
~Xd=(F−1
X1(U), F−1
X2(U), . . . , F−1
Xn(U)); (1.17)
18 Chapter 1 - Risk and comonotonicity in the actuarial world
4. There exist a random variable Z and non-decreasing functions fi
(i = 1, 2, . . . , n), such that
~Xd= (f1(Z), f2(Z), . . . , fn(Z)).
Proof. See Dhaene et al. (2002a).
From (1.16) we see that, in order to find the probability of all the outcomes
of n comonotonic risks Xi being less than xi (i = 1, . . . , n) one simply
takes the probability of the least likely of these n events. It is obvious
that for any random vector (X1, . . . , Xn), not necessarily comonotonic,
the following inequality holds:
Pr[X1 ≤ x1, . . . , Xn ≤ xn
]≤ min
{FX1(x1), . . . , FXn(xn)
}, (1.18)
and it is well-known that the function min{FX1(x1), . . . , FXn(xn)
}is in-
deed the multivariate cdf of a random vector(F−1
X1(U), . . . , F−1
Xn(U)), which
has the same marginal distributions as (X1, . . . , Xn). Inequality (1.18)
states that in the class of all random vectors (X1, . . . , Xn) with the same
marginal distributions, the probability that all Xi simultaneously realize
large values is maximized if the vector is comonotonic, suggesting that
comonotonicity is indeed a very strong positive dependency structure. In
the special case that all marginal distribution functions FXiare identical,
we find from (1.17) that comonotonicity of ~X is equivalent to saying that
X1 = X2 = · · · = Xn holds almost surely.
A standard way of modelling situations where individual random vari-
ables X1, . . . , Xn are subject to the same external mechanism is to use a
secondary mixing distribution. The uncertainty about the external mech-
anism is then described by a structure variable z, which is a realization of
a random variable Z and acts as a (random) parameter of the distribution
of ~X. The aggregate claims can then be seen as a two-stage process: first,
the external parameter Z = z is drawn from the distribution function FZ
of z. The claim amount of each individual risk Xi is then obtained as a
realization from the conditional distribution function of Xi given Z = z.
A special type of such a mixing model is the case where given Z = z, the
claim amounts Xi are degenerate on xi, where the xi = xi(z) are non-
decreasing in z. This means that (X1, . . . , Xn)d= (f1(Z), . . . fn(Z)) where
all functions fi are non-decreasing. Hence, (X1, . . . , Xn) is comonotonic.
Such a model is in a sense an extreme form of a mixing model, as in this
1.2. Comonotonicity 19
case the external parameter Z = z completely determines the aggregate
claims.
If U ∼ U(0, 1), then also 1 − U ∼ U(0, 1). This implies that comono-
tonicity of ~X can also be characterized by
~Xd=(F−1
X1(1 − U), F−1
X2(1 − U), . . . , F−1
Xn(1 − U)
).
Similarly, one can prove that ~X is comonotonic if and only if there exist a
random variable Z and non-increasing functions fi, (i = 1, 2, . . . , n), such
that~X
d= (f1(Z), f2(Z), . . . , fn(Z)).
In the sequel, for any random vector (X1, . . . , Xn), the notation (Xc1, . . . , X
cn)
will be used to indicate a comonotonic random vector with the same
marginals as (X1, . . . , Xn). From (1.17) we find that for any random vec-
tor ~X the outcome of its comonotonic counterpart ~Xc = (Xc1, . . . , X
cn) lies
with probability one in the following set{(F−1
X1(p), F−1
X2(p), . . . , F−1
Xn(p))|0 < p < 1
}.
The following theorem states essentially that the comonotonicity of a ran-
dom vector is equivalent with pairwise comonotonicity.
Theorem 5 (Pairwise comonotonicity).
A random vector ~X is comonotonic if and only if the couples (Xi, Xj) are
comonotonic for all i and j in {1, 2, . . . , n}.The next theorem characterizes a comonotonic random couple by means
of Pearson’s correlation coefficient r.
Theorem 6 (Comonotonicity and maximum correlation).
For any random vector (X1, X2) the following inequality holds:
r(X1, X2) ≤ r(F−1
X1(U), F−1
X2(U)), (1.19)
with strict inequalities when (X1, X2) is not comonotonic.
As a special case of (1.19), we find that r(F−1
X1(U), F−1
X2(U))≥ 0 always
holds. In Denuit & Dhaene (2003) it is shown that other dependence
measures such as Kendall’s τ and Spearman’s ρ equal 1 (and thus are also
maximal) if and only if the variables are comonotonic.
In the following theorem we recall that the Value-at-Risk (VaRp), the
Tail Value-at-Risk (TVaRp) and the Expected Shortfall (ESFp) are addi-
tive for comonotonic risks.
20 Chapter 1 - Risk and comonotonicity in the actuarial world
Theorem 7 (Comonotonicity and risk measures).
Consider a comonotonic random vector(Xc
1, Xc2, . . . , X
cn
), and let Sc =
Xc1 +Xc
2 + · · · +Xcn. Then for all p ∈ (0, 1) one has that
VaRp[Sc] =
n∑
i=1
VaRp[Xi]; (1.20)
TVaRp[Sc] =
n∑
i=1
TVaRp[Xi]; (1.21)
ESFp[Sc] =
n∑
i=1
ESFp[Xi]. (1.22)
Proof. See Dhaene et al. (2004).
The computation of the most important risk measures is very easy for sums
of comonotonic random variables, since it suffices to perform calculations
for marginal distributions and add up the resulting values. Throughout
the rest of this thesis we will use the property of additivity of a quantile
function for comonotonic risks.
Chapter 2
Convex bounds
Summary In many actuarial and financial problems the distribution of
a sum of dependent random variables is of interest. In general, however,
this distribution function can not be obtained analytically because of the
complex underlying dependency structure. Kaas et al. (2000) and Dhaene
et al. (2002a) propose a possible way out by considering upper and lower
bounds for (the distribution function of) such a sum that allow explicit cal-
culations of various actuarial quantities. When lower and upper bounds are
close to each other, together they can provide reliable information about
the original and more complex variable. In particular this technique is very
useful to find reliable estimations of upper quantiles and stop-loss premi-
ums. We summarize the main results for deriving lower and upper bounds
and we construct sharper upper bounds for stop-loss premiums, based upon
the traditional comonotonic bounds. The idea of convex upper and lower
bounds is generalized to the case of scalar products of non-negative random
variables. We apply the derived results to the case of general discounted
cash flows, with stochastic payments. Numerous numerical illustrations
are provided, demonstrating that the derived methodology gives very ac-
curate approximations for the underlying distribution functions and the
corresponding risk measures, like quantiles and stop-loss premiums.
2.1 Introduction
In many financial and actuarial applications where a sum of stochastic
terms is involved, the distribution of the quantity under investigation is too
difficult to obtain. It is well-known that in general the distribution function
21
22 Chapter 2 - Convex bounds
of a sum of dependent random variables cannot be determined analytically.
Therefore, instead of aiming to calculate the exact distribution, we will look
for approximations (bounds), in the convex order sense, with a simpler
structure.
The first approximation we will consider for the distribution function
of a sum of dependent random variables is derived by approximating the
dependence structure between the random variables involved by a comono-
tonic dependence structure. If the dependency structure between the sum-
mands of such a sum is strong enough, this upper bound in convex order
performs reasonably well.
The second approximation, which is derived by considering conditional
expectations, partly takes of the dependence structure into account. This
lower bound in convex order turns out to be extremely useful to evaluate
the quality of approximation provided by the upper bound. The lower
bound can also be applied as an approximation of the underlying distribu-
tion. This choice is not actuarially prudent, but the relative error of this
approximation significantly outperforms the relative error of the upper
bound.
When lower and upper bounds are close to each other, together they
can provide reliable information about the original and more complex vari-
able. We emphasize that the bounds are in convex order, which does not
mean that the real value always lies between these two approximations. In
particular this technique is very useful to find reliable estimations of upper
quantiles and stop-loss premiums.
Section 2 recalls these theoretical results of Dhaene et al. (2002a).
The lower bound approximates very accurate the real stop-loss premium,
but the comonotonic upper bounds perform rather poorly. Therefore, in
Section 3 we construct sharper upper bounds based upon the traditional
comonotonic bounds. Making use of the ideas of Rogers and Shi (1995),
the first upper bound is obtained as the comonotonic lower bound plus
an error term. Next, this bound is refined by making the error term de-
pendent on the retention in the stop-loss premium. Further, we study the
case that the stop-loss premium can be decomposed into two parts. One
part can be evaluated exactly, to another part, comonotonic bounds are
applied. The application to the lognormal case is presented at the end of
Section 3.
In Section 4 we illustrate the accuracy of the comonotonic approxima-
tions by means of an application in the context of discounted reserves.
2.2. Convex bounds for sums of dependent random variables 23
Section 5 extends the methodology of Dhaene et al. (2002a,b) for deriv-
ing lower and upper bounds of a sum of dependent variables to the case of
scalar products of independent random vectors. We derive a procedure for
calculating the lower and upper bounds in case one of the vectors follows
the multivariate lognormal law.
In Section 6 we apply these results to the case of general discounted
cash flows, with stochastic payments. Numerous numerical illustrations
are provided, demonstrating that the derived methodology gives very ac-
curate approximations for the underlying distribution functions and the
corresponding risk measures.
Section 2 and 3 in this chapter are mainly based on Hoedemakers, Dark-
iewicz, Deelstra, Dhaene & Vanmaele (2005). The results in Section 4
come from Hoedemakers & Goovaerts (2004). The generalization to the
scalar product of two random vectors in Section 5 is based on Hoedemak-
ers, Darkiewicz & Goovaerts (2005) and Section 6 is taken from Ahcan,
Darkiewicz, Goovaerts & Hoedemakers (2005).
2.2 Convex bounds for sums of dependent ran-
dom variables
In the actuarial context one encounters quite often random variables of the
type
S = X1 +X2 + · · · +Xn,
where the terms Xi are not mutually independent, but the multivariate
distribution function of the random vector ~X = (X1, X2, . . . , Xn) is not
completely specified and one only knows the marginal distribution func-
tions of the random variablesXi. In such cases, to be able to make decisions
it may be helpful to find the dependence structure for the random vector
(X1, . . . , Xn) producing the least favorable aggregate claims S with given
marginals. Therefore, given the marginal distributions of the terms in a
random variable S =∑n
i=1Xi, we shall look for a joint distribution with
a smaller resp. larger sum, in the convex order sense.
If S consists of a sum of random variables (X1, . . . , Xn), replacing the
joint distribution of (X1, . . . , Xn) by the comonotonic joint distribution
yields an upper bound for S in the convex order. On the other hand, ap-
plying conditioning to S provides us a lower bound. Finally, if we combine
24 Chapter 2 - Convex bounds
both ideas, then we end up with an improved upper bound. This is formal-
ized in the following theorem, which is taken from Dhaene et al. (2002a)
and Kaas et al. (2000).
Theorem 8 (Bounds for a sum of random variables).
Consider a sum of random variables S = X1 +X2 + . . . +Xn and define
the following related random variables:
Sl = E[X1|Λ] + E[X2|Λ] + . . .+ E[Xn|Λ], (2.1)
Sc = F−1X1
(U) + F−1X2
(U) + . . .+ F−1Xn
(U), (2.2)
Su = F−1X1|Λ(U) + F−1
X2|Λ(U) + . . .+ F−1Xn|Λ(U), (2.3)
with U a U(0,1) random variable and Λ an arbitrary random variable.
Here F−1Xi|Λ(U) is the notation for the random variable fi(U,Λ), with the
function fi defined by fi(u, λ) = F−1Xi|Λ=λ(u).
The following relations then hold:
Sl ≤cx S ≤cx Su ≤cx S
c.
Proof. See e.g. Dhaene et al. (2002a).
The comonotonic upper bound changes the original copula, but keeps the
marginal distributions unchanged. The comonotonic lower bound on the
other hand, changes both the copula and the marginals involved. Intu-
itively, one can expect that an appropriate choice of the conditioning vari-
able Λ will lead to much better approximations compared to the upper
bound.
The upper bound Sc is the most dangerous sum of random variables
with the same marginal distributions as the original terms Xj in S. Indeed,
the upper bound Sc now consists of a sum of comonotonic variables all
depending on the same random variable U . If one can find a conditioning
random variable Λ with the property that all random variables E[Xj |Λ] are
non-increasing functions of Λ (or all are non-decreasing functions of Λ),
then the lower bound Sl =∑n
j=1 E[Xj |Λ] is also a sum of n comonotonic
random variables.
We recall from Dhaene et al. (2002a) and the references therein the
procedures for obtaining the lower and upper bounds for stop-loss pre-
miums of sums S of dependent random variables by using the notion of
comonotonicity.
2.2. Convex bounds for sums of dependent random variables 25
2.2.1 The comonotonic upper bound
As proven in Dhaene et al. (2002a), the convex-largest sum of the compo-
nents of a random vector with given marginals is obtained by the comono-
tonic sum Sc = Xc1 +Xc
2 + · · · +Xcn with
Sc d=
n∑
i=1
F−1Xi
(U), (2.4)
where U denotes in the following a U(0, 1) random variable.
Kaas et al. (2000) have proved that the inverse distribution function of
a sum of comonotonic random variables is simply the sum of the inverse
distribution functions of the marginal distributions. See also Theorem 7.
Therefore, given the inverse functions F−1Xi
, the cumulative distribution
function of Sc = Xc1 +Xc
2 + · · · +Xcn can be determined as follows:
FSc(x) = sup {p ∈ (0, 1) | FSc(x) ≥ p}= sup
{p ∈ (0, 1) | F−1
Sc (p) ≤ x}
= sup
{p ∈ (0, 1) |
n∑
i=1
F−1Xi
(p) ≤ x
}. (2.5)
Moreover, in case of strictly increasing and continuous marginals, the cdf
FSc(x) is uniquely determined by
F−1Sc (FSc (x)) =
n∑
i=1
F−1Xi
(FSc (x)) = x, F−1+Sc (0) < x < F−1
Sc (1).
(2.6)
Hereafter we restrict ourselves to this case of strictly increasing and con-
tinuous marginals.
In the following theorem Dhaene et al. (2000) have proved that the
stop-loss premiums of a sum of comonotonic random variables can easily
be obtained from the stop-loss premiums of the terms.
Theorem 9 (Stop-loss premium of comonotonic sum).
The stop-loss premium, denoted by πcub(S, d), of the sum Sc of the com-
ponents of the comonotonic random vector (Xc1, X
c2, . . . , X
cn) at retention
d is given by
πcub(S, d) =
n∑
i=1
π(Xi, F
−1Xi
(FSc(d)
)),
(F−1+
Sc (0) < d < F−1Sc (1)
).
(2.7)
26 Chapter 2 - Convex bounds
If the only information available concerning the multivariate distribution
function of the random vector (X1, . . . , Xn) consists of the marginal dis-
tribution functions of the Xi, then the distribution function of Sc =
F−1X1
(U) + F−1X2
(U) + · · · + F−1Xn
(U) is a prudent choice for approximat-
ing the unknown distribution function of S = X1 +X2 + · · · +Xn. It is a
supremum in terms of convex order. It is the best upper bound that can
be derived under the given conditions.
We end this part about the comonotonic upper bound by summarizing
the main advantages of using Sc = Xc1 + Xc
2 + · · · + Xcn instead of S =
X1 +X2 + · · · +Xn:
• Replacing the distribution function of S by the distribution function
of Sc is a prudent strategy in the framework of utility theory: the
real distribution function is replaced by a less attractive one.
• The random variables S and Sc have the same expected value. As
these random variables are ordered in the convex order sense, we
have that every moment of order 2k (k = 1, 2, . . .) of S is smaller
than the corresponding moment of Sc. Many actuarially relevant
quantities reflect convex order, for instance both the ruin probability
and the Lundberg upper bound for it increase when the claim size
distribution is replaced by a convex larger one. Other examples are
zero-utility premiums such as the exponential premium, and of course
stop-loss premiums for any retention d.
• The cdf of Sc can easily be obtained; essentially, Sc has a one-
dimensional distribution, depending only on the random variable U .
The distribution function of S can only be obtained if the dependency
structure is known. Even if this dependency structure is known, it
can be hard to determine the distribution function of S from it.
• The stop-loss premiums of Sc follow from stop-loss premiums of the
marginal random variables involved. Computing the stop-loss pre-
miums of S can only be carried out when the dependency structure
is known, and in general requires n integrations to be performed.
2.2.2 The improved comonotonic upper bound
Let us now assume that we have some additional information available
concerning the stochastic nature of (X1, . . . , Xn). More precisely, we as-
2.2. Convex bounds for sums of dependent random variables 27
sume that there exists some random variable Λ with a given distribution
function, such that we know the conditional cumulative distribution func-
tions, given Λ = λ, of the random variables Xi, for all possible values of λ.
In fact, Kaas et al. (2000) define the improved comonotonic upper bound
Su as
Su = F−1X1|Λ(U) + F−1
X2|Λ(U) + · · · + F−1Xn|Λ(U). (2.8)
In order to obtain the distribution function of Su, observe that given the
event Λ = λ, the random variable Su is a sum of comonotonic random
variables.
Hence,
F−1Su|Λ=λ(p) =
n∑
i=1
F−1Xi|Λ=λ(p), p ∈ (0, 1) .
Given Λ = λ, the cdf of Su is defined by
FSu|Λ=λ(x) = sup
{p ∈ (0, 1) |
n∑
i=1
F−1Xi|Λ=λ(p) ≤ x
}.
The cdf of Su then follows from
FSu(x) =
∫ +∞
−∞FSu|Λ=λ(x) dFΛ(λ).
If the marginal cdf’s FXi|Λ=λ are strictly increasing and continuous, then
FSu|Λ=λ(x) is a solution to
n∑
i=1
F−1Xi | Λ=λ
(FSu | Λ=λ(x)
)= x, x ∈
(F−1+
Su | Λ=λ(0), F−1Su | Λ=λ(1)
).
(2.9)
In this case, we also find that for any d ∈(F−1+
Su|Λ=λ(0), F−1Su|Λ=λ(1)
):
E[(Su − d)+ |Λ = λ
]=
n∑
i=1
E
[(Xi − F−1
Xi|Λ=λ
(FSu|Λ=λ(d)
))
+|Λ = λ
],
from which the stop-loss premium at retention d of Su, which we will
denote by πicub(S, d,Λ), can be determined by weighted integration with
respect to λ over the real line.
28 Chapter 2 - Convex bounds
2.2.3 The lower bound
Let ~X = (X1, . . . , Xn) be a random vector with given marginal cumula-
tive distribution functions FX1 , FX2 , . . . , FXn . Let us now assume that we
have some additional information available concerning the stochastic na-
ture of (X1, . . . , Xn). More precisely, we assume that there exists some
random variable Λ with a given distribution function, such that we know
the conditional distribution, given Λ = λ, of the random variables Xi, for
all possible values of λ. We recall from Kaas et al. (2000) that a lower
bound, in the sense of convex order, for S = X1 +X2 + · · · +Xn is
Sl = E [S|Λ] . (2.10)
This idea can also be found in Rogers and Shi (1995) for the continuous
and lognormal case. Let us further assume that the random variable Λ is
such that all E [Xi|Λ] are non-decreasing and continuous functions of Λ,
then Sl is a comonotonic sum.
The quantiles of the lower bound S l then follow from
F−1Sl (p) =
n∑
i=1
F−1E[Xi|Λ](p) =
n∑
i=1
E[Xi|Λ = F−1
Λ (p)], p ∈ (0, 1) , (2.11)
and the cdf of Sl is according to (2.5) given by
FSl(x) = sup
{p ∈ (0, 1) |
n∑
i=1
E[Xi|Λ = F−1
Λ (p)]≤ x
}. (2.12)
Using Theorem 9, the stop-loss premiums with retention d read(F−1+
Sl (0)
< d < F−1Sl (1)
)
πlb(S, d,Λ) =n∑
i=1
π(E[Xi|Λ], F−1
E[Xi|Λ]
(FSl(d)
)).
When in addition the cdf’s of the random variables E [Xi|Λ] are strictly
increasing and continuous, then the cdf of S l is also strictly increasing and
continuous, and we get analogously to (2.6) for all x ∈(F−1+
Sl (0) , F−1Sl (1)
),
n∑
i=1
F−1E[Xi|Λ]
(FSl(x)
)= x ⇔
n∑
i=1
E[Xi|Λ = F−1
Λ
(FSl(x)
)]= x, (2.13)
2.2. Convex bounds for sums of dependent random variables 29
which unambiguously determines the cdf of the convex order lower bound
Sl for S. In order to derive the above equivalence, we used the results of
Lemma 1.
Invoking Theorem 9, the stop-loss premium πlb(S, d,Λ) of Sl can be
computed as:
πlb(S, d,Λ) =
n∑
i=1
π(E[Xi|Λ
],E[Xi|Λ = F−1
Λ
(FSl(d)
)]), (2.14)
which holds for all retentions d ∈(F−1+
Sl (0) , F−1Sl (1)
).
So far, we considered the case that all E [Xi|Λ] are non-decreasing func-
tions of Λ. The case where all E [Xi|Λ] are non-increasing and continuous
functions of Λ also leads to a comonotonic vector(E [X1|Λ] , . . . ,E [Xn|Λ]
),
and can be treated in a similar way.
In case the cumulative distribution functions of the random variables
E [Xi|Λ] are not continuous nor strictly increasing or decreasing functions
of Λ, then the stop-loss premiums of S l, which is not comonotonic anymore,
can be determined as follows :
πlb(S, d,Λ) =
∫ +∞
−∞
(n∑
i=1
E [Xi|Λ = λ] − d
)
+
dFΛ (λ) .
2.2.4 Moments based approximations
The lower and upper bounds can be considered as approximations for the
distribution of a sum S of random variables. On the other hand, any
convex combination of the stop-loss premiums of the lower bound S l and
the upper bounds Sc or Su also could serve as an approximation for the
stop-loss premium of S. Since the bounds S l and Sc have the same mean
as S, any random variable Sm defined by its stop-loss premiums
πm(S, d,Λ) = zπlb(S, d,Λ) + (1 − z)πcub(S, d), 0 ≤ z ≤ 1,
will also have the same mean as S. By taking the (right-hand) derivative
we find
FSm(x) = zFSl(x) + (1 − z)FSc(x), 0 ≤ z ≤ 1,
so the distribution function of the approximation can be calculated fairly
easily. By choosing the optimal weight z, we want Sm to be as close as
30 Chapter 2 - Convex bounds
possible to S. In Vyncke et al. (2004) z is chosen as
z =Var[Sc] − Var[S]
Var[Sc] − Var[Sl]. (2.15)
This choice does not depend on the retention and it leads to equal variances
Var[Sm] = Var[S].
As an alternative one could consider the improved upper bound Su and
define a second approximation as follows
πm2(S, d,Λ) = zπlb(S, d,Λ) + (1 − z)πicub(S, d,Λ),
now with
z =Var[Su] − Var[S]
Var[Su] − Var[Sl].
2.3 Upper bounds for stop-loss premiums
One of the most important tasks of actuaries is to assess the degree of dan-
gerousness of a risk X — either by finding the (approximate) distribution
or at least by summarizing its properties quantitatively by means of risk
measures to determine an insurance premium or a sufficient reserve with
solvency margin.
A stop-loss premium π(X, d) = E[(X−d)+] = E[max(0, X−d)] is one of
the most important risk measures. The retention d is usually interpreted as
an amount retained by an insured (or an insurer) while an amount X − d
is ceded to an insurer (or a reinsurer). In this case π(X, d) has a clear
interpretation as a pure insurance (reinsurance) premium.
Another practical application of stop-loss premiums is the following:
Suppose that a financial institution faces a risk X to which a capital K is
allocated. Then the residual risk R = (X−K)+ is a quantity of concern to
the society and regulators. Indeed, it represents the pessimistic case when
the random loss X exceeds the available capital. The value E[R] is often
referred to as the “expected shortfall” as explained in Subsection 1.1.2,
with K a VaR at some level.
It is not always straightforward to compute stop-loss premiums. In
the actuarial literature a lot of attention has been devoted to determine
bounds for stop-loss premiums in case only partial information about the
2.3. Upper bounds for stop-loss premiums 31
claim size distribution is available (e.g. De Vylder & Goovaerts (1982),
Jansen et al. (1986), Hurlimann (1996, 1998), among others).
Other types of problems appear in the case of sums of random vari-
ables S = X1+· · ·+Xn when full information about marginal distributions
is available but the dependency structure is not known. In the previous
section it is explained how the upper bound Sc of the sum S in so called
convex order sense can be calculated by replacing the unknown joint dis-
tribution of the random vector (X1, X2, . . . , Xn) by the most dangerous
comonotonic joint distribution. One can also obtain a lower bound S l
through conditioning. Such an approach allows to determine analytical
bounds for stop-loss premiums πlb(S, d,Λ) ≤ π(S, d) ≤ πcub(S, d).
In practical applications the comonotonic upper bound seems to be
useful only in the case of a very strong dependency between successive
summands. Even then the bounds for stop-loss premiums provided by
the comonotonic approximation are often not satisfactory. In this section
we present a number of techniques which allow to determine much more
efficient upper bounds for stop-loss premiums. To this end, we use on one
the hand the method of conditioning as in Curran (1994) and in Rogers &
Shi (1995), and on the other hand the upper and lower bounds for stop-
loss premiums of sums of dependent random variables as explained in the
previous subsection.
2.3.1 Upper bounds based on lower bound plus error term
Following the ideas of Rogers and Shi (1995), we derive an upper bound
based on the lower bound Sl.
Lemma 5.
For any random variable X we have the following inequality
E[X+] ≤ E[X]+ +1
2Var1/2(X). (2.16)
Proof. Define X−+ as follows
X−+ := max(−X, 0) = (−X)+ = −min(X, 0).
32 Chapter 2 - Convex bounds
Using Jensen’s inequality twice we have
0 ≤ E[X+] − E[X]+
=1
2
{(E[X+] − E[X]+
)+(E[X−
+ ] − E[X]−+)}
=1
2
{E[X+ +X−
+ ] − E[X]+ − E[X]−+}
=1
2
{E[|X|] − |E[X]|
}
≤ 1
2E[|X − E[X]|]
≤ 1
2Var1/2(X)
Applying now Proposition 5 for any random variable Y and Z:
0 ≤ E[E [Y+|Z] − E [Y |Z]+
]≤ 1
2E[√
Var[Y |Z]]
(2.17)
to the case of Y being S − d and Z being our conditioning variable Λ, we
obtain an error bound
0 ≤ E[E [(S − d)+|Λ] − (Sl − d)+
]≤ 1
2E[√
Var[S|Λ]], (2.18)
which is only useful if the retention d is strictly positive.
Consequently, we find as upper bound for the stop-loss premium of S
π(S, d) ≤ πeub(S, d,Λ), (2.19)
with πeub(S, d,Λ) given by
πeub(S, d,Λ) = πlb(S, d,Λ) +1
2E[√
Var[S |Λ]]. (2.20)
The second term on the right hand side takes the form
E[√
Var[S |Λ]]
= E
[(E[S2|Λ
]−(E[S|Λ]
)2)1/2]
(2.21)
= E
[( n∑
i=1
n∑
j=1
E [XiXj |Λ] −(Sl)2)1/2
],
and once the distributions of Xi and Λ are specified and known, it can be
written out more explicitly.
2.3. Upper bounds for stop-loss premiums 33
2.3.2 Bounds by conditioning through decomposition of thestop-loss premium
Decomposition of the stop-loss premium
In this part we show how to improve the bounds introduced in Section
2.2 and Subsection 2.3.1. By conditioning S on some random variable Λ,
the stop-loss premium can be decomposed in two parts, one of which can
either be computed exactly or by using numerical integration, depending
on the distribution of the underlying random variable. For the remaining
part we first derive a lower and an upper bound based on comonotonic
risks, and another upper bound equal to that lower bound plus an error
term. This idea of decomposition goes back at least to Curran (1994).
By the tower property for conditional expectations the stop-loss premium
π(S, d) with S =n∑
i=1Xi equals
E[E[(S − d)+|Λ]
],
for every conditioning variable Λ, say with cdf FΛ.
If in addition there exists a dΛ such that Λ ≥ dΛ implies that S ≥ d,
we can decompose the stop-loss premium of S as follows
π(S, d) =
∫ dΛ
−∞E[(S − d)+|Λ = λ]dFΛ(λ) +
∫ +∞
dΛ
E[S − d|Λ = λ]dFΛ(λ)
=: I1 + I2. (2.22)
Notice that the other case (Λ ≤ dΛ implies that S ≥ d) can be treated
in a similar way with the appropriate integration bounds. In practical
applications the existence of such a dΛ depends on the actual form of S
and Λ = λ.
The second integral can further be simplified to
I2 =
∫ +∞
dΛ
n∑
i=1
E[Xi|Λ = λ
]dFΛ(λ) − d(1 − FΛ
(dΛ)), (2.23)
and can be written out explicitly if the bivariate distribution of (Xi,Λ) is
known for all i.
Deriving bounds for the first part I1 in decomposition (2.22) and adding
up to the exact part (2.23) gives us the bounds for the stop-loss premium.
34 Chapter 2 - Convex bounds
Lower bound
By means of Jensen’s inequality, the first integral I1 of (2.22) can be
bounded below:
I1 ≥∫ dΛ
−∞
(E[S | Λ = λ]−d
)+dFΛ(λ) =
∫ dΛ
−∞
( n∑
i=1
E[Xi|Λ = λ]−d)
+dFΛ(λ).
(2.24)
By adding the exact part (2.23) and introducing notation (2.10), we end
up with the inequality of Section 2.2.3:
π(S, d) ≥ πlb(S, d,Λ).
When Sl is a sum of n comonotonic risks we can apply (2.14) which holds
even when we do not know or find a dΛ.
When Sl is not comonotonic we use the decomposition
πlb(S, d,Λ) =
∫ dΛ
−∞
( n∑
i=1
E[Xi|Λ = λ] − d)
+dFΛ(λ)
+
∫ +∞
dΛ
n∑
i=1
E[Xi|Λ = λ
]dFΛ(λ) − d(1 − FΛ
(dΛ)).
Upper bound based on lower bound
In this part we improve the bound (2.19) by applying (2.17) to (2.24):
0 ≤ E[E[(S − d)+|Λ] − (Sl − d)+
]
=
∫ dΛ
−∞
(E[(S − d)+|Λ = λ
]−(E[S|Λ = λ] − d
)+
)dFΛ(λ)
≤ 1
2
∫ dΛ
−∞
(Var[S | Λ = λ]
) 12 dFΛ(λ) (2.25)
≤ 1
2
(E[Var[S|Λ]I(Λ<dΛ)
]) 12(E[I(Λ<dΛ)
]) 12
=: ε(dΛ), (2.26)
where Holder’s inequality has been applied in the last inequality. We will
denote this upper bound by πdeub(S, d,Λ). So we have that
πdeub(S, d,Λ) = πlb(S, d,Λ) + ε(dΛ). (2.27)
2.3. Upper bounds for stop-loss premiums 35
We remark that the error bound (2.18), and hence also the upper bound
πeub(S, d,Λ), is independent of dΛ and corresponds to the limiting case
of (2.25) where dΛ equals infinity. Obviously, the error bound (2.25) im-
proves the error bound (2.18). In practical applications, the additional
error introduced by Holders inequality turns out to be much smaller than
the difference 12E[√
Var[S|Λ]]− ε(dΛ).
2.3.3 Partially exact/comonotonic upper bound
We bound the first term I1 of (2.22) above by replacing S|Λ = λ by its
comonotonic upper bound Su (in convex order sense):
∫ dΛ
−∞E[(S−d)+|Λ = λ
]dFΛ(λ) ≤
∫ dΛ
−∞E[(Su−d)+|Λ = λ
]dFΛ(λ). (2.28)
Adding (2.28) to the exact part (2.23) of the decomposition (2.22) results
in the so-called partially exact/comonotonic upper bound for a stop-loss
premium. We will use the notation πpecub(S, d,Λ) to indicate this upper
bound.
It is easily seen that
πpecub(S, d,Λ) ≤ πicub(S, d,Λ),
while for two distinct conditioning variables Λ1 and Λ2 it does not neces-
sarily holds that
πpecub(S, d,Λ1) ≤ πicub(S, d,Λ2).
2.3.4 The case of a sum of lognormal random variables
We show how to apply our results to the case of sums of lognormal dis-
tributed random variables. Such sums are widely encountered in practice,
both in actuarial science and in finance. Typical examples are present val-
ues of future cash flows with stochastic (Gaussian) returns (see Dhaene et
al. (2002b)), Asian options (see e.g. Simon et al. (2000), Vanmaele et al.
(2004b) and Albrecher et al. (2005)) and basket options (see Deelstra et
al. (2004) and Vanmaele et al. (2004a)).
36 Chapter 2 - Convex bounds
We assume that Xi = αieZi with Zi ∼ N(E[Zi], σ
2Zi
) and αi ∈ R. We
develop the expressions for the lower and upper bounds for the following
sum S
S =n∑
i=1
Xi =n∑
i=1
αieZi . (2.29)
In this case the stop-loss premium π(Xi, di) with some retention di is well-
known from the following lemma.
Lemma 6 (Stop-loss premium of lognormal random variable).
Let X be a lognormal random variable of the form αeZ with Z ∼ N(E[Z], σ2Z)
and α ∈ R. Then the stop-loss premium with retention d equals for αd > 0
π(X, d) = sign(α)eµ+σ2
2 Φ(sign(α)b1
)− dΦ
(sign(α)b2
), (2.30)
where
µ = ln |α| + E[Z] σ = σZ
b1 =µ+ σ2 − ln |d|
σb2 = b1 − σ. (2.31)
The case αd < 0 is trivial.
We now consider a normally distributed random variable Λ. The following
results are analogous to Theorem 1 in Dhaene et al. (2002b).
Theorem 10 (Bounds for a sum of lognormal random variables).
Let S be given by (2.29) and consider a normally distributed random vari-
able Λ which is such that (Zi,Λ) is bivariate normally distributed for all
i. Then the distributions of the lower bound S l, the improved comonotonic
upper bound Su and the comonotonic upper bound Sc are given by
Sl =
n∑
i=1
αieE[Zi]+riσZi
Φ−1(V )+ 12(1−r2
i )σ2Zi , (2.32)
Su =n∑
i=1
αieE[Zi]+riσZi
Φ−1(V )+sign(αi)√
1−r2i σZi
Φ−1(U), (2.33)
Sc =n∑
i=1
αieE[Zi]+sign(αi)σZi
Φ−1(U), (2.34)
2.3. Upper bounds for stop-loss premiums 37
where U and V = Φ
(Λ − E[Λ]
σΛ
)are mutually independent U(0,1) random
variables, and ri, i = 1, . . . , n, are correlations defined by
ri = Corr (Zi,Λ) =Cov [Zi,Λ]
σZiσΛ
.
If, for all i sign(αi) = sign(ri), or, for all i sign(αi) = −sign(ri) with
ri 6= 0, then Sl is comonotonic.
Proof. See Dhaene et al. (2002b)
Comonotonic upper bound
The quantile function of Sc results from (1.20) in Theorem 7 and is given
by
F−1Sc (p) =
n∑
i=1
αieE[Zi]+sign(αi)σZi
Φ−1(p), p ∈ (0, 1). (2.35)
Since the cdf’s FXiare strictly increasing and continuous, it follows from
(2.6) and (2.34) that for x ∈(F−1+
Sc (0), F−1Sc (1)
), the cdf of the comonotonic
sum FSc(x) can be found by solving
n∑
i=1
αieE[Zi]+sign(αi)σZi
Φ−1(FSc (x)
)= x.
Combination of Theorem 9 and Lemma 6 yields the following expression
for the stop-loss premium of Sc at retention d with F−1+Sc (0) < d < F−1
Sc (1):
πcub(S, d) =n∑
i=1
αieE[Zi]+
σ2Zi2 Φ
[sign(αi)σZi
−Φ−1(FSc(d)
)]−d(1−FSc(d)
).
Improved comonotonic upper bound
We now determine the cdf of Su and the stop-loss premium πicub(S, d,Λ),
where we condition on a normally distributed random variable Λ or equiv-
alently on the U(0, 1) random variable introduced in Theorem 10:
V = Φ
(Λ − E [Λ]
σΛ
).
The conditional probability FSu|V =v(x) also denoted by FSu(x|V = v),
is the cdf of a sum of n comonotonic random variables and follows for
38 Chapter 2 - Convex bounds
F−1+Su|V =v(0) < x < F−1
Su|V =v(1), according to (2.9) and (2.33), implicitly
from:
n∑
i=1
αieE[Zi]+riσZi
Φ−1(v)+sign(αi)√
1−r2i σZi
Φ−1(FSu (x|V =v)
)= x. (2.36)
The cdf of Su is then given by
FSu(x) =
∫ 1
0FSu|V =v(x)dv.
We now look for an expression for the stop-loss premium at retention d
with F−1+Su|V =v(0) < d < F−1
Su|V =v(1) for Su:
πicub(S, d,Λ) =
∫ 1
0E[(Su − d)+ |V = v
]dv
=n∑
i=1
∫ 1
0E[(F−1
Xi|Λ(U |V = v) − di
)+
]dv
with di = F−1Xi|Λ
(FSu(d|V = v)|V = v
)and with U a random variable
which is uniformly distributed on (0, 1). Since sign(αi)F−1Xi|Λ(U |V = v)
follows a lognormal distribution with mean and standard deviation:
µv(i) = ln |αi| + E [Zi] + riσZiΦ−1(v), σv(i) =
√1 − r2i σZi
,
one obtains that
di = αi exp[E[Zi] + riσZi
Φ−1(v) + sign(αi)√
1 − r2i σZiΦ−1
(FSu|V =v(d)
)].
Formula (2.30) then yields
E[(Su − d)+ |V = v
]=
n∑
i=1
[sign(αi)e
µv(i)+σ2
v(i)
2 Φ(sign(αi)bi,1
)− diΦ
(sign(αi)bi,2
)],
with, according to (2.31),
bi,1 =µv(i) + σ2
v(i) − ln |di|σv(i)
, bi,2 = bi,1 − σv(i).
2.3. Upper bounds for stop-loss premiums 39
Substitution of the corresponding expressions and integration over the in-
terval [0, 1] leads to the following result
πicub(S, d,Λ) =n∑
i=1
αieE[Zi]+
12σ2
Zi(1−r2
i )∫ 1
0eriσZi
Φ−1(v) ×
×Φ
(sign(αi)
√1 − r2i σZi
− Φ−1(FSu|V =v(d)
))dv
−d(1 − FSu(d)
). (2.37)
Lower bound
In this subsection, we study the case that, for all i, sign(αi) = sign(ri)
when ri 6= 0. For simplicity we take all αi ≥ 0 and assume that the
conditioning variable Λ is normally distributed and has the right sign such
that the correlation coefficients ri are all positive. These conditions ensure
that Sl is the sum of n comonotonic random variables. The case that, for
all i, sign(αi) = −sign(ri) when ri 6= 0 can be dealt with in an analogous
way.
The quantile function of Sl results from (1.20) in Theorem 7 and is given
by
F−1Sl (p) =
n∑
i=1
αieE[Zi]+riσZi
Φ−1(p)+ 12(1−r2
i )σ2Zi , p ∈ (0, 1). (2.38)
Since by our assumptions E[Xi|Λ] is increasing, we can obtain FSl(x) ac-
cording to (2.13) and (2.32) from
n∑
i=1
αieE[Zi]+riσZi
Φ−1(F
Sl (x))+ 1
2(1−r2i )σ2
Zi = x. (2.39)
Moreover as Sl is the sum of n lognormally distributed random variables,
the stop-loss premium at retention d (> 0) can be expressed explicitly by
invoking Theorem 9 and Lemma 6:
πlb(S, d,Λ) =
n∑
i=1
αieE[Zi]+
12σ2
Zi Φ[riσZi
− Φ−1(FSl(d)
)]− d(1 − FSl(d)
).
(2.40)
40 Chapter 2 - Convex bounds
Upper bound based on lower bound
From (2.21) we obtain that
E[√
Var[S|Λ]]
=
∫ +∞
−∞
{n∑
i=1
n∑
j=1
E[XiXj |Λ = λ
]−(E[S|Λ = λ]
)2} 1
2
dFΛ(λ). (2.41)
Now consider the first term in the right hand side of (2.41). Because of
the properties of lognormally distributed random variables, the product of
lognormals is again lognormal if the underlying vector is multivariate nor-
mal distributed, and conditioning a lognormal variate on a normal variate
yields a lognormally distributed variable.
We can proceed by denoting Zij = Zi +Zj with E[Zij ] = E[Zi] + E[Zj ]
and
σ2Zij
= σ2Zi
+ σ2Zj
+ 2σZiZj,
where σZiZj:= Cov[Zi, Zj ]. Note that
rij =Cov[Zij ,Λ]
σZijσΛ
=Cov [Zi,Λ]
σZijσΛ
+Cov [Zj ,Λ]
σZijσΛ
=σZi
σZij
ri +σZj
σZij
rj .
Conditionally, given Λ = λ, the random variable Zij is normally dis-
tributed with parameters µ(i, j) = E [Zij ]+rijσZij
σΛ
(λ−E[Λ]
)and σ2(i, j) =(
1 − r2ij)σ2
Zij. Hence, conditionally, given Λ = λ, the random variable
eZij is lognormally distributed with parameters µ(i, j) and σ2(i, j). As
E[eZij |Λ = λ
]= eµ(i,j)+ 1
2σ2(i,j), we find
E[eZij |Λ
]= e
E[Zij ]+rijσZijΦ−1(V )+ 1
2(1−r2ij)σ2
Zij ,
where the random variable V = Φ(
Λ−E[Λ]σΛ
)is uniformly distributed on
the interval (0, 1).
2.3. Upper bounds for stop-loss premiums 41
Thus, the first term in (2.41) equals
n∑
i=1
n∑
j=1
E[XiXj |Λ] =
n∑
i=1
n∑
j=1
αiαj exp
(E[Zij ] + rijσZij
Φ−1(V ) +1
2
(1 − r2ij
)σ2
Zij
), (2.42)
while the second term consists of (2.32). Hence (2.41) can be written out
explicitly and by using (2.20), we have that the upper bound (2.19) is given
by
πeub(S, d,Λ) =n∑
i=1
αieE[Zi]+
12σ2
Zi Φ[riσZi
− Φ−1(FSl(d)
)]− d(1 − FSl(d)
)
+1
2
∫ 1
0
{n∑
i=1
n∑
j=1
αiαjeE[Zij ]+rijσZij
Φ−1(v)+ 12(1−r2
ij)σ2Zij
−(
n∑
i=1
αieE[Zi]+riσZi
Φ−1(v)+ 12(1−r2
i )σ2Zi
)2 } 12
dv.
Bounds by conditioning through decomposition of stop-loss pre-
mium
In this part we apply the theory of Subsection 2.3.2 to the sum of lognor-
mal random variables (2.29). We give here the analytical expressions for
the two upper bounds πdeub(S, d,Λ) and πpecub(S, d,Λ). For more details
concerning the calculation of the bounds the reader is referred to the last
section of this chapter.
The following auxiliary result is needed in order to write out the bounds
explicitly.
Lemma 7.
For any constant a ∈ R and any normally distributed random variable Λ
∫ dΛ
−∞eaΦ−1(v)dFΛ(λ) = e
a2
2 Φ(d∗Λ − a), (2.43)
where d∗Λ = dΛ−E[Λ]σΛ
and Φ−1(v) = λ−E[Λ]σΛ
.
42 Chapter 2 - Convex bounds
Lower bound
Note that the lower bound via the decomposition equals the lower bound
without the decomposition. So the lower bound in the lognormal and
comonotonic case is given by expression (2.40).
Upper bound based on lower bound
The upper bound (2.27) can be written out explicitly as follows
πdeub(S, d,Λ) =n∑
i=1
αieE[Zi]+
12σ2
Zi Φ[riσZi
− Φ−1 (FSl(d))]− d (1 − FSl(d))
+1
2Φ(d∗Λ)1/2
{n∑
i=1
n∑
j=1
αiαjeE[Zij ]+
12(σ2
Zi+σ2
Zj)×
×Φ(d∗Λ −
(riσZi
+ rjσZj
)) (eσZiZj − e
σZiσZj
rirj)} 1
2
.(2.44)
Proof. See Section 2.7.
Partially exact/comonotonic upper bound
The partially exact/comonotonic upper bound of Subsection 2.3.3 is given
by
πpecub(S, d,Λ) =n∑
i=1
αieE[Zi]+
12σ2
Zi(1−r2
i ){e
r2i σ2
Zi2 Φ(riσZi
− d∗Λ) +
∫ Φ(d∗Λ)
0eriσZi
Φ−1(v)×
× Φ
(sign(αi)
√1 − r2i σZi
Φ−1(FSu|V =v(d)
))dv
}
−d(
1 −∫ Φ(d∗Λ)
0FSu|V =v(d)dv
). (2.45)
Proof. See Section 2.7.
2.3. Upper bounds for stop-loss premiums 43
Choice of the conditioning variable
If X ≤cx Y , and X and Y are not equal in distribution, then Var[X] <
Var[Y ] must hold. An equality in variance would imply that Xd= Y . This
shows that if we want to replace S by the convex smaller S l, the best
approximations will occur when the variance of S l is ‘as close as possible’
to the variance of S. Hence we should choose Λ such that the goodness-of-
fit expressed by the ratio z = Var[Sl]
Var[S ]is as close as possible to 1. Of course
one can always use numerical procedures to optimize z but this would
outweigh one of the main features of the convex bounds, namely that the
different relevant actuarial quantities (quantiles, stop-loss premiums) can
be easily obtained. Having a ready-to-use approximation that can be easily
implemented and used by all kind of end-users is important from a business
point of view.
Notice that the expected values of the random variables S, Sc and Sl
are all equal:
E[S] = E[Sl] = E[Sc] =
n∑
i=1
αieE[Zi]+
12σ2
Zi , (2.46)
while their variances are given by
Var[S] =n∑
i=1
n∑
j=1
αiαjeE[Zi]+E[Zj ]+
12(σ2
Zi+σ2
Zj)(eσZiZj − 1
), (2.47)
Var[Sl] =n∑
i=1
n∑
j=1
αiαjeE[Zi]+E[Zj ]+
12(σ2
Zi+σ2
Zj)(erirjσZi
σZj − 1)
(2.48)
and
Var[Sc] =n∑
i=1
n∑
j=1
αiαjeE[Zi]+E[Zj ]+
12(σ2
Zi+σ2
Zj)(eσZi
σZj − 1), (2.49)
respectively.
We propose here three conditioning random variables. The first two are
linear combinations of the random variables Zi:
Λ =n∑
i=1
γi Zi, (2.50)
44 Chapter 2 - Convex bounds
for particular choices of the coefficients γi.
Kaas et al. (2000) propose the following choice for the parameters γi
when computing the lower bound S l:
γi = αieE[Zi], i = 1, . . . , n. (2.51)
This choice makes Λ a linear transformation of a first order approximation
to S. This can be seen from the following derivation:
S =n∑
i=1
αieE[Zi] +(Zi−E[Zi]) ≈
n∑
i=1
αieE[Zi] (1 + Zi − E [Zi])
= C +n∑
i=1
αieE[Zi]Zi, (2.52)
where C is constant. Hence S l will be “close” to S, provided (Zi − E[Zi])
is sufficiently small, or equivalently, σ2Zi
is sufficiently small. One intu-
itively expects that for this choice for Λ, E[Var[S|Λ]
]is “small” and, since
Var[S] = E[Var[S|Λ]
]+ Var[Sl], this exactly means that one expects the
ratio z = Var[Sl]
Var[S ]to be close to one.
A possible decomposition variable is in this case given by
dΛ = d− C = d−n∑
i=1
αieE[Zi] (1 − E [Zi]) .
Using the property that ex ≥ 1+x and (2.52), we have that Λ ≥ dΛ implies
that S ≥ d.
A second conditioning variable is proposed by Vanduffel et al. (2004).
They propose the following choice for the parameters γi when computing
the lower bound Sl:
γi = αieE[Zi]+
12σ2
Zi , i = 1, . . . , n. (2.53)
In this case the first order approximation of the variance of S l will be
2.3. Upper bounds for stop-loss premiums 45
maximized. Indeed, from (2.48) we find that
Var[Sl] ≈
n∑
i=1
n∑
j=1
αiαjeE[Zi]+E[Zj ]+
12(σ2
Zi+σ2
Zj)(rirjσZiσZj
)
=n∑
i=1
n∑
j=1
αiαjeE[Zi]+E[Zj ]+
12(σ2
Zi+σ2
Zj)(
Cov[Zi,Λ]Cov[Zj ,Λ]
Var[Λ]
)
=
(Cov
[∑ni=1 αi e
E[Zi]+12σ2
ZiZi,Λ])2
Var[Λ]
=
(Corr
(n∑
i=1
αi eE[Zi]+
12σ2
ZiZi,Λ
))2
Var
[n∑
i=1
αi eE[Zi]+
12σ2
ZiZi
].
Hence, the first order approximation of Var[S l] is maximized when Λ is
given by
Λ =n∑
i=1
αieE[Zi]+
12σ2
ZiZi. (2.54)
One can easily prove that the first order approximation for Var[S l] with Λ
given by (2.54) is equal to the first order approximation of Var[S]. This
observation gives an additional indication that this particular choice for Λ
will provide a good fit.
For this ‘maximal variance’ conditioning variable a possible choice for
dΛ is given by
dΛ = d−n∑
i=1
αieE[Zi]+
12σ2
Zi
(1 − E [Zi] −
1
2σ2
Zi
). (2.55)
A third conditioning variable is based on the standardized logarithm of the
geometric average G = (∏n
i=1 S)1/n as in Nielsen and Sandmann (2003)
Λ =ln G − E[ln G]√
Var[ln G]=
∑ni=1(Zi − E[Zi])√Var[
∑ni=1 Zi]
.
Using the fact that the geometric average is not greater than the arithmetic
average, a possible decomposition variable is here given by
dΛ =n ln
(dn
)−∑n
i=1 E[Zi]√Var[
∑ni=1 Zi]
,
so that Λ ≥ dΛ implies that S ≥ d.
46 Chapter 2 - Convex bounds
Generalization to sums of lognormals with a stochastic time hori-
zon
Suppose that S is a sum of lognormal variables with a stochastic time
horizon T
S =T∑
i=1
αieZi ,
with αi ∈ R, T a random variable with life time probability distribution
FT (t) and Zi ∼ N(E[Zi], σ2Zi
) independent of T . Using the tower property
for conditional expectations, we can calculate the stop-loss premium of S
as follows
π(S, d) = π
( T∑
i=1
αieZi , d
)
= ET
[E
[( T∑
i=1
αieZi − d
)
+
|T]]
=∞∑
j=1
Pr[T = j]π
( j∑
i=1
αieZi , d
)
=∞∑
j=1
Pr[T = j] π(Sj , d), (2.56)
with
Sj :=
j∑
i=1
αieZi .
Notice that in practical applications the infinite time horizon is often re-
placed by a finite number. In this part of the thesis, the choice of Λ
will be dependent on the time horizon n. To indicate this dependence,
we introduce the notation Λn for the used conditioning variable Λ. It is
straightforward to obtain a lower bound, denoted as πlb(S, d,Λ), by looking
at the combination
πlb(S, d,Λ) =∞∑
j=1
Pr[T = j] πlb(Sj , d,Λj),
with Λ = Λ1,Λ2, . . . and πlb(Sj , d,Λj) given by (2.40) for n = j. The
same reasoning can be followed for obtaining the comonotonic upper bound
2.4. Application: discounted loss reserves 47
πcub(S, d), the improved comonotonic upper bound πicub(S, d,Λ) and the
partially exact/comonotonic upper bound πpecub(S, d,Λ).
For each term π(Sj , d) in the sum (2.56) we can take the minimum of
two or more of the above defined upper bounds. We propose two upper
bounds based on this simple idea.
The first bound takes each time the minimum of the error term (2.18)
independent of the retention and the error term (2.26) dependent on the
retention. Combining this with the stop-loss premium of the lower bound
Sl results in the following upper bound
πemub(S, d,Λ) =∞∑
j=1
Pr[T = j] min
(1
2E
[√Var[Sj |Λj ]
], ε(dΛj
)
)
+ πlb(S, d,Λ).
Calculating for each term the minimum of all the presented upper bounds
πmin(S, d,Λ) =∞∑
j=1
Pr[T = j] ×
× min(πcub(Sj , d), π
icub(Sj , d,Λj), πpecub(Sj , d,Λj), π
emub(Sj , d,Λj)),
will of course provide the best possible upper bound.
Remark that
πemub(Sj , d,Λj) = πlb(Sj , d,Λj) + min
(1
2E
[√Var[Sj |Λj ]
], ε(dΛj
)
).
2.4 Application: discounted loss reserves
Loss reserving deals with the determination of the random present value
of future payments. Since this amount is very important for an insurance
company and its policyholders, these inherent uncertainties are no excuse
for providing anything less than a rigorous scientific analysis. Since the
reserve is a provision for the future payments, the estimated loss reserve
should reflect the time value of money. At the same time, it may be
necessary or desirable for those reserves to contain a security margin that
produces p×100% confidence in their adequacy, where p is a suitably high
number.
48 Chapter 2 - Convex bounds
In many situations knowledge of the d.f. of this discounted reserve is
useful, for example dynamic financial analysis, assessing profitability and
pricing, identifying risk based capital needs, loss portfolio transfers,etc. .
This application is concerned with the evaluation of loss reserves of this
type according to financial economics (see Panjer (1998)).
2.4.1 Framework and notation
Consider an insurance portfolio subject to liability payments L(i) ≥ 0 at
times i = 1, 2, . . ., where i = 0 denotes the present. Let L(i) be a random
variable and suppose that it is modified by certain forces that influence
the liability over time.
For example, suppose that L(i)t denotes the amount of liability ex-
pressed in money values of time i. Then L(i)t evolves in the sense that
L(i)t = L
(i)t−1RLt, t = 1, . . . , i,
where the RLt are strictly positive random variables of the form
RLt = 1 + rLt,
with rLt the inflation of claims costs over interval (t − 1, t]. The liability
finally paid is
L(i) = L(i)s .
As an example, L(i)t−1 and RLt might be independently distributed as fol-
lows:
L(i)t−1 ∼ logN(ν, τ 2) and RLt ∼ logN(µ, σ2).
It is emphasized that, in this example, rLt denotes claims inflation. This
might include influences other than simple community inflation, such as
the particular pressures of the legal and health care environments on claim
costs.
Similarly, a holding of assets of value At−1 at time t − 1 accumulates
at time t to
At = At−1RAt,
with
RAt = 1 + rAt.
2.4. Application: discounted loss reserves 49
Assume that RXt, where X is either A or L, follows the capital Asset
Pricing Model (CAPM):
rXt = rFt + βX∆t + εXt, (2.57)
where rFt is the risk-free rate in period t, βX is the CAPM beta associated
with X, εXt is the idiosyncratic risk associated with X, and
∆t = rMt − rFt,
with rM denoting the period increase in value of the economy-wide port-
folio of assets. The distribution of ∆t is assumed independent of t. The
assumption of CAPM returns is consistent with an assumption that assets
and liabilities here are marked to market.
Henceforth, it will be assumed that rFt = rF , independent of t. This
simplifies the following algebraic development considerably. It should be
emphasized, however, that the whole development generalizes to the case
in which rFt varies with t. The generalization is theoretically straight-
forward, but adds considerable notational baggage without yielding any
deeper insight.
Assume that the εAt are i.i.d. and similarly the εLt. Assume that all
variables εAt, εLt and ∆t are stochastically independent, and that E[εXt] =
0. Let us further denote the variance of εXt with ω2X .
It follows that the RAt and RLt are independent and identically dis-
tributed. Suppose now the following distribution assumptions:
L(i)0 ∼ logN
(ν
(i)L0, τ
2(i)L0
)and RXt ∼ logN
(µX , σ
2X
), (2.58)
with stochastic independence between L(i)0 and RXt for all i, t, and X =
A,L.
Denote
ρ = Corr(logRAt, logRLt)
and
κ(rs) = Corr(logL
(r)0 , logL
(s)0
).
Define the accumulation factor
RXt:u = RX,t+1RX,t+2 . . . RXu, for u = t+ 1, t+ 2, . . .
Note that RXt:t+1 = RX,t+1.
50 Chapter 2 - Convex bounds
By relation (2.58) and the independence between distinct time inter-
vals,
RXt:u ∼ logN((u− t)µX , (u− t)σ2
X
).
The implicit asset allocation is any that is consistent with relation (2.58).
One might assume, for example, a constant allocation by asset sector, with
continuous rebalancing and sector-specific returns that are constant over
time. As remarked earlier in this section, the last of these assumptions
could be weakened. Indeed, if the assumptions of constant returns over
time were weakened, no assumption would be required with respect to
asset allocation. Define the discounted liability payment
V (i) = L(i)i R−1
A0:i
= L(i)0 RL0:iR
−1A0:i
= L(i)0
i∏
j=1
(RLjR−1Aj )
∼ logN(α(i), δ2(i)),
with α(i) = ν(i)L0 + i(µL −µA) and δ2(i) = τ
2(i)L0 + i(σ2
L + σ2A − 2ρσLσA). The
present value S, given by
S =n∑
i=1
V (i) :=n∑
i=1
eZi , (2.59)
with n the number of cash-flow liabilities in the discounted value of the
total outstanding losses of the portfolio.
In Taylor (2004), the mean and variance of S are calculated and given
by
E[S] =n∑
s=1
E[V (s)]
=n∑
i=1
E[L(s)0 ]
[RL
RA
(1 + (β2
Aσ2M + ω2
A)/R2A
1 + βAβLσ2M/RARL
)]s
,
2.4. Application: discounted loss reserves 51
Var[S] =n∑
r,s=1
Cov[V (r), V (s)]
=n∑
r,s=1
E[V (r)]E[V (s)](exp[κ(rs)τ
(r)L0 τ
(s)L0
+ min(r, s)[σ2L + σ2
A − 2ρσAσL]]− 1),
with RX = E[RXt] and σ2M = Var[rMt]. We will denote the variance of S
by σ2S .
There are now three relevant values of loss reserve:
• ∑ns=1 E[L
(s)0 ], which is the CAPM-based economic value of the lia-
bility.
• E[S], which is the expected value of the discounted liability cash
flows, the discount rate taking into account the insurers asset hold-
ings.
• Ap = F−1S (p) = E[S]exp(σSΦ−1(p) − 1
2σ2S), which is the p × 100%-
confidence loss reserve.
It may be convenient to write the last of these conditions in the form
Ap = [1 + η(ρ, σS)]E[S],
where η(ρ, σS) may be regarded as a security loading. Note, however,
that the security loading in this formulation is applied to E[S] and not
to the economic value of the liability. The first two of the above three
possibilities for loss reserve are the ones involved in the current debate
over the appropriate rate(s) at which to discount liabilities. The quantity
E[S] is obtained using the expectations of discount factors that reflect
the insurers expected returns. In broad (though not quite precise) terms,
it may be thought of as the amount of assets which, accumulating with
expected investment return, will be sufficient to meet liabilities as they
are required to be paid. This value depends on the insurer-specific asset
holdings, and so cannot be market or fair value of the liabilities. This is
given by the first of the above three candidates for loss reserves.
Taylor (1996) pointed out for high security margins (Φ−1(p) > σS), the
size of the security margin increases with increasing asset beta. However,
52 Chapter 2 - Convex bounds
for low security margins (Φ−1(p) < σS), the size of the security margin
decreases with increasing asset beta. In this latter case the additional
yield expected from an increased asset risk outweighs the additional risk.
Taylor (2004) defines the security margin for confidence level p as
SMp[S] := η(p, σS) = (VaRp[S]/E[S]) − 1, which is based on the quan-
tile risk measure from the distribution of the discounted reserve S. In
general, it is hard or even impossible to determine the quantiles of the dis-
counted reserve analytically, because in any realistic model for the return
process the random variable S will be a sum of strongly dependent ran-
dom variables. Here, S is is a finite sum of correlated lognormal random
variables. This implies that its cumulative distribution function cannot be
determined exactly and is even too cumbersome to work with. An inter-
esting solution to this difficulty consists of determining the lower bound S l
and the upper bound Sc as explained earlier in this chapter.
2.4.2 Calculation of convex lower and upper bounds
To calculate the security margin η(p, σS) expressions for the quantiles and
the expected value of Sl and Sc are needed. The expressions for the quan-
tile function of the lower and upper bound of a sum of lognormal random
variables are given by (2.35) and (2.38) in the case of αi = 1 for all i.
The expression for the expected value is given by (2.46). To calculate the
lower bound we choose the ‘maximal variance’ conditioning variable given
by (2.50) and (2.53):
Λ =n∑
i=1
eE[Zi]+
12σ2
ZiZi.
We find that
E[Zi] = ν(i)L0 + log
(RL
RA
(1 + (β2
Aσ2M + ω2
A)/R2A
1 + (β2Lσ
2M + ω2
L)/R2L
)1/2)i ,
Var[Zi] = σ2Zi
= τ2(i)L0 + iσ2,
where the variability of the discounting structure σ2 := σ2L +σ2
A − 2ρσLσA
is given by
log
{[1 + (β2
Aσ2M + ω2
A)/R2A][1 + (β2
Lσ2M + ω2
L)/R2L]
[1 + βAβLσ2M/RARL]2
}.
2.4. Application: discounted loss reserves 53
The correlation between Zi and Λ is given by
ri =Cov[Zi,Λ]
σZiσΛ
=
∑nk=1 βk
(σ2 min(i, k) + η(i,k)
)
σZi
√∑nk=1
∑nl=1 βkβl(σ2 min(k, l) + η(k,l))
,
with
η(k,l) = Cov[logL
(k)0 , logL
(l)0
]= κ(kl)τ
(k)L0 τ
(l)L0 .
Notice that if the liability cash flows are independent η(k,s) = τ2(k)L0 I(k=s).
We will compare the performance of the lower and upper bound approach
with the Monte Carlo simulation results, obtained by generating 1 000 000
random paths, who serve as a benchmark. Note that the random paths are
based on antithetic variables in order to reduce the variance of the Monte
Carlo estimate.
We use the notation SMp[Sl] and SMp[S
c] to denote the security mar-
gin for confidence level p approximated by the lower bound and the upper
bound approximation respectively. The different tables display the Monte
Carlo simulation result (MC) for the security margin, as well as the pro-
centual deviations of the different approximation methods, relative to the
Monte Carlo result. These procentual deviations are defined as follows:
LB :=SMp[S
l] − SMp[SMC ]
SMp[SMC ]× 100%,
UB :=SMp[S
c] − SMp[SMC ]
SMp[SMC ]× 100%,
where Sl and Sc correspond to the lower bound approach and the upper
bound approach, and SMC denotes the Monte Carlo simulation result. The
figures displayed in bold in the tables correspond to the best approxima-
tions, this means the ones with the smallest procentual deviation compared
to the Monte Carlo results.
We set βL equal to zero and choose as financial parameters rF = 6%,
E[∆] = 6% and βA = 0.9. The tables list the results for different values of
the parameters ωL, ωA, σM and n.
We construct two different cash flow structures. Table 2.1 displays the
first structure of the liability cash flows (ex. 1), each of which is assumed
lognormally distributed, and all of which are stochastically independent.
54 Chapter 2 - Convex bounds
Time i E[L(i)] E[L(i)0 ] ν
(i)0 τ
(i)0
1 5% 4.7% −3.059 10%
2 15% 13.3% −2.019 10%
3 25% 21.0% −1.566 10%
4 20% 15.8% −1.854 15%
5 15% 11.2% −2.120 15%
6 10% 7.0% −2.663 15%
7 5% 3.3% −3.424 20%
8 5% 3.1% −3.493 25%
Total 100% 79.6%
Table 2.1: Structure of stochastic liability cash flow (ex. 1).
The profile of the cash flows is intended to resemble a medium-term
casualty payment pattern. It is assumed that ωL = 5% and as financial
parameters σM = 20% and ωA = 0. It follows from equation (2.57) that
RL = 1.06. Further, we have for this example µL = 0.0570 and σL =
0.0471.
Table 2.2 summarizes the results for the 70% security margin for differ-
ent market volatilities σM . The lower bound turns out to fit the security
margins the best for all values of the parameters. Notice that between
brackets the standard error of the Monte Carlo estimate is displayed.
Table 2.3 compares the approximations for some selected confidence
levels p. For this example we have that σA = 16.1%, σL = 4.7%, µA = 9.5%
and µL = 5.7%, with µX and σ2X such that RX = exp(µX + 1
2σ2X). The
results are in line with the previous ones. The lower bound approach gives
excellent results for high as well as for low values of p.
Table 2.4 displays the approximated and simulated 97.5% margins for
some selected market volatilities. These parameters are consistent with
historical capital market values as reported by Ibbotson Associates (2002).
The presented figures again indicate that the lower bound is the most
precise method.
2.4. Application: discounted loss reserves 55
σM : 0.05 0.15 0.25 0.35
LB −0.25% −0.09% −0.12% −0.00%UB +19.86% +12.12% +5.37% −1.62%MC 0.0853 0.1090 0.1309 0.1370(s.e. × 107) (1.11) (2.47) (6.15) (8.18)
Table 2.2: (ex. 1) Approximations for the security margin SM0.70[V ] for
different market volatilities and ωL = 0.1 and ωA = 0.05.
p : 0.995 0.975 0.95 0.90 0.80 0.70
LB −0.38% −0.21% −0.16% −0.08% −0.00% −0.00%UB +26.26% +23.44% +21.80% +19.76% +16.38% +11.25%MC 1.0348 0.6927 0.5421 0.3859 0.2192 0.1124(s.e. × 105) (2.49) (0.46) (0.26) (0.10) (0.06) (0.04)
Table 2.3: (ex. 1) Approximations for some selected confidence levels
of SMp[V ]. The market volatility is set equal to 20%. (ωL = 0.05 and
ωA = 0)
σM : 0.05 0.10 0.15 0.20 0.25 0.30 0.35
LB −0.19% −0.15% −0.23% −0.16% −0.11% −0.17% −0.38%UB +31.74% +27.72% +24.12% +21.81% +20.31% +19.18% +18.13%MC 0.4390 0.5250 0.6528 0.8103 0.9924 1.1970 1.4232(s.e. × 105) (0.15) (0.29) (0.41) (0.69) (1.22) (3.78) (4.16)
Table 2.4: (ex. 1) Approximations for the security margin SM0.975[V ] for
different market volatilities.
We include an additional example (ex. 2) with a different stochastic liability
cash-flow structure. We fix the number of liabilities at n = 30. Further,
we choose ν(i)0 = −4.46 for i = 1, . . . , 30 and
τ(i)0 =
5% i ≤ 5;
10% 5 < i ≤ 15;
15% 15 < i ≤ 25;
20% 25 < i ≤ 28;
25% 28 < i ≤ 30.
56 Chapter 2 - Convex bounds
p : 0.995 0.975 0.95 0.90 0.80 0.70
LB −0.93% −0.04% −0.02% −0.18% −0.03% −0.6%UB +24.59% +19.86% +16.94% +12.95% +5.16% −30.40%MC 4.4521 2.2264 1.4998 0.8814 0.3508 0.0761(s.e. × 105) (37.63) (2.99) (7.44) (2.79) (0.78) (0.27)
Table 2.5: (ex. 2) Approximations for some selected confidence levels of
SMp[V ]. The market volatility is set equal to 25%.
This means that the sum of the expected cash flows E[L(i)] is equal to
100% and E[L(i)0 ] = 35.51%. In this example we fix the parameters ωL and
ωA equal to 10% and 5% respectively.
The same conclusions as for ex. 1 can be drawn from the results in
Table 2.5. This table reports the discussed approximations for SMp[V ] for
different probability levels and a fixed market volatility σM = 0.25. Note
that for the parameters in Table 2.5 σA = 20.5%, σL = 9.4%, µA = 8.7%
and µL = 5.4%.
Overall, the comonotonic lower bound approach provides a very accurate
fit under different parameter assumptions. These assumptions are in line
with realistic market values. Moreover, the comonotonic approximations
have the advantage that they are easy computable for any risk measure
that is additive for comonotonic risks, such as Value-at-Risk and the wider
class of distortion risk measures (see e.g. Dhaene et al. (2004)).
2.5 Convex bounds for scalar products of random
vectors
Within the fields of finance and actuarial science one is often confronted
with the problem of determining the distribution function of a scalar prod-
uct of two random vectors of the form
S =n∑
i=1
XiYti , (2.60)
where the nominal random payments Xi are due at fixed and known times
ti, i = 1, . . . , n and Yt denotes the nominal discount factor over the interval
[0, t], t ≥ 0. This means that the amount one needs to invest at time 0
2.5. Convex bounds for scalar products of random vectors 57
to get an amount 1 at time t is the random variable Yt. By nominal we
mean that there is no correction for inflation. Notice that here the random
vector ~X = (X1, X2, . . . , Xn) may reflect e.g. the insurance or credit risk
while the vector ~Y = (Yt1 , Yt2 , . . . , Ytn) represents the financial/investment
risk. If the payments Xi at time ti are independent of inflation, then the
vectors ~X and ~Y can be assumed to be mutually independent. On the
other hand if the payments are adjusted for inflation, the vectors ~X and ~Y
are not mutually independent anymore. Denoting the inflation factor over
the period [0, t] by Zt, the random variable S can be rewritten as
S =n∑
i=1
XiYti ,
where the real payments Xi and the real discount factors Yti are given
by Xi = Xi/Zti and Yti = YtiZti . Hence, in this case S is the scalar
product of two mutually independent random vectors (X1, X2, . . . , Xn)
and (Yt1 , Yt2 , . . . , Ytn). For this reason the assumption of independence
between the insurance risk and the financial risk is in most cases realis-
tic and can be efficiently deployed to obtain various quantities describing
risk within financial institutions, e.g. discounted insurance claims or the
embedded/appraisal value of a company.
Distributions of sums of the form (2.60) are often encountered in prac-
tice and need to be analyzed thoroughly by actuaries and other practition-
ers involved in the risk management process. Not only the basic summary
measures (like the first few moments) have to be computed, but also more
sophisticated risk measures which require much deeper knowledge about
the underlying distributions (e.g. the Value-at-Risk).
Unfortunately there are no analytical methods to compute distribution
functions for random variables of this form. That is why usually one has
to rely on volatile and time consuming Monte Carlo simulations. In spite
of the enormous increase in computational power observed within the last
few decades, computing time remains a serious drawback of Monte Carlo
simulations, especially when one is interested in estimating very high values
of quantiles (note that a solvency capital of an insurance company may
be determined e.g. as the 99.95%-quantile, which is extremely difficult to
estimate within reasonable time by simulation methods).
In this section we propose an alternative solution. By extending the
methodology of Section 2.2 to the case of scalar products of independent
58 Chapter 2 - Convex bounds
random vectors, we obtain convex upper and lower bounds for sums of the
form (2.60). As we demonstrate by means of a series of numerical illus-
trations, the methodology provides an excellent framework to get accurate
and easily obtainable approximations of distribution functions for random
variables of the form (2.60).
We first give the theoretical foundations for convex lower and upper
bounds in the case of scalar products of independent random vectors. Next,
we demonstrate how to obtain the bounds for (2.60) in the convex order
sense in case when ~Y follows the lognormal law. Finally, we present several
applications for discounted claim processes in a Black & Scholes setting.
2.5.1 Theoretical results
Consider sums of the form:
S = X1Y1 +X2Y2 + . . .+XnYn, (2.61)
where the random vectors ~X = (X1, X2, . . . , Xn) and ~Y = (Y1, Y2, . . . , Yn)
are assumed to be mutually independent. Theoretically, the techniques
developed in Section 2.2 can be applied also in this case (one can take
Vj = XjYj). Such an approach is however not very practical. First of all,
it is not always easy to find the marginal distributions of Vj . Secondly, it
is usually very difficult to find a suitable conditioning random variable Λ,
which will be a good approximation to the whole scalar product, taking
into account the riskiness of the random vector ~X and ~Y simultaneously.
The following theorem provides a more suitable approach to deal with
scalar products. Before we prove the theorem we recall a helpful lemma.
Lemma 8 (Scalar products and convex order).
Assume that ~X = (X1, . . . , Xn), ~Y = (Y1, . . . , Yn) and ~Z = (Z1, . . . , Zn)
are non-negative random vectors and that ~X is mutually independent of
the vectors ~Y and ~Z. If for all possible outcomes x1, . . . , xn of ~X
n∑
i=1
xiYi ≤cx
n∑
i=1
xiZi,
then the corresponding scalar products are ordered in the convex order
sense, i.e.n∑
i=1
XiYi ≤cx
n∑
i=1
XiZi.
2.5. Convex bounds for scalar products of random vectors 59
Proof. Let φ be a convex function. By conditioning on ~X and taking the
assumptions into account, we find that
E[φ( n∑
i=1
XiYi
)]= E ~X
[E[φ( n∑
i=1
XiYi
)| ~X]]
≤ E ~X
[E[φ( n∑
i=1
XiZi
)| ~X]]
= E[φ( n∑
i=1
XiZi
)]
holds for any convex function φ.
Theorem 11 (Bounds for scalar products of random vectors).
Consider the following sum of random variables
S =n∑
i=1
XiYi. (2.62)
Assume that the vectors ~X = (X1, X2, . . . , Xn) and ~Y = (Y1, Y2, . . . , Yn)
are mutually independent. Define the following quantities:
Sc =n∑
i=1
F−1Xi
(U)F−1Yi
(V ), (2.63)
Sl =n∑
i=1
E[Xi|Γ]E[Yi|Λ], (2.64)
where U and V are independent standard uniform random variables, Γ is
a random variable independent of ~Y and Λ, and the second conditioning
random variable Λ is independent of ~X and Γ. Then, the following relation
holds:
Sl ≤cx S ≤cx Sc.
Proof. The proof is based on a multiple application of Lemma 8.
1. First, we prove that∑n
i=1XiYi ≤cx∑n
i=1 F−1Xi
(U)F−1Yi
(V ).
From Theorem 8 it follows that for all possible outcomes (x1, . . . , xn)
of ~X the following inequality holds:
n∑
i=1
xiYi ≤cx
n∑
i=1
F−1xiYi
(V ) =n∑
i=1
xiF−1Yi
(V ).
60 Chapter 2 - Convex bounds
Thus from Lemma 8 it follows immediately that∑n
i=1XiYi ≤cx∑ni=1XiF
−1Yi
(V ). The same reasoning can be applied to show that
n∑
i=1
XiF−1Yi
(V ) ≤cx
n∑
i=1
F−1Xi
(U)F−1Yi
(V ).
2. In a similar way, one can show that
n∑
i=1
E[Xi|Γ]E[Yi|Λ] ≤cx
n∑
i=1
XiE[Yi|Λ] ≤cx
n∑
i=1
XiYi.
Remark 1. Notice that∑n
i=1 F−1Xi
(U)F−1Yi
(V ) ≤cx∑n
i=1 F−1XiYi
(U). Thus
the upper bound (2.63) is improved compared to the comonotonic upper
bound. It takes the information into account that the vectors ~X and ~Y
are independent.
Remark 2. One can also calculate the improved upper bound
Su =n∑
i=1
F−1Xi|Γ(U)F−1
Yi|Λ(V ),
but since the improved upper bound Su is very close to the comonotonic
upper bound Sc and it requires much more computational time, we con-
centrate in this thesis only on the lower bound S l and the comonotonic
upper bound Sc as approximations for S.
Remark 3. Having obtained the convex upper and lower bounds one can
get also the moments based approximation Sm as described in Subsection
2.2.4, i.e. by determining the distribution function as follows:
FSm(t) = zFSl(t) + (1 − z)FSc(t), (2.65)
where
z =Var[Sc] − Var[S]
Var[Sc] − Var[Sl]. (2.66)
2.5. Convex bounds for scalar products of random vectors 61
2.5.2 Stop-loss premiums
The stop-loss premiums of Sc and Sl provide natural bounds for the stop-
loss premiums of the underlying scalar product of random vectors. More
precisely, one has the following relationship:
πlb(S, d,Γ,Λ) ≤ π(S, d) ≤ πcub(S, d).
The values πcub(S, d) and πlb(S, d,Γ,Λ) can be easily computed. Below we
give the computational procedure in detail.
First, consider a sum of the form
(Sc|U = u) =n∑
i=1
F−1Xi
(u)F−1Yi
(V ).
It can be easily seen that it is a sum of the components of a comonotonic
vector, and hence the conditional stop-loss premiums of Sc (given U = u)
can be found in the case the distribution functions of Yi are continuous
and strictly increasing, by applying Theorem 9. Then, the overall stop-
loss premium of Sc can be computed by conditioning
πcub(S, d) = E[E[(Sc − d)+|U
]]
=
∫ 1
0
n∑
i=1
F−1Xi
(u)π(Yi, F
−1Yi
(FSc|U=u(d)
))du. (2.67)
In general it is more difficult to calculate stop-loss premiums for the lower
bound. However it can be done similarly as in the case of the upper
bound if one additionally assumes that the conditioning variables Γ and
Λ can be chosen in such a way that for any fixed γ ∈ supp(Γ) all compo-
nents E[Xi|Γ = γ
]E[Yi|Λ = λ
]are non-decreasing (or equivalently non-
increasing) in λ. Then the vector
(E[X1|Γ = γ]E[Y1|Λ], E[X2|Γ = γ]E[Y2|Λ], . . . , E[Xn|Γ = γ]E[Yn|Λ]
)
62 Chapter 2 - Convex bounds
is comonotonic and Theorem 9 can be applied. Thus, one gets
πlb(S, d,Γ,Λ) = E[E[(Sl − d)+|Γ
]]
=
∫ 1
0
n∑
i=1
(E[Xi|Γ = F−1
Γ (u)]×
× π(E[Yi|Λ], F−1
E[Yi|Λ]
(FSl|Γ=F−1
Γ (u)(d))))
du. (2.68)
Hence if one can only compute stop-loss premiums of Yi and E[Yi|Λ], one
can also compute stop-loss premiums of Sc and Sl.
Note that stop-loss premiums of the moments based approximation Sm
can be easily calculated as
πm(S, d,Γ,Λ) = zπlb(S, d,Γ,Λ) + (1 − z)πcub(S, d).
2.5.3 The case of log-normal discount factors
In the sequel we develop a framework for computing convex bounds for
random variables of the form:
S =n∑
i=1
αiXieZi , (2.69)
where the vectors ~X and ~Z satisfy the usual conditions (see Section 2.5.1).
We assume αi > 0 and Zi ∼ N(E[Zi], σ2Zi
). In this section we consider
the problem in general, without imposing any conditions on the random
variables Xi. In particular we don’t discuss the choice of the conditioning
variable Γ.
The upper bound
From Theorem 11 it follows that
Sc =n∑
i=1
F−1Xi
(U)F−1αieZi
(V )
=n∑
i=1
F−1Xi
(U)αieE[Zi]+sign(αi)σZi
Φ−1(V ), (2.70)
where U and V are independent standard uniform random variables.
The cumulative distribution function of Sc can be calculated in three
steps:
2.5. Convex bounds for scalar products of random vectors 63
1. Suppose that U = u is fixed. Then from (2.70) it follows that condi-
tional quantiles can be computed as
F−1Sc|U=u(p) =
n∑
i=1
F−1Xi
(u)αieE[Zi]+sign(αi)σZi
Φ−1(p); (2.71)
2. Obviously for any u the function given by (2.71) is continuous and
strictly increasing. Thus for any y ≥ 0 one can compute the value
of the conditional distribution function using one of the well-known
numerical methods (e.g. Newton-Raphson) as a solution of
n∑
i=1
F−1Xi
(u)αieE[Zi]+sign(αi)σZi
Φ−1(FSc|U=u(y)) = y; (2.72)
3. The cumulative distribution function of Sc can be now derived as
FSc(y) =
∫ 1
0FSc|U=u(y)du.
The stop-loss premiums of the upper bound can be computed as follows.
For simplicity of notation let us denote
du,i = F−1αieZi
(FSc|U=u(d)
)= αie
E[Zi]+sign(αi)σZiΦ−1(FSc|U=u(d)). (2.73)
Then one has
π(αie
Zi , du,i
)= αie
E[Zi]+σ2
Zi2 Φ
(sign(αi)b
(1)u,i
)− du,iΦ
(sign(αi)b
(2)u,i
),
(2.74)
where, using Lemma 6,
b(1)u,i =
E[Zi] + σ2Zi
− ln(du,i)
σZi
, b(2)u,i = b
(1)u,i − σZi
.
Then the stop-loss premium of Sc with retention d can be computed by
plugging (2.74) into (2.67) and is given by
64 Chapter 2 - Convex bounds
πcub(S, d) =
∫ 1
0
n∑
i=1
F−1Xi
(u)π(αie
Zi , du,i
)du
=n∑
i=1
αieE[Zi]+
12σ2
Zi ×
×∫ 1
0F−1
Xi(u)Φ
(sign(αi)σZi
− Φ−1(FSc|U=u(d)
) )du
− d (1 − FSc(d)) . (2.75)
The lower bound
The computations for the lower bound are performed similarly, however
the quality of the bound heavily depends on the choice of the conditioning
random variables. Recall that from Theorem 11 it follows that
Sl =n∑
i=1
E[Xi|Γ
]E[αie
Zi |Λ], (2.76)
where the first conditioning variable Γ is independent of Λ and ~Y and
where the second conditioning variable Λ is independent of Γ and ~X. In
this section the choice of Γ will not be discussed and the random variable
Λ will be assumed to be of the ‘maximal variance’ form (2.54)
Λ =n∑
i=1
βiZi =n∑
i=1
αiE[Xi]eE[Zi]+
12σ2
ZiZi. (2.77)
Under these assumptions the vectors of the form(Zi,Λ
)have a bivariate
normal distribution. Thus, Zi|Λ = λ will be normally distributed with
mean µi,λ and variance σ2i,λ given by
µi,λ = E[Zi] +Cov
[Zi,Λ
]
Var[Λ]
(λ− E[Λ]
)
and
σ2i,λ = σ2
Zi− Cov
[Zi,Λ
]2
Var[Λ].
2.5. Convex bounds for scalar products of random vectors 65
The lower bound (2.76) can be written out as
Sl =n∑
i=1
E[Xi|Γ
]E[αie
Zi |Λ]
=n∑
i=1
E[Xi|Γ
]αie
µi,Λ+σ2
i,Λ2
=
n∑
i=1
E[Xi|Γ
]αie
E[Zi]+12σ2
Zi(1−r2
i )+σZiriΦ
−1(U), (2.78)
with U a standard uniform random variable and correlations given by
ri = Corr (Zi,Λ) =Cov
[Zi,Λ
]
σZiσΛ
=
∑nj=1 E[Xi]e
E[Zj ]+12σ2
ZjσZiZj
σZi
√∑1≤k,l≤n E[Xk]E[Xl]e
E[Zk]+E[Zl]+12(σ2
Zk+σ2
Zl)σZkZl
. (2.79)
Note that the ri’s are non-negative and the random variable S l is (given a
value Γ = γ) the sum of the components of a comonotonic vector. Thus the
cumulative distribution function of the lower bound S l can be computed,
similar to the case of the upper bound Sc, in three steps:
1. From (2.78) it follows that the conditional quantiles (given Γ = γ)
can be computed as
F−1Sl|Γ=γ
(p) =n∑
i=1
E[Xi|Γ = γ
]αie
E[Zi]+12σ2
Zi(1−r2
i )+σZiriΦ
−1(p); (2.80)
2. The conditional distribution function is computed as the solution of
n∑
i=1
E[Xi|Γ = γ
]αie
E[Zi]+12σ2
Zi(1−r2
i )+σZiriΦ
−1(FSl|Γ=γ
(y))= y; (2.81)
3. Finally, the cumulative distribution function of S l can be derived as
FSl(y) =
∫ 1
0FSl|Γ=F−1
Γ (u)(y)du.
66 Chapter 2 - Convex bounds
The stop-loss premiums are computed as follows. Let us denote
dγ,i = F−1
E[αieZi |Λ
](FSl|Γ=γ(d))
= αieE[Zi]+
12σ2
Zi(1−r2
i )+σZiriΦ
−1(FSl|Γ=γ
(d)).
Then one has
π(E[αie
Zi |Λ], dγ,i
)= αie
E[Zi]+12σ2
Zi Φ(sign(αi)b
(1)γ,i
)−dγ,iΦ
(sign(αi)b
(2)γ,i
),
(2.82)
with
b(1)γ,i =
E[Zi] + 12σ
2Zi
(1 − r2i ) + σ2Zir2i − ln(dγ,i)
σZiri
, b(2)γ,i = b
(1)γ,i − σZi
ri.
Then the stop loss-premium of S l with retention d can be computed by
plugging (2.82) into (2.68) and is given by
πlb(S, d,Γ,Λ) =
∫ 1
0
n∑
i=1
E[Xi|Γ = F−1
Γ (u)]π(E[αie
Zi |Λ], dγ,i
)du
=n∑
i=1
αieE[Zi]+
12σ2
Zi ×
×∫ 1
0E[Xi|Γ = F−1
Γ (u)]Φ(riσZi
− Φ−1(FSl|Γ=γ(d)
))du
− d(1 − FSl(d)
). (2.83)
Moments based approximations
For computing the moments based approximation as defined in (2.65), one
has to calculate the variance of S, S l and Sc. In general the problem
is easy solvable for the upper and the lower bound. For the exact dis-
tribution it is more difficult to find a universal solution and the problem
needs to be considered individually. In the general case one would face the
problem of computing multiple integrals, what requires usually too much
computational time.
Note that the upper and the lower bound of S, as described in Subsec-
tions 2.5.3 and 2.5.3, can be seen as a special case of the following random
variable X with general form given by
X =n∑
i=1
αifi(U)gi(V ), (2.84)
2.5. Convex bounds for scalar products of random vectors 67
where (α1, α2, . . . , αn) is a vector of non-negative numbers, fi(.) and gi(.)
are non-negative functions and U and V two independent standard uniform
random variables. Indeed, in the case of the upper bound one takes
fi(U) = F−1Xi
(U) and gi(V ) = F−1eZi
(V )
and in the case of the lower bound
fi(U) = E[Xi|Γ
]and gi(V ) = E
[eZi |Λ
].
The variance of X in expression (2.84) can be computed as follows
Var[X] = E[Var[X|U ]
]+ Var
[E[X|U ]
]
=
∫ 1
0Var[ n∑
i=1
αifi(u)gi(V )]du+
∫ 1
0
(E[ n∑
i=1
αifi(u)gi(V )])2
du
−(∫ 1
0E[ n∑
i=1
αifi(u)gi(V )]du
)2
.
Thus the problem of computing the variance of X is always solvable if one
is able to compute the expectation and the variance of random variables
X of the form
X =n∑
i=1
αigi(V ),
for any vector of non-negative numbers (α1, α2, . . . , αn) (here αi = αifi(u)).
For the comonotonic upper bound (2.70), i.e. gi(V ) = eE[Zi]+σZiΦ−1(V ), the
variance of X is given by
Var[X]
=n∑
i=1
n∑
j=1
αiαjeE[Zi]+E[Zj ]+
σ2Zi
+σ2Zj
2(eσZi
σZj − 1)
and for the lower bound (2.76), i.e. gi(V ) = eE[Zi]+
12σ2
Zi(1−r2
i )−σZiriΦ
−1(V ),
by
Var[X]
=n∑
i=1
n∑
j=1
αiαjeE[Zi]+E[Zj ]+
σ2Zi
+σ2Zj
2(erirjσZi
σZj − 1).
68 Chapter 2 - Convex bounds
2.6 Application: the present value of stochastic
cash flows
In this section we derive convex upper and lower bounds for general dis-
counted cash flows of the form
S =n∑
i=1
Xie−Y (i), (2.85)
where the random variables Xi denote future (non-negative) payments due
at time i and Y (t) is a stochastic process describing returns on investment
in the period (0, t).
We give explicit results for convex upper and lower bounds in three
specific cases:
(i) The vector ln ~X =(ln(X1), ln(X2), . . . , ln(Xn)
)has a multivariate
normal distribution and hence the losses are log-normally distributed.
(ii) The vector ~X =(X1, X2, . . . , Xn
)has a multivariate elliptical distri-
bution. Formally the described methodology is valid only in the case
when Xi > 0.
(iii) The yearly payments Xi are independent and identically distributed.
2.6.1 Stochastic returns
We start with a general definition of a Gaussian process.
Definition 7 (Gaussian process).
A stochastic process{Y (t)|t ≥ 0
}is called Gaussian if for any 0 < t1 <
t2 < . . . < tn the vector(Y (t1), Y (t2), . . . , Y (tn)
)has a multivariate nor-
mal distribution.
Gaussian processes have a lot of desirable properties. They are very easy to
handle since they are completely determined by their mean and covariance
functions
m(t) = E[Y (t)] and c(s, t) = Cov[Y (s), Y (t)]. (2.86)
For an introduction to Gaussian processes, see e.g. Karatzas & Shreve
(1991). The normality assumption for modelling returns on investment
2.6. Application: the present value of stochastic cash flows 69
has been questioned in the financial literature for the short term setting
(e.g. daily returns — see Schoutens (2003)). In the long term however
Gaussian models provide a satisfactory approximation since the Central
Limit Theorem is applicable under the reasonable assumptions of indepen-
dent returns with finite variance (some empirical evidence is provided e.g.
in Cesari & Cremonini (2003)). Therefore in the framework of this thesis
we restrict ourselves to two simple Gaussian models for future returns Y (t).
More precisely, we will focus on modelling returns by means of a Brownian
motion with drift (the Black & Scholes model) and an Ornstein-Uhlenbeck
process. This limitation is very convenient because it leads to closed-form
formulas for convex upper and lower bounds of future cash flows.
The Black & Scholes setting (B-SM)
We assume that a process X(t) satisfies the following stochastic differential
equation:
dX(t) = X(t)(µ+
1
2σ2)dt+X(t)σdW1(t), (2.87)
where W1(t) denotes a standard Brownian motion. It is well-known that
(2.87) has a unique solution of the form
X(t) = X(0)eµt+σW1(t),
and thus the return on investment process Y (t) = log(
X(t)X(0)
)is Gaussian
with mean and covariance functions given by
m(t) = µt and c(s, t) = min(s, t)σ2.
One of the most important features of the return process Y (t) is the prop-
erty of independent increments. Indeed, it is straightforward to verify that
for every 0 < s < t < u one has that
Cov[Y (u) − Y (t), Y (t) − Y (s)
]= 0.
For this reason we often consider yearly rates of return
Yi = Y (i) − Y (i− 1) for i = 1, 2, . . . (2.88)
which are independent and normally distributed with mean equal to µ and
variance equal to σ2.
70 Chapter 2 - Convex bounds
The Ornstein-Uhlenbeck model (O-UM)
In the Ornstein-Uhlenbeck model the return process is described as
Y (t) = µt+ Z(t),
where Z(t) is the solution of the following stochastic differential equation:
dZ(t) = −aZ(t)dt+ σdW1(t),
with a and σ being positive constants. Then Y (t) is again Gaussian with
mean and covariance functions given by
m(t) = µt and c(s, t) =σ2
2a
(e−a|t−s| − e−a(t+s)
)(2.89)
We refer to e.g. Arnold (1974) for more details about the derivation.
Note that for a = 0 the Ornstein-Uhlenbeck process degenerates to
an ordinary Brownian motion with drift and is equivalent to the Black &
Scholes setting. When a > 0, process Y (t) has no independent increments
any more. Moreover, it becomes mean reverting. Intuitively the property
of mean reversion means that process Y (t) cannot deviate too far from its
mean function m(t). In fact the parameter a measures how strongly paths
of Y (t) are attracted by the mean function. The value a = 0 corresponds to
the case when there is no attraction and as a consequence the increments
become independent. On Figure 2.1 we illustrate typical sample paths of
the Ornstein-Uhlenbeck model for different values of parameter a.
In particular we will concentrate on the case when Y (i) is defined by one
of these models. Then the sum S in (2.85) has a clear interpretation: it is
the discounted value of future benefits Xi with returns described by one of
the well-known Gaussian models. The input variables of the two discussed
return models are displayed in Table 2.6.
2.6. Application: the present value of stochastic cash flows 71
t
Y(t)
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
a) The Ornstein-Uhlenbeck process: a=0
t
Y(t)
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
b) The Ornstein-Uhlenbeck process: a=0.02
t
Y(t)
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
c) The Ornstein-Uhlenbeck process: a=0.1
t
Y(t)
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
d) The Ornstein-Uhlenbeck process: a=0.5
Figure 2.1: Typical paths for the Ornstein-Uhlenbeck process with mean
µ = 0.05, volatility σ = 0.07 and different values of parameter a.
Model Variable Formula
B-SM E[Y (i)] iµVar[Y (i)] iσ2
Var[Λ]∑n
j=1 jβ2jσ
2 +∑
1≤j<k≤n 2jβjβkσ2
Cov[Y (i),Λ]∑n
j=1 min(i, j)βjσ2
O-UM E[Y (i)] iµ
Var[Y (i)] σ2
2α (1 − e−2iα)
Var[Λ] σ2
2α
(∑nj=1 β
2j (1 − e−2jα)+
+∑
1≤j<k≤n 2βjβk(e−(k−j)α − e−(j+k)α))
Cov[Y (i),Λ] σ2
2α
∑nj=1 βj(e
−|i−j|α − e−(i+j)α)
Table 2.6: Input variables for returns. We take Λ =∑n
i=1 βiY (i).
72 Chapter 2 - Convex bounds
2.6.2 Lognormally distributed payments
Consider a sum of the form
SLN =n∑
i=1
eNie−Y (i), (2.90)
where ~N =(N1, N2, . . . , Nn
)=(ln(X1), ln(X2), . . . , ln(Xn)
)is a normally
distributed random vector with mean ~µ ~N =(µN1 , µN2 , . . . , µNn
)and co-
variance matrix Σ ~N =[σ
~Nij
]1≤i,j≤n
. The corresponding variances are de-
noted by σ2Ni
:= σ~Nii .
There are two different approaches to derive convex upper and lower
bounds for SLN as defined in (2.90). In the first approach independent
parts of the scalar product are treated separately (this approach is consis-
tent with the methodology described in Subsections 2.5.1 and 2.5.3). In
the second approach we treat SLN unidimensionally, by noticing that it
can be rewritten as
SLN =
n∑
i=1
Xi =
n∑
i=1
eNi , (2.91)
where~N =
(N1, N2, . . . , Nn
)=(N1−Y (1), N2−Y (2), . . . , Nn−Y (n)
)has
a multivariate normal distribution with parameters
~µ ~N
=(µN1
, µN2, . . . , µNn
)and Σ ~
N=[σ
~Nij
]1≤i,j≤n
, (2.92)
with
µNi= µNi
−m(i) and σ~Nij = σ
~Nij + c(i, j),
where m(.) and c(., .) denote mean and covariance functions of the process
Y (.), as defined in (2.86). We further use the following notations σ2Ni
:=
σ~Nii , µi := −m(i) and σ2
i := c(i, i). Thus one can derive convex upper
and lower bounds of (2.91) just by adapting the methodology described in
Section 2.3.4.
Below we work out both approaches explicitly. The main advantage of
the first method is a better recognition of the dependency structure and
this results in more precise estimates (especially the upper bound). On
the other hand the second method is much less time-consuming because
the problem is reduced to only one dimension.
2.6. Application: the present value of stochastic cash flows 73
The upper bound
The upper bound can be written as
ScLN =
n∑
i=1
eµNi+µi+σNi
Φ−1(U)+σiΦ−1(V )
and its distribution function can be computed as described in Subsection
2.5.3.
The lower bound
To compute the lower bound we propose to define a conditioning random
variable Γ symmetrically to the conditioning variable Λ, i.e.
Γ =n∑
i=1
E[e−Y (i)
]eµNi
+ 12σ2
NiNi =n∑
i=1
eµNi
+µi+12
(σ2
Ni+σ2
i
)Ni.
The conditioning variable Λ is chosen as in (2.77), which gives after the
obvious substitution
Λ = −n∑
i=1
eµNi
+µi+12
(σ2
Ni+σ2
i
)Y (i). (2.93)
Now the corresponding lower bound can be written as
Sl1LN =
n∑
i=1
eµNi
+µi+12σ2
Ni(1−r2
Ni)+ 1
2σ2
i (1−r2i )+σNi
rNiΦ−1(U)+σiriΦ
−1(V ),
where correlations ri = r(−Y (i),Λ) are defined as in (2.79) and
rNi= r(Ni,Γ)
=
∑nj=1 e
µNj+µj+
12
(σ2
Nj+σ2
j
)σ
~Nij
σNi
√∑n
k,l=1 eµNk
+µNl+µk+µl+
12
(σ2
Nk+σ2
Nl+σ2
k+σ2
l
)σ
~Nkl
.
Its distribution function can be computed by conditioning on U , as de-
scribed in Section 2.5.3.
From Remark 1 it follows that
ScLN ≤cx
n∑
i=1
F−1
eNi(U),
74 Chapter 2 - Convex bounds
and thus we don’t consider the comonotonic upper bound for (2.91). To
compute the lower bound we apply directly the results of Section 2.3.4.
Therefore, we take as conditioning random variable
Λ =n∑
i=1
eµ
Ni(µ)+σ2
Ni Ni. (2.94)
Then the lower bound is given explicitly as
Sl2LN =
n∑
i=1
eµ
Ni+ 1
2σ2
Ni(1−r2
Ni)+σ
NirNi
Φ−1(U),
where
rNi= r(Ni, Λ) =
∑nj=1 e
µNj
+ 12σ2
Njσ~Nij
σNi
√∑n
k,l=1 eµ
Nk+µ
Nl+ 1
2
(σ2
Nk+σ2
Nl
)σ
~Nkl
Note that in order to obtain a comonotonic lower bound one has to assure
additionally that rNi> 0 for all i.
Suppose that this lower bound is comonotonic. Then its quantiles are
given by a closed-form expression:
F−1Sl2LN
(p) =n∑
i=1
eµ
Ni+ 1
2σ2
Ni(1−r2
Ni)+σ
NirNi
Φ−1(p),
from which one can easily find values of the corresponding distribution
function e.g. by means of the Newton-Raphson method.
The moments based approximation
It is also possible to derive the moments based approximations Sm1 and
Sm2 as described in (2.65) since there are explicit solutions for the vari-
2.6. Application: the present value of stochastic cash flows 75
ances:
Var[SLN ] =n∑
i=1
n∑
j=1
eµ
Ni+µ
Nj+ 1
2
(σ2
Ni+σ2
Nj
)(eσ
~Nij − 1
),
Var[ScLN ] =
n∑
i=1
n∑
j=1
eµ
Ni+µ
Nj+ 1
2
(σ2
Ni+σ2
Nj
)(eσNi
σNj+σiσj − 1
),
Var[Sl1LN ] =
n∑
i=1
n∑
j=1
eµ
Ni+µ
Nj+ 1
2
(σ2
Ni+σ2
Nj
)(erNi
rNjσNi
σNj+rirjσiσj − 1
),
Var[Sl2LN ] =
n∑
i=1
n∑
j=1
eµ
Ni+µ
Nj+ 1
2
(σ2
Ni+σ2
Nj
)(erNi
rNj
σNi
σNj − 1
).
After obvious substitutions in formulas (2.75) and (2.83) one gets the fol-
lowing expressions for stop-loss premiums in the first approach:
πcub(SLN , d) =n∑
i=1
eµi+12σ2
i ×
×∫ 1
0
(eµNi
+σNiΦ−1(u)Φ
(σi − Φ−1
(FSu
LN |U=u(d)))
du
− d(1 − FSu
LN (d)),
πlb1(SLN , d,Γ,Λ) =n∑
i=1
eµi+12σ2
i
∫ 1
0eµNi
+ 12(1−r2
Ni)σ2
Ni+rNi
σNiΦ−1(u) ×
× Φ(riσi − Φ−1
(FSl1
LN |Γ=F−1Γ (u)(d)
))du
− d(1 − FSl1
LN(d)).
In the second approach the expression for stop-loss premiums of the lower
bound follows straightforward from (2.40):
πlb2(SLN , d,Λ) =n∑
i=1
eµ
Ni+ 1
2σ2
Ni Φ(rNi
σNi− Φ−1
(FSl2
LN(d)))
− d(1 − FSl2
LN(d)).
Finally, the corresponding stop-loss premiums for the moments based ap-
proximations are given by
πm1(SLN , d) = z1πlb1(SLN , d) + (1 − z1)π
cub(SLN , d),
πm2(SLN , d) = z2πlb2(SLN , d) + (1 − z2)π
cub(SLN , d),
76 Chapter 2 - Convex bounds
p Sl1LN Sl2
LN Sm1LN Sm2
LN ScLN MC (s.e.×104)
0.75 14.6818 14.6822 14.6847 14.6839 15.0295 14.6795 (0.71)0.90 17.0976 17.1024 17.1067 17.1078 18.0976 17.1019 (1.06)0.95 18.7642 18.7723 18.7788 18.7815 20.2580 18.7769 (1.45)0.975 20.3631 20.3753 20.3843 20.3882 22.3610 20.3881 (2.08)0.995 23.9603 23.9823 24.0032 24.0082 27.1914 24.0237 (4.59)
Table 2.7: Approximations for some selected quantiles with probability
level p of SLN .
where
z1 =Var[Sc
LN ] − Var[SLN ]
Var[ScLN ] − Var[Sl1
LN ]and z2 =
Var[ScLN ] − Var[SLN ]
Var[ScLN ] − Var[Sl2
LN ].
A numerical illustration
We examine the accuracy and efficiency of the derived approximations for
the present values of a cash flow with lognormally distributed payments.
For the purpose of this numerical illustration we choose parameters µNi=
− ln(1.01)2 and σ2
Ni= ln(1.01) (Note that this value correspond to E[X] = 1
and Var[X] = 0.01). Moreover, we allow for some dependencies between
the payments by imposing correlations between the normal exponents:
r(Ni, Nj) =
1 if i = j
0.5 if |i− j| = 1
0.2 if |i− j| = 2,
0 if |i− j| > 2.
We restrict ourselves to the case of a Black & Scholes setting with drift
µ = 0.05 and volatility σ = 0.1. We compare the distribution functions of
the upper bound ScLN and the lower bounds Sl1
LN (obtained by taking two
conditioning random variables) and S l2LN (with 1 conditioning variable)
with the original distribution function of SLN obtained by means of a
Monte Carlo (MC) simulation based on generating 500 × 100 000 sample
paths.
Table 2.7 illustrates the performance of the different approximations.
One can see that the upper bound ScLN gives a poor approximation. The
main reason for that is a relatively weak dependence between payments,
2.6. Application: the present value of stochastic cash flows 77
d Sl1LN Sl2
LN Sm1LN Sm2
LN ScLN MC (s.e.×104)
0 12.8928 12.8928 12.8928 12.8928 12.8928 12.8931 (4.37)5 7.8928 7.8928 7.8928 7.8928 7.8931 7.8931 (4.37)10 3.0854 3.0856 3.0871 3.0866 3.2521 3.0870 (4.11)15 0.5589 0.5602 0.5615 0.5618 0.8216 0.5613 (2.14)20 0.0658 0.0663 0.0668 0.0669 0.1647 0.0672 (0.72)25 0.0070 0.0071 0.0072 0.0072 0.0315 0.0074 (0.25)30 0.0008 0.0008 0.0008 0.0008 0.0062 0.0008 (0.08)
Table 2.8: Approximations for some selected stop-loss premiums with
retention d of SLN .
for which the comonotonic approximation significantly overestimates the
tails. On the other hand, both lower bounds S l1LN and Sl2
LN give excellent
approximations. One may be surprised especially with the performance
of the second lower bound — it turns out that the results are not less
accurate for one conditioning random variable than in the case of two
conditioning random variables. In the table we include also two moments
based approximations Sm1LN and Sm2
LN , which perform excellent as well.
Finally, the stop-loss premiums for the different approximations are
compared in Table 2.8. This study confirms the high accuracy of the
lower bounds and moments based approximations, which are very close to
the Monte Carlo estimates. The overestimation of the stop-loss premiums
provided by the convex upper bound is considerable.
2.6.3 Elliptically distributed payments
The class of elliptical distributions is a natural extension of the normal
law. We say that a random vector ~X =(X1, X2, . . . , Xn
)has an n-
dimensional elliptical distribution with parameters ~µ =(µ1, µ2, . . . , µn
),
Σ =[σij
]1≤i,j≤n
(symmetric and positive definite matrix) and character-
istic generator φ(·), if the characteristic function of ~X is given by
ϕ ~X
(~t)
= ei~t
′~µφ(~t
′Σ~t).
We write ~X ∼ En(~µ,Σ, φ). Obviously the normal distribution satisfies this
definition, with φ(y) = e−12y. Elliptical distributions are very useful for
several reasons. First of all they are very easy to manipulate because they
78 Chapter 2 - Convex bounds
inherit surprisingly many properties from the normal law. On the other
hand the normal distribution is not very flexible in modelling tails (in prac-
tice we often encounter much heavier tails than the Gaussian ones). The
class of elliptical laws offers a full variety of random distributions, from very
heavy-tailed ones (like Cauchy or stable distributions), distributions with
tails of the polynomial-type (t-Student), through the exponentially-tailed
Laplace and logistic distributions to the light-tailed Gaussian distribution.
Below we give a brief overview of the properties of elliptical distribu-
tions. For more information about elliptical distributions we refer to Fang
et al. (1990). The generalization of some of the results on comonotonic
bounds for∑n
i=1Xi to the multivariate elliptical case can be found in
Valdez & Dhaene (2004).
1. E[Xi] = µi, Var[Xi] = −2φ′(0)σii and Cov[Xi, Xj ] = −2φ′(0)σii if
only the corresponding moments exist. Here, φ′(·) is the first deriva-
tive of the characteristic generator φ(·).
2. Let ~Y = A ~X +~b, where A denote an m×n-matrix and ~b is a vector
in Rn. Then ~Y ∼ Em
(A~µ+~b,AΣA′, φ
);
3. If the density function f ~X(·) exists, it is given by the formula
f ~X(~x) =c√
det[Σ]g((~x− ~µ)′Σ−1(~x− ~µ)
)
for any non-negative function g satisfying
0 <
∫ ∞
0z
n2−1g(z)dz <∞
and c being a normalizing constant. The function g(·) is called the
density generator of the distribution Em
(~µ,Σ, φ
). A detailed proof
of these results, using spherical transformations of rectangular coor-
dinates, can be found in Landsman & Valdez (2002).
4. Let ~X =(~X1, ~X2
)denote an En+m(~µ,Σ, φ)-random vector, where
~µ =(~µ1, ~µ2
)and
Σ =
(Σ11 Σ12
Σ21 Σ22
).
2.6. Application: the present value of stochastic cash flows 79
Then, given conditionally that ~X2 = ~x2, the vector ~X1 has the
En(~µ1|2,Σ11|2, φx2)-distribution with parameters given by
~µ1|2 = ~µ1 + Σ12Σ−122
(~x2 − ~µ2
)and
Σ11|2 = Σ11 − Σ12Σ−122 Σ21.
Notice that in general (unlike in the normal case) the characteristic
generator of the conditional distribution is not known explicitly and
depends on the value of x2.
Consider now sums of the form
Sel =n∑
i=1
Xie−Y (i),
where the return process Y (t) is, like in the previous example, described
by the Black & Scholes model and ~X =(X1, X2, . . . , Xn
)is elliptically
distributed with parameters ~µ ~X =(µX1 , µX2 , . . . , µXn
), Σ ~X =
[σ
~Xij
]1≤i,j≤n
and characteristic generator φ(·). Here we note only that for φ(u) = e−u2
one gets a multivariate normal distribution with mean parameter ~µ ~X and
covariance matrix Σ ~X .
Note that elliptical random variables take both positive and negative
values and therefore one cannot apply immediately Theorem 11. We
propose to consider pragmatically only the cases where the probability
Pr[Xi < 0] is very small. This can be achieved by choosing the parameters
in such a way thatµXi
σXi
is much larger then 0, where we use the conventional
notation σ2Xi
:= σ~Xii .
The upper bound
The computation of the upper bound is straightforward if the inverse dis-
tribution function for the specific elliptical distribution is available in the
software package. In other words, the comonotonic upper bound is given
by
Scel =
n∑
i=1
F−1
En
�µXi
,σ2Xi
,φ �(U)eµi+σiΦ
−1(V ), (2.95)
80 Chapter 2 - Convex bounds
where by convention µi = −m(i) and σ2i = c(i, i) for m(·) and c(·, ·) de-
noting the mean and covariance functions of the process Y (i) described
previously in this subsection.
Note that for the most interesting case of a multivariate normal distri-
bution one gets
ScN =
n∑
i=1
(µXi
+ σXiΦ−1(U)
)eµi+σiΦ
−1(V ).
The corresponding expressions for stop-loss premiums are given by
πcub(Sel, d) =n∑
i=1
eµi+12σ2
i ×
×∫ 1
0
{F−1
En
�µXi
,σ2Xi
,φ � (u)Φ(σi − Φ−1
(FSu
el|U=u(d)
))}du
−d(1 − FSu
el(d))
(2.96)
and
πcub(SN , d) =n∑
i=1
eµi+12σ2
i ×
×∫ 1
0
{(µXi
+ σXiΦ−1(u)
)Φ(σi − Φ−1
(FSu
N |U=u(d)))}
du
−d(1 − FSu
N (d)).
The lower bound
To compute the lower bound, we define the conditioning random variable
Γ as follows
Γ =n∑
j=1
E[e−Y (j)
]Xj =
n∑
j=1
eµj+12σ2
jXj .
Then a random vector(Xj ,Γ
)has a bivariate elliptical distribution, with
parameters ~µΓ,i =(µXi
, µΓ
)and ΣΓ,i =
[σΓ,i
kl
]1≤k,l≤2
, where
µΓ =n∑
j=1
eµj+12σ2
jµXj,
2.6. Application: the present value of stochastic cash flows 81
σ2Xi
:= σΓ,i11 , σΓ,i
12 = σΓ,i21 =
n∑
j=1
eµj+12σ2
j σ~Xij and
σ2Γ := σΓ,i
22 =n∑
j=1
n∑
k=1
eµj+µk+ 12
(σ2
j +σ2k
)σ
~Xjk.
From property (4) of the elliptical distributions, it follows that — given
Γ = γ — the r.v. Xi is elliptically distributed with parameters
µXi,Γ = µXi+σΓ,i
12
σ2Γ
(Γ − µΓ
), σ2
Xi,Γ = σ2Xi
−
(σΓ,i
12
)2
σ2Γ
(2.97)
and the unknown characteristic generator φa(·) depending on a equals(Γ−µΓ)2
σ2Γ
(recall that for the multivariate normal case the conditional dis-
tribution remains normal). Note that in our application it does not really
matter that the characteristic generator φa(·) is not known — it suffices
to notice that
E[Xi | Γ] = µXi,Γ = µXi
+σΓ,i
12
σ2Γ
(Γ − µΓ
).
The second conditioning random variable is chosen analogously as in (2.93):
Λ = −n∑
i=1
E[Xi]eµi+
12σ2
i Y (i) = −n∑
i=1
µXieµi+
12σ2
i Y (i).
From Section 2.5.1 it follows that the lower bound is given by the following
expression:
Slel =
n∑
i=1
(µXi
+σΓ,i
12
σ2Γ
(F−1
Γ (U) − µΓ
))eµi+
12σ2
i (1−r2i )+riσiΦ
−1(V ), (2.98)
where correlations ri = r(−Y (i),Λ) are defined as in (2.79) (with E[Xi]
substituted by µXi). Note that expression (2.98) simplifies in the normal
case to
SlN =
n∑
i=1
(µXi
+ rXiσXi
Φ−1(U))eµi+
12σ2
i (1−r2i )+riσiΦ
−1(V )
82 Chapter 2 - Convex bounds
with
rXi= r(Xi,Γ) =
∑nj=1 µXj
eµj+12σ2
j σ~Xij
σXi
√∑n
k,l=1 µXkµXl
eµk+µl+12
(σ2
k+σ2
l
)σ
~Xkl
.
Finally, the corresponding stop-loss premiums are computed according to
the following expressions:
πlb(Sel, d,Γ,Λ) =n∑
i=1
eµi+12σ2
i
∫ 1
0
{(µXi
+σΓ,i
12
σ2Γ
(F−1
Γ (u) − µΓ
))×
× Φ(riσi − Φ−1
(FSl
el|Γ=F−1
Γ (u)(d)))}
du
− d(1 − FSl
el(d)),
πlb(SN , d,Γ,Λ) =n∑
i=1
eµi+12σ2
i
∫ 1
0
{(µXi
+ rXiσXi
Φ−1(u))×
× Φ(riσi − Φ−1
(FSl
N |Γ=F−1Γ (u)(d)
))}du
− d(1 − FSl
N(d)).
The moments based approximation
It is also possible to find the moments based approximation SmN from for-
mula (2.65), since one can compute the variance of SN as
Var[SN ] = E ~X
[Var[SN | ~X
]]+ Var ~X
[E[SN | ~X
]]
= E ~X
[ n∑
i=1
n∑
j=1
XiXjeµi+µj+
12
(σ2
i +σ2j
)(eσij − 1
)]
+ Var ~X
[ n∑
i=1
Xieµi+
12σ2
i
]
=n∑
i=1
n∑
j=1
(σ
~Xij + µXi
µXj
)eµi+µj+
12
(σ2
i +σ2j
)+σij
−n∑
i=1
n∑
j=1
µXiµXj
eµi+µj+12
(σ2
i +σ2j
).
2.6. Application: the present value of stochastic cash flows 83
Here, the variances of the upper and the lower bound are computed as
explained in Section 2.5.3.
We remark that for ~X having a multivariate elliptical distribution the
computations are almost identical, with the only difference in the formula
for covariances
Cov[Xi, Xj ] = −2φ′(0)~Xij .
Then the stop-loss premium of the moments based approximation is
obtained as a convex combination
πm(Sel, d,Γ,Λ) = zπlb(Sel, d,Γ,Λ) + (1 − z)πcub(Sel, d),
where z is defined as in (2.66).
A numerical illustration
We study the case of normally distributed payments with mean µXi= 1
and variance σ2Xi
= 0.01. Note that the mean and the variance are the same
as in the lognormal case. Moreover we assume the following correlation
pattern for the payments:
r(Xi, Xj) =
1 if i = j
0.5 if |i− j| = 1
0.2 if |i− j| = 2,
0 if |i− j| > 2.
.
As in the previous example, we work in the Black & Scholes setting with
drift parameter µ = 0.05 and volatility σ = 0.1. We compare the per-
formances of the lower bound S lN , the upper bound Sc
N and the mo-
ments based approximation SmN with the real distribution of SN of the
present value function, obtained by a Monte Carlo simulation (MC) based
on 500 × 100 000 simulated paths.
The performance of the approximations is illustrated by the numerical
values of some upper quantiles displayed in Table 2.9. The same conclu-
sions can be drawn as in the log-normal case — the upper bound ScN gives
a quite poor approximation, while the lower bound S lN and the moments
based approximation perform excellent.
The study of stop-loss premiums in Table 2.10 confirms this observa-
tion.
84 Chapter 2 - Convex bounds
p SlN Sm
N ScN MC (s.e.×103)
0.75 14.6820 14.6849 15.0368 14.6820 (0.70)0.90 17.0978 17.1068 18.0992 17.1025 (1.02)0.95 18.7642 18.7787 20.2522 18.7789 (1.46)0.975 20.3630 20.3840 22.3456 20.3895 (2.11)0.995 23.9599 24.0020 27.1468 24.0354 (4.61)
Table 2.9: Approximations for some selected quantiles with probability
level p of SN .
d SlN Sm
N ScN MC (s.e.×104)
0 12.8928 12.8928 12.8928 12.8923 (4.50)5 7.8928 7.8929 7.8931 7.8923 (4.50)10 3.0855 3.0872 3.2544 3.0863 (4.16)15 0.5589 0.5615 0.8213 0.5610 (2.11)20 0.0658 0.0668 0.1636 0.0671 (0.74)25 0.0070 0.0072 0.0309 0.0073 (0.25)30 0.0008 0.0008 0.0060 0.0008 (0.08)
Table 2.10: Approximations for some selected stop-loss premiums with
retention d of SN .
2.6.4 Independent and identically distributed payments
Finally, we consider the case where the payments Xi are independent and
identically distributed. The independence assumption accounts for more
flexibility in modelling the underlying marginal distributions, however —
unlike in the lognormal and elliptical cases — it imposes a rigid condition
on the dependence structure. We start with defining the class of tempered
stable distributions for which the methodology works particularly efficient.
Tempered stable distributions
The Tempered Stable law T S(δ, a, b) for a, b > 0 and 0 < δ < 1 is a
one-dimensional distribution given by the characteristic function:
ϕT S(t; δ, a, b) = eab−a(b1δ −2it
)δ
. (2.99)
2.6. Application: the present value of stochastic cash flows 85
For more details we refer to e.g. Schoutens (2003). This class of distribu-
tions has the special property that the sum of independent and identically
distributed tempered stable random variables is again tempered stable.
This is formalized in the following lemma:
Lemma 9 (Sum of tempered stable random variables).
If Xi are i.i.d. random variables T S(κ, a, b)-distributed for i = 1, 2, . . . , n,
then their sum X1 +X2 + · · · +Xn is T S(κ, na, b)-distributed.
Proof. Consider the corresponding characteristic functions. We get
ϕX1+X2+···+Xn(t) =(ϕT S(t;κ, a, b)
)n
= e(na)b−(na)(b1κ −2it)κ
= ϕT S(t;κ, na, b).
The first two moments of a random variable X ∼ T S(δ, a, b) are given by
E[X] = 2aδbδ−1
δ and Var[X] = 4aδ(1 − δ)bδ−2
δ .
In the sequel we provide more details about two well-known special
cases: the gamma distribution and the inverse Gaussian distribution.
The gamma distribution Gamma(a, b) corresponds to the limiting case
when δ → 0. The characteristic function of the gamma distribution is
given by
ϕ(t; a, b) =(1 − it
b
)−a.
Notice that for X ∼ Gamma(a, b) one has E[X] = ab and Var[X] = a
b2.
The inverse Gaussian distribution is a member of the class of Tempered
Stable distributions with δ = 12 . Thus, the characteristic function is given
by
ϕ(t; a, b) = e−a(√
−2it+b2−b).
Moreover the mean and variance of X ∼ IG(a, b) are given by E[X] = ab
and Var[X] = ab3
.
We consider now sums of the form
Sind =n∑
i=1
Xie−Y (i), (2.100)
where the process Y (i) is defined like in the previous examples and the
payments Xi are independent and follow the law defined by the cdf FX(·).
86 Chapter 2 - Convex bounds
The upper bound
The computation of the upper bound is straightforward (as described in
Section 2.5.3):
Scind = F−1
X (U)n∑
i=1
eµi+σiΦ−1(V ). (2.101)
The stop-loss premiums for the upper bound are given by an expression
analogous to (2.75), with Sc replaced by Scind.
The lower bound
To compute the lower bound, we start with defining the conditioning ran-
dom variables Γ and Λ. Let
Γ = X1 +X2 + · · · +Xn.
If we know the distributions of Xi, the distribution of the sum Γ is also
known. In particular, for Xi gamma distributed the sum Γ remains gamma
distributed and the same for Xi inverse Gaussian distributed.
Like in the previous examples, the conditional random variable Λ is
chosen as
Λ = −n∑
i=1
E[Xi]eµi+
12σ2
i Y (i). (2.102)
Now, the lower bound can be written as
Slind =
1
nF−1
Γ (U)n∑
i=1
eµi+12(1−r2
i )σ2i +riσiΦ
−1(V ),
where the correlations ri = r(− Y (i),Λ
)are defined as in (2.79).
Note that the computation of stop-loss premiums of the lower bound
is straightforward, by applying (2.83) and replacing S l by Slind.
Cumulative distribution functions
In this case there is a more efficient method to compute the distribution
functions than this described in Section 2.5.3.
2.6. Application: the present value of stochastic cash flows 87
Remark 4. The cumulative distribution function of the product W of two
non-negative independent variables X and Y can be written as
FW (z) =
∫ ∞
−∞FY
( zx
)dFX(x) =
∫ 1
0FY
(z
F−1X (u)
)du. (2.103)
Using this result one can compute the cumulative distribution functions of
the upper and the lower bound as
FScind
(y) =
∫ 1
0FX
(y
F−1Sc
(v)
)dv,
FSlind
(y) =
∫ 1
0F 1
nΓ
(y
F−1Sl
(v)
)dv,
where
Sc =n∑
i=1
eµi+σiΦ−1(V ), Sl =
n∑
i=1
eµi+12(1−r2
i )σ2i +riσiΦ
−1(V ),
F−1Sc
(v) =
n∑
i=1
eµi+σiΦ−1(v), F−1
Sl(v) =
n∑
i=1
eµi+12(1−r2
i )σ2i +riσiΦ
−1(v).
The moments based approximation
The moments based approximation of Sind can be found in a similar way
to the moments based approximation for elliptical distributions. The key
step is to compute the variance of Sind:
Var[Sind] = E ~X
[Var[Sind | ~X
]]+ Var ~X
[E[Sind | ~X
]]
= E ~X
[ n∑
i=1
n∑
j=1
XiXjeµi+µj+
12
(σ2
i +σ2j
)(eσij − 1
)]
+ Var ~X
[ n∑
i=1
Xieµi+
12σ2
i
]
=n∑
i=1
n∑
j=1
(E[Xi]E[Xj ]
)eµi+µj+
12
(σ2
i +σ2j
)(eσij − 1
)
+n∑
i=1
Var[Xi]e2µi+σ2
i . (2.104)
88 Chapter 2 - Convex bounds
p Slind Sm
ind Scind MC (s.e.×103)
0.75 14.6709 14.6723 15.0320 14.6820 (0.70)0.90 17.0767 17.0810 18.0984 17.1025 (1.02)0.95 18.7372 18.7443 20.2563 18.7789 (1.46)0.975 20.3309 20.3412 22.3560 20.3895 (2.11)0.995 23.9183 23.9390 27.1762 24.0354 (4.61)
Table 2.11: Approximations for some selected quantiles with probability
level p of Sind for gamma i.i.d. liabilities.
The variances of the upper and the lower bound are computed as explained
in Subsection 2.5.3.
Consequently, the stop-loss premium of the moments based approxi-
mation is obtained as a convex combination
πm(Sind, d,Γ,Λ) = zπlb(Sind, d,Γ,Λ) + (1 − z)πcub(Sind, d),
where z is defined as in (2.66).
A numerical illustration
We consider in this application independent Gamma(100, 100) distributed
future payments. Note that this choice of parameters implies that E[X] = 1
and Var[X] = 0.01 — i.e. we take the same mean and variance of liabilities
as in the lognormal and normal cases. As before we work in a Black &
Scholes setting with drift µ = 0.05 and volatility σ = 0.1. We compare
the performances of the lower bound S lind, the upper bound Sc
ind and the
moments based approximation Smind with the real value Sind obtained by a
Monte Carlo simulation (MC) based on 500 × 100 000 simulated paths.
The results are very similar to the normal and lognormal case. It is
worth noticing that the variance of Sind (10.1489) is a bit lower than in the
lognormal case (10.2789) and in the normal case (10.2792). This is due
to independence of gamma-payments while we imposed a slight positive
dependence in the previous cases.
The quality of the approximations is illustrated by some upper quan-
tiles displayed in Table 2.11. The lower bound S lind and the moments
based approximation Smind perform well, but not as good as in the lognor-
mal and normal cases (probably because the conditioning random variable
2.7. Proofs 89
d Slind Sm
ind Scind MC (s.e.×104)
0 12.8928 12.8928 12.8928 12.8921 (4.44)5 7.8928 7.8928 7.8931 7.8921 (4.44)10 3.0813 3.0821 3.2528 3.0821 (4.06)15 0.5540 0.5553 0.8215 0.5549 (2.08)20 0.0647 0.0652 0.1644 0.0655 (0.77)25 0.0068 0.0069 0.0313 0.0071 (0.27)30 0.0007 0.0008 0.0061 0.0008 (0.09)
Table 2.12: Approximations for some selected stop-loss premiums with
retention d of Sind for gamma i.i.d. liabilities.
Γ does not take discounting factors into account). The study of stop-loss
premiums in Table 2.12 goes in line with these findings.
2.7 Proofs
Upper bound based on lower bound (2.44)
In the following we shall derive an easily computable expression for (2.26).
The second expectation term in the product (2.26) equals, when denoting
by FΛ(·) the normal cumulative distribution function of Λ,
E[I(Λ<dΛ)] = 0 · Pr[Λ ≥ dΛ] + 1 · Pr[Λ < dΛ] = FΛ(dΛ) = Φ(d∗Λ). (2.105)
The first expectation term in the product (2.26) can be expressed as
E[Var [S|Λ] I(Λ<dΛ)
]= E
[E[S2|Λ]I(Λ<dΛ)
]− E
[(E[S|Λ])2I(Λ<dΛ)
].
(2.106)
Now consider the second term of the right-hand side of (2.106)
E[(E[S|Λ])2I(Λ<dΛ)
]=
∫ dΛ
−∞(E[S|Λ = λ])2dFΛ(λ). (2.107)
According to (2.32) and using the notation Zij introduced before, we can
90 Chapter 2 - Convex bounds
express (2.107) as
E[(E[S|Λ])2I(Λ<dΛ)
]
=
∫ dΛ
−∞
(n∑
i=1
E[Xi|Λ = λ]
)2
dFΛ(λ)
=
∫ dΛ
−∞
(n∑
i=1
αieE[Zi]+riσZi
Φ−1(v)+ 12(1−r2
i )σ2Zi
)2
dFΛ(λ)
=
∫ dΛ
−∞
n∑
i=1
n∑
j=1
αiαjeE[Zij ]+(riσZi
+rjσZj)Φ−1(v) ×
× e12 � (1−r2
i )σ2Zi
+(1−r2j )σ2
Zj � dFΛ(λ)
=n∑
i=1
n∑
j=1
αiαjeE[Zij ]+
12 � (1−r2
i )σ2Zi
+(1−r2j )σ2
Zj � ×
×∫ dΛ
−∞e(riσZi
+rjσZj)Φ−1(v)
dFΛ(λ). (2.108)
Next, applying Lemma 7 to (2.108) with a = riσZi+ rjσZj
yields
E[(E[S|Λ])2I(Λ<dΛ)
]=
n∑
i=1
n∑
j=1
αiαjeE[Zij ]+
12(σ2
Zi+σ2
Zj+2rirjσZi
σZj)Φ(d∗Λ −
(riσZi
+ rjσZj
)).(2.109)
Now consider the first term of the right-hand side of expression (2.106),
E[E[S2|Λ]I(Λ<dΛ)
]. The term E[S2|Λ] is given by (2.42). By applying
(2.43) with a = rijσZij= riσZi
+ rjσZj, and simplifying, we obtain
2.7. Proofs 91
E[E[S2|I(Λ<dΛ)
]
=n∑
i=1
n∑
j=1
∫ dΛ
−∞αiαje
E[Zij ]+rijσZijΦ−1(v)+ 1
2(1−r2ij)σ2
Zij dFΛ(λ)
=n∑
i=1
n∑
j=1
αiαjeE[Zij ]+
12(1−r2
ij)σ2Zij
∫ dΛ
−∞erijσZij
Φ−1(v)dFΛ(λ)
=n∑
i=1
n∑
j=1
αiαjeE[Zij ]+
12(1−r2
ij)σ2Zij
+r2ijσ2
Zij2 Φ(d∗Λ − rijσZij
)
=n∑
i=1
n∑
j=1
αiαjeE[Zij ]+
σ2Zij2 Φ(d∗Λ − (riσYi
+ rjσYj)). (2.110)
Combining (2.110) and (2.109) into (2.106), and then substituting (2.105)
and (2.106) into (2.26) we get the following expression for the error bound
ε(dΛ) (2.26):
ε(dΛ)
=1
2(Φ(d∗Λ))
12
{n∑
i=1
n∑
j=1
αiαj
[eE[Zij ]+
σ2Zij2 Φ
(d∗Λ −
(riσZi
+ rjσZj
))−
−eE[Zij ]+12(σ2
Zi+σ2
Zj+2rirjσZi
σZj)Φ(d∗Λ −
(riσZi
+ rjσZj
))]} 1
2
=1
2(Φ(d∗Λ))
12
{n∑
i=1
n∑
j=1
αiαjeE[Zij ]Φ
(d∗Λ −
(riσZi
+ rjσZj
))×
×(e
12(σ2
Zi+σ2
Zj+2σZiZj
) − e12(σ2
Zi+σ2
Zj+2rirjσZi
σZj))} 1
2
=1
2(Φ(d∗Λ))
12
{n∑
i=1
n∑
j=1
αiαjeE[Zij ]+
12(σ2
Zi+σ2
Zj)Φ(d∗Λ −
(riσZi
+ rjσZj
))×
×(eσZiZj − e
σZiσZj
rirj)} 1
2
.
92 Chapter 2 - Convex bounds
Partially exact/comonotonic upper bound (2.45)
Applying Lemma 7 with a = riσZi, and using (2.32), we can express the
second term I2 in (2.22) in closed-form:∫ +∞
dΛ
E[S − d|Λ = λ]dFΛ(λ)
=
∫ +∞
dΛ
E[S|Λ = λ]dFΛ(λ) − d(1 − FΛ(dΛ))
=n∑
i=1
αieE[Zi]+
12(1−r2
i )σ2Zi
∫ +∞
dΛ
eriσZiΦ−1(v)dFΛ(λ) − d(1 − Φ(d∗Λ))
=n∑
i=1
αieE[Zi]+
σ2Zi2 Φ(riσZi
− d∗Λ) − dΦ(−d∗Λ). (2.111)
Substituting (2.33) in (2.28) we end up with the following upper bound of
I1 similar to (2.37) but now with an integral from zero to Φ(d∗Λ):
∫ dΛ
−∞E[(S − d)+|Λ = λ]dFΛ(λ)
≤∫ dΛ
−∞E[(Su − d)+|Λ = λ]dFΛ(λ)
=
∫ Φ(d∗Λ)
0E[(Su − d)+|V = v] dv
=n∑
i=1
αieE[Zi]+
12σ2
Zi(1−r2
i )×
×∫ Φ(d∗Λ)
0eriσZi
Φ−1(v)Φ
(sign(αi)
√1 − r2i σZi
− Φ−1(FSu|V =v(d)
))dv
− d
(Φ(d∗Λ) −
∫ Φ(d∗Λ)
0FSu|V =v(d)dv
), (2.112)
where we recall that d∗Λ is defined as in (2.43), and the cumulative distri-
bution FSu(d) is, according to (2.36), determined byn∑
i=1
αieE[Zi]+riσZi
Φ−1(v)+sign(αi)√
1−r2i σZi
Φ−1(FSu (d|V =v)) = d.
Finally, adding (2.112) to the exact part (2.111) of the decomposition (2.22)
results in the partially exact/comonotonic upper bound.
Chapter 3
Reserving in life insurance
business
Summary In the traditional approach to life contingencies only decre-
ments are assumed to be stochastic. In this contribution we consider the
distribution of a life annuity (and a portfolio of life annuities) when also
the stochastic nature of interest rates is taken into account. Although the
literature concerning this topic is already quite rich, the authors usually
restrict themselves to the computation of the first two or three moments.
However, if one wants to determine e.g. capital requirements using more
sofisticated risk measures like Value-at-Risk or Tail Value-at-Risk, more
detailed knowledge about underlying distributions is required. For this
purpose, we propose to use the theory of comonotonic risks introduced in
Chapter 2. This methodology allows to obtain reliable approximations of
the underlying distribution functions, in particular very accurate estimates
of upper quantiles and stop-loss premiums. Several numerical illustrations
confirm the very high accuracy of the methodology.
3.1 Introduction
Unlike in finance, in insurance the concept of stochastic interest rates
emerged quite recently. In the traditional approach to life contingencies
only decrements are assumed to be stochastic — see e.g. Bowers et al.
(1986), Wolthuis & Van Hoek (1986). Such a simplification allows to treat
effectively summary measures of financial contracts such as the mean, the
93
94 Chapter 3 - Reserving in life insurance business
standard deviation or the upper quantiles. For a more detailed discussion
about the distributions in life insurance under deterministic interest rates,
see e.g. Dhaene (1990).
In non-life insurance the use of deterministic interest rates may be
justified by short terms of insurance commitments. In the case of the life
insurance and the life annuity business, durations of contracts are typically
very long (often 30 or even more years). Then uncertainty about future
rates of return becomes very high. Moreover the financial and investment
risk — unlike the mortality risk — cannot be diversified with an increase in
the number of policies. Therefore in order to calculate insurance premiums
or mathematical reserves, actuaries are forced to adopt very conservative
assumptions. As a result the diversification effects between interest rates
in different investment periods may not be taken into account (i.e. that
poor investment results in some periods are usually compensated by very
good ones in others) and the life insurance business becomes too expensive,
both for the insureds who have to pay higher insurance premiums and for
the shareholders who have to provide more capital than necessary. Profit-
sharing can partially solve this problem. For these reasons the necessity to
introduce models with stochastic interest rates have been well-understood
in the actuarial world.
In the actuarial literature numerous papers have treated the random
interest rates. In Boyle (1976) autoregressive models of order one are intro-
duced to model interest rates. Bellhouse & Panjer (1980, 1981) use similar
models to compute moments of insurance and annuity functions. In Wilkie
(1976) the force of interest is assumed to follow a Gaussian random walk.
Waters (1978) computes the moments of actuarial functions when the in-
terest rates are independent and identically Gaussian distributed. He com-
putes also moments of portfolios of policies and approximates the limiting
distribution by Pearson’s curves. In Dhaene (1989) the force of interest
is modelled as an ARMA(p, d, q) process. He uses this model to compute
the moments of present value functions. Norberg (1990) provides an ax-
iomatic approach to stochastic interest rates and the valuation of payment
streams. Parker (1994d) compares two approaches to the randomness of in-
terest rates: by modelling only the accumulated interest and by modelling
the force of interest. Both methodologies are illustrated by calculating the
mean, the standard deviation and the skewness of the annuity-immediate.
An overview of stochastic life contingencies with solvency valuation is
presented in Frees (1990). In the papers of Beekman & Fuelling (1990,
3.1. Introduction 95
1991) the mean and the standard deviation of continuous-time life annu-
ities are calculated with the force of mortality modelled as an Ornstein-
Uhlenbeck and a Wiener process respectively. In Beekman & Fuelling
(1993) expressions are given for the mean and the standard deviation of
the future life insurance payments. Norberg (1993) derives the first two
moments of the present value of stochastic payment streams. The first
three moments of homogeneous portfolios of life insurance and endowment
policies are calculated in Parker (1994a,b) and the results are generalized
to heterogeneous portfolios in Parker (1997). The same author (1994c,
1996) provides a recursive formula to calculate an approximate distribu-
tion function of the limiting homogeneous portfolio of term life insurance
and endowment policies. In Debicka (2003) the mean and the variance are
calculated for the present value of discrete-time payment streams in life
insurance.
Although the literature on stochastic interest rates in life insurance is
already quite rich, for most of the problems no satisfactory solutions have
been found as yet. In almost all papers the authors restrict themselves
to calculating the first two or three moments of the present value function
(except Waters (1978), Parker (1994d, 1996)). The computation of the first
few moments may be seen as just a first attempt to explore the properties of
a random distribution. Moreover in general the variance does not appear to
be the most suitable risk measure to determine the solvency requirements
for an insurance portfolio. As a two-sided risk measure it takes into account
both positive and negative discrepancies which leads to underestimation of
the reserve in the case of a skewed distribution. It does not emphasize the
tail properties of the distribution and does not give any reliable estimates of
the Value-at-Risk or other tail-related risk measures, for which simulation
methods have to be deployed. The same applies to risk measures based on
stop-loss premiums, like Expected Shortfall.
In this chapter we aim to provide some conservative estimates both
for high quantiles and stop-loss premiums for an individual policy and for
a whole portfolio. We focus here only on life annuities, however similar
techniques may be used to get analogous estimates for more general life
contingencies. Using the results of Chapter 2 we will approximate the
quantiles of the present value of a life annuity and a portfolio of life annu-
ities.
We perform our analysis separately for a single life annuity and a whole
portfolio of policies. Our solution enables to solve with a great accuracy
96 Chapter 3 - Reserving in life insurance business
personal finance problems, such as: How much does one need to invest now
to ensure — given a periodical (e.g. yearly) consumption pattern — that
the probability of outliving ones money is very small (e.g. less than 1%)?
Similar problems were studied by Dufresne (2004) and Milevsky & Wang
(2004).
The case of a portfolio of life annuity policies has been studied exten-
sively in the literature, but only in the limiting case — for homogeneous
portfolios, when the mortality risk is fully diversified. However the applica-
bility of these results in insurance practice may be questioned: especially
in the case of the life annuity business a typical portfolio does not con-
tain enough policies to speak about full diversification. For this reason we
propose to approximate the number of active policies in subsequent years
using a normal power distribution and to model the present value of future
benefits as a scalar product of mutually independent random vectors.
This chapter is mainly based on Hoedemakers, Darkiewicz & Goovaerts
(2005) and is organized as follows. In Section 2 we give a summary of
the model assumptions and properties for the mortality process that are
needed to reach our goal. In the first part of Section 3 we apply the
results of Chapter 2 to the present value of a single life annuity policy.
In the second part of this section we present the convex bounds for a
homogeneous portfolio of policies. A numerical illustration is provided at
the end of each part. We also illustrate the obtained results graphically.
3.2 Modelling stochastic decrements
A life annuity may be defined as a series of periodic payments where each
payment will actually be made only if a designated life is alive at the time
the payment is due. Let us consider a person aged x years, also called a
life aged x and denoted by (x). We denote his or her future lifetime by Tx.
Thus x+ Tx will be the age of death of the person. The future lifetime Tx
is a random variable with a probability distribution function
Gx(t) = Pr[Tx ≤ t] = tqx, t ≥ 0.
The function Gx represents the probability that the person will die within t
years, for any fixed t. We assume that Gx is known. We define Kx = bTxc,the number of completed future years lived by (x), or the curtate future
3.2. Modelling stochastic decrements 97
lifetime of (x). The probability distribution of the integer valued random
variable Kx is given by
Pr[Kx = k] = Pr[k ≤ Tx < k+ 1] = k+1qx − kqx = k|qx, k = 0, 1, . . . .
Let us denote the lifetime from birth by the random variable T . We assume
Pr[Tx ≤ t] = Pr[T ≤ x+ t|T ≥ x].
With this notation, Td= T0. Further, the ultimate age of the life table is
denoted by ω, this means that ω − x is the first remaining lifetime of (x)
for which ω−xqx = 1, or equivalently, G−1x (1) = ω − x.
In the remainder of this chapter we will always use the standard actu-
arial notation:
Pr[Tx > t] = tpx, Pr[Tx > 1] = px, Pr[Tx ≤ t] = tqx, Pr[Tx ≤ 1] = qx.
In this chapter we consider three types of annuities. The present value of
a single life annuity for a person aged x paying periodically (e.g. yearly) a
fixed amount of αi (i = 1, . . . , bω − xc) can be expressed as
Ssp,x =
Kx∑
i=1
αie−Y (i) =
bω−xc∑
i=1
I(Tx>i)αie−Y (i). (3.1)
We consider also the present value of a homogeneous portfolio of life
annuities — this random variable is particularly interesting for an in-
surer who has to determine a sufficient level of the reserve and the sol-
vency margin. Assuming that every beneficiary gets a fixed amount of αi
(i = 1, . . . , bω−xc) per year, the present value can be expressed as follows
Spp,x =
bω−xc∑
i=1
αiNie−Y (i), (3.2)
where Ni denotes the remaining number of policies-in-force in year i.
Finally, consider a portfolio of N0 homogeneous life annuity contracts
for which the future lifetimes of the insureds T(1)x , T
(2)x , . . . , T
(N0)x are as-
sumed to be independent. Then the insurer faces two risks: mortality
risk and investment risk. Note that from the Law of Large Numbers the
98 Chapter 3 - Reserving in life insurance business
mortality risk decreases with the number of policies N0 while the invest-
ment risk remains the same (each of the policies is exposed to the same
investment risk). Thus, for sufficiently large N0 we have that
bω−xc∑
i=1
αiNie−Y (i) = N0
bω−xc∑
i=1
αiNi
N0e−Y (i)
≈ N0
bω−xc∑
i=1
αi ipxe−Y (i)
.
Hence in the case of large portfolios of life annuities it suffices to compute
risk measures of an ‘average’ portfolio Sapp,x given by
Sapp,x =
bω−xc∑
i=1
αi ipxe−Y (i) = E
[Ssp,x|Y (1), · · · , Y (bω − xc)
]. (3.3)
Remark 5. For the random variables Sapp,x and Ssp,x one has that
Sapp,x ≤cx Ssp,x and consequently Var[Sapp,x] ≤ Var[Ssp,x].
Indeed, let Γ denote a random variable independent of Tx. Then, it follows
immediately from Theorem 8 that
Ssp,x =
bω−xc∑
i=1
I(Tx>i)αie−Y (i)
≥cx
bω−xc∑
i=1
E[I(Tx>i)|Γ]αie−Y (i)
=
bω−xc∑
i=1
ipxαie−Y (i)
= Sapp,x.
Obviously Ssp,x, Spp,x and Sapp,x depend on the distribution of the total
lifetime T . We assume that T follows the Gompertz-Makeham law, i.e.
the force of mortality at age ξ is given by the formula
µξ = α+ βcξ,
where α > 0 is a constant component, interpreted as capturing accident
hazard, and βcξ is a variable component capturing the hazard of aging
with β > 0 and c > 1. This leads to the survival probability
tpx = Pr[Tx > t] = e− � x+tx
µξdξ = stgcx+t−cx
,
3.2. Modelling stochastic decrements 99
where
s = e−α and g = e− β
log c . (3.4)
In numerical illustrations we use the Belgian analytic life tables MR and
FR for life annuity valuation, with corresponding constants for males: s =
0.999441703848, g = 0.999733441115 and c = 1.101077536030 and for fe-
males: s = 0.999669730966, g = 0.999951440171 and c = 1.116792453830.
Denote by T ′ and T ′x the corresponding random variables from the
Gompertz family — the subclass of the Makeham-Gompertz family with
the force of mortality given by
µ′ξ = βcξ.
It is straightforward to show that
Txd= min(T ′
x, E/α), (3.5)
where E denotes a random variable from the standard exponential distri-
bution, independent of T ′. Indeed, one has that
Pr[min(T ′x, E/α) > t] = Pr[T ′
x > t] Pr[E > αt]
= e− � x+tx
µ′ξdξe−αt
= e− � x+tx
µξdξ
= Pr[Tx > t].
The cumulative distribution function for the Gompertz law, unlike for the
Makeham-Gompertz law in general, has an analytical expression for the
inverse function and therefore (3.5) can be used for simulations.
For generating one random variate from Makeham’s law, we use the
composition method (Devroye, 1986) and perform the following steps
1. Generate G from the Gompertz’s law by the well-known inversion
method
2. Generate E from the exponential(1) distribution
3. Retain T = min(G,E/α),
where α = − log s, see (3.4).
100 Chapter 3 - Reserving in life insurance business
3.3 The distribution of life annuities
This section is organized into 2 subsections. In the first subsection we
derive upper and lower bounds in convex order for the distribution of the
present value of a single life annuity given a mortality law T and a model
for the returns. This distribution is very important in the context of so-
called personal finance problems. Suppose that (x) disposes of a lump
sum L. What is the amount that (x) can yearly consume to be sure with a
sufficiently high probability (e.g. p = 99%) that the money will not be run
out before death? Obviously, to answer this question one has to compute
the Value-at-Risk measure of the distribution at an appropriate chosen
level.
In the second part of this section we will consider the distribution of
a homogeneous and ‘average’ portfolio of life annuities. An insurer has
to derive this distribution to determine its future liabilities and solvency
margin. Notice that the presented methodology is appropriate not only in
the case of large portfolios when the limiting distribution can be used on
the basis of the law of large numbers but also for portfolios of average size
(e.g. 1 000 - 5 000) which are typical for the life annuity business.
The vector ~Y =(Y (1), Y (2), . . . , Y (n)
)is assumed to have a n-dimensional
normal distribution with given mean vector
~µ = (µ1, . . . , µn) =(E[Y (1)],E[Y (2)], . . . ,E[Y (n)]
)
and covariance matrix
Σ = [σij ]1≤i,j≤n =[Cov
(Y (i), Y (j)
)]
1≤i,j≤n.
In the above notation we will denote σii by σ2i .
3.3.1 A single life annuity
In this subsection we consider a whole life annuity of αi (> 0) payable at
the end of each year i while (x) survives, described by the formula
Ssp,x =
Kx∑
i=1
αie−Y (i) =
bω−xc∑
i=1
I(Tx>i)αie−Y (i).
3.3. The distribution of life annuities 101
The upper bound
The random variable Xi = I(Tx>i) is Bernoulli(ipx) distributed and thus
the inverse distribution function is given by
F−1Xi
(p) =
{1 for p > iqx0 for p ≤ iqx.
This leads to the following formula for the upper bound
Scsp,x =
bω−xc∑
i=1
F−1Xi
(U)F−1αie−Y (i)(V )
=
bF−1Tx
(U)c∑
i=1
F−1αie−Y (i)(V ),
where U and V are independent standard uniformly distributed random
variables. Thus the conditional quantiles are given by
F−1Sc
sp,x|Tx=t(p) =
btc∑
i=1
F−1αie−Y (i)(V )
and the conditional distribution function can be computed numerically
from the identity
btc∑
i=1
αie−µi+sign(αi)σiΦ
−1(FScsp,x|Tx=t(y))
=k∑
i=1
αie−µi+sign(αi)σiΦ
−1(FScsp,x|Kx=k(y))
= y.
Define Sk as follows:
Sk =k∑
i=1
αie−Y (i), (3.6)
102 Chapter 3 - Reserving in life insurance business
then Skd= Ssp,x|Kx = k. Hence, the distribution function of Sc
sp,x can be
computed as
FScsp,x
(y) =
bω−xc∑
k=1
Pr[Kx = k]FScsp,x|Kx=k(y)
=
bω−xc∑
k=1
k|qxFSck(y)
=
bω−xc∑
k=1
k|qxPr
[k∑
i=1
αie−µi+sign(αi)σiΦ
−1(U) ≤ y
],
with Sck =
∑ki=1 F
−1αie−Y (i)(U) and U a standard uniform random variable.
The computation of the corresponding stop-loss premiums is also straight-
forward:
πcub(Ssp,x, d) = EKx
[E[(Sc
sp,x − d)+|Kx
]]
=
bω−xc∑
k=1
k|qxπcub(Sk, d)
=
bω−xc∑
k=1
k|qx
( k∑
i=1
π(αie
−Y (i), dck,i
)),
where dck,i is defined analogously to (2.73) as
dck,i = αie
−µi+sign(αi)σiΦ−1(F
Sck(d))
and the values of π(αie−Y (i), dk,i) are computed as in (2.74). The stop-loss
premium of Scsp,x at retention d can be written out explicitly as follows
πcub(Ssp,x, d) =
bω−xc∑
k=1
k|qx
{k∑
i=1
αie−µi+
σ2i2 Φ(sign(αi)σi − Φ−1
(FSc
k(d)))
− d(1 − FSc
k(d))}.
3.3. The distribution of life annuities 103
The lower bound
For the lower bound one faces the problem of choosing appropriate condi-
tioning random variables Γ and Λ. The random variables Xi are in fact
comonotonic and depend only on the future lifetime Tx, thus Γ = Tx is the
most natural choice. As a result one simply gets
E[I(Tx>i)|Tx
]= I(Tx>i).
The choice of the second conditioning random variable Λ is less obvious.
We propose two different approaches:
1. Λ(a) =∑bω−xc
i=1 ipxαie−µi+
12σ2
i Y (i). Intuitively it means that the con-
ditioning random variable is chosen as a first order approximation to
the present value of the limiting portfolio Sapp,x in (3.3).
2. Consider the ‘maximal variance’ conditioning random variables of
the form Λj =∑j
i=1 αie−µi+
12σ2
i Y (i)(j = 1, . . . , bω − xc
)and the
corresponding lower bounds
Sl,jsp,x =
Kx∑
i=1
E[αie
−Y (i)|Λj
], j = 1, . . . , bω − xc
from which one chooses the lower bound with the largest variance.
The corresponding conditioning random variable will be denoted as
Λ(m). This choice can be motivated as follows. For two random
variables X and Y with X ≤cx Y one has that Var[X] ≤ Var[Y ]. As
discussed in Chapter 2 we should choose Λ such that the goodness-
of-fit expressed by the ratio z =Var[Sl
sp,x]
Var[Ssp,x]is as close as possible to
1. Hence one can expect that a lower bound with a larger variance
will provide a better fit to the original random variable.
Having chosen the conditioning random variable Λ one proceeds as in the
case of the upper bound: the first step requires the computation of the
conditional distribution of the lower bound from the formula
k∑
i=1
αie−µi+
12σ2
i (1−r2i )+σiriΦ
−1(FSl
sp,x|Kx=k(y))
= y.
104 Chapter 3 - Reserving in life insurance business
The cumulative distribution function of S lsp,x can then be computed as
FSlsp,x
(y) =
bω−xc∑
k=1
k|qxFSlsp,x|Kx=k(y)
=
bω−xc∑
k=1
k|qxFSlk(y)
=
bω−xc∑
k=1
k|qxPr
[k∑
i=1
αie−µi−riσiΦ
−1(U)+ 12(1−r2
i )σ2i ≤ y
],
with Slk = E[Sk|Λ] and U a standard uniform random variable.
The computation of the corresponding stop-loss premium is similar to
the one of the upper bound and as a result one gets the following explicit
solution
πlb(Ssp,x, d,Γ,Λ) = EKx
[E[(Sl
sp,x − d)+|Kx
]]
=
bω−xc∑
k=1
k|qxπlb(Sk, d,Λ)
=
bω−xc∑
k=1
k|qx
(k∑
i=1
π(E[αie
−Y (i)|Λ], dl
k,i
)),
with dlk,i given by
dlk,i = αie
−µi+12σ2
i (1−r2i )+σiriΦ
−1(FSl
sp,x|Kx=k(d))
.
Note that the values of π(E[αie
−Y (i)|Λ], dl
k,i
)can be computed as in
(2.82). The stop-loss premium of S lsp,x at retention d can be written out
explicitly as follows
πlb(Ssp,x, d,Γ,Λ) =
bω−xc∑
k=1
k|qx
{k∑
i=1
αi e−µi+
σ2i2 Φ(riσi − Φ−1
(FSl
k(d)))
− d(1 − FSl
k(d))}.
3.3. The distribution of life annuities 105
The lower bound based on a lifetime dependent conditioning ran-
dom variable
In this subsection we show how it is possible to improve the lower bound of
a scalar product if one of the vectors is comonotonic. We state this result
in the following lemma.
Lemma 10.
Consider a scalar product of random variables S =∑n
i=1XiYi, where the
random vectors ~X and ~Y are independent and ~X is additionally assumed
to be comonotonic, i.e. ~X =(F−1
X1(U), F−1
X2(U), . . . , F−1
Xn(U)). Let Λ(u) be
a random variable which is defined for each u ∈ (0, 1) separately. Define
Scl(u) as follows:
Scl(u) =n∑
i=1
F−1Xi
(u) E[Yi | Λ(u)
],
then Scl(u)d= (Scl|U = u). Define the random variable Scl through its
distribution function
FScl(y) =
∫ 1
0FScl|U=u(y)du.
Then Scl ≤cx S.
Remark 6. Obviously the conditioning random variable U can be replaced
by any other random variable which determines the comonotonic vector~X by a functional relationship. We consider here the case when Xi =
I(Tx>i) = I(Kx≥i) and therefore it is convenient to condition on the future
lifetime Kx.
Proof. Let S(u) denote a random variable distributed as S given that
U = u. From Definition 1b of convex order, it follows immediately that
Scl(u) ≤cx S(u).
Indeed, let v(.) be an arbitrary convex function. Then we get
E[v(Scl)
]=
∫ 1
0E[v(Scl(u))
]du ≤
∫ 1
0E[v(S(u))
]du = E
[v(S)
],
which completes the proof.
106 Chapter 3 - Reserving in life insurance business
Because of Lemma 10, one can determine a lower bound of a single life
annuity using the following conditioning random variable:
ΛKx =
Kx∑
i=1
αie−µi+
12σ2
i Y (i).
Intuitively it is clear that the lower bound defined by the random variable
ΛKx should approximate the underlying distribution better than those de-
fined by the conditioning random variables Λ(a) and Λ(m). As before, one
starts with computing the conditional distributions for the lower bound
Sclsp,x numerically by considering the equation
k∑
i=1
αie−µi+
12(1−r2
i,k)σ2i +ri,kσiΦ
−1(F
Sclsp,x|Kx=k
(y))
= y,
with correlations ri,k given by
ri,k =Cov
[Y (i),Λk
]√
Var[Y (i)]√
Var[Λk]
Consequently, the distribution function of Sclsp,x can be obtained as
FSclsp,x
(y) =
bω−xc∑
k=1
Pr[Kx = k]FSclsp,x|Kx=k(y) =
bω−xc∑
k=1
k|qxFSclk(y),
with
Sclk = E
[Sk|Λk
]. (3.7)
The stop-loss premiums of Sclsp,x can be computed as follows
πclb(Ssp,x, d,Γ,Λ) = EKx
[E[(Scl
sp,x − d)+|Kx
]]
=
bω−xc∑
k=1
k|qxπlb(Sk, d,Λk)
=
bω−xc∑
k=1
k|qx
(k∑
i=1
π(E[αie
−Y (i)|Λk
], dcl
k,i
)),
3.3. The distribution of life annuities 107
with dclk,i given by
dclk,i = αie
−µi+12σ2
i (1−r2i,k)+σiri,kΦ−1(F
Sclk
(d)).
The stop-loss premium of Sclsp,x at retention d can be written out explicitly
as follows
πclb(Ssp,x, d,Γ,Λ) =
bω−xc∑
k=1
k|qx
{k∑
i=1
αie−µi+
σ2i2 Φ(ri,kσi − Φ−1
(FScl
k(d)))
− d(1 − FScl
k(d))}.
The moments based approximation
Having computed the upper bound Scsp,x and the lower bounds Sl
sp,x and
Sclsp,x, one can compute two moments based approximations as described
in Subsection 2.2.4. To find the coefficient z given by (2.15) one needs to
calculate the variances of Scsp,x, Sl
sp,x, Sclsp,x and Ssp,x. The variance of Sc
sp,x
and Slsp,x can be computed as explained in Subsection 2.5.3. The variance
of Ssp,x and Sclsp,x can be treated very similarly. Indeed, after some simple
calculations one gets
Var[Scl
sp,x
]= EKx
[E[(Scl
sp,x)2|Kx
]]−(E[Scl
sp,x
])2
=
bω−xc∑
k=1
k|qxE[(Scl
k
)2]−(E[Scl
sp,x
])2,
Var[Ssp,x
]= EKx
[E[(Ssp,x)2|Kx
]]−(E[Ssp,x
])2
=
bω−xc∑
k=1
k|qxE[(Sk
)2]−(E[Ssp,x
])2,
108 Chapter 3 - Reserving in life insurance business
where Sclk and Sk are defined as in (3.7) and (3.6) respectively. Thus it
suffices to plug in
E[Scl
k
]= E
[Sk
]=
k∑
i=1
αie−µi+
σ2i2 ,
E[(Scl
k
)2]=
k∑
i=1
k∑
j=1
αiαje−µi−µj+
12(σ2
i +σ2j )+ri,krj,kσiσj ,
E[(Sk
)2]=
k∑
i=1
k∑
j=1
αiαje−µi−µj+
12(σ2
i +σ2j )+σij ,
and
E[Ssp,x
]= E
[Scl
sp,x
]=
bω−xc∑
k=1
k|qxE[Sk
]=
bω−xc∑
k=1
k|qxE[Scl
k
].
Now one can compute distributions of the moment based approximations
from the formulas
FSmsp,x
(y) = z1FSlsp,x
(y) + (1 − z1)FScsp,x
(y),
FScmsp,x
(y) = z2FSclsp,x
(y) + (1 − z2)FScsp,x
(y)
and their corresponding stop-loss premiums as
πm(Ssp,x, d,Γ,Λ) = z1πlb(Ssp,x, d,Γ,Λ) + (1 − z1)π
cub(Ssp,x, d),
πcm(Ssp,x, d,Γ,Λ) = z2πclb(Ssp,x, d,Γ,Λ) + (1 − z2)π
cub(Ssp,x, d),
where
z1 =Var[Sc
sp,x] − Var[Ssp,x]
Var[Scsp,x] − Var[Sl
sp,x]and z2 =
Var[Scsp,x] − Var[Ssp,x]
Var[Scsp,x] − Var[Scl
sp,x].
A numerical illustration
We examine the accuracy and efficiency of the derived approximations
for a single life annuity of a 65-years old male person with yearly unit
payments. We restrict ourselves to the case of a Black & Scholes setting
(model BS) with drift µ = 0.05 and volatility σ = 0.1. We assume further
that the future lifetime T65 follows the Makeham-Gompertz law with the
corresponding coefficients of the Belgian analytic life table MR (see Section
3.3. The distribution of life annuities 109
3.2). We compare the distribution functions of the upper bound Scsp,65 and
the lower bounds Slsp,65 and Scl
sp,65, as described in the previous sections,
with the original distribution function of Ssp,65 based on extensive Monte
Carlo (MC) simulation. We generated 500 × 100 000 paths and for each
estimate we computed the standard error (s.e.). As is well-known, the
(asymptotic) 95% confidence interval is given by the estimate plus or minus
1.96 times the standard error. Note also that the random paths are based
on antithetic variables in order to reduce the variance. Notice that to
compute the lower bound we use as conditioning random variable Λ(m) =
Λ24 (the value j = 24 was found to be the one that maximizes the variance
as described in Section 3.3.1).
Figure 3.1 shows the cumulative distribution functions of the approx-
imations, compared to the empirical distribution. One can see that the
lower bound Sclsp,65 is almost indistinguishable from the original distribu-
tion. In order to have a better view on the behavior of the approximations
in the tail, we consider a QQ-plots where the quantiles of S lsp,65, S
clsp,65 and
Scsp,65 are plotted against the quantiles of Ssp,65 obtained by simulation.
The different bounds will be good approximations if the plotted points
(F−1Ssp,65
(p), F−1Sl
sp,65(p)), (F−1
Ssp,65(p), F−1
Sclsp,65
(p)) and (F−1Ssp,65
(p), F−1Sc
sp,65(p)) for
all values of p in (0, 1) do not deviate too much from the line y = x. From
the QQ-plot in Figure 3.2, we can conclude that the comonotonic upper
bound slightly overestimates the tails of Ssp,65, whereas the accuracy of
the lower bounds Slsp,65 and Scl
sp,65 is extremely high; the corresponding
QQ-plot is indistinguishable from a perfect straight line. These visual ob-
servations are confirmed by the numerical values of some upper quantiles
displayed in Table 3.1, which also reports the moments based approxima-
tions Smsp,65 and Scm
sp,65.
Stop-loss premiums for the different approximations are compared in
Figure 3.3 and Table 3.2. This study confirms the high accuracy of the
derived bounds. Note that for very high values of d the differences become
larger, however these cases don’t represent any practical importance. All
Monte Carlo estimates are very close to πclb(Ssp,65, d,Γ,Λ) and some of
them even turn out to be smaller than this lower bound for. This not only
demonstrates the difficulty of estimating stop-loss premiums by simulation,
but it also indicates the accuracy of the lower bound πclb(Ssp,65, d,Γ,Λ).
Indeed, since the Monte Carlo estimate is based on random paths, it can
be smaller than πclb(Ssp,65, d,Γ,Λ) and this is very likely to happen if the
110 Chapter 3 - Reserving in life insurance business
p Slsp,65 Scl
sp,65 Smsp,65 Scm
sp,65 Scsp,65 MC (s.e. × 103)
0.75 14.1741 14.1887 14.1750 14.1887 14.1867 14.1887 (0.978)0.90 17.5905 17.5972 17.6250 17.6008 18.0797 17.5969 (1.420)0.95 19.9565 19.9713 20.0232 19.9783 20.8754 19.9731 (1.896)0.975 22.2495 22.2875 22.3559 22.2986 23.6574 22.2839 (2.816)0.995 27.5124 27.6700 27.7498 27.6943 30.2983 27.6933 (6.324)
Table 3.1: Approximations for some selected quantiles with probability
level p of Ssp,65.
d Slsp,65 Scl
sp,65 Smsp,65 Scm
sp,65 Scsp,65 MC (s.e. × 104)
0 11.0944 11.0944 11.0944 11.0944 11.0944 11.0937 (9.43)5 6.3715 6.3756 6.3721 6.3756 6.3792 6.3748 (8.67)10 2.5956 2.6071 2.6029 2.6078 2.6900 2.6068 (5.89)15 0.7151 0.7201 0.7265 0.7213 0.8629 0.7201 (0.34)20 0.1628 0.1664 0.1698 0.1671 0.2536 0.1668 (0.21)25 0.0357 0.0379 0.0388 0.0382 0.0758 0.0382 (0.10)30 0.0080 0.0091 0.0092 0.0092 0.0239 0.0093 (0.02)35 0.0019 0.0023 0.0024 0.0023 0.0081 0.0024 (0.004)
Table 3.2: Approximations for some selected stop-loss premiums with
retention d of Ssp,65.
lower bound is close to the real stop-loss premium. Table 3.3 compares the
stop-loss premium of the comonotonic upper bound with the partially ex-
act/comonotonic upper bound πpecub(Ssp,65, d,Λ,Γ) (PECUB) and the two
combination bounds πeub(Ssp,65, d,Λ,Γ) (EMUB) (upper bounds based on
the lower bound Slsp,65) and πmin(Ssp,65, d,Λ,Γ) (MIN). For the partial
exact/comonotonic upper bound we use the same conditioning variable as
for the lower bound Sclsp,65. Remark that the decomposition variable is of
the form (2.55) with Λ ≡ Λn.
For the important retentions d = 5, 10, 15 and 20 the upper bound
πmin(Ssp,65, d,Λ,Γ) really improves the comonotonic upper bound. Notice
that for the extreme cases the values are more or less the same.
3.3. The distribution of life annuities 111
0 10 20 30 40 50
outcome
0.0
0.2
0.4
0.6
0.8
1.0
cdf
Figure 3.1: The cdf’s of ‘Ssp,65’ (MC) (solid grey line), S lsp,65 (•-line),
Sclsp,65 (N-line) and Sc
sp,65 (dashed line).
0 5 10 15 20 25
05
1015
2025
30
0 5 10 15 20 25
05
1015
2025
30
Figure 3.2: QQ-plot of the quantiles of S lsp,65 (◦) / Scl
sp,65 (4) and Scsp,65
(�) versus those of ‘Ssp,65’ (MC).
112 Chapter 3 - Reserving in life insurance business
0 10 20 30 40 50
outcome
02
46
810
Sto
p-lo
ss p
rem
ium
Figure 3.3: Stop-loss premiums for ‘Ssp,65’ (MC) (solid grey line), S lsp,65
(•-line), Sclsp,65 (N-line) and Sc
sp,65 (dashed line).
d MIN EMUB PECUB CUB MC (s.e. × 104)
0 11.0944 11.0944 11.0944 11.0944 11.0937 (9.43)5 6.3759 6.3761 6.3775 6.3792 6.3748 (8.67)10 2.6153 2.6164 2.6523 2.6900 2.6068 (5.89)15 0.7484 0.7532 0.8025 0.8629 0.7201 (0.34)20 0.2066 0.2207 0.2331 0.2536 0.1668 (0.21)25 0.0684 0.1009 0.0711 0.0758 0.0382 (0.10)30 0.0223 0.0738 0.0223 0.0239 0.0093 (0.02)35 0.0074 0.0672 0.0074 0.0081 0.0024 (0.004)
Table 3.3: Upper bounds for some selected stop-loss premiums with re-
tention d of Ssp,65.
3.3. The distribution of life annuities 113
3.3.2 A homogeneous portfolio of life annuities
We consider now the distribution of the present value of a homogeneous
portfolio of N0 life annuities paying a fixed amount of αi (> 0) at the end
of each year i. This present value can be expressed by the formula
Spp,x =
bω−xc∑
i=1
Ni αie−Y (i),
where Ni denotes the number of survivals in year i and can be written as
Ni = I �T
(1)x >i � + I �
T(2)x >i � + . . .+ I �
T(N0)x >i � ,
where T(j)x denotes the future lifetime of the j-th insured. We assume that
these random variables are mutually independent. So the random vari-
ables Ni are binomially distributed with parameters n = N0 and success
parameter ipx.
Note that
Spp,x =
N0∑
j=1
S(j)sp,x, (3.8)
with S(j)sp,x given by
S(j)sp,x =
bω−xc∑
i=1
I �T
(j)x >i � αie
−Y (i).
The computation of the convex upper and lower bound for the case of a
portfolio of life annuities is more complicated than in the case of a single
life annuity. The binomial distributed random variables Ni are not very
useful in practical computations, because there exist no closed-form ex-
pressions for the cumulative and the inverse distribution functions. This
problem can be dealt with by replacing the random variables Ni by more
handy continuous approximations Ni. We propose to approximate the dis-
tribution of Ni by the Normal Power Approximation (NPA). This allows to
incorporate the sknewness in contrast with a Normal approximation, be-
cause the binomial distribution is very skewed (unless either the parameter
114 Chapter 3 - Reserving in life insurance business
n is very high or the success parameter p is close to 12). The distribution
function of the NPA Ni is given by the formula
FNi(x) = Φ
(− 3
γNi
+
√9
γ2Ni
+6(x− µNi
)
γNiσNi
+ 1
),
where
µNi= E [Ni] = N0 ipx,
σ2Ni
= Var [Ni] = N0 ipx iqx,
γNi=
E[(Ni − µNi
)3]
σ3Ni
=1 − 2 ipx√N0 ipx iqx
.
Then the p-th quantile of Ni is given by
F−1Ni
(p) = µNi+ σNi
Φ−1(p) +γNi
σNi
6
((Φ−1(p))2 − 1
). (3.9)
The upper bound
The upper bound Scpp,x is computed as described in Section 2.5.3. The
only difference is that in the formulas (2.71), (2.72) and (2.75) F−1Xi
(u) has
to be replaced by the approximation given in (3.9).
The lower bound
To compute the lower bound one has to choose two conditioning variables:
Γ and Λ. For the first conditioning random variable Γ we propose to take
Ni0 — the number of policies-in-force in the year i0. Note that
E[Ni|Ni0 = n0
]= i−i0px+i0n0 for i ≥ i0.
For i < i0, Pr[Ni = n|Ni0 = ni] can be computed from Bayes’ theorem.
As a result one gets the following formula for the conditional expectation:
E[Ni|Ni0 = n0
]=
N0∑
k=n0
kPr[Ni0 = n0|Ni = k]Pr[Ni = k]
Pr[Ni0 = n0]
=
N0∑
k=n0
k
(kn0
)(N0
k
)(N0
n0
) i0−ipn0x+i i0−iq
k−n0x+i ip
kx iq
N0−kx
i0pn0x i0q
N0−n0x
=
N0∑
k=n0
k
(N0 − n0
k − n0
)ip
k−n0x
i0−iqk−n0x+i iq
N0−kx
i0qN0−n0x
.
3.3. The distribution of life annuities 115
For mathematical convenience we rewrite this formula for non-integer val-
ues of Ni0 as follows
E[Ni|Ni0 = y
]=
N0∑
k=dyek
(N0 − dyek − dye
)ip
k−dyex
i0−iqk−dyex+i iq
N0−kx
i0qN0−dyex
. (3.10)
We propose to take Λ(a), as defined in Section 3.3.1, for the second condi-
tioning random variable Λ. Now one can perform step by step the computa-
tions described in Subsection 2.5.3 with the only exception that E[Xi|Γ =
γ]
has to replaced in the formulas (2.80) and (2.81) by E[Ni|Ni0 = y
]in
(3.10).
Also the stop-loss premiums are calculated according to the methodol-
ogy presented in Section 5.3 and 2.5.3 with the only difference the replace-
ment of E[Xi|Γ = F−1
Γ (u)]
in formula (2.83) by the approximation given
in (3.10).
The moments based approximation
As in the case of a single life annuity, the only problem in the computation
of the weight z given by (2.66) is to find expressions for the variances of
Scpp,x, Sl
pp,x and Spp,x. For the upper and the lower bound we have deployed
a procedure, described in Section 2.5.3, with fi(u) replaced by
fi(u) = F−1Ni
(u) for the upper bound
and
fi(u) = E[Ni|Ni0 = F−1
Ni0
(u)]
for the lower bound.
The variance of Spp,x can be computed from (3.8) and by noticing that,
given the returns ~Y =(Y (1), . . . , Y (bω − xc)
), the random variables S
(1)sp,x,
S(2)sp,x, . . . , S
(N0)sp,x are conditionally independent. Hence, we have that
Var[Spp,x
]= E~Y
[Var[Spp,x|~Y
]]+ Var~Y
[E[Spp,x|~Y
]]
= N0E~Y
[Var[Ssp,x|~Y
]]+N2
0 Var~Y
[E[Ssp,x|~Y
]]
= N0Var[Ssp,x
]+ (N2
0 −N0)Var~Y
[E[Ssp,x|~Y
]],
116 Chapter 3 - Reserving in life insurance business
p Slpp,65 Sm
pp,65 Scpp,65 MC (s.e.)
0.75 12 574 12 577 12 821 12 577 (3.90)
0.90 14 565 14 574 15 290 14 568 (5.08)
0.95 15 937 15 951 17 029 15 947 (8.15)
0.975 17 252 17 272 18 722 17 276 (8.80)
0.995 20 209 20 250 22 620 20 242 (22.09)
Table 3.4: Approximations for some selected quantiles with probability
level p of Spp,65.
where Var[Ssp,x
]is calculated in Subsection 3.3.1 and
Var~Y
[E[Ssp,x|~Y
]]=
bω−xc∑
i=1
bω−xc∑
j=1
ipx jpx αiαje−µi−µj+
σ2i +σ2
j2(eσij − 1
).
A numerical illustration
To test the quality of the derived approximations we present a numeri-
cal illustration similar to this from Subsection 3.3.1. As before we work
in a Black & Scholes setting with drift µ = 0.05 and volatility σ = 0.1
and we use the Makeham-Gompertz law to describe the mortality process
of 65-year old male persons. We compare the performance of the lower
bound Slpp,65, the upper bound Sc
pp,65 and the moments based approxima-
tion Smpp,65 with the real value Spp,65, obtained by extensive simulation,
for a portfolio of 1 000 policies. The number of policies-in-force after the
first year N1 is taken as the conditioning random variable Γ for the lower
bound. This choice seems to us to be reasonable — other choices can im-
prove the performance of the lower bound only a bit but with a significant
increase in computational time as cost. The Monte Carlo (MC) study of
Spp,65 is based on 30 × 50 000 simulated paths. Antithetic variables are
used in order to reduce the variance of the Monte Carlo estimates.
The quality of the approximations is illustrated in Figure 3.4 and 3.5.
One can see that the lower bound S lpp,65 indeed performs very well. The fit
of the upper bound is a bit poorer but still reasonable. The moments based
approximation Smpp,65 performs extremely well. These visual observations
are confirmed by the numerical values of some upper quantiles displayed
in Table 3.4 and by the study of stop-loss premiums in Figure 3.6 and in
Table 3.5.
3.3. The distribution of life annuities 117
5000 10000 15000 20000 25000 30000
outcome
0.0
0.2
0.4
0.6
0.8
1.0
cdf
Figure 3.4: The cdf’s of ‘Spp,65’ (MC) (solid grey line), S lpp,65 (•-line),
Smpp,65 (N-line) and Sc
pp,65 (dashed line).
6000 8000 10000 12000 14000 16000 18000 20000
5000
1000
015
000
2000
0
6000 8000 10000 12000 14000 16000 18000 20000
5000
1000
015
000
2000
0
Figure 3.5: QQ-plot of the quantiles of S lpp,65 (◦)/Sm
pp,65 (4) and Scpp,65
(�) versus those of ‘Spp,65’ (MC).
118 Chapter 3 - Reserving in life insurance business
5000 10000 15000 20000 25000 30000
outcome
020
0040
0060
00
Sto
p-lo
ss p
rem
ium
Figure 3.6: Stop-loss premiums for ‘Spp,65’ (MC) (solid grey line), S lpp,65
(•-line), Smpp,65 (N-line) and Sc
pp,65 (dashed line).
d Slpp,65 Sm
pp,65 Scpp,65 MC (s.e.)
0 11 094 11 094 11 094 11 098 (2.11)5 000 6 094 6 094 6 095 6 098 (2.10)10 000 1 608 1 610 1 793 1 611 (1.95)15 000 153.7 155.3 278.4 155.3 (1.78)20 000 10.23 10.57 36.02 10.67 (1.26)25 000 0.680 0.734 4.816 0.743 (0.09)30 000 0.051 0.059 0.711 0.036 (0.02)
Table 3.5: Approximations for some selected stop-loss premiums with
retention d of Spp,65.
3.3. The distribution of life annuities 119
3.3.3 An ‘average’ portfolio of life annuities
As explained in Section 3.2 in the case of large portfolios of life annuities
it suffices to compute risk measures of an ‘average’ portfolio given by
Sapp,x =
bω−xc∑
i=1
ipx αie−Y (i),
where we assume that the payments αi are positive and due at times i =
1, . . . , bω − xc (payable at the end of each year). Notice that Sapp,x is of the
form (2.29) and that Sapp,x = E[Ssp,x|Y (1), · · · , Y (bω−xc)]. Comonotonic
approximations for this type of sums has been studied extensively by Kaas
et al. (2000), Dhaene et al. (2002a,b), Vyncke (2003), Darkiewicz (2005b)
and Vanduffel (2005), among others.
It turns out that for this application the conditioning variable of the
‘maximal variance’ form gives very accurate results. This means that we
define Λ here as
bω−xc∑
i=1
ipx αie−µi+
12σ2
i Y (i).
Notice that this conditioning variable could also be used in order to derive
the lower bound for a single life annuity.
To compute the comonotonic approximations for the quantiles and
stop-loss premiums, notice that the correlations ri are given by
ri = Corr(Y (i),Λ) =Cov[Y (i),Λ]
σiσΛ.
Because all correlation coefficients ri are positive, we have seen that the
lower bound is a comonotonic sum (all the terms in the sum are non-
decreasing functions of the same standard uniform random variable U).
This implies that the quantiles related to the lower and upper bound can
be computed by summing the corresponding quantiles for the marginals
involved. We find the following expressions for the quantiles and stop-loss
premiums of Slapp,x and Sc
app,x:
120 Chapter 3 - Reserving in life insurance business
F−1Sl
app,x(p) =
bω−xc∑
i=1
ipx αie−µi+riσiΦ
−1(p)+ 12(1−r2
i )σ2i ,
F−1Sc
app,x(p) =
bω−xc∑
i=1
ipx αie−µi+sign(ipxαi)σiΦ
−1(p),
πlb(Sapp,x, d,Λ) =
bω−xc∑
i=1
ipx αie−µi+
σ2i2 Φ[riσi − Φ−1
(FSl
app,x(d))]
−d(1 − FSl
app,x(d)),
πcub(Sapp,x, d) =
bω−xc∑
i=1
ipx αie−µi+
σ2i2 Φ[sign(ipxαi)σi − Φ−1
(FSc
app,x(d))]
−d(1 − FSc
app,x(d)).
To calculate the moments based approximation we need the expressions
for the variances of Sapp,x, Slapp,x and Sc
app,x. These are given by
Var[Sapp,x] =
bω−xc∑
i=1
bω−xc∑
j=1
ipx jpx αiαje−µi−µj+
σ2i +σ2
j2 (eσij − 1) ,
Var[Slapp,x] =
bω−xc∑
i=1
bω−xc∑
j=1
ipx jpx αiαje−µi−µj+
σ2i +σ2
j2 (erirjσiσj − 1) ,
Var[Scapp,x] =
bω−xc∑
i=1
bω−xc∑
j=1
ipx jpx αiαje−µi−µj+
σ2i +σ2
j2 (eσiσj − 1) .
3.3.4 A numerical illustration
In this subsection we illustrate our findings numerically and graphically.
We use the same parameters for the financial and mortality process as in
the two previous illustrations, namely a Black & Scholes model for the
returns with µ = 0.05, σ = 0.1 and the Makeham-Gompertz law with
corresponding coefficients of the Belgian analytic life table MR. We will
compare the different approximations for quantiles and stop-loss premiums
with the values obtained by Monte Carlo simulation (MC). The simulation
3.3. The distribution of life annuities 121
results are based on generating 500 × 100000 random paths. The estimates
obtained from this time-consuming simulation will serve as benchmark.
The random paths are based on antithetic variables in order to reduce the
variance of the Monte Carlo estimates.
Figure 3.7 shows the distribution functions of the lower bound S lapp,65,
the upper bound Scapp,65, the moment based approximation Sm
app,65 and the
simulated one Sapp,65. Again the lower bound and the moments based ap-
proximation prove to be very good approximations for the real cumulative
distribution function of Sapp,65. To assess the accuracy of the bounds in
the tails, we plot their quantiles against those of Sapp,65 in Figure 3.8. The
largest quantile (p = 0.995) of Smapp,65 in this QQ-plot underestimates the
exact quantile by only 0.06%. Table 3.6 shows the numerical values for
some high quantiles. The stop-loss premiums for different choices of d are
shown in Figure 3.9 and in Table 3.7. The lower bound and the moments
based approximation give very accurate results compared to the real value
of the stop-loss premium. The comonotonic upper bound performs rather
badly for some retentions. But, using the results of Chapter 2 we can
construct sharper upper bounds than the traditional comonotonic upper
bounds.
In Table 3.8 we compare the stop-loss premium of the comonotonic up-
per bound with the partially exact/comonotonic upper bound πpecub(Sapp,65,
d,Λ) (PECUB) and the two upper bounds based on the lower bound S lapp,65
plus an error term dependent of the retention πdeub(Sapp,65, d,Λ) (DEUB)
and independent of the retention πeub(Sapp,65, d,Λ) (EUB). For the partial
exact/comonotonic upper bound we use the same conditioning variable
as for the lower bound Slapp,65. The decomposition variable used in this
illustration is given by
dΛ = d−bω−xc∑
i=1
ipx αie−µi+
σ2i2
(1 + µi −
1
2σ2
i
).
The results for the different upper bounds are in line with the previous ones
for a single life annuity. Note that for very high values of d the differences
become larger, but these cases don’t represent any practical importance.
We can conclude that in both cases the upper bound based on the lower
bound plus an error term dependent on the retention πdeub(., d,Λ) performs
very well.
122 Chapter 3 - Reserving in life insurance business
p Slapp,65 Sm
app,65 Scapp,65 MC (s.e. × 104)
0.75 12.5745 12.5760 12.8192 12.574 (0.03)0.90 14.5649 14.5698 15.2819 14.5699 (0.07)0.95 15.9364 15.9444 17.0152 15.9448 (0.14)0.975 17.2513 17.2628 18.703 17.2683 (0.24)0.995 20.2073 20.2303 22.5847 20.2425 (1.58)
Table 3.6: Approximations for some selected quantiles with probability
level p of Sapp,65.
d Slapp,65 Sm
app,65 Scapp,65 MC (s.e. × 104)
0 11.0944 11.0944 11.0944 11.0948 (8.22)5 6.0945 6.0945 6.0951 6.0948 (7.67)10 1.6081 1.6094 1.7910 1.6097 (4.45)15 0.1536 0.1545 0.2766 0.1549 (1.01)20 0.0102 0.0104 0.0355 0.0105 (0.31)25 0.0007 0.0007 0.0047 0.0007 (0.01)
Table 3.7: Approximations for some selected stop-loss premiums with
retention d of Sapp,65.
Notice that for the retention d = 0 all values (except the value for DEUB
because there the error term is independent of the retention) in both tables
are identical and equal to 11.0944. This follows from the fact that in this
case the expected value of Ssp,65 equals the expected value of Sapp,65. Note
also that the values in Tables 3.2 and 3.3 are typically larger than the
corresponding values in Tables 3.7 and 3.8. This is not surprising. From
Remark 5 it immediately follows that Sapp,65 ≤cx Ssp,65 and hence for any
retention d > 0 one has
π(Sapp,65, d) ≤ π(Ssp,65, d).
3.3. The distribution of life annuities 123
5 10 15 20 25 30
outcome
0.0
0.2
0.4
0.6
0.8
1.0
cdf
Figure 3.7: The cdf’s of ‘Sapp,65’ (MC) (solid grey line), S lapp,65 (•-line),
Smapp,65 (N-line) and Sc
app,65 (dashed line).
6 8 10 12 14 16 18 20
510
1520
6 8 10 12 14 16 18 20
510
1520
Figure 3.8: QQ-plot of the quantiles of S lapp,65 (◦)/Sm
app,65 (4) and Scapp,65
(�) versus those of ‘Sapp,65’ (MC).
124 Chapter 3 - Reserving in life insurance business
5 10 15 20 25 30
outcome
02
46
Sto
p-lo
ss p
rem
ium
Figure 3.9: Stop-loss premiums for ‘Sapp,65’ (MC) (solid grey line),
Slapp,65 (•-line), Sm
app,65 (N-line) and Scapp,65 (dashed line).
d EUB DEUB PECUB CUB MC (s.e. × 104)
0 11.1652 11.0944 11.0944 11.0944 11.0948 (8.22)5 6.1653 6.0948 6.0948 6.0951 6.0948 (7.67)10 1.6789 1.6240 1.6980 1.7910 1.6097 (4.45)15 0.2244 0.2144 0.2559 0.2766 0.1549 (1.01)20 0.0810 0.0809 0.0328 0.0355 0.0105 (0.31)25 0.0715 0.0715 0.0041 0.0047 0.0007 (0.01)
Table 3.8: Upper bounds for some selected stop-loss premiums with re-
tention d of Sapp,65.
3.4. Conclusion 125
3.4 Conclusion
In this chapter we studied the case of life annuities. The aggregate distri-
bution function of such stochastic sums of dependent random variables is
very difficult to calculate. Usually it is only possible to get formulae for the
first couple of moments. To compute more cumbersome risk measures, like
stop-loss premiums or upper quantiles, one has to rely on time consuming
simulations.
We derived comonotonicity based approximations both for the case of
a single life annuity and a homogeneous portfolio of life annuities. The
numerical illustrations confirm the very high accuracy of the bounds (es-
pecially the lower bound). These observations are confirmed by the results
of the stop-loss premiums. One maybe gets an impression that the upper
bound — which performs poorer than the lower bound in all cases — is
not worth being studied. In actuarial applications, however, the upper
bound should draw a lot of attention because one is usually interested in
conservative estimates of quantities of interest. Indeed, when an actuary
calculates reserves he has to take into account some additional sources of
uncertainty, such as the choice of the interest rates model, the estimation
of parameters, the assumptions about mortality, the longevity risk and
many others. For these the estimates provided by the upper bound in con-
vex order can be in many cases more appropriate than the more accurate
approximations obtained from the lower bound in convex order.
Chapter 4
Reserving in non-life
insurance business
Summary In this chapter we present some methods to set up confidence
bounds for the discounted IBNR reserve. We first model the claim pay-
ments by means of a lognormal and a loglinear location-scale regression
model. We further extend this to the class of generalized linear models.
The knowledge of the distribution function of the discounted IBNR re-
serve will help us to determine the initial reserve, e.g. through the quantile
risk measure. The results are based on the comonotonic approximations
explained in Chapter 2.
4.1 Introduction
To get the correct picture of its liabilities, a company should set aside the
correctly estimated amount of money to meet claims arising in the future
on the written policies. The past data used to construct estimates for the
future payments consist of a triangle of incremental claims Yij , as depicted
in Figure 4.1. This is the simplest shape of data that can be obtained and
it avoids having to introduce complicated notation to cope with all possible
situations. We use the standard notation, with the random variables Yij for
i = 1, 2, . . . , t; j = 1, 2, . . . , s denoting the claim figures for year of origin (or
accident year) i and development year j, meaning that the claim amounts
were paid in calendar year i+j−1. Year of origin, year of development and
calendar year act as possible explanatory variables for the observation Yij .
127
128 Chapter 4 - Reserving in non-life insurance business
Year of Development year
origin 1 2 · · · j · · · t− 1 t
1 Y11 Y12 · · · Y1j · · · Y1,t−1 Y1t
2 Y21 Y22 · · · Y2j · · · Y2,t−1... · · · · · · · · · · · · · · ·i Yi1 · · · · · · Yij... · · · · · · · · ·t Yt1
Figure 4.1: Random variables in a run-off triangle
Most claims reserving methods assume that t = s. For (i, j) combinations
with i + j ≤ t+ 1, Yij has already been observed, otherwise it is a future
observation. Next to claims actually paid, these figures can also be used
to denote quantities such as loss ratios. To a large extent, it is irrelevant
whether incremental or cumulative data are used when considering claims
reserving in a stochastic context.
We consider annual development (the methods can be extended easily
to semi-annual, quarterly or monthly development) and we assume that
the time it takes for the claims to be completely paid is fixed and known.
The triangle is augmented each year by the addition of a new diagonal.
The purpose is to complete this run-off triangle to a square, or to
a rectangle if estimates are required pertaining to development years of
which no data are recorded in the run-off triangle at hand. To this end, the
actuary can make use of a variety of techniques. The inherent uncertainty
is described by the distribution of possible outcomes, and one needs to
arrive at the best estimate of the reserve.
The choice of an appropriate statistical model is an important matter.
Furthermore within a stochastic framework, there is considerable flexibil-
ity in the choice of predictor structures. In England & Verrall (2002) the
reader finds an excellent review of possible stochastic models. An appro-
priate model will enable the calculation of the distribution of the reserve
that reflects the process variability producing the future payments, and
accounts for the estimation error and statistical uncertainty (in the sense
given in Taylor & Ashe (1983)). It is necessary to be able to estimate the
variability of claims reserves, and ideally to be able to estimate the full dis-
4.1. Introduction 129
tribution of possible outcomes so that percentiles (or other risk measures
of this distribution) can be obtained. Next, recognizing the estimation
error involved with the parameter estimates, confidence intervals for these
measures constitute another desirable part of the output.
Here, putting the emphasis on the discounting aspect of the reserve,
we first consider simple lognormal linear models. Doray (1996) studied
the loglinear models extensively, taking into account the estimation error
on the parameters and the statistical prediction error in the model. Such
models have some significant disadvantages. Predictions from this model
can yield unusable results, and we need to impose that each incremental
value should be greater than zero. So, it is not possible to model negative
or zero claims. From the nature of the claims reserving problem, it is
expected that a higher proportion of zeros would be observed in the later
stages of the incremental loss data triangle. In reinsurance, zero claims
are also frequently observed in incremental loss data triangles for excess
layers. Negative incremental values will be the result of salvage recoveries,
payments from third parties, total or partial cancellation of outstanding
claims, due to initial overestimation of the loss or to possible favorable jury
decision in favor of the insurer, rejection by the insurer, or just plain errors.
In Goovaerts & Redant (1999) a lognormal linear regression model is used
to model the random fluctuations in the direction of the calendar years,
taking into account the apparatus of financial mathematics. The results are
based on supermodularity order, such that, in the framework of stop-loss
ordering one obtains the distribution of the IBNR reserve corresponding
to an extremal element in this ordering, when some marginals are fixed.
The lognormal linear model is a member of the broader class of loglin-
ear location-scale regression models. In Doray (1994) the reader can find
an overview with a lot a characteristics of the different distributions in
this class. The logarithm of the error is assumed to follow certain known
distributions (normal, extreme value, generalized loggamma, logistic and
log inverse Gaussian). Doray studied these models extensively. He has de-
rived certain theoretical properties of these distributions and proved that
the MLE’s of the regression and scale parameters exist and are unique,
when the error has a log-concave density.
Claim sizes can often be described by distributions with a subexpo-
nential right tail. Furthermore, the phenomena to be modelled are rarely
additive in the collateral data. A multiplicative model is much more plau-
sible. These problems cannot be solved by working with ordinary linear
130 Chapter 4 - Reserving in non-life insurance business
models, but with generalized linear models. The generalization is twofold.
First, it is allowed that the random deviations from the mean follow an-
other distribution than the normal. In fact, one can take any distribution
from the exponential dispersion family, including for instance the Poisson,
the binomial, the gamma and the inverse Gaussian distributions. Second,
it is no longer necessary that the mean of the random variable is a linear
function of the explanatory variables, but it only has to be linear on a
certain scale. If this scale for instance is logarithmic, we have in fact a
multiplicative model instead of an additive model.
Loss reserving deals with the determination of the (characteristics of
the) d.f. of the random present value of an unknown amount of future
payments. Since this d.f. is very important for an insurance company and
its policyholders, these inherent uncertainties are no excuse for providing
anything less than a rigorous scientific analysis. In order for the reserve
estimate truly to represent the actuary’s “best estimate” of the needed
reserve, both the determination of the expected value of unpaid losses and
the appropriate discount should reflect the actuary’s best estimates (i.e.
should not be dictated by others or by regulatory requirements). Since
the reserve is a provision for the future payment of unpaid losses, we be-
lieve the estimated loss reserve should reflect the time value of money. In
many situations this discounted reserve is useful, for example dynamic fi-
nancial analysis, assessing profitability and pricing, identifying risk based
capital needs, loss portfolio transfers, profit testing, and so on. Ideally the
discounted loss reserve would also be acceptable for regulatory reporting.
However, many current regulations do not permit it. It could be argued
that reserves set on an undiscounted basis include an implicit margin for
prudence, although, in the current climate of low interest rates, that mar-
gin is very much reduced. If reserves are set on a discounted basis, there is
a strong case for including an explicit prudential margin. As such, a risk
margin based on a risk measure from a predictive distribution of claims
reserves is a strong contender.
One of the sub-problems in this respect consists of the discounting of
the future estimates in the run-off triangle, where returns (and inflation)
are not known for certain. We will model the stochastic discount factor
using a Brownian motion with drift. When determining the discounted
loss reserve, we impose an explicit margin based on a risk measure (for
example Value-at-Risk) from the total distribution of the discounted re-
serve. Considering the discounted IBNR reserve, we have to incorporate a
4.2. The claims reserving problem 131
certain dependence structure. In general, it is hard or even impossible to
determine the quantiles of the discounted loss reserve analytically, because
in any realistic model for the return process this random variable will be
a sum of strongly dependent random variables. The “true” multivariate
distribution function of the lower triangle cannot be determined analyti-
cally in most cases, because the mutual dependencies are not known, or
are difficult to cope with. We suggest to solve this problem by calculating
upper and lower bounds making efficient use of the available information.
This chapter is set out as follows. Section 2 places the claims reserving
problem in a broader context. Section 3 gives a brief review of loglinear
and generalized linear models and their applications to claims reserving.
To be able to use the results of Chapter 2 we need some asymptotic results
for model parameter estimates in generalized linear models. Section 4 de-
scribes how convex lower and upper bounds can be obtained for discounted
IBNR evaluations. Some numerical illustrations for a simulated data set
are provided in Section 5, together with a discussion of the estimation error
using a bootstrap approach. We also graphically illustrate the obtained
bounds.
The results of this chapter come from Hoedemakers, Beirlant, Goovaerts
& Dhaene (2003, 2005).
4.2 The claims reserving problem
As a rule not all claims on a general insurance portfolio will have been paid
by the end of the calender year of an insurance company. There can be
several reasons for the delay in payment, e.g. delays in reporting the claim,
long legal procedures, difficulties in determining the size of the claim, and
so on. It is also possible that the claim still has to occur, but that the
cause of the claim occurs in the past (e.g. exposed to asbestos). This of
course depends on what is insured in the policy. The delay in payment can
vary from a couple of days up to some years depending on the complexity
and the severity of the damage. To be able to pay these claims the insurer
has to keep reserves which should enable him to pay all future outstanding
claims.
Claims reserving is a vital area of insurance company management,
132 Chapter 4 - Reserving in non-life insurance business
which is receiving close attention from shareholders, auditors, tax author-
ities and regulators. For insurance companies, the claims reserve is a very
substantial balance sheet item, which can be large in relation to share-
holders funds. Actuaries are now well-established in the area of claims re-
serving for non-life insurance business. In many countries there is already
a statutory requirement for actuarial certification of reserves. Even in
jurisdictions where there is no such requirement, the substantial contribu-
tion actuaries can make to estimating future liabilities has been recognized
across the market.
Failure to reserve accurately for outstanding and IBNR claims will
adversely affect a company’s future financial development. Any current
reserve inadequacy will give rise to losses in subsequent years. Conversely,
premium calculations based on a too pessimistic evaluation of current lia-
bilities will damage the company’s competitive position.
The reserves held by a general insurance company can be divided into the
following categories:
• Claims reserves representing the estimated outstanding claims pay-
ments that are to be covered by premiums already earned by the
company. These reserves are sometimes called IBNS reserves (In-
curred But Not Settled). These can in turn be divided into
1. IBNYR reserves, representing the estimated claims payments
for claims which have already Incurred, But which are Not Yet
Reported to the company.
2. RBNS reserves, being the reserves required in respect of claims
which have been Reported to the company, But are Not yet
fully Settled. A special case of RBNS reserves are case reserves,
which are the individual reserves set by the claim handlers in
the claims handling process.
• Unearned premium reserves (UPR). Because the insurance premiums
are paid up-front, the company will, at any given accounting date,
need to hold a reserve representing the liability that a part of the
paid premium should be paid back to the policyholder in the event
that insurance policies were to be cancelled at that date. Unearned
premium reserves are pure accounting reserves, calculated on a pro
rata basis.
4.3. Model set-up: regression models 133
• Unexpired risk reserves (URR). While the policyholder only in special
cases has the option to cancel a policy before the agreed insurance
term has expired, he certainly always has the option to continue the
policy for the rest of the term. The insurance company, therefore,
runs the risk that the unearned premium will prove insufficient to
cover the corresponding unexpired risk, and hence the unexpired risk
reserve is set up to cover the probable losses resulting from insufficient
written but yet unearned premiums.
• CBNI reserves. Essentially the same as unearned premium reserves,
but to take into account possible seasonal variations in the risk pat-
tern, they are not necessarily calculated pro rata, so that they also
incorporate the function of the unexpired risk reserves. Their pur-
pose is to provide for Covered But Not Incurred (CBNI) claims.
• The sum of the CBNI and IBNS reserves is sometimes called the
Covered But Not Settled (CBNS) reserve.
• Fluctuation reserves (equalization reserves) do not represent a future
obligation, but are used as a buffer capital to safeguard against ran-
dom fluctuations in future business results. The use of fluctuation
reserves varies from country to country.
The loss reserves considered here only refer to the claims that result from
already occurred events; the so-called IBNS reserves. Notice that often the
terminology is not used uniformly: the abbreviation IBNR is used when
speaking of loss reserving problems as a whole.
4.3 Model set-up: regression models
The problem of estimating IBNR claims consists in predicting, for each
accident year, the ultimate amount of claims incurred. The amount paid
by the insurance company for those claims, when it comes due, is then sub-
tracted, leaving the reserve the insurer should hold for future payments. To
calculate the reserve, all methods or models usually assume that the pat-
tern of cumulative or incremental claims incurred or paid is stable across
the development years, for each accident year. Since for the last accident
year, only one amount will be available, the reserve will be highly sensitive
to this amount. Moreover, because of growth experienced by the company,
134 Chapter 4 - Reserving in non-life insurance business
it will be larger than any other amount in the data set, hence the im-
portance of verifying that the development pattern of the claims has not
changed over the years. One of the earliest methods, and now the most
commonly used in the actuarial profession, is the chain-ladder method.
Assuming that for each accident year, the development pattern remains
stable, development factors are calculated by dividing cumulative paid or
incurred claims after j periods by the cumulative amount after j − 1 pe-
riods. The year-to-year development factors are then applied to the most
recent amount for each accident year, i.e. the amounts on the right-most
diagonal. Many variations have been presented for the basic chain-ladder
method just introduced; a linear trend or an exponential growth may be
assumed to be present among the development factors. Instead of taking
their weighted average, they could be extrapolated into the future. The
chain-ladder method can also be adjusted for inflation. However, the chain-
ladder method suffers from the following deficiencies:
1. It explicitly assumes too many parameters (one for each column).
2. It does not give any idea of the variability of the reserve estimate, or
a confidence interval for the reserve.
3. It is negatively biased, which could lead to serious underreserving, a
threat to the insurer’s solvency.
Therefore stochastic models have been developed which enable to calculate
an amount such that there is a high probability that the reserve will be
sufficient to cover the liabilities generated by the current block of business.
In claims reserving, we are interested in the aggregated value
t∑
i=2
t∑
j=t+2−i
Yij .
In this section we given an overview of the different regression models used
in claims reserving.
We use the following notation throughout this section:~Y = (Y11, . . . , Yt1, Y21, . . . , Yt1) is the vector of claims, ~β = (β1, . . . , βp)
are model parameters, U is the regression matrix corresponding to the
upper triangle of dimension [ t(t+1)2 ] × p and R is the regression matrix
corresponding to the complete square of dimension t2 × p.
4.3. Model set-up: regression models 135
4.3.1 Lognormal linear models
We consider the following loglinear regression model in matrix notation
~Z = ln~Y = R~β + ~ε, ~ε ∼ N(0, σ2I), (4.1)
where ~ε is the vector of independent normal random errors with mean 0
and variance σ2.
So, the normal responses Zij are assumed to decompose (additively)
into a deterministic non-random component with mean (R~β)ij and a ho-
moscedastic normally distributed random error component with zero mean.
The parameters are estimated by the maximum likelihood method,
which in the case of the normal error structure is equivalent to minimizing
the residual sum of squares. The unknown variance σ2 is estimated by the
residual sum of squares divided by the degrees of freedom (the number of
observations minus the numbers of regression parameters estimated):
σ2 =1
n− p(~Z − U~β)′(~Z − U~β). (4.2)
This is an unbiased estimator of σ2. The maximum likelihood estimator
of σ2 is given by
σ2 =1
n(~Z − U~β)′(~Z − U~β), (4.3)
while the maximum likelihood estimator of ~β is
~β = (U′U)−1U′ ~Z. (4.4)
Now we can forecast the total IBNR reserve with
IBNR reserve =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+εij . (4.5)
This definition of the IBNR reserve can, among others, be found in Doray
(1996). Here (R~β)ij and εij are independent.
We have that
εiji.i.d∼ N(0, σ2), (4.6)
(R~β)ij ∼ N((R~β)ij , σ
2(R(U′U)−1R′)
ij
). (4.7)
136 Chapter 4 - Reserving in non-life insurance business
Starting from model (4.1), we summarize now some properties of the IBNR
reserve (4.5), which can be found in Doray (1996).
1. The mean of the IBNR reserve equals
W =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+
12σ2(1+(R(U′
U)−1R
′)ij). (4.8)
2. The unique UMVUE of the mean of the IBNR reserve is given by
WU = 0F1
(n− p
2;SSz
4
) t∑
i=2
t∑
j=t+2−i
e(R~β)ij , (4.9)
where 0F1(α; z) denotes the hypergeometric function.
3. The MLE of the mean of the IBNR reserve:
WM =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+
12σ2(1+(R(U′
U)−1R
′)ij). (4.10)
Verrall (1991) has considered an estimator similar to WM , but with σ2
replaced with σ2:
WV =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+
12σ2(1+(R(U′
U)−1R
′)ij). (4.11)
The simple estimator
WD =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+
12σ2, (4.12)
was considered in Doray (1996).
Also, we have the order relation
WU < WD < WV , (4.13)
which implies that
W = E[WU ] < E[WD] < E[WV ]. (4.14)
4.3. Model set-up: regression models 137
Hence both the estimators WD and WV exhibit a positive bias.
This Lognormal Linear (LL) model with normal random error is a special
case of the class of loglinear location-scale models. Other choices possible
for the distribution of the random error are the extreme value distribution,
leading to the Weibull-extreme value regression model, the generalized
loggamma, the logistic, and the log inverse Gaussian distribution. In what
follows we shortly recall this class of regression models.
4.3.2 Loglinear location-scale models
For a general introduction to survival analysis we refer to Kalbfleish &
Prentice (1980), Lawless (1982), Cohen & Whitten (1985), among others.
In this section we recall the structure of this model and the main charac-
teristics of the distributions for the error component.
A location-scale model has a cumulative distribution function of the
form
FX(x) = G
(x− µ
σ
), (4.15)
where µ is the location parameter, σ is the scale parameter, and G is the
standardized form (µ = 0, σ = 1) of the cumulative distribution function.
The parameter vector is ~θ = (µ, σ).
We consider the following Loglinear Location-Scale (LLS) regression
model in matrix notation
~Z = ln~Y = R~β + σ~ε, (4.16)
where (R~β)ij is the linear predictor or location parameter for Zij , σ is the
scale parameter and ~ε is a random error with known density f~ε(·).It should also be noticed that in general the scale parameter estimator
is not independent of the location parameter estimator, as is the case in
normal regression.
It is clear that the random variable Zij has the following density
1
σf~ε
(zij − (R~β)ij
σ
),
with −∞ < zij <∞. This model can only be applied if all data points are
non-negative. The parameters are estimated by maximum likelihood.
138 Chapter 4 - Reserving in non-life insurance business
Doray (1994) showed that the maximum likelihood estimators of the
regression and scale parameters exist and are unique when the error ~ε in the
loglinear location-scale regression model has a log-concave density. This is
the case for the five distributions we consider in Table 4.1. Note that the
exponential distribution is a special case of the Weibull distribution when
the shape parameter is equal to 1. The generalized gamma distribution is a
flexible family of distributions containing as special cases the exponential,
the Weibull and the gamma distribution.
The IBNR reserve under this class of regression models is given by
IBNR reserve =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+σεij .
Table 4.2 displays the mean, cumulative distribution function and in-
verse distribution function of Xij = e(R~β)ij+σεij for the different regression
models in the LLS family.
Notice that the definition of the IBNR reserve here differs from defi-
nition (4.3.2) under the lognormal linear model. We use here e(R~β)ij+σεij
instead of e(R~β)ij+ˆσεij , where ~β and ˆσ represent the MLE’s of ~β and σ
respectively. Also this definition of the IBNR reserve can, among others,
be found in Doray (1996). This approach partly uses the information con-
tained in the upper triangle (through ~β), and acknowledges the underlying
stochastic structure (through εij).
4.3. Model set-up: regression models 139
Regression model Density
Lognormal linear εij ∼ i.i.d N(0, 1)1√2π
e−12
x2
(−∞ < x < ∞)
Weibull-extreme value εij ∼ Gumbel
ex−ex
(−∞ < x < ∞)
Logistic εij ∼ standard logisticex
(1+ex)2(−∞ < x < ∞)
Generalized loggamma εij ∼ loggamma
kk−
12
Γ(k)e√
kx−kex
√
k � (−∞ < x < ∞)
(0 < k < +∞)
Log inverse Gaussian εij ∼ log inverse Gaussian
(2πλ)−12 e−
x2 e
1λ e−
1λcosh(x)
� (−∞ < x < ∞)
(λ > 0)
Table 4.1: Characteristics of the random error εij in the regression models
of the LLS family.
140
Chapte
r4
-R
ese
rvin
gin
non-life
insu
rance
busin
ess
Regression model E[Xij ] FXij(xij) F−1
Xij(p)
Lognormal linear e(R~β)ij+
σ2
2 Φ(
ln(xij)−(R~β)ij
σ
)e(R
~β)ij+σΦ−1(p)
Weibull-extreme value e(R~β)ij Γ(1 + σ) 1 − exp
[−(
xij
e(R~β)ij
) 1σ
]e(R
~β)ij (−ln(1 − p))σ
Logistic(e(R
~β)ij
)2− 1σ
(1 − 2σ)πcosec(2πσ) 1 −(
1 +(xije
−(R~β)ij
) 1σ
)−1
e(R~β)ij
(p
1−p
)σ
(1 < σ < 2)
e(R~β)ij Γ(1 + σ)Γ(1 − σ)
(σ < 1)
Generalized loggamma k−σ√
ke(R~β)ij
Γ(k) Γ(k + σ√k) I
(k,(
xij
k−σ√
ke(R~β)ij
) 1
σ√
k
)/
Log inverse Gaussian e(R~β)ij Φ
[√e(R~β)ij
λxij+√
xij
λe(R~β)ij
]− /
e2λ Φ
[√e(R~β)ij
λxij−√
xij
λe(R~β)ij
]
Table 4.2: Characteristics of Xij = e(R~β)ij+σεij in the regression models of the LLS family.
4.3. Model set-up: regression models 141
4.3.3 Generalized linear models
For a general introduction to Generalized Linear Models (GLIMs) we refer
to McCullagh & Nelder (1992). This family encompasses normal error
linear regression models and the nonlinear exponential, logistic and Poisson
regression models, as well as many other models, such as loglinear models
for categorical data. In this subsection we recall the structure of GLIMs
in the framework of claims reserving.
The first component of a GLIM, the random component, assumes that
the response variables Yij are independent and that the density function
of Yij belongs to the exponential family with densities of the form
f(yij ; θij , φ) = exp {[yijθij − b(θij)] /a(φ) + c(yij , φ)} , (4.17)
where a(·), b(·) en c(·, ·) are given functions. The function a(φ) often has
the form a(φ) = φ, where φ is called the dispersion parameter.
When φ is a known constant, (4.17) simplifies to the natural exponen-
tial family
f(yij ; θij) = a(θij)b(yij)exp {yijQ(θij)} . (4.18)
We identify Q(θ) with θ/a(φ), a(θ) with exp{−b(θ)/a(φ)}, and b(y) with
exp{c(y, φ)}. The more general formula (4.17) is useful for two-parameter
families, such as the normal or gamma, in which φ is a nuisance parameter.
Denoting the mean of Yij by µij , it is known that
µij = E[Yij ] = b′(θij) and Var[Yij ] = b′′(θij)a(φ), (4.19)
where the primes denote derivatives with respect to θ. The variance can
be expressed as a function of the mean by
Var[Yij ] = a(φ)V (µij) = φV (µij),
where V (·) is called the variance function. The variance function V cap-
tures the relationship, if any, between the mean and variance of Yij .
The possible distributions to work with in claims reserving include for
instance the normal, Poisson, gamma and inverse Gaussian distributions.
Table 4.3 shows some of their characteristics. For a given distribution,
link functions other than the canonical link function can also be used. For
example, the log-link is often used with the gamma distribution.
The systematic component of a GLIM is based on a linear predictor
ηij = (R~β)ij = β1Rij,1 + · · · + βpRij,p, i, j = 1, . . . , t. (4.20)
142 Chapter 4 - Reserving in non-life insurance business
Distribution Density φ Canonical µ(θ) = V (µ) =link θ(µ) b′(θ) b′′(θ)
N(µ, σ2) 1σ√
2πexp
(− (y−µ)2
2σ2
)σ2 µ θ 1
Poisson(µ) e−µ µy
y! 1 log(µ) eθ µ
Gamma(µ, ν) 1Γ(ν)
(νyµ
)ν
exp(− νy
µ
)1y
1ν 1/µ −1/θ µ2
IG(µ, σ2) y−3/2
√2πσ2
exp(
−(y−µ)2
2yσ2µ2
)σ2 1/µ2 (−2θ)−1/2 µ3
Table 4.3: Characteristics of some frequently used distributions in loss
reserving.
Various choices are possible for this linear predictor. In Subsection 4.3.4
we give a short overview of frequently used parametric structures in claims
reserving applications.
The link function, the third component of a GLIM, connects the ex-
pectation µij of Yij to the linear predictor by
ηij = g(µij), (4.21)
where g is a monotone, differentiable function. Thus, a GLIM links the
expected value of the response to the explanatory variables through the
equation
g(µij) = (R~β)ij i, j = 1, . . . , t. (4.22)
For the canonical link g for which g(µij) = θij in (4.17), there is the direct
relationship between the natural parameter and the linear predictor. Since
µij = b′(θij), the canonical link is the inverse function of b′.Generalized linear models may have nonconstant variances σ2
ij for the
responses Yij . Then the variance σ2ij can be taken as a function of the
predictor variables through the mean response µij , or the variance can
be modelled using a parameterized structure (see Renshaw (1994)). Any
regression model that belongs to the family of generalized linear models
can be analyzed in a unified fashion. The maximum likelihood estimates of
the regression parameters can be obtained by iteratively reweighted least
4.3. Model set-up: regression models 143
squares (naturally extending ordinary least squares for normal error linear
regression models).
Supposing that the claim amounts follow a lognormal distribution,
taking the logarithm of all Yij ’s implies that they have a normal distri-
bution. So, the link function is given by ηij = µij and the scale parameter
is the variance of the normal distribution, i.e. φ = σ2. We remark that
each incremental claim must be greater than zero, and predictions from
this model can yield unusable results.
The predicted value under a generalized linear model will be given by
IBNR reserve =t∑
i=2
t∑
j=t+2−i
µij , (4.23)
with ~µij = g−1((R~β)ij
)for a given link function g.
We end this section with some extra comments concerning GLIMs.
The need for more general GLIM models for modelling claims reserves be-
comes clear in the column of variance functions in Table 4.3. If the variance
of the claims is proportional to the square of the mean, the gamma family
of distributions can accommodate this characteristic. The Poisson and in-
verse Gaussian provide alternative variance functions. However, it may be
that the relationship between the mean and the variance falls somewhere
between the inverse Gaussian and the gamma models. Quasi-likelihood is
designed to handle this broader class of mean-variance relationships. This
is a very simple and robust alternative, introduced in Wedderburn (1974),
which uses only the most elementary information about the response vari-
able, namely the mean-variance relationship. This information alone is
often sufficient to stay close to the full efficiency of maximum likelihood
estimators. Suppose that we know that the response is always positive,
the data are invariably skew to the right, and the variance increases with
the mean. This does not enable us to specify a particular distribution
(for example it does not discriminate between Poisson or negative bino-
mial errors), hence one cannot use techniques like maximum likelihood or
likelihood ratio tests. However, quasi-likelihood estimation allows one to
model the response variable in a regression context without specifying its
distribution. We need only to specify the link and variance functions to
144 Chapter 4 - Reserving in non-life insurance business
estimate regression coefficients. Although the link and variance functions
determine a theoretical likelihood, the likelihood itself is not specified so
fewer assumptions are required for estimation and inference. This is analo-
gous to the connection between normal-theory regression models and least-
squares estimates. Least-squares estimation provides identical parameter
estimates to those obtained from normal-theory models, but least-squares
estimation assumes far less. Only second moment assumptions are made by
least-squares compared to full distribution assumptions of normal-theory
models. For quasi-likelihood, specification of a variance function deter-
mines a corresponding quasi-likelihood element for each observation:
Q(µij ; yij) =
∫ µij
yij
yij − t
φV (t)dt, (4.24)
where Q(µij ; yij) satisfies a number of properties in common with the log-
likelihood. Specifically, if K = k(µij ;Yij) = (Yij − µij)/(φV (µij)), then
E(K) = 0
Var(K) =1
φV (µij)
−E
(∂K
∂µij
)=
1
φV (µij). (4.25)
According to McCullagh & Nelder (1992), since most first-order asymp-
totic theory regarding likelihood functions is based on the three proper-
ties (4.25), we can expect Q(µij ; yij) to behave like a log-likelihood under
certain broad conditions. Summing (4.24) over all yij-values yields the
quasi-likelihood for the complete data. The quasi-deviance D(yij ;µij) is
similarly defined to be the sum over all yij-values of
−2φQ(µij ; yij) = 2
∫ yij
µij
yij − t
V (t)dt. (4.26)
Parameter estimation proceeds by maximizing the quasi-likelihood. Since
the quasi-likelihood behaves like an ordinary likelihood, it inherits all the
large sample properties of likelihoods: approximate unbiasedness and nor-
mality of the parameter estimates. For example, through the use of the
quasi-likelihood
Q(µij ; yij) =
∫ µij
yij
Yij − t
φt2.5dt =
1
φµ2.5ij
(µijyij
(−1.5)−
µ2ij
(−0.5)
)(4.27)
4.3. Model set-up: regression models 145
we could model a variance function between those of the gamma and inverse
Gaussian families: V (µij) = µ2.5ij .
When using the canonical link function, the quasi-likelihood equations
are given by
t+1−i∑
j=1
µij =t+1−i∑
j=1
Yij 1 ≤ i ≤ t;
t+1−j∑
i=1
µij =
t+1−j∑
i=1
Yij 1 ≤ j ≤ t. (4.28)
As can easily be seen from these equations in case of the Poisson model
with logarithmic link function, it is necessary to impose the constraint
that the sum of the incremental claims in every row and column has to
be non-negative. For example, this assumption makes the model unsuit-
able for incurred triangles, which may contain many negatives in the later
development periods due to overestimates of case reserves in the earlier
development periods.
We recall that the only distributional assumptions used in GLIMs are
the functional relationship between variance and mean and the fact that
the distribution belongs to the exponential family. When we consider the
Poisson case, this relationship can be expressed as
Var[Yij ] = E[Yij ]. (4.29)
One can allow for more or less dispersion in the data by generalizing (4.29)
to Var[Yij ]=φE[Yij ] (φ ∈ (0,∞)) without any change in the form and
solution of the likelihood equations. For example, it is well known that an
over-dispersed Poisson model with the chain-ladder type linear predictor
gives the same predictions as those obtained by the deterministic chain-
ladder method (see Renshaw & Verrall, 1994).
Modelling the incremental claim amounts as independent gamma re-
sponse variables, with a logarithmic link function and the chain-ladder
type linear predictor produces exactly the same results as obtained by
Mack (1991). The relationship between this generalized linear model and
the model proposed by Mack was first pointed out by Renshaw & Verrall
(1994). The mean-variance relationship for the gamma model is given by
Var[Yij ] = φ (E[Yij ])2 . (4.30)
146 Chapter 4 - Reserving in non-life insurance business
Using this model gives predictions close to those from the deterministic
chain-ladder technique, but not exactly the same. Notice that we need to
impose that each incremental value should be positive (non-negative) if
we work with gamma (Poisson) models. This restriction can be overcome
using a quasi-likelihood approach.
As in normal regression, the search for a suitable model may encompass
a wide range of possibilities. The Bayesian information criterion (BIC)
and the Akaike Information Criterion (AIC) are model selection devices
that emphasize parsimony by penalizing models for having large numbers
of parameters. Tests for model development to determine whether some
predictor variables may be dropped from the model can be conducted
using partial deviances. Two measures for the goodness-of-fit of a given
generalized linear model are the scaled deviance and Pearson’s chi-square
statistic.
In cases where the dispersion parameter is not known, an estimate can
be used to obtain an approximation to the scaled deviance and Pearson’s
chi-square statistic. One strategy is to fit a model that contains a sufficient
number of parameters so that all systematic variation is removed, estimate
φ from this model, and then use this estimate in computing the scaled
deviance of sub-models. The deviance or Pearson’s chi-square divided by
its degrees of freedom is sometimes used as an estimate of the dispersion
parameter φ.
4.3.4 Linear predictors and the discounted IBNR reserve
Various choices are possible for the linear predictor in claims reserving
applications. We give here a short overview of frequently used parametric
structures.
A well-known and widely used predictor is the chain-ladder type
ηij = αi + βj , (4.31)
(αi is the parameter for each year of origin i and βj for each development
year j). It should be noted that this representation implies the same
development pattern for all years of origin, where that pattern is defined
by the parameters βj . Notice that a parameter, for example β1, must be set
equal to zero, in order to have a non-singular regression matrix. Another
natural and frequently used restriction on the parameters is to impose that
4.3. Model set-up: regression models 147
β1 + · · ·+ βt = 1, since this allows the βj to be interpreted as the fraction
of claims settled in development year j.
The separation predictor takes into account the calendar years and
replaces in (4.31) αi with γk (k = i + j − 1). It combines the effects of
monetary inflation and changing jurisprudence.
For a general model with parameters in the three directions, we refer to
De Vylder & Goovaerts (1979). We give here some frequently used special
cases:
• The probabilistic trend family (PTF) of models as suggested in Barnett
& Zehnwirth (1998)
ηij = αi +
j−1∑
k=1
βk +
i+j−2∑
t=1
γt, (4.32)
where γ denotes the calendar year effect; it combines the effects of
monetary inflation and changing jurisprudence.
• The Hoerl curve as in Zehnwirth (1985)
ηij = αi + βilog(j) + γij (j > 0). (4.33)
This model has the advantage that one can predict payments by
extrapolation for j > t, because development year j is considered
as a continuous covariate. This is useful in estimating tail factors.
Wright (1990) extends this Hoerl curve further to model possible
claim inflation.
• A mixture of models (4.31) and (4.33) as in England & Verrall (2001)
ηij =
{αi + βj if j ≤ q;
αi + βilog(j) + γij if j > q(4.34)
for some integer q specified by the modeller.
In the case that the type of business allows for discounting we add a dis-
counting process. Of course, the level of the required reserve will strongly
148 Chapter 4 - Reserving in non-life insurance business
depend on how we will invest this reserve. We define the discounted IBNR
reserve S under one of the discussed regression models as follows
lognormal linear model: SLL =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+εij−Y (i+j−t−1),
loglinear location-scale model: SLLS =t∑
i=2
t∑
j=t+2−i
e(R~β)ij+σεij−Y (i+j−t−1),
generalized linear model: SGLIM =t∑
i=2
t∑
j=t+2−i
g−1((R~β)ij
)e−Y (i+j−t−1),
where the returns are modelled by means of a Brownian motion described
by the following equation
Y (i) = (δ +ς2
2)i+ ςB(i), (4.35)
where B(i) is the standard Brownian motion, ς is the volatility and δ is a
constant force of interest.
4.4 Convex bounds for the discounted IBNR re-
serve
Before we can apply the results of Chapter 2 in order to derive the comono-
tonic approximations for S, we have to specify further the distribution of
~µ = g−1((R~β)ij
). This is done in what follows.
4.4.1 Asymptotic results in generalized linear models
Let φ, ~β, ~η = R~β and ~µ = g−1(~η) be the maximum likelihood estimates of
φ, ~β, ~η and ~µ respectively. The estimation equation for ~β is then given by
U′WU~β = U′W~y∗, (4.36)
where W = diag{w11, · · · , wt1}, with wij = Var[Yij ]−1(dµij/dηij)
2, ~y∗ =
(y∗11, · · · , y∗t1)′, and denoting y∗ij = ηij + (yij − µij)dηij/dµij where yij de-
note the sample values. Note that W is W evaluated at ~β. It is well-known
that for asymptotically normal statistics, many functions of such statistics
4.4. Convex bounds for the discounted IBNR reserve 149
are also asymptotically normal. Because R ~β =((R~β)11, · · · , (R~β)tt
)is
asymptotically multivariate normal with mean R~β =((R~β)11, · · · , (R~β)tt
)
and variance-covariance matrix Σ(R ~β) = Σa = {σaij} = R(U′WU)−1R′
and g−1(η11, · · · , ηtt) has a nonzero differential ~ψ = (ψ11, · · · , ψtt) at (R~β),
where ψij = dµij/dηij , it follows from the delta method that
[~µ− ~µ
]d→ N
(0,Σ(~µ)
), (4.37)
where Σ(~µ) = ~ψ′Σa ~ψ. Hence, for large samples the distribution of ~µ =
g−1(R~β) can be approximated by a normal distribution with mean ~µ and
variance-covariance matrix Σ(~µ).
Maximum likelihood estimates may be biased when the sample size or
the total Fisher information is small. The bias is usually ignored in prac-
tice, because it is negligible compared with the standard errors. In small
or moderate-sized samples, however, a bias correction can be necessary,
and it is helpful to have a rough estimate of its size.
In deriving the convex bounds, one need the expected values. Since
there is no exact expression for the expectation of ~µ, we approximate it
using a general formula for the first-order bias of the estimate of ~µ.
Cordeiro & McCullagh (1991) derived the first order bias of ~β. In
matrix notation this bias reduces to the simple form
B(~β) = −1
2ΣbU′Σc
dFd1, (4.38)
with Σb = Σ(~β) = {σbij} = (U′WU)−1, Σc = Σ(U~β) = {σc
ij} = UΣbU′,
Σad = diag{σa
11, · · · , σatt}, Σc
d = diag{σc11, · · · , σc
t1}, 1 is a t(t+1)2 ×1 vector of
ones, and Fd = diag{f11, · · · , ft1} with fij = Var[Yij ]−1(
dµij
dηij
) (d2
µij
dη2ij
).
It follows that the n−1 bias of ~η also has a simple expression:
B(~η) = −1
2RΣbU′Σc
dFd1. (4.39)
To evaluate the n−1 biases of ~β and ~η we need only the variance and the link
functions with their first and second derivatives. In the right-hand sides of
equations (4.38) and (4.39), which are of order n−1, consistent estimates
of the parameters ~µ can be inserted to define the corrected maximum
150 Chapter 4 - Reserving in non-life insurance business
likelihood estimates ~ηc = ~η − B(~η) and ~βc = ~β − B(~β), which should
have smaller biases than the corresponding ~η and ~β. From now on B(·)means the value of B(·) at the point ~µ. Expressions (4.38) and (4.39)
are applicable even if the link is not the same for each observation. For
the linear model with any distribution in the exponential family B( ~β) and
B(~η) are zero. This is to be expected for the normal linear model or for the
inverse Gaussian non-intercept linear regression model. However it is not
obvious that this happens for any distribution in the exponential family
(4.17) with identity link since ~β is obtained, apart from these cases, from
the non-linear equation (4.36) with and because of the dependence of ~β
on W and ~y∗. We now give the n−1 bias of ~µ. Because µij = g−1(ηij) =
g−1((R~β)ij) and the link function is monotone and twice differentiable, we
can apply a Taylor series expansion of µij around ηij :
µij∼= µij +
dµij
dηij(ηij − ηij) +
1
2
d2µij
dη2ij
(ηij − ηij)2,
µij − µij∼= dµij
dηij(ηij − ηi) +
1
2
d2µij
dη2ij
(ηij − ηij)2,
E[µij − µij ] ∼= dµij
dηijE[(ηij − ηij)] +
1
2
d2µij
dη2ij
Var[ηij ].
In matrix notation
E[~µ− ~µ] ∼= G1E[(~η − ~η)] +1
2G2[Var(~η)]
∼= −1
2RΣbU′Σc
dFd1 +1
2G2Σ
ad1
=1
2
{G2Σ
ad1 − G1RΣbU′Σc
dFd1}.
So, the first order bias of ~µ in matrix notation is given by the following
equation:
B(~µ) =1
2
{G2Σ
ad1 − G1RΣbU′Σc
dFd1}, (4.40)
where 1 is a t2 × 1 vector of ones and G1 = diag{ψ11, · · · , ψtt}, G2 =
diag{ϕ11, · · · , ϕtt} where ψij =dµij
dηijand ϕij =
d2µij
dη2ij
.
4.4. Convex bounds for the discounted IBNR reserve 151
So, we can define adjusted values as ~µc = ~µ − B(~µ), which should have
smaller biases than the corresponding ~µ. Note that B(·) means here the
value of B(·) taken at (φ, ~µ).
4.4.2 Lower and upper bounds
In this subsection we will derive the upper and lower bounds in convex or-
der, as described in Chapter 2, for the discounted IBNR reserve SLL, SLLS
and SGLIM under the different regression models.
Using the results of Chapter 2, we derive a convex lower and upper
bound for S =∑
i
∑j XijZij given by
∑
i
∑
j
E[Xij ]E[Zij |Λ]
︸ ︷︷ ︸Sl
≤cx
∑
i
∑
j
XijZij
︸ ︷︷ ︸S
≤cx
∑
i
∑
j
F−1Xij
(U)F−1Zij
(V )
︸ ︷︷ ︸Sc
,
with
Xij =
eεij (SLL);
e(R~β)ij+σεij (SLLS);
µij (SGLIM ).
Zij =
e(R~β)ij−Y (i+j−t−1) (SLL);
e−Y (i+j−t−1) (SLLS);
e−Y (i+j−t−1) (SGLIM ).
We introduce the random variables Wij and Wij defined by
Wij = (R~β)ij − Y (i+ j − t− 1) and Wij = −Y (i+ j − t− 1), (4.41)
with
E[Wij ] = (R~β)ij − (δ +1
2ς2)(i+ j − t− 1),
E[Wij ] = −(δ +1
2ς2)(i+ j − t− 1),
Var[Wij ] = σ2Wij
= σ2(R(U′U)−1R′)
ij+ (i+ j − t− 1)ς2,
Var[Wij ] = σ2Wij
= (i+ j − t− 1)ς2.
152 Chapter 4 - Reserving in non-life insurance business
The lower bound
To compute the lower bound we consider the following conditioning normal
random variable of the form (2.53)
Λ =t∑
i=2
t∑
j=t+2−i
νijY (i+ j − t− 1), (4.42)
with
νij =
e(R~β)ije−(i+j−t−1)δ (Sl
LL);
E[e(R
~β)ij+σεij
]e−(i+j−t−1)δ (Sl
LLS);(µij + B(~µ)ij
)e−(i+j−t−1)δ (Sl
GLIM ).
(4.43)
Notice that (Wij ,Λ) has a bivariate normal distribution. Conditionally
given Λ = λ, Wij has a univariate normal distribution with mean and
variance given by
E[Wij |Λ = λ] = E[Wij ] + ρijσWij
σΛ(λ− E[Λ]) (4.44)
and
Var[Wij |Λ = λ] = σ2Wij
(1 − ρ2
ij
), (4.45)
where ρij denotes the correlation between Λ and Wij . The same is true for
(Wij ,Λ), where we denote the correlation between Λ and Wij by ρij .
The lower bound can be written as
SlLL =
t∑
i=2
t∑
j=t+2−i
E[Xij ]eE[Wij ]+ρijσWij
Φ−1(V )+ 12(1−ρ2
ij)σ2Wij ,
SlLLS =
t∑
i=2
t∑
j=t+2−i
E[Xij ]eE[Wij ]+ρijσ
WijΦ−1(V )+ 1
2(1−ρ2
ij)σ2Wij ,
SlGLIM =
t∑
i=2
t∑
j=t+2−i
E[Xij ]eE[Wij ]+ρijσ
WijΦ−1(V )+ 1
2(1−ρ2
ij)σ2Wij ,
with
E[Xij ] =
E [eεij ] = e12σ2
(SLL);
E[e(R
~β)ij+σεij
]= See Table 4.2 (SLLS);
E[g−1
((R~β)ij
)]= µij + B(~µ)ij (SGLIM ).
4.4. Convex bounds for the discounted IBNR reserve 153
The correlations ρij and ρij are given by
ρij =Cov[Λ,Wij ]
σΛσWij
, ρij =Cov[Λ, Wij ]
σΛσWij
,
with
Cov[Λ,Wij ] = Cov[Λ, Wij ]
= −ς2t∑
k=2
t∑
l=t+2−k
νkl min(i+ j − t− 1, k + l − t− 1)
and
Var[Λ] = σ2Λ = ς2
t∑
r=2
t∑
s=t+2−r
t∑
v=2
t∑
w=t+2−v
νrsνvw min(r+s−t−1, v+w−t−1).
By conditioning on one of the standard uniform random variables one can
compute the distribution function of the lower bound. See Subsection 2.5.3
for more details.
For the lognormal linear and loglinear location-scale models there exist
a closed-form expression for the quantile function of S l.
Taking into account that Λ =∑t
i=2
∑tj=t+2−i νijY (i + j − t − 1) is
normally distributed, we find that
F−1Λ (1 − p) = E[Λ] − σΛΦ−1(p),
and hence
F−1Sl (p) = F−1� t
i=2
� tj=t+2−i E[Xij ]E[Zij |Λ]
(p), p ∈ (0, 1)
=t∑
i=2
t∑
j=t+2−i
F−1E[Xij ]E[Zij |Λ](p)
=t∑
i=2
t∑
j=t+2−i
E[Xij ]E[Zij |Λ = F−1Λ (1 − p)],
In order to derive the above result, we used the fact that for a non-
increasing continuous function g, we have
F−1g(X)(p) = g(F−1
X (1 − p)), p ∈ (0, 1). (4.46)
154 Chapter 4 - Reserving in non-life insurance business
Here, g = E[Zij |Λ] is a non-increasing function of Λ since ρij (ρij) is always
negative. So, we have that
F−1Sl (p) =
t∑
i=2
t∑
j=t+2−i
E[Xij ]eE[Wij ]−ρijσWij
Φ−1(p)+ 12(1−ρ2
ij)σ2Wij , (LL)
t∑
i=2
t∑
j=t+2−i
E[Xij ]eE[Wij ]−ρijσ
WijΦ−1(p)+ 1
2(1−ρ2
ij)σ2Wij . (LLS)
and FSl(x) can be obtained from solving the equation
t∑
i=2
t∑
j=t+2−i
E[Xij ]eE[Wij ]−ρijσWij
Φ−1(FSl
LL(x))+ 1
2(1−ρ2
ij)σ2Wij = x, (LL)
t∑
i=2
t∑
j=t+2−i
E[Xij ]eE[Wij ]−ρijσWij
Φ−1(FSl
LLS(x))+ 1
2(1−ρ2
ij)σ2Wij = x. (LLS)
The upper bound
The upper bound can be written as
ScLL =
t∑
i=2
t∑
j=t+2−i
F−1Xij
(U)eE[Wij ]+σWij
Φ−1(V ),
ScLLS =
t∑
i=2
t∑
j=t+2−i
F−1Xij
(U)eE[Wij ]+σ
WijΦ−1(V )
,
ScGLIM =
t∑
i=2
t∑
j=t+2−i
F−1Xij
(U)eE[Wij ]+σ
WijΦ−1(V )
,
with
F−1Xij
(U) =
F−1eεij (U) = eσΦ−1(U) (SLL);
F−1
e(R~β)ij+σεij(U) = See Table 4.2 (SLLS);
F−1
g−1�(R~β)ij � (U) = µij + B(~µ)ij
+√
Σ(~µ)ijΦ−1(p) (SGLIM ).
4.4. Convex bounds for the discounted IBNR reserve 155
The cdf of the upper bound can be computed as described in Subsection
2.5.3. Using Remark 4 one can calculate the distribution function of ScLL
and ScLLS more efficiently. We start with the cdf of Sc
LL.
From previous results
FScLL
(y) =
∫ 1
0FN
(ln(y) − ln
(F−1
Sc′LL
(u)))
du,
with FN (x) the cdf of N(0, σ2) and
Sc′LL =
t∑
i=2
t∑
j=t+2−i
exp
(F−1
(R~β)ij−Y (i+j−t−1)(U)
)
=t∑
i=2
t∑
j=t+2−i
e(R~β)ij−(δ+ 1
2ς2)(i+j−t−1)
× e√
σ2(R(U′U)−1R′)ij+ς2(i+j−t−1)Φ−1(p).
and
F−1
Sc′LL
(u) =t∑
i=2
t∑
j=t+2−i
e(R~β)ij−(δ+ 1
2ς2)(i+j−t−1)
× e√
σ2(R(U′U)−1R′)ij+ς2(i+j−t−1)Φ−1(u).
We can write the upper bound of SLLS as
ScLLS = G
t∑
i=2
t∑
j=t+2−i
eE[Wij ]+σ
WijΦ−1(V )
e(R~β)ij ,
with
G =
eσΦ−1(U) (Lognormal linear);
(−log(1 − U))σ (Weibull-extreme value);(U
1−U
)σ(Logistic).
The distribution function of G is given by
FG(x) ∼
Φ(
lnxσ
)(Lognormal linear);
1 − e−x1σ (Weibull-extreme value);
1 −(1 + x
1σ
)−1(Logistic).
156 Chapter 4 - Reserving in non-life insurance business
Using Remark 4 we can write the cdf of ScLLS for the lognormal linear, the
weibull-extreme value and the logistic regression model as follows
F−1Sc
LLS(y) =
∫ 1
0FG
y
F−1
Sc′LLS
(u)
du.
with
Sc′LLS =
t∑
i=2
t∑
j=t+2−i
exp
(F−1
(R~β)ij−Y (i+j−t−1)(U)
)
=t∑
i=2
t∑
j=t+2−i
e(R~β)ij−(δ+ 1
2ς2)(i+j−t−1)+ς
√i+j−t−1Φ−1(U).
and
F−1
Sc′LLS
(u) =t∑
i=2
t∑
j=t+2−i
e(R~β)ij−(δ+ 1
2ς2)(i+j−t−1)+ς
√i+j−t−1Φ−1(u).
Remark 7. Since we have no equality of the first moments in the GLIM
framework, the convex order relationship between the two approximations
and S is not valid. This does not impose any restrictions on the use of
the approximations. In fact, we can say that the convex order only holds
asymptotically in this case.
Remark 8. The estimator WD (4.12), for the mean of the IBNR reserve,
constitutes a close upper bound for the UMVUE of the mean of the IBNR
reserve if t(t+1)2 − p is large and the residual sum of squares is small. It
should be noted that e((R~β)ij+σ2/2) is the estimator of the mean of a log-
normal distribution logN((R~β)ij , σ2) obtained by replacing the parameters
~β and σ2 by their unbiased estimates. Adding now a discount process to
WD gives
WDD =t∑
i=2
t∑
j=t+2−i
e(R~β)ij−Y (i+j−t−1)+ 1
2σ2. (4.47)
Now, we can apply the same methodology as explained before. The results
for the lognormal linear model are still applicable. The only difference is
4.5. The bootstrap methodology in claims reserving 157
that εij is changed by 12 σ
2, with
1
2σ2 ∼ Gamma
(n− p
2,σ2
n− p
). (4.48)
4.5 The bootstrap methodology in claims reserving
4.5.1 Introduction
The bootstrap technique as an inferential statistical computer intensive
device was introduced by Efron (1979) as a quite intuitive and simple way
of making approximations to distributions which are very hard or even
impossible to compute analytically. This technique has proved to be a
very useful tool in many statistical applications and can be particularly
interesting to assess the variability of the claim reserving predictions and
to construct upper limits at an adequate confidence level. Its popularity
is due to a combination of available computing power and theoretical de-
velopment. One advantage of the bootstrap technique is that it can be
applied to any data set without having to assume an underlying distribu-
tion. Moreover most computer packages can handle very large numbers of
repeated samplings.
Our goal is to obtain quantiles of the loss reserve for which the predic-
tive distribution is not known. If we do not know the distribution, then
our best guess at the distribution is provided by the data. The main idea
in bootstrapping is that we (a) pretend that the data constitute the popu-
lation and (b) take samples from this pretended population (which we call
“resamples”). Substituting the sample for the population means that we
are interested in the frequency with which the observed values occurred.
This is done by sampling with replacement. From the re-sample, we
calculate the statistic we are interested in. This is called a “bootstrap
statistic”. After storing this value, one repeats the above steps collect-
ing a large number (B) of bootstrap statistics. The general idea is that
the relationship of the bootstrap statistics to the observed statistic is the
same as the relationship of the observed statistic to the true value. Under
mild regularity conditions, the bootstrap yields an approximation to the
distribution of an estimator or test statistic that is at least as accurate
as the approximation obtained from first-order asymptotic theory. For an
introduction explaining the bootstrap technique, see Efron & Tibshirani
(1993).
158 Chapter 4 - Reserving in non-life insurance business
4.5.2 Central idea
The concept of bootstrap relies on the consideration of the discrete empir-
ical distribution generated by a random sample of size n from an unknown
distribution F . This empirical distribution assigns equal probability to
each sample item. In the discussion which follows, we will write Fn for
that distribution. By generating an independent, identically distributed
random sequence (resample) from the distribution Fn or its appropriately
smoothed version, we can arrive at new estimates of various parameters
and nonparametric characteristics of the original distribution F .
As we have already mentioned, the central idea of bootstrap lies in
sampling the empirical cdf Fn. This idea is closely related to the following,
well-known statistical principle, henceforth referred to as the “plug-in”
principle. Given a parameter of interest θ(F ) depending upon an unknown
population cdf F , we estimate this parameter by θ = θ(Fn). That is, we
simply replace F in the formula for θ by its empirical counterpart Fn
obtained from the observed data. The plug-in principle will not provide
good results if Fn poorly approximates F , or if there is information about
F other than that provided by the sample. For instance, in some cases we
might know (or be willing to assume) that F belongs to some parametric
family of distributions. However, the plug-in principle and the bootstrap
may be adapted to this latter situation as well. To illustrate the idea,
let us consider a parametric family of cdf’s {Fµ} indexed by a parameter
µ (possibly a vector), and for some given µ0, let µ0 denote its estimate
calculated from the sample. The plug-in principle in this case states that
we should estimate θ(Fµ0) by θ(Fµ0). In this case, bootstrap is often called
parametric, since a resample is now collected from Fµ0 . Here, we refer to
any replica of θ calculated from a resample as “a bootstrap estimate of
θ(F )” and denote it by θ∗.
4.5.3 Bootstrap confidence intervals
Let us now turn to the problem of using the bootstrap methodology to
construct confidence intervals. This area has been a major focus of theo-
retical work on the bootstrap, and several different methods of approaching
the problem have been suggested. The “naive” procedure described below
is not the most efficient one and can be significantly improved in both
rate of convergence and accuracy. It is, however, intuitively obvious and
4.5. The bootstrap methodology in claims reserving 159
easy to justify, and seems to be working well enough for the cases con-
sidered here. For a complete review of available approaches to bootstrap
confidence intervals, see Efron & Tibisharani (1993). Let us consider θ∗,a bootstrap estimate of θ based on a resample of size n from the origi-
nal sample X1, . . . , Xn, and let G∗ be its distribution function given the
observed sample values
G∗ = Pr[θ∗ ≤ x|X1 = x1, . . . , Xn = xn].
The bootstrap percentiles method gives G−1∗ (α) and G−1
∗ (1−α) as, respec-
tively, lower and upper bounds for the (1 − 2α) confidence interval for θ.
Let us note that for most statistics θ, the distribution function of the boot-
strap estimator θ∗ is not available. In practice, G−1∗ (α) and G−1
∗ (1 − α)
are approximated by taking multiple resamples and then calculating the
empirical percentiles. In most cases B ≥ 1000 is recommended.
4.5.4 Bootstrap in claims reserving
As already mentioned above, with bootstrapping, we treat the obtained
data as if they are an accurate reflection of the parent population, and
then draw many bootstrapped samples by sampling, with replacement,
from a pseudo-population consisting of the obtained data. Technically,
this is called “non-parametric bootstrapping”, because we are sampling
from the actual data and we have made no assumptions about the distri-
bution of the parent population, other than that the raw data adequately
reflect the population’s shape. If we were willing to make more assump-
tions, such as an assumption that the parent population follows a normal
distribution, then we could do our sampling, with replacement, from a
normal distribution. This is called “parametric bootstrapping”.
For a description of the bootstrap methodology in claims reserving we refer
to England & Verrall (1999) and Pinheiro et al. (2003). In these papers the
bootstrap technique is used to obtain prediction errors for different claims
reserving methods, namely methods based on the chain-ladder technique
and on generalized linear models. Applications of the bootstrap technique
to claims reserving can also be found in Lowe (1994), in Taylor (2000) and
in England & Verrall (2002).
Starting from the original run-off triangle one can create a large number
of bootstrap run-off triangles by repeatedly resampling, with replacement,
160 Chapter 4 - Reserving in non-life insurance business
from the appropriate residuals. For each bootstrap sample the regression
model is refitted and the bootstrap statistic is calculated.
In England & Verrall (1999) the bootstrap technique is used to com-
pute the bootstrap root mean squared error of prediction (RMSEPbs), also
known as the bootstrap standard error of prediction. This is equal to, what
they call, the square root of the sum of the squares of parameter variability
and data variability. For the parameter variability one suggests a correc-
tion on the bootstrap standard error to enable a comparison between the
analytic standard error and the bootstrap one by taking account of the
number of parameters used in fitting the model. The bootstrap standard
error is the standard deviation of the bootstrap reserve estimates. So, pa-
rameter variability is defined as the bootstrap standard error multiplied
by the square root of n divided by n− p (n: sample size, p: number of pa-
rameters). Data variability is the square root of the uniformly minimum
variance unbiased estimator of the variance of the IBNR reserve. This
estimator was already calculated by Doray (1996). Note that if the full
predictive distribution can be found, the RMSEP can be obtained directly
by calculating its standard deviation. Using a normal approximation, a
100(1 − α)% bootstrap prediction interval for the total reserve is calcu-
lated as [R ± Φ−1(1 − α/2) ∗ RMSEPbs(R)], with R the initial forecast of
the IBNR reserve.
The second approach is more robust against deviations from the hy-
pothesis of the model. For a detailed presentation of this method see
Davidson & Hinkley (1997). A new bootstrap statistic is defined here as
a function of the bootstrap estimate and a bootstrap simulation of the
future reality. This statistic is called the prediction error. (This is very
confusing because in the literature the term prediction error is also used
for the RMSEP or the standard error of prediction.) For each bootstrap
loop the prediction error is then kept in a vector and the percentile method
is used to obtain the desired percentile of this prediction error (PPE). In
a last stage an upper limit of the prediction interval for the total reserve
is calculated as [R+ PPE].
The reader can find a complete list of the required steps for those two
procedures in the paper of Pinheiro et al. (2003). These authors have also
compared and discussed the two bootstrap procedures and the main con-
clusion is that the differences amongst the results obtained with the two
procedures, RMSEP and PPE, are not very important. The PPE proce-
dure generates generally smaller values. Further one suggest to eliminate
4.5. The bootstrap methodology in claims reserving 161
the residuals with value 0 and to work with standardized residuals since
only the former could be considered as identically distributed.
The third approach is explained in England & Verrall (2002). Like
in the previous methods, first of all a stochastic model is fitted to the
bootstrap sample and a run-off triangle is bootstrapped. For this pseudo
triangle the parameters are estimated in order to calculate future incre-
mental claim payments Y ∗ij . The second stage of the procedure replicates
the process variance. This is achieved by simulating an observed claim
payment for each future cell in the run-off triangle, using the bootstrap
value Y ∗ij as the mean, and using the process distribution assumed in the
underlying model. For each iteration the reserves are calculated by adding
up the simulated forecast payments. The set of reserves obtained in this
way forms the predictive distribution. The percentile method is then used
to obtain the required prediction interval.
In a practical case study one can bootstrap a high percentile of the dis-
tribution of the lower bound in order to describe the estimation error
involved. Taylor & Ashe (1983) used the terminology estimation error for
Var[(R~β)ij ] and statistical or random error for Var[εij ]. The estimation
error arises from the estimation of the vector ~β from the data, and the
statistical error stems from the stochastic nature of the regression model.
We bootstrap an upper triangle using the non-parametric procedure. This
involves resampling, with replacement, from the original residuals and then
creating a new triangle of past claim payments using the resampled resid-
uals together with the fitted values.
With regression type problems the resampling procedure is applied to the
residuals of the model. Residuals are approximately independent and iden-
tically distributed. In a statistical analysis they are commonly used in
order to explore the adequacy of the fit of the model, with respect to
the choice of the variance function, link function and terms in the linear
predictor. Residuals may also indicate the presence of anomalous values
requiring further investigation.
For generalized linear models an extended definition of residuals is re-
quired, applicable to all the distributions that may replace the normal
distribution. It is convenient if these residuals can be used for the same
purposes as standard normal residuals. Three well-known forms of general-
162 Chapter 4 - Reserving in non-life insurance business
ized residuals are the Pearson, Anscombe and deviance residuals. Pearson
residuals are easy to interpret: it are just the raw residuals scaled by the
estimated standard deviation of the response variable. A disadvantage of
the Pearson residual is that the distribution of this residual form for non-
normal distributions is often markedly skewed, and so it may fail to have
properties similar to those of a normal theory residual. Anscombe and de-
viance residuals are more appropriate to check the approximate normality.
In general the lower bound S l turns out to perform very well. A final
method to obtain a confidence bound for the predictive distribution is a
combination of the power of this lower bound and bootstrapping. We will
bootstrap a high percentile of the distribution of the lower bound. This is
done as follows:
1. The preliminaries:
• Estimate the model parameters ~β and
σ2 (LL)
σ2 (LLS)
φ (GLIM)
• Calculate the fitted values: µij =
e(R~β)ij (LL)
See Table 4.2 (LLS)
g−1(R~β)ij (GLIM)(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).
• Calculate the residuals: rij =
zij − ln µij (LL)
zij − ln µij (LLS)yij−µij√φV (µij)
(GLIM)
(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).
2. Bootstrap loop (to be repeated B times):
• Generate a set of residuals r∗ij by sampling with replacement
from the original
residuals (rij) (i = 1, . . . , t; j = 1, . . . , t+ 1 − i).
• Create a new upper triangle y∗ij :
– non-parametric bootstrap (NPB)
y∗ij =
eln(µij)+r∗ij (LL)
eln(µij)+r∗ij (LLS)√φV (µij)r
∗ij + µij (GLIM)
4.6. Three applications 163
(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).
– parametric bootstrap (PB)
y∗ij =
e(R~β)ij+σN(0,1) (LL)
See Table 4.2 (LLS)
≈ µij + B(~µ)ij +√
Σ(~µ)ijN(0, 1) (GLIM)
(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).
Now we have bootstrapped a run-off triangle.
• Calculate for this bootstrapped triangle the parameters ~β∗ and
(σ2)∗ (LL)
(ˇσ2)∗ (LLS)
φ∗ (GLIM)
• Calculate the percentile k of the distribution of S l, Sl∗(k), using
these parameters.
• Return to the beginning of step 2 until the B repetitions are
completed.
3. Analysis of the bootstrap data:
• Apply the percentile method to the bootstrap observations to
obtain the required prediction interval.
4.6 Three applications
In this section we illustrate the effectiveness of the bounds derived for the
discounted IBNR reserve S, under the model studied. We investigate the
accuracy of the proposed bounds, by comparing their cumulative distri-
bution function to the empirical distribution obtained with Monte Carlo
simulation (MC), which serves as a close approximation to the exact dis-
tribution of S. The simulation results are based on generating 100 000
random paths. The estimates obtained from this time-consuming simula-
tion will serve as benchmark. The random paths are based on antithetic
variables in order to reduce the variance of the Monte Carlo estimates.
In order to illustrate the power of the bounds, namely inspecting the
deviation of the cdf of the convex bounds S l and Sc from the true distribu-
tion of the total IBNR reserve S, we simulate a triangle from a particular
model. We created a non-cumulative run-off triangle based on the chain-
ladder predictor (4.31) with parameters given in Table 4.4. So, the run-off
164 Chapter 4 - Reserving in non-life insurance business
α1 α2 α3 α4 α5 α6 α7 α8 α9 α10 α11
12.8 12.9 13.6 13.5 13.4 13.2 13.8 13.7 13.1 13.0 13.9
β1 β2 β3 β4 β5 β6 β7 β8 β9 β10 β11
0 0.31 −0.11 −0.42 −0.37 −0.87 −0.96 −1.33 −1.63 −1.92 −2.31
Table 4.4: Model parameters.
triangle has only trends in the two main directions, namely in the year of
origin and in the development year. The parameter β1 is set equal to zero
in order to have a non-singular regression matrix.
We also specify the multivariate distribution function of the random
vector (Y1, Y2, . . . , Yt−1). In particular, we will assume that the random
variables Yi are i.i.d. and N(δ + 12 ς
2, ς2) distributed with δ = 0.08 and
ς = 0.11. This enables now to simulate the cdf’s while there is no way to
compute them analytically.
4.6.1 Lognormal linear models
The simulated run-off triangle for this model is displayed in Table 4.5.
Fitting the lognormal linear model with a chain-ladder type predictor gives
the parameter estimates and standard errors shown in Table 4.6.
4.6
.T
hre
eapplic
atio
ns
165
1 2 3 4 5 6 7 8 9 10 11
1 363,346 492,947 322,511 236,555 249,319 151,228 138,373 95,703 71,742 53,788 35,9972 397,798 543,864 358,855 263,325 276,817 167,045 153,095 106,272 78,515 58,7903 806,154 1,096,841 727,977 530,683 557,870 336,716 310,022 213,706 157,5044 727,102 995,988 654,059 476,665 502,405 303,132 278,280 192,4365 659,846 900,386 591,633 433,425 457,482 276,056 253,3016 541,187 736,205 487,730 353,255 373,921 226,0917 979,636 1,342,832 882,924 651,920 682,3078 890,641 1,219,406 798,007 582,4159 486,340 666,405 442,45710 445,174 604,20611 1,084,253
Table 4.5: Simulated run-off triangle with non-cumulative claim figures for the lognormal linear regression
model.
166 Chapter 4 - Reserving in non-life insurance business
Parameter Value Estimate Standard error
α1 12.8 12.7976 0.0018
α2 12.9 12.8968 0.0018
α3 13.6 13.5994 0.0018
α4 13.5 13.4957 0.0019
α5 13.4 13.3996 0.0019
α6 13.2 13.1997 0.0020
α7 13.8 13.7999 0.0021
α8 13.7 13.6983 0.0023
α9 13.1 13.0999 0.0025
α10 13.0 13.0035 0.0029
α11 13.9 13.8964 0.0039
β2 0.31 0.3109 0.0018
β3 −0.11 −0.1060 0.0018
β4 −0.42 −0.4198 0.0019
β5 −0.37 −0.3677 0.0020
β6 −0.87 −0.8717 0.0021
β7 −0.96 −0.9579 0.0022
β8 −1.33 −1.3267 0.0024
β9 −1.63 −1.6249 0.0027
β10 −1.92 −1.9100 0.0032
β11 −2.31 −2.3064 0.0043
σ 0.0004 0.0037
Table 4.6: Model specification, maximum likelihood estimates and stan-
dard errors for the run-off triangle in Table 4.5.
Figure 4.2 shows the cdf’s of the upper and lower bounds, compared to
the empirical distribution based on 100 000 randomly generated, normally
distributed vectors (Y1, Y2, . . . , Yt−1) and ~ε. Since SlLL ≤cx SLL ≤cx S
cLL,
the same ordering holds for the tails of their respective distribution func-
tions which can be observed to cross only once. We see that the cdf of
SlLL is very close to the distribution of SLL. The “real” standard deviation
equals 1,617,912 whereas the standard deviation of the lower bound equals
1,590,233. A lower bound for the 95th percentile is given by 13,638,620.
The comonotonic upper bound ScLL performs badly in this case. This
comes from the fact that in order to determine S lLL, we make use of the
(estimated values of the) correlations between the cells of the lower trian-
gle, whereas in the case of ScLL, the distribution is an upper bound (in the
sense of convex order) for any possible dependence structure between the
components of the vector ~V . The standard deviation of the upper bound
is given by 1,890,298. The 95th percentile of the upper bound now equals
4.6. Three applications 167
14,207,619, which is of course much higher than the 95th percentile of S lLL.
Table 4.7 summarizes the numerical values of the 95th percentiles of
the two bounds SlLL and Sc
LL, together with their means and standard
deviations. This is also provided for the row totals
SLL,i =t∑
j=t+2−i
e(R~β)ij−Y (i+j−t−1)+εij , i = 2, · · · , t. (4.49)
We can conclude that the lower bound approximates the “real discounted
reserve” very well.
In order to have a better view on the behavior of the upper bound
ScLL and of the lower bound Sl
LL in the tails, we consider a QQ-plot where
the quantiles of ScLL and of the lower bound Sl
LL are plotted against the
quantiles of SLL. The upper bound ScLL and the lower bound Sl
LL will
be a good approximation for SLL if the plotted points (F−1SLL
(p), F−1Sc
LL(p)),
respectively (F−1SLL
(p), F−1Sl
LL
(p)), for all values of p in (0, 1) do not devi-
ate too much from the line y = x. From the QQ-plot in Figure 4.3, we
can conclude that the upper bound (slightly) overestimates the tails of S,
whereas the accuracy of the lower bond is extremely high for the chosen
set of parameter values. Table 4.8 confirms these observations.
We remark that the improved upper bound SuLL is very close to the
comonotonic upper bound ScLL. This could be expected because ρij is close
to ρkl for any pair (ij, kl) with ij and kl sufficient close. This implies that
for any such pair (ij, kl)(F−1
e(R~β)ij−Y (i+j−t−1)|Λ(U), F−1
e(R~β)kl−Y (k+l−t−1)|Λ(U))
is close to(F−1
e(R~β)ij−Y (i+j−t−1)(U), F−1
e(R~β)kl−Y (k+l−t−1)(U)). Since the im-
proved upper bound requires more computational time, the results for the
improved upper bound are not displayed in this thesis.
168 Chapter 4 - Reserving in non-life insurance business
discounted IBNR reserve
cum
. dis
tr.
6*10^6 8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7 1.8*10^7
0.0
0.2
0.4
0.6
0.8
1.0
Figure 4.2: The cdf’s of ‘SLL’ (MC) (solid line), SlLL (dotted line) and
ScLL (dashed line) for the run-off triangle in Table 4.5.
8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7
8*10
^610
^71.
2*10
^71.
6*10
^7
Figure 4.3: QQ-plot of the quantiles of S lLL (◦) and Sc
LL (�) versus those
of ‘SLL’ (MC).
4.6
.T
hre
eapplic
atio
ns
169
SlLL SLL Sc
LL
year 95% mean st. dev. 95% mean st. dev. 95% mean st. dev.
2 41,913 36,694 3,043 43,742 36,690 4,072 43,796 36,694 4,0963 210,781 178,522 18,580 215,958 178,510 21,334 218,463 178,522 22,8054 339,371 280,596 33,568 344,231 280,570 36,069 350,678 280,596 39,7385 487,782 396,861 51,644 492,575 396,817 53,804 503,873 396,861 60,3576 609,034 491,311 66,663 614,094 491,252 68,525 630,052 491,311 77,9717 1,515,794 1,206,735 174,414 1,526,990 1,206,571 177,891 1,570,251 1,206,735 203,4228 1,976,955 1,574,772 226,804 1,986,766 1,574,556 230,635 2,053,898 1,574,772 267,6689 1,392,268 1,095,585 166,894 1,403,295 1,095,420 169,890 1,449,017 1,095,585 196,74410 1,641,355 1,287,052 199,051 1,657,107 1,286,851 203,005 1,713,161 1,287,052 236,65811 5,423,367 4,267,416 649,616 5,473,462 4,266,762 662,975 5,674,518 4,267,416 781,003
total 13,638,620 10,815,543 1,590,233 13,718,215 10,814,002 1,617,912 14,207,619 10,815,543 1,890,298
Table 4.7: 95th percentiles, means and standard deviations of the distributions of S lLL and Sc
LL vs. ‘SLL’ (MC).
170 Chapter 4 - Reserving in non-life insurance business
p SlLL SLL Sc
LL
0.95 13,638,620 13,718,215 14,207,6190.975 14,303,311 14,411,869 15,035,3800.99 15,122,153 15,166,753 16,066,3050.995 15,709,687 15,710,588 16,813,4320.999 17,003,250 17,003,255 18,479,550
Table 4.8: Approximations for some selected quantiles with probability
level p of SLL.
Distribution of bootstrapped Simulated distribution
95th percentiles of SlLL of F−1
SLL(0.95)
1 st percentile 13,587,825 13,578,3312.5 th percentile 13,589,852 13,579,131
5 th percentile 13,597,445 13,585,81310 th percentile 13,616,522 13,598,72325 th percentile 13,627,692 13,619,38950 th percentile 13,637,841 13,634,54375 th percentile 13,647,654 13,651,19590 th percentile 13,661,140 13,669,10495 th percentile 13,671,003 13,678,393
97.5 th percentile 13,678,085 13,685,37899 th percentile 13,680,785 13,688,379
Table 4.9: Percentiles of the bootstrapped 95th percentile of the distribu-
tion of the lower bound SBl(95) vs. the simulation.
Finally, for each bootstrap sample, we calculate the desired percentile of
the distribution of SlLL. This two-step procedure is repeated a large number
of times. The first column of Table 4.9 shows the results, concerning the
95th percentile, for 5000 bootstrap samples applied to the run-off triangle
in Table 4.5. When compared with the simulated distribution of F−1SLL
(0.95)
(obtained through 5000 simulated triangles), we can conclude that the
bootstrap distribution yields appropriate confidence bounds.
4.6. Three applications 171
Parameter Value Estimate Standard error
α1 12.8 12.805 0.0073α2 12.9 12.909 0.0074α3 13.6 13.599 0.0077α4 13.5 13.506 0.0076α5 13.4 13.411 0.0082α6 13.2 13.203 0.0076α7 13.8 13.788 0.0091α8 13.7 13.708 0.0081α9 13.1 13.103 0.0101α10 13.0 13.982 0.0102α11 13.9 13.905 0.0131β2 0.31 0.310 0.0068β3 −0.11 −0.118 0.0080β4 −0.42 −0.424 0.0079β5 −0.37 −0.370 0.0088β6 −0.87 −0.883 0.0079β7 −0.96 −0.967 0.0093β8 −1.33 −1.325 0.0108β9 −1.63 −1.643 0.0097β10 −1.92 −1.956 0.0225β11 −2.31 −2.311 0.0150σ 0.01 0.0093 0.0001
Table 4.10: Model specification, maximum likelihood estimates and stan-
dard errors for the run-off triangle in Table 4.11.
4.6.2 Loglinear location-scale models
Table 4.11 displays the simulated run-off triangle for the logistic regression
model with given parameters displayed in Table 4.4.
Fitting the logistic regression model with a chain-ladder type predictor
gives the parameter estimates and standard errors shown in Table 4.10.
172
Chapte
r4
-R
ese
rvin
gin
non-life
insu
rance
busin
ess
1 2 3 4 5 6 7 8 9 10 11
1 362,573 487,703 327,399 247,297 248,321 151,494 137,722 98,983 70,587 50,118 36,1102 400,144 548,504 366,684 255,014 283,467 166,318 154,915 105,641 77,890 58,7633 819,562 1,109,572 665,960 520,160 566,065 330,429 302,985 216,361 156,1594 724,419 999,135 668,363 478,629 512,920 307,563 275,629 192,2125 675,791 893,821 597,618 434,052 442,722 276,007 262,5206 544,870 736,215 471,965 359,236 377,939 222,5907 990,881 1,341,576 850,040 639,613 658,6388 896,565 1,230,011 790,872 589,7619 482,297 674,219 437,69210 432,302 595,20611 1,093,549
Table 4.11: Simulated run-off triangle with non-cumulative claim figures for the logistic regression model.
4.6. Three applications 173
We will compare the derived bounds with a time consuming Monte Carlo
simulation based on 100 000 randomly generated, normally distributed vec-
tors (Y1, Y2, . . . , Yt−1) and eσ~ε. Using the following properties, the simula-
tion of these last terms can be done in any statistical software package.
• If εij is Gumbel distributed, then we have that eσεij is Weibull dis-
tributed with location parameter 1/σ and scale parameter equal to
1.
• If εij is generalized loggamma distributed with parameter k, then
we have that eσεij is generalized gamma distributed with parameters
γ = 1/(σ√k) and α = k−σ
√k. One can generate a random number
from a generalized gamma distribution as follows:
1. Generate Gk from the gamma distribution with location para-
meter k and scale parameter 1
2. Retain α(Gk)1λ .
• If εij is log inverse Gaussian distributed, then we have that eσεij
is inverse Gaussian distributed with location parameter and scale
parameter equal to 1/σ. Michael et al. (1976) describe an algorithm
to generate a random number from an inverse Gaussian distribution
with parameters α and β as follows:
1. Generate C from the χ2(1) distribution
2. Calculate x1 = αβ + C
2β − 12β
√4αC + C2, x2 = α2
β2x1 and p1 =(1 + β
αx1
)−1
3. Generate U ∼ Uniform(0, 1)
4. Retain x2 if U ≤ p1, else x1.
On Figures 4.4 and 4.5 we compare the approximations (the convex upper
and lower bounds) for the distribution of the discounted loss reserve SLLS
to the empirical distribution function obtained by a Monte Carlo (MC)
simulation study. One can see that the upper bound ScLLS gives a poor
approximation. We observe that this upper bound has heavier tails than
the original distribution — the deviation for upper quantiles reaches 25%.
The main reason for that is a relatively weak dependence between claims,
for which the comonotonic approximation significantly overestimates the
174 Chapter 4 - Reserving in non-life insurance business
p SlLLS SLLS Sc
LLS
0.95 13,517,204 13,524,010 14,125,2030.975 14,175,492 14,165,083 14,950,8380.99 14,988,558 15,009,978 15,979,224
0.995 15,573,369 15,483,938 16,724,5840.999 16,865,068 16,623,928 18,386,959
Table 4.12: Approximations for some selected quantiles with probability
level p of SLLS.
tails, which is very clear both from the plot of cdf’s and from the QQ-plot.
On the other hand the lower bound gives a much better fit to the original
distribution. These findings are confirmed in Table 4.12 for some chosen
quantiles.
Similar conclusions can be drawn from the study of the reserves for the
row totals given by
SLLS,i =
t∑
j=t+2−i
e(R~β)ij+σεij−Y (i+j−t−1), i = 2, · · · , t. (4.50)
Table 4.13 summarizes the numerical values of the 95th percentiles of the
two bounds SlLLS and Sc
LLS , together with their means and standard de-
viations.
We end this illustration with a bootstrap study in order to incorporate
the estimation error involved. Starting from the run-off triangle in Table
4.11 we bootstrap 5000 pseudo run-off triangles and calculate for each
bootstrap sample the 95% percentile of the distribution of S lLLS . Table
4.14 displays the results of this study. One can observe that, compared
to the simulated distribution of F−1SLLS
(0.95), the bootstrap distributions
performs very well.
4.6. Three applications 175
discounted IBNR reserve
cum
. dis
tr.
6*10^6 8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7 1.8*10^7
0.0
0.2
0.4
0.6
0.8
1.0
Figure 4.4: The cdf’s of ‘SLLS’ (MC) (solid line), SlLLS (dotted line) and
ScLLS (dashed line) for the run-off triangle in Table 4.11.
8*10^6 10^7 1.2*10^7 1.4*10^7
8*10
^610
^71.
2*10
^71.
6*10
^7
Figure 4.5: QQ-plot of the quantiles of S lLLS (◦) and Sc
LLS (�) versus
those of ‘SLLS’ (MC).
176
Chapte
r4
-R
ese
rvin
gin
non-life
insu
rance
busin
ess
SlLLS SLLS Sc
LLS
year 95% mean st. dev. 95% mean st. dev. 95% mean st. dev.
2 41,609 36,990 2,705 44,057 36,990 4,124 44,146 36,990 4,1333 201,904 173,309 16,524 209,182 173,309 20,367 212,567 173,309 21,7564 330,812 276,778 30,930 339,482 276,778 35,288 346,424 276,778 38,7865 481,854 395,969 48,841 487,849 395,969 53,075 503,109 395,969 60,5676 599,443 487,188 63,572 605,245 487,188 67,227 625,306 487,188 77,3217 1,473,927 1,178,908 166,351 1,484,295 1,178,908 171,499 1,535,494 1,178,908 201,4828 1,971,886 1,576,489 222,631 1,979,501 1,576,489 226,440 2,057,661 1,576,489 263,8799 1,384,144 1,090,821 164,670 1,386,684 1,090,821 165,568 1,443,768 1,090,821 192,21210 1,593,022 1,248,692 192,941 1,589,451 1,248,692 193,174 1,663,714 1,248,692 224,98211 5,438,603 4,278,076 650,272 5,443,520 4,278,076 653,437 5,693,117 4,278,076 763,183
total 13,517,204 10,743,220 1,559,369 13,524,010 10,743,220 1,583,892 14,125,203 10,743,220 1,884,508
Table 4.13: 95th percentiles, means and standard deviations of the distributions of S lLLS and Sc
LLS vs. ‘SLLS’
(MC).
4.6. Three applications 177
Distribution of bootstrapped Simulated distribution
95th percentiles of SlLLS of F−1
SLLS(0.95)
1 st percentile 13,123,442 13,111,5322.5 th percentile 13,227,201 13,216,739
5 th percentile 13,314,139 13,301,53910 th percentile 13,363,615 13,340,20025 th percentile 13,434,055 13,421,18150 th percentile 13,510,262 13,501,80875 th percentile 13,583,175 13,585,04290 th percentile 13,646,792 13,654,48395 th percentile 13,691,476 13,698,756
97.5 th percentile 13,716,004 13,731,55199 th percentile 13,730,976 13,740,991
Table 4.14: Percentiles of the bootstrapped 95th percentile of the distribu-
tion of the lower bound SBl(95) vs. the simulation.
4.6.3 Generalized linear models
In this last illustration we model the incremental claims Yij with a loga-
rithmic link function to obtain a multiplicative parametric structure and
we link the expected value of the response to the chain-ladder type linear
predictor. Formally, this means that
E[Yij ] = µij ,
Var[Yij ] = φµκij ,
log(µij) = ηij ,
ηij = αi + βj . (4.51)
The choice of the error distribution is determined by κ.
More specific we consider model (4.51) with the Poisson error distribu-
tion (κ=1 and φ = 1). The simulated triangle for this model is depicted in
Table 4.15. Parameter estimates and standard errors for this fit are shown
in Table 4.16.
Since this model is a generalized linear model, standard statistical
software can be used to obtain maximum (quasi) likelihood parameter
estimates, fitted and predicted values. Standard statistical theory also
suggests goodness-of-fit measures and appropriate residual definitions for
diagnostic checks of the fitted model.
178
Chapte
r4
-R
ese
rvin
gin
non-life
insu
rance
busin
ess
1 2 3 4 5 6 7 8 9 10 11
1 362,505 493,876 323,065 237,574 249,850 152,221 139,293 95,961 70,812 53,395 35,9022 399,642 545,274 357,788 263,414 276,500 168,064 153,603 105,760 78,736 58,6123 805,843 1,100,020 722,110 531,220 557,195 337,606 309,306 213,416 158,6114 728,762 994,975 653,231 478,728 502,797 306,071 278,436 193,2015 661,713 899,778 591,647 434,626 456,763 276,588 253,2976 539,789 737,394 484,415 355,175 372,800 226,8657 983,897 1,341,585 881,786 647,431 679,2648 889,268 1,217,248 798,387 585,0999 487,823 666,590 437,98710 442,982 601,70611 1,087,672
Table 4.15: Simulated run-off triangle with non-cumulative claim figures for the Poisson regression model.
4.6. Three applications 179
Parameter Value Estimate Standard error
α1 12.8 12.7990566 0.0007918770α2 12.9 12.8989406 0.0007631003α3 13.6 13.6001742 0.0006060520α4 13.5 13.4989356 0.0006283423α5 13.4 13.4007436 0.0006556928α6 13.2 13.1997559 0.0007180990α7 13.8 13.7991616 0.0005991796α8 13.7 13.6998329 0.0006464691α9 13.1 13.0989431 0.0008707837α10 13.0 12.9987252 0.0010370987α11 13.9 13.8995502 0.0009710197β2 0.31 0.3106789 0.0005310346β3 −0.11 −0.1099061 0.0006026958β4 −0.42 −0.4189677 0.0006804776β5 −0.37 −0.3700452 0.0007168115β6 −0.87 −0.8685181 0.0009462170β7 −0.96 −0.9585385 0.0010542829β8 −1.33 −1.3284870 0.0013825136β9 −1.63 −1.6269622 0.0018947413β10 −1.92 −1.9170757 0.0030880359β11 −2.31 −2.3105083 0.0054029754φ 1 1.025663
Table 4.16: Model specification, maximum likelihood estimates and stan-
dard errors for the run-off triangle in Table 4.15.
Figure 4.6 shows the distribution functions of the different bounds com-
pared to the empirical distribution obtained by Monte Carlo simulation
(MC). The distribution functions are remarkably close to each other and
enclose the simulated cdf nicely. This is confirmed by the QQ-plot in Fig-
ure 4.7 where we also see that the comonotonic upper bound has somewhat
heavier tails. Numerical values of some high quantiles of SGLIM , SlGLIM
and ScGLIM are given in Table 4.18.
Table 4.17 summarizes the numerical values of the 95th percentiles of
the two bounds SlGLIM and Sc
GLIM vs. SGLIM , together with their means
and standard deviations. This is also provided for the row totals
SGLIM,i =t∑
j=t+2−i
µije−Y (i+j−t−1), i = 2, . . . , t. (4.52)
180 Chapter 4 - Reserving in non-life insurance business
discounted IBNR reserve
cum
. dis
tr.
10^7 1.5*10^7 2*10^7
0.0
0.2
0.4
0.6
0.8
1.0
Figure 4.6: The cdf’s of ‘SGLIM ’ (MC) (solid line), SlGLIM (dotted line)
and ScGLIM (dashed line) for the run-off triangle in Table 4.15.
8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7
8*10
^610
^71.
2*10
^71.
6*10
^7
Figure 4.7: QQ-plot of the quantiles of S lGLIM (◦) and Sc
GLIM (�) versus
those of ‘SGLIM ’ (MC).
4.6
.T
hre
eapplic
atio
ns
181
SlGLIM SGLIM Sc
GLIM
year 95% mean st. dev. 95% mean st. dev. 95% mean st. dev.
2 43,622 36,623 4,041 43,624 36,623 4,042 43,631 36,623 4,0463 214,142 177,600 21,002 214,428 177,600 21,040 217,352 177,600 22,7514 342,589 280,318 35,595 343,011 280,318 35,691 350,360 280,318 39,8055 489,087 396,089 52,976 489,689 396,089 53,194 502,853 396,089 60,3986 608,891 490,289 67,401 609,535 490,289 67,565 628,672 490,289 78,0217 1,514,480 1,205,224 175,099 1,516,799 1,205,224 175,658 1,567,945 1,205,224 203,6928 1,977,737 1,575,313 227,703 1,980,868 1,575,313 228,343 2,054,475 1,575,313 268,6619 1,390,601 1,093,992 167,320 1,392,957 1,093,992 167,862 1,444,660 1,093,992 197,12110 1,632,675 1,278,947 199,110 1,634,653 1,278,947 199,693 1,702,375 1,278,947 236,12111 5,439,986 4,276,121 655,280 5,446,107 4,276,121 656,472 5,685,932 4,276,121 785,741
total 13,631,905 10,810,476 1,594,152 13,648,695 10,810,476 1,597,507 14,200,226 10,810,476 1,896,219
Table 4.17: 95th percentiles, means and standard deviations of the distributions of S lGLIM and Sc
GLIM vs.
‘GLIM ’ (MC).
182 Chapter 4 - Reserving in non-life insurance business
p SlGLIM SGLIM Sc
GLIM
0.95 13,631,905 13,648,695 14,200,2260.975 14,296,448 14,305,657 15,027,4140.99 15,115,189 15,122,840 16,057,613
0.995 15,702,702 15,709,497 16,804,2060.999 16,996,374 17,018,860 18,469,110
Table 4.18: Approximations for some selected quantiles with probability
level p of SGLIM .
Distribution of bootstrapped Simulated distribution
95th percentiles of SlGLIM of F−1
SGLIM(0.95)
1 st percentile 13,614,404 13,604,3142.5 th percentile 13,617,028 13,609,425
5 th percentile 13,619,474 13,613,04810 th percentile 13,622,664 13,618,05325 th percentile 13,626,759 13,624,36950 th percentile 13,631,651 13,631,62275 th percentile 13,636,506 13,638,99790 th percentile 13,641,168 13,645,81295 th percentile 13,643,882 13,649,574
97.5 th percentile 13,646,720 13,652,99599 th percentile 13,648,833 13,656,178
Table 4.19: Percentiles of the bootstrapped 95th percentile of the distribu-
tion of the lower bound SBl(95) vs. the simulation.
The bootstrap results in Table 4.19 are in line with the results of the
previous applications. We can conclude that in the discussed applications
the lower bound approximates the “real discounted reserve” very well. The
precision of the bounds only depends on the underlying variance of the sta-
tistical and financial part. As long as the yearly volatility does not exceed
ς = 35%, the financial part of the comonotonic approximation provides a
very accurate fit. These parameters are consistent with historical capital
market values as reported by Ibbotson Associates (2002). The underlying
variance of the statistical part depends on the estimated dispersion para-
meter and error distribution or mean-variance relationship. For example,
in case of the gamma distribution one obtains excellent results as long
4.7. Conclusion 183
as the dispersion parameter is smaller than 1. This is again in line with
the volatility structure in practical IBNR data sets. Since the parameters
in the paper for the statistical part of the bounds, obtained through the
quasi-likelihood approach, have small standard errors, it follows that re-
sults would be similar when simulating from a GLIM with the same linear
predictor, but for instance with another distribution type. In that sense
our findings are robust.
4.7 Conclusion
In this chapter, we considered the problem of deriving the distribution
function of the present value of a triangle of claim payments that are
discounted using some given stochastic return process. We started to model
the claim payments by means of a lognormal linear model which is also
included in the larger class of loglinear location-scale models. The use of
generalized linear models offers a great gain in modelling flexibility over the
simple lognormal model. The incremental claim amounts can for instance
be modelled as independent normal, Poisson, gamma or inverse Gaussian
response variables together with a logarithmic link function and a specified
linear predictor.
Because an explicit expression for the distribution function is hard to
obtain, we presented some approximations for this distribution function, in
the sense that these approximations are larger or smaller in convex order
sense than the exact distribution. When lower and upper bounds are close
to each other, together they can provide reliable information about the
original and more complex variable. An essential point in the derivation
of the presented convex lower bound approximations is the choice of the
conditioning random variable Λ.
When dealing with very large variances in the statistical and financial
part of our model, an adaptation of the random variable Λ will be necessary
or one can use other approximation techniques. This will be the topic of
the next chapter.
Chapter 5
Other approximation
techniques for sums of
dependent random variables
Summary In this chapter we derive some asymptotic results for the tail
distribution of sums of heavy tailed dependent random variables. We show
how to apply the obtained results to approximate certain functionals of
(the d.f. of) sums of dependent random variables. Our numerical results
demonstrate that the asymptotic approximations are typically close to the
Monte Carlo value. We will further briefly recall the mathematical tech-
niques behind the moment matching approximations and the Bayesian ap-
proach. Finally, we compare these approximations with the comonotonic
approximations of the previous chapter in the context of claims reserving.
5.1 Introduction
Many quantities of relevance in actuarial science concern functionals of
(the d.f. of) sums of dependent random variables. For example, one can
think of the Value-at-Risk of a stochastically discounted life annuity, or
the stop-loss premium for the aggregate claim amount of a number of in-
terrelated policies. Therefore, distribution functions of sums of dependent
random variables are of particular interest. Typically these distribution
functions are of a complex form. Consequently, in order to compute func-
tionals of sums of dependent random variables, approximation methods
185
186 Chapter 5 - Approximation techniques for sums of r.v.’s
are generally indispensable. Obviously, in many cases we could use Monte
Carlo simulation to obtain empirical distribution functions. However, this
is typically a time-consuming approach, in particular if we want to ap-
proximate tail probabilities, which would require an excessive number of
simulations. Therefore, alternative methods need to be explored.
Practitioners often use moment matching techniques to approximate
(the d.f. of) a sum of dependent lognormal random variables. In Section 2
we recall the lognormal and reciprocal gamma moment matching approach.
Both approximations are chosen such that their first two moments are equal
to the corresponding moments of the random variable of interest.
In Chapter 2 we discussed the concept of comonotonicity to obtain
bounds in convex order for sums of dependent random variables. Al-
though these bounds in convex order have proven to be good approxi-
mations in case the variance of the random sum is sufficiently small, they
perform much worse when the variance gets large. Section 3 establishes
some asymptotic results for the tail probability of a sum of dependent
random variables, in the presence of heavy-tailedness conditions.
Section 4 sketches, in very broad terms, basic elements of Bayesian
computation. We discuss two major obstacles to its popularity. The first
is how to specify prior distributions, and the second is how to evaluate
the integrals required for inference, given that for most models, these are
analytically intractable.
In the last section we compare the discussed approximations with the
comonotonic approximations of the previous chapter in the context of
claims reserving. In case the underlying variance of the statistical and
financial part of the discounted IBNR reserve gets large, the comono-
tonic approximations perform worse. We will illustrate this observation
by means of a simple example and propose to solve this problem using the
derived asymptotic results for the tail probability of a sum of dependent
random variables, in the presence of heavy-tailedness conditions. These
approximations are compared with the lognormal moment matching ap-
proximations. We finally consider the distribution of the discounted loss
reserve when the data in the run-off triangle is modelled by a generalized
linear model and compare the outcomes of the Bayesian approach with the
comonotonic approximations.
This chapter is based on Laeven, Goovaerts & Hoedemakers (2005),
Vanduffel, Hoedemakers & Dhaene (2004) and Antonio, Beirlant & Hoede-
makers (2005).
5.2. Moment matching approximations 187
5.2 Moment matching approximations
Consider a sum S given by
S =n∑
i=1
αieZi . (5.1)
Here, the αi are non-negative real numbers and (Z1, Z2, ..., Zn) is a multi-
variate normal distributed random vector.
The accumulated value at time n of a series of future deterministic
saving amounts αi can be written in the form (5.1), where Zi denotes the
random accumulation factor over the period [i, n]. Also the present value
of a series of future deterministic payments αi can be written in the form
(5.1), where now Zi denotes the random discount factor over the period
[0, i]. The valuation of Asian or basket options in a Black & Scholes model
and the setting of provisions and required capitals in an insurance context
boils down to the evaluation of risk measures related to the distribution
function of a random variable S as defined in (5.1).
The r.v. S defined in (5.1) will in general be a sum of non-independent
lognormal r.v.’s. Its distribution function cannot be determined analyti-
cally and is too cumbersome to work with. In the literature, a variety of
approximation techniques for this distribution function has been proposed.
Practitioners often use a moment matching lognormal approximation
for the distribution of S. The lognormal approximation is chosen such that
its first two moments are equal to the corresponding moments of S.
The present value of a continuous perpetuity with lognormal return
process has a reciprocal gamma distribution, see for instance Milevsky
(1997) and Dufresne (1990). This present value can be considered as the
limiting case of a random variable S as defined above. Motivated by this
observation, Milevsky & Posner (1998) and Milevsky & Robinson (2000)
propose a moment matching reciprocal gamma approximation for the d.f.
of S such that the first two moments match. They use this technique
for deriving closed form approximations for the price of Asian and basket
options.
5.2.1 Two well-known moment matching approximations
It belongs to the toolkit of any actuary to approximate the distribution
function of an unknown r.v. by a known distribution function in such a
188 Chapter 5 - Approximation techniques for sums of r.v.’s
way that the first moments are preserved. In this section we will briefly
describe the reciprocal gamma and the lognormal moment matching ap-
proximations. These two methods are frequently used to approximate the
distribution function of the r.v. S defined by (5.1).
The reciprocal gamma approximation
A r.v. X is said to be gamma distributed when its probability density
function is given by
fX(x;α, β) =βα
Γ(α)xα−1e−βx, x > 0, (5.2)
where α > 0, β > 0 and Γ(.) denotes the gamma function.
Consider now the r.v. Y = 1/X. This r.v. is said to be reciprocal gamma
distributed. Its p.d.f. is given by
fY (y;α, β) = fX(1/y;α, β)/y2, y > 0. (5.3)
It is straightforward to prove that the quantiles of Y are given by
F−1Y (p) =
1
F−1X (1 − p;α, β)
, p ∈ (0, 1) , (5.4)
where FX(.;α, β) is the cdf of the gamma distribution with parameters
α and β. Since the inverse of the gamma distribution function is readily
available in many statistical software packages, quantiles can easily be
determined.
The first two moments of the reciprocal gamma distributed r.v. Y are
given by
E[Y ] =1
β(α− 1), α > 1 (5.5)
and
E[Y 2] =1
β2(α− 1)(α− 2), α > 2. (5.6)
Expressing the parameters α and β in terms of E[Y ] and E[Y 2] gives
α =2E[Y 2] − E[Y ]2
E[Y 2] − E[Y ]2(5.7)
and
β =E[Y 2] − E[Y ]2
E[Y ]E[Y 2]. (5.8)
5.2. Moment matching approximations 189
The d.f. of the r.v. defined in (5.1) is now approximated by a reciprocal
gamma distribution with first two moments (2.46) and (2.47). The coef-
ficients α and β of the reciprocal gamma approximation follow from (5.7)
and (5.8). The reciprocal gamma approximation for the quantile function
is then given by (5.4).
The reciprocal gamma moment matching method appears naturally in
case one wants to approximate the d.f. of stochastic present values. Indeed,
for the limiting case of the constant continuous perpetuity :
S∞ =
∫ ∞
0exp
[−(µ− σ2
2)τ − σB(τ)
]dτ, (5.9)
where B(τ) represents a standard Brownian motion and µ > σ2
2 , the risk
measures can be calculated very easily since Dufresne (1990) proved that
S−1∞ is gamma distributed with parameters 2µ
σ2 − 1 and σ2
2 . An elegant
proof for this result can be found in Milevsky (1997).
Expression (5.9) can be seen as a continous counterpart of a discounted
sum such as in (5.1). One expects that the present value of a finite dis-
crete annuity with a normal logreturn process with independent periodic
returns, can be approximated by a reciprocal gamma distribution, pro-
vided the time period involved is long enough. This idea was set forward
and explored in Milevsky & Posner (1998), Milevsky & Robinson (2000)
and Huang et al. (2004).
The lognormal approximation
A r.v. X is said to be lognormally distributed if its p.d.f. is given by
fX(x;µ, σ2) =1
xσ√
2πe
−(log x−µ)2
2σ2 , x > 0, (5.10)
where σ > 0.
The quantiles of X are given by
F−1X (p) = eµ+σΦ−1(p), p ∈ (0, 1) . (5.11)
The first two moments of X are given by
E[X] = eµ+ 12σ2
(5.12)
and
E[X2] = e2µ+2σ2. (5.13)
190 Chapter 5 - Approximation techniques for sums of r.v.’s
Expressing the parameters µ and σ2 of the lognormal distribution in terms
of E[X] and E[X2] leads to
µ = log
(E[X]2√E[X2]
)(5.14)
and
σ2 = log
(E[X2]
E[X]2
). (5.15)
The same procedure as the one explained in the previous subsection can
be followed in order to obtain a lognormal approximation for S, with the
first two moments matched. Dufresne (2002) obtains a lognormal limit dis-
tribution for S as volatility σ tends to zero and this provides a theoretical
justification for the use of the lognormal approximation.
5.2.2 Application: discounted loss reserves
We calculate the lognormal moment matching approximations for the ap-
plication considered in Section 2.4 and compare the results with the convex
lower bound. The results are given below.
We use the notation SMp[Vl] and SMp[V
LN ] to denote the security
margin for confidence level p approximated by the lower bound and by the
lognormal moment matching technique respectively. The different tables
display the Monte Carlo simulation result (MC) for the security margin, as
well as the procentual deviations of the different approximation methods,
relative to the Monte Carlo result. These procentual deviations are defined
as follows:
LB :=SMp[V
l] − SMp[VMC ]
SMp[VMC ]× 100%,
LN :=SMp[V
LN ] − SMp[VMC ]
SMp[VMC ]× 100%,
where V l and V LN correspond to the lower bound approach and the log-
normal moment matching approach, and V MC denotes the Monte Carlo
simulation result. The figures displayed in bold in the tables correspond to
the best approximations, this means the ones with the smallest procentual
deviation compared to the Monte Carlo results.
Overall the comonotonic lower bound approach provides a very accu-
rate fit under different parameter assumptions. These assumptions are
5.2. Moment matching approximations 191
σM : 0.05 0.15 0.25 0.35
LB −0.25% −0.09% −0.12% −0.00%LN −1.66% +1.28% +4.09% +7.52%MC 0.0853 0.1090 0.1309 0.1370(s.e. × 107) (1.11) (2.47) (6.15) (8.18)
Table 5.1: (ex. 1) Approximations for the security margin SM0.70[V ] for
different market volatilities and ωL = 0.1 and ωA = 0.05.
p : 0.995 0.975 0.95 0.90 0.80 0.70
LB −0.38% −0.21% −0.16% −0.08% −0.00% −0.00%LN −4.30% −2.96% −2.29% −1.43% −0.11% +1.74%MC 1.0348 0.6927 0.5421 0.3859 0.2192 0.1124(s.e. × 105) (2.49) (0.46) (0.26) (0.10) (0.06) (0.04)
Table 5.2: (ex. 1) Approximations for some selected confidence levels
of SMp[V ]. The market volatility is set equal to 20%. (ωL = 0.05 and
ωA = 0)
σM : 0.05 0.10 0.15 0.20 0.25 0.30 0.35
LB −0.19% −0.15% −0.23% −0.16% −0.11% −0.17% −0.38%LN −4.94% −3.92% −3.17% −2.49% −1.95% −1.56% −1.30%MC 0.4390 0.5250 0.6528 0.8103 0.9924 1.1970 1.4232s.e.(×105) (0.15) (0.29) (0.41) (0.69) (1.22) (3.78) (4.16)
Table 5.3: (ex. 1) Approximations for the security margin SM0.975[V ] for
different market volatilities.
p : 0.995 0.975 0.95 0.90 0.80 0.70
LB −0.93% −0.04% −0.02% −0.18% −0.03% −0.6%LN −3.94% +3.78% +7.22% +11.29% +19.68% +53.46%MC 4.4521 2.2264 1.4998 0.8814 0.3508 0.0761s.e.(×105) (37.63) (2.99) (7.44) (2.79) (0.78) (0.27)
Table 5.4: (ex. 2) Approximations for some selected confidence levels of
SMp[V ]. The market volatility is set equal to 25%.
in line with the realistic market values. Moreover the comonotonic ap-
proximations have the advantage that they are easy computable for any
risk measure that is additive for comonotonic risks, such as Value-at-Risk
192 Chapter 5 - Approximation techniques for sums of r.v.’s
and Tail Value-at-Risk. We believe the comonotonic approach is preferred
to any moment matching approximation, because it is more stable and
accurate across all levels of volatility.
5.3 Asymptotic approximations
In actuarial applications it is often merely the tail of the distribution func-
tion that is of interest. Indeed, one may think of Value-at-Risk, Conditional
Tail Expectation or Expected Shortfall estimations. Therefore, approxi-
mations for functionals of sums of (the d.f. of) dependent random variables
may alternatively be obtained through the use of asymptotic relations. Al-
though asymptotic results are valid at infinity, they may as well serve as
approximations near infinity.
This section establishes some asymptotic results for the tail proba-
bilities related with a sum of heavy tailed dependent random variables.
In particular, we establish an asymptotic result for the randomly weighted
sum of a sequence of non-negative numbers. Furthermore, we establish un-
der two different sets of conditions, an asymptotic result for the randomly
weighted sum of a sequence of independent random variables that consist
of a random and a deterministic component. Throughout, the random
weights are products of i.i.d. random variables and thus exhibit an explicit
dependence structure. Next, we present an application that demonstrates
how the derived asymptotic results can be employed to approximate cer-
tain functionals of sums of (the d.f. of) dependent random variables. To
explore the quality of the asymptotic approximations, we also provide a
numerical illustration that compares the asymptotic approximation values
to Monte Carlo simulated values.
5.3.1 Preliminaries for heavy-tailed distributions
First we introduce some notational conventions. For a random variable X
with a distribution function F , we denote its tail probability by F (x) =
1 − F (x) = Pr[X > x]. For two independent r.v.’s X and Y with d.f.’s F
andG supported on (−∞,+∞), we write by F∗G(x) =∫ +∞−∞ F (x−t)G(dt),
−∞ < x < +∞, the convolution of F and G. We denote by F ∗n =
F ∗ · · · ∗ F the n-fold convolution of F , and we write by F ⊗G the d.f. of
XY .
5.3. Asymptotic approximations 193
Throughout, unless otherwise stated, all limit relations are for x →+∞. Let a(x) ≥ 0 and b(x) > 0 be two functions satisfying
l− ≤ lim infx→+∞
a(x)
b(x)≤ lim sup
x→+∞
a(x)
b(x)≤ l+.
We write a(x) = O (b(x)) if l+ < +∞, a(x) = o (b(x)) if l+ = 0 and
a(x) � b(x) if both l+ < +∞ and l− > 0. We write a(x) . b(x) if l+ = 1,
a(x) & b(x) if l− = 1 and a(x) ∼ b(x) if both l+ = 1 and l− = 1. We say
that a(x) and b(x) are weakly equivalent if a(x) � b(x), and say that a(x)
and b(x) are (strongly) equivalent if a(x) ∼ b(x).
A r.v. X or its d.f. F is said to be heavy-tailed if E[eγX ] = +∞ for
any γ > 0. Below we introduce some important classes of heavy-tailed
distributions. A d.f. F supported on (0,+∞) belongs to the subexponential
class S if
limx→+∞
F ∗n(x)/F (x) = n (5.16)
for any (or equivalently, for some) n ≥ 2. More generally, a d.f. F sup-
ported on (−∞,+∞) belongs to the class S if F (x) = F (x)I(x>0) does. A
d.f. F supported on (−∞,+∞) belongs to the long-tailed class L if for any
real number y (or equivalently, for y = 1) we have that
limx→+∞
F (x+ y) /F (x) = 1. (5.17)
A class of heavy-tailed distributions that is closely related to the classes
S and L, is the class D of d.f.’s with dominatedly varying tails. A d.f. F
supported on (−∞,+∞) belongs to the class D if its tail F is of dominated
variation in the sense that
lim supx→+∞
F (xy)
F (x)< +∞ (5.18)
for any 0 < y < 1 (or equivalently for some 0 < y < 1). It is well-known
that
D ∩ L ⊂ S ⊂ L.
See e.g. Embrechts et al. (1997). We remark that the intersection D ∩ Lcontains many useful heavy-tailed distributions. In particular, the inter-
section D ∩ L covers the class R, which consists of all d.f.’s with regularly
194 Chapter 5 - Approximation techniques for sums of r.v.’s
varying tails. A d.f. F supported on (−∞,+∞) has a regularly varying
tail if there is some α > 0 such that the relation
limx→+∞
F (xy)
F (x)= y−α
holds true for any y > 0. We denote F ∈ R−α.
In addition to the classes of heavy-tailed distributions introduced above,
we introduce the class R−∞ of d.f.’s with rapidly varying tails, containing
both heavy-tailed and light-tailed distributions. For a d.f. F supported on
(−∞,+∞) satisfying F (x) > 0 for any x > 0, F belongs to the class R−∞if
limx→+∞
F (xy)
F (x)=
{0, for any y > 1;
+∞, for any 0 < y < 1.(5.19)
We remark that the intersection S∩R−∞ contains e.g. lognormal distribu-
tions and certain Weibull distributions, which are prominent distributions
in actuarial applications.
For an elaboration on the classes of heavy-tailed distributions and the
class of rapidly varying tailed distributions, and their applications in in-
surance and finance, the interested reader is referred to Bingham et al.
(1987), Embrechts et al. (1997) and Beirlant et al. (2004).
In Table 5.5 we list some well-known distributions and their corres-
ponding distribution class.
5.3.2 Asymptotic results
In this subsection, we derive some asymptotic results for the tail proba-
bility of sums of dependent r.v.’s, in the presence of heavy-tailedness. In
the following, we let {Xn, n = 1, 2, . . .} and {Yn, n = 1, 2, . . .} denote two
sequences of i.i.d. r.v.’s that are mutually independent. We write by FX
the d.f. of a r.v. X of which Xn, n = 1, 2, . . ., are considered to be inde-
pendent replicates, and assume it is supported on (−∞,+∞). Similarly,
we write by FY the d.f. of a r.v. Y of which Yn, n = 1, 2, . . ., are considered
to be independent replicates, and assume it is supported on (0,+∞). For
notational convenience, we will use the device of independent replicates
throughout.
5.3
.A
sym
pto
ticappro
xim
atio
ns
195
Name d.f. or density f Parameters Class
Lognormal f(x) = 1√2πσx
e−12( log x−µ
σ)2 , (µ ∈ R, σ > 0) R−∞ ∩ S
Weibull F (x) = 1 − e−cxβ(c > 0, 0 < β < 1) R−∞ ∩ S
Benktander-I F (x) = 1 − cx−α−1e−β(log x)2(α+ 2β log x) (c, α, β > 0) R−∞ ∩ SBenktander-II F (x) = 1 − cαx−(1−β) exp{−(α/β)xβ} (c, α > 0, 0 < β < 1) R−∞ ∩ SPareto F (x) = 1 − ( x
β )−α (α, β > 0) RBurr F (x) = 1 − (1 + xτ
β )−α (α, β, τ > 0) RLoggamma f(x) = βα
Γ(α)x(log x)α−1x−β (α, β > 0) RTransformed β f(x) = |a|
B(p,q)xap−1(1 + xa)−(p+q) (a ∈ R, p, q > 0) R
Truncated F (x) = Pr[|X| ≤ x], X ∼ α-stable (1 < α < 2) Rα-stable
Table 5.5: Some well-known distributions and their distribution class.
196 Chapter 5 - Approximation techniques for sums of r.v.’s
We state the following theorem:
Theorem 12.
Let Zi = Y1Y2 · · ·Yi and 0 < ai < +∞, i = 1, 2, . . .. If FY ∈ S ∩ R−∞,
then it holds for each n = 1, 2, . . . and x→ +∞ that
Pr
[n∑
i=1
aiZi > x
]∼
n∑
i=1
Pr [aiZi > x] . (5.20)
Proof. See section 5.6.
In an actuarial context the sequence {ai, i = 1, 2, . . .} can be regarded as a
sequence of deterministic payments. The following theorem applies to the
case in which the payments consist of both a deterministic and a random
component, and the deterministic component is either an additive or a
multiplicative constant. The theorem is an extension of Theorems 5.1 and
5.2 of Tang & Tsitsiashvili (2003):
Theorem 13.
Let Zi = Y1Y2 · · ·Yi and 0 < ai < +∞, i = 1, 2, . . .. If the following
conditions are valid:
1. FX ∈ D ∩ L,
2. FY ∈ R−∞,
then it holds for each n = 1, 2, . . . and x→ +∞ that
Pr
[n∑
i=1
(ai +Xi)Zi > x
]∼
n∑
i=1
Pr [(ai +X)Zi > x] . (5.21)
Furthermore, it holds for each n = 1, 2, . . . and x→ +∞ that
Pr
[n∑
i=1
(aiXi)Zi > x
]∼
n∑
i=1
Pr [(aiX)Zi > x] . (5.22)
Proof. See section 5.6.
5.3. Asymptotic approximations 197
Corollary 3.
Under the conditions stated in Theorem 13, we have for each n = 1, 2, . . .
and x→ +∞ that
Pr
[n∑
i=1
(ai +Xi)Zi > x
]−Pr
[n−1∑
i=1
(ai +Xi)Zi > x
]∼ Pr [(an +X)Zn > x] .
(5.23)
Furthermore, it holds for each n = 1, 2, . . . and x→ +∞ that
Pr
[n∑
i=1
(aiXi)Zi > x
]−Pr
[n−1∑
i=1
(aiXi)Zi > x
]∼ Pr [anXZn > x] . (5.24)
Proof. See section 5.6.
Corollary 4.
If condition 1. stated in Theorem 13 is replaced by “FX ∈ R−α”, while the
other conditions remain the same, then it holds for each n = 1, 2, . . . and
x→ +∞ that
Pr
[n∑
i=1
(ai +Xi)Zi > x
]∼
n∑
i=1
FX(x− ai) (E[Y α])i . (5.25)
and
Pr
[n∑
i=1
(aiXi)Zi > x
]∼ FX(x)
n∑
i=1
aαi (E[Y α])i . (5.26)
Proof. See section 5.6.
We remark that the particular case of lognormally distributed payments is
not covered by Theorem 13, since the lognormal distribution does not be-
long to the intersection D∩L. The lognormal distribution has a moderately
heavy tail and and has been a popular model for loss severity distributions.
Hence, we state the following theorem:
Theorem 14.
Relations (5.21), (5.22), (5.23) and (5.24) remain valid if conditions 1.
and 2. stated in Theorem 13 are replaced by
1’ X ∼ logN(µX , σ2X), −∞ < µX < +∞ and σX > 0,
2’ Y ∼ logN(µY , σ2Y ), −∞ < µY < +∞ and σY > 0,
198 Chapter 5 - Approximation techniques for sums of r.v.’s
3’ σX > σY .
Proof. See section 5.6.
5.3.3 Application: discounted loss reserves
In this subsection, we consider the problem of determining stop-loss pre-
miums and quantiles for discounted loss reserves. We denote by the r.v.
Xi from the i.i.d. sequence {Xi, i = 1, . . . , n}, the net loss in year i. Fur-
thermore, the positive r.v. Yi from the i.i.d. sequence {Yi, i = 1, . . . , n}represents the present value discounting factor from year i to year i − 1.
The two sequences {Xi, i = 1, . . . , n} and {Yi, i = 1, . . . , n} are considered
to be mutually independent. Then the discounted loss reserve S is given
by
S =n∑
i=1
Xi
i∏
j=1
Yj . (5.27)
Henceforth, we impose that E[SI( �S>0)
] < +∞, which is implied by the
condition that E[XI(X>0)] < +∞ and E[Y ] < +∞.
Approximate values for the stop-loss premium and quantiles of the dis-
counted loss reserve S may be obtained by using the previously obtained
asymptotic results. In particular, ifX and Y satisfy the corresponding con-
ditions under which Theorem 13 or Theorem 14 holds, then for sufficiently
large values of the retention d, the stop-loss premium can be approximated
by
π(S, d) ≈n∑
i=1
∫ +∞
dFX � i
j=1 Yj(s)ds =
n∑
i=1
π(X
i∏
j=1
Yj , d). (5.28)
Since the d.f. of X∏i
j=1 Yj will generally not be analytically tractable,
Monte Carlo simulation may still be required. However, the number of
simulations has been reduced considerably.
In case FX ∈ R−α, 0 < α < +∞, and FY ∈ R−∞, the asymptotic
approximations for the stop-loss premium of S reduce to
π(S, d) ≈∫ +∞
d
n∑
i=1
(E[Y α])i FX(s)ds =n∑
i=1
(E[Y α])i π(X, d). (5.29)
5.3. Asymptotic approximations 199
Furthermore, in this case we have for sufficiently large values of p, that the
asymptotic approximation for the p-quantile is given by
F−1�S (p) ≈ inf
{s :
n∑
i=1
(E[Y α])i FX(s) ≤ 1 − p
}. (5.30)
Under the conditions of Theorem 14, we have for sufficiently large values
of p, that the asymptotic approximation for the p-quantile is given by
F−1�S (p) ≈ inf
{s :
n∑
i=1
FX � ij=1 Yj
(s) ≤ 1 − p
}. (5.31)
We emphasize that the approximation (5.31) is not in general valid under
the conditions of Theorem 13; it requires the additional condition that
FX ∈ R−α, 0 < α <∞.
As an example, we consider Xi ∼ GPD(α, β) and Yi ∼ logN(µ, σ2), i =
1, . . . , n, in which GPD(α, β) denotes the generalized Pareto distribution
with d.f.
FX(x) = 1 − (1 +x
β)−α, x > 0,
where α > 0 and β > 0. Then, clearly we have that FX ∈ R−α and FY ∈R−∞. Hence, the asymptotic approximations (5.29) and (5.30) are valid.
Notice that for the example considered, the asymptotic approximations can
even be computed analytically. We performed 5 000 000 Monte Carlo (MC)
simulations for quantiles and stop-loss premiums to assess the quality of the
asymptotic approximations (5.29) and (5.30), under various specifications
of the parameter n. We fix the parameter values: α = 1.5, β = 1, µ =
−0.04 and σ = 0.10. The results are presented in Table 5.6. Ndiff. refers
to the normalized difference defined as MC−Appr.MC × 100%. Our numerical
results demonstrate that the asymptotic approximations are typically close
to the Monte Carlo value.
200 Chapter 5 - Approximation techniques for sums of r.v.’s
n=3d MC Appr. Ndiff. p MC Appr. Ndiff.
15 1.50 1.36 9% 0.95 16 14 15%
20 1.28 1.19 7% 0.975 25 22 11%
25 1.14 1.07 6% 0.99 44 41 7%
30 1.03 0.98 5% 0.995 69 66 4%
35 0.95 0.91 4% 0.999 198 194 2%
40 0.88 0.85 4%
50 0.78 0.76 3%
60 0.71 0.70 2%
80 0.61 0.61 1%
100 0.55 0.54 1%
150 0.44 0.44 0%
200 0.38 0.38 0%
n = 5d MC Appr. Ndiff. p MC Appr. Ndiff.
20 2.22 1.89 15% 0.95 24 19 22%
30 1.75 1.56 11% 0.975 36 30 17%
40 1.48 1.35 9% 0.99 63 57 10%
60 1.18 1.11 6% 0.995 96 90 6%
80 1.01 0.96 5% 0.999 274 265 3%
100 0.90 0.86 4%
150 0.72 0.70 3%
200 0.62 0.61 2%
250 0.56 0.55 2%
300 0.51 0.50 2%
n = 10d MC Appr. Ndiff. p MC Appr. Ndiff.
40 2.91 2.41 17% 0.95 40 28 30%
60 2.22 1.98 11% 0.975 58 45 23%
80 1.86 1.72 7% 0.99 98 84 14%
100 1.62 1.54 5% 0.995 148 133 10%
150 1.28 1.26 2% 0.999 402 390 3%
200 1.09 1.09 0%
300 0.87 0.88 -1%
400 0.74 0.75 -1%
Table 5.6: Approximations for stop-loss premiums and quantiles of S for
Pareto claim sizes and lognormal present value discounting factors.
5.4. The Bayesian approach 201
5.4 The Bayesian approach
Some comments on notation are needed at this point. First p(.|.) denotes
a conditional probability density with the arguments determined by the
context, and similarly for p(·), which denotes a marginal distribution. The
same notation is used for continuous density functions and discrete prob-
ability mass functions.
5.4.1 Introduction
Bayesian theory is a powerful branch of statistics not yet fully explored by
practitioner actuaries. One of its main benefits, which is the core of its
philosophy, is the ability of including subjective information in a formal
framework. Apart from this, the wide range of models presented by this
branch of statistics is also one of the main reasons why it has been so much
studied recently.
Since the early 1990s, statistics (and to a lesser extent econometrics)
has seen an explosion in applied Bayesian research. This explosion has
had little to do with a renewed interest of the statistics and econometrics
communities to the theoretical foundation of Bayesianism, or to a sudden
awakening to the merits of the Bayesian approach over frequentist meth-
ods, but instead can be primarily explained on pragmatic grounds. The
recent developments are mainly due to, firstly, the recent computer de-
velopments that have made it easier to perform calculation by simulations
and, secondly, to the failure of classical statistical methods to give solutions
to many problems. Indeed, the use of such tools often enables researchers
to estimate complicated statistical models that would be quite difficult, if
not virtually impossible, using standard frequentist techniques. But, al-
though so many developments have been occurring in Bayesian statistics,
very few actuaries are aware of them and even fewer make use of them.
The purpose of this section is to sketch, in very broad terms, basic elements
of Bayesian computation.
Classical statistics provides methods to analyze data, from simple de-
scriptive measures to complex and sophisticated models. The available
data are processed and then conclusions about a hypothetical population,
of which the data available is supposed to be a representative sample, are
drawn. It is not hard to imagine situations, however, in which data are not
the only available source of information about the population. Bayesian
202 Chapter 5 - Approximation techniques for sums of r.v.’s
methods provide a principled way to incorporate this external information
into the data analysis process. To do so, however, Bayesian methods have
to change entirely the vision of the data analysis process with respect to
the classical approach. In a Bayesian approach, the data analysis process
starts already with a given probability distribution. As this distribution is
given before any data is considered, it is called prior distribution.
Bayesian methods allow us to assign prior distributions to the param-
eters in the model which capture known qualitative and quantitative fea-
tures, and then to update these priors in the light of the data, yielding a
posterior distribution via Bayes’ theorem
Posterior ∝ Likelihood × Prior,
where ∝ denotes that two quantities are proportional to each other. Hence
the posterior distribution is found by combining the prior distribution for
the parameters with the probability of observing the data given the param-
eters (the likelihood). The ability to include prior information in the model
is not only an attractive pragmatic feature of the Bayesian approach, it is
theoretically vital for guaranteeing coherent inferences.
More formally Bayes’ theorem is defined as follows. Consider a process
in which observations (~Y is the vector of observations) are to be taken
from a distribution for which the probability density function is p(~Y |~θ),where ~θ is a set of unknown parameters. Before any observation is made,
the analyst would include all his previous information and judgements of ~θ
in a prior distribution p(~θ), that would be combined with the observations
to give a posterior distribution p(~θ|~Y ) in the following way:
p(~θ|~Y ) ∝ p(~Y |~θ)p(~θ)
Bayesian modelling involves integrals over the parameters, whereas non-
Bayesian methods often rely on optimization of the parameters. The main
difference between these methods is that optimization fails to take into
account the inherent uncertainty in the parameters. There is no true value
for each of the parameters which can be found by optimization. Instead,
there is a range of plausible values, each with some associated density.
The mechanisms of the Bayesian approach to model fitting to make
inferences consists of three basic steps:
1. Assign priors to all the unknown parameters;
5.4. The Bayesian approach 203
2. Write down the likelihood of the data given the parameters;
3. Determine the posterior distribution of the parameters given the data
using Bayes’ theorem.
Bayesian inference is quite simple to describe probabilistically, but there
have been two major obstacles to its popularity. The first is how to specify
prior distributions, and the second is how to evaluate the integrals required
for inference, given that for most models, these are analytically intractable.
This will be discussed in short in the next two subsections.
5.4.2 Prior choice
The prior distribution can arise from data previously observed, or it can be
the subjective assessment of some domain expert and, as such, it represents
the information we have about the problem at hand, that is not conveyed
by the sample data.
Several methods for eliciting prior densities from experts exist. See,
e.g. O’Hagan (1994) for a comprehensive review. A common approach is
to choose a prior distribution with density function similar to the likelihood
function. In doing so, the posterior distribution of ~θ will be in the same
class and the prior is said to be conjugate to the likelihood. The conjugate
family is mathematically convenient in that the posterior distribution fol-
lows a known parametric form. Of course, if information is available that
contradicts the conjugate parametric family, it may be necessary to use a
more realistic, if inconvenient, prior distribution. The basic justification
for the use of conjugate prior distributions is similar to that for using stan-
dard models for the likelihood: it is easy to understand the corresponding
results, which can often be put in analytic form. Next, they are often a
good approximation, and they simplify computations. Although they can
make interpretations of posterior inferences less transparent and compu-
tation more difficult, non-conjugate prior distributions do not pose any
new conceptual problem. In practice, for complicated models, conjugate
prior distributions may not even be possible. In general, the exponential
families are the only classes of distributions that have natural conjugate
distributions, since, apart from certain irregular cases, the only distribu-
tions having a fixed number of sufficient statistics are of the exponential
type.
204 Chapter 5 - Approximation techniques for sums of r.v.’s
Kass and Wasserman (1996) survey formal rules that have been suggested
for choosing a prior. Many of these rules reflect the desire to let the
“data speak for themselves”, so that inferences are unaffected by informa-
tion external to the current data. This has led to variety of priors with
names like conventional, default, diffuse, flat, formal, generic, indifference,
neutral, non-informative, objective, reference, and vague priors. Prior dis-
tributions playing a minimal role in the posterior distribution are called
‘reference prior distributions’. One interpretation of letting the data speak
for themselves is to use classical techniques. Maximum likelihood estimates
are rationalizable in a Bayesian framework by appropriate choice of prior
distribution, specifically a uniform prior.
There are many ways of defining a non-informative prior. The main
objective is to give as little subjective information as possible. So, usually
a prior distribution with a large value for the variance is used. Another
way of including the minimal prior information is to find estimates of the
parameters of the prior distribution, using the data. This last approach
is called the empirical Bayes method, but often there is a relationship
between those two approaches — non-informative and empirical Bayes.
A commonly used reference prior in Bayesian analysis is Jeffreys’ prior
(See Jeffreys (1946)). This choice is based on considering one-to-one trans-
formations of the parameter h(~θ). Jeffreys’ general principle is that any
rule for determining the prior density p(~θ) should yield an equivalent re-
sult if applied to the transformed parameter. This non-informative prior
is obtained by applying Jeffreys’ rule, which is to take the prior density
to be proportional to the square root of the determinant of the Fisher
information matrix. This prior exhibits many nice features that make it
an attractive reference prior. One such property is parametrization invari-
ance. Although Jeffreys’ rule has many desirable properties, it should be
used with caution.
In most cases, Jeffreys’ prior is technically not a probability distribu-
tion, since the density function does not have a finite integral over the
parameter space. It is then termed an improper prior. It is often the case
that Bayesian inference based on improper priors returns proper posterior
distributions which then turn out to be numerically equivalent to the re-
sults of classical inference. Problems related to the use of improper prior
distributions can be overcome by assigning prior distributions that are as
uniform as possible but still remain probability distributions. The use of
uniform prior distributions to represent uncertainty clearly assumes that
5.4. The Bayesian approach 205
“equally probable” is an adequate representation of “lack of information”.
Theoretically, a prior distribution could be included for all the parame-
ters that are unknown in a model, so that any model could be represented in
a Bayesian way. However, this often leads to intractable problems (mainly
integrals without solution). So the main limitation of Bayesian theory is
the difficulty, and in many cases the impossibility, of solving the required
equations analytically.
In the last decade many simulation techniques have been developed
in order to solve this problem and to obtain estimates of the posterior
distribution. These techniques were turning points for the Bayesian theory,
making it possible to apply many of its models. On one hand, the use
of a final and closed formula for a solution is, generally speaking, more
satisfactory than the use of an approximation through simulation. On the
other hand, simulation gives a larger range of models for which solutions
(or at least good approximations) can be obtained.
5.4.3 Iterative simulation methods
In order to illustrate the simulation philosophy, suppose that the posterior
of a specific parameter ~θ is needed. If an analytical solution was available,
a formula would be derived, where the observed data and known param-
eters would be included, defining a final result. But, depending on the
model, this solution will not be possible. In such cases an approximation
for the posterior distribution of ~θ is needed. One way of finding this ap-
proximation is by simulation, that substitutes the posterior distribution
by a large sample of ~θ based on the characteristics of the model. With this
large sample of ~θ many summary statistics could be calculated, like the
mean, variance or histogram, extracting all the information needed from
this sample of the posterior distribution.
There are a number of ways of simulating and in all of them some
checking should be carried out to guarantee that the simulation set is
really representative for the required distribution. For instance, it must
be checked whether the simulation is mixing well or, in other words, if the
simulation procedure is visiting all the possible values for ~θ. It should be
also considered how large the sample should be, and whether the initial
point where the simulation starts does not play a big role. Among many
other issues, the moment when convergence to the true distribution of ~θ is
achieved should also be monitored.
206 Chapter 5 - Approximation techniques for sums of r.v.’s
The most popular type of simulation in Bayesian theory are the Markov
Chain Monte Carlo (MCMC) methods. This class of simulation models
has been used in a large number and wide range of applications, and has
been found to be very powerful. The essence of the MCMC method is that
by sampling from specific simple distributions (derived from the combina-
tion of the likelihood and prior distributions), a sample from the posterior
distribution will be obtained in an asymptotic way.
Iterative simulation methods, particularly the Gibbs sampler and the
Metropolis Hastings algorithm are powerful statistical tools that facilitate
computation in a variety of complex models. Though these two algorithms
are commonly presented as useful yet distinct instruments for simulating
joint posteriors, this distinction is rather artificial - indeed, one can regard
the Gibbs sampler as a special case of the Metropolis-Hastings algorithm
where jumps along the complete conditional distributions are accepted with
probability one. In conditionally conjugate models, the Gibbs sampler is
typically the algorithm of choice (since the complete posterior conditionals
are easily sampled).
The general strategy with iterative methods is to follow the steps of the
algorithms to generate a series of draws (sometimes called a parameter
chain), say θ0, θ1, θ2, . . . that converge in distribution to some target density
- in our case, the posterior f(θ|~Y ). The algorithms are constructed so that
the posterior f(θ|~Y ) is the unique stationary distribution of the parameter
chain. Once convergence to the target density is “achieved” we can use
these draws in the same way as with direct Monte Carlo integration to
calculate posterior means, posterior standard deviations, and so on. In
practice, we take care to diagnose that the parameter chain has approached
convergence to the target density, to discard the initial set of the pre-
convergence draws (often called the burn-in period), and then to use the
post-convergence sample to calculate the desired quantities. Unlike the
non-iterative methods discussed previously, the post-convergence draws
we obtain using these iterative methods will prove to be correlated, as
the distribution of, say, θt depends on the last parameter sampled in the
chain, θt−1. If the correlation among the draws is severe, it may prove to be
difficult to traverse the entire parameter space, and the numerical standard
errors associated with the point estimates can be quite large. When the
simulations are highly correlated, and our chain makes only small local
movements from iteration to iteration, we refer to this as slow mixing of
5.4. The Bayesian approach 207
the parameter chain.
One can find an excellent overview and a detailed discussion of exam-
ples of MCMC algorithms in, for example, Gilks et al. (1996). Here we
will describe Gibbs Sampling (GS), a special case of Metropolis-Hastings
algorithms, which is becoming increasingly popular in the statistical com-
munity. GS is an iterative method that produces a Markov Chain, that is
a sequence of values {~θ(0), ~θ(1), ~θ(2), . . .} such that ~θ(i+1) is sampled from a
distribution that depends on the current state i of the chain. The algorithm
works as follows.
Let ~θ(0) = {θ(0)1 , . . . , θ
(0)k } be a vector of initial values of ~θ and suppose that
the conditional distributions of θi|(θ1, . . . , θi−1, θi+1, . . . , θk, ~Y ) are known
for each i. The first value in the chain is simulated as follows:
θ(1)1 is sampled from the conditional distribution of θ1|(θ(1)
2 , . . . , θ(1)k , ~Y );
θ(1)2 is sampled from the conditional distribution of θ2|(θ(1)
1 , θ(1)3 , . . . , θ
(1)k , ~Y );
θ(1)k is sampled from the conditional distribution of θk|(θ(1)
1 , θ(1)2 , . . . , θ
(1)k−1,
~Y );
Then ~θ(0) is replaced by ~θ(1) and the simulation is repeated to generate~θ(2), and so forth. In general, the i-th value in the chain is generated by
simulating from the distribution of ~θ conditional on the value previously
generated ~θ(i−1). After an initial long chain, called burn-in, of say b itera-
tions, the values {~θ(b+1), ~θ(b+2), ~θ(b+3), . . .} will be approximately a sample
from the posterior distribution of ~θ, from which empirical estimates of the
posterior means and any other function of the parameters can be com-
puted. Critical issues for this method are the choice of the starting value~θ(0), the length of the burn-in and the selection of a stopping rule. The pro-
gram “WinBugs” provides an implementation of GS suitable for problems
in which the likelihood function satisfies certain factorization properties.
5.4.4 Bayesian model set-up
In this subsection we explain how to set up the relevant Bayesian models
and draw samples from posterior distributions for parameters ~θ and future
observables Y .
We show how simple simulation methods can be used to draw samples
from posterior and predictive distributions, automatically incorporating
208 Chapter 5 - Approximation techniques for sums of r.v.’s
uncertainty in the model parameters, and draw samples for posterior pre-
dictive checks.
The simplest and most widely used version of this model is the normal
linear model, in which the distribution of the response variable ~Y given
the regression matrix X is normal with mean a linear function of X:
E[Yi|~β,X] = β1xi1 + · · · + βkxik,
for i = 1, . . . , n. We further restrict to the case of ordinary linear regres-
sion, in which the conditional variances are equal, Var[Yi|~θ,X] = σ2 for
all i, and the observations are conditionally independent given ~θ,X. The
parameter vector is then ~θ = (β1, . . . , βk, σ2).
Under a standard non-informative prior distribution, the Bayesian es-
timates and standard errors coincide with the classical results. In the
simplest case, called ordinary linear regression, the observation errors are
independent and have equal variance. In vector notation given by
~Y |~β, σ2,X ∼ Nn(X~β, σ2I),
where I is the n × n identity matrix. In the normal regression model, a
convenient non-informative prior distribution is uniform on (~β, log σ) or,
equivalently,
p(~β, σ2|X) ∝ σ−2
When there are many data points and only a few parameters, the non-
informative prior distribution is useful — it gives acceptable results and
takes less effort than specifying prior knowledge in probabilistic form. For
a small sample size or a large number of parameters, the likelihood is less
sharply peaked, and so prior distributions are more important.
We determine first the posterior distribution for ~β, conditional on σ2,
and then the marginal posterior distribution for σ2. That is, we factor the
joint posterior distribution for ~β and σ2 as p(~β, σ2|~Y ) = p(~β|σ2, ~Y )p(σ2|~Y ).
1. Conditional posterior distribution of ~β given σ2
~β|σ2, ~Y ∼ N(~β, V~βσ2),
with~β = (X′X)−1X~Y
and
V~β= (X′X)−1
5.5. Applications in claims reserving 209
2. Marginal posterior distribution of σ2
σ2|~Y ∼ Inv − χ2(n− k, s2),
where
s2 =1
n− k(~Y − X~β)′(~Y − X~β).
The marginal posterior distribution of ~β|y, averaging over σ2, is multivari-
ate t with n− k degrees of freedom, but we rarely use this fact in practice
when drawing inferences by simulation, since to characterize the joint pos-
terior distribution we can draw simulations of σ2 and then ~β|σ2. The
standard non-Bayesian estimates of ~β and σ2 are ~β and s2, respectively,
as just defined. The classical standard error estimate for ~β is obtained by
setting σ2 = s2.
It is easy to draw samples from the posterior distribution: Compute
first ~β, V~βand s2 and draw then σ2 from the scaled inverse-χ2 distribution
and ~β from the multivariate normal distribution.
The posterior predictive distribution of unobserved data, p( ~Y |~Y ), has two
components of uncertainty:
1. The fundamental variability of the model, represented by the vari-
ance σ2 in ~Y , and
2. The posterior uncertainty in ~β and σ2 due to the finite sample size
of ~Y . As the sample size n → ∞, the variance due to posterior un-
certainty in (~β, σ2) decreases to zero, but the predictive uncertainty
remains.
5.5 Applications in claims reserving
5.5.1 The comonotonicity approach versus the Bayesian ap-proximations
In this subsection we apply a Bayesian model in the context of discounted
loss reserves. The outcomes of this approach are compared with the
comonotonic approximations for the distribution of the discounted loss re-
serve when the run-off triangle is modelled by a generalized linear model.
210 Chapter 5 - Approximation techniques for sums of r.v.’s
We realize that the Bayesian posterior predictive distribution is a very
general workhorse, which takes into account all sources of uncertainty in
the model formulation and is applicable to different statistical domains,
whereas the comonotonic approximations originate from a specific actuarial
context. We want to illustrate however that the predictive distribution
based on the comonotonic bounds provides results that are close to the
results obtained via MCMC. The main advantage of the bounds is that
several risk measures such as percentiles (VaRs), expected shortfalls (stop-
loss premiums) and TailVaRs can be calculated easily from it.
As illustrated by Verrall (2004) (for GLIMs) and in earlier work by (for
instance) de Alba (2002) (for lognormal models) Bayesian techniques are
useful in this area as they provide the posterior predictive distribution of
the reserve.
Bayesian methods for the analysis of GLIMs
We consider Bayesian methods for the analysis of generalized linear models,
which provide a general framework for cases in which normality and linear-
ity are not viable assumptions. These cases point out the major computa-
tional bottleneck of Bayesian methods: when the assumptions of normality
and/or linearity are removed, usually the posterior distribution cannot be
computed in closed form. We will discuss some computational methods to
approximate this distribution.
Generalized linear models provide a unified framework to encompass
several situations which are not adequately described by the assumptions
of normality of the data and linearity in the parameters. As described in
Chapter 4 (Section 4.3.3), the features of a GLIM are the fact that the
distribution of ~Y |~θ (~θ is used to denote the parameter vector) belongs to
the exponential family, and that a transformation of the expectation of the
data, g(~µ), is a linear function of the linear predictor R~β. The parameter
vector is made up of ~β and of the dispersion parameter φ.
Classical analyses of generalized linear models allow for the possibil-
ity of variation beyond that of the assumed sampling distribution, called
overdispersion. A prior distribution can be placed on the dispersion pa-
rameter, and any prior information about p(~β, φ) can be described condi-
tional on the dispersion parameter; that is, p(~β, φ) = p(φ)p(~β|φ).
The classical analysis of generalized linear models is obtained if a non-
informative or flat prior distribution is assumed for ~β. The posterior mode
5.5. Applications in claims reserving 211
corresponding to a noninformative uniform prior density is the maximum
likelihood estimate for the parameter ~β, which can be obtained using iter-
ative weighted linear regression.
The problem with a Bayesian analysis of GLIMs is that, in general, the
posterior distribution of ~β cannot be calculated exactly, since the marginal
density of the data
p(~Y ) =
∫p(~Y |~θ)p(~θ)d~θ (5.32)
cannot be evaluated in closed form.
Numerical integration techniques can be exploited to approximate (5.32),
from which a numerical approximation of the posterior density of ~β can
be found. When numerical integration techniques become infeasible, we
are left with two main ways to perform approximate posterior analysis:
(i) to provide an asymptotic approximation of the posterior distribution
or (ii) to use stochastic methods to generate a sample from the posterior
distribution.
When the sample size is large enough, posterior analysis can be based
on an asymptotic approximation of the posterior distribution by using a
normal distribution with some mean and variance. This idea generalizes
the asymptotic normal distribution of the maximum likelihood estimates
when their exact sampling distribution cannot be derived or it is too dif-
ficult to be used. Asymptotic normality of the posterior distribution pro-
vides notable computational advantages, since marginal and conditional
distributions are still normal, and hence inference on parameters of inter-
est can be easily carried out. However, for relatively small samples, the
assumption of asymptotic normality can be inaccurate.
For relatively small samples, stochastic methods (or Monte Carlo methods)
provide an approximate posterior analysis based on a sample of values gen-
erated from the posterior distribution of the parameters. The task reduces
to generating a sample from the posterior distribution of the parameters.
A numerical illustration
Consider now the run-off triangle in Table 5.7, taken from Taylor & Ashe
(1983) and used in various other publications on claims reserving.
These data are modelled using a gamma GLIM (see expression (4.51)
with κ = 2) with logarithmic link function.
212
Chapte
r5
-A
ppro
xim
atio
nte
chniq
ues
for
sum
sofr.v
.’s
1 2 3 4 5 6 7 8 9 10
1 357,848 766,940 610,542 482,940 527,326 574,398 146,342 139,950 227,299 67,9482 352,118 884,021 933,894 1,183,289 445,745 320,996 527,804 266,172 425,0463 290,507 1,001,799 926,219 1,016,654 750,816 146,923 495,992 280,4054 310,608 1,108,250 776,189 1,562,400 272,482 352,053 206,2865 443,160 693,190 991,983 769,488 504,851 470,6396 396,132 937,085 847,498 805,037 705,9607 440,832 847,631 1,131,398 1,063,2698 359,480 1,061,648 1,443,3709 376,686 986,60810 344,014
Table 5.7: Run-off triangle with non-cumulative claim figures.
5.5. Applications in claims reserving 213
year of origin
2 4 6 8 10
-0.5
0.0
0.5
1.0
1.5
development year
2 4 6 8 10
-0.5
0.0
0.5
1.0
1.5
calendar year
2 4 6 8 10
-0.5
0.0
0.5
1.0
1.5
Figure 5.1: Weighted residuals for linear predictor in (5.33), together
with average lines, which represent the average of the weighted standardized
residuals in each period of interest. Note that the average is zero when no
observations occur.
Although the linear predictor in the Probabilistic Trend Family of
models (4.32) is over-parameterized, it provides a flexible modelling struc-
ture. For example, one might begin with three parameters (α, β, γ), with
one accident period level parameter, one development period trend para-
meter and one calendar period trend parameter, which equates to a linear
predictor with the following form,
ηij = α+ (j − 1)β + (i+ j − 2)γ. (5.33)
Adding more accident, development and calendar period parameters where
necessary, allows the structure to be extremely flexible.
The weighted residuals for this model (Figure 5.1) indicate that there
are major trends in the development period that are not being captured.
There also appears to be a level change between accident periods one, two-
214 Chapter 5 - Approximation techniques for sums of r.v.’s
year of origin
2 4 6 8 10
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
development year
2 4 6 8 10
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
calendar year
2 4 6 8 10
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
Figure 5.2: Weighted residuals for linear predictor in (5.34), together
with average lines, which represent the average of the weighted standardized
residuals in each period of interest. Note that the average is zero when no
observations occur.
three, four and five. To capture these trends extra development period
trend parameters and extra accident period level parameters are required.
The new form of the linear predictor is given by
ηij = α1I(i=1) + α2I(i=2,3) + α3I(i=4) + α4I(i>4) + β1I(j>1)
+ β2I(j>4) + (j − 5)β3I(5<j<9) + 3β3I(j>8). (5.34)
The weighted residuals for this updated model (Figure 5.2) indicate that
(5.34) appears to capture the significant levels and trends in the data.
We recall from the previous chapter the definition of the discounted
IBNR reserve under a generalized linear model and normal logreturn process.
SGLIM =t∑
i=2
t∑
j=t+2−i
g−1((R~β)ij
)e−Y (i+j−t−1),
5.5. Applications in claims reserving 215
year SlGLIM Sc
GLIM Bayesian
2 360,725 387,404 436,1513 700,465 765,451 760,1774 945,845 982,425 970,5355 1,441,016 1,513,186 1,448,0566 1,913,383 1,977,934 1,919,3007 2,519,292 2,614,564 2,558,2088 3,557,014 3,702,302 3,641,8909 4,573,767 4,770,944 4,727,26210 5,577,925 5,821,804 5,638,301
total 20,949,190 21,988,048 20,360,196
Table 5.8: 95th percentile of the predictive distribution of SGLIM
where the returns are modelled by means of a Brownian motion described
by the following equation
Y (i) = (δ +ς2
2)i+ ςB(i),
where B(i) is the standard Brownian motion, ς is the volatility and δ is a
constant force of interest.
The discounting process (with δ = 0.08 and ς = 0.11) is incorporated in
the WinBugs code for the gamma GLIM. To enable comparisons with the
results from the comonotonic bounds , flat priors were used both for the row
and column parameters of the linear predictor and for the scale parameter
in the gamma model. Table 5.8 contains the results obtained via MCMC
simulations with the WinBugs program. A burn-in of 10 000 iterations was
allowed, after which another 10 000 iterations were performed.
The bounds for the discounted loss reserve use the maximum likeli-
hood estimates of the parameters in the linear predictor. To incorporate
the error arising from the estimation of these parameters we apply the
bootstrap algorithm as explained in Section 4.5. We bootstrapped 1000
times, computed each time (analytically) the 95th percentile of upper and
lower bound. Table 5.8 compares the Bayesian 95th percentile and the
bootstrapped 95th percentile of the lower and upper bound for the differ-
ent reserves.
The results for the upper and lower bounds in convex order are given in
the same table. One can see that the results from the comonotonic bounds
are close to the results obtained via MCMC simulation. Thus, at least for
216 Chapter 5 - Approximation techniques for sums of r.v.’s
this example, these bounds provide actuaries with accurate information
concerning the predictive distribution of discounted loss reserves.
5.5.2 The comonotonicity approach versus the asymptoticand moment matching approximations
In case the underlying variance of the statistical and financial part of the
discounted IBNR reserve gets large, the comonotonic approximations per-
form worse. We will illustrate this by means of a simple example in the
context of loss reserving and propose to solve this problem using the asymp-
totic approximations introduced in Section 5.3.
In the following, we assume that the r.v.’s Yij , i, j = 1, . . . , t can be
expressed as products of a deterministic component and an i.i.d. random
component. In particular, we consider the following model
Yij = aijY ij , i, j = 1, . . . , t, (5.35)
in which Y ij , i, j = 1, . . . , t are i.i.d. r.v.’s and aij > 0, i, j = 1, . . . , t are
positive numbers.
We will consider in this part the simple lognormal linear model (4.1)
ln~Y = R~β + ~ε, ~ε ∼ N(0, σ2I),
with ~Y as before the vector of historical claim figures.
The accumulated IBNR reserve is given by
IBNR reserve =
t∑
i=2
t∑
j=t+2−i
aijY ij . (5.36)
We will again incorporate stochastic discounting factors. We let the posi-
tive r.v. Vk from the i.i.d. sequence {Vk, k = 1, . . . , t−1} denote the present
value discounting factor from year k to year k − 1 and consider the two
sequences {Y ij , i = 2, . . . , t; j = t+ 2− i, . . . , t} and {Vk, k = 1, . . . , t− 1}to be mutually independent. Furthermore, for notational convenience, we
introduce the positive r.v. Zk = V1V2 · · ·Vk, k = 1, . . . , t − 1. Then the
discounted IBNR reserve S is given by
S =t∑
i=2
t∑
j=t+2−i
aijY ijZi+j−t−1. (5.37)
5.5. Applications in claims reserving 217
Henceforth, we impose that E[S] < +∞. Approximate values for stop-
loss premiums and quantiles for S may be obtained by using asymptotic
results. In particular, if {Y ij , i = 2, . . . , t; j = t+2− i, . . . , t} and {Vk, k =
1, . . . , t− 1} satisfy the corresponding conditions under which Theorem 13
or Theorem 14 is valid, then for sufficiently large values of d, we have that
π(S, d) ≈t∑
i=2
t∑
j=t+2−i
aijπ(Y Zi+j−t−1, d/aij
). (5.38)
Furthermore, if either FY ∈ R−α for some 0 < α < +∞, and FV ∈ R−∞,
or the conditions of Theorem 14 apply, then for sufficiently large values of
p, we have that
F−1S (p) ≈ inf
s :
t∑
i=2
t∑
j=t+2−i
F Y Zi+j−t−1(s/aij) ≤ 1 − p
. (5.39)
As an example, we consider a lognormal linear regression model with chain-
ladder linear predictor to describe the random claims and we use a geomet-
ric Brownian motion with drift to represent the stochastic discount factors.
We remark that for this specification Theorem 14 applies. Furthermore, for
this specification the products Y ijZi+j−t−1, i = 2, . . . , t; j = t+2− i, . . . , tare lognormal and therefore the present value of the IBNR reserve becomes
a linear combination of dependent lognormal r.v.’s, given by
S =t∑
i=2
t∑
j=t+2−i
aijY ijZi+j−t−1 =t∑
i=2
t∑
j=t+2−i
eηijeεije−Y (i+j−t−1). (5.40)
Notice that this definition is the same as (4.35) for the special case of the
lognormal linear model (with σ = σ) and chain-ladder type linear predictor
ηij = (R~β)ij = αi + βj .
In this illustration, we start with a given set of parameters and define
the reserve as expressed in (5.40). In a real reserving exercise, one has to
build an appropriate statistical model based on the incremental claims in
the run-off triangle and to estimate the parameters from this model.
Using the same notation as in the previous chapter we have for Wij :=
−Y (i+ j − t− 1) that
E[Wij ] = −(δ +1
2ς2)(i+ j − t− 1),
Var[Wij ] = σ2Wij
= (i+ j − t− 1)ς2.
218 Chapter 5 - Approximation techniques for sums of r.v.’s
The asymptotic approximations (5.38) and (5.39) become
π(S, d) ≈t∑
i=2
t∑
j=t+2−i
eηij+E[Wij ]+
12(σ2
Wij+σ2)
×Φ
((ηij + E[Wij ] + σ2
Wij+ σ2 − log(d)
)/√σ2
Wij+ σ2
)
−dΦ((ηij + E[Wij ] − log(d)
)/√σ2
Wij+ σ2
), d ∈ R+,
F−1S (p) ≈ inf
s :
t∑
i=2
t∑
j=t+2−i
FLN (s) ≤ 1 − p
, p ∈ (0, 1),
in which FLN is the cdf of logN(ηij + E[Wij ], σ
2Wij
+ σ2).
To compute the lognormal moment matching approximations as described
in Section 5.2 we need expressions for the mean and variance of S. These
are given by
E[S] =t∑
i=2
t∑
j=t+2−i
eηij+E[Wij ]+
12
(σ2
Wij+σ2),
Var[S] =t∑
i=2
t∑
j=t+2−i
t∑
k=2
t∑
l=t+2−k
eσ2+(ηij+ηkl+E[Wij ]+E[Wkl]
)+ 1
2
(σ2
Wij+σ2
Wkl
)
×(eς
2 min(i+j−t−1,k+l−t−1)+σ2∗ − 1),
where σ2∗ =
{σ2 if i, j = k, l;
0 if i, j 6= k, l.
We arbitrarily set σ = 3, δ = −0.07, ς = 0.2 and t = 5 and use the following
chain-ladder parameters:
α1
α2
α3
α4
α5
=
1.1
1.6
1.9
2.1
2.2
,
β1
β2
β3
β4
β5
=
0
−0.42
−0.38
−0.87
−0.96
.
5.5. Applications in claims reserving 219
d MC Appr. 1 Appr. 2 Appr. 3 Ndiff. 1 Ndiff. 2 Ndiff. 3
7500 1868.0 1771.6 2541.1 2277.6 5.2% -36.0% -21.9%
10000 1743.5 1658.1 2459.2 2165.8 4.9% -41.0% -24.2%
15000 1568.7 1496.9 2333.6 1998.4 4.6% -48.8% -27.4%
20000 1446.7 1383.1 2237.8 1874.1 4.4% -54.7% -29.5%
25000 1354.0 1295.8 2160.0 1775.2 4.3% -59.5% -31.1%
30000 1279.7 1225.4 2094.4 1693.4 4.2% -63.7% -32.3%
40000 1165.7 1116.7 1987.6 1563.3 4.2% -70.5% -34.1%
50000 1080.4 1034.8 1902.4 1462.2 4.2% -76.1% -35.3%
75000 933.3 892.5 1743.6 1280.5 4.4% -86.8% -37.2%
100000 835.6 797.4 1600.2 1154.9 4.6% -91.5% -38.2%
150000 708.5 673.0 1437.8 985.3 5.0% -102.9% -39.1%
200000 626.2 592.0 1323.5 871.7 5.5% -111.4% -39.2%
250000 566.5 533.4 1260.8 788.2 5.8% -122.6% -39.1%
300000 520.7 488.4 1190.4 723.1 6.2% -128.6% -38.9%
400000 453.9 422.6 1081.8 626.8 6.9% -138.3% -38.1%
500000 406.5 375.9 1000.2 557.7 7.5% -146.1% -37.2%
p MC Appr. 1 Appr. 2 Appr. 3 Ndiff. 1 Ndiff. 2 Ndiff. 3
0.95 8650 7863 4814 7555 9.1% 44.3% 12.7%
0.975 17000 15868 12436 17296 6.7% 26.8% -1.7%
0.99 38957 37496 37490 45306 3.8% 3.8% -16.3%
0.995 70795 68885 79477 87283 2.7% -12.3% -23.3%
0.999 257090 253021 374188 337364 1.6% -45.5% -31.2%
Table 5.9: Monte Carlo (MC) versus approximate values of stop-loss
premiums and quantiles for chain-ladder claim sizes and lognormal present
value discounting factors.
In Table 5.9 we numerically compare the asymptotic approximations with
a Monte Carlo (MC) study based on 5 000 0000 simulations. Numerical
results of the comonotonic and moment matching approximations have
also been included. “Appr. 1” refers to the asymptotic approximation,
“Appr. 2” to the convex upper bound and “Appr. 3” to the lognormal
moment matching approach. “Ndiff. ” refers to the normalized difference
defined as MC−Appr.MC × 100%. The numerical results demonstrate that the
asymptotic approximation values generally outperform the comonotonic
upper bound and the lognormal moment matching technique. Because the
comonotonic lower bound performed remarkably bad, its numerical values
were left out of the table.
220 Chapter 5 - Approximation techniques for sums of r.v.’s
5.6 Proofs
Theorem 12
In order to prove the theorem, we first establish the following result from
Tang & Tsitsiashvili (2004):
Lemma 11.
Let F1, F2 and G be three d.f.’s. Suppose that F i(x) > 0 for any real
number x, Fi(0)G(0) = 0, i = 1, 2, and G ∈ R−∞. If F 1(x) ∼ F 2(x), then
F1 ⊗G(x) ∼ F2 ⊗G(x). (5.41)
Proof. From the condition F 1(x) ∼ F 2(x) we know that, for any 0 < ε < 1
and all large x, say x ≥ y0 for some y0 > 0,
(1 − ε)F 2(x) ≤ F 1(x) ≤ (1 + ε)F 2(x). (5.42)
It is not difficult to verify that since F i(y0) > 0 for all y0 > 0, i = 1, 2, we
have by the definition of the class R−∞ for i = 1, 2, that
lim supx→+∞
∫ y0
0 G(x/y)Fi(dy)∫ +∞y0
G(x/y)Fi(dy)≤ lim sup
x→+∞
∫ y0
0 G(x/y)Fi(dy)∫ +∞2y0
G(x/y)Fi(dy)
≤ lim supx→+∞
G(x/y0)(Fi(y0) − Fi(0))
G(x/2y0)F i(2y0)
= 0
and hence that for i = 1, 2
Fi ⊗G(x) =
∫ y0
0G (x/y)Fi(dy) +
∫ +∞
y0
G (x/y)Fi(dy)
∼∫ +∞
y0
G (x/y)Fi(dy)
= G (x/y0)F i(y0) +
∫ x/y0
0F i(x/y)G(dy).
Substituting (5.42) to the above leads to
(1 − ε)F2 ⊗G(x) . F1 ⊗G(x) . (1 + ε)F2 ⊗G(x).
Hence, relation (5.41) follows from the arbitrariness of 0 < ε < 1.
5.6. Proofs 221
Then, we proceed with the proof of Theorem 12.
Proof. Clearly, it holds that
Pr
[n∑
i=1
aiZi > x
]= Pr
[Y1
(a1 + Y2
(a2 + . . . Yn−1
(an−1 + anYn
)))> x
].
Since FY ∈ L and an > 0, we have that
Pr [an−1 + anYn > x] ∼ Pr [anYn > x] .
Hence, applying Lemma 11 we obtain that
Pr [Yn−1 (an−1 + anYn) > x] ∼ P [anYn−1Yn > x] .
Repeatedly applying Lemma 11, we finally obtain that
Pr[Y1
(a1 + Y2
(a2 + . . . Yn−1
(an−1 + anYn
)))> x
]
∼ Pr [anY1Y2 · · ·Yn−1Yn > x] .
For the remainder of the proof it suffices to verify that the probabilities
Pr [aiZi > x], i = 1, 2, · · · , n − 1, on the right-hand side of (5.20) can be
neglected when compared with the probability Pr [anZn > x]. Since the
class R−∞ is closed under product convolution, we have that the d.f. of
the product∏i
j=1 Yj belongs to the class R−∞ for each i = 1, 2, . . .. Hence,
we verify that for each i = 1, 2, . . . , n− 1, and some 0 < v < 1,
lim supx→+∞
Pr[ai∏i
j=1 Yj > x]
Pr[an∏n
j=1 Yj > x]
≤ lim supx→+∞
Pr[ai∏i
j=1 Yj > x]
Pr[ai∏i
j=1 Yj > vx, an
ai
∏nj=i+1 Yj > 1/v
]
=1
Pr[
an
ai
∏nj=i+1 Yj > 1/v
] lim supx→+∞
Pr[ai∏i
j=1 Yj > x]
Pr[ai∏i
j=1 Yj > vx]
= 0 .
This proves that (5.20) holds.
222 Chapter 5 - Approximation techniques for sums of r.v.’s
Theorem 13
To prove the theorem, we first state three lemma’s.
Lemma 12.
Let X and Y be two independent r.v.’s, where X is supported on (−∞,+∞)
with a d.f. F , and Y is strictly positive with a d.f. G. Let V = XY and
denote by H the d.f. of V . If F ∈ D∩L and G ∈ R−∞, then H ∈ D∩L ⊂ Sand
H(x) � F (x).
Proof. This lemma can easily be proved by Lemma 3.8 and Lemma 3.10
of Tang & Tsitsiashvili (2003).
Lemma 13.
If F ∈ D and G ∈ R−∞, then there exists some ε > 0 such that
G(x1−ε
)= o
(F (x)
).
Proof. This lemma can be proved by Lemma 3.7 of Tang & Tsitsiashvili
(2003).
Lemma 14.
Let F = F1 ∗ F2, where F1 and F2 are two d.f.’s supported on (−∞,+∞).
If F1 ∈ S, F2 ∈ L, and F 2(x) = O(F 1(x)
), then F ∈ S and
F (x) ∼ F 1(x) + F 2(x).
Proof. This result can be obtained by fixing γ = 0 in Lemma 3.2 of Tang
& Tsitsiashvili (2003).
We are now ready to prove Theorem 13.
Proof. First we prove (5.21), which says that
Pr [(a1 +X1)Y1 + . . .+ (an−1 +Xn−1)Yn−1 . . . Y1+
+(an +Xn)YnYn−1 . . . Y1 > x]
∼ Pr [(a1 +X1)Y1 > x] + . . .+ Pr [(an−1 +Xn−1)Yn−1 . . . Y1 > x]
+Pr [(an +Xn)YnYn−1 . . . Y1 > x] .
5.6. Proofs 223
Applying Lemma 12, we have that the product (an + Xn)Yn is subexpo-
nentially distributed and
Pr [(an +Xn)Yn > x] � F (x). (5.43)
Applying Lemma 14, we have that
Pr [(an−1 +Xn−1) + (an +Xn)Yn > x]
∼ Pr [(an−1 +Xn−1) > x] + Pr [(an +Xn)Yn > x] .
Since, by Lemma 13, there exists some ε > 0 such that G(x1−ε
)=
o(F (x)
), we have that
Pr [(an−1 +Xn−1)Yn−1 + (an +Xn)YnYn−1 > x]
=
(∫ x1−ε
0+
∫ +∞
x1−ε
)Pr [(an−1 +Xn−1)y + (an +Xn)Yny > x] dG(y)
=
∫ x1−ε
0Pr
[(an−1 +Xn−1) + (an +Xn)Yn >
x
y
]dG(y) + o
(F (x)
)
∼∫ x1−ε
0
(Pr
[(an−1 +Xn−1) >
x
y
]+ Pr
[(an +Xn)Yn >
x
y
])dG(y)
+o(F (x)
)
=
(∫ +∞
0−∫ +∞
x1−ε
)(Pr
[(an−1 +Xn−1) >
x
y
]
+Pr
[(an +Xn)Yn >
x
y
])dG(y) + o
(F (x)
)
= Pr [(an−1 +Xn−1)Yn−1 > x] + Pr [(an +Xn)YnYn−1 > x] + o(F (x)
)
∼ Pr [(an−1 +Xn−1)Yn−1 > x] + Pr [(an +Xn)YnYn−1 > x] .
Furthermore, by application of Lemma’s 12 and 14, it follows that (an−1 +
Xn−1)Yn−1 + (an +Xn)YnYn−1 is subexponentially distributed and that
Pr [(an−1 +Xn−1)Yn−1 + (an +Xn)YnYn−1 > x] � F (x).
Simply repeating the procedure above and observing that
(an−2 +Xn−2)Yn−2 + (an−1 +Xn−1)Yn−1Yn−2 + (an +Xn)YnYn−1Yn−2
= [(an−2 +Xn−2) + (an−1 +Xn−1)Yn−1 + (an +Xn)YnYn−1]Yn−2,
224 Chapter 5 - Approximation techniques for sums of r.v.’s
we obtain that
Pr [(an−2 +Xn−2)Yn−2 + (an−1 +Xn−1)Yn−1Yn−2+
+(an +Xn)YnYn−1Yn−2 > x]
∼ Pr [(an−2 +Xn−2)Yn−2 > x] + Pr [(an−1 +Xn−1)Yn−1Yn−2 > x]
+Pr [(an +Xn)YnYn−1Yn−2 > x] .
Hence, repeating the procedure above n − 1 times yields the announced
result (5.21). The proof of (5.22), can be given completely analogously to
the above, since the distribution of aiXi satisfies
Pr [aiXi > x] = F (x/ai) � F (x)
and is subexponential.
Corollary 3
Proof. Using (5.43), one can easily verify that
lim infx→+∞
Pr[∑n
i=1(ai +Xi)Zi > x]
Pr[∑n−1
i=1 (ai +Xi)Zi > x] > 1,
and that
lim infx→+∞
Pr[∑n
i=1(aiXi)Zi > x]
Pr[∑n−1
i=1 (aiXi)Zi > x] > 1.
Hence, we can prove (5.23) and (5.24) by substituting (5.21) and (5.22)
into the left-hand-side of (5.23) and (5.24), respectively.
Corollary 4
Proof. Given the asymptotic results (5.21) and (5.22), the proof of this
corollary follows immediately from a well-known result, which was referred
by Cline (1986) to Proposition 3 of Breiman (1965).
Theorem 14
In case the conditions 1 and 2 of Theorem 13 are replaced by the conditions
1’, 2’ and 3’ of Theorem 14, the proof of (5.21) can be established com-
pletely analogously to the proof of Theorem 13 using the following three
5.6. Proofs 225
lemma’s, which are the analogs of Lemma 12, Lemma 13 and Lemma 14,
respectively:
Lemma 15.
Let X and Y be two independent lognormally distributed r.v.’s with σY <
σX . Furthermore, let V = XY and denote by H the d.f. of V . Then V
follows a lognormal law and F (x) = o(H(x)).
Lemma 16.
If both F and G are lognormal laws with σG < σF , then there exists some
ε > 0 such that
G(x1−ε
)= o
(F (x)
).
Lemma 17.
Let F = F1 ∗ F2, where F1 and F2 are two lognormal laws. Then F ∈ Sand
F (x) ∼ F 1(x) + F 2(x).
Proof. This is a special case of Corollary 1 of Cline (1986) and moreover
is a special case of Lemma 14.
We are now ready to proof Theorem 14.
Proof. The proof of (5.22) can be given analogously, since the distribution
of aiXi is again lognormal with Var[log(aiXi)] = Var[log(Xi)] = σ2X .
Finally, we prove (5.23) and (5.24). By application of Lemma 11 and the
same reasoning as in the proof of Theorem 12, we have for each n = 1, 2, . . .,
and some 0 < v < 1 that
226 Chapter 5 - Approximation techniques for sums of r.v.’s
lim infx→+∞
∑ni=1 Pr
[(ai +X)
∏ij=1 Yj > x
]
∑n−1i=1 Pr
[(ai +X)
∏ij=1 Yj > x
]
≥ lim infx→+∞
Pr[(an +X)
∏nj=1 Yj > x
]
∑n−1i=1 Pr
[(ai +X)
∏ij=1 Yj > x
]
=1
∑n−1i=1 lim supx→+∞
Pr[(ai+X) � ij=1 Yj>x]
Pr[(an+X) � nj=1 Yj>x]
≥ 1∑n−1
i=1 lim supx→+∞Pr[(ai+X) � i
j=1 Yj>x]Pr[(an+X) � i
j=1 Yj>vx]Pr[ � nj=i+1 Yj>1/v]
=1
∑n−1i=1 lim supx→+∞
Pr[X � ij=1 Yj>x]
Pr[X � ij=1 Yj>vx]Pr[ � n
j=i+1 Yj>1/v]
= +∞ > 1.
and
lim infx→+∞
∑ni=1 Pr
[(aiX)
∏ij=1 Yj > x
]
∑n−1i=1 Pr
[(aiX)
∏ij=1 Yj > x
]
≥ lim infx→+∞
Pr[(anX)
∏nj=1 Yj > x
]
∑n−1i=1 Pr
[(aiX)
∏ij=1 Yj > x
]
=1
∑n−1i=1 lim supx→+∞
Pr[(aiX) � ij=1 Yj>x]
Pr[(anX) � nj=1 Yj>x]
= +∞ > 1.
Hence, we can prove (5.23) and (5.24) by substituting (5.21) and (5.22)
into the left-hand-side of (5.23) and (5.24), respectively.
Samenvatting in het
Nederlands (Summary in
Dutch)
Inleiding
In deze thesis bekijken we de reserveringsproblematiek in de verzekerings-
wereld van naderbij. Een reserveringsstudie komt in grote lijnen neer op de
bepaling van de huidige waarde van de toekomstige schade-uitkeringen. De
deskundigheid en nauwkeurigheid waarmee dit onzeker bedrag tot stand
komt is dan ook cruciaal voor een maatschappij en haar polishouders. De
intrinsieke onzekerheden die hiermee gepaard gaan, mogen bovendien geen
excuus zijn om van een sterk wetenschappelijk onderbouwde analyse af
te zien. Belangen en prioriteiten kunnen verschillen tussen al diegenen
die te maken krijgen met reserveschattingen. Voor het management moet
deze schatting betrouwbare informatie verschaffen om de leefbaarheid en
de winstgevendheid van de maatschappij te maximaliseren. Voor de con-
trole instantie, die zich bezighoudt met de solvabiliteit, moeten de reserves
conservatief bepaald worden om de kans op een faillissement te reduceren.
Voor de fiscus moeten de reserves de werkelijke betalingen zo goed mo-
gelijk weergeven. De polishouder ten slotte wil dat de reserves voldoende
zijn om verzekerde schadegevallen te kunnen betalen, maar wil niet beboet
worden onder de vorm van een te hoge premie voor die garantie.
Het voornaamste doel van het reserveringsproces kan eenvoudig als
volgt beschreven worden. Vanaf een bepaalde, vooraf overeengekomen, dag
is een verzekeraar verantwoordelijk voor alle opgelopen claims. Kosten die
dit schadegeval met zich meebrengen worden opgedeeld in twee categorieen:
227
228 Samenvatting in het Nederlands (Summary in Dutch)
diegene die reeds betaald zijn en diegene die nog niet (volledig) betaald
zijn. Het voornaamste doel van het reserveringsproces is nu het schatten
van die kosten die nog niet betaald zijn door de maatschappij. De verdeling
van mogelijke geaggregeerde onbetaalde schadegevallen kan voorgesteld
worden als een kansdichtheidsfunctie. Er is reeds veel geschreven over
de statistische verdelingen die geschikt zijn bij de studie van risico’s en
verzekeringen. In de praktijk kan men niet beschikken over de volledige
informatie van de onderliggende verdelingen. Daarom moet men zich dik-
wijls beroepen op beperkte informatie, zoals bv. schattingen van de eerste
momenten van de verdeling. Niet enkel de basisrisicomaten maar ook meer
gesofisticeerde maten (zoals scheefheidsmaten, extreme percentielen van de
verdeling,. . . ) die een dieper inzicht in de onderliggende verdeling vereisen,
zijn erg van belang. De berekening van de eerste momenten kan gezien wor-
den als een eerste poging om meer te weten te komen over de eigenschap-
pen van een verdeling. Bovendien is de variantie niet de meest geschikte
risicomaat om de solvabiliteitsvereisten van een verzekeringsportefeuille te
bepalen. Als tweezijdige risicomaat houdt deze zowel rekening met de
positieve als met de negatieve tekortkomingen hetgeen tot onderschatting
van de reserve zal leiden in geval van een scheve verdeling. Bovendien
benadrukt deze maat niet de staarteigenschappen van de verdeling. In dit
geval lijkt het meer geschikt de VaR (het p-de kwantiel) te gebruiken of zelfs
de TVaR (hetgeen in essentie neerkomt op een gemiddelde van alle kwan-
tielen boven een voorgedefinieerd niveau p). Ook risicomaten gebaseerd
op stop-loss premies (bv. de verwachte shortfall) kunnen in deze context
aangewend worden. Het verkrijgen van de verdeling waarvan dan aller-
lei maten kunnen berekend worden is het uiteindelijke doel. Deze trends
worden ook aangehaald in de huidige bank- en verzekeringsvoorschriften
(Basel 2 en Solvency 2) die de risico-gebaseerde benadering in ALM be-
nadrukken. Dit vereist een nieuwe methodologische aanpak die toelaat
meer gesofisticeerde informatie over de onderliggende risico’s te verkrijgen.
In de huidige actuariele wetenschappelijke literatuur vinden we weinig
terug over de geschikte berekeningsmethode van de verdeling van reserve-
uitkomsten. Verscheidene methoden bestaan om efficient de verdeling van
sommen van onafhankelijke risico’s te benaderen (zoals Panjer’s recursie,
convolutie, ...). Als bovendien het aantal risico’s in een portefeuille groot
genoeg is, kan men gebruik maken van de Centrale Limiet Stelling om de
geaggregeerde claims via de normale verdeling te benaderen. Zelfs indien
deze onafhankelijkheidsveronderstelling niet voldaan is (wanneer bv. de
Inleiding 229
aanname van onafhankelijkheid op basis van statistische testen verworpen
wordt) wordt deze benadering veel gebruikt in de praktijk omwille van
de mathematische eenvoud. In een aantal praktische toepassingen wordt
deze onafhankelijkheidsveronderstelling nochtans geschonden, hetgeen tot
een significante onderschatting van het risico van de portefeuille kan lei-
den. Dit is onder meer het geval wanneer het actuarieel technische risico
gecombineerd wordt met het financiele investeringsrisico.
In tegenstelling tot in het bankwezen, is het concept van stochastische
interestvoeten pas recent aan de oppervlakte gekomen in het verzekerings-
wezen. Traditioneel vertrouwen actuarissen op deterministische inter-
estvoeten. Een dergelijke vereenvoudiging laat toe efficiente risicomaten
(zoals het gemiddelde, de standaarddeviatie, bovenkwantielen, ...) van
financiele contracten te bepalen. Door een hoge onzekerheid over toekom-
stige investeringsresultaten worden actuarissen nochtans gedwongen con-
servatieve aannames te doen om verzekeringspremies en wiskundige re-
serves te berekenen. Dit heeft tot gevolg dat de diversificatie-effecten van
returns in verschillende investeringsperioden niet in rekening kunnen wor-
den gebracht. Hiermee bedoelen we dat slechte investeringsresultaten in
bepaalde perioden gewoonlijk gecompenseerd worden door zeer goede re-
sultaten in andere perioden. Deze bijkomende kosten worden ofwel naar de
verzekerden doorgerekend, die hogere premies moeten betalen, ofwel naar
de aandeelhouders, die meer economisch kapitaal moeten voorzien. Het
belang van de introductie van modellen met stochastische interestvoeten
is daarom goed begrepen in de actuariele wereld. Ook de laatste bank- en
verzekeringsvoorschriften (Basel 2, Solvency 2) onderstrepen dit belang.
Deze voorschriften leggen de nadruk op de risico-gebaseerde benadering
om economisch kapitaal te bepalen. Het projecteren van cash flows met
stochastische returns is ook belangrijk in de prijsbepaling van verzekerings-
toepassingen zoals de ‘embedded value’ (de huidige waarde van cash flows
voortgebracht door de van kracht zijnde polissen) en de ‘appraisal value’
(de huidige waarde van cash flows voortgebracht door de van kracht zijnde
polissen en door polissen die in de toekomst zullen onderschreven worden).
Een wiskundige beschrijving van het aangehaalde probleem kan als volgt
samengevat worden. Zij Xi (i = 1, . . . , n) een stochastisch bedrag dat
betaald moet worden op tijdstip ti en zij Vi de verdisconteringsfactor over
de periode [0, ti]. We beschouwen dan de huidige waarde van toekomstige
230 Samenvatting in het Nederlands (Summary in Dutch)
betalingen, die geschreven kan worden als een scalair produkt van de vorm
S =n∑
i=1
XiVi. (N.1)
De stochastische vector ~X = (X1, X2, . . . , Xn) kan bv. het verzekerings-
of kredietrisico weergeven, terwijl de vector ~V = (V1, V2, . . . , Vn) het fi-
nanciele/investeringsrisico weergeeft. In het algemeen veronderstellen we
dat deze vectoren onderling onafhankelijk zijn. In praktische toepassingen
kan deze onafhankelijkheidsaanname wel eens geschonden zijn bv. door
een inflatiefactor met een sterke invloed op betalings- en investeringsre-
sultaten. Men kan dit probleem echter aan pakken door sommen van de
volgende vorm te beschouwen
S =n∑
i=1
XiVi,
waarbij Xi = Xi/Zi en Vi = ViZi de aangepaste waarden zijn uitgedrukt
in reele termen (Zi is een inflatiefactor over de periode [0, ti]). Daarom is
de onafhankelijkheidsveronderstelling tussen het verzekeringsrisico en het
financiele risico in vele gevallen realistisch en kan zij efficient aangewend
worden om verschillende grootheden te verkrijgen die het risico in financiele
instituten beschrijft (bv. verdisconteerde claims of de ‘embedded/appraisal’
waarde van een maatschappij).
Deze verdelingsfuncties zijn typisch complex en niet voor de hand
liggend omwille van twee belangrijke redenen. Eerst en vooral behoort
de verdeling van een som van stochastische veranderlijken met marginale
verdelingen in dezelfde verdelingsklasse in het algemeen niet tot deze verde-
lingsklasse. Ten tweede verhindert de stochastische afhankelijkheid tussen
de elementen in de som het gebruik van convolutie en maakt het geheel
aanzienlijk ingewikkelder. Bijgevolg worden benaderingsmethoden om func-
ties van sommen van afhankelijke variabelen te berekenen noodzakelijk.
In vele gevallen kan men natuurlijk Monte Carlo simulatie gebruiken om
empirische verdelingsfuncties te verkrijgen. Dit is echter typisch een tijd-
rovende benaderingsmethode, in het bijzonder indien men staartkansen
wenst te benaderen hetgeen een groot aantal simulaties vereist. Daarom
moet men opzoek gaan naar nieuwe alternatieve methoden. In deze thesis
bestuderen en evalueren we de meest frequent gebruikte benaderingstech-
nieken voor verzekeringstoepassingen.
Inleiding 231
Het centrale idee in dit werk is het comonotoniciteitsconcept. We
stellen voor het hierboven uiteengezette probleem op te lossen door onder-
en bovengrenzen voor de som van afhankelijke variabelen te berekenen ge-
bruikmakend van de beschikbare informatie. Deze grenzen zijn gebaseerd
op een algemene techniek voor het berekenen van het onder- en bovengren-
zen van stop-loss premies van een som van afhankelijke variabelen, zoals
uiteengezet in Kaas et al. (2000).
De eerste benadering voor de verdelingsfunctie van de verdisconteerde
reserve wordt afgeleid door de afhankelijkheidstructuur tussen de betrokken
stochastische veranderlijken te benaderen door een comonotone afhankelijk-
heidsstructuur. Op deze manier wordt het meerdimensionale probleem
gereduceerd tot een tweedimensionaal probleem hetgeen opgelost kan wor-
den door te conditioneren en gebruik te maken van eenvoudige numerieke
technieken. Deze benadering is plausibel in actuariele toepassingen aan-
gezien het leidt tot voorzichtige en conservatieve waarden van de reserves
en solvabiliteitsmarges. Indien de onderliggende afhankelijkheidsstructuur
sterk genoeg is, geeft deze bovengrens in convexe orde bevredigende resul-
taten.
De tweede benadering, die afgeleid wordt door voorwaardelijke ver-
wachtingswaarden te beschouwen, neemt een deel van de afhankelijkheids-
structuur in beschouwing. Deze benedengrens in convexe orde is zeer nut-
tig om de kwaliteit van de bovengrens als benadering te evalueren en kan
ook gebruikt worden als een benadering van de onderliggende verdeling.
Alhoewel deze keuze niet (actuarieel) voorzichtig is, doet de relatieve fout
van deze benadering significant beter dan de relatieve fout van de boven-
grens. Daarom zal de ondergrens verkozen worden in toepassingen waarbij
een hoge nauwkeurigheid van de toegepaste benaderingen vereist wordt
(zoals het prijzen van exotische opties of strategische portefeuille selectie
problemen).
Deze thesis is als volgt ingedeeld.
Het eerste hoofdstuk herhaalt de basis van de actuariele risicotheorie. We
definieren enkele veel gebruikte afhankelijkheidsmaten en de belangrijkste
risico-orderelaties voor actuariele toepassingen. We introduceren verder
verscheidene welbekende risicomaten en de relaties die onderling gelden.
Verder geeft het eerste hoofdstuk een theoretische achtergrond voor de
concepten van comonotoniciteit en herhaalt het de belangrijkste eigen-
232 Samenvatting in het Nederlands (Summary in Dutch)
schappen van comonotone risico’s.
In Hoofdstuk 2 herhalen we hoe de convexe grenzen kunnen afgeleid worden
en illustreren we de theoretische resultaten aan de hand van een toepassing
met betrekking tot verdisconteerde reserves. Het voordeel van te werken
met een som van comonotone variabelen ligt in de eenvoudige berekening
van de betrokken verdeling. In het bijzonder is deze techniek zeer nuttig
om betrouwbare schattingen te verkrijgen van bovenkwantielen en stop-
loss premies.
In praktische toepassingen is de bovengrens enkel nuttig indien de
afhankelijkheid tussen opeenvolgende termen van de som sterk genoeg is.
Maar zelfs dan zijn deze benaderingen voor stop-loss premies niet bevredi-
gend. In dit hoofdstuk stellen we een aantal technieken voor om meer
efficiente bovengrenzen voor stop-loss premies te bepalen. We gebruiken
hiervoor enerzijds de conditioneringsmethode zoals in Curran (1994) en in
Rogers & Shi (1995) en anderzijds de traditionele onder- en bovengrenzen
voor stop-loss premies van sommen van afhankelijke stochastische veran-
derlijken. We tonen ook hoe deze resultaten kunnen toegepast worden in
het speciale geval van lognormale stochastische veranderlijken. Dergelijke
sommen komt men vaak in de praktijk tegen, zowel in de actuariele als in
de financiele wereld.
We leiden comonotone benaderingen af voor het scalaire produkt van
stochastische vectoren van de vorm (N.1). Een algemene procedure voor
het berekenen van accurate schattingen van kwantielen en stop-loss pre-
mies wordt uiteengezet. We bestuderen de verdelingsfunctie van de huidige
waarde van een serie van stochastische betalingen in een stochastisch fi-
nanciele omgeving beschreven door een lognormaal verdisconteringspro-
ces. Dergelijke verdelingen komen frequent voor in een breed spectrum
van verzekerings- en financiele toepassingen. We verkrijgen nauwkeurige
benaderingen door onder- en bovengrenzen in convexe orde te ontwikke-
lingen voor dergelijke huidige-waarde-functies. We beschouwen verschei-
dene toepassingen voor verdisconteerde schadeprocessen onder de Black &
Scholes setting. In het bijzonder analyseren we in detail de gevallen waarbij
de stochastische veranderlijken Xi verzekeringsschades voorstellen gemo-
delleerd door lognormale, normale (meer algemeen elliptische) en gamma
of invers Gaussische (meer algemeen gematigd stabiele) verdelingen. Door
middel van een reeks numerieke illustraties tonen we dat de methode
zeer nauwkeurige en eenvoudig te verkrijgen benaderingen verschaft voor
Inleiding 233
verdelingsfuncties van stochastische veranderlijken van de vorm (N.1).
In Hoofdstuk 3 en 4 passen we de verkregen resultaten toe op twee be-
langrijke reserveringsproblemen in het verzekeringswezen en illustreren we
de benaderingen zowel numeriek als grafisch.
In Hoofdstuk 3 beschouwen we een belangrijke toepassing in het domein
van de levensverzekeringen. We trachten conservatieve schattingen te
bekomen voor kwantielen en stop-loss premies van een annuıteit en een
ganse portefeuille van annuıteiten. Gelijkaardige technieken kunnen aan-
gewend worden om schattingen te verkrijgen van meer algemene verze-
keringsprodukten in de sector leven. Onze techniek laat toe ‘personal fi-
nance’ problemen zeer nauwkeurig op te lossen.
Het geval van een portefeuille van annuıteiten is reeds uitgebreid on-
derzocht in de wetenschappelijke literatuur, maar enkel in het grensgeval
— voor homogene portefeuilles, wanneer het sterfterisico volledig gediver-
sifieerd is. De toepasbaarheid van deze resultaten in de verzekeringsprak-
tijk kan echter in vraag gesteld worden: in het bijzonder hier, aangezien
een typische portefeuille niet genoeg polissen bevat om te spreken over
volledige diversificatie. Daarom stellen we voor het aantal actieve polissen
in de opeenvolgende jaren te benaderen gebruikmakend van een ‘normal
power’ verdeling en de huidige waarde van de toekomstige uitkeringen te
modelleren als een scalair produkt van onderling onafhankelijke vectoren.
Hoofdstuk 4 focust op het schadereserveringsprobleem. Het correct schat-
ten van het bedrag dat een maatschappij opzij moet zetten om tegemoet
te komen aan de verplichtingen (schadegevallen) die zich in de toekomst
voordoen, is een belangrijke taak voor verzekeringsmaatschappijen om een
correct beeld van haar verplichtingen te krijgen. De historische data die
nodig zijn om schattingen te bekomen voor toekomstige betalingen wor-
den meestal weergegeven als incrementele betalingen in driehoek-vorm.
De bedoeling is deze schadedriehoek te vervolledigen tot een vierkant en
eventueel tot een rechthoek indien schattingen nodig zijn die behoren tot
afwikkelingsjaren waarvan geen data in de driehoek opgenomen zijn. Hier-
voor kan de actuaris gebruik maken van een aantal technieken. De in-
trinsieke onzekerheid wordt beschreven door de verdeling van mogelijke
uitkomsten en men zoekt steeds naar de beste schatting van de reserve.
Schadereservering heeft te maken met de bepaling van de onzekere huidige
234 Samenvatting in het Nederlands (Summary in Dutch)
waarde van een ongekend bedrag van toekomstige betalingen. Aange-
zien dit bedrag zeer belangrijk is voor een verzekeringsmaatschappij en
haar polishouders zijn de intrinsieke onzekerheden geen excuus om een
wetenschappelijke analyse links te laten liggen. Opdat de reserveschatting
werkelijk de beste schatting van de actuaris zou weergeven, moet zowel de
bepaling van de verwachte waarde van niet-betaalde schadegevallen als-
ook de geschikte verdisconteringsvoet de beste schatting van de actuaris
weergeven (hiermee bedoelen we dat deze niet opgelegd moet worden door
anderen of door de wetgeving). Aangezien de reserve een provisie is voor
toekomstige betalingen van niet-afgehandelde schadegevallen, geloven we
dat de geschatte schadereserve de tijdswaarde van geld moet weergeven.
In vele situaties is de verdisconteerde reserve nuttig, bv. in een dynamisch
financiele analyse, winstbepaling en het prijs zetten, risicokapitaal, schade-
portefeuille transfers,. . . . Idealiter zou de verdisconteerde reserve ook aan-
vaardbaar moeten zijn voor rapportering. De huidige wetgeving laat het
echter meestal niet toe. Niet-verdisconteerde reserves bevatten in feite een
zekere risicomarge afhankelijk van het niveau van de interestvoet. In dit
hoofdstuk beschouwen we de verdisconteerde IBNR reserve en leggen we
een impliciete marge op gebaseerd op een risicomaat van de verdeling van
de totale verdisconteerde reserve. We modelleren de schadebetalingen ge-
bruikmakend van lognormale lineaire modellen, loglineaire locatie-schaal
modellen en veralgemeende lineaire modellen en leiden accurate comono-
tone benaderingen af voor de verdisconteerde reserve.
De bootstraptechniek heeft bewezen zeer nuttig te zijn in vele statis-
tische toepassingen en kan in het bijzonder interessant zijn om de vari-
abiliteit van de schadevoorspellingen te bepalen en bovendien om boven-
grenzen te construeren met een geschikt betrouwbaarheidsniveau. Haar
populariteit is te wijten aan een combinatie van rekenkracht en theore-
tische ontwikkeling. Een voordeel van de bootstrapbenadering is dat de
techniek op elke dataset kan toegepast worden zonder een onderliggende
verdeling te veronderstellen. Bovendien kan de meeste software omgaan
met zeer grote aantallen bootstrapiteraties.
In Hoofstuk 5 leiden we andere methoden af om benaderingen te verkrijgen
voor S. We herhalen en evalueren ook kort enkele reeds bestaande tech-
nieken. In de eerse sectie van dit hoofdstuk, herhalen we twee bekende
moment gebaseerde benaderingen: de lognormale en de inverse gamma be-
nadering. Mensen uit de praktijk gebruiken vaak een moment gebaseerde
Inleiding 235
lognormale benadering voor de verdeling van S. Deze benaderingen zijn
zo gekozen dat de eerste twee momenten samenvallen met de correspon-
derende momenten van S.
Alhoewel de comonotone benaderingen in convexe orde bewezen hebben
goede benaderingen te zijn in geval de onderliggende variabiliteit klein is,
doen ze het een stuk minder wanneer de variantie toeneemt. Daarom kijken
we hier naar benaderingen voor functies van sommen van afhankelijke vari-
abelen door gebruik te maken van asymptotische resultaten. Alhoewel
asymptotische resultaten geldig zijn op oneindig, kunnen ze ook nuttig
zijn als benaderingen in de buurt van oneindig. We leiden enkele asymp-
totische resultaten af voor de staartkans van een som van zwaarstaartige
afhankelijke variabelen.
Sedert 1990 kent het toegepaste Bayesiaanse onderzoek een enorme
groei bij de statistici. Deze explosie heeft weinig te maken gehad met de
groeiende interesse van statistici en econometrici voor de theoretische basis
van de Bayesiaanse analyse of met een plotselinge bewustwording van de
voordelen van de Bayesiaanse aanpak ten opzichte van de frequentistische
methoden, maar heeft vooral een pragmatische grondslag. De ontwikke-
ling van krachtige rekeninstrumenten (en de bewustwording dat bestaande
statistische tools nuttig kunnen zijn om Bayesiaanse modellen te fitten)
heeft een groot aantal onderzoekers aangetrokken om de Bayesiaanse be-
nadering te gebruiken in de praktijk. Het gebruik van dergelijke methoden
laat onderzoekers toe ingewikkelde statistische modellen te schatten, die
gebruikmakend van standaard frequentistische technieken redelijk moeilijk
zijn, al dan niet onmogelijk. In deze sectie schetsen we vrij algemeen de ba-
siselementen van de Bayesiaanse berekening. Bayesiaanse gevolgtrekking
komt neer op het fitten van een kansmodel op een dataset en het resultaat
samenvatten door middel van een kansverdeling op de modelparameters en
op niet-waargenomen grootheden zoals predicties voor nieuwe observaties.
Er bestaan eenvoudige simulatiemethoden om een steekproef te nemen van
de posterior- en predictieverdeling, waarbij onzekerheid in de modelpara-
meters automatisch meegenomen wordt. Een voordeel van de Bayesiaanse
aanpak is dat we steeds, gebruikmakend van simulatie, de posterior pre-
dictieverdeling kunnen berekenen zodat we niet veel energie moeten steken
in het schatten van de steekproefverdeling van teststatistieken.
Uiteindelijk vergelijken we deze benaderingen met de comonotone be-
naderingen uit het vorig hoofdstuk in de context van de schadereserverings-
problematiek. In geval de onderliggende variantie van het statistische
236 Samenvatting in het Nederlands (Summary in Dutch)
en financiele gedeelte van de verdisconteerde IBNR reserve groter wordt,
presteren de comonotone benaderingen slecht. We illustreren dit aan de
hand van een eenvoudig voorbeeld en stellen de asymptotische resultaten
uit het vorig hoofdstuk als een alternatief voor. We vergelijken al deze
resultaten ook met de lognormale moment gebaseerde benaderingen. Ten-
slotte bekijken we ook de verdeling van de verdisconteerde reserve wanneer
we de data in de schadedriehoek modelleren met behulp van een veralge-
meend lineair model en vergelijken de resultaten van de comonotone be-
naderingen met de Bayesiaanse benaderingen.
Bibliography
[1] Ahcan A., Darkiewicz G., Goovaerts M.J. & Hoedemakers T. (2005).
“Computation of convex bounds for present value functions of ran-
dom payments”, Journal of Computational and Applied Mathema-
tics, to appear.
[2] Albrecher H., Dhaene J., Goovaerts M.J. & Schoutens W. (2005).
“Static hedging of Asian options under Levy models: The comono-
tonicity approach”, The Journal of Derivatives, 12(3), 63-72.
[3] Antonio K., Beirlant J. & Hoedemakers T. (2005). Discussion of
“A Bayesian generalized linear model for the Bornhuetter-Ferguson
method of claims reserving”by Richard Verrall, North American Ac-
tuarial Journal, to be published.
[4] Arnold L. (1974). Stochastic Differential Equations: Theory and Ap-
plications, Wiley, New York.
[5] Artzner P. (1999). “Application of coherent risk measures to capital
requirements in insurance”, North American Actuarial Journal, 3(2),
11–25.
[6] Artzner P., Delbaen F., Eber J.M. & Heath D. (1999). “Coherent
measures of risk”, Mathematical Finance, 9, 203–228.
[7] Barnett G. & Zehnwirth B. (2000). “Best estimates for reserves”,
Proceedings of the Casualty Actuarial Society, 87(2), 245–321.
[8] Beekman J.A. & Fuelling C.P. (1990). “Interest and mortality ran-
domness in some annuities”, Insurance: Mathematics & Economics,
9(2-3), 185–196.
237
238 Bibliography
[9] Beekman J.A. & Fuelling C.P. (1991). “Extra randomness in certain
annuity models”, Insurance: Mathematics & Economics, 10(4), 275–
287.
[10] Beekman J.A. & Fuelling C.P. (1993). “One approach to dual ran-
domness in life insurance”, Scandinavian Actuarial Journal, 76(2),
173–82.
[11] Bellhouse D.R. & Panjer H.H. (1981). “Stochastic modeling of inter-
est rates with applications to life contingencies - Part II”, Journal of
Risk and Insurance, 48(4), 628–637.
[12] Beirlant J., Goegebeur Y., Segers J. & Teugels J. (2004). Statistics
of Extremes: Theory and Applications, Wiley, New York.
[13] Bingham N.H., Goldie C.M. & Teugels J.L. (1987). Regular Varia-
tion, Cambridge University Press, Cambridge.
[14] Black F. & Scholes M. (1973). “The pricing of options and corporate
liabilities”, Journal of Political Economy, 81, 637–659.
[15] Blum K.A. & Otto D.J. (1998). “Best estimate loss reserving: an
actuarial perspective”, Casualty Actuarial Society Forum Fall 1998,
55-101.
[16] Boyle P.P. (1976). “Rates of return as random variables”, Journal of
Risk and Insurance, 43(4), 693–711.
[17] Bowers N.L., Gerber H.U., Hickman J.C., Jones D.A. & Nesbitt C.J.
(1986). Actuarial Mathematics, Schaumburg, Ill.: Society of Actuar-
ies.
[18] Breiman L. (1965). “On some limit theorems similar to the arc-sin
law”, Theory of Probability and Its Applications, 10(2), 323–331.
[19] Buhlmann H., Gagliardi B., Gerber H.U. & Straub E. (1977). “Some
inequalities for stop-loss premiums”, ASTIN Bulletin 9, 75-83.
[20] Cesari R. & Cremonini D. (2003). “Benchmarking, portfolio insur-
ance and technical analysis: a Monte Carlo comparison of dynamic
strategies of asset allocation”, Journal of Economic Dynamics and
Control, 27, 987-1011.
Bibliography 239
[21] Christofides S. (1990). “Regression models based on log-incremental
payments”, Claims Reserving Manual, 2, Institute of Actuaries, Lon-
don.
[22] Cline D.B.H. (1986). “Convolution tails, product tails and domains
of attraction”, Probability Theory Related Fields, 72, 529–557.
[23] Cohen A.C. & Whitten B.J. (1988). Parameter estimation in relia-
bility and life span models, Marcel Dekker, Inc., New York.
[24] Cordeiro G.M. & McCullagh P. (1991). “Bias correction in general-
ized linear models ”, Journal of the Royal Statistical Society B, 53(3),
629–643.
[25] Curran M. (1994). “Valuing Asian and portfolio options by condi-
tioning on the geometric mean price”, Management Science, 40(12),
1705–1711.
[26] Darkiewicz G., Dhaene J. & Goovaerts M.J. (2005a). “Risk mea-
sures and dependencies of risks”, Brazilian Journal of Probability
and Statistics, to appear.
[27] Darkiewicz G. (2005b). Value-at-Risk in Insurance and Finance: the
Comonotonicity Approach, PhD Thesis, K.U. Leuven, Faculty of
Economics and Applied Economics, Leuven.
[28] Davison A.C., & Hinkley D.V. (1997) Bootstrap Methods and their
Application, Cambridge Series in Statistical and Probabilistic Ma-
thematics, Cambridge University Press.
[29] De Alba E. (2002). “Bayesian estimation of outstanding claims re-
serves”, North American Actuarial Journal, 6(4), 1–20.
[30] Debicka J. (2003). “Moments of the cash value of future payment
streams arising from life insurance contracts”, Insurance: Mathema-
tics & Economics, 33(3), 533–550.
[31] Decamps M., De Schepper A. & Goovaerts M.J. (2004). “Pricing
exotic options under local volatility”, Proceedings of the second In-
ternational Workshop on Applied Probability (IWAP), Athens.
240 Bibliography
[32] Deelstra M., Liinev J. & Vanmaele M. (2004). “Pricing of arith-
metic basket options by conditioning”, Insurance: Mathematics &
Economics, 34(1), 55–77.
[33] Denuit M. & Dhaene J. (2003) “Simple characterizations of comono-
tonicity and countermonotonicity by extremal correlations”, Belgian
Actuarial Bulletin, 3, 22–27.
[34] Denuit M. & Dhaene J. (2004) “Dependent risks”, Encyclopedia of
Actuarial Science, Wiley, Vol. I, 464–471.
[35] Devroye L. (1986). Non-Uniform random variate generation,
Springer-Verlag, New York.
[36] De Vylder F. & Goovaerts M.J. (1979). Proceedings of the first meet-
ing of the contact group ”Actuarial Sciences”, K.U.Leuven, nr 7904B,
wettelijk Depot: D/1979/23761/5.
[37] De Vylder F. & Goovaerts M.J. (1982). “Upper and lower bounds on
stop-loss premiums in case of known expectation and variance of the
risk variable”, Mitt. Verein. Schweiz. Versicherungmath., 149–164.
[38] Dhaene J. (1989). “Stochastic interest rates and autoregressive inte-
grated moving average processes”, ASTIN Bulletin, 19(2), 131–138.
[39] Dhaene J. (1990). “Distributions in life insurance”, ASTIN Bulletin,
20(1), 81–92.
[40] Dhaene J., Wang S., Young V. & Goovaerts M.J. (2000).
“Comonotonicity and maximal stop-loss premiums”, Mitteilungen
der Schweiz. Aktuarvereinigung, 2000(2), 99–113.
[41] Dhaene J., Denuit M., Goovaerts M.J., Kaas R. & Vyncke D.
(2002a). “The concept of comonotonicity in actuarial science and
finance: Theory”, Insurance: Mathematics & Economics, 31(1), 3–
33.
[42] Dhaene J., Denuit M., Goovaerts M.J., Kaas R. & Vyncke D.
(2002b). “The concept of comonotonicity in actuarial science and fi-
nance: Applications”, Insurance: Mathematics & Economics, 31(2),
133–161.
Bibliography 241
[43] Dhaene J., Goovaerts M.J. & Kaas R. (2003). “Economical capital
allocation derived from risk measures”, North American Actuarial
Journal, 7(2), 44–59.
[44] Dhaene J., Vanduffel S., Tang Q., Goovaerts M.J., Kaas R. & Vyncke
D. (2004). “Solvency capital, risk measures and comonotonicity: a re-
view”, Research Report OR 0416, Department of Applied Economics,
K.U.Leuven.
[45] Dhaene J., Vanduffel S., Goovaerts M.J., Kaas R. & Vyncke D.
(2005). “Comonotonic approximations for optimal portfolio selection
Problems”, Journal of Risk and Insurance, 72(2), 253–301.
[46] Doray L.G. (1994). “IBNR reserve under a loglinear location-scale
regression model”, Casualty Actuarial Society Forum 1994, 2, 607-
652.
[47] Doray L.G. (1996). “UMVUE of the IBNR reserve in a lognormal lin-
ear regression model”, Insurance: Mathematics & Economics, 18(1),
43–58.
[48] Dufresne D. (1990). “The distribution of a perpetuity with applica-
tions to risk theory and pension funding”, Scandinavian Actuarial
Journal, 9, 39–79.
[49] Dufresne D. (2002). “Asian and basket asymptotics”, Research Paper
No. 100, Centre for Actuarial Studies, University of Melbourne.
[50] Dufresne D. (2004). “Stochastic life annuities, Research Paper, Cen-
tre for Actuarial Studies, University of Melbourne.
[51] Efron B. (1979). “Bootstrap methods: another look at the jackknife”,
Ann. Statist., 7, 1–26.
[52] Efron B. & Tibshirani R.J. (1993). An Introduction to the Bootstrap.
Chapman and Hall, New York.
[53] Embrechts P., Kluppelberg C. & Mikosch T. (1997). Modelling Ex-
tremal Events for Insurance and Finance, Springer, Berlin.
[54] England P.D. & Verrall R.J. (1999). “Analytic and bootstrap esti-
mates of prediction errors in claim reserving”, Insurance: Mathema-
tics & Economics, 25(3), 281-293.
242 Bibliography
[55] England P.D. & Verrall R.J. (2001). “A flexible framework for
stochastic claims reserving”, Proceedings of the Casualty Actuarial
Society, 88(1), 1–38.
[56] England P.D. & Verrall R.J. (2002). “Stochastic claims reserving in
general insurance”, British Actuarial Journal, 8(3), 443–518.
[57] Fang K.T., Kotz S. & Ng K.W. (1990) Symmetric Multivariate and
Related Distributions, Chapman & Hall, London.
[58] Feller W. (1971). An Introduction to Probability Theory and Its Ap-
plications, Wiley, New York.
[59] Frees E. (1990). “Stochastic life contingencies with solvency consid-
erations”, Transactions of the Society of Actuaries, 42, 91–129.
[60] Gilks W.R., Richardson S. & Spiegelhalter D.J. (1996) Practical
Markov Chain Monte Carlo, Chapman and Hall, London.
[61] Goovaerts M.J., Kaas R., Van Heerwaarden A.E. & Bauwelinckx T.
(1990). Effective Actuarial Methods, North-Holland, Amsterdam.
[62] Goovaerts M.J. & Redant H. (1999). “On the distribution of IBNR
reserves”, Insurance: Mathematics & Economics, 25(1), 1–9.
[63] Goovaerts M.J., Dhaene J. & De Schepper A. (2000). “Stochastic
upper bounds for present value functions”, Journal of Risk and In-
surance, 67(1),1–14.
[64] Goovaerts M.J., Kaas R., Dhaene J., & Tang Q. (2003). “A unified
approach to generate risk measures”, ASTIN Bulletin, 33(2), 173–
192.
[65] Goovaerts M.J., Kaas R., Dhaene J., & Tang Q. (2004). “Some new
classes of consistent risk measures”, Insurance: Mathematics & Eco-
nomics, 34(3), 505–516.
[66] Heerwaarden A.E. van (1991). Ordering of Risks: Theory and Actu-
arial Applications, Thesis Publishers, Amsterdam.
[67] Hoedemakers T., Beirlant J., Goovaerts M.J. & Dhaene J. (2003).
“Confidence bounds for discounted loss reserves”, Insurance: Mathe-
matics & Economics, 33(2), 297–316.
Bibliography 243
[68] Hoedemakers T. & Goovaerts M.J. (2004). Discussion of “Risk and
discounted loss reserves”by Greg Taylor, North American Actuarial
Journal, 8(4), 146–150.
[69] Hoedemakers T., Beirlant J., Goovaerts M.J. & Dhaene J. (2005).
“On the distribution of discounted loss reserves using generalized
linear models”, Scandinavian Actuarial Journal, 2005(1), 25–45.
[70] Hoedemakers T., Darkiewicz G. & Goovaerts M.J. (2005). “Approx-
imations for life annuity contracts in a stochastic financial environ-
ment”, Insurance: Mathematics & Economics, to be published.
[71] Hoedemakers T., Darkiewicz G., Deelstra G., Dhaene J. & Vanmaele
M. (2005). “Bounds for stop-loss premiums of stochastic sums (with
applications to life contingencies)”, Research Report OR 0523, De-
partment of Applied Economics, K.U.Leuven.
[72] Huang H., Milevsky M.A. & Wang J. (2004). “Ruined moments in
your Life: how good are the approximations?”, Insurance: Mathe-
matics & Economics, 34(3), 421–447.
[73] Hurlimann W. (1996). “Improved analytical bounds for some risk
quantities”, ASTIN Bulletin, 26(2), 185–199.
[74] Hurlimann W. (1998). “On best stop-loss bounds for bivariate sums
by known marginal means, variances and correlation”, Mitt. Verein.
Schweiz. Versicherungmath., 111–134.
[75] Ibbotson Associates (2002). Stocks, Bonds, Bills and Inflation: 1926-
2001, Chicago, IL.
[76] Jansen K., Haezendonck J. & Goovaerts M.J. (1986). “Upper bounds
on stop-loss premiums in case of known moments up to the fourth
order”, Insurance: Mathematics & Economics, 5(4), 315–334.
[77] Jeffreys H. (1946). “An invariant form for the prior probability in
estimation problems”, Proc. Roy. Soc. London Ser. A, 196, 453–461.
[78] Kaas R., Van Heerwaarden A.E. & Goovaerts M.J. (1998). Ordering
of Actuarial Risks, Caire Education Series 1, Caire, Brussels.
244 Bibliography
[79] Kaas R., Dhaene J. & Goovaerts M.J. (2000). “Upper and lower
bounds for sums of random variables”, Insurance: Mathematics &
Economics, 27(2), 151–168.
[80] Kaas R., Goovaerts M.J., Dhaene J. & Denuit M. (2001). Modern
Actuarial Risk Theory, Kluwer Academic Publishers.
[81] Kalbfleisch J.D. & Prentice R.L. (1980). The Statistical Analysis of
Failure Time Data, Wiley, New York.
[82] Karatzas I. & Shreve S.E. (1991). Brownian Motion and Stochastic
Calculus, Springer-Verlag, New York.
[83] Kass, R.E. & Wasserman L. (1996). “The selection of prior distribu-
tions by formal rules”, Journal of the American Statistical Associa-
tion, 91, 1343–1370.
[84] Kremer E. (1982). “IBNR-claims and the two-way model of
ANOVA”, Scandinavian Actuarial Journal, 47–55.
[85] Laeven R.J.A, Goovaerts M.J. & Hoedemakers T. (2005). “Some
asymptotic results for sums of dependent random variables with ac-
tuarial applications”, Insurance: Mathematics & Economics, to be
published.
[86] Landsman Z. & Valdez E.A. (2003). “Tail conditional expectations
for elliptical distributions”, North American Actuarial Journal, 7,
55–71.
[87] Lawless J.F. (1982). Statistical Models and Methods for Lifetime
Data, Wiley, New York.
[88] Lehmann E. (1955). “Ordered families of distributions”, Ann. Math.
Statist., 26, 399–419.
[89] Lowe J. (1994). “A practical guide to measuring reserve variabil-
ity using: Bootstrapping, operational time and a distribution free
approach”, Proceedings of the 1994 General Insurance Convention,
Institute of Actuaries and Faculty of Actuaries.
[90] Mack T. (1991). “A simple parametric model for rating automo-
bile insurance or estimating IBNR claims reserves”, ASTIN Bulletin,
22(1), 93–109.
Bibliography 245
[91] Mack T. (1993). “Distribution free calculation of the standard error
of chain ladder reserve estimates”, ASTIN Bulletin, 23(2), 213–225.
[92] Mack T. (1994). “Measuring the variability of chain-ladder reserve
estimates“, Casualty Actuarial Society Forum Spring 1994, 1, 101-
182.
[93] McCullagh P. & Nelder J.A. (1992). Generalized Linear Models, 2nd
edition, Chapman and Hall, New York.
[94] Merton R. (1971). “Optimum consumption and portfolio rules in a
continuous-time model”, Journal of Economic Theory 3, 373–413.
[95] Merton R. (1990). Continuous Time Finance, Cambridge, Blackwell.
[96] Michael J.R., Schucany W.R. & Haas R.W. (1976). “Generating ran-
dom variates using transformations with multiple roots”, The Amer-
ican Statistician, 30, 88–90.
[97] Milevsky M.A., Ho K. & Robinson C. (1997). “Asset allocation via
the conditional first exit time or how to avoid outliving your money”,
Review of Quantitative Finance and Accounting, 9(1), 53–70.
[98] Milevsky M.A. (1997). “The present value of a stochastic perpetu-
ity and the Gamma distribution”, Insurance: Mathematics & Eco-
nomics, 20(3), 243–250.
[99] Milevsky M.A. & Posner S.E. (1998). “Asian options, the sum of
lognormals, and the reciprocal gamma distribution”, Journal of Fi-
nancial and Quantitative Analysis, 33(3), 409–422.
[100] Milevsky M.A. & Robinson C. (2000). “Self-annuitization and ruin
in retirement”, North American Actuarial Journal, 4(4), 112–124.
[101] Milevsky M.A. & Wang J. (2004). “Stochastic annuities under expo-
nential mortality”, Research paper, York University and The IFID
Centre.
[102] Nielsen J.A. & Sandmann K. (2003). “Pricing bounds on Asian op-
tions”, Journal of Financial and Quantitative Analysis, 38(2), 449–
474.
246 Bibliography
[103] Norberg R. (1990). “Payment measures, interest and discounting. An
axiomatic approach with applications to insurance”, Scandinavian
Actuarial Journal, 73, 14–33.
[104] Norberg R. (1993). “A solvency study in life insurance”, Proceedings
of the Third AFIR International Colloquium, Rome, 822–830.
[105] O’Hagan A. (1994). Bayesian Inference, Kendall’s Advanced Theory
of Statistics, Arnold, London.
[106] Panjer H.H. & Bellhouse D.R. (1980). “Stochastic modeling of in-
terest rates with applications to life contingencies”, Journal of Risk
and Insurance, 47, 91–110.
[107] Panjer H.H. (1998). Financial economics: With applications to in-
vestments, insurance and pensions, Schaumburg, Ill.: Society of Ac-
tuaries.
[108] Parker G. (1994a). “Moments of the present value of a portfolio of
policies”, Scandinavian Actuarial Journal, 77(1), 53–67.
[109] Parker G. (1994b). “Stochastic analysis of portfolio of endowment
insurance policies”, Scandinavian Actuarial Journal 77(2), 119–130.
[110] Parker G. (1994c). “Limiting distribution of the present value of a
portfolio”, ASTIN Bulletin, 24(1), 47–60.
[111] Parker G. (1994d). “Two stochastic approaches for discounting ac-
tuarial functions”, ASTIN Bulletin, 24(2), 167–181.
[112] Parker G. (1996). “A portfolio of endowment policies and its limiting
distribution”, ASTIN Bulletin, 26(1), 25–33.
[113] Parker G. (1997). “Stochastic analysis of the interaction between
investment and insurance risks”, North American Actuarial Journal,
1(2), 55–71.
[114] Pinheiro P.J.R., Andrade e Silva J.M. & de Lourdes Centeno M.
(2003). “Bootstrap methodology in claim reserving”, Journal of Risk
and Insurance, 70(4), 701–715.
Bibliography 247
[115] Renshaw A.E. (1989). “Chain ladder and interactive modelling
(claims reserving and GLIM)”, Journal of the Institute of Actuar-
ies, 116(III), 559–587.
[116] Renshaw A.E. (1994). “On the second moment properties and the
implementation of certain GLIM based stochastic claims reserving
models”, Actuarial Research Paper No. 65, Department of Actuarial
Science and Statistics, City University, London.
[117] Renshaw A.E. (1994b). “Claims reserving by joint modelling”, Actu-
arial Research Paper No. 72, Department of Actuarial Science and
Statistics, City University, London.
[118] Renshaw A.E. & Verrall R.J. (1994). “A stochastic model underlying
the chain-ladder technique”, Proceedings XXV ASTIN Colloquium,
Cannes.
[119] Rogers L.C.G. & Shi Z. (1995). “The Value of an Asian option”,
Journal of Applied Probability, 32, 1077–1088.
[120] Schoutens W. (2003). Levy Processes in Finance: Pricing Financial
Derivatives, Wiley, New York.
[121] Shaked M. & Shanthikumar J.G. (1994). Stochastic orders and their
applications, Academic Press.
[122] Simon S., Goovaerts M.J. & Dhaene J. (2000). “An easy computable
upper bound for the price of an arithmetic Asian option”, Insurance:
Mathematics & Economics, 26(2-3), 175–184.
[123] Tang Q. & Tsitsiashvili G. (2003). “Precise estimates for the ruin
probability in finite horizon in a discrete-time model with heavy-
tailed insurance and financial risks”, Stochastic Processes and their
Applications, 108, 299–325.
[124] Tang Q. & Tsitsiashvili G. (2004). “Finite and infinite time ruin
probabilities in the presence of stochastic return on investments”,
Advances in Applied Probability, 36, 1278–1299.
[125] Taylor G.C. & Ashe F.R. (1983). “Second moments of estimates of
outstanding claims”, Journal of Econometrics, 23, 37–61.
248 Bibliography
[126] Taylor G.C. (1996). “Risk, capital and profit in insurance”, SCOR
International Prize in Actuarial Science.
[127] Taylor G.C. (2000). Loss Reserving: An Actuarial Perspective,
Kluwer Academic Publishers.
[128] Taylor G.C. (2004). “Risk and discounted loss reserves”, North
American Actuarial Journal, 8(1), 37–44.
[129] Valdez E. & Dhaene J. (2004). “Bounds for sums of dependent log-
elliptical risks”, Working Paper, University of New South Wales.
[130] Vanduffel S., Hoedemakers T. & Dhaene J. (2004). “Comparing ap-
proximations for sums of non-independent lognormal random vari-
ables”, Research Report OR 0418, Department of Applied Eco-
nomics, K.U.Leuven.
[131] Vanduffel S. (2005). Comonotonicity: From Risk Measurement to
Risk Management, PhD Thesis, University of Amsterdam, Faculty
of Economics and Econometrics, Amsterdam.
[132] Vanmaele M., Deelstra G. & Liinev J. (2004a). “Approximation of
stop-loss premiums involving sums of lognormals by conditioning on
two random variables”, Insurance: Mathematics & Economics, 35(2),
343–367.
[133] Vanmaele M., Deelstra G., Liinev J., Dhaene J. & Goovaerts M.J.
(2004b). “Bounds for the price of discrete arithmetic Asian options”,
Journal of Computational and Applied Mathematics, to appear.
[134] Verrall R.J. (1989). “A state space representation of the chain-ladder
linear model”, Journal of the Institute of Actuaries, 116, 589–610.
[135] Verrall R.J. (1991). “On the unbiased estimation of reserves from
loglinear models”, Insurance: Mathematics & Economics, 10, 75–80.
[136] Verrall R.J. (2004). “A Bayesian generalized linear model for the
Bornhuetter-Ferguson method of claims reserving”, North American
Actuarial Journal, 8(3), 67–89.
[137] Vyncke D. (2003). Comonotonicity: the Perfect Dependence, PhD
Thesis, K.U. Leuven, Faculty of Sciences, Leuven.
Bibliography 249
[138] Vyncke D., Goovaerts M.J. & Dhaene J. (2004). “An accurate an-
alytical approximation for the price of a european-style arithmetic
Asian option”, Finance (AFFI), 25, 121–139.
[139] Wang S. & Young V.R. (1998). “Ordering risks: Expected utility
theory versus Yaari’s dual theory of risk”, Insurance: Mathematics
& Economics, 22, 235–242.
[140] Waters H.R. (1978). “The moments and distributions of actuarial
functions”, Journal of the Institute of Actuaries, 105, 61–75.
[141] Wedderburn R.W.M. (1974). “Quasi-likelihood functions, general-
ized linear models, and the Gauss-Newton method”, Biometrika, 61,
439–447.
[142] Wilkie A.D. (1976). “The rate of interest as a stochastic process:
Theory and applications”, Proceedings of the 20th International
Congress of Actuaries, Tokyo 1, 325–337.
[143] Wolthuis H. & Van Hoek I. (1986). “Stochastic models for life con-
tingencies”, Insurance: Mathematics & Economics, 5(3), 217–254.
[144] Wright T.S. (1990). “A stochastic method for claims reserving in
general insurance”, Journal of the Institute of Actuaries, 117, 677–
731.
[145] Yaari M.E. (1987). “The dual theory of choice under risk”, Econo-
metrica, 55, 95–115.
[146] Zehnwirth B. (1989). “The chain-ladder technique - A stochastic
model”, Claims Reserving Manual, 2, Institute of Actuaries, Lon-
don.