modern reserving techniques for the insurance business · 2017. 5. 5. · goovaerts and jan dhaene...

Katholieke Universiteit Leuven

FACULTEIT WETENSCHAPPEN

DEPARTEMENT WISKUNDE

MODERN RESERVING TECHNIQUES

FOR THE INSURANCE BUSINESS

door

Tom HOEDEMAKERS

Promotor:

Prof. Dr. J. Beirlant

Prof. Dr. J. Dhaene

Proefschrift ingediend tot het

behalen van de graad van

Doctor in de Wetenschappen

Leuven 2005

Acknowledgments

Four years ago I became part of the stimulating and renowned academic

environment at K. U. Leuven, the Department of Applied Economics, and

the AFI Leuven Research Center in particular. As a researcher, I had the

opportunity to interact, work with and learn from many interesting people.

I consider myself extremely fortunate to have had the following people in

support for the realization of this thesis.

I feel very privileged to have worked with my two supervisors, Jan

Beirlant and Jan Dhaene. To each of them I owe a great debt of gratitude

for their continuous encouragement, patience, inspiration and friendship.

I especially want to thank them for the freedom they allowed me to seek

satisfaction in research, for supporting me in my choices and for believing

in me. They carefully answered the many (sometimes not well-defined)

questions that I had and they always found a way to make themselves

available for yet another meeting. Each chapter of this thesis has benefitted

from their critical comments, which often inspired me to do further research

and to improve the vital points of the argument. It has been a privilege

to study under Jan and Jan, and to them goes my highest personal and

professional respect.

I am also grateful to Marc Goovaerts for giving me the opportunity to

start my thesis in one of the world-leading actuarial research centers. Marc

Goovaerts and Jan Dhaene have taught me a great deal about the field of

actuarial science by sharing with me the joy of discovery and investigation

that is the heart of research. They brought me in contact with a lot of

interesting people in the actuarial world and gave me the possibility to

present my work at different congresses all over the world.

I would also like to thank the other members of the doctoral committee

Michel Denuit, Rob Kaas, Wim Schoutens and Jef Teugels for their valu-

able contributions as committee members. Their detailed comments as

i

ii Acknowledgments

well as their broader reactions definitely helped me to improve the quality

of my research and its write-up.

Many thanks go also to my (ex-)colleagues Ales, Bjorn, David, Grzegorz,

Katrien, Marc, Piotr, Steven and Yuri for their enthusiasm and stimulat-

ing cooperation. A lot of sympathy goes to Emiliano Valdez for the serious

discussions, and even more important, for the fun we had during his stay

at the K. U. Leuven in the beginning of this year.

After the professionals, a word of thanks is addressed to all my friends

and fellow students for their friendship and support.

Finally, not least, I would like to thank my parents and my sister Leen

for their love, guidance and support. They constantly reminded me of their

confidence and encouraged me to pursue my scientific vocation, especially

in moments of doubt. You have always believed in me and that was a great

moral support.

Tom

Leuven, 2005

Table of Contents

Acknowledgments i

Preface vii

Publications xix

List of abbreviations and symbols xxi

1 Risk and comonotonicity in the actuarial world 1

1.1 Fundamental concepts in actuarial risk theory . . . . . . . . 1

1.1.1 Dependent risks . . . . . . . . . . . . . . . . . . . . 2

1.1.2 Risk measures . . . . . . . . . . . . . . . . . . . . . 4

1.1.3 Actuarial ordering of risks . . . . . . . . . . . . . . . 10

1.2 Comonotonicity . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Convex bounds 21

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Convex bounds for sums of dependent random variables . . 23

2.2.1 The comonotonic upper bound . . . . . . . . . . . . 25

2.2.2 The improved comonotonic upper bound . . . . . . . 26

2.2.3 The lower bound . . . . . . . . . . . . . . . . . . . . 28

2.2.4 Moments based approximations . . . . . . . . . . . . 29

2.3 Upper bounds for stop-loss premiums . . . . . . . . . . . . 30

2.3.1 Upper bounds based on lower bound plus error term 31

2.3.2 Bounds by conditioning through decomposition of

the stop-loss premium . . . . . . . . . . . . . . . . . 33

2.3.3 Partially exact/comonotonic upper bound . . . . . . 35

2.3.4 The case of a sum of lognormal random variables . . 35

iii

iv Table of Contents

2.4 Application: discounted loss reserves . . . . . . . . . . . . . 47

2.4.1 Framework and notation . . . . . . . . . . . . . . . . 48

2.4.2 Calculation of convex lower and upper bounds . . . 52

2.5 Convex bounds for scalar products of random vectors . . . 56

2.5.1 Theoretical results . . . . . . . . . . . . . . . . . . . 58

2.5.2 Stop-loss premiums . . . . . . . . . . . . . . . . . . . 61

2.5.3 The case of log-normal discount factors . . . . . . . 62

2.6 Application: the present value of stochastic cash flows . . . 68

2.6.1 Stochastic returns . . . . . . . . . . . . . . . . . . . 68

2.6.2 Lognormally distributed payments . . . . . . . . . . 72

2.6.3 Elliptically distributed payments . . . . . . . . . . . 77

2.6.4 Independent and identically distributed payments . 84

2.7 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

3 Reserving in life insurance business 93

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

3.2 Modelling stochastic decrements . . . . . . . . . . . . . . . 96

3.3 The distribution of life annuities . . . . . . . . . . . . . . . 100

3.3.1 A single life annuity . . . . . . . . . . . . . . . . . . 100

3.3.2 A homogeneous portfolio of life annuities . . . . . . 113

3.3.3 An ‘average’ portfolio of life annuities . . . . . . . . 119

3.3.4 A numerical illustration . . . . . . . . . . . . . . . . 120

3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

4 Reserving in non-life insurance business 127

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

4.2 The claims reserving problem . . . . . . . . . . . . . . . . . 131

4.3 Model set-up: regression models . . . . . . . . . . . . . . . 133

4.3.1 Lognormal linear models . . . . . . . . . . . . . . . . 135

4.3.2 Loglinear location-scale models . . . . . . . . . . . . 137

4.3.3 Generalized linear models . . . . . . . . . . . . . . . 141

4.3.4 Linear predictors and the discounted IBNR reserve . 146

4.4 Convex bounds for the discounted IBNR reserve . . . . . . 148

4.4.1 Asymptotic results in generalized linear models . . . 148

4.4.2 Lower and upper bounds . . . . . . . . . . . . . . . 151

4.5 The bootstrap methodology in claims reserving . . . . . . . 157

4.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 157

4.5.2 Central idea . . . . . . . . . . . . . . . . . . . . . . . 158

Table of Contents v

4.5.3 Bootstrap confidence intervals . . . . . . . . . . . . . 158

4.5.4 Bootstrap in claims reserving . . . . . . . . . . . . . 159

4.6 Three applications . . . . . . . . . . . . . . . . . . . . . . . 163

4.6.1 Lognormal linear models . . . . . . . . . . . . . . . . 164

4.6.2 Loglinear location-scale models . . . . . . . . . . . . 171

4.6.3 Generalized linear models . . . . . . . . . . . . . . . 177

4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

5 Other approximation techniques for sums of dependent

random variables 185

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

5.2 Moment matching approximations . . . . . . . . . . . . . . 187

5.2.1 Two well-known moment matching approximations . 187

5.2.2 Application: discounted loss reserves . . . . . . . . . 190

5.3 Asymptotic approximations . . . . . . . . . . . . . . . . . . 192

5.3.1 Preliminaries for heavy-tailed distributions . . . . . 192

5.3.2 Asymptotic results . . . . . . . . . . . . . . . . . . . 194

5.3.3 Application: discounted loss reserves . . . . . . . . . 198

5.4 The Bayesian approach . . . . . . . . . . . . . . . . . . . . 201

5.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 201

5.4.2 Prior choice . . . . . . . . . . . . . . . . . . . . . . . 203

5.4.3 Iterative simulation methods . . . . . . . . . . . . . 205

5.4.4 Bayesian model set-up . . . . . . . . . . . . . . . . . 207

5.5 Applications in claims reserving . . . . . . . . . . . . . . . . 209

5.5.1 The comonotonicity approach versus the Bayesian

approximations . . . . . . . . . . . . . . . . . . . . . 209

5.5.2 The comonotonicity approach versus the asymptotic

and moment matching approximations . . . . . . . . 216

5.6 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

Samenvatting in het Nederlands (Summary in Dutch) 227

Bibliography 237

Preface

Uncertainty is very much a part of the world in which we live. Indeed, one

often hears the well-known cliche that the only certainties in life are death

and taxes. However, even these supposed certainties are far from being

completely certain, as any actuary or accountant can attest. For although

one’s eventual death and the requirement that one pay taxes may be facts

of life, the timing of one’s death and the amount of taxes to pay are far from

certain and are generally beyond one’s control. Uncertainty can make life

interesting. Indeed, the world would likely be a very dull place if everything

were perfectly predictable. However, uncertainty can also cause grief and

suffering.

Actuarial science is the subject whose primary focus is analyzing the

financial consequences of future uncertain events. In particular, it is con-

cerned with analyzing the adverse financial consequences of large, unpre-

dictable losses and with designing mechanisms to cushion the harmful fi-

nancial effects of such losses.

Insurance is based on the premise that individuals faced with large and

unpredictable losses can reduce the variability of such losses by forming a

group and sharing the losses incurred by the group as a whole. This im-

portant principle of loss sharing, known as the insurance principle, forms

the foundation of actuarial science. It can be justified mathematically

using the Central Limit Theorem from probability theory. For the insur-

ance principle to be valid, essentially four conditions should hold (or very

nearly hold). The losses should be unpredictable. The risks should be

independent in the sense that a loss incurred by one member of the group

makes additional losses by other members of the group no more or less

likely. The risks should be homogeneous in the sense that a loss incurred

by one member of the group is not expected to be any different in size or

likelihood from losses incurred by other members of the group. Finally,

vii

viii Preface

the group should be sufficiently large so that the portion of the total loss

that each individual is required to pay becomes relatively certain. In prac-

tice, risks are not truly independent or homogeneous. Moreover, there will

always be situations where the condition of unpredictability is violated.

Actuarial science seeks to address the following three problems associated

with any such insurance arrangement:

1. Given the nature of the risk being assumed, what price (i.e. premium)

should the insurance company charge?

2. Given the nature of the overall risks being assumed, how much of

the aggregate premium income should the insurance company set

aside in a reserve to meet contractual obligations (i.e. pay insurance

claims) as they arise?

3. Given the importance to society and the general economy of having

sound financial institutions able to meet all their obligations, how

much capital should an insurance company have above and beyond

its reserves to absorb losses that are larger than expected? Given the

actual level of an insurance company’s capital, what is the probability

of the company remaining solvent?

These are generally referred to as the problems of pricing, reserving, and

solvency.

This thesis focuses on the problem of reserving and total balance sheet

requirements. A reserving analysis involves the determination of the ran-

dom present value of an unknown amount of future loss payments. For a

property/casualty insurance company this uncertain amount is usually the

most important number on its financial statement. The care and expertise

with which that number is developed are crucial to the company and to its

policyholders. It is important not to let the inherent uncertainties serve as

an excuse for providing anything less than a rigorous scientific analysis.

Among those who rely on reserve estimates, interests and priorities may

vary. To company management the reserve estimate should provide reliable

information in order to maximize the company’s viability and profitabil-

ity. To the insurance regulator, concerned with company solvency, reserves

should be set conservatively to reduce the probability of failure of the in-

surance company. To the tax agent charged with ensuring timely reporting

Preface ix

of earned income, the reserves should reflect actual payments as “nearly

as it is possible to ascertain them”. The policyholder is most concerned

that reserves are adequate to pay insured claims, but does not want to be

overcharged.

Besides all the techniques, the primary goal of the reserving process

can be stated quite simply. As of a given date, an insurer is liable for

all claims incurred from that date on. As well as for claims that arise

from already occurred events as for claims that arise from risks covered by

the insurer but for which the uncertain event has not yet occurred. Costs

associated with these claims fall into two categories: those which have been

paid and those which have not. The primary goal of the reserving process

is to estimate those which have not yet been paid (i.e. unpaid losses). As

of a given reserve date, the distribution of possible aggregate unpaid loss

amounts may be represented as a probability density function. Much has

been written about the statistical distributions that have proven to be

most useful in the study of risk and insurance. In practice full information

about the underlying distributions is hardly ever available. For this reason

one often has to rely on partial information, for example estimations of the

first couple of moments. Not only the basic summary measures, but also

more sophisticated risk measures (such as measures of skewness or extreme

percentiles of the distribution) which require much deeper knowledge about

the underlying distributions are of interest. The computation of the first

few moments may be seen as just a first attempt to explore the properties of

a random distribution. Moreover in general the variance does not appear to

be the most suitable risk measure to determine the solvency requirements

for an insurance portfolio. As a two-sided risk measure it takes into account

both positive and negative discrepancies which leads to underestimation

of the reserve in the case of a skewed distribution. Moreover it does not

emphasize the tail properties of the distribution. In this case it seems

much more appropriate to use the Value-at-Risk (the p-th quantile) or

also the Tail Value-at-Risk (which is essentially the same as an average of

all quantiles above a predefined level p). Also risk measures based on stop-

loss premiums (for example the Expected Shortfall) can be used in this

context. These trends are also reflected in the recent regulatory changes

in banking and insurance (Basel 2 and Solvency 2) which stress the role

of the risk-based approach in asset-liability management. This creates a

need for new methodological tools which allow to obtain more sophisticated

information about the underlying risks, like the upper quantiles, stop-loss

x Preface

premiums and others.

There is little in the actuarial literature which considers the adequate

computation of the distribution of reserve outcomes. Several methods exist

which allow to approximate efficiently the distribution functions for sums

of independent risks (e.g. Panjer’s recursion, convolution, ...). Moreover if

the number of risks in an insurance portfolio is large enough, the Central

Limit Theorem allows to obtain a normal approximation for aggregate

claims. Therefore even if the independence assumption is not justified (e.g.

when it is rejected by formal statistical tests), it is often used in practice

because of its mathematical convenience. In a lot of practical applications

the independence assumption may be often violated, which can lead to

significant underestimation of the riskiness of the portfolio. This is the

case for example when the actuarial technical risk is combined with the

financial investment risk.

Unlike in finance, in insurance the concept of stochastic interest rates

emerged quite recently. Traditionally actuaries rely on deterministic in-

terest rates. Such a simplification allows to treat efficiently summary mea-

sures of financial contracts such as the mean, the standard deviation or the

upper quantiles. However due to a high uncertainty about future invest-

ment results, actuaries are forced to adopt very conservative assumptions

in order to calculate insurance premiums or mathematical reserves. As

a result the diversification effects between returns in different investment

periods cannot be taken into account (i.e. the fact that poor investment

results in some periods are usually compensated by very good results in

others). This additional cost is transferred either to the insureds who

have to pay higher insurance premiums or to the shareholders who have

to provide more economic capital.

For these reasons the need for introducing models with stochastic inter-

est rates has been well-understood also in the actuarial world. The move

toward stochastic modelling of interest rates is additionally enhanced by

the latest regulatory changes in banking and insurance (Basel 2, Solvency

2) which promote the risk-based approach to determine economic capital,

i.e. they state that traditional actuarial conservatism should be replaced

by the fair value reserving, with the regulatory capital determined solely

on the basis of unexpected losses which can be estimated e.g. by taking the

Value-at-Risk measure at appropriate probability level p. Projecting cash

flows with stochastic rates of return is also crucial in pricing applications

Preface xi

in insurance, like the embedded value (the present value of cash flows gen-

erated only by policies-in-force) and the appraisal value (the present value

of cash flows generated both by policies-in-force and by new business, i.e.

the policies which will be written in the future).

A mathematical description of the discussed problem can be summarized

as follows.

Let Xi denote a random amount to be paid at time ti, i = 1, . . . , n and let

Vi denote the discount factor over the period [0, ti]. We will consider the

present value of future payments being a scalar product of the form

S =n∑

i=1

XiVi. (1)

The random vector ~X = (X1, X2, . . . , Xn) may reflect e.g. the insurance

or credit risk while the vector ~V = (V1, V2, . . . , Vn) represents the finan-

cial/investment risk. In general we assume that these vectors are mutually

independent. In practical applications the independence assumption is of-

ten violated, e.g. due to an inflation factor which strongly influences both

payments and investment results. One can however tackle this problem by

considering sums of the form

S =n∑

i=1

XiVi,

where Xi = Xi/Zi and Vi = ViZi are the adjusted values expressed in

real terms (Zi denotes here an inflation factor over period [0, ti]). For this

reason the assumption of independence between the insurance risk and the

financial risk is in most cases realistic and can be efficiently applied to

obtain various quantities describing risk within financial institutions, e.g.

discounted insurance claims or the embedded/appraisal value of a com-

pany.

Typically these distribution functions are rather involved, which is

mainly due to two important reasons. First of all, the distribution of

the sum of random variables with marginal distributions in the same dis-

tribution class in general does not belong to the particular distribution

class. Secondly, the stochastic dependence between the elements in the

sum precludes convolution and complicates matters considerably.

xii Preface

Consequently, in order to compute functionals of sums of dependent

random variables, approximation methods are generally indispensable. Pro-

vided that the whole dependency structure is known, one can use Monte

Carlo simulation to obtain empirical distribution functions. However, this

is typically a time consuming approach, in particular if we want to ap-

proximate tail probabilities, which would require an excessive number of

simulations. Therefore, alternative methods need to be explored. In this

thesis we discuss the most frequent used approximation techniques for re-

serving applications.

The central idea in this work is the concept of comonotonicity. We

suggest to solve the above described problem by calculating upper and

lower bounds for the sum of dependent random variables making efficient

use of the available information. These bounds are based on a general

technique for deriving lower and upper bounds for stop-loss premiums of

sums of dependent random variables, as explained in Kaas et al. (2000),

Dhaene et al. (2002a,b), among others.

The first approximation we will consider for the distribution function of

the discounted reserve is derived by approximating the dependence struc-

ture between the random variables involved by a comonotonic dependence

structure. In this way the multi-dimensional problem is reduced to a two-

dimensional one which can easily be solved by conditioning and using some

numerical techniques. It is argued that this approach is plausible in ac-

tuarial applications because it leads to prudent and conservative values of

the reserves and solvency margin. If the dependency structure between

the summands of S is strong enough, this upper bound in convex order

performs reasonably well.

The second approximation, which is derived by considering conditional

expectations, takes part of the dependence structure into account. This

lower bound in convex order turns out to be extremely useful to evaluate

the quality of approximation provided by the upper bound. The lower

bound can also be applied as an approximation of the underlying distribu-

tion. This choice is not actuarially prudent, however the relative error of

this approximation significantly outperforms the relative error of the upper

bound. For this reason, the lower bound will always be preferable in the

applications which require high precision of approximations, like pricing of

exotic derivatives (e.g. Decamps et al. (2004), Deelstra et al. (2004) and

Vyncke et al. (2004)) or optimal portfolio selection problems (e.g. Dhaene

et al. (2005)).

Preface xiii

This thesis is set out as follows.

The first chapter recalls the basics of actuarial risk theory. We define

some frequently used measures of dependence and the most important or-

derings of risks for actuarial applications. We further introduce several

well-known risk measures and the relations that hold between them. We

summarize properties of these risk measures that can be used to facilitate

decision-taking. Finally, we provide theoretical background for the con-

cept of comonotonicity and we review the most important properties of

comonotonic risks.

In Chapter 2 we recall how the comonotonic bounds can be derived and

illustrate the theoretical results by means of an application in the con-

text of discounted loss reserves. The advantage of working with a sum of

comonotonic variables has to be that the calculation of the distribution of

such a sum is quite easy. In particular this technique is very useful to find

reliable estimations of upper quantiles and stop-loss premiums.

In practical applications the comonotonic upper bound seems to be

useful only in the case of a very strong dependency between successive

summands. Even then the bounds for stop-loss premiums provided by

the comonotonic approximation are often not satisfactory. In this chapter

we present a number of techniques which allow to determine much more

efficient upper bounds for stop-loss premiums. To this end, we use on the

one hand the method of conditioning as in Curran (1994) and in Rogers

and Shi (1995), and on the other hand the upper and lower bounds for

stop-loss premiums of sums of dependent random variables. We show also

how to apply the results to the case of sums of lognormally distributed

random variables. Such sums are widely encountered in practice, both in

actuarial science and in finance.

We derive comonotonic approximations for the scalar product of ran-

dom vectors of the form (1) and explain a general procedure to obtain

accurate estimates for quantiles and stop-loss premiums. We study the

distribution of the present value function of a series of random payments

in a stochastic financial environment described by a lognormal discounting

process. Such distributions occur naturally in a wide range of applications

within fields of insurance and finance. Accurate approximations are ob-

tained by developing upper and lower bounds in the convex order sense for

xiv Preface

such present value functions. Finally, we consider several applications for

discounted claim processes under the Black & Scholes setting. In particular

we analyze in detail the cases when the random variables Xi denote insur-

ance losses modelled by lognormal, normal (more general: elliptical) and

gamma or inverse Gaussian (more general: tempered stable) distributions.

As we demonstrate by means of a series of numerical illustrations, the

methodology provides an excellent framework to get accurate and easily

obtainable approximations of distribution functions for random variables

of the form (1).

Chapters 3 and 4 apply the obtained results to two important reserving

problems in insurance business and illustrate them numerically.

In Chapter 3 we consider an important application in the life insurance

business. We aim to provide some conservative estimates both for high

quantiles and stop-loss premiums for a single life annuity and for a whole

portfolio. We focus here only on life annuities, however similar techniques

may be used to get analogous estimates for more general life contingencies.

Our solution enables to solve with a great accuracy personal finance

problems such as: How much does one need to invest now to ensure — given

a periodical (e.g. yearly) consumption pattern — that the probability of

outliving ones money is very small (e.g. less than 1%)?

The case of a portfolio of life annuity policies has been studied exten-

sively in the literature, but only in the limiting case — for homogeneous

portfolios, when the mortality risk is fully diversified. However the applica-

bility of these results in insurance practice may be questioned: especially

in the case of the life annuity business a typical portfolio does not con-

tain enough policies to speak about full diversification. For this reason we

propose to approximate the number of active policies in subsequent years

using a normal power distribution (by fitting the first three moments of

the corresponding binomial distributions) and to model the present value

of future benefits as a scalar product of mutually independent random

vectors.

Chapter 4 focuses on the claims reserving problem. To get the correct

picture of its liabilities, a company should set aside the correctly estimated

amount to meet claims arising in the future on the written policies. The

past data used to construct estimates for the future payments consist of a

Preface xv

triangle of incremental claims.

The purpose is to complete this run-off triangle to a square, and even

to a rectangle if estimates are required pertaining to development years of

which no data are recorded in the run-off triangle at hand. To this end, the

actuary can make use of a variety of techniques. The inherent uncertainty

is described by the distribution of possible outcomes, and one needs to

arrive at the best estimate of the reserve. In this chapter we look at the

discounted reserve and impose an explicit margin based on a risk measure

from the distribution of the total discounted reserve. We will model the

claim payments using lognormal linear, loglinear location-scale and gener-

alized linear models, and derive accurate comonotonic approximations for

the discounted loss reserve.

The bootstrap technique has proved to be a very useful tool in many

statistical applications and can be particularly interesting to assess the

variability of the claim reserving predictions and to construct upper limits

at an adequate confidence level. Its popularity is due to a combination of

available computing power and theoretical development. One advantage of

the bootstrap is that the technique can be applied to any data set without

having to assume an underlying distribution. Moreover, most computer

packages can handle very large numbers of repeated samplings, and this

should not limit the accuracy of the bootstrap estimates.

In the last chapter we derive, review and discuss some other methods

to obtain approximations for S. In the first section we recall two well-

known moment matching approximations: the lognormal and the recip-

rocal gamma approximation. Practitioners often use a moment matching

lognormal approximation for the distribution of S. The lognormal and

reciprocal gamma approximations are chosen such that their first two mo-

ments are equal to the corresponding moments of S.

Although the comonotonic bounds in convex order have proven to be

good approximations in case the variance of the random sum is sufficiently

small, they perform much worse when the variance gets large. In actuarial

applications it is often merely the tail of the distribution function that is

of interest. Indeed, one may think of Value-at-Risk, Conditional Tail Ex-

pectation or Expected Shortfall estimations. Therefore, approximations

for functionals of sums of dependent random variables may alternatively

be obtained through the use of asymptotic relations. Although asymptotic

results are valid at infinity, they may as well serve as approximations near

xvi Preface

infinity. We establish some asymptotic results for the tail probability of

a sum of heavy tailed dependent random variables. In particular, we de-

rive an asymptotic result for the randomly weighted sum of a sequence of

non-negative numbers. Furthermore, we establish under two different sets

of conditions, an asymptotic result for the randomly weighted sum of a

sequence of independent random variables that consist of a random and a

deterministic component. Throughout, the random weights are products of

i.i.d. random variables and thus exhibit an explicit dependence structure.

Since the early 1990’s, statistics has seen an explosion in applied Bayesian

research. This explosion has had little to do with a warming of the statistics

and econometrics communities to the theoretical foundation of Bayesian-

ism, or to a sudden awakening to the merits of the Bayesian approach

over frequentist methods, but instead can be primarily explained on prag-

matic grounds. Bayesian inference is the process of fitting a probability

model to a set of data and summarizing the result by a probability distri-

bution on the parameters of the model and on unobserved quantities such

as predictions for new observations. Simple simulation methods exist to

draw samples from posterior and predictive distributions, automatically

incorporating uncertainty in the model parameters. An advantage of the

Bayesian approach is that we can compute, using simulation, the posterior

predictive distribution for any data summary, so we do not need to put a

lot of effort into estimating the sampling distribution of test statistics. The

development of powerful computational tools (and the realization that ex-

isting statistical tools could prove quite useful for fitting Bayesian models)

has drawn a number of researchers to use the Bayesian approach in prac-

tice. Indeed, the use of such tools often enables researchers to estimate

complicated statistical models that would be quite difficult, if not virtu-

ally impossible, using standard frequentist techniques. The purpose of this

third section is to sketch, in very broad terms, basic elements of Bayesian

computation.

Finally, we compare these approximations with the comonotonic ap-

proximations of the previous chapter in the context of claims reserving. In

case the underlying variance of the statistical and financial part of the dis-

counted IBNR reserve gets large, the comonotonic approximations perform

worse. We will illustrate this observation by means of a simple example

and propose to solve this problem using the derived asymptotic results for

the tail probability of a sum of dependent random variables, in the presence

of heavy-tailedness conditions. These approximations are compared with

Preface xvii

the lognormal moment matching approximations. We finally consider the

distribution of the discounted loss reserve when the data in the run-off tri-

angle is modelled by a generalized linear model and compare the outcomes

of the Bayesian approach with the comonotonic approximations.

Publications

• Ahcan A., Darkiewicz G., Hoedemakers T., Dhaene J. and Goovaerts

M.J. (2004), “Optimal portfolio selection: Applications in insurance

business”, Proceedings of the 8th International Congress on Insur-

ance: Mathematics & Economics, June 14-16, Rome, pp. 40.

• Ahcan A., Darkiewicz G., Goovaerts M.J. and Hoedemakers T. (2005),

“Computation of convex bounds for present value functions of ran-

dom payments”, Journal of Computational and Applied Mathema-

tics, to be published.

• Antonio K., Goovaerts M.J. and Hoedemakers T. (2004), “On the

distribution of discounted loss reserves”, Medium Econometrische

Toepassingen, vol. 12, no. 2, pp. 14-18.

• Antonio K., Beirlant J. and Hoedemakers T. (2005), Discussion of

“A Bayesian generalized linear model for the Bornhuetter-Ferguson

method of claims reserving” by Richard Verrall, North American

Actuarial Journal, to be published.

• Antonio K., Beirlant J., Hoedemakers T. and Verlaak R. (2005),

“On the use of general linear mixed models in loss reserving”, North

American Actuarial Journal, submitted.

• Darkiewicz G. and Hoedemakers T. (2005), “How the co-integration

analysis can help in mortality forecasting”, British Actuarial Journal,

submitted.

• Hoedemakers T., Beirlant J., Goovaerts M.J. and Dhaene J. (2003),

“Confidence bounds for discounted loss reserves”, Insurance: Mathe-

matics & Economics, vol. 33, no. 2, pp. 297-316.

xix

xx Publications

• Hoedemakers T. and Goovaerts M.J. (2004), Discussion of “Risk and

discounted loss reserves” by Greg Taylor, North American Actuarial

Journal, vol. 8, no. 4, pp. 146-150.

• Hoedemakers T., Beirlant J., Goovaerts M.J. and Dhaene J. (2005),

“On the distribution of discounted loss reserves using generalized

linear models”, Scandinavian Actuarial Journal, vol. 2005, no. 1, pp.

25-45.

• Hoedemakers T., Darkiewicz G. and Goovaerts M.J. (2005), “Ap-

proximations for life annuity contracts in a stochastic financial envi-

ronment”, Insurance: Mathematics & Economics, to be published.

• Hoedemakers T., Darkiewicz G., Deelstra G., Dhaene J. and Van-

maele M. (2005), “Bounds for stop-loss premiums of stochastic sums

(with applications to life contingencies)”, Scandinavian Actuarial

Journal, submitted.

• Hoedemakers T., Goovaerts M.J. and Dhaene J. (2003), “IBNR pro-

blematiek in historisch perspectief”, De Actuaris, vol. 11, no. 2, pp.

27-29.

• Hoedemakers T., Goovaerts M.J. and Dhaene J. (2004), “De IBNR-

discussie”, De Actuaris, vol. 11, no. 4, pp. 26-29.

• Laeven R.J.A, Goovaerts M.J. and Hoedemakers T. (2005), “Some

asymptotic results for sums of dependent random variables with ac-

tuarial applications”, Insurance: Mathematics & Economics, to be

published.

• Vanduffel S., Hoedemakers T. and Dhaene J. (2005), “Comparing

approximations for risk measures of sums of non-independent log-

normal random variables”, North American Actuarial Journal, to be

published.

List of abbreviations and

symbols

Abbreviation Explanation

or symbol

ARMA(p, q) AutoRegressive-Moving Average Process of

order (p, q)

cdf cumulative distribution function

c.f. characteristic function

CLT Central Limit Theorem

Corr(X,Y ) = r(X,Y ) Pearson’s correlation coefficient between

the r.v.’s X and Y

Cov[X,Y ] covariance between the r.v.’s X and Y

D class of dominatedly varying functions

d.f. distribution function

E exponential r.v.

En(~µ,Σ, φ) n-dimensional elliptical distribution with

parameters ~µ, Σ and φ

F d.f. and distribution of a r.v.

F tail of the d.f. F : F = 1 − F

F ∗n n-fold convolution of the d.f. or distribution F

Γ(x) gamma function: Γ(x) =∫∞0 tx−1e−tdt, x > 0

Gamma(a, b) gamma distribution with parameters a and b:

f(x) = ba(Γ(a))−1xa−1e−bx, x ≥ 0

I(a, x) incomplete gamma function:

Γa(x) = (Γ(a))−1∫∞x e−tta−1dt, x ≥ 0

GPD Generalized Pareto Distribution

xxi

xxii List of abbreviations and symbols

I(.) indicator function: I(c) = 1 if the condition c is true

and I(c) = 0 if it is not

i.i.d. independent, identically distributed

L class of long-tailed distributions

LLN Law of Large Numbers

logN(µ, σ2) lognormal distribution with parameters µ and σ2:

f(x) = 1xσ

√2πe

−(log x−µ)2

2σ2 , x > 0

MLE Maximum Likelihood Estimator

N(µ, σ2),N(µ,Σ) Gaussian (normal) distribution with mean µ and

variance σ2 or covariance matrix Σ

N(0, 1) standard normal distribution

o(1) a(x) = o(b(x)) as x→ x0 means that

limx→x0 a(x)/b(x) = 0

O(1) a(x) = O(b(x)) as x→ x0 means that

limx→x0 |a(x)/b(x)| <∞ϕX(t) c.f. of the r.v. X: ϕX(t) = E[eitX ]

Φ(.) the cdf of the standard normal r.v.

lim supn→∞(xn) limit superior of the bounded sequence {xn}:= lim(sn), where sn = supk≥n xk = sup{xn, xn+1, . . .}

lim infn→∞(xn) limit inferior of the bounded sequence {xn}:= lim(tn), where tn = infk≥n xk = inf{xn, xn+1, . . .}

p.d.f probability density function

Pr[.] probability measure

p(.|.) conditional probability density

p(.) marginal distribution

R class of the d.f.’s with regularly varying right tail

Rα class of the regularly varying functions with index α

R−∞ class of the rapidly varying functions

r.v. random variable

S class of the subexponential distributions

σ2X variance of the r.v. X

σXiXjCov[Xi, Xj ]

sign(a) sign of the real number a

T S(δ, a, b) tempered stable law with parameters δ, a and b

U(a, b) uniform random variable on (a, b)

UMVUE Uniformly Minimum Variance Unbiased Estimator

Var[X] variance of the r.v. X

List of abbreviations and symbols xxiii

∼ a(x) ∼ b(x) as x→ x0 means that limx→x0 a(x)/b(x) = 1

a(x) ∼ 0 means a(x) = o(1)

≈ a(x) ≈ b(x) as x→ x0 means that a(x) is approximately

(roughly) of the same order as b(x) as x→ x0.

It is only used in a heuristic sense.

� a(x) � b(x) as x→ x0 means that 0 < lim infx→x0a(x)/b(x)

≤ lim supx→x0a(x)/b(x) <∞

d→ convergence in distributiond= equal in distribution

b.c floor function: bxc is the largest integer less than or

equal to x

d.e ceiling function: dxe is the smallest integer greater than or

equal to x

(x− d)+ max(x− d, 0)

=: or := notation

Chapter 1

Risk and comonotonicity in

the actuarial world

Summary In order to make decisions one has to evaluate the (distri-

bution function of the) multivariate risk (or random variable) one faces.

In this chapter we recall the basics of actuarial risk theory. We define

some frequently used measures of dependence and the most important or-

derings of risks for actuarial applications. We further introduce several

well-known risk measures and the relations that hold between them. We

summarize properties of these risk measures that can be used to facilitate

decision-taking. Finally, we provide theoretical background for the con-

cept of comonotonicity and we review the most important properties of

comonotonic risks.

1.1 Fundamental concepts in actuarial risk theory

In this section we briefly recall the most important concepts in actuar-

ial risk theory. The study of dependence has become of major concern

in actuarial research. We start by defining three important measures of

dependence: Pearson’s correlation coefficient, Kendall’s τ and Spearman’s

ρ. Once dependence measures are defined, one could use them to compare

the strength of dependence between random variables.

The determination of capital requirements for an insurance company

is a complex and non-trivial task. From their nature, capital requirements

are numeric values expressed in monetary units and based on quantifiable

1

2 Chapter 1 - Risk and comonotonicity in the actuarial world

measures of risks. Formally a risk measure is defined as a mapping from

the set of risks at hand to the real numbers. In other words, with any

potential loss X one associates a real number ρ[X]. Thus a risk measure

summarizes the riskiness of the underlying distribution in one single num-

ber. Usually such quantification serves as a risk management tool (e.g. an

insurance premium or an economic capital), but it can be also helpful in

overall decision making. We review and place the four popular risk mea-

sures (Value-at-Risk, Tail Value-at-Risk, Conditional Tail Expectation and

Expected Shortfall) in their context.

In the actuarial literature, orderings of risks are an important tool for

comparing the attractiveness of different risks. The essential tool for the

comparison of different concepts of orderings of risks will be the stop-loss

transform/premium and its properties. In the actuarial literature it is a

common feature to replace a risk by a “less favorable” risk that has a

simpler structure, making it easier to determine the distribution function.

We clarify what we mean with a “less favorable” risk and define the three

most important orderings of risks for actuarial applications: stochastic

dominance, stop-loss order and convex order.

This chapter is essentially based on Dhaene, Denuit, Goovaerts, Kaas &

Vyncke (2002a) and Dhaene, Vanduffel, Tang, Goovaerts, Kaas & Vyncke

(2004).

1.1.1 Dependent risks

In risk theory, all the random variables are traditionally assumed to be

mutually independent. It is clear that this assumption is made for mathe-

matical convenience. In some situations however, insured risks tend to act

similarly. The independence assumption is then violated and is not an ade-

quate way to describe the relations between the different random variables

involved. The individual risks of an earthquake or flooding risk portfolio

which are located in the same geographic area are correlated, since indi-

vidual claims are contingent on the occurrence and severity of the same

earthquake or flood. On a foggy day all cars of a region have higher prob-

ability to be involved in an accident. During dry hot summers, all wooden

cottages are more exposed to fire. More generally, one can say that if the

density of insured risks in a certain area or organization is high enough,

then catastrophes such as storms, explosions, earthquakes, epidemics and

1.1. Fundamental concepts in actuarial risk theory 3

so on can cause an accumulation of claims for the insurer. In life insurance,

there is ample evidence that the lifetimes of husbands and their wives are

positively associated. There may be certain selection mechanisms in the

matching of couples (“birds of a feather flock together”): both partners

often belong to the same social class and have the same life style. Fur-

ther, it is known that the mortality rate increases after the passing away

of one’s spouse (the “broken heart syndrome”). These phenomena have

implications on the valuation of aggregate claims in life insurance portfo-

lios. Another example in a life insurance context is a pension fund that

covers the pensions of persons working for the same company. These per-

sons work at the same location, they take the same flights. It is evident

that the mortality of these persons will be dependent, at least to a certain

extent.

The study of dependence has become of major concern in actuarial

research. There are a variety of ways to measure dependence.

First Pearson’s product moment correlation coefficient, captures the

linear dependence between couples of random variables. For a random

couple (X1, X2) having marginals with finite variances, Pearson’s product

correlation coefficient is defined by

Corr(X1, X2) =Cov[X1, X2]√Var[X1]Var[X2]

.

Pearson’s correlation coefficient contains information on both the strength

and direction of a linear relationship between two random variables. If one

variable is an exact linear function of the other variable, a positive relation-

ship leads to correlation coefficient 1, while a negative relationship leads

to correlation coefficient −1. If there is no linear predictability between

the two variables, the correlation is 0.

Kendall’s τ is a nonparametric measure of association based on the

probabilities of concordances and discordances in paired observations. Con-

cordance occurs when paired observations vary together, and discordance

occurs when paired observations vary differently. Specifically, Kendall’s τ

for a random couple (X1, X2) of random variables with continuous cdf’s is

defined as

τ(X1, X2) = Pr[(X1 −X ′1)(X2 −X ′

2) > 0]

−Pr[(X1 −X ′1)(X2 −X ′

2) < 0]

= 2Pr[(X1 −X ′1)(X2 −X ′

2) > 0] − 1,


where (X ′1, X

′2) is an independent copy of (X1, X2).

Contrary to Pearson’s r, Kendall’s τ is invariant under strictly mono-

tone transformations, that is, if φ1 and φ2 are strictly increasing (or de-

creasing) functions on the supports of X1 and X2, respectively, then

τ(φ1(X1), φ2(X2)

)= τ

(X1, X2

)provided the cdf’s of X1 and X2 are

continuous. Further, (X1, X2) are perfectly dependent if and only if,

|τ(X1, X2)| = 1.

Another very useful dependence measure is Spearman’s ρ. The idea

behind this dependence measure is very simple. Given random variablesX1

and X2 with continuous cdf’s FX1 and FX2 , we first create U1 = FX1(X1)

and U2 = FX2(X2), which are uniformly distributed over [0, 1] and then

use Pearson’s r. Spearman’s ρ is thus defined as ρ(X1, X2) = r(U1, U2).

Dependence measures can be used to compare the strength of depen-

dence between random variables.

1.1.2 Risk measures

Measuring risk and measuring preferences is not the same. When or-

dering preferences, activities, for example, alternatives A and B with fi-

nancial consequences XA and XB, are compared in order of preference

under conditions of risk. A preference order A � B means that A is

preferred to B. This order is represented by a preference function Ψ

with A � B ⇔ Ψ[XA] > Ψ[XB]. In contrast, a risk order A �R B

means that A is riskier than B and is represented by a function ρ with

A �R B ⇔ ρ[XA] > ρ[XB]. Every such function ρ is called a risk measure.

Models in actuarial science are used both for quantifying risks and

for pricing risks. Quantifying risk requires a risk measure to convert a

random future gain or loss into a certainty equivalent that can then be

used to order different risks and for decision-making purposes. In order

to quantify risk, it is necessary to specify the probability distributions of

the risks involved and to apply a preference function to these probability

distributions. Thus, this process involves both statistical assumptions and

economic assumptions. Individuals are assumed to be risk averse and to

have a preference to diversify risks.

Banks and regulatory agencies use monetary measures of risk to assess

the risk taken by financial investors; important examples are given by the

so-called Value-at-Risk and Tail Value-at-Risk.

Two-sided risk measures measure the magnitude of the distance (in


both directions) from X to E[X]. Different functions of distance lead to

different risk measures. Looking, for instance, at quadratic deviations,

this leads to the risk measure variance or to the risk measure standard

deviation. These risk measures have been the traditional measures in eco-

nomics and finance since the pioneering work of Markowitz. They exhibit

a number of nice technical properties. For instance, the variance of a port-

folio return is the sum of the variances and covariances of the individual

returns. Furthermore, the variance is used as a standard optimization

function (quadratic optimization).

On the other hand, a two-sided risk measure contradicts the intuitive

notion of risk that only negative deviations are dangerous. In addition

variance does not account for fat tails of the underlying distribution and

for the corresponding tail risk. For this reason, people include higher

(normalized) central moments, as for example, skewness and kurtosis, into

the analysis to assess risk more properly.

Perhaps the most popular risk measure is the Value-at-Risk (VaR). Let

L be the potential loss of a financial position. The VaR at confidence level

p (0 < p < 1) is then defined by the requirement

Pr[L > VaRp[L]

]= 1 − p. (1.1)

An intuitive interpretation of the VaR is that of a probable maximum loss

or more concrete, a 100×p% maximal loss, because Pr[L ≤ VaRp[L]

]= p,

which means that in 100 × p% of the cases, the loss is smaller or equal to

VaRp[L]. Interpreting the VaR as necessary underlying capital to bear risk,

relation (1.1) implies that this capital will, on average, not be exhausted in

100× p% of the cases. Obviously, the VaR is identical to the p-quantile of

the loss distribution, that is VaRp[L] = F−1L (p). It is important to remark

that the VaR does not take into account the severity of potential losses

in the 100 × (1 − p)% worst cases. A regulator for instance is not only

concerned with the frequency of default, but also about the severity of

default. Also shareholders and management should be concerned with the

question “how bad is bad?” when they want to evaluate the risks at hand

in a consistent way. Therefore, one often uses another risk measure which

is called the Tail Value-at-Risk (TVaR) and defined by

TVaRp[L] =1

1 − p

∫ 1

pVaRq[L]dq, p ∈ (0, 1).


It is the arithmetic average of the quantiles of L, from p on. Note that the

TVaR is always larger than the corresponding VaR.

We will define the other popular risk measures in terms of L for a

better comparison to the VaR. The Conditional Tail Expectation (CTE)

at confidence level p is defined by

CTEp[L] = E[L|L > VaRp[L]

], p ∈ (0, 1).

On the basis of the interpretation of the VaR as a 100 × p%-maximum

loss, the CTE can be interpreted as the average maximal loss in the worst

100 × (1 − p)% cases. Notice that in case of continuous distributions the

CTE and TVaR coincide.

Measures of shortfall risk are one-sided risk measures and measures

the shortfall risk relative to a target variable. This may be the expected

value, but in general, it is an arbitrary deterministic target or a stochastic

benchmark. The Expected Shortfall (ESF) at confidence level p is defined

as

ESFp[L] = E[max(L− VaRp[L], 0)], p ∈ (0, 1).

The following relations hold between the four risk measures defined above.

Theorem 1 (Relation between VaR, TVaR, CTE and ESF).

For p ∈ (0, 1), we have that

TVaRp[X] = VaRp[X] +1

1 − pESFp[X],

CTEp[X] = VaRp[X] +1

1 − FX(VaRp[X])ESFp[X],

CTEp[X] = TVaRFX(VaRp[X])[X].

Proof. See Dhaene et al. (2004).

Researchers always aimed to find a set of properties (axioms) that any

risk measure should satisfy. Recently the class of coherent risk measures,

introduced in Artzner (1999) and Artzner et al. (1999), has drawn a lot

of attention in the actuarial literature. The authors postulated that every

‘coherent’ risk measure should satisfy the following four properties:


1. monotonicity, i.e. X ≤ Y ⇒ ρ[X] ≤ ρ[Y ];

2. subadditivity, i.e. ρ[X + Y ] ≤ ρ[X] + ρ[Y ];

3. translation invariance, i.e. ρ[X + c] = ρ[X] + c ∀ c ∈ R;

4. positive homogeneity, i.e. ρ[aX] = aρ[X] ∀ a ≥ 0.

It can be demonstrated that the Value-at-Risk and the Expected Short-

fall are in general not subadditive. On the other hand, the TVaR is subad-

ditive. The desirability of the subadditivity property of risk measures has

been a major topic for research and discussion. Some researchers believe

that the axiom of subadditivity of risk measures used to determine the

solvency capital, reflects the risk diversification. However other authors

argue that the diversification benefits should be considered rather in terms

of subadditivity of the corresponding shortfalls.

It is an open question whether the coherent set of axioms is indeed the

‘best one’. For a relevant discussion we refer to e.g. Dhaene et al. (2003),

Goovaerts et al. (2003, 2004) and Darkiewicz et al. (2005a). It should

be noted that in spite of the disagreement in the scientific community

about the axioms of coherency, a lot of well-known risk measures satisfy

conditions (1)-(4) (e.g. the TVaR).

The expressions for the discussed risk measures of normal and lognor-

mal losses are given in the next two examples, which will be used in the

remainder of this thesis. For a proof of these examples, we refer to Dhaene

et al. (2004).

Example 1 (normal losses).

Consider a random variable X ∼ N(µ, σ2). The VaR, ESF and CTE at

confidence level p (p ∈ (0, 1)) of X are given by

VaRp[X] = µ+ σΦ−1(p), (1.2)

ESFp[X] = σφ(Φ−1(p)

)− σΦ−1(p)(1 − p), (1.3)

CTEp[X] = µ+ σφ(Φ−1(p)

)

1 − p, (1.4)

where φ(x) = Φ′(x) denotes the density function of the standard normal

distribution.


Example 2 (lognormal losses).

Consider a random variable X ∼ logN(µ, σ2). The VaR, ESF and CTE at

confidence level p (p ∈ (0, 1)) of X are given by

VaRp[X] = eµ+σΦ−1(p), (1.5)

ESFp[X] = eµ+σ2/2Φ(σ − Φ−1(p)

)− eµ+σΦ−1(p)(1 − p), (1.6)

CTEp[X] = eµ+σ2/2 Φ(σ − Φ−1(p)

)

1 − p. (1.7)

We end this section with a note about inverse distribution functions.

Inverse distribution functions

The cdf FX(x) = Pr[X ≤ x] of a random variable X is a right continuous

non-decreasing function with

FX(−∞) = limx→−∞

FX(x) = 0, FX(+∞) = limx→+∞

FX(x) = 1.

The classical definition of the inverse of a distribution function is the non-

decreasing and left-continuous function F−1X (p) defined by

F−1X (p) = inf{x ∈ R|FX(x) ≥ p}, p ∈ [0, 1]

with inf ∅ = +∞ by convention. For all x ∈ R and p ∈ [0, 1], we have

F−1X (p) ≤ x⇔ p ≤ FX(x). (1.8)

In this thesis we will use a more sophisticated definition for inverses of

distribution functions. For any real p ∈ [0, 1], a possible choice for the

inverse of FX in p is any point in the closed interval

[inf{x ∈ R|FX(x) ≥ p}, sup{x ∈ R|FX(x) ≤ p}

],

where, as before, inf ∅ = +∞, and also sup ∅ = −∞. Taking the left hand

border of this interval to be the value of the inverse cdf at p, we get F−1X (p).

Similarly, we define F−1+X (p) as the right hand border of the interval:

F−1+X (p) = sup{x ∈ R|Fx(x) ≤ p}, p ∈ [0, 1]

which is a non-decreasing and right-continuous function. Note that F−1X (0)

= −∞, F−1+X (1) = +∞ and that all the probability mass of X is contained


in the interval[F−1+

X , (0)F−1X (1)

]. Also note that F−1

X (p) and F−1+X (p) are

finite for all p ∈ (0, 1). In the sequel we will always use p as a value ranging

over the open interval (0, 1), unless stated otherwise.

In the following lemma, we state the relation between the inverse dis-

tribution functions of the random variables X and g(X) for a monotone

function g.

Lemma 1 (Inverse distribution function of g(X)).

Let X and g(X) be real-valued random variables and 0 < p < 1.

(a) If g is non-decreasing and left-continuous, then

F−1g(X)(p) = g

(F−1

X (p)).

(b) If g is non-decreasing and right-continuous, then

F−1+g(X)(p) = g

(F−1+

X (p)).

(c) If g is non-increasing and left-continuous, then

F−1+g(X)(p) = g

(F−1

X (1 − p)).

(d) If g is non-increasing and right-continuous, then

F−1g(X)(p) = g

(F−1+

X (1 − p)).

Proof. See Dhaene et al. (2002a).

Hereafter, we will reserve the notation U and V for U(0, 1) random vari-

ables, i.e. FU (p) = p and F−1U (p) = p for all 0 < p < 1, and the same for

V . One can prove that

Xd= F−1

X (U)d= F−1+

X (U). (1.9)

The first distributional equality is known as the quantile transform theo-

rem and follows immediately from (1.8). It states that a sample of random

numbers from a general cumulative distribution function FX can be gen-

erated from a sample of uniform random numbers. Note that FX has at

most a countable number of horizontal segments, implying that the last

two random variables in (1.9) only differ in a null-set of values of U . This

means that these random variables are equal with probability one.


1.1.3 Actuarial ordering of risks

In the actuarial literature, orderings of risks are an important tool for

comparing the attractiveness of different risks. Many examples and results

can be found in the work of Goovaerts et al. (1990), Van Heerwaarden

(1991) and Kaas et al. (1998).

The essential tool for the comparison of different concepts of order-

ings of risks will be the stop-loss transform/premium and its properties.

Throughout this section a risk X will be a random variable with finite

mean. The distribution function of X is denoted by FX , and FX = 1−FX

is the corresponding survival function.

In the actuarial literature it is a common feature to replace a risk by

a “less favorable” risk that has a simpler structure, making it easier to

determine the distribution function. Of course, we have to clarify what we

mean with a “less favorable” risk. Therefore, we first introduce the notion

of “stop-loss premium” of a distribution function.

Definition 1 (Stop-loss premium).

The stop-loss premium with retention d of a risk X is defined by

π(X, d) := E[(X − d)+

]=

∫ ∞

dFX(x)dx, −∞ < d < +∞, (1.10)

with the notation (x− d)+ = max(x− d, 0).

From this formula it is clear that the stop-loss premium with retention

d can be considered as the weight of an upper tail of (the distribution

function of) X. Indeed, it is the surface between the cdf FX of X and

the constant function 1, from d on. For these reasons stop-loss premiums

contain a lot of information about riskiness of underlying distributions.

The following properties of the stop-loss premium can easily be deduced

from the definition.

Theorem 2 (Stop-loss properties).

The stop-loss premium π(X, .) has the following properties:

(i) π(X, .) is decreasing and convex;

(ii) The right-hand derivative π′+(X, .) exists and −1 ≤ π

′+(X, .) ≤ 0;

(iii) limd→+∞ π(X, d) = 0.


To every function π : R+ → R, that fulfils (i)-(iii) there is a risk X, such

that π is the stop-loss premium of X. The distribution function of X is

given by FX(d) = π′+(X, d) + 1.

There are many concepts for comparing random variables. The most fa-

miliar one is the usual stochastic order introduced by Lehmann (1955).

In the actuarial and economic literature this ordering is sometimes called

stochastic dominance, see e.g. Goovaerts et al. (1990) and Van Heerwaar-

den (1991).

Definition 2 (Stochastic order).

We say that risk Y stochastically dominates risk X, written X ≤st Y , if

and only if FX(t) ≥ FY (t) for all t ∈ R.

In other words, X ≤st Y if their corresponding quantiles are ordered.

Note that the condition for stochastic dominance is very strong — it can

be easily seen that X ≤st Y if and only if there exist a bivariate vector

(X ′, Y ′) with the same marginal distributions as X and Y and such that

X ′ ≤ Y ′ almost surely.

Several results for this ordering can be found in Shaked and Shanthiku-

mar (1994). In the following theorem, some equivalent characterizations

are given for stochastic dominance.

Lemma 2 (Characterizations for stochastic dominance).

X ≤st Y holds if and only if any of the following equivalent conditions is

satisfied:

1. Pr[X ≥ t] ≥ Pr[Y ≥ t], for all t ∈ R;

2. Pr[X > t] ≥ Pr[Y > t], for all t ∈ R;

3. E[φ(X)] ≤ E[φ(Y )], for all non-decreasing functions φ(.);

4. E[ψ(−X)] ≥ E[ψ(−Y )], for all non-decreasing functions ψ(.);

5. The function t→ π(Y, t) − π(X, t) is non-increasing.

A consequence of stochastic order X ≤st Y , i.e. a necessary condition for

it, is obviously that E[X] ≤ E[Y ], and even E[X] < E[Y ] unless Xd= Y .

The stochastic dominance has a natural interpretation in terms of utility

theory. We have that X ≤st Y holds if and only if E[u(−X)] ≥ E[u(−Y )]

for every non-decreasing utility function u. So the pairs of risks X and


Y with X ≤st Y are exactly those pairs of losses about which all decision

makers with an increasing utility function agree.

For actuarial applications the stop-loss order is much more interesting.

This ordering was investigated by Buhlmann et al. (1977), Goovaerts et al.

(1990) and Van Heerwaarden (1991). It is equivalent to increasing convex

order, which is well known in operations research and statistics.

Definition 3 (Stop-loss order).

If X and Y are two risks, then X precedes Y in stop-loss order, written

X ≤sl Y , if and only if

π(X, d) ≤ π(Y, d) for all −∞ < d < +∞. (1.11)

In other words two risks are ordered in the stop-loss sense if their corres-

ponding stop-loss premiums are ordered. It is clear that stochastic order

induces stop-loss order.

Like stochastic order, stop-loss order between two risksX and Y implies

a corresponding ordering of their means. To prove this, assume that d < 0.

From the expression (1.10) in Definition 1 of stop-loss premiums as upper

tails, we immediately find the following equality:

d+ π(X, d) = −∫ 0

dFX(x)dx+

∫ ∞

0(1 − FX(x))dx (1.12)

and also, letting d→ −∞,

limd→−∞

(d+ π(X, d)

)= E[X].

Hence, adding d to both sides of the inequality (1.11) in Definition 3 and

taking the limit for d→ −∞, we get E[X] ≤ E[Y ].

A sufficient condition forX ≤sl Y to hold is that E[X] ≤ E[Y ], together

with the condition that their cumulative distribution functions only cross

once. This means that there exists a real number c such that FX(x) ≥FY (x) for x ≥ c, but FX(x) ≤ FY (x) for x < c. Indeed, considering the

function f(d) = π(Y, d) − π(X, d), we have that

limd→−∞

f(d) = E[Y ] − E[X] ≥ 0, and limd→+∞

f(d) = 0.

Further, f(d) first increases, and then decreases (from c on) but remains

non-negative.


If two risks X and Y are ordered in the stop-loss sense, X ≤sl Y , this

means that X has uniformly smaller upper tails than Y , which in turns

means that a risk X is more attractive than a risk Y for an insurance

company. Moreover stop-loss order has a natural economic interpretation

in terms of expected utility. Indeed, it can be shown that X ≤sl Y if and

only if E[u(−X)

]≥ E

[u(−Y )

]holds for all non-decreasing concave real

functions u. This means that any risk-averse decision maker will prefer

to pay X instead of Y , which implies that acting as if the obligations X

are replaced by Y indeed leads to conservative or prudent decisions. This

characterization of stop-loss order in terms of utility functions is equivalent

to E[v(X)

]≤ E

[v(Y )

]holding for all non-decreasing convex functions v.

For this reason stop-loss order is alternatively called an increasing convex

order and denoted by ≤icx.

Recall that our original problem was to replace a risk X by a less

favorable risk Y , for which the distribution function is easier to obtain. If

X ≤sl Y , then also E[X] ≤ E[Y ], and it is intuitively clear that the best

approximations arise in the borderline case where E[X] = E[Y ]. This leads

to the so-called convex order.

Definition 4 (Convex order).

If X and Y are two risks, then X precedes Y in convex order, written

X ≤cx Y , if and only if

E[X] = E[Y ] and π(X, d) ≤ π(Y, d) for all −∞ < d < +∞. (1.13)

A sufficient condition for X ≤cx Y to hold is that E[X] = E[Y ], together

with the condition that their cumulative distribution functions only cross

once. This once-crossing condition can be observed to hold in most natural

examples, but it is of course easy to construct examples with X ≤cx Y and

distribution functions that cross more than once.

It can also be proven that X ≤cx Y if and only E[v(X)

]≤ E

[v(Y )

]for

all convex functions v. This explains the name “convex order”. Note that

when characterizing stop-loss order, the convex functions v are additionally

required to be non-decreasing. Hence, stop-loss order is weaker: more pairs

of random variables are ordered.

In the utility context one will reformulate this condition to E[X] = E[Y ]

and E[u(−X)

]≥ E

[u(−Y )

]for all non-decreasing concave functions u.

These conditions represent the common preferences of all risk-averse deci-

sion makers between risks with equal mean. We summarize the properties

of convex order in the following lemma.


Lemma 3 (Characterizations for convex order).

X ≤cx Y if and only if any of the following equivalent conditions is satis-

fied:

1. E[X] = E[Y ] and π(X, d) ≤ π(Y, d) for all d ∈ R;

2. E[X] = E[Y ] and E[(d−X)+] ≤ E[(d− Y )+] for all d ∈ R;

3. π(X, d) ≤ π(Y, d) and E[(d−X)+] ≤ E[(d− Y )+] for all d ∈ R;

4. E[X] = E[Y ] and E[u(−X)

]≥ E

[u(−Y )

]for all concave functions

u(.);

5. E[v(X)

]≤ E

[v(Y )

]for all convex functions v(.).

In case X ≤cx Y , the upper tails as well as the lower tails of Y eclipse the

corresponding tails of X, which means that extreme values are more likely

to occur for Y than for X. This observation also implies that X ≤cx Y is

equivalent to −X ≤cx −Y . Hence, the interpretation of risks as payments

or as incomes is irrelevant for the convex order.

Note that with stop-loss order, we are concerned with large values of

a random loss, and call the risk Y less attractive than X if the expected

values of all top parts (Y − d)+ are larger than those of X. Negative

values for these random variables are actually gains. With stability in

mind, excessive gains might also be unattractive for the decision maker,

for instance for tax reasons. In this situation, X could be considered to

be more attractive than Y if both the top parts (X − d)+ and the bottom

parts (d −X)+ have a lower expected value than for Y . Both conditions

just define the convex order introduced above.

Corollary 1 (Convex order and variance).

If X ≤cx Y then Var[X] ≤ Var[Y ].

Proof. It suffices to take the convex function v(x) = x2.

Notice that the reverse implication does not hold in general. Compar-

ing variances is meaningful when comparing stop-loss premiums of convex

ordered risks. The following corollary links variances and stop-loss premi-

ums.

1.2. Comonotonicity 15

Corollary 2 (Variance and stop-loss premiums).

For any random variable X we can write

1

2Var[X] =

∫ +∞

−∞

(π(X, t) −

(E[X] − t

)+

)dt. (1.14)

Proof. See e.g. Kaas et al. (1998).

From relation (1.14) in Corollary 2 we deduce that if X ≤cx Y ,∫ +∞

−∞

∣∣π(Y, t) − π(X, t)∣∣dt =

1

2

(Var[Y ] − Var[X]

). (1.15)

Thus, if X ≤cx Y , their stop-loss distance, i.e. the integrated absolute

difference of their respective stop-loss premiums, equals half the variance

difference between these two random variables.

As the integrand in (1.15) is non-negative, we find that if X ≤cx Y and

in addition Var[X] = Var[Y ], then X and Y must have necessarily equal

stop-loss premiums and hence the same distribution. We also find that

if X ≤cx Y , and X and Y are not equal in distribution, then Var[X] <

Var[Y ] must hold. Note that (1.14) and (1.15) have been derived under

the additional condition that X and Y have finite second moments, hence

both limx→∞ x2(1 − FX(x)) and limx→−∞ x2FX(x) are equal to 0 (and

similar for Y ).

In the following theorem we recall the characterization of stochastic dom-

inance in terms of Value-at-Risk, and a similar result characterizing stop-

loss order by Tail Value-at-Risk.

Theorem 3. For any random pair (X,Y ) we have that

1. X ≤st Y ⇔ VaRp[X] ≤ VaRp[Y ] for all p ∈ (0, 1);

2. X ≤sl Y ⇔ TVaRp[X] ≤ TVaRp[Y ] for all p ∈ (0, 1).


1.2 Comonotonicity

In an insurance context, one is often interested in the distribution function

of a sum of random variables. Such a sum appears for instance when con-

sidering the aggregate claims of an insurance portfolio over a certain refer-

ence period. In traditional risk theory, the individual risks of a portfolio are


usually assumed to be mutually independent. This is very convenient from

a mathematical point of view as the standard techniques for determining

the distribution function of aggregate claims, such as Panjer’s recursion

and convolution, are based on the independence assumption. Moreover, in

general the statistics gathered by the insurer only give information about

the marginal distributions of the risks, not about their joint distribution,

i.e. the way these risks are interrelated. The assumption of mutual inde-

pendence however does not always comply with reality, which may resolve

in an underestimation of the total risk. On the other hand, the mathema-

tics for dependent variables is less tractable, except when the variables are

comonotonic.

This section provides theoretical background for the concept of comono-

tonicity.

We start by defining a comonotonicity of a set A of n-vectors in Rn. We

will denote an n-vector (x1, x2, . . . , xn) by ~x. For two n-vectors ~x and ~y, the

notation ~x ≤ ~y will be used for the componentwise order which is defined

by xi ≤ yi for all i = 1, 2, . . . , n. We will denote the (i, j)-projection of a

set A in Rn by Ai,j . It is formally defined by Aij = {(xi, xj)|~x ∈ A}.

Definition 5 (Comonotonic set).

The set A ⊆ Rn is said to be comonotonic if for any ~x and ~y in A, either

~x ≤ ~y or ~y ≤ ~x holds.

A set A ⊆ Rn is comonotonic if for any ~x and ~y in A, if xi < yi for some

i, then ~x ≤ ~y must hold. Hence, a comonotonic set is simultaneously non-

decreasing in each component. Notice that a comonotonic set is a ‘thin’

set: it cannot contain any subset of dimension larger than 1. Any subset of

a comonotonic set is also comonotonic. The proof of the following lemma

is straightforward.

Lemma 4. A ⊆ Rn is comonotonic if and only if the set Ai,j is comono-

tonic for all i 6= j in {1, 2, . . . , n}.

For a general set A, comonotonicity of the (i, i + 1)-projections Ai,i+1,

(i = 1, 2, . . . , n− 1), will not necessarily imply that A is comonotonic. As

a counter example, consider the set A = {(x1, 1, x3)|0 < x1, x3 < 1}. This

set is not comonotonic, although A1,2 and A2,3 are comonotonic.


Next, we define the notion of support of an n-dimensional random

vector ~X = (X1, . . . , Xn). Any subsect A ⊆ Rn will be called a support

of ~X if Pr[~X ∈ A

]= 1 and Pr

[~X /∈ A

]= 0. In generally we will be

interested in supports which are “as small as possible”. Informally, the

smallest support of a random vector ~X is the subset of Rn that is obtained

by deleting from Rn all points which have a zero-probability neighborhood

(with respect to ~X). This support can be interpreted as the set of all

possible outcomes of ~X.

Definition 6 (Comonotonic random vector).

A random vector ~X = (X1, X2, . . . , Xn) is said to be comonotonic if it has

a comonotonic support.

From Definition 6 we can conclude that comonotonicity is a very strong

positive dependency structure. Indeed, if ~x and ~y are elements of the

comonotonic support of ~X, i.e. ~x and ~y are possible outcomes of ~X, then

they must be ordered component by component. This explains the term

comonotonic (common monotonic).

Comonotonicity of a random vector ~X implies that the higher the value

of one component Xj , the higher the value of any other component Xk.

This means that comonotonicity entails that no Xj is in any way a ‘hedge’,

for another component Xk.

In the following theorem, some equivalent characterizations are given

for comonotonicity of a random vector.

Theorem 4 (Characterizations for comonotonicity).

A random vector ~X = (X1, X2, . . . , Xn) is comonotonic if and only if one

of the following equivalent conditions are satisfied:

1. ~X has a comonotonic support;

2. For all ~x = (x1, x2, . . . , xn), we have

F ~X(~x) = min{FX1(x1), FX2(x2), . . . , FXn(xn)

}; (1.16)

3. For U ∼ U(0, 1), we have

~Xd=(F−1

X1(U), F−1

X2(U), . . . , F−1

Xn(U)); (1.17)


4. There exist a random variable Z and non-decreasing functions fi

(i = 1, 2, . . . , n), such that

~Xd= (f1(Z), f2(Z), . . . , fn(Z)).

Proof. See Dhaene et al. (2002a).

From (1.16) we see that, in order to find the probability of all the outcomes

of n comonotonic risks Xi being less than xi (i = 1, . . . , n) one simply

takes the probability of the least likely of these n events. It is obvious

that for any random vector (X1, . . . , Xn), not necessarily comonotonic,

the following inequality holds:

Pr[X1 ≤ x1, . . . , Xn ≤ xn

]≤ min

{FX1(x1), . . . , FXn(xn)

}, (1.18)

and it is well-known that the function min{FX1(x1), . . . , FXn(xn)

}is in-

deed the multivariate cdf of a random vector(F−1

X1(U), . . . , F−1

Xn(U)), which

has the same marginal distributions as (X1, . . . , Xn). Inequality (1.18)

states that in the class of all random vectors (X1, . . . , Xn) with the same

marginal distributions, the probability that all Xi simultaneously realize

large values is maximized if the vector is comonotonic, suggesting that

comonotonicity is indeed a very strong positive dependency structure. In

the special case that all marginal distribution functions FXiare identical,

we find from (1.17) that comonotonicity of ~X is equivalent to saying that

X1 = X2 = · · · = Xn holds almost surely.

A standard way of modelling situations where individual random vari-

ables X1, . . . , Xn are subject to the same external mechanism is to use a

secondary mixing distribution. The uncertainty about the external mech-

anism is then described by a structure variable z, which is a realization of

a random variable Z and acts as a (random) parameter of the distribution

of ~X. The aggregate claims can then be seen as a two-stage process: first,

the external parameter Z = z is drawn from the distribution function FZ

of z. The claim amount of each individual risk Xi is then obtained as a

realization from the conditional distribution function of Xi given Z = z.

A special type of such a mixing model is the case where given Z = z, the

claim amounts Xi are degenerate on xi, where the xi = xi(z) are non-

decreasing in z. This means that (X1, . . . , Xn)d= (f1(Z), . . . fn(Z)) where

all functions fi are non-decreasing. Hence, (X1, . . . , Xn) is comonotonic.

Such a model is in a sense an extreme form of a mixing model, as in this


case the external parameter Z = z completely determines the aggregate

claims.

If U ∼ U(0, 1), then also 1 − U ∼ U(0, 1). This implies that comono-

tonicity of ~X can also be characterized by

~Xd=(F−1

X1(1 − U), F−1

X2(1 − U), . . . , F−1

Xn(1 − U)

).

Similarly, one can prove that ~X is comonotonic if and only if there exist a

random variable Z and non-increasing functions fi, (i = 1, 2, . . . , n), such

that~X

d= (f1(Z), f2(Z), . . . , fn(Z)).

In the sequel, for any random vector (X1, . . . , Xn), the notation (Xc1, . . . , X

cn)

will be used to indicate a comonotonic random vector with the same

marginals as (X1, . . . , Xn). From (1.17) we find that for any random vec-

tor ~X the outcome of its comonotonic counterpart ~Xc = (Xc1, . . . , X

cn) lies

with probability one in the following set{(F−1

X1(p), F−1

X2(p), . . . , F−1

Xn(p))|0 < p < 1

}.

The following theorem states essentially that the comonotonicity of a ran-

dom vector is equivalent with pairwise comonotonicity.

Theorem 5 (Pairwise comonotonicity).

A random vector ~X is comonotonic if and only if the couples (Xi, Xj) are

comonotonic for all i and j in {1, 2, . . . , n}.The next theorem characterizes a comonotonic random couple by means

of Pearson’s correlation coefficient r.

Theorem 6 (Comonotonicity and maximum correlation).

For any random vector (X1, X2) the following inequality holds:

r(X1, X2) ≤ r(F−1

X1(U), F−1

X2(U)), (1.19)

with strict inequalities when (X1, X2) is not comonotonic.

As a special case of (1.19), we find that r(F−1

X1(U), F−1

X2(U))≥ 0 always

holds. In Denuit & Dhaene (2003) it is shown that other dependence

measures such as Kendall’s τ and Spearman’s ρ equal 1 (and thus are also

maximal) if and only if the variables are comonotonic.

In the following theorem we recall that the Value-at-Risk (VaRp), the

Tail Value-at-Risk (TVaRp) and the Expected Shortfall (ESFp) are addi-

tive for comonotonic risks.


Theorem 7 (Comonotonicity and risk measures).

Consider a comonotonic random vector(Xc

1, Xc2, . . . , X

cn

), and let Sc =

Xc1 +Xc

2 + · · · +Xcn. Then for all p ∈ (0, 1) one has that

VaRp[Sc] =

n∑

i=1

VaRp[Xi]; (1.20)

TVaRp[Sc] =

n∑

i=1

TVaRp[Xi]; (1.21)

ESFp[Sc] =

n∑

i=1

ESFp[Xi]. (1.22)


The computation of the most important risk measures is very easy for sums

of comonotonic random variables, since it suffices to perform calculations

for marginal distributions and add up the resulting values. Throughout

the rest of this thesis we will use the property of additivity of a quantile

function for comonotonic risks.

Chapter 2

Convex bounds

Summary In many actuarial and financial problems the distribution of

a sum of dependent random variables is of interest. In general, however,

this distribution function can not be obtained analytically because of the

complex underlying dependency structure. Kaas et al. (2000) and Dhaene

et al. (2002a) propose a possible way out by considering upper and lower

bounds for (the distribution function of) such a sum that allow explicit cal-

culations of various actuarial quantities. When lower and upper bounds are

close to each other, together they can provide reliable information about

the original and more complex variable. In particular this technique is very

useful to find reliable estimations of upper quantiles and stop-loss premi-

ums. We summarize the main results for deriving lower and upper bounds

and we construct sharper upper bounds for stop-loss premiums, based upon

the traditional comonotonic bounds. The idea of convex upper and lower

bounds is generalized to the case of scalar products of non-negative random

variables. We apply the derived results to the case of general discounted

cash flows, with stochastic payments. Numerous numerical illustrations

are provided, demonstrating that the derived methodology gives very ac-

curate approximations for the underlying distribution functions and the

corresponding risk measures, like quantiles and stop-loss premiums.

2.1 Introduction

In many financial and actuarial applications where a sum of stochastic

terms is involved, the distribution of the quantity under investigation is too

difficult to obtain. It is well-known that in general the distribution function

21

22 Chapter 2 - Convex bounds

of a sum of dependent random variables cannot be determined analytically.

Therefore, instead of aiming to calculate the exact distribution, we will look

for approximations (bounds), in the convex order sense, with a simpler

structure.

The first approximation we will consider for the distribution function

of a sum of dependent random variables is derived by approximating the

dependence structure between the random variables involved by a comono-

tonic dependence structure. If the dependency structure between the sum-

mands of such a sum is strong enough, this upper bound in convex order

performs reasonably well.

The second approximation, which is derived by considering conditional

expectations, partly takes of the dependence structure into account. This

lower bound in convex order turns out to be extremely useful to evaluate

the quality of approximation provided by the upper bound. The lower

bound can also be applied as an approximation of the underlying distribu-

tion. This choice is not actuarially prudent, but the relative error of this

approximation significantly outperforms the relative error of the upper

bound.

When lower and upper bounds are close to each other, together they

can provide reliable information about the original and more complex vari-

able. We emphasize that the bounds are in convex order, which does not

mean that the real value always lies between these two approximations. In

particular this technique is very useful to find reliable estimations of upper

quantiles and stop-loss premiums.

Section 2 recalls these theoretical results of Dhaene et al. (2002a).

The lower bound approximates very accurate the real stop-loss premium,

but the comonotonic upper bounds perform rather poorly. Therefore, in

Section 3 we construct sharper upper bounds based upon the traditional

comonotonic bounds. Making use of the ideas of Rogers and Shi (1995),

the first upper bound is obtained as the comonotonic lower bound plus

an error term. Next, this bound is refined by making the error term de-

pendent on the retention in the stop-loss premium. Further, we study the

case that the stop-loss premium can be decomposed into two parts. One

part can be evaluated exactly, to another part, comonotonic bounds are

applied. The application to the lognormal case is presented at the end of

Section 3.

In Section 4 we illustrate the accuracy of the comonotonic approxima-

tions by means of an application in the context of discounted reserves.

2.2. Convex bounds for sums of dependent random variables 23

Section 5 extends the methodology of Dhaene et al. (2002a,b) for deriv-

ing lower and upper bounds of a sum of dependent variables to the case of

scalar products of independent random vectors. We derive a procedure for

calculating the lower and upper bounds in case one of the vectors follows

the multivariate lognormal law.

In Section 6 we apply these results to the case of general discounted

cash flows, with stochastic payments. Numerous numerical illustrations

are provided, demonstrating that the derived methodology gives very ac-

curate approximations for the underlying distribution functions and the

corresponding risk measures.

Section 2 and 3 in this chapter are mainly based on Hoedemakers, Dark-

iewicz, Deelstra, Dhaene & Vanmaele (2005). The results in Section 4

come from Hoedemakers & Goovaerts (2004). The generalization to the

scalar product of two random vectors in Section 5 is based on Hoedemak-

ers, Darkiewicz & Goovaerts (2005) and Section 6 is taken from Ahcan,

Darkiewicz, Goovaerts & Hoedemakers (2005).

2.2 Convex bounds for sums of dependent ran-

dom variables

In the actuarial context one encounters quite often random variables of the

type

S = X1 +X2 + · · · +Xn,

where the terms Xi are not mutually independent, but the multivariate

distribution function of the random vector ~X = (X1, X2, . . . , Xn) is not

completely specified and one only knows the marginal distribution func-

tions of the random variablesXi. In such cases, to be able to make decisions

it may be helpful to find the dependence structure for the random vector

(X1, . . . , Xn) producing the least favorable aggregate claims S with given

marginals. Therefore, given the marginal distributions of the terms in a

random variable S =∑n

i=1Xi, we shall look for a joint distribution with

a smaller resp. larger sum, in the convex order sense.

If S consists of a sum of random variables (X1, . . . , Xn), replacing the

joint distribution of (X1, . . . , Xn) by the comonotonic joint distribution

yields an upper bound for S in the convex order. On the other hand, ap-

plying conditioning to S provides us a lower bound. Finally, if we combine


both ideas, then we end up with an improved upper bound. This is formal-

ized in the following theorem, which is taken from Dhaene et al. (2002a)

and Kaas et al. (2000).

Theorem 8 (Bounds for a sum of random variables).

Consider a sum of random variables S = X1 +X2 + . . . +Xn and define

the following related random variables:

Sl = E[X1|Λ] + E[X2|Λ] + . . .+ E[Xn|Λ], (2.1)

Sc = F−1X1

(U) + F−1X2

(U) + . . .+ F−1Xn

(U), (2.2)

Su = F−1X1|Λ(U) + F−1

X2|Λ(U) + . . .+ F−1Xn|Λ(U), (2.3)

with U a U(0,1) random variable and Λ an arbitrary random variable.

Here F−1Xi|Λ(U) is the notation for the random variable fi(U,Λ), with the

function fi defined by fi(u, λ) = F−1Xi|Λ=λ(u).

The following relations then hold:

Sl ≤cx S ≤cx Su ≤cx S

c.

Proof. See e.g. Dhaene et al. (2002a).

The comonotonic upper bound changes the original copula, but keeps the

marginal distributions unchanged. The comonotonic lower bound on the

other hand, changes both the copula and the marginals involved. Intu-

itively, one can expect that an appropriate choice of the conditioning vari-

able Λ will lead to much better approximations compared to the upper

bound.

The upper bound Sc is the most dangerous sum of random variables

with the same marginal distributions as the original terms Xj in S. Indeed,

the upper bound Sc now consists of a sum of comonotonic variables all

depending on the same random variable U . If one can find a conditioning

random variable Λ with the property that all random variables E[Xj |Λ] are

non-increasing functions of Λ (or all are non-decreasing functions of Λ),

then the lower bound Sl =∑n

j=1 E[Xj |Λ] is also a sum of n comonotonic

random variables.

We recall from Dhaene et al. (2002a) and the references therein the

procedures for obtaining the lower and upper bounds for stop-loss pre-

miums of sums S of dependent random variables by using the notion of

comonotonicity.


2.2.1 The comonotonic upper bound

As proven in Dhaene et al. (2002a), the convex-largest sum of the compo-

nents of a random vector with given marginals is obtained by the comono-

tonic sum Sc = Xc1 +Xc

2 + · · · +Xcn with

Sc d=

n∑

i=1

F−1Xi

(U), (2.4)

where U denotes in the following a U(0, 1) random variable.

Kaas et al. (2000) have proved that the inverse distribution function of

a sum of comonotonic random variables is simply the sum of the inverse

distribution functions of the marginal distributions. See also Theorem 7.

Therefore, given the inverse functions F−1Xi

, the cumulative distribution

function of Sc = Xc1 +Xc

2 + · · · +Xcn can be determined as follows:

FSc(x) = sup {p ∈ (0, 1) | FSc(x) ≥ p}= sup

{p ∈ (0, 1) | F−1

Sc (p) ≤ x}

= sup

{p ∈ (0, 1) |

n∑

i=1

F−1Xi

(p) ≤ x

}. (2.5)

Moreover, in case of strictly increasing and continuous marginals, the cdf

FSc(x) is uniquely determined by

F−1Sc (FSc (x)) =

n∑

i=1

F−1Xi

(FSc (x)) = x, F−1+Sc (0) < x < F−1

Sc (1).

(2.6)

Hereafter we restrict ourselves to this case of strictly increasing and con-

tinuous marginals.

In the following theorem Dhaene et al. (2000) have proved that the

stop-loss premiums of a sum of comonotonic random variables can easily

be obtained from the stop-loss premiums of the terms.

Theorem 9 (Stop-loss premium of comonotonic sum).

The stop-loss premium, denoted by πcub(S, d), of the sum Sc of the com-

ponents of the comonotonic random vector (Xc1, X

c2, . . . , X

cn) at retention

d is given by

πcub(S, d) =

n∑

i=1

π(Xi, F

−1Xi

(FSc(d)

)),

(F−1+

Sc (0) < d < F−1Sc (1)

).

(2.7)


If the only information available concerning the multivariate distribution

function of the random vector (X1, . . . , Xn) consists of the marginal dis-

tribution functions of the Xi, then the distribution function of Sc =

F−1X1

(U) + F−1X2

(U) + · · · + F−1Xn

(U) is a prudent choice for approximat-

ing the unknown distribution function of S = X1 +X2 + · · · +Xn. It is a

supremum in terms of convex order. It is the best upper bound that can

be derived under the given conditions.

We end this part about the comonotonic upper bound by summarizing

the main advantages of using Sc = Xc1 + Xc

2 + · · · + Xcn instead of S =

X1 +X2 + · · · +Xn:

• Replacing the distribution function of S by the distribution function

of Sc is a prudent strategy in the framework of utility theory: the

real distribution function is replaced by a less attractive one.

• The random variables S and Sc have the same expected value. As

these random variables are ordered in the convex order sense, we

have that every moment of order 2k (k = 1, 2, . . .) of S is smaller

than the corresponding moment of Sc. Many actuarially relevant

quantities reflect convex order, for instance both the ruin probability

and the Lundberg upper bound for it increase when the claim size

distribution is replaced by a convex larger one. Other examples are

zero-utility premiums such as the exponential premium, and of course

stop-loss premiums for any retention d.

• The cdf of Sc can easily be obtained; essentially, Sc has a one-

dimensional distribution, depending only on the random variable U .

The distribution function of S can only be obtained if the dependency

structure is known. Even if this dependency structure is known, it

can be hard to determine the distribution function of S from it.

• The stop-loss premiums of Sc follow from stop-loss premiums of the

marginal random variables involved. Computing the stop-loss pre-

miums of S can only be carried out when the dependency structure

is known, and in general requires n integrations to be performed.

2.2.2 The improved comonotonic upper bound

Let us now assume that we have some additional information available

concerning the stochastic nature of (X1, . . . , Xn). More precisely, we as-


sume that there exists some random variable Λ with a given distribution

function, such that we know the conditional cumulative distribution func-

tions, given Λ = λ, of the random variables Xi, for all possible values of λ.

In fact, Kaas et al. (2000) define the improved comonotonic upper bound

Su as

Su = F−1X1|Λ(U) + F−1

X2|Λ(U) + · · · + F−1Xn|Λ(U). (2.8)

In order to obtain the distribution function of Su, observe that given the

event Λ = λ, the random variable Su is a sum of comonotonic random

variables.

Hence,

F−1Su|Λ=λ(p) =

n∑

i=1

F−1Xi|Λ=λ(p), p ∈ (0, 1) .

Given Λ = λ, the cdf of Su is defined by

FSu|Λ=λ(x) = sup

{p ∈ (0, 1) |

n∑

i=1

F−1Xi|Λ=λ(p) ≤ x

}.

The cdf of Su then follows from

FSu(x) =

∫ +∞

−∞FSu|Λ=λ(x) dFΛ(λ).

If the marginal cdf’s FXi|Λ=λ are strictly increasing and continuous, then

FSu|Λ=λ(x) is a solution to

n∑

i=1

F−1Xi | Λ=λ

(FSu | Λ=λ(x)

)= x, x ∈

(F−1+

Su | Λ=λ(0), F−1Su | Λ=λ(1)

).

(2.9)

In this case, we also find that for any d ∈(F−1+

Su|Λ=λ(0), F−1Su|Λ=λ(1)

):

E[(Su − d)+ |Λ = λ

]=

n∑

i=1

E

[(Xi − F−1

Xi|Λ=λ

(FSu|Λ=λ(d)

))

+|Λ = λ

],

from which the stop-loss premium at retention d of Su, which we will

denote by πicub(S, d,Λ), can be determined by weighted integration with

respect to λ over the real line.


2.2.3 The lower bound

Let ~X = (X1, . . . , Xn) be a random vector with given marginal cumula-

tive distribution functions FX1 , FX2 , . . . , FXn . Let us now assume that we

have some additional information available concerning the stochastic na-

ture of (X1, . . . , Xn). More precisely, we assume that there exists some

random variable Λ with a given distribution function, such that we know

the conditional distribution, given Λ = λ, of the random variables Xi, for

all possible values of λ. We recall from Kaas et al. (2000) that a lower

bound, in the sense of convex order, for S = X1 +X2 + · · · +Xn is

Sl = E [S|Λ] . (2.10)

This idea can also be found in Rogers and Shi (1995) for the continuous

and lognormal case. Let us further assume that the random variable Λ is

such that all E [Xi|Λ] are non-decreasing and continuous functions of Λ,

then Sl is a comonotonic sum.

The quantiles of the lower bound S l then follow from

F−1Sl (p) =

n∑

i=1

F−1E[Xi|Λ](p) =

n∑

i=1

E[Xi|Λ = F−1

Λ (p)], p ∈ (0, 1) , (2.11)

and the cdf of Sl is according to (2.5) given by

FSl(x) = sup

{p ∈ (0, 1) |

n∑

i=1

E[Xi|Λ = F−1

Λ (p)]≤ x

}. (2.12)

Using Theorem 9, the stop-loss premiums with retention d read(F−1+

Sl (0)

< d < F−1Sl (1)

)

πlb(S, d,Λ) =n∑

i=1

π(E[Xi|Λ], F−1

E[Xi|Λ]

(FSl(d)

)).

When in addition the cdf’s of the random variables E [Xi|Λ] are strictly

increasing and continuous, then the cdf of S l is also strictly increasing and

continuous, and we get analogously to (2.6) for all x ∈(F−1+

Sl (0) , F−1Sl (1)

),

n∑

i=1

F−1E[Xi|Λ]

(FSl(x)

)= x ⇔

n∑

i=1

E[Xi|Λ = F−1

Λ

(FSl(x)

)]= x, (2.13)


which unambiguously determines the cdf of the convex order lower bound

Sl for S. In order to derive the above equivalence, we used the results of

Lemma 1.

Invoking Theorem 9, the stop-loss premium πlb(S, d,Λ) of Sl can be

computed as:

πlb(S, d,Λ) =

n∑

i=1

π(E[Xi|Λ

],E[Xi|Λ = F−1

Λ

(FSl(d)

)]), (2.14)

which holds for all retentions d ∈(F−1+

Sl (0) , F−1Sl (1)

).

So far, we considered the case that all E [Xi|Λ] are non-decreasing func-

tions of Λ. The case where all E [Xi|Λ] are non-increasing and continuous

functions of Λ also leads to a comonotonic vector(E [X1|Λ] , . . . ,E [Xn|Λ]

),

and can be treated in a similar way.

In case the cumulative distribution functions of the random variables

E [Xi|Λ] are not continuous nor strictly increasing or decreasing functions

of Λ, then the stop-loss premiums of S l, which is not comonotonic anymore,

can be determined as follows :

πlb(S, d,Λ) =

∫ +∞

−∞

(n∑

i=1

E [Xi|Λ = λ] − d

)

+

dFΛ (λ) .

2.2.4 Moments based approximations

The lower and upper bounds can be considered as approximations for the

distribution of a sum S of random variables. On the other hand, any

convex combination of the stop-loss premiums of the lower bound S l and

the upper bounds Sc or Su also could serve as an approximation for the

stop-loss premium of S. Since the bounds S l and Sc have the same mean

as S, any random variable Sm defined by its stop-loss premiums

πm(S, d,Λ) = zπlb(S, d,Λ) + (1 − z)πcub(S, d), 0 ≤ z ≤ 1,

will also have the same mean as S. By taking the (right-hand) derivative

we find

FSm(x) = zFSl(x) + (1 − z)FSc(x), 0 ≤ z ≤ 1,

so the distribution function of the approximation can be calculated fairly

easily. By choosing the optimal weight z, we want Sm to be as close as


possible to S. In Vyncke et al. (2004) z is chosen as

z =Var[Sc] − Var[S]

Var[Sc] − Var[Sl]. (2.15)

This choice does not depend on the retention and it leads to equal variances

Var[Sm] = Var[S].

As an alternative one could consider the improved upper bound Su and

define a second approximation as follows

πm2(S, d,Λ) = zπlb(S, d,Λ) + (1 − z)πicub(S, d,Λ),

now with

z =Var[Su] − Var[S]

Var[Su] − Var[Sl].

2.3 Upper bounds for stop-loss premiums

One of the most important tasks of actuaries is to assess the degree of dan-

gerousness of a risk X — either by finding the (approximate) distribution

or at least by summarizing its properties quantitatively by means of risk

measures to determine an insurance premium or a sufficient reserve with

solvency margin.

A stop-loss premium π(X, d) = E[(X−d)+] = E[max(0, X−d)] is one of

the most important risk measures. The retention d is usually interpreted as

an amount retained by an insured (or an insurer) while an amount X − d

is ceded to an insurer (or a reinsurer). In this case π(X, d) has a clear

interpretation as a pure insurance (reinsurance) premium.

Another practical application of stop-loss premiums is the following:

Suppose that a financial institution faces a risk X to which a capital K is

allocated. Then the residual risk R = (X−K)+ is a quantity of concern to

the society and regulators. Indeed, it represents the pessimistic case when

the random loss X exceeds the available capital. The value E[R] is often

referred to as the “expected shortfall” as explained in Subsection 1.1.2,

with K a VaR at some level.

It is not always straightforward to compute stop-loss premiums. In

the actuarial literature a lot of attention has been devoted to determine

bounds for stop-loss premiums in case only partial information about the

2.3. Upper bounds for stop-loss premiums 31

claim size distribution is available (e.g. De Vylder & Goovaerts (1982),

Jansen et al. (1986), Hurlimann (1996, 1998), among others).

Other types of problems appear in the case of sums of random vari-

ables S = X1+· · ·+Xn when full information about marginal distributions

is available but the dependency structure is not known. In the previous

section it is explained how the upper bound Sc of the sum S in so called

convex order sense can be calculated by replacing the unknown joint dis-

tribution of the random vector (X1, X2, . . . , Xn) by the most dangerous

comonotonic joint distribution. One can also obtain a lower bound S l

through conditioning. Such an approach allows to determine analytical

bounds for stop-loss premiums πlb(S, d,Λ) ≤ π(S, d) ≤ πcub(S, d).

In practical applications the comonotonic upper bound seems to be

useful only in the case of a very strong dependency between successive

summands. Even then the bounds for stop-loss premiums provided by

the comonotonic approximation are often not satisfactory. In this section

we present a number of techniques which allow to determine much more

efficient upper bounds for stop-loss premiums. To this end, we use on one

the hand the method of conditioning as in Curran (1994) and in Rogers &

Shi (1995), and on the other hand the upper and lower bounds for stop-

loss premiums of sums of dependent random variables as explained in the

previous subsection.

2.3.1 Upper bounds based on lower bound plus error term

Following the ideas of Rogers and Shi (1995), we derive an upper bound

based on the lower bound Sl.

Lemma 5.

For any random variable X we have the following inequality

E[X+] ≤ E[X]+ +1

2Var1/2(X). (2.16)

Proof. Define X−+ as follows

X−+ := max(−X, 0) = (−X)+ = −min(X, 0).


Using Jensen’s inequality twice we have

0 ≤ E[X+] − E[X]+

=1

2

{(E[X+] − E[X]+

)+(E[X−

+ ] − E[X]−+)}

=1

2

{E[X+ +X−

+ ] − E[X]+ − E[X]−+}

=1

2

{E[|X|] − |E[X]|

}

≤ 1

2E[|X − E[X]|]

≤ 1

2Var1/2(X)

Applying now Proposition 5 for any random variable Y and Z:

0 ≤ E[E [Y+|Z] − E [Y |Z]+

]≤ 1

2E[√

Var[Y |Z]]

(2.17)

to the case of Y being S − d and Z being our conditioning variable Λ, we

obtain an error bound

0 ≤ E[E [(S − d)+|Λ] − (Sl − d)+

]≤ 1

2E[√

Var[S|Λ]], (2.18)

which is only useful if the retention d is strictly positive.

Consequently, we find as upper bound for the stop-loss premium of S

π(S, d) ≤ πeub(S, d,Λ), (2.19)

with πeub(S, d,Λ) given by

πeub(S, d,Λ) = πlb(S, d,Λ) +1

2E[√

Var[S |Λ]]. (2.20)

The second term on the right hand side takes the form

E[√

Var[S |Λ]]

= E

[(E[S2|Λ

]−(E[S|Λ]

)2)1/2]

(2.21)

= E

[( n∑

i=1

n∑

j=1

E [XiXj |Λ] −(Sl)2)1/2

],

and once the distributions of Xi and Λ are specified and known, it can be

written out more explicitly.


2.3.2 Bounds by conditioning through decomposition of thestop-loss premium

Decomposition of the stop-loss premium

In this part we show how to improve the bounds introduced in Section

2.2 and Subsection 2.3.1. By conditioning S on some random variable Λ,

the stop-loss premium can be decomposed in two parts, one of which can

either be computed exactly or by using numerical integration, depending

on the distribution of the underlying random variable. For the remaining

part we first derive a lower and an upper bound based on comonotonic

risks, and another upper bound equal to that lower bound plus an error

term. This idea of decomposition goes back at least to Curran (1994).

By the tower property for conditional expectations the stop-loss premium

π(S, d) with S =n∑

i=1Xi equals

E[E[(S − d)+|Λ]

],

for every conditioning variable Λ, say with cdf FΛ.

If in addition there exists a dΛ such that Λ ≥ dΛ implies that S ≥ d,

we can decompose the stop-loss premium of S as follows

π(S, d) =

∫ dΛ

−∞E[(S − d)+|Λ = λ]dFΛ(λ) +

∫ +∞

dΛ

E[S − d|Λ = λ]dFΛ(λ)

=: I1 + I2. (2.22)

Notice that the other case (Λ ≤ dΛ implies that S ≥ d) can be treated

in a similar way with the appropriate integration bounds. In practical

applications the existence of such a dΛ depends on the actual form of S

and Λ = λ.

The second integral can further be simplified to

I2 =

∫ +∞

dΛ

n∑

i=1

E[Xi|Λ = λ

]dFΛ(λ) − d(1 − FΛ

(dΛ)), (2.23)

and can be written out explicitly if the bivariate distribution of (Xi,Λ) is

known for all i.

Deriving bounds for the first part I1 in decomposition (2.22) and adding

up to the exact part (2.23) gives us the bounds for the stop-loss premium.


Lower bound

By means of Jensen’s inequality, the first integral I1 of (2.22) can be

bounded below:

I1 ≥∫ dΛ

−∞

(E[S | Λ = λ]−d

)+dFΛ(λ) =

∫ dΛ

−∞

( n∑

i=1

E[Xi|Λ = λ]−d)

+dFΛ(λ).

(2.24)

By adding the exact part (2.23) and introducing notation (2.10), we end

up with the inequality of Section 2.2.3:

π(S, d) ≥ πlb(S, d,Λ).

When Sl is a sum of n comonotonic risks we can apply (2.14) which holds

even when we do not know or find a dΛ.

When Sl is not comonotonic we use the decomposition

πlb(S, d,Λ) =

∫ dΛ

−∞

( n∑

i=1

E[Xi|Λ = λ] − d)

+dFΛ(λ)

+

∫ +∞

dΛ

n∑

i=1

E[Xi|Λ = λ

]dFΛ(λ) − d(1 − FΛ

(dΛ)).

Upper bound based on lower bound

In this part we improve the bound (2.19) by applying (2.17) to (2.24):

0 ≤ E[E[(S − d)+|Λ] − (Sl − d)+

]

=

∫ dΛ

−∞

(E[(S − d)+|Λ = λ

]−(E[S|Λ = λ] − d

)+

)dFΛ(λ)

≤ 1

2

∫ dΛ

−∞

(Var[S | Λ = λ]

) 12 dFΛ(λ) (2.25)

≤ 1

2

(E[Var[S|Λ]I(Λ<dΛ)

]) 12(E[I(Λ<dΛ)

]) 12

=: ε(dΛ), (2.26)

where Holder’s inequality has been applied in the last inequality. We will

denote this upper bound by πdeub(S, d,Λ). So we have that

πdeub(S, d,Λ) = πlb(S, d,Λ) + ε(dΛ). (2.27)


We remark that the error bound (2.18), and hence also the upper bound

πeub(S, d,Λ), is independent of dΛ and corresponds to the limiting case

of (2.25) where dΛ equals infinity. Obviously, the error bound (2.25) im-

proves the error bound (2.18). In practical applications, the additional

error introduced by Holders inequality turns out to be much smaller than

the difference 12E[√

Var[S|Λ]]− ε(dΛ).

2.3.3 Partially exact/comonotonic upper bound

We bound the first term I1 of (2.22) above by replacing S|Λ = λ by its

comonotonic upper bound Su (in convex order sense):

∫ dΛ

−∞E[(S−d)+|Λ = λ

]dFΛ(λ) ≤

∫ dΛ

−∞E[(Su−d)+|Λ = λ

]dFΛ(λ). (2.28)

Adding (2.28) to the exact part (2.23) of the decomposition (2.22) results

in the so-called partially exact/comonotonic upper bound for a stop-loss

premium. We will use the notation πpecub(S, d,Λ) to indicate this upper

bound.

It is easily seen that

πpecub(S, d,Λ) ≤ πicub(S, d,Λ),

while for two distinct conditioning variables Λ1 and Λ2 it does not neces-

sarily holds that

πpecub(S, d,Λ1) ≤ πicub(S, d,Λ2).

2.3.4 The case of a sum of lognormal random variables

We show how to apply our results to the case of sums of lognormal dis-

tributed random variables. Such sums are widely encountered in practice,

both in actuarial science and in finance. Typical examples are present val-

ues of future cash flows with stochastic (Gaussian) returns (see Dhaene et

al. (2002b)), Asian options (see e.g. Simon et al. (2000), Vanmaele et al.

(2004b) and Albrecher et al. (2005)) and basket options (see Deelstra et

al. (2004) and Vanmaele et al. (2004a)).


We assume that Xi = αieZi with Zi ∼ N(E[Zi], σ

2Zi

) and αi ∈ R. We

develop the expressions for the lower and upper bounds for the following

sum S

S =n∑

i=1

Xi =n∑

i=1

αieZi . (2.29)

In this case the stop-loss premium π(Xi, di) with some retention di is well-

known from the following lemma.

Lemma 6 (Stop-loss premium of lognormal random variable).

Let X be a lognormal random variable of the form αeZ with Z ∼ N(E[Z], σ2Z)

and α ∈ R. Then the stop-loss premium with retention d equals for αd > 0

π(X, d) = sign(α)eµ+σ2

2 Φ(sign(α)b1

)− dΦ

(sign(α)b2

), (2.30)

where

µ = ln |α| + E[Z] σ = σZ

b1 =µ+ σ2 − ln |d|

σb2 = b1 − σ. (2.31)

The case αd < 0 is trivial.

We now consider a normally distributed random variable Λ. The following

results are analogous to Theorem 1 in Dhaene et al. (2002b).

Theorem 10 (Bounds for a sum of lognormal random variables).

Let S be given by (2.29) and consider a normally distributed random vari-

able Λ which is such that (Zi,Λ) is bivariate normally distributed for all

i. Then the distributions of the lower bound S l, the improved comonotonic

upper bound Su and the comonotonic upper bound Sc are given by

Sl =

n∑

i=1

αieE[Zi]+riσZi

Φ−1(V )+ 12(1−r2

i )σ2Zi , (2.32)

Su =n∑

i=1

αieE[Zi]+riσZi

Φ−1(V )+sign(αi)√

1−r2i σZi

Φ−1(U), (2.33)

Sc =n∑

i=1

αieE[Zi]+sign(αi)σZi

Φ−1(U), (2.34)


where U and V = Φ

(Λ − E[Λ]

σΛ

)are mutually independent U(0,1) random

variables, and ri, i = 1, . . . , n, are correlations defined by

ri = Corr (Zi,Λ) =Cov [Zi,Λ]

σZiσΛ

.

If, for all i sign(αi) = sign(ri), or, for all i sign(αi) = −sign(ri) with

ri 6= 0, then Sl is comonotonic.

Proof. See Dhaene et al. (2002b)

Comonotonic upper bound

The quantile function of Sc results from (1.20) in Theorem 7 and is given

by

F−1Sc (p) =

n∑

i=1


Φ−1(p), p ∈ (0, 1). (2.35)

Since the cdf’s FXiare strictly increasing and continuous, it follows from

(2.6) and (2.34) that for x ∈(F−1+

Sc (0), F−1Sc (1)

), the cdf of the comonotonic

sum FSc(x) can be found by solving

n∑

i=1


Φ−1(FSc (x)

)= x.

Combination of Theorem 9 and Lemma 6 yields the following expression

for the stop-loss premium of Sc at retention d with F−1+Sc (0) < d < F−1

Sc (1):

πcub(S, d) =n∑

i=1

αieE[Zi]+

σ2Zi2 Φ

[sign(αi)σZi

−Φ−1(FSc(d)

)]−d(1−FSc(d)

).

Improved comonotonic upper bound

We now determine the cdf of Su and the stop-loss premium πicub(S, d,Λ),

where we condition on a normally distributed random variable Λ or equiv-

alently on the U(0, 1) random variable introduced in Theorem 10:

V = Φ

(Λ − E [Λ]

σΛ

).

The conditional probability FSu|V =v(x) also denoted by FSu(x|V = v),

is the cdf of a sum of n comonotonic random variables and follows for


F−1+Su|V =v(0) < x < F−1

Su|V =v(1), according to (2.9) and (2.33), implicitly

from:

n∑

i=1

αieE[Zi]+riσZi

Φ−1(v)+sign(αi)√

1−r2i σZi

Φ−1(FSu (x|V =v)

)= x. (2.36)

The cdf of Su is then given by

FSu(x) =

∫ 1

0FSu|V =v(x)dv.

We now look for an expression for the stop-loss premium at retention d

with F−1+Su|V =v(0) < d < F−1

Su|V =v(1) for Su:

πicub(S, d,Λ) =

∫ 1

0E[(Su − d)+ |V = v

]dv

=n∑

i=1

∫ 1

0E[(F−1

Xi|Λ(U |V = v) − di

)+

]dv

with di = F−1Xi|Λ

(FSu(d|V = v)|V = v

)and with U a random variable

which is uniformly distributed on (0, 1). Since sign(αi)F−1Xi|Λ(U |V = v)

follows a lognormal distribution with mean and standard deviation:

µv(i) = ln |αi| + E [Zi] + riσZiΦ−1(v), σv(i) =

√1 − r2i σZi

,

one obtains that

di = αi exp[E[Zi] + riσZi

Φ−1(v) + sign(αi)√

1 − r2i σZiΦ−1

(FSu|V =v(d)

)].

Formula (2.30) then yields

E[(Su − d)+ |V = v

]=

n∑

i=1

[sign(αi)e

µv(i)+σ2

v(i)

2 Φ(sign(αi)bi,1

)− diΦ

(sign(αi)bi,2

)],

with, according to (2.31),

bi,1 =µv(i) + σ2

v(i) − ln |di|σv(i)

, bi,2 = bi,1 − σv(i).


Substitution of the corresponding expressions and integration over the in-

terval [0, 1] leads to the following result

πicub(S, d,Λ) =n∑

i=1

αieE[Zi]+

12σ2

Zi(1−r2

i )∫ 1

0eriσZi

Φ−1(v) ×

×Φ

(sign(αi)

√1 − r2i σZi

− Φ−1(FSu|V =v(d)

))dv

−d(1 − FSu(d)

). (2.37)

Lower bound

In this subsection, we study the case that, for all i, sign(αi) = sign(ri)

when ri 6= 0. For simplicity we take all αi ≥ 0 and assume that the

conditioning variable Λ is normally distributed and has the right sign such

that the correlation coefficients ri are all positive. These conditions ensure

that Sl is the sum of n comonotonic random variables. The case that, for

all i, sign(αi) = −sign(ri) when ri 6= 0 can be dealt with in an analogous

way.

The quantile function of Sl results from (1.20) in Theorem 7 and is given

by

F−1Sl (p) =

n∑

i=1

αieE[Zi]+riσZi

Φ−1(p)+ 12(1−r2

i )σ2Zi , p ∈ (0, 1). (2.38)

Since by our assumptions E[Xi|Λ] is increasing, we can obtain FSl(x) ac-

cording to (2.13) and (2.32) from

n∑

i=1

αieE[Zi]+riσZi

Φ−1(F

Sl (x))+ 1

2(1−r2i )σ2

Zi = x. (2.39)

Moreover as Sl is the sum of n lognormally distributed random variables,

the stop-loss premium at retention d (> 0) can be expressed explicitly by

invoking Theorem 9 and Lemma 6:

πlb(S, d,Λ) =

n∑

i=1

αieE[Zi]+

12σ2

Zi Φ[riσZi

− Φ−1(FSl(d)

)]− d(1 − FSl(d)

).

(2.40)



From (2.21) we obtain that

E[√

Var[S|Λ]]

=

∫ +∞

−∞

{n∑

i=1

n∑

j=1

E[XiXj |Λ = λ

]−(E[S|Λ = λ]

)2} 1

2

dFΛ(λ). (2.41)

Now consider the first term in the right hand side of (2.41). Because of

the properties of lognormally distributed random variables, the product of

lognormals is again lognormal if the underlying vector is multivariate nor-

mal distributed, and conditioning a lognormal variate on a normal variate

yields a lognormally distributed variable.

We can proceed by denoting Zij = Zi +Zj with E[Zij ] = E[Zi] + E[Zj ]

and

σ2Zij

= σ2Zi

+ σ2Zj

+ 2σZiZj,

where σZiZj:= Cov[Zi, Zj ]. Note that

rij =Cov[Zij ,Λ]

σZijσΛ

=Cov [Zi,Λ]

σZijσΛ

+Cov [Zj ,Λ]

σZijσΛ

=σZi

σZij

ri +σZj

σZij

rj .

Conditionally, given Λ = λ, the random variable Zij is normally dis-

tributed with parameters µ(i, j) = E [Zij ]+rijσZij

σΛ

(λ−E[Λ]

)and σ2(i, j) =(

1 − r2ij)σ2

Zij. Hence, conditionally, given Λ = λ, the random variable

eZij is lognormally distributed with parameters µ(i, j) and σ2(i, j). As

E[eZij |Λ = λ

]= eµ(i,j)+ 1

2σ2(i,j), we find

E[eZij |Λ

]= e

E[Zij ]+rijσZijΦ−1(V )+ 1

2(1−r2ij)σ2

Zij ,

where the random variable V = Φ(

Λ−E[Λ]σΛ

)is uniformly distributed on

the interval (0, 1).


Thus, the first term in (2.41) equals

n∑

i=1

n∑

j=1

E[XiXj |Λ] =

n∑

i=1

n∑

j=1

αiαj exp

(E[Zij ] + rijσZij

Φ−1(V ) +1

2

(1 − r2ij

)σ2

Zij

), (2.42)

while the second term consists of (2.32). Hence (2.41) can be written out

explicitly and by using (2.20), we have that the upper bound (2.19) is given

by

πeub(S, d,Λ) =n∑

i=1

αieE[Zi]+

12σ2

Zi Φ[riσZi

− Φ−1(FSl(d)

)]− d(1 − FSl(d)

)

+1

2

∫ 1

0

{n∑

i=1

n∑

j=1

αiαjeE[Zij ]+rijσZij

Φ−1(v)+ 12(1−r2

ij)σ2Zij

−(

n∑

i=1

αieE[Zi]+riσZi

Φ−1(v)+ 12(1−r2

i )σ2Zi

)2 } 12

dv.

Bounds by conditioning through decomposition of stop-loss pre-

mium

In this part we apply the theory of Subsection 2.3.2 to the sum of lognor-

mal random variables (2.29). We give here the analytical expressions for

the two upper bounds πdeub(S, d,Λ) and πpecub(S, d,Λ). For more details

concerning the calculation of the bounds the reader is referred to the last

section of this chapter.

The following auxiliary result is needed in order to write out the bounds

explicitly.

Lemma 7.

For any constant a ∈ R and any normally distributed random variable Λ

∫ dΛ

−∞eaΦ−1(v)dFΛ(λ) = e

a2

2 Φ(d∗Λ − a), (2.43)

where d∗Λ = dΛ−E[Λ]σΛ

and Φ−1(v) = λ−E[Λ]σΛ

.


Lower bound

Note that the lower bound via the decomposition equals the lower bound

without the decomposition. So the lower bound in the lognormal and

comonotonic case is given by expression (2.40).


The upper bound (2.27) can be written out explicitly as follows

πdeub(S, d,Λ) =n∑

i=1

αieE[Zi]+

12σ2

Zi Φ[riσZi

− Φ−1 (FSl(d))]− d (1 − FSl(d))

+1

2Φ(d∗Λ)1/2

{n∑

i=1

n∑

j=1

αiαjeE[Zij ]+

12(σ2

Zi+σ2

Zj)×

×Φ(d∗Λ −

(riσZi

+ rjσZj

)) (eσZiZj − e

σZiσZj

rirj)} 1

2

.(2.44)

Proof. See Section 2.7.

Partially exact/comonotonic upper bound

The partially exact/comonotonic upper bound of Subsection 2.3.3 is given

by

πpecub(S, d,Λ) =n∑

i=1

αieE[Zi]+

12σ2

Zi(1−r2

i ){e

r2i σ2

Zi2 Φ(riσZi

− d∗Λ) +

∫ Φ(d∗Λ)

0eriσZi

Φ−1(v)×

× Φ

(sign(αi)

√1 − r2i σZi

Φ−1(FSu|V =v(d)

))dv

}

−d(

1 −∫ Φ(d∗Λ)

0FSu|V =v(d)dv

). (2.45)

Proof. See Section 2.7.


Choice of the conditioning variable

If X ≤cx Y , and X and Y are not equal in distribution, then Var[X] <

Var[Y ] must hold. An equality in variance would imply that Xd= Y . This

shows that if we want to replace S by the convex smaller S l, the best

approximations will occur when the variance of S l is ‘as close as possible’

to the variance of S. Hence we should choose Λ such that the goodness-of-

fit expressed by the ratio z = Var[Sl]

Var[S ]is as close as possible to 1. Of course

one can always use numerical procedures to optimize z but this would

outweigh one of the main features of the convex bounds, namely that the

different relevant actuarial quantities (quantiles, stop-loss premiums) can

be easily obtained. Having a ready-to-use approximation that can be easily

implemented and used by all kind of end-users is important from a business

point of view.

Notice that the expected values of the random variables S, Sc and Sl

are all equal:

E[S] = E[Sl] = E[Sc] =

n∑

i=1

αieE[Zi]+

12σ2

Zi , (2.46)

while their variances are given by

Var[S] =n∑

i=1

n∑

j=1

αiαjeE[Zi]+E[Zj ]+

12(σ2

Zi+σ2

Zj)(eσZiZj − 1

), (2.47)

Var[Sl] =n∑

i=1

n∑

j=1


12(σ2

Zi+σ2

Zj)(erirjσZi

σZj − 1)

(2.48)

and

Var[Sc] =n∑

i=1

n∑

j=1


12(σ2

Zi+σ2

Zj)(eσZi

σZj − 1), (2.49)

respectively.

We propose here three conditioning random variables. The first two are

linear combinations of the random variables Zi:

Λ =n∑

i=1

γi Zi, (2.50)


for particular choices of the coefficients γi.

Kaas et al. (2000) propose the following choice for the parameters γi

when computing the lower bound S l:

γi = αieE[Zi], i = 1, . . . , n. (2.51)

This choice makes Λ a linear transformation of a first order approximation

to S. This can be seen from the following derivation:

S =n∑

i=1

αieE[Zi] +(Zi−E[Zi]) ≈

n∑

i=1

αieE[Zi] (1 + Zi − E [Zi])

= C +n∑

i=1

αieE[Zi]Zi, (2.52)

where C is constant. Hence S l will be “close” to S, provided (Zi − E[Zi])

is sufficiently small, or equivalently, σ2Zi

is sufficiently small. One intu-

itively expects that for this choice for Λ, E[Var[S|Λ]

]is “small” and, since

Var[S] = E[Var[S|Λ]

]+ Var[Sl], this exactly means that one expects the

ratio z = Var[Sl]

Var[S ]to be close to one.

A possible decomposition variable is in this case given by

dΛ = d− C = d−n∑

i=1

αieE[Zi] (1 − E [Zi]) .

Using the property that ex ≥ 1+x and (2.52), we have that Λ ≥ dΛ implies

that S ≥ d.

A second conditioning variable is proposed by Vanduffel et al. (2004).

They propose the following choice for the parameters γi when computing

the lower bound Sl:

γi = αieE[Zi]+

12σ2

Zi , i = 1, . . . , n. (2.53)

In this case the first order approximation of the variance of S l will be


maximized. Indeed, from (2.48) we find that

Var[Sl] ≈

n∑

i=1

n∑

j=1


12(σ2

Zi+σ2

Zj)(rirjσZiσZj

)

=n∑

i=1

n∑

j=1


12(σ2

Zi+σ2

Zj)(

Cov[Zi,Λ]Cov[Zj ,Λ]

Var[Λ]

)

=

(Cov

[∑ni=1 αi e

E[Zi]+12σ2

ZiZi,Λ])2

Var[Λ]

=

(Corr

(n∑

i=1

αi eE[Zi]+

12σ2

ZiZi,Λ

))2

Var

[n∑

i=1

αi eE[Zi]+

12σ2

ZiZi

].

Hence, the first order approximation of Var[S l] is maximized when Λ is

given by

Λ =n∑

i=1

αieE[Zi]+

12σ2

ZiZi. (2.54)

One can easily prove that the first order approximation for Var[S l] with Λ

given by (2.54) is equal to the first order approximation of Var[S]. This

observation gives an additional indication that this particular choice for Λ

will provide a good fit.

For this ‘maximal variance’ conditioning variable a possible choice for

dΛ is given by

dΛ = d−n∑

i=1

αieE[Zi]+

12σ2

Zi

(1 − E [Zi] −

1

2σ2

Zi

). (2.55)

A third conditioning variable is based on the standardized logarithm of the

geometric average G = (∏n

i=1 S)1/n as in Nielsen and Sandmann (2003)

Λ =ln G − E[ln G]√

Var[ln G]=

∑ni=1(Zi − E[Zi])√Var[

∑ni=1 Zi]

.

Using the fact that the geometric average is not greater than the arithmetic

average, a possible decomposition variable is here given by

dΛ =n ln

(dn

)−∑n

i=1 E[Zi]√Var[

∑ni=1 Zi]

,

so that Λ ≥ dΛ implies that S ≥ d.


Generalization to sums of lognormals with a stochastic time hori-

zon

Suppose that S is a sum of lognormal variables with a stochastic time

horizon T

S =T∑

i=1

αieZi ,

with αi ∈ R, T a random variable with life time probability distribution

FT (t) and Zi ∼ N(E[Zi], σ2Zi

) independent of T . Using the tower property

for conditional expectations, we can calculate the stop-loss premium of S

as follows

π(S, d) = π

( T∑

i=1

αieZi , d

)

= ET

[E

[( T∑

i=1

αieZi − d

)

+

|T]]

=∞∑

j=1

Pr[T = j]π

( j∑

i=1

αieZi , d

)

=∞∑

j=1

Pr[T = j] π(Sj , d), (2.56)

with

Sj :=

j∑

i=1

αieZi .

Notice that in practical applications the infinite time horizon is often re-

placed by a finite number. In this part of the thesis, the choice of Λ

will be dependent on the time horizon n. To indicate this dependence,

we introduce the notation Λn for the used conditioning variable Λ. It is

straightforward to obtain a lower bound, denoted as πlb(S, d,Λ), by looking

at the combination

πlb(S, d,Λ) =∞∑

j=1

Pr[T = j] πlb(Sj , d,Λj),

with Λ = Λ1,Λ2, . . . and πlb(Sj , d,Λj) given by (2.40) for n = j. The

same reasoning can be followed for obtaining the comonotonic upper bound

2.4. Application: discounted loss reserves 47

πcub(S, d), the improved comonotonic upper bound πicub(S, d,Λ) and the

partially exact/comonotonic upper bound πpecub(S, d,Λ).

For each term π(Sj , d) in the sum (2.56) we can take the minimum of

two or more of the above defined upper bounds. We propose two upper

bounds based on this simple idea.

The first bound takes each time the minimum of the error term (2.18)

independent of the retention and the error term (2.26) dependent on the

retention. Combining this with the stop-loss premium of the lower bound

Sl results in the following upper bound

πemub(S, d,Λ) =∞∑

j=1

Pr[T = j] min

(1

2E

[√Var[Sj |Λj ]

], ε(dΛj

)

)

+ πlb(S, d,Λ).

Calculating for each term the minimum of all the presented upper bounds

πmin(S, d,Λ) =∞∑

j=1

Pr[T = j] ×

× min(πcub(Sj , d), π

icub(Sj , d,Λj), πpecub(Sj , d,Λj), π

emub(Sj , d,Λj)),

will of course provide the best possible upper bound.

Remark that

πemub(Sj , d,Λj) = πlb(Sj , d,Λj) + min

(1

2E

[√Var[Sj |Λj ]

], ε(dΛj

)

).

2.4 Application: discounted loss reserves

Loss reserving deals with the determination of the random present value

of future payments. Since this amount is very important for an insurance

company and its policyholders, these inherent uncertainties are no excuse

for providing anything less than a rigorous scientific analysis. Since the

reserve is a provision for the future payments, the estimated loss reserve

should reflect the time value of money. At the same time, it may be

necessary or desirable for those reserves to contain a security margin that

produces p×100% confidence in their adequacy, where p is a suitably high

number.


In many situations knowledge of the d.f. of this discounted reserve is

useful, for example dynamic financial analysis, assessing profitability and

pricing, identifying risk based capital needs, loss portfolio transfers,etc. .

This application is concerned with the evaluation of loss reserves of this

type according to financial economics (see Panjer (1998)).

2.4.1 Framework and notation

Consider an insurance portfolio subject to liability payments L(i) ≥ 0 at

times i = 1, 2, . . ., where i = 0 denotes the present. Let L(i) be a random

variable and suppose that it is modified by certain forces that influence

the liability over time.

For example, suppose that L(i)t denotes the amount of liability ex-

pressed in money values of time i. Then L(i)t evolves in the sense that

L(i)t = L

(i)t−1RLt, t = 1, . . . , i,

where the RLt are strictly positive random variables of the form

RLt = 1 + rLt,

with rLt the inflation of claims costs over interval (t − 1, t]. The liability

finally paid is

L(i) = L(i)s .

As an example, L(i)t−1 and RLt might be independently distributed as fol-

lows:

L(i)t−1 ∼ logN(ν, τ 2) and RLt ∼ logN(µ, σ2).

It is emphasized that, in this example, rLt denotes claims inflation. This

might include influences other than simple community inflation, such as

the particular pressures of the legal and health care environments on claim

costs.

Similarly, a holding of assets of value At−1 at time t − 1 accumulates

at time t to

At = At−1RAt,

with

RAt = 1 + rAt.


Assume that RXt, where X is either A or L, follows the capital Asset

Pricing Model (CAPM):

rXt = rFt + βX∆t + εXt, (2.57)

where rFt is the risk-free rate in period t, βX is the CAPM beta associated

with X, εXt is the idiosyncratic risk associated with X, and

∆t = rMt − rFt,

with rM denoting the period increase in value of the economy-wide port-

folio of assets. The distribution of ∆t is assumed independent of t. The

assumption of CAPM returns is consistent with an assumption that assets

and liabilities here are marked to market.

Henceforth, it will be assumed that rFt = rF , independent of t. This

simplifies the following algebraic development considerably. It should be

emphasized, however, that the whole development generalizes to the case

in which rFt varies with t. The generalization is theoretically straight-

forward, but adds considerable notational baggage without yielding any

deeper insight.

Assume that the εAt are i.i.d. and similarly the εLt. Assume that all

variables εAt, εLt and ∆t are stochastically independent, and that E[εXt] =

0. Let us further denote the variance of εXt with ω2X .

It follows that the RAt and RLt are independent and identically dis-

tributed. Suppose now the following distribution assumptions:

L(i)0 ∼ logN

(ν

(i)L0, τ

2(i)L0

)and RXt ∼ logN

(µX , σ

2X

), (2.58)

with stochastic independence between L(i)0 and RXt for all i, t, and X =

A,L.

Denote

ρ = Corr(logRAt, logRLt)

and

κ(rs) = Corr(logL

(r)0 , logL

(s)0

).

Define the accumulation factor

RXt:u = RX,t+1RX,t+2 . . . RXu, for u = t+ 1, t+ 2, . . .

Note that RXt:t+1 = RX,t+1.


By relation (2.58) and the independence between distinct time inter-

vals,

RXt:u ∼ logN((u− t)µX , (u− t)σ2

X

).

The implicit asset allocation is any that is consistent with relation (2.58).

One might assume, for example, a constant allocation by asset sector, with

continuous rebalancing and sector-specific returns that are constant over

time. As remarked earlier in this section, the last of these assumptions

could be weakened. Indeed, if the assumptions of constant returns over

time were weakened, no assumption would be required with respect to

asset allocation. Define the discounted liability payment

V (i) = L(i)i R−1

A0:i

= L(i)0 RL0:iR

−1A0:i

= L(i)0

i∏

j=1

(RLjR−1Aj )

∼ logN(α(i), δ2(i)),

with α(i) = ν(i)L0 + i(µL −µA) and δ2(i) = τ

2(i)L0 + i(σ2

L + σ2A − 2ρσLσA). The

present value S, given by

S =n∑

i=1

V (i) :=n∑

i=1

eZi , (2.59)

with n the number of cash-flow liabilities in the discounted value of the

total outstanding losses of the portfolio.

In Taylor (2004), the mean and variance of S are calculated and given

by

E[S] =n∑

s=1

E[V (s)]

=n∑

i=1

E[L(s)0 ]

[RL

RA

(1 + (β2

Aσ2M + ω2

A)/R2A

1 + βAβLσ2M/RARL

)]s

,


Var[S] =n∑

r,s=1

Cov[V (r), V (s)]

=n∑

r,s=1

E[V (r)]E[V (s)](exp[κ(rs)τ

(r)L0 τ

(s)L0

+ min(r, s)[σ2L + σ2

A − 2ρσAσL]]− 1),

with RX = E[RXt] and σ2M = Var[rMt]. We will denote the variance of S

by σ2S .

There are now three relevant values of loss reserve:

• ∑ns=1 E[L

(s)0 ], which is the CAPM-based economic value of the lia-

bility.

• E[S], which is the expected value of the discounted liability cash

flows, the discount rate taking into account the insurers asset hold-

ings.

• Ap = F−1S (p) = E[S]exp(σSΦ−1(p) − 1

2σ2S), which is the p × 100%-

confidence loss reserve.

It may be convenient to write the last of these conditions in the form

Ap = [1 + η(ρ, σS)]E[S],

where η(ρ, σS) may be regarded as a security loading. Note, however,

that the security loading in this formulation is applied to E[S] and not

to the economic value of the liability. The first two of the above three

possibilities for loss reserve are the ones involved in the current debate

over the appropriate rate(s) at which to discount liabilities. The quantity

E[S] is obtained using the expectations of discount factors that reflect

the insurers expected returns. In broad (though not quite precise) terms,

it may be thought of as the amount of assets which, accumulating with

expected investment return, will be sufficient to meet liabilities as they

are required to be paid. This value depends on the insurer-specific asset

holdings, and so cannot be market or fair value of the liabilities. This is

given by the first of the above three candidates for loss reserves.

Taylor (1996) pointed out for high security margins (Φ−1(p) > σS), the

size of the security margin increases with increasing asset beta. However,


for low security margins (Φ−1(p) < σS), the size of the security margin

decreases with increasing asset beta. In this latter case the additional

yield expected from an increased asset risk outweighs the additional risk.

Taylor (2004) defines the security margin for confidence level p as

SMp[S] := η(p, σS) = (VaRp[S]/E[S]) − 1, which is based on the quan-

tile risk measure from the distribution of the discounted reserve S. In

general, it is hard or even impossible to determine the quantiles of the dis-

counted reserve analytically, because in any realistic model for the return

process the random variable S will be a sum of strongly dependent ran-

dom variables. Here, S is is a finite sum of correlated lognormal random

variables. This implies that its cumulative distribution function cannot be

determined exactly and is even too cumbersome to work with. An inter-

esting solution to this difficulty consists of determining the lower bound S l

and the upper bound Sc as explained earlier in this chapter.

2.4.2 Calculation of convex lower and upper bounds

To calculate the security margin η(p, σS) expressions for the quantiles and

the expected value of Sl and Sc are needed. The expressions for the quan-

tile function of the lower and upper bound of a sum of lognormal random

variables are given by (2.35) and (2.38) in the case of αi = 1 for all i.

The expression for the expected value is given by (2.46). To calculate the

lower bound we choose the ‘maximal variance’ conditioning variable given

by (2.50) and (2.53):

Λ =n∑

i=1

eE[Zi]+

12σ2

ZiZi.

We find that

E[Zi] = ν(i)L0 + log

(RL

RA

(1 + (β2

Aσ2M + ω2

A)/R2A

1 + (β2Lσ

2M + ω2

L)/R2L

)1/2)i ,

Var[Zi] = σ2Zi

= τ2(i)L0 + iσ2,

where the variability of the discounting structure σ2 := σ2L +σ2

A − 2ρσLσA

is given by

log

{[1 + (β2

Aσ2M + ω2

A)/R2A][1 + (β2

Lσ2M + ω2

L)/R2L]

[1 + βAβLσ2M/RARL]2

}.


The correlation between Zi and Λ is given by

ri =Cov[Zi,Λ]

σZiσΛ

=

∑nk=1 βk

(σ2 min(i, k) + η(i,k)

)

σZi

√∑nk=1

∑nl=1 βkβl(σ2 min(k, l) + η(k,l))

,

with

η(k,l) = Cov[logL

(k)0 , logL

(l)0

]= κ(kl)τ

(k)L0 τ

(l)L0 .

Notice that if the liability cash flows are independent η(k,s) = τ2(k)L0 I(k=s).

We will compare the performance of the lower and upper bound approach

with the Monte Carlo simulation results, obtained by generating 1 000 000

random paths, who serve as a benchmark. Note that the random paths are

based on antithetic variables in order to reduce the variance of the Monte

Carlo estimate.

We use the notation SMp[Sl] and SMp[S

c] to denote the security mar-

gin for confidence level p approximated by the lower bound and the upper

bound approximation respectively. The different tables display the Monte

Carlo simulation result (MC) for the security margin, as well as the pro-

centual deviations of the different approximation methods, relative to the

Monte Carlo result. These procentual deviations are defined as follows:

LB :=SMp[S

l] − SMp[SMC ]

SMp[SMC ]× 100%,

UB :=SMp[S

c] − SMp[SMC ]

SMp[SMC ]× 100%,

where Sl and Sc correspond to the lower bound approach and the upper

bound approach, and SMC denotes the Monte Carlo simulation result. The

figures displayed in bold in the tables correspond to the best approxima-

tions, this means the ones with the smallest procentual deviation compared

to the Monte Carlo results.

We set βL equal to zero and choose as financial parameters rF = 6%,

E[∆] = 6% and βA = 0.9. The tables list the results for different values of

the parameters ωL, ωA, σM and n.

We construct two different cash flow structures. Table 2.1 displays the

first structure of the liability cash flows (ex. 1), each of which is assumed

lognormally distributed, and all of which are stochastically independent.


Time i E[L(i)] E[L(i)0 ] ν

(i)0 τ

(i)0

1 5% 4.7% −3.059 10%

2 15% 13.3% −2.019 10%

3 25% 21.0% −1.566 10%

4 20% 15.8% −1.854 15%

5 15% 11.2% −2.120 15%

6 10% 7.0% −2.663 15%

7 5% 3.3% −3.424 20%

8 5% 3.1% −3.493 25%

Total 100% 79.6%

Table 2.1: Structure of stochastic liability cash flow (ex. 1).

The profile of the cash flows is intended to resemble a medium-term

casualty payment pattern. It is assumed that ωL = 5% and as financial

parameters σM = 20% and ωA = 0. It follows from equation (2.57) that

RL = 1.06. Further, we have for this example µL = 0.0570 and σL =

0.0471.

Table 2.2 summarizes the results for the 70% security margin for differ-

ent market volatilities σM . The lower bound turns out to fit the security

margins the best for all values of the parameters. Notice that between

brackets the standard error of the Monte Carlo estimate is displayed.

Table 2.3 compares the approximations for some selected confidence

levels p. For this example we have that σA = 16.1%, σL = 4.7%, µA = 9.5%

and µL = 5.7%, with µX and σ2X such that RX = exp(µX + 1

2σ2X). The

results are in line with the previous ones. The lower bound approach gives

excellent results for high as well as for low values of p.

Table 2.4 displays the approximated and simulated 97.5% margins for

some selected market volatilities. These parameters are consistent with

historical capital market values as reported by Ibbotson Associates (2002).

The presented figures again indicate that the lower bound is the most

precise method.


σM : 0.05 0.15 0.25 0.35

LB −0.25% −0.09% −0.12% −0.00%UB +19.86% +12.12% +5.37% −1.62%MC 0.0853 0.1090 0.1309 0.1370(s.e. × 107) (1.11) (2.47) (6.15) (8.18)

Table 2.2: (ex. 1) Approximations for the security margin SM0.70[V ] for

different market volatilities and ωL = 0.1 and ωA = 0.05.

p : 0.995 0.975 0.95 0.90 0.80 0.70

LB −0.38% −0.21% −0.16% −0.08% −0.00% −0.00%UB +26.26% +23.44% +21.80% +19.76% +16.38% +11.25%MC 1.0348 0.6927 0.5421 0.3859 0.2192 0.1124(s.e. × 105) (2.49) (0.46) (0.26) (0.10) (0.06) (0.04)

Table 2.3: (ex. 1) Approximations for some selected confidence levels

of SMp[V ]. The market volatility is set equal to 20%. (ωL = 0.05 and

ωA = 0)

σM : 0.05 0.10 0.15 0.20 0.25 0.30 0.35

LB −0.19% −0.15% −0.23% −0.16% −0.11% −0.17% −0.38%UB +31.74% +27.72% +24.12% +21.81% +20.31% +19.18% +18.13%MC 0.4390 0.5250 0.6528 0.8103 0.9924 1.1970 1.4232(s.e. × 105) (0.15) (0.29) (0.41) (0.69) (1.22) (3.78) (4.16)


different market volatilities.

We include an additional example (ex. 2) with a different stochastic liability

cash-flow structure. We fix the number of liabilities at n = 30. Further,

we choose ν(i)0 = −4.46 for i = 1, . . . , 30 and

τ(i)0 =

5% i ≤ 5;

10% 5 < i ≤ 15;

15% 15 < i ≤ 25;

20% 25 < i ≤ 28;

25% 28 < i ≤ 30.


p : 0.995 0.975 0.95 0.90 0.80 0.70

LB −0.93% −0.04% −0.02% −0.18% −0.03% −0.6%UB +24.59% +19.86% +16.94% +12.95% +5.16% −30.40%MC 4.4521 2.2264 1.4998 0.8814 0.3508 0.0761(s.e. × 105) (37.63) (2.99) (7.44) (2.79) (0.78) (0.27)

Table 2.5: (ex. 2) Approximations for some selected confidence levels of

SMp[V ]. The market volatility is set equal to 25%.

This means that the sum of the expected cash flows E[L(i)] is equal to

100% and E[L(i)0 ] = 35.51%. In this example we fix the parameters ωL and

ωA equal to 10% and 5% respectively.

The same conclusions as for ex. 1 can be drawn from the results in

Table 2.5. This table reports the discussed approximations for SMp[V ] for

different probability levels and a fixed market volatility σM = 0.25. Note

that for the parameters in Table 2.5 σA = 20.5%, σL = 9.4%, µA = 8.7%

and µL = 5.4%.

Overall, the comonotonic lower bound approach provides a very accurate

fit under different parameter assumptions. These assumptions are in line

with realistic market values. Moreover, the comonotonic approximations

have the advantage that they are easy computable for any risk measure

that is additive for comonotonic risks, such as Value-at-Risk and the wider

class of distortion risk measures (see e.g. Dhaene et al. (2004)).

2.5 Convex bounds for scalar products of random

vectors

Within the fields of finance and actuarial science one is often confronted

with the problem of determining the distribution function of a scalar prod-

uct of two random vectors of the form

S =n∑

i=1

XiYti , (2.60)

where the nominal random payments Xi are due at fixed and known times

ti, i = 1, . . . , n and Yt denotes the nominal discount factor over the interval

[0, t], t ≥ 0. This means that the amount one needs to invest at time 0

2.5. Convex bounds for scalar products of random vectors 57

to get an amount 1 at time t is the random variable Yt. By nominal we

mean that there is no correction for inflation. Notice that here the random

vector ~X = (X1, X2, . . . , Xn) may reflect e.g. the insurance or credit risk

while the vector ~Y = (Yt1 , Yt2 , . . . , Ytn) represents the financial/investment

risk. If the payments Xi at time ti are independent of inflation, then the

vectors ~X and ~Y can be assumed to be mutually independent. On the

other hand if the payments are adjusted for inflation, the vectors ~X and ~Y

are not mutually independent anymore. Denoting the inflation factor over

the period [0, t] by Zt, the random variable S can be rewritten as

S =n∑

i=1

XiYti ,

where the real payments Xi and the real discount factors Yti are given

by Xi = Xi/Zti and Yti = YtiZti . Hence, in this case S is the scalar

product of two mutually independent random vectors (X1, X2, . . . , Xn)

and (Yt1 , Yt2 , . . . , Ytn). For this reason the assumption of independence

between the insurance risk and the financial risk is in most cases realis-

tic and can be efficiently deployed to obtain various quantities describing

risk within financial institutions, e.g. discounted insurance claims or the

embedded/appraisal value of a company.

Distributions of sums of the form (2.60) are often encountered in prac-

tice and need to be analyzed thoroughly by actuaries and other practition-

ers involved in the risk management process. Not only the basic summary

measures (like the first few moments) have to be computed, but also more

sophisticated risk measures which require much deeper knowledge about

the underlying distributions (e.g. the Value-at-Risk).

Unfortunately there are no analytical methods to compute distribution

functions for random variables of this form. That is why usually one has

to rely on volatile and time consuming Monte Carlo simulations. In spite

of the enormous increase in computational power observed within the last

few decades, computing time remains a serious drawback of Monte Carlo

simulations, especially when one is interested in estimating very high values

of quantiles (note that a solvency capital of an insurance company may

be determined e.g. as the 99.95%-quantile, which is extremely difficult to

estimate within reasonable time by simulation methods).

In this section we propose an alternative solution. By extending the

methodology of Section 2.2 to the case of scalar products of independent


random vectors, we obtain convex upper and lower bounds for sums of the

form (2.60). As we demonstrate by means of a series of numerical illus-

trations, the methodology provides an excellent framework to get accurate

and easily obtainable approximations of distribution functions for random

variables of the form (2.60).

We first give the theoretical foundations for convex lower and upper

bounds in the case of scalar products of independent random vectors. Next,

we demonstrate how to obtain the bounds for (2.60) in the convex order

sense in case when ~Y follows the lognormal law. Finally, we present several

applications for discounted claim processes in a Black & Scholes setting.

2.5.1 Theoretical results

Consider sums of the form:

S = X1Y1 +X2Y2 + . . .+XnYn, (2.61)

where the random vectors ~X = (X1, X2, . . . , Xn) and ~Y = (Y1, Y2, . . . , Yn)

are assumed to be mutually independent. Theoretically, the techniques

developed in Section 2.2 can be applied also in this case (one can take

Vj = XjYj). Such an approach is however not very practical. First of all,

it is not always easy to find the marginal distributions of Vj . Secondly, it

is usually very difficult to find a suitable conditioning random variable Λ,

which will be a good approximation to the whole scalar product, taking

into account the riskiness of the random vector ~X and ~Y simultaneously.

The following theorem provides a more suitable approach to deal with

scalar products. Before we prove the theorem we recall a helpful lemma.

Lemma 8 (Scalar products and convex order).

Assume that ~X = (X1, . . . , Xn), ~Y = (Y1, . . . , Yn) and ~Z = (Z1, . . . , Zn)

are non-negative random vectors and that ~X is mutually independent of

the vectors ~Y and ~Z. If for all possible outcomes x1, . . . , xn of ~X

n∑

i=1

xiYi ≤cx

n∑

i=1

xiZi,

then the corresponding scalar products are ordered in the convex order

sense, i.e.n∑

i=1

XiYi ≤cx

n∑

i=1

XiZi.


Proof. Let φ be a convex function. By conditioning on ~X and taking the

assumptions into account, we find that

E[φ( n∑

i=1

XiYi

)]= E ~X

[E[φ( n∑

i=1

XiYi

)| ~X]]

≤ E ~X

[E[φ( n∑

i=1

XiZi

)| ~X]]

= E[φ( n∑

i=1

XiZi

)]

holds for any convex function φ.

Theorem 11 (Bounds for scalar products of random vectors).

Consider the following sum of random variables

S =n∑

i=1

XiYi. (2.62)

Assume that the vectors ~X = (X1, X2, . . . , Xn) and ~Y = (Y1, Y2, . . . , Yn)

are mutually independent. Define the following quantities:

Sc =n∑

i=1

F−1Xi

(U)F−1Yi

(V ), (2.63)

Sl =n∑

i=1

E[Xi|Γ]E[Yi|Λ], (2.64)

where U and V are independent standard uniform random variables, Γ is

a random variable independent of ~Y and Λ, and the second conditioning

random variable Λ is independent of ~X and Γ. Then, the following relation

holds:

Sl ≤cx S ≤cx Sc.

Proof. The proof is based on a multiple application of Lemma 8.

1. First, we prove that∑n

i=1XiYi ≤cx∑n

i=1 F−1Xi

(U)F−1Yi

(V ).

From Theorem 8 it follows that for all possible outcomes (x1, . . . , xn)

of ~X the following inequality holds:

n∑

i=1

xiYi ≤cx

n∑

i=1

F−1xiYi

(V ) =n∑

i=1

xiF−1Yi

(V ).


Thus from Lemma 8 it follows immediately that∑n

i=1XiYi ≤cx∑ni=1XiF

−1Yi

(V ). The same reasoning can be applied to show that

n∑

i=1

XiF−1Yi

(V ) ≤cx

n∑

i=1

F−1Xi

(U)F−1Yi

(V ).

2. In a similar way, one can show that

n∑

i=1

E[Xi|Γ]E[Yi|Λ] ≤cx

n∑

i=1

XiE[Yi|Λ] ≤cx

n∑

i=1

XiYi.

Remark 1. Notice that∑n

i=1 F−1Xi

(U)F−1Yi

(V ) ≤cx∑n

i=1 F−1XiYi

(U). Thus

the upper bound (2.63) is improved compared to the comonotonic upper

bound. It takes the information into account that the vectors ~X and ~Y

are independent.

Remark 2. One can also calculate the improved upper bound

Su =n∑

i=1

F−1Xi|Γ(U)F−1

Yi|Λ(V ),

but since the improved upper bound Su is very close to the comonotonic

upper bound Sc and it requires much more computational time, we con-

centrate in this thesis only on the lower bound S l and the comonotonic

upper bound Sc as approximations for S.

Remark 3. Having obtained the convex upper and lower bounds one can

get also the moments based approximation Sm as described in Subsection

2.2.4, i.e. by determining the distribution function as follows:

FSm(t) = zFSl(t) + (1 − z)FSc(t), (2.65)

where

z =Var[Sc] − Var[S]

Var[Sc] − Var[Sl]. (2.66)


2.5.2 Stop-loss premiums

The stop-loss premiums of Sc and Sl provide natural bounds for the stop-

loss premiums of the underlying scalar product of random vectors. More

precisely, one has the following relationship:

πlb(S, d,Γ,Λ) ≤ π(S, d) ≤ πcub(S, d).

The values πcub(S, d) and πlb(S, d,Γ,Λ) can be easily computed. Below we

give the computational procedure in detail.

First, consider a sum of the form

(Sc|U = u) =n∑

i=1

F−1Xi

(u)F−1Yi

(V ).

It can be easily seen that it is a sum of the components of a comonotonic

vector, and hence the conditional stop-loss premiums of Sc (given U = u)

can be found in the case the distribution functions of Yi are continuous

and strictly increasing, by applying Theorem 9. Then, the overall stop-

loss premium of Sc can be computed by conditioning

πcub(S, d) = E[E[(Sc − d)+|U

]]

=

∫ 1

0

n∑

i=1

F−1Xi

(u)π(Yi, F

−1Yi

(FSc|U=u(d)

))du. (2.67)

In general it is more difficult to calculate stop-loss premiums for the lower

bound. However it can be done similarly as in the case of the upper

bound if one additionally assumes that the conditioning variables Γ and

Λ can be chosen in such a way that for any fixed γ ∈ supp(Γ) all compo-

nents E[Xi|Γ = γ

]E[Yi|Λ = λ

]are non-decreasing (or equivalently non-

increasing) in λ. Then the vector

(E[X1|Γ = γ]E[Y1|Λ], E[X2|Γ = γ]E[Y2|Λ], . . . , E[Xn|Γ = γ]E[Yn|Λ]

)


is comonotonic and Theorem 9 can be applied. Thus, one gets

πlb(S, d,Γ,Λ) = E[E[(Sl − d)+|Γ

]]

=

∫ 1

0

n∑

i=1

(E[Xi|Γ = F−1

Γ (u)]×

× π(E[Yi|Λ], F−1

E[Yi|Λ]

(FSl|Γ=F−1

Γ (u)(d))))

du. (2.68)

Hence if one can only compute stop-loss premiums of Yi and E[Yi|Λ], one

can also compute stop-loss premiums of Sc and Sl.

Note that stop-loss premiums of the moments based approximation Sm

can be easily calculated as

πm(S, d,Γ,Λ) = zπlb(S, d,Γ,Λ) + (1 − z)πcub(S, d).

2.5.3 The case of log-normal discount factors

In the sequel we develop a framework for computing convex bounds for

random variables of the form:

S =n∑

i=1

αiXieZi , (2.69)

where the vectors ~X and ~Z satisfy the usual conditions (see Section 2.5.1).

We assume αi > 0 and Zi ∼ N(E[Zi], σ2Zi

). In this section we consider

the problem in general, without imposing any conditions on the random

variables Xi. In particular we don’t discuss the choice of the conditioning

variable Γ.

The upper bound

From Theorem 11 it follows that

Sc =n∑

i=1

F−1Xi

(U)F−1αieZi

(V )

=n∑

i=1

F−1Xi

(U)αieE[Zi]+sign(αi)σZi

Φ−1(V ), (2.70)

where U and V are independent standard uniform random variables.

The cumulative distribution function of Sc can be calculated in three

steps:


1. Suppose that U = u is fixed. Then from (2.70) it follows that condi-

tional quantiles can be computed as

F−1Sc|U=u(p) =

n∑

i=1

F−1Xi

(u)αieE[Zi]+sign(αi)σZi

Φ−1(p); (2.71)

2. Obviously for any u the function given by (2.71) is continuous and

strictly increasing. Thus for any y ≥ 0 one can compute the value

of the conditional distribution function using one of the well-known

numerical methods (e.g. Newton-Raphson) as a solution of

n∑

i=1

F−1Xi

(u)αieE[Zi]+sign(αi)σZi

Φ−1(FSc|U=u(y)) = y; (2.72)

3. The cumulative distribution function of Sc can be now derived as

FSc(y) =

∫ 1

0FSc|U=u(y)du.

The stop-loss premiums of the upper bound can be computed as follows.

For simplicity of notation let us denote

du,i = F−1αieZi

(FSc|U=u(d)

)= αie

E[Zi]+sign(αi)σZiΦ−1(FSc|U=u(d)). (2.73)

Then one has

π(αie

Zi , du,i

)= αie

E[Zi]+σ2

Zi2 Φ

(sign(αi)b

(1)u,i

)− du,iΦ

(sign(αi)b

(2)u,i

),

(2.74)

where, using Lemma 6,

b(1)u,i =

E[Zi] + σ2Zi

− ln(du,i)

σZi

, b(2)u,i = b

(1)u,i − σZi

.

Then the stop-loss premium of Sc with retention d can be computed by

plugging (2.74) into (2.67) and is given by


πcub(S, d) =

∫ 1

0

n∑

i=1

F−1Xi

(u)π(αie

Zi , du,i

)du

=n∑

i=1

αieE[Zi]+

12σ2

Zi ×

×∫ 1

0F−1

Xi(u)Φ

(sign(αi)σZi

− Φ−1(FSc|U=u(d)

) )du

− d (1 − FSc(d)) . (2.75)

The lower bound

The computations for the lower bound are performed similarly, however

the quality of the bound heavily depends on the choice of the conditioning

random variables. Recall that from Theorem 11 it follows that

Sl =n∑

i=1

E[Xi|Γ

]E[αie

Zi |Λ], (2.76)

where the first conditioning variable Γ is independent of Λ and ~Y and

where the second conditioning variable Λ is independent of Γ and ~X. In

this section the choice of Γ will not be discussed and the random variable

Λ will be assumed to be of the ‘maximal variance’ form (2.54)

Λ =n∑

i=1

βiZi =n∑

i=1

αiE[Xi]eE[Zi]+

12σ2

ZiZi. (2.77)

Under these assumptions the vectors of the form(Zi,Λ

)have a bivariate

normal distribution. Thus, Zi|Λ = λ will be normally distributed with

mean µi,λ and variance σ2i,λ given by

µi,λ = E[Zi] +Cov

[Zi,Λ

]

Var[Λ]

(λ− E[Λ]

)

and

σ2i,λ = σ2

Zi− Cov

[Zi,Λ

]2

Var[Λ].


The lower bound (2.76) can be written out as

Sl =n∑

i=1

E[Xi|Γ

]E[αie

Zi |Λ]

=n∑

i=1

E[Xi|Γ

]αie

µi,Λ+σ2

i,Λ2

=

n∑

i=1

E[Xi|Γ

]αie

E[Zi]+12σ2

Zi(1−r2

i )+σZiriΦ

−1(U), (2.78)

with U a standard uniform random variable and correlations given by

ri = Corr (Zi,Λ) =Cov

[Zi,Λ

]

σZiσΛ

=

∑nj=1 E[Xi]e

E[Zj ]+12σ2

ZjσZiZj

σZi

√∑1≤k,l≤n E[Xk]E[Xl]e

E[Zk]+E[Zl]+12(σ2

Zk+σ2

Zl)σZkZl

. (2.79)

Note that the ri’s are non-negative and the random variable S l is (given a

value Γ = γ) the sum of the components of a comonotonic vector. Thus the

cumulative distribution function of the lower bound S l can be computed,

similar to the case of the upper bound Sc, in three steps:

1. From (2.78) it follows that the conditional quantiles (given Γ = γ)

can be computed as

F−1Sl|Γ=γ

(p) =n∑

i=1

E[Xi|Γ = γ

]αie

E[Zi]+12σ2

Zi(1−r2

i )+σZiriΦ

−1(p); (2.80)

2. The conditional distribution function is computed as the solution of

n∑

i=1

E[Xi|Γ = γ

]αie

E[Zi]+12σ2

Zi(1−r2

i )+σZiriΦ

−1(FSl|Γ=γ

(y))= y; (2.81)

3. Finally, the cumulative distribution function of S l can be derived as

FSl(y) =

∫ 1

0FSl|Γ=F−1

Γ (u)(y)du.


The stop-loss premiums are computed as follows. Let us denote

dγ,i = F−1

E[αieZi |Λ

](FSl|Γ=γ(d))

= αieE[Zi]+

12σ2

Zi(1−r2

i )+σZiriΦ

−1(FSl|Γ=γ

(d)).

Then one has

π(E[αie

Zi |Λ], dγ,i

)= αie

E[Zi]+12σ2

Zi Φ(sign(αi)b

(1)γ,i

)−dγ,iΦ

(sign(αi)b

(2)γ,i

),

(2.82)

with

b(1)γ,i =

E[Zi] + 12σ

2Zi

(1 − r2i ) + σ2Zir2i − ln(dγ,i)

σZiri

, b(2)γ,i = b

(1)γ,i − σZi

ri.

Then the stop loss-premium of S l with retention d can be computed by

plugging (2.82) into (2.68) and is given by

πlb(S, d,Γ,Λ) =

∫ 1

0

n∑

i=1

E[Xi|Γ = F−1

Γ (u)]π(E[αie

Zi |Λ], dγ,i

)du

=n∑

i=1

αieE[Zi]+

12σ2

Zi ×

×∫ 1

0E[Xi|Γ = F−1

Γ (u)]Φ(riσZi

− Φ−1(FSl|Γ=γ(d)

))du

− d(1 − FSl(d)

). (2.83)

Moments based approximations

For computing the moments based approximation as defined in (2.65), one

has to calculate the variance of S, S l and Sc. In general the problem

is easy solvable for the upper and the lower bound. For the exact dis-

tribution it is more difficult to find a universal solution and the problem

needs to be considered individually. In the general case one would face the

problem of computing multiple integrals, what requires usually too much

computational time.

Note that the upper and the lower bound of S, as described in Subsec-

tions 2.5.3 and 2.5.3, can be seen as a special case of the following random

variable X with general form given by

X =n∑

i=1

αifi(U)gi(V ), (2.84)


where (α1, α2, . . . , αn) is a vector of non-negative numbers, fi(.) and gi(.)

are non-negative functions and U and V two independent standard uniform

random variables. Indeed, in the case of the upper bound one takes

fi(U) = F−1Xi

(U) and gi(V ) = F−1eZi

(V )

and in the case of the lower bound

fi(U) = E[Xi|Γ

]and gi(V ) = E

[eZi |Λ

].

The variance of X in expression (2.84) can be computed as follows

Var[X] = E[Var[X|U ]

]+ Var

[E[X|U ]

]

=

∫ 1

0Var[ n∑

i=1

αifi(u)gi(V )]du+

∫ 1

0

(E[ n∑

i=1

αifi(u)gi(V )])2

du

−(∫ 1

0E[ n∑

i=1

αifi(u)gi(V )]du

)2

.

Thus the problem of computing the variance of X is always solvable if one

is able to compute the expectation and the variance of random variables

X of the form

X =n∑

i=1

αigi(V ),

for any vector of non-negative numbers (α1, α2, . . . , αn) (here αi = αifi(u)).

For the comonotonic upper bound (2.70), i.e. gi(V ) = eE[Zi]+σZiΦ−1(V ), the

variance of X is given by

Var[X]

=n∑

i=1

n∑

j=1


σ2Zi

+σ2Zj

2(eσZi

σZj − 1)

and for the lower bound (2.76), i.e. gi(V ) = eE[Zi]+

12σ2

Zi(1−r2

i )−σZiriΦ

−1(V ),

by

Var[X]

=n∑

i=1

n∑

j=1


σ2Zi

+σ2Zj

2(erirjσZi

σZj − 1).


2.6 Application: the present value of stochastic

cash flows

In this section we derive convex upper and lower bounds for general dis-

counted cash flows of the form

S =n∑

i=1

Xie−Y (i), (2.85)

where the random variables Xi denote future (non-negative) payments due

at time i and Y (t) is a stochastic process describing returns on investment

in the period (0, t).

We give explicit results for convex upper and lower bounds in three

specific cases:

(i) The vector ln ~X =(ln(X1), ln(X2), . . . , ln(Xn)

)has a multivariate

normal distribution and hence the losses are log-normally distributed.

(ii) The vector ~X =(X1, X2, . . . , Xn

)has a multivariate elliptical distri-

bution. Formally the described methodology is valid only in the case

when Xi > 0.

(iii) The yearly payments Xi are independent and identically distributed.

2.6.1 Stochastic returns

We start with a general definition of a Gaussian process.

Definition 7 (Gaussian process).

A stochastic process{Y (t)|t ≥ 0

}is called Gaussian if for any 0 < t1 <

t2 < . . . < tn the vector(Y (t1), Y (t2), . . . , Y (tn)

)has a multivariate nor-

mal distribution.

Gaussian processes have a lot of desirable properties. They are very easy to

handle since they are completely determined by their mean and covariance

functions

m(t) = E[Y (t)] and c(s, t) = Cov[Y (s), Y (t)]. (2.86)

For an introduction to Gaussian processes, see e.g. Karatzas & Shreve

(1991). The normality assumption for modelling returns on investment

2.6. Application: the present value of stochastic cash flows 69

has been questioned in the financial literature for the short term setting

(e.g. daily returns — see Schoutens (2003)). In the long term however

Gaussian models provide a satisfactory approximation since the Central

Limit Theorem is applicable under the reasonable assumptions of indepen-

dent returns with finite variance (some empirical evidence is provided e.g.

in Cesari & Cremonini (2003)). Therefore in the framework of this thesis

we restrict ourselves to two simple Gaussian models for future returns Y (t).

More precisely, we will focus on modelling returns by means of a Brownian

motion with drift (the Black & Scholes model) and an Ornstein-Uhlenbeck

process. This limitation is very convenient because it leads to closed-form

formulas for convex upper and lower bounds of future cash flows.

The Black & Scholes setting (B-SM)

We assume that a process X(t) satisfies the following stochastic differential

equation:

dX(t) = X(t)(µ+

1

2σ2)dt+X(t)σdW1(t), (2.87)

where W1(t) denotes a standard Brownian motion. It is well-known that

(2.87) has a unique solution of the form

X(t) = X(0)eµt+σW1(t),

and thus the return on investment process Y (t) = log(

X(t)X(0)

)is Gaussian

with mean and covariance functions given by

m(t) = µt and c(s, t) = min(s, t)σ2.

One of the most important features of the return process Y (t) is the prop-

erty of independent increments. Indeed, it is straightforward to verify that

for every 0 < s < t < u one has that

Cov[Y (u) − Y (t), Y (t) − Y (s)

]= 0.

For this reason we often consider yearly rates of return

Yi = Y (i) − Y (i− 1) for i = 1, 2, . . . (2.88)

which are independent and normally distributed with mean equal to µ and

variance equal to σ2.


The Ornstein-Uhlenbeck model (O-UM)

In the Ornstein-Uhlenbeck model the return process is described as

Y (t) = µt+ Z(t),

where Z(t) is the solution of the following stochastic differential equation:

dZ(t) = −aZ(t)dt+ σdW1(t),

with a and σ being positive constants. Then Y (t) is again Gaussian with

mean and covariance functions given by

m(t) = µt and c(s, t) =σ2

2a

(e−a|t−s| − e−a(t+s)

)(2.89)

We refer to e.g. Arnold (1974) for more details about the derivation.

Note that for a = 0 the Ornstein-Uhlenbeck process degenerates to

an ordinary Brownian motion with drift and is equivalent to the Black &

Scholes setting. When a > 0, process Y (t) has no independent increments

any more. Moreover, it becomes mean reverting. Intuitively the property

of mean reversion means that process Y (t) cannot deviate too far from its

mean function m(t). In fact the parameter a measures how strongly paths

of Y (t) are attracted by the mean function. The value a = 0 corresponds to

the case when there is no attraction and as a consequence the increments

become independent. On Figure 2.1 we illustrate typical sample paths of

the Ornstein-Uhlenbeck model for different values of parameter a.

In particular we will concentrate on the case when Y (i) is defined by one

of these models. Then the sum S in (2.85) has a clear interpretation: it is

the discounted value of future benefits Xi with returns described by one of

the well-known Gaussian models. The input variables of the two discussed

return models are displayed in Table 2.6.


t

Y(t)

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

a) The Ornstein-Uhlenbeck process: a=0

t

Y(t)

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

b) The Ornstein-Uhlenbeck process: a=0.02

t

Y(t)

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

c) The Ornstein-Uhlenbeck process: a=0.1

t

Y(t)

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

d) The Ornstein-Uhlenbeck process: a=0.5

Figure 2.1: Typical paths for the Ornstein-Uhlenbeck process with mean

µ = 0.05, volatility σ = 0.07 and different values of parameter a.

Model Variable Formula

B-SM E[Y (i)] iµVar[Y (i)] iσ2

Var[Λ]∑n

j=1 jβ2jσ

2 +∑

1≤j<k≤n 2jβjβkσ2

Cov[Y (i),Λ]∑n

j=1 min(i, j)βjσ2

O-UM E[Y (i)] iµ

Var[Y (i)] σ2

2α (1 − e−2iα)

Var[Λ] σ2

2α

(∑nj=1 β

2j (1 − e−2jα)+

+∑

1≤j<k≤n 2βjβk(e−(k−j)α − e−(j+k)α))

Cov[Y (i),Λ] σ2

2α

∑nj=1 βj(e

−|i−j|α − e−(i+j)α)

Table 2.6: Input variables for returns. We take Λ =∑n

i=1 βiY (i).


2.6.2 Lognormally distributed payments

Consider a sum of the form

SLN =n∑

i=1

eNie−Y (i), (2.90)

where ~N =(N1, N2, . . . , Nn

)=(ln(X1), ln(X2), . . . , ln(Xn)

)is a normally

distributed random vector with mean ~µ ~N =(µN1 , µN2 , . . . , µNn

)and co-

variance matrix Σ ~N =[σ

~Nij

]1≤i,j≤n

. The corresponding variances are de-

noted by σ2Ni

:= σ~Nii .

There are two different approaches to derive convex upper and lower

bounds for SLN as defined in (2.90). In the first approach independent

parts of the scalar product are treated separately (this approach is consis-

tent with the methodology described in Subsections 2.5.1 and 2.5.3). In

the second approach we treat SLN unidimensionally, by noticing that it

can be rewritten as

SLN =

n∑

i=1

Xi =

n∑

i=1

eNi , (2.91)

where~N =

(N1, N2, . . . , Nn

)=(N1−Y (1), N2−Y (2), . . . , Nn−Y (n)

)has

a multivariate normal distribution with parameters

~µ ~N

=(µN1

, µN2, . . . , µNn

)and Σ ~

N=[σ

~Nij

]1≤i,j≤n

, (2.92)

with

µNi= µNi

−m(i) and σ~Nij = σ

~Nij + c(i, j),

where m(.) and c(., .) denote mean and covariance functions of the process

Y (.), as defined in (2.86). We further use the following notations σ2Ni

:=

σ~Nii , µi := −m(i) and σ2

i := c(i, i). Thus one can derive convex upper

and lower bounds of (2.91) just by adapting the methodology described in

Section 2.3.4.

Below we work out both approaches explicitly. The main advantage of

the first method is a better recognition of the dependency structure and

this results in more precise estimates (especially the upper bound). On

the other hand the second method is much less time-consuming because

the problem is reduced to only one dimension.


The upper bound

The upper bound can be written as

ScLN =

n∑

i=1

eµNi+µi+σNi

Φ−1(U)+σiΦ−1(V )

and its distribution function can be computed as described in Subsection

2.5.3.

The lower bound

To compute the lower bound we propose to define a conditioning random

variable Γ symmetrically to the conditioning variable Λ, i.e.

Γ =n∑

i=1

E[e−Y (i)

]eµNi

+ 12σ2

NiNi =n∑

i=1

eµNi

+µi+12

(σ2

Ni+σ2

i

)Ni.

The conditioning variable Λ is chosen as in (2.77), which gives after the

obvious substitution

Λ = −n∑

i=1

eµNi

+µi+12

(σ2

Ni+σ2

i

)Y (i). (2.93)

Now the corresponding lower bound can be written as

Sl1LN =

n∑

i=1

eµNi

+µi+12σ2

Ni(1−r2

Ni)+ 1

2σ2

i (1−r2i )+σNi

rNiΦ−1(U)+σiriΦ

−1(V ),

where correlations ri = r(−Y (i),Λ) are defined as in (2.79) and

rNi= r(Ni,Γ)

=

∑nj=1 e

µNj+µj+

12

(σ2

Nj+σ2

j

)σ

~Nij

σNi

√∑n

k,l=1 eµNk

+µNl+µk+µl+

12

(σ2

Nk+σ2

Nl+σ2

k+σ2

l

)σ

~Nkl

.

Its distribution function can be computed by conditioning on U , as de-

scribed in Section 2.5.3.

From Remark 1 it follows that

ScLN ≤cx

n∑

i=1

F−1

eNi(U),


and thus we don’t consider the comonotonic upper bound for (2.91). To

compute the lower bound we apply directly the results of Section 2.3.4.

Therefore, we take as conditioning random variable

Λ =n∑

i=1

eµ

Ni(µ)+σ2

Ni Ni. (2.94)

Then the lower bound is given explicitly as

Sl2LN =

n∑

i=1

eµ

Ni+ 1

2σ2

Ni(1−r2

Ni)+σ

NirNi

Φ−1(U),

where

rNi= r(Ni, Λ) =

∑nj=1 e

µNj

+ 12σ2

Njσ~Nij

σNi

√∑n

k,l=1 eµ

Nk+µ

Nl+ 1

2

(σ2

Nk+σ2

Nl

)σ

~Nkl

Note that in order to obtain a comonotonic lower bound one has to assure

additionally that rNi> 0 for all i.

Suppose that this lower bound is comonotonic. Then its quantiles are

given by a closed-form expression:

F−1Sl2LN

(p) =n∑

i=1

eµ

Ni+ 1

2σ2

Ni(1−r2

Ni)+σ

NirNi

Φ−1(p),

from which one can easily find values of the corresponding distribution

function e.g. by means of the Newton-Raphson method.

The moments based approximation

It is also possible to derive the moments based approximations Sm1 and

Sm2 as described in (2.65) since there are explicit solutions for the vari-


ances:

Var[SLN ] =n∑

i=1

n∑

j=1

eµ

Ni+µ

Nj+ 1

2

(σ2

Ni+σ2

Nj

)(eσ

~Nij − 1

),

Var[ScLN ] =

n∑

i=1

n∑

j=1

eµ

Ni+µ

Nj+ 1

2

(σ2

Ni+σ2

Nj

)(eσNi

σNj+σiσj − 1

),

Var[Sl1LN ] =

n∑

i=1

n∑

j=1

eµ

Ni+µ

Nj+ 1

2

(σ2

Ni+σ2

Nj

)(erNi

rNjσNi

σNj+rirjσiσj − 1

),

Var[Sl2LN ] =

n∑

i=1

n∑

j=1

eµ

Ni+µ

Nj+ 1

2

(σ2

Ni+σ2

Nj

)(erNi

rNj

σNi

σNj − 1

).

After obvious substitutions in formulas (2.75) and (2.83) one gets the fol-

lowing expressions for stop-loss premiums in the first approach:

πcub(SLN , d) =n∑

i=1

eµi+12σ2

i ×

×∫ 1

0

(eµNi

+σNiΦ−1(u)Φ

(σi − Φ−1

(FSu

LN |U=u(d)))

du

− d(1 − FSu

LN (d)),

πlb1(SLN , d,Γ,Λ) =n∑

i=1

eµi+12σ2

i

∫ 1

0eµNi

+ 12(1−r2

Ni)σ2

Ni+rNi

σNiΦ−1(u) ×

× Φ(riσi − Φ−1

(FSl1

LN |Γ=F−1Γ (u)(d)

))du

− d(1 − FSl1

LN(d)).

In the second approach the expression for stop-loss premiums of the lower

bound follows straightforward from (2.40):

πlb2(SLN , d,Λ) =n∑

i=1

eµ

Ni+ 1

2σ2

Ni Φ(rNi

σNi− Φ−1

(FSl2

LN(d)))

− d(1 − FSl2

LN(d)).

Finally, the corresponding stop-loss premiums for the moments based ap-

proximations are given by

πm1(SLN , d) = z1πlb1(SLN , d) + (1 − z1)π

cub(SLN , d),

πm2(SLN , d) = z2πlb2(SLN , d) + (1 − z2)π

cub(SLN , d),


p Sl1LN Sl2

LN Sm1LN Sm2

LN ScLN MC (s.e.×104)

0.75 14.6818 14.6822 14.6847 14.6839 15.0295 14.6795 (0.71)0.90 17.0976 17.1024 17.1067 17.1078 18.0976 17.1019 (1.06)0.95 18.7642 18.7723 18.7788 18.7815 20.2580 18.7769 (1.45)0.975 20.3631 20.3753 20.3843 20.3882 22.3610 20.3881 (2.08)0.995 23.9603 23.9823 24.0032 24.0082 27.1914 24.0237 (4.59)

Table 2.7: Approximations for some selected quantiles with probability

level p of SLN .

where

z1 =Var[Sc

LN ] − Var[SLN ]

Var[ScLN ] − Var[Sl1

LN ]and z2 =

Var[ScLN ] − Var[SLN ]

Var[ScLN ] − Var[Sl2

LN ].

A numerical illustration

We examine the accuracy and efficiency of the derived approximations for

the present values of a cash flow with lognormally distributed payments.

For the purpose of this numerical illustration we choose parameters µNi=

− ln(1.01)2 and σ2

Ni= ln(1.01) (Note that this value correspond to E[X] = 1

and Var[X] = 0.01). Moreover, we allow for some dependencies between

the payments by imposing correlations between the normal exponents:

r(Ni, Nj) =

1 if i = j

0.5 if |i− j| = 1

0.2 if |i− j| = 2,

0 if |i− j| > 2.

We restrict ourselves to the case of a Black & Scholes setting with drift

µ = 0.05 and volatility σ = 0.1. We compare the distribution functions of

the upper bound ScLN and the lower bounds Sl1

LN (obtained by taking two

conditioning random variables) and S l2LN (with 1 conditioning variable)

with the original distribution function of SLN obtained by means of a

Monte Carlo (MC) simulation based on generating 500 × 100 000 sample

paths.

Table 2.7 illustrates the performance of the different approximations.

One can see that the upper bound ScLN gives a poor approximation. The

main reason for that is a relatively weak dependence between payments,


d Sl1LN Sl2

LN Sm1LN Sm2

LN ScLN MC (s.e.×104)

0 12.8928 12.8928 12.8928 12.8928 12.8928 12.8931 (4.37)5 7.8928 7.8928 7.8928 7.8928 7.8931 7.8931 (4.37)10 3.0854 3.0856 3.0871 3.0866 3.2521 3.0870 (4.11)15 0.5589 0.5602 0.5615 0.5618 0.8216 0.5613 (2.14)20 0.0658 0.0663 0.0668 0.0669 0.1647 0.0672 (0.72)25 0.0070 0.0071 0.0072 0.0072 0.0315 0.0074 (0.25)30 0.0008 0.0008 0.0008 0.0008 0.0062 0.0008 (0.08)

Table 2.8: Approximations for some selected stop-loss premiums with

retention d of SLN .

for which the comonotonic approximation significantly overestimates the

tails. On the other hand, both lower bounds S l1LN and Sl2

LN give excellent

approximations. One may be surprised especially with the performance

of the second lower bound — it turns out that the results are not less

accurate for one conditioning random variable than in the case of two

conditioning random variables. In the table we include also two moments

based approximations Sm1LN and Sm2

LN , which perform excellent as well.

Finally, the stop-loss premiums for the different approximations are

compared in Table 2.8. This study confirms the high accuracy of the

lower bounds and moments based approximations, which are very close to

the Monte Carlo estimates. The overestimation of the stop-loss premiums

provided by the convex upper bound is considerable.

2.6.3 Elliptically distributed payments

The class of elliptical distributions is a natural extension of the normal

law. We say that a random vector ~X =(X1, X2, . . . , Xn

)has an n-

dimensional elliptical distribution with parameters ~µ =(µ1, µ2, . . . , µn

),

Σ =[σij

]1≤i,j≤n

(symmetric and positive definite matrix) and character-

istic generator φ(·), if the characteristic function of ~X is given by

ϕ ~X

(~t)

= ei~t

′~µφ(~t

′Σ~t).

We write ~X ∼ En(~µ,Σ, φ). Obviously the normal distribution satisfies this

definition, with φ(y) = e−12y. Elliptical distributions are very useful for

several reasons. First of all they are very easy to manipulate because they


inherit surprisingly many properties from the normal law. On the other

hand the normal distribution is not very flexible in modelling tails (in prac-

tice we often encounter much heavier tails than the Gaussian ones). The

class of elliptical laws offers a full variety of random distributions, from very

heavy-tailed ones (like Cauchy or stable distributions), distributions with

tails of the polynomial-type (t-Student), through the exponentially-tailed

Laplace and logistic distributions to the light-tailed Gaussian distribution.

Below we give a brief overview of the properties of elliptical distribu-

tions. For more information about elliptical distributions we refer to Fang

et al. (1990). The generalization of some of the results on comonotonic

bounds for∑n

i=1Xi to the multivariate elliptical case can be found in

Valdez & Dhaene (2004).

1. E[Xi] = µi, Var[Xi] = −2φ′(0)σii and Cov[Xi, Xj ] = −2φ′(0)σii if

only the corresponding moments exist. Here, φ′(·) is the first deriva-

tive of the characteristic generator φ(·).

2. Let ~Y = A ~X +~b, where A denote an m×n-matrix and ~b is a vector

in Rn. Then ~Y ∼ Em

(A~µ+~b,AΣA′, φ

);

3. If the density function f ~X(·) exists, it is given by the formula

f ~X(~x) =c√

det[Σ]g((~x− ~µ)′Σ−1(~x− ~µ)

)

for any non-negative function g satisfying

0 <

∫ ∞

0z

n2−1g(z)dz <∞

and c being a normalizing constant. The function g(·) is called the

density generator of the distribution Em

(~µ,Σ, φ

). A detailed proof

of these results, using spherical transformations of rectangular coor-

dinates, can be found in Landsman & Valdez (2002).

4. Let ~X =(~X1, ~X2

)denote an En+m(~µ,Σ, φ)-random vector, where

~µ =(~µ1, ~µ2

)and

Σ =

(Σ11 Σ12

Σ21 Σ22

).


Then, given conditionally that ~X2 = ~x2, the vector ~X1 has the

En(~µ1|2,Σ11|2, φx2)-distribution with parameters given by

~µ1|2 = ~µ1 + Σ12Σ−122

(~x2 − ~µ2

)and

Σ11|2 = Σ11 − Σ12Σ−122 Σ21.

Notice that in general (unlike in the normal case) the characteristic

generator of the conditional distribution is not known explicitly and

depends on the value of x2.

Consider now sums of the form

Sel =n∑

i=1

Xie−Y (i),

where the return process Y (t) is, like in the previous example, described

by the Black & Scholes model and ~X =(X1, X2, . . . , Xn

)is elliptically

distributed with parameters ~µ ~X =(µX1 , µX2 , . . . , µXn

), Σ ~X =

[σ

~Xij

]1≤i,j≤n

and characteristic generator φ(·). Here we note only that for φ(u) = e−u2

one gets a multivariate normal distribution with mean parameter ~µ ~X and

covariance matrix Σ ~X .

Note that elliptical random variables take both positive and negative

values and therefore one cannot apply immediately Theorem 11. We

propose to consider pragmatically only the cases where the probability

Pr[Xi < 0] is very small. This can be achieved by choosing the parameters

in such a way thatµXi

σXi

is much larger then 0, where we use the conventional

notation σ2Xi

:= σ~Xii .

The upper bound

The computation of the upper bound is straightforward if the inverse dis-

tribution function for the specific elliptical distribution is available in the

software package. In other words, the comonotonic upper bound is given

by

Scel =

n∑

i=1

F−1

En

�µXi

,σ2Xi

,φ �(U)eµi+σiΦ

−1(V ), (2.95)


where by convention µi = −m(i) and σ2i = c(i, i) for m(·) and c(·, ·) de-

noting the mean and covariance functions of the process Y (i) described

previously in this subsection.

Note that for the most interesting case of a multivariate normal distri-

bution one gets

ScN =

n∑

i=1

(µXi

+ σXiΦ−1(U)

)eµi+σiΦ

−1(V ).

The corresponding expressions for stop-loss premiums are given by

πcub(Sel, d) =n∑

i=1

eµi+12σ2

i ×

×∫ 1

0

{F−1

En

�µXi

,σ2Xi

,φ � (u)Φ(σi − Φ−1

(FSu

el|U=u(d)

))}du

−d(1 − FSu

el(d))

(2.96)

and

πcub(SN , d) =n∑

i=1

eµi+12σ2

i ×

×∫ 1

0

{(µXi

+ σXiΦ−1(u)

)Φ(σi − Φ−1

(FSu

N |U=u(d)))}

du

−d(1 − FSu

N (d)).

The lower bound

To compute the lower bound, we define the conditioning random variable

Γ as follows

Γ =n∑

j=1

E[e−Y (j)

]Xj =

n∑

j=1

eµj+12σ2

jXj .

Then a random vector(Xj ,Γ

)has a bivariate elliptical distribution, with

parameters ~µΓ,i =(µXi

, µΓ

)and ΣΓ,i =

[σΓ,i

kl

]1≤k,l≤2

, where

µΓ =n∑

j=1

eµj+12σ2

jµXj,


σ2Xi

:= σΓ,i11 , σΓ,i

12 = σΓ,i21 =

n∑

j=1

eµj+12σ2

j σ~Xij and

σ2Γ := σΓ,i

22 =n∑

j=1

n∑

k=1

eµj+µk+ 12

(σ2

j +σ2k

)σ

~Xjk.

From property (4) of the elliptical distributions, it follows that — given

Γ = γ — the r.v. Xi is elliptically distributed with parameters

µXi,Γ = µXi+σΓ,i

12

σ2Γ

(Γ − µΓ

), σ2

Xi,Γ = σ2Xi

−

(σΓ,i

12

)2

σ2Γ

(2.97)

and the unknown characteristic generator φa(·) depending on a equals(Γ−µΓ)2

σ2Γ

(recall that for the multivariate normal case the conditional dis-

tribution remains normal). Note that in our application it does not really

matter that the characteristic generator φa(·) is not known — it suffices

to notice that

E[Xi | Γ] = µXi,Γ = µXi

+σΓ,i

12

σ2Γ

(Γ − µΓ

).

The second conditioning random variable is chosen analogously as in (2.93):

Λ = −n∑

i=1

E[Xi]eµi+

12σ2

i Y (i) = −n∑

i=1

µXieµi+

12σ2

i Y (i).

From Section 2.5.1 it follows that the lower bound is given by the following

expression:

Slel =

n∑

i=1

(µXi

+σΓ,i

12

σ2Γ

(F−1

Γ (U) − µΓ

))eµi+

12σ2

i (1−r2i )+riσiΦ

−1(V ), (2.98)

where correlations ri = r(−Y (i),Λ) are defined as in (2.79) (with E[Xi]

substituted by µXi). Note that expression (2.98) simplifies in the normal

case to

SlN =

n∑

i=1

(µXi

+ rXiσXi

Φ−1(U))eµi+

12σ2

i (1−r2i )+riσiΦ

−1(V )


with

rXi= r(Xi,Γ) =

∑nj=1 µXj

eµj+12σ2

j σ~Xij

σXi

√∑n

k,l=1 µXkµXl

eµk+µl+12

(σ2

k+σ2

l

)σ

~Xkl

.

Finally, the corresponding stop-loss premiums are computed according to

the following expressions:

πlb(Sel, d,Γ,Λ) =n∑

i=1

eµi+12σ2

i

∫ 1

0

{(µXi

+σΓ,i

12

σ2Γ

(F−1

Γ (u) − µΓ

))×


(FSl

el|Γ=F−1

Γ (u)(d)))}

du

− d(1 − FSl

el(d)),

πlb(SN , d,Γ,Λ) =n∑

i=1

eµi+12σ2

i

∫ 1

0

{(µXi

+ rXiσXi

Φ−1(u))×


(FSl

N |Γ=F−1Γ (u)(d)

))}du

− d(1 − FSl

N(d)).


It is also possible to find the moments based approximation SmN from for-

mula (2.65), since one can compute the variance of SN as

Var[SN ] = E ~X

[Var[SN | ~X

]]+ Var ~X

[E[SN | ~X

]]

= E ~X

[ n∑

i=1

n∑

j=1

XiXjeµi+µj+

12

(σ2

i +σ2j

)(eσij − 1

)]

+ Var ~X

[ n∑

i=1

Xieµi+

12σ2

i

]

=n∑

i=1

n∑

j=1

(σ

~Xij + µXi

µXj

)eµi+µj+

12

(σ2

i +σ2j

)+σij

−n∑

i=1

n∑

j=1

µXiµXj

eµi+µj+12

(σ2

i +σ2j

).


Here, the variances of the upper and the lower bound are computed as

explained in Section 2.5.3.

We remark that for ~X having a multivariate elliptical distribution the

computations are almost identical, with the only difference in the formula

for covariances

Cov[Xi, Xj ] = −2φ′(0)~Xij .

Then the stop-loss premium of the moments based approximation is

obtained as a convex combination

πm(Sel, d,Γ,Λ) = zπlb(Sel, d,Γ,Λ) + (1 − z)πcub(Sel, d),

where z is defined as in (2.66).


We study the case of normally distributed payments with mean µXi= 1

and variance σ2Xi

= 0.01. Note that the mean and the variance are the same

as in the lognormal case. Moreover we assume the following correlation

pattern for the payments:

r(Xi, Xj) =

1 if i = j

0.5 if |i− j| = 1

0.2 if |i− j| = 2,

0 if |i− j| > 2.

.

As in the previous example, we work in the Black & Scholes setting with

drift parameter µ = 0.05 and volatility σ = 0.1. We compare the per-

formances of the lower bound S lN , the upper bound Sc

N and the mo-

ments based approximation SmN with the real distribution of SN of the

present value function, obtained by a Monte Carlo simulation (MC) based

on 500 × 100 000 simulated paths.

The performance of the approximations is illustrated by the numerical

values of some upper quantiles displayed in Table 2.9. The same conclu-

sions can be drawn as in the log-normal case — the upper bound ScN gives

a quite poor approximation, while the lower bound S lN and the moments

based approximation perform excellent.

The study of stop-loss premiums in Table 2.10 confirms this observa-

tion.


p SlN Sm

N ScN MC (s.e.×103)

0.75 14.6820 14.6849 15.0368 14.6820 (0.70)0.90 17.0978 17.1068 18.0992 17.1025 (1.02)0.95 18.7642 18.7787 20.2522 18.7789 (1.46)0.975 20.3630 20.3840 22.3456 20.3895 (2.11)0.995 23.9599 24.0020 27.1468 24.0354 (4.61)


level p of SN .

d SlN Sm

N ScN MC (s.e.×104)

0 12.8928 12.8928 12.8928 12.8923 (4.50)5 7.8928 7.8929 7.8931 7.8923 (4.50)10 3.0855 3.0872 3.2544 3.0863 (4.16)15 0.5589 0.5615 0.8213 0.5610 (2.11)20 0.0658 0.0668 0.1636 0.0671 (0.74)25 0.0070 0.0072 0.0309 0.0073 (0.25)30 0.0008 0.0008 0.0060 0.0008 (0.08)


retention d of SN .

2.6.4 Independent and identically distributed payments

Finally, we consider the case where the payments Xi are independent and

identically distributed. The independence assumption accounts for more

flexibility in modelling the underlying marginal distributions, however —

unlike in the lognormal and elliptical cases — it imposes a rigid condition

on the dependence structure. We start with defining the class of tempered

stable distributions for which the methodology works particularly efficient.

Tempered stable distributions

The Tempered Stable law T S(δ, a, b) for a, b > 0 and 0 < δ < 1 is a

one-dimensional distribution given by the characteristic function:

ϕT S(t; δ, a, b) = eab−a(b1δ −2it

)δ

. (2.99)


For more details we refer to e.g. Schoutens (2003). This class of distribu-

tions has the special property that the sum of independent and identically

distributed tempered stable random variables is again tempered stable.

This is formalized in the following lemma:

Lemma 9 (Sum of tempered stable random variables).

If Xi are i.i.d. random variables T S(κ, a, b)-distributed for i = 1, 2, . . . , n,

then their sum X1 +X2 + · · · +Xn is T S(κ, na, b)-distributed.

Proof. Consider the corresponding characteristic functions. We get

ϕX1+X2+···+Xn(t) =(ϕT S(t;κ, a, b)

)n

= e(na)b−(na)(b1κ −2it)κ

= ϕT S(t;κ, na, b).

The first two moments of a random variable X ∼ T S(δ, a, b) are given by

E[X] = 2aδbδ−1

δ and Var[X] = 4aδ(1 − δ)bδ−2

δ .

In the sequel we provide more details about two well-known special

cases: the gamma distribution and the inverse Gaussian distribution.

The gamma distribution Gamma(a, b) corresponds to the limiting case

when δ → 0. The characteristic function of the gamma distribution is

given by

ϕ(t; a, b) =(1 − it

b

)−a.

Notice that for X ∼ Gamma(a, b) one has E[X] = ab and Var[X] = a

b2.

The inverse Gaussian distribution is a member of the class of Tempered

Stable distributions with δ = 12 . Thus, the characteristic function is given

by

ϕ(t; a, b) = e−a(√

−2it+b2−b).

Moreover the mean and variance of X ∼ IG(a, b) are given by E[X] = ab

and Var[X] = ab3

.

We consider now sums of the form

Sind =n∑

i=1

Xie−Y (i), (2.100)

where the process Y (i) is defined like in the previous examples and the

payments Xi are independent and follow the law defined by the cdf FX(·).


The upper bound

The computation of the upper bound is straightforward (as described in

Section 2.5.3):

Scind = F−1

X (U)n∑

i=1

eµi+σiΦ−1(V ). (2.101)

The stop-loss premiums for the upper bound are given by an expression

analogous to (2.75), with Sc replaced by Scind.

The lower bound

To compute the lower bound, we start with defining the conditioning ran-

dom variables Γ and Λ. Let

Γ = X1 +X2 + · · · +Xn.

If we know the distributions of Xi, the distribution of the sum Γ is also

known. In particular, for Xi gamma distributed the sum Γ remains gamma

distributed and the same for Xi inverse Gaussian distributed.

Like in the previous examples, the conditional random variable Λ is

chosen as

Λ = −n∑

i=1

E[Xi]eµi+

12σ2

i Y (i). (2.102)

Now, the lower bound can be written as

Slind =

1

nF−1

Γ (U)n∑

i=1

eµi+12(1−r2

i )σ2i +riσiΦ

−1(V ),

where the correlations ri = r(− Y (i),Λ

)are defined as in (2.79).

Note that the computation of stop-loss premiums of the lower bound

is straightforward, by applying (2.83) and replacing S l by Slind.

Cumulative distribution functions

In this case there is a more efficient method to compute the distribution

functions than this described in Section 2.5.3.


Remark 4. The cumulative distribution function of the product W of two

non-negative independent variables X and Y can be written as

FW (z) =

∫ ∞

−∞FY

( zx

)dFX(x) =

∫ 1

0FY

(z

F−1X (u)

)du. (2.103)

Using this result one can compute the cumulative distribution functions of

the upper and the lower bound as

FScind

(y) =

∫ 1

0FX

(y

F−1Sc

(v)

)dv,

FSlind

(y) =

∫ 1

0F 1

nΓ

(y

F−1Sl

(v)

)dv,

where

Sc =n∑

i=1

eµi+σiΦ−1(V ), Sl =

n∑

i=1

eµi+12(1−r2

i )σ2i +riσiΦ

−1(V ),

F−1Sc

(v) =

n∑

i=1

eµi+σiΦ−1(v), F−1

Sl(v) =

n∑

i=1

eµi+12(1−r2

i )σ2i +riσiΦ

−1(v).


The moments based approximation of Sind can be found in a similar way

to the moments based approximation for elliptical distributions. The key

step is to compute the variance of Sind:

Var[Sind] = E ~X

[Var[Sind | ~X

]]+ Var ~X

[E[Sind | ~X

]]

= E ~X

[ n∑

i=1

n∑

j=1

XiXjeµi+µj+

12

(σ2

i +σ2j

)(eσij − 1

)]

+ Var ~X

[ n∑

i=1

Xieµi+

12σ2

i

]

=n∑

i=1

n∑

j=1

(E[Xi]E[Xj ]

)eµi+µj+

12

(σ2

i +σ2j

)(eσij − 1

)

+n∑

i=1

Var[Xi]e2µi+σ2

i . (2.104)


p Slind Sm

ind Scind MC (s.e.×103)

0.75 14.6709 14.6723 15.0320 14.6820 (0.70)0.90 17.0767 17.0810 18.0984 17.1025 (1.02)0.95 18.7372 18.7443 20.2563 18.7789 (1.46)0.975 20.3309 20.3412 22.3560 20.3895 (2.11)0.995 23.9183 23.9390 27.1762 24.0354 (4.61)


level p of Sind for gamma i.i.d. liabilities.

The variances of the upper and the lower bound are computed as explained

in Subsection 2.5.3.

Consequently, the stop-loss premium of the moments based approxi-

mation is obtained as a convex combination

πm(Sind, d,Γ,Λ) = zπlb(Sind, d,Γ,Λ) + (1 − z)πcub(Sind, d),

where z is defined as in (2.66).


We consider in this application independent Gamma(100, 100) distributed

future payments. Note that this choice of parameters implies that E[X] = 1

and Var[X] = 0.01 — i.e. we take the same mean and variance of liabilities

as in the lognormal and normal cases. As before we work in a Black &

Scholes setting with drift µ = 0.05 and volatility σ = 0.1. We compare

the performances of the lower bound S lind, the upper bound Sc

ind and the

moments based approximation Smind with the real value Sind obtained by a

Monte Carlo simulation (MC) based on 500 × 100 000 simulated paths.

The results are very similar to the normal and lognormal case. It is

worth noticing that the variance of Sind (10.1489) is a bit lower than in the

lognormal case (10.2789) and in the normal case (10.2792). This is due

to independence of gamma-payments while we imposed a slight positive

dependence in the previous cases.

The quality of the approximations is illustrated by some upper quan-

tiles displayed in Table 2.11. The lower bound S lind and the moments

based approximation Smind perform well, but not as good as in the lognor-

mal and normal cases (probably because the conditioning random variable

2.7. Proofs 89

d Slind Sm

ind Scind MC (s.e.×104)

0 12.8928 12.8928 12.8928 12.8921 (4.44)5 7.8928 7.8928 7.8931 7.8921 (4.44)10 3.0813 3.0821 3.2528 3.0821 (4.06)15 0.5540 0.5553 0.8215 0.5549 (2.08)20 0.0647 0.0652 0.1644 0.0655 (0.77)25 0.0068 0.0069 0.0313 0.0071 (0.27)30 0.0007 0.0008 0.0061 0.0008 (0.09)


retention d of Sind for gamma i.i.d. liabilities.

Γ does not take discounting factors into account). The study of stop-loss

premiums in Table 2.12 goes in line with these findings.

2.7 Proofs

Upper bound based on lower bound (2.44)

In the following we shall derive an easily computable expression for (2.26).

The second expectation term in the product (2.26) equals, when denoting

by FΛ(·) the normal cumulative distribution function of Λ,

E[I(Λ<dΛ)] = 0 · Pr[Λ ≥ dΛ] + 1 · Pr[Λ < dΛ] = FΛ(dΛ) = Φ(d∗Λ). (2.105)

The first expectation term in the product (2.26) can be expressed as

E[Var [S|Λ] I(Λ<dΛ)

]= E

[E[S2|Λ]I(Λ<dΛ)

]− E

[(E[S|Λ])2I(Λ<dΛ)

].

(2.106)

Now consider the second term of the right-hand side of (2.106)

E[(E[S|Λ])2I(Λ<dΛ)

]=

∫ dΛ

−∞(E[S|Λ = λ])2dFΛ(λ). (2.107)

According to (2.32) and using the notation Zij introduced before, we can


express (2.107) as


]

=

∫ dΛ

−∞

(n∑

i=1

E[Xi|Λ = λ]

)2

dFΛ(λ)

=

∫ dΛ

−∞

(n∑

i=1

αieE[Zi]+riσZi

Φ−1(v)+ 12(1−r2

i )σ2Zi

)2

dFΛ(λ)

=

∫ dΛ

−∞

n∑

i=1

n∑

j=1

αiαjeE[Zij ]+(riσZi

+rjσZj)Φ−1(v) ×

× e12 � (1−r2

i )σ2Zi

+(1−r2j )σ2

Zj � dFΛ(λ)

=n∑

i=1

n∑

j=1

αiαjeE[Zij ]+

12 � (1−r2

i )σ2Zi

+(1−r2j )σ2

Zj � ×

×∫ dΛ

−∞e(riσZi

+rjσZj)Φ−1(v)

dFΛ(λ). (2.108)

Next, applying Lemma 7 to (2.108) with a = riσZi+ rjσZj

yields


]=

n∑

i=1

n∑

j=1

αiαjeE[Zij ]+

12(σ2

Zi+σ2

Zj+2rirjσZi

σZj)Φ(d∗Λ −

(riσZi

+ rjσZj

)).(2.109)

Now consider the first term of the right-hand side of expression (2.106),

E[E[S2|Λ]I(Λ<dΛ)

]. The term E[S2|Λ] is given by (2.42). By applying

(2.43) with a = rijσZij= riσZi

+ rjσZj, and simplifying, we obtain

2.7. Proofs 91

E[E[S2|I(Λ<dΛ)

]

=n∑

i=1

n∑

j=1

∫ dΛ

−∞αiαje

E[Zij ]+rijσZijΦ−1(v)+ 1

2(1−r2ij)σ2

Zij dFΛ(λ)

=n∑

i=1

n∑

j=1

αiαjeE[Zij ]+

12(1−r2

ij)σ2Zij

∫ dΛ

−∞erijσZij

Φ−1(v)dFΛ(λ)

=n∑

i=1

n∑

j=1

αiαjeE[Zij ]+

12(1−r2

ij)σ2Zij

+r2ijσ2

Zij2 Φ(d∗Λ − rijσZij

)

=n∑

i=1

n∑

j=1

αiαjeE[Zij ]+

σ2Zij2 Φ(d∗Λ − (riσYi

+ rjσYj)). (2.110)

Combining (2.110) and (2.109) into (2.106), and then substituting (2.105)

and (2.106) into (2.26) we get the following expression for the error bound

ε(dΛ) (2.26):

ε(dΛ)

=1

2(Φ(d∗Λ))

12

{n∑

i=1

n∑

j=1

αiαj

[eE[Zij ]+

σ2Zij2 Φ

(d∗Λ −

(riσZi

+ rjσZj

))−

−eE[Zij ]+12(σ2

Zi+σ2

Zj+2rirjσZi

σZj)Φ(d∗Λ −

(riσZi

+ rjσZj

))]} 1

2

=1

2(Φ(d∗Λ))

12

{n∑

i=1

n∑

j=1

αiαjeE[Zij ]Φ

(d∗Λ −

(riσZi

+ rjσZj

))×

×(e

12(σ2

Zi+σ2

Zj+2σZiZj

) − e12(σ2

Zi+σ2

Zj+2rirjσZi

σZj))} 1

2

=1

2(Φ(d∗Λ))

12

{n∑

i=1

n∑

j=1

αiαjeE[Zij ]+

12(σ2

Zi+σ2

Zj)Φ(d∗Λ −

(riσZi

+ rjσZj

))×

×(eσZiZj − e

σZiσZj

rirj)} 1

2

.


Partially exact/comonotonic upper bound (2.45)

Applying Lemma 7 with a = riσZi, and using (2.32), we can express the

second term I2 in (2.22) in closed-form:∫ +∞

dΛ

E[S − d|Λ = λ]dFΛ(λ)

=

∫ +∞

dΛ

E[S|Λ = λ]dFΛ(λ) − d(1 − FΛ(dΛ))

=n∑

i=1

αieE[Zi]+

12(1−r2

i )σ2Zi

∫ +∞

dΛ

eriσZiΦ−1(v)dFΛ(λ) − d(1 − Φ(d∗Λ))

=n∑

i=1

αieE[Zi]+

σ2Zi2 Φ(riσZi

− d∗Λ) − dΦ(−d∗Λ). (2.111)

Substituting (2.33) in (2.28) we end up with the following upper bound of

I1 similar to (2.37) but now with an integral from zero to Φ(d∗Λ):

∫ dΛ

−∞E[(S − d)+|Λ = λ]dFΛ(λ)

≤∫ dΛ

−∞E[(Su − d)+|Λ = λ]dFΛ(λ)

=

∫ Φ(d∗Λ)

0E[(Su − d)+|V = v] dv

=n∑

i=1

αieE[Zi]+

12σ2

Zi(1−r2

i )×

×∫ Φ(d∗Λ)

0eriσZi

Φ−1(v)Φ

(sign(αi)

√1 − r2i σZi

− Φ−1(FSu|V =v(d)

))dv

− d

(Φ(d∗Λ) −

∫ Φ(d∗Λ)

0FSu|V =v(d)dv

), (2.112)

where we recall that d∗Λ is defined as in (2.43), and the cumulative distri-

bution FSu(d) is, according to (2.36), determined byn∑

i=1

αieE[Zi]+riσZi

Φ−1(v)+sign(αi)√

1−r2i σZi

Φ−1(FSu (d|V =v)) = d.

Finally, adding (2.112) to the exact part (2.111) of the decomposition (2.22)

results in the partially exact/comonotonic upper bound.

Chapter 3

Reserving in life insurance

business

Summary In the traditional approach to life contingencies only decre-

ments are assumed to be stochastic. In this contribution we consider the

distribution of a life annuity (and a portfolio of life annuities) when also

the stochastic nature of interest rates is taken into account. Although the

literature concerning this topic is already quite rich, the authors usually

restrict themselves to the computation of the first two or three moments.

However, if one wants to determine e.g. capital requirements using more

sofisticated risk measures like Value-at-Risk or Tail Value-at-Risk, more

detailed knowledge about underlying distributions is required. For this

purpose, we propose to use the theory of comonotonic risks introduced in

Chapter 2. This methodology allows to obtain reliable approximations of

the underlying distribution functions, in particular very accurate estimates

of upper quantiles and stop-loss premiums. Several numerical illustrations

confirm the very high accuracy of the methodology.

3.1 Introduction

Unlike in finance, in insurance the concept of stochastic interest rates

emerged quite recently. In the traditional approach to life contingencies

only decrements are assumed to be stochastic — see e.g. Bowers et al.

(1986), Wolthuis & Van Hoek (1986). Such a simplification allows to treat

effectively summary measures of financial contracts such as the mean, the

93

94 Chapter 3 - Reserving in life insurance business

standard deviation or the upper quantiles. For a more detailed discussion

about the distributions in life insurance under deterministic interest rates,

see e.g. Dhaene (1990).

In non-life insurance the use of deterministic interest rates may be

justified by short terms of insurance commitments. In the case of the life

insurance and the life annuity business, durations of contracts are typically

very long (often 30 or even more years). Then uncertainty about future

rates of return becomes very high. Moreover the financial and investment

risk — unlike the mortality risk — cannot be diversified with an increase in

the number of policies. Therefore in order to calculate insurance premiums

or mathematical reserves, actuaries are forced to adopt very conservative

assumptions. As a result the diversification effects between interest rates

in different investment periods may not be taken into account (i.e. that

poor investment results in some periods are usually compensated by very

good ones in others) and the life insurance business becomes too expensive,

both for the insureds who have to pay higher insurance premiums and for

the shareholders who have to provide more capital than necessary. Profit-

sharing can partially solve this problem. For these reasons the necessity to

introduce models with stochastic interest rates have been well-understood

in the actuarial world.

In the actuarial literature numerous papers have treated the random

interest rates. In Boyle (1976) autoregressive models of order one are intro-

duced to model interest rates. Bellhouse & Panjer (1980, 1981) use similar

models to compute moments of insurance and annuity functions. In Wilkie

(1976) the force of interest is assumed to follow a Gaussian random walk.

Waters (1978) computes the moments of actuarial functions when the in-

terest rates are independent and identically Gaussian distributed. He com-

putes also moments of portfolios of policies and approximates the limiting

distribution by Pearson’s curves. In Dhaene (1989) the force of interest

is modelled as an ARMA(p, d, q) process. He uses this model to compute

the moments of present value functions. Norberg (1990) provides an ax-

iomatic approach to stochastic interest rates and the valuation of payment

streams. Parker (1994d) compares two approaches to the randomness of in-

terest rates: by modelling only the accumulated interest and by modelling

the force of interest. Both methodologies are illustrated by calculating the

mean, the standard deviation and the skewness of the annuity-immediate.

An overview of stochastic life contingencies with solvency valuation is

presented in Frees (1990). In the papers of Beekman & Fuelling (1990,

3.1. Introduction 95

1991) the mean and the standard deviation of continuous-time life annu-

ities are calculated with the force of mortality modelled as an Ornstein-

Uhlenbeck and a Wiener process respectively. In Beekman & Fuelling

(1993) expressions are given for the mean and the standard deviation of

the future life insurance payments. Norberg (1993) derives the first two

moments of the present value of stochastic payment streams. The first

three moments of homogeneous portfolios of life insurance and endowment

policies are calculated in Parker (1994a,b) and the results are generalized

to heterogeneous portfolios in Parker (1997). The same author (1994c,

1996) provides a recursive formula to calculate an approximate distribu-

tion function of the limiting homogeneous portfolio of term life insurance

and endowment policies. In Debicka (2003) the mean and the variance are

calculated for the present value of discrete-time payment streams in life

insurance.

Although the literature on stochastic interest rates in life insurance is

already quite rich, for most of the problems no satisfactory solutions have

been found as yet. In almost all papers the authors restrict themselves

to calculating the first two or three moments of the present value function

(except Waters (1978), Parker (1994d, 1996)). The computation of the first

few moments may be seen as just a first attempt to explore the properties of

a random distribution. Moreover in general the variance does not appear to

be the most suitable risk measure to determine the solvency requirements

for an insurance portfolio. As a two-sided risk measure it takes into account

both positive and negative discrepancies which leads to underestimation of

the reserve in the case of a skewed distribution. It does not emphasize the

tail properties of the distribution and does not give any reliable estimates of

the Value-at-Risk or other tail-related risk measures, for which simulation

methods have to be deployed. The same applies to risk measures based on

stop-loss premiums, like Expected Shortfall.

In this chapter we aim to provide some conservative estimates both

for high quantiles and stop-loss premiums for an individual policy and for

a whole portfolio. We focus here only on life annuities, however similar

techniques may be used to get analogous estimates for more general life

contingencies. Using the results of Chapter 2 we will approximate the

quantiles of the present value of a life annuity and a portfolio of life annu-

ities.

We perform our analysis separately for a single life annuity and a whole

portfolio of policies. Our solution enables to solve with a great accuracy


personal finance problems, such as: How much does one need to invest now

to ensure — given a periodical (e.g. yearly) consumption pattern — that

the probability of outliving ones money is very small (e.g. less than 1%)?

Similar problems were studied by Dufresne (2004) and Milevsky & Wang

(2004).

The case of a portfolio of life annuity policies has been studied exten-

sively in the literature, but only in the limiting case — for homogeneous

portfolios, when the mortality risk is fully diversified. However the applica-

bility of these results in insurance practice may be questioned: especially

in the case of the life annuity business a typical portfolio does not con-

tain enough policies to speak about full diversification. For this reason we

propose to approximate the number of active policies in subsequent years

using a normal power distribution and to model the present value of future

benefits as a scalar product of mutually independent random vectors.

This chapter is mainly based on Hoedemakers, Darkiewicz & Goovaerts

(2005) and is organized as follows. In Section 2 we give a summary of

the model assumptions and properties for the mortality process that are

needed to reach our goal. In the first part of Section 3 we apply the

results of Chapter 2 to the present value of a single life annuity policy.

In the second part of this section we present the convex bounds for a

homogeneous portfolio of policies. A numerical illustration is provided at

the end of each part. We also illustrate the obtained results graphically.

3.2 Modelling stochastic decrements

A life annuity may be defined as a series of periodic payments where each

payment will actually be made only if a designated life is alive at the time

the payment is due. Let us consider a person aged x years, also called a

life aged x and denoted by (x). We denote his or her future lifetime by Tx.

Thus x+ Tx will be the age of death of the person. The future lifetime Tx

is a random variable with a probability distribution function

Gx(t) = Pr[Tx ≤ t] = tqx, t ≥ 0.

The function Gx represents the probability that the person will die within t

years, for any fixed t. We assume that Gx is known. We define Kx = bTxc,the number of completed future years lived by (x), or the curtate future

3.2. Modelling stochastic decrements 97

lifetime of (x). The probability distribution of the integer valued random

variable Kx is given by

Pr[Kx = k] = Pr[k ≤ Tx < k+ 1] = k+1qx − kqx = k|qx, k = 0, 1, . . . .

Let us denote the lifetime from birth by the random variable T . We assume

Pr[Tx ≤ t] = Pr[T ≤ x+ t|T ≥ x].

With this notation, Td= T0. Further, the ultimate age of the life table is

denoted by ω, this means that ω − x is the first remaining lifetime of (x)

for which ω−xqx = 1, or equivalently, G−1x (1) = ω − x.

In the remainder of this chapter we will always use the standard actu-

arial notation:

Pr[Tx > t] = tpx, Pr[Tx > 1] = px, Pr[Tx ≤ t] = tqx, Pr[Tx ≤ 1] = qx.

In this chapter we consider three types of annuities. The present value of

a single life annuity for a person aged x paying periodically (e.g. yearly) a

fixed amount of αi (i = 1, . . . , bω − xc) can be expressed as

Ssp,x =

Kx∑

i=1

αie−Y (i) =

bω−xc∑

i=1

I(Tx>i)αie−Y (i). (3.1)

We consider also the present value of a homogeneous portfolio of life

annuities — this random variable is particularly interesting for an in-

surer who has to determine a sufficient level of the reserve and the sol-

vency margin. Assuming that every beneficiary gets a fixed amount of αi

(i = 1, . . . , bω−xc) per year, the present value can be expressed as follows

Spp,x =

bω−xc∑

i=1

αiNie−Y (i), (3.2)

where Ni denotes the remaining number of policies-in-force in year i.

Finally, consider a portfolio of N0 homogeneous life annuity contracts

for which the future lifetimes of the insureds T(1)x , T

(2)x , . . . , T

(N0)x are as-

sumed to be independent. Then the insurer faces two risks: mortality

risk and investment risk. Note that from the Law of Large Numbers the


mortality risk decreases with the number of policies N0 while the invest-

ment risk remains the same (each of the policies is exposed to the same

investment risk). Thus, for sufficiently large N0 we have that

bω−xc∑

i=1

αiNie−Y (i) = N0

bω−xc∑

i=1

αiNi

N0e−Y (i)

≈ N0

bω−xc∑

i=1

αi ipxe−Y (i)

.

Hence in the case of large portfolios of life annuities it suffices to compute

risk measures of an ‘average’ portfolio Sapp,x given by

Sapp,x =

bω−xc∑

i=1

αi ipxe−Y (i) = E

[Ssp,x|Y (1), · · · , Y (bω − xc)

]. (3.3)

Remark 5. For the random variables Sapp,x and Ssp,x one has that

Sapp,x ≤cx Ssp,x and consequently Var[Sapp,x] ≤ Var[Ssp,x].

Indeed, let Γ denote a random variable independent of Tx. Then, it follows

immediately from Theorem 8 that

Ssp,x =

bω−xc∑

i=1

I(Tx>i)αie−Y (i)

≥cx

bω−xc∑

i=1

E[I(Tx>i)|Γ]αie−Y (i)

=

bω−xc∑

i=1

ipxαie−Y (i)

= Sapp,x.

Obviously Ssp,x, Spp,x and Sapp,x depend on the distribution of the total

lifetime T . We assume that T follows the Gompertz-Makeham law, i.e.

the force of mortality at age ξ is given by the formula

µξ = α+ βcξ,

where α > 0 is a constant component, interpreted as capturing accident

hazard, and βcξ is a variable component capturing the hazard of aging

with β > 0 and c > 1. This leads to the survival probability

tpx = Pr[Tx > t] = e− � x+tx

µξdξ = stgcx+t−cx

,

3.2. Modelling stochastic decrements 99

where

s = e−α and g = e− β

log c . (3.4)

In numerical illustrations we use the Belgian analytic life tables MR and

FR for life annuity valuation, with corresponding constants for males: s =

0.999441703848, g = 0.999733441115 and c = 1.101077536030 and for fe-

males: s = 0.999669730966, g = 0.999951440171 and c = 1.116792453830.

Denote by T ′ and T ′x the corresponding random variables from the

Gompertz family — the subclass of the Makeham-Gompertz family with

the force of mortality given by

µ′ξ = βcξ.

It is straightforward to show that

Txd= min(T ′

x, E/α), (3.5)

where E denotes a random variable from the standard exponential distri-

bution, independent of T ′. Indeed, one has that

Pr[min(T ′x, E/α) > t] = Pr[T ′

x > t] Pr[E > αt]

= e− � x+tx

µ′ξdξe−αt

= e− � x+tx

µξdξ

= Pr[Tx > t].

The cumulative distribution function for the Gompertz law, unlike for the

Makeham-Gompertz law in general, has an analytical expression for the

inverse function and therefore (3.5) can be used for simulations.

For generating one random variate from Makeham’s law, we use the

composition method (Devroye, 1986) and perform the following steps

1. Generate G from the Gompertz’s law by the well-known inversion

method

2. Generate E from the exponential(1) distribution

3. Retain T = min(G,E/α),

where α = − log s, see (3.4).


3.3 The distribution of life annuities

This section is organized into 2 subsections. In the first subsection we

derive upper and lower bounds in convex order for the distribution of the

present value of a single life annuity given a mortality law T and a model

for the returns. This distribution is very important in the context of so-

called personal finance problems. Suppose that (x) disposes of a lump

sum L. What is the amount that (x) can yearly consume to be sure with a

sufficiently high probability (e.g. p = 99%) that the money will not be run

out before death? Obviously, to answer this question one has to compute

the Value-at-Risk measure of the distribution at an appropriate chosen

level.

In the second part of this section we will consider the distribution of

a homogeneous and ‘average’ portfolio of life annuities. An insurer has

to derive this distribution to determine its future liabilities and solvency

margin. Notice that the presented methodology is appropriate not only in

the case of large portfolios when the limiting distribution can be used on

the basis of the law of large numbers but also for portfolios of average size

(e.g. 1 000 - 5 000) which are typical for the life annuity business.

The vector ~Y =(Y (1), Y (2), . . . , Y (n)

)is assumed to have a n-dimensional

normal distribution with given mean vector

~µ = (µ1, . . . , µn) =(E[Y (1)],E[Y (2)], . . . ,E[Y (n)]

)

and covariance matrix

Σ = [σij ]1≤i,j≤n =[Cov

(Y (i), Y (j)

)]

1≤i,j≤n.

In the above notation we will denote σii by σ2i .

3.3.1 A single life annuity

In this subsection we consider a whole life annuity of αi (> 0) payable at

the end of each year i while (x) survives, described by the formula

Ssp,x =

Kx∑

i=1

αie−Y (i) =

bω−xc∑

i=1

I(Tx>i)αie−Y (i).

3.3. The distribution of life annuities 101

The upper bound

The random variable Xi = I(Tx>i) is Bernoulli(ipx) distributed and thus

the inverse distribution function is given by

F−1Xi

(p) =

{1 for p > iqx0 for p ≤ iqx.

This leads to the following formula for the upper bound

Scsp,x =

bω−xc∑

i=1

F−1Xi

(U)F−1αie−Y (i)(V )

=

bF−1Tx

(U)c∑

i=1

F−1αie−Y (i)(V ),

where U and V are independent standard uniformly distributed random

variables. Thus the conditional quantiles are given by

F−1Sc

sp,x|Tx=t(p) =

btc∑

i=1

F−1αie−Y (i)(V )

and the conditional distribution function can be computed numerically

from the identity

btc∑

i=1

αie−µi+sign(αi)σiΦ

−1(FScsp,x|Tx=t(y))

=k∑

i=1


−1(FScsp,x|Kx=k(y))

= y.

Define Sk as follows:

Sk =k∑

i=1

αie−Y (i), (3.6)


then Skd= Ssp,x|Kx = k. Hence, the distribution function of Sc

sp,x can be

computed as

FScsp,x

(y) =

bω−xc∑

k=1

Pr[Kx = k]FScsp,x|Kx=k(y)

=

bω−xc∑

k=1

k|qxFSck(y)

=

bω−xc∑

k=1

k|qxPr

[k∑

i=1


−1(U) ≤ y

],

with Sck =

∑ki=1 F

−1αie−Y (i)(U) and U a standard uniform random variable.

The computation of the corresponding stop-loss premiums is also straight-

forward:

πcub(Ssp,x, d) = EKx

[E[(Sc

sp,x − d)+|Kx

]]

=

bω−xc∑

k=1

k|qxπcub(Sk, d)

=

bω−xc∑

k=1

k|qx

( k∑

i=1

π(αie

−Y (i), dck,i

)),

where dck,i is defined analogously to (2.73) as

dck,i = αie

−µi+sign(αi)σiΦ−1(F

Sck(d))

and the values of π(αie−Y (i), dk,i) are computed as in (2.74). The stop-loss

premium of Scsp,x at retention d can be written out explicitly as follows

πcub(Ssp,x, d) =

bω−xc∑

k=1

k|qx

{k∑

i=1

αie−µi+

σ2i2 Φ(sign(αi)σi − Φ−1

(FSc

k(d)))

− d(1 − FSc

k(d))}.


The lower bound

For the lower bound one faces the problem of choosing appropriate condi-

tioning random variables Γ and Λ. The random variables Xi are in fact

comonotonic and depend only on the future lifetime Tx, thus Γ = Tx is the

most natural choice. As a result one simply gets

E[I(Tx>i)|Tx

]= I(Tx>i).

The choice of the second conditioning random variable Λ is less obvious.

We propose two different approaches:

1. Λ(a) =∑bω−xc

i=1 ipxαie−µi+

12σ2

i Y (i). Intuitively it means that the con-

ditioning random variable is chosen as a first order approximation to

the present value of the limiting portfolio Sapp,x in (3.3).

2. Consider the ‘maximal variance’ conditioning random variables of

the form Λj =∑j

i=1 αie−µi+

12σ2

i Y (i)(j = 1, . . . , bω − xc

)and the

corresponding lower bounds

Sl,jsp,x =

Kx∑

i=1

E[αie

−Y (i)|Λj

], j = 1, . . . , bω − xc

from which one chooses the lower bound with the largest variance.

The corresponding conditioning random variable will be denoted as

Λ(m). This choice can be motivated as follows. For two random

variables X and Y with X ≤cx Y one has that Var[X] ≤ Var[Y ]. As

discussed in Chapter 2 we should choose Λ such that the goodness-

of-fit expressed by the ratio z =Var[Sl

sp,x]

Var[Ssp,x]is as close as possible to

1. Hence one can expect that a lower bound with a larger variance

will provide a better fit to the original random variable.

Having chosen the conditioning random variable Λ one proceeds as in the

case of the upper bound: the first step requires the computation of the

conditional distribution of the lower bound from the formula

k∑

i=1

αie−µi+

12σ2

i (1−r2i )+σiriΦ

−1(FSl

sp,x|Kx=k(y))

= y.


The cumulative distribution function of S lsp,x can then be computed as

FSlsp,x

(y) =

bω−xc∑

k=1

k|qxFSlsp,x|Kx=k(y)

=

bω−xc∑

k=1

k|qxFSlk(y)

=

bω−xc∑

k=1

k|qxPr

[k∑

i=1

αie−µi−riσiΦ

−1(U)+ 12(1−r2

i )σ2i ≤ y

],

with Slk = E[Sk|Λ] and U a standard uniform random variable.

The computation of the corresponding stop-loss premium is similar to

the one of the upper bound and as a result one gets the following explicit

solution

πlb(Ssp,x, d,Γ,Λ) = EKx

[E[(Sl

sp,x − d)+|Kx

]]

=

bω−xc∑

k=1

k|qxπlb(Sk, d,Λ)

=

bω−xc∑

k=1

k|qx

(k∑

i=1

π(E[αie

−Y (i)|Λ], dl

k,i

)),

with dlk,i given by

dlk,i = αie

−µi+12σ2

i (1−r2i )+σiriΦ

−1(FSl

sp,x|Kx=k(d))

.

Note that the values of π(E[αie

−Y (i)|Λ], dl

k,i

)can be computed as in

(2.82). The stop-loss premium of S lsp,x at retention d can be written out

explicitly as follows

πlb(Ssp,x, d,Γ,Λ) =

bω−xc∑

k=1

k|qx

{k∑

i=1

αi e−µi+

σ2i2 Φ(riσi − Φ−1

(FSl

k(d)))

− d(1 − FSl

k(d))}.


The lower bound based on a lifetime dependent conditioning ran-

dom variable

In this subsection we show how it is possible to improve the lower bound of

a scalar product if one of the vectors is comonotonic. We state this result

in the following lemma.

Lemma 10.

Consider a scalar product of random variables S =∑n

i=1XiYi, where the

random vectors ~X and ~Y are independent and ~X is additionally assumed

to be comonotonic, i.e. ~X =(F−1

X1(U), F−1

X2(U), . . . , F−1

Xn(U)). Let Λ(u) be

a random variable which is defined for each u ∈ (0, 1) separately. Define

Scl(u) as follows:

Scl(u) =n∑

i=1

F−1Xi

(u) E[Yi | Λ(u)

],

then Scl(u)d= (Scl|U = u). Define the random variable Scl through its

distribution function

FScl(y) =

∫ 1

0FScl|U=u(y)du.

Then Scl ≤cx S.

Remark 6. Obviously the conditioning random variable U can be replaced

by any other random variable which determines the comonotonic vector~X by a functional relationship. We consider here the case when Xi =

I(Tx>i) = I(Kx≥i) and therefore it is convenient to condition on the future

lifetime Kx.

Proof. Let S(u) denote a random variable distributed as S given that

U = u. From Definition 1b of convex order, it follows immediately that

Scl(u) ≤cx S(u).

Indeed, let v(.) be an arbitrary convex function. Then we get

E[v(Scl)

]=

∫ 1

0E[v(Scl(u))

]du ≤

∫ 1

0E[v(S(u))

]du = E

[v(S)

],

which completes the proof.


Because of Lemma 10, one can determine a lower bound of a single life

annuity using the following conditioning random variable:

ΛKx =

Kx∑

i=1

αie−µi+

12σ2

i Y (i).

Intuitively it is clear that the lower bound defined by the random variable

ΛKx should approximate the underlying distribution better than those de-

fined by the conditioning random variables Λ(a) and Λ(m). As before, one

starts with computing the conditional distributions for the lower bound

Sclsp,x numerically by considering the equation

k∑

i=1

αie−µi+

12(1−r2

i,k)σ2i +ri,kσiΦ

−1(F

Sclsp,x|Kx=k

(y))

= y,

with correlations ri,k given by

ri,k =Cov

[Y (i),Λk

]√

Var[Y (i)]√

Var[Λk]

Consequently, the distribution function of Sclsp,x can be obtained as

FSclsp,x

(y) =

bω−xc∑

k=1

Pr[Kx = k]FSclsp,x|Kx=k(y) =

bω−xc∑

k=1

k|qxFSclk(y),

with

Sclk = E

[Sk|Λk

]. (3.7)

The stop-loss premiums of Sclsp,x can be computed as follows

πclb(Ssp,x, d,Γ,Λ) = EKx

[E[(Scl

sp,x − d)+|Kx

]]

=

bω−xc∑

k=1

k|qxπlb(Sk, d,Λk)

=

bω−xc∑

k=1

k|qx

(k∑

i=1

π(E[αie

−Y (i)|Λk

], dcl

k,i

)),


with dclk,i given by

dclk,i = αie

−µi+12σ2

i (1−r2i,k)+σiri,kΦ−1(F

Sclk

(d)).

The stop-loss premium of Sclsp,x at retention d can be written out explicitly

as follows

πclb(Ssp,x, d,Γ,Λ) =

bω−xc∑

k=1

k|qx

{k∑

i=1

αie−µi+

σ2i2 Φ(ri,kσi − Φ−1

(FScl

k(d)))

− d(1 − FScl

k(d))}.


Having computed the upper bound Scsp,x and the lower bounds Sl

sp,x and

Sclsp,x, one can compute two moments based approximations as described

in Subsection 2.2.4. To find the coefficient z given by (2.15) one needs to

calculate the variances of Scsp,x, Sl

sp,x, Sclsp,x and Ssp,x. The variance of Sc

sp,x

and Slsp,x can be computed as explained in Subsection 2.5.3. The variance

of Ssp,x and Sclsp,x can be treated very similarly. Indeed, after some simple

calculations one gets

Var[Scl

sp,x

]= EKx

[E[(Scl

sp,x)2|Kx

]]−(E[Scl

sp,x

])2

=

bω−xc∑

k=1

k|qxE[(Scl

k

)2]−(E[Scl

sp,x

])2,

Var[Ssp,x

]= EKx

[E[(Ssp,x)2|Kx

]]−(E[Ssp,x

])2

=

bω−xc∑

k=1

k|qxE[(Sk

)2]−(E[Ssp,x

])2,


where Sclk and Sk are defined as in (3.7) and (3.6) respectively. Thus it

suffices to plug in

E[Scl

k

]= E

[Sk

]=

k∑

i=1

αie−µi+

σ2i2 ,

E[(Scl

k

)2]=

k∑

i=1

k∑

j=1

αiαje−µi−µj+

12(σ2

i +σ2j )+ri,krj,kσiσj ,

E[(Sk

)2]=

k∑

i=1

k∑

j=1

αiαje−µi−µj+

12(σ2

i +σ2j )+σij ,

and

E[Ssp,x

]= E

[Scl

sp,x

]=

bω−xc∑

k=1

k|qxE[Sk

]=

bω−xc∑

k=1

k|qxE[Scl

k

].

Now one can compute distributions of the moment based approximations

from the formulas

FSmsp,x

(y) = z1FSlsp,x

(y) + (1 − z1)FScsp,x

(y),

FScmsp,x

(y) = z2FSclsp,x

(y) + (1 − z2)FScsp,x

(y)

and their corresponding stop-loss premiums as

πm(Ssp,x, d,Γ,Λ) = z1πlb(Ssp,x, d,Γ,Λ) + (1 − z1)π

cub(Ssp,x, d),

πcm(Ssp,x, d,Γ,Λ) = z2πclb(Ssp,x, d,Γ,Λ) + (1 − z2)π

cub(Ssp,x, d),

where

z1 =Var[Sc

sp,x] − Var[Ssp,x]

Var[Scsp,x] − Var[Sl

sp,x]and z2 =

Var[Scsp,x] − Var[Ssp,x]

Var[Scsp,x] − Var[Scl

sp,x].


We examine the accuracy and efficiency of the derived approximations

for a single life annuity of a 65-years old male person with yearly unit

payments. We restrict ourselves to the case of a Black & Scholes setting

(model BS) with drift µ = 0.05 and volatility σ = 0.1. We assume further

that the future lifetime T65 follows the Makeham-Gompertz law with the

corresponding coefficients of the Belgian analytic life table MR (see Section


3.2). We compare the distribution functions of the upper bound Scsp,65 and

the lower bounds Slsp,65 and Scl

sp,65, as described in the previous sections,

with the original distribution function of Ssp,65 based on extensive Monte

Carlo (MC) simulation. We generated 500 × 100 000 paths and for each

estimate we computed the standard error (s.e.). As is well-known, the

(asymptotic) 95% confidence interval is given by the estimate plus or minus

1.96 times the standard error. Note also that the random paths are based

on antithetic variables in order to reduce the variance. Notice that to

compute the lower bound we use as conditioning random variable Λ(m) =

Λ24 (the value j = 24 was found to be the one that maximizes the variance

as described in Section 3.3.1).

Figure 3.1 shows the cumulative distribution functions of the approx-

imations, compared to the empirical distribution. One can see that the

lower bound Sclsp,65 is almost indistinguishable from the original distribu-

tion. In order to have a better view on the behavior of the approximations

in the tail, we consider a QQ-plots where the quantiles of S lsp,65, S

clsp,65 and

Scsp,65 are plotted against the quantiles of Ssp,65 obtained by simulation.

The different bounds will be good approximations if the plotted points

(F−1Ssp,65

(p), F−1Sl

sp,65(p)), (F−1

Ssp,65(p), F−1

Sclsp,65

(p)) and (F−1Ssp,65

(p), F−1Sc

sp,65(p)) for

all values of p in (0, 1) do not deviate too much from the line y = x. From

the QQ-plot in Figure 3.2, we can conclude that the comonotonic upper

bound slightly overestimates the tails of Ssp,65, whereas the accuracy of

the lower bounds Slsp,65 and Scl

sp,65 is extremely high; the corresponding

QQ-plot is indistinguishable from a perfect straight line. These visual ob-

servations are confirmed by the numerical values of some upper quantiles

displayed in Table 3.1, which also reports the moments based approxima-

tions Smsp,65 and Scm

sp,65.

Stop-loss premiums for the different approximations are compared in

Figure 3.3 and Table 3.2. This study confirms the high accuracy of the

derived bounds. Note that for very high values of d the differences become

larger, however these cases don’t represent any practical importance. All

Monte Carlo estimates are very close to πclb(Ssp,65, d,Γ,Λ) and some of

them even turn out to be smaller than this lower bound for. This not only

demonstrates the difficulty of estimating stop-loss premiums by simulation,

but it also indicates the accuracy of the lower bound πclb(Ssp,65, d,Γ,Λ).

Indeed, since the Monte Carlo estimate is based on random paths, it can

be smaller than πclb(Ssp,65, d,Γ,Λ) and this is very likely to happen if the


p Slsp,65 Scl

sp,65 Smsp,65 Scm

sp,65 Scsp,65 MC (s.e. × 103)

0.75 14.1741 14.1887 14.1750 14.1887 14.1867 14.1887 (0.978)0.90 17.5905 17.5972 17.6250 17.6008 18.0797 17.5969 (1.420)0.95 19.9565 19.9713 20.0232 19.9783 20.8754 19.9731 (1.896)0.975 22.2495 22.2875 22.3559 22.2986 23.6574 22.2839 (2.816)0.995 27.5124 27.6700 27.7498 27.6943 30.2983 27.6933 (6.324)


level p of Ssp,65.

d Slsp,65 Scl

sp,65 Smsp,65 Scm

sp,65 Scsp,65 MC (s.e. × 104)

0 11.0944 11.0944 11.0944 11.0944 11.0944 11.0937 (9.43)5 6.3715 6.3756 6.3721 6.3756 6.3792 6.3748 (8.67)10 2.5956 2.6071 2.6029 2.6078 2.6900 2.6068 (5.89)15 0.7151 0.7201 0.7265 0.7213 0.8629 0.7201 (0.34)20 0.1628 0.1664 0.1698 0.1671 0.2536 0.1668 (0.21)25 0.0357 0.0379 0.0388 0.0382 0.0758 0.0382 (0.10)30 0.0080 0.0091 0.0092 0.0092 0.0239 0.0093 (0.02)35 0.0019 0.0023 0.0024 0.0023 0.0081 0.0024 (0.004)


retention d of Ssp,65.

lower bound is close to the real stop-loss premium. Table 3.3 compares the

stop-loss premium of the comonotonic upper bound with the partially ex-

act/comonotonic upper bound πpecub(Ssp,65, d,Λ,Γ) (PECUB) and the two

combination bounds πeub(Ssp,65, d,Λ,Γ) (EMUB) (upper bounds based on

the lower bound Slsp,65) and πmin(Ssp,65, d,Λ,Γ) (MIN). For the partial

exact/comonotonic upper bound we use the same conditioning variable as

for the lower bound Sclsp,65. Remark that the decomposition variable is of

the form (2.55) with Λ ≡ Λn.

For the important retentions d = 5, 10, 15 and 20 the upper bound

πmin(Ssp,65, d,Λ,Γ) really improves the comonotonic upper bound. Notice

that for the extreme cases the values are more or less the same.


0 10 20 30 40 50

outcome

0.0

0.2

0.4

0.6

0.8

1.0

cdf

Figure 3.1: The cdf’s of ‘Ssp,65’ (MC) (solid grey line), S lsp,65 (•-line),

Sclsp,65 (N-line) and Sc

sp,65 (dashed line).

0 5 10 15 20 25

05

1015

2025

30

0 5 10 15 20 25

05

1015

2025

30

Figure 3.2: QQ-plot of the quantiles of S lsp,65 (◦) / Scl

sp,65 (4) and Scsp,65

(�) versus those of ‘Ssp,65’ (MC).


0 10 20 30 40 50

outcome

02

46

810

Sto

p-lo

ss p

rem

ium

Figure 3.3: Stop-loss premiums for ‘Ssp,65’ (MC) (solid grey line), S lsp,65

(•-line), Sclsp,65 (N-line) and Sc

sp,65 (dashed line).

d MIN EMUB PECUB CUB MC (s.e. × 104)

0 11.0944 11.0944 11.0944 11.0944 11.0937 (9.43)5 6.3759 6.3761 6.3775 6.3792 6.3748 (8.67)10 2.6153 2.6164 2.6523 2.6900 2.6068 (5.89)15 0.7484 0.7532 0.8025 0.8629 0.7201 (0.34)20 0.2066 0.2207 0.2331 0.2536 0.1668 (0.21)25 0.0684 0.1009 0.0711 0.0758 0.0382 (0.10)30 0.0223 0.0738 0.0223 0.0239 0.0093 (0.02)35 0.0074 0.0672 0.0074 0.0081 0.0024 (0.004)

Table 3.3: Upper bounds for some selected stop-loss premiums with re-

tention d of Ssp,65.


3.3.2 A homogeneous portfolio of life annuities

We consider now the distribution of the present value of a homogeneous

portfolio of N0 life annuities paying a fixed amount of αi (> 0) at the end

of each year i. This present value can be expressed by the formula

Spp,x =

bω−xc∑

i=1

Ni αie−Y (i),

where Ni denotes the number of survivals in year i and can be written as

Ni = I �T

(1)x >i � + I �

T(2)x >i � + . . .+ I �

T(N0)x >i � ,

where T(j)x denotes the future lifetime of the j-th insured. We assume that

these random variables are mutually independent. So the random vari-

ables Ni are binomially distributed with parameters n = N0 and success

parameter ipx.

Note that

Spp,x =

N0∑

j=1

S(j)sp,x, (3.8)

with S(j)sp,x given by

S(j)sp,x =

bω−xc∑

i=1

I �T

(j)x >i � αie

−Y (i).

The computation of the convex upper and lower bound for the case of a

portfolio of life annuities is more complicated than in the case of a single

life annuity. The binomial distributed random variables Ni are not very

useful in practical computations, because there exist no closed-form ex-

pressions for the cumulative and the inverse distribution functions. This

problem can be dealt with by replacing the random variables Ni by more

handy continuous approximations Ni. We propose to approximate the dis-

tribution of Ni by the Normal Power Approximation (NPA). This allows to

incorporate the sknewness in contrast with a Normal approximation, be-

cause the binomial distribution is very skewed (unless either the parameter


n is very high or the success parameter p is close to 12). The distribution

function of the NPA Ni is given by the formula

FNi(x) = Φ

(− 3

γNi

+

√9

γ2Ni

+6(x− µNi

)

γNiσNi

+ 1

),

where

µNi= E [Ni] = N0 ipx,

σ2Ni

= Var [Ni] = N0 ipx iqx,

γNi=

E[(Ni − µNi

)3]

σ3Ni

=1 − 2 ipx√N0 ipx iqx

.

Then the p-th quantile of Ni is given by

F−1Ni

(p) = µNi+ σNi

Φ−1(p) +γNi

σNi

6

((Φ−1(p))2 − 1

). (3.9)

The upper bound

The upper bound Scpp,x is computed as described in Section 2.5.3. The

only difference is that in the formulas (2.71), (2.72) and (2.75) F−1Xi

(u) has

to be replaced by the approximation given in (3.9).

The lower bound

To compute the lower bound one has to choose two conditioning variables:

Γ and Λ. For the first conditioning random variable Γ we propose to take

Ni0 — the number of policies-in-force in the year i0. Note that

E[Ni|Ni0 = n0

]= i−i0px+i0n0 for i ≥ i0.

For i < i0, Pr[Ni = n|Ni0 = ni] can be computed from Bayes’ theorem.

As a result one gets the following formula for the conditional expectation:

E[Ni|Ni0 = n0

]=

N0∑

k=n0

kPr[Ni0 = n0|Ni = k]Pr[Ni = k]

Pr[Ni0 = n0]

=

N0∑

k=n0

k

(kn0

)(N0

k

)(N0

n0

) i0−ipn0x+i i0−iq

k−n0x+i ip

kx iq

N0−kx

i0pn0x i0q

N0−n0x

=

N0∑

k=n0

k

(N0 − n0

k − n0

)ip

k−n0x

i0−iqk−n0x+i iq

N0−kx

i0qN0−n0x

.


For mathematical convenience we rewrite this formula for non-integer val-

ues of Ni0 as follows

E[Ni|Ni0 = y

]=

N0∑

k=dyek

(N0 − dyek − dye

)ip

k−dyex

i0−iqk−dyex+i iq

N0−kx

i0qN0−dyex

. (3.10)

We propose to take Λ(a), as defined in Section 3.3.1, for the second condi-

tioning random variable Λ. Now one can perform step by step the computa-

tions described in Subsection 2.5.3 with the only exception that E[Xi|Γ =

γ]

has to replaced in the formulas (2.80) and (2.81) by E[Ni|Ni0 = y

]in

(3.10).

Also the stop-loss premiums are calculated according to the methodol-

ogy presented in Section 5.3 and 2.5.3 with the only difference the replace-

ment of E[Xi|Γ = F−1

Γ (u)]

in formula (2.83) by the approximation given

in (3.10).


As in the case of a single life annuity, the only problem in the computation

of the weight z given by (2.66) is to find expressions for the variances of

Scpp,x, Sl

pp,x and Spp,x. For the upper and the lower bound we have deployed

a procedure, described in Section 2.5.3, with fi(u) replaced by

fi(u) = F−1Ni

(u) for the upper bound

and

fi(u) = E[Ni|Ni0 = F−1

Ni0

(u)]

for the lower bound.

The variance of Spp,x can be computed from (3.8) and by noticing that,

given the returns ~Y =(Y (1), . . . , Y (bω − xc)

), the random variables S

(1)sp,x,

S(2)sp,x, . . . , S

(N0)sp,x are conditionally independent. Hence, we have that

Var[Spp,x

]= E~Y

[Var[Spp,x|~Y

]]+ Var~Y

[E[Spp,x|~Y

]]

= N0E~Y

[Var[Ssp,x|~Y

]]+N2

0 Var~Y

[E[Ssp,x|~Y

]]

= N0Var[Ssp,x

]+ (N2

0 −N0)Var~Y

[E[Ssp,x|~Y

]],


p Slpp,65 Sm

pp,65 Scpp,65 MC (s.e.)

0.75 12 574 12 577 12 821 12 577 (3.90)

0.90 14 565 14 574 15 290 14 568 (5.08)

0.95 15 937 15 951 17 029 15 947 (8.15)

0.975 17 252 17 272 18 722 17 276 (8.80)

0.995 20 209 20 250 22 620 20 242 (22.09)


level p of Spp,65.

where Var[Ssp,x

]is calculated in Subsection 3.3.1 and

Var~Y

[E[Ssp,x|~Y

]]=

bω−xc∑

i=1

bω−xc∑

j=1

ipx jpx αiαje−µi−µj+

σ2i +σ2

j2(eσij − 1

).


To test the quality of the derived approximations we present a numeri-

cal illustration similar to this from Subsection 3.3.1. As before we work

in a Black & Scholes setting with drift µ = 0.05 and volatility σ = 0.1

and we use the Makeham-Gompertz law to describe the mortality process

of 65-year old male persons. We compare the performance of the lower

bound Slpp,65, the upper bound Sc

pp,65 and the moments based approxima-

tion Smpp,65 with the real value Spp,65, obtained by extensive simulation,

for a portfolio of 1 000 policies. The number of policies-in-force after the

first year N1 is taken as the conditioning random variable Γ for the lower

bound. This choice seems to us to be reasonable — other choices can im-

prove the performance of the lower bound only a bit but with a significant

increase in computational time as cost. The Monte Carlo (MC) study of

Spp,65 is based on 30 × 50 000 simulated paths. Antithetic variables are

used in order to reduce the variance of the Monte Carlo estimates.

The quality of the approximations is illustrated in Figure 3.4 and 3.5.

One can see that the lower bound S lpp,65 indeed performs very well. The fit

of the upper bound is a bit poorer but still reasonable. The moments based

approximation Smpp,65 performs extremely well. These visual observations

are confirmed by the numerical values of some upper quantiles displayed

in Table 3.4 and by the study of stop-loss premiums in Figure 3.6 and in

Table 3.5.


5000 10000 15000 20000 25000 30000

outcome

0.0

0.2

0.4

0.6

0.8

1.0

cdf

Figure 3.4: The cdf’s of ‘Spp,65’ (MC) (solid grey line), S lpp,65 (•-line),

Smpp,65 (N-line) and Sc

pp,65 (dashed line).

6000 8000 10000 12000 14000 16000 18000 20000

5000

1000

015

000

2000

0

6000 8000 10000 12000 14000 16000 18000 20000

5000

1000

015

000

2000

0

Figure 3.5: QQ-plot of the quantiles of S lpp,65 (◦)/Sm

pp,65 (4) and Scpp,65

(�) versus those of ‘Spp,65’ (MC).


5000 10000 15000 20000 25000 30000

outcome

020

0040

0060

00

Sto

p-lo

ss p

rem

ium

Figure 3.6: Stop-loss premiums for ‘Spp,65’ (MC) (solid grey line), S lpp,65

(•-line), Smpp,65 (N-line) and Sc

pp,65 (dashed line).

d Slpp,65 Sm

pp,65 Scpp,65 MC (s.e.)

0 11 094 11 094 11 094 11 098 (2.11)5 000 6 094 6 094 6 095 6 098 (2.10)10 000 1 608 1 610 1 793 1 611 (1.95)15 000 153.7 155.3 278.4 155.3 (1.78)20 000 10.23 10.57 36.02 10.67 (1.26)25 000 0.680 0.734 4.816 0.743 (0.09)30 000 0.051 0.059 0.711 0.036 (0.02)


retention d of Spp,65.


3.3.3 An ‘average’ portfolio of life annuities

As explained in Section 3.2 in the case of large portfolios of life annuities

it suffices to compute risk measures of an ‘average’ portfolio given by

Sapp,x =

bω−xc∑

i=1

ipx αie−Y (i),

where we assume that the payments αi are positive and due at times i =

1, . . . , bω − xc (payable at the end of each year). Notice that Sapp,x is of the

form (2.29) and that Sapp,x = E[Ssp,x|Y (1), · · · , Y (bω−xc)]. Comonotonic

approximations for this type of sums has been studied extensively by Kaas

et al. (2000), Dhaene et al. (2002a,b), Vyncke (2003), Darkiewicz (2005b)

and Vanduffel (2005), among others.

It turns out that for this application the conditioning variable of the

‘maximal variance’ form gives very accurate results. This means that we

define Λ here as

bω−xc∑

i=1

ipx αie−µi+

12σ2

i Y (i).

Notice that this conditioning variable could also be used in order to derive

the lower bound for a single life annuity.

To compute the comonotonic approximations for the quantiles and

stop-loss premiums, notice that the correlations ri are given by

ri = Corr(Y (i),Λ) =Cov[Y (i),Λ]

σiσΛ.

Because all correlation coefficients ri are positive, we have seen that the

lower bound is a comonotonic sum (all the terms in the sum are non-

decreasing functions of the same standard uniform random variable U).

This implies that the quantiles related to the lower and upper bound can

be computed by summing the corresponding quantiles for the marginals

involved. We find the following expressions for the quantiles and stop-loss

premiums of Slapp,x and Sc

app,x:


F−1Sl

app,x(p) =

bω−xc∑

i=1

ipx αie−µi+riσiΦ

−1(p)+ 12(1−r2

i )σ2i ,

F−1Sc

app,x(p) =

bω−xc∑

i=1

ipx αie−µi+sign(ipxαi)σiΦ

−1(p),

πlb(Sapp,x, d,Λ) =

bω−xc∑

i=1

ipx αie−µi+

σ2i2 Φ[riσi − Φ−1

(FSl

app,x(d))]

−d(1 − FSl

app,x(d)),

πcub(Sapp,x, d) =

bω−xc∑

i=1

ipx αie−µi+

σ2i2 Φ[sign(ipxαi)σi − Φ−1

(FSc

app,x(d))]

−d(1 − FSc

app,x(d)).

To calculate the moments based approximation we need the expressions

for the variances of Sapp,x, Slapp,x and Sc

app,x. These are given by

Var[Sapp,x] =

bω−xc∑

i=1

bω−xc∑

j=1


σ2i +σ2

j2 (eσij − 1) ,

Var[Slapp,x] =

bω−xc∑

i=1

bω−xc∑

j=1


σ2i +σ2

j2 (erirjσiσj − 1) ,

Var[Scapp,x] =

bω−xc∑

i=1

bω−xc∑

j=1


σ2i +σ2

j2 (eσiσj − 1) .

3.3.4 A numerical illustration

In this subsection we illustrate our findings numerically and graphically.

We use the same parameters for the financial and mortality process as in

the two previous illustrations, namely a Black & Scholes model for the

returns with µ = 0.05, σ = 0.1 and the Makeham-Gompertz law with

corresponding coefficients of the Belgian analytic life table MR. We will

compare the different approximations for quantiles and stop-loss premiums

with the values obtained by Monte Carlo simulation (MC). The simulation


results are based on generating 500 × 100000 random paths. The estimates

obtained from this time-consuming simulation will serve as benchmark.

The random paths are based on antithetic variables in order to reduce the

variance of the Monte Carlo estimates.

Figure 3.7 shows the distribution functions of the lower bound S lapp,65,

the upper bound Scapp,65, the moment based approximation Sm

app,65 and the

simulated one Sapp,65. Again the lower bound and the moments based ap-

proximation prove to be very good approximations for the real cumulative

distribution function of Sapp,65. To assess the accuracy of the bounds in

the tails, we plot their quantiles against those of Sapp,65 in Figure 3.8. The

largest quantile (p = 0.995) of Smapp,65 in this QQ-plot underestimates the

exact quantile by only 0.06%. Table 3.6 shows the numerical values for

some high quantiles. The stop-loss premiums for different choices of d are

shown in Figure 3.9 and in Table 3.7. The lower bound and the moments

based approximation give very accurate results compared to the real value

of the stop-loss premium. The comonotonic upper bound performs rather

badly for some retentions. But, using the results of Chapter 2 we can

construct sharper upper bounds than the traditional comonotonic upper

bounds.

In Table 3.8 we compare the stop-loss premium of the comonotonic up-

per bound with the partially exact/comonotonic upper bound πpecub(Sapp,65,

d,Λ) (PECUB) and the two upper bounds based on the lower bound S lapp,65

plus an error term dependent of the retention πdeub(Sapp,65, d,Λ) (DEUB)

and independent of the retention πeub(Sapp,65, d,Λ) (EUB). For the partial

exact/comonotonic upper bound we use the same conditioning variable

as for the lower bound Slapp,65. The decomposition variable used in this

illustration is given by

dΛ = d−bω−xc∑

i=1

ipx αie−µi+

σ2i2

(1 + µi −

1

2σ2

i

).

The results for the different upper bounds are in line with the previous ones

for a single life annuity. Note that for very high values of d the differences

become larger, but these cases don’t represent any practical importance.

We can conclude that in both cases the upper bound based on the lower

bound plus an error term dependent on the retention πdeub(., d,Λ) performs

very well.


p Slapp,65 Sm

app,65 Scapp,65 MC (s.e. × 104)

0.75 12.5745 12.5760 12.8192 12.574 (0.03)0.90 14.5649 14.5698 15.2819 14.5699 (0.07)0.95 15.9364 15.9444 17.0152 15.9448 (0.14)0.975 17.2513 17.2628 18.703 17.2683 (0.24)0.995 20.2073 20.2303 22.5847 20.2425 (1.58)


level p of Sapp,65.

d Slapp,65 Sm

app,65 Scapp,65 MC (s.e. × 104)

0 11.0944 11.0944 11.0944 11.0948 (8.22)5 6.0945 6.0945 6.0951 6.0948 (7.67)10 1.6081 1.6094 1.7910 1.6097 (4.45)15 0.1536 0.1545 0.2766 0.1549 (1.01)20 0.0102 0.0104 0.0355 0.0105 (0.31)25 0.0007 0.0007 0.0047 0.0007 (0.01)


retention d of Sapp,65.

Notice that for the retention d = 0 all values (except the value for DEUB

because there the error term is independent of the retention) in both tables

are identical and equal to 11.0944. This follows from the fact that in this

case the expected value of Ssp,65 equals the expected value of Sapp,65. Note

also that the values in Tables 3.2 and 3.3 are typically larger than the

corresponding values in Tables 3.7 and 3.8. This is not surprising. From

Remark 5 it immediately follows that Sapp,65 ≤cx Ssp,65 and hence for any

retention d > 0 one has

π(Sapp,65, d) ≤ π(Ssp,65, d).


5 10 15 20 25 30

outcome

0.0

0.2

0.4

0.6

0.8

1.0

cdf

Figure 3.7: The cdf’s of ‘Sapp,65’ (MC) (solid grey line), S lapp,65 (•-line),

Smapp,65 (N-line) and Sc

app,65 (dashed line).

6 8 10 12 14 16 18 20

510

1520

6 8 10 12 14 16 18 20

510

1520

Figure 3.8: QQ-plot of the quantiles of S lapp,65 (◦)/Sm

app,65 (4) and Scapp,65

(�) versus those of ‘Sapp,65’ (MC).


5 10 15 20 25 30

outcome

02

46

Sto

p-lo

ss p

rem

ium

Figure 3.9: Stop-loss premiums for ‘Sapp,65’ (MC) (solid grey line),

Slapp,65 (•-line), Sm

app,65 (N-line) and Scapp,65 (dashed line).

d EUB DEUB PECUB CUB MC (s.e. × 104)

0 11.1652 11.0944 11.0944 11.0944 11.0948 (8.22)5 6.1653 6.0948 6.0948 6.0951 6.0948 (7.67)10 1.6789 1.6240 1.6980 1.7910 1.6097 (4.45)15 0.2244 0.2144 0.2559 0.2766 0.1549 (1.01)20 0.0810 0.0809 0.0328 0.0355 0.0105 (0.31)25 0.0715 0.0715 0.0041 0.0047 0.0007 (0.01)

Table 3.8: Upper bounds for some selected stop-loss premiums with re-

tention d of Sapp,65.

3.4. Conclusion 125

3.4 Conclusion

In this chapter we studied the case of life annuities. The aggregate distri-

bution function of such stochastic sums of dependent random variables is

very difficult to calculate. Usually it is only possible to get formulae for the

first couple of moments. To compute more cumbersome risk measures, like

stop-loss premiums or upper quantiles, one has to rely on time consuming

simulations.

We derived comonotonicity based approximations both for the case of

a single life annuity and a homogeneous portfolio of life annuities. The

numerical illustrations confirm the very high accuracy of the bounds (es-

pecially the lower bound). These observations are confirmed by the results

of the stop-loss premiums. One maybe gets an impression that the upper

bound — which performs poorer than the lower bound in all cases — is

not worth being studied. In actuarial applications, however, the upper

bound should draw a lot of attention because one is usually interested in

conservative estimates of quantities of interest. Indeed, when an actuary

calculates reserves he has to take into account some additional sources of

uncertainty, such as the choice of the interest rates model, the estimation

of parameters, the assumptions about mortality, the longevity risk and

many others. For these the estimates provided by the upper bound in con-

vex order can be in many cases more appropriate than the more accurate

approximations obtained from the lower bound in convex order.

Chapter 4

Reserving in non-life

insurance business

Summary In this chapter we present some methods to set up confidence

bounds for the discounted IBNR reserve. We first model the claim pay-

ments by means of a lognormal and a loglinear location-scale regression

model. We further extend this to the class of generalized linear models.

The knowledge of the distribution function of the discounted IBNR re-

serve will help us to determine the initial reserve, e.g. through the quantile

risk measure. The results are based on the comonotonic approximations

explained in Chapter 2.

4.1 Introduction

To get the correct picture of its liabilities, a company should set aside the

correctly estimated amount of money to meet claims arising in the future

on the written policies. The past data used to construct estimates for the

future payments consist of a triangle of incremental claims Yij , as depicted

in Figure 4.1. This is the simplest shape of data that can be obtained and

it avoids having to introduce complicated notation to cope with all possible

situations. We use the standard notation, with the random variables Yij for

i = 1, 2, . . . , t; j = 1, 2, . . . , s denoting the claim figures for year of origin (or

accident year) i and development year j, meaning that the claim amounts

were paid in calendar year i+j−1. Year of origin, year of development and

calendar year act as possible explanatory variables for the observation Yij .

127

128 Chapter 4 - Reserving in non-life insurance business

Year of Development year

origin 1 2 · · · j · · · t− 1 t

1 Y11 Y12 · · · Y1j · · · Y1,t−1 Y1t

2 Y21 Y22 · · · Y2j · · · Y2,t−1... · · · · · · · · · · · · · · ·i Yi1 · · · · · · Yij... · · · · · · · · ·t Yt1

Figure 4.1: Random variables in a run-off triangle

Most claims reserving methods assume that t = s. For (i, j) combinations

with i + j ≤ t+ 1, Yij has already been observed, otherwise it is a future

observation. Next to claims actually paid, these figures can also be used

to denote quantities such as loss ratios. To a large extent, it is irrelevant

whether incremental or cumulative data are used when considering claims

reserving in a stochastic context.

We consider annual development (the methods can be extended easily

to semi-annual, quarterly or monthly development) and we assume that

the time it takes for the claims to be completely paid is fixed and known.

The triangle is augmented each year by the addition of a new diagonal.

The purpose is to complete this run-off triangle to a square, or to

a rectangle if estimates are required pertaining to development years of

which no data are recorded in the run-off triangle at hand. To this end, the

actuary can make use of a variety of techniques. The inherent uncertainty

is described by the distribution of possible outcomes, and one needs to

arrive at the best estimate of the reserve.

The choice of an appropriate statistical model is an important matter.

Furthermore within a stochastic framework, there is considerable flexibil-

ity in the choice of predictor structures. In England & Verrall (2002) the

reader finds an excellent review of possible stochastic models. An appro-

priate model will enable the calculation of the distribution of the reserve

that reflects the process variability producing the future payments, and

accounts for the estimation error and statistical uncertainty (in the sense

given in Taylor & Ashe (1983)). It is necessary to be able to estimate the

variability of claims reserves, and ideally to be able to estimate the full dis-

4.1. Introduction 129

tribution of possible outcomes so that percentiles (or other risk measures

of this distribution) can be obtained. Next, recognizing the estimation

error involved with the parameter estimates, confidence intervals for these

measures constitute another desirable part of the output.

Here, putting the emphasis on the discounting aspect of the reserve,

we first consider simple lognormal linear models. Doray (1996) studied

the loglinear models extensively, taking into account the estimation error

on the parameters and the statistical prediction error in the model. Such

models have some significant disadvantages. Predictions from this model

can yield unusable results, and we need to impose that each incremental

value should be greater than zero. So, it is not possible to model negative

or zero claims. From the nature of the claims reserving problem, it is

expected that a higher proportion of zeros would be observed in the later

stages of the incremental loss data triangle. In reinsurance, zero claims

are also frequently observed in incremental loss data triangles for excess

layers. Negative incremental values will be the result of salvage recoveries,

payments from third parties, total or partial cancellation of outstanding

claims, due to initial overestimation of the loss or to possible favorable jury

decision in favor of the insurer, rejection by the insurer, or just plain errors.

In Goovaerts & Redant (1999) a lognormal linear regression model is used

to model the random fluctuations in the direction of the calendar years,

taking into account the apparatus of financial mathematics. The results are

based on supermodularity order, such that, in the framework of stop-loss

ordering one obtains the distribution of the IBNR reserve corresponding

to an extremal element in this ordering, when some marginals are fixed.

The lognormal linear model is a member of the broader class of loglin-

ear location-scale regression models. In Doray (1994) the reader can find

an overview with a lot a characteristics of the different distributions in

this class. The logarithm of the error is assumed to follow certain known

distributions (normal, extreme value, generalized loggamma, logistic and

log inverse Gaussian). Doray studied these models extensively. He has de-

rived certain theoretical properties of these distributions and proved that

the MLE’s of the regression and scale parameters exist and are unique,

when the error has a log-concave density.

Claim sizes can often be described by distributions with a subexpo-

nential right tail. Furthermore, the phenomena to be modelled are rarely

additive in the collateral data. A multiplicative model is much more plau-

sible. These problems cannot be solved by working with ordinary linear


models, but with generalized linear models. The generalization is twofold.

First, it is allowed that the random deviations from the mean follow an-

other distribution than the normal. In fact, one can take any distribution

from the exponential dispersion family, including for instance the Poisson,

the binomial, the gamma and the inverse Gaussian distributions. Second,

it is no longer necessary that the mean of the random variable is a linear

function of the explanatory variables, but it only has to be linear on a

certain scale. If this scale for instance is logarithmic, we have in fact a

multiplicative model instead of an additive model.

Loss reserving deals with the determination of the (characteristics of

the) d.f. of the random present value of an unknown amount of future

payments. Since this d.f. is very important for an insurance company and

its policyholders, these inherent uncertainties are no excuse for providing

anything less than a rigorous scientific analysis. In order for the reserve

estimate truly to represent the actuary’s “best estimate” of the needed

reserve, both the determination of the expected value of unpaid losses and

the appropriate discount should reflect the actuary’s best estimates (i.e.

should not be dictated by others or by regulatory requirements). Since

the reserve is a provision for the future payment of unpaid losses, we be-

lieve the estimated loss reserve should reflect the time value of money. In

many situations this discounted reserve is useful, for example dynamic fi-

nancial analysis, assessing profitability and pricing, identifying risk based

capital needs, loss portfolio transfers, profit testing, and so on. Ideally the

discounted loss reserve would also be acceptable for regulatory reporting.

However, many current regulations do not permit it. It could be argued

that reserves set on an undiscounted basis include an implicit margin for

prudence, although, in the current climate of low interest rates, that mar-

gin is very much reduced. If reserves are set on a discounted basis, there is

a strong case for including an explicit prudential margin. As such, a risk

margin based on a risk measure from a predictive distribution of claims

reserves is a strong contender.

One of the sub-problems in this respect consists of the discounting of

the future estimates in the run-off triangle, where returns (and inflation)

are not known for certain. We will model the stochastic discount factor

using a Brownian motion with drift. When determining the discounted

loss reserve, we impose an explicit margin based on a risk measure (for

example Value-at-Risk) from the total distribution of the discounted re-

serve. Considering the discounted IBNR reserve, we have to incorporate a

4.2. The claims reserving problem 131

certain dependence structure. In general, it is hard or even impossible to

determine the quantiles of the discounted loss reserve analytically, because

in any realistic model for the return process this random variable will be

a sum of strongly dependent random variables. The “true” multivariate

distribution function of the lower triangle cannot be determined analyti-

cally in most cases, because the mutual dependencies are not known, or

are difficult to cope with. We suggest to solve this problem by calculating

upper and lower bounds making efficient use of the available information.

This chapter is set out as follows. Section 2 places the claims reserving

problem in a broader context. Section 3 gives a brief review of loglinear

and generalized linear models and their applications to claims reserving.

To be able to use the results of Chapter 2 we need some asymptotic results

for model parameter estimates in generalized linear models. Section 4 de-

scribes how convex lower and upper bounds can be obtained for discounted

IBNR evaluations. Some numerical illustrations for a simulated data set

are provided in Section 5, together with a discussion of the estimation error

using a bootstrap approach. We also graphically illustrate the obtained

bounds.

The results of this chapter come from Hoedemakers, Beirlant, Goovaerts

& Dhaene (2003, 2005).

4.2 The claims reserving problem

As a rule not all claims on a general insurance portfolio will have been paid

by the end of the calender year of an insurance company. There can be

several reasons for the delay in payment, e.g. delays in reporting the claim,

long legal procedures, difficulties in determining the size of the claim, and

so on. It is also possible that the claim still has to occur, but that the

cause of the claim occurs in the past (e.g. exposed to asbestos). This of

course depends on what is insured in the policy. The delay in payment can

vary from a couple of days up to some years depending on the complexity

and the severity of the damage. To be able to pay these claims the insurer

has to keep reserves which should enable him to pay all future outstanding

claims.

Claims reserving is a vital area of insurance company management,


which is receiving close attention from shareholders, auditors, tax author-

ities and regulators. For insurance companies, the claims reserve is a very

substantial balance sheet item, which can be large in relation to share-

holders funds. Actuaries are now well-established in the area of claims re-

serving for non-life insurance business. In many countries there is already

a statutory requirement for actuarial certification of reserves. Even in

jurisdictions where there is no such requirement, the substantial contribu-

tion actuaries can make to estimating future liabilities has been recognized

across the market.

Failure to reserve accurately for outstanding and IBNR claims will

adversely affect a company’s future financial development. Any current

reserve inadequacy will give rise to losses in subsequent years. Conversely,

premium calculations based on a too pessimistic evaluation of current lia-

bilities will damage the company’s competitive position.

The reserves held by a general insurance company can be divided into the

following categories:

• Claims reserves representing the estimated outstanding claims pay-

ments that are to be covered by premiums already earned by the

company. These reserves are sometimes called IBNS reserves (In-

curred But Not Settled). These can in turn be divided into

1. IBNYR reserves, representing the estimated claims payments

for claims which have already Incurred, But which are Not Yet

Reported to the company.

2. RBNS reserves, being the reserves required in respect of claims

which have been Reported to the company, But are Not yet

fully Settled. A special case of RBNS reserves are case reserves,

which are the individual reserves set by the claim handlers in

the claims handling process.

• Unearned premium reserves (UPR). Because the insurance premiums

are paid up-front, the company will, at any given accounting date,

need to hold a reserve representing the liability that a part of the

paid premium should be paid back to the policyholder in the event

that insurance policies were to be cancelled at that date. Unearned

premium reserves are pure accounting reserves, calculated on a pro

rata basis.

4.3. Model set-up: regression models 133

• Unexpired risk reserves (URR). While the policyholder only in special

cases has the option to cancel a policy before the agreed insurance

term has expired, he certainly always has the option to continue the

policy for the rest of the term. The insurance company, therefore,

runs the risk that the unearned premium will prove insufficient to

cover the corresponding unexpired risk, and hence the unexpired risk

reserve is set up to cover the probable losses resulting from insufficient

written but yet unearned premiums.

• CBNI reserves. Essentially the same as unearned premium reserves,

but to take into account possible seasonal variations in the risk pat-

tern, they are not necessarily calculated pro rata, so that they also

incorporate the function of the unexpired risk reserves. Their pur-

pose is to provide for Covered But Not Incurred (CBNI) claims.

• The sum of the CBNI and IBNS reserves is sometimes called the

Covered But Not Settled (CBNS) reserve.

• Fluctuation reserves (equalization reserves) do not represent a future

obligation, but are used as a buffer capital to safeguard against ran-

dom fluctuations in future business results. The use of fluctuation

reserves varies from country to country.

The loss reserves considered here only refer to the claims that result from

already occurred events; the so-called IBNS reserves. Notice that often the

terminology is not used uniformly: the abbreviation IBNR is used when

speaking of loss reserving problems as a whole.

4.3 Model set-up: regression models

The problem of estimating IBNR claims consists in predicting, for each

accident year, the ultimate amount of claims incurred. The amount paid

by the insurance company for those claims, when it comes due, is then sub-

tracted, leaving the reserve the insurer should hold for future payments. To

calculate the reserve, all methods or models usually assume that the pat-

tern of cumulative or incremental claims incurred or paid is stable across

the development years, for each accident year. Since for the last accident

year, only one amount will be available, the reserve will be highly sensitive

to this amount. Moreover, because of growth experienced by the company,


it will be larger than any other amount in the data set, hence the im-

portance of verifying that the development pattern of the claims has not

changed over the years. One of the earliest methods, and now the most

commonly used in the actuarial profession, is the chain-ladder method.

Assuming that for each accident year, the development pattern remains

stable, development factors are calculated by dividing cumulative paid or

incurred claims after j periods by the cumulative amount after j − 1 pe-

riods. The year-to-year development factors are then applied to the most

recent amount for each accident year, i.e. the amounts on the right-most

diagonal. Many variations have been presented for the basic chain-ladder

method just introduced; a linear trend or an exponential growth may be

assumed to be present among the development factors. Instead of taking

their weighted average, they could be extrapolated into the future. The

chain-ladder method can also be adjusted for inflation. However, the chain-

ladder method suffers from the following deficiencies:

1. It explicitly assumes too many parameters (one for each column).

2. It does not give any idea of the variability of the reserve estimate, or

a confidence interval for the reserve.

3. It is negatively biased, which could lead to serious underreserving, a

threat to the insurer’s solvency.

Therefore stochastic models have been developed which enable to calculate

an amount such that there is a high probability that the reserve will be

sufficient to cover the liabilities generated by the current block of business.

In claims reserving, we are interested in the aggregated value

t∑

i=2

t∑

j=t+2−i

Yij .

In this section we given an overview of the different regression models used

in claims reserving.

We use the following notation throughout this section:~Y = (Y11, . . . , Yt1, Y21, . . . , Yt1) is the vector of claims, ~β = (β1, . . . , βp)

are model parameters, U is the regression matrix corresponding to the

upper triangle of dimension [ t(t+1)2 ] × p and R is the regression matrix

corresponding to the complete square of dimension t2 × p.


4.3.1 Lognormal linear models

We consider the following loglinear regression model in matrix notation

~Z = ln~Y = R~β + ~ε, ~ε ∼ N(0, σ2I), (4.1)

where ~ε is the vector of independent normal random errors with mean 0

and variance σ2.

So, the normal responses Zij are assumed to decompose (additively)

into a deterministic non-random component with mean (R~β)ij and a ho-

moscedastic normally distributed random error component with zero mean.

The parameters are estimated by the maximum likelihood method,

which in the case of the normal error structure is equivalent to minimizing

the residual sum of squares. The unknown variance σ2 is estimated by the

residual sum of squares divided by the degrees of freedom (the number of

observations minus the numbers of regression parameters estimated):

σ2 =1

n− p(~Z − U~β)′(~Z − U~β). (4.2)

This is an unbiased estimator of σ2. The maximum likelihood estimator

of σ2 is given by

σ2 =1

n(~Z − U~β)′(~Z − U~β), (4.3)

while the maximum likelihood estimator of ~β is

~β = (U′U)−1U′ ~Z. (4.4)

Now we can forecast the total IBNR reserve with

IBNR reserve =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+εij . (4.5)

This definition of the IBNR reserve can, among others, be found in Doray

(1996). Here (R~β)ij and εij are independent.

We have that

εiji.i.d∼ N(0, σ2), (4.6)

(R~β)ij ∼ N((R~β)ij , σ

2(R(U′U)−1R′)

ij

). (4.7)


Starting from model (4.1), we summarize now some properties of the IBNR

reserve (4.5), which can be found in Doray (1996).

1. The mean of the IBNR reserve equals

W =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+

12σ2(1+(R(U′

U)−1R

′)ij). (4.8)

2. The unique UMVUE of the mean of the IBNR reserve is given by

WU = 0F1

(n− p

2;SSz

4

) t∑

i=2

t∑

j=t+2−i

e(R~β)ij , (4.9)

where 0F1(α; z) denotes the hypergeometric function.

3. The MLE of the mean of the IBNR reserve:

WM =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+

12σ2(1+(R(U′

U)−1R

′)ij). (4.10)

Verrall (1991) has considered an estimator similar to WM , but with σ2

replaced with σ2:

WV =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+

12σ2(1+(R(U′

U)−1R

′)ij). (4.11)

The simple estimator

WD =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+

12σ2, (4.12)

was considered in Doray (1996).

Also, we have the order relation

WU < WD < WV , (4.13)

which implies that

W = E[WU ] < E[WD] < E[WV ]. (4.14)


Hence both the estimators WD and WV exhibit a positive bias.

This Lognormal Linear (LL) model with normal random error is a special

case of the class of loglinear location-scale models. Other choices possible

for the distribution of the random error are the extreme value distribution,

leading to the Weibull-extreme value regression model, the generalized

loggamma, the logistic, and the log inverse Gaussian distribution. In what

follows we shortly recall this class of regression models.

4.3.2 Loglinear location-scale models

For a general introduction to survival analysis we refer to Kalbfleish &

Prentice (1980), Lawless (1982), Cohen & Whitten (1985), among others.

In this section we recall the structure of this model and the main charac-

teristics of the distributions for the error component.

A location-scale model has a cumulative distribution function of the

form

FX(x) = G

(x− µ

σ

), (4.15)

where µ is the location parameter, σ is the scale parameter, and G is the

standardized form (µ = 0, σ = 1) of the cumulative distribution function.

The parameter vector is ~θ = (µ, σ).

We consider the following Loglinear Location-Scale (LLS) regression

model in matrix notation

~Z = ln~Y = R~β + σ~ε, (4.16)

where (R~β)ij is the linear predictor or location parameter for Zij , σ is the

scale parameter and ~ε is a random error with known density f~ε(·).It should also be noticed that in general the scale parameter estimator

is not independent of the location parameter estimator, as is the case in

normal regression.

It is clear that the random variable Zij has the following density

1

σf~ε

(zij − (R~β)ij

σ

),

with −∞ < zij <∞. This model can only be applied if all data points are

non-negative. The parameters are estimated by maximum likelihood.


Doray (1994) showed that the maximum likelihood estimators of the

regression and scale parameters exist and are unique when the error ~ε in the

loglinear location-scale regression model has a log-concave density. This is

the case for the five distributions we consider in Table 4.1. Note that the

exponential distribution is a special case of the Weibull distribution when

the shape parameter is equal to 1. The generalized gamma distribution is a

flexible family of distributions containing as special cases the exponential,

the Weibull and the gamma distribution.

The IBNR reserve under this class of regression models is given by

IBNR reserve =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+σεij .

Table 4.2 displays the mean, cumulative distribution function and in-

verse distribution function of Xij = e(R~β)ij+σεij for the different regression

models in the LLS family.

Notice that the definition of the IBNR reserve here differs from defi-

nition (4.3.2) under the lognormal linear model. We use here e(R~β)ij+σεij

instead of e(R~β)ij+ˆσεij , where ~β and ˆσ represent the MLE’s of ~β and σ

respectively. Also this definition of the IBNR reserve can, among others,

be found in Doray (1996). This approach partly uses the information con-

tained in the upper triangle (through ~β), and acknowledges the underlying

stochastic structure (through εij).


Regression model Density

Lognormal linear εij ∼ i.i.d N(0, 1)1√2π

e−12

x2

(−∞ < x < ∞)

Weibull-extreme value εij ∼ Gumbel

ex−ex

(−∞ < x < ∞)

Logistic εij ∼ standard logisticex

(1+ex)2(−∞ < x < ∞)

Generalized loggamma εij ∼ loggamma

kk−

12

Γ(k)e√

kx−kex

√

k � (−∞ < x < ∞)

(0 < k < +∞)

Log inverse Gaussian εij ∼ log inverse Gaussian

(2πλ)−12 e−

x2 e

1λ e−

1λcosh(x)

� (−∞ < x < ∞)

(λ > 0)

Table 4.1: Characteristics of the random error εij in the regression models

of the LLS family.

140

Chapte

r4

-R

ese

rvin

gin

non-life

insu

rance

busin

ess

Regression model E[Xij ] FXij(xij) F−1

Xij(p)

Lognormal linear e(R~β)ij+

σ2

2 Φ(

ln(xij)−(R~β)ij

σ

)e(R

~β)ij+σΦ−1(p)

Weibull-extreme value e(R~β)ij Γ(1 + σ) 1 − exp

[−(

xij

e(R~β)ij

) 1σ

]e(R

~β)ij (−ln(1 − p))σ

Logistic(e(R

~β)ij

)2− 1σ

(1 − 2σ)πcosec(2πσ) 1 −(

1 +(xije

−(R~β)ij

) 1σ

)−1

e(R~β)ij

(p

1−p

)σ

(1 < σ < 2)

e(R~β)ij Γ(1 + σ)Γ(1 − σ)

(σ < 1)

Generalized loggamma k−σ√

ke(R~β)ij

Γ(k) Γ(k + σ√k) I

(k,(

xij

k−σ√

ke(R~β)ij

) 1

σ√

k

)/

Log inverse Gaussian e(R~β)ij Φ

[√e(R~β)ij

λxij+√

xij

λe(R~β)ij

]− /

e2λ Φ

[√e(R~β)ij

λxij−√

xij

λe(R~β)ij

]

Table 4.2: Characteristics of Xij = e(R~β)ij+σεij in the regression models of the LLS family.


4.3.3 Generalized linear models

For a general introduction to Generalized Linear Models (GLIMs) we refer

to McCullagh & Nelder (1992). This family encompasses normal error

linear regression models and the nonlinear exponential, logistic and Poisson

regression models, as well as many other models, such as loglinear models

for categorical data. In this subsection we recall the structure of GLIMs

in the framework of claims reserving.

The first component of a GLIM, the random component, assumes that

the response variables Yij are independent and that the density function

of Yij belongs to the exponential family with densities of the form

f(yij ; θij , φ) = exp {[yijθij − b(θij)] /a(φ) + c(yij , φ)} , (4.17)

where a(·), b(·) en c(·, ·) are given functions. The function a(φ) often has

the form a(φ) = φ, where φ is called the dispersion parameter.

When φ is a known constant, (4.17) simplifies to the natural exponen-

tial family

f(yij ; θij) = a(θij)b(yij)exp {yijQ(θij)} . (4.18)

We identify Q(θ) with θ/a(φ), a(θ) with exp{−b(θ)/a(φ)}, and b(y) with

exp{c(y, φ)}. The more general formula (4.17) is useful for two-parameter

families, such as the normal or gamma, in which φ is a nuisance parameter.

Denoting the mean of Yij by µij , it is known that

µij = E[Yij ] = b′(θij) and Var[Yij ] = b′′(θij)a(φ), (4.19)

where the primes denote derivatives with respect to θ. The variance can

be expressed as a function of the mean by

Var[Yij ] = a(φ)V (µij) = φV (µij),

where V (·) is called the variance function. The variance function V cap-

tures the relationship, if any, between the mean and variance of Yij .

The possible distributions to work with in claims reserving include for

instance the normal, Poisson, gamma and inverse Gaussian distributions.

Table 4.3 shows some of their characteristics. For a given distribution,

link functions other than the canonical link function can also be used. For

example, the log-link is often used with the gamma distribution.

The systematic component of a GLIM is based on a linear predictor

ηij = (R~β)ij = β1Rij,1 + · · · + βpRij,p, i, j = 1, . . . , t. (4.20)


Distribution Density φ Canonical µ(θ) = V (µ) =link θ(µ) b′(θ) b′′(θ)

N(µ, σ2) 1σ√

2πexp

(− (y−µ)2

2σ2

)σ2 µ θ 1

Poisson(µ) e−µ µy

y! 1 log(µ) eθ µ

Gamma(µ, ν) 1Γ(ν)

(νyµ

)ν

exp(− νy

µ

)1y

1ν 1/µ −1/θ µ2

IG(µ, σ2) y−3/2

√2πσ2

exp(

−(y−µ)2

2yσ2µ2

)σ2 1/µ2 (−2θ)−1/2 µ3

Table 4.3: Characteristics of some frequently used distributions in loss

reserving.

Various choices are possible for this linear predictor. In Subsection 4.3.4

we give a short overview of frequently used parametric structures in claims

reserving applications.

The link function, the third component of a GLIM, connects the ex-

pectation µij of Yij to the linear predictor by

ηij = g(µij), (4.21)

where g is a monotone, differentiable function. Thus, a GLIM links the

expected value of the response to the explanatory variables through the

equation

g(µij) = (R~β)ij i, j = 1, . . . , t. (4.22)

For the canonical link g for which g(µij) = θij in (4.17), there is the direct

relationship between the natural parameter and the linear predictor. Since

µij = b′(θij), the canonical link is the inverse function of b′.Generalized linear models may have nonconstant variances σ2

ij for the

responses Yij . Then the variance σ2ij can be taken as a function of the

predictor variables through the mean response µij , or the variance can

be modelled using a parameterized structure (see Renshaw (1994)). Any

regression model that belongs to the family of generalized linear models

can be analyzed in a unified fashion. The maximum likelihood estimates of

the regression parameters can be obtained by iteratively reweighted least


squares (naturally extending ordinary least squares for normal error linear

regression models).

Supposing that the claim amounts follow a lognormal distribution,

taking the logarithm of all Yij ’s implies that they have a normal distri-

bution. So, the link function is given by ηij = µij and the scale parameter

is the variance of the normal distribution, i.e. φ = σ2. We remark that

each incremental claim must be greater than zero, and predictions from

this model can yield unusable results.

The predicted value under a generalized linear model will be given by

IBNR reserve =t∑

i=2

t∑

j=t+2−i

µij , (4.23)

with ~µij = g−1((R~β)ij

)for a given link function g.

We end this section with some extra comments concerning GLIMs.

The need for more general GLIM models for modelling claims reserves be-

comes clear in the column of variance functions in Table 4.3. If the variance

of the claims is proportional to the square of the mean, the gamma family

of distributions can accommodate this characteristic. The Poisson and in-

verse Gaussian provide alternative variance functions. However, it may be

that the relationship between the mean and the variance falls somewhere

between the inverse Gaussian and the gamma models. Quasi-likelihood is

designed to handle this broader class of mean-variance relationships. This

is a very simple and robust alternative, introduced in Wedderburn (1974),

which uses only the most elementary information about the response vari-

able, namely the mean-variance relationship. This information alone is

often sufficient to stay close to the full efficiency of maximum likelihood

estimators. Suppose that we know that the response is always positive,

the data are invariably skew to the right, and the variance increases with

the mean. This does not enable us to specify a particular distribution

(for example it does not discriminate between Poisson or negative bino-

mial errors), hence one cannot use techniques like maximum likelihood or

likelihood ratio tests. However, quasi-likelihood estimation allows one to

model the response variable in a regression context without specifying its

distribution. We need only to specify the link and variance functions to


estimate regression coefficients. Although the link and variance functions

determine a theoretical likelihood, the likelihood itself is not specified so

fewer assumptions are required for estimation and inference. This is analo-

gous to the connection between normal-theory regression models and least-

squares estimates. Least-squares estimation provides identical parameter

estimates to those obtained from normal-theory models, but least-squares

estimation assumes far less. Only second moment assumptions are made by

least-squares compared to full distribution assumptions of normal-theory

models. For quasi-likelihood, specification of a variance function deter-

mines a corresponding quasi-likelihood element for each observation:

Q(µij ; yij) =

∫ µij

yij

yij − t

φV (t)dt, (4.24)

where Q(µij ; yij) satisfies a number of properties in common with the log-

likelihood. Specifically, if K = k(µij ;Yij) = (Yij − µij)/(φV (µij)), then

E(K) = 0

Var(K) =1

φV (µij)

−E

(∂K

∂µij

)=

1

φV (µij). (4.25)

According to McCullagh & Nelder (1992), since most first-order asymp-

totic theory regarding likelihood functions is based on the three proper-

ties (4.25), we can expect Q(µij ; yij) to behave like a log-likelihood under

certain broad conditions. Summing (4.24) over all yij-values yields the

quasi-likelihood for the complete data. The quasi-deviance D(yij ;µij) is

similarly defined to be the sum over all yij-values of

−2φQ(µij ; yij) = 2

∫ yij

µij

yij − t

V (t)dt. (4.26)

Parameter estimation proceeds by maximizing the quasi-likelihood. Since

the quasi-likelihood behaves like an ordinary likelihood, it inherits all the

large sample properties of likelihoods: approximate unbiasedness and nor-

mality of the parameter estimates. For example, through the use of the

quasi-likelihood

Q(µij ; yij) =

∫ µij

yij

Yij − t

φt2.5dt =

1

φµ2.5ij

(µijyij

(−1.5)−

µ2ij

(−0.5)

)(4.27)


we could model a variance function between those of the gamma and inverse

Gaussian families: V (µij) = µ2.5ij .

When using the canonical link function, the quasi-likelihood equations

are given by

t+1−i∑

j=1

µij =t+1−i∑

j=1

Yij 1 ≤ i ≤ t;

t+1−j∑

i=1

µij =

t+1−j∑

i=1

Yij 1 ≤ j ≤ t. (4.28)

As can easily be seen from these equations in case of the Poisson model

with logarithmic link function, it is necessary to impose the constraint

that the sum of the incremental claims in every row and column has to

be non-negative. For example, this assumption makes the model unsuit-

able for incurred triangles, which may contain many negatives in the later

development periods due to overestimates of case reserves in the earlier

development periods.

We recall that the only distributional assumptions used in GLIMs are

the functional relationship between variance and mean and the fact that

the distribution belongs to the exponential family. When we consider the

Poisson case, this relationship can be expressed as

Var[Yij ] = E[Yij ]. (4.29)

One can allow for more or less dispersion in the data by generalizing (4.29)

to Var[Yij ]=φE[Yij ] (φ ∈ (0,∞)) without any change in the form and

solution of the likelihood equations. For example, it is well known that an

over-dispersed Poisson model with the chain-ladder type linear predictor

gives the same predictions as those obtained by the deterministic chain-

ladder method (see Renshaw & Verrall, 1994).

Modelling the incremental claim amounts as independent gamma re-

sponse variables, with a logarithmic link function and the chain-ladder

type linear predictor produces exactly the same results as obtained by

Mack (1991). The relationship between this generalized linear model and

the model proposed by Mack was first pointed out by Renshaw & Verrall

(1994). The mean-variance relationship for the gamma model is given by

Var[Yij ] = φ (E[Yij ])2 . (4.30)


Using this model gives predictions close to those from the deterministic

chain-ladder technique, but not exactly the same. Notice that we need to

impose that each incremental value should be positive (non-negative) if

we work with gamma (Poisson) models. This restriction can be overcome

using a quasi-likelihood approach.

As in normal regression, the search for a suitable model may encompass

a wide range of possibilities. The Bayesian information criterion (BIC)

and the Akaike Information Criterion (AIC) are model selection devices

that emphasize parsimony by penalizing models for having large numbers

of parameters. Tests for model development to determine whether some

predictor variables may be dropped from the model can be conducted

using partial deviances. Two measures for the goodness-of-fit of a given

generalized linear model are the scaled deviance and Pearson’s chi-square

statistic.

In cases where the dispersion parameter is not known, an estimate can

be used to obtain an approximation to the scaled deviance and Pearson’s

chi-square statistic. One strategy is to fit a model that contains a sufficient

number of parameters so that all systematic variation is removed, estimate

φ from this model, and then use this estimate in computing the scaled

deviance of sub-models. The deviance or Pearson’s chi-square divided by

its degrees of freedom is sometimes used as an estimate of the dispersion

parameter φ.

4.3.4 Linear predictors and the discounted IBNR reserve

Various choices are possible for the linear predictor in claims reserving

applications. We give here a short overview of frequently used parametric

structures.

A well-known and widely used predictor is the chain-ladder type

ηij = αi + βj , (4.31)

(αi is the parameter for each year of origin i and βj for each development

year j). It should be noted that this representation implies the same

development pattern for all years of origin, where that pattern is defined

by the parameters βj . Notice that a parameter, for example β1, must be set

equal to zero, in order to have a non-singular regression matrix. Another

natural and frequently used restriction on the parameters is to impose that


β1 + · · ·+ βt = 1, since this allows the βj to be interpreted as the fraction

of claims settled in development year j.

The separation predictor takes into account the calendar years and

replaces in (4.31) αi with γk (k = i + j − 1). It combines the effects of

monetary inflation and changing jurisprudence.

For a general model with parameters in the three directions, we refer to

De Vylder & Goovaerts (1979). We give here some frequently used special

cases:

• The probabilistic trend family (PTF) of models as suggested in Barnett

& Zehnwirth (1998)

ηij = αi +

j−1∑

k=1

βk +

i+j−2∑

t=1

γt, (4.32)

where γ denotes the calendar year effect; it combines the effects of

monetary inflation and changing jurisprudence.

• The Hoerl curve as in Zehnwirth (1985)

ηij = αi + βilog(j) + γij (j > 0). (4.33)

This model has the advantage that one can predict payments by

extrapolation for j > t, because development year j is considered

as a continuous covariate. This is useful in estimating tail factors.

Wright (1990) extends this Hoerl curve further to model possible

claim inflation.

• A mixture of models (4.31) and (4.33) as in England & Verrall (2001)

ηij =

{αi + βj if j ≤ q;

αi + βilog(j) + γij if j > q(4.34)

for some integer q specified by the modeller.

In the case that the type of business allows for discounting we add a dis-

counting process. Of course, the level of the required reserve will strongly


depend on how we will invest this reserve. We define the discounted IBNR

reserve S under one of the discussed regression models as follows

lognormal linear model: SLL =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+εij−Y (i+j−t−1),

loglinear location-scale model: SLLS =t∑

i=2

t∑

j=t+2−i

e(R~β)ij+σεij−Y (i+j−t−1),

generalized linear model: SGLIM =t∑

i=2

t∑

j=t+2−i

g−1((R~β)ij

)e−Y (i+j−t−1),

where the returns are modelled by means of a Brownian motion described

by the following equation

Y (i) = (δ +ς2

2)i+ ςB(i), (4.35)

where B(i) is the standard Brownian motion, ς is the volatility and δ is a

constant force of interest.

4.4 Convex bounds for the discounted IBNR re-

serve

Before we can apply the results of Chapter 2 in order to derive the comono-

tonic approximations for S, we have to specify further the distribution of

~µ = g−1((R~β)ij

). This is done in what follows.

4.4.1 Asymptotic results in generalized linear models

Let φ, ~β, ~η = R~β and ~µ = g−1(~η) be the maximum likelihood estimates of

φ, ~β, ~η and ~µ respectively. The estimation equation for ~β is then given by

U′WU~β = U′W~y∗, (4.36)

where W = diag{w11, · · · , wt1}, with wij = Var[Yij ]−1(dµij/dηij)

2, ~y∗ =

(y∗11, · · · , y∗t1)′, and denoting y∗ij = ηij + (yij − µij)dηij/dµij where yij de-

note the sample values. Note that W is W evaluated at ~β. It is well-known

that for asymptotically normal statistics, many functions of such statistics

4.4. Convex bounds for the discounted IBNR reserve 149

are also asymptotically normal. Because R ~β =((R~β)11, · · · , (R~β)tt

)is

asymptotically multivariate normal with mean R~β =((R~β)11, · · · , (R~β)tt

)

and variance-covariance matrix Σ(R ~β) = Σa = {σaij} = R(U′WU)−1R′

and g−1(η11, · · · , ηtt) has a nonzero differential ~ψ = (ψ11, · · · , ψtt) at (R~β),

where ψij = dµij/dηij , it follows from the delta method that

[~µ− ~µ

]d→ N

(0,Σ(~µ)

), (4.37)

where Σ(~µ) = ~ψ′Σa ~ψ. Hence, for large samples the distribution of ~µ =

g−1(R~β) can be approximated by a normal distribution with mean ~µ and

variance-covariance matrix Σ(~µ).

Maximum likelihood estimates may be biased when the sample size or

the total Fisher information is small. The bias is usually ignored in prac-

tice, because it is negligible compared with the standard errors. In small

or moderate-sized samples, however, a bias correction can be necessary,

and it is helpful to have a rough estimate of its size.

In deriving the convex bounds, one need the expected values. Since

there is no exact expression for the expectation of ~µ, we approximate it

using a general formula for the first-order bias of the estimate of ~µ.

Cordeiro & McCullagh (1991) derived the first order bias of ~β. In

matrix notation this bias reduces to the simple form

B(~β) = −1

2ΣbU′Σc

dFd1, (4.38)

with Σb = Σ(~β) = {σbij} = (U′WU)−1, Σc = Σ(U~β) = {σc

ij} = UΣbU′,

Σad = diag{σa

11, · · · , σatt}, Σc

d = diag{σc11, · · · , σc

t1}, 1 is a t(t+1)2 ×1 vector of

ones, and Fd = diag{f11, · · · , ft1} with fij = Var[Yij ]−1(

dµij

dηij

) (d2

µij

dη2ij

).

It follows that the n−1 bias of ~η also has a simple expression:

B(~η) = −1

2RΣbU′Σc

dFd1. (4.39)

To evaluate the n−1 biases of ~β and ~η we need only the variance and the link

functions with their first and second derivatives. In the right-hand sides of

equations (4.38) and (4.39), which are of order n−1, consistent estimates

of the parameters ~µ can be inserted to define the corrected maximum


likelihood estimates ~ηc = ~η − B(~η) and ~βc = ~β − B(~β), which should

have smaller biases than the corresponding ~η and ~β. From now on B(·)means the value of B(·) at the point ~µ. Expressions (4.38) and (4.39)

are applicable even if the link is not the same for each observation. For

the linear model with any distribution in the exponential family B( ~β) and

B(~η) are zero. This is to be expected for the normal linear model or for the

inverse Gaussian non-intercept linear regression model. However it is not

obvious that this happens for any distribution in the exponential family

(4.17) with identity link since ~β is obtained, apart from these cases, from

the non-linear equation (4.36) with and because of the dependence of ~β

on W and ~y∗. We now give the n−1 bias of ~µ. Because µij = g−1(ηij) =

g−1((R~β)ij) and the link function is monotone and twice differentiable, we

can apply a Taylor series expansion of µij around ηij :

µij∼= µij +

dµij

dηij(ηij − ηij) +

1

2

d2µij

dη2ij

(ηij − ηij)2,

µij − µij∼= dµij

dηij(ηij − ηi) +

1

2

d2µij

dη2ij

(ηij − ηij)2,

E[µij − µij ] ∼= dµij

dηijE[(ηij − ηij)] +

1

2

d2µij

dη2ij

Var[ηij ].

In matrix notation

E[~µ− ~µ] ∼= G1E[(~η − ~η)] +1

2G2[Var(~η)]

∼= −1

2RΣbU′Σc

dFd1 +1

2G2Σ

ad1

=1

2

{G2Σ

ad1 − G1RΣbU′Σc

dFd1}.

So, the first order bias of ~µ in matrix notation is given by the following

equation:

B(~µ) =1

2

{G2Σ

ad1 − G1RΣbU′Σc

dFd1}, (4.40)

where 1 is a t2 × 1 vector of ones and G1 = diag{ψ11, · · · , ψtt}, G2 =

diag{ϕ11, · · · , ϕtt} where ψij =dµij

dηijand ϕij =

d2µij

dη2ij

.


So, we can define adjusted values as ~µc = ~µ − B(~µ), which should have

smaller biases than the corresponding ~µ. Note that B(·) means here the

value of B(·) taken at (φ, ~µ).

4.4.2 Lower and upper bounds

In this subsection we will derive the upper and lower bounds in convex or-

der, as described in Chapter 2, for the discounted IBNR reserve SLL, SLLS

and SGLIM under the different regression models.

Using the results of Chapter 2, we derive a convex lower and upper

bound for S =∑

i

∑j XijZij given by

∑

i

∑

j

E[Xij ]E[Zij |Λ]

︸︷︷︸Sl

≤cx

∑

i

∑

j

XijZij

︸︷︷︸S

≤cx

∑

i

∑

j

F−1Xij

(U)F−1Zij

(V )

︸︷︷︸Sc

,

with

Xij =

eεij (SLL);

e(R~β)ij+σεij (SLLS);

µij (SGLIM ).

Zij =

e(R~β)ij−Y (i+j−t−1) (SLL);

e−Y (i+j−t−1) (SLLS);

e−Y (i+j−t−1) (SGLIM ).

We introduce the random variables Wij and Wij defined by

Wij = (R~β)ij − Y (i+ j − t− 1) and Wij = −Y (i+ j − t− 1), (4.41)

with

E[Wij ] = (R~β)ij − (δ +1

2ς2)(i+ j − t− 1),

E[Wij ] = −(δ +1

2ς2)(i+ j − t− 1),

Var[Wij ] = σ2Wij

= σ2(R(U′U)−1R′)

ij+ (i+ j − t− 1)ς2,

Var[Wij ] = σ2Wij

= (i+ j − t− 1)ς2.


The lower bound

To compute the lower bound we consider the following conditioning normal

random variable of the form (2.53)

Λ =t∑

i=2

t∑

j=t+2−i

νijY (i+ j − t− 1), (4.42)

with

νij =

e(R~β)ije−(i+j−t−1)δ (Sl

LL);

E[e(R

~β)ij+σεij

]e−(i+j−t−1)δ (Sl

LLS);(µij + B(~µ)ij

)e−(i+j−t−1)δ (Sl

GLIM ).

(4.43)

Notice that (Wij ,Λ) has a bivariate normal distribution. Conditionally

given Λ = λ, Wij has a univariate normal distribution with mean and

variance given by

E[Wij |Λ = λ] = E[Wij ] + ρijσWij

σΛ(λ− E[Λ]) (4.44)

and

Var[Wij |Λ = λ] = σ2Wij

(1 − ρ2

ij

), (4.45)

where ρij denotes the correlation between Λ and Wij . The same is true for

(Wij ,Λ), where we denote the correlation between Λ and Wij by ρij .

The lower bound can be written as

SlLL =

t∑

i=2

t∑

j=t+2−i

E[Xij ]eE[Wij ]+ρijσWij

Φ−1(V )+ 12(1−ρ2

ij)σ2Wij ,

SlLLS =

t∑

i=2

t∑

j=t+2−i

E[Xij ]eE[Wij ]+ρijσ

WijΦ−1(V )+ 1

2(1−ρ2

ij)σ2Wij ,

SlGLIM =

t∑

i=2

t∑

j=t+2−i

E[Xij ]eE[Wij ]+ρijσ

WijΦ−1(V )+ 1

2(1−ρ2

ij)σ2Wij ,

with

E[Xij ] =

E [eεij ] = e12σ2

(SLL);

E[e(R

~β)ij+σεij

]= See Table 4.2 (SLLS);

E[g−1

((R~β)ij

)]= µij + B(~µ)ij (SGLIM ).


The correlations ρij and ρij are given by

ρij =Cov[Λ,Wij ]

σΛσWij

, ρij =Cov[Λ, Wij ]

σΛσWij

,

with

Cov[Λ,Wij ] = Cov[Λ, Wij ]

= −ς2t∑

k=2

t∑

l=t+2−k

νkl min(i+ j − t− 1, k + l − t− 1)

and

Var[Λ] = σ2Λ = ς2

t∑

r=2

t∑

s=t+2−r

t∑

v=2

t∑

w=t+2−v

νrsνvw min(r+s−t−1, v+w−t−1).

By conditioning on one of the standard uniform random variables one can

compute the distribution function of the lower bound. See Subsection 2.5.3

for more details.

For the lognormal linear and loglinear location-scale models there exist

a closed-form expression for the quantile function of S l.

Taking into account that Λ =∑t

i=2

∑tj=t+2−i νijY (i + j − t − 1) is

normally distributed, we find that

F−1Λ (1 − p) = E[Λ] − σΛΦ−1(p),

and hence

F−1Sl (p) = F−1� t

i=2

� tj=t+2−i E[Xij ]E[Zij |Λ]

(p), p ∈ (0, 1)

=t∑

i=2

t∑

j=t+2−i

F−1E[Xij ]E[Zij |Λ](p)

=t∑

i=2

t∑

j=t+2−i

E[Xij ]E[Zij |Λ = F−1Λ (1 − p)],

In order to derive the above result, we used the fact that for a non-

increasing continuous function g, we have

F−1g(X)(p) = g(F−1

X (1 − p)), p ∈ (0, 1). (4.46)


Here, g = E[Zij |Λ] is a non-increasing function of Λ since ρij (ρij) is always

negative. So, we have that

F−1Sl (p) =

t∑

i=2

t∑

j=t+2−i

E[Xij ]eE[Wij ]−ρijσWij

Φ−1(p)+ 12(1−ρ2

ij)σ2Wij , (LL)

t∑

i=2

t∑

j=t+2−i

E[Xij ]eE[Wij ]−ρijσ

WijΦ−1(p)+ 1

2(1−ρ2

ij)σ2Wij . (LLS)

and FSl(x) can be obtained from solving the equation

t∑

i=2

t∑

j=t+2−i


Φ−1(FSl

LL(x))+ 1

2(1−ρ2

ij)σ2Wij = x, (LL)

t∑

i=2

t∑

j=t+2−i


Φ−1(FSl

LLS(x))+ 1

2(1−ρ2

ij)σ2Wij = x. (LLS)

The upper bound

The upper bound can be written as

ScLL =

t∑

i=2

t∑

j=t+2−i

F−1Xij

(U)eE[Wij ]+σWij

Φ−1(V ),

ScLLS =

t∑

i=2

t∑

j=t+2−i

F−1Xij

(U)eE[Wij ]+σ

WijΦ−1(V )

,

ScGLIM =

t∑

i=2

t∑

j=t+2−i

F−1Xij

(U)eE[Wij ]+σ

WijΦ−1(V )

,

with

F−1Xij

(U) =

F−1eεij (U) = eσΦ−1(U) (SLL);

F−1

e(R~β)ij+σεij(U) = See Table 4.2 (SLLS);

F−1

g−1�(R~β)ij � (U) = µij + B(~µ)ij

+√

Σ(~µ)ijΦ−1(p) (SGLIM ).


The cdf of the upper bound can be computed as described in Subsection

2.5.3. Using Remark 4 one can calculate the distribution function of ScLL

and ScLLS more efficiently. We start with the cdf of Sc

LL.

From previous results

FScLL

(y) =

∫ 1

0FN

(ln(y) − ln

(F−1

Sc′LL

(u)))

du,

with FN (x) the cdf of N(0, σ2) and

Sc′LL =

t∑

i=2

t∑

j=t+2−i

exp

(F−1

(R~β)ij−Y (i+j−t−1)(U)

)

=t∑

i=2

t∑

j=t+2−i

e(R~β)ij−(δ+ 1

2ς2)(i+j−t−1)

× e√

σ2(R(U′U)−1R′)ij+ς2(i+j−t−1)Φ−1(p).

and

F−1

Sc′LL

(u) =t∑

i=2

t∑

j=t+2−i

e(R~β)ij−(δ+ 1

2ς2)(i+j−t−1)

× e√

σ2(R(U′U)−1R′)ij+ς2(i+j−t−1)Φ−1(u).

We can write the upper bound of SLLS as

ScLLS = G

t∑

i=2

t∑

j=t+2−i

eE[Wij ]+σ

WijΦ−1(V )

e(R~β)ij ,

with

G =

eσΦ−1(U) (Lognormal linear);

(−log(1 − U))σ (Weibull-extreme value);(U

1−U

)σ(Logistic).

The distribution function of G is given by

FG(x) ∼

Φ(

lnxσ

)(Lognormal linear);

1 − e−x1σ (Weibull-extreme value);

1 −(1 + x

1σ

)−1(Logistic).


Using Remark 4 we can write the cdf of ScLLS for the lognormal linear, the

weibull-extreme value and the logistic regression model as follows

F−1Sc

LLS(y) =

∫ 1

0FG

y

F−1

Sc′LLS

(u)

du.

with

Sc′LLS =

t∑

i=2

t∑

j=t+2−i

exp

(F−1

(R~β)ij−Y (i+j−t−1)(U)

)

=t∑

i=2

t∑

j=t+2−i

e(R~β)ij−(δ+ 1

2ς2)(i+j−t−1)+ς

√i+j−t−1Φ−1(U).

and

F−1

Sc′LLS

(u) =t∑

i=2

t∑

j=t+2−i

e(R~β)ij−(δ+ 1

2ς2)(i+j−t−1)+ς

√i+j−t−1Φ−1(u).

Remark 7. Since we have no equality of the first moments in the GLIM

framework, the convex order relationship between the two approximations

and S is not valid. This does not impose any restrictions on the use of

the approximations. In fact, we can say that the convex order only holds

asymptotically in this case.

Remark 8. The estimator WD (4.12), for the mean of the IBNR reserve,

constitutes a close upper bound for the UMVUE of the mean of the IBNR

reserve if t(t+1)2 − p is large and the residual sum of squares is small. It

should be noted that e((R~β)ij+σ2/2) is the estimator of the mean of a log-

normal distribution logN((R~β)ij , σ2) obtained by replacing the parameters

~β and σ2 by their unbiased estimates. Adding now a discount process to

WD gives

WDD =t∑

i=2

t∑

j=t+2−i

e(R~β)ij−Y (i+j−t−1)+ 1

2σ2. (4.47)

Now, we can apply the same methodology as explained before. The results

for the lognormal linear model are still applicable. The only difference is

4.5. The bootstrap methodology in claims reserving 157

that εij is changed by 12 σ

2, with

1

2σ2 ∼ Gamma

(n− p

2,σ2

n− p

). (4.48)

4.5 The bootstrap methodology in claims reserving

4.5.1 Introduction

The bootstrap technique as an inferential statistical computer intensive

device was introduced by Efron (1979) as a quite intuitive and simple way

of making approximations to distributions which are very hard or even

impossible to compute analytically. This technique has proved to be a

very useful tool in many statistical applications and can be particularly

interesting to assess the variability of the claim reserving predictions and

to construct upper limits at an adequate confidence level. Its popularity

is due to a combination of available computing power and theoretical de-

velopment. One advantage of the bootstrap technique is that it can be

applied to any data set without having to assume an underlying distribu-

tion. Moreover most computer packages can handle very large numbers of

repeated samplings.

Our goal is to obtain quantiles of the loss reserve for which the predic-

tive distribution is not known. If we do not know the distribution, then

our best guess at the distribution is provided by the data. The main idea

in bootstrapping is that we (a) pretend that the data constitute the popu-

lation and (b) take samples from this pretended population (which we call

“resamples”). Substituting the sample for the population means that we

are interested in the frequency with which the observed values occurred.

This is done by sampling with replacement. From the re-sample, we

calculate the statistic we are interested in. This is called a “bootstrap

statistic”. After storing this value, one repeats the above steps collect-

ing a large number (B) of bootstrap statistics. The general idea is that

the relationship of the bootstrap statistics to the observed statistic is the

same as the relationship of the observed statistic to the true value. Under

mild regularity conditions, the bootstrap yields an approximation to the

distribution of an estimator or test statistic that is at least as accurate

as the approximation obtained from first-order asymptotic theory. For an

introduction explaining the bootstrap technique, see Efron & Tibshirani

(1993).


4.5.2 Central idea

The concept of bootstrap relies on the consideration of the discrete empir-

ical distribution generated by a random sample of size n from an unknown

distribution F . This empirical distribution assigns equal probability to

each sample item. In the discussion which follows, we will write Fn for

that distribution. By generating an independent, identically distributed

random sequence (resample) from the distribution Fn or its appropriately

smoothed version, we can arrive at new estimates of various parameters

and nonparametric characteristics of the original distribution F .

As we have already mentioned, the central idea of bootstrap lies in

sampling the empirical cdf Fn. This idea is closely related to the following,

well-known statistical principle, henceforth referred to as the “plug-in”

principle. Given a parameter of interest θ(F ) depending upon an unknown

population cdf F , we estimate this parameter by θ = θ(Fn). That is, we

simply replace F in the formula for θ by its empirical counterpart Fn

obtained from the observed data. The plug-in principle will not provide

good results if Fn poorly approximates F , or if there is information about

F other than that provided by the sample. For instance, in some cases we

might know (or be willing to assume) that F belongs to some parametric

family of distributions. However, the plug-in principle and the bootstrap

may be adapted to this latter situation as well. To illustrate the idea,

let us consider a parametric family of cdf’s {Fµ} indexed by a parameter

µ (possibly a vector), and for some given µ0, let µ0 denote its estimate

calculated from the sample. The plug-in principle in this case states that

we should estimate θ(Fµ0) by θ(Fµ0). In this case, bootstrap is often called

parametric, since a resample is now collected from Fµ0 . Here, we refer to

any replica of θ calculated from a resample as “a bootstrap estimate of

θ(F )” and denote it by θ∗.

4.5.3 Bootstrap confidence intervals

Let us now turn to the problem of using the bootstrap methodology to

construct confidence intervals. This area has been a major focus of theo-

retical work on the bootstrap, and several different methods of approaching

the problem have been suggested. The “naive” procedure described below

is not the most efficient one and can be significantly improved in both

rate of convergence and accuracy. It is, however, intuitively obvious and


easy to justify, and seems to be working well enough for the cases con-

sidered here. For a complete review of available approaches to bootstrap

confidence intervals, see Efron & Tibisharani (1993). Let us consider θ∗,a bootstrap estimate of θ based on a resample of size n from the origi-

nal sample X1, . . . , Xn, and let G∗ be its distribution function given the

observed sample values

G∗ = Pr[θ∗ ≤ x|X1 = x1, . . . , Xn = xn].

The bootstrap percentiles method gives G−1∗ (α) and G−1

∗ (1−α) as, respec-

tively, lower and upper bounds for the (1 − 2α) confidence interval for θ.

Let us note that for most statistics θ, the distribution function of the boot-

strap estimator θ∗ is not available. In practice, G−1∗ (α) and G−1

∗ (1 − α)

are approximated by taking multiple resamples and then calculating the

empirical percentiles. In most cases B ≥ 1000 is recommended.

4.5.4 Bootstrap in claims reserving

As already mentioned above, with bootstrapping, we treat the obtained

data as if they are an accurate reflection of the parent population, and

then draw many bootstrapped samples by sampling, with replacement,

from a pseudo-population consisting of the obtained data. Technically,

this is called “non-parametric bootstrapping”, because we are sampling

from the actual data and we have made no assumptions about the distri-

bution of the parent population, other than that the raw data adequately

reflect the population’s shape. If we were willing to make more assump-

tions, such as an assumption that the parent population follows a normal

distribution, then we could do our sampling, with replacement, from a

normal distribution. This is called “parametric bootstrapping”.

For a description of the bootstrap methodology in claims reserving we refer

to England & Verrall (1999) and Pinheiro et al. (2003). In these papers the

bootstrap technique is used to obtain prediction errors for different claims

reserving methods, namely methods based on the chain-ladder technique

and on generalized linear models. Applications of the bootstrap technique

to claims reserving can also be found in Lowe (1994), in Taylor (2000) and

in England & Verrall (2002).

Starting from the original run-off triangle one can create a large number

of bootstrap run-off triangles by repeatedly resampling, with replacement,


from the appropriate residuals. For each bootstrap sample the regression

model is refitted and the bootstrap statistic is calculated.

In England & Verrall (1999) the bootstrap technique is used to com-

pute the bootstrap root mean squared error of prediction (RMSEPbs), also

known as the bootstrap standard error of prediction. This is equal to, what

they call, the square root of the sum of the squares of parameter variability

and data variability. For the parameter variability one suggests a correc-

tion on the bootstrap standard error to enable a comparison between the

analytic standard error and the bootstrap one by taking account of the

number of parameters used in fitting the model. The bootstrap standard

error is the standard deviation of the bootstrap reserve estimates. So, pa-

rameter variability is defined as the bootstrap standard error multiplied

by the square root of n divided by n− p (n: sample size, p: number of pa-

rameters). Data variability is the square root of the uniformly minimum

variance unbiased estimator of the variance of the IBNR reserve. This

estimator was already calculated by Doray (1996). Note that if the full

predictive distribution can be found, the RMSEP can be obtained directly

by calculating its standard deviation. Using a normal approximation, a

100(1 − α)% bootstrap prediction interval for the total reserve is calcu-

lated as [R ± Φ−1(1 − α/2) ∗ RMSEPbs(R)], with R the initial forecast of

the IBNR reserve.

The second approach is more robust against deviations from the hy-

pothesis of the model. For a detailed presentation of this method see

Davidson & Hinkley (1997). A new bootstrap statistic is defined here as

a function of the bootstrap estimate and a bootstrap simulation of the

future reality. This statistic is called the prediction error. (This is very

confusing because in the literature the term prediction error is also used

for the RMSEP or the standard error of prediction.) For each bootstrap

loop the prediction error is then kept in a vector and the percentile method

is used to obtain the desired percentile of this prediction error (PPE). In

a last stage an upper limit of the prediction interval for the total reserve

is calculated as [R+ PPE].

The reader can find a complete list of the required steps for those two

procedures in the paper of Pinheiro et al. (2003). These authors have also

compared and discussed the two bootstrap procedures and the main con-

clusion is that the differences amongst the results obtained with the two

procedures, RMSEP and PPE, are not very important. The PPE proce-

dure generates generally smaller values. Further one suggest to eliminate


the residuals with value 0 and to work with standardized residuals since

only the former could be considered as identically distributed.

The third approach is explained in England & Verrall (2002). Like

in the previous methods, first of all a stochastic model is fitted to the

bootstrap sample and a run-off triangle is bootstrapped. For this pseudo

triangle the parameters are estimated in order to calculate future incre-

mental claim payments Y ∗ij . The second stage of the procedure replicates

the process variance. This is achieved by simulating an observed claim

payment for each future cell in the run-off triangle, using the bootstrap

value Y ∗ij as the mean, and using the process distribution assumed in the

underlying model. For each iteration the reserves are calculated by adding

up the simulated forecast payments. The set of reserves obtained in this

way forms the predictive distribution. The percentile method is then used

to obtain the required prediction interval.

In a practical case study one can bootstrap a high percentile of the dis-

tribution of the lower bound in order to describe the estimation error

involved. Taylor & Ashe (1983) used the terminology estimation error for

Var[(R~β)ij ] and statistical or random error for Var[εij ]. The estimation

error arises from the estimation of the vector ~β from the data, and the

statistical error stems from the stochastic nature of the regression model.

We bootstrap an upper triangle using the non-parametric procedure. This

involves resampling, with replacement, from the original residuals and then

creating a new triangle of past claim payments using the resampled resid-

uals together with the fitted values.

With regression type problems the resampling procedure is applied to the

residuals of the model. Residuals are approximately independent and iden-

tically distributed. In a statistical analysis they are commonly used in

order to explore the adequacy of the fit of the model, with respect to

the choice of the variance function, link function and terms in the linear

predictor. Residuals may also indicate the presence of anomalous values

requiring further investigation.

For generalized linear models an extended definition of residuals is re-

quired, applicable to all the distributions that may replace the normal

distribution. It is convenient if these residuals can be used for the same

purposes as standard normal residuals. Three well-known forms of general-


ized residuals are the Pearson, Anscombe and deviance residuals. Pearson

residuals are easy to interpret: it are just the raw residuals scaled by the

estimated standard deviation of the response variable. A disadvantage of

the Pearson residual is that the distribution of this residual form for non-

normal distributions is often markedly skewed, and so it may fail to have

properties similar to those of a normal theory residual. Anscombe and de-

viance residuals are more appropriate to check the approximate normality.

In general the lower bound S l turns out to perform very well. A final

method to obtain a confidence bound for the predictive distribution is a

combination of the power of this lower bound and bootstrapping. We will

bootstrap a high percentile of the distribution of the lower bound. This is

done as follows:

1. The preliminaries:

• Estimate the model parameters ~β and

σ2 (LL)

σ2 (LLS)

φ (GLIM)

• Calculate the fitted values: µij =

e(R~β)ij (LL)

See Table 4.2 (LLS)

g−1(R~β)ij (GLIM)(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).

• Calculate the residuals: rij =

zij − ln µij (LL)

zij − ln µij (LLS)yij−µij√φV (µij)

(GLIM)

(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).

2. Bootstrap loop (to be repeated B times):

• Generate a set of residuals r∗ij by sampling with replacement

from the original

residuals (rij) (i = 1, . . . , t; j = 1, . . . , t+ 1 − i).

• Create a new upper triangle y∗ij :

– non-parametric bootstrap (NPB)

y∗ij =

eln(µij)+r∗ij (LL)

eln(µij)+r∗ij (LLS)√φV (µij)r

∗ij + µij (GLIM)

4.6. Three applications 163

(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).

– parametric bootstrap (PB)

y∗ij =

e(R~β)ij+σN(0,1) (LL)

See Table 4.2 (LLS)

≈ µij + B(~µ)ij +√

Σ(~µ)ijN(0, 1) (GLIM)

(i = 1, . . . , t; j = 1, . . . , t+ 1 − i).

Now we have bootstrapped a run-off triangle.

• Calculate for this bootstrapped triangle the parameters ~β∗ and

(σ2)∗ (LL)

(ˇσ2)∗ (LLS)

φ∗ (GLIM)

• Calculate the percentile k of the distribution of S l, Sl∗(k), using

these parameters.

• Return to the beginning of step 2 until the B repetitions are

completed.

3. Analysis of the bootstrap data:

• Apply the percentile method to the bootstrap observations to

obtain the required prediction interval.

4.6 Three applications

In this section we illustrate the effectiveness of the bounds derived for the

discounted IBNR reserve S, under the model studied. We investigate the

accuracy of the proposed bounds, by comparing their cumulative distri-

bution function to the empirical distribution obtained with Monte Carlo

simulation (MC), which serves as a close approximation to the exact dis-

tribution of S. The simulation results are based on generating 100 000

random paths. The estimates obtained from this time-consuming simula-

tion will serve as benchmark. The random paths are based on antithetic

variables in order to reduce the variance of the Monte Carlo estimates.

In order to illustrate the power of the bounds, namely inspecting the

deviation of the cdf of the convex bounds S l and Sc from the true distribu-

tion of the total IBNR reserve S, we simulate a triangle from a particular

model. We created a non-cumulative run-off triangle based on the chain-

ladder predictor (4.31) with parameters given in Table 4.4. So, the run-off


α1 α2 α3 α4 α5 α6 α7 α8 α9 α10 α11

12.8 12.9 13.6 13.5 13.4 13.2 13.8 13.7 13.1 13.0 13.9

β1 β2 β3 β4 β5 β6 β7 β8 β9 β10 β11

0 0.31 −0.11 −0.42 −0.37 −0.87 −0.96 −1.33 −1.63 −1.92 −2.31

Table 4.4: Model parameters.

triangle has only trends in the two main directions, namely in the year of

origin and in the development year. The parameter β1 is set equal to zero

in order to have a non-singular regression matrix.

We also specify the multivariate distribution function of the random

vector (Y1, Y2, . . . , Yt−1). In particular, we will assume that the random

variables Yi are i.i.d. and N(δ + 12 ς

2, ς2) distributed with δ = 0.08 and

ς = 0.11. This enables now to simulate the cdf’s while there is no way to

compute them analytically.

4.6.1 Lognormal linear models

The simulated run-off triangle for this model is displayed in Table 4.5.

Fitting the lognormal linear model with a chain-ladder type predictor gives

the parameter estimates and standard errors shown in Table 4.6.

4.6

.T

hre

eapplic

atio

ns

165

1 2 3 4 5 6 7 8 9 10 11

1 363,346 492,947 322,511 236,555 249,319 151,228 138,373 95,703 71,742 53,788 35,9972 397,798 543,864 358,855 263,325 276,817 167,045 153,095 106,272 78,515 58,7903 806,154 1,096,841 727,977 530,683 557,870 336,716 310,022 213,706 157,5044 727,102 995,988 654,059 476,665 502,405 303,132 278,280 192,4365 659,846 900,386 591,633 433,425 457,482 276,056 253,3016 541,187 736,205 487,730 353,255 373,921 226,0917 979,636 1,342,832 882,924 651,920 682,3078 890,641 1,219,406 798,007 582,4159 486,340 666,405 442,45710 445,174 604,20611 1,084,253

Table 4.5: Simulated run-off triangle with non-cumulative claim figures for the lognormal linear regression

model.


Parameter Value Estimate Standard error

α1 12.8 12.7976 0.0018

α2 12.9 12.8968 0.0018

α3 13.6 13.5994 0.0018

α4 13.5 13.4957 0.0019

α5 13.4 13.3996 0.0019

α6 13.2 13.1997 0.0020

α7 13.8 13.7999 0.0021

α8 13.7 13.6983 0.0023

α9 13.1 13.0999 0.0025

α10 13.0 13.0035 0.0029

α11 13.9 13.8964 0.0039

β2 0.31 0.3109 0.0018

β3 −0.11 −0.1060 0.0018

β4 −0.42 −0.4198 0.0019

β5 −0.37 −0.3677 0.0020

β6 −0.87 −0.8717 0.0021

β7 −0.96 −0.9579 0.0022

β8 −1.33 −1.3267 0.0024

β9 −1.63 −1.6249 0.0027

β10 −1.92 −1.9100 0.0032

β11 −2.31 −2.3064 0.0043

σ 0.0004 0.0037

Table 4.6: Model specification, maximum likelihood estimates and stan-

dard errors for the run-off triangle in Table 4.5.

Figure 4.2 shows the cdf’s of the upper and lower bounds, compared to

the empirical distribution based on 100 000 randomly generated, normally

distributed vectors (Y1, Y2, . . . , Yt−1) and ~ε. Since SlLL ≤cx SLL ≤cx S

cLL,

the same ordering holds for the tails of their respective distribution func-

tions which can be observed to cross only once. We see that the cdf of

SlLL is very close to the distribution of SLL. The “real” standard deviation

equals 1,617,912 whereas the standard deviation of the lower bound equals

1,590,233. A lower bound for the 95th percentile is given by 13,638,620.

The comonotonic upper bound ScLL performs badly in this case. This

comes from the fact that in order to determine S lLL, we make use of the

(estimated values of the) correlations between the cells of the lower trian-

gle, whereas in the case of ScLL, the distribution is an upper bound (in the

sense of convex order) for any possible dependence structure between the

components of the vector ~V . The standard deviation of the upper bound

is given by 1,890,298. The 95th percentile of the upper bound now equals


14,207,619, which is of course much higher than the 95th percentile of S lLL.

Table 4.7 summarizes the numerical values of the 95th percentiles of

the two bounds SlLL and Sc

LL, together with their means and standard

deviations. This is also provided for the row totals

SLL,i =t∑

j=t+2−i

e(R~β)ij−Y (i+j−t−1)+εij , i = 2, · · · , t. (4.49)

We can conclude that the lower bound approximates the “real discounted

reserve” very well.

In order to have a better view on the behavior of the upper bound

ScLL and of the lower bound Sl

LL in the tails, we consider a QQ-plot where

the quantiles of ScLL and of the lower bound Sl

LL are plotted against the

quantiles of SLL. The upper bound ScLL and the lower bound Sl

LL will

be a good approximation for SLL if the plotted points (F−1SLL

(p), F−1Sc

LL(p)),

respectively (F−1SLL

(p), F−1Sl

LL

(p)), for all values of p in (0, 1) do not devi-

ate too much from the line y = x. From the QQ-plot in Figure 4.3, we

can conclude that the upper bound (slightly) overestimates the tails of S,

whereas the accuracy of the lower bond is extremely high for the chosen

set of parameter values. Table 4.8 confirms these observations.

We remark that the improved upper bound SuLL is very close to the

comonotonic upper bound ScLL. This could be expected because ρij is close

to ρkl for any pair (ij, kl) with ij and kl sufficient close. This implies that

for any such pair (ij, kl)(F−1

e(R~β)ij−Y (i+j−t−1)|Λ(U), F−1

e(R~β)kl−Y (k+l−t−1)|Λ(U))

is close to(F−1

e(R~β)ij−Y (i+j−t−1)(U), F−1

e(R~β)kl−Y (k+l−t−1)(U)). Since the im-

proved upper bound requires more computational time, the results for the

improved upper bound are not displayed in this thesis.


discounted IBNR reserve

cum

. dis

tr.

6*10^6 8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7 1.8*10^7

0.0

0.2

0.4

0.6

0.8

1.0

Figure 4.2: The cdf’s of ‘SLL’ (MC) (solid line), SlLL (dotted line) and

ScLL (dashed line) for the run-off triangle in Table 4.5.

8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7

8*10

^610

^71.

2*10

^71.

6*10

^7

Figure 4.3: QQ-plot of the quantiles of S lLL (◦) and Sc

LL (�) versus those

of ‘SLL’ (MC).

4.6

.T

hre

eapplic

atio

ns

169

SlLL SLL Sc

LL

year 95% mean st. dev. 95% mean st. dev. 95% mean st. dev.

2 41,913 36,694 3,043 43,742 36,690 4,072 43,796 36,694 4,0963 210,781 178,522 18,580 215,958 178,510 21,334 218,463 178,522 22,8054 339,371 280,596 33,568 344,231 280,570 36,069 350,678 280,596 39,7385 487,782 396,861 51,644 492,575 396,817 53,804 503,873 396,861 60,3576 609,034 491,311 66,663 614,094 491,252 68,525 630,052 491,311 77,9717 1,515,794 1,206,735 174,414 1,526,990 1,206,571 177,891 1,570,251 1,206,735 203,4228 1,976,955 1,574,772 226,804 1,986,766 1,574,556 230,635 2,053,898 1,574,772 267,6689 1,392,268 1,095,585 166,894 1,403,295 1,095,420 169,890 1,449,017 1,095,585 196,74410 1,641,355 1,287,052 199,051 1,657,107 1,286,851 203,005 1,713,161 1,287,052 236,65811 5,423,367 4,267,416 649,616 5,473,462 4,266,762 662,975 5,674,518 4,267,416 781,003

total 13,638,620 10,815,543 1,590,233 13,718,215 10,814,002 1,617,912 14,207,619 10,815,543 1,890,298

Table 4.7: 95th percentiles, means and standard deviations of the distributions of S lLL and Sc

LL vs. ‘SLL’ (MC).


p SlLL SLL Sc

LL

0.95 13,638,620 13,718,215 14,207,6190.975 14,303,311 14,411,869 15,035,3800.99 15,122,153 15,166,753 16,066,3050.995 15,709,687 15,710,588 16,813,4320.999 17,003,250 17,003,255 18,479,550


level p of SLL.

Distribution of bootstrapped Simulated distribution

95th percentiles of SlLL of F−1

SLL(0.95)

1 st percentile 13,587,825 13,578,3312.5 th percentile 13,589,852 13,579,131

5 th percentile 13,597,445 13,585,81310 th percentile 13,616,522 13,598,72325 th percentile 13,627,692 13,619,38950 th percentile 13,637,841 13,634,54375 th percentile 13,647,654 13,651,19590 th percentile 13,661,140 13,669,10495 th percentile 13,671,003 13,678,393

97.5 th percentile 13,678,085 13,685,37899 th percentile 13,680,785 13,688,379

Table 4.9: Percentiles of the bootstrapped 95th percentile of the distribu-

tion of the lower bound SBl(95) vs. the simulation.

Finally, for each bootstrap sample, we calculate the desired percentile of

the distribution of SlLL. This two-step procedure is repeated a large number

of times. The first column of Table 4.9 shows the results, concerning the

95th percentile, for 5000 bootstrap samples applied to the run-off triangle

in Table 4.5. When compared with the simulated distribution of F−1SLL

(0.95)

(obtained through 5000 simulated triangles), we can conclude that the

bootstrap distribution yields appropriate confidence bounds.



α1 12.8 12.805 0.0073α2 12.9 12.909 0.0074α3 13.6 13.599 0.0077α4 13.5 13.506 0.0076α5 13.4 13.411 0.0082α6 13.2 13.203 0.0076α7 13.8 13.788 0.0091α8 13.7 13.708 0.0081α9 13.1 13.103 0.0101α10 13.0 13.982 0.0102α11 13.9 13.905 0.0131β2 0.31 0.310 0.0068β3 −0.11 −0.118 0.0080β4 −0.42 −0.424 0.0079β5 −0.37 −0.370 0.0088β6 −0.87 −0.883 0.0079β7 −0.96 −0.967 0.0093β8 −1.33 −1.325 0.0108β9 −1.63 −1.643 0.0097β10 −1.92 −1.956 0.0225β11 −2.31 −2.311 0.0150σ 0.01 0.0093 0.0001



4.6.2 Loglinear location-scale models

Table 4.11 displays the simulated run-off triangle for the logistic regression

model with given parameters displayed in Table 4.4.

Fitting the logistic regression model with a chain-ladder type predictor

gives the parameter estimates and standard errors shown in Table 4.10.

172

Chapte

r4

-R

ese

rvin

gin

non-life

insu

rance

busin

ess

1 2 3 4 5 6 7 8 9 10 11

1 362,573 487,703 327,399 247,297 248,321 151,494 137,722 98,983 70,587 50,118 36,1102 400,144 548,504 366,684 255,014 283,467 166,318 154,915 105,641 77,890 58,7633 819,562 1,109,572 665,960 520,160 566,065 330,429 302,985 216,361 156,1594 724,419 999,135 668,363 478,629 512,920 307,563 275,629 192,2125 675,791 893,821 597,618 434,052 442,722 276,007 262,5206 544,870 736,215 471,965 359,236 377,939 222,5907 990,881 1,341,576 850,040 639,613 658,6388 896,565 1,230,011 790,872 589,7619 482,297 674,219 437,69210 432,302 595,20611 1,093,549

Table 4.11: Simulated run-off triangle with non-cumulative claim figures for the logistic regression model.


We will compare the derived bounds with a time consuming Monte Carlo

simulation based on 100 000 randomly generated, normally distributed vec-

tors (Y1, Y2, . . . , Yt−1) and eσ~ε. Using the following properties, the simula-

tion of these last terms can be done in any statistical software package.

• If εij is Gumbel distributed, then we have that eσεij is Weibull dis-

tributed with location parameter 1/σ and scale parameter equal to

1.

• If εij is generalized loggamma distributed with parameter k, then

we have that eσεij is generalized gamma distributed with parameters

γ = 1/(σ√k) and α = k−σ

√k. One can generate a random number

from a generalized gamma distribution as follows:

1. Generate Gk from the gamma distribution with location para-

meter k and scale parameter 1

2. Retain α(Gk)1λ .

• If εij is log inverse Gaussian distributed, then we have that eσεij

is inverse Gaussian distributed with location parameter and scale

parameter equal to 1/σ. Michael et al. (1976) describe an algorithm

to generate a random number from an inverse Gaussian distribution

with parameters α and β as follows:

1. Generate C from the χ2(1) distribution

2. Calculate x1 = αβ + C

2β − 12β

√4αC + C2, x2 = α2

β2x1 and p1 =(1 + β

αx1

)−1

3. Generate U ∼ Uniform(0, 1)

4. Retain x2 if U ≤ p1, else x1.

On Figures 4.4 and 4.5 we compare the approximations (the convex upper

and lower bounds) for the distribution of the discounted loss reserve SLLS

to the empirical distribution function obtained by a Monte Carlo (MC)

simulation study. One can see that the upper bound ScLLS gives a poor

approximation. We observe that this upper bound has heavier tails than

the original distribution — the deviation for upper quantiles reaches 25%.

The main reason for that is a relatively weak dependence between claims,

for which the comonotonic approximation significantly overestimates the


p SlLLS SLLS Sc

LLS

0.95 13,517,204 13,524,010 14,125,2030.975 14,175,492 14,165,083 14,950,8380.99 14,988,558 15,009,978 15,979,224

0.995 15,573,369 15,483,938 16,724,5840.999 16,865,068 16,623,928 18,386,959


level p of SLLS.

tails, which is very clear both from the plot of cdf’s and from the QQ-plot.

On the other hand the lower bound gives a much better fit to the original

distribution. These findings are confirmed in Table 4.12 for some chosen

quantiles.

Similar conclusions can be drawn from the study of the reserves for the

row totals given by

SLLS,i =

t∑

j=t+2−i

e(R~β)ij+σεij−Y (i+j−t−1), i = 2, · · · , t. (4.50)

Table 4.13 summarizes the numerical values of the 95th percentiles of the

two bounds SlLLS and Sc

LLS , together with their means and standard de-

viations.

We end this illustration with a bootstrap study in order to incorporate

the estimation error involved. Starting from the run-off triangle in Table

4.11 we bootstrap 5000 pseudo run-off triangles and calculate for each

bootstrap sample the 95% percentile of the distribution of S lLLS . Table

4.14 displays the results of this study. One can observe that, compared

to the simulated distribution of F−1SLLS

(0.95), the bootstrap distributions

performs very well.



cum

. dis

tr.

6*10^6 8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7 1.8*10^7

0.0

0.2

0.4

0.6

0.8

1.0

Figure 4.4: The cdf’s of ‘SLLS’ (MC) (solid line), SlLLS (dotted line) and

ScLLS (dashed line) for the run-off triangle in Table 4.11.

8*10^6 10^7 1.2*10^7 1.4*10^7

8*10

^610

^71.

2*10

^71.

6*10

^7

Figure 4.5: QQ-plot of the quantiles of S lLLS (◦) and Sc

LLS (�) versus

those of ‘SLLS’ (MC).

176

Chapte

r4

-R

ese

rvin

gin

non-life

insu

rance

busin

ess

SlLLS SLLS Sc

LLS


2 41,609 36,990 2,705 44,057 36,990 4,124 44,146 36,990 4,1333 201,904 173,309 16,524 209,182 173,309 20,367 212,567 173,309 21,7564 330,812 276,778 30,930 339,482 276,778 35,288 346,424 276,778 38,7865 481,854 395,969 48,841 487,849 395,969 53,075 503,109 395,969 60,5676 599,443 487,188 63,572 605,245 487,188 67,227 625,306 487,188 77,3217 1,473,927 1,178,908 166,351 1,484,295 1,178,908 171,499 1,535,494 1,178,908 201,4828 1,971,886 1,576,489 222,631 1,979,501 1,576,489 226,440 2,057,661 1,576,489 263,8799 1,384,144 1,090,821 164,670 1,386,684 1,090,821 165,568 1,443,768 1,090,821 192,21210 1,593,022 1,248,692 192,941 1,589,451 1,248,692 193,174 1,663,714 1,248,692 224,98211 5,438,603 4,278,076 650,272 5,443,520 4,278,076 653,437 5,693,117 4,278,076 763,183

total 13,517,204 10,743,220 1,559,369 13,524,010 10,743,220 1,583,892 14,125,203 10,743,220 1,884,508

Table 4.13: 95th percentiles, means and standard deviations of the distributions of S lLLS and Sc

LLS vs. ‘SLLS’

(MC).



95th percentiles of SlLLS of F−1

SLLS(0.95)






4.6.3 Generalized linear models

In this last illustration we model the incremental claims Yij with a loga-

rithmic link function to obtain a multiplicative parametric structure and

we link the expected value of the response to the chain-ladder type linear

predictor. Formally, this means that

E[Yij ] = µij ,

Var[Yij ] = φµκij ,

log(µij) = ηij ,

ηij = αi + βj . (4.51)

The choice of the error distribution is determined by κ.

More specific we consider model (4.51) with the Poisson error distribu-

tion (κ=1 and φ = 1). The simulated triangle for this model is depicted in

Table 4.15. Parameter estimates and standard errors for this fit are shown

in Table 4.16.

Since this model is a generalized linear model, standard statistical

software can be used to obtain maximum (quasi) likelihood parameter

estimates, fitted and predicted values. Standard statistical theory also

suggests goodness-of-fit measures and appropriate residual definitions for

diagnostic checks of the fitted model.

178

Chapte

r4

-R

ese

rvin

gin

non-life

insu

rance

busin

ess

1 2 3 4 5 6 7 8 9 10 11

1 362,505 493,876 323,065 237,574 249,850 152,221 139,293 95,961 70,812 53,395 35,9022 399,642 545,274 357,788 263,414 276,500 168,064 153,603 105,760 78,736 58,6123 805,843 1,100,020 722,110 531,220 557,195 337,606 309,306 213,416 158,6114 728,762 994,975 653,231 478,728 502,797 306,071 278,436 193,2015 661,713 899,778 591,647 434,626 456,763 276,588 253,2976 539,789 737,394 484,415 355,175 372,800 226,8657 983,897 1,341,585 881,786 647,431 679,2648 889,268 1,217,248 798,387 585,0999 487,823 666,590 437,98710 442,982 601,70611 1,087,672

Table 4.15: Simulated run-off triangle with non-cumulative claim figures for the Poisson regression model.



α1 12.8 12.7990566 0.0007918770α2 12.9 12.8989406 0.0007631003α3 13.6 13.6001742 0.0006060520α4 13.5 13.4989356 0.0006283423α5 13.4 13.4007436 0.0006556928α6 13.2 13.1997559 0.0007180990α7 13.8 13.7991616 0.0005991796α8 13.7 13.6998329 0.0006464691α9 13.1 13.0989431 0.0008707837α10 13.0 12.9987252 0.0010370987α11 13.9 13.8995502 0.0009710197β2 0.31 0.3106789 0.0005310346β3 −0.11 −0.1099061 0.0006026958β4 −0.42 −0.4189677 0.0006804776β5 −0.37 −0.3700452 0.0007168115β6 −0.87 −0.8685181 0.0009462170β7 −0.96 −0.9585385 0.0010542829β8 −1.33 −1.3284870 0.0013825136β9 −1.63 −1.6269622 0.0018947413β10 −1.92 −1.9170757 0.0030880359β11 −2.31 −2.3105083 0.0054029754φ 1 1.025663



Figure 4.6 shows the distribution functions of the different bounds com-

pared to the empirical distribution obtained by Monte Carlo simulation

(MC). The distribution functions are remarkably close to each other and

enclose the simulated cdf nicely. This is confirmed by the QQ-plot in Fig-

ure 4.7 where we also see that the comonotonic upper bound has somewhat

heavier tails. Numerical values of some high quantiles of SGLIM , SlGLIM

and ScGLIM are given in Table 4.18.

Table 4.17 summarizes the numerical values of the 95th percentiles of

the two bounds SlGLIM and Sc

GLIM vs. SGLIM , together with their means

and standard deviations. This is also provided for the row totals

SGLIM,i =t∑

j=t+2−i

µije−Y (i+j−t−1), i = 2, . . . , t. (4.52)



cum

. dis

tr.

10^7 1.5*10^7 2*10^7

0.0

0.2

0.4

0.6

0.8

1.0

Figure 4.6: The cdf’s of ‘SGLIM ’ (MC) (solid line), SlGLIM (dotted line)

and ScGLIM (dashed line) for the run-off triangle in Table 4.15.

8*10^6 10^7 1.2*10^7 1.4*10^7 1.6*10^7

8*10

^610

^71.

2*10

^71.

6*10

^7

Figure 4.7: QQ-plot of the quantiles of S lGLIM (◦) and Sc

GLIM (�) versus

those of ‘SGLIM ’ (MC).

4.6

.T

hre

eapplic

atio

ns

181

SlGLIM SGLIM Sc

GLIM


2 43,622 36,623 4,041 43,624 36,623 4,042 43,631 36,623 4,0463 214,142 177,600 21,002 214,428 177,600 21,040 217,352 177,600 22,7514 342,589 280,318 35,595 343,011 280,318 35,691 350,360 280,318 39,8055 489,087 396,089 52,976 489,689 396,089 53,194 502,853 396,089 60,3986 608,891 490,289 67,401 609,535 490,289 67,565 628,672 490,289 78,0217 1,514,480 1,205,224 175,099 1,516,799 1,205,224 175,658 1,567,945 1,205,224 203,6928 1,977,737 1,575,313 227,703 1,980,868 1,575,313 228,343 2,054,475 1,575,313 268,6619 1,390,601 1,093,992 167,320 1,392,957 1,093,992 167,862 1,444,660 1,093,992 197,12110 1,632,675 1,278,947 199,110 1,634,653 1,278,947 199,693 1,702,375 1,278,947 236,12111 5,439,986 4,276,121 655,280 5,446,107 4,276,121 656,472 5,685,932 4,276,121 785,741

total 13,631,905 10,810,476 1,594,152 13,648,695 10,810,476 1,597,507 14,200,226 10,810,476 1,896,219

Table 4.17: 95th percentiles, means and standard deviations of the distributions of S lGLIM and Sc

GLIM vs.

‘GLIM ’ (MC).


p SlGLIM SGLIM Sc

GLIM

0.95 13,631,905 13,648,695 14,200,2260.975 14,296,448 14,305,657 15,027,4140.99 15,115,189 15,122,840 16,057,613

0.995 15,702,702 15,709,497 16,804,2060.999 16,996,374 17,018,860 18,469,110


level p of SGLIM .


95th percentiles of SlGLIM of F−1

SGLIM(0.95)






The bootstrap results in Table 4.19 are in line with the results of the

previous applications. We can conclude that in the discussed applications

the lower bound approximates the “real discounted reserve” very well. The

precision of the bounds only depends on the underlying variance of the sta-

tistical and financial part. As long as the yearly volatility does not exceed

ς = 35%, the financial part of the comonotonic approximation provides a

very accurate fit. These parameters are consistent with historical capital

market values as reported by Ibbotson Associates (2002). The underlying

variance of the statistical part depends on the estimated dispersion para-

meter and error distribution or mean-variance relationship. For example,

in case of the gamma distribution one obtains excellent results as long

4.7. Conclusion 183

as the dispersion parameter is smaller than 1. This is again in line with

the volatility structure in practical IBNR data sets. Since the parameters

in the paper for the statistical part of the bounds, obtained through the

quasi-likelihood approach, have small standard errors, it follows that re-

sults would be similar when simulating from a GLIM with the same linear

predictor, but for instance with another distribution type. In that sense

our findings are robust.

4.7 Conclusion

In this chapter, we considered the problem of deriving the distribution

function of the present value of a triangle of claim payments that are

discounted using some given stochastic return process. We started to model

the claim payments by means of a lognormal linear model which is also

included in the larger class of loglinear location-scale models. The use of

generalized linear models offers a great gain in modelling flexibility over the

simple lognormal model. The incremental claim amounts can for instance

be modelled as independent normal, Poisson, gamma or inverse Gaussian

response variables together with a logarithmic link function and a specified

linear predictor.

Because an explicit expression for the distribution function is hard to

obtain, we presented some approximations for this distribution function, in

the sense that these approximations are larger or smaller in convex order

sense than the exact distribution. When lower and upper bounds are close

to each other, together they can provide reliable information about the

original and more complex variable. An essential point in the derivation

of the presented convex lower bound approximations is the choice of the

conditioning random variable Λ.

When dealing with very large variances in the statistical and financial

part of our model, an adaptation of the random variable Λ will be necessary

or one can use other approximation techniques. This will be the topic of

the next chapter.

Chapter 5

Other approximation

techniques for sums of

dependent random variables

Summary In this chapter we derive some asymptotic results for the tail

distribution of sums of heavy tailed dependent random variables. We show

how to apply the obtained results to approximate certain functionals of

(the d.f. of) sums of dependent random variables. Our numerical results

demonstrate that the asymptotic approximations are typically close to the

Monte Carlo value. We will further briefly recall the mathematical tech-

niques behind the moment matching approximations and the Bayesian ap-

proach. Finally, we compare these approximations with the comonotonic

approximations of the previous chapter in the context of claims reserving.

5.1 Introduction

Many quantities of relevance in actuarial science concern functionals of

(the d.f. of) sums of dependent random variables. For example, one can

think of the Value-at-Risk of a stochastically discounted life annuity, or

the stop-loss premium for the aggregate claim amount of a number of in-

terrelated policies. Therefore, distribution functions of sums of dependent

random variables are of particular interest. Typically these distribution

functions are of a complex form. Consequently, in order to compute func-

tionals of sums of dependent random variables, approximation methods

185

186 Chapter 5 - Approximation techniques for sums of r.v.’s

are generally indispensable. Obviously, in many cases we could use Monte

Carlo simulation to obtain empirical distribution functions. However, this

is typically a time-consuming approach, in particular if we want to ap-

proximate tail probabilities, which would require an excessive number of

simulations. Therefore, alternative methods need to be explored.

Practitioners often use moment matching techniques to approximate

(the d.f. of) a sum of dependent lognormal random variables. In Section 2

we recall the lognormal and reciprocal gamma moment matching approach.

Both approximations are chosen such that their first two moments are equal

to the corresponding moments of the random variable of interest.

In Chapter 2 we discussed the concept of comonotonicity to obtain

bounds in convex order for sums of dependent random variables. Al-

though these bounds in convex order have proven to be good approxi-

mations in case the variance of the random sum is sufficiently small, they

perform much worse when the variance gets large. Section 3 establishes

some asymptotic results for the tail probability of a sum of dependent

random variables, in the presence of heavy-tailedness conditions.

Section 4 sketches, in very broad terms, basic elements of Bayesian

computation. We discuss two major obstacles to its popularity. The first

is how to specify prior distributions, and the second is how to evaluate

the integrals required for inference, given that for most models, these are

analytically intractable.

In the last section we compare the discussed approximations with the

comonotonic approximations of the previous chapter in the context of

claims reserving. In case the underlying variance of the statistical and

financial part of the discounted IBNR reserve gets large, the comono-

tonic approximations perform worse. We will illustrate this observation

by means of a simple example and propose to solve this problem using the

derived asymptotic results for the tail probability of a sum of dependent

random variables, in the presence of heavy-tailedness conditions. These

approximations are compared with the lognormal moment matching ap-

proximations. We finally consider the distribution of the discounted loss

reserve when the data in the run-off triangle is modelled by a generalized

linear model and compare the outcomes of the Bayesian approach with the

comonotonic approximations.

This chapter is based on Laeven, Goovaerts & Hoedemakers (2005),

Vanduffel, Hoedemakers & Dhaene (2004) and Antonio, Beirlant & Hoede-

makers (2005).

5.2. Moment matching approximations 187

5.2 Moment matching approximations

Consider a sum S given by

S =n∑

i=1

αieZi . (5.1)

Here, the αi are non-negative real numbers and (Z1, Z2, ..., Zn) is a multi-

variate normal distributed random vector.

The accumulated value at time n of a series of future deterministic

saving amounts αi can be written in the form (5.1), where Zi denotes the

random accumulation factor over the period [i, n]. Also the present value

of a series of future deterministic payments αi can be written in the form

(5.1), where now Zi denotes the random discount factor over the period

[0, i]. The valuation of Asian or basket options in a Black & Scholes model

and the setting of provisions and required capitals in an insurance context

boils down to the evaluation of risk measures related to the distribution

function of a random variable S as defined in (5.1).

The r.v. S defined in (5.1) will in general be a sum of non-independent

lognormal r.v.’s. Its distribution function cannot be determined analyti-

cally and is too cumbersome to work with. In the literature, a variety of

approximation techniques for this distribution function has been proposed.

Practitioners often use a moment matching lognormal approximation

for the distribution of S. The lognormal approximation is chosen such that

its first two moments are equal to the corresponding moments of S.

The present value of a continuous perpetuity with lognormal return

process has a reciprocal gamma distribution, see for instance Milevsky

(1997) and Dufresne (1990). This present value can be considered as the

limiting case of a random variable S as defined above. Motivated by this

observation, Milevsky & Posner (1998) and Milevsky & Robinson (2000)

propose a moment matching reciprocal gamma approximation for the d.f.

of S such that the first two moments match. They use this technique

for deriving closed form approximations for the price of Asian and basket

options.

5.2.1 Two well-known moment matching approximations

It belongs to the toolkit of any actuary to approximate the distribution

function of an unknown r.v. by a known distribution function in such a


way that the first moments are preserved. In this section we will briefly

describe the reciprocal gamma and the lognormal moment matching ap-

proximations. These two methods are frequently used to approximate the

distribution function of the r.v. S defined by (5.1).

The reciprocal gamma approximation

A r.v. X is said to be gamma distributed when its probability density

function is given by

fX(x;α, β) =βα

Γ(α)xα−1e−βx, x > 0, (5.2)

where α > 0, β > 0 and Γ(.) denotes the gamma function.

Consider now the r.v. Y = 1/X. This r.v. is said to be reciprocal gamma

distributed. Its p.d.f. is given by

fY (y;α, β) = fX(1/y;α, β)/y2, y > 0. (5.3)

It is straightforward to prove that the quantiles of Y are given by

F−1Y (p) =

1

F−1X (1 − p;α, β)

, p ∈ (0, 1) , (5.4)

where FX(.;α, β) is the cdf of the gamma distribution with parameters

α and β. Since the inverse of the gamma distribution function is readily

available in many statistical software packages, quantiles can easily be

determined.

The first two moments of the reciprocal gamma distributed r.v. Y are

given by

E[Y ] =1

β(α− 1), α > 1 (5.5)

and

E[Y 2] =1

β2(α− 1)(α− 2), α > 2. (5.6)

Expressing the parameters α and β in terms of E[Y ] and E[Y 2] gives

α =2E[Y 2] − E[Y ]2

E[Y 2] − E[Y ]2(5.7)

and

β =E[Y 2] − E[Y ]2

E[Y ]E[Y 2]. (5.8)


The d.f. of the r.v. defined in (5.1) is now approximated by a reciprocal

gamma distribution with first two moments (2.46) and (2.47). The coef-

ficients α and β of the reciprocal gamma approximation follow from (5.7)

and (5.8). The reciprocal gamma approximation for the quantile function

is then given by (5.4).

The reciprocal gamma moment matching method appears naturally in

case one wants to approximate the d.f. of stochastic present values. Indeed,

for the limiting case of the constant continuous perpetuity :

S∞ =

∫ ∞

0exp

[−(µ− σ2

2)τ − σB(τ)

]dτ, (5.9)

where B(τ) represents a standard Brownian motion and µ > σ2

2 , the risk

measures can be calculated very easily since Dufresne (1990) proved that

S−1∞ is gamma distributed with parameters 2µ

σ2 − 1 and σ2

2 . An elegant

proof for this result can be found in Milevsky (1997).

Expression (5.9) can be seen as a continous counterpart of a discounted

sum such as in (5.1). One expects that the present value of a finite dis-

crete annuity with a normal logreturn process with independent periodic

returns, can be approximated by a reciprocal gamma distribution, pro-

vided the time period involved is long enough. This idea was set forward

and explored in Milevsky & Posner (1998), Milevsky & Robinson (2000)

and Huang et al. (2004).

The lognormal approximation

A r.v. X is said to be lognormally distributed if its p.d.f. is given by

fX(x;µ, σ2) =1

xσ√

2πe

−(log x−µ)2

2σ2 , x > 0, (5.10)

where σ > 0.

The quantiles of X are given by

F−1X (p) = eµ+σΦ−1(p), p ∈ (0, 1) . (5.11)

The first two moments of X are given by

E[X] = eµ+ 12σ2

(5.12)

and

E[X2] = e2µ+2σ2. (5.13)


Expressing the parameters µ and σ2 of the lognormal distribution in terms

of E[X] and E[X2] leads to

µ = log

(E[X]2√E[X2]

)(5.14)

and

σ2 = log

(E[X2]

E[X]2

). (5.15)

The same procedure as the one explained in the previous subsection can

be followed in order to obtain a lognormal approximation for S, with the

first two moments matched. Dufresne (2002) obtains a lognormal limit dis-

tribution for S as volatility σ tends to zero and this provides a theoretical

justification for the use of the lognormal approximation.

5.2.2 Application: discounted loss reserves

We calculate the lognormal moment matching approximations for the ap-

plication considered in Section 2.4 and compare the results with the convex

lower bound. The results are given below.

We use the notation SMp[Vl] and SMp[V

LN ] to denote the security

margin for confidence level p approximated by the lower bound and by the

lognormal moment matching technique respectively. The different tables

display the Monte Carlo simulation result (MC) for the security margin, as

well as the procentual deviations of the different approximation methods,

relative to the Monte Carlo result. These procentual deviations are defined

as follows:

LB :=SMp[V

l] − SMp[VMC ]

SMp[VMC ]× 100%,

LN :=SMp[V

LN ] − SMp[VMC ]

SMp[VMC ]× 100%,

where V l and V LN correspond to the lower bound approach and the log-

normal moment matching approach, and V MC denotes the Monte Carlo

simulation result. The figures displayed in bold in the tables correspond to

the best approximations, this means the ones with the smallest procentual

deviation compared to the Monte Carlo results.

Overall the comonotonic lower bound approach provides a very accu-

rate fit under different parameter assumptions. These assumptions are


σM : 0.05 0.15 0.25 0.35

LB −0.25% −0.09% −0.12% −0.00%LN −1.66% +1.28% +4.09% +7.52%MC 0.0853 0.1090 0.1309 0.1370(s.e. × 107) (1.11) (2.47) (6.15) (8.18)


different market volatilities and ωL = 0.1 and ωA = 0.05.

p : 0.995 0.975 0.95 0.90 0.80 0.70

LB −0.38% −0.21% −0.16% −0.08% −0.00% −0.00%LN −4.30% −2.96% −2.29% −1.43% −0.11% +1.74%MC 1.0348 0.6927 0.5421 0.3859 0.2192 0.1124(s.e. × 105) (2.49) (0.46) (0.26) (0.10) (0.06) (0.04)

Table 5.2: (ex. 1) Approximations for some selected confidence levels

of SMp[V ]. The market volatility is set equal to 20%. (ωL = 0.05 and

ωA = 0)

σM : 0.05 0.10 0.15 0.20 0.25 0.30 0.35

LB −0.19% −0.15% −0.23% −0.16% −0.11% −0.17% −0.38%LN −4.94% −3.92% −3.17% −2.49% −1.95% −1.56% −1.30%MC 0.4390 0.5250 0.6528 0.8103 0.9924 1.1970 1.4232s.e.(×105) (0.15) (0.29) (0.41) (0.69) (1.22) (3.78) (4.16)


different market volatilities.

p : 0.995 0.975 0.95 0.90 0.80 0.70

LB −0.93% −0.04% −0.02% −0.18% −0.03% −0.6%LN −3.94% +3.78% +7.22% +11.29% +19.68% +53.46%MC 4.4521 2.2264 1.4998 0.8814 0.3508 0.0761s.e.(×105) (37.63) (2.99) (7.44) (2.79) (0.78) (0.27)

Table 5.4: (ex. 2) Approximations for some selected confidence levels of

SMp[V ]. The market volatility is set equal to 25%.

in line with the realistic market values. Moreover the comonotonic ap-

proximations have the advantage that they are easy computable for any

risk measure that is additive for comonotonic risks, such as Value-at-Risk


and Tail Value-at-Risk. We believe the comonotonic approach is preferred

to any moment matching approximation, because it is more stable and

accurate across all levels of volatility.

5.3 Asymptotic approximations

In actuarial applications it is often merely the tail of the distribution func-

tion that is of interest. Indeed, one may think of Value-at-Risk, Conditional

Tail Expectation or Expected Shortfall estimations. Therefore, approxi-

mations for functionals of sums of (the d.f. of) dependent random variables

may alternatively be obtained through the use of asymptotic relations. Al-

though asymptotic results are valid at infinity, they may as well serve as

approximations near infinity.

This section establishes some asymptotic results for the tail proba-

bilities related with a sum of heavy tailed dependent random variables.

In particular, we establish an asymptotic result for the randomly weighted

sum of a sequence of non-negative numbers. Furthermore, we establish un-

der two different sets of conditions, an asymptotic result for the randomly

weighted sum of a sequence of independent random variables that consist

of a random and a deterministic component. Throughout, the random

weights are products of i.i.d. random variables and thus exhibit an explicit

dependence structure. Next, we present an application that demonstrates

how the derived asymptotic results can be employed to approximate cer-

tain functionals of sums of (the d.f. of) dependent random variables. To

explore the quality of the asymptotic approximations, we also provide a

numerical illustration that compares the asymptotic approximation values

to Monte Carlo simulated values.

5.3.1 Preliminaries for heavy-tailed distributions

First we introduce some notational conventions. For a random variable X

with a distribution function F , we denote its tail probability by F (x) =

1 − F (x) = Pr[X > x]. For two independent r.v.’s X and Y with d.f.’s F

andG supported on (−∞,+∞), we write by F∗G(x) =∫ +∞−∞ F (x−t)G(dt),

−∞ < x < +∞, the convolution of F and G. We denote by F ∗n =

F ∗ · · · ∗ F the n-fold convolution of F , and we write by F ⊗G the d.f. of

XY .

5.3. Asymptotic approximations 193

Throughout, unless otherwise stated, all limit relations are for x →+∞. Let a(x) ≥ 0 and b(x) > 0 be two functions satisfying

l− ≤ lim infx→+∞

a(x)

b(x)≤ lim sup

x→+∞

a(x)

b(x)≤ l+.

We write a(x) = O (b(x)) if l+ < +∞, a(x) = o (b(x)) if l+ = 0 and

a(x) � b(x) if both l+ < +∞ and l− > 0. We write a(x) . b(x) if l+ = 1,

a(x) & b(x) if l− = 1 and a(x) ∼ b(x) if both l+ = 1 and l− = 1. We say

that a(x) and b(x) are weakly equivalent if a(x) � b(x), and say that a(x)

and b(x) are (strongly) equivalent if a(x) ∼ b(x).

A r.v. X or its d.f. F is said to be heavy-tailed if E[eγX ] = +∞ for

any γ > 0. Below we introduce some important classes of heavy-tailed

distributions. A d.f. F supported on (0,+∞) belongs to the subexponential

class S if

limx→+∞

F ∗n(x)/F (x) = n (5.16)

for any (or equivalently, for some) n ≥ 2. More generally, a d.f. F sup-

ported on (−∞,+∞) belongs to the class S if F (x) = F (x)I(x>0) does. A

d.f. F supported on (−∞,+∞) belongs to the long-tailed class L if for any

real number y (or equivalently, for y = 1) we have that

limx→+∞

F (x+ y) /F (x) = 1. (5.17)

A class of heavy-tailed distributions that is closely related to the classes

S and L, is the class D of d.f.’s with dominatedly varying tails. A d.f. F

supported on (−∞,+∞) belongs to the class D if its tail F is of dominated

variation in the sense that

lim supx→+∞

F (xy)

F (x)< +∞ (5.18)

for any 0 < y < 1 (or equivalently for some 0 < y < 1). It is well-known

that

D ∩ L ⊂ S ⊂ L.

See e.g. Embrechts et al. (1997). We remark that the intersection D ∩ Lcontains many useful heavy-tailed distributions. In particular, the inter-

section D ∩ L covers the class R, which consists of all d.f.’s with regularly


varying tails. A d.f. F supported on (−∞,+∞) has a regularly varying

tail if there is some α > 0 such that the relation

limx→+∞

F (xy)

F (x)= y−α

holds true for any y > 0. We denote F ∈ R−α.

In addition to the classes of heavy-tailed distributions introduced above,

we introduce the class R−∞ of d.f.’s with rapidly varying tails, containing

both heavy-tailed and light-tailed distributions. For a d.f. F supported on

(−∞,+∞) satisfying F (x) > 0 for any x > 0, F belongs to the class R−∞if

limx→+∞

F (xy)

F (x)=

{0, for any y > 1;

+∞, for any 0 < y < 1.(5.19)

We remark that the intersection S∩R−∞ contains e.g. lognormal distribu-

tions and certain Weibull distributions, which are prominent distributions

in actuarial applications.

For an elaboration on the classes of heavy-tailed distributions and the

class of rapidly varying tailed distributions, and their applications in in-

surance and finance, the interested reader is referred to Bingham et al.

(1987), Embrechts et al. (1997) and Beirlant et al. (2004).

In Table 5.5 we list some well-known distributions and their corres-

ponding distribution class.

5.3.2 Asymptotic results

In this subsection, we derive some asymptotic results for the tail proba-

bility of sums of dependent r.v.’s, in the presence of heavy-tailedness. In

the following, we let {Xn, n = 1, 2, . . .} and {Yn, n = 1, 2, . . .} denote two

sequences of i.i.d. r.v.’s that are mutually independent. We write by FX

the d.f. of a r.v. X of which Xn, n = 1, 2, . . ., are considered to be inde-

pendent replicates, and assume it is supported on (−∞,+∞). Similarly,

we write by FY the d.f. of a r.v. Y of which Yn, n = 1, 2, . . ., are considered

to be independent replicates, and assume it is supported on (0,+∞). For

notational convenience, we will use the device of independent replicates

throughout.

5.3

.A

sym

pto

ticappro

xim

atio

ns

195

Name d.f. or density f Parameters Class

Lognormal f(x) = 1√2πσx

e−12( log x−µ

σ)2 , (µ ∈ R, σ > 0) R−∞ ∩ S

Weibull F (x) = 1 − e−cxβ(c > 0, 0 < β < 1) R−∞ ∩ S

Benktander-I F (x) = 1 − cx−α−1e−β(log x)2(α+ 2β log x) (c, α, β > 0) R−∞ ∩ SBenktander-II F (x) = 1 − cαx−(1−β) exp{−(α/β)xβ} (c, α > 0, 0 < β < 1) R−∞ ∩ SPareto F (x) = 1 − ( x

β )−α (α, β > 0) RBurr F (x) = 1 − (1 + xτ

β )−α (α, β, τ > 0) RLoggamma f(x) = βα

Γ(α)x(log x)α−1x−β (α, β > 0) RTransformed β f(x) = |a|

B(p,q)xap−1(1 + xa)−(p+q) (a ∈ R, p, q > 0) R

Truncated F (x) = Pr[|X| ≤ x], X ∼ α-stable (1 < α < 2) Rα-stable

Table 5.5: Some well-known distributions and their distribution class.


We state the following theorem:

Theorem 12.

Let Zi = Y1Y2 · · ·Yi and 0 < ai < +∞, i = 1, 2, . . .. If FY ∈ S ∩ R−∞,

then it holds for each n = 1, 2, . . . and x→ +∞ that

Pr

[n∑

i=1

aiZi > x

]∼

n∑

i=1

Pr [aiZi > x] . (5.20)

Proof. See section 5.6.

In an actuarial context the sequence {ai, i = 1, 2, . . .} can be regarded as a

sequence of deterministic payments. The following theorem applies to the

case in which the payments consist of both a deterministic and a random

component, and the deterministic component is either an additive or a

multiplicative constant. The theorem is an extension of Theorems 5.1 and

5.2 of Tang & Tsitsiashvili (2003):

Theorem 13.

Let Zi = Y1Y2 · · ·Yi and 0 < ai < +∞, i = 1, 2, . . .. If the following

conditions are valid:

1. FX ∈ D ∩ L,

2. FY ∈ R−∞,

then it holds for each n = 1, 2, . . . and x→ +∞ that

Pr

[n∑

i=1

(ai +Xi)Zi > x

]∼

n∑

i=1

Pr [(ai +X)Zi > x] . (5.21)

Furthermore, it holds for each n = 1, 2, . . . and x→ +∞ that

Pr

[n∑

i=1

(aiXi)Zi > x

]∼

n∑

i=1

Pr [(aiX)Zi > x] . (5.22)



Corollary 3.

Under the conditions stated in Theorem 13, we have for each n = 1, 2, . . .

and x→ +∞ that

Pr

[n∑

i=1

(ai +Xi)Zi > x

]−Pr

[n−1∑

i=1

(ai +Xi)Zi > x

]∼ Pr [(an +X)Zn > x] .

(5.23)

Furthermore, it holds for each n = 1, 2, . . . and x→ +∞ that

Pr

[n∑

i=1

(aiXi)Zi > x

]−Pr

[n−1∑

i=1

(aiXi)Zi > x

]∼ Pr [anXZn > x] . (5.24)


Corollary 4.

If condition 1. stated in Theorem 13 is replaced by “FX ∈ R−α”, while the

other conditions remain the same, then it holds for each n = 1, 2, . . . and

x→ +∞ that

Pr

[n∑

i=1

(ai +Xi)Zi > x

]∼

n∑

i=1

FX(x− ai) (E[Y α])i . (5.25)

and

Pr

[n∑

i=1

(aiXi)Zi > x

]∼ FX(x)

n∑

i=1

aαi (E[Y α])i . (5.26)


We remark that the particular case of lognormally distributed payments is

not covered by Theorem 13, since the lognormal distribution does not be-

long to the intersection D∩L. The lognormal distribution has a moderately

heavy tail and and has been a popular model for loss severity distributions.

Hence, we state the following theorem:

Theorem 14.

Relations (5.21), (5.22), (5.23) and (5.24) remain valid if conditions 1.

and 2. stated in Theorem 13 are replaced by

1’ X ∼ logN(µX , σ2X), −∞ < µX < +∞ and σX > 0,

2’ Y ∼ logN(µY , σ2Y ), −∞ < µY < +∞ and σY > 0,


3’ σX > σY .


5.3.3 Application: discounted loss reserves

In this subsection, we consider the problem of determining stop-loss pre-

miums and quantiles for discounted loss reserves. We denote by the r.v.

Xi from the i.i.d. sequence {Xi, i = 1, . . . , n}, the net loss in year i. Fur-

thermore, the positive r.v. Yi from the i.i.d. sequence {Yi, i = 1, . . . , n}represents the present value discounting factor from year i to year i − 1.

The two sequences {Xi, i = 1, . . . , n} and {Yi, i = 1, . . . , n} are considered

to be mutually independent. Then the discounted loss reserve S is given

by

S =n∑

i=1

Xi

i∏

j=1

Yj . (5.27)

Henceforth, we impose that E[SI( �S>0)

] < +∞, which is implied by the

condition that E[XI(X>0)] < +∞ and E[Y ] < +∞.

Approximate values for the stop-loss premium and quantiles of the dis-

counted loss reserve S may be obtained by using the previously obtained

asymptotic results. In particular, ifX and Y satisfy the corresponding con-

ditions under which Theorem 13 or Theorem 14 holds, then for sufficiently

large values of the retention d, the stop-loss premium can be approximated

by

π(S, d) ≈n∑

i=1

∫ +∞

dFX � i

j=1 Yj(s)ds =

n∑

i=1

π(X

i∏

j=1

Yj , d). (5.28)

Since the d.f. of X∏i

j=1 Yj will generally not be analytically tractable,

Monte Carlo simulation may still be required. However, the number of

simulations has been reduced considerably.

In case FX ∈ R−α, 0 < α < +∞, and FY ∈ R−∞, the asymptotic

approximations for the stop-loss premium of S reduce to

π(S, d) ≈∫ +∞

d

n∑

i=1

(E[Y α])i FX(s)ds =n∑

i=1

(E[Y α])i π(X, d). (5.29)


Furthermore, in this case we have for sufficiently large values of p, that the

asymptotic approximation for the p-quantile is given by

F−1�S (p) ≈ inf

{s :

n∑

i=1

(E[Y α])i FX(s) ≤ 1 − p

}. (5.30)

Under the conditions of Theorem 14, we have for sufficiently large values

of p, that the asymptotic approximation for the p-quantile is given by

F−1�S (p) ≈ inf

{s :

n∑

i=1

FX � ij=1 Yj

(s) ≤ 1 − p

}. (5.31)

We emphasize that the approximation (5.31) is not in general valid under

the conditions of Theorem 13; it requires the additional condition that

FX ∈ R−α, 0 < α <∞.

As an example, we consider Xi ∼ GPD(α, β) and Yi ∼ logN(µ, σ2), i =

1, . . . , n, in which GPD(α, β) denotes the generalized Pareto distribution

with d.f.

FX(x) = 1 − (1 +x

β)−α, x > 0,

where α > 0 and β > 0. Then, clearly we have that FX ∈ R−α and FY ∈R−∞. Hence, the asymptotic approximations (5.29) and (5.30) are valid.

Notice that for the example considered, the asymptotic approximations can

even be computed analytically. We performed 5 000 000 Monte Carlo (MC)

simulations for quantiles and stop-loss premiums to assess the quality of the

asymptotic approximations (5.29) and (5.30), under various specifications

of the parameter n. We fix the parameter values: α = 1.5, β = 1, µ =

−0.04 and σ = 0.10. The results are presented in Table 5.6. Ndiff. refers

to the normalized difference defined as MC−Appr.MC × 100%. Our numerical

results demonstrate that the asymptotic approximations are typically close

to the Monte Carlo value.


n=3d MC Appr. Ndiff. p MC Appr. Ndiff.

15 1.50 1.36 9% 0.95 16 14 15%

20 1.28 1.19 7% 0.975 25 22 11%

25 1.14 1.07 6% 0.99 44 41 7%

30 1.03 0.98 5% 0.995 69 66 4%

35 0.95 0.91 4% 0.999 198 194 2%

40 0.88 0.85 4%

50 0.78 0.76 3%

60 0.71 0.70 2%

80 0.61 0.61 1%

100 0.55 0.54 1%

150 0.44 0.44 0%

200 0.38 0.38 0%

n = 5d MC Appr. Ndiff. p MC Appr. Ndiff.

20 2.22 1.89 15% 0.95 24 19 22%

30 1.75 1.56 11% 0.975 36 30 17%

40 1.48 1.35 9% 0.99 63 57 10%

60 1.18 1.11 6% 0.995 96 90 6%

80 1.01 0.96 5% 0.999 274 265 3%

100 0.90 0.86 4%

150 0.72 0.70 3%

200 0.62 0.61 2%

250 0.56 0.55 2%

300 0.51 0.50 2%

n = 10d MC Appr. Ndiff. p MC Appr. Ndiff.

40 2.91 2.41 17% 0.95 40 28 30%

60 2.22 1.98 11% 0.975 58 45 23%

80 1.86 1.72 7% 0.99 98 84 14%

100 1.62 1.54 5% 0.995 148 133 10%

150 1.28 1.26 2% 0.999 402 390 3%

200 1.09 1.09 0%

300 0.87 0.88 -1%

400 0.74 0.75 -1%

Table 5.6: Approximations for stop-loss premiums and quantiles of S for

Pareto claim sizes and lognormal present value discounting factors.

5.4. The Bayesian approach 201

5.4 The Bayesian approach

Some comments on notation are needed at this point. First p(.|.) denotes

a conditional probability density with the arguments determined by the

context, and similarly for p(·), which denotes a marginal distribution. The

same notation is used for continuous density functions and discrete prob-

ability mass functions.

5.4.1 Introduction

Bayesian theory is a powerful branch of statistics not yet fully explored by

practitioner actuaries. One of its main benefits, which is the core of its

philosophy, is the ability of including subjective information in a formal

framework. Apart from this, the wide range of models presented by this

branch of statistics is also one of the main reasons why it has been so much

studied recently.

Since the early 1990s, statistics (and to a lesser extent econometrics)

has seen an explosion in applied Bayesian research. This explosion has

had little to do with a renewed interest of the statistics and econometrics

communities to the theoretical foundation of Bayesianism, or to a sudden

awakening to the merits of the Bayesian approach over frequentist meth-

ods, but instead can be primarily explained on pragmatic grounds. The

recent developments are mainly due to, firstly, the recent computer de-

velopments that have made it easier to perform calculation by simulations

and, secondly, to the failure of classical statistical methods to give solutions

to many problems. Indeed, the use of such tools often enables researchers

to estimate complicated statistical models that would be quite difficult, if

not virtually impossible, using standard frequentist techniques. But, al-

though so many developments have been occurring in Bayesian statistics,

very few actuaries are aware of them and even fewer make use of them.

The purpose of this section is to sketch, in very broad terms, basic elements

of Bayesian computation.

Classical statistics provides methods to analyze data, from simple de-

scriptive measures to complex and sophisticated models. The available

data are processed and then conclusions about a hypothetical population,

of which the data available is supposed to be a representative sample, are

drawn. It is not hard to imagine situations, however, in which data are not

the only available source of information about the population. Bayesian


methods provide a principled way to incorporate this external information

into the data analysis process. To do so, however, Bayesian methods have

to change entirely the vision of the data analysis process with respect to

the classical approach. In a Bayesian approach, the data analysis process

starts already with a given probability distribution. As this distribution is

given before any data is considered, it is called prior distribution.

Bayesian methods allow us to assign prior distributions to the param-

eters in the model which capture known qualitative and quantitative fea-

tures, and then to update these priors in the light of the data, yielding a

posterior distribution via Bayes’ theorem

Posterior ∝ Likelihood × Prior,

where ∝ denotes that two quantities are proportional to each other. Hence

the posterior distribution is found by combining the prior distribution for

the parameters with the probability of observing the data given the param-

eters (the likelihood). The ability to include prior information in the model

is not only an attractive pragmatic feature of the Bayesian approach, it is

theoretically vital for guaranteeing coherent inferences.

More formally Bayes’ theorem is defined as follows. Consider a process

in which observations (~Y is the vector of observations) are to be taken

from a distribution for which the probability density function is p(~Y |~θ),where ~θ is a set of unknown parameters. Before any observation is made,

the analyst would include all his previous information and judgements of ~θ

in a prior distribution p(~θ), that would be combined with the observations

to give a posterior distribution p(~θ|~Y ) in the following way:

p(~θ|~Y ) ∝ p(~Y |~θ)p(~θ)

Bayesian modelling involves integrals over the parameters, whereas non-

Bayesian methods often rely on optimization of the parameters. The main

difference between these methods is that optimization fails to take into

account the inherent uncertainty in the parameters. There is no true value

for each of the parameters which can be found by optimization. Instead,

there is a range of plausible values, each with some associated density.

The mechanisms of the Bayesian approach to model fitting to make

inferences consists of three basic steps:

1. Assign priors to all the unknown parameters;


2. Write down the likelihood of the data given the parameters;

3. Determine the posterior distribution of the parameters given the data

using Bayes’ theorem.

Bayesian inference is quite simple to describe probabilistically, but there

have been two major obstacles to its popularity. The first is how to specify

prior distributions, and the second is how to evaluate the integrals required

for inference, given that for most models, these are analytically intractable.

This will be discussed in short in the next two subsections.

5.4.2 Prior choice

The prior distribution can arise from data previously observed, or it can be

the subjective assessment of some domain expert and, as such, it represents

the information we have about the problem at hand, that is not conveyed

by the sample data.

Several methods for eliciting prior densities from experts exist. See,

e.g. O’Hagan (1994) for a comprehensive review. A common approach is

to choose a prior distribution with density function similar to the likelihood

function. In doing so, the posterior distribution of ~θ will be in the same

class and the prior is said to be conjugate to the likelihood. The conjugate

family is mathematically convenient in that the posterior distribution fol-

lows a known parametric form. Of course, if information is available that

contradicts the conjugate parametric family, it may be necessary to use a

more realistic, if inconvenient, prior distribution. The basic justification

for the use of conjugate prior distributions is similar to that for using stan-

dard models for the likelihood: it is easy to understand the corresponding

results, which can often be put in analytic form. Next, they are often a

good approximation, and they simplify computations. Although they can

make interpretations of posterior inferences less transparent and compu-

tation more difficult, non-conjugate prior distributions do not pose any

new conceptual problem. In practice, for complicated models, conjugate

prior distributions may not even be possible. In general, the exponential

families are the only classes of distributions that have natural conjugate

distributions, since, apart from certain irregular cases, the only distribu-

tions having a fixed number of sufficient statistics are of the exponential

type.


Kass and Wasserman (1996) survey formal rules that have been suggested

for choosing a prior. Many of these rules reflect the desire to let the

“data speak for themselves”, so that inferences are unaffected by informa-

tion external to the current data. This has led to variety of priors with

names like conventional, default, diffuse, flat, formal, generic, indifference,

neutral, non-informative, objective, reference, and vague priors. Prior dis-

tributions playing a minimal role in the posterior distribution are called

‘reference prior distributions’. One interpretation of letting the data speak

for themselves is to use classical techniques. Maximum likelihood estimates

are rationalizable in a Bayesian framework by appropriate choice of prior

distribution, specifically a uniform prior.

There are many ways of defining a non-informative prior. The main

objective is to give as little subjective information as possible. So, usually

a prior distribution with a large value for the variance is used. Another

way of including the minimal prior information is to find estimates of the

parameters of the prior distribution, using the data. This last approach

is called the empirical Bayes method, but often there is a relationship

between those two approaches — non-informative and empirical Bayes.

A commonly used reference prior in Bayesian analysis is Jeffreys’ prior

(See Jeffreys (1946)). This choice is based on considering one-to-one trans-

formations of the parameter h(~θ). Jeffreys’ general principle is that any

rule for determining the prior density p(~θ) should yield an equivalent re-

sult if applied to the transformed parameter. This non-informative prior

is obtained by applying Jeffreys’ rule, which is to take the prior density

to be proportional to the square root of the determinant of the Fisher

information matrix. This prior exhibits many nice features that make it

an attractive reference prior. One such property is parametrization invari-

ance. Although Jeffreys’ rule has many desirable properties, it should be

used with caution.

In most cases, Jeffreys’ prior is technically not a probability distribu-

tion, since the density function does not have a finite integral over the

parameter space. It is then termed an improper prior. It is often the case

that Bayesian inference based on improper priors returns proper posterior

distributions which then turn out to be numerically equivalent to the re-

sults of classical inference. Problems related to the use of improper prior

distributions can be overcome by assigning prior distributions that are as

uniform as possible but still remain probability distributions. The use of

uniform prior distributions to represent uncertainty clearly assumes that


“equally probable” is an adequate representation of “lack of information”.

Theoretically, a prior distribution could be included for all the parame-

ters that are unknown in a model, so that any model could be represented in

a Bayesian way. However, this often leads to intractable problems (mainly

integrals without solution). So the main limitation of Bayesian theory is

the difficulty, and in many cases the impossibility, of solving the required

equations analytically.

In the last decade many simulation techniques have been developed

in order to solve this problem and to obtain estimates of the posterior

distribution. These techniques were turning points for the Bayesian theory,

making it possible to apply many of its models. On one hand, the use

of a final and closed formula for a solution is, generally speaking, more

satisfactory than the use of an approximation through simulation. On the

other hand, simulation gives a larger range of models for which solutions

(or at least good approximations) can be obtained.

5.4.3 Iterative simulation methods

In order to illustrate the simulation philosophy, suppose that the posterior

of a specific parameter ~θ is needed. If an analytical solution was available,

a formula would be derived, where the observed data and known param-

eters would be included, defining a final result. But, depending on the

model, this solution will not be possible. In such cases an approximation

for the posterior distribution of ~θ is needed. One way of finding this ap-

proximation is by simulation, that substitutes the posterior distribution

by a large sample of ~θ based on the characteristics of the model. With this

large sample of ~θ many summary statistics could be calculated, like the

mean, variance or histogram, extracting all the information needed from

this sample of the posterior distribution.

There are a number of ways of simulating and in all of them some

checking should be carried out to guarantee that the simulation set is

really representative for the required distribution. For instance, it must

be checked whether the simulation is mixing well or, in other words, if the

simulation procedure is visiting all the possible values for ~θ. It should be

also considered how large the sample should be, and whether the initial

point where the simulation starts does not play a big role. Among many

other issues, the moment when convergence to the true distribution of ~θ is

achieved should also be monitored.


The most popular type of simulation in Bayesian theory are the Markov

Chain Monte Carlo (MCMC) methods. This class of simulation models

has been used in a large number and wide range of applications, and has

been found to be very powerful. The essence of the MCMC method is that

by sampling from specific simple distributions (derived from the combina-

tion of the likelihood and prior distributions), a sample from the posterior

distribution will be obtained in an asymptotic way.

Iterative simulation methods, particularly the Gibbs sampler and the

Metropolis Hastings algorithm are powerful statistical tools that facilitate

computation in a variety of complex models. Though these two algorithms

are commonly presented as useful yet distinct instruments for simulating

joint posteriors, this distinction is rather artificial - indeed, one can regard

the Gibbs sampler as a special case of the Metropolis-Hastings algorithm

where jumps along the complete conditional distributions are accepted with

probability one. In conditionally conjugate models, the Gibbs sampler is

typically the algorithm of choice (since the complete posterior conditionals

are easily sampled).

The general strategy with iterative methods is to follow the steps of the

algorithms to generate a series of draws (sometimes called a parameter

chain), say θ0, θ1, θ2, . . . that converge in distribution to some target density

- in our case, the posterior f(θ|~Y ). The algorithms are constructed so that

the posterior f(θ|~Y ) is the unique stationary distribution of the parameter

chain. Once convergence to the target density is “achieved” we can use

these draws in the same way as with direct Monte Carlo integration to

calculate posterior means, posterior standard deviations, and so on. In

practice, we take care to diagnose that the parameter chain has approached

convergence to the target density, to discard the initial set of the pre-

convergence draws (often called the burn-in period), and then to use the

post-convergence sample to calculate the desired quantities. Unlike the

non-iterative methods discussed previously, the post-convergence draws

we obtain using these iterative methods will prove to be correlated, as

the distribution of, say, θt depends on the last parameter sampled in the

chain, θt−1. If the correlation among the draws is severe, it may prove to be

difficult to traverse the entire parameter space, and the numerical standard

errors associated with the point estimates can be quite large. When the

simulations are highly correlated, and our chain makes only small local

movements from iteration to iteration, we refer to this as slow mixing of


the parameter chain.

One can find an excellent overview and a detailed discussion of exam-

ples of MCMC algorithms in, for example, Gilks et al. (1996). Here we

will describe Gibbs Sampling (GS), a special case of Metropolis-Hastings

algorithms, which is becoming increasingly popular in the statistical com-

munity. GS is an iterative method that produces a Markov Chain, that is

a sequence of values {~θ(0), ~θ(1), ~θ(2), . . .} such that ~θ(i+1) is sampled from a

distribution that depends on the current state i of the chain. The algorithm

works as follows.

Let ~θ(0) = {θ(0)1 , . . . , θ

(0)k } be a vector of initial values of ~θ and suppose that

the conditional distributions of θi|(θ1, . . . , θi−1, θi+1, . . . , θk, ~Y ) are known

for each i. The first value in the chain is simulated as follows:

θ(1)1 is sampled from the conditional distribution of θ1|(θ(1)

2 , . . . , θ(1)k , ~Y );

θ(1)2 is sampled from the conditional distribution of θ2|(θ(1)

1 , θ(1)3 , . . . , θ

(1)k , ~Y );

θ(1)k is sampled from the conditional distribution of θk|(θ(1)

1 , θ(1)2 , . . . , θ

(1)k−1,

~Y );

Then ~θ(0) is replaced by ~θ(1) and the simulation is repeated to generate~θ(2), and so forth. In general, the i-th value in the chain is generated by

simulating from the distribution of ~θ conditional on the value previously

generated ~θ(i−1). After an initial long chain, called burn-in, of say b itera-

tions, the values {~θ(b+1), ~θ(b+2), ~θ(b+3), . . .} will be approximately a sample

from the posterior distribution of ~θ, from which empirical estimates of the

posterior means and any other function of the parameters can be com-

puted. Critical issues for this method are the choice of the starting value~θ(0), the length of the burn-in and the selection of a stopping rule. The pro-

gram “WinBugs” provides an implementation of GS suitable for problems

in which the likelihood function satisfies certain factorization properties.

5.4.4 Bayesian model set-up

In this subsection we explain how to set up the relevant Bayesian models

and draw samples from posterior distributions for parameters ~θ and future

observables Y .

We show how simple simulation methods can be used to draw samples

from posterior and predictive distributions, automatically incorporating


uncertainty in the model parameters, and draw samples for posterior pre-

dictive checks.

The simplest and most widely used version of this model is the normal

linear model, in which the distribution of the response variable ~Y given

the regression matrix X is normal with mean a linear function of X:

E[Yi|~β,X] = β1xi1 + · · · + βkxik,

for i = 1, . . . , n. We further restrict to the case of ordinary linear regres-

sion, in which the conditional variances are equal, Var[Yi|~θ,X] = σ2 for

all i, and the observations are conditionally independent given ~θ,X. The

parameter vector is then ~θ = (β1, . . . , βk, σ2).

Under a standard non-informative prior distribution, the Bayesian es-

timates and standard errors coincide with the classical results. In the

simplest case, called ordinary linear regression, the observation errors are

independent and have equal variance. In vector notation given by

~Y |~β, σ2,X ∼ Nn(X~β, σ2I),

where I is the n × n identity matrix. In the normal regression model, a

convenient non-informative prior distribution is uniform on (~β, log σ) or,

equivalently,

p(~β, σ2|X) ∝ σ−2

When there are many data points and only a few parameters, the non-

informative prior distribution is useful — it gives acceptable results and

takes less effort than specifying prior knowledge in probabilistic form. For

a small sample size or a large number of parameters, the likelihood is less

sharply peaked, and so prior distributions are more important.

We determine first the posterior distribution for ~β, conditional on σ2,

and then the marginal posterior distribution for σ2. That is, we factor the

joint posterior distribution for ~β and σ2 as p(~β, σ2|~Y ) = p(~β|σ2, ~Y )p(σ2|~Y ).

1. Conditional posterior distribution of ~β given σ2

~β|σ2, ~Y ∼ N(~β, V~βσ2),

with~β = (X′X)−1X~Y

and

V~β= (X′X)−1

5.5. Applications in claims reserving 209

2. Marginal posterior distribution of σ2

σ2|~Y ∼ Inv − χ2(n− k, s2),

where

s2 =1

n− k(~Y − X~β)′(~Y − X~β).

The marginal posterior distribution of ~β|y, averaging over σ2, is multivari-

ate t with n− k degrees of freedom, but we rarely use this fact in practice

when drawing inferences by simulation, since to characterize the joint pos-

terior distribution we can draw simulations of σ2 and then ~β|σ2. The

standard non-Bayesian estimates of ~β and σ2 are ~β and s2, respectively,

as just defined. The classical standard error estimate for ~β is obtained by

setting σ2 = s2.

It is easy to draw samples from the posterior distribution: Compute

first ~β, V~βand s2 and draw then σ2 from the scaled inverse-χ2 distribution

and ~β from the multivariate normal distribution.

The posterior predictive distribution of unobserved data, p( ~Y |~Y ), has two

components of uncertainty:

1. The fundamental variability of the model, represented by the vari-

ance σ2 in ~Y , and

2. The posterior uncertainty in ~β and σ2 due to the finite sample size

of ~Y . As the sample size n → ∞, the variance due to posterior un-

certainty in (~β, σ2) decreases to zero, but the predictive uncertainty

remains.

5.5 Applications in claims reserving

5.5.1 The comonotonicity approach versus the Bayesian ap-proximations

In this subsection we apply a Bayesian model in the context of discounted

loss reserves. The outcomes of this approach are compared with the

comonotonic approximations for the distribution of the discounted loss re-

serve when the run-off triangle is modelled by a generalized linear model.


We realize that the Bayesian posterior predictive distribution is a very

general workhorse, which takes into account all sources of uncertainty in

the model formulation and is applicable to different statistical domains,

whereas the comonotonic approximations originate from a specific actuarial

context. We want to illustrate however that the predictive distribution

based on the comonotonic bounds provides results that are close to the

results obtained via MCMC. The main advantage of the bounds is that

several risk measures such as percentiles (VaRs), expected shortfalls (stop-

loss premiums) and TailVaRs can be calculated easily from it.

As illustrated by Verrall (2004) (for GLIMs) and in earlier work by (for

instance) de Alba (2002) (for lognormal models) Bayesian techniques are

useful in this area as they provide the posterior predictive distribution of

the reserve.

Bayesian methods for the analysis of GLIMs

We consider Bayesian methods for the analysis of generalized linear models,

which provide a general framework for cases in which normality and linear-

ity are not viable assumptions. These cases point out the major computa-

tional bottleneck of Bayesian methods: when the assumptions of normality

and/or linearity are removed, usually the posterior distribution cannot be

computed in closed form. We will discuss some computational methods to

approximate this distribution.

Generalized linear models provide a unified framework to encompass

several situations which are not adequately described by the assumptions

of normality of the data and linearity in the parameters. As described in

Chapter 4 (Section 4.3.3), the features of a GLIM are the fact that the

distribution of ~Y |~θ (~θ is used to denote the parameter vector) belongs to

the exponential family, and that a transformation of the expectation of the

data, g(~µ), is a linear function of the linear predictor R~β. The parameter

vector is made up of ~β and of the dispersion parameter φ.

Classical analyses of generalized linear models allow for the possibil-

ity of variation beyond that of the assumed sampling distribution, called

overdispersion. A prior distribution can be placed on the dispersion pa-

rameter, and any prior information about p(~β, φ) can be described condi-

tional on the dispersion parameter; that is, p(~β, φ) = p(φ)p(~β|φ).

The classical analysis of generalized linear models is obtained if a non-

informative or flat prior distribution is assumed for ~β. The posterior mode


corresponding to a noninformative uniform prior density is the maximum

likelihood estimate for the parameter ~β, which can be obtained using iter-

ative weighted linear regression.

The problem with a Bayesian analysis of GLIMs is that, in general, the

posterior distribution of ~β cannot be calculated exactly, since the marginal

density of the data

p(~Y ) =

∫p(~Y |~θ)p(~θ)d~θ (5.32)

cannot be evaluated in closed form.

Numerical integration techniques can be exploited to approximate (5.32),

from which a numerical approximation of the posterior density of ~β can

be found. When numerical integration techniques become infeasible, we

are left with two main ways to perform approximate posterior analysis:

(i) to provide an asymptotic approximation of the posterior distribution

or (ii) to use stochastic methods to generate a sample from the posterior

distribution.

When the sample size is large enough, posterior analysis can be based

on an asymptotic approximation of the posterior distribution by using a

normal distribution with some mean and variance. This idea generalizes

the asymptotic normal distribution of the maximum likelihood estimates

when their exact sampling distribution cannot be derived or it is too dif-

ficult to be used. Asymptotic normality of the posterior distribution pro-

vides notable computational advantages, since marginal and conditional

distributions are still normal, and hence inference on parameters of inter-

est can be easily carried out. However, for relatively small samples, the

assumption of asymptotic normality can be inaccurate.

For relatively small samples, stochastic methods (or Monte Carlo methods)

provide an approximate posterior analysis based on a sample of values gen-

erated from the posterior distribution of the parameters. The task reduces

to generating a sample from the posterior distribution of the parameters.


Consider now the run-off triangle in Table 5.7, taken from Taylor & Ashe

(1983) and used in various other publications on claims reserving.

These data are modelled using a gamma GLIM (see expression (4.51)

with κ = 2) with logarithmic link function.

212

Chapte

r5

-A

ppro

xim

atio

nte

chniq

ues

for

sum

sofr.v

.’s

1 2 3 4 5 6 7 8 9 10

1 357,848 766,940 610,542 482,940 527,326 574,398 146,342 139,950 227,299 67,9482 352,118 884,021 933,894 1,183,289 445,745 320,996 527,804 266,172 425,0463 290,507 1,001,799 926,219 1,016,654 750,816 146,923 495,992 280,4054 310,608 1,108,250 776,189 1,562,400 272,482 352,053 206,2865 443,160 693,190 991,983 769,488 504,851 470,6396 396,132 937,085 847,498 805,037 705,9607 440,832 847,631 1,131,398 1,063,2698 359,480 1,061,648 1,443,3709 376,686 986,60810 344,014

Table 5.7: Run-off triangle with non-cumulative claim figures.


year of origin

2 4 6 8 10

-0.5

0.0

0.5

1.0

1.5

development year

2 4 6 8 10

-0.5

0.0

0.5

1.0

1.5

calendar year

2 4 6 8 10

-0.5

0.0

0.5

1.0

1.5

Figure 5.1: Weighted residuals for linear predictor in (5.33), together

with average lines, which represent the average of the weighted standardized

residuals in each period of interest. Note that the average is zero when no

observations occur.

Although the linear predictor in the Probabilistic Trend Family of

models (4.32) is over-parameterized, it provides a flexible modelling struc-

ture. For example, one might begin with three parameters (α, β, γ), with

one accident period level parameter, one development period trend para-

meter and one calendar period trend parameter, which equates to a linear

predictor with the following form,

ηij = α+ (j − 1)β + (i+ j − 2)γ. (5.33)

Adding more accident, development and calendar period parameters where

necessary, allows the structure to be extremely flexible.

The weighted residuals for this model (Figure 5.1) indicate that there

are major trends in the development period that are not being captured.

There also appears to be a level change between accident periods one, two-


year of origin

2 4 6 8 10

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

development year

2 4 6 8 10

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

calendar year

2 4 6 8 10

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

Figure 5.2: Weighted residuals for linear predictor in (5.34), together

with average lines, which represent the average of the weighted standardized

residuals in each period of interest. Note that the average is zero when no

observations occur.

three, four and five. To capture these trends extra development period

trend parameters and extra accident period level parameters are required.

The new form of the linear predictor is given by

ηij = α1I(i=1) + α2I(i=2,3) + α3I(i=4) + α4I(i>4) + β1I(j>1)

+ β2I(j>4) + (j − 5)β3I(5<j<9) + 3β3I(j>8). (5.34)

The weighted residuals for this updated model (Figure 5.2) indicate that

(5.34) appears to capture the significant levels and trends in the data.

We recall from the previous chapter the definition of the discounted

IBNR reserve under a generalized linear model and normal logreturn process.

SGLIM =t∑

i=2

t∑

j=t+2−i

g−1((R~β)ij

)e−Y (i+j−t−1),


year SlGLIM Sc

GLIM Bayesian

2 360,725 387,404 436,1513 700,465 765,451 760,1774 945,845 982,425 970,5355 1,441,016 1,513,186 1,448,0566 1,913,383 1,977,934 1,919,3007 2,519,292 2,614,564 2,558,2088 3,557,014 3,702,302 3,641,8909 4,573,767 4,770,944 4,727,26210 5,577,925 5,821,804 5,638,301

total 20,949,190 21,988,048 20,360,196

Table 5.8: 95th percentile of the predictive distribution of SGLIM

where the returns are modelled by means of a Brownian motion described

by the following equation

Y (i) = (δ +ς2

2)i+ ςB(i),

where B(i) is the standard Brownian motion, ς is the volatility and δ is a

constant force of interest.

The discounting process (with δ = 0.08 and ς = 0.11) is incorporated in

the WinBugs code for the gamma GLIM. To enable comparisons with the

results from the comonotonic bounds , flat priors were used both for the row

and column parameters of the linear predictor and for the scale parameter

in the gamma model. Table 5.8 contains the results obtained via MCMC

simulations with the WinBugs program. A burn-in of 10 000 iterations was

allowed, after which another 10 000 iterations were performed.

The bounds for the discounted loss reserve use the maximum likeli-

hood estimates of the parameters in the linear predictor. To incorporate

the error arising from the estimation of these parameters we apply the

bootstrap algorithm as explained in Section 4.5. We bootstrapped 1000

times, computed each time (analytically) the 95th percentile of upper and

lower bound. Table 5.8 compares the Bayesian 95th percentile and the

bootstrapped 95th percentile of the lower and upper bound for the differ-

ent reserves.

The results for the upper and lower bounds in convex order are given in

the same table. One can see that the results from the comonotonic bounds

are close to the results obtained via MCMC simulation. Thus, at least for


this example, these bounds provide actuaries with accurate information

concerning the predictive distribution of discounted loss reserves.

5.5.2 The comonotonicity approach versus the asymptoticand moment matching approximations

In case the underlying variance of the statistical and financial part of the

discounted IBNR reserve gets large, the comonotonic approximations per-

form worse. We will illustrate this by means of a simple example in the

context of loss reserving and propose to solve this problem using the asymp-

totic approximations introduced in Section 5.3.

In the following, we assume that the r.v.’s Yij , i, j = 1, . . . , t can be

expressed as products of a deterministic component and an i.i.d. random

component. In particular, we consider the following model

Yij = aijY ij , i, j = 1, . . . , t, (5.35)

in which Y ij , i, j = 1, . . . , t are i.i.d. r.v.’s and aij > 0, i, j = 1, . . . , t are

positive numbers.

We will consider in this part the simple lognormal linear model (4.1)

ln~Y = R~β + ~ε, ~ε ∼ N(0, σ2I),

with ~Y as before the vector of historical claim figures.

The accumulated IBNR reserve is given by

IBNR reserve =

t∑

i=2

t∑

j=t+2−i

aijY ij . (5.36)

We will again incorporate stochastic discounting factors. We let the posi-

tive r.v. Vk from the i.i.d. sequence {Vk, k = 1, . . . , t−1} denote the present

value discounting factor from year k to year k − 1 and consider the two

sequences {Y ij , i = 2, . . . , t; j = t+ 2− i, . . . , t} and {Vk, k = 1, . . . , t− 1}to be mutually independent. Furthermore, for notational convenience, we

introduce the positive r.v. Zk = V1V2 · · ·Vk, k = 1, . . . , t − 1. Then the

discounted IBNR reserve S is given by

S =t∑

i=2

t∑

j=t+2−i

aijY ijZi+j−t−1. (5.37)


Henceforth, we impose that E[S] < +∞. Approximate values for stop-

loss premiums and quantiles for S may be obtained by using asymptotic

results. In particular, if {Y ij , i = 2, . . . , t; j = t+2− i, . . . , t} and {Vk, k =

1, . . . , t− 1} satisfy the corresponding conditions under which Theorem 13

or Theorem 14 is valid, then for sufficiently large values of d, we have that

π(S, d) ≈t∑

i=2

t∑

j=t+2−i

aijπ(Y Zi+j−t−1, d/aij

). (5.38)

Furthermore, if either FY ∈ R−α for some 0 < α < +∞, and FV ∈ R−∞,

or the conditions of Theorem 14 apply, then for sufficiently large values of

p, we have that

F−1S (p) ≈ inf

s :

t∑

i=2

t∑

j=t+2−i

F Y Zi+j−t−1(s/aij) ≤ 1 − p

. (5.39)

As an example, we consider a lognormal linear regression model with chain-

ladder linear predictor to describe the random claims and we use a geomet-

ric Brownian motion with drift to represent the stochastic discount factors.

We remark that for this specification Theorem 14 applies. Furthermore, for

this specification the products Y ijZi+j−t−1, i = 2, . . . , t; j = t+2− i, . . . , tare lognormal and therefore the present value of the IBNR reserve becomes

a linear combination of dependent lognormal r.v.’s, given by

S =t∑

i=2

t∑

j=t+2−i

aijY ijZi+j−t−1 =t∑

i=2

t∑

j=t+2−i

eηijeεije−Y (i+j−t−1). (5.40)

Notice that this definition is the same as (4.35) for the special case of the

lognormal linear model (with σ = σ) and chain-ladder type linear predictor

ηij = (R~β)ij = αi + βj .

In this illustration, we start with a given set of parameters and define

the reserve as expressed in (5.40). In a real reserving exercise, one has to

build an appropriate statistical model based on the incremental claims in

the run-off triangle and to estimate the parameters from this model.

Using the same notation as in the previous chapter we have for Wij :=

−Y (i+ j − t− 1) that

E[Wij ] = −(δ +1

2ς2)(i+ j − t− 1),

Var[Wij ] = σ2Wij

= (i+ j − t− 1)ς2.


The asymptotic approximations (5.38) and (5.39) become

π(S, d) ≈t∑

i=2

t∑

j=t+2−i

eηij+E[Wij ]+

12(σ2

Wij+σ2)

×Φ

((ηij + E[Wij ] + σ2

Wij+ σ2 − log(d)

)/√σ2

Wij+ σ2

)

−dΦ((ηij + E[Wij ] − log(d)

)/√σ2

Wij+ σ2

), d ∈ R+,

F−1S (p) ≈ inf

s :

t∑

i=2

t∑

j=t+2−i

FLN (s) ≤ 1 − p

, p ∈ (0, 1),

in which FLN is the cdf of logN(ηij + E[Wij ], σ

2Wij

+ σ2).

To compute the lognormal moment matching approximations as described

in Section 5.2 we need expressions for the mean and variance of S. These

are given by

E[S] =t∑

i=2

t∑

j=t+2−i

eηij+E[Wij ]+

12

(σ2

Wij+σ2),

Var[S] =t∑

i=2

t∑

j=t+2−i

t∑

k=2

t∑

l=t+2−k

eσ2+(ηij+ηkl+E[Wij ]+E[Wkl]

)+ 1

2

(σ2

Wij+σ2

Wkl

)

×(eς

2 min(i+j−t−1,k+l−t−1)+σ2∗ − 1),

where σ2∗ =

{σ2 if i, j = k, l;

0 if i, j 6= k, l.

We arbitrarily set σ = 3, δ = −0.07, ς = 0.2 and t = 5 and use the following

chain-ladder parameters:

α1

α2

α3

α4

α5

=

1.1

1.6

1.9

2.1

2.2

,

β1

β2

β3

β4

β5

=

0

−0.42

−0.38

−0.87

−0.96

.


d MC Appr. 1 Appr. 2 Appr. 3 Ndiff. 1 Ndiff. 2 Ndiff. 3

7500 1868.0 1771.6 2541.1 2277.6 5.2% -36.0% -21.9%

10000 1743.5 1658.1 2459.2 2165.8 4.9% -41.0% -24.2%

15000 1568.7 1496.9 2333.6 1998.4 4.6% -48.8% -27.4%

20000 1446.7 1383.1 2237.8 1874.1 4.4% -54.7% -29.5%

25000 1354.0 1295.8 2160.0 1775.2 4.3% -59.5% -31.1%

30000 1279.7 1225.4 2094.4 1693.4 4.2% -63.7% -32.3%

40000 1165.7 1116.7 1987.6 1563.3 4.2% -70.5% -34.1%

50000 1080.4 1034.8 1902.4 1462.2 4.2% -76.1% -35.3%

75000 933.3 892.5 1743.6 1280.5 4.4% -86.8% -37.2%

100000 835.6 797.4 1600.2 1154.9 4.6% -91.5% -38.2%

150000 708.5 673.0 1437.8 985.3 5.0% -102.9% -39.1%

200000 626.2 592.0 1323.5 871.7 5.5% -111.4% -39.2%

250000 566.5 533.4 1260.8 788.2 5.8% -122.6% -39.1%

300000 520.7 488.4 1190.4 723.1 6.2% -128.6% -38.9%

400000 453.9 422.6 1081.8 626.8 6.9% -138.3% -38.1%

500000 406.5 375.9 1000.2 557.7 7.5% -146.1% -37.2%

p MC Appr. 1 Appr. 2 Appr. 3 Ndiff. 1 Ndiff. 2 Ndiff. 3

0.95 8650 7863 4814 7555 9.1% 44.3% 12.7%

0.975 17000 15868 12436 17296 6.7% 26.8% -1.7%

0.99 38957 37496 37490 45306 3.8% 3.8% -16.3%

0.995 70795 68885 79477 87283 2.7% -12.3% -23.3%

0.999 257090 253021 374188 337364 1.6% -45.5% -31.2%

Table 5.9: Monte Carlo (MC) versus approximate values of stop-loss

premiums and quantiles for chain-ladder claim sizes and lognormal present

value discounting factors.

In Table 5.9 we numerically compare the asymptotic approximations with

a Monte Carlo (MC) study based on 5 000 0000 simulations. Numerical

results of the comonotonic and moment matching approximations have

also been included. “Appr. 1” refers to the asymptotic approximation,

“Appr. 2” to the convex upper bound and “Appr. 3” to the lognormal

moment matching approach. “Ndiff. ” refers to the normalized difference

defined as MC−Appr.MC × 100%. The numerical results demonstrate that the

asymptotic approximation values generally outperform the comonotonic

upper bound and the lognormal moment matching technique. Because the

comonotonic lower bound performed remarkably bad, its numerical values

were left out of the table.


5.6 Proofs

Theorem 12

In order to prove the theorem, we first establish the following result from

Tang & Tsitsiashvili (2004):

Lemma 11.

Let F1, F2 and G be three d.f.’s. Suppose that F i(x) > 0 for any real

number x, Fi(0)G(0) = 0, i = 1, 2, and G ∈ R−∞. If F 1(x) ∼ F 2(x), then

F1 ⊗G(x) ∼ F2 ⊗G(x). (5.41)

Proof. From the condition F 1(x) ∼ F 2(x) we know that, for any 0 < ε < 1

and all large x, say x ≥ y0 for some y0 > 0,

(1 − ε)F 2(x) ≤ F 1(x) ≤ (1 + ε)F 2(x). (5.42)

It is not difficult to verify that since F i(y0) > 0 for all y0 > 0, i = 1, 2, we

have by the definition of the class R−∞ for i = 1, 2, that

lim supx→+∞

∫ y0

0 G(x/y)Fi(dy)∫ +∞y0

G(x/y)Fi(dy)≤ lim sup

x→+∞

∫ y0

0 G(x/y)Fi(dy)∫ +∞2y0

G(x/y)Fi(dy)

≤ lim supx→+∞

G(x/y0)(Fi(y0) − Fi(0))

G(x/2y0)F i(2y0)

= 0

and hence that for i = 1, 2

Fi ⊗G(x) =

∫ y0

0G (x/y)Fi(dy) +

∫ +∞

y0

G (x/y)Fi(dy)

∼∫ +∞

y0

G (x/y)Fi(dy)

= G (x/y0)F i(y0) +

∫ x/y0

0F i(x/y)G(dy).

Substituting (5.42) to the above leads to

(1 − ε)F2 ⊗G(x) . F1 ⊗G(x) . (1 + ε)F2 ⊗G(x).

Hence, relation (5.41) follows from the arbitrariness of 0 < ε < 1.

5.6. Proofs 221

Then, we proceed with the proof of Theorem 12.

Proof. Clearly, it holds that

Pr

[n∑

i=1

aiZi > x

]= Pr

[Y1

(a1 + Y2

(a2 + . . . Yn−1

(an−1 + anYn

)))> x

].

Since FY ∈ L and an > 0, we have that

Pr [an−1 + anYn > x] ∼ Pr [anYn > x] .

Hence, applying Lemma 11 we obtain that

Pr [Yn−1 (an−1 + anYn) > x] ∼ P [anYn−1Yn > x] .

Repeatedly applying Lemma 11, we finally obtain that

Pr[Y1

(a1 + Y2

(a2 + . . . Yn−1

(an−1 + anYn

)))> x

]

∼ Pr [anY1Y2 · · ·Yn−1Yn > x] .

For the remainder of the proof it suffices to verify that the probabilities

Pr [aiZi > x], i = 1, 2, · · · , n − 1, on the right-hand side of (5.20) can be

neglected when compared with the probability Pr [anZn > x]. Since the

class R−∞ is closed under product convolution, we have that the d.f. of

the product∏i

j=1 Yj belongs to the class R−∞ for each i = 1, 2, . . .. Hence,

we verify that for each i = 1, 2, . . . , n− 1, and some 0 < v < 1,

lim supx→+∞

Pr[ai∏i

j=1 Yj > x]

Pr[an∏n

j=1 Yj > x]

≤ lim supx→+∞

Pr[ai∏i

j=1 Yj > x]

Pr[ai∏i

j=1 Yj > vx, an

ai

∏nj=i+1 Yj > 1/v

]

=1

Pr[

an

ai

∏nj=i+1 Yj > 1/v

] lim supx→+∞

Pr[ai∏i

j=1 Yj > x]

Pr[ai∏i

j=1 Yj > vx]

= 0 .

This proves that (5.20) holds.


Theorem 13

To prove the theorem, we first state three lemma’s.

Lemma 12.

Let X and Y be two independent r.v.’s, where X is supported on (−∞,+∞)

with a d.f. F , and Y is strictly positive with a d.f. G. Let V = XY and

denote by H the d.f. of V . If F ∈ D∩L and G ∈ R−∞, then H ∈ D∩L ⊂ Sand

H(x) � F (x).

Proof. This lemma can easily be proved by Lemma 3.8 and Lemma 3.10

of Tang & Tsitsiashvili (2003).

Lemma 13.

If F ∈ D and G ∈ R−∞, then there exists some ε > 0 such that

G(x1−ε

)= o

(F (x)

).

Proof. This lemma can be proved by Lemma 3.7 of Tang & Tsitsiashvili

(2003).

Lemma 14.

Let F = F1 ∗ F2, where F1 and F2 are two d.f.’s supported on (−∞,+∞).

If F1 ∈ S, F2 ∈ L, and F 2(x) = O(F 1(x)

), then F ∈ S and

F (x) ∼ F 1(x) + F 2(x).

Proof. This result can be obtained by fixing γ = 0 in Lemma 3.2 of Tang

& Tsitsiashvili (2003).

We are now ready to prove Theorem 13.

Proof. First we prove (5.21), which says that

Pr [(a1 +X1)Y1 + . . .+ (an−1 +Xn−1)Yn−1 . . . Y1+

+(an +Xn)YnYn−1 . . . Y1 > x]

∼ Pr [(a1 +X1)Y1 > x] + . . .+ Pr [(an−1 +Xn−1)Yn−1 . . . Y1 > x]

+Pr [(an +Xn)YnYn−1 . . . Y1 > x] .

5.6. Proofs 223

Applying Lemma 12, we have that the product (an + Xn)Yn is subexpo-

nentially distributed and

Pr [(an +Xn)Yn > x] � F (x). (5.43)

Applying Lemma 14, we have that

Pr [(an−1 +Xn−1) + (an +Xn)Yn > x]

∼ Pr [(an−1 +Xn−1) > x] + Pr [(an +Xn)Yn > x] .

Since, by Lemma 13, there exists some ε > 0 such that G(x1−ε

)=

o(F (x)

), we have that

Pr [(an−1 +Xn−1)Yn−1 + (an +Xn)YnYn−1 > x]

=

(∫ x1−ε

0+

∫ +∞

x1−ε

)Pr [(an−1 +Xn−1)y + (an +Xn)Yny > x] dG(y)

=

∫ x1−ε

0Pr

[(an−1 +Xn−1) + (an +Xn)Yn >

x

y

]dG(y) + o

(F (x)

)

∼∫ x1−ε

0

(Pr

[(an−1 +Xn−1) >

x

y

]+ Pr

[(an +Xn)Yn >

x

y

])dG(y)

+o(F (x)

)

=

(∫ +∞

0−∫ +∞

x1−ε

)(Pr

[(an−1 +Xn−1) >

x

y

]

+Pr

[(an +Xn)Yn >

x

y

])dG(y) + o

(F (x)

)

= Pr [(an−1 +Xn−1)Yn−1 > x] + Pr [(an +Xn)YnYn−1 > x] + o(F (x)

)

∼ Pr [(an−1 +Xn−1)Yn−1 > x] + Pr [(an +Xn)YnYn−1 > x] .

Furthermore, by application of Lemma’s 12 and 14, it follows that (an−1 +

Xn−1)Yn−1 + (an +Xn)YnYn−1 is subexponentially distributed and that

Pr [(an−1 +Xn−1)Yn−1 + (an +Xn)YnYn−1 > x] � F (x).

Simply repeating the procedure above and observing that

(an−2 +Xn−2)Yn−2 + (an−1 +Xn−1)Yn−1Yn−2 + (an +Xn)YnYn−1Yn−2

= [(an−2 +Xn−2) + (an−1 +Xn−1)Yn−1 + (an +Xn)YnYn−1]Yn−2,


we obtain that

Pr [(an−2 +Xn−2)Yn−2 + (an−1 +Xn−1)Yn−1Yn−2+

+(an +Xn)YnYn−1Yn−2 > x]

∼ Pr [(an−2 +Xn−2)Yn−2 > x] + Pr [(an−1 +Xn−1)Yn−1Yn−2 > x]

+Pr [(an +Xn)YnYn−1Yn−2 > x] .

Hence, repeating the procedure above n − 1 times yields the announced

result (5.21). The proof of (5.22), can be given completely analogously to

the above, since the distribution of aiXi satisfies

Pr [aiXi > x] = F (x/ai) � F (x)

and is subexponential.

Corollary 3

Proof. Using (5.43), one can easily verify that

lim infx→+∞

Pr[∑n

i=1(ai +Xi)Zi > x]

Pr[∑n−1

i=1 (ai +Xi)Zi > x] > 1,

and that

lim infx→+∞

Pr[∑n

i=1(aiXi)Zi > x]

Pr[∑n−1

i=1 (aiXi)Zi > x] > 1.

Hence, we can prove (5.23) and (5.24) by substituting (5.21) and (5.22)

into the left-hand-side of (5.23) and (5.24), respectively.

Corollary 4

Proof. Given the asymptotic results (5.21) and (5.22), the proof of this

corollary follows immediately from a well-known result, which was referred

by Cline (1986) to Proposition 3 of Breiman (1965).

Theorem 14

In case the conditions 1 and 2 of Theorem 13 are replaced by the conditions

1’, 2’ and 3’ of Theorem 14, the proof of (5.21) can be established com-

pletely analogously to the proof of Theorem 13 using the following three

5.6. Proofs 225

lemma’s, which are the analogs of Lemma 12, Lemma 13 and Lemma 14,

respectively:

Lemma 15.

Let X and Y be two independent lognormally distributed r.v.’s with σY <

σX . Furthermore, let V = XY and denote by H the d.f. of V . Then V

follows a lognormal law and F (x) = o(H(x)).

Lemma 16.

If both F and G are lognormal laws with σG < σF , then there exists some

ε > 0 such that

G(x1−ε

)= o

(F (x)

).

Lemma 17.

Let F = F1 ∗ F2, where F1 and F2 are two lognormal laws. Then F ∈ Sand

F (x) ∼ F 1(x) + F 2(x).

Proof. This is a special case of Corollary 1 of Cline (1986) and moreover

is a special case of Lemma 14.

We are now ready to proof Theorem 14.

Proof. The proof of (5.22) can be given analogously, since the distribution

of aiXi is again lognormal with Var[log(aiXi)] = Var[log(Xi)] = σ2X .

Finally, we prove (5.23) and (5.24). By application of Lemma 11 and the

same reasoning as in the proof of Theorem 12, we have for each n = 1, 2, . . .,

and some 0 < v < 1 that


lim infx→+∞

∑ni=1 Pr

[(ai +X)

∏ij=1 Yj > x

]

∑n−1i=1 Pr

[(ai +X)

∏ij=1 Yj > x

]

≥ lim infx→+∞

Pr[(an +X)

∏nj=1 Yj > x

]

∑n−1i=1 Pr

[(ai +X)

∏ij=1 Yj > x

]

=1

∑n−1i=1 lim supx→+∞

Pr[(ai+X) � ij=1 Yj>x]

Pr[(an+X) � nj=1 Yj>x]

≥ 1∑n−1

i=1 lim supx→+∞Pr[(ai+X) � i

j=1 Yj>x]Pr[(an+X) � i

j=1 Yj>vx]Pr[ � nj=i+1 Yj>1/v]

=1


Pr[X � ij=1 Yj>x]

Pr[X � ij=1 Yj>vx]Pr[ � n

j=i+1 Yj>1/v]

= +∞ > 1.

and

lim infx→+∞

∑ni=1 Pr

[(aiX)

∏ij=1 Yj > x

]

∑n−1i=1 Pr

[(aiX)

∏ij=1 Yj > x

]

≥ lim infx→+∞

Pr[(anX)

∏nj=1 Yj > x

]

∑n−1i=1 Pr

[(aiX)

∏ij=1 Yj > x

]

=1


Pr[(aiX) � ij=1 Yj>x]

Pr[(anX) � nj=1 Yj>x]

= +∞ > 1.

Hence, we can prove (5.23) and (5.24) by substituting (5.21) and (5.22)

into the left-hand-side of (5.23) and (5.24), respectively.

Samenvatting in het

Nederlands (Summary in

Dutch)

Inleiding

In deze thesis bekijken we de reserveringsproblematiek in de verzekerings-

wereld van naderbij. Een reserveringsstudie komt in grote lijnen neer op de

bepaling van de huidige waarde van de toekomstige schade-uitkeringen. De

deskundigheid en nauwkeurigheid waarmee dit onzeker bedrag tot stand

komt is dan ook cruciaal voor een maatschappij en haar polishouders. De

intrinsieke onzekerheden die hiermee gepaard gaan, mogen bovendien geen

excuus zijn om van een sterk wetenschappelijk onderbouwde analyse af

te zien. Belangen en prioriteiten kunnen verschillen tussen al diegenen

die te maken krijgen met reserveschattingen. Voor het management moet

deze schatting betrouwbare informatie verschaffen om de leefbaarheid en

de winstgevendheid van de maatschappij te maximaliseren. Voor de con-

trole instantie, die zich bezighoudt met de solvabiliteit, moeten de reserves

conservatief bepaald worden om de kans op een faillissement te reduceren.

Voor de fiscus moeten de reserves de werkelijke betalingen zo goed mo-

gelijk weergeven. De polishouder ten slotte wil dat de reserves voldoende

zijn om verzekerde schadegevallen te kunnen betalen, maar wil niet beboet

worden onder de vorm van een te hoge premie voor die garantie.

Het voornaamste doel van het reserveringsproces kan eenvoudig als

volgt beschreven worden. Vanaf een bepaalde, vooraf overeengekomen, dag

is een verzekeraar verantwoordelijk voor alle opgelopen claims. Kosten die

dit schadegeval met zich meebrengen worden opgedeeld in twee categorieen:

227

228 Samenvatting in het Nederlands (Summary in Dutch)

diegene die reeds betaald zijn en diegene die nog niet (volledig) betaald

zijn. Het voornaamste doel van het reserveringsproces is nu het schatten

van die kosten die nog niet betaald zijn door de maatschappij. De verdeling

van mogelijke geaggregeerde onbetaalde schadegevallen kan voorgesteld

worden als een kansdichtheidsfunctie. Er is reeds veel geschreven over

de statistische verdelingen die geschikt zijn bij de studie van risico’s en

verzekeringen. In de praktijk kan men niet beschikken over de volledige

informatie van de onderliggende verdelingen. Daarom moet men zich dik-

wijls beroepen op beperkte informatie, zoals bv. schattingen van de eerste

momenten van de verdeling. Niet enkel de basisrisicomaten maar ook meer

gesofisticeerde maten (zoals scheefheidsmaten, extreme percentielen van de

verdeling,. . . ) die een dieper inzicht in de onderliggende verdeling vereisen,

zijn erg van belang. De berekening van de eerste momenten kan gezien wor-

den als een eerste poging om meer te weten te komen over de eigenschap-

pen van een verdeling. Bovendien is de variantie niet de meest geschikte

risicomaat om de solvabiliteitsvereisten van een verzekeringsportefeuille te

bepalen. Als tweezijdige risicomaat houdt deze zowel rekening met de

positieve als met de negatieve tekortkomingen hetgeen tot onderschatting

van de reserve zal leiden in geval van een scheve verdeling. Bovendien

benadrukt deze maat niet de staarteigenschappen van de verdeling. In dit

geval lijkt het meer geschikt de VaR (het p-de kwantiel) te gebruiken of zelfs

de TVaR (hetgeen in essentie neerkomt op een gemiddelde van alle kwan-

tielen boven een voorgedefinieerd niveau p). Ook risicomaten gebaseerd

op stop-loss premies (bv. de verwachte shortfall) kunnen in deze context

aangewend worden. Het verkrijgen van de verdeling waarvan dan aller-

lei maten kunnen berekend worden is het uiteindelijke doel. Deze trends

worden ook aangehaald in de huidige bank- en verzekeringsvoorschriften

(Basel 2 en Solvency 2) die de risico-gebaseerde benadering in ALM be-

nadrukken. Dit vereist een nieuwe methodologische aanpak die toelaat

meer gesofisticeerde informatie over de onderliggende risico’s te verkrijgen.

In de huidige actuariele wetenschappelijke literatuur vinden we weinig

terug over de geschikte berekeningsmethode van de verdeling van reserve-

uitkomsten. Verscheidene methoden bestaan om efficient de verdeling van

sommen van onafhankelijke risico’s te benaderen (zoals Panjer’s recursie,

convolutie, ...). Als bovendien het aantal risico’s in een portefeuille groot

genoeg is, kan men gebruik maken van de Centrale Limiet Stelling om de

geaggregeerde claims via de normale verdeling te benaderen. Zelfs indien

deze onafhankelijkheidsveronderstelling niet voldaan is (wanneer bv. de

Inleiding 229

aanname van onafhankelijkheid op basis van statistische testen verworpen

wordt) wordt deze benadering veel gebruikt in de praktijk omwille van

de mathematische eenvoud. In een aantal praktische toepassingen wordt

deze onafhankelijkheidsveronderstelling nochtans geschonden, hetgeen tot

een significante onderschatting van het risico van de portefeuille kan lei-

den. Dit is onder meer het geval wanneer het actuarieel technische risico

gecombineerd wordt met het financiele investeringsrisico.

In tegenstelling tot in het bankwezen, is het concept van stochastische

interestvoeten pas recent aan de oppervlakte gekomen in het verzekerings-

wezen. Traditioneel vertrouwen actuarissen op deterministische inter-

estvoeten. Een dergelijke vereenvoudiging laat toe efficiente risicomaten

(zoals het gemiddelde, de standaarddeviatie, bovenkwantielen, ...) van

financiele contracten te bepalen. Door een hoge onzekerheid over toekom-

stige investeringsresultaten worden actuarissen nochtans gedwongen con-

servatieve aannames te doen om verzekeringspremies en wiskundige re-

serves te berekenen. Dit heeft tot gevolg dat de diversificatie-effecten van

returns in verschillende investeringsperioden niet in rekening kunnen wor-

den gebracht. Hiermee bedoelen we dat slechte investeringsresultaten in

bepaalde perioden gewoonlijk gecompenseerd worden door zeer goede re-

sultaten in andere perioden. Deze bijkomende kosten worden ofwel naar de

verzekerden doorgerekend, die hogere premies moeten betalen, ofwel naar

de aandeelhouders, die meer economisch kapitaal moeten voorzien. Het

belang van de introductie van modellen met stochastische interestvoeten

is daarom goed begrepen in de actuariele wereld. Ook de laatste bank- en

verzekeringsvoorschriften (Basel 2, Solvency 2) onderstrepen dit belang.

Deze voorschriften leggen de nadruk op de risico-gebaseerde benadering

om economisch kapitaal te bepalen. Het projecteren van cash flows met

stochastische returns is ook belangrijk in de prijsbepaling van verzekerings-

toepassingen zoals de ‘embedded value’ (de huidige waarde van cash flows

voortgebracht door de van kracht zijnde polissen) en de ‘appraisal value’

(de huidige waarde van cash flows voortgebracht door de van kracht zijnde

polissen en door polissen die in de toekomst zullen onderschreven worden).

Een wiskundige beschrijving van het aangehaalde probleem kan als volgt

samengevat worden. Zij Xi (i = 1, . . . , n) een stochastisch bedrag dat

betaald moet worden op tijdstip ti en zij Vi de verdisconteringsfactor over

de periode [0, ti]. We beschouwen dan de huidige waarde van toekomstige


betalingen, die geschreven kan worden als een scalair produkt van de vorm

S =n∑

i=1

XiVi. (N.1)

De stochastische vector ~X = (X1, X2, . . . , Xn) kan bv. het verzekerings-

of kredietrisico weergeven, terwijl de vector ~V = (V1, V2, . . . , Vn) het fi-

nanciele/investeringsrisico weergeeft. In het algemeen veronderstellen we

dat deze vectoren onderling onafhankelijk zijn. In praktische toepassingen

kan deze onafhankelijkheidsaanname wel eens geschonden zijn bv. door

een inflatiefactor met een sterke invloed op betalings- en investeringsre-

sultaten. Men kan dit probleem echter aan pakken door sommen van de

volgende vorm te beschouwen

S =n∑

i=1

XiVi,

waarbij Xi = Xi/Zi en Vi = ViZi de aangepaste waarden zijn uitgedrukt

in reele termen (Zi is een inflatiefactor over de periode [0, ti]). Daarom is

de onafhankelijkheidsveronderstelling tussen het verzekeringsrisico en het

financiele risico in vele gevallen realistisch en kan zij efficient aangewend

worden om verschillende grootheden te verkrijgen die het risico in financiele

instituten beschrijft (bv. verdisconteerde claims of de ‘embedded/appraisal’

waarde van een maatschappij).

Deze verdelingsfuncties zijn typisch complex en niet voor de hand

liggend omwille van twee belangrijke redenen. Eerst en vooral behoort

de verdeling van een som van stochastische veranderlijken met marginale

verdelingen in dezelfde verdelingsklasse in het algemeen niet tot deze verde-

lingsklasse. Ten tweede verhindert de stochastische afhankelijkheid tussen

de elementen in de som het gebruik van convolutie en maakt het geheel

aanzienlijk ingewikkelder. Bijgevolg worden benaderingsmethoden om func-

ties van sommen van afhankelijke variabelen te berekenen noodzakelijk.

In vele gevallen kan men natuurlijk Monte Carlo simulatie gebruiken om

empirische verdelingsfuncties te verkrijgen. Dit is echter typisch een tijd-

rovende benaderingsmethode, in het bijzonder indien men staartkansen

wenst te benaderen hetgeen een groot aantal simulaties vereist. Daarom

moet men opzoek gaan naar nieuwe alternatieve methoden. In deze thesis

bestuderen en evalueren we de meest frequent gebruikte benaderingstech-

nieken voor verzekeringstoepassingen.

Inleiding 231

Het centrale idee in dit werk is het comonotoniciteitsconcept. We

stellen voor het hierboven uiteengezette probleem op te lossen door onder-

en bovengrenzen voor de som van afhankelijke variabelen te berekenen ge-

bruikmakend van de beschikbare informatie. Deze grenzen zijn gebaseerd

op een algemene techniek voor het berekenen van het onder- en bovengren-

zen van stop-loss premies van een som van afhankelijke variabelen, zoals

uiteengezet in Kaas et al. (2000).

De eerste benadering voor de verdelingsfunctie van de verdisconteerde

reserve wordt afgeleid door de afhankelijkheidstructuur tussen de betrokken

stochastische veranderlijken te benaderen door een comonotone afhankelijk-

heidsstructuur. Op deze manier wordt het meerdimensionale probleem

gereduceerd tot een tweedimensionaal probleem hetgeen opgelost kan wor-

den door te conditioneren en gebruik te maken van eenvoudige numerieke

technieken. Deze benadering is plausibel in actuariele toepassingen aan-

gezien het leidt tot voorzichtige en conservatieve waarden van de reserves

en solvabiliteitsmarges. Indien de onderliggende afhankelijkheidsstructuur

sterk genoeg is, geeft deze bovengrens in convexe orde bevredigende resul-

taten.

De tweede benadering, die afgeleid wordt door voorwaardelijke ver-

wachtingswaarden te beschouwen, neemt een deel van de afhankelijkheids-

structuur in beschouwing. Deze benedengrens in convexe orde is zeer nut-

tig om de kwaliteit van de bovengrens als benadering te evalueren en kan

ook gebruikt worden als een benadering van de onderliggende verdeling.

Alhoewel deze keuze niet (actuarieel) voorzichtig is, doet de relatieve fout

van deze benadering significant beter dan de relatieve fout van de boven-

grens. Daarom zal de ondergrens verkozen worden in toepassingen waarbij

een hoge nauwkeurigheid van de toegepaste benaderingen vereist wordt

(zoals het prijzen van exotische opties of strategische portefeuille selectie

problemen).

Deze thesis is als volgt ingedeeld.

Het eerste hoofdstuk herhaalt de basis van de actuariele risicotheorie. We

definieren enkele veel gebruikte afhankelijkheidsmaten en de belangrijkste

risico-orderelaties voor actuariele toepassingen. We introduceren verder

verscheidene welbekende risicomaten en de relaties die onderling gelden.

Verder geeft het eerste hoofdstuk een theoretische achtergrond voor de

concepten van comonotoniciteit en herhaalt het de belangrijkste eigen-


schappen van comonotone risico’s.

In Hoofdstuk 2 herhalen we hoe de convexe grenzen kunnen afgeleid worden

en illustreren we de theoretische resultaten aan de hand van een toepassing

met betrekking tot verdisconteerde reserves. Het voordeel van te werken

met een som van comonotone variabelen ligt in de eenvoudige berekening

van de betrokken verdeling. In het bijzonder is deze techniek zeer nuttig

om betrouwbare schattingen te verkrijgen van bovenkwantielen en stop-

loss premies.

In praktische toepassingen is de bovengrens enkel nuttig indien de

afhankelijkheid tussen opeenvolgende termen van de som sterk genoeg is.

Maar zelfs dan zijn deze benaderingen voor stop-loss premies niet bevredi-

gend. In dit hoofdstuk stellen we een aantal technieken voor om meer

efficiente bovengrenzen voor stop-loss premies te bepalen. We gebruiken

hiervoor enerzijds de conditioneringsmethode zoals in Curran (1994) en in

Rogers & Shi (1995) en anderzijds de traditionele onder- en bovengrenzen

voor stop-loss premies van sommen van afhankelijke stochastische veran-

derlijken. We tonen ook hoe deze resultaten kunnen toegepast worden in

het speciale geval van lognormale stochastische veranderlijken. Dergelijke

sommen komt men vaak in de praktijk tegen, zowel in de actuariele als in

de financiele wereld.

We leiden comonotone benaderingen af voor het scalaire produkt van

stochastische vectoren van de vorm (N.1). Een algemene procedure voor

het berekenen van accurate schattingen van kwantielen en stop-loss pre-

mies wordt uiteengezet. We bestuderen de verdelingsfunctie van de huidige

waarde van een serie van stochastische betalingen in een stochastisch fi-

nanciele omgeving beschreven door een lognormaal verdisconteringspro-

ces. Dergelijke verdelingen komen frequent voor in een breed spectrum

van verzekerings- en financiele toepassingen. We verkrijgen nauwkeurige

benaderingen door onder- en bovengrenzen in convexe orde te ontwikke-

lingen voor dergelijke huidige-waarde-functies. We beschouwen verschei-

dene toepassingen voor verdisconteerde schadeprocessen onder de Black &

Scholes setting. In het bijzonder analyseren we in detail de gevallen waarbij

de stochastische veranderlijken Xi verzekeringsschades voorstellen gemo-

delleerd door lognormale, normale (meer algemeen elliptische) en gamma

of invers Gaussische (meer algemeen gematigd stabiele) verdelingen. Door

middel van een reeks numerieke illustraties tonen we dat de methode

zeer nauwkeurige en eenvoudig te verkrijgen benaderingen verschaft voor

Inleiding 233

verdelingsfuncties van stochastische veranderlijken van de vorm (N.1).

In Hoofdstuk 3 en 4 passen we de verkregen resultaten toe op twee be-

langrijke reserveringsproblemen in het verzekeringswezen en illustreren we

de benaderingen zowel numeriek als grafisch.

In Hoofdstuk 3 beschouwen we een belangrijke toepassing in het domein

van de levensverzekeringen. We trachten conservatieve schattingen te

bekomen voor kwantielen en stop-loss premies van een annuıteit en een

ganse portefeuille van annuıteiten. Gelijkaardige technieken kunnen aan-

gewend worden om schattingen te verkrijgen van meer algemene verze-

keringsprodukten in de sector leven. Onze techniek laat toe ‘personal fi-

nance’ problemen zeer nauwkeurig op te lossen.

Het geval van een portefeuille van annuıteiten is reeds uitgebreid on-

derzocht in de wetenschappelijke literatuur, maar enkel in het grensgeval

— voor homogene portefeuilles, wanneer het sterfterisico volledig gediver-

sifieerd is. De toepasbaarheid van deze resultaten in de verzekeringsprak-

tijk kan echter in vraag gesteld worden: in het bijzonder hier, aangezien

een typische portefeuille niet genoeg polissen bevat om te spreken over

volledige diversificatie. Daarom stellen we voor het aantal actieve polissen

in de opeenvolgende jaren te benaderen gebruikmakend van een ‘normal

power’ verdeling en de huidige waarde van de toekomstige uitkeringen te

modelleren als een scalair produkt van onderling onafhankelijke vectoren.

Hoofdstuk 4 focust op het schadereserveringsprobleem. Het correct schat-

ten van het bedrag dat een maatschappij opzij moet zetten om tegemoet

te komen aan de verplichtingen (schadegevallen) die zich in de toekomst

voordoen, is een belangrijke taak voor verzekeringsmaatschappijen om een

correct beeld van haar verplichtingen te krijgen. De historische data die

nodig zijn om schattingen te bekomen voor toekomstige betalingen wor-

den meestal weergegeven als incrementele betalingen in driehoek-vorm.

De bedoeling is deze schadedriehoek te vervolledigen tot een vierkant en

eventueel tot een rechthoek indien schattingen nodig zijn die behoren tot

afwikkelingsjaren waarvan geen data in de driehoek opgenomen zijn. Hier-

voor kan de actuaris gebruik maken van een aantal technieken. De in-

trinsieke onzekerheid wordt beschreven door de verdeling van mogelijke

uitkomsten en men zoekt steeds naar de beste schatting van de reserve.

Schadereservering heeft te maken met de bepaling van de onzekere huidige


waarde van een ongekend bedrag van toekomstige betalingen. Aange-

zien dit bedrag zeer belangrijk is voor een verzekeringsmaatschappij en

haar polishouders zijn de intrinsieke onzekerheden geen excuus om een

wetenschappelijke analyse links te laten liggen. Opdat de reserveschatting

werkelijk de beste schatting van de actuaris zou weergeven, moet zowel de

bepaling van de verwachte waarde van niet-betaalde schadegevallen als-

ook de geschikte verdisconteringsvoet de beste schatting van de actuaris

weergeven (hiermee bedoelen we dat deze niet opgelegd moet worden door

anderen of door de wetgeving). Aangezien de reserve een provisie is voor

toekomstige betalingen van niet-afgehandelde schadegevallen, geloven we

dat de geschatte schadereserve de tijdswaarde van geld moet weergeven.

In vele situaties is de verdisconteerde reserve nuttig, bv. in een dynamisch

financiele analyse, winstbepaling en het prijs zetten, risicokapitaal, schade-

portefeuille transfers,. . . . Idealiter zou de verdisconteerde reserve ook aan-

vaardbaar moeten zijn voor rapportering. De huidige wetgeving laat het

echter meestal niet toe. Niet-verdisconteerde reserves bevatten in feite een

zekere risicomarge afhankelijk van het niveau van de interestvoet. In dit

hoofdstuk beschouwen we de verdisconteerde IBNR reserve en leggen we

een impliciete marge op gebaseerd op een risicomaat van de verdeling van

de totale verdisconteerde reserve. We modelleren de schadebetalingen ge-

bruikmakend van lognormale lineaire modellen, loglineaire locatie-schaal

modellen en veralgemeende lineaire modellen en leiden accurate comono-

tone benaderingen af voor de verdisconteerde reserve.

De bootstraptechniek heeft bewezen zeer nuttig te zijn in vele statis-

tische toepassingen en kan in het bijzonder interessant zijn om de vari-

abiliteit van de schadevoorspellingen te bepalen en bovendien om boven-

grenzen te construeren met een geschikt betrouwbaarheidsniveau. Haar

populariteit is te wijten aan een combinatie van rekenkracht en theore-

tische ontwikkeling. Een voordeel van de bootstrapbenadering is dat de

techniek op elke dataset kan toegepast worden zonder een onderliggende

verdeling te veronderstellen. Bovendien kan de meeste software omgaan

met zeer grote aantallen bootstrapiteraties.

In Hoofstuk 5 leiden we andere methoden af om benaderingen te verkrijgen

voor S. We herhalen en evalueren ook kort enkele reeds bestaande tech-

nieken. In de eerse sectie van dit hoofdstuk, herhalen we twee bekende

moment gebaseerde benaderingen: de lognormale en de inverse gamma be-

nadering. Mensen uit de praktijk gebruiken vaak een moment gebaseerde

Inleiding 235

lognormale benadering voor de verdeling van S. Deze benaderingen zijn

zo gekozen dat de eerste twee momenten samenvallen met de correspon-

derende momenten van S.

Alhoewel de comonotone benaderingen in convexe orde bewezen hebben

goede benaderingen te zijn in geval de onderliggende variabiliteit klein is,

doen ze het een stuk minder wanneer de variantie toeneemt. Daarom kijken

we hier naar benaderingen voor functies van sommen van afhankelijke vari-

abelen door gebruik te maken van asymptotische resultaten. Alhoewel

asymptotische resultaten geldig zijn op oneindig, kunnen ze ook nuttig

zijn als benaderingen in de buurt van oneindig. We leiden enkele asymp-

totische resultaten af voor de staartkans van een som van zwaarstaartige

afhankelijke variabelen.

Sedert 1990 kent het toegepaste Bayesiaanse onderzoek een enorme

groei bij de statistici. Deze explosie heeft weinig te maken gehad met de

groeiende interesse van statistici en econometrici voor de theoretische basis

van de Bayesiaanse analyse of met een plotselinge bewustwording van de

voordelen van de Bayesiaanse aanpak ten opzichte van de frequentistische

methoden, maar heeft vooral een pragmatische grondslag. De ontwikke-

ling van krachtige rekeninstrumenten (en de bewustwording dat bestaande

statistische tools nuttig kunnen zijn om Bayesiaanse modellen te fitten)

heeft een groot aantal onderzoekers aangetrokken om de Bayesiaanse be-

nadering te gebruiken in de praktijk. Het gebruik van dergelijke methoden

laat onderzoekers toe ingewikkelde statistische modellen te schatten, die

gebruikmakend van standaard frequentistische technieken redelijk moeilijk

zijn, al dan niet onmogelijk. In deze sectie schetsen we vrij algemeen de ba-

siselementen van de Bayesiaanse berekening. Bayesiaanse gevolgtrekking

komt neer op het fitten van een kansmodel op een dataset en het resultaat

samenvatten door middel van een kansverdeling op de modelparameters en

op niet-waargenomen grootheden zoals predicties voor nieuwe observaties.

Er bestaan eenvoudige simulatiemethoden om een steekproef te nemen van

de posterior- en predictieverdeling, waarbij onzekerheid in de modelpara-

meters automatisch meegenomen wordt. Een voordeel van de Bayesiaanse

aanpak is dat we steeds, gebruikmakend van simulatie, de posterior pre-

dictieverdeling kunnen berekenen zodat we niet veel energie moeten steken

in het schatten van de steekproefverdeling van teststatistieken.

Uiteindelijk vergelijken we deze benaderingen met de comonotone be-

naderingen uit het vorig hoofdstuk in de context van de schadereserverings-

problematiek. In geval de onderliggende variantie van het statistische


en financiele gedeelte van de verdisconteerde IBNR reserve groter wordt,

presteren de comonotone benaderingen slecht. We illustreren dit aan de

hand van een eenvoudig voorbeeld en stellen de asymptotische resultaten

uit het vorig hoofdstuk als een alternatief voor. We vergelijken al deze

resultaten ook met de lognormale moment gebaseerde benaderingen. Ten-

slotte bekijken we ook de verdeling van de verdisconteerde reserve wanneer

we de data in de schadedriehoek modelleren met behulp van een veralge-

meend lineair model en vergelijken de resultaten van de comonotone be-

naderingen met de Bayesiaanse benaderingen.

Bibliography

[1] Ahcan A., Darkiewicz G., Goovaerts M.J. & Hoedemakers T. (2005).

“Computation of convex bounds for present value functions of ran-

dom payments”, Journal of Computational and Applied Mathema-

tics, to appear.

[2] Albrecher H., Dhaene J., Goovaerts M.J. & Schoutens W. (2005).

“Static hedging of Asian options under Levy models: The comono-

tonicity approach”, The Journal of Derivatives, 12(3), 63-72.

[3] Antonio K., Beirlant J. & Hoedemakers T. (2005). Discussion of

“A Bayesian generalized linear model for the Bornhuetter-Ferguson

method of claims reserving”by Richard Verrall, North American Ac-

tuarial Journal, to be published.

[4] Arnold L. (1974). Stochastic Differential Equations: Theory and Ap-

plications, Wiley, New York.

[5] Artzner P. (1999). “Application of coherent risk measures to capital

requirements in insurance”, North American Actuarial Journal, 3(2),

11–25.

[6] Artzner P., Delbaen F., Eber J.M. & Heath D. (1999). “Coherent

measures of risk”, Mathematical Finance, 9, 203–228.

[7] Barnett G. & Zehnwirth B. (2000). “Best estimates for reserves”,

Proceedings of the Casualty Actuarial Society, 87(2), 245–321.

[8] Beekman J.A. & Fuelling C.P. (1990). “Interest and mortality ran-

domness in some annuities”, Insurance: Mathematics & Economics,

9(2-3), 185–196.

237

238 Bibliography

[9] Beekman J.A. & Fuelling C.P. (1991). “Extra randomness in certain

annuity models”, Insurance: Mathematics & Economics, 10(4), 275–

287.

[10] Beekman J.A. & Fuelling C.P. (1993). “One approach to dual ran-

domness in life insurance”, Scandinavian Actuarial Journal, 76(2),

173–82.

[11] Bellhouse D.R. & Panjer H.H. (1981). “Stochastic modeling of inter-

est rates with applications to life contingencies - Part II”, Journal of

Risk and Insurance, 48(4), 628–637.

[12] Beirlant J., Goegebeur Y., Segers J. & Teugels J. (2004). Statistics

of Extremes: Theory and Applications, Wiley, New York.

[13] Bingham N.H., Goldie C.M. & Teugels J.L. (1987). Regular Varia-

tion, Cambridge University Press, Cambridge.

[14] Black F. & Scholes M. (1973). “The pricing of options and corporate

liabilities”, Journal of Political Economy, 81, 637–659.

[15] Blum K.A. & Otto D.J. (1998). “Best estimate loss reserving: an

actuarial perspective”, Casualty Actuarial Society Forum Fall 1998,

55-101.

[16] Boyle P.P. (1976). “Rates of return as random variables”, Journal of

Risk and Insurance, 43(4), 693–711.

[17] Bowers N.L., Gerber H.U., Hickman J.C., Jones D.A. & Nesbitt C.J.

(1986). Actuarial Mathematics, Schaumburg, Ill.: Society of Actuar-

ies.

[18] Breiman L. (1965). “On some limit theorems similar to the arc-sin

law”, Theory of Probability and Its Applications, 10(2), 323–331.

[19] Buhlmann H., Gagliardi B., Gerber H.U. & Straub E. (1977). “Some

inequalities for stop-loss premiums”, ASTIN Bulletin 9, 75-83.

[20] Cesari R. & Cremonini D. (2003). “Benchmarking, portfolio insur-

ance and technical analysis: a Monte Carlo comparison of dynamic

strategies of asset allocation”, Journal of Economic Dynamics and

Control, 27, 987-1011.

Bibliography 239

[21] Christofides S. (1990). “Regression models based on log-incremental

payments”, Claims Reserving Manual, 2, Institute of Actuaries, Lon-

don.

[22] Cline D.B.H. (1986). “Convolution tails, product tails and domains

of attraction”, Probability Theory Related Fields, 72, 529–557.

[23] Cohen A.C. & Whitten B.J. (1988). Parameter estimation in relia-

bility and life span models, Marcel Dekker, Inc., New York.

[24] Cordeiro G.M. & McCullagh P. (1991). “Bias correction in general-

ized linear models ”, Journal of the Royal Statistical Society B, 53(3),

629–643.

[25] Curran M. (1994). “Valuing Asian and portfolio options by condi-

tioning on the geometric mean price”, Management Science, 40(12),

1705–1711.

[26] Darkiewicz G., Dhaene J. & Goovaerts M.J. (2005a). “Risk mea-

sures and dependencies of risks”, Brazilian Journal of Probability

and Statistics, to appear.

[27] Darkiewicz G. (2005b). Value-at-Risk in Insurance and Finance: the

Comonotonicity Approach, PhD Thesis, K.U. Leuven, Faculty of

Economics and Applied Economics, Leuven.

[28] Davison A.C., & Hinkley D.V. (1997) Bootstrap Methods and their

Application, Cambridge Series in Statistical and Probabilistic Ma-

thematics, Cambridge University Press.

[29] De Alba E. (2002). “Bayesian estimation of outstanding claims re-

serves”, North American Actuarial Journal, 6(4), 1–20.

[30] Debicka J. (2003). “Moments of the cash value of future payment

streams arising from life insurance contracts”, Insurance: Mathema-

tics & Economics, 33(3), 533–550.

[31] Decamps M., De Schepper A. & Goovaerts M.J. (2004). “Pricing

exotic options under local volatility”, Proceedings of the second In-

ternational Workshop on Applied Probability (IWAP), Athens.

240 Bibliography

[32] Deelstra M., Liinev J. & Vanmaele M. (2004). “Pricing of arith-

metic basket options by conditioning”, Insurance: Mathematics &

Economics, 34(1), 55–77.

[33] Denuit M. & Dhaene J. (2003) “Simple characterizations of comono-

tonicity and countermonotonicity by extremal correlations”, Belgian

Actuarial Bulletin, 3, 22–27.

[34] Denuit M. & Dhaene J. (2004) “Dependent risks”, Encyclopedia of

Actuarial Science, Wiley, Vol. I, 464–471.

[35] Devroye L. (1986). Non-Uniform random variate generation,

Springer-Verlag, New York.

[36] De Vylder F. & Goovaerts M.J. (1979). Proceedings of the first meet-

ing of the contact group ”Actuarial Sciences”, K.U.Leuven, nr 7904B,

wettelijk Depot: D/1979/23761/5.

[37] De Vylder F. & Goovaerts M.J. (1982). “Upper and lower bounds on

stop-loss premiums in case of known expectation and variance of the

risk variable”, Mitt. Verein. Schweiz. Versicherungmath., 149–164.

[38] Dhaene J. (1989). “Stochastic interest rates and autoregressive inte-

grated moving average processes”, ASTIN Bulletin, 19(2), 131–138.

[39] Dhaene J. (1990). “Distributions in life insurance”, ASTIN Bulletin,

20(1), 81–92.

[40] Dhaene J., Wang S., Young V. & Goovaerts M.J. (2000).

“Comonotonicity and maximal stop-loss premiums”, Mitteilungen

der Schweiz. Aktuarvereinigung, 2000(2), 99–113.

[41] Dhaene J., Denuit M., Goovaerts M.J., Kaas R. & Vyncke D.

(2002a). “The concept of comonotonicity in actuarial science and

finance: Theory”, Insurance: Mathematics & Economics, 31(1), 3–

33.

[42] Dhaene J., Denuit M., Goovaerts M.J., Kaas R. & Vyncke D.

(2002b). “The concept of comonotonicity in actuarial science and fi-

nance: Applications”, Insurance: Mathematics & Economics, 31(2),

133–161.

Bibliography 241

[43] Dhaene J., Goovaerts M.J. & Kaas R. (2003). “Economical capital

allocation derived from risk measures”, North American Actuarial

Journal, 7(2), 44–59.

[44] Dhaene J., Vanduffel S., Tang Q., Goovaerts M.J., Kaas R. & Vyncke

D. (2004). “Solvency capital, risk measures and comonotonicity: a re-

view”, Research Report OR 0416, Department of Applied Economics,

K.U.Leuven.

[45] Dhaene J., Vanduffel S., Goovaerts M.J., Kaas R. & Vyncke D.

(2005). “Comonotonic approximations for optimal portfolio selection

Problems”, Journal of Risk and Insurance, 72(2), 253–301.

[46] Doray L.G. (1994). “IBNR reserve under a loglinear location-scale

regression model”, Casualty Actuarial Society Forum 1994, 2, 607-

652.

[47] Doray L.G. (1996). “UMVUE of the IBNR reserve in a lognormal lin-

ear regression model”, Insurance: Mathematics & Economics, 18(1),

43–58.

[48] Dufresne D. (1990). “The distribution of a perpetuity with applica-

tions to risk theory and pension funding”, Scandinavian Actuarial

Journal, 9, 39–79.

[49] Dufresne D. (2002). “Asian and basket asymptotics”, Research Paper

No. 100, Centre for Actuarial Studies, University of Melbourne.

[50] Dufresne D. (2004). “Stochastic life annuities, Research Paper, Cen-

tre for Actuarial Studies, University of Melbourne.

[51] Efron B. (1979). “Bootstrap methods: another look at the jackknife”,

Ann. Statist., 7, 1–26.

[52] Efron B. & Tibshirani R.J. (1993). An Introduction to the Bootstrap.

Chapman and Hall, New York.

[53] Embrechts P., Kluppelberg C. & Mikosch T. (1997). Modelling Ex-

tremal Events for Insurance and Finance, Springer, Berlin.

[54] England P.D. & Verrall R.J. (1999). “Analytic and bootstrap esti-

mates of prediction errors in claim reserving”, Insurance: Mathema-

tics & Economics, 25(3), 281-293.

242 Bibliography

[55] England P.D. & Verrall R.J. (2001). “A flexible framework for

stochastic claims reserving”, Proceedings of the Casualty Actuarial

Society, 88(1), 1–38.

[56] England P.D. & Verrall R.J. (2002). “Stochastic claims reserving in

general insurance”, British Actuarial Journal, 8(3), 443–518.

[57] Fang K.T., Kotz S. & Ng K.W. (1990) Symmetric Multivariate and

Related Distributions, Chapman & Hall, London.

[58] Feller W. (1971). An Introduction to Probability Theory and Its Ap-

plications, Wiley, New York.

[59] Frees E. (1990). “Stochastic life contingencies with solvency consid-

erations”, Transactions of the Society of Actuaries, 42, 91–129.

[60] Gilks W.R., Richardson S. & Spiegelhalter D.J. (1996) Practical

Markov Chain Monte Carlo, Chapman and Hall, London.

[61] Goovaerts M.J., Kaas R., Van Heerwaarden A.E. & Bauwelinckx T.

(1990). Effective Actuarial Methods, North-Holland, Amsterdam.

[62] Goovaerts M.J. & Redant H. (1999). “On the distribution of IBNR

reserves”, Insurance: Mathematics & Economics, 25(1), 1–9.

[63] Goovaerts M.J., Dhaene J. & De Schepper A. (2000). “Stochastic

upper bounds for present value functions”, Journal of Risk and In-

surance, 67(1),1–14.

[64] Goovaerts M.J., Kaas R., Dhaene J., & Tang Q. (2003). “A unified

approach to generate risk measures”, ASTIN Bulletin, 33(2), 173–

192.

[65] Goovaerts M.J., Kaas R., Dhaene J., & Tang Q. (2004). “Some new

classes of consistent risk measures”, Insurance: Mathematics & Eco-

nomics, 34(3), 505–516.

[66] Heerwaarden A.E. van (1991). Ordering of Risks: Theory and Actu-

arial Applications, Thesis Publishers, Amsterdam.

[67] Hoedemakers T., Beirlant J., Goovaerts M.J. & Dhaene J. (2003).

“Confidence bounds for discounted loss reserves”, Insurance: Mathe-

matics & Economics, 33(2), 297–316.

Bibliography 243

[68] Hoedemakers T. & Goovaerts M.J. (2004). Discussion of “Risk and

discounted loss reserves”by Greg Taylor, North American Actuarial

Journal, 8(4), 146–150.

[69] Hoedemakers T., Beirlant J., Goovaerts M.J. & Dhaene J. (2005).

“On the distribution of discounted loss reserves using generalized

linear models”, Scandinavian Actuarial Journal, 2005(1), 25–45.

[70] Hoedemakers T., Darkiewicz G. & Goovaerts M.J. (2005). “Approx-

imations for life annuity contracts in a stochastic financial environ-

ment”, Insurance: Mathematics & Economics, to be published.

[71] Hoedemakers T., Darkiewicz G., Deelstra G., Dhaene J. & Vanmaele

M. (2005). “Bounds for stop-loss premiums of stochastic sums (with

applications to life contingencies)”, Research Report OR 0523, De-

partment of Applied Economics, K.U.Leuven.

[72] Huang H., Milevsky M.A. & Wang J. (2004). “Ruined moments in

your Life: how good are the approximations?”, Insurance: Mathe-

matics & Economics, 34(3), 421–447.

[73] Hurlimann W. (1996). “Improved analytical bounds for some risk

quantities”, ASTIN Bulletin, 26(2), 185–199.

[74] Hurlimann W. (1998). “On best stop-loss bounds for bivariate sums

by known marginal means, variances and correlation”, Mitt. Verein.

Schweiz. Versicherungmath., 111–134.

[75] Ibbotson Associates (2002). Stocks, Bonds, Bills and Inflation: 1926-

2001, Chicago, IL.

[76] Jansen K., Haezendonck J. & Goovaerts M.J. (1986). “Upper bounds

on stop-loss premiums in case of known moments up to the fourth

order”, Insurance: Mathematics & Economics, 5(4), 315–334.

[77] Jeffreys H. (1946). “An invariant form for the prior probability in

estimation problems”, Proc. Roy. Soc. London Ser. A, 196, 453–461.

[78] Kaas R., Van Heerwaarden A.E. & Goovaerts M.J. (1998). Ordering

of Actuarial Risks, Caire Education Series 1, Caire, Brussels.

244 Bibliography

[79] Kaas R., Dhaene J. & Goovaerts M.J. (2000). “Upper and lower

bounds for sums of random variables”, Insurance: Mathematics &

Economics, 27(2), 151–168.

[80] Kaas R., Goovaerts M.J., Dhaene J. & Denuit M. (2001). Modern

Actuarial Risk Theory, Kluwer Academic Publishers.

[81] Kalbfleisch J.D. & Prentice R.L. (1980). The Statistical Analysis of

Failure Time Data, Wiley, New York.

[82] Karatzas I. & Shreve S.E. (1991). Brownian Motion and Stochastic

Calculus, Springer-Verlag, New York.

[83] Kass, R.E. & Wasserman L. (1996). “The selection of prior distribu-

tions by formal rules”, Journal of the American Statistical Associa-

tion, 91, 1343–1370.

[84] Kremer E. (1982). “IBNR-claims and the two-way model of

ANOVA”, Scandinavian Actuarial Journal, 47–55.

[85] Laeven R.J.A, Goovaerts M.J. & Hoedemakers T. (2005). “Some

asymptotic results for sums of dependent random variables with ac-

tuarial applications”, Insurance: Mathematics & Economics, to be

published.

[86] Landsman Z. & Valdez E.A. (2003). “Tail conditional expectations

for elliptical distributions”, North American Actuarial Journal, 7,

55–71.

[87] Lawless J.F. (1982). Statistical Models and Methods for Lifetime

Data, Wiley, New York.

[88] Lehmann E. (1955). “Ordered families of distributions”, Ann. Math.

Statist., 26, 399–419.

[89] Lowe J. (1994). “A practical guide to measuring reserve variabil-

ity using: Bootstrapping, operational time and a distribution free

approach”, Proceedings of the 1994 General Insurance Convention,

Institute of Actuaries and Faculty of Actuaries.

[90] Mack T. (1991). “A simple parametric model for rating automo-

bile insurance or estimating IBNR claims reserves”, ASTIN Bulletin,

22(1), 93–109.

Bibliography 245

[91] Mack T. (1993). “Distribution free calculation of the standard error

of chain ladder reserve estimates”, ASTIN Bulletin, 23(2), 213–225.

[92] Mack T. (1994). “Measuring the variability of chain-ladder reserve

estimates“, Casualty Actuarial Society Forum Spring 1994, 1, 101-

182.

[93] McCullagh P. & Nelder J.A. (1992). Generalized Linear Models, 2nd

edition, Chapman and Hall, New York.

[94] Merton R. (1971). “Optimum consumption and portfolio rules in a

continuous-time model”, Journal of Economic Theory 3, 373–413.

[95] Merton R. (1990). Continuous Time Finance, Cambridge, Blackwell.

[96] Michael J.R., Schucany W.R. & Haas R.W. (1976). “Generating ran-

dom variates using transformations with multiple roots”, The Amer-

ican Statistician, 30, 88–90.

[97] Milevsky M.A., Ho K. & Robinson C. (1997). “Asset allocation via

the conditional first exit time or how to avoid outliving your money”,

Review of Quantitative Finance and Accounting, 9(1), 53–70.

[98] Milevsky M.A. (1997). “The present value of a stochastic perpetu-

ity and the Gamma distribution”, Insurance: Mathematics & Eco-

nomics, 20(3), 243–250.

[99] Milevsky M.A. & Posner S.E. (1998). “Asian options, the sum of

lognormals, and the reciprocal gamma distribution”, Journal of Fi-

nancial and Quantitative Analysis, 33(3), 409–422.

[100] Milevsky M.A. & Robinson C. (2000). “Self-annuitization and ruin

in retirement”, North American Actuarial Journal, 4(4), 112–124.

[101] Milevsky M.A. & Wang J. (2004). “Stochastic annuities under expo-

nential mortality”, Research paper, York University and The IFID

Centre.

[102] Nielsen J.A. & Sandmann K. (2003). “Pricing bounds on Asian op-

tions”, Journal of Financial and Quantitative Analysis, 38(2), 449–

474.

246 Bibliography

[103] Norberg R. (1990). “Payment measures, interest and discounting. An

axiomatic approach with applications to insurance”, Scandinavian

Actuarial Journal, 73, 14–33.

[104] Norberg R. (1993). “A solvency study in life insurance”, Proceedings

of the Third AFIR International Colloquium, Rome, 822–830.

[105] O’Hagan A. (1994). Bayesian Inference, Kendall’s Advanced Theory

of Statistics, Arnold, London.

[106] Panjer H.H. & Bellhouse D.R. (1980). “Stochastic modeling of in-

terest rates with applications to life contingencies”, Journal of Risk

and Insurance, 47, 91–110.

[107] Panjer H.H. (1998). Financial economics: With applications to in-

vestments, insurance and pensions, Schaumburg, Ill.: Society of Ac-

tuaries.

[108] Parker G. (1994a). “Moments of the present value of a portfolio of

policies”, Scandinavian Actuarial Journal, 77(1), 53–67.

[109] Parker G. (1994b). “Stochastic analysis of portfolio of endowment

insurance policies”, Scandinavian Actuarial Journal 77(2), 119–130.

[110] Parker G. (1994c). “Limiting distribution of the present value of a

portfolio”, ASTIN Bulletin, 24(1), 47–60.

[111] Parker G. (1994d). “Two stochastic approaches for discounting ac-

tuarial functions”, ASTIN Bulletin, 24(2), 167–181.

[112] Parker G. (1996). “A portfolio of endowment policies and its limiting

distribution”, ASTIN Bulletin, 26(1), 25–33.

[113] Parker G. (1997). “Stochastic analysis of the interaction between

investment and insurance risks”, North American Actuarial Journal,

1(2), 55–71.

[114] Pinheiro P.J.R., Andrade e Silva J.M. & de Lourdes Centeno M.

(2003). “Bootstrap methodology in claim reserving”, Journal of Risk

and Insurance, 70(4), 701–715.

Bibliography 247

[115] Renshaw A.E. (1989). “Chain ladder and interactive modelling

(claims reserving and GLIM)”, Journal of the Institute of Actuar-

ies, 116(III), 559–587.

[116] Renshaw A.E. (1994). “On the second moment properties and the

implementation of certain GLIM based stochastic claims reserving

models”, Actuarial Research Paper No. 65, Department of Actuarial

Science and Statistics, City University, London.

[117] Renshaw A.E. (1994b). “Claims reserving by joint modelling”, Actu-

arial Research Paper No. 72, Department of Actuarial Science and

Statistics, City University, London.

[118] Renshaw A.E. & Verrall R.J. (1994). “A stochastic model underlying

the chain-ladder technique”, Proceedings XXV ASTIN Colloquium,

Cannes.

[119] Rogers L.C.G. & Shi Z. (1995). “The Value of an Asian option”,

Journal of Applied Probability, 32, 1077–1088.

[120] Schoutens W. (2003). Levy Processes in Finance: Pricing Financial

Derivatives, Wiley, New York.

[121] Shaked M. & Shanthikumar J.G. (1994). Stochastic orders and their

applications, Academic Press.

[122] Simon S., Goovaerts M.J. & Dhaene J. (2000). “An easy computable

upper bound for the price of an arithmetic Asian option”, Insurance:

Mathematics & Economics, 26(2-3), 175–184.

[123] Tang Q. & Tsitsiashvili G. (2003). “Precise estimates for the ruin

probability in finite horizon in a discrete-time model with heavy-

tailed insurance and financial risks”, Stochastic Processes and their

Applications, 108, 299–325.

[124] Tang Q. & Tsitsiashvili G. (2004). “Finite and infinite time ruin

probabilities in the presence of stochastic return on investments”,

Advances in Applied Probability, 36, 1278–1299.

[125] Taylor G.C. & Ashe F.R. (1983). “Second moments of estimates of

outstanding claims”, Journal of Econometrics, 23, 37–61.

248 Bibliography

[126] Taylor G.C. (1996). “Risk, capital and profit in insurance”, SCOR

International Prize in Actuarial Science.

[127] Taylor G.C. (2000). Loss Reserving: An Actuarial Perspective,

Kluwer Academic Publishers.

[128] Taylor G.C. (2004). “Risk and discounted loss reserves”, North

American Actuarial Journal, 8(1), 37–44.

[129] Valdez E. & Dhaene J. (2004). “Bounds for sums of dependent log-

elliptical risks”, Working Paper, University of New South Wales.

[130] Vanduffel S., Hoedemakers T. & Dhaene J. (2004). “Comparing ap-

proximations for sums of non-independent lognormal random vari-

ables”, Research Report OR 0418, Department of Applied Eco-

nomics, K.U.Leuven.

[131] Vanduffel S. (2005). Comonotonicity: From Risk Measurement to

Risk Management, PhD Thesis, University of Amsterdam, Faculty

of Economics and Econometrics, Amsterdam.

[132] Vanmaele M., Deelstra G. & Liinev J. (2004a). “Approximation of

stop-loss premiums involving sums of lognormals by conditioning on

two random variables”, Insurance: Mathematics & Economics, 35(2),

343–367.

[133] Vanmaele M., Deelstra G., Liinev J., Dhaene J. & Goovaerts M.J.

(2004b). “Bounds for the price of discrete arithmetic Asian options”,

Journal of Computational and Applied Mathematics, to appear.

[134] Verrall R.J. (1989). “A state space representation of the chain-ladder

linear model”, Journal of the Institute of Actuaries, 116, 589–610.

[135] Verrall R.J. (1991). “On the unbiased estimation of reserves from

loglinear models”, Insurance: Mathematics & Economics, 10, 75–80.

[136] Verrall R.J. (2004). “A Bayesian generalized linear model for the

Bornhuetter-Ferguson method of claims reserving”, North American

Actuarial Journal, 8(3), 67–89.

[137] Vyncke D. (2003). Comonotonicity: the Perfect Dependence, PhD

Thesis, K.U. Leuven, Faculty of Sciences, Leuven.

Bibliography 249

[138] Vyncke D., Goovaerts M.J. & Dhaene J. (2004). “An accurate an-

alytical approximation for the price of a european-style arithmetic

Asian option”, Finance (AFFI), 25, 121–139.

[139] Wang S. & Young V.R. (1998). “Ordering risks: Expected utility

theory versus Yaari’s dual theory of risk”, Insurance: Mathematics

& Economics, 22, 235–242.

[140] Waters H.R. (1978). “The moments and distributions of actuarial

functions”, Journal of the Institute of Actuaries, 105, 61–75.

[141] Wedderburn R.W.M. (1974). “Quasi-likelihood functions, general-

ized linear models, and the Gauss-Newton method”, Biometrika, 61,

439–447.

[142] Wilkie A.D. (1976). “The rate of interest as a stochastic process:

Theory and applications”, Proceedings of the 20th International

Congress of Actuaries, Tokyo 1, 325–337.

[143] Wolthuis H. & Van Hoek I. (1986). “Stochastic models for life con-

tingencies”, Insurance: Mathematics & Economics, 5(3), 217–254.

[144] Wright T.S. (1990). “A stochastic method for claims reserving in

general insurance”, Journal of the Institute of Actuaries, 117, 677–

731.

[145] Yaari M.E. (1987). “The dual theory of choice under risk”, Econo-

metrica, 55, 95–115.

[146] Zehnwirth B. (1989). “The chain-ladder technique - A stochastic

model”, Claims Reserving Manual, 2, Institute of Actuaries, Lon-

don.

modern reserving techniques for the insurance business · 2017. 5. 5. · goovaerts and jan dhaene...

Documents