sampling from archimedean copulas

This article was downloaded by: [Duke University Libraries]On: 09 October 2014, At: 07:38Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

Quantitative FinancePublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/rquf20

Sampling from Archimedean copulasNiall Whelan aa Trade Floor Risk Management, Bank of Nova Scotia , 40 King Street West, Toronto, ON,M5H 1H1, CanadaPublished online: 19 Aug 2006.

To cite this article: Niall Whelan (2004) Sampling from Archimedean copulas, Quantitative Finance, 4:3, 339-352, DOI:10.1088/1469-7688/4/3/009

To link to this article: http://dx.doi.org/10.1088/1469-7688/4/3/009

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/rquf20

http://www.tandfonline.com/action/showCitFormats?doi=10.1088/1469-7688/4/3/009

http://dx.doi.org/10.1088/1469-7688/4/3/009

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Q UANTITATIVE F I N A N C E V O L U M E 4 (2004) 339–352 RE S E A R C H PA P E RI N S T I T U T E O F P H Y S I C S P U B L I S H I N G quant.iop.org

Sampling from Archimedean copulas

Niall Whelan

Trade Floor Risk Management, Bank of Nova Scotia, 40 King Street West,Toronto, ON M5H 1H1, Canada

E-mail: niall [email protected]

Received 27 May 2003, in final form 18 November 2003Published 11 March 2004Online at stacks.iop.org/Quant/4/339 (DOI: 10.1088/1469-7688/4/3/009)

AbstractWe develop sampling algorithms for multivariate Archimedean copulas. Forexchangeable copulas, where there is only one generating function, we firstanalyse the distribution of the copula itself, deriving a number of integralrepresentations and a generating function representation. One of the integralrepresentations is related, by a form of convolution, to the distribution whoseLaplace transform yields the copula generating function. In theinfinite-dimensional limit there is a direct connection between the distributionof the copula value and the inverse Laplace transform. Armed with theseresults, we present three sampling algorithms, all of which entail drawingfrom a one-dimensional distribution and then scaling the result to createrandom deviates distributed according to the copula. We implement andcompare the various methods. For more general cases, in which anN -dimensional Archimedean copula is given by N − 1 nested generatingfunctions, we present algorithms in which each new variate is drawnconditional only on the value of the copula of the previously drawn variates.We also discuss the use of composite nested and exchangeable copulas formodelling random variates with a natural hierarchical structure, such asratings and sectors for obligors in credit baskets.

1. IntroductionUse of copulas for multidimensional distributions is a powerfulmethod of analysing the dependence structure of randomvariables. The term ‘copula’ was first introduced by Sklar(1959) although some of the ideas go back to Hoffding(1940). They are useful because they permit us to focus onthe dependence structure of the distribution independently ofthe marginal distributions of the random variables. Recentlythey have been considered for basket credit derivatives infinance (Li 2000, Embrechts et al 2001b, Schonbucher andSchubert 2001, Bouye et al 2000, Razak 2003) and forportfolio credit risk (Frey and McNeil 2003, and referencestherein) where we assume that the marginal distributions canbe obtained from market conditions, such as bond spreads orcredit default spreads, but where we need to impose someinformation on the mutual dependences. By marrying themarket observable marginal distributions with a model choice

for the copula, we can construct the full multidimensionaldistribution. This is used for modelling both N th to defaultbaskets and collateralized debt obligations (CDOs) whose pay-offs depend on the probabilities of joint default. However,modelling credit derivatives is just one application of copulasand while it originally motivated this paper, the conclusions ofthis paper are not limited to finance applications.

Archimedean copulas (see Nelsen 1999, Joe 1997 forreviews) are one class which has attracted particular interestsince they have a number of properties which make themsimple to analyse. Commonly we want to randomly samplefrom the multivariate distribution in order to perform anumerical simulation and an algorithm for two-dimensionalArchimedean copulas is presented in Embrechts et al (2001a).In the current paper, we present sampling algorithms forArchimedean copulas in arbitrary dimension. Similar resultscan also be found in Frey and McNeil (2003), the latter beingsimilar in spirit to one of the algorithms we present in this paper.

1469-7688/04/030339+14$30.00 © 2004 IOP Publishing Ltd PII: S1469-7688(04)64101-7 339

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4

http://stacks.iop.org/Quant/4/339

N Whelan QUANTITATIVE FI N A N C E

Our goal is to avoid, as much as possible, direct determinationof high-order derivatives of many variables. This considerationrules out three generic multivariate sampling methods. If oneinvokes direct sampling by drawing a random variate x1, thendrawing x2 conditional on x1 and so on, one ends up havingto determine high-order derivatives of multiple variables.Another generic method is the rejection algorithm which isbased on finding a covering distribution (Devroye 1986), butthis again requires the multivariate probability distributionfunction (pdf) which is difficult to determine. Finally thereis the Metropolis algorithm in which one chooses whether toaccept consecutive random draws using an acceptance criterionbased on the density of the multivariate distribution, thedetermination of which again requires high-order derivativesof many variables. Furthermore, the last method can be veryslow since it involves acceptance criteria which can result inexponentially many rejected draws for every accepted draw.Because of these difficulties, we seek algorithms which arespecific to Archimedean copulas.

We begin with a brief review of copulas and for this wefirst need to introduce some notation. We will use the symbolP to refer to the pdf of its argument. If we have N randomvariables y1, . . . , yN , the joint pdf is P(y) (where we usethe bold notation to denote all of the variables yi consideredtogether.) We use I (y) to denote the cumulative distributionfunction (cdf) of y and I ∗(y) to denote the complement of I (y):

I (y) =∫ y1

−∞dy ′

1 · · ·∫ yN

−∞dy ′

N P (y′),

I ∗(y) =∫ ∞

y1

dy ′1 · · ·

∫ ∞

yN

dy ′N P (y′).

(1)

For one-dimensional normalized distributions we have I (y) +I ∗(y) = 1, which is a relation we will use often. I (y)

represents the probability that every component of the randomvariable is less than the corresponding value of yi while I ∗(y)

represents the probability that every component of the randomvariable is greater than the corresponding value of yi . For eachdimension we also introduce the marginal pdf P(yi) whichcan be obtained from P(y) by integrating over all componentsbut yi . There is also the corresponding marginal cdf I (yi)

which equals I (y) with all components but yi set to infinity(or the largest values of the support of the respective pdfs).For notational simplicity, we will use the symbols P and I

generically to denote distributions of different variables; thearguments of the functions will in almost all cases resolve anyambiguity. Where there is ambiguity, we introduce subscripts.

It is convenient to introduce new variables xi = Ii(yi)

which have a domain from zero to one. (We have introduced asubscript on the functions Ii for notational clarity.) These newvariables have their own multivariate distribution as well asmarginal pdfs and cdfs. We recall that under any monotonicallyincreasing change of variables w = f (z) with distributionfunctions Iz(z) and Iw(w), those distribution functions arerelated by

Iw(w) = Iz(z = f −1(w)). (2)

This follows from the fact that Pr(z < Z) equals Pr(w < W)

where W = f (Z) or equivalently Z = f −1(W). Applying

this identity to our situation in which xi = Ii(yi), we find

Ci(xi) = Ii(yi = I−1i (xi)) = xi (3)

(where to lessen notational confusion we reserve the symbolCi to refer to the cdf of xi). This is a convenient relationsince it means that the marginal pdf of xi is uniform. In fact,this is often taken as the definition of a copula, a multivariatedistribution defined on the unit cube with uniform marginaldensities. By analogous logic, the joint cdf of x is

C(x1, . . . , xN) = I (y1 = I−11 (x1), . . . , yN = I−1

N (xN)). (4)

We use the symbol C to indicate that in addition to being a cdfit is also a copula. C has a number of very specific properties(Nelsen 1999, Joe 1997) and we will only ever express it asa function of the random variables x since under a change ofvariables it loses the property of being a copula. In additionto being the cdf of x, C can also be understood as a randomvariable in its own right (since it is a function of the randomvariables x). Specifically it has its own pdf and cdf which willbe derived below in one situation.

As mentioned, a common application of copulas is toassume a specific copula functional form and combine this withinformation about the marginal distributions of the variablesyi to construct a full multivariate distribution for the yi . Allinformation about the dependence amongst the variables isembedded within the choice of copula. This procedure canbe thought of as extending information about the marginaldistributions to knowledge about the complete distribution. Ithas the added benefit of cleanly separating the informationabout the marginal distributions from the assumptions abouttheir dependence structure. For example, it is relatively simpleto change the copula while keeping the marginal distributionsconstant and thereby explore the effect of different choicesof dependence. Different copulas may also have propertiesnot present for standard multivariate distributions such as thenormal. For instance, the normal distribution has asymptoticindependence unless correlations are unity. Other copulasmay not be asymptotically independent, meaning that eventsin which two or more deviates are in the tails of their respectivemarginal distributions are relatively more common. This canbe important when trying to model joint defaults, for example.

All copulas are bounded between the functions MN =min(x1, . . . , xN) and WN = max(x1 + · · · + xN − N + 1),known as the Frechet–Hoffding bounds. MN is the maximumand is itself a copula (corresponding to perfect dependenceamongst the xi) while WN is the minimum but is only a copulafor N = 2. Standard copulas include the product copula � =∏

i xi which corresponds to statistical independence. Thereis also the Gaussian copula which depends parametrically ona correlation matrix. Assuming the marginal distributionsare all Gaussians, this corresponds to a multivariate Gaussiandistribution. Another class are extreme value copulas whichhave attracted interest in finance applications (Bouye et al2000); in particular they lead first to default intensities whichare free of term structure (Razak 2003). There are alsothe Archimedean copulas which have a number of attractiveproperties and which will be explored in the sections to come.

340

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4

QUANTITATIVE FI N A N C E Sampling from Archimedean copulas

In the following section we analyse various features ofthe distribution of copula values for one simple but commonlyconsidered class of Archimedean copulas, which are called‘exchangeable’. In the section following that we use theseresults to provide a number of sampling algorithms. In thetwo subsequent sections we discuss two classes of copulaswhich generalize the exchangeable copulas in different waysand show how to sample from them. In the penultimate sectionwe introduce a class of inexchangeable Archimedean copulaswhich have properties of both of the generalizations mentionedabove and which can reflect hierarchical structures which mayexist among the random variables. An example is a set ofobligors organized by credit rating, industrial sector and homecountry. We end with a conclusion.

2. Analysis of exchangeableArchimedean copulasAs mentioned in the introduction, one commonly consideredclass is that of the Archimedean copulas (Nelsen 1999, Joe1997) for which the copula function has the form

C(x1, x2, . . . , xN) = φ−1(φ(x1) + φ(x2) + · · · + φ(xN)), (5)

where φ(x) is a generating function which is defined in therange zero to one, is monotonically decreasing and φ(1) = 0.Furthermore, if φ(0) = ∞ the copula is said to be a ‘strictArchimedean’. Most useful copulas are strict and we assumethis condition in what follows. Examples include (Marshalland Olkin 1988): the independent copula for which φ(x) =− log(x), the Gumbel copula for which φ(x) = (− log(x))α

with 1 � α < ∞ and the Clayton copula for which φ(x) =(x−α − 1)/α with 0 < α < ∞. Note that these are all strict.The Gumbel copula is privileged in being both an Archimedeanand an extreme value copula. There is a permutation symmetryamong the xi in (5) and the term ‘exchangeable’ is used todescribe this situation and to distinguish it from more generalclasses of Archimedean copulas which will be presented below.

Because it arises so often in the subsequent discussion itis useful to define the function f (x) = φ−1(x). (Note thatsome authors define φ and f oppositely to what we have donehere; while arguably a superior convention, we have elected tofollow the convention in the finance literature.) Archimedeancopulas arise naturally in the context of Laplace transformsof distribution functions (Joe 1997). If f (x) is the Laplacetransform,

f (x) =∫ ∞

0ds e−sxF (s) (6)

of some univariate density function F(s), then the function (5)is guaranteed to be a proper distribution, meaning that itsdensity function as well as all marginal density functions arepositive. If, on the other hand, we do not obtain φ froma Laplace transform, then in general we are not guaranteedthat (5) is a proper distribution. A necessary condition isthat f (x) have negative first derivative everywhere and thatsubsequent derivatives alternate in sign (Kimberling 1974).This property is referred to as being ‘completely monotone’(Schoenberg 1938) and implies that it is analytic in the right

complex half-plane {z = a + ib, a > 0} and typically,although not necessarily, has a singularity at the origin. Thiscondition is well known and can be seen to arise quite naturallyin the subsequent discussion. The need for this additionalcondition arises because we are specifying the cdf (i.e. thecopula). Traditional routes to copulas start with some pdf(e.g. the multivariate normal distribution) and then derivefrom it the corresponding copula. One is then left with acopula whose density is necessarily positive but which isalgebraically intractable. With Archimedean copulas on theother hand, the copula is quite straightforward to analyse butunless f (x) is defined from a Laplace transform as in (6) onemust be manifestly concerned with guaranteeing positivity ofthe densities.

2.1. The distribution function of the copula

In this subsection we develop expressions for the distributionof the copula value. It is actually simplest to work in terms ofthe variable t = φ(C), as we now show. We start by defininga new set of variables ξ where

ξi = φ(xi), (7)

in terms of which C(x) = f (∑

i ξi). Recall that C(x) canbe interpreted as the probability that each component of x issmaller than the corresponding argument of C. Due to the factthat ξi is a decreasing function of xi , this equals the probabilitythat each component of ξ is greater than the correspondingargument. Symbolically, I ∗(ξ) = C(x) = f (

∑i ξi). We can

then differentiate to find the pdf of ξ:

P(ξ) = (−1)N∂N∏i ∂ξi

f

(∑i

ξi

). (8)

The factor (−1)N arises from the definition of I ∗ in terms ofP ; there is a negative sign per component.

We now introduce a new change of variables, analogousto hyperspherical polar coordinates: t = ∑

i ξi and a set ofN − 1 variables ui which span the constant t surface. (Theseare known as ‘Jacobi coordinates’ in the context of dynamicalsystems in which we use centre of mass coordinates.) SinceP only depends on t , we can focus most of our attention onit and we actually do not need to be very specific about theparametrization of the ui . In particular,

C(x) = f (t). (9)

As mentioned, most of our results are for the distribution oft , but this is trivially mapped to the distribution of the copulavalue C.

The pdf of the set (t, u) is related to that of the ξi in (8)by the Jacobian of the transformation. We can always definethe ui such that this Jacobian is unity so that

P(t, u) = (−1)Nf (N)(t), (10)

where the superscript on f indicates the number of derivativesto take. This is the joint pdf of both t and u. Although there isno explicit dependence on the latter, they still play a role when

341

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


we integrate over their measure. More specifically, if we wantthe marginal pdf of just t , we need to integrate over all the u:

PN(t) =∫ ∏

i

duiP (t, u)

= (−1)NtN−1

(N − 1)!f (N)(t), (11)

where the factor tN−1/(N − 1)! comes from the integral overthe ui . That the u integral leads to this factor is trivially truefor N = 1 (for which there is no integral) and larger valuesof N follow by induction; the proof is trivial and we do notpresent it. Also, we have introduced a subscript of N on P tospecify for which dimension it is relevant. Equation (11) has aninterpretation as a generalization of a Poisson distribution. Infact, for the independent copula for which f (x) = exp(−x),the formula is exactly the pdf of the N th arrival time of aPoisson process.

A further comment on (11) is on the alternating signs asa function of N . In order for the left-hand side to be positive,we clearly need that alternate derivatives of the function f (t)

alternate in sign. As mentioned earlier, this is the property ofbeing completely monotone and is a condition for the functionφ to generate a proper distribution.

One nice property of PN(t) is that it is a perfect derivative,although this is not necessarily obvious. Nevertheless, it canbe expressed as

PN(t) = − d

dtI ∗N(t), (12)

where

I ∗N(t) =

N−1∑n=0

(−t)n

n!f (n)(t). (13)

This is easy to see, by differentiating the series (13) to finda telescoping series of which (11) is the only surviving term.Recall that I ∗

N(t) is the integral from t to ∞ of the pdf of t andis the complement of the normal cdf. In particular, I ∗

N(∞) = 0while I ∗

N(0) = f (0) = 1. The latter in particular indicates thatthe pdf is correctly normalized1. The expressions (11) and (13)are the main results of this section and are what we shall buildon in what follows.

We can also derive a generating function expansion for thePN(t) by defining2

g(t, z) =∞∑

N=1

zN−1PN(t). (14)

We use the expression (11) and identify the sum as the Taylorexpansion of f ′(x) expanded around x = t and evaluated atx = (1 − z)t to conclude

g(t, z) = −f ′((1 − z)t), (15)1 It is interesting to note that (13) looks much like a Taylor series expansion.In fact if we introduce the complementary series IN (t) = ∑∞

n=N(−t)n

n! f (n)(t)

then IN (t)+I ∗N(t) together represent f (0) Taylor expanded around the point t .

The series can be shown to converge using the Laplace relations (6). However,by definition f (0) = 1 and so IN (t) = 1 − I ∗

N(t) and is therefore an equallyvalid representation of the cdf of t . This is reminiscent of the phenomenon of‘resurgence’ in asymptotic analysis where there is a complementarity betweenthe so-called ‘early terms’ and ‘late terms’ in a properly resummed asymptoticexpansion.2 I wish to thank Tom Hurd for this idea.

in terms of which we can find any PN(t) as

PN(t) = 1

(N − 1)!

dN−1

dzN−1g(t, z)

∣∣∣∣z=0

. (16)

We can integrate this with respect to t to find

I ∗N(t) = 1

(N − 1)!

dN−1

dzN−1h(t, z)

∣∣∣∣z=0

, (17)

where h(t, z) = f ((1 − z)t)/(1 − z). The functions g(t, z)

and h(t, z) have analyticity properties governed by that of thefunction f (z). In particular, there is generally a singularity att = 0 and at z = 1. Although the expression (14) is typicallyonly convergent for |z| < 1, (15) is the unique analyticcontinuation and is valid for all values of z, although care mustbe taken since the functions g and h may be multivalued in thecomplex plane.

The expression (13) is directly related to an expressiondue to Barbe et al (1996) for the cdf of the copula itself. Wecan think of the expression (9) as defining a change of variablesfrom t to C which can be made explicit by writing t = φ(C).Since f (t) is a decreasing function, the probability that t isgreater than some value T is the same as the probability thatC is less than f (T ). Therefore we can immediately write thecdf of C as

IN(C) =N−1∑n=0

(−1)nφn(C)

n!f (n)(φ(C))

= C +N−1∑n=1

(−1)nφn(C)

n!f (n)(φ(C)). (18)

This formula can be found in Barbe et al (1996) where itis referred to as that for K(t). The exposition above is analternative and arguably simpler route to the same result. Intwo dimensions this simplifies to

I2(C) = C − φ(C)

φ′(C). (19)

We conclude that the expressions (11) and (13) can alsobe understood as determining the cdf of the copula itself.This fact is potentially useful in fitting empirical data to anArchimedean copula. In Genest and Rivest (1993) and Bouyeet al (2000) the authors discuss parametric fitting of empiricaldata. In particular, one can estimate the distribution functionof the copula using non-parametric estimation and then do aparametric estimation of the best-fitting Archimedean copula.The distribution of t might potentially aid in this endeavouras well as being useful for sampling. However, to proceedwith either of these applications, it is important to have acomputationally tractable representation of the distribution.Most generating functions are sufficiently complicated thatthe higher derivatives get very complicated quickly and it isnot feasible to determine them directly. Therefore in the nexttwo subsections we present two integral representations of thedistribution which are simpler to work with than (13).

342

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


2.2. Cauchy integral representation

To obtain the first integral representation, we make use of theCauchy formula to determine the derivatives in (13). Recallthat if we have a function f (t) we can express any derivativeas

f (n)(t) = n!

2π i

∮dz

f (z)

(z − t)n+1, (20)

where the integration contour encloses the pole at z = t onceand otherwise crosses no poles or branch-cuts of f . We caninsert this expression into (13) and sum a geometric serieswhich is generated to conclude

I ∗N(t) = − 1

2π i

∮dz

f (z)

z

(t

t − z

)N

. (21)

This is a usable representation of the cdf since we can readilyperform the contour integral numerically. This is discussed inthe appendix. As mentioned, f (z) being completely monotonemeans that it is only guaranteed to be analytic in the positivereal part of the complex plane so the contour is limited to thatdomain. In particular, it should avoid the origin which typicallyhas a singularity. Applying (20) to the equation in a previousfootnote leads to the following alternative form

IN(t) = 1 +1

2π i

∮dz

f (z)

z

(t

t − z

)N

. (22)

The integral representations above are particularlyeffective when t is large. When t is small they arenot numerically efficient but this is overcome by changingvariables from z to 1/z so that alternative representations ofthe same functions are

I ∗N(t) = 1

2π i

∮dz

f (1/z)

z

(z

z − 1/t

)N

,

IN(t) = 1 − 1

2π i

∮dz

f (1/z)

z

(z

z − 1/t

)N

.

(23)

Again we must avoid the origin and limit the contour to thepositive real half-plane when encircling the pole at z = 1/t .

2.3. Laplace integral representation and N → ∞Another path to an integral representation is to appeal to thedefinition of f (t) as a Laplace transform. In particular we willsee that this leads to an interesting relation in the limit of largedimension. Combining (6) and (11) we find

PN(t) = 1

�(N)

∫ ∞

0ds se−st (st)N−1F(s). (24)

We can integrate this with respect to t to find

IN(t) =∫ ∞

0ds F (s)γ (N, st),

I ∗N(t) = t

�(N)

∫ ∞

0ds e−st (st)N−1I (s),

(25)

where γ (N, x) is the incomplete gamma function and is itselfa cdf while I (s) is the cdf corresponding to F(s). One

can see by inspection that IN(∞) = 1, assuming F(s) isnormalized. This is a central relation which we will referto often in the subsequent discussion. We can think of thisas a form of convolution in which the cdf γ (N, x) is spreadout by integration against a broadening function3. In fact, wecan immediately understand the infinite-dimensional limit. Weearlier remarked that (11) resembles the distribution of the N tharrival time for some process. We know what happens in theinfinite-dimensional limit for the Poisson generating function(i.e. the independent copula) and it is natural to ask the samequestion for non-independent copulas. To motivate this, it isuseful first to look at the moments of the distribution of t .

Using (11) we find that the mth moment of t is given bythe following recursion relation:

〈tm〉N = (−1)N

(N − 1)!

∫ ∞

0dt tm+N−1f (N)(t)

= N + m − 1

N − 1〈tm〉N−1. (26)

To obtain this, we have integrated by parts and identified theresulting integral as being proportional to the mth moment fordimension N − 1. We can obviously continue this and we find

〈tm〉N =(

m + N − 1N − 1

)〈tm〉, (27)

where we have defined 〈tm〉 without a subscript with referenceto N = 1, that is

〈tm〉 ≡ 〈tm〉N=1 = −∫

dt f ′(t)tm. (28)

In particular, the mean value of t is proportional to N so itis useful to scale this out by defining a new random variableτ = t/N . The first two moments of τ are

〈τ 〉N = 〈t〉,

〈τ 2〉N = 1

2

(1 +

1

N

)〈t〉2

(29)

from which we can determine the variance

σ 2τ =

(1

2〈t2〉 − 〈t〉2

)+

1

2N〈t〉2. (30)

There are two interesting limits. One is where the copulagenerating function is such that the first term in (30) is zero.In that case, only the second term is relevant and we find thatthe standard deviation decreases as 1/

√N for large N . This

case is already very familiar, being the distribution of the N tharrival time of a Poisson process; this approaches a Gaussiancentred on τ = 1, which in the large-N limit becomes a deltafunction. Therefore we will not dwell on this and rather focuson the other interesting limit suggested by (30) which is takingN → ∞ so that the second term vanishes. However, we wouldlike to be more ambitious for this limit: not just to examinethe first few moments, but rather to derive an expression for

3 More precisely (25) is in the form of a correlation between the functions F

and γ but where we multiply, rather than add, the arguments s and t in γ . Interms of log t and log s, (25) is a standard correlation integral.

343

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


the entire distribution. We first derive an expression for allmoments by revisiting (27) but focusing on the large-N limitand expressing the results in terms of τ so we can write

〈τm〉 = 1

Nm

(m + N − 1

N − 1

)〈tm〉

≈ 1

m!〈tm〉. (31)

In the second line, the approximation is that N m. Theprevious expression tells us that there is a deep connectionbetween the infinite-dimensional limit of the τ distributionand the one-dimensional t distribution. Of course, the precisenature of that connection is embedded in the expressions (24)and (25).

In terms of τ , we can re-express (25) as

IN(τ) =∫ ∞

0ds F (s)γ (N, Nsτ). (32)

In the infinite-dimensional limit, a saddle-point analysisindicates that

limN→∞

γ (N, Nx) = �(x − 1), (33)

where�(x) is the Heaviside step function. (More precisely, forlarge N , the left-hand side can be shown to approach a normalcdf which becomes infinitely sharp as N → ∞.) Using thisidentity and defining G(s) as the cdf of s, corresponding to thepdf F(s), we conclude that

I ∗∞(τ ) = G(1/τ),

P∞(τ ) = 1

τ 2F(1/τ).

(34)

Equivalently, if we define ω = 1/τ , then the pdf of ω is simplyF(ω). We have hereby proved that in the infinite-dimensionallimit, the distribution of τ is given by the distribution whoseLaplace transform generates the Archimedean copula. As faras we are aware, this is a new result. The expressions (34)are the main result of this section and will prove of use indeveloping sampling algorithms.

We can also express this directly in terms of the copulaitself through the change of variables τ = φ(C)/N in the limitthat N → ∞ so that the distribution of C is

I∞(C) = limN→∞

G

(N

φ(C)

),

P∞(C) = − limN→∞

Nφ′(C)F

(N

φ(C)

)φ2(C)

,

(35)

although these are probably less useful since they require carein understanding the limit.

We can explicitly verify (34) for the independent copulafor which f (t) = exp (−t) and hence F(s) = δ(s −1). Usingthis relation in (34) we find P∞(τ ) = δ(τ − 1), which is thecorrect limiting distribution of the average arrival time of aPoisson process in the limit that N → ∞. The effect of usingsomething other than the independent copula is to broaden the

distribution (as we observe from the variance being non-zeroin (30)). The expression (34) is then useful for understandingthe asymptotic behaviour of P(τ) from the characteristics ofthe Laplace transform, for example. We hope to explore thisin a later publication as well as the case that N is large but notinfinite in (25).

As a final piece of analysis, we can determine a relativelysimple expression for the characteristic function of (34) usingthe moments of the distribution. In general, the characteristicfunction of a pdf P(x) can be expressed in series form as

Q(k) =∞∑

m=0

〈xm〉 (2π ik)m

m!. (36)

Applying this identity to our case together with the resultsin (28) and (31), we find

Q∞(k) = −∫ ∞

0dtf ′(t)

∞∑m=0

(2π ikt)m

m!m!

= −∫ ∞

0dtf ′(t)J0

(e−iπ/4

√2πkt

), (37)

in terms of which

P∞(τ ) =∫ ∞

−∞dk e−2π ikτQ∞(k). (38)

The function J0(e−iπ/4z) can also be expressed in termsof Kelvin functions (Abramowitz and Stegun 1965). Infact, performing the above Fourier transform leads to theexpressions (34) and this constitutes an alternative route tothe same result4.

3. Sampling from exchangeable copulasIn this section we present two strategies for sampling from thecopula using the results of the previous sections. One strategyis to find a random value of t and then partition it among theξ (and hence among the x.) The first two subsections presenttwo algorithms for randomly sampling a value t while the thirdsubsection presents an algorithm for partitioning a sampledvalue of t randomly among the x. In the fourth subsectionwe present a slightly different strategy which makes use of theinfinite-dimensional limit of the τ distribution. In most casesthis is probably the most efficient algorithm but to explain it,we find it helpful to start by explaining the first strategy. In allcases we make explicit use of the exchangeability property.

3.1. First algorithm for drawing t

The first algorithm is to integrate (25) in closed form. Thisonly works for relatively simple copulas. One example is theClayton copula for which (Marshall and Olkin 1988)

F(s) = (1/α)1/α

�(1/α)e−s/αs1/α−1. (39)

4 Analysis of the Fourier integral requires some care since if we interchangethe t and k integrals, the integral over k is pure imaginary and divergent. Rather,we break the k integral into positive and negative k domains and introduce asmall imaginary component ±iε to τ to render the integrals convergent, thesign of ε depending on the sign of k. The two k integrals can then be foundfrom tables (Oberhettinger 1990, for example). We then sum them and takethe limit ε → 0 and the rest follows.

344

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


Using this in (25) we find the complement of IN(t) as

I ∗N(t) = B( 1

1+αt; N, 1/α)

B(N, 1/α), (40)

where B(z; a, b) is the incomplete beta function and B(a, b) isthe complete beta function. This is manifestly normalized andthere are standard techniques to sample from it, as we discussbelow.

3.2. Second algorithm for drawing t

For this algorithm we use the complex integral representationsof section 2.2. We can for instance sample a random uniformdeviate r and then find the value of t consistent with it. Thistechnique is suitable for situations where we do not havenumerically efficient means of directly determining F(s). Asdiscussed in the appendix, one virtue of having a closed contourrepresentation is that the integral may be relatively inexpensiveto calculate since it converges very rapidly with the number ofpoints used.

We forego presenting the analogous method using theLaplace representation 25 since it explicitly requires F(s).If we know F(s), then there is a superior method which wepresent below. If we do not know F(s) then using the Laplacerepresentation would be difficult to implement.

3.3. Partitioning t among the ξ

For purposes of this subsection we imagine that, using one ofthe previous algorithms, we have chosen a random value of t

from its distribution. Given such a random value, we wouldlike to randomly select a set ξ with the constraint that theirsum equals t . We can think of this geometrically as pickingpoints randomly on a hyperplane in N -dimensional space, alsoknown as a simplex. Methods for sampling from it can befound in Devroye (1986). One algorithm follows by analogywith sampling from a unit sphere. We make N uncorrelateddraws from the simple Poisson pdf exp(−y), calling the drawsri . We rescale each of them by the ratio of t and their sum.This works because the Poisson distribution has the property∏

i P (xi) = P(∑

i xi), so that the independent multivariatedensity is a function of the sum of the random variables.

Symbolically, we find random deviates ri drawn from thePoisson distribution and then define

ξi = ri∑Ni=1 ri

t. (41)

As a final step, we define xi = f (ξi) and we are done. Asa result of this algorithm, we have a random draw of valuesfrom (5). For many copulas, most of the difficult numericalwork is in finding a random value of t , whereas finding the N

random values ri is very fast even for large N .

3.4. Third algorithm using the distribution of τ

The expression (34) has an immediate application as the basisfor an alternate sampling strategy. The idea is to conceptuallyimagine that we make an infinite-dimensional draw but that we

only record a number d of the variates. The set we record willperforce be distributed as the d-dimensional copula.

We first make a random draw ω from the pdf F(ω) whichwe invert as τ = 1/ω, from which we can define t = Nτ .Recall that we have defined t = ∑N

i=1 ξi . In the previoussubsection we described how to sample from the constant t

simplex in the space of ξ. It amounted to making one drawfrom Poisson distribution for each dimension, that draw beingcalled ri and then using (41). We now recall that t = Nτ andalso note that the denominator of (41) approaches N in thelimit so we are left with

ξi = riτ. (42)

To recap, we first draw a value ω from F(ω) which we in-vert to get τ . For each dimension we draw a random variate ri

from the Poisson distribution and multiply it by τ . These willthen be random deviates ξi . We then use xi = f (ξi) to deter-mine the random deviates in the original defining space of thecopula (see note added in proof). In most cases this is probablymore efficient than the previous algorithms since one is savedthe necessity of using any integral representations. One maystill need to precalculate F(s) supplemented with interpolationif it is not known in closed form. However, this is a relativelysmall cost and need only be borne once, independently of thedimensionality or the number of samples required. For copulaswhere F(s) is known, this method is very fast.

This algorithm is very similar in spirit to one proposedin Frey and McNeil (2003, lemma 4.15). The only differenceis that we are working in terms of variables t and τ , whichmakes the sampling particularly simple; however, the logic isthe same. We also remark that this algorithm is quite similarto sampling from a multivariate Student’s t distribution whereone first draws from a χ2 distribution, then takes the inverseof the square root of the draw and finally scales that numberby draws from a normal distribution.

Another interesting observation is that we can also pursuethe conceptual idea encapsulated in (42) to proceed with theformal development. That is we can define variables ξi = riτ

and given the distributions of ri and τ , we can determine thedistribution of the ξi . From that we can then determine thedistribution of the variables xi = f (ξi). We do not presentthe details but it is not difficult to show that one ends up witha Laplace transform of a product which in the end leads backto (5). In a sense then we have come full circle but along theway have derived some useful alternative expressions for thedistribution.

3.5. Numerical results

For all algorithms we require sampling either from theunivariate distribution of t or of τ (the latter being directlyrelated to the distribution F(s)). Two basic approaches areusing the rejection algorithm (Devroye 1986) and directioninversion of the cdf. The rejection algorithm is the method ofchoice for many standard univariate distributions such as thegamma and beta distributions (Ahrens and Dieter 1974). Inorder to use this, one needs to find a covering distribution whosepdf exceeds the pdf of the desired distribution for all values.

345

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


The covering distribution should also be simple to samplefrom. For our purposes we in fact need to find parametricfamilies of covering distributions which work for all choicesof parameter and of copula dimensionality (the latter constraintis only relevant for P(t).) For example, for the Claytoncopula we have IN(t) in terms of the beta distribution (40)and F(s) in terms of the gamma distribution (39). For both ofthese the rejection algorithm is the fastest and this is what isimplemented in almost all statistical numerical packages.

Lacking a covering distribution, one can always proceedby numerical inversion. Namely, one samples from theunivariate uniform distribution, calling the deviate r . If weare using algorithms 1 or 2, we invert the cdf by finding t suchthat

IN(t) = r. (43)

For algorithm 3, we invert the algorithm so that

I (τ ) = r. (44)

In the second case, we can work in terms of the distributionG(s) (i.e. the cdf corresponding to the pdf F(s)) by finding

G(s) = r (45)

and then using τ = 1/s as is evident from (34). We alsoremark that since r is uniformly distributed, we can equallywell substitute I for I ∗ (or G for G∗ in the three expressionsabove). In order to invert the expressions above we can proceedby Newton iteration since the derivatives are given by therespective pdf functions. This can be efficient since we candetermine the cdf and pdf at the same time for a given choiceof variable and much of the computational effort is commonand need not be duplicated. As an example, if we are using (24)and (25) to determine the pdf and cdf respectively, then we needonly determine the function value F(s) and the product st once.Similar logic also applies to the Cauchy representation.

To make the Newton method work efficiently, one alsorequires a scheme to estimate a good first guess for a givenchoice of r (as well as checks to make sure that we areconverging to a solution and if not to proceed with somenumber of bisections). Again, the best guess is always adaptedto the specific copula at hand. Understanding the asymptoticbehaviour of the distributions in the limits of large and smallarguments often allows for reasonable first guesses which canbe effectively extended over the entire range of r values.For the Clayton copula the algorithms 1 and 3 are clearlysuperior to using any numerical inversion of the cdf, so it isnot worth investing effort in the analysis of best first guesses.For the Gumbel copula, we have determined a number ofproperties of the function F(s), including efficiently computedrepresentations and asymptotic limits in both argument s andparameter α. We do not have space here to present all ofthese results and plan to present these results in a forthcomingpublication.

For all algorithms, we verified that the procedure workedby testing that the marginal distributions generated from oursamples of x are indeed uniform. We also tested dependenceamong the deviates by use of Kendall’s tau statistic (Kendall

Table 1. Timing results (in seconds) for the Clayton copula withα = 2 and N = 3.

ClaytonTotal cpu time

No of cpu time per sampleAlgorithm samples (s) (s)

1 1000 000 14.8 1.48 × 10−5

2 1 000 15.0 1.50 × 10−2

3 1000 000 13.5 1.58 × 10−5

Table 2. Timing results (in seconds) for the Clayton copula withα = 2 and N = 100.

ClaytonTotal cpu time

No of cpu time per sampleAlgorithm samples (s) (s)

1 1000 000 208.2 20.8 × 10−5

2 1 000 665.4 66.5 × 10−2

3 1000 000 207.9 20.8 × 10−5

and Stuart 1979). For the Clayton and Gumbel copulas wehave

τ = α

α + 2Clayton

= α − 1

αGumbel. (46)

This is a superior statistic to the standard correlation sinceit is zero if and only if the deviates are independent. Forthe exchangeable copulas, every choice of pairs of indicesshould have the same value of tau, within statistical error.Parenthetically we remark that from this statistic we seethat the two copula families span the range from completeindependence (τ = 0) to complete dependence (τ = 1).

We now present some timing statistics, namely the amountof time taken per sample using the algorithms presented aboveon a standard Pentium II 450 MHz processor. We first presenttiming results for the Clayton copula in table 1. We observethat algorithms 1 and 3 are virtually identical in terms ofperformance while algorithm 2 is three orders of magnitudeslower. This is both because it uses inversion of the cdf,which is slow, and because of the integral representation ofthe distribution, which is also slow.

We also want to explore how well these algorithms workwhen we go to very large dimension. Below we present thetiming results for a copula of dimension 100, which wouldbe typical of the number of names in a CDO. The results arepresented in table 2 where we see that the algorithms 1 and3 are essentially identical in terms of performance. There isabout a factor of 10 loss of speed due to the necessity of makingso many Poisson draws for each sample, but that is tolerable.By contrast algorithm 2 is about forty times slower than withN = 3 which we ascribe to the expressions (21) and (23)becoming harder to work with as N becomes large, since thecost of the Poisson draws is negligible relative to the cost ofthe draw from IN(t). This points against its use for still largerN .

We anticipate no problems going to arbitrarily highdimension with either algorithm 1 or 3 except for a linearly

346

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


Table 3. Timing results (in seconds) for the Gumbel copula withα = 2 and N = 3.

Gumbel

Total cpu timeNo of cpu time per sample

Algorithm samples (s) (s)

1 N/A N/A N/A2 1 000 22.09 2.21 × 10−2

3 1000 000 74.67 7.47 × 10−5

increasing time per draw. Since method 1 requires specialproperties while method 3 is generally applicable for anycopula in which we can numerically determine F(s), wewould lean towards implementing 3 simply due to its widerapplicability. Additionally it has the conceptual advantage thatto obtain sampled values of τ , we sample from F(s) withoutany regard to the copula dimension N . The information aboutN is simply encoded in the number of draws we make froma univariate Poisson distribution. By contrast the first twoalgorithms encode N directly in the distribution IN(t). Thismeans that we need to explicitly worry about what happensas N gets large for example, a consideration not necessaryfor algorithm 3, for which robust scaling with dimension ismanifest.

We next present analogous results for the Gumbel copulain table 3. In this case we do not have recourse to algorithm 1.We note that algorithm 3 uses a numerical inversion of thenumerically determined distribution F(s) and is only a factorof four slower than the Clayton copula which is sampled usingthe rejection algorithm. As mentioned we plan to publish ourresults on the Gumbel copula and hope to explore whether aneven more efficient scheme using rejection can be determined.In the meantime, this is certainly competitive with the Claytoncopula. We note that algorithm 2 is slightly slower than for theClayton but is similar.

Algorithm 2 is the only possibility when we do not knowF(s). However, as we have discovered with the Gumbelcopula, even not having F(s) in closed form, it can still bebetter to work with a numerical representation of it while usingalgorithm 3 than the much slower algorithm 2. Determinationof a suitable representation of F(s) obviously requires aspecific analysis of the copula in question but at least in thecase of the Gumbel copula that was clearly worth the effort.When we tried to increase the dimensionality, we found thatalgorithm 2 had problems beyond about 25. While this couldprobably be amended by optimizing the integration contourfor large N , it does underline the fact that this algorithmdoes not scale robustly. By contrast with algorithm 3, wefound a million samples with N = 100 in 277.08 s, whichagain compares favourably with the Clayton copula. (Thefact that the ratio of times is much closer to unity than forN = 3 is evidence that most of the time is spent drawingfrom the Poisson distribution and not drawing from the F(s)

distribution.)As a final word then, it would appear that algorithm 3 is

the best in terms of conceptual simplicity and speed. It cleanlyseparates out the sampling of τ from any concerns about thedimensionality of the copula, meaning that it scales well with

dimension. It is probably always worth finding an appropriaterepresentation of F(s) from the inverse Laplace transform off (x), even if it is numerical and not in a closed form. Whenthis is not possible, algorithm 2 does work but is very slow andmay fail for large dimension. Algorithm 1 may be competitivewith 3 where it is available but it really only applies to specialcases for which (25) can be integrated in closed form.

4. Fully nested copulaOne generalization of the multivariate Archimedean copulais given in Joe (1997) and also discussed in Embrechts et al(2001a). It is given by N − 1 distinct generating functions as

C(xN, . . . , x1) = φ−1N (φN(xN) + φN(φ−1

N−1(φN−1(xN−1)

+ · · · + φ−12 (φ2(x2) + φ2(x1)) · · ·))). (47)

The structure is simple, if awkward to express in equations.We first couple x1 and x2. We then couple the copula of x1 andx2 with x3. We then couple that copula with x4 and so on. Weshall refer to this situation as fully nested. In addition to thefact that alternate derivatives of each function φ−1

i (x) need toalternate in sign (as in the exchangeable case) we have furtherconstraints on the functions φ−1

i+1 ◦ φi(x), which we describebelow. Also note that the N − 1 copulas will be given byN − 1 parameters. However, in general there are N(N − 1)/2pairings of variables and so there is still not enough structure tomodel all possible mutual dependences amongst the variates.Nevertheless, (47) is more general than (5).

In order to draw from this we could proceed by way oftaking repeated conditional draws. The probability functionfor x1 is unity so this is a trivial draw. We then have

I (x2|x1) = ∂C2(x2, x1)

∂x1,

...

I (xN |xN−1, . . . , x1) = ∂N−1CN(xN, . . . , x1)

∂xN−1 · · · ∂x1.

(48)

The problem with this approach is that the partial derivativesget very complicated very quickly. As an example, for a 3-copula we find

∂2C3(x3, x2, x1)

∂x1∂x2= f ′′

3 (φ3)φ′23 (f2)f

′22 (φ2)φ

′2(x2)φ

′2(x1)

+ f ′3(φ3)φ

′′3 (f2)f

′22 (φ2)φ

′2(x2)φ

′2(x1)

+ f ′3(φ3)φ

′3(f2)f

′′2 (φ2)φ

′2(x2)φ

′2(x1), (49)

where we have defined fi(z) = φ−1i (z) and for notational

compactness we use the notation φ2 = φ2(x2) + φ2(x1),f2 = f2(φ2) and φ3 = φ3(x3)+φ3(f2) for function arguments.For a Gumbel generating function, φ3(z) = (− log z)α . As anexample of a higher derivative, we would find

φ′′3 (z) = α(− log z)α−2

z2(α − 1 − log z). (50)

There are expressions of similar complexity for φ′23 (z), f ′′

3 (z)

and so on. Putting together (49) and (50), it is not hard to

347

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


see that this approach will become unwieldy for dimensionsbeyond two or three.

Instead, inspired by the exchangeable case (7), we firstdefine

ξi = φi(xi), (51)

where for notational consistency we can define φ1(z) = φ2(z).These are the fundamental variables we will work with andsample from. The probability distribution of the ξi is given by

P(ξN, . . . , ξ1) = (−1)N∂NCN(xN, . . . , x1)

∂xN · · · ∂x1. (52)

As for (8), we have a factor of −1 per dimension.In terms of the ξi , we find

I ∗N(ξN, . . . , ξ1) = fN(ξN + gN(ξN−1

+ gN−1(ξN−2 + · · · + g2(ξ2 + ξ1) · · ·))), (53)where we have further introduced the ‘coupling functions’

gN(z) = φN ◦ fN−1(z). (54)

The exchangeable copula can be understood as a specialcase in which all coupling functions are the identity functiongi(x) = x. CN is a copula only if all the coupling functions gi

belong to the space of functions L∗∞ where (Joe 1997)

L∗∞ = {ω : [0, ∞) → [0, ∞)|ω(0) = 0, ω(∞) = ∞,

(−1)jω(j) � 0, j = 1, . . . ,∞}, (55)effectively meaning that they map the positive real axis to thepositive real axis and have alternate derivatives which alternatein sign. The properties of mapping the real axis to the real axis,mapping zero to zero and mapping infinity to infinity followfrom the definition of the generating functions. However,the constraint of having alternately signed derivatives is non-trivial and the generating functions must be chosen carefullyso as to satisfy this property. If it is violated, the copula willproduce negative density functions. Embrechts et al discussthis for the Gumbel copula (Embrechts et al 2001a) and showthat this additional constraint is satisfied if the successivegenerating functions φi(x) have parameters αi which satisfyα1 � α2 � · · · � αN , meaning the degree of dependence isgreatest for the most deeply nested variates in (47).

We now set out to determine the distribution function ofξN , given all of the other variables. In order to do so, weintroduce one more change of variables. We invoke the trivialmap from ξi to ξi for all i but N −1; to replace ξN−1 we define

t = ξN−1 + gN−1(ξN−2 + · · · + g2(ξ2 + ξ1) · · ·). (56)

The merit of this change of variables is that I ∗N only depends

on ξN and t so that the derivatives become simpler to compute:

∂I ∗N

∂ξN

= ∂I ∗N

∂ξN

,

∂I ∗N

∂ξN−1= ∂I ∗

N

∂t,

∂I ∗N

∂ξN−2= ∂t

∂ξN−2

∂I ∗N

∂t,

...

∂I ∗N

∂ξ1= ∂t

∂ξ1

∂I ∗N

∂t.

(57)

The various partial derivatives hi(ξ) ≡ ∂t∂ξi

have the usefulproperty that they only depend on ξ1 to ξN−2 and on neither t

nor ξN . Therefore successive differentiations with respect to t

do not affect them. We can then write

P(ξN, . . . , ξ1)

= (−1)N(∏

i

hi(ξ)

)∂

∂ξN

∂N−1

∂tN−1fN(ξN + gN(t)). (58)

First, we integrate over∫ ∞

0 dξN , which is trivial since (58)is a perfect derivative in terms of ξN so that

P(ξN−1, . . . , ξ1) = (−1)N−1

(∏i

hi(ξ)

)∂N−1

∂tN−1fN(gN(t)).

(59)The ratio of the previous two expressions is the conditionaldensity

P(ξN |ξN−1, . . . , ξ1) = −∂

∂ξN

∂N−1

∂tN−1 fN(ξN + gN(t))

∂N−1

∂tN−1 fN(gN(t)). (60)

We can then integrate this expression with respect to ξN

to determine the conditional cdf. Again, this is a perfectderivative and is trivially integrated:

I ∗(ξN |ξN−1, . . . , ξ1) =∂N−1

∂tN−1 fN(ξN + gN(t))

∂N−1

∂tN−1 fN(gN(t)). (61)

This is manifestly normalized since it equals unity for ξN = 0.Equation (61) is the main result of this section. However,

it does merit some discussion. Firstly we can see that the right-hand side is only a function of ξN and t . However, we can alsointerpret t in terms of the copula CN−1 as t = φN−1(CN−1). Inother words, (61) gives the distribution of ξN conditional on thecopula function of the previous draws, which is an appealingrecursive property. Second, we can simplify the denominatorthrough the relation fN(gN(t)) = fn ◦ gN(t) = fN−1(t). Infact, applying this same identity to (59), it is not difficult tosee that that equation is consistent with the variables ξi to ξN−1

having the copula CN−1, as they must.Then the algorithm is that we first draw a value x1

randomly from zero to unity. We then determine ξ1 and fromthat determine a random draw ξ2 using (61). We then calculatex2 = φ2(ξ2) and the value t = φ2(C2(x1, x2)). We can thenmake a conditional draw for ξ3 conditional on the value of t ,using (61). We continue until we have N drawn values.

As mentioned, an alternative algorithm for twodimensions is presented in Embrechts et al (2001a). It isdistinct from what we have derived here but does share theproperty that one first makes a draw for x1 and then makes adraw for x2 conditional on x1. Note that in two dimensionsthere is no distinction between exchangeable and nestedArchimedean copulas, so the need for distinct algorithms onlyarises in more than two dimensions.

Unfortunately this algorithm still requires determininghigh-order derivatives in (61). There is nevertheless anadvantage over directly working with the distribution interms of the original variables x. Namely, we only need

348

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


Table 4. Timing results (in seconds) for the fully nested algorithmusing the Gumbel copula with α = 2 and N = 3.

Gumbel

Total cpu timeNo of cpu time per sample

Dimension samples (s) (s)

2 1000 10.5 1.05 × 10−2

3 1000 27.7 2.77 × 10−2

4 1000 41.9 4.19 × 10−2

5 1000 58.1 5.81 × 10−2

to concern ourselves with derivatives with respect to onevariable, t , and not with respect to all of the x1, . . . , xN−1.We believe that (61) represents the most parsimonioussampling algorithm and that the complication of having todetermine the high-order derivatives as a function of t is anunavoidably intrinsic property of this class of copulas. Possiblemethods for calculating the required derivatives includeusing a symbolic manipulation package with automatic codegeneration and using ‘automatic differentiation’ to generatecode for derivatives (for reviews see Rall 1981 and Corliss2002).

One can also use the Cauchy formula supplemented withasymptotic relations for the derivatives in the limits of largeand small arguments. Specifically we can write

I ∗(ξN |t) = (N − 1)!

2π i ∂N−1

∂tN−1 fN−1(t)

∮dz

fN(ξN + gN(z))

(z − t)N(62)

together with a corresponding expression for P(t), itsderivative. We invoked the last method to confirm that thisalgorithm works for the Gumbel copula. In table 4 we presenttiming statistics for a selection of dimensions. We observe thatfor two or three dimensions it is competitive with algorithm2 of the exchangeable copula case. The time is roughlylinear in dimension up to dimension 5. Beyond that wefound it to be very slow, presumably due to the increasingcomplexity of (62) with dimension. With effort, this couldbe improved upon by better adapted integration contours, forexample. Nevertheless, this is indicative that this methodprobably would be difficult to scale beyond ten or twentydimensions. An interesting direction to study, if one wanted touse these nested copulas for large dimension, is whether onecan find appropriate covering distributions parametrized by N

and t and the relevant copula parameters in order to make useof the rejection algorithm.

5. Partially nested copulaThere are alternative multivariate extensions to the fully nestedcopula of the previous section. They can be understood ascomposites of the exchangeable copula and the fully nestedcopula. The lowest dimension in which there is a distinctcopula of this class is four, for which the copula function is

C(x4, x3, x2, x1) = φ−1(φ(φ−112 (φ12(x1) + φ12(x2)))

+ φ(φ−134 (φ34(x3) + φ34(x4)))). (63)

Again, the equation looks complicated although the logic isstraightforward. We first couple the two pairs x1, x2 and x3, x4

with distinct copulas generated by φ12 and φ34 respectively. Wethen couple the two copula functions using a third generatingfunction φ. Joe discusses how this structure and that ofthe fully nested copula can be understood from distinctmultivariate Laplace transforms (Joe 1997). This distributionis exchangeable between x1 and x2 and also between x3 and x4

and for that reason can be understood as intermediate betweenthe fully exchangeable copula and the fully nested copula.Like (47) for N = 4, (63) is generated by three distinctgenerating functions.

It is not difficult to imagine other patterns of nestingin higher dimensions. It can be notationally overwhelmingto attempt to express symbolically the most general case.Nevertheless, in the following section we do present oneparticular choice of nesting to reflect hierarchical structureamong the random deviates. Rather than writing an algorithmfor the most general case of partial nesting, we shall focuson the particular multivariate distribution (49). The pattern ofhow to handle more complicated nestings should then be clear.

We first define ξi = φ12(xi) for i = 1, 2 and ξi = φ34(xi)

for i = 3, 4 and then t = ξ1 + ξ2 and s = ξ3 + ξ4 as well as ut

and us which span the constant t and s surfaces respectively.Appropriately enough, our approach for this problem is acomposite of the approaches for the exchangeable copula andthe fully nested copula. As for the exchangeable copula, weseek a strategy to make random draws of the variables s andt from which we can then trivially determine random deviatesof ξi and hence of xi . As for the fully nested copula, we do thisusing conditional arguments; we first draw a value of t fromthe two-dimensional copula of x1 and x2 and then draw a valueof s conditional on it. We now present the details.

First we define the functions f (x) = φ−1(x) (andsimilarly for f12(x) and f34(x)) and coupling functions

g12(x) = φ ◦ f12(x),

g34(x) = φ ◦ f34(x),(64)

in terms of which

C(x4, x3, x2, x1) = f (g12(t) + g23(s)). (65)

There are similar constraints on the coupling functions as inthe fully nested copula (Joe 1997).

The joint pdf of the ξ variables is the fourth derivative ofthe copula function with respect to the ξ variables (as for boththe exchangeable and fully nested copulas). Under the changeof variables to t , s, ut and us , this becomes

P(t, s, ut , us) = ∂2

∂t2

∂2

∂s2f (g12(t) + g23(s)). (66)

As for the exchangeable copula, even though this does notdepend on the u variables, we still get non-trivial factors of t

and s when integrating over their measures to determine themarginal distribution of just t and s. That is,

P(t, s) = ts∂2

∂t2

∂2

∂s2f (g12(t) + g23(s)). (67)

349

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


From the discussion in the exchangeable section, we knowthat the marginal distribution for t alone is P(t) = t ∂2

∂t2 f12(t).Therefore the conditional density is given as the ratio whichcan then be integrated with respect to s to determine thecomplement of I (s|t) as

I ∗(s|t) = 1

f ′′12(t)

(∂2

∂t2f (g12(t) + g23(s))

− s∂3

∂s∂t2f (g12(t) + g23(s))

). (68)

Using (64), we observe that this equals unity for s = 0 andhence is normalized. In practice, we can expand out all of thederivatives using the chain and product rules or evaluate thederivatives in some numerical manner.

The algorithm is then to draw a value of t as in theexchangeable section. Next draw a value s conditional onthe drawn value of t using the conditional cdf above. Nextfind values of ξi consistent with the sums equalling t and s (asfor the exchangeable copula discussed in section 3.3). Finallymap the values ξi to the corresponding xi .

6. Hierarchical copulasWe end this paper on a more speculative note. The differentnesting combinations discussed in the previous two sectionssuggest that a hierarchical structure among the random variatescan be naturally implemented in the copula function itself. Tomake this explicit we use an example from finance but the sameideas apply to any application with a hierarchical structure.

It is common to categorize obligors by their credit ratingand industrial sector. For example there is a growing market intranches of collateralized loan obligations in which differentcounterparties assume (for a fee) the credit risk associated withdefault of some fraction of a large basket of names. The pay-off function for this derivative is strongly dependent on theassumed dependence structure of the large basket. Rather thantreating every obligor in a name-specific way we can categorizethem by credit rating and industrial sector. Then we can treat allobligors with the same rating and sector using an exchangeablecopula function. This can be expressed symbolically as

Cij = φ−1ij

(∑k

φij (xijk)

), (69)

where i labels the sector, j labels the rating and k is adummy index labelling the obligors within a given rating/sectorcombination. We have defined a rating/sector-specificgenerator φij (x).

We would then want to combine across sectors for a givenrating (alternately we could combine across ratings for a givensector, the logic is the same and this amounts to a modellingchoice). We can do this with a new set of generators ψi(x),one per rating, so that the copula for all obligors with a givenrating would be

Ci = ψ−1i

(∑j

ψi(Cij )

). (70)

The dependence on the variates x is implicitly contained in thecopula functions Cij .

The copula for the entire basket is then obtained bycombining across all ratings. We assume that the generatingfunction for this is χ(x) so that we have

C = χ−1

(∑i

χ(Ci)

). (71)

This choice of multivariate copula has both nesting andexchangeable elements. There are constraints on the variousgenerating functions analogous to (55) but presumably theysatisfy the reasonable property of requiring less dependenceas we consider higher orders of nesting, as we remarkedupon for the fully nested Gumbel copula. Obviously thesequestions need study but this would go beyond the scope ofthe present paper. Whether this hierarchical copula has enoughstructure to accurately model the behaviour of a basket is aninteresting question. For example, it assumes that within arating/sector combination all obligors are identically correlated(more generally are exchangeable). It also assumes that acrossdifferent sectors and ratings the dependences are the same; allindustrial obligors are correlated the same with all financialobligors and with all energy obligors, for example. The ideasof this paper can be readily generalized to allow for samplingfrom these distributions.

This structure generalizes in an obvious manner. Forexample if we wished to add a new attribute, such as the countrywhere the obligor is based, we just add one more index tothe obligors and introduce one more level of hierarchy in thenesting of the copulas. We can also impose a hierarchicalstructure directly on the correlation matrix of a Gaussiancopula and it would be interesting to compare the relativeperformance of the two approaches. In a related vein, analternative method for lessening the number of parameters andeffective dimensionality is to use a factor copula (Laurent andGregory 2002) and it would be interesting to explore the degreeof overlap between that approach and the one outlined here.We plan to explore these ideas more fully in a later publication.

7. ConclusionIn this paper we have introduced three algorithms for samplingfrom exchangeable Archimedean copulas. In the course ofdoing this we have found several integral representations forthe distribution of the copula function as well as a generatingfunction expansion. We also derived an expression for thecopula distribution function in the infinite-dimensional limitwhich directly relates it to the distribution whose Laplacetransform yields the generating function. We hope to expandon this in a subsequent paper, including developing variousasymptotic arguments for large and small arguments as well ason large but not infinite dimension. Already we see the utilityof the infinite-dimensional limit in providing a particularlyelegant sampling algorithm in 3.4. This was the algorithm wedecided upon as the best technique due to its separation of thesampling of τ from any consideration about the dimensionalityof the problem.

350

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


We also provided analysis and algorithms for samplingfrom inexchangeable Archimedean copulas by working interms of new variables. These algorithms still require directdetermination of high-order derivatives of the generatingfunctions, but only in terms of one variable. Therefore theyare simpler than direct use of consecutive conditional draws ofthe original xi variables and make use of what we believe isthe simplest possible structure for purposes of sampling.

We have also suggested a class of Archimedean copulaswhich reflects the hierarchical structure among the randomvariates and discussed a possible application of this idea in thearea of credit derivatives.

AcknowledgmentsI wish to thank Jeff Boland, Tom Hurd, Harry Joe and ShaloubRazak for many useful comments and discussions.

Note added in proof. This algorithm is equivalent to one which can be foundin Marshall and Olkin (1988); see also the discussion in Schonbucher (2003,equation (10.44)).

Appendix. Numerical determination ofcontour integralsWe have expressed a number of our results in terms of contourintegrals. Despite the fact that determination of contourintegrals is a relatively straightforward numerical task, we arenot aware of its being extensively discussed in standard textson numerical methods; a discussion of a specific applicationto a non-closed contour can be found in Press et al (1992).Therefore for completeness, we briefly discuss how to do this.Imagine we wish to perform the following contour integral:

I =∮

C

dz f (z) (A.1)

over some closed contour C. The first point is that we canalways parametrize the contour as z(t) which is a complexfunction of a real parameter t . Without loss of generality, t canbe defined between 0 and 1. We then find

I =∫ 1

0dt

dz

dtf (z(t))

=∫ 1

0dtF (t), (A.2)

where we have defined the function F(t) = dzdt

f (z(t)). Wecan then apply standard methods of numerical integration withthe proviso that the integrand is complex, but this is a relativelyminor complexity.

It is worth noting however that assuming the contourcrosses no singularities (which is almost certain to be true inany practical application) and is an analytic function of t , thenF(t) is an analytic and periodic function of t . In that event thebest way to integrate over t is to approximate it as a discrete sum

at equally spaced points t which, by an extension of the Euler–Maclaurin summation formula, converges exponentially withthe number of sampled points. In particular,

IN ≡ 1

N

N−1∑i=0

F

(i

N

)(A.3)

approaches I faster than any reciprocal power of N as N → ∞(Hirayama 2001). This is a particularly nice approach due tothe recursion formula

I2N = 1

2

(IN +

1

N

N−1∑i=0

F

(i + 1/2

N

)). (A.4)

(This just interleaves the set of points used in determining IN

with a new set spaced midway between.) We can start at somesmall reasonable value of N , such as 4, and progressivelydouble N . We stop when the sum has converged to withinour desired tolerance. In practice a relatively small handfulof function evaluations should be sufficient to determine theintegral to typical desired accuracies. This approach can befurther supplemented with ‘Richardson’s deferred approach tothe limit’ where at each choice of 1/N , we use all previouslydetermined values to extrapolate the result to 1/N → 0, whichis the idea behind Romberg integration and the Bulirsch–Stoermethod for integrating ordinary differential equations (Presset al 1992).

What remains is to choose the contour which canbe integrated the most efficiently and to choose theparametrization z(t). There are no clear-cut choices and thebest choice is presumably specific to the problem at hand.We therefore outline some general principles which can beconsidered. We want to avoid any kinks in the contour or elsewe lose the property that the integrand of (A.2) is analytic in t .It is then not to hard to see that even if analytic, to the extentthat the contour has large curvature, the convergence will beslower. This suggests that a circular contour will typically bethe best choice, if this is possible given the topology of thecontour. Convergence will typically be fastest if the F(t) hasas little structure as possible, meaning no sharp changes inamplitude over narrow ranges of t . For this reason, we willalso typically want that any pole be at the centre of the contour,assuming there is just one pole. If there are multiple poles wechoose a point somewhere in the middle of them.

We must then choose the radius of the contour. If wemake it too small, we run the risk of potentially havinglarge magnitudes in the sum (A.3) thereby requiring precisecancellation amongst the various terms, leading to a possibleloss of numerical precision. On the other hand, we do notwant to allow it to be so large that the contour approachesother singularities in the complex plane. This is best explorednumerically as there is little beyond this that can be said ofany generality. One reassuring result is that the method isvery robust assuming one keeps the contour away from anysingularities and does not let it get too large. As an example, forthe Cauchy integral (21), a good choice is to centre the contourat z = t and to select a radius of Nt/(N + 1). That choice ofradius assures us that the integrand is close to its minimum

351

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4


value when the contour crosses the real axis for z < t . Thisminimizes the possibility that there are large fluctuations inmagnitude as we traverse the contour, thereby minimizing thecomputational effort.

Once we have selected the contour, we are left with thechoice of determining z(t). Assuming that it is a circularcontour, in general a robust choice will be that the angularposition in the complex plane is linear in t so that z(t) =z0 +R exp(2π it), where z0 is the centre of the contour and R isthe radius; this is the parametrization we have used in the workin this paper. However, even for a given contour other choicesof parametrization could possibly be more efficient; this is a de-tailed question which would be specific to the problem at hand.

Integrals which are not closed but extend to infinity, suchas arise from inverse Laplace transforms, can be mapped toeither a finite or an infinite integral over a real parameter t . Thisis then a standard problem in integration. Questions about thechoice of contour and of the parametrization z(t) still remain.For example, it is often best to select a contour for which theintegrand, or some chosen part of the integrand, has constantphase.

ReferencesAbramowitz M and Stegun I (ed) 1965 Handbook of Mathematical

Functions (New York: Dover)Ahrens J and Dieter U 1974 Computer methods for sampling from

gamma, beta, Poisson and binomial distributions Computing 12223–46

Barbe P, Genest C, Ghoudi K and Remillard B 1996 On Kendall’sprocess J. Multivariate Anal. 58 197–229

Bouye E, Durrleman V, Nikeghbali A, Riboulet G and Roncalli T2000 Copulas for finance: a reading guide and someapplications, Groupe de Recherche Operationnelle, CreditLyonnais Preprint

Corliss G, Faure C, Griewank A and Hascoet L (ed) 2002 AutomaticDifferentiation of Algorithms (Berlin: Springer)

Devroye L 1986 Non-Uniform Random Variate Generation (NewYork: Springer)

Embrechts P, Lindskog F and McNeil A 2001a Modellingdependence with copulas and applications to risk managementPreprint ETH

Embrechts P, McNeil A and Straumann D 2001b Correlation anddependence in risk management: properties and pitfalls RiskManagement: Value at Risk and Beyond ed M Dempster andH Moffat (Cambridge: Cambridge University Press)

Frey R and McNeil A J 2003 Dependent defaults in models ofportfolio credit risk, in preparation

Genest C and Rivest L 1993 Statistical inference procedures forbivariate Archimedean copulas J. Am. Stat. Assoc. 88 1034–43

Hirayama H 2001 Numerical integration on the complex plane IPSJSIGNotes Numerical Analysis Abstract No 022-003

Hoffding W 1940 Masstabinvariante Korrelationstheorie Schriftendes Mathematischen Seminars und des Instituts furAngewandte Mathematik der Universitat Berlin 5 181–233

Joe H 1997 Multivariate models and dependence conceptsMonographs on Statistics and Applied Probability No. 37(London: Chapman and Hall)

Kendall M and Stuart A 1979 Handbook of Statistics (London:Griffin and Company)

Kimberling C 1974 A probabilistic interpretation of completemonotonicity Aequationes Mathematicæ 10 152–64

Laurent J-P and Gregory J 2002 Basket default swaps, cdo’s andfactor copulas (Available at http://laurent.jeanpaul.free.fr)

Li D X 2000 On default correlation: a copula function approachJ. Fixed Income 9 43–54

Marshall A W and Olkin I 1988 Families of multivariatedistributions J. Am. Stat. Assoc. 83 834–41

Nelsen R 1999 An introduction to copulas Lecture Notes inStatistics No. 139 (New York: Springer)

Oberhettinger F 1990 Tables of Fourier Transforms and FourierTransforms of Distributions (Berlin: Springer)

Press W, Teukolsky S A, Vetterling W T and Flannery B P 1992Numerical Recipes in C 2nd edn (Cambridge: CambridgeUniversity Press)

Rall L B 1981 Automatic Differentiation: Techniques andApplications (Lecture Notes in Computer Science vol 120)(Berlin: Springer)

Razak S 2003 Default intensities from copulas, in preparationSchoenberg I J 1938 Metric spaces and completely monotone

functions Ann. Math. 39 811–41Schonbucher P J 2003 Credit Derivatives Pricing Models

(Chichester: Wiley)Schonbucher P J and Schubert D 2001 Copula-dependent default

risk in intensity models Preprint Department of Statistics,Bonn University

Sklar A 1959 Fonctions de repartition a n dimensions et leursmarges Publications de l’Institut de Statistique de L’Universitede Paris 8 229–31

352

Dow

nloa

ded

by [

Duk

e U

nive

rsity

Lib

rari

es]

at 0

7:38

09

Oct

ober

201

4

sampling from archimedean copulas

Documents