copulas and their alternatives

Copulas and their Alternatives

In a sense it is not possible to avoid copulas when dealing with multiple variables, because the joint distri-

bution F(x1,…,xn) can always be expressed as a copula applied to the marginal distributions, that is F(x1,…,xn)

= C[F1(x1),…,Fn(xn)]. This statement is known as Sklar’s Theorem, and in the continuous case is easily shown

by defining C(u1,…,un) = F[F1-1(u1),…,Fn

-1(un)]. However that does not mean that choosing one of the stan-

dard copulas is the only way to generate a joint distribution.

An alternative approach is to specify marginal and conditional distributions. Switching to the bivariate case

for notational simplicity, if F(x,y) = C[FX(x),FY(y)], then the conditional distribution of Y|X=x is given by

FY|X(y) = C1[FX(x),FY(y)], where C1 is the derivative of C with respect to its first argument. This shows that

conditional distributions come directly out of copulas.

But the conditional and marginal densities also directly define the joint density: f(x,y) = fY|X(y)fX(x). If you

define the joint and marginal distributions to be well-known distributions, you might get an implied copula

that is not on the usual list. However there are some exceptions. The normal and t-copulas are defined based

on normal and t distributions.

As an example, for independent distributions, F(x,y) = FX(x)FY(y) so the independence copula C(u,v) =

uv. Then C1(u,v) = v, so FY|X(y) = FY(y) is the conditional distribution of y given x for any x.

Another example is the bivariate Burr distribution. Suppose X and Y are Burr with survival functions S(x)

= 1 – F(x) given by S(x) = (1 + (x/b)p)-a and S(y) = (1 + (y/d)q)-a. These have the same outer shape parameter

a but each has its own tail strength given by ap or aq. The heavy right tail copula with parameter a>0 is:

C(u,v) = u + v – 1 + [(1 – u)-1/a + (1 – v)-1/a – 1]-a. This copula has fairly high upper tail dependence.

With the same parameter a as for the two Burrs, the joint Burr distribution is:

F(x,y) = 1 – (1 + (x/b)p)-a – (1 + (y/d)q)-a + [1 + (x/b)p + (y/d)q]-a .

The conditional distribution of Y|x is also Burr in this case and is given by:

FY|X(y|x) = 1 – [1 + (y/dx)q]

-(a+1), where dx = d[1 + (x/b)p]1/q. By varying p, q, and a, any desired degree of

tail strength can be specified for x, y and the conditional distribution.

However this convergence of joint distributions defined by standard conditional distributions and by well-

known copulas is unusual. Thus specifying marginal and conditional distributions is an alternative way to

build up multivariate distributions.

For instance, suppose loss is Pareto distributed with S(x) = (1 + x/150)-2.5 and given a loss of size x, loss

expense is distributed S(x) = (1 + 2y/(x+200))-4. The product of the densities of these functions describes a

joint distribution but the implied copula might not be widely studied.

Another way to do generate multivariate distributions is direct modeling of the process that is believed to

be generating the correlations. For instance in modeling the cost of hurricanes, a model of the wind strengths

and properties exposed might be used to generate multiple storm scenarios. This could result in correlations

among losses in different classes of business, even though the multivariate distribution is not specified in ad-

vance. A similar situation might arise for corporate bond defaults, where a copula could be chosen that will

sometimes produce a large number of defaults in a specified time period, but a model of the economy might

produce the same result if default probabilities are tied in to economic activity.

Thus incorporating associations among variates is not limited to using copulas. The choice of the ap-

proach to use is a key consideration in the model building process.

Descriptive Functions

Measures of dependency like linear correlation, rank correlation, etc. describe the strength of the relation-

ship with a single number, but this does not describe what parts of the distribution are most strongly related.

The copula itself gives full information about the relationship of percentiles of the distributions, but it is a

multi-dimensional function that could be difficult to get a handle on. Intermediate between the scalars and

the copula are descriptive functions of the copula. These are univariate functions on the unit interval that de-

scribe how some property of the copula varies across that interval.

These include the right and left tail concentration functions R and L:

R(z) = Pr(U>z|V>z)

L(z) = Pr(U<z|V<z)

These are symmetric in U and V, as R(z) = Pr(U>z & V>z)/(1 – z) and L(z) = Pr(U<z & V<z)/z. These

formulas can be used to define multivariate versions of these functions. Also L(z) = C(z,z)/z and R(z) =

[1 – 2z +C(z,z)]/(1 – z). The former generalizes readily to any dimension, but the latter gets more compli-

cated.

The limit of R(z) as z goes to 1 and of L(z) as z goes to 0 are called tail concentration coefficients which

we will label R and L. For instance, if R=0, which it does for the normal copula, then having U and V both

large gets less and less likely and finally very rare as the loss size increases. On the other hand, if R is positive,

then the large, large pairs will occur more often.

Both L(1) and R(0) are 1, but L(0) = L and R(1) = R can be anything from 0 to 1. Thus L(z) is more inter-

esting for z below ½ and R(z) for z above ½. Graphing on these regions allows both functions to be shown

in the same graph. This is done for several copulas with Kenall’s = 35% but quite different R’s in Figure 1.

Two other tail-related descriptive functions can be defined as:

)(log)log(

)(log)log()(

log

)(log)(

zRz

zRzzand

z

zLz

1

11

Even though is defined with the L function, it agrees with the R function in the limit z 1, i.e., at the far

right. It is a constant for some copulas, like the Gumbel. These functions look fairly different for different

copulas so are useful in distinguishing among them. They are graphed in Figures 2 and 3.

Figure 1 – L and R functions for several copulas

LR Functions for Tau = .35

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Gum

HRT

Frank

Max

Power

Clay

Norm

Chi and R Functions, Tau = 35%

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

HRT Chi

Frank Chi

HRT R

Frank R

Figure 2 – Chi and R functions for HRT and Frank Copulas

Chi-bar Function, Tau = 35%

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 0.2 0.4 0.6 0.8 1

Gumbel

HRT

Frank

Figure 3 – Chi-bar functions for HRT, Frank, and Gumbel copulas

Other descriptive functions can be defined using correlation coefficients restricted to some regions. Ken-

dall’s is a function of the copula alone, and can define the cumulative function J(z):

J(z) = –z2

+ 40

z0

z C(u,v)c(u,v)dvdu/C(z,z)

Taking z = 1 gives the formula for Kendall’s . These functions all readily generalize to multiple dimensions.

Since descriptive functions look different for different copulas, they can be used to compare the empirical

copulas defined by multivariate data to various copulas fit to the data. As an example, French insurance losses

from windstorm events for auto and property losses, from Belguise & Levi1, are graphed in kilo-Francs on a

log scale in Figure 4.

Figure 4 – Auto and property loss pairs from French windstorms

The data looks largely independent for smaller losses but quite strongly related for large loss events. This

is fairly typical for insurance data. The heavy right tail, or HRT, copula has strong right-tail dependency like

this and is only weakly correlated in the left tail. The Gumbel copula also has strong right-tail dependency,

but is stronger than the HRT in the left tail. For comparison, the Frank copula is also fitted, although it is not

1 Belguise & Levi (2002) “Tempêtes : Etude des dépendances entre les branches Automobile et Incendie à

l’aide de la théorie des copulas,” Bulletin Francais d’Actuariat Vol 5,135-174.

strong in either tail. These were fit by maximum likelihood by Belguise and Levi. The loglikelihoods at the

maximum were 84, 77, and 50 for the HRT, Gumbel, and Frank copulas, respectively. This indicates a much

better fit for the HRT, and much worse for the Frank, compared to the Gumbel. The graphs of descriptive

functions can be evaluated as to how well they reflect this.

The L and R tail functions show how the tail dependency varies by probability level. This is an important

effect when the distribution of the sum of the variates is of interest, as it is in insurance. However the greatest

interest is in the extreme tails, where the empirical functions are least stable. Figure 5 illustrates this.

Figure 5 – Auto-property windstorm R and L functions empirical and for three copulas

For the auto-property data shown in Figure 5, the HRT clearly fits best in the left tail. In the right tail the

Gumbel is reasonable but not as good as the HRT. Figure 6 is the same thing for the chi-bar function. The

conclusions are similar. The same thing can be done for the J function, which sometimes can more clearly

distinguish between good and almost good fits.

Once a preferred copula is selected, further testing can be done by simulating the descriptive functions,

which can give confidence intervals around the fitted function. This is shown for the windstorm losses in

figure 7, which do seem consistent with the fitted HRT copula.

AP R & L Functions

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L AP

L AP Gumbel

L AP - Frank

L AP - HRT

R AP

R AP Gumbel

R AP - Frank

R AP - HRT

AP Chi-bar Function Comparison

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Chi-bar AP

Chi-bar AP Gumbel

Chi-bar AP - Frank

Chi-bar AP - HRT

Figure 6 – Auto-property windstorm chi-bar function empirical and for three copulas

Figure 7 – Confidence intervals around auto-property R and L from fitted HRT copula

AP R & L Functions

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L AP

L AP 90%

L AP 75%

L AP 25%

L AP 10%

R AP

R AP 90%

R AP 75%

R AP 25%

R AP 10%

Simulating Normal and t-Copulas

While there are numerous bivariate copulas, in multivariate applications the t and normal copulas are the

most prevalent. The normal or Gaussian copula takes as parameters a correlation matrix. Its main drawback is

that it has zero tail dependency. It is thus problematic to apply the Gaussian copula when there is tail depen-

dency. The t-copula also takes a correlation matrix as input, but it has one more parameter, called degrees of

freedom, to adjust the tail dependency.

The input correlation matrix is the usual linear correlation, but the ending multivariate distribution will not

have this correlation matrix unless the canonical normal or t marginals are used, as the linear correlation is

not preserved by copulas. However both the Kendall’s and Spearman’s rank correlations are preserved by

copulas, and so will be maintained no matter what marginal distributions are used. Kendall’s is related to the

linear correlation by = (2/ )arcsin( ). The rank correlation is given by (6/ )arcsin( /2).

The tail dependency measures R and L are equal for the t-copula. For a pair of variates they are given

by the correlation and degrees of freedom by L = R = 2 – 2T +1{[( +1)(1– )/(1+ )]0.5}, where T is

the t-distribution function. As grows the t-copula approaches the normal copula where R=L=0.

Their simulation is fairly easy once the Cholesky decomposition of the correlation matrix is available. The

Cholesky decomposition is a square matrix of the same dimensions as the correlation matrix with all zeros

above the diagonal, and it converts the correlation matrix into the coefficients of a series of multiple regres-

sions for each variable in terms of the previous ones, plus the standard errors. When multiplied on the right

by a column vector of independent standard normal random draws it gives a vector of correlated standard

normal random draws.

Many programming packages have functions for the Cholesky decomposition, but beware: some of them

put the zeros below the diagonal, so its transpose must be taken before use. An algorithm is given at the end

in case it is not available. Once you have the Cholesky decomposition L of the correlation matrix, the simula-

tion of a multivariate distribution with the normal copula is:

1. Generate independent uniform(0,1) random numbers u1, … , uk. In Excel this is rand().

2. Calculate 1( )j jz u for each uj from step 1. This creates independent standard normal variables.

Arrange the values into a column vector 1( , , )T

kz zz . In Excel this is normsinv(u).

3. Calculate the vector x Lz . This creates correlated standard normal variables.

4. Calculate ( )j jv x which transforms the variables to correlated uniform variables.

5. Calculate 1( )j j jy F v which creates correlated variables with the desired marginal distributions.

The t-copula is similar but after step 3 the correlated standard normal variables are transformed to corre-

lated t-distributed variables by dividing each one of them by the same constant, which comes from a chi-

squared distribution with degrees of freedom. The chi-squared is a gamma distribution with shape parame-

ter = /2 and scale parameter 2, and so mean = . Here does not have to be an integer. The steps are:

1. Generate independent uniform(0,1) random numbers u1, … , uk. In Excel this is rand().

2. Calculate 1( )j jz u for each uj from step 1. This creates independent standard normal variables.

Arrange the values into a column vector z. In Excel this is normsinv(u).

3. Calculate the vector x Lz . This creates correlated standard normal variables.

4. Simulate g from the chi-square distribution with v degrees of freedom). In Excel this is g =

2*gammainv(rand(), /2,1,1).

5. Calculate / /j jw x g v . This creates correlated t-distributed variables.

6. Calculate ( )j v jv T w which transforms the variables to correlated uniform variables. Here ( )vT w

is the cumulative distribution function for the t distribution with v degrees of freedom where again

does not have to be an integer. In Excel this is T (w) = ½ + ½ sign(w)betadist[w2/( +w

2), ½, /2].

7. Calculate 1( )j j jy F v which creates correlated variables with the desired marginal distributions.

An algorithm for the Cholesky decomposition for a matrix with elements Aij is to calculate Lij with i > j

for the lower triangle, and Di where the diagonal will be Di½ by the recursive steps:

D1 = A11

L21 = A21

copulas and their alternatives

Documents