distributions with fixed marginals and related topics || nonparametric measures of multivariate...

Nonparametric Measures of Multivariate AssociationAuthor(s): Roger B. NelsenSource: Lecture Notes-Monograph Series, Vol. 28, Distributions with Fixed Marginals andRelated Topics (1996), pp. 223-232Published by: Institute of Mathematical StatisticsStable URL: http://www.jstor.org/stable/4355895 .

Accessed: 14/06/2014 00:10

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access toLecture Notes-Monograph Series.

http://www.jstor.org

This content downloaded from 188.72.127.68 on Sat, 14 Jun 2014 00:10:09 AMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=ims

http://www.jstor.org/stable/4355895?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


Distributions with Fixed Marginals and Related Topics IMS Lecture Notes - Monograph Series Vol. 28, 1996

Nonparametric Measures

of Multivariate Association

By Roger B. Nelsen

Lewis and Clark College

We study measures of multivariate association which are generalizations of the measures of bivariate association known as Spearman's rho and Kendall's

tau. Since the population versions of Spearman's rho and Kendall's tau can be

interpreted as measures of average positive (and negative) quadrant dependence and average total positivity of order two, respectively (Nelsen, 1992), we extend

these ideas to the multivariate setting and derive measures of multivariate as-

sociation from averages of orthant dependence and multivariate total positivity of order two. We examine several properties of these measures, and present

examples in three and four dimensions.

1. Introduction. The purpose of this paper is to present three mea-

sures of multivariate association which are derived from multivariate depen-

dence concepts. We begin in Section 2 by reviewing the main results for the

bivariate case: Spearman's rho is a measure of average quadrant dependence;

while Kendall's tau is a measure of average total positivity of order two. Re-

lationships between measures of multivariate association and concordance are

discussed in Section 3. In Section 4 we construct two measures of multivariate

association from two generalizations of quadrant dependence -

upper orthant

dependence and lower orthant dependence, and examine properties of these

measures in Section 5. In Section 6 we construct a measure from multivariate

total positivity of order two, and in Section 7 we examine properties of this

measure.

2. The Bivariate Case. Before proceeding, we adopt some notation

and review some definitions. Let X and Y denote continuous random variables

(r.v.'s) with joint distribution function (d.f.) H and marginal d.f.'s F and G.

Let C denote the copula of X and Y, the function C : I2 ?> J = [0,1] given by

H(x,y) = C(F(x),G(y)). We will also denote the joint and marginal densities

of X and Y by h, f, and g, respectively.

AMS 1991 Subject Classification: Primary 62H20

Key words and phrases: copula, Kendall's tau, multivariate total positivity of order

two, orthant dependence, Spearman's rho.



224 NONPARAMETRIC MEASURES

The r.v.'s X and Y are said to be positively quadrant dependent, written

PQD, (Lehmann, 1966) iff the probability that they are simultaneously large is at least as great as it would be were X and Y independent, i.e., iff

P(X >x,Y >y)> P(X > x)P(Y > y) for all ? and y. (2.1)

In terms of d.f.'s, X and Y are PQD iff 1 - F(x)

- G(y) + H(x,y) > [1

-

F(x)][l-G(y)}, or equivalently, iff H(x,y) > F(x)G(y) (i.e.,iffP(X < x,Y <

y) > P(X < x)P(Y < y)). So, in a sense, the expression H(x,y) -

F(x)G(y) measures "local" quadrant dependence at each point (x,y) in R2.

The population version of Spearman's rho is given by

p. = 12 I 2[H(x,y)- F(x)G(y)]dF{x)dG(y),

J R

and hence j^ps represents a measure of average quadrant dependence, where

the average is with respect to the marginal distributions of X and Y (Nelsen,

1992). Invoking the probability transforms u = F(x) and ? = G(y) and the

copula C, the above expressions (as well as ones to follow) simplify. The

measure of local quadrant dependence becomes C(u,v) - uv (for u and ? in

I2), and ps = 12 jT2[C(u, v)

- uv]dudv.

In a similar fashion, the population analog of Kendall's tau is related to

the dependence property known as total positivity of order two. X and Y

are totally positive of order two (written TP2) iff their joint density h satisfies

h(xi,yi)h(x2,y2) > h(xi,y2)h(x2,yi) for all xx, x2, yi, ?/2 in R such that

xi < x2 and yx < y2. Thus h(xi,yi)h(x2,y2) -

h(xi,y2)h(x2,yi) measures

"local" TP2 for X and Y. Let t denote its average for -00 < xi < x2 < 00

and ?00 < yi < y2 < 00, i.e., let

/oo

roo rx2 ry2

?ii \???,?\)??2,?2) -

h(xi,y2)h(x2,yi)]dyidxidy2dx2. -00 J?00 J?00 J?oo

It can be shown that t = 2t where t is the population version of Kendall's

tau,

t = 4 f 2H(x,y)dH(x,y)~

1 = 4 / C(u, v)dC(u,v) - 1. (2.2)

J R JI

See Nelsen (1992) for details.

3. Concordance. In the bivariate setting, two pairs of r.v.'s (??,??) and (X2,Y2) are concordant if Xi < X2 and Yi < Y2 or if X\ > X2 and

Yi > Y2; and discordant if ?? < X2 and Yi > Y2 or if Xx > X2 and Yx <Y2.

As is well known (Kruskal, 1958), the population version of Kendall's t is the



ROGER ?. NELSEN 225

probability that two independent observations of the r.v.'s X and Y (with d.f.

H) are concordant, scaled to be 0 when X and Y are independent and 1 when

the d.f. of X and Y is the Fr?chet upper bound. This can be seen by noting that

P(Xi < X2,Yi < Y2 or Xi > X2,Yi > Y2) = 2P{XX < X2,Yi < Y2)

= 2 / P(X < ?, Y < y)dH(x,y) = 2 / H(x,y)dH(x,y)

J R J R

and observing that this integral appears in (2.2).

For notation in the multivariate case, let Xi,X2,? - ?,Xn be continuous

r.v.'s with marginal d.f.'s Fi?F2?- --Fn respectively, joint d.f. H, and copula C G ^1 given by H(xux2,.

? -,xn) = C(Fi{xi),F2(x2),- -,Fn(xn)). Also

let X = (??,X2,? ?

?,??)?> & = (??>?2> ? ?

?,??), and let X > ? denote the

component-wise inequality Xi > xt, i = 1,2,???,?. In a recent paper Joe

(1990) studied a family of measures of multivariate concordance given by

?

t* = S wkP(D ? V*) (3?1) k=n'

where D = X?Y, Bk,n-k is the set of a; in Rn with k positive and n?k negative or k negative and n ? k positive components; n* =

f1^], and the coefficients

Wf? satisfy some technical restrictions. At one extreme, Wk = 1 ? \(n-\? anc^

r* is the average of the (!J) pairwise Kendall's r's for the components of X.

At the other extreme, wn = 1 and Wk = 2n-i-i ^or ^ < n, in which case

7-- ? 1 /on?1 2n-T3?(2n""1P(X < Y or X > Y)

- 1). In Section 6 we will see that

this measure of concordance is also a measure of average total positivity of

order two.

We further note that p+ and p~, which will appear in Section 4, are

also discussed in Joe (1990), where they appear as the scaled expected values

E(Fi(xi)F2(x2) ?

-Fn(xn)) and E(F1{xi)F2(x2) ? ?

-Fn(xn)), respectively.

4. Two Measures of Average Orthant Dependence. There are

two standard ways to generalize positive quadrant dependence to the multi-

variate situation (Shaked, 1982). The first is a generalization of (2.1): X is ?

positively upper orthant dependent (PUOD) iff P(X > x) > Y[P(Xi > *i) i=l

and negatively upper orthant dependent (NUOD) when > is replaced by <.

The second is similar: X is positively lower orthant dependent (PLOD) iff ?

P(X < x) > Y^P(Xi

< Xi) and negatively lower orthant dependent (NLOD)




when > is replaced by <. When ? ? 2, positive upper orthant dependence and positive lower orthant dependence are the same, both reducing to positive quadrant dependence (Lehmann, 1966); but for ? > 3 they are dis-

tinct concepts. For example (when ? = 3), if X assumes the four values

(1,1,1),(1,0,0),(0,1,0) and (0,0,1) each with probability \, then it is easy

to verify that X is PUOD but not PLOD. (Note that P(X < 0) = 0 while

P(Xi < 0)P(X2 < 0)P(X3 < 0) - D

As with quadrant dependence, we can view the expression P(X > x) ?

?

TT P(X{ > Xi) as a measure of "local" upper orthant dependence at each point i=l ? in Rn. If we set Ut = Fi(Xi) for i = 1,2,???,? and U = (Uu U2, ? ? ?, Un), then each f/t is uniform on I and the d.f. of U is the copula C. Furthermore,

X is PUOD (NUOD) iff U is PUOD (NUOD), or equivalently, iff P(U > u) > ? ?

(<)Y\(1 ?

Ui). Thus P(U > ?) ?

TT(1 ?

Ui) also measures local upper orthant

t=l i=l

dependence for X. Let ? denote its average:

p= / [P(U > u) ?

TT(1 -

Ui)]duidu2 - ? -dun.

Jin t=l

Now let V be a random vector with ? independent components each uniformly

distributed on J. Then

/, [P(U > u))duidu2 -"dun = P(U > V)

In

= P(V<U)= ?JP(V <u)]dC(u)= j nUiu2--undC(u),

and since Jjn ?G=?(1 ~

Ui)duidu2 ? ? -dun =

(|) , we have

? = / uiu2 ? ? ? undC(u)

- ? ? j

?

If X is a vector of independent r.v.'s with marginal d.f.'s F\,F2, ? ? ?,??, then

H(xi,X2, * ? '

,xn) = Fi(xi)F2(x2) * *

-Fn(xn), and hence the copula for such an

X, which we will denote by II(u), is given by II(wi,U2, ? ? ?

,un) ?

uiu2 ? ? -un.

In this case ? = 0. When H(xi,x2, ? ?

-,xn) = min(F1(x1), F2(x2), ? ?

-,Fn(xn)),

the Fr?chet upper bound for distributions with margins Fi, F2, ? ? ?, Fn, then

the copula is M(u) = min(wi, w2, ? ?-^ wn). In this case P(U > u) = min(l -

Ui, 1 ? u2, ???,! ? un), so that frn[P(U > u)}duidu2

? ? -dun = --. Since 1 ? + 1

C(u) < M(u) for every C, it follows that ? < ^-

- (2)n- So if we divide ?

by this constant, we will have a measure which is 1 when the d.f. of X is the



ROGER ?. NELSEN 227

Fr?chet upper bound (and still 0 when the components of X are independent).

Hence we make the following

Definition 4.1. A measure of multivariate association p+ derived from

average upper orthant dependence for the random vector X with copula C(u) is given by:

pt = 2"-(t + i)(2W Jr

UlU2 ? ??UndC{u) -1);

or, equivalently,

pX = n+1

?2n / n(u)dC(tt) -

1). Hn 2n -

(n+ l)v Jln

In a similar fashion, we can define another measure p~ from average

lower orthant dependence. Recall that X is PLOD (NLOD) iff P(X < x) > ? ? ?

(<) J] P(X? < Xi), that is, iff P(U < u) > (<) JJ w,. So P((7 < ti) -

JJ u,

i=l measures "lc

its average;

i=l t=l ?=i measures "local" positive and negative lower orthant dependence; let q denote

q = [P(U < ti) ? TT Ui]duidu2

? ? -dun;

from which it readily follows that

q= I C(u)duidu2 ? ? ? dun - f -

J .

As with p, q ? 0 when the components of X are independent, and ?t = ^-?

?

(I) when the d.f. of X is the Fr?chet upper bound. So we make the following

Definition 4.2. A measure of multivariate association p~ derived from

average lower orthant dependence for the random vector X with copula C(u) is given by:

Pn = 2?G(?+1)(2? Jjn

C(U)du^du2 ' ' ' dun -

1);

or, equivalently,

p- = n+1

(2n / C(u)dn(u) -

1). rn 2n -(n + iy Jr1

J J }




5. Properties of p+ and p~ and Examples.

1. A lower bound for />+ and p~ is Jh?^t+i\i

? Since the Fr?chet lower

bound for distributions with marginal d.f.'s Fi,F2,- ? ,Fn is max(Fi(xi) +

F2(x2) + -- + Fn(xn)-n+ 1,0),it follows that P(U > u) > max(l-ux-u2-

-un,0). Thus fjn[P(U > u))duidu2 ? ? -dun > I [max(l

? ui - u2-

un,0)]duidu2--dun = , so that p+ > 2?-(n+i) {(?+?y.

~ ?)'

Simi"

larly, since C(u) > max(ui+u2-\-hun-n+l,0), then Jjn C(u)duidu2 ? ? -dun

> jjn max(t?i +w2 H-\-un ? n + l,0)duidu2 * ? -dun

? ??\??,

and we obtain

the same lower bound for p~. Since the Fr?chet lower bound is not a d.f. for

? > 3, this bound may well fail to be best possible.

2. For ? = 2, both pj and p? reduce to Spearman's ps discussed in

Section 2.

3. For ? = 3, the inclusion-exclusion principle yields

P(U > u) + C(u) = l-ui-u2-u3 + C(ui,U2,l) + C(ui,l,u3) + C(l,U2,u3).

Integrating over I3 gives

P3 + ! P3 + ! = -, 3

1 Pxy + 3 pxz + 3 ?>yZ + 3

8 8 '

2 12 12 12 '

where ???, ??? and pyZ denote Spearman's ps for the two r.v.'s displayed as

the subscript. It follows that

2^3 + Ps) =

?(PXY + PXZ + PYZ)?

Similar expressions can be obtained for ? > 4.

4. Suppose that the distribution of X is radially symmetric (Nelsen,

1993), that is, there is a point a in Rn such that P(X < a-x) = P(X > a + x) for all ? in Rn. It follows that P(U < u) = P(U > 1 -

u), so that p+ = p~.

Example 1. Let ?, Y, and ? be r.v.'s, each uniform on I, such that the

probability mass in I3 is uniformly distributed on the faces of the tetrahedron

with vertices (1, 0, 0), (0, 1, 0), (0, 0, 1), and (1, 1, 1). For this distribution,

P3 ? 15 aRd P3 = ?

?5? Furthermore, since ?, Y, and ? are pairwise inde-

pendent, ??? = ??? = Pyz ? 0. Thus a common measure of multivariate

dependence, the average of the pairwise Spearman's p5's, is 0 for (?, Y, Z). But

the fact that p? is positive and p? is negative indicates some degree of positive

upper orthant dependence and negative lower orthant dependence - indeed,

P(X > \,Y > \,Z> \) =

^ and ? (X < \,Y < \,Z < |) = ? but both

of these probabilities are | when ?, Y, and ? are mutually independent.



ROGER B. NELSEN 229

Example 2. Suppose X, Y, and Z have the trivariate Farlie-Gumbel-

Morgenstern (FGM) distribution on I3 with d.f. H(x,y,z) = xyz[l + a(l ?

2/)(l-z)+/J(l-x)(l-^)+7(l-x)(l-y)+<5(l-x)(l-y)(l-2)]; where the four

parameters a, ?, ?, and d are each in [-1,1] satisfying the inequalities 1+??a+

?2? + ?37 > |?| for ?t? = ?1, 6i6263 = 1. The univariate margins are uniform

on J while the bivariate margins are FGM, and the pairwise Spearman's p?'s

are ??? = ^7, ??? = |/3, and ??? =

^o:. Thus the average of the pairwise

/>s's is |(a + /? + 7), however

P3=l(<* + ? +

7)--fis and Pj =

\{<* + ? + ?[) + ?d.

Example 3. Copulas for trivariate Cuadras-Auge distributions are weighted

geometric means of the copula for independent r.v.'s and the copula for the

Fr?chet upper bound: Ce(x^y^) = [mm(x,y,z)]e(xyz)l~6, ? e I. The uni-

variate margins are uniform on J and the bivariate margins are Cuadras-Auge distributions with parameter T. This distribution is both PUOD and PLOD

and ??? ? ??? ? ??? ? 30/(4 -

T); however

, 0(11-50) _A ?._ 0(7-0) P3 =

/? a\(? _ ?\ and ^3 = (3-0)(4-0)

r? (3-?)(4-?)'

Example 4. Let ?, ?, and ? be r.v.'s, each uniform on J, such that

the probability mass in I3 is uniformly distributed on two triangles, one with

vertices (1, 0, 0), (0, 1, 0), and (0, 0, 1); and one with vertices (1, 1, 0), (1,

0, 1), and (0, 1, 1). Here p? = p? = 0. Note that ?, Y', and ? are again

pairwise independent and that (?,?, Z) is radially symmetric about (^, |, ~).

6. A Measure of Average Multivariate Total Positivity of Order

Two. The multivariate version of the TP2 property is called multivariate total

positivity of order two (Karlin and Rinott, 1980): A distribution with joint

density h is multivariate totally positive of order two (MTP2) iff for all ? and

yin Rn,

h(xVy)h(xAy)> h{x)h(y)

where

and

? V y = (max(xi, yx), max(x2, y2), ?, max(xn, yn))

? ? y = (min(xi, yi), min(x2, y2), ? ? ?, min(xn, yn)).

Thus h(xVy)h(xAy)-h(x)h(y) measures "local" MTP2 for a distribution




with density h, and we let ? denote its average:

T - / ? / Jh(x Vy)h(x^y)- h(x)h(y)]dxidx2--dxndyidy2--dyn.

JRn JRn

Let Si = Ft(xi), U = Fi(yi), and h(x) = c(F1(x1), F2(x2), ? ? ?, Fn(xn))

h{x\)?2{x2) ' -

-fn(Xn) where /, is the (marginal) density of Xi and c(u) =

dnC(u)/duidu2--dun. Then

T= / / [c(sVt)c(3ht)-c(s)c(t)]dsid82---dsndtidt2---dtn. Jr1 Jr

Now let ti = s V t and ? = s ? t. Then ? < u and dvidui = dsidti for

i = 1,2, ? ? ?,?; hence

T= f-f f G?~ G G S S [c(")cH

- <*??*)?

Jo JO JO JO JO JO , n 0^?7 Jfc=0 S?N |S|=*

dvidv2 - ? - dvnduidu2 ? ? ?</^?

where ? = {1,2, ? ? ?, ?},

{t?t

if i ? S, ( Vi if ? G 5, and 2,? = <

Vi if i ? S, { Ui if i ? S.

Evaluation of the n inner integrals yields

f n

T= / [2nC(ti)c(ti) -

J] J] ^(ti^-^Citi)]^!^???^, ?^ A;=0 S?N

\S\=k

where d$C(u) denotes the kth order mixed partial derivative of C(u) with

respect to the k variables whose subscripts are in S. But since the double sum

in the above expression is simply du d^...du C2(u), we have

r f dn ? = 2n / C(u)c(u)duidu2 ? ? ? dun -

/ -?-?C2(u)duidu2 ? ? ? dun.

Jr1 Jl71 duidu2'-dun

The second multiple integral above is C2(l) = 1, and thus

? = 2n f C{u)dC{u)-l. Jr1

As with ? and q in Section 4, ? = 0 when the components of X are indepen-

dent, and when the d.f. of X is the Fr?chet upper bound, ? = 2n~1 - 1. So

we make the following



ROGER B. NELSEN 231

Definition 6.1. A measure of multivariate association t? derived from

average multivariate total positivity of order two for the random vector X

with copula C(u) is given by:

_^(2?^ C(?)dC(u)

- 1).

7. Properties of rn and Examples.

1. A lower bound for rn is 2nI11_1 (since Jjn C(u)dC(u) > 0.)

2. For ? = 2, r2 reduces to Kendall's r as presented in Section 2.

3. For ? = 3, r3 = \(t?? + t?? + t??), where t??, t?? and t?? denote

Kendall's t for the two r.v.'s displayed as the subscript. This follows from the

fact that the family of measures of multivariate concordance studied by Joe

(1990) includes both rn and the average of the pairwise Kendall's r's; but has

only one member when ? = 3.

4. As noted in Section 3, rn is a scaled probability of concordance. This

follows from the observation that 2 Jjn C(u)dC(u) = P(X < Y or X > Y), where X and Y are independent each with d.f. H.

Example 5. Let ?, ?, and ? be r.v.'s on I3 with a density h(x, y, z) = 4

in the two cubes [0, ^]3 and [\, l]3, and 0 elsewhere. Then p* = p^ = | while

Example 6. Now let ?, ?, and ? be r.v. 's on I3 with a density h(x, y, z) =

2 in the four cubes [\, 1] ? [0, \] x [0, \], [0, \\ xj|, 1] x [0, \], [0,1] ? [0, \] x [\, 1],

and [~, l]3; and 0 elsewhere. JVow p% = |, pj = ?

|, whiie T3 = 0. Note that

in this example ?, ?, and ? are pairwise independent.

Example 7. Let ?, ?, ?, and W be r.v. ys on I4 with d.f. H(x, y, z, w) =

xyzw[l + 0(1 -

x)(l -

y)(l -

z)(l -

w)] and density h(x, y,z,w) = 1 + 0(1 -

2x)(l -

2y)(l -

2z)(l -

2w), where |0| < 1. Any three of these four r.v.'s are

mutually independent; however all four are not unless 0 = 0, and

pt = p4" = ??

and u = Wi6'

References

Joe, H. (1990). Multivariate concordance. J. Multivariate Anal. 35, 12-30.

Karlin, S. and Rinott, Y. (1980). Classes of orderings of measures and

related correlation inequalities - I. J. Multivariate Anal. 10, 467-498.



232 NONPARA METRIC MEASURES

Kruskal, W. H. (1958). Ordinal measures of association. J. Amer. Statist.

Assoc. 53, 814-861.

Lehmann, E. L. (1966). Some concepts of dependence. Ann. Math. Statist.

37, 1137-1153.

Nelsen, R. B. (1992). On measures of association as measures of positive

dependence. Statist. Probab. Lett. 14, 269-274.

Nelsen, R. B. (1993). Some concepts of bivariate symmetry. Nonparametric

Statist. 3, 95-101.

Shared, M. (1982). A general theory of some positive dependence notions. J.

Multivariate Anal. 12, 199-218.

Department of Mathematical Sciences

Lewis and Clark College

Portland, OR 97219-7899

nelsen<91clark. edu



distributions with fixed marginals and related topics || nonparametric measures of multivariate...

Documents