kaplan-meier estimate on the plane

Kaplan-Meier Estimate on the Plane

Dorota M. DabrowskaCarnegie-Mellon University

and University of California, Berkeley

Technical Report No. 92April 1987

(revised September 1987)

Department of StatisticsUniversity of CalifomiaBerkeley, California

KAPLAN-MEIER ESTIMATEON THE PLANE*

Dorota M. DabrowskaCarnegie-Mellon University

and University of California, Berkeley

Abstract

Estimation of the bivariate survival function from censored data is considered. Theproduct integral representation of univariate survival functions (Aalen and Johansen(1978)) is generalized to the bivariate case and used to determine identifiability of thesurvival function of the partially observed data. A bivariate analogue of the Kaplan -Meier (1958) estimate is introduced and its almost sure consistency is studied. Exten-sions to the general multivariate case are sketched.

Key words and phrases: product integrals, identifiability, censored data.

* This research was supported by the University of California PresidentialFellowship and the National Institute of General Medical Sciences Grant SSS-YlROlGM35416-02.

1. Introduction. Survival and reliability studies often involve observations onpaired individuals subject to censoring. Let T = (T1, T2) be a pair of nonnegative ran-dom variables (rv). The variables T1 and T2 are thought of as survival or failure timesand may represent lifetimes of manried couples, times from initiation of a treatmentuntil first response in two successive courses of a treatment in the same patient, etc.Under bivariate right censoring, the observable variables are given by Y = (Y1, Y2) and8 = (81' 82), where Yi = min(Ti, Z) and Bi = I(Ti = Yi). Here Z = (Z1, Z2) is a pair offixed or random censoring times thought to represent times to withdrawals from thestudy. We refer to Clayton (1978), Hanley and Parnes (1983), Campbell (1981) andClayton and Cuzick (1985) for examples of this type of censoring mechanism.

Two problems are addressed in this paper. Firstly, we discuss conditions whichensure identifiability of the underlying joint survival function of the partially observ-able failure times. In the univariate case Aalen and Johansen (1978) and Gill andJohansen (1987) show that the survival function can be expressed as a product integralof the cumulative hazard function. A similar representation is available in the bivari-ate case for a suitably defined bivariate cumulative hazard function. The latter is avector function representing cumulative hazards corresponding to "single" and "dou-ble" failures. Under the assumption of independence of the failure and censoringtimes, the bivariate cumulative hazard and the associated bivariate survival functioncan be easily expressed in terms of the joint distribution function of the observablevariables.

Further, we consider estimation of the survival function of the censored failuretimes. Our estimation procedure rests on the natural "substitution principle", i.e. theestimate is based on the sample counterpart of the product integral. We refer to it as abivariate Kaplan-Meier estimate. The name seems to be justified since apart from itsproduct integral form, the marginals are given by the univariate Kaplan-Meier (1958)estimates and in the absence of censoring, the estimator reduces to the usual empiricalsurvival function. The almost sure consistency of the bivariate Kaplan-Meier estimateis established.

The relevance of identifiability questions in inference problems related to compet-ing risks models has been raised by many authors, see Tsiatis (1975) and Peterson(1975) for instance. In the context of estimation of the bivariate survival functionfrom censored data, this problem was considered by Langberg and Shaked (1982).The authors suggested looking at P(T1 > s, T2 > t) = A(s,t) B(t) whereA(s,t) = P(T1 > s I T2> t) and B(t) = P(T2> t), and applying the product integralrepresentation of univariate survival functions to the terms A and B separately.

- 2 -

Properties of the corresponding estimator of the survival function were developed byCampbell and Foldes (1982), Campbell (1982), Horvath (1983), Burke (1984), Lo andWang (1986) and Horvith and Yandell (1986), among others. The estimator suffersfrom various drawbacks, in particular it is not a proper survival function since it is notmonotone in each of its coordinates, it does not reduce to the usual empirical survivalfunction in the case of uncensored data and is dependent on the selected path and ord-ering of the components. Ruymgaart (1987) considered estimation of the relatedcumulative hazard function.

Tsai et al. (1986) suggested an estimation procedure which involves estimation ofconditional survival functions using Beran's (1981) nonparametric regression methodsfor censored data. The estimator rests on smoothing techniques appropriate for non-

parametric density and regression estimation and, although it is consistent, its rate ofalmost sure consistency is very slow, as compared to the Campbell - F6ldes estimateor ours.

Campbell (1981), Hanley and Parnes (1983) and Mufnoz (1980) studied non-parametric MLE estimation using Efron's (1967) self-consistency algorithm and EMalgorithm of Dempster, Laird and Rubin (1977). The nonparametric MLE in thismodel does not have closed form expression and is not unique.

In connection with testing for independence, Pons (1986) derived a weak conver-gence result for the estimate of the bivariate cumulative hazard function correspondingto double failures (estimator All of Section 3). P.J. Bickel (personal communication)suggested the use of this estimator to construct an estimate of the bivariate survivalfunction. His approach boils down to solving the integral equation

S t

F(s,t) = F(s,0) + F(0,t) - 1 +JJF(u-,v-)A11 (du,dv)00

subject to the initial condition

F(s,0) = II (1 - Alo (du, 0))u!S

F (0, t) = rI (1 - AO, (0, dv))v t

where Alo, AO, and All are cumulative hazard functions corresponding to single anddouble failures (Section 2). This is an inhomogeneous Volterra equation and has asolution in terms of the Peano series

co

1 + I J... J All(du1,dvl) ... Al1(dun,dvn)n=1 0su1<...s

Osv< *... <V.st

The corresponding estimator obtained by plugging in estimates Aol, AO1 and All

- 3 -

(Section 3) has several of the same properties as the estimate considered in this paper:the definition is symmetric, the marginals are given by the Kaplan-Meier estimators, inthe absence of censoring the estirate reduces to the empirical survival function. It hasan extra nice property, namely it is always a survival function. However an undesir-able property is that it throws an important part of the data away.

2. Survival and cumulative hazard functions. In this section we show thecorrespondence between the bivariate survival and cumulative hazard functions. Forthe sake of completeness, we briefly consider the univariate case first. Next we definea bivariate hazard function and show that in analogy to the univariate case, it deter-mines the bivariate survival function. Extensions to higher dimensions are outlined.

2.1. Univariate survival times. Let T be a univariate nonnegative rv defined onsome probability space (Q2, F, P) and let F(t) = P(T> t) be its survival function. Furth-ermore, let A (t), A (dt) = -F(dt) / F(t-), A (0) = 0 be the associated cumulative hazardfunction. Then for t e [0, t] such that F(C) > 0, we have

F(t) = exp(-AC (t)) II (1 -A (Au)) (2.1)Ust

where Ac is the continuous component of A, the product is taken over the discontinuitypoints of F and A (Au) = A (u) -A (u-) denotes the size of the jump at time u. This isthe well known representation of univariate survival functions, we refer to Peterson(1977), Gill (1980), Beran (1981) and Wellner (1985) for its derivations. Aalen andJohansen (1978) and Gill and Johansen (1987) show that (2.1) can be written as a pro-duct integral

F (t) = 1l (1 - A (ds)) = lim [I (1 - A ((si_ , sisSTt maxl sr-Si-, I -- 0 i

where 0=so<si< .. <sn=t is a partition of (0,t] andA ((si-1=,si ]) A (si) - A (si_ ).

While various proofs of (2.1) seem to be available, the following simple argumentwill be useful in the sequel. For t e [0, t] such that F(C) >0, we have F(t) = exp (A(t)),where A(t) = log F(t). By the Jordan decomposition of functions of bounded variation,

t t t

A(t) = JA(du) = fAC(du)+ Ad (du), where AC and Ad are the continuous and discrete0 0 0

components of A. Specifically, Ad (t)=- XAd (Au) = X log( F(u) / F(u-))u5t Ust

- l; log (1 -A (Au)) where the sum is taken over the discontinuity points of F, andUSt

t

AC (t) = A(t) - Ad (t) = JF(u-)-1 FC (du) = -AC (t).0

- 4 -

2.2. Bivariate survival times. Let T = (T1, T2) be a pair of nonnegative rv'sdefined on a probability space (Q, F, P) and let F(s,t) = P(T1 > s, T2> t) be thecorresponding joint survival function. By a bivariate cumulative hazard function, wemean a vector function A (s,t) = (Alo (s,t), AO, (s,t), All (s,t)), where

A ~~~~P(T1 e ds, T2e:dt)_F(sdtAll(ds,Pdt)= ((T dsT2 Et) = F(ds,dt)

Ao (ds, ) P(T1Eds, T2> t) F(s-, t)

P(T1 > s, T2 e dt) -F(s, dt)A01(s,dt) = P(T1 > s, T2 . t) F(s, t-)

and A10(0,t)=AO1(s,0)=A11(0,0)=0. If F has a density f(s,t), we have A11(ds,dt)- X11 (s,t) dsdt, Alo (ds, t) = X10 (s,t) dt and AO, (s, dt) = kol (s,t) dt, where

1xi (s,t) = lim hh P(T1 [s, s+hl], T2 e [t, t+h2]I T1 s T2 . t)(hl,h2)-+o h1h2

= f(s,t) / F(s-,t-)

X10 (s,t) = hlim P(T1 E [s, s+h] I T1 . s, T2 > t) = f f(s,u) dv / F(s-, t)h-*0O h

00

X0o (s,t) = lim - P(T2 E [t, t+h] I T1 > s, T2 . t) = J f(u,t) du / F(s, t-).h-+o h 2

Thus kll (s,t) represents the instantaneous rate of a "double failure" at point (s,t),given that. the individuals were alive at times T1 = s- and T2 = t-. Further Xlo (s,t)represents the rate of a "single failure" at time s given that the first individual wasalive at time T1 = s- and the second survived beyond time T2 = t. The meaning of01 (s,t) is analogous.We next give the representation of the bivariate survival function F(s,t) in terms of

the bivariate hazard function A (s,t). Set A(s,t) = log F(s,t). Then for(s,t) E [0,t1] x [0,t2], F(r1, 2) > 0, we have

S t

F(s,t) = exp { A(s,t)) = expf I A(du, dv) + A(s,O) + A(O,t) }00

St

= F(s,0) F(0,t) exp iJA(du,dv)}.00

We exploit the Jordan decomposition of A(s,t). For (s,t) e [0,t1] x[0,t2], F(11,t2)>0,the function A(s,t) = logF(s,t) is a function of bounded variation in the sense of Vitali

- 5 -

and Hardy and Krause. We refer to Hildebrandt (1963, Ch. 3) and Clarkson andAdams (1933) for a survey of basic results on functions of bounded variation on theplane. By Theorem 5.4 in Hildebrandt (1963, p. 110) A(s,t) has a finite or countablenumber of discontinuities and they lie on a denumerable set of lines orthogonal to thecoordinate axes.

In what follows, for any bivariate function4 (As, t) = 4 (s,t) -4 (s-, t), 4 (s, At) = 4 (s,t) -4 (s, t-)= 4 (s,t) -4 (s,t-) -4 (s-, t) + 4 (s-, t-). Introduce sets

4 (s,t)and

we write4 (As, At)

El = ((s,t): A(s,t)<0, A(As, t) = A(s, At) = 0)

E2 = I (s,t): A(s,t) < 0, A(As, t) <0O, A(As, At) = 0)

E3 = {(s,t): A(s,t) <0, A(s, At) <0, A(As, At) = 0)

E4 = { (s,t) : A(s,t) < 0, A(As, At) > 0).

By the right-continuity and monotonicity of F, the set E1 corresponds to the support ofthe purely continuous component of A, while E4 is the support of the purely discretecomponent. Further, E2 and E3 are supports of components of A that have discontinui-ties lying along lines orthogonal to the coordinate axes. By the Jordan decompositionof functions of bounded variation on the plane, we have

S t 4J|A(du, dv) = I Ai (s,t)00 i=1

where

A1 (s,t) =

A2 (s,t) =

A3 (S,t) =

S t

J I[(U,V) EE1] A(du, dv)00

t

J I[(U,V) E E2] [A(u, dv) - A(u-, dv)]u!gs o

s

£ fI[(U,V) E E3] [A(du, v) - A(du, v-)]vsto

A4 (s,t) = Y; z I[(u,v) E E4] A(Au, Av).USS v St

We compute now thewill be useful:

explicit form of this decomposition. The following identities

F (u,v) / F (u-, v-) = 1 - A10 (Au, v-) - AO1 (u-,Av) + A1 1 (Au, Av)

F (u-, v) / F (u-, v-) = 1 - AO0 (u-, Av)

F(u,v-)/F(u-,v-) = 1 - Alo(Au,v-)

(2.2)

- 6 -

Consider the purely discrete part first. By (2.2), we have

CF(u,v) C logv CF(u,v-)

A4(s,t) = U!s Vt [ g F(u-,v-) ] g [F(u-,v-) J [F(u-,v-) ](u,v)eE4

A10 (Au,v-) AO, (u-,Av) -All (Au, Av) 1I I109 (1 A(A.- (2(1-A0

USS Vst (1 - Alo (U,v-)) (1 -AO (u-, Av))(u,v)eE4

Further, using (2.2) again

f F(du,v) _F(du,v-)A3 (s,t) = I I[(u,v) E3] [F(u-v) F(u-v-)

r F(du, Av) _ F(du,v-) F(u-, Av)I[(uv)t E L F(u-,v) F(u-,v) F(u-,v-)J

= £;fI[ (u,v) E E3] F (u-,9v-) F (du,Av) _ F (du, v-) F (u-,Av) ]vst0 F(u-,v) [F(u-,v-) F(u-,v-) F(u-,v-)

s 1[(u,v) E E31=sto I[1Al(u,v)e E3] [A11 (du, Av) -A10 (du,v-)Ao1 (u-, Av)]. (2.4)

Similarly,

A2(S't) = II[ (u E-] [Al (Au, dv) -A10 (Au, v-) AO (u-, Av)]. (2.5)

Finally,

StJ( F (du, dv) _F (du,v) F (u,dv)A1 (s,t) = u,v E1] [F(u,v) F(u,v) F(u,v)

S t

= ffI[(u,v) E1] [A1 (du,dv) -Alo (du,v) AO, (u,dv)]. (2.6)00

Note that formulas (2.4)-(2.6) follow heuristically from (2.3) through a Taylor expan-sion by noting that the quotient in (2.3) is infinitesimal on Ei, i = 1, 2, 3. Define func-tion L(s,t) by

L(du, dv) = A10 (du, v-) AO1 (u-, dv) - A11 (du, dv){1 - Alo (Au, v-)) { 1 - AO, (u-, Av))

By the definition of the sets Ei, we have Alo (Au, v-) =0 for (u,v) F E3 u E1, and

AO1 (u-, Av) = 0 for (u,v) EE2u E1. Therefore, combining (2.3) - (2.6) with the productintegral representation of the univariate survival functions, we arrive at the following

- 7 -

Proposition 2.1. For (s,t) such that F(s,t) >O, we have F(s,t) = F(s,O)4

F(O,t) II Bi (s,t), wherei=1d

s t

Bi(s,t) = exp(Aj(s,t)) = exp(-J I[(u,v)e E] L(du, dv)) i = 1,2,300

B4 (s,t) = II H [1 -L (Au, Av)],uss v5t(u,v)eE4

and F(s,O) = exp (-Afc (s,0)) HI { 1 - Alo (Au, 0)), F(0,t) =uss

exp(-A1 (0,t)) rI {1-A- (O,Av)}.VSt

Richard Gill pointed out to me that the representation of Proposition 2.1 can berewritten as a product integral

F(s,t) = II (1 - Alo(du,0)) H (1 - AO, (0,dv)) II (1 - L(du,dv)) (2.7)USS v5t u5s

VSt

where the last factor on the right-hand side is defined by

HI (1 - L (du, dv)) = lim rI (1 - L ((u._., ui ] x (vj, v]))u5s maxlu,-u 1-+0 ijVSt max vjvFl I -*

where 0 = uo < < um = s, 0 = vo < ... < vn = t is a partition andL ((ui-lg ui ] x (vj, vj]) = L (ui, vj) - L (ui_1, vj) - L (ui, v.i) + L (ui-,vF1). While arigorous argument requires extension of Gill and Johansen's (1987) results on productintegration, heuristically note that

1 - Alo(du,v-) = P(T1 > u T, . u,T2 . v) = F(u,v-)/F(u-,v-)

1 - AO, (u-,dv) = P(T2 > vIT1 . u,T2 . v) = F(u-,v)/F(u-,v-)

1 - All (du, dv) = (1 - Alo (du, v-)) + (1 - AO, (u-, dv))

- P(T1 > u,T2 > VIT 2 u,T2 2 v)= -F(u,v-) + F(u-,v) - F(u,v)) /F(u-,v-)

and after some algebra

1 - L(du,dv) = P(T1 > u,T2 > vIT1 . u,T2 2 v)P(T1 > ujT1 . u,T2 . v)P(T2 > vIT1 . u,T2 . v)

F (u, v) F (u-, v-)F (u, v-) F (u-, v)

Substitution of these expressions into the right-hand side of (2.7) and a little algebragives an alternative proof of Proposition 2.1.

- 8 -

2.3. Extensions to the multivariate case. We consider now the general multivari-ate case. Since the notation is cumbersome, we merely sketch the main points.

Let T = (T1,-- , TO) be a nonnegative rv defined on some probability space(Q,F, P) and let F(tl, * * . , tk) = P(T1> ,t .* * ,Tk> tk) be the corresponding survivalfunction. We define first the k-variate cumulative hazard function. Roughly speaking,this is the collection of 2k-1 functions representing the instantaneous risk of all possi-ble "q-tuple failures", q = l...k of components ml, ... ,mq at times tM * ,tlqgiven that they were alive at times tml -, * * * , t, - and that the remaining componentssurvived beyond times tm, me (l,...,k)-(ml,,.. , mqn . More precisely, let J be thecollection of all zero-one sequences (j) = (i,,* *Jik) such that Iji*O. By a k-variatecumulative hazard function we mean a triangular arTay A (t1l, * I tk)=- Aq(j)(t1,... , tk): q=l,...,k, (j)eJ, Yjm=q) consisting of k rows with k ele-

ments in the q-th row, q= l,...,k. For (j) e J such that ;jm=q and ji, =...= jiq = 1, thefunction Aqoj) is defined by

AqO() *-stJl dtji- tjl+1s t** *,j-1,) dtji, tji +1,)..tk)1 1, ~~~~qq q

F(tls9--)tjii - dtjil tjii+l.)..-.tj, -19 dtji, tji +l 9----tk)(_l)q I~~ 1 1 q q q

F(t19 -tj. -19 tj.l ,1 tji+ **tj -11, tji tji+19-A*k)'q q q

Thus for instance if k=3 then the first row of A (tl, t2, t3) consists of 3 cumulativehazard functions corresponding to "single failures", the second row consists of 3cumulative hazard functions corresponding to "double failures" and the third row con-sists of one function representing the cumulative hazard function of a "triple failure".

The derivation of the product integral representation of F(tl,...,tk) is inductive andfollows along the same lines as in the bivariate case. Set A(tl,...,tk) = logF(tl,...,tk).Then

tj tk k-l

F(tl,...,tk) = exp(f ... JA(dul,...,duk)) H [HI F(mltl,_.mktk)]( 1)ki (2.8)o o i=1 (in)

where the inner product in (2.8) is taken over all zero-one sequences (m) = (ml,...,mk)such that Inj=i. The double product in (2.8) represents, the ratio of products of i-dimensional marginal survival functions, i = 1,...,k-l. For instance if k=3 then thisfactor reduces to F(tl, t2, O) F(tl, O, t3) F(O, t2, t3) / {F(tl, 0,0) F(O, t2, O) F(0, 0, t3)). Toobtain the product integral representation of F(tl,...,tk) it remains to use the productintegral representation of i-dimensional marginal survival functions, i=l,...,k-l andnext apply the Jordan decomposition to the integral appearing in the exponent of (2.6).In the last step, note that the discontinuities lie on hyperplanes orthogonal to the axes

-9-

of the coordinate system The form of the purely continuous component can bededuced from the derivative of A (t, .. . , tk) with respect to tl, . . . , tk. The form ofthe purely discrete component follows from a little algebra applied to£ ..* * A(Aul, . ,AUk).

3. Estimation of the bivariate survival function from censored data. Weassume now that the data are censored and consider the identifiability question first.The failure times T= (T1, T2) and censoring times Z= (Z1, Z2) are defined on a com-mon probability space (Q, F, P) and the respective joint survival functions are denotedby F(s,t) and G(s,t). The observable rv's are given by Y = (Y1, Y2) and 6= (81) 62),where Yi = min(Ti, Zi) and 8i = I(Ti = Yi), i = 1,2.

By Proposition 2.1, the identifiability of F(s,t) will follow if we can show that thebivariate hazard function A (s,t) can be expressed in terms of the joint distributionfunction of Y and 6.

Set H(s,t) = P(Y1 > s, Y2 > t), K1 (s,t) = P(Y1 > s, Y2 > t,61 = 1, 82 = 1), K2 (s,t)=P(Y1 > s, Y2 > t,61 = 1) and K3 (s,t) = P(Y1 > s,Y2 > t,62 = 1). Assume

A. T = (T1, T2) and Z = (Z1, Z2) are independent.

Assumption A is sufficient to ensure identifiability of F on the support of H. Indeed,for (s,t) such that H(s,t) > 0, we have H(s,t) = G(s,t) F(s,t),K1 (ds, dt) = G(s-, t-) F(ds, dt), K2 (ds, t) = G(s-, t) F(ds, t) and K3 (s, dt) = G(s, t-) F(s, dt)so that

s t

A11 (s, t) = JJK (du, dv) / H(u-, v-)00

Alo(s,t) = -fK2(du,t)/H(u-,t)0

t

AO, (s, t) = -JK3 (s, dv) / H(s, v-).0

Suppose now that Yi = (Yli, Y2i), 8i= (81i, 62i), i= l,...,n is an iid sample, each(Yi,g i) having the same distribution as (Y, 6). To estimate the survival function F ofthe partially observable survival times, define

H(s,t) = n-1 £I(Yli> s, Y2i>t)K1 (s,t) = n-1 I(Y1H>s,Y2i>t,61i= l62i= 1)

- 10-

K2(s,t) = n711I(Y1i>s,Y2i>t,81i=1)K3 (s,t) = n 1XI(Y1i> s,Y2i>t, 82i= 1).

Further, let A (s,t) = (Alo (s,t), AO0 (s,t), All (s,t)) be an estimator of the bivariate cumu-lative hazard function given by

s t

All (s,t) = JK1 (du, dv)/H (u-, v-)00

s

Alo(s,t) = -JK2(du,t)/H(u-,t)0

t

AO, (s,t) = -JK3 (s, dv)/H (s, v-).0

A natural candidate for an estimator of F(s,t) is provided by

F(s,t) = F(s,0)F(O,t) Fl [ 1 - L(Au,Av)],0<u~sO<V.t

where

Alo (Au, v-) AO, (u-, Av) - Al 1 (Au, Av)

{1 - Alo (Au, v-)} { 1 - AO, (u-, Av)}and F (s,0) and F (O,t) are the usual Kaplan-Meier estimates, i.e.

F(s,O) = II[1-Ajo(Au,0)],u5s

F(O,t) = I[1-AO,(0OAv)].v<t

The marginals of F (s, t) are given by the univariate Kaplan-Meier estimates. In theabsence of censoring F (s, t) reduces to the -usual empirical survival function. This canbe verified by noting that the empirical survival function is purely discrete and by car-rying the same calculation as in (2.3). A referee pointed out that in the presence ofcensoring F (s, t) may fail to be monotone. As an example consider points(Yli,Y2i,lig,52i), i = 1, * * * , 4 given by (.51, .02, 1, 1), (.68, .68, 1, 1) (.11, .62, 1, 0)(.24, .24, 0, 0). Then F(s,0) = 1,.75,5,375,0 for s =0,.11,.24,.51,.68,F (0, t) = 1, .75, .75, .75,0 for t = 0, .02, .24, .62, .68, F (s, t) = 0.5 for(s,t) E [.11,.68) x [.02,68) and F(s,t) = 0 if s 2 .68 or t > .68. Note thatF(.51,,.02) = .5 > .375 = F(.51,0). Roughly speaking this anomaly, can be explainedby the fact that three different portions of the data set are involved in estimation ofA10, AO1 and All. It can be easily verified that Alo, AO0 and All satisfy

A1O (ds, dt) = (1 - Alo (As, t-)) L (ds, dt)

- 11 -

AO1 (ds, dt) = (1 - ASo (s-, At)) L (ds, dt).However, unless the data are uncensored, Alo, AO0 and All no longer satisfy this con-straint.

Extension to the general multivariate case is inductive. In Section 2.3, we haveoutlined the extension of the product integral representation of survival functions to thecase of k-dimensional failure times. Consider an iid sample Yi = (Yli, . . . , Yki) andSi = (li, .... &k9), where Yji = min(Tji, Zji) and 8ji = I(Tii = Yji), j = 1,...,k andi = 1,...,n. If the censoring variables Zi = (Zli . . . , Zki) are independent of the failuretimes Ti = (Tli,. . . , Tki) then the multivariate hazard function A (tl, .. . , tk) of Sec-tion 2.3 can be expressed in terms of the joint distribution function of Y's and 8's.The sample counterpart of A (tl, . . . , tk) coupled with the product integral representa-tion of the survival function F(t1, . . . , tk) yields the multivariate Kaplan - Meier esti-mate.

4. Consistency of the bivariate Kaplan-Meier estimate. In this section we con-sider consistency of the estimator F (s, t). For X = (r1,x'r2) let 11 1k denote thesupremum norm on [ 0,t1 ] x [0,t2 ]IProposition 4.1. Suppose that the condition A holds and X = (X1,l2) satisfiesH (XC1 2) > 0. Then 11 F - F --*O almost surely.

We have F (s,t) = F (s,O) F (O,t) fl Ci (s,t), where

Ci (s,t) = I {1 -L (Au, Av))O<vst(u,v)eEi

By Proposition 2.1 and the uniform consistency of the univariate Kaplan-Meier esti-mates (Foldes and Rejtb (1981), Csorgb and Horvnth (1983), Shorack and Wellner(1986, p. 305)), it is enough to show IIBi-CiJk--O a.s. for i=1,...,4. This will beestablished in a sequence of lemmas.

Lemma 4.1. Under assumptions of Proposition 3.1, JA11 -A11Ir-*-03, hIA1o-A1oJIr-+Oand IAo1 - Aol0l -*0 almost surely.

Proof. This follows from the Glivenko-Cantelli theorem and a simple algebra. Weomit the details.

Lemma 4.2. Under assumptions of Proposition 3.1, hIBi-Cill-*0 almost surely, fori= 1,2,3.

Proof. We consider the case of i =3, the proof in the remaining two cases is analo-gous. By the inequality ix-yyl < logx - log yl for 0< x,y< 1, we have

sup I B3 (s,t) - C3 (s,t) I < sup I F; l; [log (1 -L (Au, Av)) -L (Au, Av)]Iuss v t(u,v)E E3

- 12 -

s t

+ sup Iff I[(u,v) e E3] (L-L) (du,dv)1, (4.1)00

where the sup is taken over (s, t) e [0, ti,] x [0, 2] Consider the first sum on theright-hand side of (4.1). By the elementary inequality-log [1 - (1 + x)-1] - (1 + x)-1 < [x(1 + x)]-l for x >Oand x < -1 we have

x£ ilog (1 -L (Au, Av)) - L (Au, Av)I(u,v>EE3

.x x L (Au,Av)1-L(Au,Av))-(u,v) E3

< sup (L (Au, Av) : (u,v) E E3 r [O,tl] x [0,-2])}

xxxiy | AAO0(Au,v-)AO (u-,Av)-Aj1 (Au,Av)U5'rl V5r2 1 -A 1 (Au, v-) - AO (u-, Av) + All (Au, Av)(u,v)eE3

For (u,v) E E3, we have L(Au, Av) = 0. Therefore

sup{L (Au, Av): (u,v) EE3 n [O,tl] x [O,2])

< sup(IL(Au,Av)-L(Au, Av)I: O<u.<c1, 0<v.<2)

which converges almost surely to 0 by Lemma 4.1. Furthermore, the sum on theright-hand side of (4.2) stays bounded by the consistency of Alo, AO0 and Al and alittle algebra. Finally, the second term on the right-hand side of (4.1) convergesalmost surely to 0 by Lemma 4.1 and a simple calculation.

Lemma 4.3. Under assumptions of Proposition 3.1

x x IL (Au,Av) - L(Au, Av)I-OUSTi vS'T(u,v)eE4

almost surely.

Proof. We have

xx IL (Au, Av) - L (Au, Av)I < xx IL1 (Au, Av) - L1 (Au, Av)I+

xx jL2 (Au, Av) - L2 (Au, Av)I. (4.3)

Here the sums extend over u <tl, v -c2 such that (u,v) E E4. Furthermore,

L1(Au, Av) = All(Au,Av)/{(1-Alo(Au,v-))(1-Aol(u-,Av))}L2(Au,Av) = Alo(Au,v-)Aol (u-,Av)/ ((1 -Alo (Au, v-)) (1-Aol (u-,Av))I}.

The terms L1 and L2 are defined by replacing All, Al0 and AO0 by their sample coun-terparts in L1 and L2.

- 13 -

Lemma 4.1 and a litde algebra imply sup iLl (Au, Av) - L1 (Au, Av)I -+0 a.s. withsup taken over O.su s cl and Os v.sx, and iILi-Llk-0 a.s. Since almost sure con-vergence implies weak convergence and the set E4 is closed, we have

lim : LIL(Au,Av):5 X I L,(Au.Av) < co

USt1 V5¶2 UV¶1 VV¶2

(u,v)eE4 (u,v)eE4

Scheffe's theorem (see e.g. Shorack and Wellner, (1986, p. 862)) implies

X IL, (Au, Av) - L1 (Au, Av)I -4 0U:S1 VSt2

(u,v hE4

almost surely. The second sum in (4.3) can be treated in a similar way.

Lemma 4.4. Under assumption of Proposition 3.1, iiB4 - C4iio-+0 a.s.

Proof. Since H(t1,j2) > 0, we can find r1 < 1 such thatsup(L (Au, Av): 0 < u.1, 0 <v. 2)} <1T. Fix e>0. By Lemma 4.1 and a little alge-bra, sup(IL(Au,Av)-L(Au,Av)l/(l-L(Au,Av)): O<u<t,O<v<t2)<e for n

sufficiently large. Further, by the mean value theorem, we haveIlog (1 - x) < lxl(1 - e)-1 for xlx<e. Therefore, for n sufficiently large

sup lB4(s,t)-C4(s,t)I<I 1 Iog(1 -L(Au,Av))-log(1 -L(Au,Av)) I

= Iilog 1 L(Au,Av)-L(Au,Av)%. 1 -L(Au,Av) .

(1 -e)-'I;L(Au,Av)-L(Au,Av)I/(1 -L(Au,Av))

< (1 - £,)-' (1 - 11)1 IzIIL (Au, Av) - L (Au, Av)j.Here all the sums are taken over 0 < u < 1, 0 < v < C2 such that (u,v) e E4. Lemma 4.3implies that this bound converges almost surely to zero.

Acknowledgement. I thank Richard Gill and the referees for their comments.

References

Aalen, 0.0. and Johansen, S. (1978). An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand. J. Statist. 5,141-150.

Beran, R. (1981). Nonparametric regression with randomly censored survival data.Tech. Report, University of California, Berkeley.

Burke, M.D. (1984). An almost-sure approximation of a multivariate product-limitestimator under ranodm censorship. Preprint.

- 14 -

Campbell, G. (1981). Nonparametric bivariate estimation with randomly censoreddata. Biometrika 68, 417-422.

Campbell, G. (1982). Asymptotic properties of several nonparametric multivariate dis-tribution function estimators under random censorship. In Survival Analysis, IMSLecture Notes 2, 243-256.

Campbell, G. and Foldes, A. (1982). Large sample properties of nonparametric bivari-ate estimators with censored data. In Nonparametric Statistical Inference, Collo-quia Mathernatica-Societatis Janos Bolyai (B.V. Gnedenko, M.L. Puri and I.Vincze, eds.). North Holland, Amsterdam.

Clarkson, J.A. and Adams, C.R. (1933). On the definitions of bounded variation forfunctions of two varibles. Trans. Amer. Math. Soc. 35, 824-854.

Clayton, D.G. and Cuzick, J. (1985). Multivariate generalizations of the proportionalhazards model. J. Roy. Statist. Soc. Ser. A 148, 82-117.

Clayton, D.G. (1978). A model of association in bivariate life tables and its applica-tion in epidemiological studies of familiar tendency in chronic disease incidence.Biometrika 65, 141-151.

Csorg6, S. and Horvath, L. (1983). The rate of strong uniform consistency for theproduct-limit estimator. Z. Wahrscheinlichkeitstheorie verw. Gebiete 62, 411-426.

Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977). Maximum likelihood estimationfor incomplete data via the EM algorithm. J. Roy Statist. Soc. Ser. B 39, 1-38.

Efron, B. (1967). The two-sample problem with censored data. Proc. Fifth BerkeleySymp. Math. Statist. Probab. 4, 831-853.

Foldes, A. and Rejt8, L. (1981). Strong uniform consistency for nonparametric sur-vival arrive estimators from randomly censored data. Ann. Statist. 9, 122-129.

Gill, R.D. (1980). Censoring and Stochastic Integrals. Mathematical Centre Tracts124, Amsterdam.

Gill, R.D. and Johansen, S. (1987). Product integrals and counting processes. TechReport MS-R8707, Centre for Mathematics and Computer Science, Amsterdam.

Hanley, J.A. and Parnes, M.N. (1983). Nonparametric estimation of a multivariate dis-tribution in the presence of censoring. Biometrics 39, 129-139.

Hildebrandt, T.K. (1963). Introduction to the Theory of Integration. Academic Press,New York.

Horvath, L. (1983). The rate of strong uniform consistency for the multivariateproduct-limit estimator. J. Mult. Analysis 13, 202-209.

- 15 -

Horvith, L. and Yandell, B.S. (1986). Bootstrapped multidimensional product limitprocess. Tech. Report, University of Madison, Wisconsin.

Kaplan, E.L. and Meier, P. (1958). Nonparametric estimation from incomplete obser-vations. J. Amer. Statist. Assoc. 53, 457-481.

Langberg, N.A. and Shaked, M. (1982). On the identifiability of multivariate life dis-tribution functions. Ann. Probab. 10, 773-779.

Lo, S.H. and Wang, J.L. (1986). Iid representations for the bivariate product limitestimators and the bootstrap versions. Tech. Report, University of California,Davis.

Munoz, A. (1980). Nonparametric estimation from censored bivariate observations.Tech. Report, Stanford University.

Peterson, A.V. (1975). Nonparametric estimation in the competing risk problem.Tech. Report, Stanford University.

Peterson, A.V. (1977). Expressing the Kaplan-Meier estimator as a function of empin-cal subsurvival functions. J. Amer. Statist. Assoc. 72, 854-858.

Pons, 0. (1986). A test of independence between two censored survival times. Scand.J. Statist. 13, 173-185.

Ruymgaart, F. (1987). Some properties of bivariate empirical hazard processes underrandom censoring. Tech Rep 8718. Catholic University of Nijmegen, Netherlands.

Shorack, G.R. and Wellner, J.A. (1986). Empirical Processes with Applications toStatistics. Wiley, New York.

Tsiatis, A. (1975). A non-identifiability aspect of the problem of competing risks.Proc. Nat. Acad. Sciences USA 72, 20-22.

Tsai, W.Y., Leurgans, S. and Crowley, J. (1986). Nonparametric estimation of abivariate survival function in the presence of censoring. Ann. Statist. 14, 1351-1365.

Wellner, J.A. (1985). A heavy censoring limit theorem for the product limit estimator.Ann. Statist. 13, 150-162.

kaplan-meier estimate on the plane

Documents