1 using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf borgan...

20
1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint work with Bryan Langholz

Upload: hugo-carpenter

Post on 12-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

1

Using martingale residuals to assess goodness of fit for sampled

risk set data

Ørnulf BorganDepartment of Mathematics

University of Oslo

Based on joint work with Bryan Langholz

Page 2: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

2

Outline:

• Example: Uranium miners cohort

• Cohort model, data and martingale residuals

• Risk set sampling

• Martingale residuals and goodness-of-fit tests for sampled risk set data

• Concluding remarks

Page 3: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

3

Uranium miners cohort:

• 3347 uranium miners from Colorado Plateau included in study cohort 1950-60

• Followed-up until end of 1982

• 258 lung cancer deaths

• Interested in effect of radon and smoking exposure on the risk of lung cancer death

• Have exposure information for the full cohort. Will sample from the risk sets for illustration

(e.g. Langholz & Goldstein, 1996)

Page 4: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

4

Relative risk regression models

Hazard rate for individual i

0 )( () ii t t

Relative risk for individual i depends on covariates xi1 , xi2 , … , xip (possibly time-dependent)

relative risk baseline hazard

1 1exp ... i i p ipx xCox:

1 1(1 ) ... (1 ) i i p ipx x

Excess relative risk:

Page 5: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

5

Cohort data:

Study time

individuals at risk

(arrows are censored observations)

Page 6: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

6

t1 < t2 < t3 < …. times of failures

ij individual failing at tj ("case")

Counting process for individual i :

( ) ,j

i j jt t

N t I t t i i

( ) ( ) 1 | "past"i it dt dN tP

Intensity process i(t) is given by

Page 7: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

7

Cumulative intensity processes:

Martingales:

Martingale residual processes:

at risk indicator hazard rate

( ) ( ) ( )i i it Y t t

0( ) ( )

t

i it u du

( ) ( ) ( )i i iM t N t t

ˆ ˆ( ) ( ) ( )i i iM t N t t

0( ) ( )i iY t t

00( ) ( )

t

i iY u dA u

Page 8: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

8

Martingal residual processes may be used to assess goodness of fit:

• Plot individual martingale residuals

• Plot grouped martingale residual processes

versus time

(Aalen,1993; Grønnesby & Borgan,1996)

versus covariates(Therneau, Grambsch & Flemming,1990)

ˆ ˆ ( )i iM M

The latter may be extended to sampled risk set data

* ˆ( ) ( )g ii gM t M t

Page 9: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

9

Risk set sampling

• Cohort studies need information on covariates for all individuals at risk

• Expensive to collect and check (!) this information for all individuals in large cohorts

• For risk set sampling designs one only needs to collect covariate information for the cases and a few controls sampled at the times of the failure

Page 10: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

10

Select m –1 controls among the n(t) – 1 non-failures at risk if a case occurs at time t, i.e. match on study time

Illustration for m = 2

case

control

Page 11: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

11

A sampling design for the controls is described by its sampling distribution

The classical nested case-control design:If individual i fails at time t the probability of selecting the set r as the sampled risk set is

A sampled risk set consists of the case ij and its controls

(we assume that r is a subset of the risk set, that r is of size m and that i is in r)

jR

1( ) 1

( | )1t

n tr i

m

A number of sampling designs are available

Page 12: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

12

Inference on the regression coefficients can be based on the partial likelihood

The partial likelihood enjoys usual likelihood properties (Borgan, Goldstein & Langholz 1995)

For the classical nested case-control design, the partial likelihood simplifies

( | )

( | )j j

j j

j

i t j j

t l t jl R

R iL

R l

Page 13: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

13

Martingale residuals and goodness-of-fit tests for sampled risk set data

Introduce the counting processes

Intensity processes take the form:

( , )( ) ( ) ( | )i r i tt t r i

( , )( ) , ( , ) ( , )j

i r j j jt t

N t I t t i R i r

0( ) ( ) ( | )i i tY t t r i

Page 14: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

14

Martingale residual processes:

Corresponding martingales:

The are of little practical use on their own, but they may be aggregated over groups of individuals to produce useful plots

( , ) ( , ) ( , )( ) ( ) ( )i r i r i rM t N t udu

( , ) ( , ) ( , )ˆ ˆ( ) ( ) ( )i r i r i rM t N t udu

( , )ˆ ( )i rM t

Page 15: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

15

For group g

May be interpreted as "observed _ expected" number of failures in group g

Asymptotic distribution may be derived using counting process methods

Simplifies for classical nested case-control

*( , )

ˆ( ) ( )g i ri g r

M t M t

ˆ ( | )

( )ˆ ( | )

jj

j jj

l t jl R gi

i g t t l t jl R

R lN t

R l

Page 16: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

16

Ilustration: uranium miners cohort

1 1 2 2(1 ) (1 ) i i ix x

Fit excess relative risk model:

xi1 = cumulative radon (100 WLMs)

xi2 = cumulative smoking (1000 packs)

For classical nested case-control with three controls per case:

1̂ 0.556 (0.215) per 100 WMLs

2ˆ 0.276 (0.093) per 1000 packs

Page 17: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

17

Aggregate martingale residual processes in three groups according to cumulative radon exposure:

Groups: I: < 500 WLMs II: 500-1500 WLMs

III: > 1500 WLMs

There are indications for an interaction between cumulative radon exposure and age

Page 18: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

18

Age and group Observed Expected

Below 60 years & group I 30 30.7

Below 60 years & group II 39 45.9

Below 60 years & group III 81 73.4

Above 60 years & group I 27 27.7

Above 60 years & group II 45 36.1

Above 60 years & group III 36 44.2

Observed and expected number of failures in the groups for ages below and above 60 years:

Chi-squared statistic with 2(3 – 1) = 4 df takes the value 10.5 (P-value 3.2%)

Page 19: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

19

Concluding remarks

• Introduces a time aspect that is usually disregarded for sample risk set data

• Gives a similar model formulation as for cohort data and thereby opens up for similar methodo-logical developments as for cohort studies

• Grouped martingale residual processes is one example of this. They allow to check for time-dependent effects and other deviations from the model

The counting process formulation of nested case-control studies:

Page 20: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint

20

• How should the grouping be performed?

• How do specific deviations from the model turn up in the plots?

• Kolmogorov-Smirnov and Cramer von Mises type tests? (Durbin’s approximation, Lin et al’s simultation trick)

Questions and further develoments of grouped martingale residual plots and related goodness-of-fit methods