1 using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf borgan...
TRANSCRIPT
![Page 1: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/1.jpg)
1
Using martingale residuals to assess goodness of fit for sampled
risk set data
Ørnulf BorganDepartment of Mathematics
University of Oslo
Based on joint work with Bryan Langholz
![Page 2: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/2.jpg)
2
Outline:
• Example: Uranium miners cohort
• Cohort model, data and martingale residuals
• Risk set sampling
• Martingale residuals and goodness-of-fit tests for sampled risk set data
• Concluding remarks
![Page 3: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/3.jpg)
3
Uranium miners cohort:
• 3347 uranium miners from Colorado Plateau included in study cohort 1950-60
• Followed-up until end of 1982
• 258 lung cancer deaths
• Interested in effect of radon and smoking exposure on the risk of lung cancer death
• Have exposure information for the full cohort. Will sample from the risk sets for illustration
(e.g. Langholz & Goldstein, 1996)
![Page 4: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/4.jpg)
4
Relative risk regression models
Hazard rate for individual i
0 )( () ii t t
Relative risk for individual i depends on covariates xi1 , xi2 , … , xip (possibly time-dependent)
relative risk baseline hazard
1 1exp ... i i p ipx xCox:
1 1(1 ) ... (1 ) i i p ipx x
Excess relative risk:
![Page 5: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/5.jpg)
5
Cohort data:
Study time
individuals at risk
(arrows are censored observations)
![Page 6: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/6.jpg)
6
t1 < t2 < t3 < …. times of failures
ij individual failing at tj ("case")
Counting process for individual i :
( ) ,j
i j jt t
N t I t t i i
( ) ( ) 1 | "past"i it dt dN tP
Intensity process i(t) is given by
![Page 7: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/7.jpg)
7
Cumulative intensity processes:
Martingales:
Martingale residual processes:
at risk indicator hazard rate
( ) ( ) ( )i i it Y t t
0( ) ( )
t
i it u du
( ) ( ) ( )i i iM t N t t
ˆ ˆ( ) ( ) ( )i i iM t N t t
0( ) ( )i iY t t
00( ) ( )
t
i iY u dA u
![Page 8: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/8.jpg)
8
Martingal residual processes may be used to assess goodness of fit:
• Plot individual martingale residuals
• Plot grouped martingale residual processes
versus time
(Aalen,1993; Grønnesby & Borgan,1996)
versus covariates(Therneau, Grambsch & Flemming,1990)
ˆ ˆ ( )i iM M
The latter may be extended to sampled risk set data
* ˆ( ) ( )g ii gM t M t
![Page 9: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/9.jpg)
9
Risk set sampling
• Cohort studies need information on covariates for all individuals at risk
• Expensive to collect and check (!) this information for all individuals in large cohorts
• For risk set sampling designs one only needs to collect covariate information for the cases and a few controls sampled at the times of the failure
![Page 10: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/10.jpg)
10
Select m –1 controls among the n(t) – 1 non-failures at risk if a case occurs at time t, i.e. match on study time
Illustration for m = 2
case
control
![Page 11: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/11.jpg)
11
A sampling design for the controls is described by its sampling distribution
The classical nested case-control design:If individual i fails at time t the probability of selecting the set r as the sampled risk set is
A sampled risk set consists of the case ij and its controls
(we assume that r is a subset of the risk set, that r is of size m and that i is in r)
jR
1( ) 1
( | )1t
n tr i
m
A number of sampling designs are available
![Page 12: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/12.jpg)
12
Inference on the regression coefficients can be based on the partial likelihood
The partial likelihood enjoys usual likelihood properties (Borgan, Goldstein & Langholz 1995)
For the classical nested case-control design, the partial likelihood simplifies
( | )
( | )j j
j j
j
i t j j
t l t jl R
R iL
R l
![Page 13: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/13.jpg)
13
Martingale residuals and goodness-of-fit tests for sampled risk set data
Introduce the counting processes
Intensity processes take the form:
( , )( ) ( ) ( | )i r i tt t r i
( , )( ) , ( , ) ( , )j
i r j j jt t
N t I t t i R i r
0( ) ( ) ( | )i i tY t t r i
![Page 14: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/14.jpg)
14
Martingale residual processes:
Corresponding martingales:
The are of little practical use on their own, but they may be aggregated over groups of individuals to produce useful plots
( , ) ( , ) ( , )( ) ( ) ( )i r i r i rM t N t udu
( , ) ( , ) ( , )ˆ ˆ( ) ( ) ( )i r i r i rM t N t udu
( , )ˆ ( )i rM t
![Page 15: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/15.jpg)
15
For group g
May be interpreted as "observed _ expected" number of failures in group g
Asymptotic distribution may be derived using counting process methods
Simplifies for classical nested case-control
*( , )
ˆ( ) ( )g i ri g r
M t M t
ˆ ( | )
( )ˆ ( | )
jj
j jj
l t jl R gi
i g t t l t jl R
R lN t
R l
![Page 16: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/16.jpg)
16
Ilustration: uranium miners cohort
1 1 2 2(1 ) (1 ) i i ix x
Fit excess relative risk model:
xi1 = cumulative radon (100 WLMs)
xi2 = cumulative smoking (1000 packs)
For classical nested case-control with three controls per case:
1̂ 0.556 (0.215) per 100 WMLs
2ˆ 0.276 (0.093) per 1000 packs
![Page 17: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/17.jpg)
17
Aggregate martingale residual processes in three groups according to cumulative radon exposure:
Groups: I: < 500 WLMs II: 500-1500 WLMs
III: > 1500 WLMs
There are indications for an interaction between cumulative radon exposure and age
![Page 18: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/18.jpg)
18
Age and group Observed Expected
Below 60 years & group I 30 30.7
Below 60 years & group II 39 45.9
Below 60 years & group III 81 73.4
Above 60 years & group I 27 27.7
Above 60 years & group II 45 36.1
Above 60 years & group III 36 44.2
Observed and expected number of failures in the groups for ages below and above 60 years:
Chi-squared statistic with 2(3 – 1) = 4 df takes the value 10.5 (P-value 3.2%)
![Page 19: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/19.jpg)
19
Concluding remarks
• Introduces a time aspect that is usually disregarded for sample risk set data
• Gives a similar model formulation as for cohort data and thereby opens up for similar methodo-logical developments as for cohort studies
• Grouped martingale residual processes is one example of this. They allow to check for time-dependent effects and other deviations from the model
The counting process formulation of nested case-control studies:
![Page 20: 1 Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649ea05503460f94ba33aa/html5/thumbnails/20.jpg)
20
• How should the grouping be performed?
• How do specific deviations from the model turn up in the plots?
• Kolmogorov-Smirnov and Cramer von Mises type tests? (Durbin’s approximation, Lin et al’s simultation trick)
Questions and further develoments of grouped martingale residual plots and related goodness-of-fit methods