an introduction to frailty models - weill cornell...

60
An introduction to frailty models Methodology considerations for building prognostic models for esophageal cancer using SEER-Medicare database Xi Kathy Zhou, Ph.D. Division of statistics and epidemiology Department of public health Weill Cornell Medical College

Upload: ngokien

Post on 05-Feb-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

An introduction to frailty models

Methodology considerations for building prognostic models for esophageal cancer

using SEER-Medicare database

Xi Kathy Zhou, Ph.D.Division of statistics and epidemiology

Department of public healthWeill Cornell Medical College

Page 2: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Outline

Background about Esophageal Cancer Need prognostic modelNeed SEER-Medicare data

Motivation through SEER-Medicare dataNeed to deal with Clustered Survival Data

A brief review of survival data analysisMethods for Clustered Survival Data with a focus on frailty modelsStatistical Simulation to illustrate the usefulness of the frailty models

Page 3: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Motivation and Background

Page 4: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Esophageal Cancer

8th most common cancer and 6th leading cause of cancer death world wide.Increasing incidence of esophageal adenocarcinoma at a 2% annual rate in western countries.Poor prognosis in general: ~10% 5-year survival.Prognostic factors:

Stage, age, gender, number of positive lymph nodes

Page 5: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Controversies in treating esophageal cancer

Surgery only curative approachhigh riskgreat variation (esp. extensiveness)

Regionalization of treatment based on strong volume and short-term outcome relationship

Page 6: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Issues of interest

Long term implication of the surgical treatment – the prognostic role of the number of dissected lymph nodes

Long term implication of the volume outcome relationship

The extent of heterogeneity in long term outcome may be attributed to providers

Require multi-center clinical trial data or large observational databases.

Page 7: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

SEER-MEDICARE Database

Have all information available in SEERdemographics, diagnosis, primary treatment, survival

Added information from Medicarecomorbidity, treatments sequence, provider information

Has been used to build a short-term surgical mortality score for patients with esophageal cancer.

Page 8: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Methodological challenge

Primary outcome: censored survival times.Data structure: naturally clustered.Covariates: individual level & cluster level

Consequences of failing to take the clustered data structure into account:Regression parameters may be biased (Hougaard, 2000)Wrong effects assessment (Wintrebert, et al. 2004)

A model for clustered survival data allows for evaluation of cluster level covariates is needed

Page 9: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Survival Data Analysis Review

Page 10: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Cornell Esophageal Cancer Data

Consecutive patients between 1987-2006N=264 treated with surgery onlyMedian follow up: 4.1 yearsOverall 5-year survival: 37.9% yrs (95% CI=(31.8%, 45.1%))

Main objective: Comparison of surgical procedures

Page 11: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Notation

i=1,…, N subjectsTi : survival time for subject iCi : time to censoring for subject iXi = min(Ti , Ci) observed survival time for iZi: vector of risk factors for subject i

stage, age, gender, tumor size, tumor differenciation, surgical procedure, number of positive lymph nodes

Page 12: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Describe Univariate Survival Time

Cumulative distribution function: F(t) = P(T<=t)Density function: f(t)=F’(t) Survivor function: S(t)=1-F(t)=Pr(T>t)Hazard function: λ(t) =f(t)/S(t)

EquivalentDue to censoring, commonly working with the hazard function λ(t)

Page 13: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Modeling covariate effect

Proportional hazard model

λ0(t): baseline hazard

λ0(t)=1: exponential PH modelλ0(t)=αtα−1: weibull PH modelλ0(t)=αj I(aj-1< t<=aj): piecewise exponential

PH modelλ0(t) unspecified: Cox PH model

)exp()()|( 0 ZtZt Tβλλ =

Page 14: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Parameter Estimation

Partial likelihood:

Widely available in statistical packages, SAS, R/Splus, Stata, etc.Maximum likelihood approach: Generalized linear regression for poisson data

∏ ∑ ∈

=Casesi Rj j

i

iX

XL: )exp(

)exp()(β

ββ

Page 15: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Estimating Model Parameters: Bayesian Approach

Utilize full likelihoodNeed prior specification for model parametersSemiparametric models requires specifying a prior random process to generate a distribution function for survival timesMCMC based methods for generating samples from joint posterior distributionBUGS can be used

Page 16: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Cornell Data: Predictors

Stage (invasiveness): 1 (28.4%); 2 (18.2%); 3 (50.8%); 4 (2.6%)

Age: median: 64 range : 29-88

Gender: Male (78.4%)Procedure:

non-invasive (34.8%); 2-field (30.7%); 3-field (34.5%)Tumor size (cm):

median 3.5 (range 0-12.5)No. of positive lymph nodes:

0 (40.9%), 1-5 (36.0%), 6-34(23.1%)Tumor differentiation:

Well (10.2%), moderate (42.4%), poor(44.3%)

Page 17: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Cornell Data Analysis ResultVariables Hazard Ratio p-value

Tumor stageIIIIIIIV

12.65 (1.42, 4.96)3.49 (1.97, 6.17)

5.22 (1.89, 14.43)

0.002<0.0010.002

Tumor differentiationPoor

ModerateGood

11.04 (0.73, 1.46)0.53 (0.27, 1.05)

0.840.07

GenderFM

11.64 (1.07, 2.51) 0.02

Surgical procedureNon-radical

Radical 2-fieldRadical 3-field

10.53 (0.34, 0.81)0.44 (0.29, 0.67)

0.004<0.001

Age at surgery 1.02 (1.01, 1.04) 0.01

Tumor size 1.04 (0.96, 1.13) 0.35

No. of positive LN 1.04 (1.01, 1.08) 0.005

Page 18: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Summary

Extensiveness of surgery has great implication for survival

More extensive surgery, better survival

Difficult to generalize the results (single institution study, 4 surgeons)

Page 19: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Methods for Clustered Survival Data

Page 20: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Types of Clustered Survival Data

Multiple event time dataEach subject experiences multiple failures, recurrent events in bladder cancer Dependence structure is induced serially

Individuals under the study are naturally clustered

paired data, family data dependence structure is induced in a parallel manner

Page 21: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

SEER-Medicare Data on Esophageal Cancer

Right censored time to event dataDifferences in patient selection, treatment, post treatment care may induce heterogeneity in outcome across providersCluster size varies greatly across providersInclude both individual covariates and cluster level covariatesObservational database – nonrandom treatment assignment, treatment effect may be different for different providers

Page 22: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Notation

i=1,..., nkk=1, …, KTik: survival time for subject i in cluster kZik: individual level covariatesVk : cluster level covariates

Key feature in the data:(T1k, …, Tnk) dependent

Page 23: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Problems to be dealt with

How to modeling association structureFor accurate covariate estimationTo assess the amount of heterogeneity

How to make inferencePossible complicated semiparametric estimation

How to assess model fitModel selection and diagnosis

Page 24: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Modeling the within-cluster dependence

Under proportional hazard framework

Page 25: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Within-cluster dependence

Not directly observedShared by subjects within the cluster

Could be described by a latent cluster level covariate ξk

Different across clustersHazard function should be different for different cluster

Page 26: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Fixed Effects Approach

)exp()(),|( 0 kikT

kikik ZtZt ξβλξλ +=

• Subject Specific regression• The latent covariate ξ is treated as fixed effects• Affect the hazard proportionally• Easy calculation conceptually, available methods

• require large cluster size for good estimation• Large number of parameters, may encounter computational problem

Page 27: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Stratified PH Model

)exp()()|( 0 ikT

kikik ZtZt βλλ =

Subject specific regressionAllows baseline hazard different for each cluster (no proportionality assumed)No distribution assumed for the difference in baseline hazardEasy calculationParameter estimation is based on intra-cluster comparisons

Can not estimate effects due to cluster level covariate ( no variation within strata)Can not assess the amount of heterogeneity across clusters

Page 28: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Frailty PH Model

)exp()(),|( 0 ikT

kkikik ZtZt βλξξλ =

Subject specific regressionlatent covariate ξk is treated as random effectCould be generated from various distributionsAffect the hazard proportionallyMuch smaller number of parametersAllows for modeling cluster level covariatesAllows for estimating between cluster heterogeneityCan be very flexible

Computationally can be challengingStill under active development

Page 29: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Conditional Models

Fixed effects approach: using indicator variables for each clusterStratified approach where each center is treated as a stratafrailty models (Random effects models for survival outcome)

Page 30: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Marginal Model(Wei, Lin and Weissfeld 1988, Cai 1992)

Population average modelCorrelation structure completely unspecifiedwithin-cluster correlation is adjusted through the robust sandwich type estimator

No estimate for heterogeneityCovariate estimate interpretation: average difference between any two individuals across cluster.

useful for assess the significance of a particular covariate not very useful for prediction or for assessing cluster effects

Statistical programs available in R/Splus, SAS, Stata

Page 31: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Frailty PH Models

Page 32: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Early developments of frailty models

1978: Clayton Bivariate data without covariates gamma frailty model

1979: Vaupel et al first used the term “frailty”Gamma frailtyaccount for the unobserved population heterogeneity

Hougaard (1986)Positive stable frailty model

Oakes(1989)Other choices of frailty distributions

Page 33: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Understanding the frailties

)exp()(),|( 0 ikT

kkikik ZtZt βλξξλ =Frailties (ξk)

Unobserved , latentindependent of the measured covariates affect the average baseline risk multiplicativelyspecifies the heterogeneity of the population

independent across clustersshared by members within a cluster

distributional choices were mainly based on computational conveniencefrailties are scaled to have mean 1, for identifiability

Page 34: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Heterogeneity & Dependence

Frailty variation characterizes the heterogeneity across clustersFrailty induces dependence within a clusterMethods to quantifying the unconditional dependence in survival times

Kendall’s tau (a global measure of dependence)Cross Ratio function by Clayton (1978) & Oaks (1989) (a local measure of dependence)

Dependence is different for different frailty distribution specification

Page 35: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Commonly used frailty models

Page 36: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Gamma frailty model

),(~ 1 −1− θθξ Gk

Mean(ξk)=1, Var(ξk)=θIntraclass correlation: τ=θ/(2+θ)Cross ratio: 1+θLarger θ, stronger dependence

Marginal hazard ratio due to covariate becomes time-dependent

Page 37: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Positive Stable Frailty(Hougaard, 1986)

),(~ 0=1,=1= δβσξ αSk

Stable distribution:α: stability parameter (0,2], degree of peakedness and

heaviness of tailσ: scale parameter β: skewness parameter [-1,1]δ: location parameter

Positive stable: α in (0,1), β=1, δ=0σ=1 for identifiability

No closed form density functionCharacterized by its Laplace transformation

Page 38: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Positive Stable Frailty Model

Mean(ξk)=1, Var(ξk)=infKendall’s τ=1−αCross ratio:

complex time varyingconverge to 1 as time increases

The only frailty model maintained the marginal proportionality of the covariates

Page 39: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Log Normal Frailty Model

),0(~)log(k2= σξε Nk

Mean(εk)=0, Var(εk)=σ2

No simple explicit form for dependence measures

Page 40: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Extensions of simple frailty models

Multivariate frailty models:Could arise if there is subject-cluster interaction for some covariates

Time-dependent frailty modelsPaik: piecewise gamma frailty modelDunson: Bayesian semiparametric dynamic frailty model

Page 41: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Estimation & Inference under frailty models

More complicated likelihoodPartial likelihood approach no longer workable

Need to deal with baseline hazardComputational difficulty is different for different frailty models

Page 42: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

EM Algorithm(Nielsen)

Conceptually simple, view the frailties as missing dataE-step: estimate the expected frailty for each cluster (simple expression for gamma frailty models)M-step: obtain the maximum likelihood estimate for the covariates and baseline hazard by treating frailties as an offset

Slow convergence, 30-50 stepsSAS Macros for gamma & positive frailty models

Page 43: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Pennalized Partial Likelihood(Therneau & Grambsch, 2003)

Rewrote the conditional likelihoodMaximizes

log likelihood- η p(parameters)P() is a penalty function, η is a smoothing parameter

fast calculationyield MLE for gamma frailtiesAvailable in R/Splus for gamma & log-normal frailty models

Page 44: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Bayesian Methods

Working with joint posterior density of parameters given dataNeed to specify a prior process for baseline in semiparametric setting (gamma process, beta process, dirichlet process)Need to specify prior distributions for parameters or hyperparametersMCMC/Gibbs sampling approach used to update the sampleCan be readily implemented for gamma, log-normal, inverse-gaussian frailty models in BUGSAllow easy extension for some more complicated frailty models

Page 45: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Software availability

R/Splus: gamma & log normal frailty models (PPL)

SAS: gamma & positive stable frailty models (EM algorithm, slower)Score tests of independence of failures

Stata: gamma, inverse gaussian frailty models

BUGS:gamma, log normal, inverse-gaussian, some time-dependent, multivariate , additive frailty models

Page 46: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Illustration / Comparison of methods using simulated data

Data generated using parameters estimated from WCMC databaseNumber of subjects N=2000Exponential PH modelCensoring rate: 30%Gamma frailty:

strong dependence: Gamma(0.5,0.5), Kendall’s τ=0.5weak dependence: Gamma(20, 20), Kendall’s τ =0.02

number of clusters K=200Cluster size:

fixed: 10/clustervarying: small volume providers 130

Page 47: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulated data: covariatesIndividual level covariates:

Tumor stage1(25%),2(20%),3(45%),4(10%)HR:1:2.5:3.5:5

Gender: F (25%), M (75%)HR: 1:1.2

Number of positive lymph node:rounded gamma variableHR: 1.04

Cluster level covariate:Fixed cluster size: Quality (Low vs High: 1.10)Varying cluster size: Volume ( 0.98)

Methods used:No adjustment, Marginal, stratified, gamma frailty, log normal frailty

Page 48: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulation result: Individual level covariate estimates(Strong dependence, fixed cluster size)

Noadjustment

Marginal stratified Gamma Frailty

LogNormFrailty

Bias -0.440-0.859-1.514-0.048-0.010

-0.440-0.859-1.514-0.048-0.010

0.0450.0500.1080.0010.000

0.019-0.001

0.027-0.001-0.001

0.013-0.014-0.002-0.002-0.001

RMSE 0.4720.8841.5640.0890.014

0.4720.8841.5640.0890.014

0.2510.3240.6330.0860.011

0.2230.2870.5850.0780.011

0.2230.2870.5800.0790.010

95% CI Coverage

0.350.040.070.860.80

0.340.050.080.860.81

0.950.940.960.950.95

0.950.940.950.950.95

0.960.940.940.950.96

Page 49: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulation result: Individual level covariate estimates(Weak dependence, fixed cluster size)

Noadjustment

Marginal Stratified Gamma Frailty

LogNormFrailty

Bias -0.050-0.117-0.2190.002

-0.001

-0.050-0.117-0.2190.002-0.001

0.0220.0240.0240.0100.000

-0.001-0.018-0.0400.0070.000

0.007-0.002-0.0100.0080.000

RMSE 0.2230.2820.5450.0770.010

0.2230.2820.5450.0770.010

0.2530.3110.6090.0860.011

0.2270.2790.5340.0790.010

0.2280.2770.5340.0790.010

95% CI Coverage

0.960.930.920.940.95

0.950.920.910.930.93

0.960.950.950.960.95

0.960.950.940.940.95

0.950.950.950.950.95

Page 50: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulation result: Individual level covariate estimates(Strong dependence, varying cluster size)

Noadjustment

Marginal stratified Gamma Frailty

LogNormFrailty

Bias -0.910-1.651-2.807-0.107-0.021

-0.910-1.651-2.870-0.107-0.021

0.0360.0190.087

-0.0010.000

0.017-0.0150.011

-0.0030.000

0.006-0.036-0.033-0.004

0.000

RMSE 0.9211.6562.8190.1300.023

0.9211.6562.8190.1300.023

0.2940.3390.7160.0870.012

0.2630.3070.6420.0790.012

0.2670.3230.6570.0800.012

95% CI Coverage

0.0040.0000.0000.6820.468

0.0040.0000.0000.6800.482

0.9400.9480.9260.9560.936

0.9280.9320.9060.9620.940

0.9300.9280.9060.9620.934

Page 51: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulation result: Individual level covariate estimates(Weak dependence, varying cluster size)

Noadjustment

Marginal stratified Gamma Frailty

LogNormFrailty

Bias -0.393-0.752-1.373-0.046-0.008

-0.393-0.752-1.373-0.046-0.008

0.0220.0320.078-0.0010.001

-0.001-0.008-0.009-0.0020.001

-0.007-0.022-0.039-0.0030.000

RMSE 0.4420.7851.4340.0920.013

0.4420.7851.4340.0920.013

0.2670.3250.6690.0920.012

0.2460.2920.5940.0860.011

0.2460.2920.5910.0860.011

95% CI Coverage

0.5620.1680.1400.8900.880

0.5600.1600.1440.8920.874

0.9660.9600.9400.9560.934

0.9500.9520.9380.9420.934

0.9480.9500.9400.9460.934

Page 52: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulation result: cluster level covariate estimates

Noadjustment

Marginal stratified Gamma Frailty

LogNormFrailty

BiasStrong.fixed

Weak.fixedStrong.vary

Weak.vary

-0.0470.0020.0100.000

-0.0470.0020.0100.000

NA0.0130.0050.0000.000

0.0260.005-0.0070.000

RMSE0.1530.0700.0140.003

0.1530.0700.0140.003

NA0.2280.0720.0110.003

0.2870.0720.0150.003

95% CI Coverage 0.52

0.900.180.90

0.930.940.700.93

NA 0.970.930.920.96

0.970.940.880.97

Page 53: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulation summary: individual level covariates

Marginal model provide almost the same estimates as the model without any adjustment Estimated effects attenuated in marginal models (frailty selection)Attenuation in effects weaker when dependence is weaker

Stratified approach has good performance when the cluster size is large

Frailty models offer the best estimates miss specifying frailty distribution have small effect on estimates

Overall, balanced cluster size retains more information and better estimates

Frailty models have good performance in general, even when there is no intra-class correlation, or frailty distribution is miss-specified

Page 54: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Simulation summary: cluster level covariates

models without adjustment to clustered data have poorer 95% CI coverage (prone to spurious finding) When there is strong dependence, marginal models performs slightly better than frailty models in terms of bias and RMSE, but comparable coverageFrailty models have better performance when the cluster size variesWhen there is weak dependence, frailty models has comparable performance as the marginal model

Page 55: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Future research

Application to SEER-Medicare datadata could be more complicated standard PH frailty models, additive frailty models, time dependent frailty model, multivariate frailty modelmodel diagnosis, model selection, model validation

Page 56: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

ConclusionsFrailty models would be a useful tool for modeling survival data from SEER-Medicare DatabaseGamma frailty model is well developed, tested and widely available Developments is ongoing for other frailty models, their extensions, model diagnosis and model selectionInvolving complex computationBUGS program offers greatest flexibility without the need for writing own algorithm

Page 57: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

References

Page 58: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs

Thank you!

Page 59: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs
Page 60: An introduction to frailty models - Weill Cornell Medicinehpr.weill.cornell.edu/divisions/biostatistics/pdf/Zhou-Research... · An introduction to frailty models ... Statistical programs