cervical cancer case study presented by: university of guelph baktiar hasan mark kane melanie...

22
Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Upload: james-codd

Post on 31-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Cervical Cancer Case Study

Presented by:

University of Guelph

Baktiar Hasan

Mark Kane

Melanie Laframboise

Michael Maschio

Andy Quigley

Page 2: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Objectives

• To determine an appropriate model for the prediction of recurrence of cervical cancer

• To classify future patients on their risk of recurrence of cervical cancer

Page 3: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Cervical Cancer Data Set

The original data set included 905 cases

Patients were removed from the data set if they had ANY of the following:

• Were NOT free of the disease after surgery

845 Cases remain

• NO follow up date • ZERO survival time

Page 4: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Modeling Methods

• Mixture Model with Accelerated Failure time– Peng and Debham (1998)

• Cox Proportional Hazard Model

• Latent Variable Model

• Bayesian Survival Analysis– Seltman, Greenhouse, and Wassserman (2001)– Chen, Ibrahim, and Sinha (1999)

Page 5: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Mixture model

• The model we chose for modeling time to recurrence is a mixture model of the form:

S(t)=pSu(t) + (1-p)

F(t)=pFu(t)

Benefits:• Allows for cure rate• Covariates can be incorporated into survival time

[Su(t)] AND\OR cure rate [1-p]

Page 6: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Mixture Model (Con’t)

• The model can be fit using a S-plus library (GFCURE) written by Peng.

• Further details about the library and the model can be found in Peng et al. (1998) and Maller and Zhou (1996).

• It should be mentioned that we found an error in the S-plus library written by Peng. The function pred.gfcure has a small error which can cause the program to crash or produce incorrect predicted values in some situations.

Page 7: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

“Immunes” and Sufficient Follow up

• Maller and Zhou (1996) suggest tests to examine the hypotheses of:– Presence of “immunes” in the data set– Sufficient follow up time

• In the data set, it was found that immunes were present and there was not strong evidence to suggest that follow up time was insufficient

Page 8: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Missing Covariates

• It was noticed that a large proportion of the cases (≈40%) had at least one covariate with a missing value

• Various methods to handle this situation include:– Ignoring cases with missing covariate data– Maximum Likelihood Methods

Chen and Ibrahim (2001)

Page 9: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Missing Covariates (Con’t)

• We chose to perform variable selection on only the cases that contain no missing covariates (n=534).

• BIAS introduced ???

• CHECK: compare distributions of covariates in “full” and “reduced” data sets

• NO significant bias was introduced

Page 10: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Distribution

• A variety of distributions were considered for modeling recurrence time including Weibull, gamma, lognormal, log-logistic, extended generalized gamma and generalized F.

• From comparing the distributions using AIC for the above models, there was little improvement from fitting a distribution with 3 or 4 parameters versus a 2 parameter distribution.

• Of the 2 parameter distributions considered the Weibull distribution surfaced as the best distribution in terms of likelihood and prediction of the cure rate.

Page 11: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Variable Selection

• Stepwise variable selection was performed using the 534 patients previously mentioned; AIC was used as the entering criterion.

• Variables were allowed to enter both the cure rate portion of the model and survival time portion of the model.

• The final model chosen uses the explanatory variables pelvis lymph node involvement (PELLYMPH) and size of tumor (SIZE) to model the survival time of uncured patients and uses Capillary Lymphatic Spaces (CLS) and depth of tumor (MAXDEPTH) to predict cure rate.

Page 12: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Variable Selection (Con’t)

• It should be noted that CLS was modeled as a continuous variable rather than discrete because twice the difference of log likelihoods from modeling CLS as continuous versus discrete is 0.017.

• Interactions of the significant covariates in the chosen model were also considered, but were found to be non-significant.

Page 13: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Chosen Model

Variable Coefficient S.E. p-value

Terms in accelerated failure time model

PELLYMPH -1.0727 0.3676 0.0035

SIZE -0.0578 0.0111 <0.0001

Terms in the logistic model

CLS 0.9203 0.2988 0.0021

MAXDEPTH 0.0561 0.0206 0.0081

Page 14: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Interpretation of the Model• The negative coefficient of PELLYMPH indicates that uncured

patients found positive for pelvis lymph node involvement will have a lower recurrence time than patients found negative for pelvis lymph node involvement .

• The coefficient of SIZE is also negative, which means that for uncured patients, larger tumor size corresponds to quicker recurrence of cancer.

• The positive value of CLS in the cure rate portion of the model indicates that patients with a positive prognosis have a higher probability of recurrence.

• The coefficient of MAXDEPTH is also positive, indicating that patients with a large tumor depth have a higher probability of recurrence.

Page 15: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Model Validation

• In order to determine how well the chosen model will predict future patients, the data was randomly split into two subsets.

• Since it is not known if a patient who did not relapse was cured or censored it is not possible to compare the predicted probability of recurrence with the actual probability of recurrence.

• A graphical method was utilized for determining how well the predicted probabilities performed.

Page 16: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Model Validation (Con’t)

• The graphical method involved predicting the probability of recurrence before time ti (F(t)) for a number of chosen times.

• This prediction is smoothed against recurrence, which is 1 if recurrence occurred before time ti or 0 if recurrence has not occurred before time ti

• A criticism of this graphical method is that it is possible for a patient with a survival time less than ti but no recurrence to have a recurrence between their censored survival time and ti so they should have been coded as a 1 not a zero for the graph.

Page 17: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

0.00 0.05 0.10 0.15 0.20 0.25 0.30

F(t)

0.0

0.2

0.4

0.6

0.8

1.0

recu

rren

ceTime=600 days

0.0 0.1 0.2 0.3 0.4 0.5

F(t)

0.0

0.2

0.4

0.6

0.8

1.0

recu

rren

ce

Time=1200 days

0.0 0.2 0.4 0.6 0.8

F(t)

0.0

0.2

0.4

0.6

0.8

1.0

recu

rren

ce

Time=2400 days

0.0 0.2 0.4 0.6 0.8

F(t)

0.0

0.2

0.4

0.6

0.8

1.0

recu

rren

ce

Time=3600 days

Page 18: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Classification

• The second objective is to classify patients into 3 groups: Low relapse, Moderate relapse, and High relapse.

• We classified patients based on their estimated cure rate from the final model previously mentioned.

• Low relapse: estimated cure rate ≥ 94%• Moderate relapse: 84% < estimated cure rate < 94%• High relapse: estimated cure rate ≤ 84%

Page 19: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

0.4 0.5 0.6 0.7 0.8 0.9

Predicted Cure Rate

0.0

0.2

0.4

0.6

0.8

1.0

Rec

urre

nce

Predicted Cure Rate Vs. Event

High Moderate Low

Page 20: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

Conclusions

• We found that the attributes Capillary Lymphatic Spaces and depth of tumor are important for predicting the probability of relapse and pelvis lymph node involvement and size of tumor are important for predicting the survival time of uncured patients.

• We used these attributes in a Weibull mixture model to classify patients according to their risk of recurrence.

Page 21: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley
Page 22: Cervical Cancer Case Study Presented by: University of Guelph Baktiar Hasan Mark Kane Melanie Laframboise Michael Maschio Andy Quigley

References• Chen, M., and Ibrahim, J. (2001), “Maximum likelihood methods for cure

rate models with missing covariates” Biometrics, 57, 43-52.

• Chen, M., Ibrahim, J., and Sinha, D. (1999), “A new bayesian model for survival data with a surviving fraction” JASA, 94, 909-919.

• Maller, R., and Zhou, X. (1996), Survival Analysis with Long-Term Survivors. Toronto: John Wiley & Sons.

• Peng, Y., Dear, K., and Debham, J. (1998), “A generalized F mixture model for cure rate estimation” Statistics in Medicine, 17, 813-830.

• Seltman, H., Greenhouse, J., and Wasserman, L. (2001), “Bayesian model selection: analysis of a survival model with a surviving function” Statistics in Medicine 20, 1681-1691.