cox model

l A Cox model is a statistical technique for exploring therelationship between the survival of a patient and severalexplanatory variables.

l Survival analysis is concerned with studying the timebetween entry to a study and a subsequent event (such asdeath).

l A Cox model provides an estimate of the treatment effecton survival after adjustment for other explanatory variables.In addition, it allows us to estimate the hazard (or risk) of deathfor an individual, given their prognostic variables.

l A Cox model must be fitted using an appropriate computerprogram (such as SAS, STATA or SPSS). The final model from aCox regression analysis will yield an equation for the hazardas a function of several explanatory variables.

l Interpreting the Cox model involves examining the coefficientsfor each explanatory variable. A positive regressioncoefficient for an explanatory variable means that the hazardis higher, and thus the prognosis worse. Conversely, a negativeregression coefficient implies a better prognosis for patientswith higher values of that variable.

1

What is...? series Second edition Statistics

For further titles in the series, visit:www.whatisseries.co.uk

Stephen J Walters BScMSc PhD CStat Reader inMedical Statistics, Schoolof Health and RelatedResearch (ScHARR),University of Sheffield

What is a Cox model?

Supported by sanofi-aventis

Date of preparation: May 2009 NPR09/1005

What is the purpose of theCox model?

The Cox model is based on a modellingapproach to the analysis of survival data. Thepurpose of the model is to simultaneouslyexplore the effects of several variables onsurvival.

The Cox model is a well-recognisedstatistical technique for analysing survivaldata. When it is used to analyse the survival ofpatients in a clinical trial, the model allows usto isolate the effects of treatment from theeffects of other variables. The model can alsobe used, a priori, if it is known that there areother variables besides treatment thatinfluence patient survival and these variablescannot be easily controlled in a clinical trial.Using the model may improve the estimate oftreatment effect by narrowing the confidenceinterval. Survival times now often refer tothe development of a particular symptom orto relapse after remission of a disease, as wellas to the time to death.

Why are survival timescensored?A significant feature of survival times is thatthe event of interest is very rarely observed inall subjects. For example, in a study tocompare the survival of patients having

different types of treatment for malignantmelanoma of the skin, although the patientsmay be followed up for several years, therewill be some patients who are still alive at theend of the study. We do not know when thesepatients will die, only that they are still aliveat the end of the study; therefore, we do notknow their survival time from the start oftreatment, only that it will be longer thantheir time in the study. Such survival timesare termed censored, to indicate that theperiod of observation was cut off before theevent of interest occurred.

From a set of observed survival times(including censored times) in a sample ofindividuals, we can estimate the proportion ofthe population of such people who wouldsurvive a given length of time under the samecircumstances. This method is called theproduct limit or Kaplan–Meier method.The method allows a table and a graph to beproduced; these are referred to as the life tableand survival curve respectively.

Kaplan–Meier estimate ofthe survivor functionThe data on ten patients presented in Table 1refer to the survival time in years followingtreatment for malignant melanoma of theskin.

What is a Cox model?

2

What isa Cox model?


A B C D E FSurvival time Number at Number of Number Proportion Cumulative(years) risk at start deaths censored surviving until proportion

of study end of interval surviving

0.909 10 1 0 1 – 1/10 = 0.900 0.9001.112 9 1 0 1 – 1/9 = 0.889 0.8001.322* 8 0 1 1 – 0/8 = 1.000 0.8001.328 7 1 0 1 – 1/7 = 0.857 0.6861.536 6 1 0 1 – 1/6 = 0.833 0.5712.713 5 1 0 1 – 1/5 = 0.800 0.4572.741* 4 0 1 1 – 0/4 = 1.000 0.4572.743 3 1 0 1 – 1/3 = 0.667 0.3053.524* 2 0 1 1 – 0/2 = 1.000 0.3054.079* 1 0 1 1 – 0/1 = 1.000 0.305* Indicates a censored survival time

Table 1. Calculation of Kaplan–Meier estimate of the survivor function

3

What isa Cox model?

To determine the Kaplan–Meier estimate ofthe survivor function for the above example, aseries of time intervals is formed. Each ofthese intervals is constructed to be such thatone observed death is contained in theinterval, and the time of this death is taken tooccur at the start of the interval.

Table 1 shows the survival times arrangedin ascending order (column A). Some survivaltimes are censored (that is, the patient did notdie during the follow-up period) and these arelabelled with an asterisk. The number ofpatients who are alive just before 0.909 yearsis ten (column B). Since one patient dies at0.909 years (column D), the probability ofdying by 0.909 years is 1/10 = 0.10. So thecorresponding probability of surviving up to0.909 years is 1 minus the probability ofdying (column F) or 0.900.

The cumulative probability of surviving upto 1.112 years, then, is the probability ofsurviving at 1.112 years, and survivingthroughout the preceding time interval – thatis, 0.900 x 0.889 = 0.800 (column F). Thethird time interval (1.322 years) containscensored data, so the probability of survivingin this time interval is 1 or unity, and thecumulative probability of surviving isunchanged from the previous interval. This isthe Kaplan–Meier estimate of the survivorfunction.

Sometimes the censored survival timesoccur at the same time as deaths. The

censored survival time is then taken to occurimmediately after the death time whencalculating the survivor function.

A plot of the Kaplan–Meier estimate of thesurvivor function (Figure 1) is a step function,in which the estimated survival probabilitiesare constant between adjacent death timesand only decrease at each death.

An important part of survival analysis is toproduce a plot of the survival curves for eachgroup of interest.1 However, the comparisonof the survival curves of two groups should bebased on a formal non-parametric statisticaltest called the logrank test, and not uponvisual impressions.2 Figure 2 shows thesurvival of patients treated for malignantmelanoma: the survival of 338 patients oninterferon treatment was compared with thatof 336 patients in the control group.3 The twogroups of patients appear to have similarsurvival and the logrank test supports thisconclusion.

Modelling survival – the Coxregression modelThe logrank test cannot be used to explore(and adjust for) the effects of several variables,such as age and disease duration, known toaffect survival. Adjustment for variables thatare known to affect survival may improve theprecision with which we can estimate thetreatment effect.

The regression method introduced by Coxis used to investigate several variables at atime.4 It is also known as proportionalhazards regression analysis.

Briefly, the procedure models or regressesthe survival times (or more specifically, theso-called hazard function) on the explanatoryvariables. The actual method is much toocomplex for detailed discussion here. Thispublication is intended to give anintroduction to the method, and should be ofuse in the understanding and interpretationof the results of such analyses. A moredetailed discussion is given by Machin et al5

and Collett.6

What is a hazard function?The hazard function is the probability thatan individual will experience an event (forexample, death) within a small time interval,


Overall survival (years from surgery)

1.0 –

0.8 –

0.6 –

0.4 –

0.2 –

0.0 –3 4 50 1 2

Censored

Cum

ulat

ive

prop

orti

onsu

rviv

ing

Survival function

Figure 1. Kaplan–Meierestimate of the survivalfunction

given that the individual has survived up tothe beginning of the interval. It can thereforebe interpreted as the risk of dying at time t.

The hazard function – denoted by h(t) –can be estimated using the followingequation:

number of individuals experiencing

h(t) =an event in interval beginning at t

(number of individuals surviving at time t) x (interval width)

What is regression?If we want to describe the relationshipbetween the values of two or more variableswe can use a statistical technique calledregression.7 If we have observed the valuesof two variables, X (for example, age ofchildren) and Y (for example, height ofchildren), we can perform a regression of Y onX. We are investigating the relationshipbetween a dependent variable (the heightof children) based on the explanatoryvariable (the age of children).

When more than one explanatory (X)variable needs to be taken into account (forexample, height of the father), the method isknown as multiple regression. Cox’smethod is similar to multiple regressionanalysis, except that the dependent (Y)

variable is the hazard function at a given time.If we have several explanatory (X) variables ofinterest (for example, age, sex and treatmentgroup), then we can express the hazard or riskof dying at time t as:

h(t) = h0(t) x exp(bage.age + bsex.sex + ... + bgroup.group)

taking natural logarithms of both sides:

ln h(t) = ln h0(t) x exp(bage.age + bsex.sex + ... + bgroup.group)

The quantity h0(t) is the baseline orunderlying hazard function and correspondsto the probability of dying (or reaching anevent) when all the explanatory variables arezero. The baseline hazard function isanalogous to the intercept in ordinaryregression (since exp0 = 1).

The regression coefficients bage to bgroup givethe proportional change that can be expectedin the hazard, related to changes in theexplanatory variables. They are estimated by acomplex statistical method called maximumlikelihood,6 using an appropriate computerprogram (for example, SAS, SPSS or STATA).

The assumption of a constant relationshipbetween the dependent variable and theexplanatory variables is called proportional

4

What isa Cox model?


Time from randomisation to death (years)

1.00 –

0.75 –

0.50 –

0.25 –

0.0 –

6 80 2 4

Interferon

Cum

ulat

ive

prop

orti

onsu

rviv

ing

Control

22

23

0

0

336

338

203

215

97

84

Number at risk

Control

Interferon

Hazard ratio 0.92 (95% CI: 0.74–1.13); p=0.411 (logrank)

Figure 2. Kaplan–Meiersurvival curves inpatients receivingtreatment formalignant melanoma3

hazards. This means that the hazardfunctions for any two individuals at anypoint in time are proportional. In otherwords, if an individual has a risk of death atsome initial time point that is twice as highas that of another individual, then at all latertimes the risk of death remains twice as high.This assumption of proportional hazardsshould be tested.6

The testing of the proportional hazardsassumption is most straightforward when wecompare two groups with no covariates. Thesimplest check is to plot the Kaplan–Meiersurvival curves together (Figure 2).3 If theycross, then the proportional hazardsassumption may be violated. For small datasets, where there may be a great deal of errorattached to the survival curve, it is possiblefor curves to cross, even under theproportional hazards assumption. A moresophisticated check is based on what isknown as the complementary log-log plot.With this method, a plot of the logarithm ofthe negative logarithm of the estimatedsurvivor function against the logarithm ofsurvival time will yield parallel curves if thehazards are proportional across the groups(Figure 3).3

Interpretation of the modelAs mentioned above, the Cox model must befitted using an appropriate computerprogram. The final model from a Coxregression analysis will yield an equation forthe hazard as a function of severalexplanatory variables (including treatment).So how do we interpret the results? This isillustrated by the following example.

Cox regression analysis was carried out onthe data from a randomised trial comparingthe effect of low-dose adjuvant interferon alfa-2a therapy with that of no further treatmentin patients with malignant melanoma at highrisk of recurrence.3,8 Malignant melanoma is aserious type of skin cancer, characterised byuncontrolled growth of pigment cells calledmelanocytes. Treatments include surgicalremoval of the tumour; adjuvant treatment;chemo- and immunotherapy, and radiationtherapy. In this trial, 674 patients with aradically resected malignant melanoma (whowere at high risk of disease recurrence) wererandomly assigned to one of two treatmentgroups: interferon (3 megaunits of interferonalfa-2a three times a week until recurrence ofcancer, or for two years – whichever occurredfirst) or no further treatment. The primary

5

What isa Cox model?


Randomisedgroup

Control

Interferon

ln (time)

1 –

0 –

-1 –

-2 –

-3 –

-4 –

-5 –

-6 –

0 1 2-3 -2 -1

ln{–

ln[s

urvi

valp

roba

bilit

y]}

Figure 3. Complementarylog-log plot3

6

aim of this multicentre study was todetermine the effects of interferon on overallsurvival. Patients were followed for up to eightyears from randomisation.8

The final Cox model included twodemographic (age and gender) and onebaseline clinical variable (histology) asindependent prognostic factors, plus atreatment variable (Table 2). An approximatetest of significance for each variable isobtained by dividing the regression estimate bby its standard error SE(b), and comparing theresult with the standard normal distribution.Values of this ratio greater than 1.96 will bestatistically significant at the 5% level. TheCox model is shown in Table 2.

The first feature to note in such a table isthe sign of the regression coefficients. Apositive sign means that the hazard (risk ofdeath) is higher, and thus the prognosisworse, for subjects with higher values of thatvariable. Thus, from Table 2, older age andregionally metastatic cancer histology areassociated with poorer survival, whereasbeing male is associated with better survival.

An individual regression coefficient isinterpreted quite easily. Note that patients areeither given interferon (coded as 1) or not(coded as 0). From Table 2, the estimatedhazard in the interferon group is exp(–0.90) =0.914 of that of the control group; that is, a9% decrease in the risk of death afteradjustment for the other explanatoryvariables in the model. However, the p-valueof 0.404 is not statistically significant and the95% confidence interval for the hazard ratioincludes 1, suggesting no difference insurvival. In this study the authors concludedthat there was no significant difference inoverall survival between interferon-treatedpatients and those in the control group, evenafter adjustment for prognostic factors.8

For explanatory variables that arecontinuous (for example, age) the regressioncoefficient refers to the increase in log hazardfor an increase of 1 in the value of thecovariate. Thus, the estimated hazard or risk ofdeath increases by exp(0.004) = 1.004 times if apatient is a year older, after adjustment for theeffects of the other variables in the model

What isa Cox model?


Variable Regression Standard p-value eb Hazard 95% CI for coefficient (b) error SE(b) ratio* hazard ratio

Lower Upper

Age 0.004 0.004 0.359 1.004 0.996 1.012

Sex –0.312 0.110 0.005 0.732 0.590 0.909(0 = female, 1 = male)

Histology 0.001

Histology (1) –0.033 0.234 0.887 0.967 0.612 1.530(0 = localised, 1 = LM)

Histology (2) 0.446 0.204 0.029 1.562 1.048 2.330(0 = localised, 1 = RMD)

Histology (3) 0.569 0.154 0.001 1.766 1.306 2.387(0 = localised, 1 = RMR)

Group –0.090 0.108 0.404 0.914 0.740 1.129(0 = control, 1 = interferon)

* Risk of death according to treatment assignment and prognostic variables

CI: confidence interval; LM: locally metastatic; RMD: regionally metastatic at diagnosis; RMR: regionally metastatic at recurrence

Table 2. Cox regression model fitted to the data from the AIM HIGH trial of interferon versusno further treatment (control) in malignant melanoma (n=674)

7

(Table 2). The overall effect on survival for anindividual patient, however, cannot bedescribed simply, as it depends on the patient’svalues of the other variables in the model.

Other modelsCox regression is considered a ‘semi-parametric’ procedure because the baselinehazard function, h0(t), (and the probabilitydistribution of the survival times) does nothave to be specified. Since the baseline hazardis not specified, a different parameter is usedfor each unique survival time. Because the

hazard function is not restricted to a specificform, the semi-parametric model hasconsiderable flexibility and is widely used.However, if the assumption of a particularprobability distribution for the data is valid,inferences based on such an assumption aremore precise. That is, estimates of the hazardratio will have smaller standard errors andhence narrower confidence limits.

A fully parametric proportionalhazards model makes the same assumptionsas the Cox regression model but, in addition,also assumes that the baseline hazardfunction, h0(t), can be parameterisedaccording to a specific model for thedistribution of the survival times. Survivaltime distributions that can be used for thispurpose (those that have the proportionalhazards property) are mainly theexponential, Weibull and Gompertzdistributions.

Figure 4 shows examples of the hazardfunctions for the exponential, Weibull andGompertz distributions. The simplest modelfor the hazard function is to assume that it isconstant over time. The hazard of death at anytime after the start of the study is then thesame, irrespective of the time elapsed, and thehazard function follows an exponentialdistribution (Figure 4A). In practice, theassumption of a constant hazard function (orequivalently exponentially distributed survivaltimes) is rarely tenable. A more general formof hazard function is called the Weibulldistribution. The shape of the Weibull hazardfunction depends critically on the value ofsomething called the shape parameter,typically denoted by the Greek letter gamma,γ. Figure 4B shows the general form of thishazard function for different values of gamma.Since the Weibull hazard function can take avariety of forms depending on the value of theshape parameter gamma, this distribution iswidely used in the parametric analysis ofsurvival data. When the hazard of death isexpected to increase or decrease with time inthe short term and then to become constant, ahazard function that follows a Gompertzdistribution may be appropriate (Figure 4C).

Different distributions imply differentshapes of the hazard function, and in practicethe distribution that best describes thefunctional form of the observed hazard

What isa Cox model?


Time

0.15 –

0.0 –100

Haz

ard

func

tion

Time

0.12 –

0 –100

Haz

ard

func

tion

Gamma >2Gamma = 2Gamma = 10< gamma <1

1 –

0 –

Haz

ard

func

tion

Time0.50

Figure 4. Examples ofhazard functions overtime for exponential(a), Weibull (b) andGompertz (c)distributions

(a)

(b)

(c)

Supported by sanofi-aventis

What isa Cox model?

Published by Hayward MedicalCommunications, a division ofHayward Group Ltd.

Copyright © 2009 Hayward Group Ltd.All rights reserved.

8

What is...? series

First edition published 2001Author: Stephen J Walters

This publication, along withthe others in the series, isavailable on the internet atwww.whatisseries.co.ukThe data, opinions and statementsappearing in the article(s) hereinare those of the contributor(s)concerned. Accordingly, thesponsor and publisher, and theirrespective employees, officersand agents, accept no liabilityfor the consequences of any suchinaccurate or misleading data,opinion or statement.


function is chosen.6 Fitting three parametricproportional hazard models, assumingexponential, Weibull and Gompertz baselinehazards, to the malignant melanoma trialdata produced similar regression coefficientsto the standard Cox model in Table 2.

A family of fully parametric models thataccommodate, directly, the multiplicativeeffects of explanatory variables on survivaltimes, and hence do not have to rely onproportional hazards, are called acceleratedfailure time models. These models are toocomplex for a discussion here, and a moredetailed discussion is given by Collett.6

References1. Freeman JV, Walters SJ, Campbell MJ. How to display data.Oxford: Blackwell BMJ Books, 2008.

2. Altman DG. Practical Statistics for Medical Research. London:Chapman & Hall/CRC, 1991: 365–396.3. Dixon S, Walters SJ, Turner L, Hancock BW. Quality of life andcost-effectiveness of interferon-alpha in malignant melanoma:results from randomised trial. Br J Cancer 2006; 94: 492–498.4. Cox DR. Regression models and life tables. J Roy Statist Soc B1972; 34: 187–220.5. Machin D, Cheung YB, Parmar M. Survival Analysis: A PracticalApproach, 2nd edn. Chichester: Wiley, 2006. 6. Collett D. Modelling Survival Data in Medical Research, 2nd edn.London: Chapman & Hall/CRC, 2003.7. Campbell MJ, Machin D, Walters SJ. Medical Statistics: A textbook for the health sciences, 4th edn. Chichester: Wiley, 2007.8. Hancock BW, Wheatley K, Harris S et al. Adjuvant interferon inhigh-risk melanoma: the AIM HIGH Study – United KingdomCoordinating Committee on Cancer Research randomized studyof adjuvant low-dose extended-duration interferon Alfa-2a inhigh-risk resected malignant melanoma. J Clin Oncol 2004; 22:53–61.

Further readingChapter 13 of Altman2 provides a good introduction to survivalanalysis, the logrank test and the Cox regression model. A moredetailed technical discussion of survival analysis and Coxregression is given by Machin et al and Collett.5,6

Box 1. Glossary of terms

Confidence interval (CI). A range of values,calculated from the sample of observationsthat are believed, with a particular probability,to contain the true parameter value. A 95%confidence interval implies that if theestimation process were repeated again andagain, then 95% of the calculated intervalswould be expected to contain the trueparameter value. Note that the statedprobability level refers to the properties of theinterval and not to the parameter itself.

ex or exp(x). The exponential function,denoting the inverse procedure to that oftaking logarithms.

Logrank test. A method for comparing thesurvival times of two or more groups ofsubjects. It involves the calculation of observedand expected frequencies of failures inseparate time intervals. The relevant teststatistic is a comparison of the observednumber of deaths occurring at each particularpoint with the number to be expected if the

survival experience of the two groups is the same.

Logarithms. Logarithms are mainly used instatistics to transform a set of observations tovalues with a more convenient distribution.The natural logarithm (logex or ln x) of aquantity x is the value such that x = ey. Here eis the constant 2.718281… The log of 1 is 0and the log of 0 is minus infinity. Logtransformation can only be used for datawhere all x values are positive.

SE or se. The standard error of a samplemean or some other estimated statistics (forexample, regression coefficient). It is themeasure of the uncertainty of such anestimate and it is used to derive a confidenceinterval for the population value. The notationSE(b) means the ‘standard error of b’.

p. The probability value, or significance level,from a hypothesis test. p is the probability ofthe data (or some other more extreme data)arising by chance when the null hypothesis is true.

cox model

Internet