statistical analysis of composite endpoints in...

September 23, 2019 ASA Biopharmaceutical Section Regulatory-Industry Statistics Workshop

Marriott Wardman Park, Washington, DC

SHORT COURSE 5

Statistical Analysis of Composite Endpoints in

Clinical Trials

Instructor(s): Lu Mao, University of Wisconsin-Madison

1

Table of Contents 1. Introduction

a) Definition and rationaleb) Clinical trial examplesc) Regulatory guidelines and challenges

2. Conventional Methods a) Binary composite endpointsb) Time-to-first-event (TFE) analysisc) Beyond the first event

3. The Two-Sample Win Ratio a) Rationale and approach b) Hypothesis testing and estimandc) Weighted win ratio and the WWR R-packaged) Generalizations: stratification and recurrent-event extension

Table of Contents 4. Regression Methods for the Win Ratio

a) Model assumption and estimandb) Relationship with two-sample win ratio and Cox modelc) Estimation and model-checkingd) The WR R-packagee) Open problems

5. Case Study and Discussion

Appendix

References

3

Chapter 1.

Introduction

Introduction - Definition

Definition of composite endpoints (CE): an outcome combining multiple distinct types of events into a single variable (Rauch et al., 2018a)

Binary composite endpoint

Time-to-event composite endpoint

Examples: Behavioral interventions: initiation of any of several

behaviors

Cardiovascular (CV) trials: death and CV hospitalization

Oncology trials: death and tumor progression

4

Introduction - Rationale

Advantages (Freemantle et al., 2003, JAMA)

Increasing power: a larger number of events

Avoiding multiplicity adjustment

Overall treatment effect

ICH-E9 “Statistical Principles for Clinical Trials” (ICH,

1998):“There should generally be only one primary variable”

“If a single primary variable cannot be selected from multiple measurements associated with the primary objective, another useful strategy is to integrate or combine the multiple measurements into a single or composite variable, using a predefined algorithm.”

Introduction – Clinical Trial Examples

The Osteoporosis Trial (Yuksel et al., 2010):

Aim: Assess the effect of a community pharmacist-initiated education program for high-risk patients on subsequent testing and treatment of osteoporosis 262

Endpoint: Composite binary event of

performing bone mineral test (BMT) or

initiation of osteoporosis medication

within 4 months

• Results: composite event rate 0.22 vs 0.11 (p-value <.001), mainly driven by BMT

5


The CAPRICORN Trial (Dargie et al., 2001, The Lancet):

Aim: Assess long-term efficacy of carvedilol on morbidity and mortality in patients with left ventricular dysfunction after acute myocardial infarction 1959

Endpoint: All-cause mortality or CV hospitalization (time to first event)

Results (carvedilol vs control):

hazard ratio (HR) for composite event 0.92 (p-value .296)

HR for all-cause death 0.77 (p-value .031)

effect on CE driven by death


The CHARM-Preserved Trial (NCT00634712; Yusef et al., 2003, The Lancet):

Aim: Evaluate the effect of Atacand on patients with heart failure with preserved left ventricular function 3,025

Endpoint: Cardiovascular mortality or hospitalization due to congestive heart failure (time to first event)

Results (Atacand vs control) :

HR for composite event 0.89 (p-value .118)

no difference in CV death

significantly fewer repeated CHF hospitalizations in treatment (p-value .017)

6


The EMPA-REG Trial (NCT01131676; Zinman et al., 2015, New England Journal of Medicine):

• Aim: The effects of empagliflozin on CV morbidity and mortality in patients with type 2 diabetes 7020

• Endpoint: CV death, Myocardioinfarction (MI), stroke (time to first event)

• Results:


The EMPA-REG Trial (NCT01131676; Zinman et al., 2015):

7

Introduction – Regulatory Guidelines

ICH-E9 Guidelines, Section 2.2.3 (ICH, 1998): “(composite endpoint) addresses the multiplicity problem without

adjustment to the type I error.”

“The method of combining the multiple measurements should be specified in the protocol.”

“…an interpretation of the resulting scale should be provided in terms of the size of a clinically relevant benefit.”


The European Network for Health Technology Assessment (EUnetHTA) guideline: “Endpoints used for Relative Effectiveness Assessment – Composite Endpoints” (EUnetHTA, 2015)

“All components of a composite endpoint should be separately defined as secondary endpoints and reported with the results of the primary analysis.”

“Components of similar clinical importance and sensitivity to intervention should preferably be combined.”

“If adequate, mortality should however be included if it is likely to have a censoring effect on the observation of other components.”

8


FDA Guidance for Industry: “Multiple Endpoints in Clinical Trials” (FDA, 2017)

“Composite endpoints are often assessed as the time to first occurrence of any one of the components, …, it also may be possible to analyze total endpoint events.”

“The treatment effect on the composite rate can be interpreted as characterizing the overall clinical effect when the individual events all have reasonably similar clinical importance.”

“…analyses of the components of the composite endpoint are important and can influence interpretation of the overall study results.”


ICH-E9 (R1): “Estimands and Sensitivity Analysis in Clinical Trials” (ICH, 2017)

“A central question for drug development and licensing is to quantify treatment effects.”

“…an estimand defines in detail what needs to be estimated to address a specific scientific question of interest.”

“Intercurrent events: Events that occur after treatment initiation and either preclude observation of the variable or affect its interpretation” (e.g., death).

9


ICH-E9 (R1): “Estimands and Sensitivity Analysis in Clinical Trials” “Intercurrent events need to be considered in the description of a

treatment effect on a variable of interest”

“Composite strategy: The occurrence of the intercurrent event is taken to be a component of the variable.”

On the other hand, “…missing data and loss-to-follow-up are irrelevant to the construction of estimands.” (training material)


To summarize, a composite endpoint should be pre-specified in trial protocol

(ideally) consist of components of similar clinical importance

include mortality whenever appropriate

provide a meaningful scale for overall treatment effect

be followed up with component-wise secondary analysis

10

Introduction – Challenges

Components are usually of unequal importance Effect can be driven by less important components

(EUnetHTA, 2015), e.g., CHARM-Preserved trial

The composite effect measure not meaningful if “effect is chiefly on the least important event” (FDA, 2017)

Differential treatment effects on components Unresponsive components reduce overall effect

Particularly problematic if those are the less important ones, e.g., CAPRICORN and EMPA-REG trials

Need for prioritization

Introduction – Course Objectives

After taking this short course, the audience will

Understand the issues associated with composite endpoints and the conventional approaches to addressing them

Learn the basics of the newly developed Win Ratio (WR) methodology (Pocock et al., 2012, European Heart Journal)

Be able to analyze real data and interpret results using statistical software (mostly R)

Topics not covered: X Multiple testing

X Joint analysis

X Secondary, component-wise analysis

11

Chapter 2.

Conventional Methods

Binary CE

Outcomes , … , ′, where ∈ 1, 0 The Osteoporosis trial: BMT , Medication ,

2

The CE is ∗ BMTorMed 0 (the OR combination)

For notational convenience, define the composite using the AND combination ⋯ The Osteoporosis trial: noBMT ,

noMedication , 2

The CE is noBMT, noMed (the AND combination)

Note that 1 ∗

12

Binary CE

Treatment arm 1 activetreatment , 0 control

Use subscript to denote the variable from arm , … , ′ and ⋯

Hypotheses:::

where pr 1

Suppose we have a random sample from each arm: , … , 1, 0

Binary CE

If no missing data in the 1,… , , the composite indicator ⋯ is observable on each subject

The composite event rate can be estimated by ∑ , leading to test statistic

1 1 0, 1

13

Binary CE – Missing Components

If contain missing components, the composite may not be computable

0, . → 0

1, . → ?

If components are missing at random (MAR), i.e., missingness depends on treatment and non-missing components only, can use EM algorithm for inference on the (Quan et al., 2007, Stats in Med)


Denote

, pr The Osteoporosis example:

, , probability of medication without BMT in arm

, , probability of BMT without medication in arm

So

, , ,…,

Questions: how to estimate , , ,…, when some components of are missing on some subjects?

14


Let Thesetof compatiblewiththeobservationonthe thsubject

0, . → 0,0 , 0,1 1, . → 1,0 , 1,1

EM for the unknown parameters , : at 1 th iteration, E-step: (for ∈ )

,,

∑ ,∈

M-step:

, ,: ∈


Iterate until convergence to obtain , , ,…,with variance estimated by Louis formula

Likelihood ratio test

2 log ∼

: maximized likelihood by EM under : , , ,…,, , ,…,

: maximized likelihood by EM under no constraint

Software and details:Check back at https://sites.google.com/view/lmaowisc/publications

15

Binary CE – Sample Size

Sample size calculation based on component-wise marginal rates (Bofill-Roig & Gomez-Melis, 2019, Stats in Med)

An online tool

http://cinna.upc.edu:3838/compare/CompAREBinary/

Time-to-Event CE

Composite time-to-event endpoints are common in cardiovascular and oncological trials Major adverse cardiac events (MACE): MI, heart failure

(HF), stroke, death

Progression-free survival (PFS): tumor progression, death

16

Time-to-Event CE Notation: : survival time

: number of (recurrent) non-fatal events, e.g., CV

hospitalization, tumor progression, by time

Note that ⋅ doesn’t jump after (cf. ICH-E9 (R1) about the distinction between missing data and intercurrent event)

, : 0 : composite outcome data accumulated up to

≡ ∞ : full, uncensored outcome data

Time-to-First-Event Observed data , min , ∧

: censoring time

The prevailing approach to analyzing composite time-to-event outcome ⋅ is to focus on time to the first event (TFE)

If , , … are the non-fatal event times, the TFE is min , Kaplan—Meier curve; log-rank test; Cox proportional

hazards (PH) model on the univariate (censored)

17

Time-to-First-Event – Limitation

Limitation: Unequal importance between components ignored

Information beyond the first event discarded

Solution: Component-wise (cause-specific) weighting (Rauch et al.,

2018b, Stats in Med)

A recurrent-event perspective (Mao and Lin, 2016, Biostatistics)

Time-to-First-Event – Weighting

Cause-specific weighting Give fatal and non-fatal events different (pre-specified)

weights to reflect their unequal importance

Say, 1 if the first event is death and 0.5 if the first event is hospitalization

1

0.5

18

Time-to-First-Event – Sample size

Sample size formula

4 /

log :total sample size required

: type I error

: desired power

: expected (first) event rate per patient

Can be estimated under parametric models for , and

: hypothesized hazard ratio

Time-to-First-Event

Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training (HF-ACTION) A randomized controlled clinical trial to evaluate the efficacy

and safety of exercise training among HF patients (O’Connor

et al., 2009, JAMA).

Composite endpoint All-cause mortality

All-cause hospitalization

19

Time-to-First-Event

Consider a mock dataset in HF-ACTION.txt with 461 in exercise training and 502 in usual care (control) id: unique patient identifier; time: event time in months

status: 0= censoring, 1= death, 2= hospitalization

Training: 0= usual care; 1= exercise training

HF.etiology: 0= non-ischemic, 1= ischemic

Time-to-First-Event

TFE analysis

TFE HR 0.504 0.001 ; Death HR 0.791( 0.303

20

Beyond the First Event

To make use of the outcome data beyond , consider the weighted counting process

Proportional weighted mean model (Mao and Lin, 2016, Biostatistics)

,

,exp

2

3

21

2

21


Software: R function

CompoML(id,time,status,Z,w)

Input: id: unique patient identifier; time: event time

status: 0= censoring, 1= death, 2, 3,...=different types of recurrent event

Z: covariate matrix

w: weight vector for event types 1, 2, … ,

Output: beta: estimated parameters (log mean ratios)

var: estimated covariance matrix of beta

(Details in https://sites.google.com/view/lmaowisc/software/compoml)

21


Consider the HF-ACTION data


Adjusting for etiology, the training arm has 40.8% fewer weighted composite events than usual care ( 1.4 10 )

22


Plot the estimated CE frequency functions ,by treatment arm and etiology Use the generic function plot on obj, the object returned by

CompoML and the desired covariate value z

plot(obj,z,…)


23

Chapter 3.

The Two-Sample Win Ratio

The Win Ratio – Rationale

Rationale: to prioritize certain components (e.g., death) of the CE without arbitrary weighting

Approach (Pocock et al., 2012, European Heart Journal):

Pairwise comparison

Hierarchical (Sequential) comparison (e.g., first death, then non-fatal events)

24

The Win Ratio – Approach

Observed data Treatment: , 1, … ,

Control: , 1, … ,

Recall , : 0 and min ,

Cartesian product of pairs

,

,

⋮

,

,

,

⋮

,


Sequential comparison Within each pair, determine a winner, a loser, or a tie

First compare on death

If the order of death cannot be determined due to censoring, then compare on non-fatal event

Death is thus prioritized (without quantitative weighting)

25


Unlike the TFE analysis, the win ratio does not treat A and B as if they were the same (B wins on death)

A

B


WritePatient intreatmentwinsoverpatient incontrolPatient intreatmentlosestopatient incontrol

The fractions of “wins” and “losses”

26


The win ratio (WR) statistic

,∑ ∑

∑ ∑

Estimand: pr treatmentbetterthancontrol pr controlbetterthantreatment

Interpretation: the number of times a patient in the treatment is likely to fare “better” than in the control

Alternatively, one calculates the proportion in favor (PIF) of treatment, or net benefit (NB)

PIF

The Win Ratio – General concept

For general outcome , if ≻ means that subject wins over subject , then one can always calculate

the WR and PIF based on the win-loss fractions (Bebu

and Lachin, 2015, Biostatistics):

≻

≻

Then, WR / ; PIF

27

The Win Ratio – General concept

In case of binary ∈ 1, 0 , and with ≻ replaced by

pr pr 1 1 pr 1 pr pr 1 1 pr 1 So,

WR //

PIFpr 1 1 pr 0 1 pr 1 pr 1pr 1 pr 1

General concept Binary outcome

Win ratio Odds ratio

Proportion in favor Risk difference

The Win Ratio – Reception

The win ratio is gaining traction among clinicians…

28

The Win Ratio – Hypothesis testing

To test the null hypothesis that and are identically distributed, use the test statistic

,log ,

,0, 1

, is a variance estimator for log , using -statistic theory combined with delta method (Appendix A)

Difference with TFE analysis The WR test is on the joint distribution of , , rather

than the distribution of min ,

The Win Ratio – Hypothesis testing

What specifically is the WR testing? Let

, pr , , 1, 0

The WR tests if the bivariate events times are jointly stochastically greater in the treatment than in the control (Mao, 2019, Biometrics), i.e., the treatment tends to delaydeath and the (first) non-fatal event jointly

That is, if

, ,

with strict inequality for some , , thenpr , / → 1, as , → ∞

29

The Win Ratio – Estimand

Testing aside, consider the estimand of the WR statistic ,

In general, depends on the censoring distribution as well as that of the

(Luo et al., 2015, Biometrics; Bebu and Lachin, 2015, Biostatistics; Dong et al., 2019, Statistics in Biopharmaceutical Research)

An undesirable property according to ICH-E9 (R1)

“…missing data and loss-to-follow-up are irrelevant to the construction of estimands.”

Two solutions: a local one and a global one


Local estimand -- the curtailed WR ⋅ (Oakes, 2016, Biometrika)

The WR calculated on two populations in which all subjects are followed up to the same time, say

e.g., 5 years → : a five-year WR

Oakes (2016) expresses ⋅ as an explicit functional of the ⋅,⋅

Finkelstein & Schoenfeld (2019) estimate as a function of the follow-up time using model-based estimates for ⋅,⋅

• Oakes (2019) presented a nonparametric approach to estimation of at JSM 2019

30


Global estimand Impose certain global constraint on the WRs curtailed at

different follow-up times

A unified estimand for overall treatment effect without reference to the specific follow-up time

A previous example: proportional hazards assumption → a unified hazard ratio

This is the approach taken in the WR regression models (Ch. 4)

Weighted WR Weighting (Luo et al., 2017, Stats in Med; Qiu et al., 2017, J Med

Stats & Info)

, intreatmentwinsover incontrolon

, intreatmentwinsover incontrolon ,

whenindeterminateondeath

The component-specific loss indicators , and , are defined similarly

By previous notation,

, ,

, ,

31

Weighted WR Weighted fractions of wins and losses with

component-specific weight functions and ,

,

∧

,

∧ , ∧ ∧ ∧

The weighted loss faction , are similarly defined by replacing , and , with , and , , respectively

Weighted WR Choice of weight (Luo et al., 2017):

1. Gehan weight: constant

2. Log-rank: pr

,

1. Gehan weight: constant

2. Log-rank: , pr , ∧⋮

The log-rank type weights may be more efficient against proportional-hazards alternatives

32

The WWR R-Package Software: R package WWR

Unweighted WR

winratio(y1,y2,d1,d2,z)

Weighted WR

wwratio(y1,y2,d1,d2,z,wty1,wty2)

y1 ∧ ; d1

y2 ; d2

z: 1 treatment; 0 control

wty1 optionsfor , : 1 Gehan;2 log‐rank;⋯

wty2 optionsfor : 1 Gehan;2 log‐rank

The WWR R-Package Consider the HF-ACTION data analyzed in Section 2

First, re-organize the data into the (wide) format suitable for the WWR functions

33

The WWR R-Package Recall: Exercise training 461 vs Usual care

502

Unweighted WR analysis

Death Hospitalization

Proportionsofcomponent‐specificwinsamongdeterminate untied pairs,i.e.,

10,31638,206 / 48,522 25,829

The WWR R-Package Unweighted WR analysis

Patients in exercise training are 1.88 times as likely to have a better outcome than those in usual care

Normalize by 461 502to obtain PIF (NB) of treatment:0.098

34

The WWR R-Package Weighted WR analysis

Stratified WR Stratification (Dong et al., 2018, J Biopharm Stats)

Rationale: Gain efficiency by comparing patients that are similar except for treatment assignment

Treatment vsControl in thstratum1, … , ,

• Example: HF etiology in the HF-ACTION study

, : win and loss indicators comparing the th subject in treatment to the th subject in control in the th stratum

∑ ∑ (win fraction in th stratum)

∑ ∑ (loss fraction in th stratum)

35

Stratified WR Stratified WR statistic

∑

∑

where is stratum-specific weight, e.g., /∑

Variance of SWR can be estimated based on those for stratum-specific win/loss fractions similarly to Appendix A

Software for stratified WR will be built into the WR R-package (Ch. 4)

Recurrent-Event WR Recurrent event WR (Finkelstein and Schoenfeld, 1999, Stats

in Med)

Sequential comparison: first on death, then on the frequency of non-fatal events accumulated up to the earlier of the follow-up times of the two patients

Make use of the patient’s full experience

May gain statistical efficiency

Software for recurrent-event WR will be built into the WR R-package (Ch. 4)

36

Chapter 4.

Regression Methods for the Win Ratio

From Testing to Regression

Univariate or TFE Win Ratiofor (prioritized) CE

One-sample Kaplan—Meier curve Curtailed win-ratio process(Oakes; 2016; Finkelstein & Schoenfeld,

2019; Oakes, 2019)

Two-/Multi-sample Weighted/stratified log-rank tests

Weighted/stratified WR tests(Luo et al., 2017; Qiu et al., 2017; Dong

et al., 2018)

Regression Cox proportional hazards model ?

37

From Testing to Regression Example: factors that could be associate with CV

endpoints Treatment

Demographic (age, race, gender, etc.)

Medical history (diabetes, prior CV disease, etc.)

Current medication ( -blocker, ACE inhibitor, etc.)

Regression models vs. two-sample comparison Adjustment for confounding Efficiency gain Screening of important prognostic factors

From Testing to Regression Goals: Formulation of a model whose regression parameter is not

influenced by censoring Estimand derived from the scientific question ICH-E9 (R1):

“...missing data and loss-to-follow-up are irrelevant to the construction of the estimand”

Inference procedures Checking model assumptions

38

General Regression Framework Recall notation: : survival time

: number of (recurrent) non-fatal events, e.g., CV

hospitalization, tumor progression, by time

, : 0 : composite outcome data accumulated up to

≡ ∞ : full, uncensored outcome data (this will be the target of the regression model)

General Regression Framework New notation:

: a -vector of baseline covariates (e.g., treatment, demographic and clinical variables)

Introduce notation for a generalized win indicator function comparing two patients followed up to the same time Given two patients and , write

, Subject winsoversubject bytime

, is a function only of the outcome data accumulated up to

39

General Regression Framework

Examples of win indicator function ,

The win indicator used in Pocock’s win ratio (Pocock et al., 2012) as if

, ∧ ∧ , ∧

A win indicator based on the order of TFE

, ∧

A recurrent-event (Finkelstein-Schoenfeld type) win indicator

, ∧ ∧ ,

General Regression Framework The analyst is allowed to choose a wide variety of win

indicator functions, provided the following basic conditions are satisfied

(W1) , is only a function of and

(W2) , , 1

(W3) , , ∧ ∧

(W1): progressive comparison; (W2): no “win-win”; (W3) terminal event does not pose (semi-)competing risks issue

All , , and satisfy (W1)—(W3)

40

General Regression Framework

Under , compare patients with covariate to those with covariate by

Win fraction: pr , 1 ∣ ,

Loss fraction: pr , 1 ∣ ,

The covariate-specific curtailed win-ratio (process)

∣ , ; ≔pr , 1 ∣ ,

pr , 1 ∣ ,

By time , patients with covariate are ∣ , ;times as likely to have a better composite outcome than those with covariate as vice versa

The PW Models

The Proportional Win-fractions (PW) model (Mao and Wang, 2019+, under review in JASA)

∣ , ; exp

So named as the win fractions are proportional over time (right-hand-side not a function of follow-up time )

Interpretation of : log win ratios associated with unit increases in corresponding components of (regardless of follow-up time)

Note that the model is specific to the win function chosen

Denote the PW model under by PW

Models PW , PW , and PW

41

Relationship with Other Methods

Results:

PW is equivalent to the Cox PH model on TFE with win ratios equal to the inverse of hazards ratio (see Appendix B for a simple proof)

exp in PW with a binary covariate is the estimand of Pocock’s WR (so PW is indeed a regression extension of the two-sample WR in Ch. 3)

The following joint model for , implies PW

pr , ∣ ,

Relationship with Other Methods

To distinguish between PW and PW , call

PW : priority-unadjusted PW model (Cox model on TFE)

PW : priority-adjusted PW model (regression model for Pocock’s two-sample WR)

The regression parameters from the two models can be called priority-unadjusted and priority-adjusted (log) win ratios, respectively

42

Estimation Procedure Recall notation for observed (censored) data , ∧

: censoring time

Here we assume ∣

The observed win indicators

≔ , ∧ ∧

and the “determinacy” indicators

≔

Estimation Procedure

Under (W1)—(W3) for and the model assumption for PW , it can be shown that

, ; ∣ , 0for all t

, ; is a “residual

process”

Reminiscent of the usual counting process martingale process

pr wins ∣ an are tied; ,

43

Estimation Procedure Given a random sample of size , use the following

weighted -estimating equations for

2 ; , ; d , ; 0.

; , ; : an arbitrary symmetric weight function The choice affects the efficiency of , but not interpretation of its

estimand Common choice: ; , ; ≡ 1

obtained by Newton-Raphson algorithm; variance estimated using -statistic theory

exp with a binary corresponds to Pocock’s two-sample WR

Checking Proportionality Under proportionality assumption, the following score

process has mean zero for all

; , ; d , ;

Plot a standardized version of

2 ; , ; d , ;

Under model assumption, fluctuates around zero and exhibits no pattern

44

Checking Proportionality

Rule of thumb for standardized score processes under proportionality

Patternless over time with supremum bounded by 2

The WR R-Package

A tutorial can be found in https://biostat.wisc.edu/%7Elmao/software/WR/vignettes/WR.html

45

The WR R-Package The main function for PW

pwreg(time,status,Z,ID)

Input

time: event time

status: 0= censoring, 1= death, 2= non-fatal event

Z: covariate matrix

ID: unique patient identifier

Output: an object of class “pwreg” containing

beta:

Var: estimated variance matrix for

The WR R-Package Calculate the score processes for PW

score.proc(obj)

obj: an object returned by the pwreg function

Output: an object of class pwreg.score containing

t: an -vector of times

score: a -by- matrix whose th row is the standardized score process for the th covariate as a function of

Plot the standardized score process for the thcovariate

plot.pwreg.score(obj,k,…)

46

The WR R-Package Consider the HF-ACTION data analyzed in Sections 2

& 3

Data already in the desired (long) format for pwreg

The WR R-Package Run the pwreg function

1 /2

All possible pairs within the pooled sample of size 963

47

The WR R-Package

exp

Adjusting for etiology, patients in exercise training are 1.87 times as likely to have a better outcome than those in usual care

The WR R-Package

48

Win Ratio Methodology

Univariate or TFE Win Ratiofor (prioritized) CE

One-sample Kaplan—Meier curve Curtailed win-ratio process(Oakes; 2016; Finkelstein & Schoenfeld,

2019; Oakes, 2019+)

Two-/Multi-sample Weighted/stratified log-rank tests

Weighted/stratified WR tests(Luo et al., 2017; Qiu et al., 2017; Dong

et al., 2018)

Regression Cox proportional hazards model

Proportional win-fractions model (Mao & Wang, 2019+)

A special case depending onthe choice of win function

Open Problems for WR Formal goodness-of-fit tests Supremum (Kolmogorov-Smirnov) tests based on the

standardized score processes (Lin et al., 2003)

In case of non-proportionality Time-dependent covariate Stratification

Choice of ; , ; for statistical efficiency

Sample size calculation Difficulty in obtaining analytic variance formula

Group sequential methods

49

Chapter 5.

Case Study & Discussion

HF-ACTION Study HF-ACTION: a randomized controlled clinical trial

among heart failure (HF) patients.

A total of 2331 medically stable outpatients with HF and reduced ejection fraction recruited over 4/2003--02/2007 at 82 centers in the USA, Canada, and France.

Randomized to usual care alone or usual care plus aerobic exercise training that consists of 36 supervised sessions.

Primary (composite) endpoint: all-cause death and all-cause hospitalization.

50

HF-ACTION Study Consider a subset of the study data consisting of 451

non-ischemic patients

HF-ACTION Study – Two sample

Training vs Usual: Cox model on TFE

HR 0.804 95%CI: 0.643, 1.006 Win ratio by WWR

WR 1.235 95%CI: 0.974, 1.565 Win ratio by WR in PW regression with treatment as only

covariate WR 1.233 95%CI: 0.972, 1.565

The two WR methods are theoretically equivalent and did produce identical results (up to negligible numerical differences)

51

HF-ACTION Study – Regression

Cox PH on TFEHR on TFE

Priority-unadjustedWR

HF-ACTION Study – Regression

(Unweighted) Proportional mean model for recurrent CE

52

HF-ACTION Study – PW Regression

PW

Priority-adjusted WR


A 2-df test on the effect of race (“Black vs White”, “Other vs White”)

53


Plot the standardized score processes to check the proportionality assumptions

Summary Three classes of methodology for time-to-event CE

Methodology pros cons software

TFE Simple and intuitive

Indiscriminate treatment of death

and non-fatalevents

R-packagesurvival

(Weighted) Proportional mean

model

Make use of patient’s full experience

Arbitrary choice of weight

R-functionCompoML

Win ratio Prioritization of death;

No arbitrary weighting;

easy to interpret

Methodology as not fully developed

as in univariate survival analysis

R-packagesWWR WR

54

Appendix A.

Variance calculation for win ratio statistics

Variance formula for the win ratio

The formula applies to arbitrarily defined win and loss indicators and

55

Appendix B.

Equivalence of PW and Cox PH model on TFE

Equivalence of PW and Cox PH

56

Equivalence of PW and Cox PH

57

References

References

58

References

References

statistical analysis of composite endpoints in...

Documents