Download - Generalized pairwise comparisons of prioritized outcomes Marc Buyse, ScD [email protected]
• The Wilcoxon test, and generalizations• Generalized pairwise comparisons• Universal measure of treatment effect• An example• Conclusions
Outline
General Setup
Let Xi be the continuous outcome of the i th subject in T (i = 1, … , n )
REligible subjects
Control (C )Treatment (T )Let Yj be the continuous outcome ofthe j th subject in C (j = 1, … , m )
The Wilcoxon test statistic can be derived from all possible pairs of subjects, one from T and one from C.Let
Wilcoxon-Mann-Whitney test statisticW
The Mann-Whitney form of the Wilcoxon test
The Wilcoxon test can be generalized to the case of censored outcomes. Letting and denote censored observations, the pairwise comparison indicator is now
Gehan generalized the Wilcoxon test
Now let Xi and Yj be observed outcomes for ANY outcome measure (continuous, time to event, binary, categorical, …)
First, generalize the test further for a single outcome measure
Yj Xi
favors T (favorable) favors C(unfavorable) neutral uninformative
pairwisecomparison
Binary outcome measure
Pairwise comparison Pair isXi = 1, Yj = 0 favorableXi = 1, Yj = 1 or Xi = 0, Yj = 0 neutralXi = 0, Yj = 1 unfavorableXi orYj missing uninformative
Pairwise comparison Pair isXi Yj > * favorable Xi Yj ≤ * neutralXi Yj < * unfavorableXi orYj missing uninformative
* chosen to reflect clinical relevance; = 0 is Wilcoxon test
Continuous outcome measure
Time to event outcome measure
Pairwise comparison Pair isXi Yj > * or Yj > * favorable Xi Yj ≤ * neutralXi Yj < * or Xi < * unfavorableotherwise uninformative* chosen to reflect clinical relevance; = 0 is Gehan test
Let Xi and Yj be VECTORS of observed outcomes for any number of occasions of a single outcome measure, or any number of outcome measures.We assume that the occasions and/or the outcome measures can be prioritized.
Generalized pairwise comparisons
Next, generalize the test to prioritized repeated observations of a single outcome measure…
Occasion with higher priority Occasion with lower priority Pair isfavorable ignored favorableunfavorable ignored unfavorableneutral ignored neutraluninformative favorable favorableuninformative unfavorable unfavorableuninformative neutral neutraluninformative uninformative uninformative
Last, generalize the test to severalprioritized outcome measures…
Outcome with higher priority Outcome with lower priority Pair isfavorable ignored favorableunfavorable ignored unfavorableneutral ignored neutraluninformative favorable favorableuninformative unfavorable unfavorableuninformative neutral neutraluninformative uninformative uninformative
Extend the previous definition of Uij
U is the difference between the proportion of favorable pairs and the proportion of unfavorable pairs. We call this general measure of treatment effect the « proportion in favor of treatment » ().
A general measure of treatment effect
is a linear transformation of the probabilistic index, P (X > Y ) :The proportion in favor of treatment ()
Situation P (X > Y )
T uniformly worse than C 0 1T no different from C 0.5 0T uniformly better than C 1 +1
For a binary variable, is equal to the difference in proportionsFor a continuous variable , is related to the effect size dFor a time-to-event variable, is related to the hazard ratio and the proportion of informative pairs f
The proportion in favor of treatment ()
A re-randomization test for
The test statistic U (or ) no longer has known expectation and variance. An empirical distribution of can be obtained through re-randomization.Tests of significance and confidence intervals follow suit.
The proportion in favor of treatment for the l th prioritized outcome (l = 1, . . . , L ) is given by
and the cumulative proportion is
Cumulative proportions for prioritized outcomes
Early breast cancer
two combination chemotherapies plus herceptin
R3,222 patients after curative resection of HER2+ breast cancer
AdriamycinCyclophosphamideTaxotere (ACT)TaxotereCarboplatinHerceptin (TCH)
standard chemotherapy
1,0731,075
main efficacy endpoints disease recurrence or death main safety endpoint congestive heart failure
AdriamycinCyclophosphamideTaxotereHerceptin (ACTH)
1,074
87%
81%78%
75%
92%
87%84%
81%
93%
88%86%
84%
Disease-free survival
Prioritized outcomes
Priority Outcomes1 Time to death from any cause2 Time to second malignancy3 Time to distant metastases4 Time to locoregional relapse5 Time to congestive heart failure
Difference in ACTH better ACTbetter Cumulative
P-value *Time to death 4.97% 2.87% 2.09% 0.006Time to second tumor 1.20% 1.21% 2.08% 0.022Time to distant mets 7.03% 3.46% 5.66% < 0.001Time to relapse 1.82% 1.01% 6.47% < 0.001
Time to CHF 0.62% 1.83% 5.25% < 0.001
Prioritized outcomes
GENERALIZED PAIRWISE COMPARISONSACTH vs. ACT
* Unadjusted for multiplicity
Difference in TCH better ACTbetter Cumulative
P-value *Time to death 5.05% 3.49% 1.56% 0.059Time to second tumor 1.22% 0.72% 2.05% 0.029Time to distant mets 7.18% 3.96% 5.26% < 0.001Time to relapse 1.75% 1.47% 5.55% < 0.001
Time to CHF 0.63% 0.71% 5.47% < 0.001
Prioritized outcomes
GENERALIZED PAIRWISE COMPARISONSTCH vs. ACT
* Unadjusted for multiplicity
Difference in TCH better ACTHbetter Cumulative
P-value *Time to death 3.04% 3.57% -0.53% 0.46Time to second tumor 1.29% 0.74% 0.02% 0.98Time to distant mets 3.84% 4.36% -0.50% 0.68Time to relapse 1.04% 1.63% -1.09% 0.40
Time to CHF 1.97% 0.74% 0.14% 0.93
Prioritized outcomes
GENERALIZED PAIRWISE COMPARISONSTCH vs. ACTH
* Unadjusted for multiplicity
1. are equivalent to well-known non-parametric tests in simple cases2. allow testing for differences thought to be clinically relevant3. allow any number of prioritized outcomes of any type to be analyzed simultaneously4. naturally lead to a universal measure of treatment effect, , which is directly related to classical measures of treatment effect (difference in proportions, effect size or hazard ratio)
Generalized Pairwise Comparisons
References
Buyse M. Generalized pairwise comparisons for prioritized outcomes in the two-sample problem. Statistics in Medicine 29:3245-57, 2010.Buyse M. Reformulating the hazard ratio to enhance communication with clinical investigators. Clinical Trials 5:641-2, 2008.