adjusting for observable selection bias in block randomized trials

10
STATISTICS IN MEDICINE Statist. Med. 2005; 24:1537–1546 Published online 21 February 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2058 Adjusting for observable selection bias in block randomized trials Anastasia Ivanova 1; ; , Robert C. Barrier Jr 2 and Vance W. Berger 3; 4 1 Department of Biostatistics; The University of North Carolina at Chapel Hill; CB 7420; Chapel Hill; NC 27599; U.S.A. 2 Duke Comprehensive Cancer Center Biostatistics; Duke University Medical Center; Box 3958; Durham; NC 27710; U.S.A. 3 Biometry Research Group; DCP; National Cancer Institute; 6130 Executive Boulevard; MSC 7354; Bethesda; MD 20892-7354; U.S.A. 4 Department of Mathematics and Statistics; University of Maryland at Baltimore County; 1000 Hilltop Circle; Baltimore; MD 21250; U.S.A. SUMMARY In this paper, we propose a model-based approach to detect and adjust for observable selection bias in a randomized clinical trial with two treatments and binary outcomes. The proposed method was evaluated using simulations of a randomized block design in which the investigator favoured the experimental treatment by attempting to enrol stronger patients (with greater probability of treatment success) if the probability of the next treatment being experimental was high, and enrol weak patients (with less probability of treatment success) if the probability of the next treatment being experimental was low. The method allows not only testing for the presence of observable selection bias, but also testing for a dierence in treatment eects, adjusting for possible selection bias. Copyright ? 2005 John Wiley & Sons, Ltd. KEY WORDS: randomized block design; randomized clinical trials; selection bias 1. INTRODUCTION In randomized clinical trials, selection bias occurs when the investigator, either consciously or otherwise, uses knowledge of the upcoming treatment assignment as the basis for deciding whom to enrol [1]. Selection bias may have a large impact on the outcome of randomized clinical trials, and can be dicult to prevent; moreover, little research has been conducted into methods to detect or adjust for such bias. Randomization is an eective means for reducing bias in treatment selection because it guarantees that treatment assignment is not based on the Correspondence to: Anastasia Ivanova, Department of Biostatistics, The University of North Carolina at Chapel Hill, CB 7420, Chapel Hill, NC 27599-7420, U.S.A. E-mail: [email protected] Received September 2002 Copyright ? 2005 John Wiley & Sons, Ltd. Accepted August 2004

Upload: anastasia-ivanova

Post on 06-Jul-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Adjusting for observable selection bias in block randomized trials

STATISTICS IN MEDICINEStatist. Med. 2005; 24:1537–1546Published online 21 February 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2058

Adjusting for observable selection bias in blockrandomized trials

Anastasia Ivanova1;∗;†, Robert C. Barrier Jr2 and Vance W. Berger3;4

1Department of Biostatistics; The University of North Carolina at Chapel Hill; CB 7420; Chapel Hill;NC 27599; U.S.A.

2Duke Comprehensive Cancer Center Biostatistics; Duke University Medical Center; Box 3958; Durham;NC 27710; U.S.A.

3Biometry Research Group; DCP; National Cancer Institute; 6130 Executive Boulevard; MSC 7354; Bethesda;MD 20892-7354; U.S.A.

4Department of Mathematics and Statistics; University of Maryland at Baltimore County; 1000 Hilltop Circle;Baltimore; MD 21250; U.S.A.

SUMMARY

In this paper, we propose a model-based approach to detect and adjust for observable selection bias in arandomized clinical trial with two treatments and binary outcomes. The proposed method was evaluatedusing simulations of a randomized block design in which the investigator favoured the experimentaltreatment by attempting to enrol stronger patients (with greater probability of treatment success) ifthe probability of the next treatment being experimental was high, and enrol weak patients (with lessprobability of treatment success) if the probability of the next treatment being experimental was low.The method allows not only testing for the presence of observable selection bias, but also testing fora di�erence in treatment e�ects, adjusting for possible selection bias. Copyright ? 2005 John Wiley &Sons, Ltd.

KEY WORDS: randomized block design; randomized clinical trials; selection bias

1. INTRODUCTION

In randomized clinical trials, selection bias occurs when the investigator, either consciouslyor otherwise, uses knowledge of the upcoming treatment assignment as the basis for decidingwhom to enrol [1]. Selection bias may have a large impact on the outcome of randomizedclinical trials, and can be di�cult to prevent; moreover, little research has been conducted intomethods to detect or adjust for such bias. Randomization is an e�ective means for reducingbias in treatment selection because it guarantees that treatment assignment is not based on the

∗Correspondence to: Anastasia Ivanova, Department of Biostatistics, The University of North Carolina at ChapelHill, CB 7420, Chapel Hill, NC 27599-7420, U.S.A.

†E-mail: [email protected]

Received September 2002Copyright ? 2005 John Wiley & Sons, Ltd. Accepted August 2004

Page 2: Adjusting for observable selection bias in block randomized trials

1538 A. IVANOVA, R. C. BARRIER JR AND V. W. BERGER

patient’s prognostic factors [2]. Randomization in combination with perfect compliance is theultimate means to avoid selection bias [1].A signi�cant drawback to completely randomized allocation, with each allocation chosen by

tossing a fair coin, is the risk of undesirable imbalance in the number of patients allocated toeach arm. When the total number of subjects is small this can negatively a�ect the power oftreatment comparison. Patients’ prognostic factors can have a time trend during the trial. Forexample, ‘better’ patients may be treated in the beginning of the trial [3]. The in�uence ofsuch time trends can be minimized by maintaining balance in the assignments throughout thecourse of the trial. On the other extreme is a completely non-random (systematic) design withassignment alternating between experimental (E) and control (C) treatments. The resultingassignment sequence, ECECEC : : : or its reverse, guarantees that imbalance will never exceedone in any category, but since each subsequent assignment is known, there is a possibility ofselection bias. There are compromises between a perfectly balanced experiment and completerandomization. A complete randomized block design, for example, provides perfect balance inthe �nal distribution, and is frequently used. Smaller blocks are used to control the imbalanceover time. However, if all the previous assignments in the block are known to the investigator,then one or more assignments at the end can be predicted with certainty, thereby enablingselection bias. The smaller the block size, the more this becomes a problem. Masking andconcealment are intended to prevent knowledge of the allocation, but these methods are ofteninadequate [4].Treatments can be unmasked by detection of di�erences in appearance (of the packaging

or of the treatments themselves), in taste, or in the occurrence of speci�c events or sidee�ects associated with their use [5]. For example in one particular possibly unmasked trial,patients with poorer prognosis tended to be assigned to the experimental treatment (personalcommunications with Dr Gary G. Koch, 2003). This imbalance did not seem due to chance.Patients were assigned in blocks of size 3, 2 to experimental treatment and 1 to control.If such a trial is unmasked, every third assignment is always known and roughly a thirdof the second assignments are known, too. The failure to demonstrate the superiority of theexperimental treatment may have been caused, at least in part, by selection bias.Since one cannot safely assume that e�orts to prevent selection bias have been e�ective,

methods are needed for detection of—and possibly adjustment for—observable selection bias.The negative e�ect of selection bias and the steps to reduce this e�ect were discussed, forexample, in References [6–9]. Berger and Exner [5] suggested a method for detecting ob-servable selection bias in two-arm trials that are unmasked or may have had the maskingcompromized. We focus on not only on detecting observable selection bias but also adjustingtreatment comparisons for the presence and magnitude of selection bias that may be found.

2. ASSESSING OBSERVABLE SELECTION BIAS

The method proposed is designed to detect and adjust for observable selection bias in ran-domized block designs where experimental (E) and control (C) treatments are compared withrespect to a binary outcome. Let treatment indicator variable Xi, i=1; 2; : : : ; n, equal 1 if theith patient is assigned to treatment E, or 0 if the patient is assigned to C. It is assumed thatthe investigator knows the block size and is able to ascertain which treatments within the cur-rent block have been allocated already, thereby enabling him to determine the probability ’i

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 3: Adjusting for observable selection bias in block randomized trials

ADJUSTING FOR OBSERVABLE SELECTION BIAS 1539

Table I. Probability of favourable outcome Pr(Yi=1 |Xi; Si).Strong patient Medium patient Weak patientassigned assigned assigned(Si =1) (Si =0:5) (Si =0)

Xi =1 p1 + � p1 p1 − �Xi =0 p2 + � p2 p2 − �

of the experimental treatment being assigned to the next participant enrolled. For completelyrandomized allocation, with each allocation chosen by tossing a fair coin, ’i=0:5; for �xedalternating assignments ECEC : : :, ’i=1 or 0, i=1; 2; : : : ; n.Each patient has two potential treatment outcomes, one if the patient receives E and another

if he receives C. We consider a very simple model where we assume that the treatment e�ectis constant over levels of ’ and that selection bias can be summarized by a single parameter�. Note that these assumptions cannot be tested. The last assumption implies that candidatepatients can be classi�ed on the basis of prognosis as strong with probability of success ontreatment E or C of p1 + � or p2 + �, respectively, medium with probability of success p1 orp2, or weak with probability of success p1−� or p2−�; 06�6min (p1; p2; 1−p1; 1−p2). Thismodel, though simplistic, re�ects the situation where a physician is able to select patients withbaseline covariates that are positively or negatively correlated with better treatment outcome.Suppose that the investigator is in favour of the experimental treatment (�¿0) and will tryto enrol a stronger patient if the treatment is more likely to be experimental. More precisely,the investigator will enrol a strong patient if the probability ’i of being assigned to treatmentE exceeds a �xed cut-o�, and a weak patient if the probability (1−’i) of being assigned totreatment C exceeds the same cut-o�. Otherwise, a medium patient is enrolled. We assumethat there are enough strong, medium, and weak patients in the population so that a patientof each kind is always available if needed. The patient type indicator Si of the ith patient isde�ned according to the cut-o� and assignment probability:

Si=

1 if ’i¿cut-o�0 if ’i ¡ 1− cut-o�0:5 otherwise

(1)

That is, a strong patient is assigned to receive the ith treatment according to the allocationsequence if Si=1. A weak patient is assigned if Si=0, and a medium patient is assignedif Si=0:5. A cut-o� of 0.5 implies that the investigator biases the assignment as often aspossible. A cut-o� of 0.99 implies the assignment of strong or weak patients only when theinvestigator is certain of the next assignment. This strategy appears to have been used by theinvestigators in the example described in Section 1.Under model (1), the binomial outcome Yi takes the value 1 or 0 for success or failure; its

conditional probabilities are shown in Table I. The following model can be used to estimatethe parameters p1, p2, and �:

E(Yi |Xi; Si) = Xip1 + (1− Xi)p2 + Si�− (1− Si)�= Xip1 + (1− Xi)p2 + �(2Si − 1)= (p2 − �) + (p1 − p2)Xi + 2�Si (2)

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 4: Adjusting for observable selection bias in block randomized trials

1540 A. IVANOVA, R. C. BARRIER JR AND V. W. BERGER

The likelihood function is L=∏ni= 1 m

yii (1−mi)(1−yi), where n is the total number of patients

in the trial, and mi=E(Yi |Xi; Si) is de�ned in equation (2). The likelihood is maximizedsubject to the following constraints: 06p161, 06p261, and 06 | � |¡min (p1; p2; 1 −p1; 1 − p2). Covariate Si is a function of ’i and the cut-o�. Generally the cut-o� value isnot known and is estimated in the following way. For small block sizes, there are very fewcut-o� values that need to be considered. For example, with blocks of size 6 (3 experimental,3 control), ’i can assume only nine values: 0, 1

4 ,13 ,

25 ,

12 ,

35 ,

23 ,

34 , and 1. By symmetry,

the only four possibilities for the cut-o� in this case would be any value between 12 and

35 ,

between 35 and

23 , between

23 and

34 , or between

34 and 1. Model (2) is considered for each

of the four possible values of the cut-o�. The model that yields the maximum value of thelikelihood is chosen.One can test for the presence of selection bias by considering the hypotheses H0 : �=0

versus HA : � �= 0. The likelihood ratio test statistic is Q= − 2(log L0 − log L1), whereL0 is the likelihood computed under the null hypothesis (with the constraint �=0) and L1is the likelihood under the alternative hypothesis (without constraints on the parameters). Adi�erence in treatment e�ects, adjusting for selection bias, can be tested in a similar way byconsidering the hypotheses H0 : p1 − p2 = 0 versus HA : p1 − p2 �= 0. Under H0 : �=0 thereis no cut-o� point to estimate and the model loses one parameter. The null distribution of thetest statistic when a nuisance parameter is present only under the alternative can be di�cult toobtain theoretically [10]. In the simulation study we used an approximation of the referencedistribution by the chi-square distribution with one degree of freedom.One can also consider the logistic model similar to model (2):

logit E(Yi |Xi; ’i)= �+ �Xi + �Si (3)

Berger and Exner [5] proposed the model

logit E(Yi |Xi; ’i)= �+ �Xi + �’i (4)

The di�erence between the two logistic models is that in the �rst model selection bias isintroduced through a cut-o� (as in example from Section 1), and in the second model the‘strength’ of the patient is proportional to the probability of being assigned to E. Bergerand Exner [5] proposed model (4) to detect observable selection bias, however, they didnot explore the model’s performance in adjusting for selection bias. Yet another approach(suggested to us by a referee) is to stratify subjects according to ’i. This approach requiresonly the assumption that treatment e�ect is constant over ’i. For example, with blocks ofsize 6 there will be seven groups since there are only nine values ’i can assume, but theextreme values 0 and 1 do not contribute. The data for each ’i is presented in a 2× 2 tableand the equality of the odds ratios in all the tables is tested using Mantel–Haenzel test forhomogeneity. Finally, the common odds ratio is compared to 1.

3. SIMULATION RESULTS

3.1. Main simulation study

Clinical trials were simulated in which patients were assigned to experimental or controltreatments by using a randomized complete block design with blocks of size 6. Simulations

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 5: Adjusting for observable selection bias in block randomized trials

ADJUSTING FOR OBSERVABLE SELECTION BIAS 1541

Table II. Power and type I error rates for test of the presence of observable selection bias(H0 : �=0) using logistic regression model and linear model.

Cut-o� = 0:50 Cut-o� = 0:99

Linear Logistic Linear Logisticp1 p2 � model model model model

0:50 0:50 0:00 0.05 0.05 0.06 0.060:50 0:50 0:20 0.95 0.59 0.05 0.590:50 0:70 0:00 0.05 0.05 0.65 0.050:50 0:70 0:10 0.42 0.18 0.68 0.200:50 0:70 0:20 0.96 0.59 0.70 0.600:60 0:40 0:00 0.05 0.05 0.63 0.050:60 0:40 0:30 ¿0:99 0.98 0.67 0.960:70 0:50 0:00 0.05 0.05 0.65 0.050:70 0:50 0:10 0.44 0.20 0.66 0.200:70 0:50 0:20 0.98 0.70 0.68 0.660:70 0:70 0:00 0.05 0.05 0.06 0.050:70 0:70 0:20 0.98 0.70 0.04 0.67

The data are generated according to model (2). The sample size of 192 is used in each trial.

are based on 5000 runs. A sample size of 192 was used to achieve at least 80 per cent powerfor a two-sided level 0.05 test of the hypotheses H0 : p1 − p2 = 0 versus HA : p1 − p2 �= 0if the true di�erence p1 − p2 is equal to 0.2 and no selection bias is present. We studiedperformance of models (2), (3), and (4). Results are shown for models (2) and (4) only,since model (3) performed very similar to model (2). In the �rst set of simulations the datawere generated by model (2) with cut-o�s of 0.5 and 0.99.First, we assessed the ability of the two models to detect observable selection bias. Power

and type I error rates for the test of �=0 are presented in Table II. We used the Waldchi-square test to test for parameters in logistic regression models and used the chi-squarewith one degree of freedom as the reference distribution for the likelihood ratio test statistic.Simulation showed that the chi-square with one degree of freedom is a good approximationof the reference distribution. On average the two methods are comparable, with the linearmodel yielding slightly better results. These results are in accord with the simulation study ofBerger and Exner [5] which demonstrated that the logistic model (4) can detect observableselection bias well. The cut-o� was estimated correctly about 75 per cent of the time, withthe cut-o� of 0.5 estimated correctly more frequently than the cut-o� of 0.99.The next question is whether the two models can be used to adjust the treatment comparison

for selection bias. Table III presents power and type I error rates for the test of p1 =p2 forlinear (2) and logistic (4) models adjusting for observable selection bias, and for unadjustedcomparison. When p1 and p2 were both 0.5, we expected the null hypothesis p1 =p2 to berejected in approximately 5 per cent of the trials. Simulation without selection bias (a cut-o�of 0.99) resulted in a type I error rate of 6 per cent. With �=0:2, because of the selectionbias the null hypothesis was rejected 54 per cent of the time when a cut-o� of 0.50 was used,and 29 per cent of the time when the cut-o� was 0.99. Adjusting for selection bias, the errorrates were reduced to 4 and 5 per cent, respectively. When p1 = 0:7, p2 = 0:5, and �=0, thenull hypothesis was rejected 81 per cent of the time. However, when �=0:2, the probability

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 6: Adjusting for observable selection bias in block randomized trials

1542 A. IVANOVA, R. C. BARRIER JR AND V. W. BERGER

Table III. Power and type I error rates for the test of H0 : p1 =p2 using linear model (2)and logistic regression model (4).

Cut-o� = 0:50 Cut-o� = 0:99

Linear Logistic Linear Logisticp1 p2 � model model Unadjusted model model Unadjusted

0:50 0:50 0:00 0.06 0.05 0.05 0.06 0.05 0.050:50 0:50 0:20 0.04 0.05 0.54 0.05 0.04 0.290:50 0:70 0:00 0.65 0.64 0.80 0.65 0.63 0.800:50 0:70 0:10 0.73 0.64 0.43 0.68 0.67 0.560:50 0:70 0:20 0.78 0.66 0.11 0.70 0.67 0.290:60 0:40 0:00 0.63 0.67 0.79 0.63 0.66 0.790:60 0:40 0:30 0.81 0.67 ¿0:99 0.67 0.64 ¿0:990:70 0:50 0:00 0.65 0.69 0.81 0.65 0.67 0.810:70 0:50 0:10 0.68 0.69 0.98 0.66 0.67 0.950:70 0:50 0:20 0.76 0.66 ¿0:99 0.68 0.65 0.990:70 0:70 0:00 0.06 0.04 0.05 0.06 0.04 0.050:70 0:70 0:20 0.04 0.04 0.62 0.04 0.05 0.34

The data are generated according to model (2).

to reject the null hypothesis increased to 0.99 for both cut-o�s as a result of selection bias.After adjusting for selection bias, the power was 76 per cent for a cut-o� of 0.50, and 68 percent for a cut-o� of 0.99. The method did not perform as well with a cut-o� of 0.99, due tothe smaller number of strong or weak patients enrolled in the trial.The parameter estimates presented in Table IV show that the proposed method was success-

ful with respect to estimation of parameters p1 and p2 in the presence of selection bias. In all22 scenarios considered, the proposed method yielded unbiased estimates. The standard devi-ations of these estimates were also very close to the theoretical standard errors for binomialproportions. The means of the estimated standard errors of p̂1 and p̂2 were equal to 0.05 inall the scenarios. Since they were close to the errors in Table IV, the method produces goodestimates of the standard errors. The method also produced good estimates of the parameter �.

3.2. Simulation study of enrolment with limited availability

In order to test the performance of our method in a more realistic setting, we repeated thesimulations with the modi�cation that the investigator was not able to enrol the desired patienttype in every instance. Speci�cally, when the investigator attempts to enrol a strong patienthe is able to do so only 50 per cent of the time. Similarly, every time the investigatorattempts to enrol a weak patient, a weak patient is available only 50 per cent of the time.Otherwise, a medium strength patient is enrolled. This set-up is similar to the one of Section3.1, hence model (2) is expected to produce good results in detecting selection bias andcorrecting for it. Both models were able to detect the observable selection bias (Table V) andto adjust the treatment comparison for observable selection bias (Table VI). Estimation of theparameters p1, p2, and p1 −p2 was una�ected by the change to the simulation setting (dataare available from the authors). As expected, estimates of � were approximately halved; thatis, partial unavailability of strong and weak patients diminishes the e�ect of selection bias.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 7: Adjusting for observable selection bias in block randomized trials

ADJUSTING FOR OBSERVABLE SELECTION BIAS 1543

Table IV. Maximum likelihood estimates of parameters and standard deviations (SD) ob-tained from model (2).

p̂1 p̂2 p̂1 − p̂2 �̂

Cut-o� p1 Mean SD p2 Mean SD p1 − p2 Mean SD � Mean SD

0:50 0.50 0.06 0:50 0.50 0.06 0:00 0.00 0.09 0:00 0.00 0.100:50 0:50 0.50 0.05 0:50 0.50 0.05 0:00 0.00 0.08 0:20 0.21 0.050:99 0:50 0.50 0.06 0:50 0.50 0.05 0:00 0.00 0.08 0:20 0.20 0.08

0:50 0.50 0.06 0:70 0.70 0.05 −0:20 −0:20 0.09 0:00 0.00 0.090:50 0:50 0.50 0.05 0:70 0.70 0.05 −0:20 −0:21 0.08 0:10 0.12 0.070:99 0:50 0.50 0.06 0:70 0.70 0.05 −0:20 −0:20 0.08 0:10 0.10 0.090:50 0:50 0.50 0.05 0:70 0.70 0.04 −0:20 −0:20 0.07 0:20 0.21 0.050:99 0:50 0.50 0.06 0:70 0.70 0.05 −0:20 −0:20 0.08 0:20 0.19 0.07

0:60 0.60 0.06 0:40 0.40 0.06 0:20 0.20 0.09 0:00 0.00 0.090:50 0:60 0.60 0.05 0:40 0.40 0.05 0:20 0.20 0.08 0:30 0.30 0.040:99 0:60 0.60 0.05 0:40 0.40 0.05 0:20 0.20 0.08 0:30 0.30 0.06

0:70 0.70 0.05 0:50 0.50 0.06 0:20 0.20 0.09 0:00 0.00 0.090:50 0:70 0.70 0.05 0:50 0.50 0.05 0:20 0.19 0.08 0:10 0.12 0.060:99 0:70 0.70 0.05 0:50 0.50 0.06 0:20 0.20 0.08 0:10 0.10 0.080:50 0:70 0.70 0.05 0:50 0.50 0.05 0:20 0.20 0.08 0:20 0.21 0.050:99 0:70 0.70 0.05 0:50 0.50 0.05 0:20 0.19 0.08 0:20 0.20 0.07

0:70 0.70 0.05 0:70 0.70 0.05 0:00 0.00 0.08 0:00 0.00 0.090:50 0:70 0.70 0.05 0:70 0.70 0.04 0:00 0.00 0.07 0:20 0.21 0.050:99 0:70 0.70 0.05 0:70 0.70 0.05 0:00 0.00 0.07 0:20 0.20 0.07

The data are generated according to model (2).

Table V. Power and type I error rates for test of the presence of observable selection bias(H0 : �=0) using linear model (2) and logistic regression model (4).

Cut-o� = 0:50 Cut-o� = 0:99

Linear Logistic Linear Logisticp1 p2 � model model model model

0:50 0:50 0:20 0.41 0.37 0.17 0.180:50 0:70 0:20 0.43 0.38 0.17 0.190:60 0:40 0:30 0.78 0.70 0.39 0.400:70 0:50 0:20 0.43 0.38 0.19 0.20

The data are generated according to model of enrolment with limited availability. The samplesize of 192 is used in each trial.

This re�ects the fact that now E(Yi | Si=1)−E(Yi | Si=0)= �, whereas before this di�erencewas 2�.

3.3. Simulation study where probability of success is modelled using Beta distribution

In Section 2, we described the model in which patients are classi�ed on the basis of prognosisas strong with probability of success p+ �, medium with probability of success p, or weakwith probability of success p − �. In this section we consider a model where the patient’s

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 8: Adjusting for observable selection bias in block randomized trials

1544 A. IVANOVA, R. C. BARRIER JR AND V. W. BERGER

Table VI. Power and type I error rates for the test of H0 : p1 =p2 using linear model (2)and logistic regression model (4).

Cut-o� = 0:50 Cut-o� = 0:99

Linear Logistic Linear Logisticp1 p2 � model model Unadjusted model model Unadjusted

0:50 0:50 0:20 0.05 0.05 0.17 0.06 0.05 0.110:50 0:70 0:02 0.73 0.67 0.43 0.68 0.65 0.560:60 0:40 0:30 0.69 0.63 0.99 0.69 0.63 0.970:70 0:50 0:20 0.69 0.67 0.98 0.66 0.65 0.94

The data are generated according to model of enrolment with limited availability. The samplesize of 192 is used in each trial.

individual probability of success follows a Beta distribution. Let the probability of successon C follow the Beta density with parameters (2,2). This model yields average probability ofsuccess of 0.5. The mean of the Beta truncated at 0.47 is equal to 0.3, the mean of truncateddistribution between 0.47 and 0.53 is 0.5, and the mean of Beta above 0.53 is 0.7. If weonly use means of corresponding pieces of Beta density, then we will get the model fromSection 2 with p=0:5 and �=0:2. The probability of success on E follows the Beta densitywith parameters (3,2). The mean of the Beta truncated at 0.42 is equal to 0.37, the mean oftruncated distribution between 0.42 and 0.68 is 0.6, and the mean of the distribution above0.68 is 0.8. In the simulations we used corresponding pieces of the two Beta densities to modelweak, medium and strong patients on C and E. The total sample size was 192 and the cut-o�of 0.5 was used. We strati�ed the subjects according to probability, ’i, of being assigned toE and computed estimates of success probabilities in each group. With blocks of size 6, thereare nine possible values of ’i. Because of possibly small sample sizes on some of the groups,we considered �ve groups: ’i=0, 0 ¡ ’i ¡ 0:5, ’i=0:5, 0:5 ¡ ’i ¡ 1, ’i=1. Since nopatients are assigned to E when ’i=0, nor to C when ’i=1, we put ‘NA’ in place of theestimate. The �ve estimates of success probabilities in E are NA, 0.30, 0.57, 0.80, and 0.82.The �ve estimates of success probabilities in C are 0.28, 0.29, 0.50, 0.73, and NA. Both linear(2) and logistic (4) models were able to detect the presence of observable selection bias well,yielding power of about 0.9. The adjusted power of the treatment comparison according thelinear model was 0.23 and according to the logistic model was 0.11. Note that the comparisonof two populations with success probabilities of 0.5 and 0.6 and sample size of 192 yieldspower of 0.27.

4. DISCUSSION

In this paper we studied a new method of testing for observable selection bias in unmaskedtrials in which the randomized block design is used. Both the linear model and the logisticmodel were able to detect observable selection bias relatively well. Our recommendation isto use one of these models if the likelihood of the presence of selection bias is high. Eithermodel o�ers simultaneous testing for selection bias and testing for the adjusted treatmente�ect. According to our simulation study the size of the test of the treatment e�ect is wellpreserved.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 9: Adjusting for observable selection bias in block randomized trials

ADJUSTING FOR OBSERVABLE SELECTION BIAS 1545

Obviously, the simulations presented cannot re�ect the full reality of clinical trials. First,the patient strength de�nition employed here is over-simpli�ed, since the strength is likelyto vary on a patient-by-patient basis. The investigator who wants to bias the study mightnot always have strong and weak patients available to put on the trial, and the probability ofsuccessfully enrolling the desired patient type might not be �xed. Second, we assumed that allprior treatment assignments were known; in reality, however, this applies to fully unmaskedstudies only; partial concealment is likely to be maintained in masked studies. These issuescan be addressed through more realistic simulation studies. Incorporating drop-outs will alsomake the simulation study more realistic. Note that there are some types of biases that themethod cannot control for—for example, observation bias. The randomized block design wasconsidered in our simulation study. The use of blocks of random size will not prevent theoccurrence of selection bias [11]. Our method can be used with allocation by blocks of randomsize as well.A �nal note concerns the scope of applicability of these methods. It may be tempting to

regard these methods to be of limited use, based on the fact that there is such scant evidenceof selection bias in actual randomized trials. The lack of evidence is rooted in the fact that nomajor e�ort appears to exist to detect selection bias when it is present. This means that thereis no evidence that selection bias does not occur on a regular basis. Uncertainty regardingthe presence of selection bias in any given trial should probably not trigger the correctivemeasures that we have illustrated. However, it should give one pause prior to accepting theuncorrected results at face value, and should trigger e�orts aimed at the detection of selectionbias [12, 13], including the detection method described herein and=or the Berger–Exner testof selection bias [5].

ACKNOWLEDGEMENTS

The authors thank referees for their constructive suggestions. The authors are also grateful to Dr GaryG. Koch for helpful discussions.

REFERENCES

1. Blackwell D, Hodges Jr J. Design for the control of selection bias. Annals of Mathematical Statistics 1957;28:449–460.

2. Piantadosi S. Clinical Trials: A Methodologic Perspective. Wiley: New York, 1997.3. Friedman L, Furberg C, DeMets D. Fundamentals of Clinical Trials (3rd edn). Springer: New York, 1998.4. Schulz KF. Subverting randomization in controlled trials. Journal of the American Medical Association 1995;274(18):1456–1458.

5. Berger VW, Exner DV. Detecting selection bias in randomized clinical trials. Controlled Clinical Trials 1999;20:319–327.

6. Proschan M. In�uence of selection bias on type I error rate under random permuted block designs. StatisticaSinica 1994; 4:219–231.

7. Follmann D, Proschan M. The e�ect of estimation and biasing strategies on selection bias in clinical trials.Journal of Statistical Planning and Inference 1994; 39:1–17.

8. Zelen M. Discussion of biostatistical collaboration in medical research by Jonas H. Ellenberg. Biometrics 1990:46:28–29.

9. Meinert CL. Clinical Trials, Design, Conduct, and Analysis. Oxford University Press: New York, 1986.10. Davies RB. Hypothesis testing when a nuisance parameter is present only under the alternative: linear model

case. Biometrika 2002; 89:484–489.11. Berger VW, Ivanova A, Knoll M. Minimizing predictability while retaining balance trough the use of less

restrictive randomization procedures. Statistics in Medicine 2003; 22:3017–3028.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546

Page 10: Adjusting for observable selection bias in block randomized trials

1546 A. IVANOVA, R. C. BARRIER JR AND V. W. BERGER

12. Berger VW, Christophi CA. Randomization technique, allocation concealment, masking, and susceptibility oftrials to selection bias. Journal of Modern Applied Statistical Methods 2003; 2:80–86.

13. Berger VW. Selection bias and baseline imbalances in randomized trials. Drug Information Journal 2004;38:1–2.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2005; 24:1537–1546