response-adaptive randomisation in clinical trials with binary responses

School of Mathematical SciencesMTHM038: MSc Mathematics Dissertation

Response-Adaptive Randomisation inClinical Trials with Binary Responses

2014-15

Mateusz Matyjaszczyk140293481

Supervisor: Dr. D.S. Coad

140293481

Abstract

Randomisation is a fundamental concept in experimental design as it is

the best known way of removing unwanted bias. The classical approach to

randomisation is to balance the number of patients receiving each treatment.

However, in a clinical trial this has an ethical disadvantage as it could lead

to a high number of treatment failures. We explore various response-adaptive

randomisation schemes which aim to assign more patients to the superior

treatment in order to reduce the number of treatment failures. In this disser-

tation we only consider clinical trials with binary responses.

We start by introducing the randomised play-the-winner (RPW) rule. The

RPW rule has many statistical disadvantages and a previous application in

a clinical trial lead to disastrous results. We therefore introduce three dif-

ferent randomisation rules: drop-the-loser (DL) rule, odds ratio based design

(ORBD) and doubly adaptive biased coin design (DBCD). For these rules to

be applicable under a realistic setting, each one is extended to (i) allow any

number of treatments (ii) allow delayed responses (iii) incorporate covariates.

We then analyse the efficient randomised adaptive design (ERADE) which

obtains the Cramer-Rao lower bound on the asymptotic variance.

The final section compares the randomisation rules mentioned. For K = 2

and K = 3 treatment design, we compare the allocation proportion and define

a hypothesis test which we then use to simulate power and significance level.

Then the same methods are used to compare the randomisation rules under

delayed responses and incorporating covariates.

We find that in general there is an inverse relationship between more ethi-

cal allocation and power. A suitable response-adaptive randomisation scheme

needs to have a good balance between these two criteria and thus such a

randomisation procedure should be tailored to a specific clinical trial.

140293481

Contents

1 Introduction 1

2 Response-adaptive randomisation designs 32.1 Randomised play-the-winner (RPW) rule . . . . . . . . . . . . . . . . 3

2.1.1 K = 2 treatments design . . . . . . . . . . . . . . . . . . . . . 32.1.2 Statistical properties and criticisms . . . . . . . . . . . . . . . 4

2.2 Drop-the-loser (DL) rule . . . . . . . . . . . . . . . . . . . . . . . . . 72.2.1 K ≥ 2 treatments design . . . . . . . . . . . . . . . . . . . . . 72.2.2 Allowing delayed responses . . . . . . . . . . . . . . . . . . . . 82.2.3 Targetting an alternative allocation proportion . . . . . . . . . 82.2.4 Incorporating covariates . . . . . . . . . . . . . . . . . . . . . 10

2.3 Odds-ratio based designs (ORBD) . . . . . . . . . . . . . . . . . . . . 112.3.1 Incorporating covariates . . . . . . . . . . . . . . . . . . . . . 112.3.2 K > 2 treatments design . . . . . . . . . . . . . . . . . . . . . 12

2.4 Doubly adaptive biased coin design (DBCD) . . . . . . . . . . . . . . 132.4.1 K = 2 treatments design . . . . . . . . . . . . . . . . . . . . . 132.4.2 K > 2 treatments design . . . . . . . . . . . . . . . . . . . . . 152.4.3 Incorporating covariates . . . . . . . . . . . . . . . . . . . . . 17

2.5 Efficient randomised adaptive design (ERADE) . . . . . . . . . . . . 182.5.1 Rule definition . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Comparing designs 193.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 K = 2 treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2.1 Allocation proportion . . . . . . . . . . . . . . . . . . . . . . . 203.2.2 Inference, significance level and power . . . . . . . . . . . . . . 23

3.3 K = 3 treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.3.1 Allocation proportion . . . . . . . . . . . . . . . . . . . . . . . 253.3.2 Inference, significance level and power . . . . . . . . . . . . . . 27

3.4 Delayed responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.4.1 K = 2 treatments . . . . . . . . . . . . . . . . . . . . . . . . . 293.4.2 K = 3 treatments . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.5 Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.5.1 K = 2 treatments . . . . . . . . . . . . . . . . . . . . . . . . . 363.5.2 K = 3 treatments . . . . . . . . . . . . . . . . . . . . . . . . . 39

4 Conclusion 41

A R code used in simulations 44A.1 RPW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44A.2 DL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44A.3 GDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45A.4 DLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46A.5 ORBD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47A.6 DBCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48A.7 RDBCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49A.8 ERADE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

0

140293481

1 Introduction

Suppose that a new treatment has been developed and we wish to compare it to

an existing treatment through a clinical trial. In this trial, the patients arrive

sequentially and each patient is assigned to either of the treatments. If the treatment

assignment is systematic, then the physician or medical examiner may be able to

predict the next assignment and choose a patient that they would prefer to receive

the corresponding treatment. This is called selection bias and is highly undesirable

as it can invalidate the trial. The best known way of removing such bias is through

randomisation.

Complete randomisation is the most basic form of randomisation where we as-

sign a treatment to a patient with equal probability. In an experiment with two

treatments this can be compared to a toss of a fair coin. Although such a scheme

minimises selection bias and is quite easy to implement, it has many undesirable

properties e.g. it does not take the medical histories of the patients into account.

Suppose that we have a covariate under which the treatment can have various ef-

fects. Then, it is highly desired for this covariate to be equally represented in each

treatment as unbalanced treatments could have an effect on any statistical inference

performed later on.. This form of randomisation is called covariate-adaptive ran-

domisation and historically is the most widely used type of randomisation in clinical

trials.

Next, consider the responses to this experiment to be binary (i.e. treatment was

successful or unsuccessful) and instantaneous (i.e. before the next patient is ran-

domised). After some patients have been randomised and their responses obtained,

one of the treatments has been shown to have a lower proportion of treatment fail-

ures than the other treatments. Due to ethics, we wish to assign more patients to

this superior treatment in order to minimise the number of treatment failures. This

type of randomisation is called response-adaptive randomisation (RAR). Although

Thompson (1933) proposed such adaptive designs as early as 1930’s, they have had a

very limited use in practice with the randomised play-the-winner (RPW) rule being

the design most often applied. In Section 2.1 we will briefly investigate this design

and its limitations. One of the most severe limitations is that when the success

probability is high on both treatments then the variance is unbounded. This in turn

means that the allocation proportion of patients to each treatment heavily depends

on the initial settings of the scheme. We mention the ECMO trial in which a bad

choice of the initial settings of the RPW scheme lead to disastrous results.

For a given randomisation scheme to be applicable in a practical setting, we need

to consider the following limitations:

• So far we assumed the responses to be instantaneous. In practice, this is rarely

the case as new patients may be assigned to a treatment before a response is

available for all the previous patients. Hence for an adaptive design to be

1

140293481

applicable to a wide range of clinical trials, the design should allow delayed

responses.

• Covariate imbalance may be an issue, similarly as in complete randomisation.

Thus, for an adaptive design to be practical, the design should take into con-

sideration the response history so far as well as the covariate balance of the

treatments.

• We also assumed that there are only two treatments. However, this may not be

the case in many clinical trials. For example, when comparing a new treatment

to an existing one, we may wish to include a placebo group. Similarly, if the

newly proposed treatment is a drug, patients can be assigned to treatments

with different dosages in order to find the optimum dose.

In Sections 2.2-2.4 three well studied RAR designs are introduced: drop-the-

loser rule (DL), odds-ratio based design (ORBD) and doubly adaptive biased-coin

design (DBCD). Each of these designs is extended to allow for the three limitations

above: delayed responses; covariate adaptiveness and K > 2 treatments. Of note

is the extension of the ORBD to K = 3 treatments as this has not been previously

explored in the literature. With the exception of the DL rule, none of the rules

mentioned are known to obtain the lower bound on the asymptotic variance and

thus in Section 2.5 we introduce the efficient randomised adaptive design (ERADE)

that obtains this lower bound.

Finally, in Chapter 3 we compare the statistical properties of the randomisation

schemes. Section 3.2 compares the allocation and failure proportions for all the

rules with K = 2 treatments. We find that in general the DL, DBCD and ERADE

rules have the least variable allocation proportions while the ORBD assigns the most

patients to the superior treatment, resulting in the lowest failure proportion. In fact,

we notice the inverse relationship between these two criteria. We then define the

Wald test that can be used to test the hypothesis of no treatment difference. We use

this test to simulate the power and significance level of different randomisation rules

and confirm the well-known inverse relationship between power and variability. That

is, a more variable rule in general leads to reduced power. Thus, the DL, DBCD

and ERADE are the most powerful.

In Section 3.3 we explore the DL, ORBD and DBCD when extended to K = 3

treatments. The allocation and failure proportion are investigated for each rule and

the results reflect the findings of the previous section. That is, the DL and DBCD

are found to be the least variable but assign less patients to the superior treatment

than the ORBD. We then define the contrast test of homogeneity which allows us to

compare one treatment (usually the placebo) to the other treatments. We use this

test to simulate the power and significance level for the DL, ORBD and DBCD rules

and it is shown that the DL and DBCD maintain the highest power, when compared

2

140293481

to complete randomisation. Note that a simulation of the ORBD and DBCD when

extended to K = 3 treatments is not reported in the literature.

We then consider the DL, ORBD and DBCD rules under delayed responses in

Section 3.4. Note that out of these rules, the literature only investigates the DL un-

der delayed responses and there is no investigation of power under delayed responses

for any of the rules mentioned. The investigation into ORBD and DBCD with de-

layed responses is the first known investigation of this type that has been reported.

We note that moderate delay does not have an effect on allocation proportion. We

also simulate power and significance level and find that delayed responses may lower

the power of some designs. We perform such an investigation for K = 2 and K = 3

treatments. Finally, we briefly discuss the performance of RAR designs under severe

delay.

In Section 3.5 we extensions of the DL, ORBD and DBCD rules to incorporate

covariates. We see that incorporating covariates can significantly reduce the ethical

allocation, which results in higher failure rates. We also perform an investigation

into the significance level and power of these designs and find that the power can

be severely reduced for these designs. Note that this is the first investigation of the

ORBD and DBCD incorporating covariates that has been reported in the literature.

The DL incorporating covariates with K = 3 has also been studied for the first time.

We conclude that RAR designs can be statistically and ethically desirable. How-

ever, we also find that under some realistic assumptions i.e. delayed responses and

covariate-adaptiveness these RAR designs do not perform as well. Thus, a given

RAR design should be chosen in such a way that we obtain satisfactory statistical

properties whilst maximising ethical advantages. We then consider extensions to

the work presented here.

2 Response-adaptive randomisation designs

2.1 Randomised play-the-winner (RPW) rule

2.1.1 K = 2 treatments design

Assuming K = 2 treatments with binary and instantaneous responses, Zelen (1969)

proposed the following randomisation procedure. Assign the first patient to a treat-

ment with equal probability. Given the response of the first patient, the next pa-

tient is assigned the same treatment if the response was a success. Otherwise, if

the treatment response was a failure, then the patient is assigned the other treat-

ment. The procedure continues until all patients have been randomised or a suitable

stopping rule has been reached. Such a design is known as play-the-winner (PW)

rule. Clearly, this design only allows for 2 treatments and there is no immediate

way of incorporating delayed responses. More importantly, the design is completely

3

140293481

predictable as the physician is able to guess the next treatment assignment if the

response and treatment assignment of the previous patient is known.

Wei and Durham (1978) extended the above idea by proposing the randomised

play-the-winner (RPW) rule. In this design sequentially arriving patients are as-

signed to a treatment by a ball being drawn from an urn. We assume that there are

i = 1, 2 treatments. There are αi balls of the colour corresponding to each treatment.

We would usually choose α1 = α2 so that the urn is balanced in the beginning. Here,

this will always be the case so we let α = α1 = α2. When a patient is ready to be

randomised, a ball is drawn from the urn. The patient is assigned the corresponding

treatment, the ball is replaced and a response is observed. If the response of the

treatment for this patient was a success, we add β balls of the corresponding colour

to the urn. However, if the response was a failure, then β balls of the opposite kind

are added to the urn. This way we skew the probability of assignment towards the

more successful treatment so far. This process continues until a suitable stopping

criteria has been reached e.g. sufficient number of patients have been randomised.

We denote this design by RPW(α,β).

The above design deals with many of the limitations of the PW rule. Firstly,

the design is less predictable than PW rule since the allocation probability depends

on the whole response history, rather than just the last response. RPW also allows

the responses to be delayed as the urn can be updated once a response is available,

which was not the case with the PW rule.

2.1.2 Statistical properties and criticisms

We now list some interesting statistical properties of the RPW rule. Consider a

trial with treatments i = 1, 2, binary responses and n patients to be assigned.

The treatments have success probabilities 0 < pi < 1. Then, the probabilities of

treatment failure are given by qi = 1 − pi and Ni(n) is the number of patients

assigned to treatment i. Wei and Durham (1978) have shown that

Ni(n)

n→ 1/qi

1/q1 + 1/q2(1)

almost surely as n→∞. This is known as the limiting allocation proportion and is

an important feature of any randomisation design. The RPW rule can only target

this specific allocation proportion, which we will refer to as urn allocation. In prac-

tice, it is often desired that a given randomisation rule can target different allocation

proportions. For example, Rosenberger et al. (2001) proposed an alternative alloca-

tion proportion under which the number of treatment failure would be minimised.

Thus, a randomisation rule that can target any given allocation proportion is highly

desired and such rules will be introduced in Sections 2.2.3, 2.4 and 2.5.

Next, let δj be the treatment assignment of the jth patient where j = 1, . . . , n.

4

140293481

Atkinson and Biswas (2013) give the probability that the (j+1)th patient is assigned

to treatment i = 1 as

P (δj+1 = 1) = 1/2 + dj+1

where

dj+1 =jρ(p1 − p1)2(2 + jρ)

+ρ(p1 + p1 − 1)

2 + jρ

i∑k=1

dk

and ρ = β/α. Then clearly the assignment depends on ρ rather than α or β alone.

This means that the rules RPW (3, 3) and RPW (1, 1) are equivalent and only the

parameter ρ needs to be chosen.

For fixed pi and a small value of ρ (i.e. α > β) we have a small δj+1 and thus

P (δj+1 = 1) is close to 1/2. As we make α even smaller, then P (δj+1 = 1) → 1/2.

For the same values of pi and a large ρ, δj+1 is further away from zero, depending

on which treatment is more successful and so P (δj+1 = 1) tends away from a half.

This means that this rule will tend to assign more patients to the superior treatment

when ρ is small but it is much more predictable.

The ECMO trial conducted by Bartlett et al. (1985) which used the RPW(1,1)

rule serves as a good example of why a proper choice of ρ is important. When

this trial was concluded, 12 patients received the ECMO treatment, while only

one patient was assigned to the control group. The researchers concluded that the

reason for this imbalance was the bad choice of the initial urn composition and that

an RPW(3,1) or RPW(2,1) would be a more suitable choice. Such designs have a

larger ρ and so would increase the probability of assigning patients to the inferior

treatment earlier on in the trial, resulting in more balanced treatments.

Matthews and Rosenberger (1997) derived the exact variance of the allocation

proportion of the RPW rule. The form is quite complicated, requiring at least half

a page and thus it is not given here. Their result shows that if λ = p1 − q2 > 1/2

then the variance of the allocation proportion is unbounded and thus the allocation

depends on the initial urn composition. However, in practice such trials with high

success probabilities on both treatments are very rare.

When the asymptotic variance is bounded, that is p1 − q2 < 1/2, Smythe and

Rosenberger (1995) demonstrated that

n1/2

(N1(n)

n− q2q1 + q2

)→ N(0, v),

where

v =q1q2(5− 2(q1 + q2))

(2(q1 + q2)− 1)(q1 + q2)2.

Hu et al. (2006) then showed that the lower bound on this asymptotic variance is

not obtained for the RPW rule. This is an undesirable result as a reduced variability

of a randomisation rule is directly correlated to a gain in statistical power of the

5

140293481

(p1, p2) nAllocation Proportion (Standard Deviation)

RPW(1,1) RPW(3,1) RPW(5,1) RPW(10,1)ρ = 1 ρ = 1/3 ρ = 1/5 ρ = 1/10

(0.8,0.8) 100 0.50(0.16) 0.50(0.13) 0.50(0.11) 0.50(0.09)(0.8,0.6) 100 0.63(0.12) 0.61(0.10) 0.60(0.09) 0.58(0.08)(0.8,0.4) 100 0.72(0.09) 0.69(0.08) 0.67(0.07) 0.64(0.07)(0.8,0.2) 100 0.77(0.06) 0.75(0.06) 0.73(0.06) 0.69(0.06)(0.6,0.6) 100 0.50(0.10) 0.50(0.09) 0.50(0.08) 0.50(0.07)(0.6,0.4) 100 0.59(0.08) 0.58(0.07) 0.58(0.07) 0.56(0.06)(0.6,0.2) 100 0.66(0.06) 0.64(0.06) 0.63(0.06) 0.62(0.05)(0.4,0.4) 100 0.50(0.06) 0.50(0.06) 0.50(0.06) 0.50(0.06)(0.4,0.2) 100 0.57(0.05) 0.56(0.05) 0.56(0.05) 0.55(0.05)(0.2,0.2) 100 0.50(0.05) 0.50(0.04) 0.50(0.04) 0.50(0.04)

Table 1: Allocation proportion of the RPW rule for different choices of the initialurn composition. This simulation used 5, 000 replications.

design, as has been shown by Melfi and Page (1998) and Hu and Rosenberger (2003).

Thus, designs that obtain the lower bound on the asymptotic variance are highly

regarded.

We now investigate some choices of the initial urn composition under different pi

through a simulation. For each choice of pi, the n = 100 patients were assigned using

the corresponding RPW rule. The results are given in Table 1. We investigated four

choices of parameters, with the corresponding ρ values reported. The table shows

the average allocation to treatment i = 1 over 5, 000 replications with the standard

deviation of this allocation proportion given in the brackets. The allocation to i = 2

is not given as it can be obtained by subtraction the allocation to i = 1 from one. It

can be seen that when p1 = p2 the allocation proportion is equal for all choices of the

parameters. However, the corresponding standard deviation is smaller for models

with smaller ρ (i.e. higher α). When p1 6= p2 the RPW rule assigns more patients

to the superior treatment which is always the first treatment in the table. The

proportion of patients assigned to the superior treatment increases as the difference

between the two treatments grows, leading to an ethical advantage. It can also be

noticed that the choice of ρ can have an effect on the allocation proportion when

p1 6= p2. For higher values of ρ we allocate more patients to the superior treatment

but this has a trade-off as the urn models with lower ρ have a smaller standard

deviation. Due to the earlier mentioned relation between variance and power, the

models with smaller ρ may prove to be less powerful. The code written in the

programming language R used for the RPW rule can be seen in Section A.1.

Overall, the RPW rule exhibits many undesirable properties meaning that a

practical application is usually troublesome. Despite this, much work has been done

on the topic. Bandyopadhyay and Biswas (1999) extend the rule to incorporate

covariates, while Biswas (1999) studies the rule under delayed responses. Such

6

140293481

extensions are not studied here and we focus on other designs with more desirable

properties.

2.2 Drop-the-loser (DL) rule

2.2.1 K ≥ 2 treatments design

Ivanova (2003) proposed the following urn model. Consider a clinical trial of K

treatments with instantaneous, binary responses. Then, we start with an urn con-

taining K + 1 types of balls. The ball types i = 1 . . . K correspond to the K

treatments while the balls of type 0 are the so-called immigration balls. The initial

urn composition in given by Z0 = {Z0,0, . . . , ZK,0} while after ω draws it is given by

Zω = {Z0,ω, . . . , ZK,ω}. When a patient is ready to be randomised, a ball is drawn

from the urn. If the ball is an immigration ball (of type 0) then no treatment is

assigned and the ball is replaced together with one ball for each of the K treatments.

This process is repeated until a treatment ball (of type i) is drawn and then the

patient is assigned the corresponding treatment. The response of the treatment is

observed. If it is a success, the ball is replaced and the urn composition is unchanged

and so we let Zω+1 = Zω. However, if the response is a failure, then the ball is not

replaced. The urn composition becomes Zi,ω+1 = Zi,ω − 1 and Zj,ω+1 = Zj,ω, j 6= i.

The process continues until a suitable stopping rule has been triggered, such as all

patients available have been assigned. The inclusion of the immigration balls is an

important feature of the DL rule as it allows a treatment to not ”die out” even if it

has a very small success probability.

The design proposed above is a discrete time process. The technique of em-

bedding an urn model in a continuous time birth and death proposed by Ivanova

and Flournoy (2001) was used by Ivanova (2003) to obtained some useful statistical

properties of this design. The limiting allocation to treatment i = 1, . . . , K is given

byNi(n)

n→ 1/qi

1/q1 + · · ·+ 1/qK(2)

almost surely as n → ∞. Note that when K = 2 this limiting proportion is equal

to the allocation of the RPW rule given in (1). Thus, so far we have two RAR

procedures that both target urn allocation.

Ivanova (2003) showed that

n1/2

(N1(n)

n− q2q1 + q2

)→ N(0, v),

where

v =q1q2(p1 + p2)

(q1 + q2)3.

Hu et al. (2006) then demonstrated that the DL attains the lower bound on this

7

140293481

asymptotic variance, unlike the RPW rule. Thus, it could be said that although

both the rules so far target the same allocation proportion, the DL has a theoretical

advantage as it is able to obtain the minimum variance. In fact the overall variability

of this rule is known to be lower than that of the RPW rule, as shown by Ivanova

(2003) and Hu and Rosenberger (2003).

Section A.2 reports an R program that was used to simulate the DL rule.

2.2.2 Allowing delayed responses

Zhang et al. (2007) extended the DL rule in order to study delayed responses. Such

a design is called the generalised drop-the-loser (GDL) rule. Sun et al. (2007) then

extended this rule to K > 2 treatments and here we will deal with this extension.

Similarly to the DL rule, we start with K + 1 type balls. Balls of type 0 are

immigration balls and balls of type i = 1, . . . , K are balls corresponding to the

treatments. The initial urn composition is given by Z0 = {Z0,0, . . . , Z0,K} while the

urn composition after ω draws have been made is given by Zω = {Zω,0, . . . , Zω,K}.We then let Z+

ω,k = max(0, Zω,k), k = 0, . . . , K and Zω ={Z+ω,0, . . . , Z

+ω,K}. This step

is required as we now allow the urn to have fractional or a negative number of balls.

Then, the probability of selecting a ball of type i is

Z+ω,i∑K

c=0 Z+ω,c

.

If the ball selected is of type 0 (i.e. an immigration ball) then no treatment is as-

signed and the ball is returned to the urn together with ai, i = 1, . . . , K treatment

type balls. If a treatment type ball is drawn then the subject is assigned the cor-

responding treatment i. Since we wish to allow delayed responses, the ball is not

replaced immediately. Instead we continue to allocate treatments until a response

is available. Once we obtain the response, we alter the urn by adding Dω,i > 0 balls

to the urn if the treatment was a success, leaving the urn unchanged otherwise. We

continue until a suitable stopping criteria has been reached. This design reduces to

the DL urn when ai = 1 and Dω,i = 0.

2.2.3 Targetting an alternative allocation proportion

By embedding the GDL rule in a continuous time process, Zhang et al. (2007) have

shown that the asymptotic allocation proportion of the GDL rule is given by

Ni(n)

n→ ai/qi

a1/q1 + · · ·+ aK/qK(3)

almost surely as n → ∞. It can be seen that if ai = 1, i = 1, . . . , K then the

asymptotic allocation is equivalent to urn allocation, seen in (2).

8

140293481

By carefully choosing the value of ai, we can alter (3) so that the rule targets a

different allocation proportion. For example, for K treatments with binary proba-

bilities pi the allocation proportion of

Ni(n)

n→

√pi√

p1 + · · ·+√pK, i = 1, . . . , K (4)

is of a particular interest. Rosenberger et al. (2001) have shown that this allocation

proportion minimises the expected number of treatment failures, assuming a fixed

variance of the estimator for the treatment difference. We will refer to this allocation

proportion as RSIHR allocation, named after the initials of the authors. We can

target this allocation by letting

ak = C

√p̂i√

p̂1 + · · ·+√p̂K

, i = 1, . . . , K (5)

where C is a constant and p̂i is the current estimate of pi based on the responses so

far. Also let Dω,i = 0 so that the balls will only be added when an immigration ball

is selected. This urn skews the allocation by adding more balls corresponding to

the superior treatment when an immigration ball is chosen. Simulations performed

by Zhang et al. (2007) have shown that there is no significant difference among the

different choices of C they investigated so we usually let C = 2.

In practice, we can obtain the estimate p̂i in (5) by

p̂i =(number of observed successes on treatment i) + 1

(total number of observed outcomes on treatment i) + 2. (6)

We now compare the GDL rule targeting urn allocation proportion (equivalent

to DL rule) to the GDL rule targeting RSIHR allocation. We also investigate two

choices of initial urn composition. For each combination of target allocation, initial

urn composition and pi we run the rule 5, 000 times. The mean allocation proportion

to treatment i = 1 was obtained as well as the standard deviation of this allocation.

The results of this simulation are given in Table 2 with the initial urn allocation

given in the brackets in the column name. Section A.3 reports an R program that

was used to simulate the GDL rule.

When p1 = p2 all the rules seem to have a similar mean allocation proportion

but the standard deviation is lower for the GDL rules targetting RSIHR allocation.

When p1 6= p2 the GDL rule targeting RSIHR on average assigns less patients to the

superior treatment than the standard DL rule. However, the rule targeting RSIHR

is much less variable which in turn leads to an increase in power. There is little

difference between initial urn compositions in terms of the variability.

9

140293481


GDL(2,2,2) GDL(5,5,5) GDL(2,2,2) GDL(5,5,5)(0.8,0.8) 100 0.50(0.06) 0.50(0.05) 0.50(0.02) 0.50(0.02)(0.8,0.6) 100 0.60(0.05) 0.57(0.05) 0.53(0.02) 0.53(0.02)(0.8,0.4) 100 0.68(0.04) 0.63(0.04) 0.58(0.03) 0.57(0.03)(0.8,0.2) 100 0.73(0.03) 0.68(0.03) 0.64(0.04) 0.63(0.04)(0.6,0.6) 100 0.50(0.05) 0.50(0.04) 0.50(0.03) 0.50(0.03)(0.6,0.4) 100 0.58(0.04) 0.56(0.04) 0.54(0.03) 0.54(0.03)(0.6,0.2) 100 0.64(0.04) 0.62(0.03) 0.61(0.04) 0.60(0.04)(0.4,0.4) 100 0.50(0.04) 0.50(0.04) 0.50(0.04) 0.50(0.04)(0.4,0.2) 100 0.56(0.03) 0.56(0.03) 0.57(0.04) 0.56(0.04)(0.2,0.2) 100 0.50(0.03) 0.50(0.03) 0.50(0.05) 0.50(0.05)

Table 2: Allocation proportion of the GDL rules for different choices of the initialurn composition and target allocation. This simulation used 5, 000 replications.

2.2.4 Incorporating covariates

Bandyopadhyay et al. (2009) proposed an extension of the DL rule to allow in-

corporating covariates within treatments. For each sequentially entering patient

j = 1, . . . , n the level of the covariate Uj ∈ {0, . . . , G} is obtained. We use 0 for

the most favourable condition and G for the least favourable one. For example,

if Uj is the initial size of a tumour, then a lower category represents a favourable

condition to treat i.e. a smaller tumour. Let π0 < π1 < · · · < πG with πG = 1 be a

set of probabilities representing the probability of success under the corresponding

grade Uj. In practice, the probabilities πk, k = 0, . . . , G may be unknown and so

we may use k+1G+1

or another suitable function to obtain an estimate of the success

probabilities.

Similarly as before, we start the urn with the composition Z0 = {Z0,0, . . . , Z0,K}.We draw a ball from the urn. If the ball drawn is of type 0 then no treatment is

assigned and we return the ball together with K balls, one ball for each treatment. If

a treatment ball is drawn, then the patient is assigned the corresponding treatment.

We note the grade k of this patient and the response is observed. If the response is a

success, then we replace the ball with the probability πk. Otherwise, if the response

is a failure then we replace the ball with the probability 1− πG−k.However, treatments are likely to have different success probabilities under differ-

ent covariate grades. That is, a given treatment is more likely to be successful when

the corresponding patient has the grade 0 than when the grade is G. Therefore, we

determine the success probability of a treatment by

P (Zj = 1|i, k) = aUjpi,j

where Zj is the success or failure for the jth patient, pi,j is the success probability

for the jth receiving treatment i and a ∈ (0, 1) is the so-called prognostic factor

10

140293481

index that can be estimated. Note that this definition is an extension of the one

originally proposed by Bandyopadhyay et al. (2009) to any number of treatments as

the authors only considered a trial with K = 2 treatments.

The DLC design has some strong disadvantages. Defining the number of grades

G can be troublesome. Using the tumour size example above, we may be able to

define G grades that tumours fall into, depending on their size. However, it might

also be possible to make the grades boundaries smaller, increasing the number of

grades. Grades of equal length might also not always be ideal. The number of

grades is likely to have effect on the allocation proportion and so needs to be chosen

carefully.

Similarly, not all covariates can be split into grades. Clinical trials are often

balanced by the institution. In such a case, we might be unable to rank institutions

in terms of grades. Even if such a grading was possible, then the grades are likely

to have very similar probabilities πk. Finally, the DLC is not able to incorporate

multiple covariates.

Section A.4 reports an R program that can be used to simulate the DLC rule.

In the next section we define a randomisation procedure that allows a much more

flexible incorporation of covariates.

2.3 Odds-ratio based designs (ORBD)


Rosenberger et al. (2001) proposed the following way to allow covariate balance

within treatments. Considering two treatments i = 1, 2, let Tj be the treatment

indicator (Tj = 1 if treatment is 1 and Tj = 0 if treatment is 2) for the jth patient

with j = 1, . . . , n and let zj be the covariate information for the given patient. We

then define the standard logistic regression model

logit(pj) = α + βTj + z′jγ + Tiz′iδ (7)

where pj is the probability of success for the jth patient, α is the global mean, β

is the treatment main effect, γ is the vector of covariate main effects and δ is the

vector of treatment-covariate interactions. Throughout this dissertation we used a

generalised linear model (GLM) to fit this regression model with a logit link function.

It is also possible to consider fitting (7) using a GLM with a probit link function or

another suitable choice.

The design works in the following way. Patients are assigned using another

randomisation scheme (e.g. block randomisation) until the regression equation in

(7) is obtainable using the data for all patients so far i.e. all possible maximum

likelihood estimates are available. Then, the covariate-adjusted odds ratio is given

by θ = exp(β̂+zj+1γ̂) where β̂ is the current estimate of β, γ̂ is the current estimate

11

140293481

of γ and zj+1 is the covariate information for the (j+ 1)th patient. Since θ ∈ (0,∞),

we need to transform this function to (0, 1) in order to represent a probability. We

use the transformation f(θ) = 1/(1+θ). We thus assign patients to treatment i = 1

with the probability1

1 + exp(β̂ + z′j+1δ̂).

This design has many advantages over the DLC design seen in Section 2.2.4. Firstly,

we no longer need to order the covariates into grades as the logistic model allows us

to have covariates that are continuous. We can also have categorical or binary covari-

ates. The logistic model in (7) can be extended to incorporate multiple covariates

as well as interactions between them.

Rosenberger et al. (2001) obtained the limiting allocation to treatment 1 to be

N1(n)

n→ 1

1 + exp(β + z′0β). (8)

where z′0 is a fixed vector of covariates. Unfortunately, the ORBD procedure is only

able to target this allocation proportion.

It is worth noting that the logistic model in (7) can be modified to not take the

covariates into account. Then, we instead build the regression model

logit(pj) = α + βTj

and assign patients to treatment i = 1 with the probability

1

1 + exp(β̂).

2.3.2 K > 2 treatments design

Atkinson and Biswas (2013) consider the extension of the two treatment ORBD

model without covariates to three treatments. Let i = 1, 2, 3 be the treatments

with the respective unknown success and failure probabilities pi and qi = 1 − pi.

Similarly as before, we assign patient using another randomisation scheme to one

of the i = 1, 2, 3 treatments until the logistic model can be estimated. Once the

logistic model

logit(pj) = α + βTj

can be built on the data for all patients so far, we assign patients to treatment i

with the respective probabilities:

1

1 + exp(β̂2 + β̂3),

exp(β̂2)

1 + exp(β̂2 + β̂3),

exp(β̂3)

1 + exp(β̂2 + β̂3),

where β̂2 and β̂3 are the estimates of β from the logistic model. We observe the

12

140293481

response of the patient to the treatment and update the logistic model accordingly.

We may also extend this K = 3 design to incorporate covariates. Such an

extension is not reported in the literature. That is, we now use the same logistic

regression model as in (7) with all terms defined as previously. We assign patients

using complete randomisation until the regression model is estimable. We then

assign patients to treatment i with the probabilities

1

1 + exp(β̂2 + β̂3 + z′j+1δ̂),

exp(β̂2)

1 + exp(β̂2 + β̂3 + z′j+1δ̂),

exp(β̂3)

1 + exp(β̂2 + β̂3 + z′j+1δ̂).

We can further extend the model above to clinical trials with K > 3 treatments in

a similar way.

Thus, the ORBD gives a randomisation scheme with much flexibility. We are

able to incorporate multiple covariates of different types (e.g. continuous, cate-

gorical, binary) and even gives us a possibility of including interactions between

covariates. Section A.2 reports an R program that was used to simulate the ORBD

rule incorporating covariates.

The ORBD is also able to incorporate delayed responses. In such a case, the

logistic model uses the data for all patients so far and the model is updated whenever

a response is obtained.

However, there are also some drawbacks of the ORBD. Firstly, we are only able to

target one allocation proportion, namely (8). Also, the design is relatively variable

when compared to the RPW and DL rules. This is mainly caused by the use of

the logistic model and the variability associated with each estimate in the model.

Because a new model is built for each patient, the variances for all these estimates

add up and this results in a highly variable procedure overall.

2.4 Doubly adaptive biased coin design (DBCD)

2.4.1 K = 2 treatments design

With the exception of the GDL rule, all designs so far have been able to target

only one allocation proportion. Eisele (1994) and Eisele and Woodroofe (1995)

introduced the doubly adaptive biased-coin design (DBCD) which overcomes this

problem by allowing the target allocation proportion y(p1, p2) to be specified.

The design heavily relies on the function g(x, y) from [0, 1] to [0, 1]2. This func-

tion maps the current allocation proportion to the target allocation proportion.

Selection of g is often problematic due to the very restrictive rules it must follow,

as defined by Eisele (1994). In fact, Melfi et al. (2001) pointed out that the original

choice of g violates one of these rules. Hu and Zhang (2004) propose a more relaxed

set of conditions:

• g is jointly continuous

13

140293481

• g(x, x) = x

• g(x, y) is strictly decreasing in x and strictly increasing in y

• g has bounded derivatives in x and y.

Hu and Zhang (2004) chose g(x, y) to be

g(x, y) =

1 if x = 0,

y(y/x)α

y(y/x)α+(1−y)((1−y)/(1−x))α if 0 < x < 1,

0 if x = 1,

where α ≥ 0 is a parameter to be chosen which controls the randomness of the

procedure. When α = 0, g(x, y) = y the design becomes the adaptive random

design. This design proposed by Rosenberger et al. (2001) assigns a patient to

treatment i with the probability equal to the current estimate of pi and has high

variability. As we increase α, we obtain a design that is less variable, but is more

deterministic, meaning selection bias might be a problem. When α =∞, we obtain

a design that has the variance minimised, but is completely predictable. Thus, in

practice α needs to be carefully chosen between these two extremes.

Once g(x, y) has been chosen, the design is as follows. We start by assigning n0

patients to each treatment. This can be done using any type of randomisation, but

throughout this dissertation we will use complete randomisation. When m = 2n0

patients have been randomised, for patient j = m, . . . , n we obtain the current

estimate of pi, i = 1, 2. We may for example use (6), as was used for the GDL. We

call this estimate p̂i and we also obtain the estimate of qi as q̂i = 1− p̂i.We then assign the (j + 1)th patient to treatment i = 1 with the probability

g

(Ni(j)

j, y(p̂1, p̂2)

)

where y(p̂1, p̂2) is the current estimate of the allocation proportion using p̂i, Ni(j) is

the current number of patients assigned to treatment i and j is the number of the

patient being assigned, as defined previously. For example, if we wish to target urn

allocation given in (1), we let

y(p1, p2) =1/(1− p1)

1/(1− p1) + 1/(1− p2).

The rule works by skewing the probability of assignment towards the treatment

that is the furthest away from its target allocation. We continue to assign patients

until all patients have been assigned or until a suitable stopping rule has been

triggered.

14

140293481

We now compare the DBCD rule under different target allocations and different

choices of α. We considered α = 2 and α = 4 and the choice of this parameter

is given as DBCD(α). We compare urn allocation in (1) to RSIHR allocation in

(4). For all of the rules we set n0 = 2. We obtained the allocation proportion to

i = 1 and its standard deviation using a similar simulation as was used for the

RPW and GDL rules and the results are shown in Table 3. In general, there is no

significant difference between the allocation proportions when p1 = p2. However,

the allocation proportion is much more variable for the DBCD rule targeting urn

allocation. When p1 6= p2, DBCD targeting urn allocation assigns more patients

to the superior treatment than DBCD targeting RSIHR allocation. This has a

trade-off, as the rules targeting urn allocation have a higher standard deviation,

which may translate to a loss in power. This inverse correlation between allocation

proportion and the variance has been seen in all simulations performed so far and

will be fully investigated Chapter 3. This simulation has also shown that there is no

significant difference between rules with α = 2 and α = 4 in terms of mean allocation

proportion but the standard deviation of the allocation proportion is slightly lower

when α = 4.

Section A.6 reports an R program that was used to simulate the DBCD rule.

Hu and Rosenberger (2003) demonstrated that for the DBCD rule

n1/2

(N1(n)

n− y(p1, p2)

)→ N(0, v),

where

v =q1q2((1 + 2α)(p1 + p2) + 2)

(1 + 2α)(q1 + q2)3.

Hu et al. (2006) then showed that the DBCD rule does not obtain the lower bound on

this asymptotic variance. Thus, we could say that the DL has a theoretical advantage

over the DBCD targeting urn allocation as it is able to obtain its minimum variance.

2.4.2 K > 2 treatments design

The DBCD design can be generalised in order to allow K > 2, i = 1, . . . , K treat-

ments, as demonstrated by Hu and Zhang (2004). Let v = {v1, . . . , vK} be the

vector of current allocation proportions for each treatment out of the j patients ran-

domised so far. We define g(x,y) = {g1(x,y), . . . , gK(x,y)} with sum{x} = 1 and

sum{y} = 1 to be a vector of functions from [0, 1]→ [0, 1]K with the conditions:

• g(v,v) = v and g(x,y)− g(x,v)→ 0 as y→ v.

• For every i

gi(x,v)− gi(v,v)

xi − vi≤ λ0 for all xi > vi

15

140293481


DBCD(2) DBCD(4) DBCD(2) DBCD(4)Target: Urn Target: RSIHR

(0.8,0.8) 100 0.50(0.10) 0.50(0.10) 0.50(0.03) 0.50(0.02)(0.8,0.6) 100 0.66(0.08) 0.66(0.07) 0.54(0.03) 0.54(0.02)(0.8,0.4) 100 0.74(0.06) 0.74(0.05) 0.59(0.04) 0.59(0.03)(0.8,0.2) 100 0.79(0.05) 0.79(0.04) 0.67(0.04) 0.67(0.04)(0.6,0.6) 100 0.50(0.07) 0.50(0.06) 0.50(0.03) 0.50(0.03)(0.6,0.4) 100 0.60(0.05) 0.60(0.05) 0.56(0.04) 0.56(0.03)(0.6,0.2) 100 0.66(0.04) 0.66(0.04) 0.64(0.05) 0.64(0.04)(0.4,0.4) 100 0.50(0.05) 0.51(0.04) 0.50(0.04) 0.50(0.04)(0.4,0.2) 100 0.57(0.04) 0.57(0.04) 0.59(0.05) 0.59(0.05)(0.2,0.2) 100 0.50(0.03) 0.50(0.03) 0.50(0.06) 0.51(0.06)

Table 3: Allocation proportion of the DBCD rule for different choices of the targetallocation and α with n0 = 5. This simulation used 5, 000 replications.

where 0 ≥ λ0 < 1 is a constant.

• g(x,y) is strictly decreasing in x and strictly increasing in y.

• g(x,y) has bounded derivatives in x and y.

Hu and Zhang (2004) propose gi(x,y) to be

gi(x,y) =(yi(yi/xi)

α)L∑Kc=1(yc(yc/xc)

α)L, i = 1, . . . , K

where α ≥ 0 and L > 1 are parameters to be chosen. The purpose of α is similar as

in the K = 2 case, while L is a constant that has a reduced influence on g(x,y) for

large values.

The allocation of sequential patients is also similar to the K = 2 rule. We

start by allocating n0 patients to each treatment using another form of randomi-

sation. For example, when using complete randomisation we would assign patients

randomly to treatment i with the probability 1/K. Once m = Kn0 patients have

been randomised, we obtain the current estimate of p̂i using (6) and we assign

(m + 1)th patient to treatment i with the probability gi(Ni(m)/m, y(p̂1, . . . , p̂K))

where y(p̂1, . . . , p̂K)) is the target allocation proportion using the current estimates

p̂i for treatment i. That is, if we wish to target the urn allocation we can use:

y(p1, . . . , pK) =1/(1− pi)∑Kc=1 1/(1− pc)

.

We update the estimates p̂i and the procedure continues as above until all patients

have been assigned or a suitable stopping rule has been triggered.

16

140293481


Baldi Antognini and Zagoraiou (2012) suggest an extension of the DBCD to include

covariate information, called the reinforced doubly adaptive biased coin design (RD-

BCD). We start by defining the function g(x, y, z) with the properties:

• g is decreasing in x and increasing in y for any z ∈ (0, 1)

• g(x, x, z) = x for any z ∈ (0, 1)

• g is decreasing in z if x < y and increasing in z if x > y

• g(x, y, z) = 1− g(1− x, 1− y, z) for any z ∈ (0, 1)

Baldi Antognini and Zagoraiou (2012) suggest

g(x, y, z) =y(y/x)z

y(y/x)z + (1− y)[(1− y)/(1− x)]z

as a suitable choice of g(x, y, z) with α > 0 having a similar role as before. Note that

z in the above function corresponds to the covariate information for the patient we

wish to randomise. Due to the properties of this function, namely z ∈ (0, 1) we also

need to transform the covariates so that they are also in this range. A transformation

might also be needed such that high values of z will correspond to a higher value of

g than when z is small. We denote such a transformation by H(z).

The workings of this rule are similar to the K = 2 version of the DBCD rule.

We start by assigning n0 patients to each treatment using another randomisation

method. Since we wish to balance covariates, a covariate-adaptive rule might be

suitable but throughout this dissertation we use complete randomisation. Once

m = 2n0 patients have been assigned, we obtain the estimates p̂i for each treatment

using (6). Given (m + 1)th patient with covariate information zm+1, we assign this

patient to treatment i with the probability

g

(Ni(m)

m, y(p̂1, p̂2), zm+1

).

We then update p̂i and assign the next patient using the same method. The rule

continues until all patients have been assigned or a suitable stopping rule has been

triggered.

The RDBCD has many disadvantages. Firstly, it is only able to deal with a

single covariate. The covariates also need to be defined in such a way that low z is

favourable to treat, as this produces a larger g(x, y, z) value.

The RDBCD procedure suffers from a similar problem as DLC rule. That is, it

requires the covariate to be defined in such a way that a certain value is favourable

to treat when compared to another one. This may often not be the case in practice.

17

140293481

For example, in many trials involving a number of institutions, we wish to balance

the number of patients treated in each institution. This is done in order to reduce

the effect of institution on the trial. However, the RDBCD is not able to balance

such a covariate as in practice there is often no way of favouring one institution over

the other.

Finally, the RDBCD only allows K = 2 treatments. Although an extension to

K > 2 might be possible, it is not discussed here. It is also worth noting that

Zhang and Hu (2009) obtained an alternative method of extending the DBCD to

incorporate covariates with K > 2 treatments. Section A.7 reports an R program

that can be used to simulate the RDBCD.

2.5 Efficient randomised adaptive design (ERADE)

2.5.1 Rule definition

Recall that the DL rule has been the only rule that is able to obtain the lower bound

on its asymptotic variance. The RPW and DBCD rules are not able to obtain it,

whilst there is no literature on whether the ORBD and GDL obtain their respective

lower bounds. Although the DL attains this lower bound, it has some disadvantages

such as only being able to target urn allocation. We now define a randomisation

procedure that is able to obtain the lower bound on its asymptotic variance and is

able to target any given allocation proportion.

Hu et al. (2009) proposed the following randomisation procedure. We start by

assigning n0 patients to each treatment, similarly as for the DBCD. Once m = 2n0

patients have been assigned and their responses observed, we obtain the estimates

p̂i of pi for each treatment i = 1, 2 using (6). We then obtain the value of the target

allocation proportion using these estimates, that is y(p̂1, p̂2). For example, if we

wish to target urn allocation we use the function

y(p1, p2) =1/(1− pi)

1/(1− p1) + 1/(1− p2),

similarly as for the DBCD rule. We then assign the (m+ 1)th patient to treatment

i with the probabilityαy(p̂1, p̂2) if Ni(m)

m> y(p̂1, p̂2),

y(p̂1, p̂2) if Ni(m)m

= y(p̂1, p̂2),

1− α + αy(p̂1, p̂2) if Ni(m)m

< y(p̂1, p̂2),

where 0 ≤ α < 1 is a constant that reflects the degree of randomisation. We continue

until a suitable stopping criteria.

Zhang and Hu (2009) considers an extension of the ERADE to incorporate co-

variate information whilst Zhang et al. (2014) extends it to K > 3 treatments. Such

18

140293481

extensions will not be considered here and so we will only consider it for K = 2

trials.

Section A.8 reports an R program that can be used to execute the ERADE rule.

3 Comparing designs

3.1 Introduction

We start by introducing ways in which different designs can be compared. A RAR

design aims to assign more patients to the superior treatment and therefore a design

with a higher allocation proportion to this treatment will be favourable due to an

ethical advantage. We mentioned previously that Melfi and Page (1998) showed that

the power of a design is a decreasing function of the variance of the allocation pro-

portion. Thus, we can assume that a design with less variable allocation proportion

will also be more powerful. Thus, to compare RAR designs we will consider (i) allo-

cation proportion (ii) variability of the allocation proportion (iii) failure proportion

(iv) power and significance level. In addition, we mentioned that the rules that

are able to target different allocation proportions are also favourable as they give

us more flexibility. These criteria are the standard in the literature for comparing

RAR designs and have been first proposed by Hu and Rosenberger (2003).

Given i = 1, 2 treatments with probabilities of success pi and probabilities of

failure qi = 1 − pi, the power of a design is maximised by assigning patients to

treatment i = 1 with the proportion

Ni(n)

n=

√piqi√

p1q1 +√p2q2

. (9)

This allocation proportion is known as Neyman allocation and the closer a design

is to this allocation, the higher the power in general. Unfortunately, using the

Neyman allocation has an ethical disadvantage as it assigns more patients to the

inferior treatment. The Neyman allocation is the reason for the inverse correlation

between allocation proportion and its variance we have seen in the simulations so far;

more patient assigned to the superior treatment resulted in higher variance which

in turn meant lower power. Thus, we can say that a suitable RAR design should be

balanced between maintaining suitable power and having an ethically advantageous

allocation proportion.

The main reason why we wish to assign more patients to the superior treatments

is to lower the number of treatment failures. Thus, for a design that assigns more

patients to the superior treatment, we expect a lower proportion of treatment fail-

ures. Due to (9) we also expect a design that has a lower failure proportion to have

lower power. Once again, we wish to balance the ethical advantages of a design with

its statistical properties.

19

140293481

Finally, by defining a suitable test, we will be able to approximate the power of

a design. In general, we will fix n such that the power of complete randomisation

under given pi is roughly 0.90. We can then study power of different RAR designs

under various assumptions such as delayed responses and covariates.

3.2 K = 2 treatments

3.2.1 Allocation proportion

We start by investigating the simplest clinical trial: K = 2 treatments design with

instantaneous responses and no covariate information. We will consider the designs:

Complete Randomisation (CR) We include complete randomisation for com-

parison purposes.

RPW Results in Table 1 show that the urn is highly dependant on the choice of ρ.

High ρ is highly variable but assigns more patients to the superior treatment.

On the other hand, low ρ assigns less patients to the superior treatment but

has a higher variance. We choose RPW (5, 1) as a sensible value between these

two extremes and this is also the choice suggested by Rosenberger (1999).

DL Table 2 has shown us that the DL rule with lower number of balls initially are

more variable but assign more patients to the better treatment. We choose

the initial urn composition Z0 = {3, 3, 3} as suggested by Ivanova (2003).

ORBD We use the K = 2 logistic regression model mentioned in 2.3.1.

DBCD We consider the DBCD targeting the urn allocation. In Table 3 we saw

that for this allocation increasing α results in lower variability of the allocation

and lower allocation to the superior treatment. We choose α = 2, as suggested

by Hu and Rosenberger (2003). We also let n0 = 2.

ERADE We use ERADE targeting urn allocation, with α = 0.7, as suggested by

Hu et al. (2009)

It is worth noting that the suggested RPW, DL, DBCD and ERADE designs target

urn allocation, while the ORBD targets the allocation (8). Throughout all simu-

lations performed from now on, we will use the above parameters, unless specified

otherwise.

We now perform a simulation to compare the allocation proportions (AP) and

failure proportion (FP) and their respective standard deviations (SD) of the rules

mentioned above. Each rule was simulated 5, 000 times under each choice of pi until

n = 100 patients have been assigned to a treatment. We then work out the average

allocation to treatment i and the standard deviation across all 5, 000 simulations of

the rule. We also note the number of treatments failures for each run and also obtain

20

140293481

(p1, p2) nCR RPW DL

AP(SD) FP(SD) AP(SD) FP(SD) AP(SD) FP(SD)(0.8,0.8) 100 0.50(0.05) 0.20(0.04) 0.50(0.11) 0.20(0.04) 0.50(0.05) 0.20(0.04)(0.8,0.6) 100 0.50(0.05) 0.30(0.05) 0.60(0.09) 0.28(0.05) 0.59(0.05) 0.28(0.04)(0.8,0.4) 100 0.50(0.05) 0.40(0.05) 0.67(0.07) 0.33(0.05) 0.66(0.04) 0.34(0.05)(0.8,0.2) 100 0.50(0.05) 0.50(0.05) 0.73(0.06) 0.36(0.06) 0.71(0.03) 0.37(0.05)(0.6,0.6) 100 0.50(0.05) 0.40(0.05) 0.50(0.08) 0.40(0.05) 0.50(0.05) 0.40(0.05)(0.6,0.4) 100 0.50(0.05) 0.50(0.05) 0.58(0.07) 0.48(0.05) 0.57(0.04) 0.48(0.05)(0.6,0.2) 100 0.50(0.05) 0.60(0.05) 0.63(0.06) 0.55(0.06) 0.63(0.03) 0.55(0.05)(0.4,0.4) 100 0.50(0.05) 0.60(0.05) 0.50(0.06) 0.60(0.05) 0.50(0.04) 0.60(0.05)(0.4,0.2) 100 0.50(0.05) 0.70(0.05) 0.56(0.05) 0.69(0.05) 0.56(0.03) 0.69(0.05)(0.2,0.2) 100 0.50(0.05) 0.80(0.04) 0.50(0.04) 0.80(0.04) 0.50(0.03) 0.80(0.04)

(p1, p2) nORBD DBCD ERADE

AP(SD) FP(SD) AP(SD) FP(SD) AP(SD) FP(SD)(0.8,0.8) 100 0.50(0.15) 0.20(0.04) 0.50(0.11) 0.20(0.04) 0.51(0.10) 0.20(0.04)(0.8,0.6) 100 0.67(0.13) 0.27(0.05) 0.66(0.08) 0.27(0.05) 0.65(0.07) 0.27(0.05)(0.8,0.4) 100 0.78(0.10) 0.29(0.06) 0.75(0.06) 0.30(0.05) 0.73(0.05) 0.31(0.05)(0.8,0.2) 100 0.84(0.06) 0.29(0.05) 0.80(0.05) 0.32(0.06) 0.78(0.04) 0.33(0.06)(0.6,0.6) 100 0.50(0.16) 0.40(0.05) 0.51(0.07) 0.40(0.05) 0.51(0.06) 0.40(0.05)(0.6,0.4) 100 0.66(0.13) 0.47(0.06) 0.60(0.06) 0.48(0.05) 0.60(0.05) 0.48(0.05)(0.6,0.2) 100 0.78(0.09) 0.49(0.06) 0.67(0.05) 0.53(0.06) 0.66(0.04) 0.53(0.05)(0.4,0.4) 100 0.50(0.15) 0.60(0.05) 0.50(0.05) 0.60(0.05) 0.50(0.04) 0.60(0.05)(0.4,0.2) 100 0.67(0.12) 0.67(0.05) 0.57(0.04) 0.68(0.05) 0.57(0.03) 0.69(0.05)(0.2,0.2) 100 0.50(0.14) 0.80(0.04) 0.50(0.03) 0.80(0.04) 0.50(0.03) 0.80(0.04)

Table 4: Comparison of allocation proportion (AP) and failure proportion (FP) forsome response-adaptive designs targeting urn allocation. The simulation used 5,000replications.

the average and standard deviation. The results of this simulation are shown in Table

4. Similarly as before, the table only reports allocation proportion to treatment i = 1

as the allocation to treatment i = 2 can by obtained by subtraction.

We start by considering the case p1 = p2. For all rules, the allocation propor-

tion is roughly equal, which results in very similar failure proportions. The only

significant difference between designs is the standard deviation of the allocation

proportion. When the success probability is high, the DL displays the lowest vari-

ability, very similar to CR. As we decrease the success probability, the DL rule is

actually less variable than complete randomisation. The ORBD is the most variable

with a high standard deviation for all choices of pi. The DBCD and ERADE pro-

cedures perform similarly. Their behaviour is interesting as they are highly variable

for high pi and their variability reduces for small pi. In fact, for small pi both rules

have a lower variability than CR and very comparable one to DL. We do not see a

significant difference in the variability of the failure proportion between designs.

We now consider the case p1 6= p2. With the exception of CR, all rules assign

more patients to the better treatment. The bigger the difference between p1 and p2,

the more patients are assigned to the better treatment. The ORBD design performs

21

140293481

the best in this respect, with the highest proportion assigned to the best treatment

for all choices of pi. RPW and DL seem to show very similar allocation proportions

to each other with a maximum difference of 0.02. Finally, DBCD and ERADE

have the least skewed allocation and also have a similar AP to each other. We

now compare the variability of the allocation proportion for these procedures. The

highly skewed allocation proportion of ORBD translates to a very high variability

for all choices of pi. The RPW also shows high variability and it is worth noting

that although the allocation proportion of RPW and DL were similar, the DL is

much less variable. In fact the DL is less or equally variable than CR for all choices

of pi. The DBCD and ERADE designs show a similar pattern as mentioned above,

i.e. they are highly variable for high pi and their variance decreases for lower pi,

becoming very similar to the DL rule.

Finally, we compare the failure proportion between the designs. The ORBD

displays the lowest failure proportion out of all the rules and this is mostly caused

by the highly skewed allocation proportion. We notice that the failure proportion

for the RPW and DL is similar, which is most likely caused by the similar allocation

proportion. The DL still seems superior due to its less variable allocation proportion.

The DBCD and ERADE have a similar failure proportion to each other, which is

slightly smaller than the one for the RPW and DL rules. We can say that all the

designs succeed in a more ethical allocation as the failure proportion is always lower

for all p1 6= p2. The standard deviation of failure proportion differs insignificantly

between the designs.

In Table 2 and Table 3 we compared designs that tackled RSIHR allocation,

namely the GDL, DBCD and ERADE rules. We now perform an investigation

comparing these rules when targeting RSIHR allocation. Recall that this allocation

proportion is of a significant importance as it minimises the expected number of

treatment failures. The rules chosen are:

GDL In Table 2 we saw that the initial urn composition has little effect on the

allocation proportion or its variance. Thus, we choose Z0 = {3, 3, 3}. We also

let ai = C√pi/(√p1 +

√p2), D = 0 and C = 2 as before.

DBCD We saw that when targeting RSIHR allocation, an increase in α results in

higher allocation to the superior treatment but also a higher variance. We

thus choose α = 2 as before.

ERADE We choose ERADE targetting RSIHR allocation with α = 0.7.

We use a similar method to simulate allocation and failure proportions as was

used for the rules targeting urn allocation and the results of such an simulation are

shown in Table 5. When p1 = p2, all the rules have a similar allocation and failure

proportions to complete randomisation. However, these rules have a much lower

standard deviation than the other designs investigated in Table 4. Interestingly,

22

140293481

(p1, p2) nGDL DBCD ERADE

AP(SD) FP(SD) AP(SD) FP(SD) AP(SD) FP(SD)

(0.8,0.8) 100 0.50(0.02) 0.20(0.04) 0.50(0.03) 0.20(0.04) 0.50(0.02) 0.20(0.04)(0.8,0.6) 100 0.53(0.02) 0.29(0.04) 0.54(0.03) 0.29(0.04) 0.54(0.02) 0.29(0.04)(0.8,0.4) 100 0.57(0.03) 0.37(0.04) 0.59(0.04) 0.36(0.04) 0.59(0.03) 0.36(0.04)(0.8,0.2) 100 0.64(0.04) 0.42(0.04) 0.68(0.05) 0.39(0.04) 0.66(0.04) 0.40(0.04)(0.6,0.6) 100 0.50(0.03) 0.40(0.05) 0.50(0.03) 0.40(0.05) 0.50(0.02) 0.40(0.05)(0.6,0.4) 100 0.54(0.03) 0.49(0.05) 0.56(0.04) 0.49(0.05) 0.55(0.03) 0.49(0.05)(0.6,0.2) 100 0.61(0.04) 0.56(0.05) 0.65(0.06) 0.54(0.05) 0.63(0.04) 0.55(0.05)(0.4,0.4) 100 0.50(0.04) 0.60(0.05) 0.50(0.04) 0.60(0.05) 0.50(0.03) 0.60(0.05)(0.4,0.2) 100 0.57(0.04) 0.69(0.05) 0.60(0.06) 0.68(0.05) 0.58(0.05) 0.68(0.05)(0.2,0.2) 100 0.50(0.05) 0.80(0.04) 0.50(0.07) 0.80(0.04) 0.51(0.05) 0.80(0.04)

Table 5: Comparison of allocation proportion (AP) and failure proportion (FP) forsome response-adaptive designs targeting RSIHR allocation. The simulation used5,000 replications.

the variance of all the rules targeting RSIHR allocation seems to increase as we

decrease pi, which is the opposite of what happened for the same rules targeting urn

allocation. The failure proportion seems to be the same for all the rules.

We now consider the cases when p1 6= p2. We can see that all the rules assign

more patients to the superior treatment, with all three rules having a very similar

allocation proportion. However, this allocation proportion is smaller than for all

the rules considered in Table 4. On the other hand, the rules targeting RSIHR

allocation have a smaller standard deviation. Amongst the three rules, ERADE and

GDL seem to have a very similar variability, with DBCD only slightly more variable.

Overall, we can say that on average rules targeting urn allocation seem to have a

more ethical allocation, whilst the rules targeting RSIHR allocation are less variable.

3.2.2 Inference, significance level and power

Consider a trial with i = 1, 2 treatments with binary responses and corresponding

probabilities of success pi and failure qi = 1 − pi, as defined previously. We may

wish to test the difference between two treatments using the hypothesis

H0 : p1 = p2,

against the two-sided alternative

H1 : p1 6= p2.

23

140293481

(p1, p2) nCR RPW DL GDL ORBD DBCD DBCD ERADE ERADE

Urn RSIHR Urn RSIHR Urn RSIHRSignificance Level

(0.8,0.8) 100 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.05(0.6,0.6) 100 0.04 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.04(0.4,0.4) 100 0.05 0.04 0.05 0.05 0.05 0.04 0.05 0.04 0.04(0.2,0.2) 100 0.04 0.04 0.04 0.05 0.04 0.04 0.05 0.04 0.05

Power(0.8,0.6) 206 0.88 0.85 0.87 0.88 0.83 0.88 0.88 0.88 0.88(0.8,0.4) 62 0.91 0.88 0.90 0.88 0.82 0.89 0.91 0.90 0.91(0.8,0.2) 27 0.89 0.85 0.88 0.86 0.79 0.79 0.89 0.84 0.89(0.6,0.4) 256 0.90 0.89 0.89 0.88 0.87 0.90 0.90 0.90 0.90(0.6,0.2) 57 0.87 0.86 0.87 0.84 0.80 0.81 0.87 0.83 0.87(0.4,0.2) 217 0.90 0.89 0.90 0.88 0.86 0.88 0.90 0.88 0.90

Table 6: Simulated power and significance of various RAR designs for a clinical trialwith K = 2 treatments. The results were obtained using a simulation with 5,000replications.

To test these hypotheses we may use the Wald test used by Hu and Rosenberger

(2003) with the test statistic given by

Z =p̂1 − p̂2√p̂1q̂1n1

+ p̂2q̂2n2

where ni is the number of patients assigned to treatment i. Then, Z2 is asymptoti-

cally chi-squared distributed with 1 degree of freedom.

Recall that the significance level is defined as the probability that we incorrectly

rejecting H0 when it is true (type I error) while statistical power is the probability

that the test correctly rejects H0 when it is false (type II error). We now simulate

significance level and power for all the designs considered in Table 4 and Table 5.

For each value of pi, we start by assigning n patients to treatments i = 1, 2 using

each of the rules. We repeat this 5, 000 times and for each repetition we obtain

the value of the test statistic Z. Then, we calculate the proportion of values of Z

that exceed 3.841 which is our critical value of the test statistic at α = 0.05 level of

significance. When p1 = p2 we will obtain the significance level while when p1 6= p2

we will obtain the power. The results of such a simulation can be seen in Table 6.

To ease the analysis we choose n such that the power of complete randomisation is

roughly 0.90 whilst we keep n the same when simulating significance level. For rules

that can target multiple allocation proportions, the line below the rule name gives

the allocation proportion that a given rule targets. We also kept all the parameters

the same as previously.

We start by comparing the significance level of the designs. We see that the

significance level simulated is very close to α = 0.05 that we have used as the

significance level for the test. Thus, we can say that the significance level for all

these procedures is very similar.

24

140293481

It can be seen that CR maintains the highest power for all randomisation schemes.

This is mostly likely caused due to the allocation proportion being the closest to

Neyman allocation, for which power is maximised. However, various designs main-

tain a very high level of power. We notice that for DBCD and ERADE targeting

RSIHR allocation the power is matched to the power of CR. The GDL targeting

RSIHR also maintains a very high level of power. The rules targeting urn allocation

perform slightly worse, with the RPW resulting in a considerable drop in power for

some pi. The ORBD has the lowest power out of all the designs.

We can see that the rules that were the least variable (GDL, DBCD and ERADE)

also seem to maintain the highest power. We have also previously noticed that the

less variable rules assign less patients to the superior treatment. Thus we can say

that there seems to be an inverse relationship between a more ethical allocation

and power. This confirms the simulations previously carried out by Melfi and Page

(1998) and Hu and Rosenberger (2003).

3.3 K = 3 treatments

3.3.1 Allocation proportion

In this section we consider designs with K = 3 treatments. We no longer consider the

RPW and ERADE designs as these have not been extended to K > 2 treatments. A

simulation was performed to investigate the allocation and failure proportions and

the results are shown in Table 7. For now we only consider the rules that target

urn allocation and ORBD that targets (8). The approach to this simulation was

similar as in the K = 2 case and we set the rule parameters as previously with the

exception of the DBCD rule which now also requires L = 1 parameter. We start

by considering the DL and DBCD rules that target the urn allocation. For each

rule, the first and second columns give the allocation proportion to the treatment

i = 1 and i = 2 respectively. The allocation proportion for i = 3 is not given as it

is simply obtained by subtracting the other two columns from one. As before, the

simulated standard deviations are given in brackets.

When p1 = p2 = p3, all the rules seem to allocate a similar proportion to all

treatments. This also results in a very comparable failure rate for all rules. However,

the standard deviation of the allocations can differ significantly. When pi is high,

CR and DL rules perform very similarly. As pi decreases, the DL rule is less variable

than CR. The DBCD exhibits an interesting behaviour as it is more variable than

CR for high pi and then it becomes less variable when pi is low. Finally, the ORBD

shows a higher variability than all the other rules.

Now consider the case when the pi probabilities are not equal. Out of all the

rules, the ORBD has the most ethical allocation as it assigns most patients to the

superior treatments. The DBCD rule targeting urn allocation also assigns a very

25

140293481

(p1, p2, p3) nCR DL

AP(SD) to i = 1, 2 FP(SD) AP(SD) to i = 1, 2 FP(SD)(0.8,0.8,0.8) 100 0.33(0.05) 0.33(0.05) 0.20(0.04) 0.33(0.05) 0.33(0.05) 0.20(0.04)(0.8,0.6,0.4) 100 0.33(0.05) 0.33(0.05) 0.27(0.04) 0.36(0.05) 0.36(0.05) 0.25(0.04)(0.8,0.6,0.2) 100 0.33(0.05) 0.33(0.05) 0.33(0.05) 0.39(0.05) 0.39(0.05) 0.29(0.04)(0.8,0.4,0.2) 100 0.33(0.05) 0.33(0.05) 0.40(0.05) 0.40(0.05) 0.40(0.05) 0.31(0.04)(0.6,0.6,0.6) 100 0.33(0.05) 0.33(0.05) 0.40(0.05) 0.33(0.04) 0.33(0.04) 0.40(0.05)(0.6,0.6,0.4) 100 0.33(0.05) 0.33(0.05) 0.47(0.05) 0.36(0.04) 0.36(0.04) 0.46(0.05)(0.6,0.4,0.2) 100 0.33(0.05) 0.33(0.05) 0.53(0.05) 0.38(0.04) 0.38(0.04) 0.50(0.05)(0.4,0.4,0.4) 100 0.33(0.05) 0.33(0.05) 0.60(0.05) 0.33(0.03) 0.33(0.03) 0.60(0.05)(0.4,0.4,0.2) 100 0.33(0.05) 0.33(0.05) 0.67(0.05) 0.36(0.03) 0.36(0.03) 0.66(0.05)(0.2,0.2,0.2) 100 0.33(0.05) 0.33(0.05) 0.80(0.04) 0.33(0.02) 0.33(0.02) 0.80(0.04)

(p1, p2, p3) nORBD DBCD


Table 7: Comparison of allocation proportion (AP) and failure proportion (FP)for some response-adaptive designs targeting urn allocation. The exception is theORBD targeting the allocation given in 8. The simulation used 5,000 replications.

favourable allocation, only slightly lower than the ORBD. Finally, the DL assigns

a slightly smaller proportion of patients to the best treatment, but this allocation

is still more desirable than that of the CR rule. The standard deviation of the

allocation proportion indicates that the DL is the least variable, always performing

either as well as or better than CR. For high pi the DBCD seems to be much more

variable than DL but when pi is low, the two rules seem to perform very similarly

in terms of variability. The ORBD is the most variable rule for all choices of pi.

The ethical allocation has a direct translation to reduction in failures with the

ORBD and DBCD having the lowest proportion of failures out of all the designs.

The DL has a slightly higher failure proportion but this is still less than that of CR.

There does not seem to be a significant difference between the rules in terms of the

standard deviation of failure proportion.

We now consider rules that can target the RSIHR allocation, namely the GDL

and DBCD rules. The simulation used the same parameters as before and the results

are shown in Table 8. When p1 = p2 = p3, all treatments have a similar allocation

and failure proportions to CR and the results in Table 7. In terms of the variability,

both the rules perform better than all the rules in Table 7, being even less variable

26

140293481

(p1, p2, p3) nGDL DBCD


Table 8: Comparison of allocation proportion (AP) and failure proportion (FP) forsome response-adaptive designs targeting RSIHR allocation. The simulation used5,000 replications.

than the CR rule.

When treatment success probabilities are no longer equal, both rules assign most

patients to the better treatment, which results in reduced treatment failures. How-

ever, this proportion is much lower than that of the rules mentioned in Table 7.

The variability of the allocation proportion for both the rules is lower than for CR.

Interestingly, the variability of these procedures decreases as pi decreases which is

opposite to what was happening for the rules targeting urn allocation. This means

that when pi is high, rules targeting RSIHR allocation are less variable while when

pi is low, urn allocation seems to be slightly less variable out of the two. We saw a

similar behaviour for these two allocation proportion for the K = 2 case. Once again

we do not notice a significant difference in the variability of the failure proportion.

To conclude, we have seen that allocation and failure proportions for K = 3

case seem to behave similarly to those of the rules with K = 2 treatments. In

general, it can be noticed that a more ethical allocation usually results in a higher

variability of the allocation proportion. We have also noticed that the rules targeting

RSIHR allocation allocation are less variable, but have a less ethical allocation than

rules targeting urn allocation. In addition, the RSIHR allocation seems to be more

suitable when pi is high, while the urn allocation seems to perform slightly better

for small pi.

3.3.2 Inference, significance level and power

We now define a suitable statistical test for the K = 3 case. Often in a clinical trial

we wish to compare K − 1 treatments to a control. We wish to test the hypothesis

of no difference between the control and the other treatments against the hypothesis

that there is a significant difference between the K − 1 treatments and the control.

From now on, we will consider p3 to be the control. Formally, we can formulate the

27

140293481

hypotheses as

H0 : p1 − p3 = 0, p2 − p3 = 0

and

H1 : p1 − p3 6= 0, p2 − p3 6= 0.

respectively. Hu and Rosenberger (2003) consider the contrast test of homogeneity

to test the above hypotheses. The contrast of interest is defined as

pc = {p1 − p3, p2 − p3}′

with the respective estimator

p̂c = {p̂1 − p̂3, p̂2 − p̂3}′.

We let

Σ̂ =

[p1q1/N1 + p3q3/N3 p3q3/N3

p3q3/N3 p2q2/N2 + p3q3/N3

]be the estimator of the variance of p̂c. Then, the test statistic is given by

H = p̂′cΣ̂−1p̂c

which under H0 follows the chi-square distribution with 2 degrees of freedom.

We may use the test of homogeneity to simulate power and significance level.

We use a similar approach as for the K = 2 case, but we now perform the test

of homogeneity instead of the Wald test. We use 0.05 level of significance to test

the H0 hypothesis. The results of such a simulation are shown in Table 9, with

all parameters as given previously. We have set n such that the test has a power

of roughly 0.90 when using complete randomisation. As previously, for GDL and

DBCD the line below the name of the rule indicates the target allocation proportion.

There is no clear difference between the designs when comparing the simulated

significance level. For all designs, the simulated significance is close to the 0.05 sig-

nificance level we have used for the hypothesis test. This means that the probability

of incorrectly rejecting H0 (type I error) is close to the 0.05 level we have allowed

for the test.

For all pi, CR produces the highest level of power. However, we can also see

that the GDL and DBCD overall maintain a very high level of power. For these two

rules, the RSIHR target seems more appropriate as it has a higher power level than

urn allocation. Finally, the ORBD seems to have the lowest power of all designs.

To conclude, we have seen that the ORBD on average produces the most ethical

allocation by allocating the most patients to the best treatment. However, this

design also has a highly variable allocation and this leads to relatively large loss

in power. On the other hand, the DL, GDL and DBCD rules have a less ethical

28

140293481

(p1, p2, p3) nCR DL GDL ORBD DBCD DBCD

Urn RSIHR Urn RSIHRSignificance Level

(0.8,0.8,0.8) 100 0.04 0.04 0.04 0.04 0.04 0.04(0.6,0.6,0.6) 100 0.05 0.05 0.04 0.04 0.05 0.05(0.4,0.4,0.4) 100 0.04 0.04 0.05 0.05 0.05 0.05(0.2,0.2,0.2) 100 0.04 0.04 0.05 0.04 0.04 0.05

Power(0.8,0.8,0.6) 290 0.89 0.83 0.88 0.74 0.79 0.87(0.8,0.8,0.4) 84 0.88 0.84 0.87 0.71 0.83 0.87(0.8,0.8,0.2) 42 0.93 0.88 0.91 0.83 0.83 0.89(0.6,0.6,0.4) 338 0.88 0.85 0.87 0.82 0.84 0.87(0.6,0.6,0.2) 75 0.86 0.81 0.84 0.67 0.81 0.82(0.4,0.4,0.2) 285 0.90 0.87 0.87 0.79 0.86 0.88

Table 9: Simulated power and significance level of various RAR designs for a clinicaltrial with K = 3 treatments. The results were obtained using a simulation with5, 000 replications.

allocations than ORBD but the allocations are less variable and the rules maintain a

high level of power. It can be said that the rules targeting RSIHR allocation are less

variable for high pi while rules targeting urn allocation exhibit lower variability for

low pi. In terms of power, the RSIHR allocation is much more suitable, maintaining

a very high level of power. Overall, so far we have seen RAR designs that (i) assign

more patients to superior treatment (ii) are less variable than CR (iii) have lower

failure proportion than CR (iv) maintain a high level of power. The RAR designs

that is able to meet all these criteria often depends on pi.

We end the investigation of multi-treatment RAR designs here. Since DL, GDL,

ORBD and DBCD have been extended to any number of treatments, it is possible

to investigate how the rules behave for K > 3. The contrast test of homogeneity

can also be extended to K > 3 treatments, as shown by Hu and Rosenberger (2003).

However, such an investigation is not performed here and we now focus on delayed

responses.

3.4 Delayed responses

3.4.1 K = 2 treatments

We start by introducing a way of incorporating delayed responses for all the rules

being investigated. We allow patients to arrive sequentially, one patient at each

time unit. That is, the first patient arrives at time unit 1, the second patient

arrives at time unit 2 and so on. We also define the vector d = {d1, . . . , di} which

corresponds to the mean delay in response for each treatment given in time units

defined previously. For example, d = {5, 1}means that the mean delay for treatment

29

140293481

i = 1 is 5 and so we expect to on average to assign 5 new patients before the response

is available for this patient.

Exponential distribution is often used to model queues and thus we use it to

generate our delayed responses. Given a patient assigned to treatment i, we define

the response time to be the time unit of randomisation plus a random number

generated from the exponential distribution with the rate 1/di. This is because the

mean of the exponential distribution is given as the inverse of the rate.

For each rule we are also required to alter the stopping rule. So far, each rule

stopped once n patients were assigned since the responses were immediate. We now

let the rule continue until n patients have been randomised and all n responses have

been collected.

The response is now not immediate and thus after each new randomised patient

we check if any responses are ready. That is, if the current time unit is 7, we look

for any patients that have a response time > 7 and < 8. For each patient with such

criteria, we obtain a response and then the 8th patient is randomised.

We start by considering the K = 2 case for the GDL, ORBD and DBCD rules

with delayed responses. A simulation was performed to investigate the allocation

proportion of each of these rules, which is similar to the one performed in Section 3.2

and uses the same parameters, but now using the model above incorporating delayed

responses. Three different delay values of d = {1, 1}, d = {5, 1} and d = {5, 5} were

investigated. The first choice represents a low delay on both treatments, second one

represents an unequal delay on each treatment whilst the third choice represents

a moderate delay on both treatments. The results are given in Table 10 with the

row below the rule name indicating the allocation proportion that is being targeted.

There was no significant difference in the variability of failure proportion from the

results in Section 3.2 so we only report the mean failure proportion to improve

readability.

We start the analysis by considering the p1 = p2 cases. Under all three d choices,

all the rules assign patients equally, as would be expected. However, the standard

deviations differ significantly. The ORBD is the most variable, as has been seen

previously for instantaneous responses. The DBCD targeting urn allocation is also

quite variable, especially for high pi. The GDL targeting the same allocation is less

variable. Finally, the GDL and DBCD targeting RSIHR allocation are the least

variable out of all the rules for high pi with the GDL performing slightly better for

low pi.

Consider the case when p1 6= p2. We see that all the rules assign more patients

to the better treatment, as has been seen for the rules with instantaneous responses.

The ORBD has the most ethically advantageous allocation by far, assigning most

patients to treatment i = 1 i.e. the superior treatment. We notice that the DBCD

seems to be more variable than GDL, as seen previously in the p1 = p2 case. The

30

140293481

(p1, p2) nGDL GDL ORBD DBCD DBCDUrn RSIHR Urn RSIHR

AP(SD) FP AP(SD) FP AP(SD) FP AP(SD) FP AP(SD) FPd = {1, 1}

(0.8,0.8) 100 0.50(0.05) 0.20 0.50(0.02) 0.20 0.50(0.15) 0.20 0.51(0.10) 0.20 0.51(0.03) 0.20(0.8,0.6) 100 0.57(0.05) 0.29 0.53(0.02) 0.29 0.68(0.13) 0.27 0.66(0.08) 0.27 0.54(0.03) 0.29(0.8,0.4) 100 0.63(0.04) 0.35 0.57(0.03) 0.37 0.78(0.10) 0.29 0.74(0.06) 0.31 0.59(0.03) 0.36(0.8,0.2) 100 0.68(0.03) 0.39 0.63(0.04) 0.42 0.84(0.06) 0.29 0.79(0.05) 0.33 0.67(0.04) 0.40(0.6,0.6) 100 0.50(0.04) 0.40 0.50(0.03) 0.40 0.50(0.16) 0.40 0.51(0.07) 0.40 0.51(0.03) 0.40(0.6,0.4) 100 0.56(0.04) 0.49 0.54(0.03) 0.49 0.66(0.14) 0.47 0.60(0.05) 0.48 0.55(0.04) 0.49(0.6,0.2) 100 0.62(0.03) 0.55 0.60(0.04) 0.56 0.78(0.09) 0.49 0.66(0.05) 0.53 0.63(0.05) 0.55(0.4,0.4) 100 0.50(0.03) 0.60 0.50(0.04) 0.60 0.49(0.15) 0.60 0.51(0.05) 0.60 0.51(0.04) 0.60(0.4,0.2) 100 0.56(0.03) 0.69 0.56(0.04) 0.68 0.67(0.12) 0.66 0.57(0.04) 0.69 0.59(0.05) 0.68(0.2,0.2) 100 0.50(0.03) 0.80 0.50(0.05) 0.80 0.50(0.14) 0.80 0.51(0.04) 0.80 0.50(0.06) 0.80

d = {5, 1}(0.8,0.8) 100 0.48(0.05) 0.20 0.50(0.02) 0.20 0.48(0.15) 0.20 0.50(0.10) 0.20 0.50(0.03) 0.20(0.8,0.6) 100 0.55(0.05) 0.29 0.53(0.02) 0.29 0.66(0.13) 0.27 0.66(0.08) 0.27 0.54(0.03) 0.29(0.8,0.4) 100 0.61(0.04) 0.35 0.57(0.03) 0.37 0.77(0.09) 0.29 0.74(0.06) 0.30 0.59(0.03) 0.36(0.8,0.2) 100 0.66(0.03) 0.40 0.63(0.04) 0.42 0.83(0.06) 0.30 0.79(0.05) 0.33 0.67(0.04) 0.40(0.6,0.6) 100 0.49(0.04) 0.40 0.50(0.03) 0.40 0.50(0.15) 0.40 0.51(0.07) 0.40 0.51(0.03) 0.40(0.6,0.4) 100 0.55(0.04) 0.49 0.54(0.03) 0.49 0.65(0.14) 0.47 0.60(0.06) 0.48 0.56(0.04) 0.49(0.6,0.2) 100 0.60(0.03) 0.56 0.60(0.04) 0.56 0.77(0.09) 0.49 0.66(0.05) 0.53 0.63(0.05) 0.54(0.4,0.4) 100 0.49(0.04) 0.60 0.50(0.04) 0.60 0.52(0.15) 0.60 0.51(0.05) 0.60 0.51(0.04) 0.60(0.4,0.2) 100 0.55(0.03) 0.69 0.57(0.04) 0.69 0.67(0.11) 0.67 0.57(0.04) 0.69 0.59(0.05) 0.68(0.2,0.2) 100 0.50(0.02) 0.80 0.51(0.05) 0.80 0.51(0.14) 0.80 0.51(0.03) 0.80 0.51(0.06) 0.80

d = {5, 5}(0.8,0.8) 100 0.50(0.04) 0.20 0.50(0.02) 0.20 0.50(0.15) 0.20 0.51(0.11) 0.20 0.51(0.03) 0.20(0.8,0.6) 100 0.56(0.04) 0.29 0.53(0.02) 0.29 0.66(0.13) 0.27 0.66(0.08) 0.27 0.54(0.03) 0.29(0.8,0.4) 100 0.61(0.04) 0.36 0.57(0.03) 0.37 0.77(0.10) 0.29 0.74(0.06) 0.30 0.59(0.04) 0.36(0.8,0.2) 100 0.66(0.03) 0.40 0.62(0.04) 0.42 0.83(0.06) 0.30 0.79(0.05) 0.33 0.67(0.05) 0.40(0.6,0.6) 100 0.50(0.04) 0.40 0.50(0.03) 0.40 0.50(0.15) 0.40 0.51(0.07) 0.40 0.51(0.03) 0.40(0.6,0.4) 100 0.56(0.04) 0.49 0.54(0.03) 0.49 0.65(0.13) 0.47 0.60(0.05) 0.48 0.56(0.04) 0.49(0.6,0.2) 100 0.61(0.03) 0.56 0.60(0.04) 0.56 0.75(0.09) 0.50 0.66(0.05) 0.54 0.63(0.05) 0.54(0.4,0.4) 100 0.50(0.03) 0.60 0.50(0.04) 0.60 0.50(0.14) 0.60 0.51(0.05) 0.60 0.50(0.04) 0.60(0.4,0.2) 100 0.55(0.03) 0.69 0.56(0.04) 0.69 0.66(0.12) 0.67 0.57(0.04) 0.69 0.58(0.05) 0.69(0.2,0.2) 100 0.50(0.02) 0.80 0.50(0.05) 0.80 0.50(0.14) 0.80 0.51(0.04) 0.80 0.51(0.06) 0.80

Table 10: Comparison of allocation proportion (AP) and its standard deviationfor some response-adaptive designs with K = 2 treatments and delayed responses.Three different values for the response delay were investigated. The simulation used5,000 replications.

difference can be seen between these two rules especially for the rules targeting

urn allocation. Once again we see that the rules targeting urn allocation are more

appropriate for low pi whilst urns targeting RSIHR allocation are more appropriate

for high pi. There does not seem to be any significant difference between the three

settings of d or between this table and the results found for the same rules with

instantaneous responses, as seen in Table 4 and Table 5.

We now perform a simulation of significance level and power, much like the one

performed in Table 3. We still use the Wald test at 0.05 level of significance and all

parameters are as before. The results can be seen in Table 11.

We notice that the significance level does not differ significantly between the

procedures and d values. In fact, it is very similar to the significance level of the

procedures with instantaneous responses reported in Table 6.

When compared to CR, GDL rule maintains the highest power. The power is

high for this design for both target allocations considered, often maintaining a similar

31

140293481

power to CR. The DBCD targeting RSIHR allocation is also highly powerful. The

behaviour of DBCD targeting urn allocation is particularly interesting, with the rule

maintaining high power when the difference between pi is small. When the difference

between pi is larger, i.e. pi = (0.8, 0.8), the design has the lowest power out of all

the designs. The ORBD has the lowest power out of all the designs for all other pi

values. We notice only a slight difference between the different choices of d with no

clear pattern.

(p1,p2) nCR GDL GDL ORBD DBCD DBCD

Urn RSIHR Urn RSIHRd = {1, 1}

(0.8,0.8) 100 0.04 0.04 0.04 0.04 0.04 0.05(0.6,0.6) 100 0.04 0.05 0.04 0.05 0.05 0.05(0.4,0.4) 100 0.05 0.05 0.04 0.05 0.06 0.06(0.2,0.2) 100 0.04 0.04 0.04 0.04 0.05 0.05(0.8,0.6) 206 0.88 0.88 0.87 0.83 0.88 0.88(0.8,0.4) 62 0.91 0.91 0.91 0.81 0.84 0.90(0.8,0.2) 27 0.89 0.89 0.89 0.80 0.76 0.88(0.6,0.4) 256 0.90 0.89 0.88 0.86 0.89 0.90(0.6,0.2) 57 0.87 0.87 0.86 0.79 0.85 0.87(0.4,0.2) 217 0.90 0.90 0.89 0.86 0.90 0.90

d = {5, 1}(0.8,0.8) 100 0.04 0.04 0.05 0.04 0.04 0.05(0.6,0.6) 100 0.04 0.04 0.05 0.05 0.04 0.04(0.4,0.4) 100 0.05 0.05 0.05 0.05 0.04 0.05(0.2,0.2) 100 0.04 0.04 0.04 0.04 0.04 0.05(0.8,0.6) 206 0.88 0.88 0.89 0.82 0.86 0.87(0.8,0.4) 62 0.91 0.90 0.90 0.82 0.85 0.90(0.8,0.2) 27 0.89 0.89 0.89 0.84 0.76 0.87(0.6,0.4) 256 0.90 0.89 0.87 0.87 0.88 0.90(0.6,0.2) 57 0.87 0.87 0.87 0.80 0.86 0.87(0.4,0.2) 217 0.90 0.87 0.90 0.87 0.89 0.90

d = {5, 5}(0.8,0.8) 100 0.04 0.04 0.04 0.04 0.03 0.03(0.6,0.6) 100 0.04 0.05 0.06 0.06 0.04 0.04(0.4,0.4) 100 0.05 0.05 0.04 0.06 0.05 0.05(0.2,0.2) 100 0.04 0.05 0.04 0.05 0.04 0.06(0.8,0.6) 206 0.88 0.87 0.87 0.84 0.85 0.87(0.8,0.4) 62 0.91 0.89 0.88 0.83 0.85 0.91(0.8,0.2) 27 0.89 0.89 0.92 0.84 0.76 0.89(0.6,0.4) 256 0.90 0.88 0.90 0.85 0.89 0.90(0.6,0.2) 57 0.87 0.87 0.87 0.81 0.85 0.87(0.4,0.2) 217 0.90 0.90 0.88 0.88 0.89 0.88

Table 11: Simulated power of various RAR designs for a clinical trial with K = 2treatments and delayed responses. Three different values for the response delay wereinvestigated. The results were obtained using a simulation with 5,000 replications.

32

140293481


We now consider RAR procedures with delayed responses and K = 3 treatments.

We start by investigating the allocation proportions of the GDL, ORBD and DBCD

rules. We run a similar simulation as in the previous section, but we adapt the rules

to K = 3 treatments. The results of such a simulation are reported in Table 12. We

do not report the failure proportions here due to no significance difference from the

instantaneous model and to improve the readability of the table. For each rule, the

table reports the allocation proportion to treatments i = 1, 2.

Consider the case when pi values are equal. We can then see that allocation pro-

portions are roughly equal for all designs. The variability of the allocation proportion

follows a similar pattern as seen for the same rules with instantaneous designs, in-

vestigated in Table 4 and Table 5. That is, ORBD is the most variable whilst the

performance of the GDL and DBCD highly depends on the allocation proportion

being targeted. The urn allocation is less variable for small pi, whilst RSIHR alloca-

tion is more suitable when pi is higher. Out of the DBCD and GDL rules, the GDL

is slightly less variable. There does not seem to be a significant difference between

the different values of d or between the procedures with delayed and instantaneous

responses.

When pi are unequal, all designs allocate more patients to the superior treat-

ment. As before, the ORBD has the most ethically desirable allocation, followed by

the DBCD targeting urn allocation. The DBCD targeting RSIHR allocation and

both GDL designs have a very similar allocation, only slightly skewed from equal

allocation. The variability of the allocation proportion also follows similar patterns

as before, with ORBD being the most variable. The variability of the GDL and

DBCD rules highly depends on the allocation proportion being targeted and the

value of pi. When pi is high, RSIHR allocation is less variable but when pi is low,

urn allocation is less variable. Out of GDL and DBCD, GDL seems to be slightly

less variable for the same target allocation. We also see that no significant difference

is observed between the different levels of delay. Finally, there is also no significant

difference between the allocation of the rules with delayed responses and rules with

instantaneous responses that has been explored in Table 4 and Table 5.

We now consider the power of the procedures. We use the contrast test of ho-

mogeneity, much like in the case of the model with instantaneous responses. All

parameters were kept the same and the results are given in Table 13. There does

not seem to be a noticeable difference in significance level between the designs, other

than the ORBD. The power also seems to follow a similar pattern as the K = 2 case.

That is, the GDL and DBCD targeting RSIHR allocation maintain a very high level

of power for all pi, when compared to CR. The GDL rule targeting urn allocation

also shows a high level of power. The DBCD targeting urn allocation exhibits inter-

esting behaviour as was seen in the K = 2 case. That is, the power is significantly

33

140293481

reduced when one pi is smaller than the others. When pi = (0.8, 0.8, 0.2), the DBCD

targeting urn allocation is actually the least powerful. For all other settings, the

ORBD has the lowest power. It is worth noting that the true power of the ORBD

may be higher due to the slightly smaller significance level that was simulated, when

compared to the 0.05 significance level at which we performed the Wald test.

To conclude, we have seen that a moderate delay in responses does not have a

significant effect on the allocation proportion and its variability. However, we have

noticed that the power can be effected by delayed responses for DBCD targeting

urn allocation. We have seen that for this particular design, when the difference

between success probabilities pi is the greatest, the design has the lowest power

when compared to the other designs. For all other designs there is no significant

drop in power due to delayed responses.

We now consider the case of a clinical trial with a large delay. Since all RAR

designs considered require some responses in order to skew the allocation proportion,

it can be said that RAR design would not be as effective. For example in the case

when all patients are assigned before any responses are received, all RAR designs

investigated would allocate patients in the same manner as complete randomisation.

Thus, none of the RAR designs would be suitable for trials where the delay is very

large e.g. survival trials.

However, in practice there are often ways of overcoming the problem of a large

delay. For example, Tamura et al. (1994) explored an application of a RAR design

in a study investigating the effect of fluoxetine in patients with a depressive disorder.

In this study, the time between the first and final measurement was approximately

8 weeks. However, the researchers decided that this delay was too large and have

decided to use a surrogate response instead. The surrogate response was thus defined

to be a success if the patient exhibited at least a 50% reduction in HAMD (a scale

measuring severity of depression) in two consecutive visits after 3 weeks of therapy. A

similar surrogate response might be possible for other clinical trials where otherwise

an RAR design would be ruled out on the basis of a large delay.

Note that Zhang et al. (2007) compared the allocation proportion of the GDL

and DBCD under delayed responses. They only considered K = 3 treatments

and no investigation is performed into the power of those designs. However, their

comparison also investigated non-uniform patient entry time which was not explored

here. The literature does not report any investigation of the ORBD under delayed

responses. The is also the first known instance of the power of the GDL, ORBD

and DBCD investigated under delayed responses.

34

140293481

(p1,p

2,p

3)

nG

DL

GD

LO

RB

DD

BC

DD

BC

DU

rnR

SIH

RU

rnR

SIH

RA

P(S

D)

toi

=1,

2A

P(S

D)

toi

=1,

2A

P(S

D)

toi

=1,

2A

P(S

D)

toi

=1,

2A

P(S

D)

toi

=1,

2d

={1,1,1}

(0.8

,0.8

,0.8

)10

00.

33(0

.04)

0.33

(0.0

4)0.

33(0

.02)

0.3

3(0

.02)

0.3

3(0

.11)

0.3

4(0

.11)

0.3

3(0

.09)

0.3

4(0

.09)

0.3

3(0

.03)

0.3

3(0

.02)

(0.8

,0.8

,0.6

)10

00.

36(0

.05)

0.36

(0.0

4)0.

35(0

.02)

0.3

5(0

.02)

0.3

9(0

.12)

0.3

9(0

.13)

0.3

9(0

.09)

0.3

9(0

.09)

0.3

5(0

.03)

0.3

5(0

.03)

(0.8

,0.8

,0.4

)10

00.

38(0

.05)

0.39

(0.0

5)0.

36(0

.02)

0.3

6(0

.02)

0.4

2(0

.13)

0.4

2(0

.13)

0.4

2(0

.10)

0.4

2(0

.10)

0.3

7(0

.03)

0.3

7(0

.03)

(0.8

,0.8

,0.2

)10

00.

40(0

.05)

0.40

(0.0

5)0.

38(0

.02)

0.3

8(0

.02)

0.4

3(0

.13)

0.4

3(0

.13)

0.4

4(0

.10)

0.4

3(0

.10)

0.3

9(0

.03)

0.3

9(0

.03)

(0.6

,0.6

,0.6

)10

00.

33(0

.04)

0.33

(0.0

4)0.

33(0

.02)

0.3

3(0

.02)

0.3

3(0

.12)

0.3

3(0

.12)

0.3

3(0

.06)

0.3

3(0

.06)

0.3

3(0

.03)

0.3

3(0

.03)

(0.6

,0.6

,0.4

)10

00.

36(0

.04)

0.36

(0.0

4)0.

35(0

.03)

0.3

5(0

.03)

0.3

9(0

.13)

0.3

8(0

.13)

0.3

7(0

.06)

0.3

7(0

.06)

0.3

5(0

.03)

0.3

5(0

.03)

(0.6

,0.6

,0.2

)10

00.

38(0

.04)

0.38

(0.0

4)0.

37(0

.03)

0.3

7(0

.03)

0.4

2(0

.13)

0.4

2(0

.13)

0.3

9(0

.06)

0.3

9(0

.06)

0.3

8(0

.03)

0.3

8(0

.03)

(0.4

,0.4

,0.4

)10

00.

33(0

.03)

0.33

(0.0

3)0.

33(0

.03)

0.3

3(0

.03)

0.3

3(0

.12)

0.3

3(0

.12)

0.3

3(0

.04)

0.3

3(0

.04)

0.3

3(0

.04)

0.3

3(0

.04)

(0.4

,0.4

,0.2

)10

00.

36(0

.03)

0.36

(0.0

3)0.

36(0

.03)

0.3

6(0

.03)

0.3

9(0

.12)

0.3

9(0

.12)

0.3

6(0

.04)

0.3

6(0

.04)

0.3

7(0

.04)

0.3

7(0

.04)

(0.2

,0.2

,0.2

)10

00.

33(0

.02)

0.33

(0.0

2)0.

33(0

.04)

0.3

3(0

.04)

0.3

3(0

.10)

0.3

4(0

.10)

0.3

3(0

.03)

0.3

3(0

.03)

0.3

3(0

.05)

0.3

3(0

.05)

d={5,1,1}

(0.8

,0.8

,0.8

)10

00.

33(0

.04)

0.34

(0.0

4)0.

33(0

.02)

0.3

3(0

.02)

0.3

3(0

.11)

0.3

4(0

.11)

0.3

3(0

.10)

0.3

3(0

.10)

0.3

3(0

.02)

0.3

3(0

.02)

(0.8

,0.8

,0.6

)10

00.

34(0

.04)

0.37

(0.0

5)0.

34(0

.02)

0.3

5(0

.02)

0.3

8(0

.12)

0.3

9(0

.12)

0.3

9(0

.09)

0.3

9(0

.09)

0.3

5(0

.03)

0.3

5(0

.03)

(0.8

,0.8

,0.4

)10

00.

36(0

.04)

0.40

(0.0

4)0.

36(0

.02)

0.3

6(0

.02)

0.4

1(0

.13)

0.4

2(0

.13)

0.4

2(0

.10)

0.4

2(0

.10)

0.3

7(0

.03)

0.3

7(0

.03)

(0.8

,0.8

,0.2

)10

00.

38(0

.04)

0.42

(0.0

4)0.

38(0

.02)

0.3

8(0

.02)

0.4

3(0

.13)

0.4

3(0

.13)

0.4

3(0

.10)

0.4

3(0

.10)

0.3

9(0

.03)

0.4

0(0

.03)

(0.6

,0.6

,0.6

)10

00.

32(0

.04)

0.34

(0.0

4)0.

33(0

.02)

0.3

3(0

.02)

0.3

3(0

.12)

0.3

3(0

.12)

0.3

3(0

.06)

0.3

3(0

.06)

0.3

3(0

.03)

0.3

3(0

.03)

(0.6

,0.6

,0.4

)10

00.

35(0

.04)

0.37

(0.0

4)0.

35(0

.03)

0.3

5(0

.02)

0.3

8(0

.13)

0.3

9(0

.12)

0.3

7(0

.06)

0.3

7(0

.06)

0.3

5(0

.03)

0.3

5(0

.03)

(0.6

,0.6

,0.2

)10

00.

37(0

.04)

0.39

(0.0

4)0.

37(0

.03)

0.3

7(0

.03)

0.4

2(0

.12)

0.4

2(0

.12)

0.3

9(0

.06)

0.3

9(0

.06)

0.3

8(0

.03)

0.3

8(0

.03)

(0.4

,0.4

,0.4

)10

00.

33(0

.03)

0.34

(0.0

3)0.

33(0

.03)

0.3

3(0

.03)

0.3

4(0

.12)

0.3

3(0

.12)

0.3

3(0

.04)

0.3

3(0

.04)

0.3

3(0

.04)

0.3

3(0

.04)

(0.4

,0.4

,0.2

)10

00.

35(0

.03)

0.36

(0.0

3)0.

36(0

.03)

0.3

6(0

.03)

0.3

9(0

.12)

0.3

9(0

.12)

0.3

6(0

.04)

0.3

6(0

.04)

0.3

6(0

.04)

0.3

7(0

.04)

(0.2

,0.2

,0.2

)10

00.

33(0

.02)

0.33

(0.0

2)0.

34(0

.04)

0.3

3(0

.04)

0.3

4(0

.10)

0.3

3(0

.10)

0.3

3(0

.03)

0.3

3(0

.03)

0.3

3(0

.05)

0.3

3(0

.05)

d={5,5,5}

(0.8

,0.8

,0.8

)10

00.

33(0

.04)

0.33

(0.0

4)0.

33(0

.02)

0.3

3(0

.02)

0.3

3(0

.11)

0.3

3(0

.11)

0.3

3(0

.09)

0.3

3(0

.09)

0.3

3(0

.03)

0.3

3(0

.02)

(0.8

,0.8

,0.6

)10

00.

36(0

.04)

0.36

(0.0

4)0.

35(0

.02)

0.3

5(0

.02)

0.3

8(0

.12)

0.3

8(0

.12)

0.3

9(0

.10)

0.3

9(0

.10)

0.3

5(0

.03)

0.3

5(0

.03)

(0.8

,0.8

,0.4

)10

00.

38(0

.04)

0.38

(0.0

4)0.

36(0

.02)

0.3

6(0

.02)

0.4

1(0

.13)

0.4

1(0

.12)

0.4

1(0

.10)

0.4

2(0

.10)

0.3

7(0

.03)

0.3

7(0

.03)

(0.8

,0.8

,0.2

)10

00.

40(0

.04)

0.40

(0.0

4)0.

38(0

.02)

0.3

8(0

.02)

0.4

2(0

.12)

0.4

3(0

.12)

0.4

3(0

.10)

0.4

4(0

.10)

0.3

9(0

.03)

0.3

9(0

.03)

(0.6

,0.6

,0.6

)10

00.

33(0

.04)

0.33

(0.0

4)0.

33(0

.02)

0.3

3(0

.02)

0.3

3(0

.12)

0.3

3(0

.12)

0.3

3(0

.06)

0.3

3(0

.06)

0.3

3(0

.03)

0.3

3(0

.03)

(0.6

,0.6

,0.4

)10

00.

36(0

.04)

0.36

(0.0

4)0.

35(0

.03)

0.3

5(0

.03)

0.3

9(0

.12)

0.3

8(0

.12)

0.3

7(0

.06)

0.3

7(0

.06)

0.3

5(0

.03)

0.3

5(0

.03)

(0.6

,0.6

,0.2

)10

00.

38(0

.04)

0.38

(0.0

4)0.

37(0

.03)

0.3

7(0

.03)

0.4

1(0

.12)

0.4

1(0

.12)

0.4

0(0

.06)

0.3

9(0

.06)

0.3

8(0

.03)

0.3

8(0

.03)

(0.4

,0.4

,0.4

)10

00.

33(0

.03)

0.33

(0.0

3)0.

33(0

.03)

0.3

3(0

.03)

0.3

3(0

.11)

0.3

3(0

.11)

0.3

3(0

.04)

0.3

3(0

.04)

0.3

3(0

.04)

0.3

3(0

.04)

(0.4

,0.4

,0.2

)10

00.

36(0

.03)

0.36

(0.0

3)0.

36(0

.03)

0.3

6(0

.03)

0.3

8(0

.11)

0.3

9(0

.11)

0.3

6(0

.04)

0.3

6(0

.04)

0.3

7(0

.04)

0.3

6(0

.04)

(0.2

,0.2

,0.2

)10

00.

33(0

.02)

0.33

(0.0

2)0.

33(0

.04)

0.3

3(0

.04)

0.3

3(0

.10)

0.3

3(0

.10)

0.3

3(0

.03)

0.3

3(0

.03)

0.3

3(0

.05)

0.3

3(0

.05)

Tab

le12

:C

ompar

ison

ofal

loca

tion

pro

por

tion

(AP

)an

dit

sst

andar

ddev

iati

onfo

rso

me

resp

onse

-adap

tive

des

igns

wit

hK

=2

trea

tmen

tsan

ddel

ayed

resp

onse

s.T

hre

ediff

eren

tva

lues

for

the

resp

onse

del

ayw

ere

inve

stig

ated

.T

he

sim

ula

tion

use

d5,

000

replica

tion

s.

35

140293481

(p1, p2, p3) nCR GDL GDL ORBD DBCD DBCD

Urn RSIHR Urn RSIHRd = {1, 1, 1}

(0.8,0.8,0.8) 100 0.04 0.04 0.03 0.03 0.04 0.03(0.6,0.6,0.6) 100 0.04 0.04 0.05 0.05 0.03 0.05(0.4,0.4,0.4) 100 0.05 0.05 0.05 0.05 0.06 0.05(0.2,0.2,0.2) 100 0.04 0.04 0.04 0.03 0.04 0.06(0.8,0.8,0.6) 290 0.88 0.85 0.87 0.75 0.79 0.88(0.8,0.8,0.4) 84 0.89 0.85 0.89 0.71 0.69 0.89(0.8,0.8,0.2) 42 0.93 0.87 0.91 0.84 0.74 0.88(0.6,0.6,0.4) 338 0.90 0.85 0.88 0.79 0.86 0.87(0.6,0.6,0.2) 75 0.87 0.80 0.84 0.64 0.79 0.81(0.4,0.4,0.2) 285 0.91 0.85 0.87 0.79 0.88 0.88

d = {5, 1, 1}(0.8,0.8,0.8) 100 0.04 0.04 0.04 0.03 0.04 0.04(0.6,0.6,0.6) 100 0.04 0.04 0.04 0.04 0.04 0.05(0.4,0.4,0.4) 100 0.05 0.04 0.06 0.04 0.05 0.05(0.2,0.2,0.2) 100 0.04 0.04 0.05 0.03 0.04 0.06(0.8,0.8,0.6) 290 0.88 0.85 0.88 0.74 0.80 0.88(0.8,0.8,0.4) 84 0.89 0.84 0.89 0.74 0.73 0.88(0.8,0.8,0.2) 42 0.93 0.88 0.90 0.83 0.74 0.90(0.6,0.6,0.4) 338 0.90 0.85 0.86 0.76 0.84 0.86(0.6,0.6,0.2) 75 0.87 0.83 0.86 0.69 0.75 0.85(0.4,0.4,0.2) 285 0.91 0.88 0.88 0.76 0.90 0.88

d = {5, 5, 5}(0.8,0.8,0.8) 100 0.04 0.05 0.05 0.03 0.04 0.04(0.6,0.6,0.6) 100 0.04 0.04 0.05 0.04 0.04 0.05(0.4,0.4,0.4) 100 0.05 0.04 0.04 0.04 0.04 0.05(0.2,0.2,0.2) 100 0.04 0.05 0.04 0.03 0.04 0.04(0.8,0.8,0.6) 290 0.88 0.84 0.86 0.74 0.79 0.89(0.8,0.8,0.4) 84 0.89 0.86 0.87 0.76 0.74 0.87(0.8,0.8,0.2) 42 0.93 0.90 0.93 0.86 0.77 0.91(0.6,0.6,0.4) 338 0.90 0.86 0.87 0.79 0.85 0.86(0.6,0.6,0.2) 75 0.87 0.84 0.84 0.68 0.79 0.83(0.4,0.4,0.2) 285 0.91 0.88 0.87 0.79 0.86 0.87

Table 13: Simulated power of various RAR designs for a clinical trial with K = 3treatments and delayed responses. Three different values for the response delay wereinvestigated. The results were obtained using a simulation with 5,000 replications.

3.5 Covariates


So far, we have extended the DL, ORBD and DBCD to allow the incorporation of

covariates. This extended version of DL rule is known as DLC, while the DBCD

version is known as the RDBCD. It is worth noting that the way the ORBD allows

covariate balance is different to that of the DLC and RDBCD. The ORBD balances

36

140293481

covariates in a traditional sense, meaning that it aims to assign in such a way that

the covariate is equally represented in each treatment. It does this by skewing the

assignment probability towards the treatment in which the current covariate value

is under represented. However, it also has the ethical basis to deal with, so it

also skews the probability of assignment towards the treatment performing best so

far. On the other hand, the DLC and RDBCD incorporate the covariates on a more

ethical basis. These rules skew the probability in such a way that the best treatment

is assigned to the patient with the most favourable condition.

In this section we aim to compare these two ways of incorporating covariates.

The ORBD will be compared to the DLC and RDBCD. Recall that the DLC is able

to have a varied probability of success which depends on the covariate level, that is

we assign a patient to treatment i with the probability aUjpi where a ∈ (0, 1) is the

so-called prognostic factor index and Uj ∈ {0, . . . , G} is the covariate level for the

jth patient. We extended the ORBD and RDBCD to also have this probability of

success. Although this change does not alter the inner workings of the rule, it allows

us to fairly compare the rules in a more realistic setting. This is because in practice

the treatment might have a different probability of success depending on the value

of the the covariate and we always want to balance the treatments on the covariate

that is likely to have an impact on the treatment outcome. Here, we investigated the

G value of G = 1. We then generate n random numbers from the standard uniform

distribution, denoted by zj with each value corresponding to the jth patient. The

zj values were then categorised into G + 1 levels Uj. Since zj values come from

the uniform distribution, we expect that on average every Uj will contain the same

number of patients. Note that the lower the Uj level, the higher the probability of

a success.

Recall that the ORBD and RDBCD were able to incorporate continuous covari-

ates, but the DLC was not. We thus use the zj values rather than Uj for the two

former rules. That is for the ORBD the regression will use the actual covariate

values and the Uj will only be used for obtaining the response outcome. Similarly,

the RDBCD will use zj with the transformation H(z) = 1− z. This is because the

g function has a higher probability of assignment when z is high and so we need

to introduce this transformation in order to correctly reflect that the probability of

success is high for low Uj.

A simulation was performed using the above adjustments to the rules and the

results are given in Table 14. We chose a = 0.7, as suggested by Bandyopadhyay

et al. (2009). We also chose RDBCD to target RSIHR allocation. This is because

the DLC can only target urn allocation and so this will also give us a comparison

between the two target allocations. In addition, we have seen that in general the

DL is less variable than DBCD when targeting urn allocation with no covariate

information taken into account. The table reports the allocation proportion to

37

140293481

(p1, p2) nDLC ORBD RDBCD

AP(SD) CI FP AP(SD) CI FP AP(SD) CI FP(0.8,0.8) 100 0.50(0.05) 0.50 0.32 0.50(0.15) 0.50 0.32 0.50(0.09) 0.49 0.32(0.8,0.6) 100 0.53(0.04) 0.50 0.40 0.64(0.14) 0.50 0.38 0.52(0.09) 0.49 0.40(0.8,0.4) 100 0.56(0.04) 0.50 0.47 0.74(0.11) 0.50 0.40 0.55(0.09) 0.48 0.47(0.8,0.2) 100 0.59(0.04) 0.50 0.53 0.81(0.06) 0.50 0.42 0.59(0.09) 0.48 0.52(0.6,0.6) 100 0.50(0.04) 0.50 0.49 0.50(0.15) 0.50 0.49 0.50(0.09) 0.49 0.49(0.6,0.4) 100 0.53(0.04) 0.50 0.57 0.63(0.14) 0.50 0.55 0.52(0.09) 0.48 0.57(0.6,0.2) 100 0.56(0.04) 0.50 0.64 0.74(0.09) 0.50 0.58 0.57(0.09) 0.48 0.63(0.4,0.4) 100 0.50(0.04) 0.50 0.66 0.50(0.14) 0.50 0.66 0.50(0.09) 0.49 0.66(0.4,0.2) 100 0.53(0.04) 0.50 0.74 0.65(0.12) 0.50 0.72 0.54(0.09) 0.48 0.74(0.2,0.2) 100 0.50(0.03) 0.50 0.83 0.50(0.13) 0.50 0.83 0.50(0.10) 0.49 0.83

Table 14: The allocation proportion and its standard deviation, covariate informa-tion and failure proportion for the DLC, ORBD and RDBCD with K = 2 treat-ments. We chose a = 0.7 and RDBCD to target RSIHR allocation. This simulationused 5,000 replications.

i = 1 and its standard deviation, the failure proportion and covariate information

(CI). The latter represents the average value of zi in treatment i = 1. We do not

report the standard deviation for the failure proportion or covariate information as

there was no significant difference between the rules and pi values.

When p1 = p2 the allocation proportion is roughly the same for all choices of pi.

The standard deviation of this allocation proportion is the highest for the ORBD

rule, as was seen for rules not incorporating covariates. The RDBCD is also highly

variable, while the DLC exhibits the lowest variability out of all the rules. We

also notice that the failure proportion is similar for all the rules. The covariate

information is the same for the DLC and ORBD but the RDBCD seems to have

slightly unbalanced treatments in terms of covariates.

In the case p1 6= p2 we see that all the rules assign more patients to the better

treatment. The ORBD has the most ethically desirable allocation. The DLC and

RDBCD skew the allocation only slightly, even for a high difference in treatment

success probability. This has the usual translation into the failure proportion. When

the difference between pi is high, the failure proportion for the ORBD is much lower

than for the other rules. When this difference is small, e.g. p = (0.4, 0.2) this

difference is quite small. We notice that the covariate information for the DLC and

ORBD is equal for all pi whilst for the RDBCD it is slightly skewed towards the

worse treatment.

We now perform an investigation into the power of these designs. We used the

Wald test as before and the results are shown in Table 15. We use the same n

values as before. We can see that the significance level for all the rules are near the

expected 0.05 level we used to perform the test. We see that in general the ORBD

is the least powerful. The RDBCD and DLC maintain slightly higher power than

the ORBD.

38

140293481

(p1, p2) n DLC ORBD RDBCD(0.8,0.8) 100 0.05 0.06 0.07(0.6,0.6) 100 0.05 0.06 0.05(0.4,0.4) 100 0.04 0.06 0.05(0.2,0.2) 100 0.04 0.04 0.03(0.8,0.6) 206 0.70 0.65 0.73(0.8,0.4) 62 0.77 0.68 0.78(0.8,0.2) 27 0.76 0.68 0.76(0.6,0.4) 256 0.79 0.75 0.79(0.6,0.2) 57 0.77 0.72 0.78(0.4,0.2) 217 0.82 0.77 0.82

Table 15: Simulated power of various RAR designs for a clinical trial with K = 3treatments and incorporating covariates. Three different values for the responsedelay were investigated. The results were obtained using a simulation with 5,000replications.


We now consider covariates for rules with K = 3 treatments. Thus, we will compare

the DLC and ORBD. Recall that when we introduced the RDBCD rule in Section

2.4.3, we did not define the g function for K > 2 treatments so we cannot use this

rule. We used a similar approach as before to obtain the allocation proportion and

its standard deviation as well as the failure proportion. The results can be seen in

Table 16. We no longer include the column for covariate information as this was the

same for both the rules. Similarly as before, the table only reports the allocation to

the first two treatments as the allocation proportion of the i = 3 treatment can be

obtained by subtraction.

We start the analysis by considering the rules when pi are equal. We notice that

the allocation is the same for both the rules, as expected. The standard deviation

of this allocation is lower for the DLC on both treatments for all pi. The failure

proportion is also very similar.

When pi are unequal, the ORBD assigns more patients than the DLC to the

best treatment. For the DLC the amount the allocation is skewed by is very small.

We notice a similar pattern as in the K = 2 case, that is the DLC maintains a

comparable level of treatment failures as the ORBD, despite a less ethical allocation.

Once again, the only difference of note is when the difference between pi is the largest,

that is pi = (0.8, 0.8, 0.2). We also notice that the DLC is much less variable than

the ORBD.

We now consider the power and significance level of the ORBD and DLC in-

corporating covariates and with K = 3 treatments. A simulation was run using

all parameters as before, and the results are shown in Table 17. As before, for the

power simulation we used n values as was used for the equivalent rules that are not

covariate-adaptive.

39

140293481

(p1, p2) nDLC ORBD

AP(SD) to i = 1, 2 FP AP(SD) to i = 1, 2 FP(0.8,0.8,0.8) 100 0.33(0.04) 0.33(0.04) 0.32 0.33(0.13) 0.30(0.09) 0.32(0.8,0.8,0.6) 100 0.35(0.04) 0.34(0.04) 0.37 0.37(0.14) 0.33(0.10) 0.37(0.8,0.8,0.4) 100 0.36(0.04) 0.36(0.04) 0.42 0.41(0.14) 0.36(0.09) 0.40(0.8,0.8,0.2) 100 0.37(0.04) 0.37(0.04) 0.45 0.42(0.14) 0.37(0.09) 0.42(0.6,0.6,0.6) 100 0.34(0.04) 0.33(0.04) 0.49 0.33(0.14) 0.29(0.10) 0.49(0.6,0.6,0.4) 100 0.35(0.04) 0.34(0.04) 0.54 0.38(0.14) 0.33(0.09) 0.54(0.6,0.6,0.2) 100 0.35(0.04) 0.36(0.04) 0.59 0.40(0.14) 0.36(0.09) 0.57(0.4,0.4,0.4) 100 0.33(0.04) 0.33(0.04) 0.66 0.33(0.13) 0.29(0.09) 0.66(0.4,0.4,0.2) 100 0.34(0.03) 0.34(0.04) 0.71 0.38(0.13) 0.34(0.09) 0.71(0.2,0.2,0.2) 100 0.33(0.03) 0.33(0.03) 0.83 0.33(0.10) 0.31(0.07) 0.83

Table 16: The allocation proportion and its standard deviation, covariate informa-tion and failure proportion for the DLC, ORBD and RDBCD with K = 3 treat-ments. We chose a = 0.7 and RDBCD to target RSIHR allocation. This simulationused 5,000 replications.

(p1, p2, p3) n DLC ORBD(0.8,0.8,0.8) 100 0.05 0.06(0.6,0.6,0.6) 100 0.03 0.06(0.4,0.4,0.4) 100 0.04 0.05(0.2,0.2,0.2) 100 0.04 0.03(0.8,0.8,0.6) 206 0.66 0.60(0.8,0.8,0.4) 62 0.70 0.59(0.8,0.8,0.2) 27 0.65 0.61(0.6,0.6,0.4) 256 0.75 0.67(0.6,0.6,0.2) 57 0.74 0.63(0.4,0.4,0.2) 217 0.81 0.71

Table 17: Simulated power of various RAR designs for a clinical trial with K = 3treatments and incorporating covariates. Three different values for the responsedelay were investigated. The results were obtained using a simulation with 5,000replications.

We start by noticing that there is no meaningful difference in the significance

level between the designs. When comparing the power, we can say that the DLC

maintains higher power than the ORBD.

We conclude that under covariate-adaptiveness, the ORBD has the most ethical

allocation proportion which results in the smallest failure proportion. However, the

trade-off is that this rule is the most variable and has the lowest power. The DLC

and RDBCD have less ethical allocations, but are less variable and have higher

power. Out of these two rules, the DLC is much less variable although the two rules

have similar power.

40

140293481

4 Conclusion

The initial intention of this dissertation has been to define some well studied RAR

designs, explore the extensions of these RAR design in a practical settings and then

to investigate the behaviour of these designs under those practical settings. We

have achieved this by considering the extensions of a number of rules to multiple

treatments, delayed responses and covariate-adaptiveness.

Throughout this investigation, it has been seen that the performance of a RAR

scheme significantly depends on a number of factors such as (i) target allocation

(ii) value of pi (iii) delay in response (iv) covariates. This means that there is no

universal design that is superior in all aspects the others. We have seen that for

the K = 2 treatments design with no delay and not incorporating covariates, the

rules that are able to target any given allocation proportion i.e. GDL, DBCD and

ERADE perform the best in terms of maintaining high power and having a less

variable allocation proportion. This can be seen particularly in the case when these

rules target RSIHR allocation. However, the ORBD performs the best in terms of

the most ethical allocation. We also found that rules targeting urn allocation are

less variable when pi is low whilst RSIHR is less variable for high pi, meaning that a

suitable allocation proportion should also be chosen depending on the approximate

pi levels expected. Similar results have been shown for the same designs with K = 3

treatments.

We then investigated the behaviour of the rules under delayed responses. We

saw that in general the performance of the procedures is not significantly affected

by delayed responses, as long as the delay is moderate.

When RAR procedures incorporate covariates, the ORBD has the most favourable

allocation and failure proportion, but is highly variable. On the other hand, the DLC

is much less variable but skews the allocation only slightly. One limitation of the

RAR designs studied here is the extension so that the rules have varied success

probability under difference covariate levels. Although this is a more realistic set-

ting in practice, it meant that we were unable to compare the covariate-adaptive

procedures to the same procedures without covariates. Thus, if this investigation

was to be done again, this could be a suitable alternative.

The subject of randomisation in clinical trial is a rapidly growing field of re-

search and therefore many extensions to the work presented here could have been

considered. One rule that could have been covered here in more detail is ERADE.

We saw that for K = 2 treatments it performed very well so investigating the rule

under all criteria considered for the other rules (e.g. delayed responses, covariate-

adaptiveness) would be a suitable extension. It would also be interesting to consider

all the rules incorporating covariates and delayed responses as this could be a com-

mon scenario in practice. If more time was available, responses that are not binary

could also have been considered.

41

140293481

References

Atkinson, A. C. and A. Biswas (2013). Randomised Response-Adaptive Designsin Clinical Trials. Chapman & Hall/CRC Monographs on Statistics & AppliedProbability. Boca Raton: Taylor & Francis Group.

Baldi Antognini, A. and M. Zagoraiou (2012). Multi-objective optimal designs incomparative clinical trials with covariates: The reinforced doubly adaptive biasedcoin design. Ann. Statist. 40 (3), 1315–1345.

Bandyopadhyay, U. and A. Biswas (1999). Allocation by randomized play-the-winner rule in the presence of prognostic factors. Sankhy: The Indian Journal ofStatistics, Series B (1960-2002) 61 (3), 397–412.

Bandyopadhyay, U., A. Biswas, and R. Bhattacharya (2009). Drop-the-loser designin the presence of covariates. Metrika 69 (1), 1–15.

Bartlett, R. H., D. W. Roloff, R. G. Cornell, A. F. Andrews, P. W. Dillon, and J. B.Zwischenberger (1985). Extracorporeal circulation in neonatal respiratory failure:A prospective randomized study. Pediatrics 76 (4), 479–487.

Biswas, A. (1999). Delayed response in randomized play-the-winner rule revisited.Communications in Statistics - Simulation and Computation 28 (3), 715–731.

Eisele, J. R. (1994). The doubly adaptive biased coin design for sequential clinicaltrials. Journal of Statistical Planning and Inference 38 (2), 249 – 261.

Eisele, J. R. and M. B. Woodroofe (1995). Central limit theorems for doubly adaptivebiased coin designs. Ann. Statist. 23 (1), 234–254.

Hu, F. and W. F. Rosenberger (2003). Optimality, variability, power. Journal ofthe American Statistical Association 98, 671–678.

Hu, F., W. F. Rosenberger, and L.-X. Zhang (2006). Asymptotically best response-adaptive randomization procedures. Journal of Statistical Planning and Infer-ence 136 (6), 1911 – 1922.

Hu, F. and L.-X. Zhang (2004). Asymptotic properties of doubly adaptive biasedcoin designs for multitreatment clinical trials. Ann. Statist. 32 (1), 268–301.

Hu, F., L.-X. Zhang, and X. He (2009). Efficient randomized-adaptive designs. Ann.Statist. 37 (5A), 2543–2560.

Ivanova, A. (2003). A play-the-winner-type urn design with reduced variability.Metrika 58 (1), 1–13.

Ivanova, A. and C. Flournoy (2001). A birth and death urn for ternary outcomes:Stochastic processes applied to urn models. In C. A. Charalambides, M. V.Koutras, and N. Balakrishnan (Eds.), Probability and Statistical Models with Ap-plications, pp. 583–600. Boca Raton: Chapman and Hall/CRC Press.

Matthews, P. C. and W. F. Rosenberger (1997). Variance in randomized play-the-winner clinical trials. Statistics & Probability Letters 35 (3), 233 – 240.

42

140293481

Melfi, V. and C. Page (1998). Variability in Adaptive Designs for Estimation ofSuccess Probabilities, Volume 34 of Lecture Notes–Monograph Series, pp. 106–114. Hayward, CA: Institute of Mathematical Statistics.

Melfi, V. F., C. Page, and M. Geraldes (2001). An adaptive randomized design withapplication to estimation. Canadian Journal of Statistics 29 (1), 107–116.

Rosenberger, W. F. (1999). Randomized play-the-winner clinical trials: review andrecommendations. Controlled Clinical Trials 20 (4), 328–342.

Rosenberger, W. F., N. Stallard, A. Ivanova, C. N. Harper, and M. L. Ricks (2001).Optimal adaptive designs for binary response trials. Biometrics 57 (3), 909–913.

Rosenberger, W. F., A. N. Vidyashankar, and D. K. Agarwal (2001). Covariate-adjusted response-adaptive designs for binary response. Journal of Biopharma-ceutical Statistics 11 (4), 227–236.

Smythe, R. T. and W. F. Rosenberger (1995). Play-the-winner designs, generalizedPolya urns, and Markov branching processes, Volume Volume 25 of Lecture Notes–Monograph Series, pp. 13–22. Hayward, CA: Institute of Mathematical Statistics.

Sun, R., S. H. Cheung, and L.-X. Zhang (2007). A generalized drop-the-loserrule for multi-treatment clinical trials. Journal of Statistical Planning and In-ference 137 (6), 2011–2023.

Tamura, R. N., D. E. Faries, J. S. Andersen, and J. H. Heiligenstein (1994). A casestudy of an adaptive clinical trial in the treatment of out-patients with depressivedisorder. Journal of the American Statistical Association 89, 768–776.

Thompson, W. R. (1933). On the likelihood that one unknown probability exceedsanother in view of the evidence of two samples. Biometrika 25 (3), 285–294.

Wei, L. J. and S. Durham (1978). The randomized play-the-winner rule in medicaltrials. Journal of the American Statistical Association 73, 840–843.

Zelen, M. (1969). Play the winner rule and the controlled clinical trial. Journal ofthe American Statistical Association 64, 131–146.

Zhang, L.-X., W. Chan, S. Cheung, and F. Hu (2007). A generalized drop-the-loserurn for clinical trials with delayed responses. Statist. Sinica 17 (1), 387–409.

Zhang, L.-X. and F. Hu (2009). A new family of covariate-adjusted response adap-tive designs and their properties. Applied Mathematics-A Journal of ChineseUniversities 24 (1), 1–13.

Zhang, L.-X., F. Hu, S. H. Cheung, and W. S. Chan (2014). Multiple-treatmentefficient randomized adaptive design with minimum selection bias. Manuscript.

43

140293481

A R code used in simulations

Note

The code given in the sections below performs a simple execution of all rules consid-

ered throughout this dissertation. Each rule was defined in such a way that it allows

any number of treatments, apart from RPW, RDBCD and ERADE which were not

extended to K > 2 treatments. The designs given here also do not allow delayed

responses, but a description of how to extend the procedures is given in Section 3.4.

The ORBD given here allows covariate adaptiveness but can be altered to disregard

this, as was outlined in Section 2.3.1.

A.1 RPW

rpw <- function(p,n,alpha,beta){

#declarations

balls<-rep(alpha,2) #initial urn composition

allocated<-c() #treatment allocated to each patient

s<-rep(0,2) #holds number of successes

f<-rep(0,2) #holds number of failures

for(i in 1:n){ #main loop

decision<-runif(1) #random number for deciding treatment

if (decision<(balls[1]/sum(balls))){ #if ball drawn corresponds to tmnt 1

allocated[i]<-1 #assign tmnt 1

}else{

allocated[i]<-2 #assign tmnt 2

}

t_outcome<- runif(1) #decides outcome

if(t_outcome<p[allocated[i]]){ #outcome is success

s[allocated[i]]<-s[allocated[i]]+1 #update number of successes

balls[allocated[i]]<-balls[allocated[i]]+beta

#add beta balls of corresponding tmnt to urn

}else{ # outcome is a failure

f[allocated[i]]<-f[allocated[i]]+1 #update number of failures

if(allocated[i]==1){ #if tmnt i=1

balls[1]<-balls[1]+beta #add beta balls of opposite kind

}else{

balls[2]<-balls[2]+beta #add beta balls of opposite kind

}

}

}

return(append(sum(allocated==1)/n,sum(f)/n)) #return AP and FP

}

A.2 DL

dl <- function(p,n,z){ #this rule works for K treatments

#declarations

bounds<-c(0)

allocated<-rep(1,n) #treatment allocated to each patient

s<-rep(0,length(p)) #holds number of successes

f<-rep(0,length(p)) #holds number of failures

for (i in 1:n){ #main loop

#bounds[1]<-0

while (allocated[i]==1){ #while immigration ball is being drawn

alloc_rnd<-runif(1) #random number to decide tmnt

44

140293481

for (j in 1:length(z)){ #loop to obtain probabilities of each tmnt

bounds[j+1]<-sum(z[1:j])/sum(z)

#probability of each tmnt based on urn composition

if (alloc_rnd > bounds[j] &&

alloc_rnd < bounds[j+1]){

#allocate based on rnd number generated

allocated[i]<-j

}

}

if(allocated[i]==1){ #if immigration ball chosen

for (k in 2:length(z)){ #add 1 ball for each tmnt

z[k]<-z[k]+1

}

}else{ #if not immigration ball obtain response

t_choice<-runif(1) #decides outcome

if (t_choice<p[allocated[i]-1]){ #success

s[allocated[i]-1]

<-s[allocated[i]-1]+1 #update number of successes

}else{

z[allocated[i]]

<- z[allocated[i]]-1 #update urn composition

f[allocated[i]-1]

<-f[allocated[i]-1]+1 #update number of failures

}

}

}

}


}

A.3 GDL

rsihr_alloc <- function(p,tmnt){ #choice of a for rsihr allocation

return( 2*(sqrt(p[tmnt])/(sum(sqrt(p)))) ) #return a

}

urn_alloc<- function(p,tmnt){ #choice of a for urn allocation e.g DL rule

return(1)

}

gdl<- function(p,n,z,D=0,a_function){ #this rule works for K treatments

#declarations

bounds<-c(0)

allocated<-rep(1,n) #treatment allocated to each patient

s<-rep(0,length(p)) #holds number of successes


p_est<-c(rep(0,length(p))) #estimate of p based on current responses

for (i in 1:n){

p_est<- (s+1)/(s+f+2) #update estimates of p

while (allocated[i]==1){ #while immigration ball is being drawn

alloc_rnd<-runif(1) #random number to allocation tmnt

for(j in 1:length(z)){ #decision prob based on urn composition

bounds[j+1]<-max(sum(z[1:j])/sum(z),0)


if ((alloc_rnd > bounds[j]) && (alloc_rnd < bounds[j+1])){

allocated[i]<-j

#allocate tmnt based on rnd number generated

}

}

if(allocated[i]==1){ #if immigration ball is drawn

for (k in 2:length(z)){ #update urn

z[k]<-z[k]+a_function(p_est,k-1)

#add a balls, depends on target allocation

45

140293481

}

}else{

z[allocated[i]]<-z[allocated[i]]-1 #draw a ball

}

}

#Response

t_choice<-runif(1)

if (t_choice < p[allocated[i]-1]){ #success

s[allocated[i]-1]<-s[allocated[i]-1]+1 #update successes

z[allocated[i]] <- z[allocated[i]] + D #update urn composition

#if target allocation is rsihr D=0, if urn allocation D=1

}else{ #failure

f[allocated[i]-1]<-f[allocated[i]-1]+1 #update failures

}

}


}

A.4 DLC

dlc <- function(p,n,z,u,G,a){ #this rule works for K treatments

bounds<-c(0)

allocated<-rep(1,n) #tmnt allocation of each patient

pi_est<-c() #estimates of p

s<-rep(0,length(p)) #holds number of succcesses


cov_bound<-c() #bounds to work out covariate levels

u_lvl<-c() #covariate levels

#bounds to split the uniformly distributed covariates into grades

for(m in 1:(G+1)){

cov_bound[m]<-abs(min(u))+ (m/(G+1))

}

cov_bound<-append(0,cov_bound) #lower bound for covariates

for (q in 0:G){

pi_est[q+1]<- (q+1)/(G+1) #success probabilites for each covariate level

}


for(b in 1:(G+1)){ #place continuous covariate into right covariate level

if(u[i]>=cov_bound[b] && u[i]<=cov_bound[b+1]){

u_lvl[i]<-b

}

}

while (allocated[i]==1){ #while immigration ball is drawn

alloc_rnd<-runif(1) #random number to assign tmnt

for (j in 1:length(z)){ #decision prob based on urn composition

bounds[j+1]<-sum(z[1:j])/sum(z)


if (alloc_rnd > bounds[j] && alloc_rnd < bounds[j+1]){

allocated[i]<-j

#allocate tmnt based on rnd number generated

}

}

if(allocated[i]==1){ #if immigration ball is drawn

for (k in 2:length(z)){ #update urn

z[k]<-z[k]+1 #add a ball for each tmnt

}

}else{

t_choice<-runif(1) #decides response

replace_dec<-runif(1) #decides whether to replace ball

z[allocated[i]] <- z[allocated[i]] - 1 #draw ball

46

140293481

if (t_choice<((a^(u_lvl[i]-1))*(p[allocated[i]-1]))){

#success, using a parameter

s[allocated[i]-1]<-s[allocated[i]-1] + 1 #update success

if (replace_dec < pi_est[u_lvl[i]]){

#replace ball with corresponding probability pi

z[allocated[i]] <- z[allocated[i]] +1

}

}else{ #failure

f[allocated[i]-1]<-f[allocated[i]-1] + 1 #update failure

#replace ball with corresponding probability 1-pi(G-j)

if (replace_dec < (1-pi_est[(G+2)-u_lvl[i]])){

z[allocated[i]] <- z[allocated[i]] +1

}

}

}

}

}


}

A.5 ORBD

orbd_cov <- function(p,n,u){ #this rule works for K treatments

n_t<-length(p) #number of tmnts

s<-rep(0,n_t) #holds success probabilities for each tmnt

f<-rep(0,n_t) #holds failure probabilities for each tmnt

tmnt<-c() #holds tmnts for each consecutive patient

response<-c() #holds response for each consecutive patient

allocated<-c() #tmnt allocated

cov_tbl<-c() #holds covariate information for each patient

beta_est<-0 #beta estimates


dec_prob<-c(0) #reset decision probability after each patient

if (any(s==0)|| any(f==0)) {

#if there are any tmnts with no outcomes yet

dec_prob<-append(dec_prob,rep(1/n_t,n_t-1)) #equal allocation

}else{ #if all tmnts have at least one response

model<-glm(response~tmnt+cov_tbl+cov_tbl*tmnt,

family="quasibinomial"(link="logit")) #logistic model using glm

beta_est<-model$coefficients #current model estimates

beta_est[1]<-0 #since exp(0)=1,

# allows generalisation since we assign using 1/(B+..),B[2]/(B+..) and so on

if(any(is.na(beta_est)) || model$converged==0){

#if beta are missing or algorithm did not converge

dec_prob<-append(dec_prob,rep(1/n_t,n_t-1)) #equal allocation

}else{

for (j in 2:(n_t)){

#probability of each treatment based on logisitic model

dec_prob[j]<- exp(beta_est[j-1])/(1 +

sum(exp(beta_est[2:n_t]+u[i]*beta_est[j-1])))

#using function defined in orbd section

}

}

}

dec_prob<-append(dec_prob,1) #upper decision probability, used to generalise next loop

fair_coin<-runif(1); #random number deciding tmnt

for (k in 1:n_t){

if(fair_coin > sum(dec_prob[1:k]) && fair_coin <

sum(dec_prob[1:(k+1)])){

allocated[i]<-k

}

47

140293481

}

t_outcome<-runif(1); #random number to decide outcome

if(t_outcome<p[allocated[i]]){ #success

s[allocated[i]]<-s[allocated[i]]+1 #update successes

}else{ #failure

f[allocated[i]]<-f[allocated[i]]+1 #update failures

}

#update variables for the logistic model

tmnt<-append(tmnt,toString(allocated[i]))

response<-append(response, (s[allocated[i]]/

(s[allocated[i]]+f[allocated[i]])))

cov_tbl<-append(cov_tbl, u[i])

}


}

A.6 DBCD

g_func <- function(x,y,alpha,L){ #g function. This is the multi-tmnt version

vec<-rep(0,length(y))

ans<-vec

for(i in 1:length(vec)){

vec[i]<-((y[i]*(y[i]/x[i])^alpha)^L)

}

for(j in 1:length(vec)){

ans[j]<-vec[j]/sum(vec)

}

return(ans)

}

urn_allocation <- function(p){ #target urn allocation

ans<-rep(0,length(p))

for(i in 1:length(ans)){

ans[i]<- (1/(1-p[i]))/(sum(1/(1-p)))

}

return(ans)

}

rsihr_allocation <- function(p){ #target rsihr allocation

ans<-rep(0,length(p))

for(i in 1:length(ans)){

ans[i]<- (sqrt(p[i]))/sum(sqrt(p))

}

return(ans)

}

dbcd<- function(p,n,n0,alpha=0,L=1,alloc_prop){

#declarations

allocated<-c() # tmnt assigned to each patient

t<- length(p) #number of tmnts

s<-rep(0,t) #number of successes

f<-rep(0,t) #number of failures

bounds<-c(0) #bounds on each tmnt probability, used for tmnt assignment

p_est<-c() #estimates of p


if(any((s+f) < n0)){ #if less than n0 patients assigned

bounds<-append(0,rep(1/t,t)) #equal allocation

}else{

for(j in 1:t){

p_est[j]<-(s[j]+1)/(s[j] +f[j] + 2) #p estimates

}

bounds[2:(t+1)]<-g_func(s+f,alloc_prop(p_est),alpha,L) #assignment probabilities for each tmnt

}

fair_coin <- runif(1) #random number to decide tmnt

48

140293481

for(k in 1:t){

if(fair_coin>sum(bounds[1:k]) && fair_coin<sum(bounds[1:(k+1)])){

#assign tmnt based on probabilities above

allocated[i]<-k

}

}

response_dec <- runif(1) #random number to obtain response

if(response_dec < p[allocated[i]]){ #success

s[allocated[i]]<-s[allocated[i]] + 1 #update number of successes

}else{ #failure

f[allocated[i]]<-f[allocated[i]] + 1 #update number of failures

}

}


}

A.7 RDBCD

g_func <- function(x,y,z){ #g function for the RDBCD, as defined previously

ans<-0

if(x==0 || x==1){

ans<- 1-x

}else{

ans<- (y*((y/x)^z))/((y*((y/x)^z))+(((1-y)*((1-y)/(1-x))^z)))

}

return(ans)

}

rsihr_alloc<-function(p_a,p_b){ #rsihr allocation proportion

ans<- (sqrt(p_a)/(sqrt(p_a)+sqrt(p_b)))

return(ans)

}

rdbcd<- function(p,n,n0,allocation,u){ #only allows K=2 tmnts

allocated<-rep(0,n) #allocation for each consecutive patient

s<-rep(0,2) #number of successes for each tmnt

f<-rep(0,2) #number of failures for each tmnt

cov_bound<-c()

u_lvl<-c()

for(i in 1:n){

if(sum(allocated==1)<n0 || sum(allocated==2)<n0){ #if patients assigned is less than n0

dec_prob=0.5 #equal allocation

}else{

p_a <- (s[1]+1)/(s[1] +f[1] + 2) #estimate p_1

p_b <- (s[2]+1)/(s[2] +f[2] + 2) #estimate p_2

dec_prob<-g_func(sum(allocated==1)/i,allocation(p_a,p_b),u[i])

#probability using g function and current estimates of p

}

fair_coin <- runif(1) #random number to decide tmnt assignemnt

if (fair_coin < dec_prob){

allocated[i]=1 #assign tmnt 1

}else{

allocated[i]=2 #assign tmnt 2

}

response_dec <- runif(1) #random number to decide response

if(response_dec<p[allocated[i]]){ #success


}else{ #failure


}

}


}

49

140293481

A.8 ERADE

#allocation proportions not defined here to save space

#these are the same as for the DBCD

erade <- function(p,n,n0,alloc_prop,alpha){ #only allows K=2 tmnts

allocated<-c() #tmnt allocated to each consecutive patient

s<-rep(0,length(p)) #number of successes for each tmnt

f<-rep(0,length(p)) #number of failures for each tmnt

p_est<-rep(0,length(p)) #estimate of p


if(any( (s+f) < n0)){ #if less patients assigned than n0 on each tmnt

sel_prob<-0.5 #equal allocation

}else{

rho_est<-alloc_prop(p_est[1],p_est[2])

#estimate of allocation proportion using current estimates of p

#piecewise function that defines assignment probability, defined in ERADE section

if(((s[1]+f[1])/i)>rho_est){ #condition 1

sel_prob<-alpha*rho_est

}else if((s[1]+f[1])==rho_est){ #condition 2

sel_prob <-alpha*rho_est

}else{ #condition 3

sel_prob <- 1 - alpha*(1-rho_est)

}

}

fair_coin <- runif(1) #random number to decide tmnt

if(fair_coin<sel_prob){

allocated[i]<-1 #allocate tmnt 1

}else{

allocated[i]<-2 #allocate tmnt 2

}

response_dec <- runif(1) #random number to decide response

if(response_dec < p[allocated[i]]){ #success


}else{ #failure


}

p_est<-c((s[1]+1)/(s[1]+f[1]+2),(s[2]+1)/(s[2]+f[2]+2)) #update current p estimates

}


}

50

response-adaptive randomisation in clinical trials with binary responses

Documents

treatments design

treatment design

anynumber of treatments

experimental design

delayed responses

rule definition183

therpw rule

winner rpw rule