sequential methods for pharmacogenetic studies

11
Computational Statistics and Data Analysis 56 (2012) 1221–1231 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Sequential methods for pharmacogenetic studies Susan Todd a,, M. Fazil Baksh a , John Whitehead b a Department of Mathematics and Statistics, University of Reading, Philip Lyle Building, Reading, RG6 6BX, UK b Medical and Pharmaceutical Statistics Research Unit, Department of Mathematics and Statistics, Fylde College, Lancaster University, LA1 4YF, UK article info Article history: Available online 2 March 2011 Keywords: Association studies Candidate gene Genome-wide Group-sequential Interim analyses abstract A study or experiment can be described as sequential if its design includes one or more interim analyses at which it is possible to stop the study, having reached a definitive conclusion concerning the primary question of interest. The potential of the sequential study to terminate earlier than the equivalent fixed sample size study means that, typically, there are ethical and economic advantages to be gained from using a sequential design. These advantages have secured a place for the methodology in the conduct of many clinical trials of novel therapies. Recently, there has been increasing interest in pharmacogenetics: the study of how DNA variation in the human genome affects the safety and efficacy of drugs. The potential for using sequential methodology in pharmacogenetic studies is considered and the conduct of candidate gene association studies, family- based designs and genome-wide association studies within the sequential setting is explored. The objective is to provide a unified framework for the conduct of these types of studies as sequential designs and hence allow experimenters to consider using sequential methodology in their future pharmacogenetic studies. © 2011 Elsevier B.V. All rights reserved. 1. Introduction A sequential study is an experiment in which the design includes one or more interim analyses that could lead to a definitive answer concerning the primary question of interest. Such a study is different from its fixed sample size counterpart in that the sample size is not calculated in advance. Instead, a stopping rule is defined which determines when the study is completed. Such studies offer practical, economic and ethical advantages through avoiding continuation of the study in the face of mounting evidence favouring a particular hypothesis. Sequential methods have been successfully implemented and have demonstrated benefits for both patients and trial sponsors in the various phases of traditional clinical development (Jennison and Turnbull, 2000; Whitehead, 1997). Pharmacogenetics is the study of how genetic variation determines response and side-effects of therapeutic agents and the identification and design of novel drug targets. It is a rapidly developing field, see for example Brockmöller and Tzvetkov (2008), as the search continues to identify successful therapies and the best target patient populations to receive them. Recent interest in pharmacogenetics in the fields of cancer (Huang and Ratain, 2009), antiretroviral medicines (Hughes et al., 2008) and anti-epileptic drugs (Baksh and Kelly, 2008; Löscher et al., 2009) illustrates the important impact that the results of such work is likely to have on new therapeutic strategies and future policy-making decisions. Many traditionally designed pharmacogenetic studies are not sufficiently powerful for reliable conclusions to be drawn or, as in studies of rare genotype groups and small effect sizes, require prohibitively large samples (Kirchheiner et al., 2005). Under such circumstances, it has been shown (Baksh et al., 2006; Cui et al., 2009; Dreyfus et al., 2001; Shuster et al., 2002; van Corresponding author. Tel.: +44 118 378 8917; fax: +44 118 378 8032. E-mail address: [email protected] (S. Todd). 0167-9473/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2011.02.019

Upload: susan-todd

Post on 05-Sep-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Sequential methods for pharmacogenetic studies

Computational Statistics and Data Analysis 56 (2012) 1221–1231

Contents lists available at SciVerse ScienceDirect

Computational Statistics and Data Analysis

journal homepage: www.elsevier.com/locate/csda

Sequential methods for pharmacogenetic studiesSusan Todd a,∗, M. Fazil Baksh a, John Whitehead b

a Department of Mathematics and Statistics, University of Reading, Philip Lyle Building, Reading, RG6 6BX, UKb Medical and Pharmaceutical Statistics Research Unit, Department of Mathematics and Statistics, Fylde College, Lancaster University, LA1 4YF, UK

a r t i c l e i n f o

Article history:Available online 2 March 2011

Keywords:Association studiesCandidate geneGenome-wideGroup-sequentialInterim analyses

a b s t r a c t

A study or experiment can be described as sequential if its design includes one or moreinterim analyses at which it is possible to stop the study, having reached a definitiveconclusion concerning the primary question of interest. The potential of the sequentialstudy to terminate earlier than the equivalent fixed sample size study means that,typically, there are ethical and economic advantages to be gained from using a sequentialdesign. These advantages have secured a place for the methodology in the conduct ofmany clinical trials of novel therapies. Recently, there has been increasing interest inpharmacogenetics: the study of howDNA variation in the human genome affects the safetyand efficacy of drugs. The potential for using sequential methodology in pharmacogeneticstudies is considered and the conduct of candidate gene association studies, family-based designs and genome-wide association studies within the sequential setting isexplored. The objective is to provide a unified framework for the conduct of these types ofstudies as sequential designs and hence allow experimenters to consider using sequentialmethodology in their future pharmacogenetic studies.

© 2011 Elsevier B.V. All rights reserved.

1. Introduction

A sequential study is an experiment in which the design includes one or more interim analyses that could lead to adefinitive answer concerning the primary question of interest. Such a study is different from its fixed sample size counterpartin that the sample size is not calculated in advance. Instead, a stopping rule is defined which determines when the study iscompleted. Such studies offer practical, economic and ethical advantages through avoiding continuation of the study in theface of mounting evidence favouring a particular hypothesis. Sequential methods have been successfully implemented andhave demonstrated benefits for both patients and trial sponsors in the various phases of traditional clinical development(Jennison and Turnbull, 2000; Whitehead, 1997).

Pharmacogenetics is the study of how genetic variation determines response and side-effects of therapeutic agents andthe identification and design of novel drug targets. It is a rapidly developing field, see for example Brockmöller and Tzvetkov(2008), as the search continues to identify successful therapies and the best target patient populations to receive them.Recent interest in pharmacogenetics in the fields of cancer (Huang and Ratain, 2009), antiretroviral medicines (Hugheset al., 2008) and anti-epileptic drugs (Baksh and Kelly, 2008; Löscher et al., 2009) illustrates the important impact that theresults of such work is likely to have on new therapeutic strategies and future policy-making decisions.

Many traditionally designed pharmacogenetic studies are not sufficiently powerful for reliable conclusions to be drawnor, as in studies of rare genotype groups and small effect sizes, require prohibitively large samples (Kirchheiner et al., 2005).Under such circumstances, it has been shown (Baksh et al., 2006; Cui et al., 2009; Dreyfus et al., 2001; Shuster et al., 2002; van

∗ Corresponding author. Tel.: +44 118 378 8917; fax: +44 118 378 8032.E-mail address: [email protected] (S. Todd).

0167-9473/$ – see front matter© 2011 Elsevier B.V. All rights reserved.doi:10.1016/j.csda.2011.02.019

Page 2: Sequential methods for pharmacogenetic studies

1222 S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231

der Tweel and van Noord, 2000) that use of a sequential design can be beneficial, as sequential designs are likely to use fewerobservations than fixed sample designs without compromising the reliability of the conclusions. The objective of this paperis to describe and present a unified framework for the conduct of pharmacogenetic studies in clinical drug development andin post-marketing studies, within a sequential setting. The methods discussed in this paper have largely been motivatedby sequential methods for genetic and epidemiological studies described elsewhere and cover candidate gene, genome-wide and family-based studies. In Section 2 an introduction to sequential methodology is given and the idea of a unifiedframework is introduced. Sections 3–5 then build upon this, describing methodology within the stated framework, forcandidate gene association studies, family-based studies and genome-wide studies respectively. Where relevant, literatureby other authors is cited and the links between their various approaches and that presented in this paper are established. Twoworked examples are presented. Our aim is to provide the knowledge to allow future experimenters to consider planningtheir pharmacogenetic studies sequentially, in the interests of promoting more efficient research.

2. An introduction to sequential methodology

In sequential analysis, individual observations or groups of observations are analysed at interim looks to determinewhether a study should be stopped. Traditionally, the term ‘sequential design’ has referred to looking at data after everyobservation while ‘group-sequential design’ specifies the situation where new data on a group of patients are available foreach interim analysis. Modern sequential designs as described by, for example Jennison and Turnbull (2000) or Whitehead(1997) can be easily implemented in both of the above cases and in this paper, we will use the term ‘sequential study’ or‘sequential design’ to apply to either. In order to set up a sequential study, consider a trial where the hypothesis of interest isexpressed in terms of a single parameter θ and take the null hypothesis to beH0 : θ = 0. A pair of test statistics is calculatedfrom the data available at each interim analysis of the sequential test procedure. These statistics are used to assess progressof the study and are compared with an appropriate stopping rule, used to determine whether the trial should stop or not,which is specified in advance of the trial. The procedure can be formulated so that one of these statistics, which we willdenote by Z , measures the cumulative strength of evidence against the null hypothesis, H0. The second statistic, which wewill denote by V , represents the information about θ that is currently available and is closely related to sample size. A plotof Z against V forms a sample path which represents the progress of the study. This characterisation and notation, whichunderlies the basis of our unified approach, follows Whitehead (1997): note that the variance of Z is V rather than 1 asit would be in an alternative standardised representation. The route to implementation of a sequential procedure is: first,definition of the parameter of interest, θ , then determination of the appropriate forms for the test statistics Z and V , andfinally specification of a suitable stopping rule depending on the objectives of the study. Many other representations ofsequential tests involve plotting or following statistics which are exact or approximate functions of Z and V , so that theycould be transformed to the structure studied here.

Once basic definitions of the parameter of interest and the test statistics have been established, attention can turn to theselection of an appropriate stopping rule. The choice is usually made on the basis of the circumstances under which smallsamples are desirable and will be influenced by economic and ethical considerations. Although authors have suggestedslightly different approaches (for example the boundaries approach described by Whitehead, 1997 or the α-spendingfunction approach Lan and DeMets, 1983) and indeed, the use of particular designs for particular types of study, it isimportant to realise that it is possible to ‘‘mix and match’’. The statistics Z and V relating to a particular study type canbe used in conjunction with any of the designs available in the literature according to the specific requirements of the study.

The sequential framework discussed in this paper assumes normality of the test statistic Z , together with independenceof the increments (Zi–Zi−1) between the (i− 1)th and ith interim analysis. Any sequential procedure devised utilising theseassumptions can therefore be used to conduct a genetic study. For such procedures, it is possible to evaluate properties suchas the mean, median and 90th percentile of the amount of information, V , needed to complete the study. Translation is thenmade from V to sample size in order to explore the properties of a design under different values of θ .

In each of the following three sections we focus on the three pharmacogenetic study designs, as highlighted above.We briefly review the literature to consider where sequential approaches have been used in the past. This is followedby descriptions of the implementation of the pharmacogenetic designs in the sequential context, bringing methodologytogether under the common framework, considering parameterisation of the question of interest, derivation of appropriatetest statistics and choice of stopping boundaries.

3. Candidate gene association studies

3.1. Case-control designs

3.1.1. Introduction and literatureThe case-control study design is one of themost commondesigns used to assess the pharmacogenetic effects of candidate

genes. The two groups, cases and controls, are distinguished by two different clinical outcomes, for example whether or nota particular type of adverse event, or a non-response, has occurred. Each group is then genotyped for a candidate genesuspected to be related to outcome. In a nested study, cases and controls are identified from well-characterised, existingcontrolled clinical cohorts.

Page 3: Sequential methods for pharmacogenetic studies

S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231 1223

(a) The triangular test. (b) The double triangular test.

(c) The truncated sequential probability ratio test (SPRT). (d) The double truncated SPRT.

(e) The restricted procedure.

Fig. 1. Examples of sequential designs within the boundaries approach framework.

As examples, Schork et al. (2001) suggest a sequential study on responders (cases) and non-responders (controls) toa compound where reduction in the likely study size is desirable. Their approach is discussed in the context of a studywith a biallelic locus and with the rare allele suspected of influencing responsiveness to the compound. Schork et al. quoteexpected sample sizes for a triangular test (see Fig. 1(a)) updated after the inclusion of every new subject (although they donot explicitly name the test). Shuster et al. (2002) were concerned with the use of samples in a biobank and devised a two-stage sequential case-control design in order to reduce the amount of destructive testing of samples, rather than for speedingthe outcome of the study. Schork et al. (2001) and Shuster et al. (2002) derived their test procedures from a retrospectiveview of the data in that the likelihood used is a function of the observed genetic data conditional on the case-control statusof individuals in the study.

3.1.2. A generic sequential approach using Z and VIn contrast to the literature above, we adopt a prospective view of the data, where the likelihood is a function of

case-control status given the genetic data. This prospective likelihood is more straightforward to generalise (see below)and is more natural in that genes pre-date and potentially underlie the outcome of interest. Whether the prospective orretrospective view is adopted does not affect parameterisation of the difference in responses between case and controlgroups in terms of the log-odds ratio as it can be shown (by application of Bayes theorem) that this parameter is the samein either situation.

Consider the pharmacogenetic case-control, or nested case-control study, of a biallelic Aa locus with a dominant allele Aand suppose that at some stage nj individuals have been observedwith outcome j (j = 1 for case, 2 for control). Let xjk denotethe binary observation of whether or not the kth individual of group j carries the A allele. Denoting the probability P(xjk = 1)that an individual carries the A allele by pj, j = 1, 2; k = 1, 2, . . . , a primary goal of such an investigation is to test the nullhypothesis that there is no systematic difference between the two groups, that isH0 : p1 = p2. The parameter of interest canbe taken to be the log-odds ratio θ , defined by θ = log[p1(1− p2)/{p2 (1− p1)}]. Other choices are available for θ , the mostpopular of which is the difference in success probabilities θ = p1 − p2. However, the approximation to normality, uponwhich our framework relies, is more accurate for the log-odds ratio parameterisation than for the probability difference(Whitehead, 1997). Under H0 : p1 = p2 it is the case that θ = 0, whilst θ is positive if A is more common amongst cases

Page 4: Sequential methods for pharmacogenetic studies

1224 S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231

(j = 1). Let Sj = xj1 + · · ·+ xjnj , j = 1, 2 be the number of individuals carrying the A allele. Let n = n1 + n2 and S = S1 + S2.In this case, the formulation for the efficient score Z for θ follows from direct application of methodology in the clinicaltrials setting and is given by Z = (n2S1 − n1S2)/n while the corresponding measure of information, V , may be written asV = {n1n2S(n − S)}/n3, (see Section 3.2 of Whitehead, 1997). As described above, the statistics Z and V are compared withthe stopping rule to determine whether or not sufficient evidence to terminate the study is obtained.

The statistic Z above can also be expressed as Z = (n1n2/n){(S1/n1) − (S2/n2)} so that it can be seen as a multiple ofthe difference between the observed proportions carrying the A allele within the two groups. Alternatively Z is also givenby Z = S1 − n1S/n, which is the observed number of cases with A, minus the expected number under the null hypothesis.In large samples V is approximately equal to (n1n2/n)p(1 − p), where p is the overall proportion of subjects with A in thestudy as a whole. When investigators are able to choose the ratio n1 : n2 of outcomes in the study, say 1 : r , then Z is{rn1/(r +1)}(p1 −p2), and V is approximately equal to {rn1/(r +1)2}p(1− p), where p = (p1 + rp2)/(1+ r). In this setting,1:1 will be the optimal ratio, in the sense of giving maximum information V for a given total number of subjects in whichcase, V will be approximately equal to (n/4)p(1 − p). Finally, it should be noted that the familiar Pearson chi-squared teststatistic χ2

=(O − E)2/E is also given by χ2

= Z2/V for a comparison of two streams of binary observations.To conduct the pharmacogenetic case-control study of a biallelic Aa locus with a recessive gene effect we define the sum

Sj, j = 1, 2 to be the number of individuals with two copies of allele A and proceed as for a dominant effect. In the case ofthe study of an additive gene effect, we let Sj be the observed number of copies of A in the jth group, j = 1, 2. The efficientscore Z for the log-odds ratio θ can be shown to be

Z = (n2S1 − n1S2)/n, (1)

as before, while the corresponding measure of information is

V = n1n2{S(n − S)+ 2m2n}/n3, (2)

where S = S1 + S2 is the total number of copies of A and m2 is the observed number of homozygous carriers of A in thesample. In this parameterisation for the additive genetic model, the odds ratio measures the effect size for heterozygouscarriers of allele A relative to homozygous carriers of the a allele while the effect for homozygous carriers of A is double (onthe logit scale) that for heterozygous carriers. It is also of interest to note that the statistic Z2/V is asymptotically χ2 on onedegree of freedom and is identical to that used in the familiar Armitage Trend Test (Armitage, 1955).

Adopting a prospective likelihood for the pharmacogenetic case-control study allows us to account for the effect ofother prognostic factors (which are non-matched in this case) via stratification or covariate adjustment (see Whitehead,1997, Chapter 7). Other response types can also be accommodated in this framework. For example, a study with outcomesclassified as either response, partial response or non-response to treatment may be analysed using methods for ordinalresponses (see Whitehead, 1997, Chapter 3.6) while a quantitative outcome, such as amount of weight gain or percentagereduction in tumour size could be investigated as a sequential comparison of quantitative responses (see Whitehead,1997, Chapter 3.7). If outcomes of the type described by Shuster et al. (2002) – times from diagnosis to death – wereobserved prospectively rather than retrospectively in a biobank study, then methods of sequential survival analysis couldbe adopted (see Whitehead, 1997, Chapter 3.4). It is interesting to note that the prospective likelihood can also be used inthe sequential analysis of a cohort study, as was done by Sills et al. (2005) in a study of pharmacoresistance to anti-epilepticdrug treatment, where patients exposed to treatment were retrospectively assessed for response. In a study with binaryoutcomes, the statistics Z and V will be the same as given above for the equivalent case-control study.

3.1.3. Worked exampleTo illustrate the above methodology consider a pharmacogenetic investigation of the binary adverse event, asthma

exacerbation, during a six month trial of montelukast treatment. The full details of the investigation are described byLima et al. (2006) and the data from a subset of the patients is available from the Pharmacogenomics Knowledge Baseat https://www.pharmgkb.org/. Briefly, asthmatic patients taking montelukast treatment for six months as part of a clinicaltrial were genotyped for a number of candidate genes thought to influence patient response. Data used in this illustrativeexample arewhether a patient experienced an exacerbation or not, and information on their genotype at a particular bialleliclocus with alleles A and G.

Suppose that a sequential study is proposed and let us assume that it is desirable to stop the study early when there issufficient evidence that the null hypothesis of no association between the gene and exacerbation can be rejected, and to stopthe study for futility when it is clear that the null hypothesis is not going to be rejected. The triangular test, as implemented,though not named, by Schork et al. (2001)would be a suitable stopping rule to fulfil these objectives. It is outside the scope ofthis paper to explore all the different stopping rules and their properties, instead the reader is referred toWhitehead (1997)or Jennison and Turnbull (2000) for detailed discussion. In thismanuscript wewill draw upon a small group of designs basedon straight line boundaries, as illustrated in Fig. 1. In each of the diagrams, crossing different boundaries of the stopping rulesis associated with drawing different conclusions from the trial.

For this worked example, consider Fig. 1(a). Crossing the upper boundary leads to rejection of the null hypothesis andthe conclusion that there is positive association between the allele of interest and exacerbation. Crossing the dashed line ofthe lower boundary leads to the conclusion that the null hypothesis cannot be rejected. In the rare instance that the solid

Page 5: Sequential methods for pharmacogenetic studies

S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231 1225

Table 1Expected V for a 1:1 case-control study analysed using atriangular test with 80% power and type I error of 0.05 to detecta log-odds ratio of 1.0 (odds ratio of 2.7).

θtrue E(V ∗) Med(V ∗) P90(V ∗)

−1.5 1.669 1.596 2.340−1.0 2.075 1.964 3.005−0.5 2.740 2.552 4.1490.0 3.936 3.624 6.3130.5 5.457 5.316 8.3761.0 5.230 5.033 8.1571.5 3.656 3.363 5.811

Table 2Values of Z and V calculated using Eqs. (1) and (2) for theasthma trial.

Inspection number Z V

1 1.20 0.5042 2.45 1.0353 2.40 1.6724 3.30 2.2525 3.84 3.1856 5.17 4.442

portion of the lower boundary is crossed, this also leads to rejection of the null hypothesis, but with a conclusion of evidencefor a negative (protective) effect.

As stated above, there is no single sample size for a sequential study. Instead a distribution of sample sizes can be obtainedat the design stage of the investigation. Assume that the ratio of cases to controls is known, say 1 : r , then, as stated earlier,V is approximately equal to

{rn1/(r + 1)2}p(1 − p), (3)

and so the distribution of the final sample size n1(1+ r) can be predicted. If the case-control ratio is not under the control ofthe investigators, then, although predictions of design properties will be inaccurate, the sequential design will ensure thatstopping occurs when the appropriate amount of information has been collected, thereby preserving the intended errorrates. This is because the value of Z is plotted against V , not n, and V adjusts automatically to the actual ratio between thecase and control groups. In this respect a sequential design is superior to a fixed sample study, for which knowledge of theratio is necessary in order to calculate the sample size necessary to achieve given power. However, as noted, prediction ofthe distribution of final sample size becomes less certain in a sequential test where the ratio r is unknown.

Suppose in our example that it is desired to detect a log-odds ratio of 1.0 (odds ratio of 2.7) with 80% power and a typeI error rate of 0.05. Computed in PEST (MPS Research Unit, 2000) and reported in Table 1, for selected true values of θ , areexpected values of V when the sequential case-control procedure terminates, denoted V ∗. The corresponding V for a fixedsample size study is Vfix = 7.85. From this table, looking at the column for E(V ∗), we see that the expected size of thesequential study is roughly between one-quarter and three-quarters that of the fixed sample study: this is consistent withgeneral results for the triangular test (Whitehead, 1997). To translate V into sample sizes, different options are available.One method is to replacem2 and S in Eq. (2) with their expected values or, alternatively, to use an appropriate value of p inEq. (3).

Extracting data from the website and assuming an additive gene effect, Eqs. (1) and (2) can be used to calculateappropriate values of Z and V at a series of interim analyses. Using an inspection schedule of looking at the data afterevery 10 patients, then Table 2 gives the values of Z and V which are calculated up to and including the inspection whenthe sample path crosses the boundary of the stopping rule. The resultant plot of the sequential procedure is given in Fig. 2.

On completion of the trial it is possible to analyse these data in a way that allows for the sequential design adopted.The p-value for the hypothesis test is 0.0182. A median unbiased estimate of θ is 1.144, with 95% confidence interval(0.197, 2.08).

3.2. Matched case-control designs

3.2.1. Introduction and literatureIn association testing, it is sometimes advantageous to pair each responder, or case, with a non-responder, or control,

matching them on important characteristics such as drug exposure, ethnicity, age, gender and stage or duration of disease.The purpose of matching is to address any possible confounding effect. Several authors considermethodology for sequentialmatched case-control studies with genetic explanatory factors. In this setting, Schork et al. (2001) consider simple testing

Page 6: Sequential methods for pharmacogenetic studies

1226 S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231

Fig. 2. Sample path of the sequential test of whether the A allele is associated with exacerbations in asthmatic patients.

scenarios and investigate the utility of a sequential design in the search for a candidate gene or polymorphism to distinguishresponders and non-responders to a drug of interest. Reference to methodology for calculating average sample numbers inthe context of a sequential investigation are given, but it is advised that these can become undesirably large. Van der Tweeland vanNoord (2000) construct sequential tests thatmake use ofmethodology detailed inWhitehead (1997), expressing thedata using the conditional logistic regressionmodel. Themethodology is further extendedbyBaksh et al. (2005),whopresentdetailed results concerning the anticipated study size and study error rates. Van der Tweel and Schipper (2004) investigatethe use of sequential case-control designs to study gene by environment interactions. Their simple non-hierarchical modelfits into the framework discussed here. Exposure is possession of both the genotype in question and the environmentalfactor. Matching is on factors such as age or ethnicity. The authors investigate the case of matched pairs, and are motivatedto use sequential methods in order to reduce the depletion of biobank resources.

3.2.2. A generic sequential approach using Z and VFor a matched case-control study let p1 be the probability that a responder or case carries the allele A of interest and let

p2 be the probability that a non-responder or control carries A. Interest lies in the odds ratio ψ . Letting θ = logψ , we testH0 : θ = 0 against H1 : θ = 0. Following van der Tweel and van Noord (2000), the efficient score and Fisher’s informationstatistics can be derived from the conditional logistic likelihood. The logarithm of this likelihood for N matched case-controlsets with 1 : r matching is given by

Nt=1 ℓt(θ)where ℓt is the contribution from the tth set parameterised in terms of θ . Let

xjkt be the indicator of A being present for each of the case and controls with x1kt being that of the case in the tth matchedset. This gives Z =

Nt Zt and V =

Nt Vt where

Zt = x1kt −St

r + 1and Vt =

Str + 1

1 −

Str + 1

, (4)

and St is the number of carriers of A in the tth set. Note that both Zt and Vt are zero for ‘‘concordant sets’’, where A is eitherpresent or absent for all individuals in a particular set. Hence, only ‘‘discordant sets’’ contribute to the test of association.Van der Tweel and van Noord use two designs: the double triangular test and the double sequential probability ratio test(Fig. 1(b) and (d)). These designs are known as symmetric tests, for obvious reasons, and treat the alternatives θ > 0 andθ < 0 in an identical way. The two symmetric designs illustrated in Fig. 1(b) and (d) stop early in the presence of evidencethat θ = 0. Hédelin (1992) derived similar test statistics to those presented in van der Tweel and van Noord (2000), andillustrated the method through retrospective application of the triangular test and sequential probability ratio test (Fig. 1(a)and (c)). Both these designs are termed asymmetric designs and stop more readily in the presence of evidence that θ < 0.Dreyfus et al. (2001) used this approach in applying the triangular test prospectively to a matched case-control study ofantiphospholipid antibodies and preeclampsia.

Although matching is used to remove the effect of the matching factors from the analysis, conditional logistic regressioncan be used to allow for the effects of further prognostic factors. So, for example, suppose that in a pharmacogenetic studycases and controls are matched on age and ethnicity, but at the time of the analysis it is desired to allow for the confoundingeffect of weight as well. In this case, weight can be introduced in a linear model, as explained for example by Breslowand Day (1980). The structure of the data is identical to the structure encountered in proportional hazards regressionmodels for survival data. In the survival models, a ‘‘risk set’’ of all patients who could have failed at any given time afterrandomisation is considered, with the features of the patient who died being compared to those of all patients in the riskset. In conditional logistic regression, the ‘‘risk set’’ is replaced by the matched set of cases and controls, and the features ofthe case are compared with those in the matched set as a whole. Analytically, the two approaches are identical. In SAS, thesurvival analysis procedure PROC PHREG is therefore used for the conditional logistic regression analysis of matched case-control data. Sequential conditional logistic regression can be conducted using the statistics Z and V derived for sequential

Page 7: Sequential methods for pharmacogenetic studies

S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231 1227

proportional hazards regression, as presented by Whitehead (1997, Chapter 7.6.4), provided that the correct interpretationof ‘‘risk set’’ and identification of ‘‘case’’ are made.

Van der Tweel and Schipper (2004) also discuss the sequential analysis of hierarchical models for detecting gene byenvironmental interactions. The non-hierarchical approach given by the same authors has been discussed earlier in thissection. In that setting a single term is fitted for those individuals with both the allele of interest and the environmentalfactor and omitted for all other subjects. In their hierarchicalmodel, three terms are fitted in a conditional logistic regression:a genetic factor, an environmental factor and an interaction term. The interaction term is taken to be the focus of interest,while the two main effects are factors to be allowed for in its analysis. Thus the analysis fits into the structure of sequentialconditional logistic regression described on the previous paragraph, and the statistics Z and V presented are of the formdescribed above.

4. Family-based studies

4.1. Introduction and literature

In this section we consider common familial aggregation studymethods as well as the Transmission-Disequilibrium Test(TDT), a well-known family-based method for association and linkage analysis. A familial aggregation study is usually thefirst stage in the investigation of a potential genetic trait. Common designs belonging to this class of studies are the standardcase-control and cohort designs, used in the study of dichotomous disease traits, and twin and adoption studies, used in thestudy of both dichotomous and continuous traits. Familial aggregation studies based on a case-control design, such as inthe review by Mamdani et al. (2003) on responsiveness to lithium in the treatment of bipolar disorder, can be analysedsequentially using the approach described in Section 3. This section covers the sequential twin study with continuous traitsand the related sequential comparison of twins raised together with those raised apart (adopted twins study).

4.2. Twin study

4.2.1. A generic sequential approach using Z and VPharmacogenetic studies with twins are particularly well suited to a sequential design as many such studies (see Swan

et al., 2004) have to rely on very small samples. The twin study can be viewed as a comparison between a stream of nMZpairs of observations from monozygotes and a second stream of nDZ pairs from dizygotes. In the sequential twin procedurefor continuous traits proposed in Baksh et al. (2006), the correlation coefficient for the pair of responses for a monozygoticpair is denoted by ρMZ while that for a dizygotic pair is ρDZ . The difference betweenmonozygotes and dizygotes is measuredusing the parameter θ = ηMZ − ηDZ where ηMZ = 1/2 log{(1 + ρMZ )/(1 − ρMZ )} and ηDZ = 1/2 log{(1 + ρDZ )/(1 − ρDZ )}are the Fisher’s variance stabilising (z) transformations of the intraclass correlations ρMZ and ρDZ .

The sequential test for H0 : θ = 0 is based on the Wald statistic (Cox and Hinkley, 1974) and uses as appropriate teststatistics,

Z = (n−1DZ + n−1

MZ )−1θ and V−1

= (nDZ − 3)−1+ (nMZ − 3)−1, (5)

where θ is the maximum likelihood estimate for θ . This test is very closely related to standard fixed sample twinmethodology for comparing intraclass correlations, as discussed, for instance, by Armitage (2002) and Sham (1998).Measured environmental and other non-genetic risk factors may be included in the sequential test by using an adjustedmaximum likelihood estimate for θ in Z and V (see Baksh et al., 2006, for details). In particular, in the adopted twin studywe can allow for unknown environmental influences by adjusting themean responses of twins raised together by a commonfactor.

Considering twin studies with a dichotomous trait, fixed sample studies typically either compare concordance rates ofmonozygotes and dizygotes or use a liability model and the assumption of an underlying bivariate normal distribution toestimate the tetrachoric correlation for twin pairs (Thomas, 2004). The tetrachoric correlation is the correlation coefficientcomputed for two normally distributed variables that are both expressed as a dichotomy. The former approach fits naturallyinto the framework described in this paper for sequential comparison of two independent proportions. In the latter, analysisof the difference between the tetrachoric correlations of monozygotes and dizygotes (i.e. the analysis of heritability) is, inprinciple, similar to the sequential twin analysis for continuous traits. However sequential procedures in these contexts areyet to be explored.

4.2.2. Worked exampleSwan et al. (2004) describe a study to investigate the pharmacogenetics of nicotine metabolism in twins. We have

used this as the basis of a worked example implementing simulated data in place of the real observations, basing thedata generation on study results reported in the manuscript. One of the metabolic measures of interest in the study wasdideuteronicotine clearance. Summary statistics report a mean of 18.0 for this variable amongst monozygote twins inthe sample, with a standard deviation of 6.2 and a mean of 17.8 and standard deviation of again 6.2 for dizygote twins.

Page 8: Sequential methods for pharmacogenetic studies

1228 S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231

Fig. 3. Sample path of the sequential test of a difference in dideuteronicotine clearance between monozygote and dizygote twin pairs.

These estimates are based on data from 110 monozygote twin pairs and 29 dizygote twin pairs. Assuming observationsof dideuteronicotine clearance are from a bivariate normal distribution for the ith twin pair, with means and variancescoming from the data above, together with correlations ρMZ = 0.5 for monozygotes and ρDZ = 0.2 for dizygotes, data wererandomly generated for twin pairs in the ratio 4:1, monozygotes:dizygotes, to preserve the characteristics of the real study.

Suppose that a sequential study is proposed to identify a difference in the continuous trait, dideuteronicotine clearance,between the pairs of monozygote and dizygote twins and that it is desired to detect a value of θ = 0.5 with 90% power anda type I error rate of 0.05. To illustrate a different stopping rule, we will use a truncated SPRT, as illustrated in Fig. 1(c) asthe boundary shape. Assuming inspections take place after every 10 pairs of twins, which is 8 pairs of monozygote twinsand 2 pairs of dizygote twins, the formulae given in Eq. (5) above can be used to calculate the test statistics Z and V at eachinspection. The resulting plot of the procedure based on the simulated data is shown in Fig. 3. The trial stopped at the 14thlook with the conclusion that there is a significant difference between the correlations of monozygote and dizygotes pairs.

The p-value for the hypothesis test is 0.0210. A median unbiased estimate of θ is 0.560, with 95% confidence interval(0.089, 1.012).

4.3. Transmission-disequilibrium test

4.3.1. A generic sequential approach using Z and VThe TDT uses parents–offspring trios and compares the transmitted and non-transmitted alleles in a test for matched

samples. It is the family-based test most familiar to geneticists and is known to statisticians as McNemar’s test for matchedpairs case-control data. In the pharmacogenetic setting, the TDT can be used to test for genetic linkage and association witha specific treatment outcome. It should be noted that parents are not required to be on the treatment, as only genetic dataare needed. An example is Choudhry et al. (2005) who use parents–offspring trios to study responsiveness to treatment forasthma.

Consider the test of a biallelic Aa marker allele with A being the marker of interest. In each parents–offspring trio, theoffspring’s two copies of the allele are independently inherited from either parent and it is of interest to determine whetherallele A is preferentially transmitted. Each parents–offspring trio contributes the equivalent of twomatched pairs to the teststatistic (b−c)2/(b+c), where b is the number of timesA is transmitted byheterozygous parents and c is the number of timesA is not transmitted by heterozygous parents. A sequential version of the TDT using the sequential matched case-controltest in Section 3 with paired data will have contributions of the tth parents–offspring pair, Zt , Vt given by Zt = ZtM + ZtFand Vt = VtM + VtF , where the subscripts M and F denote the maternal and paternal contributions, respectively. From Eq.(4) of Section 3.2.2,

ZtM = xtM −StM2

and VtM =StM2

1 −

StM2

,

where xtM is an indicator of whether the allele of interest is transmitted and StM counts the total number of transmissions ofthe allele by the mother. Similarly, we can write out the contributions of the father ZtF and VtF . It can be shown that Z2/V ,where the Z =

t Zt and V =

t Vt , is the familiar TDT test statistic and, as in the fixed sample TDT, the sequential test is

a joint test of the null hypothesis of no linkage or no association (linkage disequilibrium).König et al. (2001) present an alternative sequential TDT based on Camp’s (1999) approximation of the TDT statistic. The

test statistic may be written as Z/√4Nf (1 − f ), where Z is as defined in the paragraph above, N is the number of parent-

affected offspring trios and f (which needs to be estimated) is the population frequency of the disease allele. König et al. findstopping limits of the sequential procedure, for a chosen error rate andmaximum number of analyses, by a search through aset of discrete α-spending functions for that which minimises the expected sample size under the alternative. The decisionrule for their test procedure is such that the study can only be terminated before the maximum sample size is observed if asignificant result is evident. The restricted procedure shown in Fig. 1(e) is designed for an identical purpose. In clinical trials

Page 9: Sequential methods for pharmacogenetic studies

S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231 1229

it is used as a safety net rather than in an attempt to reduce sample size. Both the score test version of the sequential TDTand König et al. ’s version, which can be shown to be an approximate Wald test, can be used with any sequential proceduredevised for Brownian motion. The score test approach is however likely to be more accurate as it avoids estimation of thefrequency of the disease allele in the population.

5. Sequential genome-wide studies

5.1. Introduction and literature

Recent technological advances and reduction in genotyping costs have heralded the emergence and increasingly commonuse of studies based on testing a large number of genetic markers, for example more than 1,000,000 SNPs from a wideregion of the human genome. The primary appeal of a genome-wide approach in studies of complex conditions is that noprior information is required on genes that may be involved in the biological pathways and it permits the detection ofgenes or combinations of genes not previously suspected to be involved in the aetiology of the condition. Because of thecomputational challenges posed by the large datasets obtained in genome-wide studies, fixed sample analyses of the dataare often conducted by testing each SNP separately (Balding, 2006), with an adjustment of the type 1 error for multipletesting. As illustrated by Province (2000), by simultaneously controlling both type 1 and type 2 error, a sequential testprocedure can lead to an increase in efficiency to detect true genetic effects relative to the equivalent fixed sample test.While in principle the single SNP sequential association methods in the preceding section may be directly applied to theanalysis of pharmacogenetic genome-wide studies with an appropriate adjustment for multiple testing, this may not befeasible in practice.

5.2. Genome-wide linkage studies

The multiple markers of the genome-wide single SNP analyses are partitioned into ‘‘signal’’ and ‘‘noise’’ subsets in asequential analysis of linkage proposed by Province (2000) for sib pairs and a quantitative trait. The design used is the SPRT(Fig. 1(c)) with parameter of interest the slope from the simple linear regression of the square of the trait differences for sibpairs on the proportion of alleles shared IBD in the well-known Haseman and Elstonmethod 1972. Assuming the regressionfit with linked markers will have smaller error variances than that for unlinked markers, at each interim analysis the errorvariances for individual markers are used to construct the test statistics for a multiple decision test procedure. As evidenceaccumulates the linked and unlinkedmarkers should then separate into two distinct groups (see Province, 2000, for details).The author also outlines a simplified procedure, based on a modified statistic, in recognition of the fact that the method iscomputationally intensive and difficult to implement in studies with a large number of markers.

5.3. Genome-wide association studies

An alternative to the single SNP analysis in association studies, proposed by Kelly et al. (2006) and Kelly et al. (2008),uses a global test statistic formed by summing the likelihood ratio statistics from individual SNPs in the sequential test forassociation between drug response and genotype. The first of these papers investigates whether the genotype has an effecton a binary response and the second whether there is an interaction between treatment and the genotype in governing aquantitative response. In each case, a null distribution of the global likelihood ratio statistic is found by permuting diseasestatus of a random selection of responders and non-responders and simulations are used to find the nominal significancelevel at each interim analysis. Permutations avoid relying on theoretical asymptotic results while the simulations are usedto correct for the repeated testing in the sequential analysis. The stopping boundaries are determined according to anα-spending rule, the straight line boundaries of Fig. 1 being less appropriate in the context of the permutation approach used(although the α-spending function could be derived from some ideal straight line stopping boundary). The authors pointout that this method is very computationally intensive and suggest some strategies to reduce the computational burden.

5.4. Other approaches

For completeness, it is noted that staged genome-wide studies used to screen out promising regions of the genome, seefor example (Karvanen et al., 2009; Pahl et al., 2009; Satagopan et al., 2002; Scherag et al., 2009; Thomas et al., 2005) are notconsidered in this paper. This is because methodology in this area is developing, with issues such as the number of stagesand choice of stopping boundaries still to be worked through. In addition, it is generally accepted that these designs may beless relevant now, as high-volume genotyping costs continue to plummet (Thomas et al., 2004).

Page 10: Sequential methods for pharmacogenetic studies

1230 S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231

6. Discussion

Properly designed, efficient studieswith reliable findings are essential for the realisation of the pharmacogenetic promiseof newmore effective treatments and theminimisation of unwanted responses. In this regard a sequential design can proveto be a useful alternative to the fixed sample study. This is particularly true in investigations of rare outcomes, low allelefrequencies and small effect sizes and with economic and ethical constraints. The advantages of the approach are many.The downside is that in circumstances where an effect exists, but is not as large as anticipated, then a sequential trial mayrequire a larger sample size than its fixed sample counterpart.

In this paper we consider aspects of designing and conducting sequential pharmacogenetic studies. Once a study is underway, monitoring of the data and periodic calculation of test statistics proceeds until a boundary is crossed. In practice,monitoring is most likely to be after a group of observations (the so-called group-sequential approach) rather than afterevery observation. These interim analyses determine only whether the study should be stopped, they do not provide acomplete interpretation of the data. On completion of the sequential procedure, a final analysis should be performed. Theuse of a stopping rule means that standard analysis methods are no longer valid. After a sequential test, the meaning andinterpretation of data summaries such as significance levels, point estimates and confidence intervals remain as for fixedsample size studies, but methodology for their calculation needs to be redefined (Jennison and Turnbull, 2000).

Most of the sequential methods presented in this paper have equivalent fixed sample counterpart designs and, as isusually possible for fixed sample designs, they can be extended to more complex scenarios. For example, the methodspresented in Section 3 for the conduct of candidate gene studies association studies of a biallelic gene can be extended toa multi-allelic gene by comparing the allele of interest with the others. A related question arises when considering studieswhere more than one gene is of interest. Analogous to this in the clinical trials setting is the case where more than oneendpoint is of interest. Somework ondeveloping bivariate sequential procedures, (see for example, Cook and Farewell, 1994;Jennison and Turnbull, 1993; Todd, 2003), could potentially be translated into the setting of pharmacogenetic investigations.

However, there are alsomanypharmacogenetic study designs forwhich the sequentialmethodology is yet to be explored.One such design is the prospective cohort design where patients fulfilling the inclusion criteria are randomly assignedto either a test drug or placebo (or standard treatment) and the differences in outcomes between genotype groups arecompared. These studies typically require high recruitment rates but may be feasible as add-on studies in Phase II and PhaseIII clinical trials (Kirchheiner et al., 2005). Another design that potentially fits into a sequential framework is that proposedby Hughes et al. (2008) for comparing genotype-adjusted and conventionally treated patients in a prospective randomisedstudy on HIV. Also to be investigated are pharmacogenetic dose-finding studies and extensions of the basic retrospectivecase-control design such as where responders and non-responders to treatment are compared with healthy individuals; anexample is the study on the role of the phospholipase C − γ 1 gene and response to lithium treatment for individuals withbipolar disorder (Løvlie et al., 2001).

In recent years there has been considerable interest in the emerging area of adaptive designs. Such trials define flexiblemulti-stage designs, in a variety of different ways, with adaptive interim analyses that allowmid-trial design modificationsbased on information collected both during the trial and outside of the study. See for example (Bauer et al., 2001; Poschet al., 2003) for further discussion. In order to control the type I error rate all designs use separate standardised test statisticscalculated from the samples at different stages and combine them for use to test hypotheses of interest. Some work in thisarena in the setting of microarray experiments has been undertaken (Zehetmayer et al., 2008) but their more widespreadapplication in genetic and pharmacogenetic studies is yet to be explored.

In conclusion, sequential methodology, established in clinical trials, is extended in this paper to give a unified frameworkfor the emerging area of pharmacogenetics. Numerous examples which are currently available in the literature arehighlighted. The reader is referred to these if more detail is required concerning a specific method or its implementation.The intention of this manuscript is to inform and expand the experimenter’s repertoire of available statistical design andanalysis methodology in the effort to enhance pharmacogenetic study efficiency, improve the reliability of study findingsand advance pharmacogenetic research.

References

Armitage, P., 1955. Tests for linear trends in proportions and frequencies. Biometrics 11, 375–386.Armitage, P., 2002. Correlation. In: Elston, R., Olston, J., Palmer, L. (Eds.), Biostatistical Genetics and Genetic Epidemiology. John Wiley & Sons, New York,

pp. 171–176.Baksh, M.F., Haars, G., Todd, S., van Noord, P.A.H., Whitehead, J., Lucini, M.M., 2006. Comparing correlations of continuous observations from two

independent populations using a sequential approach. Statistics in Medicine 25, 4293–4310.Baksh, M.F., Kelly, P.J., 2008. Statistical methods for examining genetic influences of resistance to anti-epileptic drugs. Expert Review of Clinical

Pharmacology 1, 137–144.Baksh, M.F., Todd, S., Whitehead, J., Lucini, M.M., 2005. Design considerations in the sequential analysis of matched case-control data. Statistics in Medicine

24, 853–867.Balding, D.J., 2006. A tutorial on statistical methods for population association studies. Nature Reviews Genetics 7, 781–791.Bauer, P., Brannath, W., Posch, M., 2001. Flexible two stage designs: an overview. Methods of Information in Medicine 40, 117–121.Breslow, N.E., Day, N.E., 1980. Statistical Methods in Cancer Research. Volume I: The Analysis of Case-Control Studies. IARC Scientific Publications, Lyon.Brockmöller, J., Tzvetkov, M.V., 2008. Pharmacogenetics: data, concepts and tools to improve drug discovery and drug treatment. European Journal of

Clinical Pharmacology 64, 133–157.Camp, N.J., 1999. Genomewide transmission/disequilibrium testing: a correction. American Journal of Human Genetics 64, 1485–1487.

Page 11: Sequential methods for pharmacogenetic studies

S. Todd et al. / Computational Statistics and Data Analysis 56 (2012) 1221–1231 1231

Choudhry, S., Ung, N., Avila, P.C., Ziv, E., Nazario, S., Casal, J., Torres, A., Gorman, J.D., Salari, K., Rodriguez-Santana, J.R., et al., 2005. Pharmacogeneticdifferences in response to Albuterol between Puerto Ricans and Mexicans with Asthma. American Journal of Respiratory and Critical Care Medicine171, 563–570.

Cook, R.J., Farewell, V.T., 1994. Guidelines for monitoring efficacy and toxicity responses in clinical trials. Biometrics 50, 1146–1152.Cox, D.R., Hinkley, D.V., 1974. Theoretical Statistics. Chapman & Hall, CRC, London.Cui, Y., Fu, Y., Hussein, A., 2009. Group sequential testing of homogeneity in genetic linkage analysis. Computational Statistics and Data Analysis 53,

3630–3639.Dreyfus, M., Hedelin, G., Kutnahorsky, R., Lehmann, M., Viville, B., Langer, B., Fleury, A., M’Barek, M., Treisser, A., Wiesel, M.-L., Pasquali, J.-L., 2001.

Antiphospholipid antibodies and preeclampsia: a case-control study. Obstetrics & Gynecology 97, 29–34.Haseman, J., Elston, R., 1972. The investigation of linkage between a quantitative trait and a Marker locus. Behavioral Genetics 2, 3–19.Hédelin, G., 1992. Sequential tests for conditional logistic regression and combination of 2 × 2 tables in case-control studies. Université Louis Pasteur:

Laboratoire d’épidémiologie et de santé publique, Strasbourg, France. p. 7.Huang, R.S., Ratain, M.J., 2009. Pharmacogenetics and pharmacogenomics of anticancer agents. CA: A Cancer Journal for Clinicians 59, 42–55.Hughes, S., Hughes, A., Brothers, C., Spreen, W., Thorborn, D., On behalf of the CNA 106030 study Team, 2008. PREDICT-1 (CNA106030): the first powered,

prospective trial of pharmacogenetic screening to reduce drug adverse events. Pharmaceutical Statistics 7, 121–129.Jennison, C., Turnbull, B.W., 1993. Group sequential tests for bivariate response: interim analyses of clinical trials with both efficacy and safety endpoints.

Biometrics 49, 741–752.Jennison, C., Turnbull, B.W., 2000. Group Sequential Methods with Applications to Clinical Trials. Chapman & Hall, CRC, Florida.Karvanen, J., Kulathinal, S., Gasbarra, D., 2009. Optimal designs to select individuals for genotyping conditional on observed binary or survival outcomes

and non-genetic covariates. Computational Statistics & Data Analysis 53, 1782–1793.Kelly, P., Stallard, N., Zhou, Y., Whitehead, J., Bowman, C., 2006. Sequential genomewide association studies for monitoring adverse events in clinical

evaluation of new drugs. Statistics in Medicine 25, 3081–3092.Kelly, P., Zhou, Y., Whitehead, J., Bowman, C., 2008. Sequential testing for a gene-drug interaction in a genomewide analysis. Statistics in Medicine 27,

2022–2034.Kirchheiner, J., Fuhr, U., Brockmöller, J., 2005. Pharmacogenetics-based therapeutic recommendations—ready for clinical practice? Nature Reviews Drug

Discovery 4, 639–647.König, I.R., Schäfer, H., Müller, H.-H., Ziegler, A., 2001. Optimized group sequential study designs for tests of genetic linkage and asociation in complex

diseases. American Journal of Human Genetics 69, 590–600.Lan, K.K.G., DeMets, D.L., 1983. Discrete sequential boundaries for clinical trials. Biometrika 70, 659–663.Lima, J.J., Zhang, S., Grant, A., Shao, L., Tantisira, K.G., Allayee, H., Wang, J., Sylvester, J., Holbrook, J., Wise, R., Weiss, S.T., Barnes, K., 2006. Influence of

leukotriene pathway polymorphisms on response tomontelukast in Asthma. American Journal of Respiratory and Critical CareMedicine 173, 379–385.Löscher, W., Klotz, U., Zimprich, F., Schmidt, D., 2009. The clinical impact of pharmacogenetics on the treatment of epilepsy. Epilepsia 50, 1–23.Løvlie, R., Berle, J.ø, Stordal, E., Steen, V.M., 2001. The phospholipase C − γ gene (PLCG1) and lithium-responsive bipolar disorder: re-examination of an

intronic dinucleotide repeat polymorphism. Psychiatric Genetics 11, 41–43.Mamdani, F., Groisman, I.J., Alda,M., Turecki, G., 2003. Long-term responsiveness to lithiumas a pharmacogenetic outcome variable: treatment and etiologic

implications. Current Psychiatry 5, 484–492.MPS Research Unit, 2000. PEST 4: operating manual. The University of Reading, UK.Pahl, R., Schäfer, H., Müller, H.-H., 2009. Optimal multistage designs-a general framework for efficient genome-wide association studies. Biostatistics 10,

297–309.Posch, M., Bauer, P., Brannath, W., 2003. Issues in designing flexible trials. Statistics in Medicine 22, 953–969.Province, M.A., 2000. A single, sequential genome-wide test to identify simultaneously all promising areas in a linkage scan. Genetic Epidemiology 19,

301–322.Satagopan, J.M., Verbel, D.A., Venkatraman, E.S., Offit, K.E., Begg, C.B., 2002. Two-stage designs for gene-disease association studies. Biometrics 58, 163–170.Scherag, A., Hebebrand, J., Schäfer, H., Müller, H.-H., 2009. Flexible designs for genomewide association studies. Biometrics 65, 815–821.Schork, N.J., Fallin, D., Tiwari, H.K., Schork, M.A., 2001. Pharmacogenetics. In: Balding, D.J., Bishop, M., Cannings, C. (Eds.), Handbook of Statistical Genetics.

John Wiley & Sons, New York, pp. 741–764.Sham, P., 1998. Statistics in Human Genetics. Arnold, London, UK.Shuster, J., Link, M., Camitta, B., Pullen, J., Behm, F., 2002. Minimax two-stage designs with applications to tissue banking case-control studies. Statistics in

Medicine 21, 2479–2493.Sills, G.J., Mohanraj, R., Butler, E., McCrindle, S., Collier, L., Wilson, E.A., Brodie, M.J., 2005. Lack of association between the C3435T polymorphism in the

human multidrug resistance (MDR1) gene and response to antiepileptic drug treatment. Epilepsia 46, 643–647.Swan, G.E., Benowitz, N.L., Jacob III, P., Lessov, C.N., Tyndale, R.F., Wilhelmsen, K., Krasnow, R.E., McElroy, M.R., Moore, S.E., Wambach, M., 2004.

Pharmacogenetics of nicotine metabolism in twins: methods and procedures. Twin Research 7, 435–448.Thomas, D.C., 2004. Statistical Methods in Genetic Epidemiology. Oxford University Press, New York.Thomas, D.C., Haile, R.W., Duggan, D., 2005. Recent developments in genomewide association scans: a workshop summary and review. American Journal

of Human Genetics 77, 337–345.Thomas, D., Xie, R., Gebregziabher, M., 2004. Two-stage sampling designs for gene association studies. Genetic Epidemiology 27, 401–414.Todd, S., 2003. An adaptive approach to implementing bivariate group sequential clinical trial designs. Biopharmaceutical Statistics 13, 605–619.van der Tweel, I., Schipper, M., 2004. Sequential tests for gene-environment interactions in matched case-control studies. Statistics in Medicine 23,

3755–3771.van der Tweel, I., van Noord, P.A.H., 2000. Sequential analysis of matched dichotomous data from prospective case-control studies. Statistics in Medicine

19, 3449–3464.Whitehead, J., 1997. The Design and Analysis of Sequential Clinical Trials, revised second ed. John Wiley & Sons, Chichester.Zehetmayer, S., Bauer, P., Posch, M., 2008. Optimized multi-stage designs controlling the false discovery or family-wise error rate. Statistics in Medicine

27, 4145–4160.