sample size calculation based on the semiparametric

Sample Size Calculation Based on theSemiparametric Analysis of Short-term

and Long-term Hazard Ratios

Yi Wang

Submitted in partial fulfillment of the

requirements for the degree of

Doctor of Philosophy

under the Executive Committee

of the Graduate School of Arts and Sciences

COLUMBIA UNIVERSITY

2013

c©2013

Yi Wang

All Rights Reserved

ABSTRACT

Sample Size Calculation Based on theSemiparametric Analysis of Short-term

and Long-term Hazard Ratios

Yi Wang

We derive sample size formulae for survival data with non-proportional hazard func-

tions under both fixed and contiguous alternatives. Sample size determination has

been widely discussed in literature for studies with failure-time endpoints. Many re-

searchers have developed methods with the assumption of proportional hazards under

contiguous alternatives. Without covariate adjustment, the logrank test statistic is

often used for the sample size and power calculation. With covariate adjustment,

the approaches are often based on the score test statistic for the Cox proportional

hazards model. Such methods, however, are inappropriate when the proportional

hazards assumption is violated. We develop methods to calculate the sample size

based on the semiparametric analysis of short-term and long-term hazard ratios. The

methods are built on a semiparametric model by Yang and Prentice (2005). The

model accommodates a wide range of patterns of hazard ratios, and includes the Cox

proportional hazards model and the proportional odds model as its special cases.

Therefore, the proposed methods can be used for survival data with proportional or

non-proportional hazard functions. In particular, the sample size formula by Schoen-

feld (1983) and Hsieh and Lavori (2000) can be obtained as a special case of our

methods under contiguous alternatives.

KEY WORDS: Accrual and follow-up; Contiguous alternatives; Cox model; Crossing

hazards; Fixed alternative; Non-proportional hazard functions; Sample size; Short-

term and long-term hazard ratios; Survival analysis.

Table of Contents

Table of Contents i

1 Introduction 1

2 General procedure of sample size calculation 7

2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 General procedure of sample size and power calculations . . . . . . . 13

2.3 The fixed alternative and the contiguous alternative hypotheses . . . 15

3 Sample size calculation for the Cox proportional hazards model 16

3.1 Notations and assumptions . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2 Model specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3 Sample size formula under fixed alternative . . . . . . . . . . . . . . . 19

3.4 Sample size formula under contiguous alternatives . . . . . . . . . . . 22

4 Sample size calculation with Yang and Prentice’s semiparametric

model 27

4.1 Notations and Model specification . . . . . . . . . . . . . . . . . . . . 28

4.2 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3 Sample size formula under fixed alternative . . . . . . . . . . . . . . . 36

4.4 Sample size formula under contiguous alternatives . . . . . . . . . . . 41

4.5 Simulation studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

i

4.5.1 Simulation studies to evaluate sample size formula derived un-

der contiguous alternatives . . . . . . . . . . . . . . . . . . . . 47

4.5.2 Simulation studies to evaluate sample size formula derived un-

der fixed alternative . . . . . . . . . . . . . . . . . . . . . . . 52

5 Accrual and follow-up times in sample size calculation 59

5.1 Accrual and followup in sample size calculation . . . . . . . . . . . . 59

5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

6 Discussion 64

6.1 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

7 Proofs 68

7.1 Proofs of the theorems and corollaries in Chapter 3 . . . . . . . . . . 68

7.1.1 Proof of theorem 3.3.1 . . . . . . . . . . . . . . . . . . . . . . 69

7.1.2 Proof of Corollay 3.3.2 . . . . . . . . . . . . . . . . . . . . . . 73

7.1.3 Proof of Theorem 3.4.1 . . . . . . . . . . . . . . . . . . . . . . 75

7.1.4 Proof of Corollary 3.4.2 . . . . . . . . . . . . . . . . . . . . . 77

7.2 Proofs of the theorems in Chapter 4 . . . . . . . . . . . . . . . . . . . 77

7.2.1 Lemma 7.2.2 and proof . . . . . . . . . . . . . . . . . . . . . . 80

7.2.3 Proof of Theorem 4.3.1 (ii) . . . . . . . . . . . . . . . . . . . . 82

7.2.4 Proof of Theorem4.4.1 . . . . . . . . . . . . . . . . . . . . . . 93

Bibliography 99

ii

List of Figures

1.1 Kaplan-Meier estimates for the VA lung cancer data. . . . . . . . . . 4

4.1 Proportional hazards: Cox model (γ1 = γ2 = γ). . . . . . . . . . . . . 30

4.2 Non-proportional hazards: proportional odds model (γ2 = 1). . . . . . 31

4.3 Non-proportional hazards: long-term effect model (γ1 = 1). . . . . . . 32

4.4 Non-proportional hazards: crossing hazards (γ1 < 1 and γ2 > 1, or,

γ1 > 1 and γ2 < 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.1 Sample size for accrual and follow-ups up to 30 months. . . . . . . . . 62

5.2 Sample size for 3 months accrual (a = 3) . . . . . . . . . . . . . . . . 63

5.3 Sample size for 20 months follow-up (f = 20) . . . . . . . . . . . . . . 63

iii

List of Tables

2.1 Comparison of η1 and (zα/2 + zβ)2 for commonly assumed α’s and β’s. 11

4.1 Values of η for different α’s and β′s. . . . . . . . . . . . . . . . . . . . 45

4.2 Empirical power of calculated sample size for α = 0.05, 1 − β = 0.9

and θ0 = (0, 0)′. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50


and θ0 = (0, 0)′. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51


and θ0 = (−0.4,−0.4)′. . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.5 Empirical power of calculated sample size for the Cox model. α = 0.05,

1− β = 0.9 and θ0 = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . 55


1− β = 0.9 and θ0 = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . 56


1− β = 0.9 and θ0 = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . 57


1− β = 0.9 and θ0 = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.1 Different accrual distributions as in Maki (2006) and Wang et al. (2012). 60

iv

Acknowledgments

I would like to express my deepest gratitude to my advisor, Dr. Zhezhen Jin,

for his supervision and support. I would also like to thank Dr. Bin Cheng, Dr.

Robert Taub, Dr. Wei-Yann Tsai and Dr. Antai Wang for serving on my dissertation

committee. Last but not least, many thanks to my friends Huaihou Chen, Wei Xiong,

Wenfei Zhang and Ziqiang Zhao, who have encouraged and supported me throughout

the entire process.

v

To my family

vi

CHAPTER 1. INTRODUCTION 1

Chapter 1

Introduction

Sample size determination is important in clinical studies, especially at the design

stage of a trial when researchers want to address some scientific hypotheses. Sample

size calculation procedure consists of two important elements: hypotheses of interest

and test statistic. For the sample size calculation in a study, the null and alternative

hypotheses need to be determined first. Then, a proper test statistic must be iden-

tified along with the distributions of the test statistic under the null and alternative

hypotheses. Sample size can then be evaluated with the pre-specified type I error,

power, effect size, and design effect(s). When the exact distributions of the test s-

tatistic are not available, the asymptotic distributions are often used instead. Type

I error, power, sample size, effect size, and design effect(s) are related to one another

for the pre-specified hypotheses of interest and test statistic. Therefore, power can

be calculated if sample size, type I error, effect size and design effect(s) are known.

Sample size and power calculation follows basically the same procedure. Reviews on

the general procedure for sample size determination and power analysis in clinical tri-

als are given by many investigators (e.g., Lachin 1981 [25]; Donner 1984 [12]; Dupont

and Plummer 1990 [11]). The general procedure is reviewed and discussed in detail

in Chapter 2. Although sample size determination is the focus in this dissertation,

power analysis for the same type of studies can be conducted in a similar manner.


Sample size formulae can be different with different choice of either the hypotheses

of interest or the test statistic. In sample size calculation, two types of alternative

hypotheses are usually considered for the hypotheses of interest: fixed alternative

and contiguous alternatives. An introduction to these alternatives can be found in

Chapter 2. For any specific hypotheses of interest, the choice of the test statistics also

makes a difference when sample size is calculated for a study. There are model-based

and non model-based test statistics. The test statistic should be model-based when

some non-binary effect(s) is (are) of interest, and/or there is a need for covariate

adjustment. When only a categorical effect (often binary), for example, treatment

effect, is of interest, one can calculate the sample size with a non model-based test

statistic.

The methods for sample size calculation in survival analysis have been discussed

extensively in the literature. Many authors have focused on developing methods for

survival data with treatment as the effect of interest (e.g., Halperin et al. 1968 [20];

Pasternack and Gilbert 1971 [38]; Pasternack 1972 [37]; George and Desu 1974 [17];

Palta and McHugh 1979 [35]; Palta and McHugh 1980 [36]; Wu et al. 1980 [50];

Lachin 1981 [25]; Rubinstein et al. 1981 [40]; Schoenfeld 1981 [41]; Makuch and

Simon 1982 [32]; Schoenfeld and Richter 1982 [43]; Freedman 1982 [15]; Schoenfeld

1983 [42]; Gail 1985 [16]; Palta and Amini 1985 [34]; Lachin and Foulkes 1986 [26];

Lakatos 1988 [28]; Gu and Lai 1999 [19]; Chen et al. 2011 [5]; Wang et al. 2012

[49]). Collett (2003) [9] has discussed and reviewed sample size calculation methods

in survival analysis. Most of these methods focused on the simple two-sample problem

and assumed proportional hazards. In particular, early works assumed exponential

distribution in survival times (e.g., Pasternack and Gilbert 1971 [38]; Pasternack

1972 [37]; George and Desu 1974 [17]; Palta and McHugh 1980 [36]; Lachin 1981 [25];

Rubinstein et al. 1981 [40]; Makuch and Simon 1982 [32]; Schoenfeld and Richter

1982 [43]; Lachin and Foulkes 1986 [26]). Among these methods, there are some that

are not limited to two-sample problem or to the assumption of proportional hazards.


For example, Makuch and Simon (1982) [32] provided the sample size requirement

for more than two treatment groups; Lakatos (1988) [28] and Lakatos and Lan (1992)

[29] derived methods that can be used for non-proportional hazard functions (a SAS

macro for the approaches are given by Shih 1995 [45]); Wu et al. (1980) [50] also

employed an approach that allows for time-dependent dropout and event rates as an

extension of the approach by Halperin et al. (1968) [20]; Gu and Lai (1999) [19]

derived a formula for clinical studies with interim analyses; Chen et al. (2011) [5]

discussed sample size determination in joint modeling of longitudinal and survival

data; Wang et al. (2012) [49] derived a formula based on the cure rate proportional

hazards model, which includes the Cox proportional hazards model as a special case.

Some researchers developed approaches that can be used for non-binary covariate(s).

For example, Zhen and Murphy (1994) [53] presented their approach based on the

exponential model; Hsieh and Lavori (2000) [21] derived a method based on the Cox

proportional hazards model.

The sample size formula by Schoenfeld (1983) [42] is commonly used in practice.

The treatment effect is handled as a binary covariate in the Cox proportional hazards

model. The formula by Schoenfeld (1983) [42] (based on the score test statistic for

the Cox proportional hazards model) is the same as that of Schoenfeld (1981) [41]

(based on the logrank test statistic) when there are no ties considered. This is because

the logrank test statistic is actually the same as the score test statistic based on the

partial likelihood for the Cox proportional hazards model when the only covariate

is the treatment indicator under no ties. Many authors derived the same formula

under different assumptions, among which there are Schoenfeld and Richter 1982 [43]

and Collett (2003) [9]. Hsieh and Lavori (2000) [21] extended Schoenfeld’s result

(1983) [42] to the case of non-binary covariate. All of these methods are derived

with the proportional hazards assumption under local alternatives. Therefore, the

methods are inappropriate when the proportional hazards assumption is violated.

Even if the proportional hazards assumption holds, the validity of the formula may


be questionable when the results under the fixed alternative are more desired.

Figure 1.1 presents the Kaplan-Meier estimates for the VA lung cancer trial

(Kalbfleisch and Prentice 2002 [23]). This is a classic example of possible viola-

tion of the proportional hazards assumption. If there is crossing in hazards functions,

there must be crossing in survival functions. In practice, people often check the

Kaplan-Meier curves to see whether the survival functions cross. When we observe

Kaplan-Meier curves instead of the true survivals, questions are raised as to whether

the underlying survivals are crossed and whether the underlying hazards are crossed.

Appropriate methods for crossing hazards can help diagnose the problem. Reasons

behind crossing can be complicated. Conventional methods such as the logrank test

and the score test for the Cox proportional hazards model would perform poorly in

the presence of crossing hazards. Consequently, the sample size calculation based on

these test statistics will be inappropriate.

Figure 1.1: Kaplan-Meier estimates for the VA lung cancer data.

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

Time (days)

Sur

viva

l Fun

ctio

n

TreatmentControl

We develop methods to calculate the sample size for survival data with non-


proportional hazard functions. Our methods are based on the pseudo score test for a

semiparametric model by Yang and Prentice (2005) [51]. This semiparametric model

accommodates a wide range of hazard ratio patterns. It models the hazard ratio that

changes monotonically over time. For a two-sample problem, this model can be used

for survival data with proportional or non-proportional (even crossed) hazard func-

tions. There are three submodels of specific interest: the Cox proportional hazards

model, the proportional odds model and the long-term effect model. These submodels

are discussed in detail as special cases of our methods. We obtain the sample size

formulae based on Yang and Prentice’s method under different alternatives (fixed and

contiguous alternatives). Sample size calculation for the Cox model is reviewed in this

dissertation, with generalization to the result under the fixed alternative. It can be

shown that our sample size formula for the semiparametric short-term and long-term

hazard ratios model reduces to the formula developed by Schoenfeld (1983) [42] and

Hsieh and Lavori (2000) [21] under contiguous alternatives for the Cox proportional

hazards model.

Accrual and follow-up times can be incorporated in sample size calculation. Some

authors included accrual and follow-up times in their sample size formulae (e.g.,

Pasternack and Gilbert 1971 [38]; Pasternack 1972 [37]; George and Desu 1974 [17];

Lachin 1981 [25]; Rubinstein et al. 1981 [40]; Schoenfeld and Richter 1982 [43];

Schoenfeld 1983 [42]; Donner 1984 [12]; Gail 1985 [16]; Lachin and Foulkes 1986 [26];

Lakatos 1986 [27]; Lakatos 1988 [28]; Wang et al. 2012 [49]). We will adopt the

assumptions in Wang et al. (2012) [49] for accrual and follow-up times. The sample

size with accrual and follow-ups are derived in a similar way. Details will be discussed

in Chapter 5.

Two important techniques to prove the results of this work are counting processes

and empirical processes. These are useful tools to evalulate the asymptotic properties.

Some books discussed the application of counting processes in survival analysis (e.g.,

Fleming and Harrington 1991 [14]; Andersen et al. 1993 [1]; Kalbfleisch and Prentice


2002 [23]). The large sample properties of empirical processes were presented in Van

der Vaart (1998) [47] and Van der Vaart and Wellner (1996) [48]. Although differ-

ent techniques may be used to derive the results, counting processes and empirical

processes are powerful tools in survival analysis. We will use these techniques in our

derivations.

This dissertation is structured as follows. In Chapter 2, the general procedure

of sample size estimation is illustrated with examples. The sample size formulae are

derived for the Cox proportional hazards model under both fixed and contiguous al-

ternatives in Chapter 3. The formula by Schoenfeld (1983) [42] and Hsieh and Lavori

(2000) [21] is shown to be a special case of our formula under contiguous alternatives.

In Chapter 4, the sample size formulae are developed based on the semiparametric

model by Yang and Prentice (2005) [51] for non-proportional hazard functions. Under

certain conditions, the sample size formula by Schoenfeld (1983) [42] and Hsieh and

Lavori (2000) [21] again turns out to be a special case of our formula. Simulation

studies are summarized in the chapter. In Chapter 5, the proposed methods are fur-

ther generalized when accrual and follow-up times are incorporated. Chapter 6 gives

a summary of the findings and discussions, along with possible alternative approaches

for sample size determination with failure-time endpoint. All the proofs are presented

in Chapter 7.

CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION 7

Chapter 2

General procedure of sample size

calculation

This chapter begins with a simple example to illustrate how the sample size can be

calculated. Similar discussions appeared in many works, including Lachin (1981) [25],

Donner (1984) [12] and Dupont and Plummer (1990) [11]. The general procedures

for sample size calculation and power analysis are summarized in Section 2.2. It is

important to specify the hypotheses of interest and identify a proper test statistic

in sample size and power calculation. In this dissertation, sample size formulae are

derived under both fixed and contiguous alternatives. We give a brief introduction to

the two types of alternative hypotheses (fixed and contiguous alternatives).

2.1 Example

Let {W1, ...,Wn} be a random sample of size n, where Wi’s are independent and

identically distributed (i.i.d.) random variables with unknown mean µ and known

finite variance σ2, for i = 1, ..., n. We consider the null hypothesis H0 : µ = µ0 versus

the alternative hypothesis H1 : µ = µ∗, where µ0, µ∗ ∈ R are known and µ0 6= µ∗. In

this example, the Wald test statistic is used to calculate the sample size. We assume


that the type I error is α and the desired power is 1− β in the study. We present the

sample size formula for the following two cases: random normal sample and random

non-normal sample. The major difference between the two is that the asymptotic

distributions, instead of the exact distributions of the test statistic, are used for the

calculation in the case of non-normal random sample.

Case 1: Random normal sample

If Wi’s are i.i.d. random variables from a normal distribution with unknown mean

µ and known finite variance σ2, for i = 1, ..., n, the Wald test statistic can be used to

test the hypotheses of interest. The test statistic is

Twn =

√n(Wn − µ0)

σ, (2.1)

where Wn =∑n

i=1Wi/n is the sample mean of {W1, ...,Wn}.

The distributions of the test statistic Twn (2.1) can be derived under the null and

alternative hypotheses:

Under H0 : µ = µ0,

Twn ∼ N(0, 1). (2.2)

Under H1 : µ = µ∗,

Twn −√n(µ∗ − µ0)

σ∼ N(0, 1). (2.3)

The results can be used to calculate sample size with pre-specified type I error α,

power 1 − β, effect size µ∗ − µ0, and design effect σ2. By (2.2) and the definition of

type I error,

PrH0

(|Twn | > c

)= α,

where the critic value c is equal to zα/2 for given type I error α, with zα/2 being the

upper 100α/2 th percentile of the standard normal distribution.


By the definition of power,

PrH1

(|Twn | > c

)= 1− β,

which implies

PrH1

(Twn > c

)+ PrH1

(Twn < −c

)= 1− β. (2.4)

For small α and β, one of the items on the left-hand side of equation (2.4) is small

and can thus be omitted. Then, sample size can be calculated based on the previous

results.

If µ0 > µ∗, it follows from (2.4) that

PrH1

(Twn < −c

)= 1− β.

With some algebra, the following equation can be obtained

PrH1

(Twn −

√n(µ∗ − µ0)

σ< −c−

√n(µ∗ − µ0)

σ

)= 1− β.

By (2.3), zβ can be approximated:

−c−√n(µ∗ − µ0)

σ= zβ.

The sample size is obtained by substituting zα/2 for c in the above equation.

n =(zα/2 + zβ)2

(µ∗ − µ0)2/σ2, (2.5)

where zβ is the upper 100 β th percentile of the standard normal distribution.

If µ0 < µ∗, it follows from (2.4) that

PrH1

(Twn > c

)= 1− β.

With some algebra, the following equation can be obtained

PrH1

(Twn −

√n(µ∗ − µ0)

σ> c−

√n(µ∗ − µ0)

σ

)= 1− β.


By (2.3), zβ can be approximated:

c−√n(µ∗ − µ0)

σ= z1−β.

The sample size is

n =(zα/2 − z1−β)2

(µ∗ − µ0)2/σ2, (2.6)

where z1−β is the upper 100 (1− β) th percentile of the standard normal distribution.

Remark 2.1.1. In fact, (2.5) and (2.6) are equivalent because zβ = −z1−β.

The sample size formula is

n =(zα/2 + zβ)2

(µ∗ − µ0)2/σ2(2.7)

regardless of the relationship between µ0 and µ∗.

Remark 2.1.2. An alternative way to calculate the sample size is to consider the

distributions of (Twn )2. The advantage of this method is to use all the terms involved,

without the omission we made of the term on the left-hand side of equation (2.4).

Under H0 : µ = µ0,

(Twn )2 ∼ χ21, (2.8)

where χ21 is the Chi-squared distribution with 1 degree of freedom.


(Twn )2 ∼ χ21(η1), (2.9)

where χ21(η1) is the non-central Chi-squared distribution with 1 degree of freedom and

noncentrality parameter η1. The noncentrality parameter η1 can be approximated by

η1n = n(µ∗−µ0)2

σ2 . The sample size formula can be derived in a similar manner without

making the approximation in (2.4). Following the procedure shown in this section, the

sample size is given by

n =η1

(µ∗ − µ0)2/σ2, (2.10)


where η1 is derived such that χ21, α = χ2

1, 1−β(η1). χ21, α is the upper 100α th percentile

of the Chi-squared distribution with 1 degree of freedom, and χ21, 1−β(η1) is the upper

100 (1 − β) th percentile of the non-central Chi-squared distribution with 1 degree of

freedom and noncentrality parameter η1.

Note that (2.7) and (2.10) differ only in the numerator. If the approximation in

(2.4) is appropriate, (zα/2 + zβ)2 should be close to η1. We compare these values for

small α’s and β’s in Table 2.1. The two values are almost identical for different small

values of α and β. Thus, we conclude that the sample size formula (2.7) (with normal

approximation) can be used instead of (2.10).

Table 2.1: Comparison of η1 and

(zα/2 + zβ)2 for commonly assumed

α’s and β’s.

α 1− β η1 (zα/2 + zβ)2

0.05 0.9 10.507 10.507

0.10 0.9 8.564 8.564

0.05 0.8 7.849 7.849

0.10 0.8 6.182 6.183

Remark 2.1.3. When the variance σ2 is unknown, the sample variance σ2 can be

used instead in the sample size calculation. Consequently, σ is substituted with σ in

the test statistic Twn (2.1). The sample size formula has the form of (2.7) with σ being

replaced by σ, and zα/2 and zβ being replaced by tα/2(n− 1) and tβ(n− 1), the upper

100α/2th and 100 βth percentiles of t-distribution with n− 1 degrees of freedom.

Case 2: Random non-normal sample

Suppose that Wi’s are i.i.d. random variables from a non-normal distribution

with unknown mean µ and known finite variance σ2, for i = 1, ..., n. The sample size


calculation procedure remains largely the same as in Case 1. The difference is that

the asymptotic distributions should be used instead of the exact distributions under

the null and alternative hypotheses. The test statistic in (2.1) can still be used in

this case. The asymptotic distributions should differ from (2.2) and (2.3) only in the

“asymptotic” context; i.e., the results are valid when n → ∞. Because the formula

obtained is based on the large sample theorem, the method may not be good for small

sample size.

We present the asymptotic distributions of the test statistic Twn (2.1) under the

null and alternative hypotheses:

By the classic central limit theorem, under H0 : µ = µ0,

Twnd−→ N(0, 1) as n→∞. (2.11)


Twn −√n(µ∗ − µ0)

σ

d−→ N(0, 1) as n→∞. (2.12)

For a given type I error α, by definition we have:

PrH0

(|Twn | > c

)= α,

where the critic value c is equal to zα/2.

By the definition of power, we derive the following:

PrH1

(|Twn | > c

)= 1− β,

which implies

PrH1

(Twn > c

)+ PrH1

(Twn < −c

)= 1− β. (2.13)

We adopt the same normal approximation method as in the random normal case.

Without loss of generality, we assume µ0 > µ∗. (2.13) can be simplified as

PrH1

(Twn < −c

)= 1− β.


It follows that

PrH1

(Twn −

√n(µ∗ − µ0)

σ< −c−

√n(µ∗ − µ0)

σ

)= 1− β.

By (2.12),

−c−√n(µ∗ − µ0)

σ= zβ.

The sample size is obtained by replacing c with zα/2 in the above equation.

n =(zα/2 + zβ)2

(µ∗ − µ0)2/σ2. (2.14)

Remark 2.1.4. The sample size formula (2.14) is derived based on the asymptotic

distributions of the test statistic (2.1). As a result, it may not be valid for small

samples. Although (2.7) and (2.14) have the same expressions, the method needs to

be used with caution when n is small.

2.2 General procedure of sample size and power

calculations

The example in the previous section illustrates the general procedure in sample

size calculation. There are several important steps in sample size calculation. The

key is to choose the appropriate hypotheses of interest and test statistic. Sample

size is related to a number of factors such as type I error, power, effect size, and

design effect(s). To make a connection among these factors, we need to investigate

the distributions or asymptotic distributions of the test statistic under the null and

alternative hypotheses. Specifically, the sample size calculation procedure can be

summarized as follows.

Sample size calculation procedure:

Step 1. Specify the hypotheses of interest (null and alternative hypotheses).


Step 2. Choose a test statistic.

Step 3. Derive the (asymptotic) distributions of the test statistic under the null

and alternative hypotheses.

Step 4. Link the sample size to type I error, power, effect size, and design

effect(s) by Step 3.

Type I error, power, sample size, effect size, and design effect(s) are related. If only

one of the elements is unknown, then it can be determined by the other quantities.

Power can be calculated in a similar way as sample size. In general, power can be

calculated by the following steps.

Power calculation procedure:

Step 1. Specify the hypotheses of interest (null and alternative hypotheses).

Step 2. Choose a test statistic for the hypotheses of interest.

Step 3. Derive the (asymptotic) distributions of the test statistic under the null

and alternative hypotheses.

Step 4. The power can be linked to type I error, sample size, effect size, and

design effect(s) by Step 3.

Again, the two key components here are the hypotheses of interest and the test

statistic. Different formulae may be derived if we choose different alternative hypoth-

esis or test statistic. In this dissertation, we derive methods under different alter-

native hypotheses. The following section introduces two commonly used alternative

hypotheses in sample size calculation and power analysis.


2.3 The fixed alternative and the contiguous alter-

native hypotheses

We adopt the notations in the example of Section 2.1. Although the example

shows the derivation of sample size formula under the fixed alternative hypothesis,

other alternative hypotheses can also be considered, especially when the random

sample is non-normal and the asymptotic distribution is difficult to obtain under

the fixed alternative. For the null hypothesis of interest H0 : µ = µ0, the following

alternative hypotheses are often considered in sample size and power calculation:

1. Fixed alternative: H1 : µ = µ∗.

2. Contiguous alternatives : H1n : µ = µ1n = µ0 + h√n, where h ∈ R.

When the class of contiguous alternatives is considered in sample size determination,

the asymptotic distribution of the test statistic needs to be derived under the contigu-

ous alternatives. Then, the sample size can be determined by letting µ∗ = µ1n. This

actually assumes a small effect size (i.e., µ0 and µ∗ are close). By setting µ∗ = µ1n,

the sample size n is a function of a real-valued h and the effect size µ∗ − µ0. This is

useful when the sample size formula is derived based on the contiguous alternatives.

Although sample size derivation based on the fixed alternative is often more desir-

able, many researchers use the methods under the contiguous alternatives when the

alternative distribution of the test statistic is difficult to obtain. If the assumption

of the closeness of µ0 and µ∗ is made, one should be cautious when using the corre-

sponding formula. It is difficult to actually quantify the adequacy of the closeness

assumption in practice. Some would use the formulae obtained under the contiguous

alternatives without checking the appropriateness of the assumption. In Chapter 3

and 4, we show in detail the derivations of the sample size formulae under these two

types of alternatives for the Cox model and for the semiparametric model by Yang

and Prentice (2005) [51].

CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COXPROPORTIONAL HAZARDS MODEL 16

Chapter 3

Sample size calculation for the Cox

proportional hazards model

3.1 Notations and assumptions

In this chapter, we generalize the sample size calculation methods for the Cox

proportional hazards model under the fixed and the contiguous alternative hypothe-

ses. The test statistic used for the Cox model is the score statistic based on the

partial likelihood. The formula for the fixed alternative is complicated. We propose

a method that simplifies the calculation under the fixed alternative. This method

is evaluated through simulation studies in Chapter 4. It is shown that our formula

for the contiguous alternatives reduces to that derived by Schoenfeld (1983) [42] and

Hsieh and Lavori (2000) [21] under certain conditions.

We define notations for right-censored failure time data in the following. Let T be

the survival time, C be the censoring variable, X = min(T,C) be the observed time

and ∆ = I(T ≤ C) be the event indicator, where I(·) is the indicator function taking

value 1 if the condition is satisfied, 0 otherwise. For right-censored survival data, the

observed time is either the survival time or the censoring time, whichever occurs first.

For simplicity, only one covariate Z is considered. Suppose that the observed data are


(Xi, Zi,∆i), for i = 1, ..., n. We denote fZ(·) as the probability density function (pdf)

or probability mass function (pmf), FZ(·) as the cumulative distribution function

(cdf) and SZ(·) as the survival function, λZ(·) as the hazard function and ΛZ(·) as

the cumulative hazard function of survival time T . In a two-sample problem, with

Z being binary which takes value 0 for the control group and 1 for the treatment

group, S1(·) denotes the survival function of the treatment group and S0(·) denotes

that of the control group. Accordingly, λ1(·) and λ0(·) are the hazard functions of

the treatment group and control group, respectively. Let∫ ba

denote∫

(a,b].

Let Ni(t) = ∆iI(Xi ≤ t) be the counting process of the number of observed events

on (0, t], and Yi(t) = I(Xi ≥ t) be the at risk process at t. If the i th individual is

still at risk and has not yet failed at time t, then Yi(t) = 1, for i = 1, ..., n and

t ∈ (0, τ ], where τ > 0 is a finite time point at which the study ends. Let t− denote

the left-continuous point of t.

Denote χ2k as the Chi-squared distribution of k degrees of freedom, and χ2

k,ω as

the upper 100ωth percentile of the Chi-squared distribution χ2k. Also let χ2

k(η) be

the non-central Chi-squared distribution of k degrees of freedom with noncentrality

parameter η, and χ2k,ω(η) be the upper 100ωth percentile of the non-central Chi-

squared distribution χ2k(η).

We consider the following regularity conditions:

(A1) Conditioning on Z, T is independent of C.

(A2) θ ∈ Θ, where Θ is compact and Θ ⊂ Rp.

(A3) Z has bounded support.

(A4) ST |Z(t) and SC|Z(t) are continuously differentiable in t ∈ (0, τ ].

(A5) fT |Z(t) and fC|Z(t) are uniformly bounded in t ∈ (0, τ ].

(A6) Pr(C ≥ τ) = Pr(C = τ) > 0.


(A7) Pr(T > τ) > 0.

(A8) Pr(T ≤ C |Z) > 0 almost surely under FZ .

Non-informative censoring is assumed (condition (A1)). Note that θ = (θ1, , ..., θp)′,

the parameter of interest, is a p×1 vector, and Θ is its parameter space. In this chap-

ter, p = 1; that is, there is only one parameter of interest θ in the Cox proportional

hazards model, based on which we conduct sample size calculation. Denote SC|Z(·)

and fC|Z(·) as the survival function and the pdf, respectively, of C conditioning on

Z of survival time T . Denote ST |Z(·) and fT |Z(·) as the survival function and the

pdf, respectively, of T conditioning on Z. The conditions (A2)–(A5) are technical

assumptions, which are needed in the derivation of the asymptotic properties. The

condition (A6) implies that any patients alive by the end of the study at time τ

are considered to be censored. The condition (A7) indicates that the probability of

an individual surviving after τ is positive. The condition (A8) implies that there

is a positive probability of observing an event for any possible value of Z. All the

conditions discussed in this chapter are adopted throughout this dissertation unless

otherwise specified.

The following additional conditions are considered in some of the results in this

dissertation.

(C1) C is independent of Z.

(C2) Z is binary and Pr(Z = 1) = ρ, where ρ ∈ (0, 1).

Conditions (C1)–(C2) are used wherever specified. The condition (C2) applies to the

case when Z is the treatment group indicator (e.g., Z = 1 if in the treatment arm and

Z = 0 if in the control arm), and ρ is the proportion of individuals in the treatment

arm.


3.2 Model specification

For simplicity, we consider the case with one covariate Z in the Cox proportional

hazards model

λT |Z(t|Z, θ) = exp{Zθ}λ0(t).

For all θ ∈ Θ ⊂ R, the partial likelihood for the Cox model is

Lp(θ) =n∏i=1

(exp{Ziθ}∑

l∈R(Xi)exp{Zlθ}

)∆i

=n∏i=1

(exp{Ziθ}∑n

j=1 Yj(Xi) exp{Zjθ}

)∆i

,

where R(Xi) is the risk set at time Xi, for i = 1, ..., n.

The corresponding log partial likelihood is

logLp(θ) =n∑i=1

∫ τ

0

[Ziθ − log

(n∑j=1

Yj(t) exp{Zjθ}

)]dNi(t).

The score function Un with respect to θ and the corresponding observed information

matrix V cn based on the partial likelihood are derived as follows:

Un(θ) =∂logLp(θ)

∂θ

=n∑i=1

∫ τ

0

[Zi −

∑nj=1 ZjYj(t) exp{Zjθ}∑nj=1 Yj(t) exp{Zjθ}

]dNi(t), (3.1)

V cn (θ) = −∂

2logLp(θ)

∂θ2

=n∑i=1

∫ τ

0

∑nj=1 Z

2j Yj(t) exp{Zjθ}∑n

j=1 Yj(t) exp{Zjθ}−

(∑nj=1 ZjYj(t) exp{Zjθ}∑nj=1 Yj(t) exp{Zjθ}

)2 dNi(t).

(3.2)

3.3 Sample size formula under fixed alternative

In this section, we derive the sample size formula based on the score test statistic

under the fixed alternative. We consider the following hypotheses of interest: the null


hypothesis H0 : θ = θ0 and the alternative hypothesis H1 : θ = θ∗, where θ0, θ∗ ∈ Θ

and θ0 6= θ∗. The score test statistic based on the partial likelihood is

T cn =Un(θ0)√V cn (θ0)

. (3.3)

To determine the asymptotic properties of the test statistic T cn, the asymptotic distri-

butions of Un(θ0), the score function evaluated at θ0 should be derived first. Theorem

3.3.1 gives the asymptotic distributions of Un(θ0) under the null and alternative hy-

potheses. We also define the following:

vcθ0(θ0) = limn→∞

Eθ0

[1

nV cn (θ0)

],

vcθ∗(θ0) = limn→∞

Eθ∗

[1

nV cn (θ0)

],

ec∗θ∗(θ0) = limn→∞

Eθ∗

[1

nUn(θ0)

],

vc∗θ∗(θ0) = limn→∞

Varθ∗

[1√nUn(θ0)

].

The expressions of vcθ0(θ0), ec∗θ∗(θ0) and vc∗θ∗(θ0) can be found in Section 7.1.1. That of

vcθ∗(θ0) can be found in Section 7.1.2.

Theorem 3.3.1. Under conditions (A1)–(A8), the following results hold:

(i) Under H0 : θ = θ0, 1√nUn(θ0)

d−→ N(

0, vcθ0(θ0))

as n→∞.

(ii) Under H1 : θ = θ∗, 1√nUn(θ0)−

√nec∗θ∗(θ0)

d−→ N(

0, vc∗θ∗(θ0))

as n→∞.

Now we need to derive the asymptotic properties of the test statistic to calculate

the sample size. This is not difficult since we have already derived the asymptotic

properties of Un(θ0) in Theorem 3.3.1. We will only need to know the limit of 1nV cn (θ0)

under the alternative hypothesis. This is shown in the proof of Corollary 3.3.2 in

Section 7.1.2. Corollary 3.3.2 gives the corresponding results for the test statistic T cn.

Corollary 3.3.2. Under conditions (A1)–(A8), the results follow:

(i) Under H0 : θ = θ0, T cnd−→ N(0, 1) as n→∞.

(ii) Under H1 : θ = θ∗, T cn −√n

ec∗θ∗ (θ0)√vcθ∗ (θ0)

d−→ N(

0,vc∗θ∗ (θ0)

vcθ∗ (θ0)

)as n→∞.


Now we have obtained the asymptotic properties of the test statistic T cn under both

the null and alternative hypotheses. The sample size can be linked to type I error,

power, effect size, and design effect based on the above results. Result 3.3.3 gives the

sample size formula for the Cox model under the fixed alternative hypothesis.

Result 3.3.3. The hypotheses of interest to test are H0 : θ = θ0 versus H1 : θ = θ∗,

with type I error α and power 1− β.

Under conditions (A1)–(A8), by Corollary 3.3.2 (i), the following relationship can be

established:

PrH0

(|T cn| > c

)= α,

where c = zα/2 is the critic value.

The derivation of the sample size is similar as shown in the example of Chapter 2.

By the definition of power, we derive the following:

PrH1

(|T cn| > c

)= 1− β,

which implies

PrH1

(T cn > c

)+ PrH1

(T cn < −c

)= 1− β.

Without loss of generality, we assume that the mean of the test statistic T cn is negative

under the alternative hypothesis H1. The following result follows with asymptotic

normal approximation:

PrH1

(T cn < −c

)= 1− β.

Then,

PrH1

{√vcθ∗(θ0)√vc∗θ∗(θ0)

(T cn −

√nec∗θ∗(θ0)√vcθ∗(θ0)

)<

√vcθ∗(θ0)√vc∗θ∗(θ0)

(−c−


)}= 1− β.


Based on Corollary 3.3.2 (ii), zβ can be approximated:√vcθ∗(θ0)√vc∗θ∗(θ0)

(−c−


)= zβ.

The sample size is obtained by replacing c with zα/2 in the above equation.

n =

(zα/2 +

√vc∗θ∗ (θ0)√vcθ∗ (θ0)

zβ

)2

(ec∗θ∗(θ0))2 /vcθ∗(θ0)(3.4)

The formula (3.4) can be used for the sample size calculation for the Cox pro-

portional hazards model under the fixed alternative. It is a function of type I error,

power, and design effects ec∗θ∗(θ0), vc∗θ∗(θ0) and vcθ∗(θ0). The formula derived here is

general as it is based on the score statistic of the Cox model under the fixed alter-

native. However, the form of vc∗θ∗(θ0) is usually complicated and may be difficult to

derive. In the next section, we consider a method that simplifies the expression of

vc∗θ∗(θ0) under contiguous alternatives.

3.4 Sample size formula under contiguous alterna-

tives

The sample size formula in Section 3.3 is derived for the null hypothesis H0 :

θ = θ0 versus the fixed alternative hypothesis H1 : θ = θ∗. The limiting variance

of the test statistic T cn has a rather complex form. In this section, we discuss the

formula derived based on a sequence of alternatives, the contiguous alternatives. The

asymptotic distribution of the test statistic will be evaluated under the contiguous

alternatives H1n : θ = θ1n = θ0 + h√n, where h ∈ R is some fixed real value. The power

under the fixed alternative is approximated by that under the contiguous alternatives.

Let θ∗ = θ1n = θ0 + h√n, then the sample size n is related to h, θ0 and θ∗, since

h =√n(θ∗ − θ0). By doing so, it is actually assumed that θ0 is close to θ∗; i.e.,

the effect size is small. When a sequence of contiguous alternatives is considered,


the limiting distribution of T cn has a simpler form. We show that the sample size

formula by Schoenfeld (1983) [42] and Hsieh and Lavori (2000) [21] can be obtained

as a special case of our method under certain conditions when θ0 = 0.

Theorem 3.4.1 gives the asymptotic distribution of Un(θ0) under the contiguous

alternatives H1n : θ = θ1n = θ0 + h√n.

Theorem 3.4.1. Under conditions (A1)–(A8), the following result holds.

Under H1n : θ = θ1n = θ0 + h√n

,

1√nUn(θ0)

d−→ N(hvcθ0(θ0), vcθ0(θ0)

), as n→∞.

The proof of Theorem 3.4.1 can be found in Section 7.1.3. The following Corollary

3.4.2 gives the asymptotic distribution of the test statistic T cn under the contiguous

alternatives. The proof is straightforward by applying Slutsky’s theorem.

Corollary 3.4.2. Under conditions (A1)–(A8), the following result holds.


,

T cnd−→ N

(h√vcθ0(θ0), 1

)as n→∞.

Corollary 3.4.2 is proved in Section 7.1.4. Now we have obtained the asymptotic

distributions of the test statistic under both the null and alternative hypotheses. The

sample size can be derived by known type I error, power, effect size, and design effect

based on the above results. Result 3.4.3 gives the sample size formula for the Cox

model based on the contiguous alternatives.

Result 3.4.3. The hypotheses of interest to test are H0 : θ = θ0 versus H1 : θ = θ∗,


Under conditions (A1)–(A8), by Corollary 3.3.2 (i), the following relationship can be

established:

PrH0

(|T cn| > c

)= α,


where c = zα/2 is the critic value.

Without loss of generality, we assume that the mean of the test statistic T cn is negative

under H1 : θ = θ1. By Corollary 3.4.2 and the definition of power, we derive the

following:

PrH1

(|T cn| > c

)= 1− β,

which is equivalent to

PrH1

(T cn > c

)+ PrH1

(T cn < −c

)= 1− β.

With asymptotic normal approximation,

PrH1

(T cn < −c

)= 1− β.

Based on Corollary 3.4.2, we assume that T cn follows an asymptotic normal distribu-

tion with mean approximated by h√vcθ0(θ0), where h =

√n(θ∗ − θ0), and asymptotic

variance equal to 1. Thus, we approximate the distribution of T cn under H1 by that

under H1n. It follows that

PrH1

(T cn − h

√vcθ0(θ0) < −c− h

√vcθ0(θ0)

)= 1− β,

and

−c− h√vcθ0(θ0) = zβ.

Based on the fact that h =√n(θ∗ − θ0), the sample size formula can be obtained by

substituting c with zα/2 in the above equation.

n =

(zα/2 + zβ

)2

vcθ0(θ0) (θ∗ − θ0)2 .

(3.5)

The formula (3.5) can be used for sample size calculation for the hypotheses of

interest H0 : θ = θ0 versus H1 : θ = θ∗, with type I error α and power 1 − β. It


is calculated based on the contiguous alternatives H1n : θ = θ1n = θ0 + h√n. The

formula is a function of α, β, θ0, θ∗, and the design effect vcθ0(θ0). We notice that the

limiting variances of the score statistic under the null and alternative hypotheses are

both vcθ0(θ0). Therefore, the sample size formula under the fixed alternative can be

simplified. We will discuss the method in the following remark.

Remark 3.4.4. It is shown that the asymptotic variance of 1√nUn(θ0) is vcθ0(θ0) under

the contiguous alternatives (see Theorem 3.4.1 and its proof in Section 7.1.3). This

result indicates that the limiting variance of 1√nUn(θ0) under the alternative hypothesis

is close to vcθ0(θ0) when the alternative is close to the null hypothesis. Let us assume a

simple linear relationship between the two limiting variances vc∗θ∗(θ0) and vcθ0(θ0); that

is, vc∗θ∗(θ0) = φvcθ0(θ0), where φ > 0. When the effect size is small, that is, when θ∗ is

close to θ0, vcθ0(θ0) ≈ vc∗θ∗(θ0). Thus, it is expected that φ is close to 1 for small effect

size. Different values of φ can be explored to obtain the best appropriate sample size.

If we replace vc∗θ∗(θ0) with φvcθ0(θ0) in 3.4, the sample size formula becomes:

n =

(zα/2 +

√φvcθ0

(θ0)√vcθ∗ (θ0)

zβ

)2

(ec∗θ∗(θ0))2 /vcθ∗(θ0)(3.6)

Different values of φ can be investigated if we use (3.6) instead of (3.4). The value of φ

can be determined through a series of simulation studies. In this way, the calculation

is greatly simplified without deriving the explicit form of the limiting variance of the

score statistic under the fixed alternative.

In practice, it is often of interest to derive the sample size for the null hypothesis

of no effect (θ0 = 0) versus some small effect under the alternative hypothesis. The

following result gives the sample size formula for this kind of special case when θ0 = 0.

Result 3.4.5. The hypotheses of interest to test are H0 : θ = 0 versus H1 : θ = θ∗,


Under conditions (A1)–(A8) and (C1), the sample size formula based on the contigu-


ous alternatives H1n : θ = θ1n = h√n

is

n =

(zα/2 + zβ

)2

Var(Z) Prθ=0({∆ : ∆ = 1}) θ∗2. (3.7)

We denote Prθ=0({∆ : ∆ = 1}) as the event probability under the null hypothesis

of θ = 0. Result 3.4.5 is derived as a direction application of Result 3.4.3. Note that

with the additional condition (C1), vcθ0 can be simplified when θ0 = 0. (3.7) is thus

the same as the sample size formula by Hsieh and Lavori (2000) [21]. The following

equation can be obtained from (3.7):

nPrθ=0({∆ : ∆ = 1}) =

(zα/2 + zβ

)2

Var(Z) θ∗2.

Let Dθ=0 denote the expected number of events under the null hypothesis H0 : θ = 0.

Therefore, Dθ=0 = nPrθ=0({∆ : ∆ = 1}) =(zα/2+zβ)

2

Var(Z) θ∗2. In particular, when the

covariate is binary (condition (C2)), Var(Z) = ρ(1−ρ). Then the sample size formula

(3.7) becomes

n =

(zα/2 + zβ

)2

ρ(1− ρ) Prθ=0({∆ : ∆ = 1}) θ∗2. (3.8)

Then Dθ=0, the expected number of events under the null hypothesis H0 : θ = 0, can

be calculated by(zα/2+zβ)

2

ρ(1−ρ) θ∗2. These results are the same as what Schoenfeld (1983)

[42] has derived. This representation of Dθ=0 has appeared in many works, including

Schoenfeld (1981) [41], Schoenfeld and Richter (1982) [43] and Collett (2003) [9].

CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG ANDPRENTICE’S SEMIPARAMETRIC MODEL 27

Chapter 4

Sample size calculation with Yang

and Prentice’s semiparametric

model

The methods discussed in Chapter 3 rely on the proportional hazards assumption.

When there is presence of non-proportionality of the hazard functions, an appropriate

model should be considered for sample size calculation. We derive the sample size

formulae based on the semiparametric model by Yang and Prentice (2005) [51]. This

model can be used for non-proportional hazard functions, with the Cox proportional

hazards model and the proportional odds model being two special cases. Thus, the

sample size calculation based on this model can be used for survival data when the

proportional hazards assumption does not hold. The first section of this chapter in-

troduces the semiparametric model for short-term and long-term hazard ratios (Yang

and Prentice 2005 [51]). The sample size formulae based on this model under fixed

and contiguous alternatives will follow. The special cases for testing the null hypoth-

esis of no treatment effect are discussed for the three submodels: Cox proportional

hazards model, proportional odds model, and long-term effect model. The last section

provides simulation results for the proposed methods under various scenarios.


4.1 Notations and Model specification

Yang and Prentice (2005) [51] developed a semiparametric model that can accom-

modate different short-term and long-term hazard ratios. In the simple two-sample

case, the model has the following form:

λ1(t)

λ0(t)=

γ1γ2

γ1 + (γ2 − γ1)S0(t). (4.1)

The hazard ratio in (4.1) is a function of two positive parameters γ1 and γ2 and

the baseline survival function S0. It is monotone in t for fixed γ1 and γ2, since S0(t)

is nonincreasing in t. Notice that γ1 = limt→0λ1(t)λ0(t)

and γ2 = limt→∞λ1(t)λ0(t)

. This is

because S0(t) goes to 1 when t goes to 0, and S0(t) goes to 0 when t goes to ∞.

Therefore, γ1 can be interpreted as the short-term hazard ratio and γ2 as the long-

term hazard ratio. The model reduces to the Cox proportional hazards model when

γ1 = γ2 = γ. The hazard ratio for the Cox model is constant over time, and the

corresponding hazard functions are proportional. When γ2 = 1, the model becomes

the proportional odds model. The proportional odds model has non-constant hazard

ratio, and the corresponding hazard functions are not proportional. The treatment

effect would fade away over time for the proportional odds model. Another special

case of interest is what we call the long-term effect model, in which γ1 = 1. This

means that there is no treatment effect at the beginning, but the effect shows up later

on. The long-term effect model has non-constant hazard ratio, and the corresponding

hazard functions are non-proportional.

The model (4.1) can be used for a wide range of hazards patterns, including

crossing hazards. When there is crossing in the hazard functions, γ1 < 1 and γ2 > 1,

or, γ1 > 1 and γ2 < 1. Examples with baseline hazard function λ0(t) = 1/(1 +

t) (corresponding baseline survival S0(t) = 1/(1 + t) and odds of baseline survival

1−S0(t)S0(t)

= t) are shown in the following Figures. The hazard ratio, corresponding

hazard functions and survival functions are plotted under different scenarios. Figure

4.1 gives two examples of constant hazard ratio. These are examples of the Cox


proportional hazards model. When the common hazard ratio is less than 1, the

treatment is more favorable. When the common hazard ratio is great than 1, the

treatment is not effective. Figure 4.2 shows the examples of the proportional odds

model, in which γ2, the long-term effect, is 1. Figure 4.3 shows two examples of

the long-term effect model, where γ1, the short-term effect, is 1. The corresponding

survival functions of the treatment and the control are close to each other in each

scenario of the long-term effect model, but start to deviate from each other after a

while. Figure 4.4 gives examples of the crossing hazard functions. These are the cases

when the treatment effect is different at different stages of a study. The treatment may

be beneficial at the beginning, but turns out to do harm later on, or, the treatment

may not be favorable at the beginning, but appears to be more effective as the study

continues. Crossing in hazard functions occurs when γ1 < 1 and γ2 > 1, or, γ1 > 1

and γ2 < 1. Notice that in each of the crossing hazards example, both of the hazard

functions and the survival functions cross, but the crossing points are different.

We will use the same notations and regularity conditions as in Section 3.1. The

regularity conditions are (A1)–(A8). Let θ = log γ be the parameter of interest. We

suppose that θ ∈ Θ, where Θ is a compact subset of Rp. In the model by Yang and

Prentice (2005) [51], p = 2. The parameter of interest is θ = (θ1, θ2)′, which is a

real-valued vector.


Figure 4.1: Proportional hazards: Cox model (γ1 = γ2 = γ).

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

γ1=0.5, γ2=0.5

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Time

Sur

viva

l Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

γ1=2, γ2=2

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Time

Sur

viva

l Fun

ctio

n

TreatmentControl


Figure 4.2: Non-proportional hazards: proportional odds model (γ2 = 1).

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

γ1=0.8, γ2=1

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Time

Sur

viva

l Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

γ1=2, γ2=1

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Time

Sur

viva

l Fun

ctio

n

TreatmentControl


Figure 4.3: Non-proportional hazards: long-term effect model (γ1 = 1).

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

γ1=1, γ2=0.5

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Time

Sur

viva

l Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

γ1=1, γ2=1.8

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Time

Sur

viva

l Fun

ctio

n

TreatmentControl


Figure 4.4: Non-proportional hazards: crossing hazards (γ1 < 1 and γ2 > 1, or, γ1 > 1

and γ2 < 1).

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

2.5

γ1=0.6, γ2=2.5

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

2.5

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

TimeS

urvi

val F

unct

ion

TreatmentControl

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

2.5

γ1=2.5, γ2=0.5

Time

Haz

ard

Rat

io

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

2.5

Time

Haz

ard

Fun

ctio

n

TreatmentControl

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Time

Sur

viva

l Fun

ctio

n

TreatmentControl


4.2 Parameter estimation

The model has the following form after re-parameterizations and the incorporation

of a covariate Z:

λT |Z(t|Z, θ) =1

e−Zθ1 + e−Zθ2R0(t)

dR0(t)

dt, (4.2)

where θj = log(γj), for i = 1, 2. Note that θ = (θ1, θ2)′ ∈ Θ ⊂ R2 is a real-valued

vector of two dimensions. Now θ1 = log(γ1) can be interpreted as the logarithm of the

short-term hazard ratio and θ2 = log(γ2) as the logarithm of the long-term hazard

ratio. The true parameter θ is a vector that consists of two elements: θ1 and θ2,

i.e., θ = (θ1, θ2)′. The compact parameter space Θ is a subset of R2. We also define

R0 as the odds of the baseline survival function, i.e., R0(t) = 1−S0(t)S0(t)

, for t ∈ (0, τ ].

The hazard ratio is then a function of θ1, θ2 and R0. The model reduces to the

Cox proportional hazards model when θ1 = θ2, to the proportional odds model when

θ2 = 0, and to the long-term effect model when θ1 = 0. If θ1 and θ2 have different

signs (i.e., one is positive and the other is negative), then there is crossing in hazard

functions.

IfR0(t) in the model (4.2) is known, the likelihood and log likelihood of θ = (θ1, θ2)′

can be written as follows under non-informative censoring:

Ln(θ, R0) =n∏i=1

λT |Z(Xi|Zi, θ)∆iST |Z(Xi|Zi, θ),

logLn(θ, R0) =n∑i=1

∫ τ

0

logλT |Z(t|Zi, θ) dNi(t)−n∑i=1

∫ τ

0

λT |Z(t|Zi, θ)Yi(t) dt.

The score function with respect to θ is ∂logLn(θ,R0)∂θ

. If R0 is unknown, it can be

consistently estimated by Rn, where Rn(t, θ) has the following close-from expression

(Yang and Prentice 2005 [51])

Rn(t, θ) =1

Pn(t, θ)

∫ t

0

Pn(s−, θ) dΛn,1(s, θ), (4.3)


where

Pn(t, θ) =∏u∈(0,t]

[1− dΛn,2(u, θ)

],

Λn,j(t, θ) =

∫ t

0

1

Kn(s)dHn,j(s, θ),

Hn,j(t, θ) =n∑i=1

∆ie−ZiθjI(Xi ≤ t),

Kn(t) =n∑i=1

Yi(t),

and j = 1, 2.

Also define

R0,θ0(t, θ) = limn→∞

Eθ0

[Rn(t, θ)

].

Yang and Prentice (2005) [51] showed that with probability 1, Rn(t, θ) converges

to R0,θ0(t, θ) uniformly in t ∈ (0, τ ] and θ ∈ Θ under true parameter θ0 ∈ Θ. In

particular, R0,θ0(t, θ0) is equal to R0,θ0(t), the true odds function of the baseline

survival when the true parameter of interest is θ0. In the rest of the chapter, R0,θ0(t)

is also denoted by R0(t).

The corresponding pseudo score function is:

Qn(θ) =∂logLn(θ, R0)

∂θ

∣∣∣∣R0(t)=Rn(t,θ)

=n∑i=1

∫ τ

0

[gn,i(t, θ)−

Sn(t, θ)

Kn(t)

(e−Ziθ1 + e−Ziθ2Rn(t, θ)

)]dNi(t),

(4.4)

where

gn,i(t, θ) =∂logλT |Z(t|Zi, θ)

∂θ

∣∣∣∣R0(t)=Rn(t,θ)

=

Zie−Ziθ1

e−Ziθ1+e−Ziθ2 Rn(t,θ)

Zie−Ziθ2 Rn(t,θ)

e−Ziθ1+e−Ziθ2 Rn(t,θ)

,

Sn(t, θ) =∑k

gn,k(t, θ)Yk(t)

e−Zkθ1 + e−Zkθ2Rn(t, θ).


The pseudo maximum likelihood estimator θ, which is the root of Qn(θ) = 0, is a

consistent estimator of the true parameter θ (Yang and Prentice (2005) [51]). We will

use Qn(θ0), the pseudo score function evaluated at θ0, to construct the test statistic

for the null hypothesis H0 : θ = θ0.

Remark 4.2.1. In the special cases when only one parameter is of interest, the forms

of gn,i(t, θ) can be derived for the Cox proportional hazards model, the proportional

odds model, and the long-term effect model.

Specifically,

Cox proportional hazards model:

gn,i(t, θ) = Zi. (4.5)

Proportional odds model:

gn,i(t, θ) =Zie−Ziθ

e−Ziθ + Rn(t, θ). (4.6)

Long-term effect model:

gn,i(t, θ) =Zie−ZiθRn(t, θ)

1 + e−ZiθRn(t, θ). (4.7)

4.3 Sample size formula under fixed alternative

For some θ0, θ∗ ∈ Θ and θ0 6= θ∗, consider the hypotheses of interest:

H0 : θ = θ0 =

θ0,1

θ0,2

vs. H1 : θ = θ∗ =

θ∗1θ∗2

.


Let Tn = Vn(θ0)−12Qn(θ0), where

Vn(θ0) =∑i

∫ τ

0

Bn,i(t, θ0)Bn,i(t, θ0)′Yi(t)

e−Ziθ0,1 + e−Ziθ0,2Rn(t, θ0)d Rn(t, θ0).

Here,

Bn,i(t, θ0) = gn,i(t, θ0)− Sn(t, θ0)

Kn(t)

(e−Ziθ0,1 + e−Ziθ0,2Rn(t, θ0)

)+Pn(t−, θ0)


)Kn(t)

×∫ τ

t

{∑k

gn,k(s, θ0)e−Zkθ0,2Yk(s)

(e−Zkθ0,1 + e−Zkθ0,2Rn(s, θ0))2

−

∑k

gn,k(s,θ0)Yk(s)

e−Zkθ0,1+e−Zkθ0,2 Rn(s,θ0)

∑k

e−Zkθ0,2Yk(s)

e−Zkθ0,1+e−Zkθ0,2 Rn(s,θ0)

Kn(s)

× dRn(s, θ0)

Pn(s, θ0).

(4.8)

Let

vθ0(θ0) = limn→∞

Eθ0

[1

nVn(θ0)

],

vθ∗(θ0) = limn→∞

Eθ∗

[1

nVn(θ0)

],

e∗θ∗(θ0) = limn→∞

Eθ∗

[1

nQn(θ0)

],

v∗θ∗(θ0) = limn→∞

Varθ∗

[1√nQn(θ0)

].

The derivations of e∗θ∗(θ0) and v∗θ∗(θ0) are shown in Section 7.2.3.

We consider the following pseudo score test statistic in the general cases:

T 2n = Qn(θ0)′Vn(θ0)−1Qn(θ0).

The following theorem gives the asymptotic distributions of Qn(θ0) under the hy-

potheses of interest with the fixed alternative.


Theorem 4.3.1. Under conditions (A1)–(A8), the following results hold:

(i) Under H0 : θ = θ0 =

θ0,1

θ0,2

,

1√nQn(θ0)

d−→ N(0, vθ0(θ0)) as n→∞.

(ii) Under H1 : θ = θ∗ =

θ∗1θ∗2

,

1√nQn(θ0)−

√ne∗θ∗(θ0)

d−→ N(

0, v∗θ∗(θ0))

as n→∞.

The proof of Theorem 4.3.1 (i) is presented in Yang and Prentice (2005) [51]. The

proof of Theorem 4.3.1 (ii) can be found in Section 7.2.3. By Slutsky’s theorem and

Theorem 4.3.1, the asymptotic properties of the test statistic T 2n under the null and

fixed alternative hypotheses can be derived. The sample size derivation based on the

fixed alternative is summarized in Result 4.3.2.

Result 4.3.2. The hypotheses of interest to test are:

H0 : θ = θ0 =

θ0,1

θ0,2

vs. H1 : θ = θ∗ =

θ∗1θ∗2

,


If the conditions (A1)–(A8) are satisfied, under H0 : θ = θ0 =

θ0,1

θ0,2

, by Theorem

4.3.1 (i) and Slutsky’s theorem, we have

Tnd−→ N(0, I2) as n→∞,

where I2 is a 2× 2 identity matrix. Hence,

T 2n

d−→ χ22 as n→∞.

With type I error α,

PrH0

(T 2n > c

)= α,


where the critic value c = χ22,α.

Under H1 : θ = θ∗ =

θ∗1θ∗2

, by Theorem 4.3.1 (ii) and Slutsky’s theorem, the

asymptotic distribution of the test statistic T 2n can be approximated by a quadratic

form of a non-central normal distribution X ′QAQXQ, where XQ is a random normal

variable with mean µQ and variance ΣQ, and

µQ =√ne∗θ∗(θ0),

ΣQ = v∗θ∗(θ0),

AQ = v−1θ∗ (θ0).

In fact, X ′QAQXQ can be expressed as a weighted sum of non-central Chi-squared

distributions. The tail probability of the quadratic form X ′QAQXQ can be assessed

using either approximation or exact method. The exact method was derived by Kotz

et al. (1967) [24]. Some authors also proposed approximation methods (e.g., Solomon

and Michael 1977 [46]; Liu et al. 2009 [30]). For the 1− β power,

PrH1

(T 2n > c

)= 1− β, (4.9)

i.e.,

PrH1

(X ′QAQXQ > χ2

2,α

)= 1− β. (4.10)

Since µQ is a function of n, both ΣQ and AQ are known, the sample size n can be

obtained by solving the equation (4.10).

Result 4.3.2 shows the derivation of the sample size formula under the fixed al-

ternative hypothesis. However, there is no closed form expression of the sample size

since the derivation is not straightforward for the tail probability of the quadratic

form. The limiting variances of the pseudo score function are different under the null

and alternative hypotheses. Thus the distribution of the test statistic can only be

evaluated by the quadratic form under the alternative hypothesis. Similar to the ap-

proach in Chapter 3, we consider a simpler version of the alternative limiting variance

v∗θ∗(θ0).


Remark 4.3.3. Result 4.3.2 gives the sample size formula based on the the pseudo

score test statistic under the fixed alternative. It is a function of type I error α, power

1 − β, and design effects vθ∗(θ0), e∗θ∗(θ0) and v∗θ∗(θ0). In the special cases (Cox pro-

portional hazards model, proportional odds model, and long-term effect model) when

only one parameter is of interest, a normal approximation can be used to calculate

the sample size as in Result 3.3.3. The equivalent form of equation (4.9) is

PrH1

(Tn > c

)+ PrH1

(Tn < −c

)= 1− β.

Assume that the mean of Tn under H1 is either positive or negative, we have

PrH1

(Tn > c

)= 1− β,

or

PrH1

(Tn < −c

)= 1− β.

The omission of either PrH1

(Tn > c

)or PrH1

(Tn < −c

)on the left-hand side of the

equation (4.9) has been discussed in the example of Chapter 2 and in the sample size

calculation for the Cox model in Section 3.3.

Hence, the sample size formula for testing H0 : θ = θ0 vs. H1 : θ = θ∗ in the

univariate case can be determined as

n =

(zα/2 +

√v∗θ∗ (θ0)√vθ∗ (θ0)

zβ

)2

(e∗θ∗(θ0))2 /vθ∗(θ0). (4.11)

The derivation of v∗θ∗(θ0) is usually complicated. Utilizing the idea proposed in

Remark 3.4.4, we may consider approximating v∗θ∗(θ0) with vθ0(θ0). We assume

v∗θ∗(θ0) = φvθ0(θ0), where φ > 0. When θ∗ is close to θ0, vθ0(θ0) ≈ v∗θ∗(θ0), and φ is

expected to be close to 1. However, if the parameter of interest θ is a two-dimensional

vector, then this approximation may not be as good as in the special cases when the

parameter of interest is one-dimensional. The linear relationship between v∗θ∗(θ0) and

vθ0(θ0) may not be true for the 2 × 2 variance matrices. It is better to use such

approximation when there is only one parameter of interest as in the special cases.


If we use φvθ0(θ0) to approximate v∗θ∗(θ0), different values of φ > 0 should be

explored in the simulation studies for stability.

4.4 Sample size formula under contiguous alterna-

tives

As already shown in Section 3.4, the calculation of sample size can be greatly

simplified under the contiguous alternatives. If we consider the null hypothesis H0 :

θ = θ0 versus the fixed alternative hypothesis H1 : θ = θ∗, the limiting variance of the

test statistic T 2n has a complex form. We will present the asymptotic distributions of

the pseudo score Qn(θ0) and then that of the test statistic T 2n under the contiguous

alternatives. The contiguous alternatives are

H1n : θ = θ1n = θ0 +h√n,

where h = (h1, h2)′ ∈ R2 is a real-valued vector.

The asymptotic distribution of T 2n under the null hypothesis has been discussed

in the previous section. We only need to derive the distribution under the contiguous

alternatives. The following theorem gives the asymptotic distribution of Qn(θ0) under

the contiguous alternatives.

Theorem 4.4.1. Under conditions (A1)–(A8), the following result holds.


,

1√nQn(θ0)

d−→ N(ξ0, vθ0(θ0)) as n→∞,

where

ξ0 = A0 · h,

and

A0 =

A0,11 A0,12

A0,21 A0,22

.


The elements of the matrix A0 are

A0,11

=

∫ τ

0

Eθ0

{[C6 + (e−Zθ0,1 + e−Zθ0,2R0,θ0(t))C1

−(Ze−Zθ0,1)C3

]C0

}dR0,θ0(t),

A0,12

=

∫ τ

0

Eθ0

{[−C6 + (e−Zθ0,1 + e−Zθ0,2R0,θ0(t))R0,θ0(t)C5

−(Ze−Zθ0,2R0,θ0(t))C3

]C0

}dR0,θ0(t),

A0,21

=

∫ τ

0

Eθ0

{[−C6 + (e−Zθ0,1 + e−Zθ0,2R0,θ0(t))R0,θ0(t)C5

−(Ze−Zθ0,1R0,θ0(t))C4

]C0

}dR0,θ0(t),

A0,22

=

∫ τ

0

Eθ0

{[C6 − (e−Zθ0,1 + e−Zθ0,2R0,θ0(t))R0,θ0(t)C2

−(Ze−Zθ0,2R20,θ0

(t))C4

]C0

}dR0,θ0(t),

where

C0 =ST |Z(t)SC|Z(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t)),

C1 =Eθ0

[Z2e−Zθ0,1 (e−Zθ0,1−e−Zθ0,2R0,θ0

(t))ST |Z(t)SC|Z(t)

(e−Zθ0,1+e−Zθ0,2R0,θ0(t))3

]Eθ0

[ST |Z(t)SC|Z(t)

] ,

C2 =Eθ0

[Z2e−Zθ0,2 (e−Zθ0,1−e−Zθ0,2R0,θ0

(t))ST |Z(t)SC|Z(t)

(e−Zθ0,1+e−Zθ0,2R0,θ0(t))3

]Eθ0

[ST |Z(t)SC|Z(t)

] ,

C3 =Eθ0

[Ze−Zθ0,1ST |Z(t)SC|Z(t)

(e−Zθ0,1+e−Zθ0,2R0,θ0(t))2

]Eθ0

[ST |Z(t)SC|Z(t)

] ,

C4 =Eθ0


(e−Zθ0,1+e−Zθ0,2R0,θ0(t))2

]Eθ0

[ST |Z(t)SC|Z(t)

] ,


C5 =Eθ0

[2Z2e−Zθ0,1e−Zθ0,2ST |Z(t)SC|Z(t)

(e−Zθ0,1+e−Zθ0,2R0,θ0(t))3

]Eθ0

[ST |Z(t)SC|Z(t)

] ,

and

C6 =Z2e−Zθ0,1e−Zθ0,2R0,θ0(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))2.

The proof of Theorem 4.4.1 is shown in Section 7.2.4. Note that the limiting

variance of the pseudo score function evaluated at θ0 is the same as that under the null

hypothesis. The following result gives the sample size formula based on the contiguous

alternatives. The power function under the fixed alternative is approximated by that

under the contiguous alternatives.

Result 4.4.2. The hypotheses of interest to test are

H0 : θ = θ0 =

θ0,1

θ0,2

vs. H1 : θ = θ∗ =

θ∗1θ∗2

,


If the conditions (A1)–(A8) are satisfied, under H0 : θ = θ0 =

θ0,1

θ0,2

, by Theorem

4.3.1 (i) and the definition of type I error, we have

PrH0

(T 2n > c

)= α,

where the critic value c = χ22,α.

By the definition of power, we have

PrH1

(T 2n > c

)= 1− β.

Instead of approximating the asymptotic distribution of T 2n by the quadratic form

X ′QAQXQ, we consider a non-central Chi-squared distribution. This is based on the

asymptotic property of the test statistic under the contiguous alternatives as shown in


Theorem 4.4.1. Assume that T 2n follows an asymptotic Chi-squared distribution with

2 degrees of freedom and a non-centrality parameter η. Therefore,

χ22,1−β(η) = χ2

2,α.

By Theorem 4.4.1, η can be approximated by ηn = ξ′0vθ0(θ0)−1ξ0 = h′A′0vθ0(θ0)−1A0h.

Let h =√n(θ∗− θ0), then ηn = n(θ∗− θ0)′A′0vθ0(θ0)−1A0(θ∗− θ0). Hence, the sample

size can be calculated by

n =η

(θ∗ − θ0)′A′0vθ0(θ0)−1A0(θ∗ − θ0).

(4.12)

It is often of interest to evaluate sample size calculation under the null hypothesis

of no effect (i.e., θ0 = 0) versus some small effect under the alternative hypothesis. The

sample size for such problem can be calculated based on the contiguous alternatives

for θ0 = 0 in the following result.


H0 : θ = 0 =

0

0

vs. H1 : θ = θ∗ =

θ∗1θ∗2

,


If the conditions (A1)–(A8) and (C1) are satisfied, the sample size based on the con-

tiguous alternatives is

n =η

θ∗′ν0θ∗, (4.13)

where η can be determined by solving the equation χ22,1−β(η) = χ2

2,α, and

ν0 = Var(Z)

∫ τ

0

SC(t)1

(1 +R0(t))4

1 R0(t)

R0(t) R0(t)2

dR0(t).


Result 4.4.3 is a direct application of Result 4.4.2 under certain conditions when

θ0 = 0. The term A′0vθ0(θ0)−1A0 can be simplified to ν0 with these conditions. Note

that the simplification is obtained based on the fact that we have the additional

assumption of (C1) (censoring is independent of covariate) and that the pseudo score

function Qn can be simplified when θ0 = 0. In this case, the sample size formula

has a simple form, which is a function of variance of the covariate Z, R0 and the

censoring survival function SC . Recall that R0 is the odds of the baseline survival

function under the null hypothesis of θ0 = 0. Table 4.1 gives the values of η for some

commonly assumed α’s and β’s in sample size calculation.

Table 4.1: Values of

η for different α’s and

β′s.

α 1− β η

0.05 0.9 12.655

0.10 0.9 10.457

0.05 0.8 9.635

0.10 0.8 7.710

Remark 4.4.4. Note that the denominator in (4.13) has a quadratic form of θ∗.

Thus, the sample size calculated by (4.13) should be the same for θ∗ = θ∗0 and θ∗ =

−θ∗0.

The sample size formula based on the contiguous alternatives can be obtained

for the three submodels: Cox model, proportional odds model, and long-term effect

model.


H0 : θ = 0 vs. H1 : θ = θ∗,



If the conditions (A1)–(A8) and (C1) are satisfied, then the sample size based on the

contiguous alternatives is

n =(zα/2 + zβ)2

ν0θ∗2, (4.14)

where

Cox proportional hazards model:

ν0 = Var(Z)

∫ τ

0

SC(t)

(1 +R0(t))2dR0(t) (4.15)

= Var(Z)Prθ0=0({∆ : ∆ = 1}) (4.16)

Proportional odds model:

ν0 = Var(Z)

∫ τ

0

SC(t)

(1 +R0(t))4dR0(t)

Long-term effect model:

ν0 = Var(Z)

∫ τ

0

SC(t)R0(t)2

(1 +R0(t))4dR0(t)

The above result is derived as a direct application of Result 4.4.3 with normal

approximation. The utilization of normal approximation in sample size calculation

has been discussed in both Chapter 2 and Chapter 3.

Remark 4.4.6. The sample size formula for the Cox proportional hazards model in

Result 4.4.5 is the same as that by Schoenfeld (1983) [42] and Hsieh and Lavori (2000)

[21]. We have obtained the same formula using the score test statistic for the Cox

proportional hazards model in Chapter 3 based on the contiguous alternatives. This is

because, Qn(0) and Un(0), Vn(0) and V cn (0), are the same under the conditions (A1)–

(A8). As a result, the test statistics T cn and Tn are equal for the Cox proportional

hazards model when θ0 = 0.


Example 4.4.7 (Sample size based on the formula derived under the contiguous

alternatives (4.13)). If α = 0.05, 1− β = 0.9, θ0 = (0, 0)′, θ∗ = (−0.5,−0.7)′ and

ν0 =

0.0832 0.0414

0.0414 0.0793

(e.g. R0(t) = t, SC(t) = 1− t/300.85 and V ar(Z) = 0.25),

Then the sample size based on the formula derived under the contiguous alternatives

is:

n =η

θ∗ν0θ∗

=12.655

(−0.5,−0.7)

0.0832 0.0414

0.0414 0.0793

(−0.5,−0.7)′

= 143

4.5 Simulation studies

4.5.1 Simulation studies to evaluate sample size formula de-

rived under contiguous alternatives

Simulation studies are conducted in this section to evaluate the sample size formula

we derived based on the contiguous alternatives. Because the result is derived based

on large sample theory, it is necessary to investigate the validity of the result through

simulation studies. We will evaluate the empirical power and the type I error with

the sample size calculated by our formula, which should be close to the pre-specified

nominal power and type I error. We examine our method by the following steps.

First, sample size n is calculated using the formula (4.12) for given type I error

α = 0.05, desired power 1− β = 0.9, θ0 = (θ0,1, θ0,2)′, θ∗ = (θ∗1, θ∗2)′ and design effect

ν0. Next, 5000 replicated samples of size n are generated under the null hypothesis

H0 : θ = θ0 and the alternative hypothesis H1 : θ = θ∗. The empirical type I


error is obtained by comparing the test statistic T 2n under the null hypothesis to

the critic value. The empirical power is obtained by comparing the test statistic T 2n

under the alternative hypothesis to the critic value. We consider the scenarios when

the odds of the baseline survival function R0(t) = t, the censoring random variable

follows a uniform distribution, i.e., C ∼ Uniform(0, τ0), where τ0 is determined such

that the censoring rate under the null hypothesis can be 0%, 10%, 20%, and 30%,

respectively. We consider the simple two-sample problem, where the covariate Z

is the treatment indicator with 1 : 1 allocation in the treatment and control arms,

i.e., Z ∼ Bernoulli(0.5). The validity of our method is examined by comparing the

empirical power to the nominal power, which should be close to each other.

Table 4.2 gives the sample size calculated under different scenarios using the sam-

ple size formula (4.13) for θ0 = (0, 0)′ and small θ∗. The empirical powers are listed.

nCox(θ∗ = θ∗1) and nCox(θ

∗ = θ∗2) are the sample sizes calculated by ignoring the

change in the hazard ratio and assuming constant hazard ratio under the alternative

hypothesis. The sample size nCox(θ∗ = θ∗1) is calculated by assuming the true param-

eter under H1 is θ∗1, and nCox(θ∗ = θ∗2) is calculated by assuming the true parameter

under H1 is θ∗2. These two naive approaches use the formula for the Cox proportional

hazards model (4.15). The empirical powers of the test statistic T 2n are derived under

different scenarios. Overall, we see empirical powers close to the nominal power for

our method. The sample sizes calculated using our method are quite different from

those using nCox(θ∗ = θ∗1) or nCox(θ

∗ = θ∗2) in most settings. The first two sets of sce-

narios (−0.2,−0.4) and (−0.4,−0.2) are the cases when the treatment arm has more

favorable effect. The second two sets of scenarios (0.2, 0.4) and (0.4, 0.2) are the cases

when the treatment is not effective. The third two sets of scenarios (0.4,−0.4) and

(−0.4, 0.4) are the cases of crossing hazards when the treatment has different effects

during different stages of a disease. We can see that for each setting, the required

sample size increases with the increase of censoring rate under the null hypothesis.

The empirical power of scenarios with more censoring tend to deviate from those with


no censoring. In addition, we notice that the number of subjects needed for crossing

hazards, is relatively large compared to other settings.

Table 4.3 shows simulation results for the settings similar to those in Table 4.2,

but with some large treatment effects. The first two sets of scenarios (−0.5,−0.7)

and (−0.7,−0.5) are the cases when the treatment arm is more effective. The second

two sets of scenarios (0.5, 0.7) and (0.7, 0.5) are the cases when the treatment is

less favorable. The third two sets of scenarios (0.7,−0.7) and (−0.7, 0.7) are the

cases when the treatment has different effects during different stages of a disease.

We have similar observations as in the scenarios in Table 4.2. The required sample

size increases with the increase of censoring rate under the null hypothesis. The

empirical powers for the scenarios with more censoring deviate more from those with

no censoring. Our method results in empirical powers that are close to the nominal

power. The empirical powers of those sample of size n calculated using the formula for

the Cox model deviate from the nominal power significantly in most of the settings.

Table 4.4 gives the empirical powers for sample sizes calculated using the general

formula (4.12). The true parameter is θ0 = (−0.4,−0.4)′ under the null hypothesis.

Under the alternative hypothesis, we assume that the short-term hazard ratio does

not change, but the logarithm of the long-term hazard ratio changes in two directions

from −0.4. The empirical powers are close to the nominal power as θ∗2, the log of

long-term hazard ratio under H1, gets close to −0.4 . Also, the closer θ∗2 is to −0.4,

the more subjects are needed.


Table 4.2: Empirical power of calculated sample size for α = 0.05, 1 − β = 0.9 and

θ0 = (0, 0)′.

(θ∗1 , θ∗2) (eθ

∗1 , eθ

∗2 ) Censoring n nCox(θ∗ = θ∗1) nCox(θ∗ = θ∗2)

log(HR) HR rate % (H0) (empirical power) (empirical power) (empirical power)

(-0.2, -0.4) (0.8, 0.7) 0 559 (0.9056) 1051 (0.9985) 263 (0.5832)

10 639 (0.9030) 1168 (0.9936) 292 (0.5658)

20 763 (0.8912) 1314 (0.9892) 329 (0.5376)

30 928 (0.8896) 1502 (0.9854) 376 (0.5028)

(-0.4, -0.2) (0.7, 0.8) 0 548 (0.8834) 263 (0.5652) 1051 (0.9938)

10 575 (0.8872) 292 (0.5990) 1168 (0.9950)

20 618 (0.8788) 329 (0.6060) 1314 (0.9948)

30 672 (0.8784) 376 (0.6226) 1502 (0.9966)

(0.4, 0.2) (1.5, 1.2) 0 548 (0.9052) 263 (0.6052) 1051 (0.9966)

10 575 (0.9124) 292 (0.6312) 1168 (0.9986)

20 618 (0.9170) 329 (0.6678) 1314 (0.9980)

30 672 (0.9168) 376 (0.6830) 1502 (0.9994)

(0.2, 0.4) (1.2, 1.5) 0 559 (0.8774) 1051 (0.9918) 263 (0.5524)

10 639 (0.8904) 1168 (0.9910) 292 (0.5484)

20 763 (0.8994) 1314 (0.9896) 329 (0.5368)

30 928 (0.9052) 1502 (0.9874) 376 (0.5152)

(0.4, -0.4) (1.5, 0.7) 0 993 (0.9138) 263 (0.3692) 263 (0.3692)

10 1179 (0.9050) 292 (0.3338) 292 (0.3338)

20 1415 (0.9002) 329 (0.3208) 329 (0.3208)

30 1652 (0.8888) 376 (0.3016) 376 (0.3016)

(-0.4, 0.4) (0.7, 1.5) 0 993 (0.8832) 263 (0.3364) 263 (0.3364)

10 1179 (0.8928) 292 (0.3372) 292 (0.3372)

20 1415 (0.8982) 329 (0.3122) 329 (0.3122)

30 1652 (0.9184) 376 (0.3204) 376 (0.3204)

HR is Hazard Ratio.

n is calculated by the formula (4.13).

nCox is calculated by the formula (4.15).


Table 4.3: Empirical power of calculated sample size for α = 0.05, 1 − β = 0.9 and

θ0 = (0, 0)′.

(θ∗1 , θ∗2) (eθ

∗1 , eθ

∗2 ) Censoring n nCox(θ∗ = θ∗1) nCox(θ∗ = θ∗2)

log(HR) HR rate % (H0) (empirical power) (empirical power) (empirical power)

(-0.5, -0.7) (0.6, 0.5) 0 143 (0.9000) 169 (0.9364) 86 (0.6970)

10 160 (0.8758) 187 (0.9246) 96 (0.6620)

20 185 (0.8720) 211 (0.9086) 108 (0.6292)

30 217 (0.8452) 241 (0.8916) 123 (0.6082)

(-0.7, -0.5) (0.5, 0.6) 0 142 (0.8736) 86 (0.6730) 169 (0.9242)

10 151 (0.8600) 96 (0.6676) 187 (0.9274)

20 166 (0.8434) 108 (0.6608) 211 (0.9238)

30 185 (0.8490) 123 (0.6640) 241 (0.9216)

(0.7, 0.5) (2.0, 1.7) 0 142 (0.9032) 86 (0.7006) 169 (0.9466)

10 151 (0.9102) 96 (0.7346) 187 (0.9596)

20 166 (0.9214) 108 (0.7746) 211 (0.9714)

30 185 (0.9296) 123 (0.7792) 241 (0.9744)

(0.5, 0.7) (1.7, 2.0) 0 143 (0.8596) 169 (0.9196) 86 (0.6388)

10 160 (0.8938) 187 (0.9316) 96 (0.6634)

20 185 (0.9010) 211 (0.9400) 108 (0.6906)

30 217 (0.9210) 241 (0.9406) 123 (0.6830)

(0.7, -0.7) (2.0, 0.5) 0 325 (0.9208) 86 (0.3688) 86 (0.3688)

10 385 (0.9036) 96 (0.3452) 96 (0.3452)

20 463 (0.8992) 108 (0.3070) 108 (0.3070)

30 540 (0.8808) 123 (0.2778) 123 (0.2778)

(-0.7, 0.7) (0.5, 2.0) 0 325 (0.8638) 86 (0.3214) 86 (0.3214)

10 385 (0.8874) 96 (0.3068) 96 (0.3068)

20 463 (0.9018) 108 (0.3068) 108 (0.3068)

30 540 (0.9272) 123 (0.3318) 123 (0.3318)

HR is Hazard Ratio.


nCox is calculated by the formula (4.15).


Table 4.4: Empirical power of calculated sample size for

α = 0.05, 1− β = 0.9 and θ0 = (−0.4,−0.4)′.

(θ∗1 , θ∗2) (eθ

∗1 , eθ

∗2 ) Censoring rate % n Empirical

log(HR) HR under H0 power

(-0.4, 0) (0.67, 1.00) 0 926 0.8500

10 1198 0.8586

(-0.4, -0.1) (0.67, 0.90) 0 1646 0.8784

10 2129 0.8628

(-0.4, -0.2) (0.67, 0.82) 0 3703 0.8774

10 4789 0.8814

(-0.4, -0.6) (0.67, 0.55) 0 3703 0.9068

10 4789 0.9234

(-0.4, -0.7) (0.67, 0.50) 0 1646 0.9160

10 2129 0.9154

(-0.4, -0.8) (0.67, 0.45) 0 926 0.9298

10 1198 0.9306

HR is Hazard Ratio.


4.5.2 Simulation studies to evaluate sample size formula de-

rived under fixed alternative

In this section, we show via simulation studies our proposed sample size approach

under the fixed alternative. We analyze the results for a special case, the Cox propor-

tional hazards model. The sample size is calculated based on the fixed alternative.


The general procedure to investigate the validity of the sample size methods is the

same as discussed in the simulation studies for sample size calculations under con-

tiguous alternatives. The value of θ∗ is set from 0.3 to 0.9 and -0.9 to -0.3. Samples

of size n are generated 5000 times under the null hypothesis H0 : θ = 0 to obtain the

empirical type I error and under the alternative hypothesis H1 : θ = θ∗ to obtain the

empirical power. The empirical power is obtained by comparing the test statistic Tn

to the critic value under the null hypothesis. We adopt the same underlying distribu-

tions as in the previous section. We take R0(t) = t. The censoring random variable

follows a uniform distribution, i.e., C ∼ Uniform(0, τ0), where τ0 is determined such

that the censoring rate under the null hypothesis can be 0%, 10%, 20%, and 30%,

respectively. We consider the simple two-sample problem, where the covariate Z is

the treatment indicator with 1 : 1 allocation in the treatment and control arms, i.e.,

Z ∼ Bernoulli(0.5). We show the sample sizes calculated using the formula based on

the fixed alternative. For the fixed alternative, we consider φ that takes different val-

ues close to 1. This avoids calculating the limiting distribution of the score function

under the fixed alternative hypothesis. Various values of φ are explored to examine

the empirical powers.

Table 4.5, Table 4.6, Table 4.7 and Table 4.8 give the empirical powers for calcu-

lated sample sizes using the formulae based on the fixed alternative for censoring rate

of 0%, 10%, 20%, and 30%, respectively, under the null hypothesis H0 : θ = 0. We

compare the sample size calculated under the fixed alternative to that under the con-

tiguous alternatives. The sample size nCox uses the formula for the Cox proportional

hazards model under the contiguous alternatives (4.15). The sample size nfix(φ) uses

the formula under the fixed alternative for the Cox model with normal approxima-

tion for specified value of φ (3.6). In general, more subjects are needed with higher

censoring rate. When the effect size is smaller, the sample size formula based on the

contiguous alternatives would have a better performance. This is consistent with the

fact that the sample size derived in this way is restricted to small effect size. Also,


better empirical powers are observed when φ gets close to 1 with small effect size. S-

maller sample size is required in the study when the effect size is larger. We highlight

in bold the smallest sample size in each setting in which the empirical power is close

to the nominal power 1 − β. In practice, the sample size can be chosen in this way

through extensive simulation studies.

CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG ANDPRENTICE’S SEMIPARAMETRIC MODEL 55T

able

4.5:

Em

pir

ical

pow

erof

calc

ula

ted

sam

ple

size

for

the

Cox

model

.α

=0.

05,

1−β

=0.

9an

dθ 0

=0.

θ∗eθ∗

ncox

nfi

xed

(φ=

0.7)

nfi

xed

(φ=

0.8)

nfi

xed

(φ=

0.9)

nfi

xed

(φ=

1)

log(H

R)

HR

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

-0.3

0.7

446

7(0

.8816

)43

6(0

.864

2)45

8(0

.882

8)479(0

.9004)

499

(0.9

082)

-0.4

0.6

726

3(0

.8756

)25

1(0

.857

6)26

4(0

.889

2)276(0

.9010)

288

(0.9

098)

-0.5

0.6

116

9(0

.8758

)16

5(0

.866

6)17

4(0

.889

2)182(0

.9074)

190

(0.9

244)

-0.6

0.5

511

7(0

.8754

)11

8(0

.880

8)124(0

.8944)

130

(0.9

084)

136

(0.9

180)

-0.7

0.5

086

(0.8

674

)90

(0.8

812)

95(0

.8994)

99(0

.914

6)10

3(0

.923

6)

-0.8

0.4

566

(0.8

562

)71(0

.8946)

75(0

.906

0)79

(0.9

174)

82(0

.927

4)

-0.9

0.41

52

(0.8

484)

59(0

.9018)

62(0

.905

4)65

(0.9

190)

68(0

.933

8)

0.3

1.35

467

(0.8

902

)42

7(0

.864

6)44

9(0

.877

8)469(0

.8920)

489

(0.9

114)

0.4

1.4

9263(0

.8954)

245

(0.8

688)

257

(0.8

718)

269

(0.8

850)

281

(0.9

120)

0.5

1.65

169

(0.8

820

)16

1(0

.880

0)16

9(0

.882

0)177(0

.9006)

184

(0.9

15)

0.6

1.8

211

7(0

.8844)

115

(0.8

774)

121(0

.8978)

126

(0.9

106)

132

(0.9

226)

0.7

2.01

86(0

.874

8)

87(0

.889

4)91

(0.8

852)

96(0

.9096)

100

(0.9

242)

0.8

2.2

366

(0.8

748)

69(0

.886

4)73(0

.9062)

76(0

.922

8)79

(0.9

242)

0.9

2.46

52(0

.868

6)

57(0

.8940)

60(0

.915

0)62

(0.9

202)

65(0

.934

4)

Cen

sori

ng

rate

un

derH

0:θ

=0

is0%

.

ncox

issa

mp

lesi

zeb

ase

don

the

conti

guou

sal

tern

ativ

es.

nfi

xed

(φ)

issa

mp

lesi

zebase

don

the

fixed

alte

rnat

ive,

assu

min

gv∗ θ∗

(0)

=φv 0

(0).

Sam

ple

size

inb

old

isth

esm

alle

ston

esu

chth

atth

eem

pir

ical

pow

eris

reas

onab

lycl

ose

to1−β

.


able

4.6:

Em

pir

ical

pow

erof

calc

ula

ted

sam

ple

size

for

the

Cox

model

.α

=0.

05,

1−β

=0.

9an

dθ 0

=0.

θ∗eθ∗

nc

nfi

xed

(φ=

0.7)

nfi

xed

(φ=

0.8)

nfi

xed

(φ=

0.9)

nfi

xed

(φ=

1)

log(H

R)

HR

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

-0.3

0.7

451

9(0

.8866

)48

2(0

.863

6)50

7(0

.883

2)531(0

.8954)

553

(0.9

066)

-0.4

0.6

729

2(0

.8840

)28

0(0

.867

2)29

4(0

.883

2)307(0

.8980)

321

(0.9

122)

-0.5

0.6

118

7(0

.8780

)18

5(0

.877

6)19

4(0

.889

8)203(0

.9088)

212

(0.9

118)

-0.6

0.5

513

0(0

.8716

)13

3(0

.874

8)14

0(0

.886

6)146(0

.8962)

153

(0.9

190)

-0.7

0.5

096

(0.8

640

)10

1(0

.878

)107(0

.8926)

112

(0.8

998)

117

(0.9

162)

-0.8

0.4

573

(0.8

498

)81

(0.8

868)

85(0

.8972)

89(0

.909

6)93

(0.9

222)

-0.9

0.4

158

(0.8

374

)66

(0.8

788)

70(0

.8982)

73(0

.912

6)77

(0.9

264)

0.3

1.35

519

(0.9

060)

447

(0.8

638)

469

(0.8

706)

490(0

.8920)

511

(0.9

070)

0.4

1.49

292

(0.9

022)

253

(0.8

608)

266

(0.8

766)

278(0

.8948)

290

(0.9

024)

0.5

1.65

187

(0.9

036)

164

(0.8

660)

172

(0.8

750)

180(0

.8958)

188

(0.9

084)

0.6

1.82

130

(0.9

010)

116

(0.8

738)

122

(0.8

852)

128(0

.8930)

133

(0.9

200)

0.7

2.01

96(0

.900

2)

87(0

.874

6)92(0

.8928)

96(0

.900

2)10

0(0

.911

6)

0.8

2.23

73(0

.896

0)

69(0

.882

2)72(0

.8956)

76(0

.914

2)79

(0.9

158)

0.9

2.46

58(0

.8948)

56(0

.883

6)59

(0.8

974)

62(0

.912

0)64

(0.9

168)

Cen

sori

ng

rate

un

derH

0:θ

=0

is10

%.

nc

issa

mp

lesi

zeb

ase

don

the

conti

guou

sal

tern

ativ

es.

nfi

xed

(φ)

issa

mp

lesi

zebase

don

the

fixed

alte

rnat

ive,

assu

min

gv∗ θ∗

(0)

=φv 0

(0).

Sam

ple

size

inb

old

isth

esm

alle

ston

esu

chth

atth

eem

pir

ical

pow

eris

reas

onab

lycl

ose

to1−β

.


able

4.7:

Em

pir

ical

pow

erof

calc

ula

ted

sam

ple

size

for

the

Cox

model

.α

=0.

05,

1−β

=0.

9an

dθ 0

=0.

θ∗eθ∗

nc

nfi

xed

(φ=

0.7)

nfi

xed

(φ=

0.8)

nfi

xed

(φ=

0.9)

nfi

xed

(φ=

1)

log(H

R)

HR

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

-0.3

0.7

458

4(0

.8904

)55

6(0

.867

6)58

4(0

.890

4)61

1(0

.920

4)637(0

.9058)

-0.4

0.6

7329

(0.8

750)

323

(0.8

622)

340

(0.8

912)

356(0

.8990)

371

(0.9

084)

-0.5

0.61

211

(0.8

626)

215

(0.8

736)

226

(0.8

888)

236(0

.8984)

247

(0.9

162)

-0.6

0.55

146

(0.8

492)

155

(0.8

708)

163

(0.8

866)

171(0

.9084)

178

(0.9

196)

-0.7

0.50

108

(0.8

436)

119

(0.8

798)

125(0

.8954)

131

(0.9

070)

137

(0.9

202)

-0.8

0.4

583

(0.8

402

)95

(0.8

834)

100(0

.9064)

105

(0.9

136)

109

(0.9

214)

-0.9

0.4

165

(0.8

168

)78

(0.8

836)

82(0

.9018)

87(0

.919

6)90

(0.9

252)

0.3

1.35

584

(0.9

104)

487

(0.8

524)

511

(0.8

760)

534

(0.8

858)

556(0

.9012)

0.4

1.49

329

(0.9

154

)27

3(0

.852

4)28

6(0

.876

4)29

9(0

.889

4)311(0

.8924)

0.5

1.65

211

(0.9

156

)17

4(0

.858

4)18

3(0

.868

8)191(0

.8968)

199

(0.9

040)

0.6

1.8

214

6(0

.9122)

122

(0.8

560)

128

(0.8

706)

134(0

.8968)

139

(0.9

086)

0.7

2.0

110

8(0

.9166)

91(0

.859

0)95

(0.8

828)

99(0

.889

6)104(0

.9086)

0.8

2.23

83

(0.9

128

)71

(0.

8656

)74

(0.8

784)

77(0

.8990)

81(0

.910

6)

0.9

2.4

665

(0.9

154)

57(

0.86

80)

60(0

.8938)

63(0

.902

2)65

(0.9

154)

Cen

sori

ng

rate

un

derH

0:θ

=0

is20

%.

nc

issa

mp

lesi

zeb

ase

don

the

conti

guou

sal

tern

ativ

es.

nfi

xed

(φ)

issa

mp

lesi

zebase

don

the

fixed

alte

rnat

ive,

assu

min

gv∗ θ∗

(0)

=φv 0

(0).

Sam

ple

size

inb

old

isth

esm

alle

ston

esu

chth

atth

eem

pir

ical

pow

eris

reas

onab

lycl

ose

to1−β

.


able

4.8:

Em

pir

ical

pow

erof

calc

ula

ted

sam

ple

size

for

the

Cox

model

.α

=0.

05,

1−β

=0.

9an

dθ 0

=0.

θ∗eθ∗

nc

nfi

xed

(φ=

0.7)

nfi

xed

(φ=

0.8)

nfi

xed

(φ=

0.9)

nfi

xed

(φ=

1)

log(H

R)

HR

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

(em

pir

ical

pow

er)

-0.3

0.7

466

8(0

.8796

)64

7(0

.874

8)68

0(0

.882

2)711(0

.8986)

742

(0.9

094)

-0.4

0.6

737

6(0

.8652

)37

8(0

.876

8)39

8(0

.890

0)416(0

.9042)

434

(0.9

172)

-0.5

0.6

124

1(0

.8588

)25

2(0

.863

8)26

5(0

.891

0)278(0

.9052)

290

(0.9

212)

-0.6

0.5

516

7(0

.8468

)18

3(0

.875

2)193(0

.8966)

202

(0.9

126)

211

(0.9

158)

-0.7

0.5

012

3(0

.8364

)14

1(0

.868

6)148(0

.8976)

155

(0.9

058)

162

(0.9

270)

-0.8

0.4

594

(0.8

222

)11

3(0

.885

6)119(0

.9046)

124

(0.9

086)

130

(0.9

276)

-0.9

0.4

175

(0.8

054)

93(0

.887

4)98(0

.9094)

103

(0.9

186)

108

(0.9

290)

0.3

1.35

668

(0.9

122)

542

(0.8

452)

569

(0.8

674)

594

(0.8

756)

619(0

.8966)

0.4

1.49

376

(0.9

288

)30

0(0

.847

2)31

5(0

.867

6)32

9(0

.887

)342(0

.8906)

0.5

1.65

241

(0.9

232

)19

0(0

.843

6)19

9(0

.876

6)20

8(0

.887

6)217(0

.8960)

0.6

1.82

167

(0.9

180

)13

1(0

.841

6)13

7(0

.866

2)144(0

.8880)

149(0

.8952)

0.7

2.0

112

3(0

.9220)

96(0

.849

6)10

1(0

.872

2)10

5(0

.886

2)110(0

.9028)

0.8

2.23

94

(0.9

248

)74

(0.8

504)

78(0

.872

2)81

(0.8

862)

84(0

.9006)

0.9

2.46

75

(0.9

350

)59

(0.8

538)

62(0

.879

0)65

(0.8

916)

67(0

.8956)

Cen

sori

ng

rate

un

derH

0:θ

=0

is30

%.

nc

issa

mp

lesi

zeb

ase

don

the

conti

guou

sal

tern

ativ

es.

nfi

xed

(φ)

issa

mp

lesi

zebase

don

the

fixed

alte

rnat

ive,

assu

min

gv∗ θ∗

(0)

=φv 0

(0).

Sam

ple

size

inb

old

isth

esm

alle

ston

esu

chth

atth

eem

pir

ical

pow

eris

reas

onab

lycl

ose

to1−β

.

CHAPTER 5. ACCRUAL AND FOLLOW-UP TIMES IN SAMPLE SIZECALCULATION 59

Chapter 5

Accrual and follow-up times in

sample size calculation

5.1 Accrual and followup in sample size calcula-

tion

The sample size formula we derived is a function of the design effect, which con-

tains the information of the censoring distribution and the baseline survival functions.

Now we want to incorporate the accrual and follow-up times in sample size calcula-

tions. This can be achieved by linking the censoring variable to the accrual and

follow-up times. We will show by examples that different plans in accrual and follow-

up times do affect the sample size calculation. The following assumptions are made

for the accrual and follow-up times:

1. Patient enters the study at a random time A with pdf fA(·) defined on (0, a].

2. The study length is τ = a+ f , where f is the follow-up time.

3. The censoring variable C = τ − A.

4. There are no competing risks. The only censoring is caused by administrative


censoring.

The censoring distribution is known if the distribution of the accrual random variable

A is known. There are many accrual schemes that can be considered in practice.

In this work, we discuss three types of accrual patterns: constant, increasing, and

decreasing accruals. There are quite a few distributions we may use for each type

of accrual. We consider some simple distributions discussed by Maki (2006) [31] and

Wang et al. (2012) [49]. The probability density functions of the accrual variable

and corresponding censoring variable are listed in Table 5.1. These are simple cases

of uniform (constant rate), increasing, and decreasing accrual rates. More flexible

accrual distributions can be found in Maki (2006) [31].

Table 5.1: Different accru-

al distributions as in Ma-

ki (2006) and Wang et al.

(2012).

Accrual rate fA(·)

Uniform 1/a

Increasing 2t/a2

Decreasing 2(a− t)/a2

fA(·) is defined on (0, τa].

The corresponding survival function of the censoring variable SC(t) can be derived

if the pdf of the accrual random variable is known.

5.2 Example

The sample size formula (4.13) based on the contiguous alternatives can be modi-

fied to incorporate the accrual and follow-up information. Recall that the sample size


formula is

n =η

θ∗′ν0θ∗, (5.1)

where

ν0 = Var(Z)

∫ τ

0

SC(t)1

(1 +R0(t))4

1 R0(t)

R0(t) R0(t)2

dR0(t).

The censoring distribution SC(t) in the expression of ν0 can be replaced by a distribu-

tion function based on a specific accrual pattern. For example, if we consider uniform

accrual on the time interval (0, a], then the design effect ν0 has the following form

ν0 = Var(Z)

∫ f

0

SC(t)

(1 +R0(t))4

1 R0(t)

R0(t) R0(t)2

dR0(t)

+

∫ τ

f

τ − ta· SC(t)

(1 +R0(t))4

1 R0(t)

R0(t) R0(t)2

dR0(t)

. (5.2)

The sample size can be calculated for the uniform accrual rate with the above ν0.

We illustrate the impact of the accrual and follow-up times on sample size cal-

culation with pre-specified uniform, increasing, and decreasing accruals. The pdfs

of accrual random variable A for different accrual rates are shown in Table 5.1. We

consider the uniform accrual (fA(t) = 1/a), increasing accrual (fA(t) = 2t/a2), and

decreasing accrual (fA(t) = 2(a − t)/a2). The sample size is calculated under the

following scenario: α = 0.05, 1 − β = 0.9, θ0 = (0, 0)′, θ∗ = (−0.5,−0.7)′, R0(t) = t,

V ar(Z) = 0.25 and SC(t) calculated based on different accrual rates. Figure 5.1 gives

the three-dimensional plots of sample size with pre-specified accrual and follow-up

patterns. From left to right are the plots of uniform, increasing, and decreasing ac-

cruals. Short accrual and follow-up times both contribute to the large required sample

sizes. However, the surfaces in the three-dimensional plots are relatively flat for all

three accrual rates when the accrual and follow-up times are larger. Although the

sample size curves are similar in shape for all three cases, increasing accrual requires


the largest number of individuals when both accrual and follow-up times are short.

In comparison, the required sample size is the smallest with decreasing accrual rate.

Figure 5.2 is the sample size plot for different length of follow-up times when the

accrual length is 3 months. Figure 5.2 shows that there is significant drop in sample

size at sometime before month 5 of follow-up. Thus, researchers may think about

having at least 5 months of follow-up under such circumstances. Different accrual

rate also has an impact on the sample size calculation under this scenario; decreasing

accrual would be better as the increasing accrual requires more subjects. Figure 5.3

is the sample size plot for different length of accrual times when the follow-up length

is 20 months. From Fig. 5.3, we can see that the decreasing accrual is still the best.

The length of accrual period affects sample size, nevertheless only by a single digit

number. Therefore we conclude that, under such a scenario, the length of follow-up

has a much more important impact on the sample size calculation. Figures 5.2 and

5.3 indicate that decreasing accrual is the best in this case, and increasing accrual is

the worst, which is consistent with the finding from the three-dimensional plots.

Figure 5.1: Sample size for accrual and follow-ups up to 30 months.Uniform accrual

510

1520

2530 0

510 15 20 25 30

100

200

300

400

500

600

Accrual

Follow−up

Sam

ple

size

Increasing accrual

510

1520

2530 0

510 15 20 25 30

100

200

300

400

500

600

Accrual

Follow−up

Sam

ple

size

Decreasing accrual

510

1520

2530 0

510 15 20 25 30

100

200

300

400

500

600

Accrual

Follow−up

Sam

ple

size


Figure 5.2: Sample size for 3 months accrual (a = 3)

0 5 10 15 20 25 30

150

200

250

300

f

Sam

ple

size

nUniform accrualIncreasing accrualDecreasing accrual

Figure 5.3: Sample size for 20 months follow-up (f = 20)

0 2 4 6 8 10

145

146

147

148

149

150

a

Sam

ple

size

n

Uniform accrualIncreasing accrualDecreasing accrual

CHAPTER 6. DISCUSSION 64

Chapter 6

Discussion

6.1 Concluding remarks

We have presented the sample size calculation procedure for the Cox proportional

hazards model in Chapter 3 and have developed the sample size calculation procedure

based on a semiparametric analysis for short-term and long-term hazards ratio in

Chapter 4. The methods for sample size calculation are developed under both fixed

alternative and contiguous alternatives. The general formulae as well as some special

cases are discussed. In the development, three types of submodels are of specific

interest: Cox proportional hazards model, proportional odds model, and long-term

effect model. We have shown the formulae for sample size calculations for these three

types of models as special cases. It is noted that, under the null hypothesis of no

effect of interest θ0 = 0 and conditions (A1)–(A8) and (C1), the sample size formula

derived under the contiguous alternatives for the Cox model in Chapter 4 is the same

as that derived by Schoenfeld (1983) and Hsieh and Lavori (2000).

Sample size calculation is complex under the fixed alternative hypothesis. With

the contiguous alternatives, the formula for sample size calculation can be simplified.

However, the sample size calculation based on the fixed alternative hypothesis is more

desirable when the effect size is large. We have obtained the general formulae under


the fixed alternative, which is more complex in calculation than the formulae under

the contiguous alternatives. The difficulty of obtaining the result under the fixed

alternative lies in that the limiting variance of the score function evaluated under

the alternative is usually difficult to obtain. One way to overcome the difficulty is

to consider the limiting variance under the fixed alternative as a linear function of

that under the contiguous alternatives. In particular, when the effect size is small,

the two limiting variances should be approximately equal. Thus, the computation

of the sample size based on the fixed alternative can be greatly simplified. In the

simulation studies, we evaluated the general sample size formula based on the con-

tiguous alternatives and the method for the three types of submodels based on the

fixed alternative. The validity of our method is confirmed by comparing the empirical

power to the nominal power for different scenarios.

The planning of accrual and follow-up schemes is very important in clinical trials.

We incorporate accrual and follow-up times in Chapter 5 for sample size calculation.

The random time at which the individual enters the study is linked to the censoring

time. We have assumed that there are no competing risks, and administrative reason

is the only cause of censoring. Therefore, the censoring distribution is known as long

as the accrual distribution is known. Different accrual schemes are discussed in Maki

(2006) [31]. We have used the sample size formula we derived under the contiguous

alternatives to show how the accrual and follow-up times can be considered in sample

size calculation. This can be done by replacing the censoring distribution function

in the formula with that derived from the accrual scheme. As an example, sample

size is calculated using our formula with different accrual schemes. In addition, it

has been shown that the lengths of accrual and follow-ups have impacts on sample

size calculation. As a result, investigators should carefully consider the accrual and

follow-up schemes in planning a well-designed and cost-efficient clinical trial.


6.2 Future work

In sample size calculation, the hypotheses of interest and test statistic are the two

key components to be considered. If different hypotheses of interest are considered or

different test statistic is chosen, the sample size formula could be different. The fixed

alternative hypothesis and the contiguous alternative hypotheses are often considered

in sample size calculation. The test statistic can be model-based if non-binary effect

is of interest along with other covariate adjustment. In such a case, an appropriate

model should be used for the underlying population, and the test statistics from

the model can be used in sample size calculation. For different types of failure-time

data, different models should be considered. For example, the cure rate proportional

hazards model can be used when there is a group of cured individuals.

We have developed sample size calculation procedures based on the pseudo score

test statistic by Yang and Prentice (2005) [51] for survival data with non-proportional

hazard functions. In fact, methods based on other survival models can also be consid-

ered for sample size determination for non-proportional hazards functions. Examples

include the proportional odds model and the linear transformation model. The linear

transformation model, like the semiparametric model by Yang and Prentice (2005)

[51], includes the Cox proportional hazards model and the proportional odds model

as special cases. As was mentioned earlier, one of the two key components in sample

size determination is the test statistic. If a model-based test statistic is considered,

then the formulae derived can be different if the test statistics are based on different

estimation methods of the parameter(s). In this dissertation, we only considered the

pseudo score test statistic for the semiparametric model of short-term and long-term

ratios (Yang and Prentice 2005 [51]), test statistics based on other types of estimation

procedure can be considered as well. In addition, methods can be developed by con-

structing the test statistic for the proportional odds model using different parameter

estimation procedure (e.g., Bennett 1983 [2]; Pettitt 1984 [39]; Dabrowska and Dok-

sum 1988 [10]; Murphy et al. 1997 [33]). Different parameter estimation approaches


for the linear transformation model (e.g., Cheng et al. 1995 [7]; Cheng et al. 1997

[8]; Fine et al. 1998 [13]; Cai et al. 2000 [4]; Chen et al. 2002 [6]; Zeng and Lin 2007

[52]) can also result in different test statistics in sample size calculations. All these

will be investigated in our future research.

CHAPTER 7. PROOFS 68

Chapter 7

Proofs

In this chapter, we present the proofs of the theorems and corollaries in this

dissertation. The first part is on the proofs of the theorems and corollaries in Chapter

3. The second part is on the proofs of the theorems in Chapter 4.

7.1 Proofs of the theorems and corollaries in Chap-

ter 3

Assume that the true parameter is θ ∈ Θ. For any θc ∈ Θ, let

Zn(t, θc) =

∑nj=1 ZjYj(t)exp{Zjθc}∑nj=1 Yj(t)exp{Zjθc}

.

Under regularity conditions (A1)–(A8), it is easy to see that limn→∞ Zn(t, θc) exists

by the law of large numbers, and

limn→∞

Eθ

[Zn(t, θc)

]=

Eθ [ZY (t)exp{Zθc}]Eθ [Y (t)exp{Zθc}]

.

By the uniformly bounded convergence theorem,

supt∈(0,τ ], θc∈Θ

|Zn(t, θc)− Z0,θ(t, θc)| → 0 as n→∞,


where

Z0,θ(t, θc) =Eθ [ZY (t)exp{Zθc}]Eθ [Y (t)exp{Zθc}]

.

Also let

Z0,θ(t, θc) =Eθ [Y (t)exp{Zθ}]Eθ [Y (t)exp{Zθc}]

.

7.1.1 Proof of theorem 3.3.1

Proof. (i) Under H0 : θ = θ0,

1

nUn(θ0) =

1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)dNi(t)

=1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)dM c

i,θ0(t)

+1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)Yi(t)exp{Ziθ0}λ0(t) dt

=1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)dM c

i,θ0(t),

where M ci,θ0

(t) = Ni(t)−∫ t

0Yi(s)exp{Ziθ0}λ0(s) ds is the martingale associated with

the i th individual.

It follows from the martingale central limit theorem that

√n

(1

nUn(θ0)

)d−→ N

(0, vcθ0(θ0)

)as n→∞,

where

vcθ0(θ0)

= limn→∞

1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)2Yi(t)exp{Ziθ0}λ0(t) dt

= limn→∞

Eθ0

[1

nV cn (θ0)

],


which can be expressed as

vcθ0(θ0)∫ τ

0

Eθ0

[Z2Y (t)exp{Zθ0}

]Eθ0

[Y (t)exp{Zθ0}

] −(Eθ0

[ZY (t)exp{Zθ0}

]Eθ0

[Y (t)exp{Zθ0}

] )2

×Eθ0

[Y (t)exp{Zθ0}

]λ0(t) dt

=

∫ τ

0

Eθ0

[Z2Y (t)exp{Zθ0}

]−

(Eθ0

[ZY (t)exp{Zθ0}

])2

Eθ0

[Y (t)exp{Zθ0}

]λ0(t) dt

=

∫ τ

0

{Eθ0

[Z2exp{Zθ0}ST |Z(t−)SC|Z(t−)

]−

(Eθ0

[Zexp{Zθ0}ST |Z(t−)SC|Z(t−)

])2

Eθ0

[exp{Zθ0}ST |Z(t−)SC|Z(t−)

]λ0(t) dt.

If θ0 = 0 and the condition (C1) is satisfied, vcθ0(θ0) can be simplified as

vcθ0(θ0) = vc0(0)

= Var(Z)

∫ τ

0

S0(t)SC(t)λ0(t) dt

= Var(Z) Prθ=0({∆ : ∆ = 1}), (7.1)

where Prθ=0({∆ : ∆ = 1}) is the event probability under the null hypothesis of θ = 0.

(ii) Under H1 : θ = θ∗,

1

nUn(θ0)

=1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)dNi(t)

=1

n

∑i

∫ τ

0

(Zi − Z0,θ∗(t, θ0)

)dNi(t)−

1

n

∑i

∫ τ

0

(Zn(t, θ0)− Z0,θ∗(t, θ0)

)dNi(t)

=1

n

∑i

∫ τ

0

(Zi − Z0,θ∗(t, θ0)

)dNi(t)−

∫ τ

0

(Zn(t, θ0)− Z0,θ∗(t, θ0)

) 1

n

∑i

dNi(t),

=1

n

∑i

∫ τ

0

(Zi − Z0,θ∗(t, θ0)

)dNi(t)

−∫ τ

0

(Zn(t, θ0)− Z0,θ∗(t, θ0)

)Eθ∗[Y (t)exp{Zθ∗}

]λ0(t) dt+ oP (n−

12 ),


which can be further derived as

1

nUn(θ0)

=1

n

∑i

∫ τ

0

(Zi − Z0,θ∗(t, θ0)

)dNi(t)

−∫ τ

0

( 1

n

∑i

ZiYi(t)exp{Ziθ0} − Eθ∗[ZY (t)exp{Zθ0}

])Z0,θ∗(t, θ0)λ0(t) dt

+

∫ τ

0

( 1

n

∑i

Yi(t)exp{Ziθ0} − Eθ∗[Y (t)exp{Zθ0}

])Z0,θ∗(t, θ0)Z0,θ∗(t, θ0)λ0(t) dt

+oP (n−12 ).

It follows that

1

nUn(θ0)− ec∗θ∗(θ0)

=1

n

∑i

∫ τ

0

(Zi − Z0,θ∗(t, θ0)

)dNi(t)− ec∗θ∗(θ0)

−∫ τ

0

( 1

n

∑i


])Z0,θ∗(t, θ0)λ0(t) dt

+

∫ τ

0

( 1

n

∑i


])Z0,θ∗(t, θ0)Z0,θ∗(t, θ0)λ0(t) dt

+oP (n−12 ),

(7.2)

where


Eθ∗

[1

nUn(θ0)

]= lim

n→∞Eθ∗

[1

n

∑i

∫ τ

0

(Zi − Z0(t, θ0)

)dNi(t)

].

Note that 1n

∑i

∫ τ0

(Zi − Z0,θ∗(t, θ0)

)dNi(t) = 1

n

∑i ∆i

(Zi − Z0,θ∗(Xi, θ0)

), which is

the summation of independent random variables.

In the following proofs, we will need the result from Van der Vaart and Wellner

(1996, p159) [48]: the class of all uniformly bounded, monotone functions on the real


line is Donsker. With regularity conditions (A1)–(A8), the functions ZY (t)exp{Zθ0}

and Y (t)exp{Zθ0} are both uniformly bounded and monotone in t.

By Donsker’s theorem,

√n

(1

n

∑i


])converges weakly to a zero-mean Gaussian process as n→∞.

Similarly,

√n

(1

n

∑i


])converges weakly to a zero-mean Gaussian process as n → ∞. The right-hand side

of (7.2) has four items, the first three of which are the sums of independent random

variables by functional delta method and are of zero-means. It suffices to show that√n( 1

nUn(θ0)−ec∗θ∗(θ0)) converges weakly to a zero-mean normal random variable. The

result follows

√n

(1

nUn(θ0)− ec∗θ∗(θ0)

)d−→ N

(0, vc∗θ∗(θ0)

)as n→∞,

where vc∗θ∗(θ0) = limn→∞Varθ∗[

1√nUn(θ0)

].

The expressions of ec∗θ∗(θ0) and vc∗θ∗(θ0) can be derived as follows:


Eθ∗

[1

nUn(θ0)

]= lim

n→∞Eθ∗

[1

n

∑i

∫ τ

0

(Zi − Z0,θ∗(t, θ0)

)dNi(t)

]

=

∫ τ

0

{Eθ∗[ZY (t)exp{Zθ∗}

]−

Eθ∗[ZY (t)exp{Zθ0}

]Eθ∗[Y (t)exp{Zθ∗}

]Eθ∗[Y (t)exp{Zθ0}

] }λ0(t) dt

=

∫ τ

0

{Eθ∗[Zexp{Zθ∗}ST |Z(t−)SC|Z(t−)

]−

Eθ∗[Zexp{Zθ0}ST |Z(t−)SC|Z(t−)

]Eθ∗[exp{Zθ∗}ST |Z(t−)SC|Z(t−)

]Eθ∗[exp{Zθ0}ST |Z(t−)SC|Z(t−)

] }×λ0(t) dt.


If θ0 = 0 and the conditions (C1) is satisfied, then ec∗θ∗(θ0) can be simplified as

ec∗θ∗(θ0) = ec∗θ∗(0)

=

∫ τ

0

{E[Zexp{Zθ∗}S0(t)exp{Zθ∗}]−

E[ZS0(t)exp{Zθ∗}]E[exp{Zθ∗}S0(t)exp{Zθ∗}]

E[S0(t)exp{Zθ∗}

] }SC(t)λ0(t) dt.

(7.3)

We now derive the expression of vc∗θ∗(θ0) as follows:

vc∗θ∗(θ0) = limn→∞

Varθ∗

[1√nUn(θ0)

]= lim

n→∞Varθ∗

{1√n

∑i

∆i

(Zi − Z0,θ∗(Xi, θ0)

)+

1√n

∑i

∫ τ

0

[Yi(t)exp{Ziθ0}Z0,θ∗(t, θ0)− ZiYi(t)exp{Ziθ0}

]×Z0,θ∗(t, θ0)λ0(t) dt

}= Varθ∗

[∆(Z − Z0,θ∗(X, θ0)

)]+

∫ τ

0

∫ τ

0

Covθ∗[Y (s)exp{Zθ0}Z0,θ∗(s, θ0)− ZY (s)exp{Zθ0},

Y (t)exp{Zθ0}Z0,θ∗(t, θ0)− ZY (t)exp{Zθ0}]

×Z0,θ∗(s, θ0)λ0(s)Z0,θ∗(t, θ0)λ0(t) ds dt

+2

∫ τ

0

Covθ∗[∆(Z − Z0,θ∗(X, θ0)

), Y (t)exp{Zθ0}Z0,θ∗(t, θ0)− ZY (t)exp{Zθ0}

]×Z0,θ∗(t, θ0)λ0(t) dt

7.1.2 Proof of Corollay 3.3.2

Proof. (i) Under H0 : θ = θ0,

by Theorem 3.3.1 (i), we have

1√nUn(θ0)

d−→ N(

0, vcθ0(θ0))

as n→∞,


where, vcθ0(θ0) = limn→∞[

1nV cn (θ0)

]. By Slutsky’s theorem,

T cnd−→ N(0, 1), as n→∞.

(ii) Under H1 : θ = θ∗,

by Theorem 3.3.1 (ii), we have

1√nUn(θ0)−

√nec∗θ∗(θ0)

d−→ N(

0, vc∗θ∗(θ0))

as n→∞.

Then, by Slutsky’s theorem, the result follows

T cn −√nec∗θ∗(θ0)√vcθ∗(θ0)

d−→ N

(0,vc∗θ∗(θ0)

vcθ∗(θ0)

), as n→∞,

where

vcθ∗(θ0)

= limn→∞

Eθ∗

[1

nV cn (θ0)

]

=

∫ τ

0

Eθ∗[Z2Y (t)exp{Zθ0}


] −(Eθ∗[ZY (t)exp{Zθ0}


] )2

×Eθ∗[Y (t)exp{Zθ∗}

]λ0(t) dt

=

∫ τ

0

{Eθ∗[Z2exp{Zθ0}ST |Z(t−)SC|Z(t−)


]−

(Eθ∗[Zexp{Zθ0}ST |Z(t−)SC|Z(t−)

]Eθ∗[exp{Zθ0}ST |Z(t−)SC|Z(t−)

] )2

×Eθ∗[exp{Zθ∗}ST |Z(t−)SC|Z(t−)

]λ0(t) dt.

If θ0 = 0 and the conditions (C1) is satisfied, then vcθ∗(θ0) can be simplified as

vcθ∗(θ0) = vcθ∗(0)

=

∫ τ

0

E[Z2S0(t)exp{Zθ∗}]

E[S0(t)exp{Zθ∗}

] −(E[ZS0(t)exp{Zθ∗}]

E[S0(t)exp{Zθ∗}

] )2

×E[exp{Zθ∗}S0(t)exp{Zθ∗}]SC(t)λ0(t) dt. (7.4)


7.1.3 Proof of Theorem 3.4.1

Proof. Under H1n : θ = θ1n = θ0 + h√n,

1

nUn(θ0) =

1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)dNi(t)

=1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)dM c

i,θ1n(t)

+1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)Yi(t)exp{Ziθ1n}λ0(t) dt

= T an,1 + T an,2,

where M ci,θ1n

(t) = Ni(t)−∫ t

0Yi(s)exp{Ziθ1n}λ0(s) ds is the martingale associated with

the i th individual,

and

T an,1 =1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)dM c

i,θ1n(t),

T an,2 =1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)Yi(t)exp{Ziθ1n}λ0(t) dt.

By the martingale central limit theorem, the asymptotic distribution of T an,1 can be

derived.√nT an,1 converges weakly to a zero-mean random normal variable.

By dominated convergence theorem, the asymptotic variance of√nT an,1 is

limn→∞

1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)2Yi(t)exp{Ziθ1n}λ0(t) dt

=

∫ τ

0

{Eθ0

[Z2Y (t)exp{Zθ0}

]− 2Eθ0

[ZY (t)exp{Zθ0}

]Z0,θ0(t, θ0)

+ Eθ0

[Y (t)exp{Zθ0}

]Z2

0,θ0(t, θ0)

}λ0(t) dt,

(7.5)

which can be further simplified as


∫ τ

0

Eθ0

[Z2Y (t)exp{Zθ0}

]−

(Eθ0

[ZY (t)exp{Zθ0}

])2

Eθ0

[Y (t)exp{Zθ0}

]λ0(t) dt

=

∫ τ

0

Eθ0

[Z2Y (t)exp{Zθ0}

]Eθ0

[Y (t)exp{Zθ0}

] −(Eθ0

[ZY (t)exp{Zθ0}

]Eθ0

[Y (t)exp{Zθ0}

] )2

×Eθ0

[Y (t)exp{Zθ0}

]λ0(t) dt

= limn→∞

Eθ0

[1

nV cn (θ0)

]= vcθ0(θ0).

It suffices to show that

√nT an,1

d−→ N(0, vcθ0(θ0)

)as n→∞.

Next we investigate the property of T an,2:

T an,2 =1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)Yi(t)exp{Ziθ1n}λ0(t) dt.

By Taylor expansion at θ0, we have

T an,2 =1

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)Yi(t)exp{Ziθ0}λ0(t) dt

+1

n

h√n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)ZiYi(t)exp{Ziθ0}λ0(t) dt

+oP (n−12 )

= 0 +1

n

h√n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)ZiYi(t)exp{Ziθ0}λ0(t) dt

+oP (n−12 ).

It follows that

√nT an,2 =

h

n

∑i

∫ τ

0

(Zi − Zn(t, θ0)

)ZiYi(t)exp{Ziθ0}λ0(t) dt+ oP (1)

P−→ hvcθ0(θ0) as n→∞.


By Slutsky’s theorem,

1√nUn(θ0)


)as n→∞.

7.1.4 Proof of Corollary 3.4.2

Proof. Under H1n : θ = θ1n = θ0 + h√n, by Theorem 3.4.1, we have

1√nUn(θ0)


)as n→∞.

By dominated convergence theorem,

limn→∞

Eθ1n

[1

nV cn (θ0)

]= lim

n→∞Eθ0

[1

nV cn (θ0)

]= vcθ0(θ0).

Then, by Slutsky’s theorem, the result follows

T cnd−→ N

(h√vcθ0(θ0), 1

)as n→∞.

7.2 Proofs of the theorems in Chapter 4

Assume that the true parameter of interest is θ ∈ Θ. For any θc ∈ Θ, let K0,θ(t) =

limn→∞ Eθ

[1nKn(t)

]. Also define the functions

H0,j,θ(t, θc) = limn→∞

Eθ

[1

nHn,j(t, θc)

],

Λ0,j,θ(t, θc) = limn→∞

Eθ

[Λn,j(t, θc)

],

for j = 1, 2.

P0,θ(t, θc) = limn→∞

Eθ

[Pn(t, θc)

].


Under conditions (A4)–(A5), Λ0,j,θ(t, θc) is continuous in t for j = 1, 2. Therefore,

P0,θ(t, θc) = exp{−Λ0,2,θ(t, θc)}. Replace Rn(t, θc) with R0,θ(t, θc) in the function

gn,i(t, θc), the function g0i,θ(t, θc) is defined as

g0i,θ(t, θc) =

g0i,1,θ(t, θc)

g0i,2,θ(t, θc)

=

Zie−Ziθc,1

e−Ziθc,1+e−Ziθc,2R0,θ(t,θc)

Zie−Ziθc,2R0,θ(t,θc)

e−Ziθc,1+e−Ziθc,2R0,θ(t,θc)

.

Let Dg0i,θ(t, θc) be the derivative of g0

i,θ with respect to R0,θ evaluated at R0,θ(t, θc),

then Dg0i,θ(t, θc) has the expression

Dg0i,θ(t, θc) =

Dg0i,1,θ(t, θc)

Dg0i,2,θ(t, θc)

=

∂g0i,1,θ(θc,R0,θ)

∂R0,θ

∣∣∣R0,θ(t,θc)=R0,θ(t,θc)

∂g0i,2,θ(θc,R0,θ)

∂R0,θ


=

− Zie−Ziθc,1e−Ziθc,2

(e−Ziθc,1+e−Ziθc,2R0,θ(t,θc))2

Zie−Ziθc,1e−Ziθc,2

(e−Ziθc,1+e−Ziθc,2R0,θ(t,θc))2

.

Let

Dg0θ(t, θc)

=

− Ze−Zθc,1e−Zθc,2

(e−Zθc,1+e−Zθc,2R0,θ(t,θc))2

Ze−Zθc,1e−Zθc,2

(e−Zθc,1+e−Zθc,2R0,θ(t,θc))2

.

Substitute R0,θ(t, θc) for Rn(t, θc) in the function Sn(t, θc), we define

S0n,θ(t, θc) =

∑k

g0k,θ(t, θc)Yk(t)

e−Zkθc,1 + e−Zkθc,2R0,θ(t, θc).


Similar to the definition of Dg0i,θ(t, θc), DS

0n,θ(t, θc) is defined as the derivative of S0

n,θ

with respect to R0,θ evaluated at R0,θ(t, θc), then DS0n,θ(t, θc) has the expression

DS0n,θ(t, θc) =

DS0n,1,θ(t, θc)

DS0n,2,θ(t, θc)

=

∂S0

n,1,θ(θc,R0,θ)

∂R0,θ


∂S0n,2,θ(θc,R0,θ)

∂R0,θ


=

−2∑

kZkYk(t)e−Zkθc,1e−Zkθc,2

(e−Zkθc,1+e−Zkθc,2R0,θ(t,θc))3∑

k

ZkYk(t)e−Zkθc,2(e−Zkθc,1−e−Zkθc,2R0,θ(t,θc))(e−Zkθc,1+e−Zkθc,2R0,θ(t,θc))

3

.

Also let

S00,θ(t, θc) = lim

n→∞Eθ

[1

nS0n,θ(t, θc)

],

DS00,θ(t, θc) = lim

n→∞Eθ

[1

nDS0

n,θ(t, θc)

].

Define

An,i(t, θc, R) = gn,i(t, θc, R)− Sn(t, θc, R)

Kn(t)

(e−Ziθc,1 + e−Ziθc,2R

),

where

gn,i(t, θc, R) =

Zie−Ziθc,1

e−Ziθc,1+e−Ziθc,2R

Zie−Ziθc,2R

e−Ziθc,1+e−Ziθc,2R

,

Sn(t, θc, R) =∑k

gn,k(t, θc, R)Yk(t)

e−Zkθc,1 + e−Zkθc,2R.

Let An,i,R(t, θc, r) be the derivative of An,i(t, θc, R) with respect to R evaluated

at R = r. Particularly, the notations Ai(t, θc, R) and gi(t, θc, R) are used instead

of An,i(t, θc, R) and gn,i(t, θc, R) when R is not a function of n. Ai,R(t, θc, r) is the

notation of An,i,R(t, θc, r) when r is not related to n.


7.2.1 Lemma 7.2.2 and proof

Lemma 7.2.2. Under conditions (A1)–(A8), for any θ, θc ∈ Θ and t ∈ (0, τ ],√n(Rn(t, θc)−R0,θ(t, θc)) converges weakly to a zero-mean Gaussian process

as n→∞. Here, θ is the true parameter.

Proof. For any θ, θc ∈ Θ and t ∈ (0, τ ], if θ is the true parameter, by the definition of

Rn(t, θc), it can be shown that

Rn(t, θc) =1

Pn(t, θc)

∫ t

0

Pn(s−, θc) dΛn,1(s, θc)

=1∏

u∈(0,t]

(1− dΛn,2(u, θc)

) ∫ t

0

∏u∈(0,s−]

(1− dΛn,2(u, θc)

)dΛn,1(s, θc)

=

∫ t

0

1∏u∈[s,t]

(1− dΛn,2(u, θc)

) dΛn,1(s, θc)

=

∫ t

0

1∏u∈[s,t]

(1− dΛn,2(u, θc)

) d(Λn,1(s, θc)− Λ0,1,θ(s, θc))

+

∫ t

0

1∏u∈[s,t]

(1− dΛn,2(u, θc)

) dΛ0,1,θ(s, θc).

By Taylor expansion,

Rn(t, θc) =

∫ t

0

[1∏

u∈[s,t] (1− dΛ0,2,θ(u, θc))

−

∏u∈[s,t]

(1− dΛn,2(u, θc)

)−∏

u∈[s,t] (1− dΛ0,2,θ(u, θc))(∏u∈[s,t] (1− dΛ0,2,θ(u, θc))

)2

×d(

Λn,1(s, θc)− Λ0,1,θ(s, θc))

+

∫ t

0

[1∏

u∈[s,t] (1− dΛ0,2,θ(u, θc))

−

∏u∈[s,t]

(1− dΛn,2(u, θc)

)−∏


)2

dΛ0,1,θ(s, θc)

+oP (n−12 ),


which is equal to∫ t

0

1∏u∈[s,t] (1− dΛ0,2,θ(u, θc))

d(

Λn,1(s, θc)− Λ0,1,θ(s, θc))

+

∫ t

0

1∏u∈[s,t] (1− dΛ0,2,θ(u, θc))

dΛ0,1,θ(s, θc)

−∫ t

0

∏u∈[s,t]

(1− dΛn,2(u, θc)

)−∏


)2 dΛ0,1,θ(s, θc)

+oP (n−12 ).

Note that

R0,θ(t, θc) =

∫ t

0

1∏u∈[s,t] (1− dΛ0,2,θ(u, θc))

dΛ0,1,θ(s, θc).

Thus,

Rn(t, θc)−R0,θ(t, θc)

=

∫ t

0

1∏u∈[s,t] (1− dΛ0,2,θ(u, θc))

d(

Λn,1(s, θc)− Λ0,1,θ(s, θc))

−∫ t

0

∏u∈[s,t]

(1− dΛn,2(u, θc)

)−∏


)2 dΛ0,1,θ(s, θc)

+oP (n−12 ). (7.6)

Since

Λn,j(t, θc) =

∫ t

0

11nKn(s)

d1

nHn,j(s, θc),

By the chain rule and Hadamard differentiability, Λn,j(t, θc) is Hadamard differen-

tiable. By functional delta method, the first item on the right-hand side of (7.6) is of

zero-mean and can be written as the sum of independent random variables.

Since Λn,2(t, θc) and the product limit∏

u∈(0,t]

(1− dΛn,2(u, θc)

)has a one-to-one

correspondence,∏

u∈(0,t]

(1− dΛn,2(u, θc)

)is also Hadamard differentiable. Similar-

ly, it can be shown that∏

u∈(0,s−]

(1− dΛn,2(u, θc)

)is Hadamard differentiable, for


s ≤ t ∈ (0, τ ]. It follows that∏

u∈[s,t]

(1− dΛn,2(u, θc)

)=

∏u∈(0,t](1−dΛn,2(u,θc))∏u∈(0,s−](1−dΛn,2(u,θc))

is

Hadamard differentiable.

Then,√n(∏

u∈[s,t]

(1− dΛn,2(u, θc)

)−∏

u∈[s,t] (1− dΛ0,2,θ(u, θc)))

converges weak-

ly to a zero-mean Gaussian process, for s ≤ t ∈ (0, τ ]. It follows that the second item

on the right-hand side of (7.6) can be expressed as the sum of independent random

variables. It suffices to show that√n(Rn(t, θc)−R0,θ(t, θc)

)converges weakly to a

zero-mean Gaussian process as n→∞.

In particular, if θc = θ,√n(Rn(t, θ)−R0,θ(t, θ)

)=√n(Rn(t, θ)−R0,θ(t)

)can

be written as a multiple of martingale (see Yang and Prentice 2005 [51]).

7.2.3 Proof of Theorem 4.3.1 (ii)

Proof. Under H1 : θ = θ∗,

1

nQn(θ0)

=1

n

n∑i=1

∫ τ

0

[gn,i(t, θ0)− Sn(t, θ0)

Kn(t)

(e−Ziθ0,1 + e−Ziθ0,2Rn(t,θ0)

)]dNi(t)

= T bn,1 − T bn,2,

where

T bn,1 =1

n

∑i

∫ τ

0

gn,i(t, θ0) dNi(t)

and

T bn,2 =1

n

∑i

∫ τ

0

Sn(t, θ0)

Kn(t)


)dNi(t).

We will evaluate the asymptotic properties of T bn,1 and T bn,2.


By Taylor expansion, T bn,1 can be expressed as

T bn,1

=1

n

∑i

∫ τ

0

gn,i(t, θ0) dNi(t)

=1

n

∑i

∫ τ

0

[g0i,θ∗(t, θ0) +

(Rn(t, θ0)−R0,θ∗(t, θ0)

)Dg0

i,θ∗(t, θ0)]dNi(t)

+oP (n−12 )

=1

n

∑i

∫ τ

0

g0i,θ∗(t, θ0) dNi(t) +

∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

) 1

n

∑i

Dg0i,θ∗(t, θ0) dNi(t)

+oP (n−12 )

=1

n

∑i

∫ τ

0

g0i,θ∗(t, θ0) dNi(t)

+

∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

)×Eθ∗

[1

n

∑i

Dg0i,θ∗(t, θ0)

Yi(t)

e−Ziθ∗1 + e−Ziθ

∗2R0,θ∗(t)

dR0,θ∗(t)

]+oP (n−

12 )

=1

n

∑i

∫ τ

0

g0i,θ∗(t, θ0) dNi(t)

+

∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

)×Eθ∗

[Dg0

θ∗(t, θ0)Y (t)

e−Zθ∗1 + e−Zθ

∗2R0,θ∗(t)

]dR0,θ∗(t)

+oP (n−12 ).

It can be shown that

T bn,2 =1

n

∑i

∫ τ

0

Sn(t, θ0)

Kn(t)


)dNi(t)

=

∫ τ

0

1nSn(t, θ0)1nKn(t)

1

n

∑i

[e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

+e−Ziθ0,2(Rn(t, θ0)−R0,θ∗(t, θ0)

)]dNi(t),



T bn,2 =

∫ τ

0

1nSn(t, θ0)1nKn(t)

1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

+

∫ τ

0

1nSn(t, θ0)1nKn(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

) 1

n

∑i

e−Ziθ0,2 dNi(t)

= T bn,21 + T bn,22,

where

T bn,21 =

∫ τ

0

1nSn(t, θ0)1nKn(t)

1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

and

T bn,22 =

∫ τ

0

1nSn(t, θ0)1nKn(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

) 1

n

∑i

e−Ziθ0,2 dNi(t).

Next, the expressions of T bn,21 and T bn,22 will be derived. By Taylor expansion, it can

be shown that

T bn,21

=

∫ τ

0

1nSn(t, θ0)1nKn(t)

1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

=

∫ τ

0

1nS0n,θ∗(t, θ0)1nKn(t)

1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

+

∫ τ

0

1nDS0

n,θ∗(t, θ0)1nKn(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)× 1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

+oP (n−12 ),



T bn,21

=

∫ τ

0

[S0

0,θ∗(t, θ0)

K0,θ∗(t)+

1

K0,θ∗(t)

(1

nS0n,θ∗(t, θ0)− S0

0,θ∗(t, θ0)

)−S0

0,θ∗(t, θ0)

(K0,θ∗(t))2

(1

nKn(t)−K0,θ∗(t)

)][1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

]

+

∫ τ

0

[DS0

0,θ∗(t, θ0)

K0,θ∗(t)+

1

K0,θ(t)

(1

nDS0

n,θ∗(t, θ0)−DS00,θ∗(t, θ0)

)−DS0

0,θ∗(t, θ0)

(K0,θ∗(t))2

(1


)]×(Rn(t, θ0)−R0,θ∗(t, θ0)

)[ 1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

]+oP (n−

12 )

=

∫ τ

0

S00,θ∗(t, θ0)

K0,θ∗(t)

1

n

∑i

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)dNi(t)

+

∫ τ

0

[1

K0,θ∗(t)

(1

nS0n,θ∗(t, θ0)− S0

0,θ∗(t, θ0)

)−S0

0,θ∗(t, θ0)

(K0,θ∗(t))2

(1


)+DS0

0,θ∗(t, θ0)

K0,θ∗(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)]×Eθ∗

[e−Zθ0,1 + e−Zθ0,2R0,θ∗(t, θ0)


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t)

+oP (n−12 ).

It can also be shown that

T bn,22

=

∫ τ

0

1nSn(t, θ0)1nKn(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

) 1

n

∑i

e−Ziθ0,2 dNi(t)

=

∫ τ

0

1nS0n,θ∗(t, θ0)1nKn(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

) 1

n

∑i

e−Ziθ0,2 dNi(t)

+

∫ τ

0

1nDS0

n,θ∗(t, θ0)1nKn(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)2 1

n

∑i

e−Ziθ0,2 dNi(t) + oP (n−12 ),



T bn,22 =

∫ τ

0

[S0

0,θ∗(t, θ0)

K0,θ∗(t)+

1

K0,θ∗(t)

(1

nS0n,θ∗(t, θ0)− S0

0,θ∗(t, θ0)

)−S0

0,θ∗(t, θ0)

(K0,θ∗(t))2

(1


)]×(Rn(t, θ0)−R0,θ∗(t, θ0)

)[ 1

n

∑i

e−Ziθ0,2 dNi(t)

]+ oP (n−

12 )

=

∫ τ

0

S00,θ∗(t, θ0)

K0,θ∗(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)[ 1

n

∑i

e−Ziθ0,2 dNi(t)

]+oP (n−

12 )

=

∫ τ

0

S00,θ∗(t, θ0)

K0,θ∗(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)×Eθ∗

[e−Zθ0,2


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t) + oP (n−

12 ).

It follows that

1

nQn(θ0) = T bn,1 − T bn,2 = T bn,1 − T b21 − T b22

=1

n

∑i

∫ τ

0

[g0i,θ∗(t, θ0)−

S00,θ∗(t, θ0)

K0,θ∗(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

)]dNi(t)

+

∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

)Eθ∗

[Dg0

θ∗(t, θ0)Y (t)


∗2R0,θ∗(t)

]dR0,θ∗(t)

−∫ τ

0

[1

K0,θ∗(t)

(1

nS0n,θ∗(t, θ0)− S0

0,θ∗(t, θ0)

)−S0

0,θ∗(t, θ0)

(K0,θ∗(t))2

(1


)+DS0

0,θ∗(t, θ0)

K0,θ∗(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)]×Eθ∗

[e−Zθ0,1 + e−Zθ0,2R0,θ∗(t, θ0)


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t)

−∫ τ

0

S00,θ∗(t, θ0)

K0,θ∗(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)×Eθ∗

[e−Zθ0,2


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t)

+oP (n−12 ).


Notice that

e∗θ∗(θ0)

= limn→∞

Eθ∗

[1

n

∑i

∫ τ

0

(g0i,θ∗(t, θ0)−

S00,θ∗(t, θ0)

K0,θ∗(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

))×dNi(t)]

=

∫ τ

0

Eθ∗

[(g0i,θ∗(t, θ0)−

S00,θ∗(t, θ0)

K0,θ∗(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ∗(t, θ0)

))× Y (t)


∗2R0,θ∗(t)

]dR0,θ∗(t).

Therefore,

√n

(1

nQn(θ0)− e∗θ∗(θ0)

)=√n

[1

n

∑i

∫ τ

0

(g0i,θ∗(t, θ0)−

S00,θ∗(t, θ0)

K0,θ∗(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

))dNi(t)

−e∗θ∗(θ0)]

+√n

∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

)Eθ∗

[Dg0

θ∗(t, θ0)Y (t)


∗2R0,θ∗(t)

]dR0,θ∗(t)

−√n

∫ τ

0

[1

K0,θ∗(t)

(1

nS0n,θ∗(t, θ0)− S0

0,θ∗(t, θ0)

)−S0

0,θ∗(t, θ0)

(K0,θ∗(t))2

(1


)+DS0

0,θ∗(t, θ0)

K0,θ∗(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)]×Eθ∗

[e−Zθ0,1 + e−Zθ0,2R0,θ∗(t, θ0)


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t)

−√n

∫ τ

0

S00,θ∗(t, θ0)

K0,θ∗(t)

(Rn(t, θ0)−R0,θ∗(t, θ0)

)×Eθ∗

[e−Zθ0,2


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t) + oP (1).

The first item in the above equation is a sum of independent random variables. By

central limit theorem, the first item converges weakly to a normal random variable as

n→∞.√n( 1

nS0n,θ∗(t, θ0)−S0

0,θ∗(t, θ0)) converges to a zero-mean Gaussian process, by

Donsker’s theorem. Similarly,√n( 1

nKn(t) −K0,θ∗(t)) also converges to a zero-mean


Gaussian process. By Lemma 7.2.2 and functional delta method, each of the next

three items on the right-hand side of (7.7) can be written as the sum of independent

random variables that converges weakly to a zero-mean random normal variable.

Therefore,√n(

1nQn(θ0) − e∗θ∗(θ0)

)converges weakly to a zero-mean normal random

variable as n→∞. The derivation of the asymptotic variance v∗θ∗(θ0) is demonstrated

in the following.

Let

Gθ∗(t, θ0)

= Eθ∗

[Dg0

θ∗(t, θ0)Y (t)


∗2R0,θ∗(t)

]−DS0

0,θ∗(t, θ0)

K0,θ∗(t)Eθ∗

[e−Zθ0,1 + e−Zθ0,2R0,θ∗(t, θ0)


∗2R0,θ∗(t)

Y (t)

]−S0

0,θ∗(t, θ0)

K0,θ∗(t)Eθ∗

[e−Zθ0,2


∗2R0,θ∗(t)

Y (t)

].

Then,

√n

(1

nQn(θ0)− e∗θ∗(θ0)

)=√n

[1

n

∑i

∫ τ

0

(g0i,θ∗(t, θ0)−

S00,θ∗(t, θ0)

K0,θ∗(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ∗(t, θ0)

))dNi(t)

−e∗θ∗(θ0)]

−√n

∫ τ

0

1

K0,θ∗(t)

(1

nS0n,θ∗(t, θ0)− S0

0,θ∗(t, θ0)

)×Eθ∗

[e−Zθ0,1 + e−Zθ0,2R0,θ∗(t, θ0)


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t)

+√n

∫ τ

0

S00,θ∗(t, θ0)

(K0,θ∗(t))2

(1


)×Eθ∗

[e−Zθ0,1 + e−Zθ0,2R0,θ∗(t, θ0)


∗2R0,θ∗(t)

Y (t)

]dR0,θ∗(t)

+√n

∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

)Gθ∗(t, θ0)dR0,θ∗(t)

+oP (1).

(7.7)


In the expression (7.7), the first item is the sum of independent random variables.

Note that Cov(∫Fn,1 dG1,

∫Fn,2 dG2) =

∫ ∫Cov(Fn,1, Fn,2)dG1dG2, for some empiri-

cal distribution functions Fn,1, Fn,2 and random functions G1, G2. Thus, if we write

items two to four in the expression (7.7) in the form of∫Fn dG, the variance matrix

v∗θ∗(θ0) can be calculated as the variance of the sum of the four items. Next, we will

represent the fourth item√n∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

)Gθ∗(t, θ0)dR0,θ∗(t) in such

form. We will use the expression of Rn(t, θ0)−R0,θ∗(t, θ0) in (7.6).

By Proposition II.8.7 of Anderson et al. (1993) [1], the functional derivative

d∏

[s,t] (1− dΛ0,2,θ∗)× (Λn,2 − Λ0,2,θ∗) can be calculated by∫u∈[s,t]

∏[s,u)(1− dΛ0,2,θ∗)

∏(u,t](1− dΛ0,2,θ∗)d

(Λn,2(u)− Λ0,2,θ∗(u)

).

Then, (7.6) can be represented as

Rn(t, θ0)−R0,θ∗(t, θ0)

=

∫ t

0

1

K0,θ∗(s)∏

u∈[s,t] (1− dΛ0,2,θ∗(u, θ0))d

(1

nHn,1(s, θ0)−H0,1,θ∗(s, θ0)

)−∫ t

0

1

(K0,θ∗(s))2∏

u∈[s,t] (1− dΛ0,2,θ∗(u, θ0))

(1

nKn(s)−K0,θ∗(s)

)×dH0,1,θ∗(s, θ0)

−∫ t

0

1∏u∈[s,t] (1− dΛ0,2,θ∗(u, θ0))

×(∫

[s,t]

1

K0,θ∗(u) (1− dΛ0,2,θ∗(u, θ0))d

(1

nHn,2(u, θ0)−H0,2,θ∗(u, θ0)

))×dΛ0,1,θ∗(s, θ0)

+

∫ t

0

1∏u∈[s,t] (1− dΛ0,2,θ∗(u, θ0))

×(∫

[s,t]

1

(K0,θ∗(u))2 (1− dΛ0,2,θ∗(u, θ0))

(1

nKn(u)−K0,θ∗(u)

)dH0,2,θ∗(u, θ0)

)×dΛ0,1,θ∗(s, θ0) + oP (n−

12 ).


Under the conditions (A4)–(A5), Λ0,2,θ∗(t, θ0) is continuous, ∀t ∈ (0, τ ]. Thus, the

product limit∏

u∈(0,t](1− dΛ0,2,θ∗(u, θ0)) can be written as e−Λ0,2,θ∗ (t,θ0).

Integration by parts, Rn(t, θ0)−R0,θ∗(t, θ0) has the representation

Rn(t, θ0)−R0,θ∗(t, θ0)

= eΛ0,2,θ∗ (t,θ0)

∫ t

0

1

eΛ0,2,θ∗ (s,θ0)

∫u∈[s,t]

1

(K0,θ∗(u))2

(1


)×dH0,2,θ∗(u, θ0)dΛ0,1,θ∗(s, θ0)

−eΛ0,2,θ∗ (t,θ0)

∫ t

0

1

(K0,θ∗(s))2eΛ0,2,θ∗ (s,θ0)

(1

nKn(s)−K0,θ∗(s)

)dH0,1,θ∗(s, θ0)

+1

K0,θ∗(t)

(1

nHn,1(t, θ0)−H0,1,θ∗(t, θ0)

)+eΛ0,2,θ∗ (t,θ0)

∫ t

0

(1

nHn,1(s, θ0)−H0,1,θ∗(s, θ0)

)× 1

(K0,θ∗(s))2eΛ0,2,θ∗ (s,θ0)[dK0,θ∗(s) +K0,θ∗(s)dΛ0,2,θ∗(s, θ0)]

−eΛ0,2,θ∗ (t,θ0) 1

K0,θ∗(t)

(1

nHn,2(t, θ0)−H0,2,θ∗(t, θ0)

)∫ t

0

1

eΛ0,2,θ∗ (s,θ0)dΛ0,1,θ∗(s, θ0)

+eΛ0,2,θ∗ (t,θ0)

∫ t

0

1

K0,θ∗(s)eΛ0,2,θ∗ (s,θ0)

(1

nHn,2(s, θ0)−H0,2(s, θ0)

)dΛ0,1(s, θ0)

−eΛ0,2,θ∗ (t,θ0)

∫ t

0

1

eΛ0,2,θ∗ (s,θ0)

∫u∈[s,t]

1

(K0,θ∗(u))2

(1

nHn,2(u, θ0)−H0,2,θ∗(u, θ0)

)×dK0(u)dΛ0,1(s, θ0)

+oP (n−12 ).

In order to derive the limiting variance of the pseudo score function under the

alternative hypothesis, we need to rewrite Rn(t, θ0) − R0,θ∗(t, θ0) by changing the

order of the integrals. The term Rn(t, θ0) − R0,θ∗(t, θ0) has the following form after

we make some changes in the order of the integrals.


It follows that

Rn(t, θ0)−R0,θ∗(t, θ0)

= eΛ0,2,θ∗ (t,θ0)

[∫u∈(0,t]

(∫v∈(0,u]

e−Λ0,2,θ∗ (v,θ0)dΛ0,1,θ∗(v, θ0)

)× 1

(K0,θ∗(u))2

(1


)dH0,2,θ∗(u, θ0)

]−eΛ0,2,θ∗ (t,θ0)

∫u∈(0,t]

e−Λ0,2,θ∗ (u,θ0) 1

(K0,θ∗(u))2

(1


)dH0,1,θ∗(u, θ0)

+1

K0,θ∗(t)

(1

nHn,1(t, θ0)−H0,1,θ∗(t, θ0)

)+eΛ0,2,θ∗ (t,θ0)

[∫u∈(0,t]

(1

nHn,1(u, θ0)−H0,1,θ∗(u, θ0)

)e−Λ0,2,θ∗ (u,θ0)

× 1

(K0,θ∗(u))2

(1


)(dK0,θ∗(u) + dH0,2,θ∗(u, θ0))

]−eΛ0,2,θ∗ (t,θ0)

(∫u∈(0,t]

e−Λ0,2,θ∗ (u,θ0)dΛ0,1,θ∗(u, θ0)

)× 1

K0,θ∗(t)

(1

nHn,2(t, θ0)−H0,2,θ∗(t, θ0)

)+eΛ0,2,θ∗ (t,θ0)

∫u∈(0,t]

e−Λ0,2,θ∗ (u,θ0) 1

K0,θ∗(u)

(1

nHn,2(u, θ0)−H0,2,θ∗(u, θ0)

)×dΛ0,1,θ∗(u, θ0)

−eΛ0,2,θ∗ (t,θ0)

[∫u∈(0,t]

(∫v∈(0,u]

e−Λ0,2,θ∗ (v,θ0)dΛ0,1,θ∗(v, θ0)

)× 1

(K0,θ∗(u))2

(1

nHn,2(u, θ0)−H0,2,θ∗(u, θ0)

)dK0,θ∗(u)

]+oP (n−

12 ).

If we change the order of the integrals,the fourth item in the expression (7.7) can

be written in the form of∫Fn dG. v∗θ∗(θ0) can thus be calculated based on the variance

and covariance of√n( 1

nHn,j(u, θ0) − H0,j,θ∗(u, θ0)) and

√n( 1

nKn(u) − K0,θ∗(u)), for

j = 1, 2 and u ∈ (0, τ ].


The fourth item on the right-hand side of equation (7.7) can be written as

√n

∫ τ

0

(Rn(t, θ0)−R0,θ∗(t, θ0)

)Gθ∗(t, θ0)dR0,θ∗(t)

=7∑i=1

T cn,i + oP (1),

where

T cn,1 =√n

∫u∈(0,τ ]

(∫w∈(0,u]

e−Λ0,2,θ∗ (w,θ0)dΛ0,1,θ∗(w, θ0)

)×(∫

v∈(u,τ ]

eΛ0,2,θ∗ (v,θ0)Gθ∗(v, θ0)dR0,θ∗(v)

)× 1

(K0,θ∗(u))2

(1


)dH0,2,θ∗(u, θ0),

T cn,2 = −√n

∫u∈(0,τ ]

(∫v∈[u,τ ]

eΛ0,2,θ∗ (v,θ0)Gθ∗(v, θ0)dR0,θ∗(v)

)×e−Λ0,2,θ∗ (u,θ0) 1

(K0,θ∗(u))2

(1


)dH0,1,θ∗(u, θ0),

T cn,3 =√n

∫ τ

0

1

K0,θ∗(u)

(1

nHn,1(u, θ0)−H0,1,θ∗(u, θ0)

)Gθ∗(u, θ0)dR0,θ∗(u),

T cn,4 =√n

∫u∈(0,τ ]

(∫v∈[u,τ ]

eΛ0,2,θ∗ (v,θ0)Gθ∗(v, θ0)dR0,θ∗(v)

)×e−Λ0,2,θ∗ (u,θ0) 1

(K0,θ∗(u))2

(1

nHn,1(u, θ0)−H0,1,θ∗(u, θ0)

)× (dK0,θ∗(u) + dH0,2,θ∗(u, θ0)) ,

T cn,5 = −√n

∫u∈(0,τ ]

(∫v∈(0,u]

e−Λ0,2,θ∗ (v,θ0)dΛ0,1,θ∗(v, θ0)

)×eΛ0,2,θ∗ (u,θ0) 1

K0,θ∗(u)

(1

nHn,2(u, θ0)−H0,2,θ∗(u, θ0)

)×Gθ∗(u, θ0)dR0,θ∗(u),

T cn,6 =√n

∫u∈(0,τ ]

(∫v∈[u,τ ]

eΛ0,2,θ∗ (v,θ0)Gθ∗(v, θ0)dR0,θ∗(v)

)×e−Λ0,2,θ∗ (u,θ0) 1

K0,θ∗(u)

(1

nHn,2(u, θ0)−H0,2,θ∗(u, θ0)

)dΛ0,1,θ∗(u, θ0),

and


T cn,7 = −√n

[∫u∈(0,τ ]

(∫w∈(0,u]

e−Λ0,2,θ∗ (w,θ0)dΛ0,1,θ∗(w, θ0)

)×(∫

v∈[u,τ ]

eΛ0,2,θ∗ (v,θ0)Gθ∗(v, θ0)dR0,θ∗(v)

)× 1

(K0,θ∗(u))2

(1

nHn,2(u, θ0)−H0,2,θ∗(u, θ0)

)dK0,θ∗(u)

].

7.2.4 Proof of Theorem4.4.1

Proof. Under H1n : θ = θ1n = θ0 + h√n,

1

nQn(θ0)

=1

n

n∑i=1

∫ τ

0

[gn,i(t, θ0)− Sn(t, θ0)

Kn(t)


)]dNi(t)

=1

n

n∑i=1

∫ τ

0

An,i(t, θ0) dNi(t)

=1

n

n∑i=1

∫ τ

0

An,i(t, θ1n) dNi(t) +1

n

n∑i=1

∫ τ

0

[An,i(t, θ0)− An,i(t, θ1n)] dNi(t)

= T dn,1 + T dn,2,

where

T dn,1 =1

n

n∑i=1

∫ τ

0

An,i(t, θ1n) dNi(t),

T dn,2 =1

n

n∑i=1

∫ τ

0

[An,i(t, θ0)− An,i(t, θ1n)] dNi(t).

By Theorem A2 of Yang and Prentice (2005) [51],

T dn,1 =1

n

∑i

∫ τ

0

An,i(t, θ1n) dNi(t)

=1

n

n∑i=1

∫ τ

0

Bn,i(t, θ1n) dMi,θ1n(t),


where Mi,θ1n(t) = Ni(t)−∫ t

0Yi(s)

e−Ziθ1n,1+e−Ziθ1n,2R0,θ0(s)dR0,θ0(s) is the martingale associ-

ated with the i th individual.

It can be shown that

limn→∞

Eθ1n

[Vn(θ0)

n

]= Eθ1n

[1

n

∑i

∫ τ

0

Bn,i(t, θ0)Bn,i(t, θ0)′Yi(t)

e−Ziθ1n,1 + e−Ziθ1n,2Rn(t, θ0)d Rn(t, θ0)

]= vθ0(θ0).

Hence,

√nT dn,1

d−→ N(0, vθ0(θ0)),

where the limiting variance is the same as that under the null hypothesis.

Next, the limiting asymptotic distribution of T dn,2 is obtained. By uniform strong

law of large numbers and the definition of Rn(t, θc) and R0,θ(t, θc), we have

R0(t) = R0,θ0(t, θ0) = limn→∞

Eθ1n [Rn(t, θ1n)] = limn→∞

Eθ1n [Rn(t, θ0)] = limn→∞

Eθ0 [Rn(t, θ1n)].

Denote Rθ1n(t, θ1n) as Eθ1n [Rn(t, θ1n)], Rθ1n(t, θ0) as Eθ1n [Rn(t, θ0)] and Rθ0(t, θ1n) as

Eθ0 [Rn(t, θ1n)].

In addition,√n(Rn(t, θ0)−R0,θ0(t, θ0))

d=√n(Rn(t, θ1n)−Rθ1n(t, θ1n)), which also

has the same asymptotic distribution as√n(Rn(t, θ0) − R0,θ0(t, θ0)) under the null

hypothesis when the true parameter θ = θ0. By Taylor expansion, it is shown that


T dn,2 =1

n

∑i

∫ τ

0

[An,i(t, θ0)− An,i(t, θ1n)] dNi(t)

=1

n

∑i

∫ τ

0

[Ai(t, θ0, R0,θ0(t, θ0))

+Ai,R(t, θ0, R0,θ0(t, θ0))(Rn(t, θ0)−R0,θ0(t, θ0))

−An,i(t, θ1n, Rθ1n(t, θ1n))

−An,i,R(t, θ1n, Rθ1n(t, θ1n))(Rn(t, θ1n)−Rθ1n(t, θ1n))]dNi(t)

+oP (n−12 )

=1

n

∑i

∫ τ

0

[Ai(t, θ0, R0,θ0(t))

+Ai,R(t, θ0, R0,θ0(t))(Rn(t, θ0)−R0,θ0(t, θ0))

−An,i(t, θ1n, R0,θ0(t))

−An,i,R(t, θ1n, R0,θ0(t))(Rn(t, θ0)−R0,θ0(t, θ0))]dNi(t)

+oP (n−12 )

=1

n

∑i

∫ τ

0

[Ai(t, θ0, R0,θ0(t))− An,i(t, θ1n, R0,θ0(t))] dNi(t)

+

∫ τ

0

(Rn(t, θ0)−R0,θ0(t, θ0))

×Eθ1n

[1

n

∑i

(Ai,R(t, θ0, R0,θ0(t)))− An,i,R(t, θ1n, R0,θ0(t))

]dNi(t)

+oP (n−12 )

=1

n

∑i

∫ τ

0

[Ai(t, θ0, R0,θ0(t))− An,i(t, θ1n, R0,θ0(t))] dNi(t)

+oP (n−12 ).

Denote Ai,θc(t, θ0, R0,θ0(t)) as the derivative of Ai(t, θc, R0,θ0(t)) with respect to θc

evaluated at θ0. Let Ai,1 and Ai,2 denote the first and second row of Ai, respectively.

It can be derived that Ai,θc(t, θ0, R0,θ0(t)) has the following expression


Ai,θc(t, θ0, R0,θ0(t)) =

Ai,11,θc(t, θ0, R0,θ0(t)) Ai,12,θc(t, θ0, R0,θ0(t))

Ai,21,θc(t, θ0, R0,θ0(t)) Ai,22,θc(t, θ0, R0,θ0(t))

,

where

Ai,11,θc(t, θ0, R0,θ0(t)) =∂Ai,1(t, θc, R0,θ0(t))

∂θc,1

∣∣∣∣θc=θ0

= − Z2i e−Ziθ0,1e−Ziθ0,2R0,θ0(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))2

−(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))

∑k

Z2ke−Zkθ0,1 (e−Zkθ0,1−e−Zkθ0,2R0,θ0

(t))Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))3∑

k Yk(t)

+(Zie−Ziθ0,1)

∑k

Zke−Zkθ0,1Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))2∑

k Yk(t),


∂θc,2

∣∣∣∣θc=θ0

=Z2i e−Ziθ0,1e−Ziθ0,2R0,θ0(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))2

−(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))R0,θ0(t)

∑k

2Z2ke−Zkθ0,1e−Zkθ0,2Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))3∑

k Yk(t)

+(Zie−Ziθ0,2R0,θ0(t))

∑k

Zke−Zkθ0,1Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))2∑

k Yk(t),


∂θc,1

∣∣∣∣θc=θ0

=Z2i e−Ziθ0,1e−Ziθ0,2R0,θ0(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))2

−(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))R0,θ0(t)

∑k

2Z2ke−Zkθ0,1e−Zkθ0,2Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))3∑

k Yk(t)

+(Zie−Ziθ0,1R0,θ0(t))

∑k

Zke−Zkθ0,2Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))2∑

k Yk(t),



∂θc,2

∣∣∣∣θc=θ0

= − Z2i e−Ziθ0,1e−Ziθ0,2R0,θ0(t)

(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))2

+(e−Ziθ0,1 + e−Ziθ0,2R0,θ0(t))R0,θ0(t)

∑k

Z2ke−Zkθ0,2 (e−Zkθ0,1−e−Zkθ0,2R0,θ0

(t))Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))3∑

k Yk(t)

+(Zie−Ziθ0,2R2

0,θ0(t))

∑k

Zke−Zkθ0,2Yk(t)

(e−Zkθ0,1+e−Zkθ0,2R0,θ0(t))2∑

k Yk(t).

By Taylor expansion, T dn,2 can be expressed as

T dn,2

=1

n

∑i

∫ τ

0

[Ai(t, θ0, R0,θ0(t))

−Ai(t, θ0, R0,θ0(t))− Ai,θc(t, θ0, R0,θ0(t))(θ1n − θ0)] dNi(t)

= − 1

n

∑i

∫ τ

0

(Ai,θc(t, θ0, R0,θ0(t))

h√n

)dNi(t)

+oP (n−12 ).

It suffices to show that

√nT dn,2

P−→ ξ0,

where

ξ0 = A0 · h,

and

A0 = limn→∞

Eθ1n

[− 1

n

∑i

∫ τ

0

Ai,θc(t, θ0, R0,θ0(t)) dNi(t)

]

= limn→∞

Eθ0

[− 1

n

∑i

∫ τ

0

Ai,θc(t, θ0, R0,θ0(t)) dNi(t)

]

=

A0,11 A0,12

A0,21 A0,22

.


The expression of A0 can be derived in the following

A0,11 =

∫ τ

0

Eθ0

[(Z2e−Zθ0,1e−Zθ0,2R0,θ0(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))2

+(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))Eθ0

[Z2e−Zθ0,1 (e−Zθ0,1−e−Zθ0,2R0,θ0

(t))ST |Z(t)SC|Z(t)

(e−Zθ0,1+e−Zθ0,2R0,θ0(t))3

]Eθ0

[ST |Z(t)SC|Z(t)

]−(Ze−Zθ0,1)

Eθ0


(e−Zθ0,1+e−Zθ0,2R0,θ0(t))2

]Eθ0

[ST |Z(t)SC|Z(t)

] ST |Z(t)SC|Z(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))

×dR0,θ0(t),

A0,12 =

∫ τ

0

Eθ0

[(− Z2e−Zθ0,1e−Zθ0,2R0,θ0(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))2

+(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))R0,θ0(t)Eθ0


(e−Zθ0,1+e−Zθ0,2R0,θ0(t))3

]Eθ0

[ST |Z(t)SC|Z(t)

]−(Ze−Zθ0,2R0,θ0(t))

Eθ0


(e−Zθ0,1+e−Zθ0,2R0,θ0(t))2

]Eθ0

[ST |Z(t)SC|Z(t)

]

×ST |Z(t)SC|Z(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))

]dR0,θ0(t),

A0,21 =

∫ τ

0

Eθ0

[(− Z2e−Zθ0,1e−Zθ0,2R0,θ0(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))2

+(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))R0,θ0(t)Eθ0


(e−Zθ0,1+e−Zθ0,2R0,θ0(t))3

]Eθ0

[ST |Z(t)SC|Z(t)

]−(Ze−Zθ0,1R0,θ0(t))

Eθ0


(e−Zθ0,1+e−Zθ0,2R0,θ0(t))2

]Eθ0

[ST |Z(t)SC|Z(t)

]

×ST |Z(t)SC|Z(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))

]dR0,θ0(t),


A0,22

=

∫ τ

0

Eθ0

[(Z2e−Zθ0,1e−Zθ0,2R0,θ0(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))2

−(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))R0,θ0(t)Eθ0

[Z2e−Zθ0,2 (e−Zθ0,1−e−Zθ0,2R0,θ0

(t))ST |Z(t)SC|Z(t)

(e−Zθ0,1+e−Zθ0,2R0,θ0(t))3

]Eθ0

[ST |Z(t)SC|Z(t)

]−(Ze−Zθ0,2R2

0,θ0(t))

Eθ0


(e−Zθ0,1+e−Zθ0,2R0,θ0(t))2

]Eθ0

[ST |Z(t)SC|Z(t)

]

×ST |Z(t)SC|Z(t)

(e−Zθ0,1 + e−Zθ0,2R0,θ0(t))

]dR0,θ0(t).

We have obtained the asymptotic distributions of√nT dn,1 and

√nT dn,2. Hence, the

asymptotic distribution of√n( 1

nQn(θ0)) can be derived.

1√nQn(θ0)

d−→ N(ξ0, vθ0(θ0)), as n→∞.

BIBLIOGRAPHY 100

Bibliography

[1] Andersen, P. K., Borgan, O., Gill, R. D. and Keiding, N. (1993). Statistical

models based on counting processes, USA: Springer-Verlag New York, Inc.

[2] Bennett, S. (1983). Analysis of survival data by the proportional odds model.

Statistics in Medicine, 2 (2), 273–277.

[3] Burton, A., Altman, D. G., Royston, P. and Holder, R. L. (2006). The design of

simulation studies in medical statistics. Statistics in Medicine, 25, 4279–4292.

[4] Cai, T., Wei, L. J. and Wilcox, M. (2000). Semiparametric regression analysis

for clustered failure time data. Biometrika, 87 (4), 867–878.

[5] Chen, L. M., Ibrahim, J. G. and Chu, H. (2011). Sample size and power de-

termination in joint modeling of longitudinal and survival data. Statistics in

Medicine, 30, 2295–2309.

[6] Chen, K., Jin, Z., Ying, Z. (2002). Semiparametric analysis of transformation

models with censored data. Biometrika, 89 (3), 659–668.

[7] Cheng, S. C., Wei, L. J. and Ying, Z. (1995). Analysis of transformation models

with censored data. Biometrika, 82 (4), 835–845.

[8] Cheng, S. C., Wei, L. J. and Ying, Z. (1997). Predicting survival probabilities

with semiparametric transformation models. Journal of the American Statistical

Association, 92 (437), 227–235.

BIBLIOGRAPHY 101

[9] Collett, D. (2003). Modelling survival data in medical research (second edition),

USA: Chapman & Hall/CRC.

[10] Dabrowska, D. M. and Doksum, K. A. (1988). Estimation and testing in a

two-sample generalized odds-rate model. Journal of the American Statistical

Association, 83 (403), 744–749.

[11] Dupont, W. D. and Plummer, W. D. (1990). Power and sample size calculations:

a review and computer program. Controlled Clinical Trials, 11, 116–128.

[12] Donner, A. (1984). Approaches to sample size estimation in the design of clinical

trials-a review. Statistics in Medicine, 3, 199–214.

[13] Fine, J. P., Ying, Z. and Wei, L. J. (1998). On the linear transformation model

for censored data. Biometrika, 85 (4), 980–986.

[14] Fleming, T. R. and Harrington, D. P. (1991). Counting processes and survival

analysis, USA: John Wiley & Sons, Inc.

[15] Freedman, L. S. (1982). Tables of the number of patients required in clinical

trials using the logrank test. Statistics in Medicine, 1, 121–129.

[16] Gail, M. H. (1985). Applicability of sample size calculations based on a com-

parison of proportions for use with the logrank test. Controlled Clinical Trials,

6, 112-119.

[17] George, S. L. and Desu, M. M. (1974). Planning the size and duration of a clin-

ical trial studying the time to some critical event. Journal of Chronic Diseases,

27, 15–24.

[18] Gill, R. D. and Johansen, S. (1990). A survey of product-integration with a view

toward application in survival analysis. The Annals of Statistics, 18, 1501–1555.

BIBLIOGRAPHY 102

[19] Gu, M. and Lai, T. L. (1999). Determination of power and sample size in

the design of clinical trials with failure-time endpoints and interim analyses.

Controlled Clinical Trials, 20, 423–438.

[20] Halperin, M., Rogot, E., Gurian, J. and Ederer, F. (1968). Sample sizes for

medical trials with special reference to long-term therapy. Journal of Chronic

Diseases, 21, 13–24.

[21] Hsieh, F. Y. and Lavori, P. W. (2000). Sample-size calculations for the Cox

proportional hazards regression model with nonbinary covariates. Controlled

Clinical Trials, 21, 552-C560.

[22] Hsieh, F. Y. and Lavori, P. W., Cohen, H. J. and Feussner, J. R. (2003). An

overview of variance inflation factors for sample-size calculation. Evaluation &

the Health Professions, 26, 239-C257.

[23] Kalbfleisch, J. D. and Prentice, R. L. (2002). The statistical analysis of failure

time data, Hoboken, New Jersey: John Wiley & Sons, Inc.

[24] Kotz, S., Johnson, N. L. and Boyd, D. W. (1967). Series representations of

distributions of quadratic forms in normal variables II. non-central case. The

Annals of Mathematical Statistics, 38 (3), 838-C848.

[25] Lachin, J. M. (1981). Introduction to sample size determination and power

analysis for clinical trials. Controlled Clinical Trials, 2, 93–113.

[26] Lachin, J. M. and Foulkes, M. A. (1986). Evaluation of sample size and power

for analyses of survival with allowance for nonuniform patient entry, losses to

follow-up, noncompliance, and stratification. Biometrics, 42 (3), 507–519.

[27] Lakatos, E. (1986). Sample size determination in clinical trials with time-

dependent rates of losses and noncompliance. Controlled Clinical Trials, 7 (3),

189–199.

BIBLIOGRAPHY 103

[28] Lakatos, E. (1988). Sample sizes based on the log-rank statistic in complex

clinical trials. Biometrics, 44, 229–241.

[29] Lakatos, E. and Lan, K. K. G. (1992). A comparison of sample size methods

for the logrank statistic. Statistics in Medicine, 11, 179–191.

[30] Liu, H., Tang, Y. and Zhang H. H. (2009). A new chi-square approximation to

the distribution of non-negative definite quadratic forms in non-central normal

variables. Computational Statistics and Data Analysis, 53, 853–856.

[31] Maki, E. (2006). Power and sample size considerations in clinical trials with

competing risk endpoints Pharmaceutical Statistics, 5, 159–171.

[32] Makuch, R. W. and Simon, R. M. (1982). Sample size requirements for com-

paring time-to-failure among k treatment groups. Journal of Chronic Diseases,

35, 861–867.

[33] Murphy, S. A., Rossini, A. J., van der Vaart, A. W. (1997). Maximum likelihood

estimation in the proportional odds model. Journal of the American Statistical

Association, 92 (439), 968–976.

[34] Palta, M. and Amini, S. B. (1985). Consideration of covariates and stratifica-

tion in sample size determination for survival time studies. Journal of Chronic

Diseases, 38 (9), 801–809.

[35] Palta, M. and McHugh, R. (1979). Adjusting for losses to follow-up in sample

size determination for cohort studies. Journal of Chronic Diseases, 32, 315–326.

[36] Palta, M. and McHugh, R. (1980). Planning the size of a cohort study in the

presence of both losses to follow-up and non-compliance. Journal of Chronic

Diseases, 33, 501–512.

[37] Pasternack, B. S. (1972). Sample sizes for clinical trials designed for patient

accrual by cohorts. Journal of Chronic Diseases, 25, 673–681.

BIBLIOGRAPHY 104

[38] Pasternack, B. S. and Gilbert, H. S. (1971). Planning the duration of long-

term survival time studies designed for accrual by cohorts. Journal of Chronic

Diseases, 24, 681–700.

[39] Pettitt, A. N. (1984). Proportional odds models for survival data and estimates

using ranks. Journal of the Royal Statistical Society. Series C (Applied Statis-

tics), 33 (2), 169–175.

[40] Rubinstein, L. V., Gail, M. H. and Santner, T. J. (1981). Planning the duration

of a comparative clinical trial with loss to follow-up and a period of continued

observation. Journal of Chronic Diseases, 34, 469–479.

[41] Schoenfeld, D. A. (1981). The asymptotic properties of nonparametric tests for

comparing survival distributions. Biometrika, 68, 316–319.

[42] Schoenfeld, D. A. (1983). Sample-size formula for the proportional-hazards re-

gression model. Biometrics, 39, 499–503.

[43] Schoenfeld, D. A. and Richter, J. R. (1982). Nomograms for calculating the

number of patients needed for a clinical trial with survival as an endpoint.

Biometrics, 38, 163–170.

[44] Sellke, T. and Siegmund, D. (1983). Sequential analysis of proportional hazards

model. Biometrika, 70, 315–326.

[45] Shih, J. H. (1995). Sample size calculation for complex clinical trials with sur-

vival endpoints. Controlled Clinical Trials, 16, 395–407.

[46] Solomon, H. and Michael, S. A. (1977). Distribution of a sum of weighted

chi-Square variables Journal of the American Statistical Association, 72 (360),

881–885.

[47] Van der Vaart, A. W. (1998). Asymptotic statistics, New York, NY: Cambridge

University Press.

BIBLIOGRAPHY 105

[48] Van der Vaart, A. W. and Wellner, J. A. (1996). Weak convergence and empir-

ical processes: with applications to statistics, New York: Springer.

[49] Wang, S., Zhang, J. and Lu, W. (2012). Sample size calculation for the propor-

tional hazards cure model. Statistics in Medicine, 31, 3959–3971.

[50] Wu, M., Fisher, M. and DeMets, D. (1980). Sample sizes for long-term medical

trial with time-dependent dropout and event rates. Controlled Clinical Trials,

1, 109–121.

[51] Yang, S., Prentice, R. (2005). Semiparametric analysis of short-term and long-

term hazard ratios with two-sample survival data. Biometrika, 92 (1), 1–17.

[52] Zeng, D. and Lin, D.Y. (2007). Maximum likelihood estimation in semipara-

metric regression models with censored data. Journal of the Royal Statistical

Society. Series B: Statistical Methodology, 69 (4), 507–564.

[53] Zhen, B. and Murphy, J. R. (1994). Sample size determination for an expo-

nential survival model with an unrestricted covariate. Statistic in Medicine, 13,

391–397.

sample size calculation based on the semiparametric

Documents