2014-09-15 mediana manual - biopharmnetbiopharmnet.com/doc/mediana_manual.pdf3 main components of...

Mediana manual

September 15, 2014

1 Installation 1

2 Clinical scenario evaluation framework 1

3 Main components of Mediana package 2

4 Case studies 8

Prepared by the Mediana development team

Expert team: Keaven Anderson (Merck), Frank Harrell (Vanderbilt Univer-sity), Mani Lakshminarayanan (Pfizer), Jose Pinheiro (Johnson and Johnson),Thomas Schmelter (Bayer).Package design: Alex Dmitrienko (Quintiles).Core development team: Alex Dmitrienko (Quintiles), Gautier Paux (Servier),Ilya Lipkovich (Quintiles).Extended development team: Thomas Brechenmacher (Novartis), Ming-DauhWang (Lilly).

1 Installation

To access the latest version of the Mediana package, enter the following command

source("http://biopharmnet.com/doc/mediana_code.txt")

The current version of Mediana requires the following R packages: mvtnorm,survival, MASS and ReporteRs. Note that the ReporteRs package is used for gen-erating Microsoft Word-based reports and, in turn, uses the rJava package.

The Mediana package is under active development and new features are addedon a regular basis. The latest version of this manual can be downloaded from thepackage’s web site:

http://biopharmnet.com/wiki/Mediana_package

2 Clinical scenario evaluation framework

Benda et al. (2010) introduced the Clinical Scenario Evaluation (CSE) approachwhich aims at a systematic simulation-based assessment of study designs and anal-

2 Mediana manual

ysis methods in individual clinical trials or, more generally, in clinical developmentprograms. The key elements of this approach include

• assumptions (data models) that define the process of generating study data;

• options (analysis models) that describe the analysis methods applied to the studydata;

• metrics (evaluation models) that specify the measures for evaluating the perfor-mance of the analysis methods.

The data and analysis models specify clinical scenarios. The evaluation models areapplied to clinical scenarios to perform clinical scenario evaluation. The CSE frame-work enables clinical trial researchers to carry out systematic quantitative assess-ment of the operating characteristics of candidate designs and statistical methodsto characterize their performance in multiple settings. Ultimately, this assessmentsupports the selection of a robust approach to clinical trial design and analysiswhich demonstrates optimal performance.

It is important to note that, within this general framework, power or otheroperating characteristics of candidate designs and statistical methods are geniallydifficult to compute analytically. Power calculations are typically performed usinga simulation-based approach.

For additional information on the CSE framework, see Friede et al. (2010). Con-ceptually similar approaches to a quantitative evaluation of multiple analysis strate-gies were also proposed in Dmitrienko et al. (2011) and Millen and Dmitrienko(2011).

3 Main components of Mediana package

Mediana is an R package which provides a general framework for clinical trialsimulations based on the CSE approach. The package supports a broad class of datamodels (including clinical trials with continuous, binary, survival-type and count-type endpoints as well as multivariate outcomes that are based on combinations ofdifferent endpoints), analysis strategies and commonly used evaluation criteria.

The section provides a detailed description of the main components of the CSEframework implemented in the Mediana package. Key options for data, analysis andevaluation models are defined below. Multiple case studies are provided in Section 4to illustrate the process of setting up data, analysis and evaluation models in clinicaltrials.

It is important to point out that the Mediana package has been designed fromthe ground up to be easily extensible. All options defined in this manual actuallyrepresent calls to R functions with the same name and thus users can add theirown options by developing functions that support the desirable options.

The current version of the Mediana package supports the following commonlyused types of study designs:

• Fixed designs with a fixed follow-up period (period from a patient’s enrollmentto discontinuation).

• Event-driven designs with a variable follow-up period. This setting is common intrials with time-to-event outcomes and is characterized by the fact that a patientmay reach the primary endpoint before the end of the follow-up period.

Support to adaptive designs with an option to modify the trial’s objectives ordesign at one or more interim analyses will be provided in a future version of the

Mediana manual 3

package.

3.1 Data models

Data models define the process of generating patient data in clinical trials. Tospecify a data model, the user needs to specify the general set of parameters as wellas parameters of each sample (samples are defined as mutually exclusive groups ofpatients, for example, treatment arms).

The general set of data model parameters includes sample size, outcome anddesign parameters.

General set: Sample size

The sample.size parameter defines the number of patients enrolled in each sam-ple. Note that several sets of sample sizes can be specified, e.g., sample.size =

list(100, 110, 120), and a convenient way to define multiple sample sizes is byusing the seq function, e.g., sample.size = as.list(seq(100, 120, 10)). Theresulting sets are known as sample size sets.

If the user defines the sample size or several sample sizes in the general setof data model parameters, these sample sizes will be applied across all samples.Alternatively, as explained below, sample sizes may also be defined within eachsample.

General set: Outcome parameters

The following outcome parameters are defined in the general set:

• outcome.dist defines the outcome distribution, i.e., the distribution of the trialendpoint, or, in a more general setting, the distribution of multiple endpointssuch as the primary endpoint and key secondary endpoints.

• outcome.type defines the primary endpoint’s type. If the primary endpoint is atime-to-event endpoint, e.g., time to progression in oncology trials, outcome.typemust be set to "event". By default, outcome.type = "fixed", which defines afixed design setting.

As stated in Section 3, the Mediana package supports a wide variety of trial end-points, including continuous, binary and survival-type (time-to-event) and count-type outcomes, as well as more complex settings involving simultaneous evaluationof multiple endpoints with different marginal distributions. Table 1 defines the out-come distributions implemented in the package as well as the required distributionparameters. Note that parameters of multivariate distributions, such as parametersof marginal distributions and correlation matrix, need to be defined using nestedlists.

It is important to note that other outcome distributions can be enabled in a datamodel by writing an R function which implements random number generation forthat particular distribution.

General set: Design parameters

The design parameters are optional and can be defined in event-driven designs ifthe user is interested in modeling the enrollment (or accrual) and dropout (or lossto follow up) processes:

• enroll.period defines the length of the enrollment period.

4 Mediana manual

TABLE 1 Outcome distributions supported by the Mediana package

Outcome distribution outcome.dist outcome.par

Univariate normal "normal" list(mean, sd)

Exponential "exponential" list(rate)

Binomial "binomial" list(prop)

Poisson "poisson.dist" list(mean)

Negative binomial "nbinomial" list(dispersion, mean)

Uniform "uniform" list(max)

Multivariate normal "mvnormal" list(list(list(mean1, sd1),

list(mean2, sd2), ...), corr)

Multivariate binomial "mvbinom" list(list(prop1, prop2, ...),

corr)

Multivariate exponential "mvexponential" list(list(rate1, rate2, ...),

corr)

Multivariate mixed "mvmixed" list(list(list(mean, sd),

prop, rate, ...), corr)

• enroll.dist defines the enrollment distribution. Any univariate distributionlisted in Table 1 can be selected; however, the most appropriate choices includethe uniform and exponential distributions.

• enroll.dist.par defines the parameters of the enrollment distribution (requireddistribution parameters are defined in Table 1). A single parameter or a list ofparameter can be specified. No parameters are needed if a uniform enrollmentdistribution is requested.

• study.duration defines the total study duration in study designs with a variable

follow-up period (designs of this kind are very common in oncology trials). Thetotal study duration is defined as the length of time from the enrollment of thefirst patient to the discontinuation of the last patient.

• followup.period defines the length of the follow-up period for each patient instudy designs with a fixed follow-up period, i.e., the length of time from theenrollment to planned discontinuation is constant across patients. The user mustspecify either followup.period or study.duration.

• dropout.dist defines the dropout distribution. As with the enrollment distribu-tion, any univariate distribution defined in Table 1 can be chosen and the mostappropriate choices include the uniform and exponential distributions.

• dropout.dist.par defines the parameters of the dropout distribution. A singleparameter or a list of parameter can be provided.

The user can define several sets of enrollment and dropout distribution parameters.These sets are known as design parameter sets and enable the user to compactlydefine multiple scenarios that can be evaluated simultaneously. Also, note that thelength of the enrollment period, total study duration and follow-up periods aremeasured using the same time units.

Sample-specific sets of parameters

Further, the user needs to set up parameters of each sample in the clinical trial(samples are defined as mutually exclusive groups of patients). The following pa-rameters are available:

• sample.id defines the unique ID for each sample. This ID is used in the specifi-cation of analysis models.

• outcome.par defines the sample-specific distributional parameters for the sin-gle trial endpoint or multiple endpoints, i.e., parameters of outcome.dist. The

Mediana manual 5

number of parameters depends on the endpoint’s distribution, e.g., if the primaryendpoint is normally distributed (outcome.dist = "normal"), two parameters(mean and standard deviation) must be specified. The parameters required foreach distribution are listed in Table 1. Multiple scenarios, known as outcome

parameter sets, can be specified.

• sample.size defines the sample-specific sample size or set of sample sizes. Thisoption is helpful for modeling unbalanced trial designs with unequal allocation ofpatients to the treatment arms. The sample sizes must be specified in the generalset or in each individual sample.

3.2 Analysis models

Analysis models define statistical tests that are applied to the study data in aclinical trial. The specification of analysis models is conceptually similar to thespecification of data models. An analysis model generally includes a general set ofparameters (it will be shown below that the general set may sometimes be omitted).In addition, the user needs to define parameters for each individual test which willbe carried out as part of the overall analysis.

TABLE 2 Statistical tests supported by the Mediana package (all tests produce one-sided p-values)

Test test.type test.par

Two-sample t-test "ttest"

Non-inferiority two-sample t-test "ttest.noninf" delta

Two-sample test for proportions(with Yates’ continuity correction)

"proptest"

Wilcoxon-Mann-Withney test "wilcoxtest"

Chi-square test "chisqtest"

Fisher exact test "fishertest"

Log-rank test "logrank"

Poisson regression test "glmpoisson"

Negative-binomial regression test "glmnb"

General set

The general set in an analysis model may include the following parameters:

• test.type defines the method for testing the treatment effect across the sam-ples of interest, e.g., the standard two-sample t-test is requested by test.type

= "ttest". Multiple statistical methods are available within the package (avail-able options for test.type are listed in Table 2) and the user can implementadditional tests by writing custom R functions. If test.type is specified in thegeneral set of parameters, the specified method is used in all of the tests. Alterna-tively, test.type can be defined individually for each test. Lastly, it is importantto note that the statistical tests implemented in the package produce one-sidedp-values.

• test.par defines the parameters of the selected statistical method. For example,if a non-inferiority assessment based on the t-test is requested (test.type =

"ttest.noninf"), the non-inferiority margin is passed to the analysis modelusing test.par. The number of parameters is test-specific, e.g., no parametersneed to be provided when the standard t-test is requested. Required parametersfor each statistical method are defined in Table 2.

• mult.adj.proc and mult.adj.par are optional parameters that enable the userto specify the multiplicity adjustment procedure and parameters of this proce-

6 Mediana manual

dure, respectively. The procedure is applied to all of the tests specified withinthe analysis model. The available options are listed in Table 3.

TABLE 3 Multiplicity adjustment procedures supported by the Mediana package

Multiplicity adjustment pro-cedure

mult.adj.proc mult.adj.par

Bonferroni procedure "bonferroni" weight (optional)Holm procedure "holm" weight (optional)Hochberg procedure "hochberg" weight (optional)Hommel procedure "hommel" weight (optional)Family of chain procedures "chain" weight,

matrix.transition

Normal parametric multipletesting procedure

"normal.parametric" weight, corr

Family of parallel gatekeep-ing procedures

"parallel.gatekeeping" family, gamma,component.procedure

Family of multiple-sequencegatekeeping procedures

"multiple.sequence.

gatekeeping"

family, gamma,component.procedure

If the user requests a multiplicity adjustment and no hypotheses weights aredefined in the procedure parameter (mult.adj.par), the hypotheses are assumedto be equally weighted. Moreover, the component procedures defined for each familyof hypotheses within parallel and multiple-sequence gatekeeping procedures mustbe chosen from the following list:

• component.procedure = "bonferroni": Bonferroni component procedure.

• component.procedure = "holm": Holm component procedure.

• component.procedure = "hochberg": Hochberg component procedure.

• component.procedure = "hommel": Hommel component procedure.

More information about the gatekeeping procedures can be found in Dmitrienko etal. (2011, 2014).

To summarize, it follows from this description that the general set of analy-sis model parameters does not need to be specified if the statistical methods forassessing the treatment effect are specified within each test and no multiplicityadjustment is requested.

Test-specific sets of parameters

Each analysis model includes a collection of tests applied to the samples definedwithin the corresponding data model and the user needs to specify the followingparameters for each test:

• test.id defines the unique ID for each test, which is used later on in the speci-fication of evaluation models.

• test.type and test.par define the test-specific statistical method and its pa-rameters. The user must define test.type and test.par in the general set ofanalysis model parameters or in each individual test.

• test.samples defines a list of samples from the data model to be used withinthe selected statistical method. For example, the two-sample t-test requires thespecification of two samples. In general, any test-specific set of samples can bedefined.

To illustrate the use and interpretation of test.samples, consider a treat-ment comparison based on the two-sample t-test (test.type = "ttest"). The

Mediana manual 7

test.samples parameter defines the samples to which the t-test is applied, e.g.,test.samples = list("Sample 1", "Sample 2"). The t-test, as well as all othertests listed in Table 2, are set up as one-sided tests and thus the sample order isimportant. In particular, the Mediana package assumes that a numerically largervalue of the endpoint is expected in Sample 2 compared to Sample 1. Suppose, forexample, that a higher treatment response indicates a beneficial effect (e.g., higherimprovement rate). In this case Sample 1 should include control patients whereasSample 2 should include patients allocated to the experimental treatment arm. Thesample order needs to be reversed if a beneficial treatment effect is associated witha lower value of the endpoint (e.g., lower blood pressure).

3.3 Evaluation models

Evaluation models are used within the Mediana package to specify the measures(metrics) for evaluating the performance of the selected analysis methods. Theavailable metrics include standard marginal power for each individual test as wellmore complex measures of overall success.

The evaluation criteria are defined using an evaluation model, which includes ageneral set of parameters as well as parameters of each metric.

General set

The general set of parameters currently includes a single parameter:

• alpha defines the overall one-sided Type I error rate in the clinical trial.

Metric-specific sets of parameters

The following parameters need to be defined for each metric:

• metric.id defines the unique ID for each metric.

• metric.criterion defines the evaluation criterion (see below). Each criterion isapplied at the significance level given by alpha.

• metric.tests defines the set of tests to which the evaluation criterion is applied.

The following evaluation criteria are currently included in the Mediana package:

• metric.criterion = "marginal": Marginal power of each test.

• metric.criterion = "disjunctive": Probability of detecting significance forat least one test.

• metric.criterion = "conjunctive": Probability of detecting significance forall tests.

The user can create custom evaluation criteria, including more general criteriapresented in Dmitrienko and D’Agostino (2013), by writing their own R functions.

Environmental variables

The Mediana package supports a general meta-information approach, i.e., eachobject created within the package stores all relevant meta data, including the user’sname, time stamp, etc. General environmental variables can be specified within anR session to define project-specific meta information which is saved in all objectscreated during the session:

8 Mediana manual

• env.username: User’s name.

• env.project: Project description.

• env.notes: General notes.

• env.randomseed: Seed for generating pseudo-random numbers.

4 Case studies

This section presents several case studies to illustrate the process of setting up data,analysis and evaluation models in clinical trials and performing clinical scenarioevaluation using the Mediana package. The code used in these case studies can bedownloaded from the package’s web site:

http://biopharmnet.com/wiki/Mediana_package

Case study 1 deals with a simple setting (a clinical trial with two treatment armsand a single endpoint) where power calculations can be performed analytically.Closed-form expressions for the sample size can be derived using the central limittheorem or other approximations and operating characteristics of candidate designsare easily evaluated. However, analytical approaches to power calculation are nolonger available if more complex study designs or analysis strategies are considered.Examples of more complex settings are given in Case studies 2 through 5:

• Case study 2: Clinical trials with three or more treatment arms.

• Case study 3: Clinical trials with several patient populations.

• Case study 4: Clinical trials with several trial endpoints.

• Case study 5: Clinical trials with several trial endpoints and multiple treatmentarms.

A simulation-based approach based on the Mediana package needs to be applied toperform clinical scenario evaluation in these case studies.

4.1 Case study 1: Trial with two treatment arms and singleendpoint (normally distributed endpoint)

Consider a clinical trial with two treatment arms (experimental treatment versusplacebo) and a single primary endpoint. This section focuses on a continuous caseand the primary endpoint is assumed to follow a normal distribution. Other impor-tant cases will be considered in the following sections:

• Section 4.2: Binary endpoint.

• Section 4.3: Survival-type (time-to-event) endpoint.

• Section 4.4: Count-type endpoint.

Data model

Suppose that a sponsor is designing a Phase III clinical trial in patients with pul-monary arterial hypertension (PAH). The efficacy of experimental treatments forPAH is commonly evaluated using a six-minute walk test and the primary endpoint

Mediana manual 9

is defined as the change from baseline to the end of the 16-week treatment periodin the six-minute walk distance.

To define a data model for this clinical trial, a general set of parameters andtwo samples corresponding to the two treatment arms need to be specified. Thechange from baseline in the six-minute walk distance is assumed to follow a normaldistribution and the distribution of the primary endpoint is defined in the generalset of data model parameters:

General set data.model.general = list(outcome.dist = "normal")

Sample sizes and parameters of the outcome distribution in the two treatmentarms are specified as follows. As indicated in Section 3, the treatment arms definetwo samples in this clinical trial example in the sense that the arms are mutuallyexclusive groups of patients. The sponsor is interested in performing power calcu-lations under two treatment effect scenarios (standard and optimistic scenarios).Under these scenarios, the experimental treatment is expected to improve the six-minute walk distance by 40 or 50 meters compared to placebo, respectively, withthe common standard deviation of 70 meters.

The two treatment effect scenarios are presented in Table 4. The table lists theparameters of the normal distribution in the treatment arms. The mean changein the placebo arm is set to µ = 0 and the mean changes in the six-minute walkdistance in the experimental arm are set to µ = 40 (standard scenario) or µ = 50(optimistic scenario). The common standard deviation is σ = 70.

TABLE 4 Treatment effect assumptions in the PAH clinical trial

Scenario Mean (SD)Placebo Treatment

Standard 0 (70) 40 (70)Optimistic 0 (70) 50 (70)

Two sets of outcome parameters are introduced to define the distribution of thetrial endpoint under the standard and optimistic scenarios:

Outcome parameters # Outcome parameter set 1

outcome1.placebo = list(mean = 0, sd = 70)

outcome1.treatment = list(mean = 40, sd = 70)

# Outcome parameter set 2

outcome2.placebo = list(mean = 0, sd = 70)

outcome2.treatment = list(mean = 50, sd = 70)

Note that the mean and standard deviation are explicitly identified in each list.This is done mainly for the user’s convenience. The use of named items such asmean and sd is optional. Also, if appropriate, the user can specify multiple sets ofoutcome parameters.

The sponsor would like to perform power evaluation over a broad range of samplesizes in each treatment arm:

Sample sizeparameters

common.sample.size = list(50, 55, 60, 65, 70)

10 Mediana manual

As indicated in Section 3, the seq function can be used to compactly definesample sizes in this data model:

Sample sizeparameters

common.sample.size = as.list(seq(50, 70, 5))

Design parameters such as the length of the enrollment and follow-up periods donot need to be specified since the trial will employ a fixed design (all patients willbe followed for 16 weeks).

The sample sizes and outcome parameters defined above are included in thesample-specific set of data model parameters (samples list) as follows:

Sample-specific datamodel parameters

samples = list(

list(sample.id = "Placebo",

sample.size = common.sample.size,

outcome.par = list(outcome1.placebo, outcome2.placebo)),

list(sample.id = "Treatment",


outcome.par = list(outcome1.treatment, outcome2.treatment))

)

Finally, the data model for this clinical trial is set up by creating a list of thegeneral and sample-specific parameters:

Data model case.study1.data.model = list(general = data.model.general,

samples = samples)

As shown above, the general and sample-specific data model parameters must beidentified using the general and samples arguments, respectively.

It is worth noting that the data model can be defined more compactly in thisclinical trial example by taking advantage of the fact that a balanced design isconsidered in this clinical trial (common sample size is assumed in the two arms).In this case, the sample size parameters can be included in the general set andomitted from the samples list specification:

Compact data model # General data model parameters

data.model.general = list(outcome.dist = "normal",

sample.size = common.sample.size)

# Sample-specific data model parameters

samples = list(


outcome.par = list(outcome1.placebo, outcome2.placebo)),


outcome.par = list(outcome1.treatment, outcome2.treatment))

)

# Data model

case.study1.data.model = list(general = data.model.general,

samples = samples)

Mediana manual 11

Analysis model

Analysis models also include a general set of parameters as well as additional pa-rameter sets. The latter define parameters of the individual tests to be carried outin a clinical trial.

Only one test is planned to performed in the PAH clinical trial (treatment versusplacebo) and the treatment comparison will be carried out using the two-samplet-test. The test type can be specified in the general set of analysis model parameters:

General set analysis.model.general = list(test.type = "ttest")

No multiplicity adjustment is specified in the general set since only one null hy-pothesis of no treatment effect is tested in this clinical trial.

The following statement is used to define the primary test in the trial (treatmentversus placebo):

Test-specific analysismodel parameters

tests = list(

list(test.id = "Placebo vs treatment",

test.samples = list("Placebo", "Treatment"))

)

According to the specifications, the two-sample t-test will be applied to Sample 1(Placebo) and Sample 2 (Treatment). These sample IDs come from the data modeldefied earlier in this section. As explained in Section 3, the sample order is deter-mined by the expected direction of the treatment effect. In this case, an increase inthe six-minute walk distance indicates a beneficial effect and a numerically largervalue of the primary endpoint is expected in Sample 2 compared to Sample 1. Thisimplies that the list of samples to be passed to the t-test should include Sample 1followed by Sample 2, i.e., test.samples = list("Placebo", "Treatment").

The analysis model is defined by creating a list of general and test-specific pa-rameters:

Analysis model case.study1.analysis.model = list(general = analysis.model.general,

tests = tests)

It is important to make sure that the general and test-specific sets of parametersare identified using the general and tests arguments, respectively.

An equivalent analysis model can be constructed by defining the test type(test.type = "ttest") within the only test included in the test-specific set ofanalysis model parameters (tests list). In this case, the general set can be omitted:

Equivalent analysismodel

# No general set of analysis model parameters

# Test-specific analysis model parameters

tests = list(


test.samples = list("Placebo", "Treatment"),

test.type = "ttest")

)

12 Mediana manual

# Analysis model

case.study1.analysis.model = list(tests = tests)

Evaluation model

The data and analysis models specified above collectively define the clinical sce-narios to be examined in the PAH clinical trial. The scenarios are evaluated usingmetrics that are aligned with the clinical objectives of the trial, e.g., in this case itis most appropriate to use regular power or, more formally, marginal power. Thismetric is specified using an evaluation model.

The general set of evaluation model parameters includes the Type I error rateto be used in the analysis (one-sided α = 0.025):

General set evaluation.model.general = list(alpha = 0.025)

The metric of interest (marginal power) is defined in the metric-specific set ofevaluation model parameters (metrics list):

Metric-specificevaluation model

parameters

metrics = list(

list(metric.id = "Marginal power",

metric.criterion = "marginal",

metric.tests = list("Placebo vs treatment"))

)

The metric.tests parameter lists the IDs of the tests to which each metric isapplied (more than one test can be specified). The test IDs link the evaluation withthe corresponding analysis model. In this particular case, marginal power will becomputed for the t-test which compares the mean change in the six-minute walkdistance in the placebo and treatment arms (Placebo vs treatment).

The general and metric-specific parameter sets are combined to define the overallevaluation model:

Evaluation model case.study1.evaluation.model = list(general = evaluation.model.general,

metrics = metrics)

Clinical scenario evaluation

After the clinical scenarios (data and analysis models) and evaluation model havebeen defined, the user is ready to evaluate the operating characteristics by callingthe cse function (CSE stands for clinical scenario evaluation). As shown below, thefunction call specifies the individual components of the clinical scenario evaluationin this case study as well as the number of simulation runs:

Performsimulation-based

evaluation

result = cse(data.model = case.study1.data.model,

analysis.model = case.study1.analysis.model,

evaluation.model = case.study1.evaluation.model,

n.sims = 10000)

Mediana manual 13

The results are saved in an evaluation object (result). The object containscomplete information about this particular evaluation, including the data, analysisand evaluation models. The most important component of this object is the listnamed result$power, which includes the power values based on the metrics inthe evaluation model. To facilitate the review of the results, the user can invokethe summary function, which produces clinical scenario evaluation reports in plaintext or Microsoft Word formats (other formats will be added in the future). Thefollowing simple function call generates a text-based report in the R commandwindow:

Produce evaluationsummary

summary(evaluation = result)

If the user specifies the filename argument, the summary function writes the clinicalscenario evaluation report to an external file. For example, the summary functiongenerates a detailed report in a Microsoft Word format if format = "word":

MicrosoftWord-based report

summary(evaluation = result,

filename = "Case study 1 (normally distributed endpoint).docx",

format = "word")

A plain text report is generated and saved if format = "text".By default, evaluation summaries use generic labels for sample size sets and

other parameter sets defined by the user, e.g., Sample size 1 or Sample size

2. More descriptive labels are easy to create using the sample.size.label,design.parameter.label and outcome.parameter.label arguments. For ex-ample, the user can assign the following custom labels to the sample size andoutcome parameter sets that will be used in the clinical scenario evaluation report:

Produce evaluationsummary withcustom labels


sample.size.label = list("N = 50", "N = 55", "N = 60",

"N = 65", "N = 70"),

outcome.parameter.label = list("Standard", "Optimistic")

)

Selected sections of the output produced by the summary function are presentedbelow (the full report in the plain text and Microsoft Word formats is availableon the package’s web site). The output includes a summary of all sample sizeand outcome parameters sets specified in the data model. A summary of designparameters would be provided as well if those were specified in the data model:

Summary of samplesize sets

Sample size

***********

Number of samples: 2

Number of sample size sets: 5

Sample size set Sample Size

N = 50 Placebo 50

N = 50 Treatment 50

14 Mediana manual

N = 55 Placebo 55

N = 55 Treatment 55

N = 60 Placebo 60

N = 60 Treatment 60

N = 65 Placebo 65

N = 65 Treatment 65

N = 70 Placebo 70

N = 70 Treatment 70

Summary ofoutcome parameter

sets

Outcome distribution

********************

Number of outcome parameter sets: 2

Outcome distribution : Normal

Outcome parameter set Sample Parameter

Standard Placebo mean=0, SD=70

Standard Treatment mean=40, SD=70

Optimistic Placebo mean=0, SD=70

Optimistic Treatment mean=50, SD=70

The evaluation results based on marginal power are displayed below for all com-binations of the parameter sets (i.e., two outcome parameter set and five samplesize sets):

Evaluation summary Marginal power

**************

Outcome parameter set Sample size set Evaluation results

Standard N = 50 Placebo vs treatment = 0.806





Optimistic N = 50 Placebo vs treatment = 0.942





As expected, marginal power increases monotonically with the sample size andhigher power is achieved under the optimistic scenario. Power approaches 90%with 65 patients per treatment arm under the standard scenario. The estimatedpower is 0.903 and, with 10,000 simulation runs, a 95% confidence interval for theestimated power is (0.897, 0.908). The resulting power values are consistent withresults derived from well-known sample size formulas, see, for example, Julious(2009, Chapter 3).

4.2 Case study 1: Trial with two treatment arms and singleendpoint (binary endpoint)

Section 4.1 assumed that the primary endpoint follows a normal distribution. Asimilar approach can be used to set up data and analysis models for binary, survival-type (time-to-event) and count-type endpoints. The binary case is considered in thissection and the other cases are discussed in Sections 4.3 and 4.4.

Mediana manual 15

Consider a Phase III clinical trial for the treatment of rheumatoid arthritis (RA).The primary endpoint is a response rate based on the American College of Rheuma-tology (ACR) definition of improvement. The trial’s sponsor in interested in per-forming power calculations using the treatment effect assumptions listed in Table 5.

TABLE 5 Treatment effect assumptions in the RA clinical trial

Outcome parameter set Response ratePlacebo Treatment

Outcome parameter set 1 30% 50%Outcome parameter set 2 30% 55%Outcome parameter set 3 30% 60%

Data model

The three outcome parameter sets displayed in Table 5 are combined with foursample size sets (common.sample.size = list(80, 90, 100, 110)) and the dis-tribution of the primary endpoint (outcome.dist = "binomial") is specified inthe general set of data model parameters:

Data model # Outcome parameter set 1

outcome1.placebo = list(prop = 0.30)

outcome1.treatment = list(prop = 0.50)







# Sample size parameters

common.sample.size = list(80, 90, 100, 110)

# General set of data model parameters

data.model.general = list(outcome.dist = "binomial",

sample.size = common.sample.size)


samples = list(


outcome.par = list(outcome1.placebo,

outcome2.placebo,

outcome3.placebo)),


outcome.par = list(outcome1.treatment,

outcome2.treatment,

outcome3.treatment))

)

# Data model


samples = samples)

16 Mediana manual

Analysis model

The following compact analysis model uses a standard two-sample test for compar-ing proportions (test.type = "proptest") to assess the treatment effect in thisclinical trial example. The general set of analysis model parameters is not set upbecause the test type is defined in the test-specific set (tests list):

Analysis model # No general set of analysis model parameters


tests = list(



test.type = "proptest")

)

# Analysis model


Evaluation model and clinical scenario evaluation

Power evaluations are easily performed in this clinical trial example using the evalu-ation model utilized in the case of a normally distributed endpoint (see Section 4.1),i.e., evaluations rely on marginal power. The full clinical scenario evaluation reportis provided on the package’s web site.

An extension of this clinical trial example is provided in Case study 5. Theextension deals with a more complex setting involving several trial endpoints andmultiple treatment arms.

4.3 Case study 1: Trial with two treatment arms and single

endpoint (survival-type endpoint)

If the trial’s primary objective is formulated in terms of analyzing the time to aclinically important event (progression or death in an oncology setting), data andanalysis models can be set up based on an exponential distribution and the log-rank test. As an illustration, consider a Phase III trial which will be conducted tostudy a new treatment for metastatic colorectal cancer (MCC). Patients will berandomized in a 2:1 ratio to an experimental treatment or placebo (in addition tobest supportive care). The trial’s primary objective is to assess the effect of theexperimental treatment on progression-free survival.

A single treatment effect scenario is considered in this clinical trial example.Specifically, the median time to progression is assumed to be t0 = 6 months and t1 =9 months in the placebo and experimental treatment arms, respectively. Under anexponential distribution assumption, the median times correspond to the followinghazard rates

λ0 = log 2/t0 = 0.116, λ1 = log 2/t1 = 0.077,

and the resulting hazard ratio is 0.077/0.116 = 0.67.It is important to note that, if no censoring mechanisms are specified in a data

model with a time-to-event endpoint, all patients will reach the endpoint of interest(e.g., progression) and thus the number of patients specified in the sample.size

Mediana manual 17

parameter will be equal to the number of events. Using this property, power cal-culations based on the number of events are easily carried out for a broad class ofclinical trials with time-to-event endpoints.

Data model

To define a data model in the MCC clinical trial, the total event count in the trialis assumed to range between 270 and 300. The event count is split in a 2:1 ratiobetween the two arms. Since the sample size (event count) varies across the arms inthis example, the sample.size parameter needs to be specified within each samplerather than in the general set of data model parameters.

Data model # Sample size parameters

event.count.total = c(270, 300)

event.count.placebo = as.list((1/3) * event.count.total)

event.count.treatment = as.list((2/3) * event.count.total)

# Outcome parameters

outcome.placebo = list(rate = 0.116)

outcome.treatment = list(rate = 0.077)


data.model.general = list(outcome.dist = "exponential")


samples = list(


sample.size = event.count.placebo,

outcome.par = list(outcome.placebo)),


sample.size = event.count.treatment,

outcome.par = list(outcome.treatment))

)

# Data model


samples = samples)

It is worth noting that the primary endpoint’s type (outcome.type) is not specifiedin the general set of data model parameters. By default, the outcome type is set to"fixed", which means that a design with a fixed follow-up is assumed even thoughthe primary endpoint in this clinical trial is clearly a time-to-event endpoint. Thisis due to the fact that, as was explained earlier in this section, there is no censoringin this design and all patients are followed until the event of interest is observed.In fact, it is easy to verify that the same results will be obtained if an event-drivenstudy design is specified by setting the outcome type to "event" in the general setof data model parameters:

Data model # General set of data model parameters

data.model.general = list(outcome.dist = "exponential",

outcome.type = "event")

18 Mediana manual

Analysis model

The analysis model in this clinical trial is very similar to the analysis models definedin Sections 4.1 and 4.2. The only difference is the choice of the statistical methodutilized in the primary analysis (test.type = "logrank"):



tests = list(



test.type = "logrank")

)

# Analysis model



An evaluation model identical to that used earlier in Sections 4.1 and 4.2 can beapplied to examine the power function at the selected event counts. The full clinicalscenario evaluation report generated in this clinical trial example is provided on thepackage’s web site.

It is worth noting that an extension of this clinical trial example is presented inCase study 4; see Section 4.7 for more information on power calculations in clinicaltrials with multiple time-to-event endpoints.

Power calculations in event-driven trials with censoring

The power calculations presented earlier in this section assumed an idealized settingwhere each patient is followed until the event of interest is observed. In this case,the sample size (number of patients) in each treatment arm is equal to the numberof events. In reality, events are often censored and a sponsor is generally interestedin determining the number of patients to be recruited in order to ensure a targetnumber of events, which translates into desirable power.

The Mediana package can be used to perform power calculations in event-driventrials in the presence of censoring. This is accomplished by setting up design param-eters such as the length of the enrollment and follow-up periods. Suppose, for ex-ample, that a standard design with a variable follow-up will be used in the MCC trialintroduced earlier in this section. The total study duration will be 21 months, whichincludes a 9-month enrollment (accrual) period and a minimum follow-up of 12months. This means that enroll.period = 9 and study.duration = 21. The pa-tients are assumed to be recruited at a uniform rate (enroll.dist = "uniform").The set of design parameters also includes the dropout distribution and its parame-ters. In this clinical trial, the dropout distribution is exponential (dropout.dist =

"exponential") and the distribution’s parameter (λ) is determined from historicaldata (dropout.dist.par = 0.0115). The design parameters are included in thedesign.set list and passed to the general set of data model parameters. Finally,the primary endpoint’s type is set to "event" in the general set to indicate that avariable follow-up will utilized in this clinical trial (outcome.type = "event").

Mediana manual 19

The resulting data model with two sample size sets, a single set of outcomeparameters and a single set of design parameters is displayed below:

Data model withdesign parameters


sample.size.total = c(390, 420)

sample.size.placebo = as.list((1/3) * sample.size.total)

sample.size.treatment = as.list((2/3) * sample.size.total)


median.time.placebo = 6

rate.placebo = log(2)/median.time.placebo

outcome.placebo = list(rate = rate.placebo)

median.time.treatment = 9

rate.treatment = log(2)/median.time.treatment

outcome.treatment = list(rate = rate.treatment)

# Design parameters

design.set = list(enroll.period = 9,

study.duration = 21,

enroll.dist = "uniform",

dropout.dist = "exponential",

dropout.dist.par = 0.0115)


data.model.general = list(outcome.dist = "exponential",

outcome.type = "event",

design = list(design.set))


samples = list(


sample.size = sample.size.placebo,



sample.size = sample.size.treatment,


)

# Data model


samples = samples)

It is important to point out that the sample.size parameter in this data modeldefines the number of patients to be enrolled in each arm rather than the num-ber of events. Also, as with outcome and sample size parameters, multiple sets ofdesign parameters can be specified. To accomplish this, a list of several design pa-rameters sets can be created and passed to the design parameter, e.g., design =

list(design.set1, design.set2, design.set3). All possible combinations ofthe resulting parameter sets will be examined and presented in the clinical scenarioevaluation report.

The analysis and evaluation models used in this setting are identical to thoseused earlier in this section and power calculations are performed by calling the csefunction. Further, the summary function with custom sample size labels is invokedto generate a clinical scenario evaluation report.

Performsimulation-basedevaluation and

produce evaluationsummary


20 Mediana manual



n.sims = 10000)


sample.size.label = list("Total N = 390", "Total N = 420")

)

The resulting clinical scenario evaluation report is very similar to the reportproduced for the setting where no censoring was assumed and it also includes asection entitled “Descriptive statistics: Number of events”. This section of the reportis displayed in Table 6 and, as before, the full clinical scenario evaluation report isavailable on the package’s web site.

TABLE 6 Evaluation summary in the MCC clinical trial with censoring

Descriptive statistics: Number of eventsSample size set Outcome Design Sample ID Mean number

parameter set parameter set ID of events

Total N = 390 Outcome 1 Design 1 Placebo 102.87Total N = 420 Outcome 1 Design 1 Placebo 110.72Total N = 390 Outcome 1 Design 1 Treatment 172.25Total N = 420 Outcome 1 Design 1 Treatment 185.66

Table 6 lists the mean number of events (progressions) in this clinical trial. Forexample, with the total sample size of 390 patients, the average number of patientswho reach the PFS endpoint during the 21-month period is 102.87+172.25 = 275.12.

In general, even though closed-form solutions have been derived for sample sizecalculations in event-driven designs, the available approaches force clinical trial re-searchers to make a variety of simplifying assumptions, e.g., assumptions on theenrollment distribution are commonly made, see, for example, Julious (2009, Chap-ter 15). A general simulation-based approach to power and sample size calculationsimplemented in the Mediana package enables clinical trial sponsors to remove theseartificial restrictions and examine a very broad set of plausible design parameters.

4.4 Case study 1: Trial with two treatment arms and single

endpoint (count-type endpoint)

The last clinical trial example within Case study 1 deals with a Phase III clinicaltrial in patients with relapsing-remitting multiple sclerosis (RRMS). The trial aimsat assessing the safety and efficacy of a single dose of a novel treatment compared toplacebo. The primary endpoint is the number of new gadolinium enhancing lesionsseen during a 6-month period on monthly MRIs of the brain and a smaller numberindicates treatment benefit. The distribution of such endpoints has been widelystudied in the literature and Sormani et al. (1999a, 1999b) showed that a negativebinomial distribution provides a relatively good fit to the data.

Table 7 gives the expected treatment effect in the experimental treatment andplacebo arms (the negative binomial distribution is parameterized using the meanrather than the probability of success in each trial). The corresponding treatmenteffect, i.e., the relative reduction in the mean number of new lesions counts, is100(13 − 7.8)/13 = 40%. The assumptions in Table 7 define a single outcomeparameter set.

Mediana manual 21

TABLE 7 Treatment effect assumptions in the RRMS clinical trial

Treatment arm Mean number of Dispersionnew lesions parameter

Placebo 13 0.5Treatment 7.8 0.5

Data model

The general set of data model parameters defines the distribution of the trial end-point (outcome.dist = "nbinomial"). Further, a balanced design will be utilizedin this clinical trial and the range of sample sizes is also defined in the general set(it is convenient to do this using the seq function). The sample-specific set includesthe parameters required by the negative binomial distribution (dispersion andmean).

Data model # General set of data model parameter

data.model.general = list(outcome.dist = "nbinomial",

sample.size = seq(50, 100, 10))


outcome.placebo = list(dispersion = 0.5, mean = 13)

outcome.treatment = list(dispersion = 0.5, mean = 7.8)


samples = list(





)

# Data model


samples = samples)

Analysis model

The treatment effect will be assessed in this clinical trial example using a nega-tive binomial generalized linear model (NBGLM) and, as shown in Table 2, thetest.type parameter needs to be set to "glmnb". For compactness, the test.typeparameter can be included in the test-specific set of analysis model parameters and,as a result, there will be no need to define the general set.



tests = list(



test.type = "glmnb")

)

22 Mediana manual

# Analysis model



From the evaluation perspective, the objective of this clinical trial is identical tothat of the clinical trials presented in Sections 4.1, 4.2 and 4.3, i.e., evaluation willbe based on marginal power of the primary endpoint test. As a consequence, thesame evaluation model can be applied. The full clinical scenario evaluation reportis available on the package’s web site.

4.5 Case study 2: Four treatment arms

This clinical trial example and clinical trial examples presented in Case studies 3through 5 deal with settings where no analytical methods are available to supportpower calculations. However, as demonstrated below, simulation-based approachesare easily applied to perform comprehensive assessment of the relevant operatingcharacteristics within the clinical scenario evaluation framework.

Case study 2 is based on a clinical trial example introduced in Dmitrienko andD’Agostino (2013, Section 10). This example deals with a Phase III clinical trial ina schizophrenia population. Three doses of a new treatment, labelled Dose L, DoseM and Dose H, will be tested versus placebo. The trial will be declared successfulif a beneficial treatment effect is demonstrated in any of the three dosing groupscompared to the placebo group.

The primary endpoint is defined as the reduction in the Positive and NegativeSyndrome Scale (PANSS) total score compared to baseline and a larger reductionin the PANSS total score indicates treatment benefit. This endpoint is normallydistributed and the treatment effect assumptions in the four treatment arms aredisplayed in Table 8.

TABLE 8 Treatment effect assumptions in the schizophrenia clinical trial

Treatment arm Mean Standarddeviation

Placebo 16 18Dose L 19.5 18Dose M 21 18Dose H 21 18

Data model

The treatment effect assumptions presented in Table 8 define a single outcomeparameter set and the common sample size is set to 260 patients. These parametersare specified in the following data model:


data.model.general = list(outcome.dist = "normal")


outcome.pl = list(mean = 16, sd = 18)

Mediana manual 23

outcome.dosel = list(mean = 19.5, sd = 18)

outcome.dosem = list(mean = 21, sd = 18)

outcome.doseh = list(mean = 21, sd = 18)


common.sample.size = list(260)


samples = list(



outcome.par = list(outcome.pl)),

list(sample.id = "Dose L",


outcome.par = list(outcome.dosel)),

list(sample.id = "Dose M",


outcome.par = list(outcome.dosem)),

list(sample.id = "Dose H",


outcome.par = list(outcome.doseh))

)

# Data model


samples = samples)

Analysis model

The analysis model, shown below, defines the three individual tests that will becarried out in the schizophrenia clinical trial. Each test corresponds to a dose-placebo comparison. The model states that each comparison will be carried outbased on the two-sample t-test (test.type = "ttest"):

Analysis model # General set of analysis model parameters

analysis.model.general = list(test.type = "ttest",

mult.adj.proc = "hochberg")


tests = list(

list(test.id = "Pl vs Dose L",

test.samples = list("Placebo", "Dose L")),

list(test.id = "Pl vs Dose M",

test.samples = list("Placebo", "Dose M")),

list(test.id = "Pl vs Dose H",

test.samples = list("Placebo", "Dose H"))

)

# Analysis model

case.study2.analysis.model = list(general = analysis.model.general,

tests = tests)

To reiterate a point made in Section 4.1, it is critical to put samples in the rightorder within test.samples. Consider, for example, the first test (test.id = "Pl

24 Mediana manual

vs Dose L"). A numerically greater response is expected at Dose L compared toplacebo and thus the Dose L sample is included after the placebo sample.

The general set of analysis model parameters also specifies the multiplicity ad-justment based on the Hochberg procedure. As indicated earlier in this section,the overall success criterion in the trial is formulated in terms of demonstrating abeneficial effect at any of the three doses. Due to multiple opportunities to claimsuccess, the overall Type I error rate will be inflated and the Hochberg procedure isintroduced to protect the error rate at the nominal level. The Hochberg procedurewill be applied to the following null hypotheses:

• H1: Null hypothesis of no difference between Dose L and placebo.

• H2: Null hypothesis of no difference between Dose M and placebo.

• H3: Null hypothesis of no difference between Dose H and placebo.

Since no procedure parameters are defined, the three tests (or, equivalently, threenull hypotheses of no effect) are assumed to be equally weighted. To request theHochberg procedure with unequally weighted hypotheses, the user needs to assigna list of hypothesis weights to the mult.adj.par parameter. The weights typicallyreflect the relative importance of the individual null hypotheses. Assume, for ex-ample, that 60% of the overall weight is assigned to H3 and the remainder is splitbetween H1 and H2. In this case, the general set of analysis model parameters needsto be set up as follows:

Parameters ofmultiplicityadjustmentprocedure


mult.adj.proc = "hochberg",

mult.adj.par = list(0.2, 0.2, 0.6))

Evaluation model

As was illustrated in Section 4.1, an evaluation model specifies clinically relevantcriteria for assessing the performance of the individual tests defined in the corre-sponding analysis model or composite measures of success. In virtually any setting,it is of interest to compute the probability of achieving a significant outcome in eachindividual test, e.g., the probability of a significant difference between placebo andeach dose. This is accomplished by requesting marginal power (metric.criterion= "marginal").

Since the trial will be declared successful if at least one dose-placebo comparisonis significant, it is natural to compute the overall success probability, which is definedas the probability of demonstrating treatment benefit in one of more dosing groups.This is equivalent to evaluating disjunctive power in the trial (metric.criterion= "disjunctive")).

In addition, the user can easily define a custom evaluation criterion. Supposethat, based on the results of the previously conducted trials, the sponsor expects amuch larger treatment treatment difference at Dose H compared to Doses L and M.Given this, the sponsor may be interested in evaluating the probability of observinga significant treatment effect at Dose H and at least one other dose. The associatedevaluation criterion is implemented in the following function:

Evaluation criterion case.study2.criterion = function(p, alpha) {

significant = ((p[,3] <= alpha) & ((p[,1] <= alpha) | (p[,2] <= alpha)))

Mediana manual 25

power = mean(significant)

return(power)

}

The function’s first argument (p) is a vector of p-values produced by the tests asso-ciated with the three dose-placebo comparisons and the second argument (alpha) isthe overall Type I error rate. The case.study2.criterion function computes theprobability of a significant treatment effect at Dose H (p[,3] <= alpha) and a sig-nificant treatment difference at Dose L or Dose M ((p[,1] <= alpha) | (p[,2]

<= alpha)). Since this criterion assumes that the third test is based on the com-parison of Dose H versus Placebo, the order in which the tests are included in theevaluation model is important.

The following evaluation model specifies marginal and disjunctive power as wellas the custom evaluation criterion defined above:

Evaluation model # General set of evaluation model parameters

evaluation.model.general = list(alpha = 0.025)

# Metric-specific evaluation model parameters

metrics = list(



metric.tests = list("Pl vs Dose L",

"Pl vs Dose M",

"Pl vs Dose H")),

list(metric.id = "Disjunctive power",

metric.criterion = "disjunctive",


"Pl vs Dose M",

"Pl vs Dose H")),

list(metric.id = "Dose H and at least one dose",

metric.criterion = "case.study2.criterion",


"Pl vs Dose M",

"Pl vs Dose H"))

)

# Evaluation model

case.study2.evaluation.model = list(general = evaluation.model.general,

metrics = metrics)

Another potential option is to apply the conjunctive criterion which is met ifa significant treatment difference is detected simultaneously in all three dosinggroups (metric.criterion = "conjunctive"). This criterion helps characterizethe likelihood of a consistent treatment effect across the doses.

The user can also use the metric.tests parameter to choose the specific tests towhich the disjunctive and conjunctive criteria are applied (the resulting criteria areknown as subset disjunctive and conjunctive criteria). To illustrate, the followingstatement computes the probability of a significant treatment effect at Dose M orDose H (Dose L is excluded from this calculation):

Evaluation modelwith subset

disjunctive criterion

metrics = list(

list(metric.id = "Subset disjunctive power",


26 Mediana manual

metric.tests = list("Pl vs Dose M",

"Pl vs Dose H"))

)


As in Case study 1 (see, for example, Section 4.1), clinical scenario evaluationin the schizophrenia clinical trial is performed by passing the data, analysis andevaluation models as well as the number of simulation runs to the cse function.After cse creates the evaluation object (result), the summary function can becalled to produce a helpful summary of the clinical scenarios, evaluation criteriaand power calculations:

Performsimulation-basedevaluation and

produce evaluationsummary




n.sims = 10000)

summary(evaluation = result)

The most relevant section of the output produced by the summary function ispresented in Table 9. The full clinical scenario evaluation report is available on thepackage’s web site.

TABLE 9 Evaluation summary in the schizophrenia clinical trial


Marginal powerOutcome 1 Sample size 1 Pl vs Dose L = 0.580

Pl vs Dose M = 0.833Pl vs Dose H = 0.836

Disjunctive powerOutcome 1 Sample size 1 Criterion = 0.915

Dose H and at least one doseOutcome 1 Sample size 1 Criterion = 0.774

It is worth noting that the success probabilities presented in Table 9 reflect theHochberg multiplicity adjustment specified in the analysis model. Table 9 showsthat the multiplicity-adjusted marginal probabilities of establishing a significanttreatment effect are equal to 0.580 at Dose L, 0.833 at Dose M and 0.836 at Dose H.In addition, the overall success probability (disjunctive power) is computed (0.915).The obtained disjunctive power matches the results of power calculations presentedin Figure 14 of Dmitrienko and D’Agostino (2013).

4.6 Case study 3: Two patient populations

This case study deals with a Phase III clinical trial in patients with mild or moderateasthma (it is based on a clinical trial example from Millen et al., 2014, Section 2.2).The trial is intended to support a tailoring strategy. In particular, the treatmenteffect of a single dose of a new treatment will be compared to that of placebo in theoverall population of patients as well as a pre-specified subpopulation of patientswith a marker-positive status at baseline (for compactness, the overall population

Mediana manual 27

is denoted by OP, marker-positive subpopulation is denoted by M+ and marker-negative subpopulation is denoted by M−).

Marker-positive patients are more likely to receive benefit from the experimentaltreatment. The overall objective of the clinical trial accounts for the fact that thetreatment’s effect may, in fact, be limited to the marker-positive subpopulation. Thetrial will be declared successful if the treatment’s beneficial effect is established inthe overall population of patients or, alternatively, the effect is established only inthe subpopulation.

The primary endpoint in the clinical trial is defined as an increase from baselinein the forced expiratory volume in one second (FEV1). This endpoint is normallydistributed and improvement is associated with a larger change in FEV1.

To set up a data model for this clinical trial, it is natural to define samples(mutually exclusive groups of patients) as follows:

• Sample 1: Marker-negative patients in the placebo arm.

• Sample 2: Marker-positive patients in the placebo arm.

• Sample 3: Marker-negative patients in the treatment arm.

• Sample 4: Marker-positive patients in the treatment arm.

Using this definition of samples, the trial’s sponsor can model the fact that thetreatment’s effect is most pronounced in patients with a marker-positive status.

The treatment effect assumptions in the four samples are summarized in Table 10(expiratory volume in FEV1 is measured in liters). As shown in the table, themean change in FEV1 is constant across the marker-negative and marker-positivesubpopulations in the placebo arm (Samples 1 and 2). A positive treatment effect isexpected in both subpopulations in the treatment arm but marker-positive patientswill experience greater beneficial effect (Sample 4).

TABLE 10 Treatment effect assumptions in the asthma clinical trial

Sample Mean Standarddeviation

Sample 1 0.12 0.45Sample 2 0.12 0.45Sample 3 0.24 0.45Sample 4 0.30 0.45

Data model

The following data model incorporates the assumptions displayed in Table 10 bydefining a single set of outcome parameters. The data model includes three samplesize sets (total sample size is set to 330, 340 and 350 patients). The sizes of theindividual samples are computed based on historic information (40% of patients inthe population of interest are expected to have a marker-positive status).


data.model.general = list(outcome.dist = "normal")


outcome.plac.minus = list(mean = 0.12, sd = 0.45)

outcome.plac.plus = list(mean = 0.12, sd = 0.45)

outcome.treat.minus = list(mean = 0.24, sd = 0.45)

outcome.treat.plus = list(mean = 0.30, sd = 0.45)

28 Mediana manual


sample.size.total = c(330, 340, 350)

sample.size.plac.minus = as.list(0.3 * sample.size.total)

sample.size.plac.plus = as.list(0.2 * sample.size.total)

sample.size.treat.minus = as.list(0.3 * sample.size.total)

sample.size.treat.plus = as.list(0.2 * sample.size.total)


samples = list(

list(sample.id = "Plac M-",

sample.size = sample.size.plac.minus,

outcome.par = list(outcome.plac.minus)),

list(sample.id = "Plac M+",

sample.size = sample.size.plac.plus,

outcome.par = list(outcome.plac.plus)),

list(sample.id = "Treat M-",

sample.size = sample.size.treat.minus,

outcome.par = list(outcome.treat.minus)),

list(sample.id = "Treat M+",

sample.size = sample.size.treat.plus,

outcome.par = list(outcome.treat.plus))

)

# Data model


samples = samples)

Analysis model

The analysis model in this clinical trial example is generally similar to that usedin Case study 2 but there is an important difference which is described below.As in Case study 2, the primary endpoint follows a normal distribution and thusthe treatment effect will be assessed using the two-sample t-test. Since two nullhypotheses are tested in this trial (null hypotheses of no effect in the overall pop-ulation of patients and subpopulation of marker-positive patients), a multiplicityadjustment needs to be applied. The Hochberg procedure with equally weightednull hypotheses will be used for this purpose.

A key feature of the analysis strategy in this case study is that the samplesdefined in the data model are different from the samples used in the analysis ofthe primary endpoint. As shown in Table 10, four samples are included in thedata model. However, from the analysis perspective, the sponsor in interested inexamining the treatment effect in two samples, namely, the overall population andmarker-positive subpopulation. As shown below, to perform a comparison in theoverall population, the t-test is applied to the following analysis samples:

• Placebo arm: Samples 1 and 2 ("Plac M-" and "Plac M+") are merged.

• Treatment arm: Samples 3 and 4 ("Treat M-" and "Treat M+") are merged.

Further, the treatment effect test in the subpopulation of marker-positive patientsis carried out based on these analysis samples:

• Placebo arm: Sample 2 ("Plac M+").

• Treatment arm: Sample 4 ("Treat M+").

Mediana manual 29

These analysis samples are specified in the analysis model below. The samplesdefined in the data model are merged using lists, e.g., list("Plac M-", "Plac

M+") defines the placebo arm and list("Treat M-", "Treat M+") defines theexperimental treatment arm in the overall population test.

Analysis model # General set of analysis model parameters


mult.adj.proc = "hochberg")


tests = list(

list(test.id = "OP test",

test.samples = list(list("Plac M-", "Plac M+"),

list("Treat M-", "Treat M+"))),

list(test.id = "M+ test",

test.samples = list("Plac M+", "Treat M+"))

)

# Analysis model


tests = tests)

Evaluation model

It is reasonable to consider the following success criteria in this case study:

• Marginal power: Probability of a significant outcome in each patient population.

• Disjunctive power: Probability of a significant treatment effect in the overallpopulation (OP) or marker-positive subpopulation (M+). This metric definesthe overall probability of success in this clinical trial.

• Conjunctive power: Probability of simultaneously achieving significance in theoverall population and marker-positive subpopulation. This criterion will be use-ful if the trial’s sponsor is interested in pursuing an enhanced efficacy claim(Millen et al., 2012).

The following evaluation model applies the three criteria to the two tests listedin the analysis model:




metrics = list(



metric.tests = list("OP test",

"M+ test")),

list(metric.id = "Disjunctive power",



"M+ test")),

list(metric.id = "Conjunctive power",

metric.criterion = "conjunctive",

30 Mediana manual


"M+ test"))

)

# Evaluation model


metrics = metrics)


The final step in the process is to perform power calculations based on the threemodels defined above and summarize the results:


evaluation




n.sims = 10000)


sample.size.label = list("N = 115 (OP), N = 66 (M+)",

"N = 120 (OP), N = 68 (M+)",

"N = 125 (OP), N = 70 (M+)")

)

The custom labels assigned to the sample size sets indicate the number of patientsper treatment arm in each patient population (overall and marker-positive popula-tions).

The most important section of the output produced by the summary function, i.e.,the section which lists the success probabilities based on the three criteria specifiedin the evaluation model, is displayed in Table 11. The full report is provided on thepackage’s web site.

TABLE 11 Evaluation summary in the asthma clinical trial


Marginal powerOutcome 1 N = 115 (OP), N = 66 (M+) OP test = 0.769

M+ test = 0.602Outcome 1 N = 120 (OP), N = 68 (M+) OP test = 0.793

M+ test = 0.625Outcome 1 N = 125 (OP), N = 70 (M+) OP test = 0.799

M+ test = 0.634

Disjunctive powerOutcome 1 N = 115 (OP), N = 66 (M+) Criterion = 0.794Outcome 1 N = 120 (OP), N = 68 (M+) Criterion = 0.817Outcome 1 N = 125 (OP), N = 70 (M+) Criterion = 0.820

Conjunctive powerOutcome 1 N = 115 (OP), N = 66 (M+) Criterion = 0.576Outcome 1 N = 120 (OP), N = 68 (M+) Criterion = 0.601Outcome 1 N = 125 (OP), N = 70 (M+) Criterion = 0.613

Beginning with marginal power, Table 11 shows that the probability of a signif-icant treatment effect in the overall population is very close to 80% with the totalsample size of 340 and 350 patients. Lower power values are observed in the analysis

Mediana manual 31

of the marker-positive subpopulation, which is mainly due to the fact only 40% ofthe patients are marker-positive. The overall probability of success based on dis-junctive power is close to or exceeds 80%. In general, it may be difficult to establisha significant treatment effect simultaneously in two populations but conjunctivepower is reasonably high in this example (it ranges between 58% and 61%).

4.7 Case study 4: Two endpoints

Case study 4 serves as an extension of the oncology clinical trial example presentedin Section 4.3. Consider again a Phase III trial in patients with metastatic colorectalcancer (MCC). The same general design will be assumed in this section; however, anadditional endpoint (overall survival) will be introduced. The case of two endpointshelps showcase the package’s ability to model complex design and analysis strategiesin trials with multivariate outcomes.

Progression-free survival (PFS) is the primary endpoint in this clinical trial andoverall survival (OS) serves as the key secondary endpoint, which provides support-ive evidence of treatment efficacy. A hierarchical testing approach will be utilizedin the analysis of the two endpoints. The PFS analysis will be performed first atα = 0.025 (one-sided), followed by the OS analysis at the same level if a significanteffect on PFS is established. The resulting testing procedure is equivalent to thefixed-sequence procedure and controls the overall Type I error rate (Dmitrienko andD’Agostino, 2013).

The treatment effect assumptions that will be used in clinical scenario evaluationare listed in Table 12. The table shows the hypothesized median times along withthe corresponding hazard rates for the primary and secondary endpoints. It followsfrom the table that the expected effect size is much larger for PFS compared to OS(PFS hazard ratio is lower than OS hazard ratio).

TABLE 12 Treatment effect assumptions in the MCC clinical trial

Endpoint Placebo Treatment

Progression-free survivalMedian time (months) 6 9Hazard rate 0.116 0.077Hazard ratio 0.077/0.116 = 0.67

Overall survivalMedian time (months) 15 19Hazard rate 0.046 0.036Hazard ratio 0.036/0.046 = 0.79

Data model

The general set of data model parameters specifies the distribution of the trial end-point or endpoints. In this clinical trial two endpoints are evaluated for each patient(PFS and OS) and thus their joint distribution needs to be listed in the general set.A bivariate exponential distribution will be used in this example and samples fromthis bivariate distribution will be generated by the mvexponential function whichimplements multivariate exponential distributions. The function utilizes the copulamethod, i.e., random variables that follow a bivariate normal distribution will begenerated and then converted into exponential random variables.

The next several statements specify the parameters of the bivariate exponentialdistribution:

32 Mediana manual

• Parameters of the marginal exponential distributions, i.e., the hazard rates.

• Correlation matrix of the underlying multivariate normal distribution used in thecopula method.

The hazard rates for PFS and OS in each treatment arm are defined based onthe information presented in Table 12 (plac.par and treat.par) and the cor-relation matrix is specified based on historical information (corr.matrix). Theseparameters are combined to define the outcome parameter sets (outcome.plac andoutcome.treat) that will be included in the sample-specific set of data model pa-rameters (samples list).

General set # General set of data model parameters

data.model.general = list(outcome.dist = "mvexponential")

# Parameter lists

plac.par = list(rate = 0.116, rate = 0.046)

treat.par = list(rate = 0.077, rate = 0.036)

# Correlation between two endpoints

corr.matrix = matrix(c(1.0, 0.3,

0.3, 1.0), 2, 2)


outcome.plac = list(plac.par, corr.matrix)

outcome.treat = list(treat.par, corr.matrix)

To define the sample-specific data model parameters, recall from Section 4.3 thata 2:1 randomization ratio will be used in this clinical trial and thus the sample size(number of events) in each treatment arm is computed from the total event countspecified by the user. Secondly, a separate sample ID needs to be assigned to eachendpoint within the two samples corresponding to the two treatment arms. Thiswill enable the user to construct analysis models for examining the treatment effecton each endpoint.

Sample-specific datamodel parameters

and data model


event.count.total = c(270, 300)

event.count.plac = as.list((1/3) * event.count.total)

event.count.treat = as.list((2/3) * event.count.total)


samples = list(

list(sample.id = list("Plac PFS", "Plac OS"),

sample.size = event.count.plac,

outcome.par = list(outcome.plac)),

list(sample.id = list("Treat PFS", "Treat OS"),

sample.size = event.count.treat,

outcome.par = list(outcome.treat))

)

# Data model


samples = samples)

Mediana manual 33

Analysis model

The treatment comparisons for both endpoints will be carried out based on thelog-rank test (test.type = "logrank"). Further, as was stated in the beginningof this section, the two endpoints will be tested hierarchically using a multiplicityadjustment procedure known as the fixed-sequence procedure. This procedure be-longs to the class of chain procedures and Figure 1 provides a visual summary ofthe decision rules used in this procedure. The circles in this figure denote the twonull hypotheses of interest:

• H1: Null hypothesis of no difference between the two arms with respect to PFS.

• H2: Null hypothesis of no difference between the two arms with respect to OS.

The value displayed above a circle defines the initial weight of each null hypothesis.All of the overall α is allocated to H1 to ensure that the OS test will be carried outonly after the PFS test is significant and the arrow indicates that H2 will be testedafter H1 is rejected.

Figure 1Visual summary of thedecision rules used inthe chain procedure H1

1

H2

0

1

More formally, a chain procedure is uniquely defined by specifying a vector ofhypothesis weights (W ) and matrix of transition parameters (G). Based on Figure 1,these parameters are given by

W =

[

10

]

, G =

[

0 10 0

]

.

Two variables (chain.weight and chain.transition) are defined below to passthe hypothesis weights and transition parameters to the general set of analysismodel parameters.

General set # Vector of hypothesis weights

chain.weight = c(1, 0)

# Matrix of transition parameters

chain.transition = matrix(c(0, 1,

0, 0), 2, 2, byrow = TRUE)

# General set of analysis model parameters

analysis.model.general = list(test.type = "logrank",

mult.adj.proc = "chain",

mult.adj.par = list(chain.weight,

chain.transition))

When defining the matrix of transition parameters for a chain procedure, it iscritical to indicate that the matrix will be constructed from the vector using arow-wise algorithm, i.e., byrow = TRUE. Otherwise, the matrix will be transposed,which will result in a different chain procedure.

As shown below, the two tests included in the analysis model reflect the two-foldobjective of this trial. The first test focuses on a PFS comparison between the twotreatment arms (test.id = "PFS test") whereas the other test is carried out toassess the treatment effect on OS (test.id = "OS test"):

34 Mediana manual

Analysis model # Test-specific analysis model parameters

tests = list(

list(test.id = "PFS test",

test.samples = list("Plac PFS", "Treat PFS")),

list(test.id = "OS test",

test.samples = list("Plac OS", "Treat OS"))

)

# Analysis model


tests = tests)

Evaluation model

The evaluation model specifies the most basic criterion for assessing the probabilityof success in the PFS and OS analyses (marginal power). A criterion based on dis-junctive power could be considered but it would not provide additional information.Due to the hierarchical testing approach, the probability of detecting a significanttreatment effect on at least one endpoint (disjunctive power) is simply equal to theprobability of establishing a significant PFS effect.




metrics = list(



metric.tests = list("PFS test", "OS test"))

)

# Evaluation model


metrics = metrics)


The data, analysis and evaluation models are ready to be combined and passedto the cse function to run simulation-based power calculations. A clinical scenarioevaluation report is produced by calling the summary function.


evaluation




n.sims = 10000)


sample.size.label = list("Total event count = 270",

"Total event count = 300")

)

Mediana manual 35

The most relevant section of the clinical scenario evaluation report produced bythe summary function displays the probabilities of establishing treatment benefitwith respect to PFS and OS (see Table 13). The full report is provided on thepackage’s web site.

TABLE 13 Evaluation summary in the metastatic colorectal cancer clinical trial


Marginal powerOutcome 1 Total event count = 270 PFS test = 0.883

OS test = 0.449Outcome 1 Total event count = 300 PFS test = 0.917

OS test = 0.488

Table 13 demonstrates that this clinical trial is well-powered to analyze thetreatment’s effect on time to progression in patients with metastatic colorectalcancer. However, it will be challenging to perform reliable inferences with respectto survival benefit. First, the anticipated treatment difference for OS is smaller thanthe difference assumed for PFS (see Table 12). In addition to that, power of the OStest is reduced due to the penalty induced by the hierarchical testing strategy.

4.8 Case study 5: Two endpoints and three treatment arms

This case study extends the straightforward setting presented in Case study 1 (seeSection 4.2) to a more complex setting involving two trial endpoints and three treat-ment arms. Case study 5 illustrates the process of performing power calculations inclinical trials with multiple, hierarchically structured objectives and “multivariate”multiplicity adjustment strategies (gatekeeping procedures).

Consider a three-arm Phase III clinical trial for the treatment of rheumatoidarthritis (RA). Two co-primary endpoints will be used to evaluate the effect of anovel treatment on clinical response and on physical function. The endpoints aredefined as follows:

• Endpoint 1: Response rate based on the American College of Rheumatologydefinition of improvement (ACR20).

• Endpoint 2: Change from baseline in the Health Assessment Questionnaire-Disability Index (HAQ-DI).

The two endpoints have different marginal distributions. The first endpoint is binarywhereas the second one is continuous and follows a normal distribution.

The efficacy profile of two doses of a new treatment (Doses L and Dose H) willbe compared to that of a placebo and a successful outcome will be defined as asignificant treatment effect at either or both doses. A hierarchical structure hasbeen established within each dose so that Endpoint 2 will be tested if and only ifthere is evidence of a significant effect on Endpoint 1.

Three treatment effect scenarios for each endpoint are displayed in Table 14.The scenarios define three outcome parameter sets. The first set represents a ratherconservative treatment effect scenario, the second set is a standard (most plausible)scenario and the third set represents an optimistic scenario.

Note that a reduction in the HAQ-DI score indicates a beneficial effect and thusthe mean changes are assumed to be negative for Endpoint 2 in Table 14.

36 Mediana manual

TABLE 14 Treatment effect assumptions in the RA clinical trial

Outcome parameter set Endpoint Placebo Doses L Doses H

Endpoint 1Outcome parameter set 1 Response rate 30% 40% 50%Outcome parameter set 2 Response rate 30% 45% 55%Outcome parameter set 3 Response rate 30% 50% 60%

Endpoint 2Outcome parameter set 1 Mean (SD) −0.10 (0.50) −0.20 (0.50) −0.30 (0.50)Outcome parameter set 2 Mean (SD) −0.10 (0.50) −0.25 (0.50) −0.35 (0.50)Outcome parameter set 3 Mean (SD) −0.10 (0.50) −0.30 (0.50) −0.40 (0.50)

Data model

As in Case study 4, two endpoints are evaluated for each patient in this clinicaltrial example, which means that their joint distribution needs to be defined in thegeneral set of data model parameters. The mvmixed function will be utilized forspecifying a bivariate distribution with binomial and normal marginals. In general,this function is used for modeling correlated normal, binomial and exponentialendpoints and relies on the copula method, i.e., random variables are generated froma multivariate normal distribution and converted into variables with pre-specifiedmarginal distributions.

Three parameters must be defined to specify the joint distribution of Endpoints 1and 2 in this clinical trial example:

• Variable types (binomial and normal).

• Outcome distribution parameters (proportion for Endpoint 1, mean and SD forEndpoint 2) based on the assumptions listed in Table 14.

• Correlation matrix of the multivariate normal distribution used in the copulamethod.

These parameters are combined to define three outcome parameter sets (e.g.,outcome1.plac, outcome1.dosel and outcome1.doseh are combined to specifythe first outcome parameter set) that will be included in the samples list in thedata model. The sample size per treatment arm ranges between 100 and 200 andis defined in the general set.

Data model # General set of data model parameter

data.model.general = list(outcome.dist = "mvmixed",

sample.size = seq(100, 200, 25))

# Variable types

var.type = list("binomial", "normal")

# Outcome distribution parameters

plac.par = list(prop = 0.3, list(mean = -0.10, sd = 0.5))

dosel.par1 = list(prop = 0.40, list(mean = -0.20, sd = 0.5))



doseh.par1 = list(prop = 0.50, list(mean = -0.30, sd = 0.5))



# Correlation between two endpoints

Mediana manual 37

corr.matrix = matrix(c(1.0, 0.5,

0.5, 1.0), 2, 2)


outcome1.plac = list(var.type, plac.par, corr.matrix)

outcome1.dosel = list(var.type, dosel.par1, corr.matrix)

outcome1.doseh = list(var.type, doseh.par1, corr.matrix)










samples = list(

list(sample.id = list("Plac ACR20", "Plac HAQ-DI"),

outcome.par = list(outcome1.plac, outcome2.plac, outcome3.plac)),

list(sample.id = list("DoseL ACR20", "DoseL HAQ-DI"),

outcome.par = list(outcome1.dosel, outcome2.dosel, outcome3.dosel)),

list(sample.id = list("DoseH ACR20", "DoseH HAQ-DI"),

outcome.par = list(outcome1.doseh, outcome2.doseh, outcome3.doseh))

)

# Data model


samples = samples)

Analysis model

To set up the analysis model in this clinical trial example, note that the treat-ment comparisons for Endpoints 1 and 2 will be carried out based on two differentstatistical tests:

• Endpoint 1: Two-sample test for comparing proportions(test.type = "proptest").

• Endpoint 2: Two-sample t-test (test.type = "ttest").

This implies that the test.type parameter needs to be defined in the test-specificset of analysis model parameters (tests list) rather than the general set.

It was pointed out earlier in this section that the two endpoints will be testedhierarchically within each dose. Figure 2 provides a visual summary of the testingstrategy used in this clinical trial. The circles in this figure denote the four nullhypotheses of interest:

• H1: Null hypothesis of no difference between Dose L and placebo with respect toEndpoint 1.

• H2: Null hypothesis of no difference between Dose H and placebo with respectto Endpoint 1.

• H3: Null hypothesis of no difference between Dose L and placebo with respect toEndpoint 2.

38 Mediana manual

• H4: Null hypothesis of no difference between Dose H and placebo with respectto Endpoint 2.

Figure 2Visual summary of thetesting strategy used inthe multiple-sequencegatekeeping procedure

H1 H2

H3 H4

A multiple testing procedure known as the multiple-sequence gatekeeping pro-cedure will be applied to account for the hierarchical structure of this multiplicityproblem. This procedure belongs to the class of mixture-based gatekeeping proce-dures introduced in Dmitrienko et al. (2014). This gatekeeping procedure is specifiedby defining the following three parameters:

• Families of null hypotheses (family).

• Component procedures used in the families (component.procedure).

• Truncation parameters used in the families (gamma).

These parameters are included in the general set shown below. The first param-eter (family = c(1, 1, 2, 2)) states that the null hypotheses will be groupedinto two families:

• Family 1: H1 and H2.

• Family 2: H3 and H4.

The families will be tested sequentially and a truncated Holm procedure will be ap-plied within each family (component.procedure = c("holm", "holm")). Lastly,the truncation parameter will be set to 0.8 in Family 1 and to 1 in Family 2 (gamma =

c(0.8, 1)). The resulting parameters are included in the mult.adj.par list and,as before, the mult.adj.proc parameter is used to specify the multiple testingprocedure (multiple-sequence gatekeeping procedure).

General set # Parameters of multiple-sequence gatekeeping procedure

family = c(1, 1, 2, 2)

component.procedure = c("holm", "holm")

gamma = c(0.8, 1)

# General set of data model parameter

analysis.model.general = list(mult.adj.proc = "multiple.sequence.gatekeeping",

mult.adj.par = list(family,

component.procedure,

gamma))

The test-specific analysis model parameters and the overall analysis model aredefined as follows:

Test-specific analysismodel parametersand analysis model


Mediana manual 39

tests = list(

list(test.id = "Pl vs DoseL - ACR20",

test.type = "proptest",

test.samples = list("Plac ACR20", "DoseL ACR20")),

list(test.id = "Pl vs DoseH - ACR20",

test.type = "proptest",

test.samples = list("Plac ACR20", "DoseH ACR20")),

list(test.id = "Pl vs DoseL - HAQ-DI",

test.type = "ttest",

test.samples = list("DoseL HAQ-DI", "Plac HAQ-DI")),

list(test.id = "Pl vs DoseH - HAQ-DI",

test.type = "ttest",

test.samples = list("DoseH HAQ-DI", "Plac HAQ-DI"))

)

# Analysis model


tests = tests)

Recall that a numerically lower value indicates a beneficial effect for the HAQ-DIscore and, as a result, the experimental treatment arm must be defined prior to theplacebo arm in the test.samples parameters corresponding to the HAQ-DI tests,e.g., test.samples = list("DoseL HAQ-DI", "Plac HAQ-DI").

Evaluation model

In order to assess the probability of success in this clinical trial, a hybrid criterionbased on the conjunctive criterion (both trial endpoints must be significant) anddisjunctive criterion (at least one dose-placebo comparison must be significant) canbe considered. This criterion will be met if a significant effect is established at one ortwo doses on Endpoint 1 (ACR20) and also at one or two doses on Endpoint 2 (HAQ-DI). However, due to the hierarchical structure of the testing strategy (see Figure 2),this is equivalent to demonstrating a significant difference between Placebo andat least one dose with respect to Endpoint 2. The corresponding criterion is asubset disjunctive criterion based on the two Endpoint 2 tests (subset disjunctivepower was briefly mentioned in Case study 2). In addition, the sponsor may also beinterested in evaluating marginal power as well as subset disjunctive power basedon the Endpoint 1 tests. The latter criterion will be met if a significant differencebetween Placebo and at least one dose is established with respect to Endpoint 1.Additionally, as in Case study 2, the user could consider defining custom evaluationcriteria.

The three resulting evaluation criteria (marginal power, subset disjunctive cri-terion based on the Endpoint 1 tests and subset disjunctive criterion based on theEndpoint 2 tests) are included in the following evaluation model.

Evaluation model # General set of evaluation model parameter



metrics = list(



metric.tests = list("Pl vs DoseL - ACR20",

"Pl vs DoseH - ACR20",

40 Mediana manual

"Pl vs DoseL - HAQ-DI",

"Pl vs DoseH - HAQ-DI")),

list(metric.id = "Disjunctive power - ACR20",


metric.tests = list("Pl vs DoseL - ACR20",

"Pl vs DoseH - ACR20")),

list(metric.id = "Disjunctive power - HAQ-DI",


metric.tests = list("Pl vs DoseL - HAQ-DI",

"Pl vs DoseH - HAQ-DI"))

)

# Evaluation model


metrics = metrics)

Note that subset disjunctive power is requested in the second and third evaluationmetrics by specifying the list of tests to which the disjunctive criterion is applied. Forexample, to compute subset disjunctive power based on the two Endpoint 2 tests,i.e., "Pl vs DoseL - HAQ-DI" and "Pl vs DoseH - HAQ-DI", the metric.tests

parameter is set to list("Pl vs DoseL - HAQ-DI", "Pl vs DoseH - HAQ-DI").As a consequence, the Endpoint 1 tests, i.e., "Pl vs DoseL - ACR20" and "Pl vs

DoseH - ACR20", are excluded from consideration.


Power calculations are performed based on the the data, analysis and evaluationmodels defined earlier in this section by calling the cse function and summarizingthe results:


evaluation




n.sims = 10000)


sample.size.label = list("N = 100", "N = 125", "N = 150",

"N = 175", "N = 200"),

outcome.parameter.label = list("Conservative", "Standard",

"Optimistic")

)

Custom labels are assigned to the sample size and outcome parameter sets to facil-itate the interpretation of the evaluation results.

Table 15 provides a summary of the subset disjunctive criterion based on the twoEndpoint 2 tests (HAQ-DI tests) and the full report is available on the package’sweb site.

It follows from Table 15 that, under the standard (most plausible) treatmenteffect scenario, the success probability based on subset disjunctive power derivedfrom the two HAQ-DI tests is close to 90% with 125 patients per treatment arm. Theother two success criteria (not shown in Table 15) provide additional informationwhich may help facilitate the selection of the sample size in this clinical trial.

Mediana manual 41

TABLE 15 Evaluation summary in the RA clinical trial


Disjunctive power - HAQ-DIConservative N = 100 Criterion = 0.452Conservative N = 125 Criterion = 0.603Conservative N = 150 Criterion = 0.726Conservative N = 175 Criterion = 0.831Conservative N = 200 Criterion = 0.894

Standard N = 100 Criterion = 0.772Standard N = 125 Criterion = 0.890Standard N = 150 Criterion = 0.954Standard N = 175 Criterion = 0.979Standard N = 200 Criterion = 0.992

Optimistic N = 100 Criterion = 0.947Optimistic N = 125 Criterion = 0.985Optimistic N = 150 Criterion = 0.997Optimistic N = 175 Criterion = 0.999Optimistic N = 200 Criterion = 1.000

References

[1] Benda, N., Branson, M., Maurer, W., Friede, T. (2010). Aspects of modernizing drugdevelopment using clinical scenario planning and evaluation. Drug Information Jour-

nal. 44, 299-315.

[2] Bretz, F., Maurer, W., Brannath, W., Posch, M. (2009). A graphical approach tosequentially rejective multiple test procedures. Statistics in Medicine. 28, 586-604.

[3] Burman, C.F., Sonesson, C., Guilbaud, O. (2009). A recycling framework for the con-struction of Bonferroni-based multiple tests. Statistics in Medicine. 28, 739-761.

[4] Dmitrienko, A., Millen, B., Brechenmacher, T., Paux, G. (2011). Development of gate-keeping strategies in confirmatory clinical trials. Biometrical Journal. 53, 875-893.

[5] Dmitrienko, A., Kordzakhia, G., Tamhane, A.C. (2011). Multistage and mixture par-allel gatekeeping procedures in clinical trials. Journal of Biopharmaceutical Statistics.21, 726-747.

[6] Dmitrienko, A., D’Agostino, R.B. (2013). Tutorial in Biostatistics: Traditional Multi-plicity Adjustment Methods in Clinical Trials. Statistics in Medicine. 32, 5172-5218.

[7] Dmitrienko, A., Kordzakhia, G., Brechenmacher, T. (2014). Mixture-based gatekeep-ing procedures for multiplicity problems with multiple sequences of hypotheses. Inpress.

[8] Friede, T., Nicholas, R., Stallard, N., Todd, S., Parsons, N.R., Valdes-Marquez, E.,Chataway, J. (2010). Refinement of the clinical scenario evaluation framework for as-sessment of competing development strategies with an application to multiple sclerosis.Drug Information Journal. 44, 713-718.

[9] Julious, S.A. (2009). Sample sizes for clinical trials. Chapman and Hall.

[10] Millen, B., Dmitrienko, A. (2011). Chain procedures: A class of flexible closed testingprocedures with clinical trial applications. Statistics in Biopharmaceutical Research.3, 14-30.

[11] Millen, B., Dmitrienko, A., Ruberg, S., Shen, L. (2012). A statistical framework fordecision making in confirmatory multipopulation tailoring clinical trials. Drug Infor-

mation Journal. 46, 647-656.

[12] Millen, B., Dmitrienko, A., Song, G. (2014). Bayesian assessment of the influence andinteraction conditions in multi-population tailoring clinical trials. Journal of Biophar-maceutical Statistics. 24, 94-109.

[13] Sormani, M.P., Bruzzi, P., Miller, D.H., Gasperini, C., Barkhof, F., Fillipi, M. (1999a).Modelling MRI enhancing lesion counts in multiple sclerosis using a negative binomialmodel: implications for clinical trials. Journal of the Neurological Sciences.163, 74-80.

[14] Sormani, M.P., Molyneux, P.D., Gasperini, C., Barkhof, F., Yousry, T.A., Miller, D.H.,Filippi, M.(1999b). Statistical power of MRI monitored trials in multiple sclerosis: newdata and comparison with previous results. Journal of Neurology, Neurosurgery, and

Psychiatry. 66, 465-469.

2014-09-15 mediana manual - biopharmnetbiopharmnet.com/doc/mediana_manual.pdf3 main components of...

Documents