Download - 542-02-#1 STATISTICS 542 Introduction to Clinical Trials “What’s the Question?” Quote by Dr. Max Halperin NHLBI, National Institutes of Health

542-02-#1

STATISTICS 542STATISTICS 542Introduction to Clinical TrialsIntroduction to Clinical Trials

“What’s “What’s thethe Question?” Question?”

Quote by Dr. Max Halperin

NHLBI, National Institutes of Health

542-02-#2

Primary vs. Secondary QuestionPrimary vs. Secondary Question

• Primary– most important, central question– ideally, only one – stated in advance– basis for design and sample size

• Secondary– related to primary– stated in advance– limited in number

542-02-#3

Examples (1)Examples (1)

• Physicians Health Study (PHS)– aspirin vs placebo– primary: total mortality– secondary: fatal + nonfatal myocardial

infarction (MI)

542-02-#4


• Eastern Cooperative Oncology Group (ECOG - 1178) – tamoxifen vs placebo– primary: tumor recurrence/relapse,

disease-free survival– secondary: total mortality

542-02-#5


• Multicenter Investigation of Limitation of Infarction Size (MILIS)– propranolol vs. placebo – primary: ultimate size of an acute

myocardial infarction– secondary: left ventricular ejection fraction

542-02-#6


• Chronic Study of Intermittent Positive Pressure Breathing (IPPB)– long-term intermittent positive pressure

breathing vs. nebulizer

– primary: forced expiratory volume (FEV1)

– secondary: quality of life

542-02-#7

A-HEFTA-HEFT

• Ref: NEJM, Nov 11, 2004• 1050 African Americans with Class III-IV CHF• Isosorbide Dinitrate + Hydrolyzine vs. Plbo• Composite outcome (death, HF

hospitalizations, change in QoL)• DMC terminated trial early

542-02-#8

A-HEFTA-HEFT

542-02-#9

A-HEFTA-HEFT

542-02-#10

2. Subgroup Questions2. Subgroup Questions

• Questions about effect of therapy in a sub-population of subjects entered into the trial

• Assess internal consistency of results• Confirm previous hypothesis• Generate new hypotheses

542-02-#11

Subgroup AnalysesSubgroup AnalysesExamples:

Breast Cancer: Does the benefit of treatment depend on: menopausal status, stage of disease, age, etc.

AIDS: Does the benefit of treatment depend on: gender, age, initial CD4 counts, race, etc.

Analyses of a trial by subgroup results in a separate statistical test for each subgroup. As a result the probability of false positive conclusions arising in the analysis of a trial will increase.

542-02-#12

False Positive RatesFalse Positive Rates

The greater the number of subgroups analyzed separately, the larger the probability of making false positive conclusions.

No. of Subgroups At Least One False Positive

1 .05 2 .097 3 .143 4 .186 5 .226

542-02-#13

Example - Subgroup ConcernExample - Subgroup Concern

• Second International Study of Infarct Survival (ISIS 2) – 2 x 2 factorial design

(aspirin vs. placebo and streptokinase vs. placebo)

– vascular and total mortality in patients with an acute myocardial infarction (MI)

– Gemini or Libra astrological birth signs did somewhat worse on aspirin while all other signs and overall results impressive and highly significant benefit from aspirin

542-02-#14

Subgroup ConsiderationsSubgroup Considerations

• Rules for Subgroups

1. Stated in advance (in protocol)2. Limited in number3. Interpreted cautiously, qualitatively4. Look for consistency of results

• May be used to

1. Confirm or answer specific questions generated in aprevious trial (e.g. Metroprolol <65 vs. >65 age total

mortality

2. Generate new hypothesis to be tested in some future trial

3. Consistency of primary outcomes

542-02-#15

MERIT-HF Study DesignMERIT-HF Study Design

• Chronic heart failure patients

• Randomized placebo controlled

• Metoprolol vs. placebo

• Two-week placebo run in (compliance)

• Entered 3991 patients

• Terminated early

• Mean follow-up approximately one year

The International Steering Committee on Behalf of the MERIT-HF Study Group,

Am J Cardiol 1997; 80(9B):54J-58J. The MERIT-HF Study Group, ACC, March 1999.

542-02-#16

MERIT Total MortalityMERIT Total Mortality

542-02-#17

MERITMERIT

542-02-#18

MERITMERIT(AHJ, 2001)

542-02-#19

• Model Choice

– Cox– Logistic

• Test Statistic– Wald (Reg co-efficient)– Score (likelihood)

• Definition of Subgroups– US vs. World– All Countries Separately

Interaction Tests Not Unique

542-02-#20

Subgroup x TreatmentSubgroup x TreatmentInteractionInteraction

• Qualitative InteractionTreatment effect is different in direction in two subgroups

• Quantitative InteractionTreatment effect is of same direction but of different magnitude

• Statistical tests for interaction not very powerful

• Even if statistically significant, must be cautious in interpretation (PRAISE)

542-02-#21

PRAISE IPRAISE IRef: NEJM, 1996

• Amlodipine vs. placebo• NYHA class II-III• Randomized double-blind• Mortality/hospitalization outcomes• Stratified by etiology (ischemic/non-ischemic)• 1153 patients

542-02-#22

PRAISE IPRAISE I

542-02-#23

PRAISE I - InteractionPRAISE I - Interaction

• Overall P = 0.07

• Etiology by Trt InteractionP = 0.004

• Ischemic P = NS

• Non-Ischemic P < 0.001

542-02-#24

PRAISE I - IschemicPRAISE I - Ischemic

542-02-#25

PRAISE I – Non- IschemicPRAISE I – Non- Ischemic

542-02-#26

PRAISE IIPRAISE II

• Repeated non-ischemic strata • Amlodipine vs. placebo• Randomized double-blind• 1653 patients• Mortality outcome• RR 1.0

542-02-#27

Three Views:Three Views:

• Ignore subgroups and analyze only by treatment groups.

• Plan for subgroup analyses in advance. Do not “mine” data.

• Do subgroup analyses --- However view all results with caution.

542-02-#28

3. Adverse Effects3. Adverse Effects

• Any intervention should do more benefit than harm

• Not always easy to specify in advance - many variables will be measured (clinical, laboratory)

• Usually not willing or interested in demonstrating an intervention to be harmful

• May be known adverse effects from earlier trials

542-02-#29

Serious Adverse Events (SAEs)Serious Adverse Events (SAEs)

• Death

• Irreversible event

• Requires hospitalization

542-02-#30

Serious Adverse Events (SAEs)Serious Adverse Events (SAEs)

Must be reported to regulatory

agencies and IRBs

542-02-#31

Adverse EventsAdverse Events

• Challenges– Short term vs longer term– Longer term follow-up in face of early benefit– Rare AEs may be seen only with very large

numbers of exposed patients and long term follow-up

• Recent Example – COX II s– Immediate pain reduction vs longer term

increase in cardiovascular risk– Viox & Celebrex

542-02-#32

What’s the Question? What’s the Question? 4. “Natural History”

• Question not related to intervention• Control group, often a “placebo,” may be used to

describe how prognostic factors relate to eventual subject outcome (predictive, not causative)

e.g. Coronary Drug Project: Aided greatly in defining natural history of patients following a heart attack

5. Ancillary• Questions not related at all but still of scientific interest• Usually piggy-backed onto trial• Must not interfere with trial!

542-02-#33

What’s the Question?What’s the Question?6. Exploratory

• Most studies conducted to test some hypothesis• Most studies can generate new hypotheses• Multiple analyses often conducted

increased false positive (Type I) error rate• Could demand reduced significance level (or p-value)

for each test•e.g. /K (assuming independent variables)• = .05, K = 10 /K =.005 • But can’t afford this usually

• Could be selective in number of primary hypotheses• Should state key comparisons in advance• Relegate other comparisons to either

• Confirmatory or Exploratory

542-02-#34

Outcome AssessmentOutcome Assessment

542-02-#35

What’s the Response Variable? What’s the Response Variable?

• Used to answer primary/secondary questions

• Characteristics for primary/secondary outcomes

1. Well defined & stable

2. Ascertained in all subjects

3. Unbiased

4. Reproducible

5. Specificity to question

542-02-#36

• Examples

1. MILISInfarct size measurement?- Enzymes (area under curve or

peaks) - Radionuclide imaging - EKG

Issues of definition, ascertainment, reproducible

2. NOTTQuality of Life?- POMS (Profile of Mood)- SIP (Sickness Impact Profile)- Pulmonary Function- Survival

Response VariableResponse Variable

542-02-#37

3. Cardiovascular Disease Trials- Total mortality- CHD mortality- Non-fatal MI- PVC’s

4. Diabetes- Mortality- Blindness- Visual impairment- Retinopathy- Microaneurisms

Response VariableResponse Variable

542-02-#38

Surrogate Response VariablesSurrogate Response Variables

• Used as alternative to desired or ideal clinical response

• Examples– Suppression of arrhythmia (sudden death)– T4 cell counts (AIDS or ARC)

• Used often - therapeutic exploratory (Phase I, Phase II)

• Use with caution - therapeutic confirmatory (Phase III)

542-02-#39

Surrogate Response Variables (2)Surrogate Response Variables (2)

• Frequent Criticism of Clinical Trials– Too long– Too large– Too expensive

• Advantages– Perhaps smaller sample size– Detect earlier effect shorter trial– Easier

542-02-#40

Examples of FDA Approval of Examples of FDA Approval of Drugs Using Surrogates (1)Drugs Using Surrogates (1)

• Lower cholesterol without evidence of survival benefit

• Lower blood pressure without evidence of benefit for stroke, MI, congestive heart failure, or survival

• Increase bone density without evidence of decreased fractures in osteoporosis

542-02-#41

Examples of FDA Approval of Examples of FDA Approval of Drugs Using Surrogates (2)Drugs Using Surrogates (2)

• Increase cardiac function in congestive heart failure without evidence of survival benefit

• Decrease rate of arrhythmias (VPBs) without evidence of survival benefit

• Lower blood glucose and glycosylated hemoglobin without evidence about diabetic complications or survival benefit

542-02-#42

Surrogate Response VariablesSurrogate Response Variables• Requirements (Prentice, 1989)

T = True clinical endpoint

S = Surrogate

Z = Treatment

• H0: P(T|Z) = P(T) P(S|Z) = P(S)

• Sufficient Conditions

1. S is informative about T (predictive)

P(T|S) P(T)

2. S fully captures effect of Z on T

P(T|S,Z) = P(T|S)

542-02-#43

Concerns About SurrogatesConcerns About Surrogates

1. Relationship between surrogate and true endpoint may not be causal, but coincidental to a third factor

2. Other unfavorable effects of the drug

3. Effect on surrogate may correlate with one clinical endpoint, but not others

542-02-#44

Time

Surrogate

Intervention

Disease End Point

True Clinical Outcome

The setting that provides the greatest potential for the surrogate endpoint to be valid. Reprinted from Ann Intern Med 1996; 125:605-13.

542-02-#45

Time

True Clinical OutcomeDisease

SurrogateEnd PointA

Surrogate

Intervention

BDisease

End Point


Intervention

CDisease

SurrogateEnd Point



DiseaseSurrogateEnd Point

D

Intervention

Reasons for failure of surrogate end points. A. The surrogate is not in the causal pathway of the disease process. B. Of several causal pathways of disease, the intervention affects only the pathway mediatedthrough the surrogate. C. The surrogate is not in the pathway of the intervention’s effect or is insensitive to its effect. D. The intervention has mechanisms for action independent of the disease process. Dotted lines = mechanisms of action that might exist.

542-02-#46

Examples Using “Surrogates”Examples Using “Surrogates”

• Chronic Obstructive Pulmonary Disease

• Cardiac Arrhythmias

• Heart Failure

• AIDS

• Osteoporosis

542-02-#47

Nocturnal Oxygen Nocturnal Oxygen Therapy Trial (NOTT)Therapy Trial (NOTT)

• Hypothesis– Is continuous oxygen therapy better than nocturnal oxygen

therapy in chronic obstructive lung disease patients? • Surrogates• Survival

• Design– 203 patients– Two-sided 0.05 Type I error– Randomized– Multicenter– Sequential data monitoring

542-02-#48

Possible NOTT SurrogatesPossible NOTT Surrogates• PaO2

• Hematocrit

• FEV1 % Predicted

• FVC % Predicted• Maximum Workload• Heart Rate• Mean Pulmonary Artery Pressure• Cardiac Index• Pulmonary Vascular Resistance• Neuropsychiatric Impairment• Quality of Life

542-02-#49

The Nocturnal Oxygen Therapy TrialThe Nocturnal Oxygen Therapy Trial

NOTT Survival Experience for 102 Patients on Nocturnal Oxygen (NOT) and 101 Patients on Continuous Oxygen Therapy (COT)

542-02-#50

Cardiac ArrhythmiasCardiac Arrhythmias

• Cardiac arrhythmias associated with sudden death

• Class of drugs developed to suppress arrhythmias

• FDA approved for high risk patients

• “Off-label” use increased

542-02-#51

Cardiac Arrhythmia Suppression TrialCardiac Arrhythmia Suppression Trial

Hypothesis

Does suppression of arrhythmia following an MI reduce incidence of:

1. Sudden death

2. Total mortality

542-02-#52


Design

• Randomized Double Blind

• Three Drug Arms vs. Placebo

• Multicenter Study

• Group Sequential Data Monitoring

• One Sided (0.025 Type I Error) for Benefit

• Advisory One Sided (0.025) for Harm

• Run-in Period (Arrhythmia Suppression)

542-02-#53


Early Termination in Two Drug Arms

Drugs Placebo

Sudden Death 33 9

Total Mortality 56 22

542-02-#54

CAST Sequential BoundariesCAST Sequential Boundaries

Early Termination in Two Drug Arms

Drugs Placebo

Sudden Death 33 9

Total Mortality 56 22

542-02-#55

Chronic Heart Failure (CHF)Chronic Heart Failure (CHF)

• CHF is a serious problem• Patients have reduced cardiac function &

reduced ability to conduct daily activities• Severity stages: NYHA Class I-IV• Mortality rates increase with severity class• Improving cardiac function, exercise

capacity & quality of life desirable• Drugs developed/approved on that basis

542-02-#56

PROMISEPROMISE(Packer et al. NEJM 1991)(Packer et al. NEJM 1991)

• Problem

– Patients with advanced (Class IV) congestive heart failure have 40% one year mortality

– Milrinone (a phosphodiesterase inhibitor) enhances cardiac contractility

– Milrinone improved cardiac output, exercise tolerance, and symptoms

• Hypothesis

Does milrinone increase survival in severe (Class III or IV) congestive heart failure patients?

542-02-#57

PROMISEPROMISEDesign

• Randomized multicenter double-blind, placebo-control trial

• Patients with Class III or IV congestive heart failure for 3 months

• Two-sided 5% significance level, 90% power for 25% reduction in mortality

• 1088 patients entered

• Milrinone (10 mg/4 times per day) vs. matched placebo

• Standard therapy of digoxin, diuretics, and a converting enzyme inhibitor

542-02-#58

PROMISE Mortality ResultsPROMISE Mortality Results

542-02-#59

AIDS Clinical TrialsAIDS Clinical Trials

• Clinical Outcomes– Death– Progression to AIDS– Progression to ARC

• Surrogate Outcome– CD4 Cell Count

542-02-#60

State-of-the-Art ConferenceState-of-the-Art Conference• Results

– AIDS/Death• *8 trials positive• 7/8 had positive CD4 cell changes• *8 trials negative• 6/8 had positive CD4 cell change

– Death• *4 trials positive• 2/4 CD4 positive• *7 trials negative• 6/7 CD4 cell positive

542-02-#61

OsteoporosisOsteoporosis(Riggs et al. NEJM, 1990)

• Bone loss in postmenopausal women leads to increase risk of fracture

• Sodium Fluoride stimulates bone formation and increased bone mass (double)

• Hypothesis– Will Fluoride treatment decrease rate of vertebral fractures?

• Design– Randomized, double blind, placebo-controlled– 202 postmenopausal women randomized– All received calcium supplementation

542-02-#62

Osteoporosis Fluoride TrialOsteoporosis Fluoride TrialResults

• Fluoride increased bone density by 35%– 35% (p = 0.0001) in spine– 12% (p = 0.0001) in femoral neck

• Fluoride decrease bone density by 4% in wrist (p = 0.02)

• Vertebral fractures higher on Fluoride (F 163, P 136, p < 0.05)

• Non-vertebral fractures higher on Fluoride (72 vs. 24; p = 0.01)

• Fluoride concluded not effective as a treatment for post-menopausal osteoporosis

542-02-#63

Concluding Remarks Concluding Remarks on Surrogateson Surrogates

• Surrogates play an important role in the development of Phase I, II, and pilot Phase III studies

• Treatments may affect more than one mechanism

• “Surrogates” do not reliably predict treatment on clinical outcome

• Continued success in a given field is not even guaranteed

• Reliance on “surrogates” should be minimized

542-02-#64

Study PopulationStudy Population

542-02-#65

What Is The Study Population? (1)What Is The Study Population? (1)• Subset of the general population

determined by the eligibility criteria

GENERAL POPULATION

eligibility criteria

STUDY POPULATION

enrollment

STUDY SAMPLEobserved

542-02-#66

The General Flow of The General Flow of Statistical InferenceStatistical Inference

Patient Population

Sample* Protocol

Patients On Study

Observed Results

Inference about Population

*Sample of Opportunity: random or non-random?

542-02-#67

What Is The Study Population? (2)What Is The Study Population? (2)

Defined by Eligibility Criteria

–Define in advance

–Characterize population• Impact of results• Replication of study

–Biased sample does not imply biased trial!

542-02-#68

Who Should Be Studied?Who Should Be Studied?

Homogeneous vs. Heterogeneous

1. Well defined Can’t specify easily

2. Mechanism of action Don’t know if one group

known will respond differently

3. Don’t dilute results Easier subject recruitment

4. Infer results specifically Easier to generalize

542-02-#69

Eligibility CriteriaEligibility Criteria• Need to describe who we intend to study

– State in advance– Precision related to importance

• Consider– Potential for benefit

• Homogeneous population• Heterogeneous population

– Ability to detect benefit High risk but not too high– No contraindications– No competing risk– Compliance likely

• Impact– Generalization– Ease of recruitment– Risk or event rates

542-02-#70

RecruitmentRecruitment• More difficult than anticipated• Yield not 100%

– Eligibility criteria (age, prior history, prior treatment, etc.)– Exclusion Criteria– Physician Refusal– Patient Refusal

• Many trials yield 10-15% randomized of those screened

• Must be a team effort– Physicians– Nurses– Data Manager or Coordinator

• Health Screening Effect lower risk than expected!

542-02-#71

Accrual TrackingAccrual Tracking

542-02-#72

Measures of Efficacy Measures of Efficacy from Clinical Trialsfrom Clinical Trials

542-02-#73

Characteristics of a Characteristics of a Good Summary Measure Good Summary Measure

• Easy to compute

• Easily understand by all (non-technical)

• Minimal variance across baseline characteristics

• Statistically sound

542-02-#74

Purpose and Limitations Purpose and Limitations of Clinical Trialsof Clinical Trials

• Clinical trials are designed to detect differences between treatment groups– relative risk ( or relative risk reduction)

– mean absolute risk reduction (relative to placebo)

• In clinical trials, the method of assessing the primary endpoints is usually pre-specified and stated in terms of RRR or RHR.

• Clinical trials are not designed to directly estimate the incidence in the population at risk.

• The population in a clinical trial may not completely represent the population to be treated

542-02-#75

Measures Currently UsedMeasures Currently Used

• Relative Risk (RR) and Relative Risk Reduction (RRR)

• Odds ratio (OR)

• Relative Hazard (RH) and Relative Hazard Reduction (RHR)

• Absolute Risk Reduction (ARR)

542-02-#76

Outcome MeasuresOutcome MeasuresRelative Risk (RR)

RR = P1/P2

Relative Risk Reduction

RRR = 1 - RR

Odds Ratio (OR)

Absolute Risk Reduction (ARR)

ARR = P1 - P2

)P(1P

)P(1POR

12

21

542-02-#77

0

10

20

30

40

50

Study (first author of paper)

Ann

ual i

ncid

ence

rat

e (%

)

Low

Medium

High

ARR

Placebo incidence rates of vertebral fracture from several studies Placebo incidence rates of vertebral fracture from several studies Efficacy as measured by relative risk reduction was reasonable stable over studiesEfficacy as measured by relative risk reduction was reasonable stable over studies

Absolute risk reduction varied across studiesAbsolute risk reduction varied across studies

542-02-#78

Study Incidence inplacebo (%)

Incidence inrisedronate (%)

RRR ARR

VERT NA 16.3 11.3 59% 5.0%

VERT MN 29.0 18.1 51% 10.9%

There is a danger in using ARR to compare efficacyThere is a danger in using ARR to compare efficacy

The drug used is the same in both studiesThe drug used is the same in both studies

542-02-#79

We Need Both MeasuresWe Need Both Measures

• Effectiveness– Related to RR

• Benefits– Related to absolute risk

542-02-#80

RRRRRR

• Usually– Constant over baseline characteristic– Constant over study time– Easy test of interaction – When not constant it is usually piece-

wise constant– Differences seen among different

studies can be viewed as random– Good statistical models are available

542-02-#81

Absolute RiskAbsolute Risk

• Unlikely– To be constant over time– To be constant over baseline characteristics– To be able to describe with simple models

• Consequences– Patients characteristics can change with study time– Differences among studies cannot be ignored

542-02-#82

ConclusionConclusion

• If the RRR is constant and detailed information about the AR is provided both summary measures provide useful information about the effectiveness and benefits of treatment

• ARR is not a simple index of therapeutic effectiveness. It is a function of the incidence rate for the event of interest in the population studied and may not be reflective of the true ARR for the patient sitting before you.

• There is concerns about using the rate in the placebo group from a clinical trial as a surrogate for the true baseline risk for an individual patient.

• Before making a recommendation, one needs to know the risk profile of the patients to be treated

542-02-#83

SummarySummaryDefining the QuestionDefining the Question

• Defined carefully in advance• Must be clinically relevant• Prioritize into primary, secondary, …• Design built around primary question(s)• Eligibility criteria define population

studied and inferences to be made• Surrogates desirable but risky• Need the relevant measure of efficacy

Download - 542-02-#1 STATISTICS 542 Introduction to Clinical Trials “What’s the Question?” Quote by Dr. Max Halperin NHLBI, National Institutes of Health

Top Related