confidence intervals elizabeth s. garrett [email protected] oncology biostatistics march 27, 2002 clinical...

33
Confidence Intervals Elizabeth S. Garrett [email protected] Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

Upload: silvester-bond

Post on 28-Dec-2015

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

Confidence Intervals

Elizabeth S. Garrett

[email protected]

Oncology Biostatistics

March 27, 2002

Clinical Trials in 20 Hours

Page 2: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 2

What is a “confidence interval”?• It is an interval that tells the precision with which we

have estimated a sample statistic.• Examples:

– parameter of interest: progression-free survival time:

“The 95% confidence interval on progression-free survival is 13 to 26 weeks.”

– parameter of interest: response rate

“The 95% confidence interval on response rate is 0.20 to 0.40.”

– Parameter of interest: change in %CD34+ cells

“The 95% confidence interval for %CD34+ cells is 0.2 to 0.4.”

Page 3: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 3

Different Interpretations of the 95% confidence interval

• “We are 95% sure that the TRUE parameter value is in the 95% confidence interval”

• “If we repeated the experiment many many times, 95% of the time the TRUE parameter value would be in the interval”

• “Before performing the experiment, the probability that the interval would contain the true parameter value was 0.95.”

Page 4: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 4

Example

Leisha Emens, M.J. Kennedy, John H. Fetting, Nancy E. Davidson, Elizabeth Garrett, Deborah A. Armstrong

“A phase 1 toxicity and feasibility trial of sequential dose dense induction chemotherapy with doxorubicin, paclitaxel, and 5-fluorouracil followed by high dose consolidation for high risk primary breast cancer”

83 patients underwent leukopheresis for peripheral blood stem cell collection after conventional dose adjuvant therapy, and 14 patients underwent the procedure on the dose dense adjuvant protocol 9626.

Results: Compared to the standard dose doxorubicin-containing adjuvant therapy, the dose dense regimen decreased CD34+ peripheral blood stem cell (PBSC) yields, requiring that 50% patients have a supplemental bone marrow harvest.

Question: What can we say about how %CD34+ peripheral blood stem cell yields in each of the two groups?

Page 5: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 5

Example

• %CD34+ PBSC in trial 9601 and 9626.

• We can estimate the mean %CD34+ PBSC in each trial:

– 0.40 in the standard group

– 0.30 in the dose-dense group.

• We can conclude:– “We estimate that %CD34+

PBSC in the standard group is 0.40 and in the dose dense group is 0.30.”

• But, how “sure” are we about those estimates?

0

.5

1

1.5

cd34

9601 9626

Page 6: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 6

Quantifying Uncertainty

• Standard deviation: measures the variation of a variable in the population.– The standard deviation of %CD34+ PBSC in

the standard group is 0.27 and is 0.20 in the dose dense group.

– Technically,

s x xN ii

N

1

12

1

( )

s C D ii

9 6 0 11

8 22

1

8 3

3 4 0 4 0 (% . )

Page 7: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 7

For normally distributed variables….

x

s

68%

68% of individuals’ values fall between 1 standard deviationof the mean

Page 8: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 8

For normally distributed variables….

95% of individuals’ values fall between 1.96 standard deviationsof the mean

x

1.96s

95%

Page 9: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 9

Standard deviation versus standard error

• The standard deviation (s) describes variability between individuals in a population.

• The standard error describes variation of a sample statistic.

• Example: We are interested in the mean %CD34+ PBSC. (We notate the mean by x).– The standard deviation (0.27 in standard and 0.20 in

dose dense) describes how individuals differ.

– The standard error of the mean describes the precision with which can make inference about the true mean.

Page 10: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 10

Standard error of the mean

• Standard error of the mean (sem):

• Comments:– n = sample size– even for large s, if n is large, we can get good

precision for sem– always smaller than standard deviation (s)

s sems

nx

Page 11: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 11

Example

• In standard group, s = 0.27 and n = 83:

• In dose dense group, s = 0.20 and n = 14:

s semx 0 2 7

8 30 0 3

..

s semx 0 2 0

1 40 0 5

..

Page 12: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 12

Sampling DistributionThe sampling distribution of a sample statistic refers

to what the distribution of the statistic would look like if we chose a large number of samples from the same population

x

y1

0 2 4 6 8 10 12

0.0

0.0

50

.10

0.1

50

.20

0.2

5

Mean = 3s = 2.45

The sample statistic of interestto us is the mean.

Page 13: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 13

Sampling Distribution of the MeanBy the Central Limit Theorem, it is

true that even if a variable is NOT normally distributed, for large sample size, the sampling distribution of the mean is normally distributed.

x

y1

0 2 4 6 8 10 12

0.0

0.0

50

.10

0.1

50

.20

0.2

5

2.0 2.5 3.0 3.5 4.0

05

1015

samps

n = 25

2.0 2.5 3.0 3.5 4.0

05

1015

2025

samps

n = 50

2.0 2.5 3.0 3.5 4.0

010

2030

samps

n = 100

2.0 2.5 3.0 3.5 4.0

05

1015

20

samps

n = 500

Mean = 3s = 2.45

Page 14: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 14

Sampling Distributions

2.0 2.5 3.0 3.5 4.0

05

10

15

samps

n = 25

2.0 2.5 3.0 3.5 4.0

05

10

15

20

25

samps

n = 50

2.0 2.5 3.0 3.5 4.0

01

02

03

0

samps

n = 100

2.0 2.5 3.0 3.5 4.0

05

10

15

20

samps

n = 500

sem = 0.47

sem = 0.10

sem = 0.23

sem = 0.47

Page 15: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 15

Central Limit Theorem Main Ideas

• The sampling distribution of a sample statistic is often normally distributed

• The mathematical result comes from the Central Limit Theorem. For the theorem to work, n should be large.

• Statisticians have derived formulas to calculate the standard deviation of the sampling distribution and it is called the standard error of the statistic

Page 16: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 16

Sampling Distribution of the Mean

• In general for large n, means have a normal distribution.

• It is true that 95% of sample means will be within 1.96 of the true mean, .

x sem x sem

x sem x sem

x sem x sem

1 9 6 1 9 6

1 9 6 1 9 6

1 9 6 1 9 6

. .

. .

. .

1 9 6 1 9 6. .sem x sem

The 95% confidence interval for the mean

Page 17: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 17

x sem 1 9 6.

xs

n 1 9 6.

General formula for 95% confidence interval

• Notes:– sample size must be sufficiently large for non-

normal variables.– how large is large? depends on skewness of

variable– VERY often people use 2 instead of 1.96.

Page 18: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 18

Example• In the standard group, the mean was 0.40, s =

0.27, and n = 83:

• In the dose dense group, the mean was 0.30, s = 0.20, and n = 14:

0 4 0 1 9 6

0 3 4 0 4 6

0 2 78 3

. .

. .

.

0 3 0 1 9 6

0 2 0 0 4 0

0 2 01 4

. .

. .

.

Page 19: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 19

Not only 95%….• 90% confidence interval:

NARROWER than 95%

• 99% confidence interval: WIDER than 95%

x sem 1 6 5.

x sem 2 5 8.

Page 20: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 20

But why do we always see 95% CI’s?

• “Duality” between confidence intervals and pvalues• Example: Assume that we are testing that for a significant

change in QOL due to an intervention, where QOL is measured on a scale from 0 to 50. – 95% confidence interval: (-2, 13)

– pvalue = 0.07

• It is true that if the 95% confidence interval overlaps 0, then a t-test testing that the treatment effect is 0 will be insignificant at the alpha = 0.05 level.

• It is true that if the 95% confidence interval does not overlap 0, then a t-test testing that the treatment effect is 0 will be significant at the alpha = 0.05 level.

Page 21: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 21

Other Confidence Intervals

• Differences in means

• Response rates

• Differences in response rates

• Hazard ratios

• median survival

• difference in median survival

• ……..

Page 22: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 22

Difference in Means• Example: What is the 95% confidence interval for

the difference in %CD34+ PBSCs in the two trials?

( ) .x xs

n

s

n1 212

1

22

2

1 9 6

9 5 % 0 4 0 0 3 0 1 9 60 2 7

8 3

0 2 0

1 40 0 2 0 2 2

2 2

(C I : . . ) .. .

( . , . )

Page 23: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 23

95% Confidence Intervals for Proportions

• Socinski et al., Phase III Trial Comparing a Defined Duration of Therapy versus Continuous Therapy Followed by Second-Line Therapy in Advanced-Stage IIIB/IV Non-Small-Cell Lung Cancer JCO, March 1, 2002.

• Patients and Methods: Arm A (4 cycles of carboplatin at an AUC of 6 and paclitaxel), Arm B (continuous treatment with carboplatin/ paclitaxel until progression). At progression, patients from each arm receive second-line weekly paclitaxel at 80mg/m2/week.

• Results: 230 Patients were randomized (114 in arm A and 116 in Arm B). Overall response rates were 22% and 24% for arms A and B. Grade 2 to 4 neuropathy was seen in 14% and 27% of Arm A and B patients, respectively.

Page 24: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 24

95% Confidence Intervals for Proportions

• What are 95% confidence intervals for the response rates in the two arms?

• standard error of a sample proportion is

• An equation for confidence interval for a proportion:

• Note: this is an approximation based on the central limit theorem! Using statistical programs, you can get “exact” confidence intervals.

• Assumptions: – n is reasonably large– p is not “too” close to 0 or 1– rule of thumb: pn > 5

.( )

pp p

n

1 9 6

1

( )p p

n

1

Page 25: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 25

Example: Response Rate to Treatment

• Arm A:

• Arm B:

.( )

pp p

n

1 9 6

1

0 2 2 1 9 60 2 2 0 7 8

11 40 1 4 0 3 0. .

. ( . )( . , . )

0 2 4 1 9 60 2 4 0 7 6

11 60 1 6 0 3 2. .

. ( . )( . , . )

Page 26: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 26

Example: Grade 2 to 4 Neuropathy

• Arm A:

• Arm B:

.( )

pp p

n

1 9 6

1

0 1 4 1 9 60 1 4 0 8 6

11 40 0 8 0 2 0. .

. ( . )( . , . )

0 2 7 1 9 60 2 7 0 7 3

11 60 1 9 0 3 5. .

. ( . )( . , . )

Page 27: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 27

95% Confidence Interval for Difference in Proportions

What is the 95% confidence interval for the difference in rates of neuropathy in arms A and B?

( ) . ( ) ( )

p pp p

n

p p

n1 21 1

1

2 2

2

1 9 61 1

( . . ) .. ( . ) . ( . )

( . , . )0 2 7 0 1 4 1 9 60 2 7 0 7 3

11 6

0 1 4 0 8 6

11 40 0 3 0 2 3

Page 28: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 28

Recap• 95% confidence intervals are used to quantify certainty

about parameters of interest.

• Confidence intervals can be constructed for any parameter of interest (we have just looked at some common ones).

• The general formulas shown here rely on the central limit theorem

• You can choose level of confidence (does not have to be 95%).

• Confidence intervals are often preferable to pvalues because they give a “reasonable range” of values for a parameter.

Page 29: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 29

Some Confidence Intervals in Survival Analysis

Hazard Ratio95% CI

Chemo v. surgery 0.69 0.46-1.06

Arm 1 Arm II

% 95%CI % 95%CI

1 year survival 58 46-73 72 58-84

3 year survival 16 8-30 30 20-46

What about the confidence interval for the 1 year and 3 year difference?

Example: Urba et al. Randomized Trial of Preoperative Chemoradiation Versus Surgery Alone in Patients with Locoregional Esophageal Carcinoma,JCO, Jan 15, 2001.

Page 30: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 30

• Why not provide confidence intervals for...

– Difference in median survival

– Difference in 1 year survival

– Difference in 3 year survival

• Would give readers a “reasonable range” of values to consider for treatment effect that are intuitive.

• What is remembered? – P = 0.09 which means insignificant result

– But, can anyone remember the treatment effect?

Page 31: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 31

Confidence Intervals for Reporting Results of Clinical Trials, Simon

• “[Hypothesis tests] are sometimes overused and their results misinterpreted.”

• “Confidence intervals are of more than philosophical interest, because their broader use would help eliminate misinterpretations of published results.”

• “Frequently, a significance level or pvalue is reduced to a ‘significance test’ by saying that if the level is greater than 0.05, then the difference is ‘not significant’ and the null hypothesis is ‘not rejected’….The distinction between statistical significance and clinical significance should not be confused.”

Page 32: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 32

Caveats

“They should not be interpreted as reflecting the absence of a clinically important difference in true response probabilities.”

Experiment 1 Experiment 2

Treatment Response Response

A 13/25 (52%) 500/1000 (50%)

B 8/25 (32%) 450/1000 (45%)

Trt effect 20% 5%

95% CI -7% - 47% 0.6% - 9%

Pvalue 0.25 0.03

Page 33: Confidence Intervals Elizabeth S. Garrett esg@jhu.edu Oncology Biostatistics March 27, 2002 Clinical Trials in 20 Hours

3/27/2002 Clinical Trials in 20 Hours 33

Excellent References on Use of Confidence Intervals in Clinical Trials

• Richard Simon, “Confidence Intervals for Reporting Results of Clinical Trials”, Annals of Internal Medicine, v.105, 1986, 429-435.

• Leonard Braitman, “Confidence Intervals Extract Clinically Useful Information from the Data”, Annals of Internal Medicine, v. 108, 1988, 296-298.

• Leonard Braitman, “Confidence Intervals Assess Both Clinical and Statistical Significance”, Annals of Internal Medicine, v. 114, 1991, 515-517.