vascular surgery biostatistics seminar we have a website:
TRANSCRIPT
Vascular Surgery Biostatistics Seminar• We have a website:
http://www.phs.wfubmc.edu/public/edu_vascSurg.cfm
• Course is “experimental”– Ask questions during lectures– Let me know of specific statistical issues that you
want covered
• Assignment: for last 2 sessions (review of student-selected publications)– Pick 2 articles for class review– Email PDFs of them to me by October 20th
Texts1. Gehlbach: Interpreting the Medical Literature
(ISBN 0-07-143789-4)2. Dawson & Trapp: Basic and Clinical Biostatistics
(ISBN 0-07-141017-1)3. Good & Hardin: Common Errors in Statistics
(ISBN 0-471-79431-7)4. Huck: Reading Statistics and Research (ISBN 0-
205-51067-1)5. van Belle: Statistical Rules of Thumb (ISBN 0-
471-40227-3)
Schedule
Study Design
Gehlbach: Chapters 1-6
Hypothetical example: factors affecting (causing) renovascular disease (RVD)• Outcomes– Renal function (GFR,
serum creatinine)– RVD by diagnostic test
(ultrasound, angiogram)– End-stage renal disease
(dialysis dependence)– Renal-related mortality
• Exposures– Hypertension– RVD repair
• Open revascularization• Percutaneous repair
– Risk factors: age, race, smoking, diabetes,…
Q: How can we examine a specific hypothesis as it relates to RVD? A: Formulate a hypothesis and design a study!
Design Dilemma
Ideal question one would pose
Data one can collect or access
From Good & Hardin, Common Errors in Statistics and How to Avoid Them:
Before conducting the experiment, trial, survey, data analysis:1. Write down the objectives2. Translate those objectives into testable hypotheses 3. List potential findings and resulting conclusions
Research Question vs. Hypothesis
• Research Question: “How does diabetes
affect renal function after renal revascularization?”
• Hypothesis:“In patients treated for
RVD with endovascular repair, those with diabetes have poorer early renal function response than those without diabetes.”
Good & Hardin: Formulate hypotheses to be quantifiable, testable, and statistical in nature.
Classification of Study Designs
Observational studies1. Descriptive or case-series2. Retrospective (case-control)3. Cross-sectional
(prevalence), surveys4. Prospective (cohort)5. Retrospective cohort
Experimental studies1. Controlled trials
a) Parallel designsb) Sequential designsc) External controls
2. Studies with no controls
Adapted from Dawson & Trapp, Basic & Clinical Biostatistics (4th ed)
Meta-analyses
Observational Studies
Retrospective Designs• Begin with disease/condition/outcome and
look back for features (“exposure”) of those with and without outcome
• Useful for:– Hypothesizing causes of disease– Identifying risk factors
• Weaknesses:– Biased case and/or control selection– Biased exposure ascertainment– Temporal sequence of exposure/outcome
Retrospective Designs (cont.)• Advantages:– Data availability (design of choice for chart
reviews)– Usually inexpensive– Can be performed quickly
• Matching cases and controls:– Prevents imbalance of known risk factor and
potential confounding– Can reduce variability (increase efficiency)– Require special analysis techniques
Retrospective Design (example)Lei et. al., “Familial aggregation of renal
disease…” J Am Soc Neph (1998) 9:1270-1276– Recruited 689 patients with new onset ESRD– Used random-digit dialing to recruit 361 controls
from geographic community– Matched cases to controls (2:1) using 5-year age
groups– Obtained information on familial history of ESRD
and other risk factors (age, race, sex, socioeconomic,…)
– Found patients with ≥ 2 relatives with ESRD at increased risk for ESRD
Retrospective Cohort Design• Uses previously collected data on a well-
defined cohort• Common approach for disease or treatment
registries since meticulous record-keeping is required
• All follow-up took place in the past• Subject to many of the same biases of other
retrospective designs• Allows estimation of “prospective-like”
measures
Retrospective Cohort (example)Holland and Lam, “Predictors of hospitalization
and death among pre-dialysis patients…” Nephrol Dial Transplant (2000) 15:650-658– Identified predictors of first hospitalization in a
cohort of 362 seen in “pre-dialysis” clinic– Dialysis initiation and loss to follow-up were
censored events– Hospitalization (for any cause) was outcome– Risk factors examined using survival analysis– Took advantage of records kept in “pre-dialysis”
clinic
Cross-sectional Designs• Classifies a population or group with respect
to both outcome and exposure at a single point in time
• Useful for:– Disease description– Diagnosis and staging– Describing disease processes, mechanisms
• Weaknesses:– Subject to sampling and recall biases– Temporal order problem– Can’t estimate disease incidence, only prevalence
Cross-sectional Design (example)Hansen et. al., “Prevalence of renovascular
disease in the elderly…” J Vasc Surg (2002) 36:443-451.– 834 participants in the CHS Study were examined
with RDS at a single point in time– RVD status determined and prevalence in CHS
cohort estimated– Increased age, lower HDL-c, and increased SBP
associated with RVD
Surveys• Single point-in-time studies; many utilize
sampling techniques to assure generalizability• Complex survey designs (e.g., NHANES, NIS H-
CUP) use probability sampling– Target population is divided into clusters; subsets
of clusters are sampled randomly– Certain clusters may be “oversampled” to assure
representation– Statistical analyses require special methods that
correct variance for study design
Complex Survey (example)Mondrall et. al., “Operative mortality for renal
artery bypass in the United States” J Vasc Surg (2008) 48:317-322– Examined RABG from NIS/H-CUP survey, 2000-
2004– Observed 10% in-hospital post-op mortality– Risk factors for increased mortality included: age,
female gender, Hx renal failure, CHF, lung disease– In-hospital mortality higher than previously
reported– Used methods that accounted for survey design
Ecologic Studies• Use data from large groups to compare rates
of exposure and disease• Data are on group-level (e.g., data on air
pollution levels in specific cities could be compared to rates of lung cancer)
• Can lead to “ecologic fallacy”, because one doesn’t know whether the actual individuals disease are subject to the exposure of interest
• Subject to “crackpot” biases
Ecologic Study (example)Reynolds et. al., “Childhood cancer and
agricultural pesticide use…” Environ Health Prospect (2002) 110:319-324– Examined incidence of childhood cancers in
California in relation to pesticide use, 1988-1994– Data sources: California Cancer Registry; U.S.
Census; California Dept. of Pesticide Regulations– Looked at cancer of all types, and by specific types– Found a significant association between childhood
leukemia rates in communities with highest use of propargite
– No other associations were observed
Prospective Designs• Start with well-defined cohort and follow-up
for occurrence of disease/outcome• Considered the optimal design for
observational studies• Useful for:– Finding causes and estimating incidence of disease– Identification of risk factors– Following natural history, determining prognosis
Prospective Designs (cont.)• Weaknesses:– Subject to selection bias (all studies are) and
surveillance bias– Losses to follow-up or dropouts– Temporal changes in health habits (e.g., MRFIT)
• Can be expensive and always take time• Advantages:– Correct temporal relationship between exposures
and disease/outcome– Allows estimation of disease incidence and
relative risks
Prospective Design (example)Edwards et. al., “Renovascular disease and the
risk of adverse coronary events…” Arch Intern Med (2005) 165:207-213– 840 CHS participants with RDS exams from
Hansen et. al.– Followed for CVD events for an average of 14
months post-RDS– Participants with RVD found to have nearly twice
the rate of adverse CVD during observation period than those without RVD
Observational Designs
Today
Retrospective Cohort
Retrospective (Case-control) Prospective (Cohort)
Cross-sectional
Time
Experimental Studies
Clinical TrialsParticipants are assigned to an experimental treatment and followed for event of interest– Clinical trials may…
a) …be randomized or non-randomizedb) …include a control group or have no control groupc) …compare current treatment to an historical controld) …employ parallel or cross-over designe) …employ blinding of investigator and/or participant
– The randomized, double-blind, placebo-controlled, parallel design is considered to be the best to determine efficacy
Clinical Trials (cont.)Randomization– Purpose: to balance groups on both observed and
unobserved factors– No guarantees: balance occurs in expectation (i.e.,
there is chance that some factors will not be balanced)
– In cross-over design, it’s best to randomize treatment order (if possible)
– Blocking used to assure treatment arm balance at fixed points
– Stratification used to assure balance on a factor of interest
Clinical Trial: Parallel Group Design
Participants screened for entry criteria
Participants screened for entry criteria
TimeScreening Baseline Treatment
Clinical Trial (example 1)Kay et. al., “Acetylcysteine for prevention of
acute deterioration of renal function…” JAMA (2003) 289:553-558.– Experiment to test efficacy of antioxidant
acetylcysteine to prevent acute nephrotoxicity– 200 patients with moderate renal insufficiency
undergoing elective coronary angiography– Randomized, double-blind, placebo-controlled– 12% with increase in SCr in placebo group vs. 4%
in acetylcysteine group (P=0.03)
Clinical Trial: Crossover Design
Participants screened for entry criteria
Participants screened for entry criteria
Screening Treatment (Phase 1)
{Washout}B/L Treatment (Phase 2)
Clinical Trial (example 2)Whelton et. al., “Effects of celecoxib and
naproxen on renal function…” Arch Intern Med (2000) 160:1465-1470– Experiment to compare effect of celecoxib vs.
naproxen on renal function in elderly cohort– 29 healthy elderly subjects took either celecoxib
or naproxen for 10 days, had 7-day washout, then took other med for 10 days
– Randomized treatment order, single-blind design– At day 6, GFR change on naproxen -7.5
mL/min/1.73m2 vs. -1.1 on celecoxib (P=0.004)
Clinical Trials (other types)• Non-randomized trials: patients not assigned
to treatment (or treatment order) via randomization; interpret with caution
• External or historical controls: compare current experiment to an external control group (e.g., from prior study or literature); interpret with caution
• Uncontrolled trial: experimental group only (no comparison); interpret with caution
Clinical Trial (example 3)Gomes et. al., “Acute renal dysfunction in high-
risk patients after angiography…” (1989) Radiology 170;65-68– 145 patients at “high-risk” for renal failure
undergoing angiography after administration with iohexol (non-ionic contrast)
– Compared to 202 historical controls previously studied with ionic contrast
– Acute renal dysfunction observed in 5.5% of iohexol group vs. 10% of historical control group (P=NS)
– Authors use result to argue for new, randomized trial of two contrast agents
Clinical Trials (issues)• Blinding: double-blind is optimal but not
always feasible– Surgical trials usually impossible to blind both
investigator and participant – Some trials are “open-label” and treat participants
to a goal; others test a behavioral intervention– Group interventions are typically not blinded;
must also account for “clustering” in intervention
• If possible, always blind staff performing measurements
• Avoid surveillance and/or ascertainment bias
Clinical Trials (issues)• Look out for loss to follow-up, differential
attrition, and poor adherence to treatments• Intention-to-treat: when analyzing outcomes,
participants are included in analyses based on treatment group assignment regardless of treatments received or adherence– Necessary to avoid potential bias due to self-
selection– Preserves randomization– Drug and device companies love to do analyses
based on treatments received
Meta-analysis• Pools results across multiple studies• A review article with quantitative summary• Typically combines results of several
experimental studies– Useful for combining small studies– Studies should have same or similar treatments– Pools results to get single measure of effect
• Beware: meta-analyses combining experimental and observational designs
• Dependent upon articles reporting sufficient data (N, effect measure, variance)
Meta-analysis (example)Leertouwer et. al., “Stent placement for renal
arterial stenosis…” Radiology (2000) 80:78-85– Compared studies of RVD repair with stent
placement vs. PTA alone– Combined data on technical success rate, BP
response, renal function response, anatomic F/U from 14 studies of stent placement and 10 studies of PTA
– Conclusion: “Renal artery stent placement is technically superior and clinically comparable to renal PTA alone.”
Data Collection for Statistical Analyses
Data Collection for Statistical Analyses1. Enter all or most of the data as numbers. Avoid entering letters,
words, string variables (e.g.,NA, 22%, <3.6), or anything that resembles a cartoon curse word, @#&*%,. In Excel, all columns, with the exception of names and text comments, should be formatted as numbers or dates (not as general or text).
2. Give each column a unique, simple, 1-word name, 8 characters or less with no spaces, beginning with a letter, and place this name in the first row.
3. Put only one variable in a column. Do not combine variables in the same column.
4. Enter each patient (or unit of analysis) on a separate line, beginning on the second line.
5. Give each research participant or patient a unique case number (1,2,3, etc.)- in the first column. Delete patient name, SS#, MR#, and any identifying information before sending it to a statistician. Always, save the spreadsheet with a password.
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataTransmissionProcedures?CGISESSID=9fe1d0d63a71d176ca460de518acf2cf
Data Collection for Statistical Analyses6. Enter cases and controls in the same spreadsheet.
Use one variable to define the control group (TREATED 0=no, 1=yes or GROUP 1=Drug A, 2=Drug B).
7. Quantify. Enter continuous measurements when possible.
8. Create a simple guide (or key) using a word processor to explain variables abbreviations, value coding, and how missing values were entered. Be consistent.
9. Think through the analysis before collecting any data. 10. Have a biostatistician review the coding before data
entry and again after the first 10 patients have been entered.
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataTransmissionProcedures?CGISESSID=9fe1d0d63a71d176ca460de518acf2cf
Spreadsheet from Hell
Spreadsheet from Heaven