methods to analyze real world databases and registries hilal maradit kremers, md msc mayo clinic,...
TRANSCRIPT
Methods toanalyze real world
databases and registries
Hilal Maradit Kremers, MD MSc
Mayo Clinic, Rochester, MN
Clinical Research Methodology CourseNYU-Hospital for Joint Diseases
December 11, 2008
Disclosure
Research funding from • National Institutes of Health (RA)• Amgen (psoriasis)• Pfizer (pulmonary arterial hypertension)
Outline
• Terminology
• Clinical trials versus observational studies and registries
• Types of observational studies in rheumatic diseases– Descriptive epidemiology (incidence, prevalence)
– Disease definitions (i.e. classification criteria)
– Examining outcomes (including effectiveness of therapy) and risk factors (environmental, genetic)
• Tips when interpreting results
Terminology
“Real-world databases”
& registries
Observational studies
=
Terminology of related observational research disciplines
Pharmaco-epidemiology
Epidemiology EconomicsHealth Services
Research
ClinicalEpidemiology
HealthEconomics
Outcomesresearch
Terminology: Clinical medicine versus epidemiology
CLINICAL MEDICINE
• Natural history of the disease
• Signs and symptoms
• Diagnosis (how and when)
• Current clinical practice
• Clinical literature
• Drug-induced illnesses
EPIDEMIOLOGY
• Distribution and determinants of diseases in populations
– Study design
– Data collection
– Measurement
– Analyses
– Interpretation
– Critical review
Clinical trials versus observational studies and
registries
Clinical trials versus observational studies and registries
Exposure -
Exposure +
Disease -
Disease +
Exposure -
Exposure +
Disease
Disease
Exposure
CLINICALTRIAL
COHORT /REGISTRY
CASE-CONTROL
Why do we need registries
• Limitations of pre-marketing trials
• Unresolved issues from pre-marketing studies
• New signals or inconsistent signals from post-marketing surveillance
• Evolving concerns about safety
• Establishing risk-benefit margins
• Learn about use, Rx decisions, compliance and other physician/patient behaviors
• To evaluate a risk management program
Clinical trial vs observational studies/registries – four “toos”
• Too few
• Too brief
• Too simple
• Too median-aged
Implications of four “toos”
• Relative effectiveness unknown– Effectiveness in comparison to alternative therapies
• Surrogate vs. clinical endpoints– Bone mineral density, blood pressure, lipid levels, tumor size,
joint counts vs radiographic damage
• Infrequent adverse events
• Long latency adverse events– DES & adenocarcinoma of vagina
• Special populations– Women, children, elderly, multiple comorbidities
• Drug use in clinical practice
What is a registry?
• Definition of a registry– An organized system that uses observational study methods to
collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by particular disease, condition or exposure, and that serves a predetermined scientific, clinical, or policy purpose(s).
• Different types of registries– Disease registry – Product registry– Health services registry
• Pregnancy registries
Registries for Evaluating Patient Outcomes. AHRQ Publication No. 07-EHC001. May 2007.
Purpose of a registry
• Describe the natural history of disease• Determine clinical effectiveness or cost
effectiveness of health care products, drugs and services
• Measure or monitor safety and harm• Measure quality of care
Registry types
• Disease registry– Patients who have the same diagnosis– e.g. all RA or SLE patients or rheumatic diseases
• Product registry– Patients who have been exposed to
biopharmaceutical products or medical devices
• Health services registry– Patients who have had a common procedure, clinical
encounter or hospitalization (TKA-THA registries)
Registries useful when:
• Outcome is relatively common, well-defined and ascertainable & serious
• Extensive drug exposure• Appropriate reference group• Data on relevant covariates ascertainable• Minimal channeling (preferential prescribing of a new
drug to patients at a higher risk)• Minimal confounding by indication• Onset latency <2-3 years• Required drug exposure <2-3 years• Not an urgent drug safety crisis
Registries may not be useful when:
• Outcome: poorly-defined, difficult to validate outcomes (depression, psychosis)
• Exposure– Rare drug exposure– Intermittent exposure– OTC drugs, herbals
• Significant confounding by indication– Antidepressants and suicides– Inhaled beta-agonists and asthma death
• Certain settings– Specialty clinics, in-hospital drug use
Consequences of not doing registries or observational studies• Arguing over case reports• Lack of data on real benefit-risk balance• Less effective and usually biased decision-
making• Possibly false conclusions• Law suits
Types of observational studies in rheumatology
Observational study designs
• Drug exposed patients
– Case reports
– Case series
– Registries
• Other
– Ecological studies
• Exposed vs. unexposed
– Cross-sectional
– Prospective cohort
– Case-control
Ecological studies – time series• When drug is predominant cause of the disease• Changes in outcomes following an abrupt change in drug exposure,
as result of a policy or regulatory change, publications, media coverage
• Reported Cases of Reye's Syndrome in Relation to the Timing of Public Announcements
Belay et al. NEJM 1999; 340:1377
Ecological studies – time seriesSecular trends in NSAID use and colorectal cancer
incidence
Lamont: Cancer J 2008:14(4):276-277
Ecological studies – time seriesRofecoxib-celecoxib and myocardial infarction
Brownstein et al. PLoS ONE. 2007:2(9):e840.
Summary: ecological studies
Limitations• Complexity of disease causation
• Confounding by the “ecological fallacy”
Advantages• Cost ↓, time ↓, using routinely collected data
• New hypotheses about the causes of a disease and new potential risk factors (e.g. air pollution)
• Provides estimates of causal effects that are not attenuated by measurement error
• Some risk factors for disease operate at the population level (i.e. SES status)
Studies on descriptive epidemiology of rheumatic
diseasesIncidence
Prevalence
Mortality
Prevalence: Proportion of individuals in a defined
population who have a particular disease at a given point in time
Population on 1/1/2005N=100
Diseased (RA) N=9
Diseased individuals
Prevalence = Incidence of disease x Duration of disease
Prevalence = 9/100
Incidence: Proportion of new cases of a disease or health-
related condition in a population-at-risk over a specified period of time
Diseased individuals on 1/1/2005
Population on 1/1/2005N=100
Exclude prevalent cases leaving N=91 at risk
1 year f-up
New-onset disease during 1 yr f-up
deceased
Incidence=2 cases/91 person-years
Incidence of RA in Olmsted County, MN (1955-2005)
Gabriel et al. A&R 2008: 58(9):S453
Year
Age
Adj
uste
d in
cide
nce
per
100,
000
pop
1955 1965 1975 1985 1995 2005
020
4060
8010
0
Female
Male
Incidence of PSA by age and sex (1970-2000)
Age
Incid
enc
e rate
(per 1
00,0
00)
20 30 40 50 60 70 80
05
10
15
20
Male
Female
Wilson et al. AC&R 2009: in press.
Incidence study requires keeping track of both the numerator & denominator!
Population on 1/1/2005N=100
1 yr 1 yr
•Residents who die or move out of the city•New residents (i.e. new folks who move into the city)•All new-onset disease while living in the city•Possible in few locations in the world
Mortality analyses
• RA: 124 studies in 84 unique cohorts1
• 15 key points in interpretation1
– Incident vs prevalent cases– Population-based vs clinic-based– SMR
• Cause-specific mortality2
• 3 time dimensions in mortality analyses3
– Duration of RA – Timing of onset of RA relative to death– Calendar time
1 Sokka et al. Clinical Exp Rheum 2008;26(Suppl. 51): S35-S61 2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697 3 Ward. A&R 2008; 59: 1687-1689
Mortality in incidence cohorts < prevalence cohorts
Overall mortality
in RA
CV mortality
in RA
Incidence cohort 1.31 1.19
Prevalence cohort 1.63 1.56
Prevalence cohort 1.63 1.56
Community-based 1.31/1.63 1.35
Clinic-based 1.28/1.65 1.53
1 Sokka et al. Clinical Exp Rheum 2008; 26 (Suppl. 51): S-35-S-61 2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697
Referral bias: Population-based vs clinic-based cohorts
Reality in the populationN=100
What the GP seesN=92
What the rheumatologist sees! N= 40
Mild disease
SMR
• Observed deaths ÷ expected deaths• Strongly influenced by choice of data to calculate
expected deaths– Age and gender specific– Time period– Complete follow-up
• Example:– RA cohort assembled between 1970-1990 with follow-up
until 2000– Expected mortality derived from US mortality rates
between 1970-1990
Trends in RA Mortality vs. Expected*
Females Males
RA
Expected Expected
RA
Gonzalez A, et al. Arthritis Rheum 2007;56(11):3583-587
Calendar Year
Mo
rtality R
ate
(pe
r 10
0 p
y)
1970 1980 1990 2000
01
23
45
Calendar Year
Mo
rtality R
ate
(pe
r 10
0 p
y)
1970 1980 1990 2000
01
23
45
Observed: expected mortality in RA
2020
Su
rviv
al (
%)
P<0.001P<0.001
Observed (RA)
Expected (non-RA)
00 55 1010 1515
Gabriel et al. A&R 2003; 48:54-58Years after RA incidence
Time: disease duration and CV mortality in RA
Duration of RA RR 95% CI
<1 years 1.00
1 to <4 years 0.80 0.32-2.02
4 to <8 years 1.00 0.42-2.37
≥ 8 years 0.84 0.37-1.90
Maradit Kremers A&R 2005; 52: 722-732
Summary: incidence, prevalence and mortality
Consider
• Underlying data source– Population-based or not– Incident vs prevalent cases
• Methodology – Case ascertainment – Completeness of follow-up
• Comparison data!
Disease definitions and classification criteria in
rheumatic diseases
Developed using observational study methodologies
Dynamic nature of rheumatic diseases
• 25% who initially met RA criteria still had evidence of RA 3-5 years later
O’Sullivan et al. Ann Intern Med 1972; 76: 573-7.Mikkelsen et al. A&R 1969; 12: 87-91.Lichtenstein et al. J Rheumatol 1991; 18: 989-93.Icen et al. J Rheumatol 2008.
Years since RA incidence
Cum
ula
tive
inci
denc
e,
%
2 or more
3 or more
4 or more
5 or more
0 5 10 15 20 250
20
40
60
80
100
Typical vs desired methodology for classification criteria
Patients with established disease
Patients with other established rheumatic diseases
Compare characteristics
Patients with new-onset disease
Patients with other new-onset rheumatic diseases
Compare characteristics
Observe disease evolution
Observe disease evolution
TYPICAL DESIRED
Examining outcomes and risk factors in rheumatic diseases
Cohort Studies (outcomes)
Registries (outcomes)
Case-control studies (risk factors)
Types of Cohort Studies
• Designated by the timing of data collection in the investigator’s time:– Prospective– Retrospective (historical)– Mixed
• Mayo studies: retrospective• Registries: prospective
Types of Cohort Studies
Prospective(concurrent)
Study
Retrospectivenon-concurrent
Study
Mixed (P+R)Study
Investigatorbegins study
Selection ofCohort
All designs feasible either as ad hoc registries or in automated database studies.
Investigatorbegins study
Investigatorbegins study
Cohort study: design options
• Prospective vs. retrospective• Entry into cohort: closed or open• Timing of exposure: new users or not• Source of un-exposed cohort
– Internal– External
• drug exposed subjects only, registries
Cohort Study: Steps
1. Cohort identification• Define subjects & follow-up period
2. Risk factor/drug exposure measurement throughout follow-up
3. Outcome (disease) ascertainment
4. Confounder measurements (throughout follow-up)
5. Analysis
Step 1 - Cohort identification
• Trade-off between external and internal validity• Retrospective vs. prospective
– Consider feasibility and costs
• Follow-up– Tracking of drug changes over time– Losses to follow-up, esp. if likely to be differential
(different for drug users and non-users)
Step 2 – Risk factor/Drug exposure measurement
• New versus old users– Ability to account confounders before drug started– Ability to quantify outcomes early after starting the drug
(compliance, early drop-offs due to intolerance)
• Incomplete drug exposure – E.g. One time measurement of DMARD use and mortality
• Drug exposure metric– Ever/never, dose (average, cumulative), duration
• Reference group– Non-users, past users, users of other drugs
• Misclassification of episodic use
Step 2 - Timing: patterns of drug use
Antibiotic
NSAIDs
DMARDs
Step 2 - Drug exposure measurement methods
• Interviews• Face-to-face, phone or self-administered
• Excellent to capture current use but not for past use or changing drug use over time
• Loss of memory – cognitively intact subjects & regularly used drugs
• Biological testing• Blood or urine
• Excellent to capture current use but not for past use
• Non-differential (unless disease affects the assay)
• Pharmacy or claims records• Medical records
Step 2 - Pharmacy or claims records for drug exposure
• Drugs obtained by prescription
• Drug details available
• Accurate & complete for both past and current drug exposure
• Temporal tracking possible
• Limitation compliance– Prescription filled and drug taking
• Validation studies are necessary
Step 2 - Misclassification of drug exposure
Free sample15 days
Rx fill for 30 daysPatient used for 40 days
Refill for 30 daysUsed 20 days
MD prescription
Pharmacy claims
Prescription database
Claims data +15 days rule
Truth
Discontinued
30 days 30 days
Step 2 summary: Aspects of drug exposure measurement
• Completeness & accuracy
• Response rate
• Temporal change over time
• Special populations
• Details of the drug
• Details of utilization
• Availability & cost (reimbursement)
• Differential or non-differential
Step 3 – Outcome ascertainment
• Low specificity – methods used to find outcomes incorrectly includes subjects without the outcome– Validation of outcomes in database studies
• Low sensitivity - incomplete (and potentially differential) identification of outcomes– increased diagnostic surveillance (e.g. NSAIDs and
GI events)– Under-diagnosed & un-treated conditions
• Timing of disease onset (protopathic bias)
Step 3 challenges: Protopathic bias
nsNSAIDs Stomach pain Coxib UGIB
nsNSAIDs Coxib UGIB
Truth
Database
Study start
nsNSAID = non-specific NSAID
Step 3 – Outcome of interest in rheumatology
• Beneficial effects/effectiveness– Disease progression
• Adverse effects– Mortality– Cardiovascular morbidity– Infections– Lymphomas & solid malignancies– Autoimmunity– GI events (NSAIDs)– Pregnancy outcomes
Step 3 - Consistency in outcome definitions – infections in RA
Askling: Curr Opin Rheumatol 2008; 20(2): 138–144
Step 3 challenges: Differential misclassification of outcome
• Cohort study: May result from misclassification of outcome/disease free (specificity) or incomplete identification of persons with outcome (sensitivity) in exposed and unexposed subjects– Under-diagnosed conditions
– Example: Patients with RA, especially those on biologics are more likely to see their doctors more often and more likely to be examined for labs, or CVD
• Using medication-taking as a surrogate of outcome can be problematic
Step 3 – Outcome ascertainmentCompeting risk of death
Melton et al. Osteoporos Int. 2008 Sep 17.
Step 4 – Confounder measurements
What is a confounder?
• The clinical condition which determines drug selection (channeling) and is linked to the adverse event – Indication
– Severity
– Contraindication
Drug Exposure Adverse event
Confounder: INDICATION
Step 4 - Confounding by indication
• The indications for drug use, because of their natural association with prognosis, may confound the comparison so that it looks as if the treatment causes the disease
“You’d better avoid antihypertensive treatment because treated patients have higher stroke rates”
Step 4 - Confounding by disease severity
• The severity of RA is a confounder because:– Associated with use of biologics– Independent risk factor for CVD– Not in causal pathway
Biologics CVD
Rheumatoid arthritis (RA) severity(confounder)
Step 4 - Confounding by contraindication
• MD’s perception of the patient’s tendency to develop peptic ulcer & bleeding is a confounder because:– Associated with NSAID choice
– Independent risk factor for GI bleeding
– Not in causal pathway
Celebrex vsNaproxen
GI bleeding
MD perception of risk
Step 4 - Confounding by indication
• Prescription Channeling– New versus older products
– Example: Comparison of the risk of upper GI bleeding among coxibs versus traditional NSAIDs
• Coxibs preferentially prescribed to patients at high risk for upper GI bleeding
Moride et al. Arthritis Res Ther. 2005;7:R333-342.
Step 4 – Extent of confounding by indication
Schneeweiss. Clin Pharmacol Ther 2007: 82:143–156
Potential for confounding by indication
Intentionality of treatment effect by prescriber
e.g. coxibs andCV events
e.g. coxibs andGI events
Step 5 - Analysis
• Conventional methods to control for confounding – Randomization (clinical trials)– Restriction - homogeneous study population– Matching - select controls comparable to cases re. confounders– Stratified analysis – Statistical modeling
• Sensitivity analyses
• Active-competing comparator designs
• Propensity scores
• Marginal structural models
• Instrumental variable analysis
Example: Sensitivity analysis
Setoguchi Am Heart J 2008;156:336-41
Example: Propensity score
Wiles Arthritis Rheum 2001;44:1033-42
Example: Marginal structural models to examine MTX and CV Death
Hazard ratio (95% CI)
• All Cause Mortality 191 0.8 (0.6-1.0)
0.4 (0.2-0.8)*
• Cardiovascular Mortality 84 0.3 (0.2-0.7)*
• Non-CV Mortality 107 0.6 (0.2-1.2)*
s Unadjusted* Adjusted for: age, sex, RF, calendar year, duration of disease, smoking, education, HAQ score, patient global assessment, joint counts,
ESR, and prednisone status and number of other DMARDs used
Choi HK, et al. Lancet 2000;359:1173-7
Deaths
• Usually prospective• Biologics registries by Pharma
– All patients getting one or more biologics– Typically one-armed cohort: No comparator
• 9882 patients on anti-TNF observed for ~2-3 years• 25 new onset psoriasis (what does this mean?)*
– Total denominator known; ?total # effects
• Comparison data– External and typically not the same sampling frame as
patients on biologics
Cohort studies exampleExposed cohort only
* Harrison et al. Ann Rheum Dis April 2008.
• NSAIDs and GI bleeding – Cohort of patients taking NSAID of interest compared
with one or more other NSAIDs
– Rate of GI bleeding during follow up period compared
• Glucocorticoids and risk of CVD in RA patients– Cohort of RA patients taking glucocorticoids –
comparison of users vs non-users
– Rate of CVD during follow-up compared
Cohort studies exampleExposed & comparison cohort
Solomon et al. Arthritis Rheum 2006;54:1378-89Davis et al. Arthritis Rheum 2007;56:820-830
• Identify general population cohort study where extensive longitudinal data available– Nurses Health Study, Framingham Study, Physician’s
Health Study, National Databank of Rheum Diseases
• REP - Rochester Epidemiology Project: Cohort is the Olmsted County population
• Advantages: If data collected, analysis only • Disadvantages: Biases, confounding relative to
nature of population + lack of key covariates
Cohort studies example Analysis within existing cohort
• Most common form in pharmacoepidemiology
• Usually retrospective, but can be mixed
• Many large multi-purpose databases are used– HMO, Managed Care (Puget Sound, United Health Care)
– Electronic medical records (GPRD, MediPlus)
– Provincial health plans (Saskatchewan)
• Advantages: large, data already exists, complete for billable services
• Disadvantages: Claims = diagnoses
Cohort studies example Database cohort study
?
Summary: Cohort studiesThere is a difference between relative versus
absolute risk
Incidence rate
Age
30 40 50 60 70 80
Unexposed (e.g. non-RA)
Exposed (e.g. RA)
Rate difference is constant but rate ratio decreasing
Rate difference increasing but rate ratio constant
Summary: Keep in mind of major differences among registries!
Name Year since
Source of data
Inclusion criteria
Comparison group Current size Follow-up
intervals
NDB (US) 1998 Selected centers
Rheum diseases DMARD users ~20,000 RA patients Semi-annual
CORRONA (US) 2002 Selected centers New starts Other DMARDs ~ 10,000 RA patients Regular?
VARA (US) ?? Veterans RA patients Not explicit unknown unknown
BSRBR (UK) 2001 Selected centers
new users; 4,000 per drug
collected at defined sites
biologics: >14,000 controls: > 3,000
Baseline & regular up to 60 months
RABBIT (Germany) 2001 Selected centers
new users; 1,000 per drug
Internal: DMARD failures
biologics: >3,500 controls: 1,800
Baseline & regular up to 120 months
ARTIS (Sweden) regional registers New users national register
data 15,000 treatments Baseline & regular
BIOBADASER (Spain) 2000 Selected centers New users EMECAR cohort >8,000 patients
registration at inception of adverse
event
DANBIO (Denmark) 2000 Selected centers New users none > 3,500 RA patients no defined follow-up
NOR-DMARD (Norway) 2000 Selected centers
New users of DMARD or
biologicDMARDs >2000 RA/AS/PsA
>3000 DMARDs Baseline & regular
DREAM (45) 2003 Selected centers New users early RA cohort >1,000 patients Baseline and regular
LORHEN (Italy) 1999 regional New users none >1,000 RA patients Baseline and irregular intervals
Swiss SCOM 1996 not a biologic register New users DMARD patients >2,000 patients annually
Tips when interpreting studies
Consider these before you believe the results!
• If negative study– Power
– Outcome & exposure definition
– Comparison group
– Non-differential misclassification
– Replication
Consider these before you believe the results!
• If positive study– Confounding
– Channeling
– Differential misclassification
– Generalizability
– Implications
– Replication