methods to analyze real world databases and registries hilal maradit kremers, md msc mayo clinic,...

Methods toanalyze real world

databases and registries

Hilal Maradit Kremers, MD MSc

Mayo Clinic, Rochester, MN

Clinical Research Methodology CourseNYU-Hospital for Joint Diseases

December 11, 2008

Disclosure

Research funding from • National Institutes of Health (RA)• Amgen (psoriasis)• Pfizer (pulmonary arterial hypertension)

Outline

• Terminology

• Clinical trials versus observational studies and registries

• Types of observational studies in rheumatic diseases– Descriptive epidemiology (incidence, prevalence)

– Disease definitions (i.e. classification criteria)

– Examining outcomes (including effectiveness of therapy) and risk factors (environmental, genetic)

• Tips when interpreting results

Terminology

“Real-world databases”

& registries

Observational studies

=

Terminology of related observational research disciplines

Pharmaco-epidemiology

Epidemiology EconomicsHealth Services

Research

ClinicalEpidemiology

HealthEconomics

Outcomesresearch

Terminology: Clinical medicine versus epidemiology

CLINICAL MEDICINE

• Natural history of the disease

• Signs and symptoms

• Diagnosis (how and when)

• Current clinical practice

• Clinical literature

• Drug-induced illnesses

EPIDEMIOLOGY

• Distribution and determinants of diseases in populations

– Study design

– Data collection

– Measurement

– Analyses

– Interpretation

– Critical review

Clinical trials versus observational studies and

registries

Clinical trials versus observational studies and registries

Exposure -

Exposure +

Disease -

Disease +

Exposure -

Exposure +

Disease

Disease

Exposure

CLINICALTRIAL

COHORT /REGISTRY

CASE-CONTROL

Why do we need registries

• Limitations of pre-marketing trials

• Unresolved issues from pre-marketing studies

• New signals or inconsistent signals from post-marketing surveillance

• Evolving concerns about safety

• Establishing risk-benefit margins

• Learn about use, Rx decisions, compliance and other physician/patient behaviors

• To evaluate a risk management program

Clinical trial vs observational studies/registries – four “toos”

• Too few

• Too brief

• Too simple

• Too median-aged

Implications of four “toos”

• Relative effectiveness unknown– Effectiveness in comparison to alternative therapies

• Surrogate vs. clinical endpoints– Bone mineral density, blood pressure, lipid levels, tumor size,

joint counts vs radiographic damage

• Infrequent adverse events

• Long latency adverse events– DES & adenocarcinoma of vagina

• Special populations– Women, children, elderly, multiple comorbidities

• Drug use in clinical practice

What is a registry?

• Definition of a registry– An organized system that uses observational study methods to

collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by particular disease, condition or exposure, and that serves a predetermined scientific, clinical, or policy purpose(s).

• Different types of registries– Disease registry – Product registry– Health services registry

• Pregnancy registries

Registries for Evaluating Patient Outcomes. AHRQ Publication No. 07-EHC001. May 2007.

Purpose of a registry

• Describe the natural history of disease• Determine clinical effectiveness or cost

effectiveness of health care products, drugs and services

• Measure or monitor safety and harm• Measure quality of care

Registry types

• Disease registry– Patients who have the same diagnosis– e.g. all RA or SLE patients or rheumatic diseases

• Product registry– Patients who have been exposed to

biopharmaceutical products or medical devices

• Health services registry– Patients who have had a common procedure, clinical

encounter or hospitalization (TKA-THA registries)

Registries useful when:

• Outcome is relatively common, well-defined and ascertainable & serious

• Extensive drug exposure• Appropriate reference group• Data on relevant covariates ascertainable• Minimal channeling (preferential prescribing of a new

drug to patients at a higher risk)• Minimal confounding by indication• Onset latency <2-3 years• Required drug exposure <2-3 years• Not an urgent drug safety crisis

Registries may not be useful when:

• Outcome: poorly-defined, difficult to validate outcomes (depression, psychosis)

• Exposure– Rare drug exposure– Intermittent exposure– OTC drugs, herbals

• Significant confounding by indication– Antidepressants and suicides– Inhaled beta-agonists and asthma death

• Certain settings– Specialty clinics, in-hospital drug use

Consequences of not doing registries or observational studies• Arguing over case reports• Lack of data on real benefit-risk balance• Less effective and usually biased decision-

making• Possibly false conclusions• Law suits

Types of observational studies in rheumatology

Observational study designs

• Drug exposed patients

– Case reports

– Case series

– Registries

• Other

– Ecological studies

• Exposed vs. unexposed

– Cross-sectional

– Prospective cohort

– Case-control

Ecological studies – time series• When drug is predominant cause of the disease• Changes in outcomes following an abrupt change in drug exposure,

as result of a policy or regulatory change, publications, media coverage

• Reported Cases of Reye's Syndrome in Relation to the Timing of Public Announcements

Belay et al. NEJM 1999; 340:1377

http://content.nejm.org/content/vol340/issue18/images/large/01f1.jpeg?ck=nck

Ecological studies – time seriesSecular trends in NSAID use and colorectal cancer

incidence

Lamont: Cancer J 2008:14(4):276-277

Ecological studies – time seriesRofecoxib-celecoxib and myocardial infarction

Brownstein et al. PLoS ONE. 2007:2(9):e840.

Summary: ecological studies

Limitations• Complexity of disease causation

• Confounding by the “ecological fallacy”

Advantages• Cost ↓, time ↓, using routinely collected data

• New hypotheses about the causes of a disease and new potential risk factors (e.g. air pollution)

• Provides estimates of causal effects that are not attenuated by measurement error

• Some risk factors for disease operate at the population level (i.e. SES status)

Studies on descriptive epidemiology of rheumatic

diseasesIncidence

Prevalence

Mortality

Prevalence: Proportion of individuals in a defined

population who have a particular disease at a given point in time

Population on 1/1/2005N=100

Diseased (RA) N=9

Diseased individuals

Prevalence = Incidence of disease x Duration of disease

Prevalence = 9/100

Incidence: Proportion of new cases of a disease or health-

related condition in a population-at-risk over a specified period of time

Diseased individuals on 1/1/2005


Exclude prevalent cases leaving N=91 at risk

1 year f-up

New-onset disease during 1 yr f-up

deceased

Incidence=2 cases/91 person-years

Incidence of RA in Olmsted County, MN (1955-2005)

Gabriel et al. A&R 2008: 58(9):S453

Year

Age

Adj

uste

d in

cide

nce

per

100,

000

pop

1955 1965 1975 1985 1995 2005

020

4060

8010

0

Female

Male

Incidence of PSA by age and sex (1970-2000)

Age

Incid

enc

e rate

(per 1

00,0

00)

20 30 40 50 60 70 80

05

10

15

20

Male

Female

Wilson et al. AC&R 2009: in press.

Incidence study requires keeping track of both the numerator & denominator!


1 yr 1 yr

•Residents who die or move out of the city•New residents (i.e. new folks who move into the city)•All new-onset disease while living in the city•Possible in few locations in the world

Mortality analyses

• RA: 124 studies in 84 unique cohorts1

• 15 key points in interpretation1

– Incident vs prevalent cases– Population-based vs clinic-based– SMR

• Cause-specific mortality2

• 3 time dimensions in mortality analyses3

– Duration of RA – Timing of onset of RA relative to death– Calendar time

1 Sokka et al. Clinical Exp Rheum 2008;26(Suppl. 51): S35-S61 2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697 3 Ward. A&R 2008; 59: 1687-1689

Mortality in incidence cohorts < prevalence cohorts

Overall mortality

in RA

CV mortality

in RA

Incidence cohort 1.31 1.19

Prevalence cohort 1.63 1.56

Prevalence cohort 1.63 1.56

Community-based 1.31/1.63 1.35

Clinic-based 1.28/1.65 1.53

1 Sokka et al. Clinical Exp Rheum 2008; 26 (Suppl. 51): S-35-S-61 2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697

Referral bias: Population-based vs clinic-based cohorts

Reality in the populationN=100

What the GP seesN=92

What the rheumatologist sees! N= 40

Mild disease

SMR

• Observed deaths ÷ expected deaths• Strongly influenced by choice of data to calculate

expected deaths– Age and gender specific– Time period– Complete follow-up

• Example:– RA cohort assembled between 1970-1990 with follow-up

until 2000– Expected mortality derived from US mortality rates

between 1970-1990

Trends in RA Mortality vs. Expected*

Females Males

RA

Expected Expected

RA

Gonzalez A, et al. Arthritis Rheum 2007;56(11):3583-587

Calendar Year

Mo

rtality R

ate

(pe

r 10

0 p

y)

1970 1980 1990 2000

01

23

45

Calendar Year

Mo

rtality R

ate

(pe

r 10

0 p

y)

1970 1980 1990 2000

01

23

45

Observed: expected mortality in RA

2020

Su

rviv

al (

%)

P<0.001P<0.001

Observed (RA)

Expected (non-RA)

00 55 1010 1515

Gabriel et al. A&R 2003; 48:54-58Years after RA incidence

Time: disease duration and CV mortality in RA

Duration of RA RR 95% CI

<1 years 1.00

1 to <4 years 0.80 0.32-2.02

4 to <8 years 1.00 0.42-2.37

≥ 8 years 0.84 0.37-1.90

Maradit Kremers A&R 2005; 52: 722-732

Summary: incidence, prevalence and mortality

Consider

• Underlying data source– Population-based or not– Incident vs prevalent cases

• Methodology – Case ascertainment – Completeness of follow-up

• Comparison data!

Disease definitions and classification criteria in

rheumatic diseases

Developed using observational study methodologies

Dynamic nature of rheumatic diseases

• 25% who initially met RA criteria still had evidence of RA 3-5 years later

O’Sullivan et al. Ann Intern Med 1972; 76: 573-7.Mikkelsen et al. A&R 1969; 12: 87-91.Lichtenstein et al. J Rheumatol 1991; 18: 989-93.Icen et al. J Rheumatol 2008.

Years since RA incidence

Cum

ula

tive

inci

denc

e,

%

2 or more

3 or more

4 or more

5 or more

0 5 10 15 20 250

20

40

60

80

100

Typical vs desired methodology for classification criteria

Patients with established disease

Patients with other established rheumatic diseases

Compare characteristics

Patients with new-onset disease

Patients with other new-onset rheumatic diseases

Compare characteristics

Observe disease evolution

Observe disease evolution

TYPICAL DESIRED

Examining outcomes and risk factors in rheumatic diseases

Cohort Studies (outcomes)

Registries (outcomes)

Case-control studies (risk factors)

Types of Cohort Studies

• Designated by the timing of data collection in the investigator’s time:– Prospective– Retrospective (historical)– Mixed

• Mayo studies: retrospective• Registries: prospective

Types of Cohort Studies

Prospective(concurrent)

Study

Retrospectivenon-concurrent

Study

Mixed (P+R)Study

Investigatorbegins study

Selection ofCohort

All designs feasible either as ad hoc registries or in automated database studies.



Cohort study: design options

• Prospective vs. retrospective• Entry into cohort: closed or open• Timing of exposure: new users or not• Source of un-exposed cohort

– Internal– External

• drug exposed subjects only, registries

Cohort Study: Steps

1. Cohort identification• Define subjects & follow-up period

2. Risk factor/drug exposure measurement throughout follow-up

3. Outcome (disease) ascertainment

4. Confounder measurements (throughout follow-up)

5. Analysis

Step 1 - Cohort identification

• Trade-off between external and internal validity• Retrospective vs. prospective

– Consider feasibility and costs

• Follow-up– Tracking of drug changes over time– Losses to follow-up, esp. if likely to be differential

(different for drug users and non-users)

Step 2 – Risk factor/Drug exposure measurement

• New versus old users– Ability to account confounders before drug started– Ability to quantify outcomes early after starting the drug

(compliance, early drop-offs due to intolerance)

• Incomplete drug exposure – E.g. One time measurement of DMARD use and mortality

• Drug exposure metric– Ever/never, dose (average, cumulative), duration

• Reference group– Non-users, past users, users of other drugs

• Misclassification of episodic use

Step 2 - Timing: patterns of drug use

Antibiotic

NSAIDs

DMARDs

Step 2 - Drug exposure measurement methods

• Interviews• Face-to-face, phone or self-administered

• Excellent to capture current use but not for past use or changing drug use over time

• Loss of memory – cognitively intact subjects & regularly used drugs

• Biological testing• Blood or urine

• Excellent to capture current use but not for past use

• Non-differential (unless disease affects the assay)

• Pharmacy or claims records• Medical records

Step 2 - Pharmacy or claims records for drug exposure

• Drugs obtained by prescription

• Drug details available

• Accurate & complete for both past and current drug exposure

• Temporal tracking possible

• Limitation compliance– Prescription filled and drug taking

• Validation studies are necessary

Step 2 - Misclassification of drug exposure

Free sample15 days

Rx fill for 30 daysPatient used for 40 days

Refill for 30 daysUsed 20 days

MD prescription

Pharmacy claims

Prescription database

Claims data +15 days rule

Truth

Discontinued

30 days 30 days

Step 2 summary: Aspects of drug exposure measurement

• Completeness & accuracy

• Response rate

• Temporal change over time

• Special populations

• Details of the drug

• Details of utilization

• Availability & cost (reimbursement)

• Differential or non-differential

Step 3 – Outcome ascertainment

• Low specificity – methods used to find outcomes incorrectly includes subjects without the outcome– Validation of outcomes in database studies

• Low sensitivity - incomplete (and potentially differential) identification of outcomes– increased diagnostic surveillance (e.g. NSAIDs and

GI events)– Under-diagnosed & un-treated conditions

• Timing of disease onset (protopathic bias)

Step 3 challenges: Protopathic bias

nsNSAIDs Stomach pain Coxib UGIB

nsNSAIDs Coxib UGIB

Truth

Database

Study start

nsNSAID = non-specific NSAID

Step 3 – Outcome of interest in rheumatology

• Beneficial effects/effectiveness– Disease progression

• Adverse effects– Mortality– Cardiovascular morbidity– Infections– Lymphomas & solid malignancies– Autoimmunity– GI events (NSAIDs)– Pregnancy outcomes

Step 3 - Consistency in outcome definitions – infections in RA

Askling: Curr Opin Rheumatol 2008; 20(2): 138–144

Step 3 challenges: Differential misclassification of outcome

• Cohort study: May result from misclassification of outcome/disease free (specificity) or incomplete identification of persons with outcome (sensitivity) in exposed and unexposed subjects– Under-diagnosed conditions

– Example: Patients with RA, especially those on biologics are more likely to see their doctors more often and more likely to be examined for labs, or CVD

• Using medication-taking as a surrogate of outcome can be problematic

Step 3 – Outcome ascertainmentCompeting risk of death

Melton et al. Osteoporos Int. 2008 Sep 17.

Step 4 – Confounder measurements

What is a confounder?

• The clinical condition which determines drug selection (channeling) and is linked to the adverse event – Indication

– Severity

– Contraindication

Drug Exposure Adverse event

Confounder: INDICATION

Step 4 - Confounding by indication

• The indications for drug use, because of their natural association with prognosis, may confound the comparison so that it looks as if the treatment causes the disease

“You’d better avoid antihypertensive treatment because treated patients have higher stroke rates”

Step 4 - Confounding by disease severity

• The severity of RA is a confounder because:– Associated with use of biologics– Independent risk factor for CVD– Not in causal pathway

Biologics CVD

Rheumatoid arthritis (RA) severity(confounder)

Step 4 - Confounding by contraindication

• MD’s perception of the patient’s tendency to develop peptic ulcer & bleeding is a confounder because:– Associated with NSAID choice

– Independent risk factor for GI bleeding

– Not in causal pathway

Celebrex vsNaproxen

GI bleeding

MD perception of risk

Step 4 - Confounding by indication

• Prescription Channeling– New versus older products

– Example: Comparison of the risk of upper GI bleeding among coxibs versus traditional NSAIDs

• Coxibs preferentially prescribed to patients at high risk for upper GI bleeding

Moride et al. Arthritis Res Ther. 2005;7:R333-342.

Step 4 – Extent of confounding by indication

Schneeweiss. Clin Pharmacol Ther 2007: 82:143–156

Potential for confounding by indication

Intentionality of treatment effect by prescriber

e.g. coxibs andCV events

e.g. coxibs andGI events

Step 5 - Analysis

• Conventional methods to control for confounding – Randomization (clinical trials)– Restriction - homogeneous study population– Matching - select controls comparable to cases re. confounders– Stratified analysis – Statistical modeling

• Sensitivity analyses

• Active-competing comparator designs

• Propensity scores

• Marginal structural models

• Instrumental variable analysis

Example: Sensitivity analysis

Setoguchi Am Heart J 2008;156:336-41

Example: Propensity score

Wiles Arthritis Rheum 2001;44:1033-42

Example: Marginal structural models to examine MTX and CV Death

Hazard ratio (95% CI)

• All Cause Mortality 191 0.8 (0.6-1.0)

0.4 (0.2-0.8)*

• Cardiovascular Mortality 84 0.3 (0.2-0.7)*

• Non-CV Mortality 107 0.6 (0.2-1.2)*

s Unadjusted* Adjusted for: age, sex, RF, calendar year, duration of disease, smoking, education, HAQ score, patient global assessment, joint counts,

ESR, and prednisone status and number of other DMARDs used

Choi HK, et al. Lancet 2000;359:1173-7

Deaths

• Usually prospective• Biologics registries by Pharma

– All patients getting one or more biologics– Typically one-armed cohort: No comparator

• 9882 patients on anti-TNF observed for ~2-3 years• 25 new onset psoriasis (what does this mean?)*

– Total denominator known; ?total # effects

• Comparison data– External and typically not the same sampling frame as

patients on biologics

Cohort studies exampleExposed cohort only

* Harrison et al. Ann Rheum Dis April 2008.

• NSAIDs and GI bleeding – Cohort of patients taking NSAID of interest compared

with one or more other NSAIDs

– Rate of GI bleeding during follow up period compared

• Glucocorticoids and risk of CVD in RA patients– Cohort of RA patients taking glucocorticoids –

comparison of users vs non-users

– Rate of CVD during follow-up compared

Cohort studies exampleExposed & comparison cohort

Solomon et al. Arthritis Rheum 2006;54:1378-89Davis et al. Arthritis Rheum 2007;56:820-830

• Identify general population cohort study where extensive longitudinal data available– Nurses Health Study, Framingham Study, Physician’s

Health Study, National Databank of Rheum Diseases

• REP - Rochester Epidemiology Project: Cohort is the Olmsted County population

• Advantages: If data collected, analysis only • Disadvantages: Biases, confounding relative to

nature of population + lack of key covariates

Cohort studies example Analysis within existing cohort

• Most common form in pharmacoepidemiology

• Usually retrospective, but can be mixed

• Many large multi-purpose databases are used– HMO, Managed Care (Puget Sound, United Health Care)

– Electronic medical records (GPRD, MediPlus)

– Provincial health plans (Saskatchewan)

• Advantages: large, data already exists, complete for billable services

• Disadvantages: Claims = diagnoses

Cohort studies example Database cohort study

?

Summary: Cohort studiesThere is a difference between relative versus

absolute risk

Incidence rate

Age

30 40 50 60 70 80

Unexposed (e.g. non-RA)

Exposed (e.g. RA)

Rate difference is constant but rate ratio decreasing

Rate difference increasing but rate ratio constant

Summary: Keep in mind of major differences among registries!

Name Year since

Source of data

Inclusion criteria

Comparison group Current size Follow-up

intervals

NDB (US) 1998 Selected centers

Rheum diseases DMARD users ~20,000 RA patients Semi-annual

CORRONA (US) 2002 Selected centers New starts Other DMARDs ~ 10,000 RA patients Regular?

VARA (US) ?? Veterans RA patients Not explicit unknown unknown

BSRBR (UK) 2001 Selected centers

new users; 4,000 per drug

collected at defined sites

biologics: >14,000 controls: > 3,000

Baseline & regular up to 60 months

RABBIT (Germany) 2001 Selected centers

new users; 1,000 per drug

Internal: DMARD failures

biologics: >3,500 controls: 1,800

Baseline & regular up to 120 months

ARTIS (Sweden) regional registers New users national register

data 15,000 treatments Baseline & regular

BIOBADASER (Spain) 2000 Selected centers New users EMECAR cohort >8,000 patients

registration at inception of adverse

event

DANBIO (Denmark) 2000 Selected centers New users none > 3,500 RA patients no defined follow-up

NOR-DMARD (Norway) 2000 Selected centers

New users of DMARD or

biologicDMARDs >2000 RA/AS/PsA

>3000 DMARDs Baseline & regular

DREAM (45) 2003 Selected centers New users early RA cohort >1,000 patients Baseline and regular

LORHEN (Italy) 1999 regional New users none >1,000 RA patients Baseline and irregular intervals

Swiss SCOM 1996 not a biologic register New users DMARD patients >2,000 patients annually

Tips when interpreting studies

Consider these before you believe the results!

• If negative study– Power

– Outcome & exposure definition

– Comparison group

– Non-differential misclassification

– Replication

Consider these before you believe the results!

• If positive study– Confounding

– Channeling

– Differential misclassification

– Generalizability

– Implications

– Replication

methods to analyze real world databases and registries hilal maradit kremers, md msc mayo clinic,...

Documents

registries slide

clinical effectiveness

clinical practice slide

uniform data clinical

medianaged slide

disease signs

particular disease

risk management program