2005-4143s1_06_gould-merck
TRANSCRIPT
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 1/16
Issues in the Practical Application of DataMining Techniques to Pharmacovigilance
A. Lawrence Gould
Merck Research LaboratoriesMay 18, 2005
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 2/16
18 May 2005 1
Spontaneous AE Reports
Clinical trial safety information is incomplete
° Few patients -- rare events likely to be missed
° Not necessarily real world
Need info from post-marketing surveillance &spontaneous reports : Pharmacovigilance
Carried out by skilled clinicians & epidemiologists
Long history of research on issue, e.g.° Finney (1974, 1982) Royall (1971)
° Inman (1970) Napke (1970)
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 3/16
18 May 2005 2
Signal Generation: The Traditional Method
Singlesuspicious
caseor cluster
PotentialSignals
IdentifyPotential
Signals
StatisticalOutput
Consult Programmer
Consult Marketing
Patient Exposure
IntegrateInformation
RefinedSignal(s)
BackgroundIncidence
Consult Literature
Consult Database
ComparativeData
Consultation
Action
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 4/16
18 May 2005 3
Some Limitations of Traditional Approach
Incomplete reports of events, not reactions
How to compute effect magnitude
Many events reported, many drugs reported
Bias & noise in system Difficult to estimate incidence because no. of pats at
risk, pat-yrs of exposure seldom reliable
Inappropriate to consider incidence using only
spontaneous reports
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 5/16
18 May 2005 4
The Pharmacovigilance Process
Detect SignalsTraditionalMethods
DataMining
Generate Hypotheses
Refute/Verify
Type A
(Mechanism-based)
Type B(Idiosyncratic)
Insight fromOutliers
EstimateIncidence
Public HealthImpact, Benefit/Risk
Act
Inform
Change LabelRestrict use/
withdraw
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 6/16
18 May 2005 5
Major Uses of Data Mining
Identify subtle associations that might exist in largedatabases
Early identification of potential toxicities
Identify complex relationships not apparent by simplesummarization
Screening tool to identify potential associations toundergo clinical/epidemiological followup
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 7/16
18 May 2005 6
More to Pharmacovigilance than Data Mining
Data mining a refinement to discover subtleties
Still need initial case review
respond to reports involving severe, potential life-threatening events eg., Stevens-Johnson syndrome,agranulocytosis, anaphylactic shock
Clinical/biological/epidemiological verification of apparent associations is essential
Need to think about most effective use of data mining inroutine pharmacovigilance practice
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 8/16
18 May 2005 7
Statistical Methodology (1)
Not the key issue Most use variations of 2-way table statistics
No. Reports Target AE Other AE Total
Target Drug a b nTDOther Drug c d nOD
Total nTA nOA n
Some possibilitiesReporting Ratio: E(a) = nTD v nTA/n
Proportional Reporting Ratio: E(a) = nTD v c/nOD
Odds Ratio: E(a) = b v c/d
Basic idea:
Flag whenR = a/E(a)
is ³large´
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 9/16
18 May 2005 8
Statistical Methodology (2)
Estimate variability in various ways, e.g., usual chi-square statistic, Bayesian & Empirical Bayesian models)
Similar results for all methods if more than a fewdrug/event combinations reported (e.g., 10)
No non-clinical gold standard cant assess diagnosticutility of any method in usual sense
OR > PRR > RR when a > E(a), doesnt mean ORidentifies real associations better than RR
RR probably most stable
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 10/16
18 May 2005 9
Spontaneous Report Database Limitations
Significant under reporting (esp. OTC) -- depending onseriousness or novelty of event, newness of drug,intensity of monitoriing
Different regulatory reporting requirements
Reflects only reporting practice, not incidence
Synonyms for drugs & events sensitivity loss
Much duplication of reports
Exposure rate unknown For any given report, there is no certainty that a
suspected drug caused the reaction
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 11/16
18 May 2005 10
A Major Limitation (Often Ignored)
Accumulated reports cannot be used to calculateincidence or to estimate drug risk. Comparisonsbetween drugs cannot be made from these data
Unfortunately, this still is done disclaimers do not
balance the effect of the misrepresentation
Easy to show differences with data mining techniques,but impossible to make valid inferences about causalityand may mislead
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 12/16
18 May 2005 11
Implementation Issues
Portfolio bias in company databases can lead toinaccurate estimates of relative reporting rates
Does public health benefit justify cost of following upsignals detected by routine data mining methods?
Variation in tools and databases among regulators couldlead to significant cost without public health benefit
Do frequency-based signal detection methods useful toregulators have business value in industry settings?
Need examples of situations where computerizedapproach failed to identify important issues and wheresignals were created by publicity or reporting artifacts
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 13/16
18 May 2005 12
Mining is Easy, Refining Low-grade Ore is Hard
What is data mining activity intended to accomplish --what are the clinical/epidemiological/regulatoryquestions that need to be answered
Need to address the impact of various factors, e.g.,
evolution of apparent association over time, associationwith key demographic factors such as age, sex, diseaseclassification
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 14/16
18 May 2005 13
More Issues
Composition of database may be important, important associations of a new drug could be cloaked by eventsassociated with an old drug with similar mechanism of action
Individual company databases tend to havecomprehensive information about company products, but not general spectrum of drugs/ vaccines
Databases contain reports mentioning drugs, not
demonstrations of causality
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 15/16
18 May 2005 14
Discussion
Most apparent associations represent known problems
Some reflect disease or patient population
~ 25% may represent signals about previously unknownassociations
Statistical involvement in implementation &interpretation is important
The actual false positive rate is unknown as are the legaland resource implications
8/7/2019 2005-4143S1_06_Gould-Merck
http://slidepdf.com/reader/full/2005-4143s106gould-merck 16/16
18 May 2005 15
What Next?
PhRMA/FDA working group is considering ways toaddress issues white paper will be published
May be worthwhile to construct & maintain a cleaned-upcanonical database from AERS to provide a common
resource for checking data mining findings based onindividual company proprietary databases