Receiver Operating Characteristic Curve Receiver Operating Characteristic Curve
(ROC) Analysis for Prediction Studies(ROC) Analysis for Prediction Studies
Ruth O’Hara, Helena Kraemer, Jerome Yesavage,Ruth O’Hara, Helena Kraemer, Jerome Yesavage, Jean Thompson, Art Noda, Joy Taylor, Jean Thompson, Art Noda, Joy Taylor,
Jared Tinklenberg Jared Tinklenberg
Stanford University, Department of Psychiatry and Behavioral SciencesStanford University School of Medicine
Sierra Pacific MIRECCVeterans Affairs Palo Alto Health Care System
Clinical practice is often “hit or miss” therapyClinical practice is often “hit or miss” therapy Try one thing, if that does not work, try Try one thing, if that does not work, try
anotheranother This is This is frustratingfrustrating for the patient and for the patient and
expensiveexpensive The Goal: find the best treatment for the The Goal: find the best treatment for the
patient with specific characteristicspatient with specific characteristics New news in psychiatry; old hat in internal New news in psychiatry; old hat in internal
medicinemedicine
The Clinical Need forThe Clinical Need for Signal DetectionSignal Detection Procedures Procedures
Receiver Operating Characteristic Curve Receiver Operating Characteristic Curve (ROC) Analysis(ROC) Analysis
Signal Detection Technique Signal Detection Technique
Traditionally used to evaluate diagnostic testsTraditionally used to evaluate diagnostic tests
Now employed to identify subgroups of a Now employed to identify subgroups of a population at differential risk for a specific population at differential risk for a specific outcome (clinical decline, treatment response)outcome (clinical decline, treatment response)
Identifies moderatorsIdentifies moderators
Receiver Operating Characteristic Curve Receiver Operating Characteristic Curve (ROC) Analysis(ROC) Analysis
Historical DevelopmentHistorical Development
ROC Analysis:ROC Analysis: Historical Development (1)Historical Development (1)
Derived from early radar in WW2 Battle of Britain Derived from early radar in WW2 Battle of Britain to address: Accurately identifying the signals on the to address: Accurately identifying the signals on the radar scan to predict the outcome of interest – radar scan to predict the outcome of interest – Enemy planes – when there were many extraneous Enemy planes – when there were many extraneous signals (e.g. Geese)?signals (e.g. Geese)?
ROC Analysis:ROC Analysis: Historical Development (2) Historical Development (2)
True PositivesTrue Positives = Radar Operator interpreted signal as = Radar Operator interpreted signal as Enemy Planes and there were Enemy planes Enemy Planes and there were Enemy planes (Good (Good Result: No wasted Resources)Result: No wasted Resources)
True NegativesTrue Negatives = Radar Operator said no planes and = Radar Operator said no planes and there were none there were none (Good Result: No wasted resources)(Good Result: No wasted resources)
False PositivesFalse Positives = Radar Operator said planes, but there = Radar Operator said planes, but there were none were none (Geese: wasted resources)(Geese: wasted resources)
False NegativesFalse Negatives = Radar Operator said no plane, but = Radar Operator said no plane, but there were planes there were planes (Bombs dropped: very bad outcome)(Bombs dropped: very bad outcome)
ROC Analysis:ROC Analysis:Historical DevelopmentHistorical Development
SensitivitySensitivity = Probability of correctly interpreting the = Probability of correctly interpreting the radar signal as Enemy planes among those times radar signal as Enemy planes among those times when Enemy planes were actually coming when Enemy planes were actually coming • SE = True Positives / True Positives + False SE = True Positives / True Positives + False
NegativesNegatives SpecificitySpecificity = Probability of correctly interpreting the = Probability of correctly interpreting the
radar signal as no Enemy planes among those times radar signal as no Enemy planes among those times when no Enemy planes were actually coming when no Enemy planes were actually coming • SP = True Negatives / True Negatives + False SP = True Negatives / True Negatives + False
PositivesPositives
ROC: Prediction of Enemy Planes ROC: Prediction of Enemy Planes by RAF Radar Operatorsby RAF Radar Operators
Operator said Planes
Operator said No Planes
Total
Enemy Planes
473 81 554
No Enemy Planes
22 44 66
Total 495 125 N = 620
SE= 473/554 .854
SP = 44/66 .667
Receiver Operating Characteristic Curve Receiver Operating Characteristic Curve (ROC) Analysis Applications: (ROC) Analysis Applications:
Evaluating Medical TestsEvaluating Medical Tests
ROC Analysis: ROC Analysis: Evaluating Medical TestsEvaluating Medical Tests
The evaluation of the ability of a diagnostic test The evaluation of the ability of a diagnostic test to identify a disease involves considering:to identify a disease involves considering:
P=Prevalence = occurrence in the population of P=Prevalence = occurrence in the population of the outcome of interest (e.g. disease)the outcome of interest (e.g. disease)
True PositivesTrue Positives True NegativesTrue Negatives False Positives False Positives False NegativesFalse Negatives P=Prevalence=True Positives + False NegativesP=Prevalence=True Positives + False Negatives
ROC Analysis: ROC Analysis: Medical Test EvaluationMedical Test Evaluation
True PositivesTrue Positives = Test states you have the disease = Test states you have the disease when you do have the diseasewhen you do have the disease
True NegativesTrue Negatives = Test states you do not have the = Test states you do not have the disease when you do not have the diseasedisease when you do not have the disease
False PositivesFalse Positives = Test states you have the disease = Test states you have the disease when you do not have the diseasewhen you do not have the disease
False NegativesFalse Negatives = Test states you do not have the = Test states you do not have the disease when you dodisease when you do
ROC Analysis: ROC Analysis: Evaluating Medical TestsEvaluating Medical Tests
SensitivitySensitivity =The probability of having a positive test =The probability of having a positive test result among those with a positive diagnosis for the result among those with a positive diagnosis for the diseasedisease• SE = True Positives / True Positives + False SE = True Positives / True Positives + False
NegativesNegatives
SpecificitySpecificity = The probability of having a negative test = The probability of having a negative test result among those with a negative diagnosis for the result among those with a negative diagnosis for the diseasedisease• SP = True Negatives / True Negatives + False SP = True Negatives / True Negatives + False
PositivesPositives
The Basic Tool: 2X2The Basic Tool: 2X2
Test+Test+ Test-Test-
O+O+ TP(a)TP(a) FN(b)FN(b) P(a P(a ++ b) b)
O-O- FP(c)FP(c) TN(d)TN(d) P'=1-PP'=1-P
Q(a Q(a ++ c) c) Q'=1-QQ'=1-Q
Sensitivity (SE)=a/P Specificity (SP)=d/P’
ROC: GDS (Test) for Diagnosis of ROC: GDS (Test) for Diagnosis of Clinically Confirmed DepressionClinically Confirmed Depression
Depression on GDS
Not Depressed on GDS
Total
Clinically Confirmed Depressed
473 81 554
Clinically Confirmed Not Depressed
22 44 66
Total 495 125 N = 620
SE= 473/554 .854
SP = 44/66 .667
Which Test Do You Use: Which Test Do You Use: Medical Tests EvaluationMedical Tests Evaluation
GDSGDS: SE = .80; SP = .85: SE = .80; SP = .85
Beck Depression InventoryBeck Depression Inventory: SE = .85; SP = .75: SE = .85; SP = .75
Major Depression InventoryMajor Depression Inventory = SE = .66; SP = SE = .66; SP = .63= .63
ROC AnalysisROC Analysis
ROC first calculates Sensitivity and SpecificityROC first calculates Sensitivity and Specificity
Quality Indices measures the quality of the Quality Indices measures the quality of the sensitivity and specificitysensitivity and specificity
ROC computes the quality indices for each ROC computes the quality indices for each predictor to find the ones with optimal predictor to find the ones with optimal sensitivity and specificitysensitivity and specificity
To Detect the Optimal To Detect the Optimal Sensitivity and SpecificitySensitivity and Specificity
Depends on the relative CLINICAL importance of false negatives versus false positives. • W=1 means only false negatives matter.• W=0 means only false positives matter. • W=1/2 means both matter equally.
Analytically: Use weighted kappa.
ROC AnalysisROC Analysis
P = TP + FNP = TP + FN P’= 1- (TP + FN)P’= 1- (TP + FN) Q = TP + FPQ = TP + FP Q’= 1- (TP + FP)Q’= 1- (TP + FP) EFF = TP + TNEFF = TP + TN κκ(0.5, 0) = [ (TP + TN) - (TP + FN)(TP+FP) - (1-(TP + FN)(1-(TP + FP))](0.5, 0) = [ (TP + TN) - (TP + FN)(TP+FP) - (1-(TP + FN)(1-(TP + FP))]
[1 – [1 – (TP + FN)(TP+FP) - (1-(TP + FN))(1-(TP + (TP + FN)(TP+FP) - (1-(TP + FN))(1-(TP + FP))]FP))]
ROC Plane and “Curve”ROC Plane and “Curve”
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
1-Sp
Se
(P,P)
(Q,Q)
Random ROC
Ideal Point ROC “curve”
Receiver Operating Characteristic Curve Receiver Operating Characteristic Curve (ROC) Analysis Applications(ROC) Analysis Applications
Identifying Predictors of Clinical Identifying Predictors of Clinical OutcomeOutcome
ROC Analysis: ROC Analysis: Prediction Studies (Dr. Kraemer)Prediction Studies (Dr. Kraemer)
ROC can identify predictors/characteristics ROC can identify predictors/characteristics of patients that are at differential risk for a specific of patients that are at differential risk for a specific
outcome of interest. e.g. What are the Characteristics of outcome of interest. e.g. What are the Characteristics of AD Patients at risk for rapid decline and are high priority AD Patients at risk for rapid decline and are high priority for treatment?for treatment?
What are the clinical predictors of Alzheimer Disease What are the clinical predictors of Alzheimer Disease patients who are “good responders” (or “poor responders”) patients who are “good responders” (or “poor responders”) to cholinesterase inhibitor treatments?to cholinesterase inhibitor treatments?
Useful in “real world” clinical medicine where multiple Useful in “real world” clinical medicine where multiple variables affect the clinical outcome and patients seldom variables affect the clinical outcome and patients seldom have one pure diagnosishave one pure diagnosis
ROC: Identifying Predictors ROC: Identifying Predictors of an Outcomeof an Outcome
1. ROC relates a predictor (1. ROC relates a predictor (testtest) to the clinical ) to the clinical outcome of interest (outcome of interest (Diagnosis/Gold StandardDiagnosis/Gold Standard))
2. ROC searches all predictors and their 2. ROC searches all predictors and their associated cut-points associated cut-points
3. ROC determines which predictor and 3. ROC determines which predictor and associated cut-point yields the optimal sensitivity associated cut-point yields the optimal sensitivity and specificity for identifying the outcome of and specificity for identifying the outcome of interest yielding two groups at differential risk interest yielding two groups at differential risk for the outcomefor the outcome
ROC: Identifying Predictors ROC: Identifying Predictors of an Outcomeof an Outcome
4. ROC is an iterative process that is then rerun 4. ROC is an iterative process that is then rerun automatically for each group yielded in Step 3. in automatically for each group yielded in Step 3. in order to examine which predictor and associated order to examine which predictor and associated cut-point may further divide the groupscut-point may further divide the groups
5. ROC will keep searching within each group 5. ROC will keep searching within each group yielded until one of three stopping rules apply yielded until one of three stopping rules apply (see Stopping rule slide)(see Stopping rule slide)
6. ROC thus identifies subgroups of individuals 6. ROC thus identifies subgroups of individuals that are at increased risk for the outcome of that are at increased risk for the outcome of interestinterest
ROC Analysis:ROC Analysis:Advantages and DisadvantagesAdvantages and Disadvantages
No assumptions of normal distributionNo assumptions of normal distribution Multiple predictors can be evaluated simultaneouslyMultiple predictors can be evaluated simultaneously Indicates interactions among predictorsIndicates interactions among predictors Indicates cut-points on these predictorsIndicates cut-points on these predictors Yields clinically relevant informationYields clinically relevant information Non-hypothesis testingNon-hypothesis testing Requires large samplesRequires large samples Capitalizes on chance: needs stringent stopping ruleCapitalizes on chance: needs stringent stopping rule
ROC Analysis: ProcedureROC Analysis: Procedure
Start with large sample sizeStart with large sample size Define the outcome of interest (always binary)Define the outcome of interest (always binary) Choose Success/Failure criteriaChoose Success/Failure criteria Select predictor variables of interest (as many as Select predictor variables of interest (as many as
you like)you like) Run ROC Program that systematically finds Run ROC Program that systematically finds
best predictors for Success/Failurebest predictors for Success/Failure
The Basic Tool: 2X2The Basic Tool: 2X2
RF+RF+ RF-RF-
O+O+ TP(a)TP(a) FN(b)FN(b) P(a P(a ++ b) b)
O-O- FP(c)FP(c) TN(d)TN(d) P'=1-PP'=1-P
Q(a Q(a ++ c) c) Q'=1-QQ'=1-Q
Sensitivity (SE)=a/P Specificity (SP)=d/P’
ROC: Identifying Predictors &ROC: Identifying Predictors & Their Cut-pointsTheir Cut-points
Dichotomous Variables such as Gender: Dichotomous Variables such as Gender:
• ROC calculates the Se and Sp for Female ROC calculates the Se and Sp for Female vs. Malevs. Male
For Continuous Variables such as Age:For Continuous Variables such as Age:
• ROC would calculate Se and Sp for the cut-ROC would calculate Se and Sp for the cut-point of 60 vs. 61+62+63 ….85; then could point of 60 vs. 61+62+63 ….85; then could calculate for cut-point of 60+61 vs. calculate for cut-point of 60+61 vs. 62+63+64 ….85, and so forth.62+63+64 ….85, and so forth.
ROC: Gender as Predictor ofROC: Gender as Predictor ofClinically Confirmed DepressionClinically Confirmed Depression
Male Female Total
Clinically Confirmed Depressed
473 81 554
Clinically Confirmed Not Depressed
22 44 66
Total 495 125 N = 620
SE= 473/554 .854
SP = 44/66 .667
ROC: Identifying Predictors &ROC: Identifying Predictors & Their Cut-pointsTheir Cut-points
Dichotomous Variables: ROC calculates the Dichotomous Variables: ROC calculates the Se and Sp for Female vs. Male, Aphasia vs. Se and Sp for Female vs. Male, Aphasia vs. No Aphasia, etc.No Aphasia, etc.
For Continuous Variables such as Age:For Continuous Variables such as Age:
• ROC would calculate Se and Sp for the cut-ROC would calculate Se and Sp for the cut-point of 60 vs. 61+62+63 ….85; then could point of 60 vs. 61+62+63 ….85; then could calculate for cut-point of 60+61 vs. calculate for cut-point of 60+61 vs. 62+63+64 ….85, and so forth.62+63+64 ….85, and so forth.
ROC: Age as Predictor of ROC: Age as Predictor of Clinically Confirmed DepressionClinically Confirmed Depression
Age > 72 Age < 72 Total
Clinically Confirmed Depressed
473 81 554
Clinically Confirmed Not Depressed
22 44 66
Total 495 125 N = 620
SE= 473/554 .854
SP = 44/66 .667
ROC: Age as Predictor of ROC: Age as Predictor of Clinically Confirmed DepressionClinically Confirmed Depression
Age > 73 Age < 73 Total
Clinically Confirmed Depressed
473 81 554
Clinically Confirmed Not Depressed
22 44 66
Total 495 125 N = 620
SE= 473/554 .854
SP = 44/66 .667
Receiver Operating Characteristic Curve Receiver Operating Characteristic Curve (ROC) Analysis(ROC) Analysis
Conducting the ROC: An ExampleConducting the ROC: An Example
ROC Analysis: ProcedureROC Analysis: Procedure
Start with large sample sizeStart with large sample size Define the outcome of interestDefine the outcome of interest Choose Success/Failure criteriaChoose Success/Failure criteria Identify predictor variables of interestIdentify predictor variables of interest Run ROC Program that systematically finds Run ROC Program that systematically finds
best predictors for Success/Failurebest predictors for Success/Failure
ROC Analysis: ExampleROC Analysis: Example
Population under investigation:Population under investigation: 1, 472 AD patients from 10 Centers 1, 472 AD patients from 10 Centerswith a 12 month follow-upwith a 12 month follow-up
Clinically significant outcome:Clinically significant outcome:More rapid decline as defined by a loss of 3 or More rapid decline as defined by a loss of 3 or more MMSE points per year, post-visitmore MMSE points per year, post-visit
O'Hara R et al. (2002). Which Alzheimer patients are at risk for rapid O'Hara R et al. (2002). Which Alzheimer patients are at risk for rapid cognitive decline? J Geriatr Psychiatry Neurol;15(4):233-8. cognitive decline? J Geriatr Psychiatry Neurol;15(4):233-8.
Predictor VariablesPredictor Variables Age-at -patient-visitAge-at -patient-visit Reported age of symptom onsetReported age of symptom onset GenderGender Years of educationYears of education EthnicityEthnicity MMSE scoreMMSE score Living ArrangementLiving Arrangement Presence of AphasiaPresence of Aphasia Presence of HallucinationsPresence of Hallucinations Presence of Extrapyramidal SignsPresence of Extrapyramidal Signs
F ig u re 1: R O C d e riv a tio n o f sub g ro u p s a t d iffe re n tia l risk fo r rap id d e c line
A phasia :N one,Q uestionable ,M ild
n = 94340.3% (380) R D
V isit A ge > 75.00n = 221
46.6% (103) R D
M M SE < = 7n = 49
40.8% (20) R D
M M SE > 7n = 259
68.0% (176) R D
M M S E
V isit A ge < = 75.00n = 308
63.6% (196) R D
V isit A ge
A phasia :M odera te ,Severe
n = 52956.5% (299) R D
A phasia
N = 1472 (Patient V isits)46.1% R apid D ecliners (R D)
Stopping RulesStopping Rules
No more possibilities (rare!)No more possibilities (rare!) Inadequate sample size Inadequate sample size Optimal test (if ‘a priori’) would not Optimal test (if ‘a priori’) would not
have been statistically significant have been statistically significant (p<.001)(p<.001)
Figure 10.3N=512 (100%)
P=.53P=.53
ROC Decision Tree for IHDP Control group with outcome of low IQ at age 3. (w= 0.5)
Non-minorityMinority
N = 321 (63%)P=.70P=.70
N = 191 (37%)P=.25P=.25
N=211 (41%)P=.81P=.81
N=110 (21%)P=.48P=.48
N=87 (17%)P=.45P=.45
Bayley Mental Dev. Index < 115
Mother neverattended college
N=104 (20%)P=.09P=.09
Mother attended college
Bayley Mental Dev. Index ≥ 115
N=131 (26%)
P=.91P=.91N=80 (16%)
P=.65P=.65N=57 (11%)
P=.30P=.30N=30 (6%)
P=.73P=.73N=43 (8%)
P=.19P=.19N=61 (12%)
P=.02P=.02
Bayley Mental Dev. Index<106
Bayley Mental Dev. Index≥106
Bayley Mental Dev. Index<106
Bayley Mental Dev. Index≥106
Graduatedfrom college
Attended, didnot graduate
ROC Plane and “Swarm” of PointsROC Plane and “Swarm” of Points
00.10.20.30.40.50.60.70.80.9
1
0 0.2 0.4 0.6 0.8 1
1-SP
SE
NHIMOMEDBWBLACKHISPMINORMDI12PDI12
ROC“curve”
To Detect the Optimal To Detect the Optimal Sensitivity and SpecificitySensitivity and Specificity
Depends on the relative CLINICAL importance of false negatives versus false positives. • W=1 means only false negatives matter.• W=0 means only false positives matter. • W=1/2 means both matter equally.
Analytically: Use weighted kappa. Geometrically: Draw a line through the Ideal
Point with slope determined by P and w. Push this line down until it just touches the ROC “curve”. That point is optimal.
ROC Analysis: ConclusionROC Analysis: Conclusion
Yields Clinically Relevant InformationYields Clinically Relevant Information Identifies complex interactionsIdentifies complex interactions Identifies individuals with different Identifies individuals with different
characteristics but at the same risk for the characteristics but at the same risk for the clinically relevant outcomeclinically relevant outcome
Identifies individuals at the least riskIdentifies individuals at the least risk Can take differential clinical costs of false Can take differential clinical costs of false
positives and false negatives into accountpositives and false negatives into account
ConclusionConclusion
It is not sufficient to identify risk factors or even It is not sufficient to identify risk factors or even to identify moderators and mediators etc. or a to identify moderators and mediators etc. or a structural model. structural model.
It is necessary to present and interpret the results It is necessary to present and interpret the results so that clinicians, policy makers, consumers, other so that clinicians, policy makers, consumers, other researchers can apply them. researchers can apply them.
ROC trees are one method to accomplish this ROC trees are one method to accomplish this purpose. purpose.