hani tamim, phd associate professor department of ......hani tamim, phd associate professor...

44
Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut

Upload: others

Post on 14-Mar-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut

Page 2: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

A graphical plot which illustrates the performance of a

binary classifier system as its discrimination threshold

is varied (3).

Effective method of evaluating the performance of

diagnostic tests (3).

ROC analysis is related in a direct and natural way to

cost/benefit analysis of diagnostic decision making (3).

Page 3: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

First developed by electrical engineers and radar

engineers during World War II for detecting enemy

objects in battlefields (3).

ROC analysis since then has been used in medicine,

radiology, biometrics, and other areas for many decades

and is increasingly used in machine learning and data

mining research (3).

Page 4: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Suppose we have a test statistic for predicting the

presence or absence of disease.

Test criterion

True disease status

Pos Neg

Pos

Neg

Page 5: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Suppose we have a test statistic for predicting the

presence or absence of disease.

Test criterion

True disease status

Pos Neg

Pos

Neg

TP ☺

Page 6: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Suppose we have a test statistic for predicting the

presence or absence of disease.

Test criterion

True disease status

Pos Neg

Pos

Neg

FP

Page 7: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Suppose we have a test statistic for predicting the

presence or absence of disease.

Test criterion

True disease status

Pos Neg

Pos

Neg

FN

Page 8: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Suppose we have a test statistic for predicting the

presence or absence of disease.

Test criterion

True disease status

Pos Neg

Pos

Neg

TN ☺

Page 9: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Suppose we have a test statistic for predicting the

presence or absence of disease.

Test criterion

True disease status

Pos Neg

Pos

Neg

P N P+N

TP FP

FN TN

Page 10: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion

True disease status

Pos Neg

Pos

Neg

P N P+N

Accuracy = Probability that the test yields a

correct result (2).

= (TP+TN) / (P+N)

TP FP

PN TN

Page 11: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion

True disease status

Pos Neg

Pos

Neg

P N P+N

Sensitivity = Probability that a true case will test

positive.

= TP / P

Also referred to as True Positive Rate (TPR)

or True Positive Fraction (TPF).

TP FP

FN TN

Page 12: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion

True disease status

Pos Neg

Pos

Neg

P N P+N

Specificity = Probability that a true negative will test

negative.

= TN / N

Also referred to as True Negative Rate (TNR)

or True Negative Fraction (TNF).

TP FP

FN TN

Page 13: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion

True disease status

Pos Neg

Pos

Neg

P N P+N

1- Specificity = Probability that a true negative will test

positive.

= FP / N

Also referred to as False Positive Rate (FPR)

or False Positive Fraction (FPF).

TP FP

FN TN

Page 14: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion

True disease status

Pos Neg

Pos

Neg

P N P+N

Positive Predictive = Probability that a positive test

Value (PPV) will truly have disease.

= TP / (TP+FP)

TP FP

FN TN

Page 15: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion

True disease status

Pos Neg

Pos

Neg

P N P+N

Negative Predictive = Probability that a negative test

Value (NPV) will truly be disease free.

= TN / (TN+FN)

TP FP

FN TN

Page 16: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion True disease status Pos Neg Pos 200

Neg 800 1000

Se = 27/100 = .27 Sp = 727/900 = .81 FPF = 1- Sp = .19 Acc = (27+727)/1000 = .75 PPV = 27/200 = .14 NPV = 727/800 = .91

27 173

73 727

Page 17: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates
Page 18: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Of these properties, only Se and Sp (and hence FPR)

are considered invariant test characteristics.

Accuracy, PPV, and NPV will vary according to the

underlying prevalence of disease.

Se and Sp are thus “fundamental” test properties and

hence are the most useful measures for comparing

different test criteria, even though PPV and NPV are

probably the most clinically relevant properties.

Page 19: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Now assume that our test statistic is no longer binary , but

takes on a series of values (for instance how many of five

distinct risk factors a person exhibits).

Clinically we make a rule that says the test is positive if the

number of risk factors meets or exceeds some threshold (#RF

> x).

Suppose our previous table resulted from using x = 4.

Let’s see what happens as we vary x.

Page 20: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Test criterion True disease status Pos Neg Pos 245

Neg 755 1000 Se = 27/100 = .45

Sp = 727/900 = .78

FPF = 1- Sp = .22

Acc = (27+727)/1000 = .75

PPV = 27/200 = .18

NPV = 727/800 = .93

Se , Sp , and interestingly both PPV and NPV .

45 200

55 700

Page 21: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

without the disease

with the disease

‘‘-’’ ‘‘+’’

Page 22: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

without the disease

with the disease

‘‘-’’ ‘‘+’’

Page 23: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Threshold TPR FPR

6 0.00 0.00

5 0.10 0.11

4 0.27 0.19

3 0.45 0.22

2 0.73 0.27

1 0.98 0.80

0 1.00 1.00

As we relax our threshold for

defining “disease,” our true positive

rate(sensitivity) increases, but so

does the false positive rate

(FPR).

The ROC curve is a way to visually

display this information.

Page 24: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

What might an even better ROC curve look like?

The diagonal line shows what we would expect from simple guessing (i.e., pure chance).

Threshold TPR FPR

6 0.00 0.00

5 0.10 0.11

4 0.27 0.19

3 0.45 0.22

2 0.73 0.27

1 0.98 0.80

0 1.00 1.00

Page 25: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Note the immediate sharp rise in

sensitivity. Perfect accuracy is

represented by upper left corner

Threshold TPR FPR

6 0.00 0.00

5 0.10 0.11

4 0.27 0.19

3 0.45 0.22

2 0.73 0.27

1 0.98 0.80

0 1.00 1.00

Page 26: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

The ROC curve allows us to see, in a simple visual

display, how sensitivity and specificity vary as our

threshold varies (2).

The shape of the curve also gives us some visual clues

about the overall strength of association between the

underlying test statistic (in this case #RFs that are

present) and disease status (2).

Page 27: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

The ROC methodology

easily generalizes to test

statistics that are continuous

(such as lung function or a

blood gas). We simply fit a

smoothed ROC curve

through all observed data

points (2).

Page 28: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

The total area of the grid

represented by an ROC

curve is 1, since both

TPR and FPR range from 0

to 1.

The portion of this total area

that falls below the ROC

curve is known as the area

under the curve,or AUC.

Page 29: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

The AUC serves as a quantitative summary of the

strength of association between the underlying test

statistic and disease status (2).

An AUC of 1.0 would mean that the test statistic could

be used to perfectly discriminate between cases and

controls (2).

An AUC of 0.5 (reflected by the diagonal 45° line) is

equivalent to simply guessing (2).

Page 30: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

The AUC can be shown to equal the Mann- Whitney U

statistic, or equivalently the Wilcoxon rank statistic, for

testing whether the test measure differs for individuals

with and without disease (2).

It also equals the probability that the value of our test

measure would be higher for a randomly chosen case

than for a randomly chosen control (2).

Page 31: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

AUC~ 0.540

FPR

ROC curve

Page 32: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

AUC~0.95

FPR

ROC curve

Page 33: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

What defines a “good” AUC?

Opinions vary

Probably context specific

What may be a good AUC for predicting COPD may be

very different than what is a good AUC for predicting

prostate cancer

Page 34: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

.90-1.0 = excellent

.80-.90 = good

.70-.80 = fair

.60-.70 = poor

.50-.60 = fail

Remember that <.50 is worse than guessing (4).

Page 35: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

.97-1.0 = excellent

.92-.97 = very good

.75-.92 = good

.50-.75 = fair (5).

Page 36: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Suppose we have two candidate test statistics to use to

create a binary decision rule. Can we use ROC curves

to choose an optimal one?

Page 37: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates
Page 38: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates
Page 39: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

We can formally compare AUCs for two competing test

statistics, but does this answer our question?

AUC speaks to which measure, as a continuous

variable, best discriminates between cases and

controls?

It does not tell us which specific cutpoint to use, or

even which test statistic will ultimately provide the

“best” cutpoint.

Page 40: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

2 methods:

The first method assumes that the best cut-off point for

balancing the sensitivity and specificity of a test is the point on

the curve closest to the (0, 1) point. In this method, optimal

sensitivity and specificity are defined as those yielding the

minimal value for (1 − sensitivity)2 + (1 − specificity)2. The

cut-off point corresponding to these sensitivity and specificity

values is the one closest to the (0, 1) point and is taken to be

the cut-off point that best differentiates between people with

disease and those without disease (1).

Page 41: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

The second method that may be used to determine the optimal

cut-off point for a test is the Youden index (J) . J is defined as

the maximum vertical distance between the ROC curve and

the diagonal or chance line and is calculated as J = maximum

(sensitivity + specificity −1). Using this measure, the cut-off

point on the ROC curve which corresponds to J, that is, at

which (sensitivity + specificity − 1) is maximized, is taken to

be the optimal cut-off point. An intuitive interpretation of J is

that it corresponds to the point on the curve farthest from

chance (1).

Page 42: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Open the excel file.

The first column is the patient coding number that you do not really need.

The second column gives the age as a continuous variable The third

column is a categorical variable about having thrombosis (1) and not

having thrombosis (0)

Exercise:

1. Calculate the sensitivity and 1-specificity Considering that age is the test

variable and thrombosis is the state variable. Calculate the ROC curves for

and plot it. Comment on the obtained results.

Page 43: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

Open the excel file.

The first column is the patient age, The second column gives the sensitivity

and the column 2 the 1-specificity.

Exercise:

1. Calculate the youden index for each age and choose according to it the optimal cut off point.

Page 44: Hani Tamim, PhD Associate Professor Department of ......Hani Tamim, PhD Associate Professor Department of Internal Medicine American University of Beirut A graphical plot which illustrates

1- Akobeng AK. Understanding diagnostic tests 3: Receiver operating

characteristic curves. Acta Paediatrica 2007, vol 96, issue 5.

2- Kaiser permanente center for health research. Receiver operating

characteristic (ROC) curves: assessing the predictive properties of a test

statistic – Decision theory 2009.

3- http://en.wikipedia.org/wiki/Receiver_operating_characteristic

4- http://gim.unmc.edu/dxtests/roc3.htm

5- www.childrens-mercy.org/stats/ask/roc.asp