roc & auc, lift ד " ר אבי רוזנפלד. introduction to roc curves roc = receiver...

ROC & AUC, LIFT

רוזנפלד" אבי ר ד

Introduction to ROC curves

• ROC = Receiver Operating Characteristic

• Started in electronic signal detection theory (1940s - 1950s)

• Has become very popular in biomedical applications, particularly radiology and imaging

• מידע בכריית בשימוש גם

False Positives / Negatives

P 20 10

N 30 90

Predicted

Confusion matrix 1

P 10 20

N 15 105

Predicted

Confusion matrix 2

Precision (P) = 20 / 50 = 0.4Recall (P) = 20 / 30 = 0.666F-measure=2*.4*.666/1.0666=.5

Different Cost Measures• The confusion matrix (easily generalize to multi-class)

• Machine Learning methods usually minimize FP+FN • TPR (True Positive Rate): TP / (TP + FN) = Recall• FPR (False Positive Rate): FP / (TN + FP) = Precision

Predicted class

Yes No

Actual class

Yes TP: True positive

FN: False negative

No FP: False positive

TN: True negative

Specific Example

Test Result

People with disease

People without disease

Test Result

Call these patients “negative”

Call these patients “positive”

Threshold

Test Result

without the diseasewith the disease

True Positives

Some definitions ...

Test Result

False Positives

Test Result

True negatives

Test Result

False negatives

Test Result

‘‘-’’

‘‘+’’

Moving the Threshold: left

Which line has the higher recall of -?Which line has the higher precision of -?

tive R

False Positive Rate (1-specificity)

ROC curve

Figure 5.2 A sample ROC curve.

של שונים גרפים ROCסוגים

Area under ROC curve (AUC)

כללי • מדד

לגרף • מתחת ROCהשטח

•0.50 , רנדומאלי מחירה .1.0הוא מושלם הוא

False Positive Rate0%

AUC = 50%

AUC = 90% AUC =

AUC = 100%

False Positive Rate0%

AUC for ROC curves

Lift Charts

• X axis is sample size: (TP+FP) / N• Y axis is TP• / : רנדומאלי דיוק המודל דיוק פורמאלי הגדרה

40% of responses for 10% of costLift factor = 4

80% of responses for 40% of costLift factor = 2Model

Random

Lift factor

4.55 15 25 35 45 55 65 75 85 95

Sample Size

המדדים בין הקשר

ה OVERFITTINGבעיית

10-fold cross-validation (one example of K-fold cross-validation)

• 1. Randomly divide your data into 10 pieces, 1 through k.• 2. Treat the 1st tenth of the data as the test dataset. Fit

the model to the other nine-tenths of the data (which are now the training data).

• 3. Apply the model to the test data (e.g., for logistic regression, calculate predicted probabilities of the test observations).

• 4. Repeat this procedure for all 10 tenths of the data.• 5. Calculate statistics of model accuracy and fit (e.g., ROC

curves) from the test data only.

תמונה

התוצאות ניתוח

The Kappa Statistic

• Kappa measures relative improvement over random prediction• Dreal / Dperfect = A (accuracy of the real model)

• Drandom / Dperfect= C (accuracy of a random model)• Kappa Statistic = (A-C) / (1-C)= (Dreal / Dperfect – Drandom / Dperfect ) / (1 – Drandom / Dperfect )

Remove Dperfect from all places

• (Dreal – Drandom) / (Dperfect – Drandom) • Kappa = 1 when A = 1• Kappa 0 if prediction is no better than random guessing

Aside: the Kappa statistic• Two confusion matrix for a 3-class problem: real model (left) vs

random model (right)

• Number of successes: sum of values in diagonal (D)• Kappa = (Dreal – Drandom) / (Dperfect – Drandom)

– (140 – 82) / (200 – 82) = 0.492– Accuracy = 140/200 = 0.70

a 88 10 2 100

b 14 40 6 60

c 18 10 12 40

60 20 200

Predicted

total a b c

a 60 30 10 100

b 36 18 6 60

c 24 12 4 40

60 20 200

Predicted

The kappa statistic – how to calculate Drandom ?

a 88 10 2 100

b 14 40 6 60

c 18 10 12 40

60 20 200

total a b c

a ? 100

60 20 200

altotal

100*120/200 = 60Rationale: 100 actual values, 120/200 in the predicted class, so random is:100*120/200

Actual confusion matrix, C

Expected confusion matrix, E, for a random model

התרגיל ...לקראת

roc & auc, lift ד " ר אבי רוזנפלד. introduction to roc curves roc = receiver...

Documents

roc curves data mining lab 5. lab outline remind what roc...

roc tomos_4

roc profile: n-nitrosamines; 14th roc 2016

inflammatory muscle disease ד " ר סוהיל אעמר...

mems packaging & damping mechanisms ד " ר דן סתר...

monitoring and follow-up of cardiovascular implantable...

roc return

optimization ד " ר בלהה מנדלסון...

תכנון וי י צו ר בעזרת מחשב computer...

roc introduction

home | roc

mycophenolate mofetil a suppression tale ד " ר יוסי...

galaxypeak limited - israel securities authority · ך ר...

roc maquina

roc - stockcharts.com

roc tutorial

ֹר ִתbiblical hebrew (tiro) keyboard manual

˛0720=85: %;51 8 e;51>1c;>g=k5 8745;8o. ˝5b>4k >?@545;5=8o...

roc polling services user manual - emerson electric ·...

biblia hebraica stuttgartensia - sbl-site.org · threni...