copyright © 2003, sas institute inc. all rights reserved. cost-sensitive classifier selection ross...

22
Copyright © 2003, SAS Institute Inc. All rights reserved. Cost- Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Upload: meredith-clark

Post on 16-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved.

Cost-SensitiveClassifier SelectionRoss BettingerAnalytical ConsultantSAS Services

Page 2: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 2

Rule-Based Knowledge Extraction

A typical goal in extracting knowledge from data is the production of a classification rule that will assign a class membership to a future event with a specified probability

A binary classifier assigns an object to one of two classes• The decision regarding the class assignment will be either

correct or incorrect, so there are four possible outcomes:

{Predicted Event, Actual Event} (True Positive)

{Predicted Event, Actual Nonevent} (False Positive)

{Predicted Nonevent, Actual Event} (False Negative)

{Predicted Nonevent, Actual Nonevent} (True Negative)

Page 3: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 3

Evaluating Classifier Performance

Use 2x2 classification table of predicted vs actual class membership

A critical concept in the discussion of decisions is the definition of an event• An observation or instance, I, has been classified into class

e with probability if the classifier assigns a probability

Predicted Event, E Predicted Nonevent, N Actual Event, e (E, e) True Positive (TP) (N, e) False Negative (FN) Actual Nonevent, n (E, n) False Positive (FP) (N, n) True Negative (TN)

cutoffpIp )()|( Ip e

Page 4: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 4

The Cost of a Decision

Correct Decision: TP = p(E|p) =

False Positive: FP = p(E|n) =

Assume that correct decisions incur no cost

The theoretical expected cost of misclassifying an instance I is

}{#}{#

noneventstotaleventspredicted

}{#}{#

eventstotaleventspredicted

),()|(),()|( eNeNnEnE cpcpC

),()1(),( eNnE cTPcFPC

Page 5: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 5

Receiver Operating Characteristic

Compute a 2x2 classification table for values of

and plot the curve traced by (FP, TP) as ranges from 0 to 1

This curve is called the “receiver operating characteristic” and was developed during World War II to assess the performance of radar receivers in detecting targets accurately

The area under the ROC curve (AUC) is defined to be the performance index of interest

cutoffp

cutoffp

Page 6: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 6

ROC Plot

(.29, .70)

(.29, .67)

Page 7: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 7

ROC Curve and Decision Costs

ROC curve does not include any class distribution or misclassification cost information in its construction • Does not give much guidance in the choice among

competing classifiers unless one of them clearly dominates all of the others over all values of

Overlay class distribution and misclassification cost on ROC curve using average cost of decision

cutoffp

)()|(),(

),()(

0

0

jji j

iji

i jjiij

ActualPlActuaPredictedPActualedictedPrcC

ActualPredictedPCosticationMisclassifCC

Page 8: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 8

ROC Curve and Decision Costs (cont’d)

For the 2x2 classification table, the equation becomes

At minimum average cost point, slope of ROC curve is

ROC operating point is sensitive to class distribution, misclassification costs

)()|(),()()|(),(

)()|(),()()|(),(0

nnNnNnnEnE

eeNeNeeEeE

PPcPPc

PPcPPcCC

),(

),(

)(

)(

)|(

)|(

eN

nE

e

n

nE

eE

c

c

P

P

dP

dP

Page 9: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 9

Determine ROC Operating Point

Represent slope of ROC curve using adjacent points to form the isoperformance line

Compute slopes at adjacent points, determine interval containing slope, match with classifier point, find

),(

),(

)(

)(

12

12

eN

nE

e

n

c

c

P

P

FPFP

TPTP

cutoffp

Page 10: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 10

ROC Convex Hull (Provost and Fawcett,1997)

Overlay multiple ROC curves on same (FP, TP) axes

Page 11: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 11

ROC Convex Hull (cont’d)

Add convex hull to ROC curves

Page 12: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 12

ROC Convex Hull (cont’d)

Add isoperformance line

Page 13: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 13

Selecting Classifiers Using ROC Method

The isoperformance line, which is tangent to the ROCCH at the point of minimum expected cost, indicates which classifier to use for a specified combination of class distribution and misclassification costs

Furthermore, the ROCCH method indicates the range of slopes over which a particular classifier is optimal with respect to class and costs

Page 14: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 14

Selecting Classifiers Using ROCCH (cont’d)

Convex hull points + associated classifier

Page 15: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 15

Selecting Classifiers Using ROCCH (cont’d)

Range of slopes, points of tangency, classifier

Page 16: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 16

Selecting Classifiers Using ROCCH (cont’d)

Classifier and AUC for German credit ensemble classifiers

Page 17: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 17

Selecting Classifiers Using ROCCH (cont’d)

Ensemble classifiers for Catalog Direct Mail

Page 18: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 18

Selecting Classifiers Using ROCCH (cont’d)

Classifier and AUC for Catalog Direct Mail

Page 19: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 19

Selecting Classifiers Using ROCCH (cont’d)

Ensemble classifiers for KDD-98 Cup

Page 20: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 20

Selecting Classifiers Using ROCCH (cont’d)

Classifier and AUC for KDD-98 Cup

Page 21: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 21

Summary

The ROCCH methodology for selecting binary classifiers explicitly includes class distribution and misclassification costs in its formulation.

It is a robust alternative to whole-curve metrics like AUC, which reports global classifier performance but which may not indicate the best classifier (in the least-cost sense) for the range of operating conditions under which the classifier will assign class memberships.

Page 22: Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services

Copyright © 2003, SAS Institute Inc. All rights reserved. 22Copyright © 2003, SAS Institute Inc. All rights reserved. 22