classification performance evaluation. how do you know that you have a good classifier? is a feature...

16
Classification Performance Evaluation

Upload: barnaby-lyons

Post on 02-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Classification Performance Evaluation

Page 2: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

How do you know that you have a good classifier?

• Is a feature contributing to overall performance?

• Is classifier A better than classifier B?• Internal Evaluation:– Measure the performance of the classifier.

• External Evaluation:– Measure the performance performed by an

agency or individuals not directly involved in or responsible for the program

Page 3: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Accuracy

• Easily the most common and intuitive measure of classification performance.

Page 4: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Significance testing

• Say I have two classifiers.

• A = 50% accuracy• B = 75% accuracy

• B is better, right?

Page 5: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Significance Testing

• Say I have another two classifiers

• A = 50% accuracy• B = 50.5% accuracy

• Is B better?

Page 6: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Basic Evaluation

• Training data – used to identify model parameters

• Testing data – used for evaluation• Optionally: Development / tuning data – used

to identify model hyperparameters.

• Difficult to get significance or confidence values

Page 7: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Cross validation

• Identify n “folds” of the available data.• Train on n-1 folds• Test on the remaining fold.

• In the extreme (n=N) this is known as “leave-one-out” cross validation

• n-fold cross validation (xval) gives n samples of the performance of the classifier.

Page 8: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Types of Errors

• False Positives– The system predicted TRUE but the value was

FALSE– aka “False Alarms” or Type I error

• False Negatives– The system predicted FALSE but the value was

TRUE– aka “Misses” or Type II error

Page 9: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Problems with accuracy

• Contingency TableTrue Values

Positive Negative

Hyp Values

Positive True Positive

False Positive

Negative

False Negative

True Negative

Page 10: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Problems with accuracy

• Information Retrieval Example– Find the 10 documents related to a query in a set

of 110 documentsTrue Values

Positive Negative

Hyp Values

Positive 0 0

Negative 10 100

Page 11: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

Problems with accuracy

• Precision: how many hypothesized events were true events

• Recall: how many of the true events were identified

• F1-Measure: Harmonic meanof precision and recall

True Values

Positive Negative

Hyp Values

Positive 0 0

Negative

10 100

Page 12: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

F-Measure

• F-measure can be weighted to favor Precision or Recall– beta > 1 favors recall– beta < 1 favors precision

True Values

Positive Negative

Hyp Values

Positive 0 0

Negative

10 100

Page 13: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

F-Measure

• Accuracy is weighted towards majority class performance.

• F-measure is useful for measuring the performance on both classes.

Page 14: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

ROC curves

• It is common to plot classifier performance at a variety of settings or thresholds

• Receiver Operating Characteristic (ROC) curves plot true positives againstfalse positives.

• The overall performanceis calculated by the AreaUnder the Curve (AUC)

Page 15: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

What about more than 2 classes

Convert to a set of binary-class classification problems

Page 16: Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier

What about more than 2 classes

a as positivea(pos) b&c(neg) <-- classified as TP=50 FN=0 a(pos) FP=0 TN=100 b&c(neg)

b as positiveb(pos) a&c(neg) <-- classified as TP=46 FN=4 b(pos) FP=0 TN=100 a&c(neg)

try c as positive yourself