introduction to statistical pattern recognition part ii setosa iris versicolor iris virginica...
TRANSCRIPT
Introduction to Statistical Pattern Recognition
Part II
1/20/2011 ECE 523: Introduction to Biometrics 1
Outline
1/20/2011 ECE 523: Introduction to Biometrics 2
• Bayes Detection Rule Revisited
• Probability of Error
• Evaluating the Classifier
• Matlab illustrations
1/20/2011 ECE 523: Introduction to Biometrics 3
Bayes Decision Rule
Decide
• Two-class case
• N-class case
Given a feature vector x, assign it to class wj if:
Expanding P(wj|x) and P(wi|x)
1/20/2011 ECE 523: Introduction to Biometrics 4
Bayes Decision Rule
• N-class case
Given a feature vector x, assign it to class wj if:
• Likelihood Ratio: 2-class case
Likelihood ratio Threshold
1/20/2011 ECE 523: Introduction to Biometrics 5
• An error is made when we classify an observation as class wi when it is really in the j-th class. Denote the complement of region i as i
c , the probability of error is
Bayes Decision Rule: Probability of Error (N-class)
1/20/2011 ECE 523: Introduction to Biometrics 6
Bayes Decision Rule: Probability of Error (2-class)
1/20/2011 ECE 523: Introduction to Biometrics 7
• We can set the amount of error we will tolerate for misclassifying one of the classes
Case I: Fish Sorting Example (Salmon vs. Sea Bass)
-6 -4 -2 0 2 4 6 80
0.05
0.1
0.15
0.2
0.25
Feature-x
Posterior 1
Posterior 2
Salmon Sea Bass
x*
Salmon: $20/lb Sea Bass: $10/lb
To satisfy customers, which error should be minimized? Error I or Error II
I II
Bayes Decision Rule: Probability of Error (2-class)
1/20/2011 ECE 523: Introduction to Biometrics 8
Case II: Cancerous vs. Healthy Tissue
-6 -4 -2 0 2 4 6 80
0.05
0.1
0.15
0.2
0.25
Feature-x
Posterior 1
Posterior 2
Healthy Cancerous
x*
I II
Taking into account the patient’s well-being, which error should be minimized? Error I or Error II
Bayes Decision Rule: Probability of Error (2-class)
1/20/2011 ECE 523: Introduction to Biometrics 9
Bayes Decision Rule: Probability of Error (2-class)
-6 -4 -2 0 2 4 6 80
0.05
0.1
0.15
0.2
0.25
Feature-x
Posterior 1
Posterior 2Target Class
Non-target Class
x*
I
Region I shows the probability of false alarm or the probability of wrongly classifying as target (class w1) when it really belongs to class w2.
1/20/2011 ECE 523: Introduction to Biometrics 10
Example
We will look at a univariate classification problem with two classes. The class-conditionals are given by the normal distributions as follows:
The priors are
Adjust the decision boundary such to achieve a desired probability of false alarm, 𝑃 𝐹𝐴 =0.05, e.g., (a) probability that cancerous tissue is classified as healthy or (b) probability that sea bass is classified as salmon
1/20/2011 ECE 523: Introduction to Biometrics 11
Example
We need to find the value of 𝑥∗ such that
𝑥∗ is a quantile, i.e.,
x* = norminv(0.05/0.4,1,1);
x* = -0.15
1/20/2011 ECE 523: Introduction to Biometrics 12
Evaluating the Classifier
• Need to evaluate its usefulness by measuring the percentage of observations we correctly classify
• Important to report the probability of false alarms
1/20/2011 ECE 523: Introduction to Biometrics 13
Evaluating the Classifier
Independent Test Sample
• If sample collection is large, divide it into training and testing sets
• Training set – build the classifier
• Testing set – classify observations in the test set using our classification rule
• Estimated classification rate – proportion of correctly classified observations
• Common mistake that novice researches make is to build a classifier using their sample and then use the same sample for testing
1/20/2011 ECE 523: Introduction to Biometrics 14
Evaluating the Classifier: Independent Test Sample
Database
• Iris flower data set – introduced by Sir Ronald Aylmer Fisher (1936)
• Dataset consists of 50 samples from each of three species of Iris flowers
• Four features measured from each sample, i.e., length and width of sepal and petal in centimeters
Iris setosa Iris versicolor Iris virginica
1/20/2011 ECE 523: Introduction to Biometrics 15
Evaluating the Classifier: Independent Test Sample
Probability of Correct Classification – Independent Test Sample (Formal Procedure)
• Randomly separate 𝑛 samples into two sets of size 𝑛𝑡𝑟𝑎𝑖𝑛 and 𝑛𝑡𝑒𝑠𝑡, where 𝑛𝑡𝑟𝑎𝑖𝑛 + 𝑛𝑡𝑒𝑠𝑡 = 𝑛
• Build the classifier (e.g., Bayes Decision Rule) using the training set
• Present each pattern from the test set to the classifier and obtain a class label for it. Since we know the correct class label for these observations beforehand, we can count the number of patterns (𝑁𝑐𝑐) correctly classified
• Probability of correct classification is
1/20/2011 ECE 523: Introduction to Biometrics 16
Evaluating the Classifier: Independent Test Sample
Matlab illustration (consider only the two species that are hard to separate, i.e., iris
versicolor and iris virginica)
% Load data
load iris
% Get data for training and testing set
% Use only first two features
indtrain = 1:2:50;
indtest = 2:2:50;
versitest = versicolor(indtest,1:2);
versitrain = versicolor(indtrain,1:2);
virgitest = virginica(indtest,1:2);
virgitrain = virginica(indtrain,1:2);
• Randomly separate 𝑛 samples into two sets of size 𝑛𝑡𝑟𝑎𝑖𝑛 and 𝑛𝑡𝑒𝑠𝑡, where 𝑛𝑡𝑟𝑎𝑖𝑛 +𝑛𝑡𝑒𝑠𝑡 = 𝑛
1/20/2011 ECE 523: Introduction to Biometrics 17
Evaluating the Classifier: Independent Test Sample
• Build the classifier (e.g., Bayes Decision Rule) using the training set, assume multivariate normal model for these data
muver = mean(versitrain);
covver = cov(versitrain);
muvir = mean(virgitrain);
covvir = cov(virgitrain);
1/20/2011 ECE 523: Introduction to Biometrics 18
Evaluating the Classifier: Independent Test Sample • Present each pattern from the test set to the classifier and obtain a class label for
it. Since we know the correct class label for these observations beforehand, we can count the number of patterns (𝑁𝑐𝑐) correctly classified
• Use equal priors
% Put all of the test data into one matrix.
X = [versitest; virgitest];
% These are the probability of x given versicolor.
pxgver = csevalnorm(X, muver, covver);
% These are the probability of x given virginica.
pxgvir = csevalnorm(X, muvir, covvir);
% Check which are correctly classified
ind = find(pxgver(1:25) > pxgvir(1:25));
ncc = length(ind);
ind = find(pxgvir(26:50) > pxgver(26:50));
ncc = ncc + length(ind);
pcc = ncc/50
1/20/2011 ECE 523: Introduction to Biometrics 19
Evaluating the Classifier
Cross-validation
• Systematically partition the data into training and testing sets • 𝑛 − 𝑘 observations are used to build the classifier, and the remaining 𝑘 patterns
are used to test it
1/20/2011 ECE 523: Introduction to Biometrics 20
Cross-validation (Formal Procedure) at 𝑘 = 1 (also known as leave-one-out method)
• Set the number of correctly classified to 0, i.e., 𝑁𝐶𝐶 = 0
• Keep out one observation, call it 𝑥𝑖
• Build the classifier using the remaining 𝑛 − 1 observations
• Present the observation 𝑥𝑖 to the classifier and obtain a class label using the classifier from the previous step
• If class label is correct, increment 𝑁𝐶𝐶, i.e., 𝑁𝐶𝐶 = 𝑁𝐶𝐶 + 1
• Repeat steps 2-5 for each pattern in the sample
• Probability of correct classification is
Evaluating the Classifier: Cross-validation
1/20/2011 ECE 523: Introduction to Biometrics 21
Evaluating the Classifier: Cross-validation
Matlab Illustration • Use iris data and estimate probability of correct classification • Use cross-validation with 𝑘 = 1 • Use versicolor and virginica only • Equal priors • Use first two features only • Build the classifier (e.g., Bayes Decision Rule) using the training set, assume
multivariate normal model for these data
1/20/2011 ECE 523: Introduction to Biometrics 22
Evaluating the Classifier: Cross-validation
% Load data
load iris
% Set ncc= 0
ncc = 0;
% Use only first two features
virginica(:,3:4) = [];
versicolor(:,3:4) = [];
% Sample size
[nver,d] = size(versicolor);
[nvir,d] = size(virginica);
n = nvir + nver;
1/20/2011 ECE 523: Introduction to Biometrics 23
Evaluating the Classifier: Cross-validation
% Loop first through all of the patterns corresponding
% to versicolor.
muvir = mean(virginica);
covvir = cov(virginica);
% These will be the same for this part.
for i = 1:nver
% Get the test point and the training set
versitrain = versicolor;
% This is the testing point.
x = versitrain(i,:);
% Delete from training set.
% The result is the training set.
versitrain(i,:)=[];
muver = mean(versitrain);
covver = cov(versitrain);
pxgver = csevalnorm(x,muver,covver);
pxgvir = csevalnorm(x,muvir,covvir);
if pxgver > pxgvir
% then we correctly classified it
ncc = ncc+1;
end
end
1/20/2011 ECE 523: Introduction to Biometrics 24
Evaluating the Classifier: Cross-validation % Loop through all of the patterns of virginica
notes.
muver = mean(versicolor);
covver = cov(versicolor);
% Those remain the same for the following.
for i = 1:nvir
% Get the test point and training set.
virtrain = virginica;
x = virtrain(i,:);
virtrain(i,:)=[];
muvir = mean(virtrain);
covvir = cov(virtrain);
pxgver = csevalnorm(x,muver,covver);
pxgvir = csevalnorm(x,muvir,covvir);
if pxgvir > pxgver
% then we correctly classified it
ncc = ncc+1;
end
end
pcc = ncc/n
1/20/2011 ECE 523: Introduction to Biometrics 25
Homework #2
(A) • Use iris data and estimate probability of correct classification • Use cross-validation with 𝑘 = 2 • Use versicolor and virginica only • Equal priors • Use first two features only • Build the classifier (e.g., Bayes Decision Rule) using the training set, assume
multivariate normal model for these data
(B) • Use iris data and estimate probability of correct classification • Use cross-validation with 𝑘 = 2 • Use versicolor and virginica only • Equal priors • Use all four features • Build the classifier (e.g., Bayes Decision Rule) using the training set, assume
multivariate normal model for these data
1/20/2011 ECE 523: Introduction to Biometrics 26
Future topics
• Receiver Operating Characteristics (ROCs)
• Face Detection in Color Images using Skin Models
1/20/2011 ECE 523: Introduction to Biometrics 27
References
R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification, 2nd edition, John Wiley & Sons, Inc., 2000 Selim Aksoy, CS 551(Pattern Recognition) Course Website, http://www.cs.bilkent.edu.tr/~saksoy/courses/cs551-Spring2010/index.html W. Martinez and A. Martinez, Computational Statistics Handbook with MATLAB, 2nd edition, Chapman and Hall/CRC, Inc., 2007