-
Machine Learning using Matlab
Lecture 4 Multiclass logistic classification
-
About projects● Not sure about your project, you can still talk with me.● Difficult to implement, you may reduce the difficulty.● For group project, the ideal way is to collect data together, but apply different
ML models to the data, then compare their performance.
-
Outline● One-vs-all● Softmax regression ● Regularized softmax regression● How to design a pedestrian detection system● Introduction of neural network
-
Multiclass Classification● In our training set, we now have that:● Examples
○ Email classification: spam (1), personal email (2), work-related email (3), etc. ○ Object classification: car(1), motorcycle(2), truck(3), …○ Handwritten digits recognition: 0(1), 1(2), …, 9(10)
-
One-vs-all (one-vs-rest)
Class 1:Class 2:Class 3:
x1
x2
-
One-vs-all (one-vs-rest)
x1
x2
-
One-vs-all (one-vs-rest)
x1
x2
-
One-vs-all (one-vs-rest)
x1
x2
-
One-vs-all (one-vs-rest)● Train a logistic regression classifier
for each class to predict the output, i.e., probability
● To make a prediction on new data, pick the class that has the maximum output
-
Softmax regression● Also called “multinomial logistic regression”● Instead of learning a binary classifier for each class, softmax regression
learns a multiclass classifier simultaneously
-
Hypothesis of softmax regression
Question: what is size of theta?
Normalization term
-
Cost function● Indicator function
● Cost function of binary logistic regression can be rewritten as:
-
Cost function of softmax regression
-
Gradient of softmax regression
-
Softmax regression properties● Softmax regression has a “redundant” set of parameters● Conclusion: instead of training (n+1)×k parameters, you may only need to
train (n+1)×(k-1) parameters● When K = 2, softmax regression reduces to logistic regression
-
Regularized softmax regression● Cost function:
● Gradient:
Note theta should start from 1 not 0!
-
One-vs-all vs. Softmax regression● Check the classes you want to divide is mutually exclusive or not.
○ If mutually exclusive, choose softmax regression○ If not mutually exclusive, choose one-vs-all
● Example: Suppose you are working on a music classification application, and there are k types of music that you are trying to recognize.
○ if your four classes are classical, country, rock, and jazz, you should use softmax regression○ If your categories are vocals, dance, soundtrack, pop, it is more appropriate to use one-vs-all
-
Question: one-vs-all or softmax regression● Would you use softmax regression or three logistic regression classifiers in
the following examples?○ Suppose that your classes are indoor_scene, outdoor_urban_scene, and
outdoor_wilderness_scene.○ Suppose your classes are indoor_scene, black_and_white_image, and image_has_people.
Softm
ax
One-v
s-all
-
How to design a pedestrian detection system● Framework● Data collection and annotation● Machine Learning model● Test
-
Pedestrian detection - Challenges● A wide variation of pedestrian● Different scales of pedestrians in different images
-
Pedestrian detection system
Positive examples
Negative examples
-
Framework
[1] Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 1. IEEE, 2005. (18,000+ citations)
SVM
Test set HOG feature Hypothesis Prediction
HOG feature [1] Training set
-
Data collection● Available dataset from HOG paper (http://pascal.inrialpes.fr/data/human/)● There are more than 2,000 positive examples, and more than 400
pedestrian-free images.● Each example is cropped to 128×64 patches.● HOG is applied by the Matlab computer vision toolbox.● For each example, the HOG feature is 3780 dimensions.● Normalization or not?
http://pascal.inrialpes.fr/data/human/
-
Suggestions on data collection● Variety● Normalized to same scale● As many as possible
-
Support Vector Machine (SVM)● Here I use Libsvm (https://www.csie.ntu.edu.tw/~cjlin/libsvm/), which provides
a number of SVM models (such as regression, binary classification, multiclass classification)
● Linear SVM is applied with the default parameter settings.● You will learn SVM later in the course.● Other models? Why not!
https://www.csie.ntu.edu.tw/~cjlin/libsvm/
-
Detect a pedestrian in an image● Now you have trained a pedestrian detection
classifier, how to detect pedestrian in an image?
● Remain challenge: different scale!● Any idea?
-
Pyramid
Level 1
Level 2Level 3
-
Sliding window
https://docs.google.com/file/d/0B_QS8iEGmlDRZ0RjeENyTDdLZFk/previewhttps://docs.google.com/file/d/0B_QS8iEGmlDRRFk4cU1vUDNDd3M/previewhttps://docs.google.com/file/d/0B_QS8iEGmlDRak9zV2pDM3BwaDQ/preview
-
Pyramid and sliding window - questions● How many levels of pyramid?● Overlap between sliding windows?
-
Non-maximum suppression
-
Non-maximum suppression● Step 1: sort bounding boxes in descending order according to predicted
outputs.● Step 2: Greedily select the highest scoring boxes while skipping detections
with bounding boxes that at least 50% covered by a bounding box of a previously select detection.
-
Pedestrian detection - a more robust classifier
x1
x2
...Positive examples
Negative examples
...
-
Pedestrian detection - a more robust classifier
x1
x2
Positive examples
Negative examplesFalse positive examples
Hard examples
-
Pedestrian detection - a more robust classifier
x1
x2
Positive examples
Negative examples
-
Experimental results
-
Limitations● How to compare the performance of your model with others?● How to improve performance?
○ More data?○ Parameter tuning?○ Model selection?
● Will discuss later in the course.