deep learning for computer vision: image classification (upc 2016)
TRANSCRIPT
![Page 1: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/1.jpg)
Day 1 Lecture 2
Classification
[course site]
Eva Mohedano
![Page 2: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/2.jpg)
Outline
Image Classification
Train/Test Splits
Metrics
Models
![Page 3: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/3.jpg)
Outline
Image Classification
Train/Test Splits
Metrics
Models
![Page 4: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/4.jpg)
4
Image ClassificationSet of predefined categories [eg: table, apple, dog, giraffe]Binary classification [1, 0]
DOG
![Page 5: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/5.jpg)
5
Image Classification
![Page 6: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/6.jpg)
6
DogImage Classification pipeline
Slide credit: Jose M Àlvarez
![Page 7: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/7.jpg)
7Slide credit: Jose M Àlvarez
Dog
Learned Representation
Image Classification pipeline
![Page 8: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/8.jpg)
8
Dog
Learned Representation
Part I: End-to-end learning (E2E)
Image Classification pipeline
Slide credit: Jose M Àlvarez
![Page 9: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/9.jpg)
9
Image Classification: Example Datasets
training set of 60,000 examples
test set of 10,000 examples
![Page 10: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/10.jpg)
10
Image Classification: Example Datasets
![Page 11: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/11.jpg)
Outline
Image Classification
Train/Test Splits
Metrics
Models
![Page 12: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/12.jpg)
2.1 3.2 4.8 0.1 0.0 2.63.1 1.4 2.5 0.2 1.0 2.01.0 2.3 3.2 9.3 6.4 0.32.0 5.0 3.2 1.0 6.9 9.19.0 3.5 5.4 5.5 3.2 1.0
N training examples (rows)
D features (columns)
01100
N
Training set
![Page 13: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/13.jpg)
DatasetShuffled
data
shuffleTraining
data(70%)
Test data(30%)
splitLearning algorithm
fit(X, y)
Model
Prediction algorithmpredict(X)
PredictionsCompute error/accuracy
score(X, y)Out-of-sample error estimate
NO!
Train/Test Splits
![Page 14: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/14.jpg)
Split your dataset into train and test at the very start● Usually good practice to shuffle data (exception: time series)
Do not look at test data (data snooping)!● Lock it away at the start to prevent contamination
NB: Never ever train on the test data!● You have no way to estimate error if you do● Your model could easily overfit the test data and have poor generalization, you have no way of
knowing without test data● Model may fail in production
Data hygiene
![Page 15: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/15.jpg)
Outline
Image Classification
Train/Test Splits
Metrics
Models
![Page 16: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/16.jpg)
Confusion matrices provide a by-class comparison between the results of the automatic classifications with ground truth annotations.
Metrics
![Page 17: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/17.jpg)
MetricsCorrect classifications appear in the diagonal, while the rest of cells correspond to errors.
Prediction
Class 1 Class 2 Class 3
Ground Truth
Class 1 x(1,1) x(1,2) x(1,3)
Class 2 x(2,1) x(2,2) x(2,3)
Class 3 x(3,1) x(3,2) x(3,3)
![Page 18: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/18.jpg)
Special case: Binary classifiers in terms of “Positive” vs “Negative”.
Prediction
Positives negative
GroundTruth
Positives True positive (TP)
Falsenegative
(FN)
negative False positives (FP)
True negative(TN)
Metrics
![Page 19: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/19.jpg)
The “accuracy” measures the proportion of correct classifications, not distinguishing between classes.
Binary
Prediction
Class 1 Class 2 Class 3
GroundTruth
Class 1 x(1,1) x(1,2) x(1,3)
Class 2 x(2,1) x(2,2) x(2,3)
Class 3 x(3,1) x(3,2) x(3,3)
Prediction
Positives negative
GroundTruth
Positives True positive (TP)
Falsenegative
(FN)
Negative False positives (FP)
True negative(TN)
Metrics
![Page 20: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/20.jpg)
Given a reference class, its Precision (P) and Recall (R) are complementary measures of relevance.
Prediction
Positives Negatives
GroundTruth
PositivesTrue
positive (TP)
Falsenegative
(FN)
NegativesFalse
positives (FP)
"Precisionrecall" by Walber - Own work. Licensed under Creative Commons Attribution-Share Alike 4.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Precisionrecall.svg#mediaviewer/File:Precisionrecall.svg
Example: Relevant class is “Positive” in a binary classifier.
Metrics
![Page 21: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/21.jpg)
Binary classification results often depend from a parameter (eg. decision threshold) whose value directly impacts precision and recall.
For this reason, in many cases a Receiver Operating Curve (ROC curve) is provided as a result.
True
Pos
itive
Rat
e
Metrics
![Page 22: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/22.jpg)
Outline
Image Classification
Train/Test Splits
Metrics
Models
![Page 23: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/23.jpg)
23
DogImage Classification pipeline
Slide credit: Jose M Àlvarez
![Page 24: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/24.jpg)
Mapping function to predict a score for the class label
Linear Modelsf(x, w) = (wTx + b)
CS231n: Convolutional Neural Networks for Visual Recognition
![Page 25: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/25.jpg)
Sigmoid
f(x, w) = g(wTx + b)
Activation function: Turn score into probabilitiesLogistic Regression
![Page 26: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/26.jpg)
Neuron
![Page 27: Deep Learning for Computer Vision: Image Classification (UPC 2016)](https://reader031.vdocuments.us/reader031/viewer/2022021815/586f89911a28ab54768b606b/html5/thumbnails/27.jpg)
Summary
Image classification aims to build a model to automatically predict the content of an image.
When presented as supervised problem, data should be structured in Train/Test splits to learn the model.
Important to have a way to asses the performance of the models
Easiest models classification are linear models. Linear models with a non-linearity on top represent the the “neuron”, the basic element of Neural Networks.