recognition: a machine learning approach
DESCRIPTION
Recognition: A machine learning approach. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba , Kristen Grauman , and Derek Hoiem. The machine learning framework. Apply a prediction function to a feature representation of the image to get the desired output: f( ) = “apple” - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/1.jpg)
Recognition: A machine learning approach
Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, Kristen Grauman, and Derek Hoiem
![Page 2: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/2.jpg)
The machine learning framework
• Apply a prediction function to a feature representation of the image to get the desired output:
f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
![Page 3: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/3.jpg)
The machine learning framework
y = f(x)
• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training set
• Testing: apply f to a never before seen test example x and output the predicted value y = f(x)
output prediction function
Image feature
![Page 4: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/4.jpg)
Prediction
StepsTraining Labels
Training Images
Training
Training
Image Features
Image Features
Testing
Test Image
Learned model
Learned model
Slide credit: D. Hoiem
![Page 5: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/5.jpg)
Features
• Raw pixels
• Histograms
• GIST descriptors
• …
![Page 6: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/6.jpg)
Classifiers: Nearest neighbor
f(x) = label of the training example nearest to x
• All we need is a distance function for our inputs• No training required!
Test example
Training examples
from class 1
Training examples
from class 2
![Page 7: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/7.jpg)
Classifiers: Linear
• Find a linear function to separate the classes:
f(x) = sgn(w x + b)
![Page 8: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/8.jpg)
• Images in the training set must be annotated with the “correct answer” that the model is expected to produce
Contains a motorbike
Recognition task and supervision
![Page 9: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/9.jpg)
Unsupervised “Weakly” supervised Fully supervised
Definition depends on task
![Page 10: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/10.jpg)
Generalization
• How well does a learned model generalize from the data it was trained on to a new test set?
Training set (labels known) Test set (labels unknown)
![Page 11: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/11.jpg)
Generalization• Components of generalization error
– Bias: how much the average model over all training sets differ from the true model?• Error due to inaccurate assumptions/simplifications made by
the model– Variance: how much models estimated from different training
sets differ from each other
• Underfitting: model is too “simple” to represent all the relevant class characteristics– High bias and low variance– High training error and high test error
• Overfitting: model is too “complex” and fits irrelevant characteristics (noise) in the data– Low bias and high variance– Low training error and high test error
![Page 12: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/12.jpg)
Bias-variance tradeoff
Training error
Test error
Slide credit: D. Hoiem
Underfitting Overfitting
Complexity Low BiasHigh Variance
High BiasLow Variance
Err
or
![Page 13: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/13.jpg)
Bias-variance tradeoff
Many training examples
Few training examples
Complexity Low BiasHigh Variance
High BiasLow Variance
Test
Err
or
Slide credit: D. Hoiem
![Page 14: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/14.jpg)
Effect of Training Size
Testing
Training
Generalization Error
Slide credit: D. Hoiem
Number of Training Examples
Err
or
Fixed prediction model
![Page 15: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/15.jpg)
Datasets
• Circa 2001: 5 categories, 100s of images per category
• Circa 2004: 101 categories• Today: up to thousands of categories, millions of
images
![Page 16: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/16.jpg)
Caltech 101 & 256
Griffin, Holub, Perona, 2007
Fei-Fei, Fergus, Perona, 2004
http://www.vision.caltech.edu/Image_Datasets/Caltech101/http://www.vision.caltech.edu/Image_Datasets/Caltech256/
![Page 17: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/17.jpg)
Caltech-101: Intraclass variability
![Page 18: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/18.jpg)
The PASCAL Visual Object Classes Challenge (2005-present)
Challenge classes:Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
![Page 19: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/19.jpg)
The PASCAL Visual Object Classes Challenge (2005-present)
• Main competitions– Classification: For each of the twenty classes,
predicting presence/absence of an example of that class in the test image
– Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
![Page 20: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/20.jpg)
The PASCAL Visual Object Classes Challenge (2005-present)
• “Taster” challenges– Segmentation:
Generating pixel-wise segmentations giving the class of the object visible at each pixel, or "background" otherwise
– Person layout: Predicting the bounding box and label of each part of a person (head, hands, feet)
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
![Page 21: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/21.jpg)
The PASCAL Visual Object Classes Challenge (2005-present)
• “Taster” challenges– Action classification
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
![Page 22: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/22.jpg)
Russell, Torralba, Murphy, Freeman, 2008
LabelMehttp://labelme.csail.mit.edu/
![Page 23: Recognition: A machine learning approach](https://reader036.vdocuments.us/reader036/viewer/2022062407/56812a44550346895d8d71da/html5/thumbnails/23.jpg)
80 Million Tiny Imageshttp://people.csail.mit.edu/torralba/tinyimages/