prof. feng liu

Prof. Feng Liu

Winter 2020

http://www.cs.pdx.edu/~fliu/courses/cs410/

02/27/2020

Last Time

Introduction to object recognition

The slides for this topic are used from Prof. S. Lazebnik.

Machine learning approach to object recognition

◼ Classifiers

◼ Bag-of-features models

Recognition: A machine learning approach

Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, Kristen Grauman, and Derek Hoiem

The machine learning framework

Apply a prediction function to a feature representation of

the image to get the desired output:

f( ) = “apple”

f( ) = “tomato”

f( ) = “cow”

The machine learning framework

y = f(x)

Training: given a training set of labeled examples {(x1,y1),

…, (xN,yN)}, estimate the prediction function f by minimizing

the prediction error on the training set

Testing: apply f to a never before seen test example x and

output the predicted value y = f(x)

output prediction function

Image feature

Prediction

Training

LabelsTraining Images

Training

Features

Testing

Test Image

Learned

Slide credit: D. Hoiem

Features

Raw pixels

Histograms

GIST descriptors

Classifiers: Nearest neighbor

f(x) = label of the training example nearest to x

All we need is a distance function for our inputs

No training required!

Test example

Training examples from class

Classifiers: Linear

Find a linear function to separate the classes:

f(x) = sgn(w x + b)

Images in the training set must be annotated with the

“correct answer” that the model is expected to produce

Contains a motorbike

Recognition task and supervision

Unsupervised “Weakly” supervised Fully supervised

Definition depends on task

Generalization

How well does a learned model generalize from

the data it was trained on to a new test set?

Training set (labels known) Test set (labels unknown)

Generalization

Components of generalization error

◼ Bias: how much the average model over all training sets differ from

the true model?

Error due to inaccurate assumptions/simplifications made by the model

◼ Variance: how much models estimated from different training sets

differ from each other

Underfitting: model is too “simple” to represent all the relevant

class characteristics

◼ High bias and low variance

◼ High training error and high test error

Overfitting: model is too “complex” and fits irrelevant

characteristics (noise) in the data

◼ Low bias and high variance

◼ Low training error and high test error

Bias-variance tradeoff

Training error

Test error

Underfitting Overfitting

Complexity Low BiasHigh Variance

High BiasLow Variance

Bias-variance tradeoff

Many training examples

Few training examples

Complexity Low BiasHigh Variance

High BiasLow Variance

Effect of Training Size

Testing

Training

Generalization Error

Number of Training Examples

Fixed prediction model

Datasets

Circa 2001: 5 categories, 100s of images per

category

Circa 2004: 101 categories

Today: up to thousands of categories, millions

of images

Caltech 101 & 256

Griffin, Holub, Perona, 2007

Fei-Fei, Fergus, Perona, 2004

http://www.vision.caltech.edu/Image_Datasets/Caltech101/http://www.vision.caltech.edu/Image_Datasets/Caltech256/

Caltech-101: Intraclass variability

The PASCAL Visual Object Classes

Challenge (2005-present)

Challenge classes:Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

http://host.robots.ox.ac.uk/pascal/VOC/

Main competitions

◼ Classification: For each of the twenty classes,

predicting presence/absence of an example of that

class in the test image

◼ Detection: Predicting the bounding box and label of

each object from the twenty target classes in the test

http://pascallin.ecs.soton.ac.uk/challenges/VOC/

“Taster” challenges

◼ Segmentation:

Generating pixel-wise

segmentations giving

the class of the object

visible at each pixel, or

"background"

otherwise

◼ Person layout:

Predicting the

bounding box and label

of each part of a

person (head, hands,

“Taster” challenges

◼ Action classification

Russell, Torralba, Murphy, Freeman, 2008

LabelMehttp://labelme.csail.mit.edu/

80 Million Tiny Images

http://people.csail.mit.edu/torralba/tinyimages/

ImageNet http://www.image-net.org/

Machine learning approach to object recognition

◼ Classifiers

◼ Bag-of-features models

Bag-of-features models

Origin 1: Texture recognition

Texture is characterized by the repetition of basic

elements or textons

For stochastic textures, it is the identity of the

textons, not their spatial arrangement, that matters

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Origin 1: Texture recognition

Universal texton dictionary

histogram

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Orderless document representation: frequencies of words

from a dictionary Salton & McGill (1983)

Origin 2: Bag-of-words models

US Presidential Speeches Tag Cloudhttp://chir.ag/phernalia/preztags/

1. Extract features

2. Learn “visual vocabulary”

3. Quantize features using visual vocabulary

4. Represent images by frequencies of “visual words”

Bag-of-features steps

1. Feature extraction

Regular grid or interest regions

Normalize patch

Detect patches

Compute descriptor

Slide credit: Josef Sivic

1. Feature extraction

2. Learning the visual vocabulary

Clustering

Visual vocabulary

K-means clustering

• Want to minimize sum of squared Euclidean

distances between points xi and their

nearest cluster centers mk

Algorithm:

• Randomly initialize K cluster centers

• Iterate until convergence:

◼ Assign each data point to the nearest center

◼ Re-compute each cluster center as the mean of

all points assigned to it

ki mxMXDcluster

clusterinpoint

2)(),(

Clustering and vector quantization

• Clustering is a common method for learning a visual

vocabulary or codebook

◼ Unsupervised learning process

◼ Each cluster center produced by k-means becomes a codevector

◼ Codebook can be learned on separate training set

◼ Provided the training set is sufficiently representative, the

codebook will be “universal”

• The codebook is used for quantizing features

◼ A vector quantizer takes a feature vector and maps it to the index

of the nearest codevector in a codebook

◼ Codebook = visual vocabulary

◼ Codevector = visual word

Example codebook

Source: B. LeibeAppearance codebook

Another codebook

Appearance codebook…

………

Source: B. Leibe

Yet another codebook

Fei-Fei et al. 2005

Visual vocabularies: Issues

• How to choose vocabulary size?

◼ Too small: visual words not representative of all

patches

◼ Too large: quantization artifacts, overfitting

• Computational efficiency

◼ Vocabulary trees

(Nister & Stewenius, 2006)

Spatial pyramid representation

level 0

Extension of a bag of features

Locally orderless representation at several levels of resolution

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene CategoriesLazebnik, Schmid & Ponce (CVPR 2006)

level 0 level 1

Lazebnik, Schmid & Ponce (CVPR 2006)

level 0 level 1 level 2

Lazebnik, Schmid & Ponce (CVPR 2006)

Scene category dataset

Multi-class classification results(100 training images per class)

Caltech101 dataset

Multi-class classification results (30 training images per class)

Next Time

More classification

Visual saliency

prof. feng liu

Documents

information filtering on dynamical networks associate prof....

feng liu, weiqi yan, peng li and chuankun wu

ecs152bxin liu 1 ecs 152b computer networks fall 2003 prof....

prof. feng liu - computer action...

zong-lai liu, tong-tong yue, wen-tian sun, feng-jun zhang

4d‐printed biodegradable and remotely controllable shape...

disease : black death by : yu feng liu 12/08/2012...

qiqi hou feng liu - arxiv · qiqi hou portland state...

berkeley data analytics stack prof. chi (harold) liu...

deriving topics and opinions from microblogs feng jiang...

minors in information system dr. feng liu department of...

berkeley data analytics stack prof. harold liu 15 december...

feng liu - university of california,...

feng liu | founder of siteber

publications for david dagan feng - university of sydney ·...

core.ac.uk · 2013-07-08 · prof. kecheng liu prof. fuzhou...

advanced business research method intructor : prof. feng-hui...

acm multimedia 2008 feng liu 1, yuhen-hu 1,2 and michael...

practical methods in am fungal research yongjun liu...

prof. wu feng, feng@cs.vt.edu depts. of computer …...