andrew ng image classification by sparse coding. andrew ng feature learning problem given a 14x14...

28
Andrew Ng Image classification by sparse coding

Upload: dorcas-jenkins

Post on 15-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Image classification

by sparse coding

Page 2: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Feature learning problem

• Given a 14x14 image patch x, can represent it using 196 real numbers.

• Problem: Can we find a learn a better representation for this?

Page 3: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Unsupervised feature learning

Given a set of images, learn a better way to represent image than pixels.

Page 4: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

First stage of visual processing in brain: V1

Schematic of simple cell Actual simple cell

[Images from DeAngelis, Ohzawa & Freeman, 1995]

“Gabor functions.”

The first stage of visual processing in the brain (V1) does “edge detection.”

Page 5: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Learning an image representation

Sparse coding (Olshausen & Field,1996)

Input: Images x(1), x(2), …, x(m) (each in Rn x n)

Learn: Dictionary of bases , …, k (also Rn x n), so that each input x can be approximately decomposed as:

s.t. aj’s are mostly zero (“sparse”)

Use to represent 14x14 image patch succinctly, as [a7=0.8, a36=0.3, a41 = 0.5]. I.e., this indicates which “basic edges” make up the image.

Page 6: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Sparse coding illustration

Natural Images Learned bases (1 , …, 64): “Edges”

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500 50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500 50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500 50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

0.8 * + 0.3 * + 0.5 *

x 0.8 * 36 + 0.3 * 42

+ 0.5

* 63 [0, 0, …, 0, 0.8, 0, …, 0, 0.3, 0, …, 0, 0.5, …] = [a1, …, a64] (feature representation)

Test example

Compact & easilyinterpretable

Page 7: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

More examples

Represent as: [0, 0, …, 0, 0.6, 0, …, 0, 0.8, 0, …, 0, 0.4, …]

Represent as: [0, 0, …, 0, 1.3, 0, …, 0, 0.9, 0, …, 0, 0.3, …]

0.6 * + 0.8 * + 0.4 *

15 28

37

1.3 * + 0.9 * + 0.3 *

5 18

29

• Method hypothesizes that edge-like patches are the most “basic” elements of a scene, and represents an image in terms of the edges that appear in it.

• Use to obtain a more compact, higher-level representation of the scene than pixels.

Page 8: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng[Evan Smith & Mike Lewicki, 2006]

Digression: Sparse coding applied to audio

Page 9: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Digression: Sparse coding applied to audio

[Evan Smith & Mike Lewicki, 2006]

Page 10: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Sparse coding details

Input: Images x(1), x(2), …, x(m) (each in Rn x n)

L1 sparsity term(causes most s to be 0)

Alternating minimization: Alternately minimize with respect to ‘s (easy) and a’s (harder).

Page 11: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Solving for bases

Early versions of sparse coding were used to learn about this many bases:

32 learned bases

How to scale this algorithm up?

Page 12: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Sparse coding details

Input: Images x(1), x(2), …, x(m) (each in Rn x n)

L1 sparsity term

Alternating minimization: Alternately minimize with respect to ‘s (easy) and a’s (harder).

Page 13: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Goal: Minimize objective with respect to ai’s.

• Simplified example:

• Suppose I tell you:

• Problem simplifies to:

• This is a quadratic function of the ai’s. Can be solved efficiently in closed form.

• Algorithm:• Repeatedly guess sign (+, - or 0) of each of the ai’s.

• Solve for ai’s in closed form. Refine guess for signs.

Feature sign search (solve for ai’s)

Page 14: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

The feature-sign search algorithm: Visualization

1a

2a

Starting from zero (default)

01 a02 a

Current guess:

Page 15: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

The feature-sign search algorithm: Visualization

1a

2a

1: Activate a2

with “+” signActive set ={a2}

Starting from zero (default)

01 a02 a

Current guess:

Page 16: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

The feature-sign search algorithm: Visualization

1a

2a

1: Activate a2

with “+” signActive set ={a2}

Starting from zero (default)

01 aCurrent guess:

02 a

Page 17: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

The feature-sign search algorithm: Visualization

2: Update a2 (closed form)

Starting from zero (default)

1: Activate a2

with “+” signActive set ={a2}

1a

2a

01 aCurrent guess:

02 a

Page 18: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

The feature-sign search algorithm: Visualization

3: Activate a1

with “+” signActive set ={a1,a2}

Starting from zero (default)

1a

2a

01 aCurrent guess:

02 a

Page 19: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

The feature-sign search algorithm: Visualization

4: Update a1 & a2 (closed form)

Starting from zero (default)

3: Activate a1

with “+” signActive set ={a1,a2}

1a

2a

01 aCurrent guess:

02 a

Page 20: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Before feature sign search

32 learned bases

Page 21: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

With feature signed search

Page 22: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Recap of sparse coding for feature learning

Input: Images x(1), x(2), …, x(m) (each in Rn x n)Learn: Dictionary of bases , …, k (also Rn x n).

Tra

inin

g tim

eT

est

time

Input: Novel image x (in Rn x n) and previously learned i’s.Output: Representation [aa, …, ak] of image x.

0.8 * + 0.3 * + 0.5 *

x 0.8 * 36 + 0.3 * 42

+ 0.5

* 63Represent as: [0, 0, …, 0, 0.8, 0, …, 0, 0.3, 0, …, 0, 0.5, …]

Page 23: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Sparse coding recap

0.8 * + 0.3 * + 0.5 *

[0, 0, …, 0, 0.8, 0, …, 0, 0.3, 0, …, 0, 0.5, …]

Much better than pixel representation. But still not competitive with SIFT, etc.

Three ways to make it competitive: • Combine this with SIFT.• Advanced versions of sparse coding (LCC).• Deep learning.

Page 24: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Combining sparse coding with SIFT

Input: Images x(1), x(2), …, x(m) (each in Rn x n)

Learn: Dictionary of bases , …, k (also Rn x n).

SIFT descriptors x(1), x(2), …, x(m) (each in R128)

R128.

Test time: Given novel SIFT descriptor, x (in R128), represent as

Page 25: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

Putting it together

• Relate to histograms view, and so sparse-coding on top of SIFT features.

Feature representation

Learningalgorithm

x(1)

a(1)

x(2) x(3)

a(2) a(3)

orLearningalgorithm

Suppose you’ve already learned bases , …, k. Here’s how you represent an image.

E.g., 73-75% on Caltech 101 (Yang et al., 2009, Boreau et al., 2009)

Page 26: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

K-means vs. sparse coding

Centroid 1

Centroid 2

Centroid 3

K-means

Represent as:

Page 27: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

K-means vs. sparse coding

Centroid 1

Centroid 2

Centroid 3

K-means

Represent as:

Basis

Sparse coding

Represent as:

Basis

Basis

Intuition: “Soft” version of k-means (membership in multiple clusters).

Page 28: Andrew Ng Image classification by sparse coding. Andrew Ng Feature learning problem Given a 14x14 image patch x, can represent it using 196 real numbers

Andrew Ng

K-means vs. sparse coding

Rule of thumb: Whenever using k-means to get a dictionary, if you replace it with sparse coding it’ll often work better.