image similarity and the earth mover’s distance

Post on 02-Feb-2016

36 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Empirical Evaluation of Dissimilarity Measures for Color and Texture Y. Rubner, J. Puzicha, C. Tomasi and T.M. Buhmann The Earth Mover’s Distance as a Metric for Image Retrieval Y. Rubner, C. Tomasi and J.J. Guibas - PowerPoint PPT Presentation

TRANSCRIPT

Image Similarity and the Earth Mover’s Distance

Empirical Evaluation of Dissimilarity Measures for Color and TextureY. Rubner, J. Puzicha, C. Tomasi and T.M. Buhmann

The Earth Mover’s Distance as a Metric for Image Retrieval

Y. Rubner, C. Tomasi and J.J. GuibasThe Earth Mover’s Distance is the Mallows Distance: Some Insights from Statistics

E. Levina and P.J. Bickel

Learning-Based Methods in Vision - Spring 2007Frederik Heger

(with graphics from last year’s slides)

1 February 2007

2 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

How Similar Are They?Images from Caltech 256

3 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Similarity is Important for …• Image classification

• Is there a penguin in this picture?• This is a picture of a penguin.

• Image retrieval• Find pictures with a penguin in them.• Image as search query

• Find more images like this one.• Image segmentation

• Something that looked like this was called penguin before.

4 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Space Shuttle Cargo Bay

Image Representations: Histograms

Normal histogram Cumulative histogram•Generalize to arbitrary dimensions•Represent distribution of features

• Color, texture, depth, …

Images from Dave Kauchak

5 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Image Representations: Histograms

Joint histogram• Requires lots of data• Loss of resolution to

avoid empty bins

Images from Dave Kauchak

Marginal histogram• Requires independent features• More data/bin than

joint histogram

6 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Space Shuttle Cargo Bay

Image Representations: Histograms

Adaptive binning• Better data/bin distribution, fewer empty bins• Can adapt available resolution to relative feature importance

Images from Dave Kauchak

7 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

EASE Truss Assembly

Space Shuttle Cargo Bay

Image Representations: Histograms

Clusters / Signatures• “super-adaptive” binning• Does not require discretization along any fixed axis

Images from Dave Kauchak

8 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Distance Metrics

-

-

-

= Euclidian distance of 5 units

= Grayvalue distance of 50 values

= ?

x

y

x

y

9 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Issue: How to Compare Histograms?

Bin-by-bin comparisonSensitive to bin size. Could use wider bins …

… but at a loss of resolution

Cross-bin comparisonHow much cross-bin influence is necessary/sufficient?

10 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresHeuristic Histogram Distance:

Minkowski-form distance (Lp)

Special Cases:L1 Mahattan distanceL2 Euclidian DistanceL Maximum value distance

11 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresHeuristic Histogram Distance:

Weighted-Mean-Variance (WMV)

Info:• Per-feature similarity measure• Based on Gabor filter image representation• Shown to outperform several parametric models

for texture-based image retrieval

12 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresNonparametric Test Statistic:

Kolmogorov-Smirnov distance (KS)

Info:• Defined for only one dimension• Maximum discrepancy between cumulative

distributions• Invariant to arbitrary monotonic feature

transformations

13 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresNonparametric Test Statistic:

Cramer/von Mises type statistic (CvM)

Info:• Squared Euclidian distance between distributions• Defined for single dimension

14 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresNonparametric Test Statistic:

2

Info:• Very commonly used

15 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresInformation-theory Divergence:

Kullback-Leibler divergence (KL)

Info:• Code one histogram using the other as true

distribution• How inefficient would it be?• Also widely used.

16 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresInformation-theory Divergence:

Jeffrey-divergence (JD)

Info:• Similar to KL divergence• But symmetric and numerically stable

17 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresGround Distance Measure:

Quadratic Form (QF)

Info:• Heuristic approach• Matrix A incorporates cross-bin information

18 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Overview: Similarity MeasuresGround Distance Measure

Earth Mover’s Distance (EMD)

Info:• Based on solution of linear optimization problem

(transportation problem)• Minimal cost to transform one distribution to the

other• Total cost = sum of costs for individual features

19 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Summary: Similarity Measures

20 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Earth Mover’s Distance

21 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Earth Mover’s Distance

22 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Earth Mover’s Distance

=

23 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Earth Mover’s Distance

=

(amount moved) * (distance moved)

24 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

How EMD Works

All movements

(distance moved) * (amount moved)

(distance moved) * (amount moved)

* (amount moved)

n clusters

Q

Pm clusters

25 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

How EMD Works

Move earth only from P to Q

P’

Q’n clusters

Q

Pm clusters

26 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

How EMD Works

n clusters

Q

Pm clusters

P cannot send more earth than there is

P’

Q’

27 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

How EMD Works

n clusters

Q

Pm clusters

Q cannot receive more earth than it can hold

P’

Q’

28 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

How EMD Works

n clusters

Q

Pm clusters

As much earth as possiblemust be moved

P’

Q’

29 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Color-based Image Retrieval

Jeffrey divergence

Quadratic form distance

Earth Mover Distance

χ2 statistics

L1 distance

30 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Red Car Retrievals (Color-based)

31 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Zebra Retrieval (Texture-based)

32 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

EMD with Position Encoding

without position

with position

33 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Issues with EMD• High computational complexity

• Prohibitive for texture segmentation• Features ordering needs to be known

• Open eyes / closed eyes example• Distance can be set by very few features.

• E.g. with partial match of uneven distribution weight

EMD = 0, no matter how many features follow

34 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Help From Statisticians• For even-mass distributions,

EMD is equivalent to Mallows distance• (for uneven mass distributions,

the two distances behave differently)• Trick to compute Mallows distance

• 1-D marginals give better classification results than joint distributions (experimental results)

• Get marginals from empirical distribution by sorting feature vectors

35 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

EMD Summary / Conclusions• Ground distance metric for image similarity• Uses signatures for best adaptive binning and

to lessen impact of prohibitive complexity• Can deal with partial matches• Good performance for color/texture

classification• Statistical grounding

36 LBMV Spring 2007 - Frederik Heger fwh@cs.cmu.edu

Last Slide

Comments? Questions?

top related