image segmentation and representation cosimo distante

Image Segmentation

And Representation

Cosimo Distante

KMeans

Let be the Let be the

Let us suppose to cluster data inot K classes Let us suppose to cluster data inot K classes Let mLet mjj, j=1,…,k be the prototype for each class, j=1,…,k be the prototype for each class

Let us suppose having already the K prototypes and we are interested in Let us suppose having already the K prototypes and we are interested in representing the input data representing the input data xxtt with the closest prototype as follow with the closest prototype as follow

N

ttxX 1

jt

ji

t mxmx min

KMeans

jt

ji

t mxmx min

How to find the the K prototypes mHow to find the the K prototypes mjj??

When xWhen xtt is represented by m is represented by mii, there is an error proportional to its , there is an error proportional to its

distancedistance

Out objective is to reduce this distance as much as possibleOut objective is to reduce this distance as much as possible

Let us introduce an error function (reconstruction)Let us introduce an error function (reconstruction)

Nt ki

itt

ik

ii bXE,..,1 ,...,1

2

1 mxm

altrimenti0

min se1 jt

ji

ttib

mxmx

KMeans

Best performances reached by finding the minimum of EBest performances reached by finding the minimum of E

Let us use an iterative procedure to find the K prototypesLet us use an iterative procedure to find the K prototypes

Let’s start by randomly initialize the K prototypes Let’s start by randomly initialize the K prototypes mmii

Then we compute the b values (membership) and then minimize the Then we compute the b values (membership) and then minimize the errro E by placing the derivative of E to zeroerrro E by placing the derivative of E to zero

t

ti

t

tti

i b

bm

x

Modifying the Modifying the mm values imply that even b values will vary and then values imply that even b values will vary and then the iterative process is repeated until no changes in the iterative process is repeated until no changes in mm will occur will occur

KMeans

77

K-Means Clustering

88

K-Means Clustering

99

K-Means Clustering

1010

K-Means Clustering

1111

K-Means Clustering

1212

K-Means Clustering

1313

K-Means Clustering

1414

K-Means Clustering

K-Means Clustering

RGB vector

Clustering

Example

D. Comaniciu and P. Meer, Robust Analysis

of Feature Spaces: Color Image

Segmentation, 1997.

K-Means Clustering

Example

Original K=5 K=11

Bahadir K. GunturkBahadir K. Gunturk EE 7730 - Image Analysis IEE 7730 - Image Analysis I 1818

K-means, only color is used in segmentation, four clusters (out of 20) are shown here.


K-means, color and position is used in segmentation, four clusters (out of 20) are shown here.

Each vector is (R,G,B,x,y).


K-Means Clustering: Axis Scaling

Features of different types may have different scales. For example, pixel coordinates on a 100x100 image vs. RGB

color values in the range [0,1].

Problem: Features with larger scales dominate clustering.

Solution: Scale the features.

KMeans

Kmeans results with RGB image in. (a) the original Kmeans results with RGB image in. (a) the original image, (b) K=3; (c) K=5;(d)K=6; (e) K=7 and (f) K=8 image, (b) K=3; (c) K=5;(d)K=6; (e) K=7 and (f) K=8

Fuzzy C-means1.1. Initialize the membership matrix b with values between 0 and 1. Initialize the membership matrix b with values between 0 and 1.

such that the following is satisfiedsuch that the following is satisfied

2.2. Compute centers mCompute centers mii, i=1,…,k, i=1,…,k

3.3. Compute cost functionCompute cost function

Stopping criterion : IF E<threshold or no significative variations Stopping criterion : IF E<threshold or no significative variations between current and previous iterationbetween current and previous iteration

4.4. Compute the new membership matrix Compute the new membership matrix And goto STEP 2And goto STEP 2

N

t

ti

N

t

tti

i

b

xbm

1

1

)(

)(

N

t

k

ii

ttik mxbmmbE

1 11 )(),,,(

k

j jt

it

ti

mx

mx

b

1

1

2

1

k

i

ti Ntb

1

,,1 1

),1[

04/21/23

HOUGH Transform

04/21/23

Contours: Lines and Curves

• Edge detectors find “edgels” (pixel level)

• To perform image analysis :– edges must be grouped into entities

such as contours (higher level).– Canny does this to certain extent: the

detector finds chains of edgels.

04/21/23

Line detection

• Mathematical model of a line:

xx

yy Y = mx + nY = mx + n

P(xP(x11,y,y11))

P(xP(x22,y,y22))

YY11=m x=m x11+n+n

YY22=m x=m x22+n+n

YYNN=m x=m xNN+n+n

04/21/23

Image and Parameter Spaces

xx

yy Y = Y = mmx + x + nn YY11==mm x x11++nn

YY22==mm x x22++nn

YYNN==mm x xNN++nn

Y = Y = mm ’’x + x + nn ’’

Image SpaceImage Space Parameter SpaceParameter Space

interceptintercept

slopeslope

mm

nnmm ’’

nn ’’

Line in Img. Space ~ Point in Param. SpaceLine in Img. Space ~ Point in Param. Space

04/21/23

Looking at it backwards …

YY11=m =m xx11+n+n

Image spaceImage space

Fix (m,n), Vary (x,y) - Fix (m,n), Vary (x,y) - LineLine

Fix (xFix (x11,y,y11), Vary (m,n) – Lines thru a ), Vary (m,n) – Lines thru a PointPoint

Y = Y = mmx + x + nn

xx

yy

P(xP(x11,y,y11))

04/21/23

Looking at it backwards …

Can be re-written as:Can be re-written as: n = n = -x-x11 m + m + YY11 YY11==mm x x11++nn

Parameter spaceParameter space

Fix (-xFix (-x11,y,y11), Vary (m,n) - Line), Vary (m,n) - Line n = n = -x-x11 m + m + YY11

interceptintercept

slopeslope

mm

nnmm ’’

nn ’’

04/21/23

Img-Param Spaces

• Image Space– Lines – Points – Collinear points

• Parameter Space– Points– Lines– Intersecting lines

04/21/23

Hough Transform Technique

• H.T. is a method for detecting straight lines (and curves) in images.

• Main idea:– Map a difficult pattern problem into a

simple peak detection problem

04/21/23

Hough Transform Technique• Given an edge point, there is an

infinite number of lines passing through it (Vary m and n).– These lines can be represented as a

line in parameter space.

Parameter SpaceParameter Space

interceptintercept

slopeslope

mm

nn

xx

yy n = n = (-x)(-x) m + m + yy

P(x,y)P(x,y)

04/21/23


• Given a set of collinear edge points, each of them have associated a line in parameter space.– These lines intersect at the point

(m,n) corresponding to the parameters of the line in the image space.

04/21/23

Parameter SpaceParameter Space

interceptintercept

slopeslope

mm

nn

xx

yy n = n = (-x1)(-x1) m + m + y1y1

P(x1,y1)P(x1,y1)

Q(x2,y2)Q(x2,y2)

n = (-x2) m + y2n = (-x2) m + y2ppqq

04/21/23


• At each point of the (discrete) parameter space, count how many lines pass through it.– Use an array of counters– Can be thought as a “ parameter image”

• The higher the count, the more edges are collinear in the image space.– Find a peak in the counter array– This is a “bright” point in the parameter image– It can be found by thresholding

04/21/23


04/21/23

Practical Issues

• The slope of the line is -<m<– The parameter space is INFINITE

• The representation y = mx + n does not express lines of the form x = k

04/21/23

Solution:

• Use the “Normal” equation of a line:

xx

yy

Y = mx + nY = mx + n = x cos= x cos+y sin+y sin

P(x,y)P(x,y)

xx

yy

Is the line orientationIs the line orientation

Is the distance between Is the distance between the origin and the linethe origin and the line

04/21/23 Octavia I. Camps 38

04/21/23

New Parameter Space

• Use the parameter space (, , )• The new space is FINITE

– 0 < < D , where D is the image diagonal.

– 0 < <

• The new space can represent all lines– Y = k is represented with = k, =90– X = k is represented with = k, =0

04/21/23

Consequence:

• A Point in Image Space is now represented as a SINUSOID = x cos= x cos+y sin+y sin

04/21/23

Hough Transform AlgorithmInput is an edge image (E(i,j)=1 for edgels)

1. Discretize and in increments of d and d. Let A(R,T) be an array of integer accumulators, initialized to 0.

2. For each pixel E(i,j)=1 and h=1,2,…T do1. = i cos(h * d ) + j sin(h * d )

2. Find closest integer k corresponding to

3. Increment counter A(h,k) by one

3. Find local maxima in A(R,T)

04/21/23

Hough Transform Speed Up

• If we know the orientation of the edge – usually available from the edge detection step– We fix theta in the parameter space

and increment only one counter!– We can allow for orientation

uncertainty by incrementing a few counters around the “nominal” counter.

04/21/23

Hough Transform for Curves

• The H.T. can be generalized to detect any curve that can be expressed in parametric form:– Y = f(x, a1,a2,…ap)– a1, a2, … ap are the parameters– The parameter space is p-dimensional– The accumulating array is LARGE!

Hough Transform for Circles

The above equation can be The above equation can be espressed in parametric formespressed in parametric form

Hough Transform for Circles

04/21/23

H.T. Summary• H.T. is a “voting” scheme

– points vote for a set of parameters describing a line or curve.

• The more votes for a particular set– the more evidence that the corresponding

curve is present in the image.

• Can detect MULTIPLE curves in one shot.

• Computational cost increases with the number of parameters describing the curve.

Image matching

by Diva Sian

by swashford

http://www.flickr.com/photos/diaphanus/136915456/

http://www.flickr.com/photos/swashford/428567562/

Harder case

by Diva Sian by scgbt

http://www.flickr.com/photos/diaphanus/136915456/

http://www.flickr.com/photos/scpgt/328570837/

Harder still?

NASA Mars Rover images

NASA Mars Rover imageswith SIFT feature matchesFigure by Noah Snavely

Answer below (look for tiny colored squares…)

Local features and alignment

[Darya Frolova and Denis Simakov]

• We need to match (align) images• Global methods sensitive to occlusion, lighting, parallax

effects. So look for local features that match well.• How would you do it by eye?

Local features and alignment• Detect feature points in both images



• Find corresponding pairs



• Find corresponding pairs

• Use these pairs to align images



• Problem 1:– Detect the same point independently in both

images

no chance to match!

We need a repeatable detector



• Problem 2:– For each point correctly recognize the

corresponding one

?

We need a reliable and distinctive descriptor


Geometric transformations

Photometric transformations

Figure from T. Tuytelaars ECCV 2006 tutorial

And other nuisances…

• Noise

• Blur

• Compression artifacts

• …

Invariant local featuresSubset of local feature types designed to be invariant to common geometric and photometric transformations.

Basic steps:1) Detect distinctive interest points 2) Extract invariant descriptors

Figure: David Lowe

Main questions

• Where will the interest points come from?– What are salient features that we’ll detect in

multiple views?

• How to describe a local region?

• How to establish correspondences, i.e., compute matches?

Finding Corners

Key property: in the region around a corner, image gradient has two or more dominant directions

Corners are repeatable and distinctive

C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“ Proceedings of the 4th Alvey Vision Conference: pages 147--151.

Source: Lana Lazebnik

http://www.csse.uwa.edu.au/~pk/research/matlabfns/Spatial/Docs/Harris/A_Combined_Corner_and_Edge_Detector.pdf

http://www.csse.uwa.edu.au/~pk/research/matlabfns/Spatial/Docs/Harris/A_Combined_Corner_and_Edge_Detector.pdf

Corners as distinctive interest points

We should easily recognize the point by looking through a small window

Shifting a window in any direction should give a large change in intensity

“edge”:no change along the edge direction

“corner”:significant change in all directions

“flat” region:no change in all directions

Source: A. Efros

Harris Detector formulation

2

,

( , ) ( , ) ( , ) ( , )x y

E u v w x y I x u y v I x y

Change of intensity for the shift [u,v]:

IntensityShifted intensity

Window function

orWindow function w(x,y) =

Gaussian1 in window, 0 outside

Source: R. Szeliski

Consider shifting the window W by (u,v)• how do the pixels in W change?

• compare each pixel before and after bysumming up the squared differences (SSD)

• this defines an SSD “error” of E(u,v):

Feature detection: the math

W

Taylor Series expansion of I:

If the motion (u,v) is small, then first order approx is good

Plugging this into the formula on the previous slide…

Small motion assumption

Consider shifting the window W by (u,v)• how do the pixels in W change?

• compare each pixel before and after bysumming up the squared differences

• this defines an “error” of E(u,v):


W

Feature detection: the mathThis can be rewritten:

For the example above• You can move the center of the green window to anywhere on the

blue unit circle

• Which directions will result in the largest and smallest E values?

• We can find these directions by looking at the eigenvectors of M

M

Harris Detector formulationThis measure of change can be approximated by:

2

2,

( , ) x x y

x y x y y

I I IM w x y

I I I

where M is a 22 matrix computed from image derivatives:

v

uMvuvuE ][),(

M

Sum over image region – area we are checking for corner

Gradient with respect to x, times gradient with respect to y

Harris Detector formulation

2

2,

( , ) x x y

x y x y y

I I IM w x y

I I I

where M is a 22 matrix computed from image derivatives:

Sum over image region – area we are checking for corner

Gradient with respect to x, times gradient with respect to y

M

Quick eigenvalue/eigenvector reviewThe eigenvectors of a matrix A are the vectors x that satisfy:

The scalar is the eigenvalue corresponding to x• The eigenvalues are found by solving:

• In our case, A = M is a 2x2 matrix, so we have

• The solution:

Once you know , you find x by solving


Eigenvalues and eigenvectors of H• Define shifts with the smallest and largest change (E value)

• x+ = direction of largest increase in E.

+ = amount of increase in direction x+

• x- = direction of smallest increase in E.

- = amount of increase in direction x-

x-

x+

M

First, consider an axis-aligned corner:

What does this matrix reveal?

2

12

2

0

0

yyx

yxx

III

IIIM

First, consider an axis-aligned corner:

This means dominant gradient directions align with x or y axis

If either λ is close to 0, then this is not a corner, so look for locations where both are large.

Slide credit: David Jacobs

What does this matrix reveal?

What if we have a corner that is not aligned with the image axes?

General Case

Since M is symmetric, we have RRM

2

11

0

0

We can visualize M as an ellipse with axis lengths determined by the eigenvalues and orientation determined by R

direction of the slowest change

direction of the fastest change

(max)-1/2

(min)-1/2

Slide adapted form Darya Frolova, Denis Simakov.

Interpreting the eigenvalues

1

2

“Corner”1 and 2 are large,

1 ~ 2;

E increases in all directions

1 and 2 are small;

E is almost constant in all directions

“Edge” 1 >> 2

“Edge” 2 >> 1

“Flat” region

Classification of image points using eigenvalues of M:

Corner response function

“Corner”R > 0

“Edge” R < 0

“Edge” R < 0

“Flat” region

|R| small

22121

2 )()(trace)det( MMR

α: constant (0.04 to 0.06)

Harris Corner Detector

• Algorithm steps: – Compute M matrix within all image windows to get

their R scores– Find points with large corner response

(R > threshold)– Take the points of local maxima of R

Harris Detector: Workflow

Slide adapted form Darya Frolova, Denis Simakov, Weizmann Institute.

Harris Detector: WorkflowCompute corner response R

Harris Detector: WorkflowFind points with large corner response: R>threshold

Harris Detector: WorkflowTake only the points of local maxima of R

Harris Detector: Workflow

Harris Detector: Properties

• Rotation invariance

Ellipse rotates but its shape (i.e. eigenvalues) remains the same

Corner response R is invariant to image rotation

Harris Detector: Properties

• Not invariant to image scale

All points will be classified as edges

Corner !

image segmentation and representation cosimo distante

Documents

image analysis

rgb image

color image segmentation

100x100 image

original image

backwards y1

x1 nparameter spacefix

n lines