image segmentation and representation cosimo distante
TRANSCRIPT
Image Segmentation
And Representation
Cosimo Distante
KMeans
Let be the Let be the
Let us suppose to cluster data inot K classes Let us suppose to cluster data inot K classes Let mLet mjj, j=1,…,k be the prototype for each class, j=1,…,k be the prototype for each class
Let us suppose having already the K prototypes and we are interested in Let us suppose having already the K prototypes and we are interested in representing the input data representing the input data xxtt with the closest prototype as follow with the closest prototype as follow
N
ttxX 1
jt
ji
t mxmx min
KMeans
jt
ji
t mxmx min
How to find the the K prototypes mHow to find the the K prototypes mjj??
When xWhen xtt is represented by m is represented by mii, there is an error proportional to its , there is an error proportional to its
distancedistance
Out objective is to reduce this distance as much as possibleOut objective is to reduce this distance as much as possible
Let us introduce an error function (reconstruction)Let us introduce an error function (reconstruction)
Nt ki
itt
ik
ii bXE,..,1 ,...,1
2
1 mxm
altrimenti0
min se1 jt
ji
ttib
mxmx
KMeans
Best performances reached by finding the minimum of EBest performances reached by finding the minimum of E
Let us use an iterative procedure to find the K prototypesLet us use an iterative procedure to find the K prototypes
Let’s start by randomly initialize the K prototypes Let’s start by randomly initialize the K prototypes mmii
Then we compute the b values (membership) and then minimize the Then we compute the b values (membership) and then minimize the errro E by placing the derivative of E to zeroerrro E by placing the derivative of E to zero
t
ti
t
tti
i b
bm
x
Modifying the Modifying the mm values imply that even b values will vary and then values imply that even b values will vary and then the iterative process is repeated until no changes in the iterative process is repeated until no changes in mm will occur will occur
KMeans
KMeans
77
K-Means Clustering
88
K-Means Clustering
99
K-Means Clustering
1010
K-Means Clustering
1111
K-Means Clustering
1212
K-Means Clustering
1313
K-Means Clustering
1414
K-Means Clustering
K-Means Clustering
RGB vector
Clustering
Example
D. Comaniciu and P. Meer, Robust Analysis
of Feature Spaces: Color Image
Segmentation, 1997.
K-Means Clustering
Example
Original K=5 K=11
Bahadir K. GunturkBahadir K. Gunturk EE 7730 - Image Analysis IEE 7730 - Image Analysis I 1818
K-means, only color is used in segmentation, four clusters (out of 20) are shown here.
Bahadir K. GunturkBahadir K. Gunturk EE 7730 - Image Analysis IEE 7730 - Image Analysis I 1919
K-means, color and position is used in segmentation, four clusters (out of 20) are shown here.
Each vector is (R,G,B,x,y).
Bahadir K. GunturkBahadir K. Gunturk EE 7730 - Image Analysis IEE 7730 - Image Analysis I 2020
K-Means Clustering: Axis Scaling
Features of different types may have different scales. For example, pixel coordinates on a 100x100 image vs. RGB
color values in the range [0,1].
Problem: Features with larger scales dominate clustering.
Solution: Scale the features.
KMeans
Kmeans results with RGB image in. (a) the original Kmeans results with RGB image in. (a) the original image, (b) K=3; (c) K=5;(d)K=6; (e) K=7 and (f) K=8 image, (b) K=3; (c) K=5;(d)K=6; (e) K=7 and (f) K=8
Fuzzy C-means1.1. Initialize the membership matrix b with values between 0 and 1. Initialize the membership matrix b with values between 0 and 1.
such that the following is satisfiedsuch that the following is satisfied
2.2. Compute centers mCompute centers mii, i=1,…,k, i=1,…,k
3.3. Compute cost functionCompute cost function
Stopping criterion : IF E<threshold or no significative variations Stopping criterion : IF E<threshold or no significative variations between current and previous iterationbetween current and previous iteration
4.4. Compute the new membership matrix Compute the new membership matrix And goto STEP 2And goto STEP 2
N
t
ti
N
t
tti
i
b
xbm
1
1
)(
)(
N
t
k
ii
ttik mxbmmbE
1 11 )(),,,(
k
j jt
it
ti
mx
mx
b
1
1
2
1
k
i
ti Ntb
1
,,1 1
),1[
04/21/23
HOUGH Transform
04/21/23
Contours: Lines and Curves
• Edge detectors find “edgels” (pixel level)
• To perform image analysis :– edges must be grouped into entities
such as contours (higher level).– Canny does this to certain extent: the
detector finds chains of edgels.
04/21/23
Line detection
• Mathematical model of a line:
xx
yy Y = mx + nY = mx + n
P(xP(x11,y,y11))
P(xP(x22,y,y22))
YY11=m x=m x11+n+n
YY22=m x=m x22+n+n
YYNN=m x=m xNN+n+n
04/21/23
Image and Parameter Spaces
xx
yy Y = Y = mmx + x + nn YY11==mm x x11++nn
YY22==mm x x22++nn
YYNN==mm x xNN++nn
Y = Y = mm ’’x + x + nn ’’
Image SpaceImage Space Parameter SpaceParameter Space
interceptintercept
slopeslope
mm
nnmm ’’
nn ’’
Line in Img. Space ~ Point in Param. SpaceLine in Img. Space ~ Point in Param. Space
04/21/23
Looking at it backwards …
YY11=m =m xx11+n+n
Image spaceImage space
Fix (m,n), Vary (x,y) - Fix (m,n), Vary (x,y) - LineLine
Fix (xFix (x11,y,y11), Vary (m,n) – Lines thru a ), Vary (m,n) – Lines thru a PointPoint
Y = Y = mmx + x + nn
xx
yy
P(xP(x11,y,y11))
04/21/23
Looking at it backwards …
Can be re-written as:Can be re-written as: n = n = -x-x11 m + m + YY11 YY11==mm x x11++nn
Parameter spaceParameter space
Fix (-xFix (-x11,y,y11), Vary (m,n) - Line), Vary (m,n) - Line n = n = -x-x11 m + m + YY11
interceptintercept
slopeslope
mm
nnmm ’’
nn ’’
04/21/23
Img-Param Spaces
• Image Space– Lines – Points – Collinear points
• Parameter Space– Points– Lines– Intersecting lines
04/21/23
Hough Transform Technique
• H.T. is a method for detecting straight lines (and curves) in images.
• Main idea:– Map a difficult pattern problem into a
simple peak detection problem
04/21/23
Hough Transform Technique• Given an edge point, there is an
infinite number of lines passing through it (Vary m and n).– These lines can be represented as a
line in parameter space.
Parameter SpaceParameter Space
interceptintercept
slopeslope
mm
nn
xx
yy n = n = (-x)(-x) m + m + yy
P(x,y)P(x,y)
04/21/23
Hough Transform Technique
• Given a set of collinear edge points, each of them have associated a line in parameter space.– These lines intersect at the point
(m,n) corresponding to the parameters of the line in the image space.
04/21/23
Parameter SpaceParameter Space
interceptintercept
slopeslope
mm
nn
xx
yy n = n = (-x1)(-x1) m + m + y1y1
P(x1,y1)P(x1,y1)
Q(x2,y2)Q(x2,y2)
n = (-x2) m + y2n = (-x2) m + y2ppqq
04/21/23
Hough Transform Technique
• At each point of the (discrete) parameter space, count how many lines pass through it.– Use an array of counters– Can be thought as a “ parameter image”
• The higher the count, the more edges are collinear in the image space.– Find a peak in the counter array– This is a “bright” point in the parameter image– It can be found by thresholding
04/21/23
Hough Transform Technique
04/21/23
Practical Issues
• The slope of the line is -<m<– The parameter space is INFINITE
• The representation y = mx + n does not express lines of the form x = k
04/21/23
Solution:
• Use the “Normal” equation of a line:
xx
yy
Y = mx + nY = mx + n = x cos= x cos+y sin+y sin
P(x,y)P(x,y)
xx
yy
Is the line orientationIs the line orientation
Is the distance between Is the distance between the origin and the linethe origin and the line
04/21/23 Octavia I. Camps 38
04/21/23
New Parameter Space
• Use the parameter space (, , )• The new space is FINITE
– 0 < < D , where D is the image diagonal.
– 0 < <
• The new space can represent all lines– Y = k is represented with = k, =90– X = k is represented with = k, =0
04/21/23
Consequence:
• A Point in Image Space is now represented as a SINUSOID = x cos= x cos+y sin+y sin
04/21/23
Hough Transform AlgorithmInput is an edge image (E(i,j)=1 for edgels)
1. Discretize and in increments of d and d. Let A(R,T) be an array of integer accumulators, initialized to 0.
2. For each pixel E(i,j)=1 and h=1,2,…T do1. = i cos(h * d ) + j sin(h * d )
2. Find closest integer k corresponding to
3. Increment counter A(h,k) by one
3. Find local maxima in A(R,T)
04/21/23
Hough Transform Speed Up
• If we know the orientation of the edge – usually available from the edge detection step– We fix theta in the parameter space
and increment only one counter!– We can allow for orientation
uncertainty by incrementing a few counters around the “nominal” counter.
04/21/23
Hough Transform for Curves
• The H.T. can be generalized to detect any curve that can be expressed in parametric form:– Y = f(x, a1,a2,…ap)– a1, a2, … ap are the parameters– The parameter space is p-dimensional– The accumulating array is LARGE!
Hough Transform for Circles
The above equation can be The above equation can be espressed in parametric formespressed in parametric form
Hough Transform for Circles
04/21/23
H.T. Summary• H.T. is a “voting” scheme
– points vote for a set of parameters describing a line or curve.
• The more votes for a particular set– the more evidence that the corresponding
curve is present in the image.
• Can detect MULTIPLE curves in one shot.
• Computational cost increases with the number of parameters describing the curve.
Image matching
by Diva Sian
by swashford
Harder case
by Diva Sian by scgbt
Harder still?
NASA Mars Rover images
NASA Mars Rover imageswith SIFT feature matchesFigure by Noah Snavely
Answer below (look for tiny colored squares…)
Local features and alignment
[Darya Frolova and Denis Simakov]
• We need to match (align) images• Global methods sensitive to occlusion, lighting, parallax
effects. So look for local features that match well.• How would you do it by eye?
Local features and alignment• Detect feature points in both images
[Darya Frolova and Denis Simakov]
Local features and alignment• Detect feature points in both images
• Find corresponding pairs
[Darya Frolova and Denis Simakov]
Local features and alignment• Detect feature points in both images
• Find corresponding pairs
• Use these pairs to align images
[Darya Frolova and Denis Simakov]
Local features and alignment
• Problem 1:– Detect the same point independently in both
images
no chance to match!
We need a repeatable detector
[Darya Frolova and Denis Simakov]
Local features and alignment
• Problem 2:– For each point correctly recognize the
corresponding one
?
We need a reliable and distinctive descriptor
[Darya Frolova and Denis Simakov]
Geometric transformations
Photometric transformations
Figure from T. Tuytelaars ECCV 2006 tutorial
And other nuisances…
• Noise
• Blur
• Compression artifacts
• …
Invariant local featuresSubset of local feature types designed to be invariant to common geometric and photometric transformations.
Basic steps:1) Detect distinctive interest points 2) Extract invariant descriptors
Figure: David Lowe
Main questions
• Where will the interest points come from?– What are salient features that we’ll detect in
multiple views?
• How to describe a local region?
• How to establish correspondences, i.e., compute matches?
Finding Corners
Key property: in the region around a corner, image gradient has two or more dominant directions
Corners are repeatable and distinctive
C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“ Proceedings of the 4th Alvey Vision Conference: pages 147--151.
Source: Lana Lazebnik
Corners as distinctive interest points
We should easily recognize the point by looking through a small window
Shifting a window in any direction should give a large change in intensity
“edge”:no change along the edge direction
“corner”:significant change in all directions
“flat” region:no change in all directions
Source: A. Efros
Harris Detector formulation
2
,
( , ) ( , ) ( , ) ( , )x y
E u v w x y I x u y v I x y
Change of intensity for the shift [u,v]:
IntensityShifted intensity
Window function
orWindow function w(x,y) =
Gaussian1 in window, 0 outside
Source: R. Szeliski
Consider shifting the window W by (u,v)• how do the pixels in W change?
• compare each pixel before and after bysumming up the squared differences (SSD)
• this defines an SSD “error” of E(u,v):
Feature detection: the math
W
Taylor Series expansion of I:
If the motion (u,v) is small, then first order approx is good
Plugging this into the formula on the previous slide…
Small motion assumption
Consider shifting the window W by (u,v)• how do the pixels in W change?
• compare each pixel before and after bysumming up the squared differences
• this defines an “error” of E(u,v):
Feature detection: the math
W
Feature detection: the mathThis can be rewritten:
For the example above• You can move the center of the green window to anywhere on the
blue unit circle
• Which directions will result in the largest and smallest E values?
• We can find these directions by looking at the eigenvectors of M
M
Harris Detector formulationThis measure of change can be approximated by:
2
2,
( , ) x x y
x y x y y
I I IM w x y
I I I
where M is a 22 matrix computed from image derivatives:
v
uMvuvuE ][),(
M
Sum over image region – area we are checking for corner
Gradient with respect to x, times gradient with respect to y
Harris Detector formulation
2
2,
( , ) x x y
x y x y y
I I IM w x y
I I I
where M is a 22 matrix computed from image derivatives:
Sum over image region – area we are checking for corner
Gradient with respect to x, times gradient with respect to y
M
Quick eigenvalue/eigenvector reviewThe eigenvectors of a matrix A are the vectors x that satisfy:
The scalar is the eigenvalue corresponding to x• The eigenvalues are found by solving:
• In our case, A = M is a 2x2 matrix, so we have
• The solution:
Once you know , you find x by solving
Feature detection: the math
Eigenvalues and eigenvectors of H• Define shifts with the smallest and largest change (E value)
• x+ = direction of largest increase in E.
+ = amount of increase in direction x+
• x- = direction of smallest increase in E.
- = amount of increase in direction x-
x-
x+
M
First, consider an axis-aligned corner:
What does this matrix reveal?
2
12
2
0
0
yyx
yxx
III
IIIM
First, consider an axis-aligned corner:
This means dominant gradient directions align with x or y axis
If either λ is close to 0, then this is not a corner, so look for locations where both are large.
Slide credit: David Jacobs
What does this matrix reveal?
What if we have a corner that is not aligned with the image axes?
General Case
Since M is symmetric, we have RRM
2
11
0
0
We can visualize M as an ellipse with axis lengths determined by the eigenvalues and orientation determined by R
direction of the slowest change
direction of the fastest change
(max)-1/2
(min)-1/2
Slide adapted form Darya Frolova, Denis Simakov.
Interpreting the eigenvalues
1
2
“Corner”1 and 2 are large,
1 ~ 2;
E increases in all directions
1 and 2 are small;
E is almost constant in all directions
“Edge” 1 >> 2
“Edge” 2 >> 1
“Flat” region
Classification of image points using eigenvalues of M:
Corner response function
“Corner”R > 0
“Edge” R < 0
“Edge” R < 0
“Flat” region
|R| small
22121
2 )()(trace)det( MMR
α: constant (0.04 to 0.06)
Harris Corner Detector
• Algorithm steps: – Compute M matrix within all image windows to get
their R scores– Find points with large corner response
(R > threshold)– Take the points of local maxima of R
Harris Detector: Workflow
Slide adapted form Darya Frolova, Denis Simakov, Weizmann Institute.
Harris Detector: WorkflowCompute corner response R
Harris Detector: WorkflowFind points with large corner response: R>threshold
Harris Detector: WorkflowTake only the points of local maxima of R
Harris Detector: Workflow
Harris Detector: Properties
• Rotation invariance
Ellipse rotates but its shape (i.e. eigenvalues) remains the same
Corner response R is invariant to image rotation
Harris Detector: Properties
• Not invariant to image scale
All points will be classified as edges
Corner !