Download - Lecture 2 - Feature Extraction
![Page 1: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/1.jpg)
Lecture 2 - Feature Extraction
DD2427
March 21, 2013
![Page 2: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/2.jpg)
Motivation for Today’s Lecture
![Page 3: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/3.jpg)
Recap: Approach to recognition
• Assume: An object class is represented by a set of exemplarimages.
• For recognition: Compare a novel image to this set oflabelled images via an intermediary feature vectorrepresentation.
• Thus: No explicit 3D model of the object or the physics ofimaging is required.
?Focus of Today’s Lecture
Consider the type of featuresthat have been used for this task.
![Page 4: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/4.jpg)
Properties of an ideal feature
• Ideally, our feature descriptor should be invariant to?
• Viewpoint changes
− translation
− scale changes
− out-of-plane rotations
• Illumination changes
• Clutter (a computer vision term for the other stuff in the image that
does not correspond to what you are looking for and corrupts your
measurements and confuses your detector)
• Partial occlusions
• Intra-class variation
while also being....
![Page 5: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/5.jpg)
Properties of an ideal feature
• Ideally, our feature descriptor should be invariant to?
• Viewpoint changes
− translation
− scale changes
− out-of-plane rotations
• Illumination changes
• Clutter (a computer vision term for the other stuff in the image that
does not correspond to what you are looking for and corrupts your
measurements and confuses your detector)
• Partial occlusions
• Intra-class variation
while also being....
![Page 6: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/6.jpg)
Properties of an ideal feature
while also being....
• Distinctive - features extracted from car images differ tothose extracted from chair images.
• Fast to compute
![Page 7: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/7.jpg)
The two extremes
Ideal features Far from ideal
![Page 8: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/8.jpg)
Back in the real world
• Must forget about the nirvana of ideal featuredescriptors.
• Must strike a balance between:
build invariances into the descriptor
while
incorporate the modes of variation notaccounted for by the descriptor into the
training data and the search
![Page 9: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/9.jpg)
Back in the real world
• Must forget about the nirvana of ideal featuredescriptors.
• Must strike a balance between:
build invariances into the descriptor
while
incorporate the modes of variation notaccounted for by the descriptor into the
training data and the search
![Page 10: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/10.jpg)
Back in the real world
In object recognition this is acheived via a smart trade-off of
feature descriptor
AND
classifier/recognition algorithm
AND
adequate training data
![Page 11: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/11.jpg)
Simple Image Patch Description
![Page 12: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/12.jpg)
Global image patch descriptions
Template (Array of pixel intensities)
Fixed spatial grid not invariant to geometric transforms
Grayscale/Colour Histogram
Invariant to geometric transforms.
![Page 13: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/13.jpg)
Definition of 1D histogram
• Given
- N data points with scalar values f1, . . . , fN with each fi ∈ R.
- m intervals/bins defined by the points b0, b1, . . . , bm where
bi < bi+1.
• Histogram definition
Histogram h = (h1, . . . , hm) records the number of points fjfalling into each bin.
• Calculation of histogram
Set h = (0, . . . , 0) thenfor i = 1 to N
find the j such that fi ∈ [bj−1, bj) and set hj = hj + 1
end
![Page 14: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/14.jpg)
Definition of 1D histogram
• Given
- N data points with scalar values f1, . . . , fN with each fi ∈ R.
- m intervals/bins defined by the points b0, b1, . . . , bm where
bi < bi+1.
• Histogram definition
Histogram h = (h1, . . . , hm) records the number of points fjfalling into each bin.
• Calculation of histogram
Set h = (0, . . . , 0) thenfor i = 1 to N
find the j such that fi ∈ [bj−1, bj) and set hj = hj + 1
end
![Page 15: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/15.jpg)
Definition of 1D histogram
• Given
- N data points with scalar values f1, . . . , fN with each fi ∈ R.
- m intervals/bins defined by the points b0, b1, . . . , bm where
bi < bi+1.
• Histogram definition
Histogram h = (h1, . . . , hm) records the number of points fjfalling into each bin.
• Calculation of histogram
Set h = (0, . . . , 0) thenfor i = 1 to N
find the j such that fi ∈ [bj−1, bj) and set hj = hj + 1
end
![Page 16: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/16.jpg)
Example: intensity histogram
Image histogram of intensity values
Which geometric transformations of the mug wouldn’t change thishistogram?
![Page 17: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/17.jpg)
Histogram of hue
Colour histograms
, 33
Hue values of the pixels is plotted against its frequency.
![Page 18: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/18.jpg)
Weighted histogram
• Given
Sometimes the N data points f1, . . . , fN also havenon-negative weights w1, . . . , wN associated with them.
• Weighted histogram definition
The weighted histogram h = (h1, . . . , hm) then records thesum of the weights of the points fj that fall into each bin.
• Calculate as follows
Set h = (0, . . . , 0) thenfor i = 1 to N
find the j such that bj ≤ fi < bj+1 and set hj = hj + wj
end
![Page 19: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/19.jpg)
Weighted histogram
• Given
Sometimes the N data points f1, . . . , fN also havenon-negative weights w1, . . . , wN associated with them.
• Weighted histogram definition
The weighted histogram h = (h1, . . . , hm) then records thesum of the weights of the points fj that fall into each bin.
• Calculate as follows
Set h = (0, . . . , 0) thenfor i = 1 to N
find the j such that bj ≤ fi < bj+1 and set hj = hj + wj
end
![Page 20: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/20.jpg)
Weighted histogram
• Given
Sometimes the N data points f1, . . . , fN also havenon-negative weights w1, . . . , wN associated with them.
• Weighted histogram definition
The weighted histogram h = (h1, . . . , hm) then records thesum of the weights of the points fj that fall into each bin.
• Calculate as follows
Set h = (0, . . . , 0) thenfor i = 1 to N
find the j such that bj ≤ fi < bj+1 and set hj = hj + wj
end
![Page 21: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/21.jpg)
Multi-dimensional histogram
• Can also have multi-dimensinal histograms.
• Example of histogramming the RGB values in an image.
image 3D Histogram
• Compute 3D histogram as
h(r, g, b) = #(pixels with color (r, g, b))
![Page 22: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/22.jpg)
Colour histograms
Pros: Robust to geometric transforms and partial occlusions.
images 3D Histograms
![Page 23: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/23.jpg)
Colour histograms
Cons
• May be sensitive to illumination changes (can sometimes fix)
• Many different images will have very similar histograms andmany objects from the same class will have very differenthistograms. (perhaps fatal)
Do we have a problem?
Other potential clashes?
![Page 24: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/24.jpg)
Colour normalisation
One component of the 3D color space is intensity.
• If a color vector is multiplied by a scalar, the intensitychanges, but not the color itself.
• This means colors can be normalized by the intensity definedby I = R+G+B.
• Chromatic representation:
r =R
R+G+B, g =
G
R+G+B, b =
B
R+G+B
• If this normalization is used then you’ve made your data2-dimensional. So only need r, g for the description task.
![Page 25: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/25.jpg)
Normalization
• Remove contrast and constant additive luminance variations.
• By fixing the first and second moments to standard values:
In(x, y) =I(x, y)− µ
σ
where
µ =1
np
∑x,y
I(x, y), σ2 =1
np − 1
∑x,y
(I(x, y)− µ)2
Before normalizationa)
b)
c) After normalizationSlide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 26: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/26.jpg)
Histogram Equalization
• Make all of the moments the same by ...
• Forcing the histogram of intensities to be the same.
a)
b)
c)
Before / normalized/ Histogram Equalized
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 27: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/27.jpg)
Histogram Equalization
0
1
0 25590
0.29
AfterBefore
• Compute the histogram of the pixel
intensities:
for k = 0, . . . , 255
hk =∑x
∑y
δ(I(x, y)− k)
• Compute the normalized
cumulative histogram
ck =1
np
k∑l=0
hl
• Use cumulative histogram as a
look-up table
Ie(x, y) = 255 × cI(x,y)
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 28: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/28.jpg)
Can histogram other quantities:
Per-pixel transformations
![Page 29: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/29.jpg)
Remember: Spatial filtering
Spatial domain process denoted by
g(x, y) = T [f(x, y)]
where
• f(x, y) is the input image,
• g(x, y) is the output image and
• T is an operator on f defined over a neighbourhood of (x, y).
Example: A 3× 3 neighbourhood
about a point (x, y) in an image in
the spatial domain. Neighbourhood
is moved from pixel to pixel in the
image to generate an output image. Image f
Spatial domain
Origin
(x, y)
3x3 neighbourhood of (x, y)
Form new image whose pixels are a function of the original pixelvalues.
![Page 30: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/30.jpg)
Remember: Spatial filtering
Spatial domain process denoted by
g(x, y) = T [f(x, y)]
where
• f(x, y) is the input image,
• g(x, y) is the output image and
• T is an operator on f defined over a neighbourhood of (x, y).
Example: A 3× 3 neighbourhood
about a point (x, y) in an image in
the spatial domain. Neighbourhood
is moved from pixel to pixel in the
image to generate an output image. Image f
Spatial domain
Origin
(x, y)
3x3 neighbourhood of (x, y)
Form new image whose pixels are a function of the original pixelvalues.
![Page 31: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/31.jpg)
Spatial filtering
A spatial filter consists of a
1. neighbourhood
2. predefined operation performed on the image pixels within theneighbourhood.
If the operation performed is linear, then the filter is called a linearspatial filter and will be of the form
g(x, y) =
a∑s=−a
b∑t=−b
w(s, t) f(x+ s, y + t)
for a mask of size m× n where m = 2a+ 1 and n = 2b+ 1.
Generally, have filters of odd size so the filter centre falls oninteger values.
![Page 32: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/32.jpg)
Spatial filtering
Image f
Spatial domain
Origin
(x, y)w(-1, -1) w(-1,0) w(-1,1)
w(0,0)
w(1,0) w(1,1)
w(0,1)w(0,-1)
w(1,-1)
Filter coefficientsf(x-1, y-1) f(x-1,y) f(x-1,y+1)
f(x,y)
f(x+1,y) f(x+1,y+1)
f(x,y+1)f(x,y-1)
f(x+1,y--1)
Pixels of image section under filter
g(x, y) =
a∑s=−a
b∑t=−b
w(s, t) f(x+ s, y + t)
![Page 33: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/33.jpg)
Spatial correlation and convolution
Correlation Move a filter over the image and compute the sum ofproducts at each location as explained.
w(x, y) ? f(x, y) =
a∑s=−a
b∑t=−b
w(s, t) f(x+ s, y + t)
Convolution Same as correlation except the filter is first rotatedby 180◦.
w(x, y) ∗ f(x, y) =a∑
s=−a
b∑t=−b
w(s, t) f(x− s, y − t)
![Page 34: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/34.jpg)
Spatial filter masks
The coefficients wj ’s define the effect of applying the filter.
Example 1:
Let wj = 1 for j = 1, . . . , nm:
1 1 1
1
1 1
11
1
⇒ g(x, y) =
a∑s=−a
b∑t=−b
f(x+ s, y + t)
⇒ g(x, y) = sum of pixel intensities in the n×m neighbourhood.
What values of wj make g(x, y) the average intensity of a pixel in the
n×m neighbourhood?
![Page 35: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/35.jpg)
Spatial filter masks
Example 2: Box filter (a smoothing filter)
Let wj =1
nm for j = 1, . . . , nm
1 1 1
1
1 1
11
1
1_9 x
=⇒ g(x, y) = 1nm
∑as=−a
∑bt=−b f(x+ s, y + t)
=⇒ g(x, y) = average intensity of a pixel in the n×m neighbourhood.
![Page 36: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/36.jpg)
Smoothing linear filters
• The output of a smoothing linear spatial filter is simply theweighted average of the pixels in the neighbourhood of thefilter mask.
• These filters are often referred to as weighted averaging filtersand act as lowpass filters.
g(x, y) =
∑as=−a
∑bt=−bw(s, t) f(x+ s, y + t)∑as=−a
∑bt=−bw(s, t)
• Each w(s, t) ≥ 0.
• Smoothing blurs an image.
• Hopefully it reduces the amount of irrelevant detail in animage.
• But it may destroy boundary edge information.
![Page 37: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/37.jpg)
Smoothing linear filters
Example: Gaussian filter
w(s, t) ∝ exp
(−s
2 + t2
2σ2
)Smoothing with a Gaussian
• Smoothing with a box actually doesn’t compare at all well with a defocussed lens
• Most obvious difference is that a single point of light viewed in a defocussed lens looks like a fuzzy blob; but the box filter would give a little square.
• A Gaussian gives a good model of a fuzzy blob
• It closely models many physical processes (the sum of many small effects)
In theory, the Gaussian function is non-zero everywhere, whichwould require an infinitely large convolution kernel, but in practiceit is effectively zero more than about three standard deviationsfrom the mean, and so we can truncate the kernel at this point.
As a crude rule of thumb if
• have a square mask n× n with n = 2a+ 1 then set σ = a2 .
• want to smooth the image by a Gaussian with σ ⇒ filter mask ofsize a = 2σ
![Page 38: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/38.jpg)
Example
Original image Smoothed imageVery smoothed
image
![Page 39: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/39.jpg)
Edges
Edges are significant local changes of intensity in an image.
Causes of these intensity changes:
Geometric events
• object boundary (discontinuity in depth and/or surfacecolor and texture)
• surface boundary (discontinuity in surface orientationand/or surface color)
Non-geometric events
• specularity (direct reflection of light, such as a mirror)
• shadows (from other objects or from the same object)
• inter-reflections
Want to construct linear filters that respond to such edges.
![Page 40: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/40.jpg)
Edge description
Edge normal unit vector in the direction of maximum intensitychange.
Edge direction unit vector to perpendicular to the edge normal.
Edge position or center the image position at which the edge islocated.
Edge strength related to the local image contrast along thenormal.
-2-
• What causes intensity changes?
- Various physical events cause intensity changes.
- Geometric events* object boundary (discontinuity in depth and/or surface color and texture)* surface boundary (discontinuity in surface orientation and/or surface colorand texture)
- Non-geometric events* specularity (direct reflection of light, such as a mirror)* shadows (from other objects or from the same object)* inter-reflections
• Edge descriptors
Edge normal: unit vector in the direction of maximum intensity change.Edge direction: unit vector to perpendicular to the edge normal.Edge position or center: the image position at which the edge is located.Edge strength: related to the local image contrast along the normal.
![Page 41: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/41.jpg)
Derivatives and edges
An edge is a place of rapid change in the image intensity function.
![Page 42: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/42.jpg)
Derivatives and edges
• Calculus describes changes of continuous functions usingderivatives.
• An image is a 2D function, so operators describing edges areexpressed using partial derivatives.
• Can approximate the derivative of a discrete signal by finitedifferences.
∂f(x, y)
∂x= lim
h→0
f(x+ h, y)− f(x, y)h
≈ f(x+ 1, y)− f(x, y), h = 1
• Therefore, the linear filter with mask [−1, 1] approximates thefirst derivative.
• Normally use the mask [−1, 0, 1] as its length is odd.
![Page 43: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/43.jpg)
Edge detection using the gradient
Definition of the gradient
The gradient vector
∇f =
∂f∂x
∂f∂y
=
(Mx
My
),
has an associated magnitude and direction
‖∇f‖ =√M2
x +M2y , dir (∇f) = tan−1
(My
Mx
)Properties of the gradient
• The magnitude of gradient indicates the strength of the edge.
• The gradient direction is perpendicular to the direction of the
edge (the edge direction is rotated with respect to the gradient
direction by -90 degrees).
![Page 44: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/44.jpg)
Edge detection using the gradient
Estimating the gradient with finite differences
∂f
∂x= lim
h→0
f(x+ h, y)− f(x, y)h
∂f
∂y= lim
h→0
f(x, y + h)− f(x, y)h
The gradient can be approximated by finite differences:
∂f
∂x≈ f(x+ 1, y)− f(x, y)
∂f
∂y≈ f(x, y + 1)− f(x, y)
![Page 45: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/45.jpg)
Edge detection using the gradient
Linear filter masks used to calculate ∂f∂x
[−1 0 1
]Prewitt:
−1 0 1−1 0 1−1 0 1
Sobel:
−1 0 1−2 0 2−1 0 1
Linear filter masks used to approximate ∂f∂y−10
1
Prewitt:
−1 −1 −10 0 01 1 1
Sobel:
−1 −2 −10 0 01 2 1
![Page 46: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/46.jpg)
Example image gradients
original image x−derivative y−derivativegradient
magnitude
![Page 47: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/47.jpg)
Practical issues
Differential masks act as high-pass filters which tend to amplifynoise.
![Page 48: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/48.jpg)
Practical issues
To reduce the effects of noise, the image needs to be smoothedfirst with a lowpass filter.
![Page 49: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/49.jpg)
Latter equivalent to
Because of the differentiation property of convolution:
∂
∂x(h ? f) =
∂h
∂x? f
![Page 50: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/50.jpg)
Can also detect edges using...
Edge is at the zero-crossing of the bottom graph.
![Page 51: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/51.jpg)
Example image
original image The LaplacianAbsoluteLaplacian
![Page 52: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/52.jpg)
Summary: Edge detection in 2D
Gaussian Derivative of Gaussian
![Page 53: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/53.jpg)
Gabor filters
a) b) c) d)
fnm =1
2πσ2exp
{−m
2 + n2
2σ2
}sin
{2π(cos(ω)m+ sin(ω)n)
λ+ φ
}
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 54: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/54.jpg)
Local Binary Patterns
• LBP operator returns a discrete value at each pixelcharacterizing local texture.
• Compare the 8 neighbour pixel intensities to the centre pixelintensity and set
Bi+k,j+l = Ind{I(x+ i, y + j) ≥ I(x, y)}
• Concatenate these binary values in a pre-determined order toobtain a decimal number.
0 1 2
7 3
6 5 4
a) b)
LBP: 10010111 = 151
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 55: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/55.jpg)
Local Binary Patterns
0 1 2
7 3
6 5 4
a) b)
• Can compute LBP over larger areas...
• Compare the current pixels to the image at positions on acircle.
• This LBP is defined by the number of samples P and theradius of the circle R.
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 56: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/56.jpg)
Textons
• An attempt to characterize texture
• Replace each pixel with integer representing the texture type.
a) c)
b) d)
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 57: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/57.jpg)
Textons - Step 1: Choose a Filter Bank
a)
b)
c) d)
⇐=
Filter bank of
• Gaussians
• derivative of Gaussians
• Laplacian of Gaussians
applied to each colorchannel.
a)
b)
c) d)
⇐= Rotationally invariant filterbank.
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 58: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/58.jpg)
Textons - Step 1: Choose a Filter Bank
a)
b)
c) d)
⇐=
Maximum response (MR8)database
• Bar filter at 3 scales replicated
at 6 orientation
• Edge filter at 3 scales
replicated at 6 orientations
• Gaussian filter
• Laplacian of Gaussian filter
Rotational invariance induced byonly using the maximum filterresponse over orientation for thebar and edge filters.
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 59: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/59.jpg)
Textons - Step 2: Learn Texture Types
• Convolve the N filters (from the filter bank) with a set oftraining images.
• For each pixel obtain a vector of N × 1 of filter responses.
• Cluster these vectors into K clusters.
• Cluster means are prototypes for the texture types.
a)
b)
c) d)
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 60: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/60.jpg)
Textons - Step 3: Assign Texton to new pixel
For new pixel
• Filter surrounding region with same filter bank
• Obtain vector of N × 1 of filter responses
• Assign to nearest cluster center
a)
b)
c) d)
a) c)
b) d)
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 61: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/61.jpg)
Image Patch Descriptors
![Page 62: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/62.jpg)
Global image patch descriptors
Have seen descriptors
Template (Vector of pixel intensities)
Grayscale/Colour Histogram
Would like a descriptor which has the advantages of both
• Invariance to small translational shifts and rotations
• Encode relative spatial relationship between different parts of theimage
![Page 63: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/63.jpg)
Global image patch descriptors
Have seen descriptors
Template (Vector of pixel intensities)
Grayscale/Colour Histogram
Would like a descriptor which has the advantages of both
• Invariance to small translational shifts and rotations
• Encode relative spatial relationship between different parts of theimage
![Page 64: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/64.jpg)
Two recent descriptors
Such image patch descriptors are:
• Distinctive image features from scale-invariant keypoints,D.G. Lowe,International Journal of Computer Vision, 2004
- In recent years this has been one of the most highly cited
papers in the field of computer science.
• Histograms of Oriented Gradients for Human Detection,Navneet Dalal and Bill Triggs,Computer Vision and Pattern Recognition, 2005.
- The descriptor used is the basis for the (one of the) best
person detector in images in the research community.
There are great similarities between the two and both rely on the
histogramming of image gradient orientations.
![Page 65: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/65.jpg)
Histogram of gradient orientations
Image Image gradients
Histogram the gradient orientation (weighted according to theirgradient magnitude) of the image gradients to get
![Page 66: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/66.jpg)
Histogram of gradient orientations
This histogram can be interpreted as a one-dimensional vector hwith n entries. Where each entry is the frequency of a bin centre.
What happens to h if
• the four translates slightly within image frame ?
• the four is very slightly rotated ?
• the four is rotated by 90◦ clockwise ?
![Page 67: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/67.jpg)
Histogram of gradient orientations
Answers
• h is invariant to translation shifts as long as the same pixelsare used for the gradient computations.
• There could be a large change in h. Why ? (Aliasing)
• h will probably be very different. Each orientation will differby 90◦ resulting in the histogramming of a very different set ofnumbers.
Note
• The last point =⇒ descriptor is not rotationally invariant.
• However, the middle condition is most worrisome.
• Want small changes in the appearance of patch =⇒ smallchanges in feature description.
• There is a solution to avoid this...
![Page 68: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/68.jpg)
Histogram of gradient orientations
Answers
• h is invariant to translation shifts as long as the same pixelsare used for the gradient computations.
• There could be a large change in h. Why ? (Aliasing)
• h will probably be very different. Each orientation will differby 90◦ resulting in the histogramming of a very different set ofnumbers.
Note
• The last point =⇒ descriptor is not rotationally invariant.
• However, the middle condition is most worrisome.
• Want small changes in the appearance of patch =⇒ smallchanges in feature description.
• There is a solution to avoid this...
![Page 69: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/69.jpg)
Avoiding aliasing
• Problem
- Each entry voting only for its nearest orientation bin results
in possible aliasing effects.
- Can cause sudden changes in the computed feature.
• Solution
- In the histogram computation distribute the weight of the
orientation gradient magnitude for every pixel into
neighbouring orientation bins.
- Let
? b be the inter-bin distance of our histogram h
? h(x) the value of the histogram for the bin centred at x.
- Assume that we want to interpolate a weight w at point x into
the histogram.
![Page 70: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/70.jpg)
Avoiding aliasing
Solution
- Let
? b be the inter-bin distance of our histogram h
? h(x) the value of the histogram for the bin centred at x.
- Assume that we want to interpolate a weight w at point x into the
histogram.
- Let x1 and x2 be the two nearest neighbouring bins of the point x
such that x1 ≤ x < x2.
- Linear interpolation distributes the weight w into two nearestneighbours as follows:
h(x1)← h(x1) + w
(1− x− x1
b
), h(x2)← h(x2) + w
(x− x1b
)Note when histogramming orientations/angles the bins have to be
wrapped around as 360◦ = 0◦.
![Page 71: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/71.jpg)
SIFT patch descriptor
• Compute and threshold image gradients.
• Create array of orientation histograms
• 4 × 4 × 8 orientation histogram array = 128 dimensions
David Lowe 3/18/2007
Object Recognition 2
Scale space processed one octave at a time Key point localization
� Detect maxima and minima of difference-of-Gaussian in scale space Blur
Sampling frequency for scaleMore points are found as sampling frequency increases, but accuracy of matching decreases after 3 scales/octave
Select canonical orientation
� Create histogram of local gradient directions computed at selected scale
� Assign canonical orientation at peak of smoothed histogram
� Each key specifies stable 2D coordinates (x, y, scale, orientation)
0 2π
Example of keypoint detectionThreshold on value at DOG peak and on ratio of principle curvatures (Harris approach)
(a) 233x189 image(b) 832 DOG extrema(c) 729 left after peak
value threshold(d) 536 left after testing
ratio of principlecurvatures
SIFT vector formation� Thresholded image gradients are sampled over 16x16
array of locations in scale space� Create array of orientation histograms� 8 orientations x 4x4 histogram array = 128 dimensions
![Page 72: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/72.jpg)
Invariance to rotation
• Create histogram of local gradientdirections
• Assign canonical orientation at peakof smoothed histogram
• Rotate patch so that dominant di-rection is vertical.
David Lowe 3/18/2007
Object Recognition 2
Scale space processed one octave at a time Key point localization
� Detect maxima and minima of difference-of-Gaussian in scale space Blur
Sampling frequency for scaleMore points are found as sampling frequency increases, but accuracy of matching decreases after 3 scales/octave
Select canonical orientation
� Create histogram of local gradient directions computed at selected scale
� Assign canonical orientation at peak of smoothed histogram
� Each key specifies stable 2D coordinates (x, y, scale, orientation)
0 2π
Example of keypoint detectionThreshold on value at DOG peak and on ratio of principle curvatures (Harris approach)
(a) 233x189 image(b) 832 DOG extrema(c) 729 left after peak
value threshold(d) 536 left after testing
ratio of principlecurvatures
SIFT vector formation� Thresholded image gradients are sampled over 16x16
array of locations in scale space� Create array of orientation histograms� 8 orientations x 4x4 histogram array = 128 dimensions
![Page 73: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/73.jpg)
Histogram of Oriented Gradients descriptor
6
State-of-the-art (so far)
Histogram of Oriented Gradients for Human Detection – N.Dalal &B.Triggs, CVPR’2005: 90% detection rate at 10-4 FPPW
Pictures from Dalal’s talk
Constructing HOG feature
Structure:
• Have a 2D grid of cells. Each cell is of size η × η pixels.
• Have m blocks where each block is a grid of ζ × ζ cells.
• Different blocks may contain some of the same cells.
![Page 74: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/74.jpg)
Histogram of Oriented Gradients descriptor
6
State-of-the-art (so far)
Histogram of Oriented Gradients for Human Detection – N.Dalal &B.Triggs, CVPR’2005: 90% detection rate at 10-4 FPPW
Pictures from Dalal’s talk
Constructing HOG feature
The descriptor
• Histogram gradient orientations within a cell into 9 bins.
• Contribution of each gradient to histogram ∝ its magnitude.
• Concatenate the cell histograms in a block.
• Normalize this HOG feature to normalize contrast within theblock.
• Concatenate all block HOG features into one long vector.
![Page 75: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/75.jpg)
HOG descriptor
a) b) c) d) e)
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 76: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/76.jpg)
Shape Context Descriptor
a) b) c)
d) e)
Slide Source: Computer vision: models, learning and inference. 2011 Simon J.D. Prince
![Page 77: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/77.jpg)
The Search Problem
![Page 78: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/78.jpg)
Story so far
• Have introduced some methods to describe the appearance ofan image patch via a feature vector. (SIFT, HOG etc..)
• For patches of similar appearance their computed featurevectors should be similar.
• For patches of dissimilar appearance their feature vectorsshould differ.
• Feature vectors are designed to be invariant to
- common geometric transformations
- common illumination changes
that superficially change the pixel appearance of the patch.
![Page 79: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/79.jpg)
Next problem
• Have a reference image patch and its feature vector fr.
Face Finder: Training• Positive examples:
– Preprocess ~1,000 example face images into 20 x 20 inputs
– Generate 15 “clones” of each with small random rotations, scalings, translations, reflections
• Negative examples– Test net on 120 known “no-face” images
����!��������!���
����!������!��!���
⇒ fr
• Given a novel image identify the patches in this image thatcorrespond to the reference patch.
• One part of the problem we have explored.
A patch from the novel image generates a feature vector
fn. If ‖fr − fn‖ is small then this patch can be considered
an instance of the texture pattern represented by the
reference patch.
![Page 80: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/80.jpg)
Next problem
• Have a reference image patch and its feature vector fr.
Face Finder: Training• Positive examples:
– Preprocess ~1,000 example face images into 20 x 20 inputs
– Generate 15 “clones” of each with small random rotations, scalings, translations, reflections
• Negative examples– Test net on 120 known “no-face” images
����!��������!���
����!������!��!���
⇒ fr
• Given a novel image identify the patches in this image thatcorrespond to the reference patch.
• One part of the problem we have explored.
A patch from the novel image generates a feature vector
fn. If ‖fr − fn‖ is small then this patch can be considered
an instance of the texture pattern represented by the
reference patch.
However, which and how many different image patches do weextract from the novel image ?
![Page 81: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/81.jpg)
Remember..
The sought after image patch can appear at:
• any spatial location in the image
• any size, (the size of an imaged object depends on its from thecamera)
• multiple locationsVariation in position and size- multiple detection windows
, 49
![Page 82: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/82.jpg)
Sliding window technique
• Therefore must examine patches centered at many differentpixel locations and at many different sizes.
• Naive Option: Exhaustive search using original imagefor j = 1:n s
n = n min + j*n step
for x=0:x max
for y=0:y max
Extract patch centred at (x, y) of size n×n.
Rescale it to the size of the reference patch
Compute feature vector f.
• Computationally intensive especially if expensive to compute fas it could be calculated upto n s × x max × y max.
• Also frequently if n is large then it is very costly to compute f .
Next Features lecture will review how to do this efficiently....
![Page 83: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/83.jpg)
Today’s ProgrammingAssignment
![Page 84: Lecture 2 - Feature Extraction](https://reader030.vdocuments.us/reader030/viewer/2022021617/620ab634a8f29c77d06fccc6/html5/thumbnails/84.jpg)
Programming assignment
• Details available on the course website.
• You will write Matlab functions to extract different featuredescriptors from an image. You will then compare thesefeatures when extracted from images of eyes and noses.
• Important: Due to the short turn around time until the nextlecture. This assignment is not due until the lecture onMonday 25th of March.
• Mail me about any errors you spot in the Exercise notes.
• I will notify the class about errors spotted and corrections viathe course website and mailing list.