introduction to computer vision (uapycon 2017)
TRANSCRIPT
About me
• Long time Python enthusiast
• Graduated from RWTH Aachen, Germany
• Defended Computer Vision Bachelor and Master theses
• Work in Ecoisme as Backend and Algorithms Engineer
Visual Odometry (later)
Why CV?
Why CV
Why CV
Why CV
Outline
• Binary and grayscale processing
• Edges and lines detection
• Global and local descriptors
• Further topics
Binary images
• A matrix of 0’s and 1’s
• Simplest case
• Distinguish back- and foreground
Greyscale images
• More complicated
• Every pixel can be [0; 1] float
• Higher variance
• Often are better then color images
Thresholding• Grayscale to binary
• Use domain knowledge to pick T
• Or calculate it via intra-class variance minimisation (Otsu method)
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 14
/15
Thresholding
• Grayscale image � Binary mask • Different variants
¾ One-sided
¾ Two-sided
¾ Set membership
B. Leibe
> @ > @¯® t
otherwise ,0
, if ,1,
TjiFjiFT
> @ > @¯® dd
otherwise ,0
, if ,1, 21 TjiFTjiFT
> @ > @¯® �
otherwise ,0
, if ,1,
ZjiFjiFT
Image Source: http://homepages.inf.ed.ac.uk/rbf/HIPR2/ 13
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 14
/15
Thresholding
• Grayscale image � Binary mask • Different variants
¾ One-sided
¾ Two-sided
¾ Set membership
B. Leibe
> @ > @¯® t
otherwise ,0
, if ,1,
TjiFjiFT
> @ > @¯® dd
otherwise ,0
, if ,1, 21 TjiFTjiFT
> @ > @¯® �
otherwise ,0
, if ,1,
ZjiFjiFT
Image Source: http://homepages.inf.ed.ac.uk/rbf/HIPR2/ 13
Thresholding
Niblack Threshold
Solution - use local window W with independent thresholds
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Co
mpu
ter V
isio
n W
S 14
/15
Local Binarization [Niblack’86]
• Estimate a local threshold within a small neighborhood window W
where k �[-1,0] is a user-defined parameter.
B. Leibe 19
TW
Effect:
�PW
W
What is the hidden assumption here?
Niblack Threshold
Canny edge detector
Intensity gradient• Filter image with Gaussian derivatives approximation
G =
!
G2x +G2
y
Canny edge detector• Filter with Gaussian derivative
• Use gradient magnitude
• Threshold
Canny edge detector• Lines have different thickness
• Parts may not survive thresholding
• Use 2 thresholds, one (stronger) to start edges, second (weaker) to continue
Thin
Thin
Thick
Line fitting• Many objects are characterised by straight lines
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 14
/15
Example: Line Fitting
• Why fit lines? ¾ Many objects are characterized by presence of straight lines
• Wait, why aren’t we done just by running edge detection?
B. Leibe Slide credit: Kristen Grauman 47
Why not to use edges?
• Noise
• Extra points
• Some parts can be missing
Hough Transform • Voting technique
• What lines can be fit to point?
• Extract lines that have most votes
Hough space
To each line in image space corresponds a point in Hough space.
x
y
Image space
y = a0x + b0
Hough spacea0
b0
a
b
y = a1x + b1
a1
b1
Hough space
To each point in image space corresponds a line in Hough space.
x
y
Image space Hough spacea
b
x0
y0
x1
y1b = -x0a + y0b = -x1a + y1
Hough transform• Due to numerical reasons, it is better to use polar representation:Pe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Polar Representation for Lines
• Issues with usual (m,b) parameter space: can take on infinite values, undefined for vertical lines.
• Point in image space
� Sinusoid segment in Hough space
dyx � TT sincos
[0,0]
dT
x
y
: perpendicular distance from line to origin
: angle the perpendicular makes with the x-axis
d
T
Slide adapted from Steve Seitz 55
Image Edges
Hough Lines
How to tell if images are similar?
How to tell if images are similar?• Compare pixels
• Not robust to noise, lightning, position change
• Idea - capture color statistics via 3D color histograms
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d C
om
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Color Histograms
• Color statistics ¾ Here: RGB as an example
¾ Given: tristimulus R,G,B for each pixel
¾ Compute 3D histogram – H(R,G,B) = #(pixels with color (R,G,B))
23 [Swain & Ballard, 1991] B. Leibe
Color histogramsPe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Color Histograms
• Robust representation
26 [Swain & Ballard, 1991] B. Leibe
Comparison
• Euclidian distance
• Mahalanobis Distance
• Chi-Square
• Earth Movers Distance
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 14
/15
Comp. Measures: Earth Movers Distance
• Motivation: Moving Earth
38
≠
Slide adapted from Pete Barnum B. Leibe
ProblemsPe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Challenges: Robustness
27
Illumination Object pose Clutter
Viewpoint Intra-class appearance
Occlusions
Slide credit: Kristen Grauman
Image credit: Kristen Grauman
ProblemsPe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Application: Image Matching
16 B. Leibe
by Diva Sian
by swashford
Slide credit: Steve Seitz
Image credit: Flickr
Hard?Pe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5 Harder Case
17 B. Leibe
by Diva Sian by scgbt
Slide credit: Steve Seitz
Image credit: Flickr
Do you see the match?Pe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Harder Still?
18 B. Leibe
NASA Mars Rover images
Slide credit: Steve Seitz
Do you see the match?Pe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Answer Below (Look for tiny colored squares)
19 B. Leibe
NASA Mars Rover images with SIFT feature matches (Figure by Noah Snavely)
Slide credit: Steve Seitz
Panorama stitchingPe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Application: Image Stitching
• Procedure: ¾ Detect feature points in both images ¾ Find corresponding pairs
22 B. Leibe Slide credit: Darya Frolova, Denis Simakov
1. RepeatablePe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Common Requirements
• Problem 1: ¾ Detect the same point independently in both images
25 B. Leibe
No chance to match!
We need a repeatable detector!
Slide credit: Darya Frolova, Denis Simakov
Different points, no change to match!
2. Distinctive
Should be easy to find correspondence!
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 14
/15
Common Requirements
• Problem 1: ¾ Detect the same point independently in both images
• Problem 2: ¾ For each point correctly recognize the corresponding one
26 B. Leibe
We need a reliable and distinctive descriptor!
?
Slide credit: Darya Frolova, Denis Simakov
Corners as keypoints• Repeatable
• Distinctive
• Easy to find via image gradient filtering
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 14
/15
Corners as Distinctive Interest Points
• Design criteria ¾ We should easily recognize the point by looking through a small
window (locality) ¾ Shifting the window in any direction should give a large change
in intensity (good localization)
35 B. Leibe
“edge”: no change along the edge direction
“corner”: significant change in all directions
“flat” region: no change in all directions
Slide credit: Alexej Efros
Hessian detectorPe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5 Hessian Detector [Beaudet78]
• Hessian determinant
56 B. Leibe
2))(det( xyyyxx IIIIHessian �
2)^(. xyyyxx III � In Matlab:
»¼
º«¬
ª
yyxy
xyxx
IIII
IHessian )(
Slide credit: Krystian Mikolajczyk
Ixx
Iyy Ixy
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Co
mpu
ter V
isio
n W
S 14
/15
Hessian Detector [Beaudet78]
• Hessian determinant
56 B. Leibe
2))(det( xyyyxx IIIIHessian �
2)^(. xyyyxx III � In Matlab:
»¼
º«¬
ª
yyxy
xyxx
IIII
IHessian )(
Slide credit: Krystian Mikolajczyk
Ixx
Iyy Ixy
• Use 2nd derivatives over image
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 14
/15
Hessian Detector – Responses [Beaudet78]
58 Slide credit: Krystian Mikolajczyk
Descriptors• Ok, now we know how to localise keypoints
• But how do we actually compare them?Pe
rceptu
al a
nd S
enso
ry A
ugm
ente
d C
om
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Local Descriptors
• We know how to detect points
• Next question:
How to describe them for matching?
56 B. Leibe
?
Point descriptor should be: 1. Invariant 2. Distinctive
Slide credit: Kristen Grauman
SIFT• Scale Invariant Feature Transform
• A 16x16 neighbourhood around the keypoint is taken. It is divided into 16 sub-blocks of 4x4 size. For each sub-block, 8 bin orientation histogram is created. So a total of 128 bin values are available. It is represented as a vector to form keypoint descriptor.
SIFT• Very robust to illumination changes
• Fast - real time capable
• Became very popular and wide-spread
SIFTPe
rcep
tual
and
Sen
sory
Aug
men
ted
Com
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
Overview: SIFT
• Extraordinarily robust matching technique ¾ Can handle changes in viewpoint up to ~60 deg. out-of-plane rotation ¾ Can handle significant changes in illumination
– Sometimes even day vs. night (below)
¾ Fast and efficient—can run in real time ¾ Lots of code available
– http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT
60 B. Leibe Slide credit: Steve Seitz
Image credit: Steve Seitz
Ada Boost• Machine learning classification method
• Have lots (hundreds) of weak classifiers
• Every week classifier is correct in > 50% cases
• Calculate weights for all of them
• Combine them into a one strong classifier
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d C
om
puti
ng
Com
pute
r Vis
ion
WS
14/1
5
AdaBoost – Formalization
• 2-class classification problem ¾ Given: training set X = {x1, …, xN}
with target values T = {t1, …, tN }, tn 2 {-1,1}.
¾ Associated weights W={w1, …, wN} for each training point.
• Basic steps ¾ In each iteration, AdaBoost trains a new weak classifier hm(x)
based on the current weighting coefficients W(m).
¾ We then adapt the weighting coefficients for each point
– Increase wn if xn was misclassified by hm(x). – Decrease wn if xn was classified correctly by hm(x).
¾ Make predictions using the final combined model
17 B. Leibe
H(x) = sign
ÃMX
m=1
®mhm(x)
!
{x1,…,xn}
Image credit: Kristen Grauman
Viola–Jones
• Use AdaBoost to detect objects (faces)
• Every week classifier is just a sum of binary filters
Image credit: Adam Harvey
Further topics
• Classification via Deep Neural Nets
• Visual-inertial odometry
• Stereo vision
• Object tracking
Visual-inertial odometry• Track position using 2 cameras and
IMU
• Useful when no GPS or similar available
Keyframe-Based Visual-Inertial Online SLAM with Relocalization
https://arxiv.org/pdf/1702.02175.pdf
Python libraries• scipy/numpy
• scikit-image
• OpenCV
OpenCV snippetsedges = cv2.Canny(img,100,200)
lines = cv2.HoughLines(edges,1,np.pi/180,200)
sift = cv2.xfeatures2d.SIFT_create()
kp, des = sift.detectAndCompute(img, None)
Questions?
Thank you!slides: antonkasyanov.com