iterative reweighted least squares · 2014-09-06 · 9/7/2014 1 iterative reweighted least squares...
TRANSCRIPT
9/7/2014
1
Iterative Reweighted Least Squares
ECCV, Sept 7, 2014
120
120
120
What point minimizes the distance to the three points of a triangle?
Exercise: find this point using ruler and compass construction.
Toricelli
9/7/2014
2
Ruler and Compass Constructionto find Fermat point
Alfred Weber
9/7/2014
3
Weber’s Solution (1906)
Gustav Weler
Gustav Weler was a political decoy (doppelgänger or Body‐double) of Adolf Hitler. At the end of the Second World War, he was executed by a gunshot to the forehead in an attempt to confuse the Allied troops when Berlin was taken.[citation needed] He was also used "as a decoy for security reasons".[2] When his corpse was discovered in the Reich Chancellery garden by Soviet troops, it was mistakenly believed to be that of Hitler because of his identical moustache and haircut. The corpse was also photographed and filmed by the Soviets.
One servant from the bunker declared that the dead man was one of Hitler's cooks. He also believed this man "had been assassinated because of his startling likeness to Hitler, while the latter had escaped from the ruins of Berlin".[3]
Weler's body was brought to Moscow for investigations and buried in the yard at Lefortovo prison.[4]
9/7/2014
4
Andrew Vázsonyi (1916–2003), also known as Endre Weiszfeld and Zepartzatt Gozinto) was a mathematician and operations researcher. He is known for Weiszfeld's algorithm for minimizing the sum of distances to a set of points, and for founding The Institute of Management Sciences.[1][2][3]
Weiszfeld
E. Weiszfeld, Sur le point pour lequel la somme des distances de n points donnés est minimum, Tôhoku Mathematics Journal 43 (1937), 355 - 386.
9/7/2014
5
Weighted L2 cost function
Robust (L1) cost function
9/7/2014
6
9/7/2014
7
Gradient of L2 distance Gradient of L1 distance
9/7/2014
8
Generalizing IRLS
Functions for which IRLS works
9/7/2014
9
Resistance to Outliers
Robustness to systematic outliers (e.g. ghost edges)
9/7/2014
10
Written without the squares
Define weights
Minimize weighted cost
Start with initial value x(0)
Define weights w(t) from x(t)
Solve WLS problemx(t+1)= argmin C(x, w(t))
t=0
t = t+1
IRLS Algorithm
9/7/2014
11
Start with initial value x(0)
Define weights w(t) from x(t)
Solve WLS problemx(t+1)= argmin C(x, w(t))
t=0
t = t+1
IRLS Algorithm
Start with initial value x(0)
Define weights w(t) from x(t)
Solve WLS problemx(t+1)= argmin C(x, w(t))
t=0
t = t+1
IRLS Algorithm
9/7/2014
12
Start with initial value x(0)
Define weights w(t) from x(t)
Solve WLS problemx(t+1)= argmin C(x, w(t))
t=0
t = t+1
IRLS Algorithm
Start with initial value x(0)
Define weights w(t) from x(t)
Solve WLS problemx(t+1)= argmin C(x, w(t))
t=0
t = t+1
IRLS Algorithm
9/7/2014
13
No square !!
Robust cost function
Required weights
Sum of distances
9/7/2014
14
Weighted residual sum decreases
Robust residual sum decreases
9/7/2014
15
SupergradientA concave function always has a supergradient
Definition of supergradient
Weighted squared residual decreases
Robust costdecreases
9/7/2014
16
Where gradient descent does not converge to a minimum
9/7/2014
17
L1
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
0.2 0.4 0.6 0.8 1.0
2
4
6
8
10
12Advantage: Robust
Disadvantage: • Function not differentiable• Weights not defined at 0• Can stop at non‐minimum.
9/7/2014
18
Lq
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
0.2 0.4 0.6 0.8 1.0
2
3
4
5
6
7
8
Advantage: • Robust• Cost function differentiable everywhere
Disadvantage: • Weights not defined at 0• Can stop at non‐minimum.
Huber
0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.5
1.0
1.5
2.0
Advantage: • Robust• Cost function differentiable• Weights defined at zero• Convex• Guaranteed to converge (at least to local
minimum)
9/7/2014
19
Pseudo Huber
0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.5
1.0
1.5
2.0 Advantage: • Robust• Cost function differentiable• Weights defined at zero• Convex• Guaranteed to converge (at least to local
minimum)
Cauchy
0.5 1.0 1.5 2.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
0.0 0.2 0.4 0.6 0.8 1.0
0.5
1.0
1.5
2.0 Advantage: • Robust• Cost function differentiable• Weights defined at zero• Guaranteed to converge (at least to local
minimum)
Disadvantage:• Non‐convex• Increased number of local minima
9/7/2014
20
Blake Zisserman
0.5 1.0 1.5 2.0 2.5 3.0
0.2
0.4
0.6
0.8
0.5 1.0 1.5 2.0 2.5 3.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.5 1.0 1.5 2.0 2.5 3.0
0.2
0.4
0.6
0.8
1.0
Advantage: • Very robust to outliers• Cost function differentiable• Weights defined at zero• Guaranteed to converge (at least to local
minimum)
Disadvantage:• Non‐convex• Increased number of local minima
1 2 3 4 5
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1 2 3 4 5
0.2
0.4
0.6
0.8
1.0
1.2
1 2 3 4 5
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Corrupted Gaussian
Advantage: • Robust• Cost function differentiable• Weights defined at zero• Guaranteed to converge (at least to local
minimum)
Disadvantage:• Non‐convex• Increased number of local minima
9/7/2014
21
Example. L1 Regression
9/7/2014
22
9/7/2014
23
9/7/2014
24
9/7/2014
25
9/7/2014
26
9/7/2014
27
9/7/2014
28
Possible application
9/7/2014
29
Example. Averaging Rotations
9/7/2014
30
9/7/2014
31
Angle between two quaternions is half the angle between the corresponding rotations, defined by
All rotations within a delta‐neighbourhood of a reference rotation form a circle on the quaternion sphere.
Isometry of Rotations and Quaternions
9/7/2014
32
Flatten out the meridians (longitude lines)
Azimuthal Equidistant Projection
Angle‐axis representation of Rotations
Rotations are represented by a ball of radius pi in 3‐Dimensional space.
9/7/2014
33
Relative rotations are computed between some nodes in the graph.
Initialization: Propagate rotations estimates across a tree.
9/7/2014
34
Orientation of each node in turn is recomputed, given the known orientation of its neigbours, and the relative orientation.
This involves “single‐rotation averaging”.
9/7/2014
35
o 569 Imageso 280,000 pointso 42000 pairs of overlapping images (more than 30 points in common)
Task: Find the orientations of all cameras.
Extension: Optimization on Riemannian Manifolds
9/7/2014
36
9/7/2014
37
9/7/2014
38
Image from http://en.wikipedia.org/wiki/File:Tangentialvektor.svg
9/7/2014
39
9/7/2014
40
9/7/2014
41
9/7/2014
42
Hyperbolic (negative curvature) manifold