lecture 4 the l 2 norm and simple least squares
DESCRIPTION
Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/1.jpg)
Lecture 4
The L2 Normand
Simple Least Squares
![Page 2: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/2.jpg)
SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems
![Page 3: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/3.jpg)
Purpose of the Lecture
Introduce the concept of prediction error and the norms that quantify it
Develop the Least Squares Solution
Develop the Minimum Length Solution
Determine the covariance of these solutions
![Page 4: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/4.jpg)
Part 1
prediction error and norms
![Page 5: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/5.jpg)
The Linear Inverse ProblemGm = d
![Page 6: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/6.jpg)
The Linear Inverse ProblemGm = ddata
model parameters
data kernel
![Page 7: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/7.jpg)
Gmest = dprean estimate of the model parameters
can be used to predict the data
but the prediction may not match the observed data
(e.g. due to observational error)dpre ≠ dobs
![Page 8: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/8.jpg)
e = dobs -dprethis mismatch leads us to define the
prediction error
e = 0when the model parameters exactly predict
the data
![Page 9: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/9.jpg)
B)A)
0 5 100
5
10
15
z
d
0 5 100
5
10
15
z
d
zi
diobsdipre ei
example of prediction errorfor line fit to data
![Page 10: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/10.jpg)
“norm”rule for quantifying the overall size
of the error vector elot’s of possible ways to do it
![Page 11: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/11.jpg)
Ln family of norms
![Page 12: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/12.jpg)
Ln family of norms
Euclidian length
![Page 13: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/13.jpg)
0 1 2 3 4 5 6 7 8 9 10-1
0
1
z
e
0 1 2 3 4 5 6 7 8 9 10-1
0
1
z
|e|
0 1 2 3 4 5 6 7 8 9 10-1
0
1
z
|e|2
0 1 2 3 4 5 6 7 8 9 10-1
0
1
z
|e|10
higher norms give increaing weight to largest element of e
![Page 14: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/14.jpg)
limiting case
![Page 15: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/15.jpg)
guiding principle for solving an inverse problem
find the mest
that minimizes E=||e||
withe = dobs –dpreand dpre = Gmest
![Page 16: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/16.jpg)
but which norm to use?
it makes a difference!
![Page 17: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/17.jpg)
0 2 4 6 8 100
5
10
15
z
d
outlier
L1L2
L∞
![Page 18: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/18.jpg)
B)A)
0 5 100
0.1
0.2
0.3
0.4
0.5
d
p(d)
0 5 100
0.1
0.2
0.3
0.4
0.5
d
p(d)
Answer is related to the distribution of the error. Are outliers common or rare?
long tailsoutliers common
outliers unimportantuse low norm
gives low weight to outliers
short tailsoutliers uncommonoutliers important
use high normgives high weight to outliers
![Page 19: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/19.jpg)
as we will show later in the class …
use L2 norm when data has
Gaussian-distributed error
![Page 20: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/20.jpg)
Part 2
Least Squares Solution to Gm=d
![Page 21: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/21.jpg)
L2 norm of error is its Euclidian length
so E is the square of the Euclidean lengthmimimize E
Principle of Least Squares
= eTe
![Page 22: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/22.jpg)
Least Squares Solution to Gm=d
minimize E with respect to mq∂E/∂mq = 0
![Page 23: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/23.jpg)
so, multiply out
![Page 24: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/24.jpg)
first term
![Page 25: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/25.jpg)
first term
∂mj /∂mq = δjq since mj and mq are independent variables
![Page 26: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/26.jpg)
ai = Σj δij bj = bi
Kronecker delta(elements of identity matrix) [I]ij = δij
a = Ib = bai = Σj δij bj = bi
i
![Page 27: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/27.jpg)
second term
third term
![Page 28: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/28.jpg)
putting it all together
or
![Page 29: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/29.jpg)
presuming [GTG] has an inverse
Least Square Solution
![Page 30: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/30.jpg)
presuming [GTG] has an inverse
Least Square Solution
memorize
![Page 31: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/31.jpg)
examplestraight line problem
Gm = d
![Page 32: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/32.jpg)
![Page 33: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/33.jpg)
![Page 34: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/34.jpg)
![Page 35: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/35.jpg)
in practice,no need to multiply matrices
analytically
just use MatLab
mest = (G’*G)\(G’*d);
![Page 36: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/36.jpg)
another examplefitting a plane surface
Gm = d
![Page 37: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/37.jpg)
x, km
y, km
z, km
![Page 38: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/38.jpg)
Part 3
Minimum Length Solution
![Page 39: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/39.jpg)
but Least Squares will fail
when [GTG] has no inverse
![Page 40: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/40.jpg)
zd
?
?
?
examplefitting line to a single point
![Page 41: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/41.jpg)
![Page 42: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/42.jpg)
zero determinanthence no inverse
![Page 43: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/43.jpg)
Least Squares will fail
when more than one solution minimizes the error
the inverse problem is “underdetermined”
![Page 44: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/44.jpg)
R
S
21
simple example of an underdetermined problem
![Page 45: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/45.jpg)
What to do?
use another guiding principle
“a priori” information about the solution
![Page 46: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/46.jpg)
in the casechoose a solution that is small
minimize ||m||2
![Page 47: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/47.jpg)
simplest case“purely underdetermined”
more than one solution has zero error
![Page 48: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/48.jpg)
minimize L=||m||22with the constraint that e=0
![Page 49: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/49.jpg)
Method of Lagrange Multipliersminimize L with constraintsC1=0, C2=0, …
equivalent to
minimize Φ=L+λ1C1+λ2C2+…with no constraints λs called “Lagrange Multipliers”
![Page 50: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/50.jpg)
e(x,y)=0
x
y
L (x,y)(x0,y0)
![Page 51: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/51.jpg)
![Page 52: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/52.jpg)
2m=GT λ and Gm=d GG½ T λ =d
λ = 2[GGT ]-1d m=GT [GGT ]-1d
![Page 53: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/53.jpg)
presuming [GGT] has an inverse
Minimum Length Solution
mest=GT [GGT ]-1d
![Page 54: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/54.jpg)
presuming [GGT] has an inverse
Minimum Length Solution
mest=GT [GGT ]-1dmemorize
![Page 55: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/55.jpg)
Part 4
Covariance
![Page 56: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/56.jpg)
Least Squares Solutionmest= [GTG ]-1GTdMinimum Length Solutionmest=GT [GGT ]-1dboth have the linear formm=Md
![Page 57: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/57.jpg)
but ifm=Mdthen[cov m] = M [cov d] MT
when data are uncorrelated with uniform variance σd
2
[cov d]=σd2I
so
![Page 58: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/58.jpg)
Least Squares Solution[cov m] = [GTG ]-1GTσd2 G[GTG ]-1[cov m] = σd
2 [GTG ]-1
Minimum Length Solution[cov m] = GT [GGT ]-1 σd2 [GGT ]-1G[cov m] = σd
2 GT [GGT ]-2G
![Page 59: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/59.jpg)
Least Squares Solution[cov m] = [GTG ]-1GTσd2 G[GTG ]-1[cov m] = σd
2 [GTG ]-1
Minimum Length Solution[cov m] = GT [GGT ]-1 σd2 [GGT ]-1G[cov m] = σd
2 GT [GGT ]-2G memorize
![Page 60: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/60.jpg)
where to obtain the value of σd2 a priori value – based on knowledge of accuracy
of measurement technique
my ruler has 1 mm divisions, so σd≈ mm½
a posteriori value – based on prediction error
![Page 61: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/61.jpg)
variance critically dependent on experiment design (structure of G)
1 12 123
1234
1 12 23 34… …
which is the better way to weigh a set of boxes ?
![Page 62: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/62.jpg)
0 10 20 30 40 50 60 70 80 90 100
-2
0
2m
z
0 10 20 30 40 50 60 70 80 90 1000
0.5
1
sm
z
A)
B)
miest
σmi
i
i
![Page 63: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/63.jpg)
0 1 2 3 4
0
1
2
3
4
500
1000
1500
2000
0 1 2 3 4
0
1
2
3
4
500
1000
1500
2000
2500
3000
-5 0 5-10
-5
0
5
10
-5 0 5-10
-5
0
5
10
-5 0 5-10
-5
0
5
10
-5 0 5-10
-5
0
5
10
z
m1
m20 40
4m1
0
4
m20 4E Ezd d
A) B)
C) D)
Relationship between[cov m] and Error Surface
![Page 64: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/64.jpg)
Taylor Series expansion of the error about its minimum
![Page 65: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/65.jpg)
Taylor Series expansion of the error about its minimum
curvature matrixwith elements∂2E/ ∂mi∂mj
![Page 66: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/66.jpg)
for a linear problemcurvature is related to GTGE = (Gm-d)T(Gm-d) =
mT[GTG]m-dTGm-mTGTd+dTdso
∂2E/ ∂mi∂mj = [GTG] ij
![Page 67: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/67.jpg)
and since
[cov m] = σd2 [GTG]-1we have
![Page 68: Lecture 4 The L 2 Norm and Simple Least Squares](https://reader036.vdocuments.us/reader036/viewer/2022062309/56815b01550346895dc8b389/html5/thumbnails/68.jpg)
the sharper the minimumthe higher the curvature
the smaller the covariance