Geology 5670/6670Inverse Theory
21 Jan 2015
© A.R. Lowry 2015Read for Fri 23 Jan: Menke Ch 3 (39-68)
Last time: Ordinary Least Squares Inversion
• Ordinary Least Squares (OLS) solves for parameters m that minimize the L2-norm of misfit residuals
• Solution for an overdetermined (N > M) linear problem is given by:
• The matrix is the pseudoinverse of G
• The misfit residual provides some intuitive insight into measurement error…
€
d = Gm
€
e = d −Gm
€
m = GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
GT
d **********
€
GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
GT
≡ GMxM
+
What does this tell us about uncertainty?First, if #obs N is “large enough” relativeto M, misfit tells us something abouterrors in measurements! We use that toestimate parameter uncertainties.
Statistical Properties:
First denote pseudoinverse:
€
GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
GT
≡ G+
And recall . Thus,
€
d = Gmt +ε
€
˜ m = GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
GT
Gmt +G+ε = mt +G
+ε
Inverse Theory: Goals include (1) Solve for parameters from observational data; (2) Determine the range of models that fit the data within uncertainties
Quick review of Gaussian distributions & statistics:
Central Limit Theorem: The sum of N independent, random variables approaches a Gaussian distribution for N sufficiently large.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Univariate case:
€
f x( ) =1
σ 2πexp −
1
2
x −μ
σ
⎛
⎝ ⎜
⎞
⎠ ⎟2 ⎧
⎨ ⎪
⎩ ⎪
⎫ ⎬ ⎪
⎭ ⎪
Where we define:
Mean: , the expected value of x
Variance:
€
μ ≡E x{ } ≡ x = xf x( )dx−∞
∞
∫
€
σ 2 ≡V x{ } = E x −μ( )2
{ } = x −μ( )2
f x( )dx−∞
∞
∫
is the Probability Density Function
For the multivariate case, the probability density function is
€
f ε( ) =1
2π( )N
2 Cexp −
1
2ε − ε( )
T
C−1
ε − ε( ) ⎧ ⎨ ⎩
⎫ ⎬ ⎭
Mean:
€
E ε{ } = ε = ε1 ε2 ... εN[ ]T
Data Covariance Matrix:
€
Cε = cov ε{ } = E ε − ε( ) ε − ε( )T ⎧
⎨ ⎩
⎫ ⎬ ⎭
(Note outer product yields a matrix!)
€
Cij = cov ε i ,ε j{ } = E ε i − ε i( ) ε j − ε j( ){ }
Independent random variables:
€
f ε i ,ε j( ) = f ε i( ) f ε j( )
Uncorrelated random variables:
(Strong condition)
€
E ε iε j{ } = E ε i{ }E ε j{ }
€
cov ε i ,ε j{ } = 0
Based on these definitions,
(1)
€
˜ m = mt +G+
ε
If we assume errors are zero-mean,
€
⇒ ε =0
€
˜ m = mt (i.e., if measurements are unbiased,the model parameters are unbiased)
(2) The model covariance matrix
€
Cm = ˜ m − ˜ m ( ) ˜ m − ˜ m ( )T
€
˜ m − ˜ m = mt +G+ε − mt = G
+ε
€
Cm = G+ε G
+ε
⎛
⎝ ⎜
⎞
⎠ ⎟
T
= G+εε
TG
+T
= G+
εεT
G+T
**********
€
Cm = G+Cε G
+T
For the ordinary least squares case,
€
Cm = GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
GT
Cε GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
GT ⎡
⎣ ⎢ ⎢
⎤
⎦ ⎥ ⎥
T
(Note because it is symmetric, and
€
GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
T
= GT
G
€
A−1 ⎛
⎝ ⎜
⎞
⎠ ⎟
T
= AT ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
for symmetric A
If measurement errors are uncorrelated, constant variance:
€
Cε =σ 2 I and
€
Cm =σ 2 GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
GT
G GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
**********
€
Cm =σ 2 GT
G ⎛
⎝ ⎜
⎞
⎠ ⎟
−1
So we can estimate a parameter variance for each model parameter:
€
σmi2 =V ˜ m i{ } = σ 2 G
TG
⎛
⎝ ⎜
⎞
⎠ ⎟
−1 ⎡
⎣ ⎢ ⎢
⎤
⎦ ⎥ ⎥ii
And we write
€
mi = ˜ m i ±σ mi