geophysical inversion: which model do you want?steve/segdl/inversion_reduced.pdf · e−field...

Geophysical Inversion: which model do you want?

Steven Constable

Scripps Institution of Oceanography

Acknowledgements

Scripps Institution of Oceanography, Seafloor Electromagnetic Methods Consortium

SEG and SEG Foundation

Sponsored by Statoil

Sponsored by Paradigm

SEG Membership Benefits

SEG Digital Library – full text articles Technical Journals in Print and Online Networking Opportunities Membership Discounts on ! Continuing Education Training Courses ! Publications (35% off list price) ! Workshops and Meetings

Join Today!

Membership materials are available today!

Join Online at seg.org/About-SEG/Membership

http://www.seg.org/join

Student Opportunities

❑ Sponsored Membership ❑ Student Chapter Programs ❑ SEG/Chevron Student Leadership Symposium ❑ SEG/ExxonMobil Student Education Program ❑ Challenge Bowl ❑ Scholarships ❑ Field Camp Grants ❑ Geoscientists Without Borders® ❑ Student Expos & IGSCs ❑ Honorary and Distinguished Lecturers ❑ SEG Online

Learn more at seg.org/Education/Students-Early-Career

http://www.seg.org/students

Section/Associated Society Opportunities

❑ Host DL, HL, and DISC Programs ❑ Council Representation ❑ Annual Meeting Booth Discount ❑ Best Papers presented at SEG

Annual Meetings ❑ Joint Conferences, Workshops, and

Forums in partnership with SEG

For more information, including a list of benefits, please visit: www.seg.org/resources/sections-societies

http://www.seg.org/resources/sections-societies

Geophysical Inversion: which model do you want?

Steven Constable

Scripps Institution of Oceanography

There is an old joke…

The production manager asked a geologist, engineer, and geophysicist what 2 + 2 was.


The geologist thought for a bit and then said “somewhere between 3 and 5”.



The engineer fiddled with a calculator and said “3.9999999”.



The engineer fiddled with a calculator and said “3.9999999”.

The geophysicist looked her in the eye and asked “what answer do you want”

It is notable that when once searching the web for this joke, I not only got the joke page of an oil price blog, but also the Wikipedia entry for “Inverse problem”.

This says it all…

Today we will explore how the inverse problem can give you whatever model you want.

It is notable that when once searching the web for this joke, I not only got the joke page of an oil price blog, but also the Wikipedia entry for “Inverse problem”.

This says it all…

Today we will explore how the inverse problem can give you whatever model you want.

Well, almost.

With seismic reflection images you can often see the geology in the data.

For EM and potential field methods you need inversion to recover something that can be interpreted as geology.

Same is true for seismic tomography and full waveform inversion.

Moore et al., 2007, Science.

−13

−12

−11

−10

0.25 Hz

−13

−12

−11

−10

0.75 Hz

−13

−12

−11

−10

E−fie

ld a

mpl

itude

, log

(V/A

)m

2E−

field

am

plitu

de, l

og(V

/A

)

m2

E−fie

ld a

mpl

itude

, log

(V/A

)m

2

1.75 Hz

10 12 14 16 18 20 22 24 26Transmitter position, km

These are some marine controlled source electromagnetic CSEM data. You can’t say much about geology just by looking at them.

Constable, Orange, and Key, 2015

-0.5

0

0.5Lo

g rh

osite 1

30

40

50

Phas

e

-0.5

0

0.5

Log

rho

30

40

50

Phas

e

site 2 site 3 site 4 site 5 site 6 site 7 site 8

site 9 site 10 site 11 site 12 site 13 site 14 site 15 site 16

0 1 2log period, s

0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

0 1 2log period, s

0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

The same is true of magnetotelluric (MT) data…


With many modern inversion algorithms available, it is all-so-easy to input data and turn the crank to get a model.

Data

model

Data errorsand misfit

Regularization

Priors/constraints

geology

Geophysicalinversionalgorithm

ModelParameterization


One of the main messages of my talk is that models from geophysical inversion depend on much more than the data inputs:

Data

model


Regularization

Priors/constraints

geology




One of the main messages of my talk is that models from geophysical inversion depend on much more than the data inputs:

Data

model


Regularization

Priors/constraints

geology



By the way… this still isn’t geology - it is density, conductivity, seismic velocity, etc.

Forward modeling:

model spacedata

space

m = (m1, m2, .....,mN )x = (x1, x2, x3, ......, xkM )

d̂ = (d̂1, d̂2, d̂3, ....., d̂M )

d̂ = f(x,m)Model parameters (layers, blocks, ...)

Independent variables (freqs., locations, ...)

Predicted data (gravity, magnetic, electric, ...)

Some forward functional f

Inverse modeling:

model space data space

d = (d1, d2, d3, ....., dM )� = (�1, �2, ...,�M )m

Given real (observed) data with errors

find an

There are several approaches to inversion:

Stochastic Monte Carlo, Markov Chains Genetic Algorithms Simulated annealing, etc. (Bayesian Searches)

Deterministic Newton Algorithms Steepest descent Conjugate Gradients Quadratic (and Linear) Programming, etc.

Analytical D+ (1D MT) Bilayer (1D resistivity) Ideal body theory in gravity and magnetism

Stochastic methods:

model space

data space

“acceptable” data space

A useful approach, largely restricted to simple problems (because millions of models required), with most of the subtlety in model generation methods.

The advantages are that (i) only forward calculations are made and (ii) some statistics can be obtained on model parameters. Best for sparsely parameterized models. One needs to be careful that bounds on explored model space don’t unduly influence the outcome.

Deterministic Newton Algorithms Steepest descent Conjugate Gradients

model space

data space

“acceptable” data space

starting model

The direction of the search is determined by how changing parts of the model affects the fit to the data.

Analytical e.g. D+ (1D MT) and Bilayer (DC resistivity)

model space

my data

bestfitting model

(guaranteed!)

These solutions are guaranteed best fitting but pathological.

MT

D+

Resistivity

Bilayer

MT: Delta functions of Resistivity: Arbitrarily thin surface layers�

We don’t know for sure, but least squares (LS) fits to higher dimensional models are probably also pathological.

But we are pretty sure that true LS solutions are maximally “rough”.

What we have talked about so far is model construction. For a great many geophysicists this is what they think of when inversion is mentioned. More rigorous approaches try to obtain bounds on model properties - something that is true of all models. The classic example is total mass from gravity:

Data

σ model

Data errorsand misfit choice

Regularization

Priors/constraints

geology



So what constitutes an “adequate” fit to the data?

For noisy data (read: all data), we need a measure of how well a given model fits. Sum of squares is the venerable way:

⇥2 =M⇤

i=1

1�2

i

�di � f(xi,m)

⇥2

�2 = ||Wd�Wd̂||2

W = diag(1/�1, 1/�2, ...., 1/�M ) .

or

where W is a diagonal of reciprocal data errors


⇥2 =M⇤

i=1

1�2

i

�di � f(xi,m)

⇥2

�2 = ||Wd�Wd̂||2

W = diag(1/�1, 1/�2, ...., 1/�M ) .

or


I like to remove the dependence on data number and use RMS:

RMS =p

�2/M .

The instinctive approach as this point is to minimize .

This is least squares.

�2


⇥2 =M⇤

i=1

1�2

i

�di � f(xi,m)

⇥2

�2 = ||Wd�Wd̂||2

W = diag(1/�1, 1/�2, ...., 1/�M ) .

or


I like to remove the dependence on data number and use RMS:

RMS =p

�2/M .

The instinctive approach as this point is to minimize .

This is least squares. In geophysics, this is dangerous!

�2

Why is this dangerous? Because as you try to approach the LS solution, your model tries to approach the maximally rough, pathological LS solutions, even if your model space does not contain delta functions, etc.

Existence and Uniqueness: Is there a solution to the inverse problem? Is there only one solution?

Finite noisy data for a linear problem (say, gravity) An infinite number of solutions fit the data

Finite noisy data for a nonlinear problem Either zero or an infinite number of solutions fit the data

Infinite exact data A unique solution has been shown to exist for a few cases. Probably true in general but … who cares?

Some people think that we can approach infinite exact data with LOTS of VERY GOOD data. This is wrong. As Sven Treitel puts it, there is no such thing as being a little bit non-unique.

Geophysical inversion is non-unique:

A single misfit will map into an infinite number of models (or none at all!).

�2model space

misfit space

It is also usually poorly constrained:

model space

A small distance in corresponds to a large distance in m

�2

�2

misfit space

(And don’t forget: the minimum is likely outside your model parameterization).

�2

So what constitutes an adequate misfit?

For zero-mean, Gaussian, independent errors, is chi-squared distributed with M degrees of freedom. The expectation value is just M, which corresponds to RMS=1, and so this could be a reasonable target misfit. Or, one could look up the 95% (or other) confidence interval for chi-squared M.

RMS = 1 RMS = 1.36

We could use other measures of fit, but the quadratic measure works with the mathematics of minimization, and for Gaussian errors has nice statistical properties (unbiased, maximum likelihood, minimum variance). But...

for 14 data. For large data sets, RMS=1 and RMS95% are very much the same.

�2

�2

... sum-squared misfit measures are unforgiving of outliers:

With Gaussian noise, the probability of a data point being misfit by 6 error bars is about one in a billion.

All through any inversion process you should monitor weighted residualsto ensure that there are no bad guys out there.

With 5% error bars this data point has the sameweight as 40,000 other data

-10000 -5000Range, m

Ampl

itude

, V/A

m^2

0 5000 10000

10 -16

10 -14

10 -12

10 -10

10-8 (marine CSEM data)

10 −1 10 0 10 1 10 2 10 3 10 40

0.5

1

1.5

2

Local

RM

SMisfit by Period

Period (s)

00.20.40.60.8

11.21.41.6

Local

RM

S

Data Type

Misfit by Data Type

TE log10 (app. res.)

TE phase

Re(Tipper)

Im(Tipper)

TM log10 (app. res.)

TM phase

M40M36

M35M34

M33M31

M27M25

M23M19

M18M17

M12M11

M10M06

M03M02

M01630

631632

633665

634666

667635

668669

636670

671637

672638

639640

00.5

11.5

22.5

33.5

Local

RM

S

Misfit by Site

A B

C

It is also a good idea to look at how the misfit is partitioned across the data:

Ideally it should be random, but in practice very rarely is.

Example is from MT data.

PhD thesis, Brent Wheelock, 2012.

Errors come from

• statistical processing errors (spectral estimation for MT; stacking for CSEM and lots of other methods)

• systematic errors such as navigation errors and instrument calibrations, and

• “geological noise” (our inability to parameterize fine details of geology).

In practice, we only have a good handle on processing errors - everything else is lumped into a noise floor, which can be pretty arbitrary at times.

(modified from Myer et al., 2012)

−20 −10 0 10 200

50

100

150

200

250 2−4 km N = 1646m = 0.3s = 2.8

Amplitude % Difference

Cou

nt

−20 −10 0 10 20

−10 −5 0 5 10

0

50

100

150

200

250

300 2−4 km N = 1646m = 1.3s = 2.4

Phase % Difference

Phase Degrees

−20 −10 0 10 200

20

40

60

80

100

120

4−6 km N = 1491m = 0.6s = 6.3

Amplitude % Difference

Cou

nt

−20 −10 0 10 200

50

100

1504−6 km N = 1491

m = 1.4s = 4.1

Phase % Difference

−10 −5 0 5 10Phase Degrees

“errors” computed from actual repeat tows for marine CSEM

0 100 200 300 400 500 600 700 800

1

2

3

4

5

6

7

8

9

10

8.75

2.4 2.0

1.7 1.5 1.3 1.2 1.1

Roughness measure

RM

S m

isfit

0 100 200 300 400 500 600 700 8001

1.5

2

2.5 2.4

2.0

1.7

1.5

1.3

1.2

1.1

Roughness measure

RM

S m

isfit

A B

So we are often left without statistical guidance and have to use judgement in determining an adequate fit. Some people like trade-off, or “L”-curves...

RMS 1.4

0 100 200 300 400 500 600 700 800

1

2

3

4

5

6

7

8

9

10

8.75

2.4 2.0

1.7 1.5 1.3 1.2 1.1

Roughness measure

RM

S m

isfit

0 100 200 300 400 500 600 700 8001

1.5

2

2.5 2.4

2.0

1.7

1.5

1.3

1.2

1.1

Roughness measure

RM

S m

isfit

A B

RMS 1.4 RMS 1.7

starting half-space

… but I am not one of them.

There really isn’t an objective way to choose misfit level except through a good understanding of the data errors.

0 50 100 150 200 250 300 350 400 4501

1.5

2

2.5 2.4

1.7

1.5

1.3 1.2

1.11

Roughness measure

RM

S m

isfit

0 50 100 150 200 2501

1.5

2

2.5

3

3.5

4

4.5

5

5.5

5.0

2.4

1.7 1.51.3 1.2

Roughness measure

RM

S m

isfit

0 50 100 150 200 250 300 350 400 450 500 5501

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.7

1.5

1.3

1.2

1.111.06

Roughness measure

RM

S m

isfit

In fact you can get pretty well what you want simply by changing the range of the plot and the scaling of the axes.


Data

model


Regularization

Priors/constraints

geology



Model parameterization:

modelspace

data space

Best fit

real world

Even with your best efforts, the real world is unlikely to be captured by your model parameterization, and the best fitting model almost certainly won’t be either. Understanding this can be important.

(And the best fitting model won’t have the same misfit as the real world does, because of noise.)

model space

data space

1D

3,4D

2D

Where in model space you are is determined by your parameterization - this also determines where in data space you can be.

In non-linear geophysical problems, even forward modeling can involve a challenging computational effort.

Sven Treitel once asked the question: “Can our mathematics ever completely describe nature?”.

The trite answer, of course, is “No”. However, it is more useful to understand the nature of the limitations:

Are the physics sufficient (e.g. scalar properties versus anisotropy)?

Is the forward computational machinery accurate? (e.g. finite difference calculations don’t handle bathymetry well)

Is the dimensionality of model space large enough? (1D, 2D, 3D, 4D)

Is the discretization fine enough and the model size big enough?

One can rarely afford to blindly ensure these are all achieved, so intelligence and understanding must be applied, perhaps by trial and error.

d̂ = f(x,m) Some forward functional f

m = (m1, m2, .....,mN ) Model parameters

Model space parameterization:

Most geophysical properties cannot go negative, but your inversion scheme might well generate negative values in m. The easiest way to handle this is by parameterizing as log(m), but there are other ways, such as NNLS.

In the real world, N (model size) is infinite (even in 1D). How we proceed from here depends on whether N is small, moderately large, or infinite.

Small (sparse) parameterizations can be handled with parameterized inversions (e.g. Marquardt) or stochastic inversions. The concept of least squares fitting works because sparse models don’t have the freedom to mimic the pathological true least squares solutions.






Infinite N requires a real inverse theory mathematician. I am not one of them.






Infinite N requires a real inverse theory mathematician. I am not one of them.

Most of the time geophysicists are working with moderately large N. Also, many geophysical problems are non-linear, so we will concentrate on that approach.




To invert non-linear forward problems we often linearize around a starting model:

d̂ = f(m1) = f(m0 + �m) � f(m0) + J�m

Jij =⇥f(xi,m0)

⇥mj

�m = m1 �m0 = (�m1, �m2, ...., �mN )

�2 ⇥ ||Wd�Wf(m0) + WJ�m||2

using a matrix of derivatives

and a model perturbation

Now our expression for is�2

For a least squares solution we solve in the usual way by differentiating and setting to zero to get a linear system:

where

So, given a starting model we can find an update :

⇥ = ��m

� = (WJ)T W(d� f(m0))

� = (WJ)T WJ .

m0 �m

�m = ��1⇥

and iterate until we converge. (This is Gauss-Newton.)

Global versus local minima:

For nonlinear problems, there are no guarantees that Gauss-Newton will converge.

There are no guarantees that if it does converge the solution is a global one.

The solution might well depend on the starting model.

local minimum global minimum(maybe)

�2

Global versus local minima:

For nonlinear problems, there are no guarantees that Gauss-Newton will converge.

There are no guarantees that if it does converge the solution is a global one.

The solution might well depend on the starting model.

local minimum global minimum(maybe)

�2

Gauss-Newton only works for small N (it isn’t even defined for N > M). If N gets too large then the solutions become unstable, oscillatory, and generally useless (they are probably trying to converge to D+ type solutions).

where is some measure of the model and is a trade-off parameter or Lagrange multiplier. In 1D a typical might be:R

Almost all inversion today incorporates some type of regularization, which minimizes some aspect of the model as well as fit to data:

Rm µ

R1 =

�

⇧⇧⇧⇧⇧⇧⇧⇤

�1 1 0 0 0 . . . 00 �1 1 0 0 . . . 00 0 �1 1 0 . . . 0

. . . . . .

�1 1

⇥

⌃⌃⌃⌃⌃⌃⌃⌅

which extracts a measure of slope. This stabilizes the inversion, creates a single solution, allows N > M, and manufactures models with useful properties.

This is easily extended to 2D and 3D modeling.

m1

m2

m3

m4

m5

m6

m7

m8

-1

+1 -1

+1 -1

+1 -1

+1 -1

+1 -1

+1 -1

+1

U =�||Wd�Wf(m)||2

�+ µ||Rm||2

When is small, model roughness is ignored and we try to fit the data. When is large, we smooth the model at the expense of data fit.

µ µ

U =�||Wd�Wf(m)||2

�+ µ||Rm||2

The trade-off between roughness and misfit:


One approach is to choose and minimize by least squares, but picking a priori is simply choosing how rough your model is.

µ U

µ µ

µ

U =�||Wd�Wf(m)||2

�+ µ||Rm||2




µ U

µ µ

We ought to have a decent idea of how well our data can be fit. This forms the basis of the “Occam” approach, where a target data misfit is chosen:

µ

U =�||Wd�Wf(m)||2

�+ µ||Rm||2

�2�

U =�||Wd�Wf(m)||2 � �2

⇤�

+ µ||Rm||2




µ U

µ µ

We ought to have a decent idea of how well our data can be fit. This forms the basis of the “Occam” approach, where a target data misfit is chosen:

µ

U =�||Wd�Wf(m)||2

�+ µ||Rm||2

�2�

U =�||Wd�Wf(m)||2 � �2

⇤�

+ µ||Rm||2

U = ||Rm1||2 + µ�1�||Wd�W

�f(m0) + J(m1 �m0)

⇥||2 � ⇥2

⇥⇥

m1 =�µRT R + (WJ)T WJ

⇥�1(WJ)T W(d� f(m0) + Jm0) .

For linearized, iterative inversion we use

After differentiation and setting to zero we get an expression for a new model:


one parameter

Sum-square misfit

starting model

surface of acceptablemisfit

least squares model

rough

ness =

0 when

p 1 = p 2

smoothest acceptable model

two

para

met

er

If the Occam algorithm does not get hung up in a local minimum, it will converge to the smoothest model for a given misfit.

It is important to run the inversion to convergence, and not stop as soon as the target misfit is achieved.

I will step through a joint 2D Occam inversion of marine CSEM (3 frequencies, no phase) and marine MT (Gemini salt prospect, Gulf of Mexico):

Dep

th (k

m)

Rho y, Gemini_joint_inv_2pt4_a.0.resistivity Folder: 40

8 3178157N 337657E

10 3179571N 339071E

12 3180985N 340485E

14 3182399N 341899E

16 3183814N 343314E

18 3185228N 344728E

20 3186642N 346142E

22 3188056N 347556E

24 3189471N 348971E

26 3190885N 350385E

0

2

4

6

8

10

12

log1

0(oh

m−m

)

−1

−0.5

0

0.5

1

1.5

2

0 2 4 6 8 10 120

10

20

30

40

50

60

70

Iteration number

Roug

hnes

s

0 2 4 6 8 10 12

2

4

6

8

10

Iteration number

RMS

misf

it

CSEMMT

All

Dep

th (k

m)

Rho y, RMS: 5.2768 Gemini_joint_inv_2pt4_a.1.resistivity Folder: 40

8 3178157N 337657E

10 3179571N 339071E

12 3180985N 340485E

14 3182399N 341899E

16 3183814N 343314E

18 3185228N 344728E

20 3186642N 346142E

22 3188056N 347556E

24 3189471N 348971E

26 3190885N 350385E

0

2

4

6

8

10

12

log1

0(oh

m−m

)

−1

−0.5

0

0.5

1

1.5

2

0 2 4 6 8 10 120

10

20

30

40

50

60

70

Iteration number

Roug

hnes

s

0 2 4 6 8 10 12

2

4

6

8

10

Iteration number

RMS

misf

it

CSEMMT

All

Dep

th (k

m)


8 3178157N 337657E

10 3179571N 339071E

12 3180985N 340485E

14 3182399N 341899E

16 3183814N 343314E

18 3185228N 344728E

20 3186642N 346142E

22 3188056N 347556E

24 3189471N 348971E

26 3190885N 350385E

0

2

4

6

8

10

12

log1

0(oh

m−m

)

−1

−0.5

0

0.5

1

1.5

2

0 2 4 6 8 10 120

10

20

30

40

50

60

70

Iteration number

Roug

hnes

s

0 2 4 6 8 10 12

2

4

6

8

10

Iteration number

RMS

misf

it

CSEMMT

All

Why do we start from a half-space? Because J depends on m.

MT: misaligned starting resistor - no harm done

Courtesy David Myer.

misaligned starting conductor - forever trapped by J


Depth

(km)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

RMS 2.40

RMS 1.70

RMS 1.30

RMS 1.20

RMS 1.11RMS 1.06

Even with well-estimated errors, choice of misfit can still be somewhat subjective.


m1

m2

m3

m4

m5

m6

m7

m8

-1

+1 -1

+1 -1

+1 -1

+1 -1

+1 -1

+1 -1

+1

m1

m2

m3

m4

m5

m6

m7

m8

-1

+1 -1

+1 -1

+1 -1

+1 -1

+1 -1

+1 -1

+1

R1 =

�

⇧⇧⇧⇧⇧⇧⇧⇤

�1 1 0 0 0 . . . 00 �1 1 0 0 . . . 00 0 �1 1 0 . . . 0

. . . . . .

�1 1

⇥

⌃⌃⌃⌃⌃⌃⌃⌅

R2 =

0

BBBBBBB@

�1 1 0 0 0 . . . 00 �1 1 0 0 . . . 00 0 �1 1 0 . . . 0

. . . . . .

�1 1

1

CCCCCCCA

⇢z ⇢h

?

What about anisotropy? It is quite common for physical properties of sediments to be different in the vertical and horizontal directions. For example, horizontal resistivity is often smaller than vertical resistivity .

The problem is how to weight the penalty between the two models.

⇢z⇢h

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Depth

(km)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2De

pth

(km

)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Dept

h (k

m)

8 10 12 14 16 18 20 22 24 26

0

2

4

6

8

10

12

log10

(ohm

−m)

−1

−0.5

0

0.5

1

1.5

2

Joint Anisotropic Inversions versus penalty between rho-y and rho-z (all fitting to RMS 1.2):

Low weight (0.1), models are

independent.

⇢h ⇢h ⇢h

⇢z ⇢z ⇢z

Medium weight (1), models look sensible.

High weight (10), models are identical.


Data

model


Regularization

Priors/constraints

geology



There are many ways to choose how to regularize the problem, and this matters too.

For a given misfit, the model depends on R

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Minimum norm model

Resistivity, Ohm.m

Depth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

First deriv smoothing -1 1

Resistivity, Ohm.m

Depth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Second deriv smoothing 1 -2 1

Resistivity, Ohm.m

Depth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Depth weighted smoothing

Resistivity, Ohm.m

Depth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Thickness weighted smoothing

Resistivity, Ohm.m

Depth

, m

(1 0 )

(-1 1) x thick/depth (-1 1) x thickness

(-1 1 ) (1 -2 1 )

You can have fun with cuts (removing a row of R):

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Top and bottom

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Top only

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Bottom only

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Too deep

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Too shallow

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Marquardt 3-layer model RMS 1.018

Resistivity, Ohm.m

De

pth

, m

You can have fun with cuts (removing a row of R):

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Top and bottom

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Top only

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Bottom only

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Too deep

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Cuts in smoothing: Too shallow

Resistivity, Ohm.m

De

pth

, m

10-1

100

101

102

-2500

-2000

-1500

-1000

-500

0

Marquardt 3-layer model RMS 1.018

Resistivity, Ohm.m

De

pth

, m

Sparsely parameterized model does well! (Why?)

In 2D:

Inverted with isotropic smoothing

Inverted with 1st-difference

Synthetic Model

default regularization generatesa conductive artifact Courtesy Brent Wheelock.

Always remember that regularization has an input into the model solution. Both of these MT models fit the data equally well.

R1 =

�

⇧⇧⇧⇧⇧⇧⇧⇤

�1 1 0 0 0 . . . 00 �1 1 0 0 . . . 00 0 �1 1 0 . . . 0

. . . . . .

�1 1

⇥

⌃⌃⌃⌃⌃⌃⌃⌅

A nonlinear and adaptive R, providing little penalty for big contrasts


Default R:

Can we place errors or uncertainties on regularized models?

No!

For sparse parameterizations, data errors are often projected onto model parameters through the Jacobian. This is a dangerous practice because

• it depends on the parameterization

• the Jacobian depends on the solution

Can we place errors or uncertainties on regularized models?

No!

For sparse parameterizations, data errors are often projected onto model parameters through the Jacobian. This is a dangerous practice because

• it depends on the parameterization

• the Jacobian depends on the solution

For regularized models, however, we compound this by generating huge amounts of covariance between the model parameters as part of the smoothness constraint. It is much more useful to think of regularized models as extremal solutions, and vary the regularization to ask the questions you may have.

Stochastic methods provide a useful way to assess model uncertainty, but they are still restricted to simple, mostly 1D, models.

More on errors and regularization:Regularized Fits to f(x) = 1.0x + 1.0

When creating synthetic data for inversion tests, always perturb the data with noise, don’t just add error bars.

This is because regularized inversion will use its misfit budget to make the model smaller.

You actually get better models by adding noise.

Constable, 1991

Data

model


Regularization

Priors/constraints

geology



It can even matter how you scale the data.

In EM, both MT apparent resistivities and CSEM amplitudes can vary by many orders of magnitude. This suggests that one should use error floors that are percentages.

One might also parameterize the data as logs. For small :

where

d0 ± 0.434✏ = log10(d± ✏d)

0.434 = 1/ ln(10)

✏

lineardata

fractionalerror

log datadata



where

It ought not to matter how you parameterize the data (so long as the errors are properly scaled and the appropriate chain rule is applied to the Jacobian):

d0 ± 0.434✏ = log10(d± ✏d)

0.434 = 1/ ln(10)

✏


⇥�1(WJ)T W(d� f(m0) + Jm0) .

lineardata

fractionalerror

log datadata

make datalog

take the logof the forward

model

use the chainrule to convert

to @d/@m @ log(d)/@m



where

It ought not to matter how you parameterize the data (so long as the errors are properly scaled and the appropriate chain rule is applied to the Jacobian):

But ...

d0 ± 0.434✏ = log10(d± ✏d)

0.434 = 1/ ln(10)

✏


⇥�1(WJ)T W(d� f(m0) + Jm0) .

lineardata

fractionalerror

log datadata

... it does. Consider misfits in marine CSEM data:

.9 x 10-10 (linear) or 1 (log)

.9 x 10-11 (linear) or 1 (log)

in the linear domain, this misfit is 10 times bigger than the other one

.99 x 10-11 (linear) or 2 (log).9 x 10-11 (linear) or 1 (log)

in the linear domain, this misfit is only 10% bigger than the other one

... it does. Consider misfits in marine CSEM data:

Horizontal Position (km)

Depth

(km)

♦

♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦

1234567891011121314151617181920

−180 −160 −140 −120 −100 −80 −60 −40 −20 0

−6

−4

−2

0

2

4

6

8

10Seafloor

Air

10 −1 10 0 10 1 10 2 10 3 10 4

10 0

10 1

10 2

10 3

Halfspace ResistivityRM

S Misfi

t

True Model90 Ωm

linear rho, phaseimpedance, phaselog rho, phase

Here we consider MT data over a half-space, varying only half-space resistivity R.

⇤ =1

2⇥fµ|Z|2

Ex

= ZHy

Recall that MT impedance (Z ) is:

For small R, misfit flattens for linear �

(modified from Wheelock et al., 2015)

(d� f(m))! const.

f(m)! 0

( log d� log f(m))

This is because

as but

does not.Horizontal Position (km)

Depth

(km)

♦

♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦

1234567891011121314151617181920

−180 −160 −140 −120 −100 −80 −60 −40 −20 0

−6

−4

−2

0

2

4

6

8

10Seafloor

Air

10 −1 10 0 10 1 10 2 10 3 10 4

10 0

10 1

10 2

10 3

Halfspace ResistivityRM

S Misfi

t

True Model90 Ωm

linear rho, phaseimpedance, phaselog rho, phase

Here we consider MT data over a half-space, varying only half-space resistivity R.

⇤ =1

2⇥fµ|Z|2

Ex

= ZHy

Recall that MT impedance (Z ) is:

For small R, misfit flattens for linear �


This effect can be even worse for marine MT data affected by bathymetry.

Local minima develop, and misfit flatlines at low R.

10 −1 10 0 10 1 10 2 10 3 10 4

10 0

10 1

10 2

10 3

Seafloor Resistivity

RMS M

isfit

True Model90 Ωm

linear rho, phaseimpedancelog rho, phase


Truth

log10(ρa), ΦZ , 5%, 10 Ωm, 11 iter.

log10(ρa), ΦZ , 10%, 10 Ωm, 14 iter. ρa, ΦZ , 10%, 1000 Ωm, 24 iter.

100 Ωm

5 Ωm

ρa, ΦZ , 5%, 10 Ωm, 21 iter.

Z, 5%, 10 Ωm, d.n.c.


Here is a 2D example. Linear apparent resistivity and phase converged, but log(resistivity) converges to the same model in half the iterations. MT impedance, Z, did not converge (d.n.c) at all.

So I hope I have convinced you that models from geophysical inversion depend on much more than the data alone:

Data

model


Regularization

Priors/constraints

geology



So I hope I have convinced you that models from geophysical inversion depend on much more than the data alone:

Data

model


Regularization

Priors/constraints

geology



Does this mean that geophysical inversion is useless?

Dep

th (k

m)

Folder: 11AI_Anis_Joint_1pt0_2

8 0N

10 0N

12 0N

14 0N

16 0N

18 0N

20 0N

0

1

2

3

4

5

6

7

8

log10(ohm−m

)

−1

−0.5

0

0.5

1

1.5

2

Rho-z

2000

1500

0.2 1 10 Ωm

2500

3000

3500

met

ers b

ls

Not at all! There is plenty of evidence that geophysical inversion works very well. You just have to know what you are doing, how your code and algorithm work, and pay attention to the factors besides the data that determine the result.

And, you will probably need to run more than one inversion… many more.

Connecting the world of applied geophysics

geophysical inversion: which model do you want?steve/segdl/inversion_reduced.pdf · e−field...

Documents