gaussian markov random fields: efficient modelling of ... · efﬁcient modelling of spatially...

Spatial GMRFs SPDE Data Generalisation

Gaussian Markov random fields:

Efficient modelling of spatially

dependent data

Johan Lindstrom1 Finn Lindgren2 Havard Rue2

1Centre for Mathematical SciencesLund University

2Department of Mathematical SciencesNorwegian University of Science and Technology, Trondheim

Lund, April 28th, 2011

Johan Lindstrom - [email protected] Gaussian Markov random fields

Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation

Spatial interpolation — Kriging

The standard problem

Given observations at some locations, Y(si), i = 1 . . . nwe want to make statements about the value at unobservedlocation(s), Y(s0).

Data from Peter Jonsson (Engineering geology, Lund University).

Spatial interpolation — Kriging

The standard problem

Given observations at some locations, Y(si), i = 1 . . . nwe want to make statements about the value at unobservedlocation(s), Y(s0).

Gaussian fields

A Gaussian model

We often assume that the observations come from a Gaussianfield [

]∈ N

[Σ00 Σ0n

Σ⊤0n Σnn

If the mean and covariance matrix are known the optimalpredictor is the conditional expectation

E(Y0|Y1, · · · , Yn

)= μ0 + Σ0nΣ

−1nn (Y − μ).

However the mean and covariance matrixare typically not known.

Gaussian fields

A Gaussian model

We often assume that the observations come from a Gaussianfield [

]∈ N

[Σ00 Σ0n

Σ⊤0n Σnn

If the mean and covariance matrix are known the optimalpredictor is the conditional expectation

E(Y0|Y1, · · · , Yn

)= μ0 + Σ0nΣ

−1nn (Y − μ).

However the mean and covariance matrixare typically not known.

Parameter estimation

Assume a parametric form for the covariance and mean

Y ∈ N (μ(θ),Σ(θ)) .

The log-likelihood becomes

l(θ|Y) = −1

2log |Σ(θ)| −

(Y − μ(θ)

Σ(θ)−1(

Y − μ(θ))

Problems:

1. The covariance matrix has O(n2

)unique elements.

2. Calculating l(θ|Y) takes O(n3

)time.

For the depth data:

◮ 11′705 observations

◮ Storing the covariance matrix ∼ 1 GB

◮ Evaluating l(θ|Y) once(!) 2.5 minutes.

Y ∈ N (μ(θ),Σ(θ)) .

l(θ|Y) = −1

2log |Σ(θ)| −

(Y − μ(θ)

Σ(θ)−1(

Y − μ(θ))

Problems:

)unique elements.

)time.

For the depth data:

Y ∈ N (μ(θ),Σ(θ)) .

l(θ|Y) = −1

2log |Σ(θ)| −

(Y − μ(θ)

Σ(θ)−1(

Y − μ(θ))

Problems:

)unique elements.

)time.

For the depth data:

Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE

Gaussian Markov random fields

Let the neighbours Ni to a point si be the points {sj, j ∈ Ni} thatare “close” to si.

Gaussian Markov random field (GMRF)

Gaussian random field x ∼ N (μ,Σ) that satisfies

p(xi|{xj : j 6= i}) = p(xi|{xj : j ∈ Ni})

is a Gaussian Markov random field.

Using the precision matrix, Q = Σ−1, the conditional expectationbecomes

E(xi|{xj : j 6= i}

)= μi −

Qij(xj − μj)

A GMRF has a sparse precision matrix j ∈ Ni ⇐⇒ Qi,j 6= 0.

E(xi|{xj : j 6= i}

)= μi −

Qij(xj − μj)

E(xi|{xj : j 6= i}

)= μi −

Qij(xj − μj)

The Markov property

The Markov property does not imply that events far apart areindependent.Only that if we know what happens close by we can ignore thingsfurther away.

Example

We want to predict the temperature in Lund

1. If we know the temperature in Copenhagen, thetemperature in London contributes very little information.

2. However, if we don’t know the temperature in Copenhagen,the temperature in London would help.

The sparse precision matrix makes GMRF computationallyeffective but it is hard to construct reasonable precision matrices.

The Markov property

Example

The Markov property

Example

The Markov property

Example

A common covariance function

The Matern covariance family

The covariance between two points at distance ‖u‖ is

C(‖u‖) = σ2 21−nΓ(ν)

(κ ‖u‖)n Kn(κ ‖u‖)

The Matern family includes:

◮ Exponential, if ν = 1/2.

◮ Gaussian, when ν→ ∞.

A common covariance function

The Matern covariance family

The covariance between two points at distance ‖u‖ is

C(‖u‖) = σ2 21−nΓ(ν)

(κ ‖u‖)n Kn(κ ‖u‖)

Fields with Matern covariances are solutions to an SPDE(Whittle, 1954, 1963),

(κ2 −Δ

)a/2x(s) = W(s).

Here W(s) is white noise, Δ =∑

∂s2i

, and α = ν+ d/2.

Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples

Constructing a GMRF from the SPDE

We want to construct a (GMRF) solution to the SPDE.

The basic idea

Construct the solution as a finite basis expansion

x(s) =

ψk(s) xk,

with a suitable distribution for the weights {xk}.

A stochastic weak solution is given by weights, {xi}, such that thejoint distribution fulfills

⟨ψj, (κ2 −Δ)a/2ψixi

⟨ψj, W

⟩∀j

Constructing a GMRF from the SPDE

We want to construct a (GMRF) solution to the SPDE.

The basic idea

Construct the solution as a finite basis expansion

x(s) =

ψk(s) xk,

with a suitable distribution for the weights {xk}.

A stochastic weak solution is given by weights, {xi}, such that thejoint distribution fulfills

⟨ψj, W

⟩∀j

Possible basis functions

Several possible basis functions exist:

◮ Harmonic functions give a spectral representation.

◮ Eigenfunctions to the covariance matrix giveKarhunen-Loeve.

◮ A piecewise linear basis gives (almost) a GMRF.

Possible basis functions

Several possible basis functions exist:

◮ Harmonic functions give a spectral representation.

◮ Eigenfunctions to the covariance matrix giveKarhunen-Loeve.

◮ A piecewise linear basis gives (almost) a GMRF.

Construction of the precision matrix

We have the SPDE∑

⟨ψj, W

SPDE solution

The distribution of the weights is x ∈ N(

0, Q−1

Q1,k = K Q2,k = KC−1K

Qa,k = KC−1Qa−2,kC−1K, α = 3, 4, . . .

C i,j =⟨ψi, ψj

K i,j =⟨ψi, (κ2 −Δ)ψj

⟩= κ2

⟨ψi, ψj

⟩−

⟨ψi, Δψj

⟩= κ2C i,j + Gi,j,

Markov approximation

Since our basis functions have compact support the C and K

matrices are sparse.However, C

−1 is dense making Q2,k = KC−1

K dense.

GMRF approximation

To obtain sparse precision matrices we replace the C-matrix with

a diagonal matrix C with elements

C i,i = 〈ψi, 1〉

The resulting approximation error is small (Bolin & Lindgren,2009, and Simpson, Lindgren & Rue, 2010, tech.reps.)

Markov approximation

Since our basis functions have compact support the C and K

matrices are sparse.However, C

−1 is dense making Q2,k = KC−1

K dense.

GMRF approximation

To obtain sparse precision matrices we replace the C-matrix with

a diagonal matrix C with elements

C i,i = 〈ψi, 1〉

The resulting approximation error is small (Bolin & Lindgren,2009, and Simpson, Lindgren & Rue, 2010, tech.reps.)

Lattice on R2, step size h, regular triangulation

◮ Order α = 1:

−1−1 4 −1

︸︷︷︸≈−D

◮ Order α = 2:

+ 2κ2

−1 4 −1−1

︸︷︷︸≈−D +

12 −8 2

1 −8 20 −8 12 −8 2

︸︷︷︸≈D2

◮ Not restricted to regular lattices; easy on triangulations

Example of neighbours for a = 2

Spatial GMRFs SPDE Data Generalisation Observations Conditional distribution

Observations

Point observations

y(u) = x(u) + ε

ψk(u)xk + ε

Area observations

y(M) =

x(u) du + ε

ψk(u) du + ε

Observations

Point observations

y(u) = x(u) + ε

ψk(u)xk + ε

Area observations

y(M) =

x(u) du + ε

ψk(u) du + ε

Conditional expectation

Introducing a matrix A with rows

Ai· =[ψ1(ui) · · · ψN(ui)

]or Ai· =

[∫Miψ1(u) du · · ·

we can write the observation equation on matrix form as

y = Ax + ε x ∈ N(μ, Q

ε ∈ N(

0, Q−1e )

Kriging with GMRF

E (x|y) = μ+ Q−1x|y A⊤Qe (y − Aμ)

V (x|y) = Q−1x|y =

(Q + A⊤QeA

Conditional expectation

Introducing a matrix A with rows

Ai· =[ψ1(ui) · · · ψN(ui)

]or Ai· =

[∫Miψ1(u) du · · ·

we can write the observation equation on matrix form as

y = Ax + ε x ∈ N(μ, Q

ε ∈ N(

0, Q−1e )

Kriging with GMRF

E (x|y) = μ+ Q−1x|y A⊤Qe (y − Aμ)

V (x|y) = Q−1x|y =

(Q + A⊤QeA

Spatial GMRFs SPDE Data Generalisation Oscillating Non-stationary

Example of an oscillating field

(κ2 exp(iπθ) −Δ

)(x1(u) + ix2(u)) = W1(u) + iW2(u)

0 10 200

0 10 20−0.5

Example of a non-stationary field

(κ2(u) −Δ

)(τ(u)x(u)) = W(u)

Features and extensions, past, current and future work

◮ Just as easy for general manifolds, e.g. the globe, as for Rd.

◮ Oscillating fields

◮ Non-stationary versions, such as

(κ2(u) −Δ

)a/2(τ(u)x(u)) = W(u)

(κ2(u) + ∇ · m(u) −∇ · M(u)∇

)a/2x(u) = τ(u)W(u)

(Includes the Sampson & Guttorp deformation method)

◮ Non-stationarity that depends on covariates.

◮ Multivariate versions

◮ Spatio-temporal versions

◮ Works well with INLA (current & next generation) but is alsogenerally applicable

gaussian markov random fields: efficient modelling of ... · efﬁcient modelling of spatially...

Documents