gaussian markov random fields: efficient modelling of ... · efficient modelling of spatially...
TRANSCRIPT
Spatial GMRFs SPDE Data Generalisation
Gaussian Markov random fields:
Efficient modelling of spatially
dependent data
Johan Lindstrom1 Finn Lindgren2 Havard Rue2
1Centre for Mathematical SciencesLund University
2Department of Mathematical SciencesNorwegian University of Science and Technology, Trondheim
Lund, April 28th, 2011
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation
Spatial interpolation — Kriging
The standard problem
Given observations at some locations, Y(si), i = 1 . . . nwe want to make statements about the value at unobservedlocation(s), Y(s0).
200m
−14
−9
−4
Data from Peter Jonsson (Engineering geology, Lund University).
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation
Spatial interpolation — Kriging
The standard problem
Given observations at some locations, Y(si), i = 1 . . . nwe want to make statements about the value at unobservedlocation(s), Y(s0).
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation
Gaussian fields
A Gaussian model
We often assume that the observations come from a Gaussianfield [
Y0
Y
]∈ N
([μ0
μ
],
[Σ00 Σ0n
Σ⊤0n Σnn
])
If the mean and covariance matrix are known the optimalpredictor is the conditional expectation
E(Y0|Y1, · · · , Yn
)= μ0 + Σ0nΣ
−1nn (Y − μ).
However the mean and covariance matrixare typically not known.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation
Gaussian fields
A Gaussian model
We often assume that the observations come from a Gaussianfield [
Y0
Y
]∈ N
([μ0
μ
],
[Σ00 Σ0n
Σ⊤0n Σnn
])
If the mean and covariance matrix are known the optimalpredictor is the conditional expectation
E(Y0|Y1, · · · , Yn
)= μ0 + Σ0nΣ
−1nn (Y − μ).
However the mean and covariance matrixare typically not known.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation
Parameter estimation
Assume a parametric form for the covariance and mean
Y ∈ N (μ(θ),Σ(θ)) .
The log-likelihood becomes
l(θ|Y) = −1
2log |Σ(θ)| −
1
2
(Y − μ(θ)
)⊤
Σ(θ)−1(
Y − μ(θ))
Problems:
1. The covariance matrix has O(n2
)unique elements.
2. Calculating l(θ|Y) takes O(n3
)time.
For the depth data:
◮ 11′705 observations
◮ Storing the covariance matrix ∼ 1 GB
◮ Evaluating l(θ|Y) once(!) 2.5 minutes.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation
Parameter estimation
Assume a parametric form for the covariance and mean
Y ∈ N (μ(θ),Σ(θ)) .
The log-likelihood becomes
l(θ|Y) = −1
2log |Σ(θ)| −
1
2
(Y − μ(θ)
)⊤
Σ(θ)−1(
Y − μ(θ))
Problems:
1. The covariance matrix has O(n2
)unique elements.
2. Calculating l(θ|Y) takes O(n3
)time.
For the depth data:
◮ 11′705 observations
◮ Storing the covariance matrix ∼ 1 GB
◮ Evaluating l(θ|Y) once(!) 2.5 minutes.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Interpolation Gaussian fields Parameter estimation
Parameter estimation
Assume a parametric form for the covariance and mean
Y ∈ N (μ(θ),Σ(θ)) .
The log-likelihood becomes
l(θ|Y) = −1
2log |Σ(θ)| −
1
2
(Y − μ(θ)
)⊤
Σ(θ)−1(
Y − μ(θ))
Problems:
1. The covariance matrix has O(n2
)unique elements.
2. Calculating l(θ|Y) takes O(n3
)time.
For the depth data:
◮ 11′705 observations
◮ Storing the covariance matrix ∼ 1 GB
◮ Evaluating l(θ|Y) once(!) 2.5 minutes.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
Gaussian Markov random fields
Let the neighbours Ni to a point si be the points {sj, j ∈ Ni} thatare “close” to si.
Gaussian Markov random field (GMRF)
Gaussian random field x ∼ N (μ,Σ) that satisfies
p(xi|{xj : j 6= i}) = p(xi|{xj : j ∈ Ni})
is a Gaussian Markov random field.
Using the precision matrix, Q = Σ−1, the conditional expectationbecomes
E(xi|{xj : j 6= i}
)= μi −
1
Qii
∑
j 6=i
Qij(xj − μj)
A GMRF has a sparse precision matrix j ∈ Ni ⇐⇒ Qi,j 6= 0.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
Gaussian Markov random fields
Let the neighbours Ni to a point si be the points {sj, j ∈ Ni} thatare “close” to si.
Gaussian Markov random field (GMRF)
Gaussian random field x ∼ N (μ,Σ) that satisfies
p(xi|{xj : j 6= i}) = p(xi|{xj : j ∈ Ni})
is a Gaussian Markov random field.
Using the precision matrix, Q = Σ−1, the conditional expectationbecomes
E(xi|{xj : j 6= i}
)= μi −
1
Qii
∑
j 6=i
Qij(xj − μj)
A GMRF has a sparse precision matrix j ∈ Ni ⇐⇒ Qi,j 6= 0.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
Gaussian Markov random fields
Let the neighbours Ni to a point si be the points {sj, j ∈ Ni} thatare “close” to si.
Gaussian Markov random field (GMRF)
Gaussian random field x ∼ N (μ,Σ) that satisfies
p(xi|{xj : j 6= i}) = p(xi|{xj : j ∈ Ni})
is a Gaussian Markov random field.
Using the precision matrix, Q = Σ−1, the conditional expectationbecomes
E(xi|{xj : j 6= i}
)= μi −
1
Qii
∑
j 6=i
Qij(xj − μj)
A GMRF has a sparse precision matrix j ∈ Ni ⇐⇒ Qi,j 6= 0.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
The Markov property
The Markov property does not imply that events far apart areindependent.Only that if we know what happens close by we can ignore thingsfurther away.
Example
We want to predict the temperature in Lund
1. If we know the temperature in Copenhagen, thetemperature in London contributes very little information.
2. However, if we don’t know the temperature in Copenhagen,the temperature in London would help.
The sparse precision matrix makes GMRF computationallyeffective but it is hard to construct reasonable precision matrices.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
The Markov property
The Markov property does not imply that events far apart areindependent.Only that if we know what happens close by we can ignore thingsfurther away.
Example
We want to predict the temperature in Lund
1. If we know the temperature in Copenhagen, thetemperature in London contributes very little information.
2. However, if we don’t know the temperature in Copenhagen,the temperature in London would help.
The sparse precision matrix makes GMRF computationallyeffective but it is hard to construct reasonable precision matrices.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
The Markov property
The Markov property does not imply that events far apart areindependent.Only that if we know what happens close by we can ignore thingsfurther away.
Example
We want to predict the temperature in Lund
1. If we know the temperature in Copenhagen, thetemperature in London contributes very little information.
2. However, if we don’t know the temperature in Copenhagen,the temperature in London would help.
The sparse precision matrix makes GMRF computationallyeffective but it is hard to construct reasonable precision matrices.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
The Markov property
The Markov property does not imply that events far apart areindependent.Only that if we know what happens close by we can ignore thingsfurther away.
Example
We want to predict the temperature in Lund
1. If we know the temperature in Copenhagen, thetemperature in London contributes very little information.
2. However, if we don’t know the temperature in Copenhagen,the temperature in London would help.
The sparse precision matrix makes GMRF computationallyeffective but it is hard to construct reasonable precision matrices.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
A common covariance function
The Matern covariance family
The covariance between two points at distance ‖u‖ is
C(‖u‖) = σ2 21−nΓ(ν)
(κ ‖u‖)n Kn(κ ‖u‖)
The Matern family includes:
◮ Exponential, if ν = 1/2.
◮ Gaussian, when ν→ ∞.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation The basics Matern & SPDE
A common covariance function
The Matern covariance family
The covariance between two points at distance ‖u‖ is
C(‖u‖) = σ2 21−nΓ(ν)
(κ ‖u‖)n Kn(κ ‖u‖)
Fields with Matern covariances are solutions to an SPDE(Whittle, 1954, 1963),
(κ2 −Δ
)a/2x(s) = W(s).
Here W(s) is white noise, Δ =∑
i∂2
∂s2i
, and α = ν+ d/2.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Constructing a GMRF from the SPDE
We want to construct a (GMRF) solution to the SPDE.
The basic idea
Construct the solution as a finite basis expansion
x(s) =
∑
k
ψk(s) xk,
with a suitable distribution for the weights {xk}.
A stochastic weak solution is given by weights, {xi}, such that thejoint distribution fulfills
∑
i
⟨ψj, (κ2 −Δ)a/2ψixi
⟩D=
⟨ψj, W
⟩∀j
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Constructing a GMRF from the SPDE
We want to construct a (GMRF) solution to the SPDE.
The basic idea
Construct the solution as a finite basis expansion
x(s) =
∑
k
ψk(s) xk,
with a suitable distribution for the weights {xk}.
A stochastic weak solution is given by weights, {xi}, such that thejoint distribution fulfills
∑
i
⟨ψj, (κ2 −Δ)a/2ψixi
⟩D=
⟨ψj, W
⟩∀j
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Possible basis functions
Several possible basis functions exist:
◮ Harmonic functions give a spectral representation.
◮ Eigenfunctions to the covariance matrix giveKarhunen-Loeve.
◮ A piecewise linear basis gives (almost) a GMRF.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Possible basis functions
Several possible basis functions exist:
◮ Harmonic functions give a spectral representation.
◮ Eigenfunctions to the covariance matrix giveKarhunen-Loeve.
◮ A piecewise linear basis gives (almost) a GMRF.
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Construction of the precision matrix
We have the SPDE∑
i
⟨ψj, (κ2 −Δ)a/2ψixi
⟩D=
⟨ψj, W
⟩
SPDE solution
The distribution of the weights is x ∈ N(
0, Q−1
)
Q1,k = K Q2,k = KC−1K
Qa,k = KC−1Qa−2,kC−1K, α = 3, 4, . . .
where
C i,j =⟨ψi, ψj
⟩,
K i,j =⟨ψi, (κ2 −Δ)ψj
⟩= κ2
⟨ψi, ψj
⟩−
⟨ψi, Δψj
⟩= κ2C i,j + Gi,j,
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Markov approximation
Since our basis functions have compact support the C and K
matrices are sparse.However, C
−1 is dense making Q2,k = KC−1
K dense.
GMRF approximation
To obtain sparse precision matrices we replace the C-matrix with
a diagonal matrix C with elements
C i,i = 〈ψi, 1〉
The resulting approximation error is small (Bolin & Lindgren,2009, and Simpson, Lindgren & Rue, 2010, tech.reps.)
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Markov approximation
Since our basis functions have compact support the C and K
matrices are sparse.However, C
−1 is dense making Q2,k = KC−1
K dense.
GMRF approximation
To obtain sparse precision matrices we replace the C-matrix with
a diagonal matrix C with elements
C i,i = 〈ψi, 1〉
The resulting approximation error is small (Bolin & Lindgren,2009, and Simpson, Lindgren & Rue, 2010, tech.reps.)
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Lattice on R2, step size h, regular triangulation
◮ Order α = 1:
κ2h2
1
+
−1−1 4 −1
−1
︸ ︷︷ ︸≈−D
◮ Order α = 2:
κ4h2
1
+ 2κ2
−1
−1 4 −1−1
︸ ︷︷ ︸≈−D +
1
h2
12 −8 2
1 −8 20 −8 12 −8 2
1
︸ ︷︷ ︸≈D2
◮ Not restricted to regular lattices; easy on triangulations
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Idea Basis functions Construction Examples
Example of neighbours for a = 2
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Observations Conditional distribution
Observations
Point observations
y(u) = x(u) + ε
=
∑
k
ψk(u)xk + ε
Area observations
y(M) =
∫
M
x(u) du + ε
=
∑
k
xk
∫
M
ψk(u) du + ε
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Observations Conditional distribution
Observations
Point observations
y(u) = x(u) + ε
=
∑
k
ψk(u)xk + ε
Area observations
y(M) =
∫
M
x(u) du + ε
=
∑
k
xk
∫
M
ψk(u) du + ε
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Observations Conditional distribution
Conditional expectation
Introducing a matrix A with rows
Ai· =[ψ1(ui) · · · ψN(ui)
]or Ai· =
[∫Miψ1(u) du · · ·
]
we can write the observation equation on matrix form as
y = Ax + ε x ∈ N(μ, Q
−1)
ε ∈ N(
0, Q−1e )
Kriging with GMRF
E (x|y) = μ+ Q−1x|y A⊤Qe (y − Aμ)
V (x|y) = Q−1x|y =
(Q + A⊤QeA
)−1
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Observations Conditional distribution
Conditional expectation
Introducing a matrix A with rows
Ai· =[ψ1(ui) · · · ψN(ui)
]or Ai· =
[∫Miψ1(u) du · · ·
]
we can write the observation equation on matrix form as
y = Ax + ε x ∈ N(μ, Q
−1)
ε ∈ N(
0, Q−1e )
Kriging with GMRF
E (x|y) = μ+ Q−1x|y A⊤Qe (y − Aμ)
V (x|y) = Q−1x|y =
(Q + A⊤QeA
)−1
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Oscillating Non-stationary
Example of an oscillating field
(κ2 exp(iπθ) −Δ
)(x1(u) + ix2(u)) = W1(u) + iW2(u)
0 10 200
0.5
1
0 10 20−0.5
0
0.5
1
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Oscillating Non-stationary
Example of a non-stationary field
(κ2(u) −Δ
)(τ(u)x(u)) = W(u)
Johan Lindstrom - [email protected] Gaussian Markov random fields
Spatial GMRFs SPDE Data Generalisation Oscillating Non-stationary
Features and extensions, past, current and future work
◮ Just as easy for general manifolds, e.g. the globe, as for Rd.
◮ Oscillating fields
◮ Non-stationary versions, such as
(κ2(u) −Δ
)a/2(τ(u)x(u)) = W(u)
(κ2(u) + ∇ · m(u) −∇ · M(u)∇
)a/2x(u) = τ(u)W(u)
(Includes the Sampson & Guttorp deformation method)
◮ Non-stationarity that depends on covariates.
◮ Multivariate versions
◮ Spatio-temporal versions
◮ Works well with INLA (current & next generation) but is alsogenerally applicable
Johan Lindstrom - [email protected] Gaussian Markov random fields