spatial processes and statistical modelling

69
1 Spatial processes and statistical modelling Peter Green University of Bristol, UK IMS/ISBA, San Juan, 24 July 2003

Upload: quasim

Post on 10-Jan-2016

39 views

Category:

Documents


2 download

DESCRIPTION

Spatial processes and statistical modelling. Peter Green University of Bristol, UK IMS/ISBA, San Juan, 24 July 2003. . . . . . . . . Spatial indexing. Continuous space Discrete space lattice irregular - general graphs areally aggregated Point processes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Spatial processes and statistical modelling

1

Spatial processes and statistical modelling

Peter GreenUniversity of Bristol, UK

IMS/ISBA, San Juan, 24 July 2003

Page 2: Spatial processes and statistical modelling

2

• Continuous space

• Discrete space– lattice– irregular - general graphs– areally aggregated

• Point processes– other object processes

Spatial indexing

Page 3: Spatial processes and statistical modelling

3

Purpose of overview

• setting the scene for 8 invited talks on spatial statistics

• particularly for specialists in the other 2 areas

Page 4: Spatial processes and statistical modelling

4

Perspective of overview

• someone interested in the development of methodology– for the analysis of spatially-indexed

data– probably Bayesian

• models and frameworks, not applications

• personal, selective, eclectic

Page 5: Spatial processes and statistical modelling

5

Genesis of spatial statistics• adaptation of time series ideas• ‘applied probability’ modelling• geostatistics• application-led

Page 6: Spatial processes and statistical modelling

6

Space vs. time

• apparently slight difference• profound implications for

mathematical formulation and computational tractability

Page 7: Spatial processes and statistical modelling

7

Requirements of particular application domains• agriculture (design)

• ecology (sparse point pattern, poor data?)

• environmetrics (space/time)

• climatology (huge physical models)

• epidemiology (multiple indexing)

• image analysis (huge size)

Page 8: Spatial processes and statistical modelling

8

Key themes

• conditional independence– graphical/hierarchical modelling

• aggregation– analysing dependence between differently

indexed data– opportunities and obstacles

• literal credibility of models• Bayes/non-Bayes distinction blurred

Page 9: Spatial processes and statistical modelling

9

A big subject….

Noel Cressie: “This may be the last time spatial

statistics is squeezed between two covers”

(Preface to Statistics for Spatial Data, 900pp., Wiley, 1991)

Page 10: Spatial processes and statistical modelling

10

Why build spatial dependence into a model?• No more reason to suppose

independence in spatially-indexed data than in a time-series

• However, substantive basis for form of spatial dependent sometimes slight - very often space is a surrogate for missing covariates that are correlated with location

Page 11: Spatial processes and statistical modelling

11

Discretely indexed data

Page 12: Spatial processes and statistical modelling

12

Modelling spatial dependence in discretely-indexed fields• Direct• Indirect

– Hidden Markov models– Hierarchical models

Page 13: Spatial processes and statistical modelling

13

Hierarchical models, using DAGs

Variables at several levels - allows modelling of complex systems, borrowing strength, etc.

Page 14: Spatial processes and statistical modelling

14

Modelling with undirected graphsDirected acyclic graphs are a natural

representation of the way we usually specify a statistical model - directionally:

• disease symptom• past future• parameters data ……whether or not causality is understood.But sometimes (e.g. spatial models) there is

no natural direction

Page 15: Spatial processes and statistical modelling

15

Conditional independence

In model specification, spatial context often rules out directional dependence (that would have been acceptable in time series context)

X0 X1 X2 X3 X4

Page 16: Spatial processes and statistical modelling

16

Conditional independence

In model specification, spatial context often rules out directional dependence

X20 X21 X22 X23 X24

X00 X01 X02 X03 X04

X10 X11 X12 X13 X14

Page 17: Spatial processes and statistical modelling

17

Conditional independence

In model specification, spatial context often rules out directional dependence

X20 X21 X22 X23 X24

X00 X01 X02 X03 X04

X10 X11 X12 X13 X14

Page 18: Spatial processes and statistical modelling

18

Directed acyclic graph

)|()( )(pa vVv

v xxpxp

in general:

for example:

a b

c

dp(a,b,c,d)=p(a)p(b)p(c|a,b)p(d|c)In the RHS, any distributions are legal, and uniquely define joint distribution

Page 19: Spatial processes and statistical modelling

19

Undirected (CI) graph

X20 X21 X22

X00 X01 X02

X10 X11 X12Absence of edge denotes conditional independence given all other variables

But now there are non-trivial constraints on conditional distributions

Regular lattice, irregular graph, areal data...

Page 20: Spatial processes and statistical modelling

20

Undirected (CI) graph

X20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp ))(exp)(

iC

CCii XVXXp ))(exp)|(

)|()|( iiii XXpXXp

The Hammersley-Clifford theorem says essentially that the converse is also true - the only sure way to get a valid joint distribution is to use ()

()

clique

Page 21: Spatial processes and statistical modelling

21

Hammersley-Clifford

X20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp )(exp)(

)|()|( iiii XXpXXp

A positive distribution p(X) is a Markov random field

if and only if it is a Gibbs distribution

- Sum over cliques C (complete subgraphs)

Page 22: Spatial processes and statistical modelling

22

Partition function

X20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp )(exp)(

Almost always, the constant of proportionality in

is not available in tractable form: an obstacle to likelihood or Bayesian inference about parameters in the potential functions

Physicists call

the partition function

))( CC XV

X C

CC XVZ )(exp

Page 23: Spatial processes and statistical modelling

23

Gaussian Markov random fields: spatial autoregression

C

CC XVXp )(exp)(

)|()|( iiii XXpXXp is a multivariate Gaussian distribution, and

If VC(XC) is -ij(xi-xj)2/2 for C={i,j} and 0 otherwise, then

is the univariate Gaussian distribution

),(~| 22i

ijjijiii XNXX

ij

iji /12where

Page 24: Spatial processes and statistical modelling

24

Inverse of (co)variance matrix:

dependent case3210

2410

1121

0012

A B C D

non-zero

non-zero),|,cov( CADB

A B C D

A

B

C

D

Gaussian random fields

Page 25: Spatial processes and statistical modelling

25

Gaussian Markov random fields: spatial autoregressionDistinguish these conditional autoregression (CAR) models from the corresponding simultaneous autoregression (SAR) models

j

ijiji XX ~

(cf time series case).

The latter are less compatible with hierarchical model structures.

i i.i.d. normal

Page 26: Spatial processes and statistical modelling

26

Non-Gaussian Markov random fields

ji

ji xx

~

)cosh(log)1(exp

C

CC XVXp )(exp)(

Pairwise interaction random fields with less smooth realisations obtained by replacing squared differences by a term with smaller tails, e.g.

Page 27: Spatial processes and statistical modelling

27

Agricultural field trials

• strong cultural constraints• design, randomisation, cultivation effects• 1-D analysis in 2-d fields• relationships between IB designs, splines,

covariance models, spatial autoregression…

Page 28: Spatial processes and statistical modelling

28

Discrete Markov random fields

C

CC XVXp )(exp)(

Besag (1974) introduced various cases of

for discrete variables, e.g. auto-logistic (binary variables), auto-Poisson (local conditionals are Poisson), auto-binomial, etc.

Page 29: Spatial processes and statistical modelling

29

Auto-logistic model

C

CC XVXp )(exp)(

ji

jiiji

ii xxx~

exp

)|()|( iiii XXpXXp is Bernoulli(pi) with

ij

jijiii xpp )())1/(log(

- a very useful model for dependent binary variables (NB various parameterisations)

(Xi = 0 or 1)

Page 30: Spatial processes and statistical modelling

30

Statistical mechanics models

C

CC XVXp )(exp)(

The classic Ising model (for ferromagnetism) is the symmetric autologistic model on a square lattice in 2-D or 3-D. The Potts model is the generalisation to more than 2 ‘colours’

ji

jii

x xxIXpi

~

][exp)(

and of course you can usefully un-symmetrise this.

Page 31: Spatial processes and statistical modelling

31

Auto-Poisson model

C

CC XVXp )(exp)(

ji

jiiji

iii xxxx~

)!log(exp

)|()|( iiii XXpXXp is Poisson ))(exp(

ij

jiji x

For integrability, ij must be 0, so this onlymodels negative dependence: very limited use.

Page 32: Spatial processes and statistical modelling

32

Hierarchical models and

hidden Markov processes

Page 33: Spatial processes and statistical modelling

33

Chain graphs

• If both directed and undirected edges, but no directed loops:

• can rearrange to form global DAG with undirected edges within blocks

Page 34: Spatial processes and statistical modelling

34

Chain graphs

• If both directed and undirected edges, but no directed loops:

• can rearrange to form global DAG with undirected edges within blocks

• Hammersley-Clifford within blocks

Page 35: Spatial processes and statistical modelling

35

Hidden Markov random fields• We have a lot of freedom modelling

spatially-dependent continuously-distributed random fields on regular or irregular graphs

• But very little freedom with discretely distributed variables

use hidden random fields, continuous or discrete

• compatible with introducing covariates, etc.

Page 36: Spatial processes and statistical modelling

36

Hidden Markov models

z0 z1 z2 z3 z4

y1 y2 y3 y4

e.g. Hidden Markov chain

observed

hidden

Page 37: Spatial processes and statistical modelling

37

Hidden Markov random fields

Unobserved dependent field

Observed conditionally-independent discrete field

(a chain graph)

Page 38: Spatial processes and statistical modelling

38

Spatial epidemiology applications

independently, for each region i. Options:• CAR, CAR+white noise (BYM, 1989)• Direct modelling of ,e.g. SAR• Mixture/allocation/partition models:

• Covariates, e.g.:

)(~ iii ePoissonY casesexpected

cases

relative risk

)log( i))cov(log( i

izi eYEi

)(

)exp()( Tiizi xeYE

i

Page 39: Spatial processes and statistical modelling

39

Spatial epidemiology applications

Spatial contiguity is usually somewhat idealised

Page 40: Spatial processes and statistical modelling

40

CAR model for lip cancer data

observed counts

random spatial effects

covariate

regressioncoefficient

expected counts

(WinBUGS example)

Page 41: Spatial processes and statistical modelling

41

relativerisk

parameters

Example of an allocation modelRichardson & Green (JASA, 2002) used a

hidden Markov random field model for disease mapping

)(Poisson~ izi eyi

observedincidence

expectedincidencehidden

MRF

Page 42: Spatial processes and statistical modelling

42)(Poisson~ ii eY

Chain graph for disease mapping

e

Y

zk

)(Poisson~ izi eYi

based on Potts model

Page 43: Spatial processes and statistical modelling

43

Larynx cancer in females in France

SMRs

)|1( ypiz

ii Ey /

Page 44: Spatial processes and statistical modelling

44

Continuously indexed data

Page 45: Spatial processes and statistical modelling

45

Continuously indexed fields

The basic model is the Gaussian random field

with and

Translation-invariant or fully stationary (isotropic) cases have

and or ,resp.

2),( EssZ)())(( ssZE

),())(),(cov( tsctZsZ

))(( sZE )())(),(cov( tsctZsZ ||)(||))(),(cov( tsctZsZ

Page 46: Spatial processes and statistical modelling

46

Geostatistics and kriging

• There is a huge literature on a group of methodologies originally developed for geographical and geological data

• The main theme is prediction of (functionals of) a random field based on observations at a finite set of locations

Page 47: Spatial processes and statistical modelling

47

Ordinary kriging

• is a random process, we have observations and we wish to predict , e.g. a “block average”

• The usual basis is least-squares prediction, using a model for the mean and covariance of estimated from the data

2),( EssZ )(),...,(),( 21 nsZsZsZ

(.))(Zg

B

BdssZ ||/)(

)(sZ

Page 48: Spatial processes and statistical modelling

48

Ordinary kriging

The usual assumption is that is intrinsically stationary, i.e. has 2nd order structure

for all s is called the semi-variogramThis is somewhat weaker than full 2nd-

order stationarity

)(sZ

)(2))()(var( hsZhsZ

)(h

0))()(( sZhsZE

Page 49: Spatial processes and statistical modelling

49

Ordinary kriging

The optimal solution to the prediction problem in terms of the semivariogram follows from standard linear algebra arguments; an empirical estimate of the semivariogram is then plugged in.

Page 50: Spatial processes and statistical modelling

50

Variants of kriging

Kriging without intrinsic stationarity (& a model instead of empirical estimates)

Co-kriging (multivariate)Robust krigingUniversal kriging (kriging with regression)Disjunctive (nonlinear) krigingIndicator krigingConnections with splines

Page 51: Spatial processes and statistical modelling

51

Given data (si,xi,Yi), build model starting

with a Gaussian random fieldwith andSet whereand

Bayesian geostatistics (Diggle, Moyeed and Tawn, Appl Stat, 1998)

2),( EssZ0))(( sZE )())(),(cov( tsctZsZ

)|(~(.)| iii MyfZY (.))|( ZYEM ii T

iii xsZMh )()(

Z*

Z

Y

Y*

inference

prediction

X

X*

Z

Page 52: Spatial processes and statistical modelling

52

Point data

Page 53: Spatial processes and statistical modelling

53

Point processes

• (inhomogeneous) Poisson process• Neyman-Scott process• (log Gaussian) Cox process• Gibbs point process• Markov point process • Area-interaction process

Page 54: Spatial processes and statistical modelling

54

Analysis of spatial point pattern• Very strong early emphasis on

modelling clustering and repelling alternatives to homogeneous Poisson process (complete spatial randomness)

• May be different effects at different scales

• Interpretations in terms of mechanisms, e.g. in ecology, forestry

Page 55: Spatial processes and statistical modelling

55

Point process as parametrisation of spaceVoronoi tessellation of random point

process

Flexible modelling of surfaces: step functions, polynomials, …

Page 56: Spatial processes and statistical modelling

56

Rare disease point data• Regard locations of cases as Poisson

process with highly structured intensity process– Covariates– Spatial dependence

)()())(( dsesdsYE

As

dsesAYE )()())((

)(dsY number of cases in ds

Page 57: Spatial processes and statistical modelling

57

Models without covariates 1

Cox processwhere is a random field, e.g.

is Gaussian – ‘log Gaussian Cox process’ (Moller, Syversveen and Waagepetersen, 1998)

)(s)())(log( sZs

)()())(( dsesdsYE

Page 58: Spatial processes and statistical modelling

58

Models without covariates 2Smoothed Gamma random field

(Wolpert and Ickstadt, 1998)

where is a kernel function

and

is a sum of smoothed gamma-distributed impulses

-- example of shot-noise Cox process

)(),()( duusks

))(),((~)( ududu ),( usk

)(s

Page 59: Spatial processes and statistical modelling

59

DAG for Gamma RF model with covariates

k

Y

eX

measure

pointprocess

function

vector

key

Page 60: Spatial processes and statistical modelling

60

Models without covariates 3

Voronoi tessellation models(PJG,1995; Heikkinen and Arjas, 1998)

where are cells of Voronoi tessellation of an unobserved point process and

might be independent or dependent (e.g. CAR model for logs)

k

kk AsIs ][)( kA

k

Page 61: Spatial processes and statistical modelling

61

Introducing covariates

With covariates {Xj(s)} measured at case locations s, usual formulation is multiplicative

but occasionally additive

+ data-dependent constraints on parameters

j

jj sXss )(exp)()(

j

jj sXss )()()(

Page 62: Spatial processes and statistical modelling

62

Markov point processesRich families of non-Poisson point processes can be defined by

specifying their densities (Radon-Nikodym derivatives) w.r.t. unit-rate Poisson process, e.g. pairwise interaction models

(e.g. g(si,sj)=<1 if d(si,sj)<, 1 otherwise), and

area-interaction models

Note formal similarity to Gibbs lattice modelsMarginal distribution of #points usually not explicit

),(

)( ),()(ji

jisn ssgsp

||)()( Tssnsp +++

++

+

Page 63: Spatial processes and statistical modelling

63

Object processes

• Poisson processes of objects (lines, planes, flats, ….)

• Coloured triangulations….

Page 64: Spatial processes and statistical modelling

64

Aggregation

Page 65: Spatial processes and statistical modelling

65

Aggregation coherence and ecological bias• Commonly, covariates and responses are

spatially indexed differently, and for most models this poses coherence problems (linear Gaussian case the main exception)

• E.g. areally-aggregated response Yi=Y(Ai), and continuously indexed covariate X(s)

iAs

i dsesXsAYE )())(exp()())((

Page 66: Spatial processes and statistical modelling

66

Aggregation coherence and ecological bias

Even with uniform , this is not of form

where

(mis-specification) bias in estimation of .Need to know spatial variation in covariate

iAs

i dsesXsAYE )())(exp()())((

)()( dses

))(exp())(( iii AXAYE

iA

ii AdssXAX ||/)()(

Page 67: Spatial processes and statistical modelling

67

Aggregation coherence and ecological bias

Additive formulation

avoids this problem, as does the Ickstadt and Wolpert approach, to some extent

iAs

i dsesXsAYE )())()(())((

Page 68: Spatial processes and statistical modelling

68

Invited talks on spatial statistics• Brad Carlin: space & space-time CDF models, air

pollutant data• Jon Wakefield: ecological fallacy• Montserrat Fuentes: spatial design, air pollution• Doug Nychka: filtering for weather forecasting• Susie Bayarri: validating computer models• Arnoldo Frigessi: localisation of GSM phones• Rasmus Waagepetersen: Poisson-log Gaussian

processes• Adrian Baddeley: point process diagnostics

Fri 09:00

Sat 10:45

Fri 17:30

Fri 10:45

Page 69: Spatial processes and statistical modelling

69

Spatial processes and statistical modelling

[email protected]://www.stats.bris.ac.uk/~peter/PR

Peter GreenUniversity of Bristol, UKIMS/ISBA, San Juan, 24 July 2003