25
1
By
Jay M. Ver Hoef
National Marine Mammal Lab
7600 Sand Point Way, NE
Seattle, WA 98115
jay.verhoef@
noaa.gov
By
Jay M. Ver Hoef
National Marine Mammal Lab
7600 Sand Point Way, NE
Seattle, WA 98115
jay.verhoef@
noaa.gov
An Introduction to Statistical
Models for Spatial Data in Ecology
An Introduction to Statistical
Models for Spatial Data in Ecology
2
What do Statisticians Do?
What do Statisticians Do?
3
What is a Model?
What is a Model?
What does it look
like?
How does it work?
Rep
rese
nta
tion
al
Fu
ncti
ona
l
4
Reality
“A
ll m
od
els
are
wro
ng
. W
e m
ake
ten
tati
ve a
ssu
mp
tion
s a
bou
t th
e re
al
wo
rld
wh
ich
we
kno
w a
re f
als
e b
ut
wh
ich
we
bel
ieve
ma
y b
e u
sefu
l.”
-G
eorg
e B
ox
19
76
What are Spatial Statistics
What are Spatial Statistics
Data Model
Representational
What does it look
like?
Points
Lines
Polygons
Probability Model
Lack of Fit
What don’t we
understand?
θ n,
θ s, θ
r
Scientific Model
Functional
How does it work?
Modeling
Modeling
Modeling
Inference
Inference
Inference
020
40
60
80
100
0246810
x
y
26
5
Notation
Notation
Z(s
)D
•D
is t
he
spat
ial
dom
ain
or
area
of
inte
rest
•s
con
tain
s th
e sp
atia
l
coord
inat
es
•Z
is a
val
ue
loca
ted
at
the
spat
ial
coord
inat
es
6
Types of Spatial Data
Types of Spatial Data
{Z
(s):
s∈
D}
•G
eost
ati
stic
al
Da
ta:
Zra
ndo
m;
Dfi
xed
, in
fin
ite,
co
nti
nu
ou
s
•L
att
ice
Da
ta:
Z r
and
om
; D
fix
ed,
fin
ite,
(ir
)reg
ula
rgri
d
•P
oin
t P
att
ern
Da
ta:
Z ≡
1;
D r
and
om
, fi
nit
e
Z(s
)D
7
Examples of GeostatisticalData
Examples of GeostatisticalData
Ozo
ne
Pre
dic
tion
sA
ver
age
Sn
ow
Dep
th
8
Examples of Lattice Data
Examples of Lattice Data 2
312
4
55144
51344
31523
52312
Transform
ed SIDS rates
Plots in a Designed
Experiment
27
9
Examples of Point Patterns
Examples of Point Patterns
Arctic Caribou Calving
Locations
x
yLansing W
oods Hickory
Locations
10
Statistical Models
Statistical Models
)(
)var
( ,
θΣ εεX
βzz
=+
=
un
ob
serv
ed
ob
serv
ed
Prediction
Estimation
Linear Model
) ,
,(
θ εX
zzg
un
ob
serv
ed
ob
serv
ed=
Nonlinear Model
11
•Description of data
•Property of a stochastic process
•Model for a stochastic process
•Statistic
•Function in Fourier analysis
Five Meanings of
Autocorrelation
Five Meanings of
Autocorrelation
12
Four Meanings of Autocorrelation
Four Meanings of Autocorrelation
02
04
06
08
01
00
0246810
x
y
02
04
06
08
01
00
02468
x
y
Independent Errors
AutocorrelatedErrors
Iε 21
0)
var(
,σ
β=
++
=i
ii
εxβ
y
Σε =+
+=
)var
( ,
10
ii
i
εxβ
yβ
01
02
03
04
05
0
-2024
Dis
tance
Empirical Covariance
02
04
06
08
01
00
010203040
Dis
tance
Autocovariance
Autocorrelation Model
Autocorrelation Statistic
AutocorrelatedData 1
2
34
28
13
Autocorrelation Models
Autocorrelation Models
14
Autocorrelation
Autocorrelation
15
Try it! F9
Try it! F9
16
0
Z(1
) ~
N(µ µµµ
,1)
12
34
5
0
12
34
5
Z(i
) ~
N( µ µµµ
,1),
i.i
.d.
=
11
1
11
1
11
1
L
MO
MM
LL Σ
=
10
0
01
0
00
1
L
MO
MM
LL ΣWhy Spatial Statistics?
Why Spatial Statistics?
Z(i
> 1
) =
Z(1
)
29
17
Fits vs. Prediction
Fits vs. Prediction
02
04
06
08
010
0
0246810
x
y
02
04
06
08
0100
02468
x
y
Independent Errors
AutocorrelatedErrors
FitPrediction
Fit = Prediction
Variances different in both cases)
()
var
( ,
θΣ εε
X
β
zz=
+=
uno
bse
rved
ob
serv
ed
18
Estimation and Prediction
Estimation and Prediction )
()
var
( ,
θΣ εεX
βzz
=+
=
unobse
rved
obse
rved
Prediction
Estimation
•Mapping
•Sampling
•Regression
•Designed Experiments
19
=
nn
nn
n
σσ
σ
σσ
σ
σσ
σ
L
MO
MM
LL
21
21
22
21
11
21
1 ΣWhy Do W
e Need
Autocorrelation Models?
Why Do W
e Need
Autocorrelation Models?
)(
)var
( ,
θΣ εεX
βzz
=+
=
uno
bse
rved
ob
serv
ed
20
•Weighted Least Squares
•Generalized Least Squares
•Maximum Likelihood
•Restricted Maximum Likelihood
•Bayes(M
arkov Chain Monte Carlo MCMC)
Estimation?!
Estimation?!
)(
)var
( ,
θΣ εεX
βzz
=+
=
unobse
rved
obse
rved
Leave it to the Statisticians!
30
21
Pitfalls: Valid Autocorrelation
Models
Pitfalls: Valid Autocorrelation
Models
22
Stream Netw
ork Models
Stream Netw
ork Models
SO4Concentration
SO4Concentration
23
Pitfalls: Valid Models for
Stream Netw
orks
Pitfalls: Valid Models for
Stream Netw
orks
Not to scale. All lengths = 1
24
Prediction -Mapping
Prediction -Mapping
Prediction Map
Standard Error Map
)(
)v
ar(
,
θΣ εεX
βzz
=+
=
un
ob
serv
ed
ob
serv
ed
31
25
Mapping -Quantiles
Mapping -Quantiles
26
QuantileMaps
QuantileMaps
27
Probability Maps
Probability Maps
Pro
b=
0.1
23
Pre
dic
tion
= 0
.080
Pre
dic
tion
= 0
.091
Pro
b=
0.0
80
Pre
dic
tion
= 0
.114
Pro
b=
0.6
24
Pre
dic
tion
= 0
.114
Pro
b=
0.5
82
28
•W
hip
tail
Liz
ard
•1
48
loca
tions
in S
outh
ern
Cali
forn
ia
•M
easu
red
the
aver
age
num
ber
cau
ght
in t
rap
s o
ver
80 –
90
tra
pp
ing
ev
ents
in o
ne
yea
r
•D
ata
log-t
ransf
orm
ed,
one
outl
ier
rem
oved
Spatial Regression
Spatial Regression
32
29
•T
her
e w
ere
37
ex
pla
nat
ory
var
iab
les
in 5
bro
ad c
ateg
ori
es:
veg
etat
ion
lay
ers,
veg
etat
ion
typ
es,
top
ogra
ph
ic p
osi
tion
, so
il
typ
es, an
d a
nt
abu
nd
ance
Whiptail Lizard Example
Whiptail Lizard Example
30
California Lizard Data
California Lizard Data
-4.9
053 t
o -
4.4
023
-4.4
023 t
o -
3.8
992
-3.8
992 t
o -
3.3
962
-3.3
962 t
o -
2.8
931
-2.8
931 t
o -
2.3
901
-2.3
901 t
o -
1.8
87
-1.8
87 t
o -
1.3
839
-1.3
839 t
o -
0.8
809
-0.8
809 t
o -
0.3
778
-0.3
778 to 0
.1251
-5-4-3-2-10
Bo
xp
lot
of
raw
data
log(Hyper) abundance
His
tog
ram
of
raw
data
log
(Hyp
er)
a
bund
anc
e
Frequency
-5-4
-3-2
-10
0510152025
)(
)var(
,
θΣ εεX
βzz
=+
=
uno
bse
rved
obse
rved
•Ant abundance
•Percent sandy soil
•Spherical autocorrelation
•Isotropic
Estimation: REML
followed by GLS
31
Exploratory Data Analysis on
Residuals
Exploratory Data Analysis on
Residuals
2/1
2/1
)
1(
ˆ
:R
esi
du
al
d
Stu
den
tize
−−
−−
′′
=−
−
ΣX
X)Σ
XX
(ΣH
1
ii
ii
hM
SE
zµ
-4-3-2-1012
Bo
xp
lot o
f stu
de
nti
ze
d r
esid
ua
ls
log(Hyper) abundance
His
tog
ram
of
stu
de
nti
ze
d r
esid
ua
ls
log
(Hyp
er)
ab
und
ance
Frequency
-4-2
02
0102030405060
32
Model Diagnostics
Model Diagnostics
•L
ikel
ihood
Dis
tan
ce
•C
ook’s
D a
nd
Lev
erag
e
•C
ovar
ian
ce T
race
Based on deleting observations
Mostly for outlier detection
()
)(
)var(
,1
θΣ εεX
βz
=+
=−
obse
rved
Co
ok
sD
Frequency
0.0
0.1
0.2
0.3
0.4
0.5
050100150
33
33
Remove Outlier and Re-fit
Lizard Data
Remove Outlier and Re-fit
Lizard Data
His
tog
ram
of
stu
de
nti
ze
d r
esid
ua
ls
log
(Hyp
er)
ab
und
ance
Frequency
-2-1
01
2
051015202530
-2-1012
Bo
xp
lot o
f s
tud
en
tize
d r
es
idu
als
log(Hyper) abundance
Co
oksD
Frequency
0.0
00.0
20.0
40
.06
0.0
80.1
0
020406080100120
34
Cross-validation
Cross-validation
-5-4
-3-2
-10
-5-4-3-2-10
Cro
ss
-va
lid
ati
on
Sc
att
er
Plo
t
Me
asure
d V
alu
e
Predicted Value
35
Directional Autocorrelation
Directional Autocorrelation
5 Parameters
Isotropy vs.
Anisotropy
36
Cross-validation
Cross-validation
∑
∑∑
=
=
=
<
−
−−
n iii
i
n i
ii
n iii
i
ZzZ
I
zZ
n
ZzZ
n
1
1
2
1
96
.1
)ˆ
r(av
)ˆ
(
:C
over
ag
e
Inte
rval
P
redic
tion
)ˆ
(1
:E
rror
P
redic
tion
S
quare
dM
ean
Root
)ˆ
r(av
)ˆ
(1
:B
ias
ed
Sta
ndard
iz)
()
var
( ,
θΣ εεX
βz
=+
=
−
i
io
bse
rved
Z•Spherical autocorrelation
•Isotropic
•Spherical autocorrelation
•Anisotropic
34
37
Model Selection
Model Selection
•AIC
•AICc
•BIC
•etc.!
-2*loglikelihood+
(Penalty for number of parameters)
Choose the model with the Minimum of these:
Be careful! Some software uses
2*loglikelihood–(Penalty for number of parameters),
in which case you choose the m
axim
um.
Can also use RMSPE and other criteria. W
hy not?
38
Model Selection
Model Selection
39
Final Fitted Model
Final Fitted Model )
()
var
( ,
θΣ εεX
βzz
=+
=
un
ob
serv
ed
ob
serv
ed
40
Spatial Regression
References
Spatial Regression
References
•V
er H
oef
, J.
M.
19
93
. U
niv
ersa
l kri
gin
gfo
r ec
olo
gic
al d
ata.
Pag
es
447
–4
53
in G
oo
dch
ild
, M
.F.,
Par
ks,
B.,
and
Ste
yae
rt,
L.T
. (e
ds.
)
En
vir
on
men
tal
Mo
deli
ng
wit
h G
IS,
Oxfo
rd U
niv
ersi
ty P
ress
, 4
88
p.
•V
er H
oef
, J.
M.,
Cre
ssie
, N
., F
isher
, R
.N.,
and
Cas
e, T
.J.
20
01
.
Unce
rtai
nty
and
sp
atia
l li
nea
r m
od
els
for
eco
logic
al d
ata.
Pag
es2
14
–
23
7 i
n H
un
saker,
C.T
., G
oo
dchil
d,
M.F
.. F
ried
l, M
.A.,
and
Cas
e, T
.J.
(ed
s.),
Sp
ati
al
Un
cert
ain
ty f
or
Eco
log
y:
Imp
lica
tio
ns
for
Rem
ote
Sen
sin
g a
nd
GIS
Ap
pli
ca
tio
ns
Sp
rin
ger
-Ver
lag.
•M
aier
, J.
A.K
., V
er H
oef
, J.
M.,
McG
uir
e, A
.D.,
Bo
wyer
, R
.T.,
Sap
erst
ein,
L.
and
Mai
er,
H.A
. 2
00
6.
Dis
trib
uti
on a
nd
den
sity
of
mo
ose
in r
elat
ion t
o l
and
scap
e ch
arac
teri
stic
s: E
ffec
ts o
f sc
ale.
In
pre
ss, C
an
ad
ian
Jo
urn
al
of
Fo
rest
Rese
arc
h.
35
41
Glades in Ozarks
Glades in Ozarks
42
Glades in Ozarks
Glades in Ozarks
43
+6
5
+6
4
− −−−5
3
− −−−3
2
01
Eff
ect
Tre
atm
en
t
Add T
rts
Est
imat
e
2312
4
55144
51344
31523
52312
Designed Experiment
Designed Experiment
44
Estimation and Prediction
Estimation and Prediction )
()
var
( ,
θΣ εεX
βzz
=+
=
unobse
rved
obse
rved
Prediction
Estimation
•Mapping
•Sampling
•Regression
•Designed Experiments
36
45
Linear Models
Linear Models
ε
X
β
zs
s+
=+
=or
)
()
(i
ji
Zε
τ0ε =
)(
E
Cova
ria
nce
Mod
els
Iε 2 )var
(σ
=In
dep
end
ence
Mo
del
s
Σ ε =)
var
(G
eost
ati
stic
al
Mod
els
Exp
on
enti
al,
Sp
her
ica
l, e
tc.
Σ ε =)
var
(L
att
ice
Mod
els
CA
R,
SA
R,
etc.
46
Estimating Treatm
ent Effects
Estimating Treatm
ent Effects
2312
4
55144
51344
31523
52312
47
Designed Experiment
Experiment
Designed Experiment
Experiment
+6
5
+6
4
− −−−5
3
− −−−3
2
01
Eff
ect
Tre
atm
en
t
1600
tim
es
Est
imat
e
2312
4
55144
51344
31523
52312
48
Designed Experiment Results
Designed Experiment Results
37
49
2312
4
55144
51344
31523
52312
Plots in a
Designed
Experiment
•V
er H
oef
, J.
M.
and
Cre
ssie
, N
. 2
00
1.
Sp
atia
l st
atis
tics
: A
nal
ysi
s o
f fi
eld
exp
erim
ents
. In
Sch
einer,
S.M
. an
d
Gu
revit
ch,
J. (
eds.
), D
esi
gn
an
d
An
aly
sis
of
Eco
log
ical
Exp
eri
men
ts,
Seco
nd
Ed
itio
n,
Oxfo
rd U
niv
ersi
ty
Pre
ss,
p. 2
89
-307
.
•L
enart
, E
.A.,
Bo
wyer
, R
.T.,
Ver
Ho
ef,
J.,
and
Ru
ess
, R
.W. 2
00
2.
Cli
mat
e ch
ange
and
car
ibo
u:
effe
cts
of
sum
mer
wea
ther
on f
ora
ge.
C
an
adia
n
Jo
urn
al
of
Zo
olo
gy
80:
66
4 –
67
8.
Designed Experiments
References
Designed Experiments
References
50
Spatial Sampling
Spatial Sampling
Moose
Su
rvey
Sou
th o
f
Fa
irb
an
ks
~ 4
50
0 m
i2
51
Sources of Randomness
Sources of Randomness
0.0
0.2
0.4
0.6
0.8
1.0
x
0
20
40
value
0.0
0.2
0.4
0.6
0.8
1.0
x
0
20
40
value
Fix
ed
Pa
tte
rn,
Ra
nd
om
Sa
mp
les
Fix
ed
Pa
tte
rn,
Ra
nd
om
Sa
mp
les
Ra
nd
om
Pa
tte
rn,
Fix
ed
Sam
ple
s
Ra
nd
om
Pa
tte
rn,
Fix
ed
Sam
ple
s
0.0
0.2
0.4
0.6
0.8
1.0
x
0
20
40
value
0.0
0.2
0.4
0.6
0.8
1.0
x
0
20
40
value
52
Source of Randomness
Source of Randomness
)1)
(exp(
)
cos(
)
cos(
)si
n(
)si
n(
)(
22
11
22
11
−+
+
++
=
xx
x
xx
xz
ec
cc
c
ss
ss
αβ
αβ
α
βα
βαFix
ed
Pa
tte
rn,
Ra
nd
om
Sa
mp
les
Fix
ed
Pa
tte
rn,
Ra
nd
om
Sa
mp
les
),
0(~
)(
);
()
()
(2
1σ
εε
ρN
xx
xz
xz
ii
ii
+=
−
Ra
nd
om
Pa
tte
rn,
Fix
ed
Sa
mp
les
Ra
nd
om
Pa
tte
rn,
Fix
ed
Sa
mp
les
38
53
Sampling and Geostatistics
Sampling and Geostatistics
54
∫=
Ad
zs
s)(τ
Ad
zA
/)
(∫
=s
sα
Tota
lM
ean
Infinite Population Parameters
Infinite Population Parameters
)(
)var(
,
θΣ εεX
βzz
=+
=
uno
bse
rved
ob
serv
ed
55
0.1
0.3
0.5
0.7
0.9
x
0.1
0.3
0.5
0.7
0.9
y 0.1
0.3
0.5
0.7
0.9
x
0.1
0.3
0.5
0.7
0.9
y
0.1
0.3
0.5
0.7
0.9
x
0.1
0.3
0.5
0.7
0.9
y 0.1
0.3
0.5
0.7
0.9
x
0.1
0.3
0.5
0.7
0.9
y
Fix
ed P
att
ern
, R
an
do
m S
am
ple
s
Simulation Study
Simulation Study
56
Simulation Results
Simulation Results
39
57
Sampling for Finite Populations
Sampling for Finite Populations
58
∑=
=N i
iz
1)
(sτ
∑=
=N i
iz
N1
)(
)/
1(s
α
Finite Population Parameters
Finite Population Parameters
)(
)var(
,
θΣ εεX
βzz
=+
=
uno
bse
rved
ob
serv
ed
59
05
10
x
05
10
15
20
y
12
611
811
12
10
910
11
10
10
912
11
11
11
11
98
10
12
10
11
11
11
10
711
8
10
10
10
10
10
10
12
11
911
11
11
10
99
10
12
12
12
10
98
910
911
99
10
11
11
13
13
11
10
11
10
99
11
11
11
12
12
89
711
710
811
10
11
10
10
811
99
912
12
12
11
11
13
11
11
8
11
10
11
810
12
12
88
9
99
911
10
10
911
88
811
10
99
11
12
11
12
10
10
10
910
810
10
10
13
13
810
88
910
13
12
10
10
810
56
611
11
12
10
12
710
77
76
67
911
99
87
78
810
10
11
12
78
75
99
11
10
11
11
89
12
76
78
11
11
Simulation Study
Simulation Study F
ixed
Pop
ula
tion
,
N =
20
0
Ra
nd
om
Sam
ple
,
n=
10
0
Nu
mb
er o
f
pla
nt
spec
ies
in
70
x 7
0 c
m
plo
ts
60
Simulation Results
Simulation Results
40
61
Simulation Study
Simulation Study
05
10
15
x
05
10
15
y
05
10
15
x
05
10
15
y
62
Simulation Results
Simulation Results
63
##F
air
ban
ks
Fair
ban
ks
Real Example –Moose Survey
Real Example –Moose Survey
64
9 “
Hig
hs,
”
52
Sa
mp
led
33
8 “
Lo
ws,
”
34
Sa
mp
led
64
Conducting the Survey
41
65
Conducting the Survey
66
01
02
03
04
05
001234
Dis
tan
ce
(k
m)
Semivariogram
01
02
03
04
05
00123
Dis
tan
ce
(k
m)
Semivariogram
Hig
h S
tra
tum
Low
Str
atu
m
Modeling Covariance
Modeling Covariance
67
##Fair
ban
ks
Fair
ban
ks
Results
Results
SR
S 985
) ˆ(
115
35
ˆ
=
= τ
τ se
FP
BK 9
78
) ˆ(
113
27
ˆ
=
= τ
τ seTota
l A
rea
Sm
all
Area
FP
BK
153
) ˆ(
143
7ˆ
=
= τ
τ se SR
S (
13
H, 4
L)
227
) ˆ(
153
5ˆ
=
= τ
τ se
68
Summary
Summary
•GeostatisticalMethods for Sampling are
often more precise
•GeostatisticalMethods for Sampling
allow small area estimation
•GeostatisticalMethods for Sampling do
not require randomized designs
•GeostatisticalMethods require modeling
42
69
•V
er H
oef
, J.
M.
20
01
. P
red
icti
ng f
init
e p
op
ula
tions
fro
m s
pat
iall
y
corr
elat
ed d
ata.
20
00
Pro
ceed
ing
s o
f th
e S
ecti
on
on
Sta
tist
ics
an
d t
he
En
vir
on
men
t o
f th
e A
meri
ca
n S
tati
stic
al
Ass
ocia
tio
n,
pgs.
93
-98.
Geostatisticsand Sampling
References
Geostatisticsand Sampling
References
•V
er H
oef
, J.
M.
20
02
.
Sam
pli
ng a
nd
geo
stat
isti
cs
for
spat
ial
dat
a.
Eco
scie
nce
9:
15
2 –
16
1.
•V
er H
oef
, J.
M.
20
06
.
Sp
atia
l M
etho
ds
for
Plo
t-
bas
ed S
amp
ling o
f W
ild
life
Po
pu
lati
ons.
In
pre
ss,
En
vir
on
men
tal
an
d
Eco
log
ica
l S
tati
stic
s70
Reality
“A
ll m
od
els
are
wro
ng
. W
e m
ake
ten
tati
ve a
ssu
mp
tion
s a
bou
t th
e re
al
wo
rld
wh
ich
we
kno
w a
re f
als
e b
ut
wh
ich
we
bel
ieve
ma
y b
e u
sefu
l.”
-G
eorg
e B
ox
19
76
Good Science is a Team Effort
Good Science is a Team Effort
Data Model
Representational
What does it look
like?
Points
Lines
Polygons
Probability Model
Lack of Fit
What don’t we
understand?
θ n,
θ s, θ
r
Scientific Model
Functional
How does it work?
Modeling
Modeling
Modeling
Inference
Inference
Inference
020
40
60
80
100
0246810
x
y