an introduction to kriging using sas
DESCRIPTION
Presentation at the 19th New England Statistics Symposium, April, 2005.TRANSCRIPT
19th New England Statistics Symposium
April 23rd, 2005
Prof. Roger Bilisoly
Department Mathematical Sciences
Central Connecticut State University
An Introduction to Kriging
Using SAS
1. Example of Spatial Data
• Magnetometer readings to
detect metal
• Goal here is to find
unexploded ordnance (UXO)
• In general, need spatial data
that is spatially correlated
2. In This Talk We Consider
Mining Data
Black = lowest value
White = highest value
(Think of burning embers)
The goal is to find high
concentrations of ore,
which are then mined.
Some Background on
Spatial Modeling
•Spatial model: Z(s), s = spatial location, and
Z(s) is a random variable for each s.
• If Z(s) has not been measured at a location s,
then its value is uncertain. We model this
uncertainty by a probability distribution.
. Z(s1) . Z(s2)
x Z(s)
. Z(s3)
Kriging = Spatial Interpolation
• Kriging is BLUE:
– B for Best meaning minimum variance
– L for Linear
– U for Unbiased
– E for Estimate
Form of
Kriging Estimate:
. Z(s1) . Z(s2)
x Z(s)
. Z(s3)
3
1
)()(i
ii sZsZ
Spatial Autocorrelation
All points uncorrelated here:
No hope for spatial prediction. Points close together tend
to be similar which allows
spatial prediction.
Variogram = Sill - Covariance
• Goal: We want to know the correlation between Z(s)
and Z(s+h) as a function of the distance |h|.
Sill
Range .
. . .
. .
Find pairs of data values
separated by the same distance
Empirical Variogram
Assume that the variogram is a function of s1 – s2 = h
and E(Z(s)) is constant.
γ(s1 – s2) = γ(h) = Var(Z(s1) – Z(s2))
2))()((
pairs#
1ji
hss
sZsZji
.
. . .
. .
Valid Models
• Any spatial prediction must have non-negative
variance:
N
i
N
j
jiji
N
i
ii ssCovsZVar1 11
0),())((
This restricts our choice of variogram models γ(h)
Example of Kriging with SAS
7. Goal: Strike it Rich!
Walker data set from
Isaaks and Srivastava(1989)
Sampling design: two stage
adaptive sampling
Measured:
Ore concentration
Goal: Find high concentrations of
ore so it can be profitably mined.
8. Map of Ore Concentration
Produced by
PROC G3GRID
PROC GCONTOUR
Using PATTERN
SAS Code for Map
axis1 order=(0 to 250 by 25) width=3 minor=(n=4) label=('Easting');
axis2 order=(0 to 300 by 25) width=3 minor=(n=4) label=('Northing');
symbol1 interpol=none line=1 value=plus color=black;
proc g3grid data=walker out=walker2;
grid y*x=v / naxis1=25 naxis2=30;
run;
pattern1 value=solid color=CX000000;
pattern2 value=solid color=CX180000;
pattern3 value=solid color=CX300000;
…
pattern31 value=solid color=CXFFFFFF;
proc gcontour data=walker2;
title 'Smoothed Data Plot';
plot y*x=v/haxis=axis1 vaxis=axis2 levels=0 to 1500 by 50 pattern nolegend;
run;
3. Read in the Data
data walker;
infile "M:\My Work\Spatial Stat Talks\walker data set.txt";
drop title nvar;
length title $ 50;
if _n_ = 1 then do;
input title $ &; put title;
input nvar;
input #(nvar+2);
end; else do;
input id x y v u t;
output;
end;
run;
x and y = spatial coordinates
v = variable analyzed here
4. Variogram Estimation
proc variogram data=walker outvar=outv;
compute lagdistance=5 maxlag=20;
coordinates xc=x yc=y;
var v; run;
This creates the empirical variogram,
which quantifies spatial autocorrelation.
Need spatial
spacing info.
4. PROC NLIN to Fit
a Parametric Variogram
proc nlin data=outv;
parms c0=20000 c1=80000 a1=80;
model variog = (c0 + c1*(1.5*(distance/a1) - 0.5*(distance/a1)**3))*(distance < a1) + (c0 + c1)*(distance >= a1);
der.c0 = 1;
der.c1 = 1.5*(distance/a1) - 0.5*(distance/a1)**3;
der.a1 = c1*(-1.5*(distance/a1**2) + 1.5*(distance**3/a1**4));
output out=fitted predicted=pvari
parms=c0 c1 a1;
run;
Need to
specify a
parametric
model
Need to give
derivative
information.
4. PROC KRIGE2D
Does the Kriging
proc krige2d data=walker outest=est;
predict var=v radius=200;
model form=spherical
range=&a1
scale=&c1
nugget=&c0;
coord xc=x yc=y;
grid x=0 to 250 by 2 y=0 to 300 by 2;
run;
This is a specific
parametric shape.
This specifies locations to make spatial predictions.
Isotropic Variogram
Model for h < a1:
c0 + c1*(1.5*h/a1 - 0.5*(h/a1)3)
Best fit by PROC NLIN:
c0 = 29474
c1 = 64313
a1 = 43.39
6. Kriging vs. Map
Note that the kriging predictions are smoother than the data.
7. Why Kriging?
•Both mapping and kriging interpolate
•Mapping is quicker
BUT
•Kriging comes with an error estimate
6. Kriging Estimate &
Error Estimate
5. Conclusions
1. Kriging is based on modeling, which makes
it harder to do, but it also comes with an
error estimate
2. SAS has kriging built into SAS/STAT via
PROC NLIN, PROC VARIOGRAM and
PROC KRIGE2D
3. SAS/GRAPH can be used for mapping
4. Note that hypothesis testing is not
prominent for these PROCs.