chapter kri
TRANSCRIPT
-
7/28/2019 Chapter Kri
1/24
Chapter 5
Objective Mapping and Kriging
Most of you are familiar with topographic contour maps. Those squiggly lines represent locations
on the map of equal elevation. Many of you have probably seen a similar mode of presentation
for scientific data, the iso-ness of those lines are comparable. What many of you are probably not
familiar with are the mathematics that lie behind the creation of those maps and their uses.
5.1 Contouring and gridding concepts
This chapter covers the question: what do you do when your data is not on a regular grid?
This question comes up frequently because computers can only draw, for example, contour lines
if they know where to draw them. Often, a contouring package will grid your data for you using
a default method of generating a grid and this will be acceptable. But theres more to it than
making it easy for computers to draw contour lines. Nevertheless, different gridding methods will
produce different looking maps. How, then, can we objectively decide between these different
results? The answer to this question is, of course, dependent upon the problem you are working.
5.1.1 Data on a regular grid
There is a straightforward way to contour irregularly spaced data: Delaunay triangularization. The
individual data points are connected in a network of triangles that have the following properties.
The triangles formed are nearly equiangular and the longest side of each triangle is as short aspossible. We surround our irregularly spaced data points with an irregularly shaped polygon such
that every point inside the polygon is closer to our enclosed data point and every data point outside
the polygon is closer to some other data point. These irregular polygons are known as Thiessen
polygons and the surrounding Thiessen polygons also enclose data points. Straight lines drawn
from only neighboring Thiessen data points creates a Delaunay triangular network. The location of
the contour line along these triangularization lines is then computed by a simple linear interpolation
(see Fig. 5.1). This approach is OK for producing contour maps, but is difficult to use for derived
103
-
7/28/2019 Chapter Kri
2/24
104 Modeling Methods for Marine Science
products (gradients, etc.) and is compute intensive. Furthermore, if you were to sample at different
locations you would get a different contour map.
Figure 5.1: An example of what a triangularization grid looks like. Choosing the optimal way
to draw the connecting lines is a form of the Delaunay triangularization problem.
Better then to put your data onto a regular, rectangular grid. A regular grid is easier for the
computer to use, but is more difficult for the user to generate. But the benefits to be gain for this
extra trouble are large.
5.1.2 Methods: nearest neighbor, bilinear, inverse square of distance, etc.
Nearest Neighbor: is a method that works in a way you might expect from the name. The grid
value ( ) is estimated from the value of the nearest neighbor data point. The distance from a gridpoint to the actual data points is given by:
(5.1)
where the index is for a number indicating grid number (in a sequential sense) and in this case
refers to sequential numbers identifying the actual data points.
(5.2)
-
7/28/2019 Chapter Kri
3/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 105
Equations 5.1 and 5.2 are the bare bones of nearest neighbor formulation. Sometimes this
method is augmented by the N-nearest neighbors (see Fig 5.2):
(5.3)
This method of generating grids is of particular use for filling in gaps in data already on a regular
grid or very nearly so.
Bilinear Interpolation: is a method that is frequently referred to as a good enough for govern-
ment work method. The value at the grid point is an interpolation product ( ) from the following
formulas for a 2-dimensional case:
(5.4)
(5.5)
and:
(5.6)
and:
(5.7)
(5.8)
where are the actual data points surrounding the grid point (sometimes called node). But
this method is best used for interpolating between data already on a grid. This method can be
augmented and there are the logical extensions such as bicubic interpolation which yields higher
order accuracy, but suffers from over- and under-shooting the target more frequently.Inverse distance: is actually a class of methods that weight the data points contribution to the
grid point by the inverse of the distance between the grid point and data point (sometimes this
weight is raised to a power, 2, 3, or even higher if there is a reason). This is basically Eqn 5.3
with the raised to the power mentioned. This method is fast, but has a tendency to generate
bulls-eyes around the actual data points.
Kriging: is a method to determined the best linear unbiased estimate of the grid points. We
will discuss this in greater detail in section 5.4. This method is very flexible, but requires the user
-
7/28/2019 Chapter Kri
4/24
106 Modeling Methods for Marine Science
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Z1
+
+
+
+ Z2
Z3
Z4 Z
8
^
x1
x2
Figure 5.2: An example of a regular grid. The lines connecting the -points are the distances
that could be calculated, the dashed lines indicate distances too large for the data to be expected
to have any significant influence on the grid point value. For example, the N-nearest neighbors
method ( ) estimation of grid point would use the points , and ; for the simpler
nearest neighbor method . This is a two dimensional example with axes and .
to bring a priori information about the data to the problem. This information takes the form of
a variogram of the semivariances and there are several models of variograms that can be used.
Typically, real data is best dealt with a linear variogram unless there is rasonable amount of data to
derive a robustvariogram (more in sections 5.3 and 5.4).
5.1.3 Weighted averaging
Several of the above methods can also have weighting added to improve the fidelity of the grid to
the actual data. Consider Eqn 5.3 (N-nearest neighbors), this equation can have a weighting factor
added to the numerator to increase the influence of some data points over others on the value of
the grid points. Typically the weighting is done with some idea of the uncertainty in the individual
data points themselves (such as ).
-
7/28/2019 Chapter Kri
5/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 107
(5.9)
5.1.4 Splines
We dont plan on covering splines per se. Like many of the topics covered in this course, splines
are a course onto themselves. But we would be remiss if we did not mention them here. Splines
got their start as long flexible pieces of wood or metal. They were used to fit curvilinearly smooth
shapes when the mathematics and/or the tools were not available to machine the shapes directly
(i.e. hull shapes and the curvature of airplane wings).
Since then, a mathematical equivalent has grown up around their use and they are extremely
useful in fitting a smooth line or surface to irregularly spaced data points. They are also useful
for interpolating between data points. They exist as piecewise polynomials constrained to have
continuous derivatives at the joints between segments. By piecewise we mean, if you dont
know how/what to do for the entire data array, then fit pieces of it one at a time. Essentially then,
splines are piecewise functions for connecting points in 2 or 3 dimensions. They are not analytical
functions nor are they statistical models, they are purely empirical and devoid of any theoretical
basis.
The most common spline (there are many of them) is the cubic spline. A cubic polynomial can
pass through any four points at once. To make sure that it is continuously smooth, a cubic spline
is fit to only two of the data points at a time. This allows for the use of the other information to
maintain this smoothness.
If you consider Fig. 5.3 there are four data points ( , and ). Cubic polynomials are
fit to only two data points at a time ( to , to , etc.). By requiring the tangent of
at to be equal to the tangent of at , we can write a series of simultaneous equations
and solve for the unknown coefficients. See Davis (1986) for more details and M ATLABs spline
toolbox (based on deBoor, 1978).
There are a number of known problems with splines. Extrapolating beyond the edges of thedata domain quite often yields wildly erratic results. This is because there is no information beyond
the data domain to constrain the extrapolation and splines are essentially higher order polynomials
which will grow to large values (positive or negative). Closely spaced data points can develop
aneurysms. In an attempt to squeeze a higher order polynomial into a tight space large over- and
under-shoots of the true function can occur. These problems also occur in 3-D applications of
splines. However, if a smooth surface is what you are looking for, frequently a spline (see spline
relaxation in other texts) will give you a good, usable smooth fit to your data.
-
7/28/2019 Chapter Kri
6/24
108 Modeling Methods for Marine Science
Figure 5.3: A cubic polynomial is fit piecewise from to , to , etc. Because only two
points are used at any one time, the additional information from the other points can be used to
constrain the tangents to be equal at the intersections of the piecewise polynomials, for example at
.
5.2 Moving Averages
Sometimes it is possible to put your data onto a regular grid through various averaging schemes.
One of the most common is the moving average. These averaging schemes are an outgrowth
of a school of thought largely credited to mining operations in France and South Africa and are
precursors to kriging, the main topic of this segment. Each averaging scheme applies some variant
of the following mathematical equation:
(5.10)
where the grid estimate ( ) is the sum of a weighting scheme ( ) times the actual observations
( ). The nature of varies as we have seen in the first part of this segment ( -nearest neighbors,
inverse of the distance, inverse of the square of the distance, etc.).
-
7/28/2019 Chapter Kri
7/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 109
5.2.1 Block Means
The first and simplest of these averaging techniques is the block mean. This technique involvesdividing your field area (containing somewhat randomly located samples) into equal area/volume
blocks. Consider a two-dimensional field divided into nine sub-areas, or blocks, of equal area.
1
2
3
4
5
6
78
9
+
Figure 5.4: A hypothetical study area divided into nine equal sub-areas or blocks. The red points
represent actual data sampling locations and the blue represents just one out of several grid
points an estimate is desired. As shown in Eqn 5.11, the value at the blue can be estimated as
the weighted sum of the means of the surrounding blocks.
An estimator for the center of this design is then given by equation 5.11 and each sub-area
can be estimated by making it the center of its own 3-by-3 block.
(5.11)
Here the s are the weights applied to the block means. These weights are determined by a
number of methods, some of which are outlined in Section 5.1.3 or from field data that allows the
inversion of the system of equations in Eqn 5.11.
One drawback to this approach is that although the mean of the block is relatively independent
of the size of the block (once the block is above a certain, data dependent, size), the variance of the
-
7/28/2019 Chapter Kri
8/24
110 Modeling Methods for Marine Science
block estimate tends to increase with increasing block size. It is quite possible that the variance
of the block estimate may be too large to make the estimate of much use in your investigation.Your block size can go either way, smaller: not enough data to be realistic; larger: all the structure
averaged out (see discussion of stationarity below).
5.2.2 Moving average design
To produce estimates with lower variance and increase the reliability of the estimate we can use
a variation of the above block mean called a moving average. This moving average is a variation
upon the design of the block averaging. Once the study area has been divided into blocks, these
same blocks can be re-divided to give you more s in Eqn 5.11, consider Fig 5.5.
1
2
3
4
5
6
78
9
+
Figure 5.5: The same study are as in Fig 5.4, but with the area surrounding the point to be estimated
divided into four new areas indicated by the shaded areas. Instead of having only nine block
averages to work with, Eqn 5.11 will have 13. The red data points have been rmoved from the
figure for clarity and the new blocks have not been numbered.
It is left to the readers imagination as to how other geometries could be used to divide and
re-divide the study area into blocks for estimating the blue cross. Keep in mind that only one blue
cross was shown for demonstration purposes, but that each block has its own blue cross that is
estimated in a fashion similar to the one we just discussed.
-
7/28/2019 Chapter Kri
9/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 111
The averages from the blocks, the s, can also be weighted, or windowed. The results, in
histogram form, are shown in Fig 5.6. Now in Eqn 5.11 the s are not equal over the entire block,but rather are a function of how distant the block centroid is from the point to be estimated.
a b
Figure 5.6: The effects of two kinds of windowing on averaging (there are many forms of window-ing, these are just examples of two). In a) a simple boxcar type of of windowing was applied to the
data, in this case the windows are all of an equal width producing a classical looking histogram.
In b) the windows are tapered, like a Gaussian, making the data points closer to the point of esti-
mation more important (of greater weight) than points farther away. As an example compare the
shaded areas, which enclose the data and their weights used to estimate the point being estimated.
5.2.3 Trend surface analysisThere are two aspects of trend surface analysis that are important for gridding your data and kriging
(which may sound redundant, but there are subtle differences between the two). In the first case,
by fitting a trend surface to your data you can use the fit function to re-sample your data field on a
regular grid. This reflects an interest in the trend surface itself. In the second case, you may want
to remove a trend surface from your data before proceeding with the kriging operation.
Sometimes it is desirable or just convenient to have a function that represents your data in
terms of the coordinate system of your study area (e.g. in terms of the longitude and latitude).
-
7/28/2019 Chapter Kri
10/24
112 Modeling Methods for Marine Science
In these cases it is possible to make your study variable a function of your coordinate system.
You are, in fact, fitting atrend surface
to your data in terms of the coordinates that you use tolocate your samples. It can be a trend surface of any order and rank; meaning that the trend can
be order (a straight line, a flat plane) or order (quadratic curve, surface or hyper-surface).
The order refers to the highest power any independent variable is raised to, the rankrefers to the
dimensionality. You can set up the equations and solve them with either normal equations or the
design matrix, in certain advanced cases you may need to apply the non-linear fitting technique of
Levenberg-Marquardt. Or, in most cases, you can use a handy little m-file Bill Jenkins wrote up
called surfit.m. It uses the repetitious nature of higher and higher order polynomials and the
SVD solution to the normal equations to fit surfaces to your data of the form:
(5.12)
Most grid generation schemes work best when , in order to accomplish this it is im-
portant to remove any trend surface from your data first. At the very least you should remove a
first order, -dimensional surface ( refers to the rank of your coordinate system) from your data
before proceeding to run your grid generation routine. You can always add it back in to your grid
estimation points because you now have an analytical equation that relates your property to the
coordinate system of your study area. Higher order, -dimensional, surfaces can also be fitted.
The higher order you go, the better your fit will be regardless of what you use as a goodness-of-fit
parameter. But keep in mind the better fit may not be statistically significant and you can use
ANOVA to test for this (see also Davis, 1986, pp 419-425).
5.3 Variograms
At the heart of kriging is the semivariogram or structure function of the regionalized variables that
you are trying to estimate. This amounts to the a priori information that you must supply to the
software in order to make a regular grid out of your irregularly spaced data. Basically the idea is to
have an estimate of the distance one would need to travel before data points separated by that much
distance are uncorrelated. This information is usually presented in the form of the variogram, in
which the semivariance is a function of distance or lag ( ).
5.3.1 Regionalized variables
Simply put, a regionalized variable is a variable that can be said to be distributedin space. This
space is not limited to the three-dimensional kind of space that we move around in every day, but
can be extended to include time, parameter space, property space, etc. This definition distributed
in space is purely descriptive and makes no probabilistic assumptions. It merely recognizes the
fact that properties measured in space follow an irregular pattern that cannot be described by a
mathematical function. Nevertheless, at every point in a space it has a value ( ,
-
7/28/2019 Chapter Kri
11/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 113
where is equal to the dimensionality of your space). A regionalized variable is typically repre-
sented as and the grid point estimate of it as . A regionalized variable, then, seems tohave two contradictory characteristics:
a local, random, erratic aspect which calls to mind the notion of a random variable;a general (or average) structured aspect which requires a certain functional representation.
Hence we are dealing with a naturally occurring property (variable) that has characteristics
intermediate between a truly random variable and completely deterministic variable. In addition,
this variable (property) can have what is known as a drift associated with it. These drifts are
generally handled with trend surface analysis and can be analyzed for and subtracted out of the
data much the same way an offset can be subtracted out of a data set.
5.3.2 Semivariance
First remember the definition of variance:
(5.13)
in most cases the variance of a data set is a number (scalar). The semivariance is a curve (vector)
derived from the data according to:
(5.14)
where the asterisk indicates an experimental variogram computed from the data and is the lag
distance between data point pairs. There also are theoretical semivariograms which model the
structure of the underlying correlation between data points, such as the exponential model:
(5.15)
where equals the nugget, equals the silland equals the range of the semivariogram model.
5.3.3 The nugget, range and sill
These three parameters define the semivariogram:
Nugget ( ): Represents unresolved, sub-grid scale variation or measurement error and is seen on
the variogram as the intercept of the variogram.
-
7/28/2019 Chapter Kri
12/24
114 Modeling Methods for Marine Science
Range ( ): The scalar that controls the degree of correlation between data points, usually repre-
sented as a distance.Sill ( ): The value of the semivariance as the lag ( ) goes to infinity, it is equal to the total variance
of the data set.
Given the two parameters range and sill and the appropriate model of semivariogram, the
semivariances can be calculated for any . These quantities can be best visualized in Fig 5.7, a
simple exponential model of semivariance.
0 5 10 15 20 25 300
1
2
3
4
5
6
7
8
9
10
(h)
Lag (h)
Exponential Semivariogram
Figure 5.7: A simple exponential semivariogram with a range of 5 and a sill of 10.
The constant offset ( ) added to the theoretical semivariance models is known as the nugget
effect. This constant accounts for the influence of high concentration centers in the data that pre-
vent the experimental semivariogram from passing through the origin. This model has its begin-nings with mining geologist who were looking for nuggets of gold, which were rarely sampled
directly, hence the unresolved or sub-sampling grid scale variability.
There are several models of semivariance to pick from, the trick is to pick the one that best fits
your data. We will mention, later on in our discussions of kriging and cokriging, that if you are
estimating the semivariogram experimentally (i.e. from actual data) often the linear modelseems
to give the best results. But there seems to be quite a bit of debated over what is the universal
model. You have already seen the exponential model, there are also the:
-
7/28/2019 Chapter Kri
13/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 115
spherical model - which rises to the sill value more quickly than the exponential model, the gen-
eral equation for it looks like:
(5.16)
Gaussian model - is a semivariogram model that displays parabolic behavior near the origin (un-
like the previous models which display linear behavior near the origin). The formula that
describes a gaussian model is:
(5.17)
linear model - in this model the data do not support any evidence for a sill or a range and rather
appear to have increasing semivariance as the lag increases. This is a key sign that the proper
choice is the linear model. In these cases the linear model is concerned with the slope and
intercept of the experimental semivariogram. It is given simply as:
(5.18)
and the slope ( ) is nothing more than the ratio of the sill ( ) to the range ( ).
5.3.4 2 Order Stationarity
Data fields are said to be first order stationary when there is no trend, i.e. the mean of the field is
the same in all sub-regions. This is easily accomplished by fitting and removing a trend surface
to/from the data (if you know what the trend is in the first place). Second order stationary data
field are realized when the variance is constant from one sub-region to the next. We say the data
(actually really the residuals) are homoscedastic, that is to say, equally scattered about a mean of
zero.
5.3.5 Isotropic and anisotropic data
The easiest semivariance model to envision of your data is when the sill and range values are
always the same, regardless of the direction being considered. But that is not always the case andit is often found that data display anisotropic behavior in their range. Nevertheless, if the data is
second order stationary, the sill will be the same in all directions. If it is not, then this is a warning
that not all the large-scale structure has been removed from the data. Consider again an exponential
model but now look at the difference revealed when the semivariances are calculated only in the
north-south direction compared to only in the east-west direction (Fig 5.8). Knowledge of these
anisotropies is necessary when designing an appropriate semivariogram model of your data prior
to kriging.
-
7/28/2019 Chapter Kri
14/24
116 Modeling Methods for Marine Science
0 5 10 15 20 25 300
1
2
3
4
5
6
7
8
9
10
(h)
Lag (h)
EastWest
NorthSouth
Figure 5.8: Two semivariograms showing the presence of anisotropies in the data. In this case the
range and sill for the east-west direction is 5 and 8, but in the north-south direction they are 3 and
8.
5.3.6 Robust semivariogram
There will be times when you will hear references to a robust semivariance estimator. This idea
was championed by Noel Cressie and is dealt with in some detail in his book (Statistics for Spatial
Data). Basically it is a variant on Eqn 5.14 that accounts for the effects of outliers in your data.
Outliers (data in the tails of your data distribution that fall outside Gaussian expectations) have a
tendency to distort the results of Eqn 5.14. Cressie has put forward the following equation to make
the experimentally determined semivariogram less sensitive to these outliers (hence, robust).
(5.19)
While somewhat overwhelming looking, upon inspection we see that this is just Eqn 5.14
modified. By taking the absolute value of the difference between two data points separated by
a distance , then taking its square root, dividing by the number of data pairs separated by the
-
7/28/2019 Chapter Kri
15/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 117
distance , and then raising the results to the fourth power we diminish the effects of these outliers.
The denominator is nothing more than anormalization
to make gamma unbiased. This form ofthe experimental semivariogram is very useful in cases where we have a lot of data to estimate
the semivariogram from and outliers can become an irksome problem; although this equation also
works on lower data densities.
5.4 Kriging
Kriging is a method devised by geostatisticians to provide the best local estimate of the value of
the mean value of a regionalized variable. Typically this was an ore grade and was motivated
by the desire to extract the most value from the ore deposit with the minimum amount of capital
investment. The technique and theory of geostatistics has grown since those early days into afield dedicated to finding the best linear unbiased estimator(BLUE) of the unknown characteristic
being studied.
5.4.1 Variogram criticality
While one can use ANOVA to determine at what level a trend surface is significant, there is still
something of an art to determining the correct variogram model to use. Using ANOVA as a guide
one can fit trend surfaces to the data of ever increasing order, eventually your ANOVA will tell
you that at some specified level of significance, a trend surface of that order does not provide a
statistically significant increase in the fit to the data. Thats where you stop, at order minus one.
The weights and neighborhood of the trend surface analysis is dependent upon the semivarianceof the data, i.e. dependent upon the structure function the data displays. This interdependency
between trend surface and semivariance means that there is no unique solution/combination of
trend and semivariance, hence the art. The degree of spatial continuity of the data (regionalized
variable) is given by the semivariogram (see section 5.3) and some of the types of models used are
covered in section 5.3.2.
5.4.2 Punctual kriging
To explain what kriging is, we am going to concentrate on the simplest form of kriging, punctual
kriging. Consider that you want to find:
(5.20)
That is to say, find the best linear weighted (unbiased) estimate for property at point (note
thios is a capital ). In addition suppose you also want to know what the estimation error is as
well:
(5.21)
-
7/28/2019 Chapter Kri
16/24
118 Modeling Methods for Marine Science
That is to say, you want to know the difference between what you estimated is and what it
really is, a quantity we usually dont know ( ). There is a way to do this by requiring that theweights sum to one, this will result in an unbiased estimate if there is no trend. You can then
calculate the error variance as:
(5.22)
It seems only logical that the closer a data point is to the grid point you wish to estimate the
more weight it should carry. These weights used ( ) and the error of estimate ( ) are related to
through the semivariogram. So, if we had three data points from which to estimate one grid
point (as in Fig 5.9), we would have:
(5.23)
for the estimate and:
(5.24)
for the weights. The question that remains is: how do we find the best set of s? Consider
Fig 5.9, here we have three data (control) points and from them we wish to make a best linear
unbiased estimate of the -field at grid point .
Using the semivariogram we can create the following sets of equations:
(5.25)
(5.26)
where is the semivariance over the distance between control points and and
is the semivariance over the distance between the control point and the grid point . With
Eqn 5.24 we have three unknowns and fourequations (remember Eqn 5.24) and to force Eqn 5.24
to always be true we add a slack variable resulting in a matrix set of equations like:
(5.27)
This yields the and one more equation:
(5.28)
-
7/28/2019 Chapter Kri
17/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 119
x1
x2
Z1
Z2
Z3
ZP
^
++
+
+1
3.3
2
3.4
4.8
2.9
Figure 5.9: Showing the layout of three control points and the grid point to be estimated in example
5.1. The distances ( ) between control points (dashed lines) used to calculate the left hand side
of Eqn 5.29 and the distances from control points to used to calculate the righthand side are
given.
yields the error of estimate.
Now we want to point out that something really cool is happening here. If you stop and think
about it you may wonder why the weights should apply to both the data points and the semi-
variances. We shouldnt have any problem considering Eqn 5.23, after all its just the best linearly
weighted combination of the surrounding data points. But what about Eqn 5.25? Why should these
also be true? Well, strictly they arent, not until you add the slack variable ( ) that allows Eqn 5.24to always be true. What insight does this give you into the nature of regionalized variables? Well
let you ponder that for a while.
Now it sometimes happens that you dont want to or cant remove the trend surface prior to
kriging. It is still possible to come up with a best linear unbiased estimate of your grid points using
Universal Kriging, the matrix you form is even more complicated than the one in Eqn 5.27 and is
covered in Davis, Chapter 5.
-
7/28/2019 Chapter Kri
18/24
120 Modeling Methods for Marine Science
Table 5.1: Example 5.1
Coordinate (km) Coordinate (km) Water Elevation (m)Well 1 3.0 4.0 120
Well 2 6.3 3.4 103
Well 3 2.0 1.3 142
Your site 3.0 3.0 ???
5.4.3 An example
This example is taken from Davis, Chapter 5 and addresses only the concept of punctual kriging.
Suppose you wanted to dig a well and wanted a good estimate of the elevation of the water tablebefore you began digging. Suppose further that you had three wells already dug distributed about
your proposed site ( ) much in the same fashion as are the control points in Fig 5.9. Given the
data in Table 5.1, you can use punctual kriging to make a best linear unbiased estimate of the water
table elevation at your proposed site.
From this information and a structure analysis (semivariogram) you can fill out the equations
in Eqn 5.27 and then solve for the water table elevation at your site. The semivariogram analysis
revealed a linear semivariogram out to 20 km with an intercept of zero and a slope of 4 m /km. So
the matrices in Eqn 5.27 look like:
(5.29)
The numbers on the left-hand side of the equation come from the semivariances between con-
trol points calculated by knowing the distance between them and the linear semivariogram model.
The numbers on the right-hand side of the equation come from knowing the distance between the
proposed site and each control point and the linear semivariogram model.
If the condition number of the matrix on the left-hand side isnt too bad, you can invert directly
to solve for the Ws and lambda, otherwise you can use SVD to solve for the answer. Either way
in this case you get a column vector of:
(5.30)
which when multiplied through Eqn 5.23 gives an estimate of 125.3 m and using Eqn 5.28 yields an
error estimate of 5.28 m . The square root of this number represents one standard deviation (2.30
-
7/28/2019 Chapter Kri
19/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 121
m), which represents the bounds of 68% confidence. So plus or minus two times this standard
deviation yields the elevation of the water table at your proposed site with a 95% confidence.MATLABs answers are a little different from the ones in Davis (1986), but we attribute that
to the fact that Davis does not use singular value decomposition to invert his matrices (see Davis,
1986, Chap 3).
5.5 Cokriging with MATLAB
We have been very fortunate to obtain from Denis Marcotte (via e-mail) copies of the m-files
published in Marcotte (1991). This section of the lecture notes covers material on how to use this
program. Although not covered in lecture, this very powerful program will extremely useful to anyof you that must make objective grids of their data during your careers.
The concept ofcokriging is nothing more than a multivariate extension of the kriging technique
we went over in class and is covered in lecture notes section 5.4. Instead of going through all of
the machinations necessary for kriging one property at a time, we do all of the properties we wish
to grid in one calculation. In addition, covariance information about the way properties related
to each other is used to improve the grid estimation and reduce the error associated with the grid
estimates.
5.5.1 Estimating the variogramAlong with the types of variograms estimated in lecture notes section 5.3, cross-variograms are also
necessary. These are logical extensions of the variograms we have already dealt with. Remember
the semivariance is provided by:
(5.31)
The cross-semivariance is given by:
(5.32)
where refers to the number of data pairs that are separated by the same distance , when
you have the definition of the semivariogram. One interesting thing about the cross-semivariance
is that it can take on negative values. The semivariance must, by definition always be positive, the
cross-semivariance can be negative because the value of one property may be increasing while the
other in the pair is decreasing.
-
7/28/2019 Chapter Kri
20/24
122 Modeling Methods for Marine Science
5.5.2 The coregionalized model
As we discussed earlier, a regionalized variable is a variable that is distributed in space, where themeaning of space can be extended to include phenomena that are generally thought of as occurring
in time. A regionalized phenomena can be represented by several inter-correlated variables, for
example, lead-zinc deposits or nutrients in the ocean. Then there may be some advantage to study
them simultaneously, this is an extension of the regionalized variable theory to mulitvariate space
and is what amounts to a coregionalized model. We can see from Eqn 5.32 that the cross-variogram
is symmetric in ( ) and ( ), which is not always the case in the covariance matrix formed
from the data.
5.5.3 Using cokri.m
In this section we will try to give you our best understanding of the program cokri.m. In this
way we hope to make the simplest and most straightforward application of this program available
to you while opening the possibility of future, more complicated uses, to you as well.
Sometimes the easiest way to understand a program is to understand, as best as possible, what
the input and output variables are. But first lets define some of the indices we will be using
when talking about the parts of the cokri.m: represents the number of data points (real ones,
not estimated ones), represents the number of dimensions you are working with (in the water
table example above you have and coordinates, hence , remeber the elevation of the
water table was your regionalized variable), lowercase represents the number of properties you
are working with (again in the example above there was only water table elevation, so ),
represents the total number of grid points (nodes) that you are working with, represents thenumber of variogram models you are working with. Now for the input and output variables, in the
case ofcokri.m they are as follows:
Input:
x this is the by matrix of data points. In this program refers to the total number of
sample locations (stations, well locations, etc.), refers to the number of properties you are
estimating, the dimensionality of the problem being studied (1-D, 2-D, 3-D, etc.).
x0 is the by matrix of points on your grid that you will be kriging (cokriging) onto. In this
program in the number of grid points, e.g. if you are working a 2-D problem and decide
to put your estimates onto a 21 by 57 point grid, will be equal to 1197, is, however,
represented as a 1197 by 2 matrix of coordinate doublets.
model is perhaps one of the trickiest variables in the program. It represents half of the core-
gionalization model to be used. If you are using only one model in 3-D, then model is a 1
by 4 matrix wherein the first column is the code of the model (1=nugget, 2=exponential,
3=gaussian, 4=spherical, 5=linear). The remaining columns of the variable represents
the range of your model in the , , and directions. A special note should be made here
-
7/28/2019 Chapter Kri
21/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 123
about the use of a linear model. As stated in the help information of cokri.m, the ranges in
a linear model are arbitrary so that when they are dividedinto
the also arbitrary sill values inthey produce the slope of the linear semivariogram model being used for that linear model
in that direction.
c is the variable containing the by sills of the coregionalization model in use. In this program
is used to represent the number of variogram models being applied to the problem. For
example, one might wish to combine the effects of a nugget model with a linear model for
three properties in 3-dimensions, then , is a 2 by 4 matrix, and is a 6 by 3 matrix
of numbers. A nugget model is indicated when the intercept of a semivariogram model is
not zero and that intercept value is put in the first by sub-matrix of to correspond to the
first model row of the model variable.
itype is a scale variable indicating the type of cokriging to be done. In five different values
just about everything is covered, from simple cokriging to universal cokriging with a trend
surface of order 2. In general, simple cokriging should be used when the mean of the data is
known and the data field is globally stationary in its mean as well as locally stationary in it
variance.
avg, in his paper (Marcotte, 1991) states that this variable is not used, but later in one of his
examples he uses it. We cannot get the program to run unless we provide a 1 by matrix of
the averages of the individual properties being cokriged when doing simple cokriging.
block is a 1 by vector of the size of the to be cokriged. If we were certain of the volumeof our individual samples we could use something other than point kriging, i.e. any positive
values will work in that case.
ndis a 1 by vector of the discretization grid for cokriging, if using point cokriging make them
all ones.
ival is a scalar describing whether or not cross-validation should be done and how. We find it
easier and quicker to run the program with set to zero for no cross-validation.
nk is a scalar indicating the number of nearest neighbors of the input matrix to use in estimating
the cokriged grid point. This is a difficult parameter to give hard and fast rules for deciding
how large to make this. You may wish cokri.m to use all of the data points and set this
scalar to a very large number, on the other hand you may wish for only local effects to factor
into the weighted estimates for the grid point. If you dont get satisfactory results the first
time around, increase or decrease this number.
radis a scalar that describes the radius of search for the nearest neighbors in , clearly they are
interrelated and one helps constrain the other. Additionally, it is clear here that the units of
the coordinates need to be in the same units, if not, standardization helps.
-
7/28/2019 Chapter Kri
22/24
124 Modeling Methods for Marine Science
ntok a scalar descibing how many groups of grid points in will be cokriged as one. When
is greater than one, the points inside the search radius will be found from the scentroid location.
Output:
x0s is, of course, your answer. It is a by matrix of the grid point estimates. The
columns correspond to the grid point coordinates given in and the columns correspond
to the estimates of the properties at those grid point coordinates.
s is a by matrix of the error estimates of the grid points. This is the big benefit to
kriging in that it provides you with not only an estimate of a propertys value at a grid point,
but also an estimate of the uncertainty in that estimate.
sv is a 1 by vector of variances of points in the universe.
idis a ( ) by 2 matrix of the identifiers of the (or, in Davis, ) weights for the last cokriging
system solved (i.e. the last grid point system of equations).
lis a (( ) minus ) by ( ) matrix with the (or ) weights and Lagrange multipliers
of the last cokriging system solved. In this program refers to the number of constraints
applied to the simple cokriging system.
A word of caution, for some reason, Marcotte has set up cokri.m to turn off case sensitivity.
When the program is finished running variables Axb and axb are considered the same and makingreference to a variable such as Axb will generate a variable or function not found
error. Simply issue the command casesen and case sensitivity will be restored. We have modi-
fied the code we provide to you by simply commenting out the casesen off command with
%casen off, so you neednt worry about this at first (but it is available, just remove the % sign).
5.5.4 Things to Remember
When using cokri.m it may be helpful to remember the following three insights as to how the
program works.
1. If the data have been properly detrended (rendered second order stationary), then it is only
logical to assume that the nuggets will be equal regardless of direction. As the lag goes
to zero the subgrid scale noise (composed of both real geophysical noise and measurment
error) will converge to the same value for all directions.
2. In a similar argument, if the data have been properly detrended, then the sills have to be
equal regardless of direction for each property (and cross-property). Think about the mostly
gridded example in class, the sill represents the total variance in the anomalies (residuals
-
7/28/2019 Chapter Kri
23/24
Glover, Jenkins and Doney; 9 May 2005 DRAFT 125
= data minus trend). Just because you calculated the semivariances in different directions
doesnt mean you havent used all of the data points, since youve used all of the data andthe sill represents the total variance contained in the data (anomalies), they will also be equal
regardless of direction.
3. This last one may seem a little odd. The ranges should all be the same in a given direction,
regardless of the property or cross-property. Think of it this way, the decorrelation scale
length is always the same in a given direction; the medium (seawater, granitic batholith, etc.)
doesnt change eventhough the property might.
Now, of course, weve told you how difficult it is to render your data second order stationary
and the above insights might not be strickly, numerically true. Your options are to return to your
trend surface analysis and see if you cant find a better filter to remove the large scale trend that
is contaminating your anomalies. Or, if the fitted parameters are close in value (remember that
nlleasqr.m gives you error estimates of these parametes), averaging them can still yield useful
results. Remember, the sill and nugget are averaged over directions, but the ranges are averaged
over properties.
5.6 Problems
All of your problems sets are served from the web page:
http://eos.whoi.edu/12.747/problem_sets.html
which can be reach via a number of links from the main course web page. In addition, the date the
problem set comes out, the date it is due, and the date the answers will be posted are also available
in a number of locations (including the one above) on the course web page.
References
Clark, I., 1979, Practical Geostatistics, Elsevier, New York, 129 p.
Cressie, N.A., 1993, Statistics for Spatial Data, Wiley-Interscience, New York, 900 p.
Davis, J.C., 1986, Statistics and Data Analysis in Geology, 2 Edition. John Wiley and Sons,
New York, 646 pp.
deBoor, C., 1978, A Practical Guide to Splines, Springer-Verlag, New York, 392 p.
Marcotte, D., 1991, Cokriging with MATLAB, Comp. and Geosci., 17(9): 12651280.
-
7/28/2019 Chapter Kri
24/24
126 Modeling Methods for Marine Science