p. goovaerts geostatistical tools for characterizing the...
Post on 30-Apr-2020
3 Views
Preview:
TRANSCRIPT
P. GoovaertsDepartment of Civil and Environmental Engineering,The University of Michigan, EWRE Bldg, Room 117,Ann Arbor, MI 48109-2125, USAe-mail: goovaert@engin.umich.edu, Tel.: c001-734-9360141,Fax: c001-734-7632275
Biol Fertil Soils (1998) 27 :315–334 Q Springer-Verlag 1998
REVIEW ARTICLE
P. Goovaerts
Geostatistical tools for characterizing the spatial variability of
microbiological and physico-chemical soil properties
Received: 6 January 1998
Abstract This paper reviews the main applications ofgeostatistics to the description and modeling of the spa-tial variability of microbiological and physico-chemicalsoil properties. First, basic geostatistical tools such asthe correlogram and semivariogram are introduced tocharacterize the spatial variability of each attribute sep-arately as well as their spatial interactions. Then, thekey issue of fitting permissible models to experimentalsemivariograms is addressed for the univariate andmultivariate situations. Capitalizing on this model ofspatial dependence, the value of a soil property can bepredicted at unsampled locations using only observa-tions of this particular property (kriging) or incorporat-ing additional information provided by other correlatedproperties (cokriging). Factorial kriging allows one todiscriminate the different sources of spatial variation insoil on the basis of the scale at which they operate, andit often enhances relations between soil attributeswhich were blurred in a traditional correlation analysiswhere the different sources of variations are mixed.Geostatistics can also be used to assess the risk of ex-ceeding critical values (regulatory thresholds, soil quali-ty criterion) at unsampled locations, and to simulatethe spatial distribution of attribute values. All the dif-ferent tools are illustrated using two transects of100 pH and electrical conductivity values measured inpasture and forest.
Key words Geostatistics 7 Spatial variability 7Kriging 7 Scale-dependent correlation 7 Stochasticsimulation
Introduction
Soil scientists are aware that soil properties vary spa-tially, and they have already recorded strong fluctua-tions even over short distances (Trangmar et al. 1985;Warrick et al. 1986). Behind a locally erratic aspect,some spatial structure is often discerned and may berelated to the combined action of several physical,chemical or biological processes that act at differentspatial scales. The characterization of the spatial varia-bility of soil attributes is essential to achieve a betterunderstanding of complex relations between soil prop-erties and environmental factors. Also, a model of spa-tial dependence between soil data can later be used toestimate attributes at unsampled locations, leading, forexample, to better recommendations for the applica-tion of fertilizers or pesticides.
Geostatistics provides a set of statistical tools for in-corporating the spatial coordinates of soil observationsin data processing, allowing for description and model-ing of spatial patterns, prediction at unsampled loca-tions, and assessment of the uncertainty attached tothese predictions. Since the publication of the first ap-plications of geostatistics to soil data in the early 1980s(Burgess and Webster 1980a, b; Webster and Burgess1980; Burgess et al. 1981), geostatistical methods havebecome popular in soil science, as illustrated by the in-creasing number of studies reported in the literature(Goovaerts 1998). The use of geostatistics is not con-fined to physico-chemical soil properties, but studieshave also been conducted on the spatial patterns of soilmicro-organisms (Wollum and Cassel 1984; Websterand Boag 1992), plants (Vieira et al. 1983; Sutherlandet al. 1991), and other organisms living at the surface ofthe soil (Rossi et al. 1992).
The greater accessibility of geostatistical software(e.g., Englund and Sparks 1991; Pannatier 1996;Deutsch and Journel 1998) has increased the risk thatgeostatistical tools are used without a good commandof the underlying theory, in particular in the field of
316
Fig. 1 Transects of 100 pH values measured in the topsoil of apasture and a forest plot
Fig. 2 Histograms of soil pH values measured in a pasture and aforest plot
semivariogram modeling. Moreover, in several situa-tions, inappropriate or non-optimal approaches are ad-opted because scientists are unaware of recent develop-ments, such as factorial kriging, indicator geostatisticsor stochastic simulation.
This paper aims at providing soil scientists with anapplied introduction to basic descriptive and predictivegeostatistical tools such as the semivariogram and krig-ing, while presenting recent developments in multivar-iate geostatistics and the modeling of local and spatialuncertainty. The different tools will be used for a step-by-step analysis of a one-dimensional set of soil data,which consists of two transects of 100 topsoil pH andelectrical conductivity values measured every meter inpasture and forest.
Description of spatial patterns
Figure 1 shows two series of 100 topsoil pH valuesmeasured every meter along transects, one in a pastureand the other in a forest (Goovaerts et al. 1989). Toooften, analysis of such data amounts to plotting the his-togram and computing summary statistics such as meanand standard deviation (Fig. 2). By so doing, one ig-nores critical information, that is the spatial location ofpH measurements. For example, the comparison of
standard deviations suggests that the variability of pHis similar for both land uses. The shape of pH transectsindicates, however, that pH values vary over shorterdistances in the forest than in the pasture! This spatialfeature, which may be important for interpretation, isnot captured if one ignores the spatial information.Geostatistics provides a set of statistical tools for de-tecting and quantifying the major scales of spatial varia-bility.
The correlogram
Let z(ua), ap1, 2, . . ., n denote the set of np100 pHvalues measured in a pasture, where ua is the vector ofspatial coordinates of the ath observation. Figure 1(top graph) shows that the distribution of pH valuesalong the transect is not fully random in that observa-tions that are close to each other on the ground tend tobe more alike than those further apart. The similaritybetween adjacent pH values can be depicted by plottingeach observation z(ua) versus the one measured 1 maway, z(uach) with hhhp1 m. In this example, 99 pairsof pH measurements (z(ua), z(uach)) can be formedfrom the initial set of 100 values, and the resulting plotis called an h-scattergram (Fig. 3; left top graph). Read-ers are likely to be familiar with the scattergram, which
317
Fig. 3 Scattergrams betweensoil pH values separated by adistance of 1 m, 2 m, . . ., 10 m.Note how the correlation be-tween observations decreasesas the separation distance in-creases
is a plot of all pairs of values related to two differentattributes measured at the same locations. By analogy,the h-scattergram is a plot of all pairs of measurementsof the same attribute z (i.e., pH) at locations separatedby a given distance hhh in a particular direction. Byconvention, the value at the start of the vector h, z(ua),is called the tail value, whereas the value at the end,z(uach), is the head value.
The shape of the cloud of points on the h-scatter-gram indicates that there is some correlation betweenadjacent pH values, and this can be measured using thelinear correlation coefficient which is traditionally usedto assess correlation between different attributes. Thecorrelation is 0.64 and agrees with our visual impres-sion that the pH value at any location is related to thepH measured 1 m away. Intuitively, one would expectthat the relation between pH values weakens for a sep-aration distance of 2 m, which is confirmed by the sec-
ond h-scattergram of Fig. 3 (right top graph): the corre-lation drops from 0.64 to 0.51. The increasing inflationof the cloud of points with increasing distance reflectsthe decreasing similarity of measurements fartherapart: the correlation at 10 m is only 0.19!
The plot of the estimated correlation coefficients asa function of the separation distance is called the ex-perimental correlogram. Figure 4 (top graph) shows thecorrelograms for pH measured in a pasture (solid line)and in a forest (dashed line), respectively. For the pas-ture, the correlation becomes negligible at a separationdistance of about 12 m, which is referred to as the rangeand is interpreted at the distance beyond which two pHvalues can be considered as statistically independent.The decline in correlation is much sharper in the forestin that any two pH values 3 m apart can already be con-sidered as independent. The correlogram thus allowsone to quantify in terms of correlation the visual im-
318
Fig. 4 Correlograms and semivariograms for the transects of 100soil pH values measured in pasture (solid line) and forest (dashedline)
pression that small-scale fluctuations prevail for thetransect of forest pH values displayed at the bottom ofFig. 1.
The semivariogram
In the geostatistical literature, spatial patterns areusually described in terms of dissimilarity (instead ofsimilarity) between observations as a function of theseparation distance. The average dissimilarity betweendata separated by a vector h is measured by the experi-mental semivariogram g (h), which is computed as halfthe average squared difference between the compo-nents of every data pair:
g(h)p1
2N(h)
N(h)
Aap1
[z(ua)Pz(uach)]2, (1)
where N(h) is the number of data pairs for a given dis-tance, e.g., N (h)p99 for hhhp1 m for the transects ofFig. 1.
Figure 4 (bottom graph) shows the semivariogramsof pH measured in pasture (solid line) and forest(dashed line), respectively. In both cases, semivario-
gram values increase with the separation distance, re-flecting our intuitive feeling that two pH values close toeach other on the ground are more alike, and thus theirsquared difference is smaller, than those further apart.The two semivariograms stop increasing at a given dis-tance, called the range, which corresponds to the rangeidentified on the correlogram, and again it can be inter-preted as the distance of dependence or zone of in-fluence of pH measurements. Beyond the range, thetwo semivariograms fluctuate around the same sill val-ue, which is roughly equal to the sample variance com-puted on Fig. 2. In other words, the overall variation ofpH measurements across the transect is similar for bothland uses, yet the larger values of the forest semivario-gram at short distances indicate that most of this varia-bility is present over a few meters in forest. The discon-tinuity at the origin of the semivariogram (i.e., zero sep-aration distance) is called the nugget effect and arisesfrom measurement errors or spatial sources of variationat distances smaller than the shortest sampling intervalor both (Journel and Huijbregts 1978, p. 39). Depend-ing on the sampling scale, a semivariogram can appearas a pure nugget effect even if the variation is locallyspatially structured, and so it might be useful to con-duct a preliminary survey to approximate the majorscales of spatial variation (Oliver and Webster 1986a).
In combination with a good knowledge about thephenomenon and the study area, such a spatial descrip-tion can improve our understanding of the physical un-derlying mechanisms controlling spatial patterns. Forexample, McBratney and Webster (1981) related theperiodicity displayed by the correlogram of wheat yieldmeasured in a 0.4 ha area to an earlier ploughing. Ro-bertson et al. (1988) found nitrogen availability in a0.5 ha old field to be highly patterned at scales 20–40 m,which corresponds to the scales over which plant com-munity composition varies during early succession, sug-gesting that the community structure is strongly in-fluenced by the spatial heterogeneity of available nu-trients. Goovaerts (1994a) found a remarkable similari-ty between the shapes of semivariograms for eight soilchemical properties and six banana leaf contents meas-ured across a 100 km2 area, and he was able to relatethis common spatial pattern to the geographical distri-bution of soil types in the study area. Other examplesin earth sciences can be found in Oliver and Webster(1986b). In the present study, the short-range variabili-ty of forest pH values probably reflects the local heter-ogeneity of ground vegetation compared with the uni-formity of grass cover.
The information collected during soil surveys rarelyreduces to a single transect, but a two-dimensional do-main is typically sampled. It is thus necessary to charac-terize the spatial variability along different directions.For example, Fig. 5 (left column) shows two grids of pHvalues which were measured in the same plots as thetwo transects of Fig. 1. Semivariograms have been com-puted along the N-S (solid line) and E-W (dashed line)directions. Data pairs have been grouped into classes of
319
Fig. 5 Grids of soil pH valuesmeasured on 1 ha pasture andforest plots, and the corre-sponding semivariograms com-puted in the NS (solid line)and EW (dashed line) direc-tions. For pasture, the varia-bility is said to be anisotropicin that it depends on the di-rection
distances (lags) and angles, [hBDh] and [uBDu], sothat each semivariogram value is computed from a suf-ficient number of data pairs (i.e., N(h)130). In the ex-ample of Fig. 5, the two classes of angles are 07B457
and 907B457, whereas the lag tolerance is 5 m. Thesegraphs show distinct spatial patterns of acidity on bothland uses: pH appears to vary more continuously inpasture as illustrated by the smaller nugget effect andlarger range of the corresponding semivariogram. An-other important feature is that the variability of pH val-ues in pasture is anisotropic, that is it depends on thedirection. The continuous increase of the N-S semivar-iogram reflects a trend in pH along that direction; thelarge pH values in the southern part of the plot are infact due to the liming of the neighboring plot a fewhours before sampling.
The cross correlogram
In most situations, several properties are measured oneach soil sample, and geostatistics is increasingly usedto process such multivariate spatial soil information(Wackernagel 1988; Goovaerts 1992). For example, inaddition to pH, electrical conductivity was measured ateach of the 100 locations along the transect in pasture(Fig. 6). Typically, the relationship between two soilproperties is assessed by plotting the scattergram of thetwo variables and computing the corresponding corre-
lation coefficient (Fig. 7; left top graph). Once again,such an approach ignores the spatial coordinates of themeasurements.
Instead of looking at the relation between pH andconductivity measured at the same location, one mayassess the correlation between adjacent observations ofboth attributes by plotting each pH value z(ua) versusthe conductivity value measured 1 m away, y(uach)with hhhp1 m. The so-called cross h-scattergram is dis-played at the top of Fig. 7 (right graph). Surprisingly,the correlation between the two attributes increaseswith the separation distance, an effect referred to as thelag effect (Journel and Huijbregts 1978, p. 48). Such aneffect can be observed in geochemistry, where differentrates of precipitation may cause enrichment in someminerals to lag behind that of others along the directionof hydrothermal flow. In this particular example, thelag effect is not backed by any physical interpretationand most likely reflects sample fluctuations.
Figure 7 shows cross h-scattergrams for a separationdistance up to 10 m. The plot of the estimated correla-tion coefficient as a function of the separation distanceis called the experimental cross correlogram (Fig. 8, topgraph). Note that two cross correlograms can be com-puted, depending on whether a pH value is comparedwith the conductivity value east of it (easterly direc-tion) or west of it (westerly direction). In both cases,the correlation between pH and electrical conductivitytends to decrease with the separation distance, and be-comes negligible around 15 m.
320
Fig. 6 Transect of 100 pH and electrical conductivity valuesmeasured in the topsoil of a pasture
Fig. 7 Scattergrams between soil pH and electrical conductivityvalues separated by a distance of 0 m, 1 m, 2 m, . . ., 10 m. Notehow the correlation between observations decreases as the sepa-ration distance increases
The cross semivariogram
Instead of looking at the similarity between pH andconductivity values separated by a vector h say z(ua)and y(uach), one may consider how these two attri-butes jointly vary over a vector h, that is the relationbetween the increments [z(ua)–z(uach)] and[y(ua)Py(uach)]. If both attributes are positively cor-related, an increase (decrease) in pH values from ua touach tends to be associated with an increase (de-crease) in conductivity values. Conversely, a negativecorrelation between attributes would entail that an in-crease (decrease) in pH values tends to be associatedwith a decrease (increase) in conductivity values.
A measure of the joint variability of two continuousattributes z and y is the experimental cross semivario-gram which is computed as:
gZY(h)
p1
2N(h)
N(h)
Aap1
[z(ua)Pz(uach)]7[y(ua)Py(uach)] (2)
The rescaling of cross semivariogram values by the cor-responding direct semivariogram values yields the co-dispersion coefficient nZY(h) which can be interpretedas a linear correlation coefficient between the spatialincrements of both attributes (Goovaerts 1997a):
nZY(h)pgZY(h)
;gZZ(h)7gYY(h)B [P1,c1] (3)
Figure 8 shows the experimental cross semivario-gram (middle graph) and codispersion function (bot-tom graph) between pH and electrical conductivity inpasture. Note that unlike the cross correlogram, thecross semivariogram is identical in the easterly andwesterly directions. An interesting feature is the changein sign of cross semivariogram values, which indicatesthat the two attributes vary in opposite ways at shortdistances (small scale), whereas at larger distances theirincrements are positively related. Such scale-dependentrelations may reflect the existence of different physico-chemical processes controling the fluctuations in pHand electrical conductivity values at different scales.Changes in the correlation between soil attributes as afunction of the spatial scale have been reported in sev-eral studies, e.g., see Wackernagel (1988), Goulard andVoltz (1992), Goovaerts (1992, 1994a), Goovaerts andWebster (1994), and Dobermann et al. (1995, 1997).
321
Fig. 8 Experimental cross correlogram and semivariogram forthe transect of 100 soil pH and electrical conductivity valuesmeasured in pasture
The indicator semivariogram
The different tools introduced so far allow the quantifi-cation of the spatial variability over the full range ofattribute values. In most applications, such descriptionshould be supplemented by the characterization of thespatial distribution of specific classes of values. For ex-ample, knowledge of whether large values are clusteredor scattered in space can improve our interpretation ofunderlying physical processes. In many environmentalstudies, a few random hot spots of large pollutant con-centrations caused by local human activities coexistwith a natural background of small values (Leonte and
Schofield 1996; Mohammadi et al. 1997). Similarly,many microbiological processes in soil, such as denitri-fication, show hot spots of high activity which may berelated to the patchy distribution of organic material insoil (Parkin 1987; Robertson et al. 1988).
The characterization of the spatial distribution of z-values above or below a given threshold value zk re-quires a prior coding of each observation z(ua) into anindicator datum i(ua;zk), defined as:
i(ua;zk)p51 if z(ua)^zk
0 otherwise(4)
Indicator semivariograms can then be computed bysubstituting indicator data i(ua;zk) for z-data z(ua) inEq. (1):
gI(h;zk)p1
2N(h)
N(h)
Aap1
[i(ua;zk)Pi(uach;zk)]2 (5)
The indicator variogram value 2gI(h;zk) measures howoften two z-values separated by a vector h are on oppo-site sides of the threshold value zk. In other words,2gI(h;zk) measures the transition frequency betweentwo classes of z-values as a function of h. The greatergI(h;zk), the less connected in space are the small orlarge values.
Figure 9 shows the indicator semivariograms com-puted for the lower and upper quartiles of the distribu-tions of pH values in pasture and forest. For forestsoils, the two indicator semivariograms have a verylarge nugget, which indicates that small and large pHvalues are almost randomly distributed along the trans-ect. Indicator semivariograms for pasture measure-ments are clearly different: the shorter range of the up-per quartile semivariogram reflects the existence ofshort high-valued (pH larger than 6.92) segments alongthe transect, whereas small pH values are better con-nected in space as indicated by smaller values of thelower quartile semivariogram.
Indicator cross semivariogram gI (h;zk,zkb) can becomputed by substituting indicator data defined for twodifferent threshold values zk and zkb for z and y-data inthe equation (2):
gI(h;zk,zkb)p1
2N(h)
N(h)
Aap1
[i(ua;zk)
Pi(uach;zk)]7[i(ua;zkb)Pi(uach;zkb)] (6)
Webster and Boag (1992) used a combination of indica-tor direct and cross semivariograms to characterize thespatial distribution of the number of cyst nematodes intopsoil. They found that as the threshold value zk in-creased, the range of indicator direct semivariogramsshortens progressively, whereas the nugget effect of thecross semivariograms decreased, which indicates aborder effect with infestation increasing gradually fromthe outside of patches towards their centres. Biological-ly, the distribution can be explained by the nematodebeing spread by cultivation from initial fortuitous fociof infestation or control by nematophagous fungi orboth.
322
Fig. 9 Experimental indicator semivariograms for soil pH valuesmeasured in pasture and forest. The two threshold values corre-spond to the lower (solid line) and upper (dashed line) quartilesof the sample distributions of Figure 2
Indicator semivariograms can also be used to charac-terize the spatial patterns of categorical variables, suchas texture or water table classes, which take only a lim-ited number of states which might be ordered or not.Let S be a categorical attribute with possible states sk,kp1,2, . . .,K. The K states are exhaustive and mutuallyexclusive in the sense that one and only one state sk
occurs at each data location ua. The pattern of variabil-ity of a category sk can be characterized by semivario-grams of type (5) defined on an indicator coding of thepresence of that category:
i(ua;sk)p51 if s(ua)psk
0 otherwise(7)
The indicator variogram value 2gI(h; sk) measures howoften two locations a vector h apart belong to differentcategories skb(sk. The smaller 2gI(h; sk), the greaterthe spatial connectivity of category sk. For example, ca-tegorical indicator semivariograms have been used tocharacterize the spatial distribution of water tableclasses (Bierkens and Burrough 1993a, b) and the spa-tial arrangement of soil types (Goovaerts 1994b).
Semivariogram modeling
One of the main applications of geostatistics is the pre-diction of attribute values at unsampled locations. Pre-diction is made possible by the existence of spatial de-pendence between observations as assessed by the cor-relogram or semivariogram. A key step between de-scription and prediction is the modeling of the spatialdistribution of attribute values. Most of geostatistics isbased on the concept of random function, whereby theset of unknown values is regarded as a set of spatiallydependent random variables. Each measurement z(ua)is thus interpreted as a particular realization of a ran-dom variable Z(ua). Interested readers should refer totextbooks such as Isaaks and Srivastava (1989, pp. 196–236) or Goovaerts (1997a, pp. 59–74) for a detailedpresentation of the theory of random functions. An im-portant characteristic of the random function is its sem-ivariogram which must be modeled from the experi-mental values.
The univariate case
Let [g(hk),kp1, . . .,K] be the set of experimental semi-variogram values computed for a finite number of lags,say the 25 semivariogram values displayed at the bot-tom of Fig. 4 for either pasture or forest pH measure-ments. A continuous function must be fitted to thesevalues so as to deduce semivariogram values for anypossible lag h required by prediction algorithms andalso to smooth out sample fluctuations. The difficulty isthat only functions that are conditionally negative defi-nite can be considered as semivariogram models, in or-der to ensure the non-negativity of the variance of theprediction error, see later expression (21). In practice,only a few models are known to be permissible, andFig. 10 shows the ones that are included in most geosta-tistical software. The three top models are bounded inthat they reach a sill either at a given range value(spherical model) or asymptotically (exponential,Gaussian model):I Spherical model with range a
g(h)pSph1ha2p 51.57
ha
P0.571ha2
3
if h~a
1 otherwise(8)
I Exponential model with distance parameter b
g(h)p1Pexp1Phb 2 (9)
I Gaussian model with distance parameter b
g(h)p1Pexp1Ph2
b2 2 (10)
The practical range is defined as the distance at whichthe model value is at 95% of the sill, that is 3b for theexponential model and ;3b for the Gaussian model.
323
Fig. 10 Bounded semivariogram models with the same practicalrange (top graph), and power models for different values of theparameter v (bottom graph)
The bottom graph shows power models for differentvalues of the parameter v:
g(h)phv with 0~v~2 (11)
These functions allow one to model different behav-iours at infinity (bounded, unbounded) and at the ori-gin (linear, quadratic) of the semivariogram. Severalauthors (McBratney and Webster 1986; Wackernagel1995; Goovaerts 1997a) have warned users of the riskof numerical instability when using the Gaussian modelfor spatial interpolation (kriging). A common recom-mendation is to add a small nugget effect to the Gaus-sian model to destroy its extreme extrapolative proper-ties and avoid the generation of artifacts in interpolatedmaps. A better solution is never to use the Gaussianmodel. If the experimental semivariogram appears tohave both an asymptote and an inflexion, it can bemodeled using the following function:
g(h)p1Pexp1Phv
b 2 with 0~v~2 (12)
where the parameter v can be adjusted (Wackernagel1995, p. 110; Webster 1997). Note that for vp2, oneretrieves the Gaussian model (10). A theoreticallysounder alternative is to use Whittle’s elementary cor-relation (McBratney and Webster 1986).
Typically, two or more permissible models must becombined to fit the shape of the experimental semivar-iogram. For example, the semivariogram of pH on pas-ture has been modeled using a combination of a nuggeteffect and two spherical models of range 6 m and 26 m
Fig. 11 Experimental semivariograms for soil pH values meas-ured along the pasture and forest transects with the model fitted
(Fig. 11; top graph). The semivariogram of forest pHhas been modeled using a nugget effect and an expon-ential model of practical range 4 m. Combinations ofpermissible models are permissible as long as the con-tribution of each basic model is positive, that is thenested model is written as:
g(h)pL
Alp0
bl gl (h) with bl60 (13)
where bl is the positive sill or slope of the correspond-ing basic semivariogram model gl(h).
The way in which these permissible models are cho-sen and their parameters (range, sill) are estimated isstill controversial (McBratney and Webster 1986;Goovaerts 1997a, pp. 97–107). Several methods havebeen proposed, ranging from full blackbox proceduresin which the choice of model and its fitting are auto-matic to visual approaches where the model is selectedso that the fit is satisfactory from a graphical point ofview, as in the examples of Fig. 11. An intermediate ap-proach consists of an automatic (least-squares) estima-tion of parameters of models chosen by the user. In thissemi-automatic procedure, the objective is usually tominimize a weighted sum of squares (WSS) of differ-
324
ences between experimental g (hk) and model g(hk)semivariogram values:
WSSpK
Akp1
v(hk)7[g(hk)Pg(hk)]2 (14)
The weight v(hk) given to each lag hk is often takenproportional to the number N(hk) of data pairs thatcontribute to the estimate g(hk). The implicit assump-tion is that the reliability of an experimental semivario-gram value increases with statistical mass. An alterna-tive that gives more weight to the first lags consists ofdividing the number of data pairs by the squared modelvalue: N(hk)/[g(hk)]2 (Cressie 1985).
The value of the WSS criterion is frequently used torank alternative models, say to compare the goodness-of-fit of a spherical model versus an exponential one. Inmany situations, the user tends to increase the numberof basic structures in the nested model (13) to lower theWSS value as much as possible. However, one shouldavoid overfitting experimental semivariograms: themore complicated model generally does not lead tomore accurate estimates. McBratney and Webster(1986) proposed using the Akaike Information Criteri-on to achieve a satisfactory compromise between good-ness-of-fit and parsimony of the model.
Semivariogram modeling is rarely a goal per se. Theultimate objective is usually to estimate the value of asoil property at an unsampled place. Cross validationallows one to compare the impact of different semivar-iogram models on interpolation results (McBratney andWebster 1986; Isaaks and Srivastava 1989, pp. 351–368). The idea consists of removing one datum at atime from the data set and re-estimating this value fromremaining data using different semivariogram models.Interpolated and actual values are compared, and themodel that yields the most accurate predictions is re-tained. A measure of accuracy is the mean absolute er-ror of prediction (MAE), which is computed as the av-erage absolute difference between actual and estimatedvalues at the n sampled locations:
MAEp1n
n
Aap1
hz (ua)Pz*(ua)h (15)
In the presence of anisotropic (direction-dependent)variation, one must build a model that describes howthe semivariogram changes as the distance and direc-tion change. Two types of anisotropy are usually distin-guished: geometric and zonal. In the first case, the di-rectional semivariograms have the same shape and sillbut different range values that plot as an ellipse as afunction of the azimuth of the direction. An anisotropythat involves sill values varying with direction is said tobe zonal. These two models of spatial anisotropy arepresented in textbooks such as Isaaks and Srivastava(1989, pp. 377–390) and Goovaerts (1997a, pp. 90–95).
Regardless of the fitting procedure adopted, onemust keep in mind that the objective is to build a per-missible semivariogram model that captures the major
spatial features of the attribute under study. The use ofleast-squares criteria amounts to reducing semivario-gram modeling to an exercise in fitting a curve to ex-perimental values, which I think is too restrictive. Al-though experimental semivariogram values play an im-portant role in this process, ancillary information suchas provided by physical knowledge of the area and phe-nomenon may be of great interest. For example, strongprior qualitative information may lead one to adopt ananisotropic model even if data sparsity prevents seeinganisotropy from the experimental semivariograms com-puted in different directions.
Bivariate case
Modeling the coregionalization between two variablesZ and Y involves choosing and fitting functions to thetwo direct semivariograms gZZ(h) and gZZ (h) plus thecross semivariogram gZY(h). The difficulty lies in thefact that the three models cannot be built indepen-dently from one another. The easiest approach consistsof modeling the three semivariograms as linear combi-nations of the same set of basic semivariogram modelsgl(h):
gZZ(h)pL
Alp0
blZZ gl (h)
gYY (h)pL
Alp0
blYY gl(h)
gZY(h)pL
Alp0
blZY gl(h)
The so-called linear model of coregionalization (LMC)is permissible if: (1) the (Lc1) basic models gl(h) arepermissible, and (2) the coefficients satisfy the follow-ing necessary and sufficient conditions:
blZZ60, bl
YY60 Gl (16)
hblZYh^;bl
ZZ blYY Gl (17)
In practice, the modeling is performed in two steps:1. Both direct semivariograms are first modeled as lin-
ear combinations of selected basic structures gl(h).2. The same basic structures are then fitted to the cross
semivariogram under the constraint (17).This approach was used to fit visually the followingmodel to the (cross) semivariograms of pH and electri-cal conductivity displayed in Fig. 12:
gPH(h)p0.007g0(h)c0.01 Sph(h/6 m)c0.0065 Sph(h/26 m)
gEC(h)p100g0(h)c155 Sph(h/26 m)gpH-EC(h)pP0.4g0(h)c1.0 Sph(h/26 m) (18)
where g0(h) is a nugget effect model, and Sph(h/6 m) isthe spherical model (8) with range ap6 m. For each ofthe three basic models, the coefficients of the directsemivariograms are positive and the inequality (17) is
325
Fig. 12 Experimental direct and cross semivariograms for soil pHand electrical conductivity along the pasture transect, with the lin-ear model of coregionalization fitted
satisfied: hP0.4h^;0.0077100, h0h^;0.0170, andh1h^;0.00657155.
The requirement that all semivariograms must sharethe same set of basic structures may sound like a severelimitation on the linear model of coregionalization.Variables that are well cross-correlated are, however,likely to show similar patterns of spatial variability. Inaddition, there is no need for the direct and cross semi-variograms to include all the basic structures; for exam-ple, the semivariogram of electrical conductivity andthe cross semivariogram do not include the short-range(6 m) spherical structure.
Multivariate case
Soil studies typically involve more than two attributes,and it is often necessary to model the coregionalizationfor a large number of variables. Checking the permissi-bility of the LMC becomes cumbersome because thesimple constraints (16) and (17) do not suffice for morethan two variables. Mathematically speaking, one mustcheck that, for each structure gl(h), the matrix of b-coefficients is positive semi-definite. A symmetric ma-trix is positive semi-definite if its determinant and all itsprincipal minor determinants are non-negative. Consid-er, for example, the linear model of coregionalizationfor three variables Z, Y and X:
3gZZ(h)gYZ(h)gXZ(h)
gZY(h)gYY(h)gXY(h)
gZX(h)gYX(h)gXX(h)
4pL
Alp0 3
blZZ
blYZ
blXZ
blZY
blYY
blXY
blZX
blYX
blXX4gl (h)
For each structure gl(h), the matrix of coefficients mustsatisfy the following seven inequalities:I All diagonal elements are non-negative:
blZZ60 bl
YY60 blXX60
I All principal minor determinants of order 2 are non-negative:
)blZZ
blYZ
blZY
blYY)pbl
ZZ blYYP[bl
YZ]260
)blYY
blXY
blYX
blXX
)pblYY bl
XXP[blXY]260
)blZZ
blXZ
blZX
blXX
)pblZZ bl
XXP[blXZ]260
I The determinant of order 3 is non-negative:
)bl
ZZ
blYZ
blXZ
blZY
blYY
blXY
blZX
blYX
blXX)pbl
ZZ (blYY bl
XXP[blYX]2)
PblZY(bl
YZ blXXPbl
XZ blYX)
cblZX(bl
YZ blXYPbl
XZ blYY)60
An iterative procedure that fits the linear model ofcoregionalization directly under such a constraint ofpositive semi-definiteness has been developed by Gou-lard (1989). This technique is increasingly used to mod-el the coregionalization of soil properties (Goovaerts1992; Goulard and Voltz 1992; Voltz and Goulard 1994;Webster et al. 1994; Dobermann et al. 1995, 1997).
Spatial prediction
Most geostatistical studies in soil science aim at esti-mating soil properties at unsampled places and map-ping them. Kriging is a generic name adopted by thegeostatisticians for a family of generalized least-squaresregression algorithms (Webster 1996). There are manydifferent kriging algorithms, and most of them are re-viewed in Goovaerts (1998) with references to soil ap-
326
plications. In this paper, only the most common ordina-ry kriging and cokriging are presented.
Ordinary kriging
Consider the problem of estimating the value of a con-tinuous soil attribute z at an unsampled location u us-ing only data on this attribute [z(ua), ap1, . . ., n]. Ordi-nary kriging (OK) estimates this value as a linear com-bination of neighboring observations:
z*OK(u)p
n(u)
Aap1
la(u)z(ua) (19)
Typically, the 10–20 observations closest to u being es-timated are retained because they screen the influenceof those farther away, that is n(u)Pn. The key issue isthe determination of the weight la(u) assigned to eachobservation.
As for other linear regression procedures, ordinarykriging weights are chosen so as to minimize the esti-mation or error variance s2
E(u)pVar [Z*(u)PZ(u)]under the constraint of unbiasedness of the estimator.These weights are obtained by solving a system of lin-ear equations which is known as “ordinary kriging sys-tem”:
5n(u)
Abp1
lb(u) g(uaPub)Pm(u)pg(uaPu)
ap1, . . ., n(u) (20)n(u)
Abp1
lb(u)p1
Unbiasedness of the estimator is ensured by constrain-ing the weights to sum to one, which requires the defi-nition of the Lagrange parameter m(u). The only infor-mation required by the kriging system are semivario-gram values for different lags, and these are readily de-rived from the semivariogram model fitted to experi-mental values (recall previous section).
In addition to an estimate for the unknown z-value,ordinary kriging provides an error variance which iscomputed as:
s2OK(u)p
n(u)
Aap1
la(u)g(uaPu)Pm(u) (21)
Under stringent hypotheses of normality and homosce-dasticity, the kriging variance can be combined with theestimated value to derive a confidence interval; for ex-ample, the 95% confidence interval is taken as:
Prob{Z(u)B[z*OK(u)P2sOK(u),z*
OK(u)c2sOK(u)]}p0.95 (22)
Figure 13 (left column) shows the ordinary kriging esti-mates and error variances for pH along the transect inpasture. Only 20 pH values out of the 100 original datawere considered here so that estimated values can belater compared with actual measurements to assess the
prediction performances of kriging. The estimation wasperformed every meter using the ten closest pH valuesand the semivariogram model displayed at the top ofFig. 11. The ordinary kriging estimator is exact in that ithonors observations at data locations, and the krigingvariance is zero there. The error variance increasesaway from the data and reaches a maximum valuebeyond the extreme right datum (extrapolation situa-tion).
Besides providing an error variance, kriging has sev-eral advantages over other interpolation techniquessuch as the inverse distances algorithm: (1) it accountsfor the pattern of spatial variability (range, anisotropy)through the semivariogram model, and (2) it allows es-timation of the target attribute on a support that is dif-ferent from the data support, e.g., pH values measuredon soil cores can be used to predict average pH valueson larger surfaces, such as 1 ha plots, which are bettersuited for land management (Burgess and Webster1980b).
Ordinary cokriging
When measurements are sparse or poorly correlated inspace, the estimation of the primary attribute of inter-est is generally improved by taking into account sec-ondary information originating from other correlatedcontinuous attributes. The ordinary kriging estimate(19) is readily extended to incorporate that additionalinformation. In the simplest case of a single secondaryattribute Y, the so-called ordinary cokriging (OCK) es-timate is written as a linear combination of both neigh-boring primary and secondary data:
z*OCK(u)p
n(u)
Aap1
la(u)z(ua)cnb(u)
Aabp1
nab(u)y(uab) (23)
where secondary data have been measured at possiblydifferent locations uab. Like ordinary kriging, the objec-tive is to minimize the error variance under unbiased-ness constraint, which yields a very complex system of(n(u)cnb(u)c2) linear equations:
5n(u)
Abp1
lb(u)gZZ(uaPub)cnb(n)
Abbp1
nbb(u)gZY(uaPubb)
PmZ(u)pgZZ(uaPu) ap1, . . .,n(u)n(u)
Abp1
lb(u)gYZ(uabPub)cnb(u)
Abbp1
nbb(u)gYY(uabPubb)
PmY(u)pgYZ(uabPu) abp1, . . .,nb(u) (24)n(u)
Abp1
lb(u)p1
nb(u)
Abbp1
nbb(u)p0
There are now two Lagrange parameters to account forthe constraints on primary and secondary data weights,and the input information comprises the values of di-rect and cross semivariograms for different lags. Thecokriging variance is computed as:
327
Fig. 13 Ordinary kriging and cokriging estimates of soil pH alongthe pasture transect, with the corresponding error variances. Theinformation available consists of 20 pH values (black dots) and100 electrical conductivity values
s2OCK(u)p
n(u)
Aap1
la(u) gZZ (uaPu)
cnb(u)
Aabp1
nab(u) gYZ(uabPu)PmZ(u) (25)
Figure 13 (right column) shows the ordinary cokrig-ing estimates and the corresponding error variances forpH along the transect in pasture. The informationavailable consists of 20 pH values plus the 100 electricalconductivity data shown at the bottom of Fig. 6. Theestimation was performed every meter using the tenclosest pH values and the ten closest electrical conduc-tivity values. The linear model of coregionalization ofFig. 12 was used to create the cokriging system. Ac-counting for this additional information yields a moredetailed estimated transect and smaller error variancesthan kriging does. The absolute estimation error wasalso computed at the 80 locations where the pH valuewas ignored, and results show that cokriging yields a10% reduction of the error: 0.0978 instead of 0.1031 forkriging. The small magnitude of the reduction is ex-plained mainly by the weak correlation (rp0.33) be-tween pH and electrical conductivity. The benefit of co-kriging over kriging increases as the correlation be-tween the two attributes increases and as the secondary
information becomes more abundant than the primaryinformation. Cokriging is thus particularly useful whena cheap-to-measure attribute is strongly correlated withthe attribute of interest.
Many examples of cokriging can be found in the soilliterature. For example, Yates and Warrick (1987) esti-mated gravimetric moisture content in a 1 ha plot usingthe bare soil surface temperature and the percent sandcontent as secondary variables. On a much larger scale,Gallichand and Marcotte (1993) mapped the clay per-centage for a study area of 33500 ha using 485 clay con-tent values and 3488 values of saturated hydraulic con-ductivity. Several authors (Leenaers et al. 1990; Odehet al. 1997) have shown how the readily available digitalelevation model can be used to improve the predictionof soil attributes through cokriging. Finally, Gotwayand Hartford (1996) demonstrated that accounting forcorn yield measurements through cokriging improvedthe prediction of residual nitrate in the field, leading tobetter recommendations for variable-rate N fertilizerapplication. Beware that several applications of cokrig-ing in the soil literature do not use a linear model ofcoregionalization, and even worse, the permissibility ofthe model fitted is rarely checked!
Factorial kriging
Any soil property measured at a certain location inspace is the outcome of several physical, chemical or
328
biological processes. Some of them operate only at mi-croscopic scales (e.g., biological activities), whereasothers such as weathering of parent material, erosion orsolute transport may act over larger distances. Thecombined effect of different sources of spatial variationthat operate at distinct scales yields nested semivario-grams, such as the semivariogram of pH on pasture(Fig. 11, top graph) which was modeled as the sum of anugget effect and two spherical models of range 6 mand 26 m:
gpH(h)p0.007g0(h)c0.01 Sph(h/6 m)c0.0065 Sph(h/26 m) (26)
Factorial kriging or kriging analysis (Matheron 1982;Wackernagel 1988, 1995; Goovaerts 1992) is a variantof kriging which aims at estimating and mapping thedifferent sources of spatial variability identified on theexperimental semivariogram. For example, on the basisof the semivariogram model (26), the transect of pHvalues measured in pasture can be viewed as the sum ofdifferent microscopic (nugget effect), short-range(6 m), and long-range (26 m) spatial components:
z(ua)pz0(ua)cz1(ua)cz2(ua) Ga
Assuming that these components are independent, thetransect of original pH values is decomposed into local(microscopic c short-range) and regional (long-range)components using the approach described in Goovaerts(1992) (Fig. 14, left column). A similar decompositionis performed for the transect of electrical conductivityvalues on the basis of the semivariogram model of Fig.12 (middle graph).
Although spatial components are mathematical con-structions with no a priori physical meaning, they canhelp to identify the main sources of spatial variationand improve our understanding of underlying physicalprocesses in soil. Goovaerts (1994a) used factorial krig-ing to separate local variation in soil and vegetationproperties due to field-to-field differences from region-al variation related to the presence of different soiltypes. Webster et al. (1994) used a similar approach tomap local variation in heavy metal concentrationswhich are partly due to anthropic activities and regionalchanges caused by the geology. Dobermann et al.(1995) investigated field-scale variation of soil fertilityin the humid tropics, and factorial kriging allowed thediscrimination of short-range variation in P, Ca, Mgand K topsoil concentrations, which is due to patchy in-puts of nutrients from buffalo excrement, from long-range variation related to vertical and lateral soil waterflow along the topographic slope.
Scale-dependent relations
An interesting application of factorial kriging is the de-tection of scale-dependent correlations between soil at-tributes, i.e. correlations that change as a function ofthe spatial scale. For example, the scattergram of pH
versus electrical conductivity values indicates that thesetwo soil attributes are weakly correlated along thetransect in pasture (Fig. 15, top graph). Original meas-urements, however, result from a combination of differ-ent factors operating at different spatial scales; some ofthem such as measurement errors or microscale varia-tion in biological activities likely reduce artificially theoverall correlation between attributes. This effect ismade clear by looking at the relation between the localor regional components of pH and electrical conductiv-ity shown in Fig. 14. Whereas the two attributes areslightly negatively correlated at a local scale, they dis-play a strong positive correlation at the regional scale(Fig. 15, bottom graph), which agrees with the changeof sign of the cross semivariogram noticed on Fig-ure 8.
Such scale-dependent relations are frequent in soilscience where many properties are controlled by thesame physical processes that operate at different spatialscales and influence these properties in different ways.Goovaerts and Webster (1994) found a strong positivecorrelation between topsoil cobalt and copper concen-trations once large micro-scale variations due to meas-urement and procedural errors were filtered from theoriginal data. They also showed that the classificationby soil association accounts for a large proportion ofthe variance at the regional scale, suggesting that theparent material contributes substantially more to thetrace element content of the soil than had been thoughtearlier on the basis of a classical analysis where all dif-ferent sources of variation were mixed. Accounting forthe spatial scale in the study of correlation can thus im-prove our understanding of complex relations betweensoil attributes and their environment.
Instead of computing the correlation coefficient be-tween previously estimated spatial components, thecorrelation between two attributes z and y at a particu-lar scale l can be assessed directly from the parametersof the linear model of coregionalization as:
lZYpCorr[Zl(u),Yl(u)]p
blZY
;blZZ7bl
YY
(27)
Application of expression (27) to the linear model ofcoregionalization (18) yields the following structuralcorrelation coefficients l
ZY for the pair pH-electricalconductivity: P0.31 and 0.99 at the local (up to 6 m)and regional (26 m) scales, respectively. These valuesare somewhat larger than those of Fig. 15, which arebased on estimated values.
Multivariate factorial kriging is a variant of cokrig-ing which aims at estimating and mapping the differentsources of spatial variability which are common to theset of attributes. The maps of the so-called regionalizedfactors have been used to summarize major features ofmultivariate soil data sets at different spatial scalesidentified on the experimental semivariogram (Goov-aerts 1992; Dobermann et al. 1995). More recently,Monestiez et al. (1997) used factorial kriging to map
329
Fig. 14 Decomposition of the pasture transect of pH and electri-cal conductivity values into local and regional components on thebasis of the nested semivariogram models of Fig. 12
the short-range (120 km) and long-range (300 km) spa-tial genetic structures of wild populations of perennialryegrass in France.
Advanced geostatistical methods
Although geostatistics is still mainly used for descrip-tion of spatial patterns and prediction of unsampledsoil attributes, the last 5 years have seen the develop-
ment of new techniques which allow soil scientists totackle advanced issues such as the assessment of theuncertainty about soil quality parameters or soil pollu-tant concentrations, or the stochastic simulation of thespatial distribution of attribute values. These topics arebriefly addressed in this section, and the reader shouldrefer to Goovaerts (1997a, pp. 259–436) or Goovaerts(1998) for a more detailed presentation.
Assessing the local uncertainty
Mapping metal concentrations or other soil propertiesis often a preliminary step towards decision makingsuch as the delineation of polluted areas or the identi-
330
Fig. 15 Scattergrams of pH vs electrical conductivity values: ori-ginal concentrations and spatial components displayed on Fig.14 Fig. 16 Kriged estimates of the probability for the soil pH along
the pasture transect to be smaller than 6.68. The informationavailable consists of 20 pH values (black dots) plus the indicatorsemivariogram model
fication of zones that are suitable for crop growth. Inmany applications, the focus is on the exceedence ofparticular values such as regulatory thresholds for hea-vy metal contamination or critical values of limiting fac-tors (e.g., depth to parent material, soil acidity) for soilquality. There is necessarily some error attached to thekriging estimate, and this must be accounted for in thedecision rule. Ignoring this uncertainty may, for exam-ple, lead one to declare safe a contaminated location onthe basis of a wrong estimate of pollutant concentrationwhich is slightly below the regulatory threshold.
Geostatistics is increasingly used to estimate andmap the risk of exceeding specific threshold values insoil science (Webster and Oliver 1989; Smith et al.1993; Goovaerts and Journel 1995; Goovaerts et al.1997; Mohammadi et al. 1997). The most straightfor-ward approach is based on a coding of each observationz(ua) into an indicator of non-exceedence of the targetthreshold value zk; recall expression (4)
i(ua;zk)p51 if z(ua)^zk
0 otherwise
The probability that the attribute value z does not ex-ceed zk at the unsampled location u is then estimatedfrom the indicator transforms using a kriging estimatorsimilar to the one developed for continuous attributes.For example, ordinary indicator kriging builds theprobability estimate as a linear combination of neigh-boring indicator data:
[Prob{Z(u)^zkh(n)}]*pn(u)
Aap1
la(u;zk) i(ua;zk) (28)
where the weights la(u;zk) are obtained by solving asystem of linear equations identical to system (20) ex-cept that semivariogram values are now derived fromthe model gI(h;zk) fitted to the experimental indicatorsemivariogram (5).
Figure 16 shows the probability that the soil pH is nogreater than 6.68 along the transect in pasture. Theseprobabilities were estimated every meter using ordina-
331
Fig. 17 Transects of 100 pH values measured in the topsoil of apasture (top graph), and the transects of estimated or simulatedvalues constructed from only 20 observations depicted by blackdots. Three different realizations were generated using stochasticsimulation
ry indicator kriging and the indicator semivariogrammodel displayed in Fig. 16. Note that the modeling ofindicator semivariograms follows the same rules as thatof semivariograms of continuous attributes.
Prediction of probabilities is generally improved byincorporating additional information, e.g., by cokrigingof indicator transforms of both primary and secondaryinformation. Goovaerts and Journel (1995) showedhow probabilities of deficiency in copper and cobaltcan be derived from the calibration of a soil map andcombined with precise measurements of metal concen-tration to map the risk of deficiency of these metals inthe soil. They have also introduced the concept of lossfunction which, in combination with probability maps,allowed one to map the economical impact of declaringwrongly that a location is deficient or sufficient in thesemetals. Similar applications to the delineation of con-taminated soils are presented in Colin et al. (1996) andGoovaerts et al. (1997).
Stochastic simulation
The comparison of the transects of measured and esti-mated pH values displayed in Fig. 17 indicates thatkriging smooths out local details of the spatial variationof pH values in pasture. Such smoothing results fromthe least-squares criterion of the kriging algorithm andleads to an unfortunate overestimation of small valuesand underestimation of large values. In stochastic simu-lation, the aim is not the minimization of the error var-iance but to generate a set of values that reproducesstatistics such as the sample histogram or the semivario-
332
gram model while honoring data at their locations.Mathematically speaking, each set of values is viewedas a realization of the random function.
For example, Fig. 17 shows three realizations of thespatial distribution of pH values generated using thesame information as in kriging, that is, 20 pH valuesdepicted by black dots and the semivariogram model ofFig. 11 (top graph). The transects of simulated valueslook more “realistic” than the transect of kriging esti-mates because they reproduce the spatial variabilitymodeled from the sample information. Stochastic simu-lation is thus increasingly preferred to kriging for allapplications where the spatial variability of the meas-ured field must be preserved, such as the delineation ofcontaminated areas (Desbarats 1996; Goovaerts 1997b)or the modeling of solute transport in the vadoze zone(Vanderborght et al. 1997). Like estimation, simulationcan be accomplished using a growing variety of tech-niques, and most of them are described in Goovaerts(1997a, pp. 376–424).
Each of the three realizations shown in Fig. 17 is aplausible representation of the unique and unknowndistribution of pH values along the transect in that eachsimulated transect honors the 20 data and reproducesapproximately the sample histogram and semivario-gram model. Differences between the three realizationsthus provide a measure of spatial uncertainty: featuressuch as segments of high pH are deemed certain if seenon most of the realizations, and their probability of oc-currence (i.e., probability that a given threshold is joint-ly exceeded at a series of locations) can be computed aslong as the realizations are equiprobable. Unlike sto-chastic simulation, indicator kriging provides only ameasure of local uncertainty in that it estimates theprobability that the threshold is exceeded at a single lo-cation; recall expression (28).
The impact of a given scenario, such as applicationof a particular amount of fertilizer, can be investigatedfrom a simulated map that reproduces aspects of thepattern of spatial dependence or other statisticsdeemed consequential for the problem at hand (e.g.,connectivity of large values, spatial correlation withsecondary attributes). Moreover, the availability ofmany equiprobable realizations allows one to assess theuncertainty about the consequences of this particularscenario, such as leaching of fertilizer and contamina-tion of groundwater, which results from our imperfectknowledge of the spatial distribution of soil attributevalues.
Conclusions
Any quantitative processing of soil information is in-complete as long as it takes no account of the spatialcoordinates of observations. Geostatistical characteri-zation of the spatial variability through semivariogramsor correlograms generally brings new insight into theway soil attributes are influenced by the environment,
such as geographical distribution of soil types or topo-graphy. Indicator semivariograms provide additionalinformation about the spatial distribution of specificclasses of values of continuous attributes as well as ca-tegorical soil attributes. Cross semivariograms andcross correlograms complete the description by assess-ing spatial relations between pairs of attributes.
A key step in any geostatistical analysis is the fittingof permissible models to the experimental semivario-gram values. In this paper, basic rules for univariateand multivariate semivariogram modeling have beenrecalled, with the hope that it will reduce the use of un-stable (i.e., Gaussian) or non-permissible models in soilstudies. Permissible models are, however, one ingre-dient of the modeling process, and whenever physicalknowledge of the area and phenomenon under study isavailable it should be accounted for in the building of amodel of spatial variability.
The existence of a model of spatial variability (de-pendence) allows one to tackle the problem of estimat-ing soil attribute values at unsampled locations. Geo-statistics offers a palette of linear least-squares (krig-ing) algorithms to incorporate different types of infor-mation in the estimation process. Kriging algorithmscan also be modified to decompose original observa-tions into specific spatial components, enabling the spa-tial fluctuations and interactions between soil attributesto be studied at different scales.
Recent developments in indicator geostatistics allowa straightforward assessment of the probability to ex-ceed critical values, such as regulatory thresholds in soilpollution or criteria for soil quality. Another way tomodel uncertainty is to generate many images (realiza-tions) that all honor the data and reproduce aspects ofthe patterns of spatial dependence or other statisticsdeemed consequential for the problem at hand. In thefuture, more attention should be devoted to the use ofthese various measures of local or spatial uncertainty indecision making, such as determination of the amountof fertilizer to be applied or the delineation of pollutedareas targeted for remediation.
Processes such as weathering of parent material, soilerosion or solute transport are dynamic, leading to var-iation of soil attribute values in both space and time.Temporal fluctuations are particularly important formicrobiological processes and cannot be ignored in sta-tistical modeling of soil variability. Although not intro-duced in the present paper, spatial geostatistical toolssuch as semivariogram or kriging can be extended tothe space-time domain.
References
Bierkens MFP, Burrough PA (1993a) The indicator approach tocategorical soil data. I. Theory. J Soil Sci 44 :361–368
Bierkens MFP, Burrough PA (1993b) The indicator approach tocategorical soil data. II. Application to mapping and land usesuitability analysis. J Soil Sci 44 :369–381
333
Burgess TM, Webster R (1980a) Optimal interpolation and isa-rithmic mapping of soil properties. I. The variogram and punc-tual kriging. J Soil Sci 31 :315–331
Burgess TM, Webster R (1980b) Optimal interpolation and isa-rithmic mapping of soil properties. II. Block kriging. J Soil Sci31 :333–341
Burgess TM, Webster R, McBratney AB (1981) Optimal interpo-lation and isarithmic mapping of soil properties. VI. Samplingstrategy. J Soil Sci 31 :643–659
Colin P, Froidevaux R, Garcia M, Nicoletis S (1996) Integratinggeophysical data for mapping the contamination of industrialsites by polycyclic aromatic hydrocarbons: a geostatistical ap-proach. In: Rouhani S, Srivastava RM, Desbarats AJ, CromerMV, Johnson AI (eds) Geostatistics for environmental andgeotechnical applications. American Society for Testing andMaterials STP 1283, Philadelphia, pp 69–87
Cressie N (1985) Fitting variogram models by weighted leastsquares. Math Geol 17 :563–586
Desbarats AJ (1996) Modeling spatial variability using geostatisti-cal simulation. In: Rouhani S, Srivastava RM, Desbarats AJ,Cromer MV, Johnson Al (eds) Geostatistics for environmen-tal and geotechnical applications. American Society for Test-ing and Materials STP 1283, Philadelphia, pp 32–48
Deutsch CV, Journel AG (1998) GSLIB: geostatistical softwarelibrary and user’s guide, 2nd edn. Oxford University Press,New York
Dobermann A, Goovaerts P, George T (1995) Sources of soil var-iation in an acid Ultisol of the Philippines. Geoderma 68 :173–191
Dobermann A, Goovaerts P, Neue HU (1997) Scale-dependentcorrelations among soil properties in two tropical lowland ricefields. Soil Sci Soc Am J 61 :1483–1496
Englund E, Sparks A (1991) Geo-EAS 1.2.1 user’s guide. EPAreport 8 600/8-91/008, EPA-EMSL, Las Vegas, Nev
Gallichand J, Marcotte D (1993) Mapping clay content for sub-surface drainage in the Nile Delta. Geoderma 58 :165–179
Goovaerts P (1992) Factorial kriging analysis: a useful tool forexploring the structure of multivariate spatial soil information.J Soil Sci 43 :597–619
Goovaerts P (1994a) Study of spatial relationships between twosets of variables using multivariate geostatistics. Geoderma62 :93–107
Goovaerts P (1994b) Comparison of coIK, IK, and mIK perform-ances for modeling conditional probabilities of categoricalvariables. In: Dimitrakopoulos R (ed) Geostatistics for thenext century. Kluwer, Dordrecht, pp 18–29
Goovaerts P (1997a) Geostatistics for natural resources evalua-tion. Oxford University Press, New York
Goovaerts P (1997b) Kriging vs stochastic simulation for riskanalysis in soil contamination. In: Soares A, Gómez-Hernán-dez J, Froidevaux R (eds) geoENV I – geostatistics for envi-ronmental applications. Kluwer, Dordrecht, pp 247–258
Goovaerts P (1998) Geostatistics in soil science: state-of-the-artand perspectives. Geoderma (in press)
Goovaerts P, Journel AG (1995) Integrating soil map informationin modelling the spatial variation of continuous soil proper-ties. Eur J Soil Sci 46 :397–414
Goovaerts P, Webster R (1994) Scale-dependent correlation be-tween topsoil copper and cobalt concentrations in Scotland.Eur J Soil Sci 45 :79–95
Goovaerts P, Gérard G, Frankart R (1989) Etude de la variabilitéspatiale de quelques propriétés chimiques du sol en Fagne deChimay, Belgique. Pédologie 39 :191–207
Goovaerts P, Webster R, Dubois J-P (1997) Assessing the risk ofsoil contamination in the Swiss Jura using indicator geostatis-tics. Environ Ecol Stat 4 :31–48
Gotway CA, Hartford AH (1996) Geostatistical methods for in-corporating auxiliary information in the prediction of spatialvariables. J Agric Biol Environ Stat 1 :17–39
Goulard M (1989) Inference in a coregionalization model. In:Armstrong M (ed) Geostatistics. Kluwer, Dordrecht, pp 397–408
Goulard M, Voltz M (1992) Linear coregionalization model: toolsfor estimation and choice of cross-variogram matrix. MathGeol 24 :269–286
Isaaks EH, Srivastava RM (1989) An introduction to appliedgeostatistics. Oxford University Press, New York
Journel AG, Huijbregts CJ (1978) Mining geostatistics. AcademicPress, New York
Leenaers H, Okx JP, Burrough PA (1990) Employing elevationdata for efficient mapping of soil pollution on floodplains. SoilUse Manag 6 :105–114
Leonte D, Schofield N (1996) Evaluation of a soil contaminatedsite and clean-up criteria: a geostatistical approach. In: Rou-hani S, Srivastava RM, Desbarats AJ, Cromer MV, JohnsonAI (eds) Geostatistics for environmental and geotechnical ap-plications. American Society for Testing and Materials STP1283, Philadelphia, pp 133–145
Matheron G (1982) Pour une analyse krigeante de données ré-gionalisées. Centre de Géostatistique, Report N-732, Fontai-nebleau
McBratney AB, Webster R (1981) Detection of ridge and furrowpattern by spectral analysis of crop yield. Int Stat Rev 49 :45–52
McBratney AB, Webster R (1986) Choosing functions for semi-variograms of soil properties and fitting them to sampling esti-mates. Soil Sci 37 :617–639
Mohammadi J, Van Meirvenne M, Goovaerts P (1997) Mappingcadmium concentration and the risk of exceeding a local sani-tation threshold using indicator geostatistics. In: Soares A,Gómez-Hernández J, Froidevaux R (eds) geoENV I – geosta-tistics for environmental applications. Kluwer, Dordrecht, pp327–337
Monestiez P, Goulard M, Charmet G, Balfourier F (1997) Ana-lysing spatial genetic structures by multivariate geostatistics:study of wild populations of perennial ryegrass. In: Baafi EY,Schofield NA (eds) Geostatistics Wollongong ’96. Kluwer,Dordrecht, pp 1197–1208
Odeh IOA, McBratney AB, Slater BK (1997) Predicting soilproperties from ancillary information: non-spatial modelscompared with geostatistical and combined methods. In: BaafiEY, Schofield NA (eds) Geostatistics Wollongong ’96. Kluw-er, Dordrecht, pp 1008–1019
Oliver MA, Webster R (1986a) Combining nested and linear sam-pling for determining the scale and form of spatial variation ofregionalized variables. Geogr Anal 18 :227–242
Oliver MA, Webster R (1986b) Semi-variograms for modellingthe spatial pattern of landform and soil properties. Earth Sur-face Proc Landforms 11 :491–504
Pannatier Y (1996) Variowin: software for spatial data analysis in2-D. Springer, New York Berlin Heidelberg
Parkin TB (1987) Soil microsites as a source of denitrification var-iability. Soil Sci Soc Am J 51 :1194–1199
Robertson GP, Huston MA, Evans FC, Tiedje JM (1988) Spatialvariability in a successional plant community: patterns of ni-trogen availability. Ecology 69 :1517–1524
Rossi RE, Mulla DJ, Journel AG, Franz EH (1992) Geostatisticaltools for modeling and interpreting ecological spatial depend-ence. Ecol Monogr 62 :277–314
Smith JL, Halvorson JJ, Papendick RI (1993) Using multiple-vari-able indicator kriging for evaluating soil quality. Soil Sci SocAm J 57 :743–749
Sutherland RA, Kessel C van, Pennock D (1991) Spatial variabil-ity of nitrogen-15 natural abundance. Soil Sci Soc Am J55 :1339–1347
Trangmar BB, Yost RS, Uehara G (1985) Application of geosta-tistics to spatial studies of soil properties. Adv Agron 38 :45–94
Vanderborght J, Jacques D, Mallants D, Tseng PH, Feyen J(1997) Analysis of solute redistribution in heterogeneous soil.II. Numerical simulation of solute transport. In: Soares A,Gómez-Hernández J, Froidevaux R (eds) geoENV I – Geosta-tistics for environmental applications. Kluwer, Dordrecht, pp283–295
334
Vieira SR, Hatfield JL, Nielsen DR, Biggar JW (1983) Geostatis-tical theory and application to variability of some agronomicalproperties. Hilgardia 5 :1–75
Voltz M, Goulard M (1994) Spatial interpolation of soil moistureretention curves. Geoderma 62 :109–123
Wackernagel H (1988) Geostatistical techniques for interpretingmultivariate spatial information. In: Chung CF, Fabbri AG,Sinding-Larsen R (eds) Quantitative analysis of mineral andenergy resources. Reidel, Dordrecht, pp 393–409
Wackernagel H (1995) Multivariate geostatistics: an introductionwith applications. Springer, Berlin Heidelberg New York
Warrick AW, Myers DE, Nielsen DR (1986) Geostatistical meth-ods applied to soil science. In: Methods of soil analysis, part 1,2nd edn Physical and mineralogical methods. AgronomyMonograph 9, pp 53–82
Webster R (1996) What is kriging? Aspects Appl Biol 46 :57–66Webster R (1997) Software review. Eur J Soil Sci 48 :173–175
Webster R, Boag B (1992) Geostatistical analysis of cyst nema-todes in soil. J Soil Sci 43 :583–595
Webster R, Burgess TM (1980) Optimal interpolation and isarith-mic mapping of soil properties. III. Changing drift and univer-sal kriging. J Soil Sci 31 :505–524
Webster R, Oliver MA (1989) Optimal interpolation and isarith-mic mapping of soil properties. VI. Disjunctive kriging andmapping the conditional probability. J Soil Sci 40 :497–512
Webster R, Atteia O, Dubois J-P (1994) Coregionalization oftrace metals in the soil in the Swiss Jura. Eur J Soil Sci45 :205–218
Wollum AG, Cassel DK (1984) Spatial variability of Rhizobiumjaponicum in two North Carolina soils. Soil Sci Soc Am J48 :1082–1086
Yates SR, Warrick AW (1987) Estimating soil water content us-ing cokriging. Soil Sci Soc Am J 51 :23–30
top related