exploring spatial variation and spatial relationships in a freshwater acidification critical load...
TRANSCRIPT
ARTICLE IN PRESS
Computers & Geosciences 36 (2010) 54–70
Contents lists available at ScienceDirect
Computers & Geosciences
0098-30
doi:10.1
� Corr
E-m
journal homepage: www.elsevier.com/locate/cageo
Exploring spatial variation and spatial relationships in a freshwateracidification critical load data set for Great Britain using geographicallyweighted summary statistics
Paul Harris a,�, Chris Brunsdon b
a National Centre for Geocomputation, National University of Ireland Maynooth, Maynooth, Co. Kildare, Irelandb Department of Geography, University of Leicester, Leicester LE1 7RH, UK
a r t i c l e i n f o
Article history:
Received 20 October 2008
Received in revised form
29 March 2009
Accepted 1 April 2009
Keywords:
Local statistics
Geographical kernel weighting
Nonstationarity
Acidified surface waters
Catchment characteristics
04/$ - see front matter & 2009 Elsevier Ltd. A
016/j.cageo.2009.04.012
esponding author. Tel.: +3531708 6208; fax:
ail address: [email protected] (P. Harris).
a b s t r a c t
In this study, geographically weighted summary statistics (GWSSs) are used to investigate spatial
variation and spatial relationships in a freshwater acidification critical load data set covering Great
Britain. This use of GWSSs not only provides valuable insight into the critical load process prior to a
geographically weighted regression (GWR) calibration, but also helps in interpreting its output. GWSSs
are similarly useful prior to the calibration of other spatial models, such as those used in geostatistics.
Results agree with those of previous research, where relationships between critical load and contextual
catchment data can vary across space. However the more sophisticated models used here are shown to
be much more flexible and informative, allowing more spatial patterns to be revealed than before.
& 2009 Elsevier Ltd. All rights reserved.
1. Introduction
Acid deposition is a major environmental threat to lakes andstreams throughout large areas of upland Britain (Mason, 1993).Pollutants that contribute to freshwater acidification are generallyemitted as sulphur dioxide and nitrogen oxides. The major sourcesof such acidifying compounds are from combustion of fossil fuelsat power stations or from other industrial processes. Vehicleexhausts, agriculture, volcanoes and the oceans also contribute.For freshwater acidification most of the atmospheric deposition isto the terrestrial part of the catchment rather than open water.Therefore lake and stream acidification is a function of flow pathsand the physical and chemical properties of catchment soils.Acidified freshwaters are a hostile environment for many forms ofaquatic life and consequently of environmental concern. Contin-uous assessment and informed management strategies for fresh-waters are fundamental for their protection.
One approach to protecting freshwaters focuses on thecalculation of acid deposition critical load values at freshwatersites. Critical load values are calculated in such a way as toindicate a site’s capacity to buffer the input of strong acid anionsof sulphur and nitrogen. Critical load values are thresholds andcan be compared directly to current and future deposition values.
ll rights reserved.
+3531708 6456.
For sites where the deposition value exceeds the critical loadvalue, acidification and associated environmental damage isexpected. Spatial variability in critical load values should beconsidered jointly with spatial variability in deposition values.This approach allows for selectivity and for exceeded sites to bepreferentially managed. For remediation of sites, two avenues arepossible: (a) reduce (nearby) deposition rates or (b) physicallyneutralise freshwater acidity (e.g. by the addition of an alkalicompound). In general, the susceptibility of freshwaters toacidification varies according to geology and land use. Waterssituated on bedrocks with a high weathering rate are usually wellbuffered against rain-deposited acidity by the relatively rapidrelease of neutralising base cations (mainly Ca2 + and Mg2 +).However for areas of slowly weathering bedrocks the reverse istrue, with acidifying compounds displacing H+ ions, whichdirectly lead to acidification. For Great Britain, the graniteregions of Scotland and Wales are particularly affected byacidification.
To calculate a critical load for any given freshwater siterequires surface water chemistry data. Collecting such data forevery site across Great Britain is prohibitively expensive. There-fore previous research has looked at ways of predicting criticalloads at sites where water chemistry data are unavailable, as analternative to a costly sampling programme. In this respect,research has endeavoured to link critical load variation withvarious catchment characteristics. This is useful as many catch-ment variables can be formulated from existing data sources and
ARTICLE IN PRESS
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 55
are therefore relatively inexpensive. Catchment variables havebeen used to predict classes of critical load (Hall et al., 1995) or toexplain critical load variation (Kernan et al., 1998, 2001) at GreatBritain and similar spatial scales. Such studies found moderatelystrong relationships between critical load and catchment data,where the strength and nature of relationships could varyaccording to sample scale in both attribute-and geographic-space.However studies applied only basic methods, where any localregression modelling was fairly rudimentary in design usingarbitrary aspatial or spatial partitions.
For this and companion studies (in preparation), it is taken thata critical load spatial process would be better investigated usingmore sophisticated techniques. In the first instance (this study),geographically weighted summary statistics (GWSSs; Brunsdonet al., 2002; Fotheringham et al., 2002) are used to explorethe critical load data set spatially. This initial study acts as aninformative precursor to an exploration with geographicallyweighted regression (GWR; Brunsdon et al., 1996; Fotheringhamet al., 2002) and to other spatial models. With GWSSs (and GWR)nearby data are given more influence by weighting observationsaccording to a distance-decay function. This use of spatiallyweighted data enables the calibration of numerous location-specific statistics (or regressions). This ‘moving-window’ ap-proach (where weights follow a focal point around the map)allows statistics to be computed for regions not necessarilyattainable in any partition-based approach. Such a use ofsample information tends to provide a continuous and smoothmodel output, where a local statistic can be mapped andvisually explored. The nonparametric techniques of GWSS andGWR are influenced by kernel density estimation (KDE)methods (Silverman, 1986; Wand and Jones, 1995), the attri-bute-space local regression (LR) models of Cleveland (1979) andLoader (2004), and the generalised additive/varying-coefficientmodels of Hastie and Tibshirani (1990, 1993). In addition to themodelling of continuous spatial processes (i.e. models used in thisstudy), the kernel approach has been extensively adopted for themodelling of point spatial processes (see Diggle, 1985; Silverman,1986).
The models of this and companion studies will use similarcritical load and catchment variables to those used in the studiesof Hall, Kernan and co-workers (from above) and therefore somelimited comparison between studies will be possible. Further-more, the use of GWSSs and GWR need not be confined to thisparticular data set as they are likely to be similarly applicable todata sets found from other environmental processes that areconsidered heterogenic. For example, applications to critical loaddata sets for regions of China where acidification is currentlyposing a major environmental problem (Brimblecombe, 2007) orto critical load data sets for other pollutants, such as those foundfor heavy metals (e.g. see Slootweg et al., 2007).
2. Data
The calculation of a critical load value for any freshwater site isitself a complex issue and competing models exist for theircalculation. Steady-state approaches calculate values such thatexceedances (critical load minus deposition) reflect potentialfuture damage once steady-state is achieved. Steady-state modelsinclude the steady-state water chemistry (SSWC) model (Henriksenet al., 1992; Curtis et al., 2000), the Diatom model (Battarbee et al.,1996) and the First-order Acidity Balance model (Poschet al., 1997; Curtis et al., 2000). Models can be calibrated forsulphur deposition, for nitrogen deposition or for both (totalacidity). For this study, critical load values from the SSWC model
for total acidity are spatially modelled. Units for critical loads (anddeposition data) are in keq. H+ ha�1 year�1.
Researchers at the Department of Geography, UniversityCollege London (UCL), provided the critical load and thecontextual catchment data. The critical load data stem from awater chemistry sampling programme for Great Britain as part ofthe UK Department of Transport and Regions critical loadsmapping programme (Kreiser et al., 1993). Water chemistrysamples were taken during the autumn or early spring over theperiod 1992–1994. Sites were chosen to represent the mostsensitive water body within either a 10 km grid square (formedium- to high-sensitive areas) or within a 20 km grid square(for low- or non-sensitive areas) so that the minimum critical loadcould be calculated. Research teams within the Critical LoadsAdvisory Group (CLAG) then used the water chemistry data tocalculate and map critical load values. Details of the samplingprogramme and mapping exercise are given in CLAG Freshwaters(1995).
The version of critical load and catchment data used for thisstudy was provided in January 2002. At this time, the waterchemistry data had been screened for problematic values byresearchers at UCL, which resulted in a critical load (andcatchment) data set of 1371 sites covering the whole of GreatBritain. This data set was further manipulated for this andcompanion studies to avoid problems of preferential sampling(i.e. data in medium- to high-sensitive areas is over-represented)when calibrating GWSSs and other spatial models (not presentedhere). This data manipulation also removed sites with missingdata. As a result, a spatially representative (declustered) data setof 497 sites for model calibration and a spatially representative(set-aside) data set of 189 sites for model validation (not usedhere) were found. The coverage of the resultant calibration data(Fig. 2b) is extensive and fairly regular, which is suitable forspatial modelling. Investigations (not presented) found this ratherlarge loss of model calibration information to have a negligibleeffect on model interpretation or performance.
To explain critical load variation, four percentage-based classvariables are used and manipulated. These catchment-specificvariables are geological sensitivity (GSP), soil buffering capacity(SBCP), soil critical load (SCLP) and land cover (LCP). The firstthree of these variables relate to a freshwater site’s ability tobuffer acid loading and comprise of four, three or five ordinalclasses for GSP, SBCP and SCLP, respectively. The twenty-five-classLCP variable is nominal. Such data were generated by over-layingdigitised catchment areas for each sampled site on to existingdigital maps. Full descriptions of the GSP and LCP data generationcan be found in Kernan et al. (1998, 2001). For the SBCP and SCLPdata generation, the reader is referred to Kernan et al. (1998)(where SBCP is termed soil sensitivity). If data reliability isconsidered an issue, then the following order of reliability isassumed: LCP, GSP, SBCP and SCLP (with the most reliable first).Other contextual variables were also available (e.g. site altitude,rainfall, etc.), but each variable added little to the varianceexplained of any regression fit; hence these variables werediscarded.
To more easily facilitate the use of percentage-based classvariables in this study’s correlations (and a companion study’sregressions), the three ordinal variables were re-formulated intosingle-value, weighted sensitivity data (Wt.GSP, Wt.SBCP andWt.SCLP). This results in a continuous variable form with only amarginal loss of information. Table 1 summarises the range ofvalues that the original percentage-based and correspondingweighted variable can take according to an expected acidbuffering capacity (or acid sensitivity). Thus low critical loadvalues would be expected to correspond to low Wt.GSP, Wt.SBCPand Wt.SCLP values (and vice versa). Twenty-five land cover
ARTICLE IN PRESS
Table 1Summary of ordinal/continuous catchment variables.
Buffering capacity GSP Wt.GSP SBCP Wt.SBCP SCLP Wt.SCLP Acid sensitivity
Low 1 1.0 1 10.0 5 0.1 Highk k k k k k k k
High 4 4.0 3 80.0 1 4.0 Low
Note the reverse order of SCLP.
Table 2Nine-class aggregation (LC9) of original twenty-five-class land cover data (LC25).
LC9 class LC25 class Description
1 1–4 and 20–22 Water and built/bare ground
2 6 Mown/grazed turf
3 7 Meadow/verge/semi-natural
4 18 Tilled land
5 14 and 15 Deciduous woodland
6 16 Coniferous woodland
7 5, 8, 13, 19 and 23–25 Lowland semi-natural grass/moor
8 9, 12 and 17 Upland semi-natural grass/bog moor
9 10 and 11 Upland semi-natural shrub moor
The most abundant LC25 classes are in bold.
Table 3Four kernel weighting functions.
Box car wij=1 if dijrr wij=0 otherwise
Bi-square wij=(1�(dij/r)2)2 if dijrr wij=0 otherwise
Gaussian wij ¼ expð�d2ij=2b2Þ
Exponential wij=exp(-dij/b)
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7056
classes would be problematic when calibrating and interpretingregressions. Hence an aggregation that gave a nine-class landcover variable was undertaken (see Table 2). It was then useful toconvert this percentage-based variable formulation into adominant catchment attribute (LC9D), as analysis with adominant class variable can more easily assess how land coverclasses discriminate between different critical load populations.This dominant class variable can be considered as low-resolutionland cover data, but overall there was little loss of informationbetween high- and low-resolution data forms.
3. Geographically weighted summary statistics (GWSSs)
3.1. Fixed and adaptive window sizes
The simplest way to find a local statistic surface is with amoving-window algorithm, where local statistics are calculatedand mapped at each window’s centre using only data withineach window. Summary statistics found using (over- or non-over-lapping) rectangular moving windows can be found in manyspatial studies, often prior to a geostatistical analysis (e.g. seeIsaaks and Srivastava, 1989; Rossi et al., 1992; Carroll and Oliver,2005; Zhang et al., 2007). Other window shapes are possible andin this study only circular ones are considered. However of mostimportance is the window size. If the window is too small, too fewdata are used to calculate the local statistic, resulting in an erraticor spiky output. If the window is too large, the local statistic willtend to the corresponding global statistic and thus provide littlespatial insight.
Window size is commonly defined as (a) a fixed size bydistance or (b) an adaptive size, where a fixed number of localdata items are used for each local statistic calculation. For sampledata on a fairly regular grid, either method is usually appropriate.For sample data on an irregular grid, the adaptive method ispreferred, resulting in different window sizes according to thedensity of local data. Hence the method is adaptive in a distancesense. Adaptive specifications eliminate poorly informed statistics
from windows with little or no information, but at a possible costof reduced ‘localness’ in some areas.
3.2. Distance-decay kernel functions
The moving-window approach can be generalised to thecalculation of locally weighted statistics, where data are nowweighted according to their proximity to a local calibration point(i.e. GWSSs). Here the weighting functions are called kernelfunctions, where a moving-window approach would relate to aGWSS specified with a box-car kernel. Importantly, the use of adistance-decay kernel can maximise sample information whilststill retaining a local focus. Specifying such a kernel will produce asmooth output across space. However, models using a distance-decay kernel should not be automatically preferred to those usinga box-car kernel, as the simple moving-window specification ismore likely to provide an output showing abrupt changes incontinuity that may be of special interest.
The box-car and the three distance-decay kernel functionsconsidered in this study are defined in Table 3. Each functionincludes a bandwidth parameter (r or b), which controls the rateof decay. All functions are defined in terms of weighting thesample data, where i is taken as the index of the calibration point,j the index of the sample data point and dij the distance betweenthe points indexed by i and j. For the box-car and bi-squarefunctions, the bandwidth r can be specified beforehand (i.e. a fixeddistance) or specified as the distance between the point i and itsNth nearest neighbour, where N is specified beforehand (i.e. anadaptive distance). The bi-square function gives fractionaldecaying weights according to the proximity of the data to eachlocation i, up until a fixed distance or a distance according to aspecified Nth nearest neighbour. The local search strategy for thisand the box-car function is simply N neighbours within a fixedradius r or N nearest neighbours for an adaptive approach. Bothfunctions can suffer from discontinuity, although the bi-squarefunction can be defined with a bandwidth that uses all of the datato minimise such problems.
The Gaussian and exponential functions are continuous anduse all the data. Their weights decay according to a Gaussian orexponential curve. According to the bandwidth set, data that are along way from the calibration point i receive virtually zero weight.The key difference between these functions is their behaviour atthe origin. Usually these continuous functions are defined with afixed bandwidth b, but can be constructed to behave in anadaptive manner. The bi-square function is useful as it can provide
ARTICLE IN PRESS
Kernel shapes
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-50t
K (t
)
Box-car weighting
Bi-square weighting
Gaussian weighting
Exponential weighting
-40 -30 -20 -10 0 10 20 30 40 50
Fig. 1. Kernel shapes (r=25 and b=11.785 see Table 3).
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 57
an intermediate weighting between the box-car and the Gaussianfunctions. To get similar weights from the bi-square and Gaussianfunctions, the bandwidths r and b can be approximately related byrffið3ð
ffiffiffi2p
=2Þb. The shapes of the functions are pictured in Fig. 1with bandwidths chosen to highlight this relationship. For allfunctions, if r or b is set suitably large enough, then all data canreceive a weight of one and hence the corresponding globalstatistic would be found.
3.3. Bandwidth size
There are many interacting specifications when implementinga kernel-based model. Bandwidth size, type of bandwidth(adaptive or fixed), type of kernel, shape of kernel (circular,elliptical, etc.), local search strategy and the data requirements ofthe local statistic or model (with respect to reliability) are allinter-connected. The problem lies in how to specify the model sothat the true heterogeneous nature of a given process isadequately depicted. In practice, bandwidth size is almost alwaysthe crucial model parameter and focus is usually placed on findingit (see Clark, 1977). For KDE there are many automated approachesfor finding an optimal bandwidth, where a cross-validationapproach is commonly taken (see Bowman and Azzalini, 1997,pp. 31–36). However cross-validation is possible only if there is anobjective function to cross-validate with and for most localstatistics this is simply not possible. Thus for any local statisticthat cannot be used to predict, bandwidths need to be chosensubjectively. This is not necessarily a problem, as their calculationand visualisation using a range of bandwidths is appropriate in anexploratory analysis.
3.4. Formulae: univariate and bivariate GWSSs
Formulae for the calculation of GWSSs can now be defined andin all cases, accord to those given in Brunsdon et al. (2002) andFotheringham et al. (2002, pp. 159–185). For parameterising andinterpreting a GWR model, it is (at least) useful to explore thechange across space in the mean, variance, coefficient of variationskew and correlation coefficient. Local correlations are particu-larly useful in that they allow preliminary investigations intorelationship nonstationarity prior to a GWR fit. Global correlations(and by extension the global multiple linear regression (MLR) fit)should not be overlooked. This is especially true for physical
processes such as the critical load process which are not expectedto show strongly nonstationary relationships as that commonlyfound with socio-demographic/economic data (for which GWRhas been extensively developed for and applied to).
Thus for attribute z a local mean can be defined asmðziÞ ¼
Pnj ¼ 1 wijzj=
Pnj ¼ 1 wij, where m(zi) is the local mean value
at any location i and wij accords to some kernel function. A localvariance can be defined as s2ðziÞ ¼
Pnj ¼ 1 wijðzj �mðziÞÞ
2=Pn
j ¼ 1 wij
and a local coefficient of variation (CV) can be defined asCVðziÞ ¼ sðziÞ=mðziÞ, where s(zi) is the local standard deviation(SD). A local skewness can be defined as
bðziÞ ¼ ð3ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPn
j ¼ 1 wijðzj �mðziÞÞ3=Pn
j ¼ 1 wij
qÞ=sðziÞ. For attributes
z and y, a local correlation coefficient can be defined asp(zi, yi)=c(zi, yi)/(s(zi)s(yi)), where p(zi, yi) is the local correlationcoefficient at any location i; s(zi) and s(yi) are the respective localSDs; and cðzi; yiÞ ¼
Pnj ¼ 1 wijfðzj �mðziÞÞðyj �mðyiÞÞg=
Pnj ¼ 1 wij is
the local covariance.
3.5. Tests
Fotheringham et al. (2002, pp. 165–169) discuss methods forinterpreting local statistics. Here it is suggested that themagnitude of local z-scores can give an indication of how localmeans are different from the global mean. That is local z-scorescan identify areas where the local mean varies more thanexpected (under the assumption of no local variation in the localmean). Local z-scores zsi are defined as zsi ¼ ðmðziÞ � mÞ=ðs
ffiffiffiffiffiffiffiffiffiffiffiffiffiPjw
2ij
qÞ
at any location indexed by i, where m is the global mean estimate,s the global SD estimate and wij can be any weighting function,but re-scaled to sum to one for each i; 95% limits (i.e. |zsi|Z1.96))can be used to identify interesting or unusual local means forexploratory purposes, but should not be used as a formalstatistical significance test.
For higher moments it is not easy to find an approximatedistribution and therefore similar local significance tests are notso easily derived. Consequently, Monte Carlo simulation tests areadvocated. These tests identify areas where local statistics are‘significantly’ different from such local statistics found by chanceor artifacts of random variation in the data. Here the sample dataare successively randomised and the nonstationary model isapplied after each randomisation. A basis of a significance test isthen possible by comparing actual results with results from a
ARTICLE IN PRESS
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7058
large number of randomised distributions. The randomisationhypothesis is that any pattern seen in the data occurs by chanceand therefore any permutation of the data is equally likely. Thetest proceeds as follows:
�
calculate the true local statistic at any location i using thesample data; � randomly choose a permutation of the data (note that thecoordinates are kept in the same pairs, as are the attributepairs for correlations);
� calculate the same local statistic at location i using therandomised data;
� repeat steps 2 and 3, say 999 times (the more the better); � rank the 999 simulated local statistics and the true localstatistic;
� ascertain where the true local statistic lies in this ranked scaleof 1000 values;
� if the true local statistic lies in the top or bottom 2.5% tail ofthis ranked distribution then the true local statistic can be saidto be ‘significantly’ different (at the 95% level) to such a localstatistic found by chance.
Note that the test is conditional on the specification of thenonstationary model in the first place (i.e. bandwidth size, type ofkernel, etc.).
4. Analysis of the critical load data set
4.1. Standard EDA
4.1.1. critical load distribution and trend
A histogram for the critical load data is shown in Fig. 2a, wheremulti-modality suggests evidence of two or three critical loadpopulations. The sample distribution is positively skewed, whereits median value is about 42% lower than its mean value (seeTable 4). From the spatial distribution given in Fig. 2b, low criticalloads are predominantly in areas of N Scotland, Wales and parts ofN England. Low critical loads occur to a lesser extent in a few areasof SW and SE England. All such areas are therefore the mostsensitive to inputs of acid anions. High critical loads cover themajority of England and central to southern Scotland. Thus ageneral trend of high to low critical loads is apparent in the SE toNW direction. Outlying data can be found in N and SW England.Overall there is visual evidence for both global and local trends incritical load variability.
Stronger evidence of any global trend in the critical load datamay be found by plotting critical load against each spatialcoordinate. Such plots are given in Fig. 3a, where unfortunatelya clearly defined relationship is not evident. A plot limited to theSE to NW direction would be expected to show the strongestrelationship, but again this is not evident. Moderate relationshipsare similarly found with simple MLR trend fits, where first- andsecond-order polynomials of the coordinates give R2 valuesof only 0.25 and 0.26, respectively (a trend fit limited to theSE to NW direction similarly gives a weak R2 value of 0.24).
4.1.2. Global relationships
A linear correlation matrix relating critical load with thecontinuous catchment data is given in Table 5, where all threecatchment variables are moderately, positively correlated withcritical load (a five-number summary of each variable is also givenin Table 6). These catchment variables are also moderatelycollinear and may therefore offer similar critical loadexplanatory powers to each other. Both raw and ranked data
correlations are given, where the ranked data correlations revealsimilar relationships to that found with the raw data. Thus at thisglobal scale any outlying data that exist appear to have a minimalinfluence on relationships. Scatterplots (Fig. 3b) largely confirmsuch relationships (but with much scatter) and experimentationwith various data transforms could not strengthen anyrelationship. It is likely that only one of these variables shouldbe used in any (global) MLR model at a time. Locally with GWR,this may be different.
Conditional critical load distributions for the LC9D variable areinvestigated using parallel box plots in Fig. 4a. LC9D classes 2–4appear to relate to high critical loads (i.e. these land cover classesoccur in large areas of England and to a lesser extent, areas ofcentral to southern Scotland, see Fig. 5a). It is also possible thatsome contextual relationship may discriminate between criticalload populations. To this extent, the LC9D variable is related tothree critical load populations, which are experimentally definedusing thresholds of 3 and 17 keq. H+ ha�1 year�1 (chosen from aninvestigation of the cumulative critical load distribution andlooking for where a change in gradient occurs, see Fig. 2c). FromFig. 4b, it appears that the LC9D variable can discriminatebetween the first (low-valued) and second (medium-valued)critical load populations, which is promising.
4.2. EDA with GWSSs
4.2.1. Specifications
The irregular shape of Great Britain should favour thespecification of adaptive bandwidths over fixed ones and unlessstated otherwise, box-car kernels are chosen. These specificationsshould provide local statistic surfaces that are fairly simple tointerpret, highlighting any unusual spatial features. Bandwidthsare given as a percentage, where for example, a bandwidth of 5%will use a data subset of the nearest N=25 observations. All localstatistics are calculated on the same rectangular 35E�50N gridand maps are presented in isoline form. In all cases, test values arecalculated on a much smaller 10�16 grid to aid interpretation.These are then mapped with the same local statistic calculated onthe 35�50 grid for context. The same bandwidth is used in eachcase. This visualisation strategy is advocated in Fotheringham etal. (2002, p. 167). All of the local statistic surfaces presented arejudged to be representative only after some initial exploration andexperimentation. The local statistic algorithms were developedwithin the R statistical computing environment (Ihaka andGentleman, 1996).
4.2.2. Outlying critical load data
Before investigating the spatial change in critical loadstatistics, it is first useful to identify spatial outliers. Theidentification of spatial outliers can aid in the interpretation ofany spatial model output and an analysis with the correspondingfiltered data set can act as a simple alternative to the use ofspecifically designed robust models (e.g. see Brunsdon et al., 2002,for the use of quantile-based GWSSs). Spatial outliers can beidentified from a geostatistical variographic analysis (e.g. seePloner, 1999), but for this study, the method of Hawkins (1980)described in Rossi et al. (1992) is followed.
In this method, all sample values, z(xa) are suspected a priori tobe spatial outliers, where z(xa) is a spatial outlier ifðNðzðxaÞ �mlÞ
2Þ=ððNþ1Þs2
l ÞZw2crit-1. Here N is the number of
neighbouring values of z(xa), ml the local mean, s2l the average
variance for equivalently sized neighbourhoods across the samplearea (i.e. the average local variance) and w2
crit-1 is a critical value ofthe chi-squared distribution for one degree of freedom. As there is
ARTICLE IN PRESS
0
10
20
30
40
50
60
70
80
90
100
0Critical load
Cum
ulat
ive
freq
uenc
y %
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
Fig. 2. Critical load aspatial (a), spatial (b) and aspatial cumulative (c) distributions. Note that Orkney and Shetland Islands (off NW Great Britain) have been omitted, as
they have no sampled sites. Cumulative distribution is shown with experimental thresholds at 3 and 17 keq. H+ ha�1 year�1.
Table 4Summary statistics for critical load data.
Minimum 0 Standard deviation 5.74
Mean 5.87 Coefficient of variation 0.98
Median 3.41 Skew 1.21
Maximum 31.32 Kurtosis 1.04
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 59
no objective function for cross-validation, neighbourhood defini-tions can only be chosen subjectively with this test statistic.
To calculate this test statistic for the critical load data, the localmean and variances were found using an adaptive bandwidth of5%. At the 95% level of confidence, this resulted in approximately5% of the data being identified as spatially outlying, which in turnyielded a filtered data set of 474 values. Fig. 5b maps the identified
spatial outliers. There are no obvious trends or distinct clusters ofspatial outliers, although the highest concentrations appear inareas of N England and on the border of SE Wales/SW England. Itis suspected that many of the outliers detected in these areasinfluence each other in this method (e.g. see the cluster in SEWales). That is, the removal of only one or two key outliers mayresult in the remaining nearby outliers being de-classified asspatially outlying. Overall, the critical load process appearssmooth and continuous for much of Scotland, whereas for partsof England and Wales there is evidence of much discontinuity.
4.2.3. critical load distribution and trend
Local mean, variance and skew surfaces for the critical loaddata are given in Fig. 6a. From the local mean surface, thedistribution of unusual means (via z-scores located on the 10�16grid) suggests unusually low critical loads located in N Scotland.
ARTICLE IN PRESS
Fig. 3. Critical load scatterplots versus (a) spatial coordinates and (b) continuous catchment data. Scatterplots are shown with marginal box plots, MLR fit (dashed line) and
LR fit (solid line) using R code provided with applied regression work of Fox (2002). LR smoothing parameter is conservatively chosen at 0.8 (i.e. proportion of observations
included locally). Correlations for critical load to Easting and critical load to Northing are r=0.46 and �0.43, respectively. Continuous catchment variables are jittered
(a random error addition to the coordinate) to negate effects of over-plotting.
Table 5Linear raw (and ranked) data correlations for critical load and continuous
catchment data.
Critical load Wt.GSP Wt.SBCP Wt.SCLP
Critical load 1 0.58 (0.58) 0.65 (0.64) 0.54 (0.57)
Wt.GSP 1 0.66 (0.67) 0.59 (0.58)
Wt.SBCP 1 0.73 (0.73)
Wt.SCLP 1
Table 6Five-number summaries for critical load and continuous catchment data.
Min. Q1 Median Q3 Max.
Critical load 0 1.29 3.41 9.31 31.32
Wt.GSP 1.00 1.00 2.00 3.10 4.00
Wt.SBCP 10.00 10.00 21.10 58.40 80.00
Wt.SCLP 0.10 0.50 0.50 1.00 4.00
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7060
ARTICLE IN PRESS
Fig. 4. Land cover data: (a) nine conditional distributions and (b) according to three experimental critical load populations defined by thresholds of Fig. 2(c).
Fig. 5. (a) LC9D classes 2/3/4 and (b) location of critical load spatial outliers.
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 61
Unusually high critical loads can be found in a large area of centralto SE England and a small area of N England/S Scotland. Data inthe identified areas relate to the multi-modal nature of the criticalload histogram. From the local variance surface, critical loadslocated in SW and N England reveal the highest variability. Clearly,spatial outliers influence the magnitude of the variance in suchareas (see Fig. 5b). A randomisation test suggests that critical loadvariability in a large area of N Scotland is ‘significantly’ low, andthat there are pockets of ‘significantly’ high and low variances,elsewhere. Thus N Scotland provides the least variation in criticalload, as does a large area of central to SE England. This is
interesting as such areas have predominantly low and high criticalloads, respectively. Thus high critical loads do not necessarilycoincide with high variance.
This unusual phenomenon can be more clearly seen with a CVsurface. Here if high critical loads coincide with high variance (orSD) whilst low critical loads coincide with low variance (or SD)(i.e. a proportional relationship between local mean and SD data),then the CV should be fairly uniform across space. Fig. 6b presentsthree such surfaces, where bandwidths are taken at 5%, 10% and20% (to highlight how interpretations can change depending onwhat scale the process is viewed at). Clearly, there is little
ARTICLE IN PRESS
Fig. 6. Critical load local statistic surfaces using an adaptive box-car kernel: (a) bandwidths set at 5% for the mean and 10% for variance and skew surfaces and (b)
bandwidths set at 5%, 10% and 20% (top–bottom) for CV surfaces. All surfaces are shown with corresponding test results (defined on a 10�16 grid). For mean and variance
surfaces, a few white areas within sampled region indicate local moments outside ranges specified.
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7062
evidence that the SD scales with the mean for the critical loadprocess on any global scale. In addition to the relatively lowvariation that is evident in central to SE England, a relatively highlocal variation is evident in N England.
From the local skew surface in Fig. 6a, the same area of centralto SE England that has low variance also has a negatively skeweddistribution. This contrary direction of skew can be explained by adomination of high critical loads combined with a handful of lowcritical loads. That is the tail of this local distribution is stretchedwith low critical loads. Interestingly a randomisation test not onlyidentifies this area of negative skew, but also an area of positive
skew in Scotland as unusual. These findings suggest that datatransforms may need to be defined locally.
4.2.4. critical load distribution and trend: use of more robust
measures
To gauge the effect of spatial outliers, local moments are nextcalculated using the filtered data set of 474 values. As mostoutliers are high valued, calibration with the filtered datagenerally lowers the local means and variances in those areasmost affected by outlying data (compare surfaces in Fig. 7 with
ARTICLE IN PRESS
Fig. 7. Critical load surfaces for robust statistics using an adaptive box-car kernel. Bandwidths are set at 5% for the mean (a) and 10% for variance (b) surface (both use
filtered data). Bandwidth is set at 10% for reduced variability (c) surface. All surfaces are shown with corresponding test results.
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 63
corresponding surfaces in Fig. 6a). For local skew (not shown), afew areas in central S England change sign. However, reducingunwanted variability is the key reason for identifying outliers inthe first place. One way to visualise any reduction in variance isshown in Fig. 7c, where differences in the local variances from thefull to filtered data (at the 474 sites) are locally averaged. Here it isclear that for N England, critical load variability has lowereddramatically. This relates to an influential area of outliersobserved before. With the full data set, this area is likely to havethe least continuity in critical load and prove the mostproblematic when modelling with GWR and similar spatialmodels.
4.2.5. critical load distribution and trend: use of
distance-decay kernels
As the choice of kernel and bandwidth cannot usually bechosen with any objectivity, it is important to experiment.Therefore in the first instance, the local mean, variance and skewsurfaces are re-specified with a bi-square kernel using 15%, 30%and 30% adaptive bandwidths, respectively (i.e. N=75, 150 and150, see Fig. 8a). In each case, this results in a more continuousand smooth output, but where the previously perceived momentnonstationarity is (reassuringly) confirmed. To get such smoothoutputs using a box-car kernel would require a much largerbandwidth, but this would reduce local detail. In the secondinstance, an alternative comparison is possible by re-specifyingone statistic using different kernels. Here the local CV surfaces arere-specified in Fig. 8b using bi-square, Gaussian and exponential
kernels, where fixed bandwidths are chosen to highlightthe smoothing similarities between the weighting functions (c.f.Fig. 1). Again (and as expected) a more continuous output isapparent from all three surfaces. Again there is little evidence thatthe SD scales with the mean on any global scale, but now therelatively low variation that is still evident in central to SEEngland appears more clearly defined.
4.2.6. Distribution and trend with the continuous catchment data
To interpret local correlation between a continuous catchmentvariable and critical load, it is first useful to assess how thecatchment variable itself varies across space. Hence local meanand SD surfaces are presented in Fig. 9 for Wt.GSP, Wt.SBCP andWt.SCLP. In a broad global sense, it is evident that these variablesvary in a similar fashion to each other and also to critical load (c.f.Fig. 6a). Consequently, any one of these variables should at leastexplain the vague NW–SE trend in critical load across GreatBritain and these initial results tend to suggest fairly stationarycritical load to continuous catchment data relationships. Areas oflow SD in an independent variable can also indicate where GWRmay have calibration difficulties (i.e. towards singular matrices).In this respect, areas of low SD should be noted and they are SEEngland and N Scotland for Wt.GSP; N Scotland for Wt.SBCP;Wales and N England/Scotland for Wt.SCLP. Observe also that thespatial distribution of the land cover classes with respect to theirrelationship to high and low critical load data values (i.e. thosegiven in Fig. 5a) shows a broad similarity with the mean surface of
ARTICLE IN PRESS
Fig. 8. Critical load surfaces for: (a) mean, variance and skew each using an adaptive bi-square kernel and (b) CV using bi-square, Gaussian and exponential fixed kernels
(top to bottom). Bandwidths (bi-square kernel) are set at 15% for mean and 30% for variance and skew surfaces. For bi-square mean surface, a few white areas indicate local
means outside ranges specified. CV bandwidths are set at 350 km for bi-square (i.e. r=350) and 165 km for Gaussian and exponential kernels (i.e. b=165). All surfaces are
shown with corresponding test results.
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7064
each continuous catchment variable. This suggests a furthercollinearity amongst the catchment data.
4.2.7. Local relationships: collinearity in the
continuous catchment data
As the continuous catchment data vary in a similar spatialfashion to each other, it is likely that the (global) collinearity inthese variables extends to a more local scale. To see this a localcorrelation matrix for the continuous catchment data is given inFig. 10, where map intervals coloured: (a) pink suggest little or nocollinearity, (b) white suggest moderate to strong positivecollinearity and (c) green suggest moderate to strong negative
collinearity. Clearly, the degree of collinearity at a global scaledoes not always extend to the local scale after all. For example,Wt.SBCP and Wt.SCLP have little relationship in W Scotland, or forexample, Wt.GSP and Wt.SCLP have little relationship over largeareas of SE England and N England/S Scotland. Results suggestthat a combination of continuous catchment variables can beincluded in a GWR fit, without compromising its interpretation.However, such combinations are still unlikely to be viable with anMLR fit. It remains to be seen if the LC9D variable iscomplementary to the continuous variables. It is difficult toinvestigate LC9D in relation to other independent data due to itscategorical nature.
ARTICLE IN PRESS
Fig. 9. Local mean (a) and SD (b) surfaces for Wt.GSP, Wt.SBCP and Wt.SCLP. All surfaces are specified with an adaptive box-car kernel using a 5% bandwidth for mean and a
15% bandwidth for SD surface. All surfaces are shown with corresponding test results.
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 65
4.2.8. Local relationships: critical load correlations
Local correlations can now be used to explore spatial relation-ships between the continuous catchment data and critical load.Surfaces and histograms are given in Fig. 11, where the histogramsare constructed from local correlation values at every calibrationdata location. It is apparent that for all surfaces there is nosignificant change of correlation sign across space. Overall thereare no strongly nonstationary relationships and in general, anychange in relationship occurs at a fairly large spatial scale. It isconsidered that critical load’s relationship with Wt.SBCP is themost stationary, whilst critical load’s relationship with Wt.GSP isthe most nonstationary (primarily due to a weak correlation in N
England/S Scotland). The consistently moderate to strong natureof the critical load to Wt.SBCP relationship suggests that thisvariable is likely to be the most promising in any regression.Wt.SBCP relates best to critical load in Wales and SW England.Regional differences in critical load’s relationship to Wt.GSP canalso be illustrated by sub-setting the data into (a) sites below400 km Northings, (b) sites between 400 and 700 km Northingsand (c) sites above 700 km Northings, and then plotting Wt.GSPagainst critical load. The relationship should be the strongest withsubset (a) in the south of Great Britain (see Fig. 12). Clearly, a localcorrelation surface is a more elegant approach to explore localrelationships.
ARTICLE IN PRESS
Fig. 10. Local correlation matrix for continuous catchment variables. Correlations are defined with an adaptive 15% bandwidth using a box-car kernel. All surfaces are
shown with corresponding test results. Inter-quartile ranges are also given for local correlation values at every calibration data location only (i.e. n=497).
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7066
4.2.9. Local relationships: critical load correlations with distance-
decay kernels
Each continuous catchment variable appears to relate weaklyto critical load in areas of NW Scotland. For Wt.SCLP, a weakcorrelation is likely to be genuine and a consequence of littlevariation in one or both variables. However for Wt.GSP andWt.SBCP, the more abrupt changes in correlation are a likelyartefact of one or two outlying observation pairs just fallingwithin the box-car kernel. For the correlation coefficient, anoutlying value (or relationship) can seriously affect this linearconstruct. In the spirit of exploration, it is worthwhile experi-menting with other kernel specifications less susceptible tooutliers. As such, local correlation surfaces for Wt.GSP, Wt.SBCPand Wt.SCLP are re-specified with an exponential kernel inFig. 13a. Here the chosen bandwidth matches that used in anAIC-defined GWR model of a companion study. According to thisnew specification, weak correlations with critical load are not soevident in NW Scotland for Wt.GSP and Wt.SBCP, whereas forWt.SCLP a weak correlation remains (as expected).
4.2.10. Local relationships: robust critical load correlations
Specifying distance-decay kernels to negate the effects ofoutlying data on local correlations is only one approach to thisproblem. For example, a filtered data approach could be usedinstead. However, it does not directly follow that representativecorrelation surfaces would be found using the filtered data setfound from before, as the filtering was based only on theidentification of outlying data in a univariate sense. Instead a
different filtered data set is now needed where critical loads thatare considered unusual in their relationship to the catchment dataare filtered out. As an example, such relationship outliers could beidentified from an assessment of high prediction errors from someregression fit.
Alternatively, a third and more direct approach to provide arobust local correlation surface is to re-apply the local correlationalgorithm to ranked data. Even though individual outlyingrelationships are not directly identified, as with a filtered dataapproach, this direct approach should at least identify regionswhere outlying relationships are most influential. As such, localrank correlation surfaces for Wt.GSP, Wt.SBCP and Wt.SCLP aregiven in Fig. 13b, each using the same kernel as that specified inFig. 13a. It appears that the use of ranked data has the greatestimpact when interpreting local relationships between critical loadand Wt.SCLP, whereas other critical load relationships remainbroadly similar. These findings loosely mimic those found globally(see Table 5).
5. Discussion
5.1. Analysis summary
Standard EDA has found evidence for both global and localspatial trends in critical load variability across Great Britain.Moderately strong correlation coefficients are found at the globalscale, but these single-valued statistics can mask high scattering
ARTICLE IN PRESS
Fig. 11. Local correlation surfaces (a) and histograms (b) for critical load relationships with Wt.GSP, Wt.SBCP and Wt.SCLP. All correlations are found using an adaptive box-
car kernel with a 15% bandwidth. All surfaces are shown with corresponding test results. Histograms represent local correlation values at every calibration data location
only (i.e. n=497).
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 67
in each critical load to continuous catchment variable (i.e.geological sensitivity, soil buffering capacity and soil critical load)relationship. Similar critical load explanatory powers are likelywith a categorical land cover variable, which could only partiallydiscriminate between three (experimental) critical load popula-tions.
According to the mapping of local statistics, there is strongevidence of mean, variance, CV and skew nonstationarity in thecritical load process. Spatial outliers are the most influential in anarea of N England, giving rise to a heightened critical loadvariance. The continuous catchment variables vary spatially in a
similar (global) fashion to each other and to critical load. Howeverlocally there are differences and the high collinearity found at aglobal scale does not always extend to the local scale. For localcorrelation with critical load, there is no significant change of signacross space with the continuous catchment data. In general andas expected with a physical process, any change in relationshipbetween these data and critical load only occurs at a fairly largespatial scale. Geological sensitivity appears to have the mostmarked nonstationary relationship to critical load, whilst soilbuffering capacity appears to have the most marked stationaryrelationship.
ARTICLE IN PRESS
Fig. 12. Scatterplots for sub-setted critical load and Wt.GSP data: (a) sites below 400 km Northings, (b) sites between 400 km and 700 km Northings and (c) sites above
700 km Northings. Linear correlation coefficients are 0.58, 0.24 and 0.36 for subsets (a)–(c), respectively. All plots are shown with marginal box plots, MLR fit (dashed line)
and LR fit (solid line with smoothing parameter=0.8). Wt.GSP is jittered to aid visualisation.
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7068
5.2. The Critical load data
As with any statistical analysis, model choice and outputdepend strongly on the sample data. In this respect, there are twomisgivings with the critical load data that need to be noted.Firstly, the water chemistry samples were taken over a 2-yearperiod (1992–1994), so it needs to be assumed that spatialvariation in water chemistry has not been contaminated with anytemporal variation in water chemistry. Secondly, the waterchemistry sites were chosen to represent the most sensitivewater body within a 10 or 20 km grid square so that the minimumcritical load could be calculated. Unfortunately, this site selectionwas unlikely to be free from error. Curtis et al. (1995) suggestedthat as many as a third of the selected sites were not at the mostsensitive water body within the given grid square. This entailsthat a third of the critical load data could be a significant over-estimate of the minimum critical load. Consequently (and for bothmisgivings), the critical load data may actually reflect a mixture ofcritical load populations, which may then lead to incorrect modelchoices and spurious results. It is suggested that a random siteselection within each grid square over a much shorter samplingtime period would have reduced such concerns.
6. Conclusions
It has been worthwhile to investigate a freshwater acidificationcritical load data for Great Britain using GWSSs, as a goodunderstanding of spatial variation and spatial covariation in this
data set has unfolded. With the use of GW univariate statistics,critical load and catchment data distributions can be taken asnonstationary in all of their key moments. With the use of GWbivariate statistics, many critical load to catchment data relation-ships are also nonstationary (which agrees with previousresearch), but this occurs only at a fairly large spatial scale (whichis to be expected from a physical process). From these exploratoryresults, an investigation with GWR of the same data set is nowwarranted. GWR enables a more complete investigation intononstationary relationships than that found with the GWcorrelation coefficient as (a) GWR incorporates the controllingeffects that each independent variable has on each other, (b) GWRcan easily include any categorical independent variable and (c)GWR calibration does not have to depend on an arbitrarily chosenweighting function. Other spatial models can also follow thisGWSS analysis, as ultimately, models for predicting critical loadsare sought. A GW mean or GWR itself can be used as one suchpredictor. Alternatively if a geostatistical predictor is preferred,then it should be constructed in a manner that caters for thenonstationarities observed here.
Acknowledgements
Research presented in this paper was funded by a StrategicResearch Cluster Grant (07/SRC/I1168) by the Science FoundationIreland under the National Development Plan. The authorsgratefully acknowledge this support and the first author’s Ph.D.
ARTICLE IN PRESS
Fig. 13. Local correlation surfaces for critical load relationships with Wt.GSP, Wt.SBCP and Wt.SCLP: (a) raw data and (b) ranked data. All surfaces are specified with an
exponential kernel using an adaptive bandwidth set at 3.62%. This bandwidth is a nonlinear parameter, which loosely reflects a local data subset size (as a percentage) that
exerts the greatest influence on each local correlation calculation. All surfaces are shown with corresponding test results.
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–70 69
studentship at Newcastle University. Thanks are also due to A.S.Fotheringham and S. Juggins.
References
Battarbee, R.W., Allot, T.E.H., Juggins, S., Kreiser, A.M., Curtis, C., Harriman, R., 1996.Critical loads of acidity to surface waters—an empirical diatom-basedpaleolimnological model. Ambio 25, 366–369.
Bowman, A.W., Azzalini, A., 1997. Applied Smoothing Techniques for DataAnalysis—The Kernel Approach with S-plus Illustrations. Oxford UniversityPress, New York (193 pp.).
Brimblecombe, P., 2007. Preface. Water, Air, and Soil Pollution: Focus 7, 1–2.Brunsdon, C., Fotheringham, A.S., Charlton, M.E., 1996. Geographically weighted
regression: a method for exploring spatial nonstationarity. GeographicalAnalysis 28, 281–289.
Brunsdon, C., Fotheringham, A.S., Charlton, M.E., 2002. Geographically weightedsummary statistics—a framework for localised exploratory data analysis.Computers, Environment and Urban Systems 26, 501–524.
Carroll, Z.L., Oliver, M.A., 2005. Exploring the spatial relations between soil physicalproperties and apparent electrical conductivity. Geoderma 128, 354–374.
CLAG Freshwaters, 1995. Critical loads of acid deposition for United Kingdomfreshwaters. Sub-report on Freshwaters, Critical Loads Advisory Group,Institute of Terrestrial Ecology (ITE), Penicuik, Scotland, 80 pp.
Clark, R.M., 1977. Non-parametric estimation of a smooth regression function.Journal of the Royal Statistical Society B 39, 107–113.
Cleveland, W.S., 1979. Robust locally weighted regression and smoothingscatterplots. Journal of the American Statistical Association 74, 829–836.
Curtis, C.J., Allott, T.E.H., Battarbee, R.W., Harriman, R., 1995. Validation of the UKcritical loads for freshwaters: site selection and sensitivity. Water, Air, and SoilPollution 85, 2467–2472.
Curtis, C., Allott, T., Hall, J., Harriman, R., Helliwell, R., Hughes, M., Kernan, M.,Reynolds, B., Ullyett, J., 2000. Critical loads of sulphur and nitrogen for
ARTICLE IN PRESS
P. Harris, C. Brunsdon / Computers & Geosciences 36 (2010) 54–7070
freshwaters in Great Britain and assessment of deposition reduction require-ments with the First-order Acidity Balance (FAB) model. Hydrology and EarthSystem Sciences 4, 125–140.
Diggle, P., 1985. A kernel method for smoothing point process data. AppliedStatistics 34, 138–147.
Fotheringham, A.S., Brunsdon, C., Charlton, M., 2002. Geographically WeightedRegression—The Analysis of Spatially Varying Relationships. John Wiley,Chichester, Sussex (269 pp.).
Fox, J., 2002. An R and S-Plus Companion to Applied Regression. Sage, London(312 pp.).
Hall, J.R., Wright, S.M., Sparks, T.H., Ullyett, J., Allott, T.E.H., Hornung, M., 1995.Predicting freshwater critical loads from national data on geology, soils andland use. Water, Air, and Soil Pollution 85, 2443–2448.
Hastie, T.J., Tibshirani, R.J., 1990. Generalized Additive Models. Chapman & Hall,London (335 pp.).
Hastie, T.J., Tibshirani, R.J., 1993. Varying-coefficient models. Journal of the RoyalStatistical Society B 55, 757–796.
Hawkins, R.M., 1980. Identification of Outliers. Chapman & Hall, London (188 pp.).Henriksen, A., Kamari, J., Posch, M., Wilander, A., 1992. Critical loads of acidity:
Nordic surface waters. Ambio 21, 356–363.Ihaka, R., Gentleman, R., 1996. R: a language for data analysis and graphics. Journal
of Computational and Graphical Statistics 5, 299–314.Isaaks, E.H., Srivastava, R.M., 1989. An Introduction to Applied Geostatistics. Oxford
University Press, New York (561 pp.).Kernan, M.R., Allott, T.E.H., Battarbee, R.W., 1998. Predicting freshwater critical
loads of acidification at the catchment scale: an empirical model. Water, Air,and Soil Pollution 185, 31–41.
Kernan, M.R., Haliwell, R.C., Hughes, M.J., 2001. Predicting freshwater critical loadsfrom catchment characteristics using national datasets. Water, Air, and SoilPollution: Focus 1, 415–435.
Kreiser, A.M., Patrick, S.T., Battarbee, R.W., 1993. Critical loads for UK freshwa-ters—introduction, sampling strategy and use of maps. In: Hornung, M.,Skeffington, R.A., (Eds.), Critical Loads: Concepts and Applications, Proceedingsof ITE Symposium No. 28, HMSO (Her Majesty’s Stationery Office), London, pp.94–98.
Loader, C., 2004. Smoothing: local regression techniques. In: Gentle, J., Hardle, W.,Mori, Y. (Eds.), Handbook of Computational Statistics. Springer-Verlag,Heidelberg, pp. 539–564.
Mason, C.F., 1993. Biology of Freshwater Pollution. John Wiley, New York (351 pp.).Ploner, A., 1999. The use of the variogram cloud in geostatistical modelling.
Environmetrics 10, 413–437.Posch, M., Kamari, J., Forsius, M., Henriksen, A., Wilander, A., 1997. Exceedance of
critical loads for lakes in Finland, Norway and Sweden: reduction requirementsfor acidifying nitrogen and sulphur deposition. Environmental Management21, 291–304.
Rossi, R.E., Mulla, D.J., Journel, A.G., Franz, E.H., 1992. Geostatistical tools formodelling and interpreting ecological spatial dependence. Ecological Mono-graphs 62, 277–314.
Silverman, B.M., 1986. Density Estimation for Statistics and Data Analysis.Chapman & Hall, London (175 pp.).
Slootweg, J., Hettelingh, J-P., Posch, M., Schutze, G., Spranger, T., de Vries, W., Rinds,G.J., van’t Zelfde, M., Dutchak, S., Illyin, I., 2007. European critical loads ofcadmium, lead and mercury and their exceedances. Water, Air, and SoilPollution: Focus 7, 371–377.
Wand, M.P., Jones, M.C., 1995. Kernel Smoothing. Chapman & Hall, London(212 pp.).
Zhang, C., Jordan, C., Higgins, A., 2007. Using neighbourhood statistics and GIS toquantify and visualize spatial variation in geochemical variables: an exampleusing Ni concentrations in the topsoils of Northern Ireland. Geoderma 137,466–476.