abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · partitioning of data, to give...
TRANSCRIPT
![Page 1: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/1.jpg)
Research Paper
1
Abstract
Fragmentation of habitat is an issue of great concern to ecologists and conservation
managers, and poses a potential threat to the continued existence of species
populations. Many studies have been conducted worldwide to assess this threat, and
habitat-abundance models exist for a variety of species. Few studies though, have
looked at multiple species, or spatial arrangement of habitat.
The uplands of Scotland are characterised by a fragmented landscape with a trend for
larger scale conservation management. This paper proposes an approach combining
GIS data and statistical modelling with habitat spatial arrangement, to predict, across
Scotland, both overall diversity, and distribution and frequency (relative abundance)
for a smaller set focal species. Models created from GIS and stepwise regression are
shown to provide powerful tools for high resolution predictive modelling, and for
understanding the optimum spatial configurations of habitat that maximise diversity.
![Page 2: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/2.jpg)
Research Paper
2
1 Introduction
Habitat fragmentation has been identified as a significant control on population and
dynamics for many species. Malanson and Cramer (1999, p.1) describe fragmentation
as “[possibly] the greatest current threat to biodiversity”. Despite this, the effects of
fragmentation have not been fully explored (Tucker, 2001), and relatively little is
known about implications for multiple species populations or for overall diversity.
1.1 Studies of upland bird species
Upland Scotland is host to a range of bird species populations of international
importance (Stillman and Brown, 1998; Tharme et. al., 2001). The landscape is
characterised by fragmented habitat with large-scale conservation management
(Moorland Working Group, 2002). Therefore species-habitat models which account
for fragmentation can assist land management for bird diversity in Scottish upland
regions. Most previous models have focussed on single species, e.g. Baines (1995),
Hancock et. al. (1999), Hill (2001), Hack (2002), Pearce-Higgins and Yalden (2003).
One notable exception is that of Stillman and Brown (1998) who carried out an
analysis of presence/absence (distribution) data across British upland areas. They
used the same bird survey data as this study (Gibbons et. al. 1993), and a similar
system of land cover classes to define upland habitats. From this they produced a
definitive list of upland bird species. They did not, however, look at the spatial
arrangement of habitat. Thus in Scotland in particular, neither diversity, nor its
relationship with habitat spatial structure, have been fully examined.
Statistical modelling is a widely used standard tool for many biological and ecological
studies. More recently, studies combining GIS and statistical modelling have been
used. In Scotland this approach has been used to predict the occurrence of Scottish
black grouse nationally (Hill, 2001) and also at a local home-range level (Hack,
2002). This paper seeks to extend these techniques, in order to predict the
environmental variables and landscape patterns that govern overall diversity.
1.2 Proposed approach
This paper utilises data from a 10-km nationwide bird survey (Gibbons et. al. 1993).
Bird data consist of species presence/absence in every 10-km square (distribution),
![Page 3: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/3.jpg)
Research Paper
3
and the proportion of 2-km tetrads visited within each 10-km square in which a
species was observed (relative abundance, or frequency). These are combined with
land cover polygons, interpreted from aerial-photography dating from the same period
(Macaulay Institute of Land Use Research, 1989).
Statistical models are constructed for five selected upland-associated species, and also
for an index of overall diversity based on 27 upland bird species. Analysis of
landscape pattern allows the addition of many new potential explanatory variables.
These measures, or metrics , are submitted to the statistical modelling process in order
to determine the most significant spatial variables in controlling species distribution,
frequency, and overall diversity.
![Page 4: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/4.jpg)
Research Paper
4
2 Methods
2.1 The study area
The extent of the land cover dataset used in this study is for Scotland only. Upland
regions are therefore selected from within Scotland, including all islands. Upland
areas are defined as polygons of suitable land cover types, i.e. cover types known to
constitute land lying above the line of enclosed or managed land. Additionally, the
study also includes land defined as non-upland that is completely enclosed by upland
habitat, as this may form “effective” habitat (Pearce-Higgins and Yalden, 2003).
2.2 Software used
Data preparation and later processing were undertaken in Microsoft Excel. Spatial
analysis was performed in ArcInfo 8 (ESRI, 2001), or ArcGIS 8.3 (ESRI, 2004), with
maps produced using ArcGIS 8.3. Landscape pattern metrics were calculated, to
quantify spatial structures, using Fragstats 3.3 (McGarigal et. al., 2002), automated by
use of Arc Macro Language (AML) and python scripts (www.python.org).
Partitioning of data, to give build and evaluation datasets, was performed in Excel
using simple randomisation (see section 2.4.1 for details). Data sets were assembled
into a single table, with one unique multiple-valued record for each grid square, using
Oracle 9i (Luscher, 2001) and SQL. All statistical models and predictions were made
using S-PLUS 6.0 (see S-PLUS 6 for Windows User’s Guide, 2001).
2.3 Data sources
The three principal data sets used in this study are outlined in Table 1. These are a
bird survey data set, air-photo derived land cover polygons, and raster elevation data.
2.3.1 Bird survey data
All bird data used in this study originate from a volunteer survey organised by the
British Trust for Ornithology (BTO) between 1988 and 1991 (Gibbons et. al., 1993).
For this study we looked at 691 upland 10-km squares for which bird numbers, land
cover, and land pattern metric data were available. These squares were partitioned
from an overall set of 993 squares, leaving a prediction set of 302 (30%). The spatial
![Page 5: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/5.jpg)
Research Paper
5
unit of 10 km is considered appropriate for this study as it was the scale of unit used
by Gibbons et. al. (1993) and is sufficiently large to hold a number of home ranges for
most species (see Cramp and Simmonds, 1977).
2.3.2 Focal species
Data for a total of 29 upland bird species were provided by the BTO for this study.
Of these, 27 species were selected for analysis. These formed the basis for the
calculation of overall diversity in each 10-km square. The species names and BTO
codes are listed in Table 2. From the initial set, five species were selected for
individual statistical analysis using generalised linear modelling. These were black
grouse (BK), curlew (CU), golden plover (GP), meadow pipit (MP), and ptarmigan
(PM), and were chosen to reflect a range of generalist (commonly spread) and
specialist (localised) habitat-type species. A fuller review of each species is given in
Macdonald (2004).
Two species were not considered. Golden eagles were omitted as they can have a
home range of up to more than double that of the 10-km unit used in this study
(Sterry, 1995, p.42; see also www.hawk-conservancy.org/priors/geagle.shtml). Data
for common scoters were sparse and unsuitable for analysis.
2.3.3 Diversity Index
The Shannon Diversity Index, H, (Shannon and Weaver, 1963; Begon et. al., 1990)
was chosen to assess the overall diversity in each 10-km square. In this study we use
the natural logarithm, ln, as the basis for this index. The equation is given below.
�−=
i
ii PPH ln Equation 1
This index is conventionally calculated from proportional abundance data. I.e. the
proportion of individuals, Pi, of a community P, constituted by species i. In this study
the index is instead calculated from bird frequency, or relative abundance. This
means that the version of the Shannon Index presented here is in reality a transformed
measure of relative density. Since the number of tetrads visited per square is constant
for all species recorded in that square, however, and since positive frequency values
will always be positively correlated with actual abundance, or count, values, we thus
have a measure that is directly proportional to the true diversity index. Therefore in
this study, H, is a comparative measure of overall bird diversity in each 10-km square.
![Page 6: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/6.jpg)
Res
earc
h P
aper
6
Tab
le 1
P
rinci
pal
dat
a se
ts u
sed f
or
crea
tion a
nd t
esti
ng o
f pre
dic
tive
model
s, d
iver
sity
, dis
trib
uti
on a
nd f
requen
cy o
f S
cott
ish u
pla
nd b
irds.
Data
set
D
escr
ipti
on
S
ou
rce
Ref
eren
ce
BT
O b
ird s
urv
ey d
ata
set.
The
New
Atl
as o
f B
reed
ing B
irds
in
Bri
tain
and I
rela
nd, 1988-1
991
Surv
ey o
rgan
ised
by t
he
Bri
tish
Tru
st
for
Orn
itholo
gy. C
om
pri
ses
bir
d
dis
trib
uti
on, fr
equen
cy, an
d m
ean
count
per
tet
rad f
or
ever
y 1
0-k
m
squar
e in
Bri
tish
and I
rish
Nat
ional
Gri
ds.
Under
lic
ence
to t
he
Inst
itute
of
Geo
gra
phy, U
niv
ersi
ty o
f E
din
burg
h.
Gib
bons
et. al
. (1
993)
Lan
d C
over
of
Sco
tlan
d 1
988
(LC
S88)
Lan
d c
over
poly
gon d
ata
set
pro
duce
d b
y t
he
Mac
aula
y L
and U
se
Res
earc
h I
nst
itute
. C
onsi
sts
of
126
pri
mar
y l
and c
over
cla
sses
and 1
197
mosa
ic (
2-t
ype)
poly
gons.
D
ata
inte
rpre
ted f
rom
air
photo
surv
ey
flow
n i
n 1
988.
Under
lic
ence
to t
he
Inst
itute
of
Geo
gra
phy, U
niv
ersi
ty o
f E
din
burg
h.
Mac
aula
y L
and U
se R
esea
rch
Inst
itute
(1989)
Ord
nan
ce S
urv
ey P
anora
ma
50m
Dig
ital
Ele
vat
ion M
odel
Ras
ter
elev
atio
n d
ata
der
ived
by O
S
from
thei
r 10m
conto
ur
dat
a se
t.
Over
all
quote
d a
ccura
cy 5
m.
Under
lic
ence
to t
he
Inst
itute
of
Geo
gra
phy, U
niv
ersi
ty o
f E
din
burg
h.
Als
o a
vai
lable
fro
m:
ww
w.e
din
a.ac
.uk/d
igim
ap
[Acc
esse
d 2
6 J
une
2004]
![Page 7: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/7.jpg)
Research Paper
7
Table 2 Names and codes for all 27 upland bird species analysed in this study. Species
marked with an asterisk * were selected for individual statistical modelling. Sources: BTO
species code guide (www.bto.org), Sterry (1995).
Species Latin name BTO code
Black Grouse* Tetrao tetrix BK
Buzzard Buteo buteo BZ
Common Sandpiper Actitis hypoleucos CS
Curlew* Numenius arquata CU
Dipper Cinclus cinclus DI
Dunlin Calidris alpina DN
Goosander Mergus merganser GD
Greenshank Tringa nebularia GK
Grey Wagtail Motacilla cinerea GL
Golden Plover* Pluvialis apricaria GP
Hen Harrier Circus cyaneus HH
Merlin Falco columbarius ML
Meadow Pipit* Anthus pratensis MP
Peregrine Falco peregrinus PE
Ptarmigan* Lagopus mutus PM
Red Grouse Lagopus lagopus RG
Red-breasted Merganser Mergus serrator RM
Raven Corvus corax RN
Ring Ouzel Turdus torquatus RZ
Skylark Alauda arvensis S.
Short-eared Owl Asio flammeus SE
Snipe Gallinago gallinago SN
Green-winged Teal Ana crecca T.
Twite Carduelis flavirostris TW
Wheatear Oenanthe oenanthe W
Whinchat Saxicola rubetra WC
Wigeon Anas penelope WN
![Page 8: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/8.jpg)
Research Paper
8
2.3.4 Topography data
Spatial summary statistics were calculated from an Ordnance Survey Panorama 50m
digital elevation model (DEM) using GIS overlay techniques. Values were extracted
from the DEM, and transferred to the 10-km square data set to form a set of
topographic explanatory variables for each grid square (see Table 4).
2.3.5 Land cover data
Land cover data was selected from the LCS88 data set (Macaulay Land Use Research
Institute, 1989). This was processed using GIS and ecological pattern analysis
software to derive metrics quantifying the amount (area and perimeter) of 17 different
habitat classes (see Table 3). An extended set of metrics was also calculated both at
class (Table 4) and landscape (Table 5) levels to measure the spatial arrangement of
habitat types. Combining all variables together gave a complete set of 885
explanatory variables, from which a working set of 79 was selected, summarised in
Tables 4 and 5.
![Page 9: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/9.jpg)
Research Paper
9
Table 3 Land cover classes included in modelling. Classes were extracted and reclassified to
a land cover class (0-17) from LCS88 polygons (Macaulay Land Use Research Institute,
1989) to create an ‘upland’ dataset.
Class Description
0 Non-upland ‘islands’, completely enclosed by upland habitat.
1 Upland cliffs ( > 5km from coastline)
2 Water features
3 Coniferous plantation
4 Semi-natural conifers
5 Undifferentiated broadleaved woodland
6 Undifferentiated mixed woodland
7 Recent ploughing
8 Recent felling
9 Open canopy young plantation
10 Land recently ripped for afforestation
11 Heather moor
12 Coarse grassland
13 Smooth grassland
14 Bracken
15 Blanket bog and peatlands
16 Montane vegetation
![Page 10: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/10.jpg)
Research Paper
10
Table 4 Explanatory variables used in statistical modelling – Geographic, topographic, and
land cover class metrics. Topographic variables were calculated over 10-km square zones.
Group Code Description
Geographic X, Y Easting and northing.
Topographic MIN, MAX, MEAN
MAJORITY,
MINORITY,
VARIETY
Minimum, maximum, mean elevation.
The most and least frequent height value, and the
number of different values within the 10-km
zone.
Class Metrics
Area CA0…..16 Total area of each land cover class.
Perimeter TE0……16 Total perimeter length of each class.
![Page 11: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/11.jpg)
Research Paper
11
Table 5 Explanatory variables used in modelling – landscape metrics. Measures of landscape patch
pattern and spatial arrangement calculated form raster grids using Fragstats pattern analysis software
(McGarigal et. al. 2002; McGarigal and Marks 1995). Distribution statistics record average measures
across all patches. Single metrics are individual (single-value) measures of landscape character. For a
detailed textual, and mathematical, description see the on-line help resources available at:
www.umass.edu/landeco/research/fragstats/documents/Metrics/Metrics TOC.htm
Group or type Name (Fragstats code) Description
Landscape
Metrics
Distribution statistics
.MN, .AM, .MD,
.RA, .SD, .CV
Mean, Area-weighted Mean, Median,
Range, Standard Deviation, Coefficient
of Variation.
Applied to each of the five
below metrics:
Area AREA Patch area
Shape SHAPE, FRAC, PARA,
CONTIG
Shape Index (irregularity), Fractal
Dimension, Perimeter-area Ratio,
Contiguity Index.
Landscape
Metrics
Single metrics
Area/edge NP, PD, LPI, ED, LSI Number of Patches, Patch Density,
Largest Patch (size) Index, Edge Density,
Landscape Shape Index.
Shape PAFRAC Perimeter-area Fractal Dimension.
Connectivity COHESION ‘Connectedness’ of focal patch.
Contagion/
interspersion
CONTAG, PLADJ, IJI,
DIVISION, SPLIT, AI
Contagion (class dispersion), % Like
Adjacencies (cells), Interspersion and
Juxtaposition Index, Division (grid mesh
size), Split (landscape subdivision),
Aggregation Index.
Diversity PR, PRD,
SHDI, SIDI, MSIDI,
SHEI, SIEI, MSIEI
Patch Richness, Patch Richness Density,
Shannon and Simpson diversity and
evenness indices.
![Page 12: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/12.jpg)
Research Paper
12
2.4 Statistical modelling
2.4.1 Data partitioning
Partitioning of 10-km squares for build and evaluation sets was achieved using a
simple randomisation technique. A sequence of numbers was generated in Excel and
tested for goodness-of-fit to a uniform distribution, and serial independence. Testing
is described in Macdonald (2004). Applying a one-dimensional set of pseudo-random
numbers to two-dimensional space of course gives rise to the possibility of spatial
autocorrelation or clumping across rows or columns. Inspection of the square datasets
shown in Figure 1 does reveal some clustering. Squares were numbered in rows from
left to right, top to bottom (i.e. sorted by easting and northing) therefore artificial
clumping occurs in vertical or diagonal trends due to periodicity in the number
generator algorithm. Techniques such as stratified random sampling (see Longley, et.
al. 2001, p.104), can reduce (i.e. scale down) this problem, however they require that
we choose an arbitrary aggregated spatial unit whose size affects the degree of
autocorrelation, in which the randomisation process is repeated across the dataset.
�
This map is based on data provided with the support of the ESRC and JISC and uses boundary material which is copyright of the Crown and the Post Office.
0 100 20050 Kilometres
Build Set
Evaluation Set
Figure 1 Partitioned 10-km squares. Left: build set (691 squares, 70%), right: evaluation set (302
squares, 30%). Squares assigned to one of the two sets using simple randomisation. Lines represent
50-km GB National Grid squares.
![Page 13: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/13.jpg)
Research Paper
13
2.4.2 Generalised Linear Models
Generalised linear modelling (GLM) provides a framework for regression models that
allows for the specification of non-uniform mean-variance relationships, and non-
normal error distributions or structures (Nelder and Wedderburn, 1972; McCullagh
and Nelder, 1983). In addition to the error structure, a GLM has two other important
components. These are the linear predictor and the link function. The former is a
linear sum of the effects of all explanatory variables, while the latter is used to relate
this to the predicted values. The predicted value is obtained by applying the inverse
link function to the linear predictor.
2.4.2.1 Diversity model
In this study two GLM families were used. Exploratory data analysis revealed the
calculated diversity index, H, had positive values in the range of 0.9 to 2.9, with mean
of approximately 2.2. Considering equation 1 again, positive numbers are expected as
all values are calculated from the negative sum of a set of negative terms. The
individual terms will always be negative, since all frequency values are decimal
proportions between 0 and 1, and the logarithms of such numbers are negative.
Therefore H is expected to be positive with real-valued mean. The values were seen
to resemble the shape of the Normal distribution, and therefore the Gaussian GLM
family was selected.
GLM Model 1: Diversity, H Gaussian family GLM with identity link.
�=
+=
79
1i
ii xbaH Equation 2
The equation for the GLM diversity model is shown in Equation 2. The terms are as
follows: a is the intercept coefficient, and each bi is one of the 79 explanatory variable
coefficients.
![Page 14: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/14.jpg)
Research Paper
14
2.4.2.2 Distribution models
Both distribution and frequencies, provided in the bird survey data, are proportional
data (values between 0 and 1). These require analysis using a binomial error
structure, commonly with a logit link function (Crawley, 2002, pp.513-536). The
distribution data was complicated by the inclusion of both single sightings and
confirmed breeding. Simplification was achieved by excluding less reliable single
sightings, thereby creating a binary response variable. This could then be analysed by
simple unweighted regression, as a special single-trial form of the Binomial
distribution known as the Bernoulli distribution (Crawley, 2002).
GLM Model 2: Distribution : Binary response binomial GLM with logit link.
��
���
�++
��
���
�+
=
�
�
=
=
79
1
79
1
exp1
exp
i
ii
i
ii
xba
xba
p Equation 3
The general equation for the GLM distribution models is shown in Equation 3. The
terms are as follows: a is the intercept coefficient, and each bi is one of the 79
explanatory variable coefficients. The computed value, p, is the probability of
presence/absence in each 10-km square.
2.4.2.3 Frequency models
Frequency data would normally be analysed using weighted binomial regression. The
survey data were supplied as decimal proportions with no information regarding the
number of tetrads visited in each 10-km square, i.e. the binomial denominators, or
weights. Therefore unweighted regression was used as the preferred available option.
![Page 15: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/15.jpg)
Research Paper
15
GLM Model 3: Frequency : Unweighted binomial GLM with logit link.
��
���
�++
��
���
�+
=
�
�
=
=
79
1
79
1
exp1
exp
i
ii
i
ii
xba
xba
p Equation 4
The general equation for the GLM frequency models is shown in Equation 3. The
terms are as follows: a is the intercept coefficient, and each bi is one of the 79
explanatory variable coefficients. The computed value, p, is the proportion of
occupied tetrads (probability of tetrad occupancy).
2.4.3 Model fitting
Model fitting involves selecting only the most important explanatory variables in
order to create a parsimonious model. That is, the simplest possible model that will
explain the greatest amount of variation in the response (Crawley, 2002). Inclusion of
all variables in a model may lead to retention of irrelevant correlation between
explanatory variables (Macdonald, 2004)
Model fitting was performed using S-Plus’ automated stepwise selection function.
This selects the optimum model based on lowering a computed measure of fit,
Akaike’s Information Criterion (AIC). The stepwise process successively adds and
removes all terms until the ‘best’ model, that with the lowest AIC, is selected. The
process can take the form of forward addition, starting from a null (or intersect only)
model, or backwards removal of terms from a full model. Forward selection removes
terms that are no longer significant in the model, thereby reducing the spurious or
exaggerated correlation between explanatory variables (G. Buchanan, pers. comm.).
Thus, in light of the number of potentially correlated land metrics, forward selection
was chosen for use in this study.
2.4.4 Model prediction
The models were used to predict values of diversity based on all bird species,
probability of a selected set of species’ presence, and the species’ relative abundance,
or frequency, for each of the evaluation 10-km squares in Scotland, based on the set
of variables chosen by stepwise selection. Distribution predictions were interpreted as
![Page 16: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/16.jpg)
Research Paper
16
predicted presence for all values greater than or equal to 0.5, and predicted absence
for values less than 0.5.
2.5 Model evaluation
2.5.1 Evaluation of partitioned data
The use of contemporaneous bird survey and land cover data in this study means there
are no such comparable datasets available for the study area. Data therefore required
to be partitioned into two sets, one for model building, and one for evaluation. Thus
we are unable to independently verify the models in terms of predicting known bird
distributions and frequencies. We are however able to test predicted models fit the
entire survey extent, through independent build and test data at the same spatial scale.
This ensures that we can accurately quantify, across Scotland, the strongest
relationships between species, habitat and, of particular interest, spatial configuration
of habitat. Additionally, there is no risk of a loss of information that may occur where
spatial aggregation of predictions to larger grain separate evaluation data is required.
2.5.2 Diversity model
Diversity predictions were evaluated by measuring the correlation between the
predicted and observed values. The Pearson product-moment correlation was
calculated for this purpose. A simple linear regression plot of observed H vs.
predicted H was also used, for visual analysis. Values for the average degree of over
and under-prediction were calculated, and residuals were also examined for spatial
patterns, either to test for problems due to autocorrelation (see 2.4.1 above), or to test
for any geographical variation in model accuracy.
2.5.3 Distribution models
Evaluation of distribution data, for all five species, was performed using a set of
performance measures devised for presence/absence models by Fielding and Bell
(1997). These measures are calculated from ‘confusion matrices’ recording the
numbers of correct and incorrect predictions of presence and absence for each square.
The performance measures used are listed in Table 6.
![Page 17: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/17.jpg)
Research Paper
17
2.5.4 Frequency models
Frequency predictions, for all five species, were evaluated by measuring the
correlation between the predicted and observed values. The Pearson product-moment
correlation was calculated as for the diversity model. Observed values were plotted
against predicted for inspection, and comparison between species. Again, as for
diversity, the average degree of over and under-prediction were calculated, for all five
species, and residuals examined for spatial patterns, either to test for problems due to
autocorrelation (see 2.4.1 above), or to test for any geographical variation in model
accuracy.
Table 6 Performance measures for assessing classification accuracy of presence-absence
models (after Fielding and Bell, 1997). Formulae are derived from confusion matrix terms as
follows: a correct positive prediction, b false positive prediction, c false negative prediction, d
true negative prediction. N is the number of cases, or sample size.
Measure Formula
Prevalence (a+c)/N
Overall diagnostic power (b+d)/N
Correct classification rate: proportion
of all cases correctly predicted
(a+d)/N
Misclassification rate: proportion of
all cases incorrect predicted
(b+c)/N
Sensitivity a/(a+c)
Specificity d/(b+d)
False positive rate b/(b+d)
False negative rate c/(a+c)
Positive predictive power (PPP) a/(a+b)
Negative predictive power (NPP) d/(c+d)
Odds-ratio (ad)/(bc)
Kappa K: proportion of specific
agreement
[(a+d)-(((a+c)(a+b)+(b+d)(c+d))/N]
/ [N-(((a+c)(a+b)+(b+d)(c+d))/N]
![Page 18: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/18.jpg)
Research Paper
18
3 Results
3.1 Optimum models of overall diversity, distribution and frequency
3.1.1 Model summaries
A total of 11 generalised linear models were created following the steps detailed in
section 2. These included a single model for diversity based on all 27 upland bird
species, and both a distribution and frequency model for each of the five species of
special interest.
Figure 2 shows a comparison of the deviance explained by each of the 11 GLM
models produced by the stepwise modelling process. The amount of deviance
explained is seen to remain reasonably constant at 25-50%, however distribution
models consistently explain more deviance than frequency models. Table 7 provides
a summary of the GLMs produced, and includes a measure of each model’s statistical
significance, the p-value.
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
BK CU GP MP PM H
Species
% E
xp
lain
ed
De
via
nc
e
Distribution Based
Frequency Based
Figure 2 Percentage deviance explained by GLM models in this study. Models for
distribution shown in blue, frequency models and diversity in yellow. All models are
identified by the relevant BTO species code: BK = black grouse, CU = curlew, GP = golden
plover, MP = meadow pipit, PM = ptarmigan. H represents diversity, H, GLM.
![Page 19: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/19.jpg)
Res
earc
h P
aper
19
Tab
le 7
S
um
mar
y o
f al
l G
LM
s pro
duce
d i
n t
his
stu
dy.
(d.f
. =
deg
rees
of
free
dom
).
The
Pea
rson p
roduct
-mo
men
t is
giv
en f
or
the
deg
ree
of
corr
elat
ion
bet
wee
n o
bse
rved
and p
redic
ted v
alues
, fo
r al
l fr
equen
cy m
odel
s an
d d
iver
sity
, H
.
Dis
trib
uti
on
mo
del
s
*D
isp
ersi
on
Null
dev
iance
d
.f.
Res
idual
dev
iance
d
.f.
p
%E
xp
lain
ed d
evia
nce
Bla
ck G
rouse
1
7
50
.24
74
6
90
4
44
.38
77
6
69
0
4
0.7
7%
Curl
ew
1
8
50
.39
04
6
90
4
80
.95
21
6
71
0
4
3.4
4%
Go
lden
Plo
ver
1
9
57
.81
22
6
90
4
93
.77
21
6
69
0
4
8.4
5%
Mea
do
w P
ipit
1
1
88
.08
72
6
90
1
19
.13
79
6
80
7
.07
e-0
11
3
6.6
6%
Pta
rmig
an
1
51
9.2
35
1
69
0
17
6.8
25
8
68
5
0
65
.94
%
Fre
qu
ency
an
d d
iver
sity
mo
del
s
D
isp
ersi
on
Null
dev
iance
d
.f.
Res
idual
dev
iance
d
.f.
p
%E
xp
lain
ed d
evia
nce
P
ears
on
t p
Bla
ck G
rouse
1
8
3.6
68
91
6
90
6
3.1
56
93
6
88
3
.51
e-0
05
2
4.5
2%
0
.17
74
197
3
.12
25
0
.00
2
Curl
ew
1
3
94
.45
95
6
90
2
49
.09
29
6
82
0
3
6.8
5%
0
.62
12
267
1
3.7
.09
0
Go
lden
Plo
ver
1
2
40
.15
63
6
90
1
20
.11
12
6
87
0
4
9.9
9%
0
.59
71
54
1
2.8
94
5
0
Mea
do
w P
ipit
1
3
55
.79
41
6
90
2
49
.58
41
6
84
0
2
9.8
5%
0
.65
83
335
1
5.1
48
5
0
Pta
rmig
an
1
61
.429
1
69
0
25
.851
95
6
87
9
.20
e-0
08
5
7.9
2%
0
.45
54
774
8
.86
17
0
H
0.2
128
24
2
09
.43
35
6
90
1
43
.23
06
6
73
9
.61
e-0
08
3
1.6
1%
0
.53
46
674
1
0.9
58
6
0
* D
isper
sion p
aram
eter
of
1 a
ssum
ed b
y S
-Plu
s fo
r bin
om
ial
regre
ssio
n.
![Page 20: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/20.jpg)
Research Paper
20
3.1.2 Model fitting and selection of explanatory variables
3.1.2.1 Diversity model
Table 8 provides a list of computed parameter estimates (regression coefficients) and
related information for the diversity model. Similar output and related information for
all models used in this study is listed in Macdonald (2004).
Variables selected as controlling overall diversity included, intuitively, the Shannon
diversity index calculated for landscape (SHDI), maximum elevation (MAX), and
areas of blanket bog and peatland (CA15) and heather moor (CA11). No particular
associations with easting or northing (X, Y) were revealed. Possible quadratic
relationships, however, can be detected in the model. For instance perimeter-area
ratio (PARA) overall is a weakly negative predictor. This is due to the positive
influence of the mean ratio and fractionally stronger negative influence of the median.
In contrast to this, the stepwise process picked out the presence of edge habitat for
water, smooth grassland, and bracken classes as positive predictors, bracken being
quite significant.
3.1.2.2 Distribution models
Parameter estimates selected for distribution models are shown in Table 9. In general,
a similar number of variables as for diversity were chosen for distribution models.
The exceptions were for meadow pipit and ptarmigan. This result, of course, might be
expected due to the unique patterns of distribution for these species.
For the distribution of black grouse, the moorland edge (TE11) and also edge habitats
of water, bracken and montane vegetation were selected as predictors. This appears to
be corroborated by a positive association with perimeter-area ratio (PARA). Fractal
dimension (shape complexity) however, is found to be strongly negative (FRAC).
Variety of patch types (PR), degree of class aggregation into single patches (AI), and,
to a lesser extent, patch area coefficient of variation (AREA.CV), are positive.
Grouse were also predicted to be weakly associated with easting, having a stronger
negative association with northing. Surprisingly, open canopy young plantation
(CA9) is found to be a negative predictor. This was expected to show a positive
![Page 21: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/21.jpg)
Research Paper
21
relationship, since black grouse are known to favour plantation shrub and herb growth
that occurs in the first 15-20 years before canopy closure. Around this time, light no
longer reaches the canopy floor and any grouse present will move on to fresh habitat
(Hill, 2001).
From the other models, ptarmigan were predicted to respond to maximum elevation
(MAX), and contiguous montane vegetation (CA16, CONTIG.MN) located in
northern region (Y). Curlews were predicted to favour a diverse and varied landscape
(SHDI, SHAPE.SD), but again with contiguous patches (CONTIG.AM). They were
also predicted to favour areas of smooth grassland and montane vegetation, with
edges of blanket bog and conifer plantations other favoured habitats. A smaller model
was produced for meadow pipits, again selecting contiguity of patches, but also
blanket bog edge as particularly significant. Golden plovers were predicted to favour
some areas of heather moor (CA11), coarse grassland (CA12), and blanket bog and
peatland (CA15). Overall, a slight positive relationship with more complex patch
shape was predicted for golden plover (FRAC, SHAPE.MD).
3.1.2.3 Frequency models
Parameter estimates selected for frequency models are shown in Table 10. For
frequency models, a much smaller set of variables was selected. For the black grouse
model, only total conifer plantation perimeter (TE3) and mean elevation (MEAN)
were selected. Ptarmigan, the other specialist examined, were predicted by a small
model of topographic variation (VARIETY), blanket bog edge (TE15), and mean
patch area (AREA.MN).
Curlew were predicted to favour coarse grassland areas (CA12), but also edges of
smooth grassland (TE13). They also exhibited slight eastward preference (X).
Golden plovers, however, were predicted to favour land to the north (Y). The model
also picked out uniform blanket bog and peatland areas (CA15, COHESION) as
important predictor variables for this species. Meadow pipits were predicted to be
negatively associated with areas of mixed woodland, however woodland edges were
positive – possibly indicating a more complex quadratic interaction. Meadow pipits
were additionally predicted to favour a highly aggregated landscape (AI), blanket bog
edge (TE15). Mean elevation was predicted to have a negative relationship.
![Page 22: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/22.jpg)
Research Paper
22
Table 8 The model chosen by forward stepwise Gaussian regression for overall bird
diversity, H, in Scottish upland habitats. The information presented here has been adapted
directly from S-Plus output, and includes the selected variables (described in Tables 4 and 5),
parameter estimate (regression coefficients), standard errors, and Student’s t test values.
Variables are presented in the order they are selected. Complete information for all models is
presented in Macdonald (2004).
Variable Parameter
estimate
Standard error t value
(Intercept) 1.729815 2.195503 0.787890
CONTIG.AM 6.421927 0.840860 7.637330
SHDI 0.329566 0.000655 5.030813
CA15 0.000055 0.000016 3.487625
TE2 0.000002 0.000001 2.036191
TE13 0.000002 0.000001 3.646451
CA0 0.000130 0.000046 -2.847954
CA11 0.000051 0.000016 3.191789
MAX 0.000416 0.000104 -3.993171
CONTIG.MD -7.438591 1.999564 -3.720105
AREA.MN 0.003361 0.000899 3.738703
CONTIG.CV -0.177609 0.031611 -5.618632
CONTIG.SD 21.32197 4.094209 5.207836
PARA.MD 0.003879 0.001289 -3.009283
TE14 0.000009 0.000004 2.477047
AREA.SD 0.000515 0.000220 -2.342149
PARA.MN 0.003492 0.001705 2.048171
CA14 0.000685 0.000383 -1.788750
Dispersion = 0.212824
Null deviance = 209.4335 on 690 degrees of freedom
Explained deviance = 143.2306 on 673 degrees of freedom, p = 9.614724e-9 (<< 0.05)
Percentage deviance explained = 31.61%
![Page 23: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/23.jpg)
Res
earc
h P
aper
23
Tab
le 9
D
istr
ibu
tion
mod
els:
Par
amet
er e
stim
ates
for
var
iable
s se
lect
ed b
y s
tepw
ise
logis
tic
regre
ssio
n. F
or
full
outp
ut
see
Mac
donal
d (
2004).
Bla
ck G
rou
se
Cu
rlew
G
old
en P
lov
er
Mea
do
w P
ipit
P
tarm
iga
n
Var
iab
le
Co
effi
cient
Var
iab
le
Co
effi
cient
V
aria
ble
C
oef
fici
ent
Var
iab
le
Co
effi
cient
V
aria
ble
C
oef
fici
ent
Inte
rcep
t)
7.4
810
33
(Inte
rcep
t)
43
.877
16
(Inte
rcep
t)
-10
9.6
322
(Inte
rcep
t)
14
389
59
.0
(I
nte
rcep
t)
-33
.02
291
PR
0
.18
85
41
X
0.0
000
23
CO
HE
SIO
N
0.6
321
03
CO
NT
IG.A
M
56
3.7
86
0
C
A1
6
0.0
005
72
AI
0.2
755
18
SH
DI
3.1
638
72
CA
15
0
.00
07
68
CO
NT
IG.C
V
-0.0
70
284
MA
X
0.0
100
32
CA
16
-0
.00
12
71
CO
NT
IG.A
M
13
4.4
47
8
C
A1
2
0.0
013
95
TE
15
0
.00
00
62
Y
0.0
000
08
SID
I 1
0.1
22
840
TE
8
-0.0
00
070
Y
0.0
000
05
MIN
OR
ITY
-0
.00
39
15
CO
NT
IG.M
N
19
.959
29
AR
EA
.CV
0
.00
83
23
IJI
-0.0
43
319
CA
8
-0.0
05
016
ED
-0
.08
30
03
TE
2
0.0
000
17
TE
1
-0.0
00
074
TE
16
-0
.00
00
18
PR
D
-5.7
18
816
PL
AD
J -1
43
95
.18
Y
-0.0
00
012
Y
-0.0
00
007
SH
AP
E.M
D
-9.6
51
047
TE
2
0.0
001
42
X
0.0
000
09
TE
15
0
.00
00
14
TE
12
-0
.00
00
12
X
0.0
000
05
TE
12
-0
.00
00
06
ME
AN
-0
.00
64
03
FR
AC
.MD
5
8.1
07
90
DIV
ISIO
N
4.0
961
26
CA
14
-0
.00
71
95
AI
-1.3
85
507
MA
JOR
ITY
0
.00
81
62
PA
RA
.AM
-8
99
.31
46
TE
11
0
.00
00
13
FR
AC
.MD
-3
1.8
42
03
ME
AN
-0
.00
63
08
TE
2
0.0
000
25
MIN
OR
ITY
-0
.00
10
94
CA
11
0
.00
04
66
FR
AC
.AM
-3
6.6
54
34
0
F
RA
C.C
V
-1.0
81
644
MIN
OR
ITY
0
.00
11
05
CA
13
0
.00
06
28
SH
AP
E.S
D
2.5
018
58
CA
7
-0.0
00
706
TE
16
0
.00
00
17
CA
13
0
.00
07
85
FR
AC
.CV
1
.16
31
22
PA
RA
.MN
-0
.02
47
42
AR
EA
.MD
-0
.04
86
71
LP
I 0
.06
14
76
IJI
-0.0
32
698
CA
16
0
.00
06
28
AR
EA
.RA
-0
.00
06
27
TE
14
0
.00
00
35
TE
3
0.0
000
12
AR
EA
.MD
0
.07
30
12
CA
9
-0.0
00
492
CA
9
-0.0
00
618
SH
DI
2.8
618
49
PA
RA
.MD
0
.01
85
17
P
AF
RA
C
-10
.85
243
PA
RA
.CV
0
.05
54
01
M
AX
-0
.00
20
16
![Page 24: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/24.jpg)
Res
earc
h P
aper
24
Tab
le 1
0 F
req
uen
cy m
od
els:
P
aram
eter
est
imat
es f
or
var
iable
s se
lect
ed b
y s
tepw
ise
logis
tic
regre
ssio
n. F
or
full
outp
ut
see
Mac
donal
d (
2004
).
Bla
ck G
rou
se
Cu
rlew
G
old
en P
lover
Mea
dow
Pip
it
Pta
rm
igan
Var
iable
C
oef
fici
ent
Var
iable
C
oef
fici
ent
V
aria
ble
C
oef
fici
ent
Var
iable
C
oef
fici
ent
V
aria
ble
C
oef
fici
ent
(Inte
rcep
t)
-5.1
49
640
(Inte
rcep
t)
23
.686
16
(Inte
rcep
t)
-10
4.1
109
(Inte
rcep
t)
-12
.23
221
(Inte
rcep
t)
-10
.94
814
TE
3
0.0
000
14
X
0.0
000
06
CA
15
0
.00
01
78
AI
0.1
401
84
VA
RIE
TY
0
.00
72
13
ME
AN
0
.00
39
98
VA
RIE
TY
-0
.00
34
28
Y
0.0
000
05
CA
6
-0.0
03
597
TE
15
0
.00
00
15
T
E1
3
0.0
000
09
CO
HE
SIO
N
0.9
909
11
TE
15
0
.00
00
07
AR
EA
.MN
0
.00
71
64
C
A1
2
0.0
003
11
M
EA
N
-0.0
01
967
F
RA
C.M
N
-20
.36
037
A
RE
A.M
N
0.0
053
40
P
R
0.1
277
55
T
E6
0
.00
00
15
P
AF
RA
C
-2.8
67
721
C
A3
-0
.00
01
77
![Page 25: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/25.jpg)
Research Paper
25
3.2 Model evaluation
3.2.1 Diversity model
The computed Pearson product-moment correlation for the diversity model
(0.5346674) is shown, with related information Student’s t statistic and the model
p-value, in Table 7. Values of the average residual and ratio of over to under-
predicted squares are shown in Table 13. A plot of observed versus predicted
diversity is presented in the top-left frame of Figure 7.
Maps showing predicted diversity, H, and the average percentage of over or under
prediction in H are shown in Figure 3. A map showing locations of over and under
predictions of H compared to those for black grouse frequency is shown in Figure 4.
3.2.2 Distribution models
Evaluation maps of all distribution models are shown in Figures 5 (more specialist
species) and 6 (generalists). The confusion matrices used to produce these are listed
in Table 11. The computed performance assessments, described in Table 6, are
shown in Table 12.
3.2.3 Frequency models
The computed Pearson product-moment correlation values for frequency models are
shown, with related information Student’s t statistic and the model
p-values, in Table 7. Values of the average residuals and ratios of over to under-
predicted squares are shown in Table 13. Plots of observed versus predicted diversity
are shown in Figure 7.
Maps showing the locations of over and under predicted frequencies of curlew and
meadow pipit are shown in Figure 8. These show similarly good results as for
generalist distribution models.
A direct comparison of the direct model outputs of probability of distribution and
frequency, shown for black grouse, is given in Figure 9.
![Page 26: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/26.jpg)
Res
earc
h P
aper
26
0.8
7 -
1.2
0
1.2
1 -
1.6
0
1.6
1 -
2.0
0
2.0
1 -
2.4
0
2.4
1 -
2.8
0
2.8
1 -
3.2
0
�
-41 -
-25
-24 -
0
1 -
25
26 -
50
51 -
75
76 -
100
This
map is b
ased o
n d
ata
pro
vid
ed w
ith the s
uppo
rt o
f th
e E
SR
C a
nd
JIS
C a
nd
uses b
oun
dary
mate
rial w
hic
h is c
opyrig
ht of th
e C
row
n a
nd t
he P
ost
Offic
e.
010
020
050
Kilo
metr
es
Pre
dic
ted d
ive
rsity H
Pe
rcen
tag
e o
ver
or
under-
pre
dic
ted
div
ers
ity
H
Fig
ure
3 P
redic
ted d
iver
sity
for
all
upla
nd b
ird s
pec
ies
(lef
t), an
d (
right)
the
per
centa
ge
over
(+
ve)
or
under
(-v
e) p
redic
tion o
f th
e obse
rved
val
ue.
Lin
es r
epre
sent
50-k
m G
B N
atio
nal
Gri
d s
quar
es.
![Page 27: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/27.jpg)
Res
earc
h P
aper
27
-1 0 1
�
-1 0 1
This
map is b
ased o
n d
ata
pro
vid
ed w
ith the s
uppo
rt o
f th
e E
SR
C a
nd
JIS
C a
nd
uses b
oun
dary
mate
rial w
hic
h is c
opyrig
ht of th
e C
row
n a
nd t
he P
ost
Offic
e.
010
020
050
Kilo
me
tre
s
Over
(+ve
) or
under
(-ve)
pre
dic
tio
n o
f div
ers
ity H
Over
(+ve)
or
unde
r (-
ve)
pre
dic
tion o
f B
lack
Gro
use
(0 r
espon
se a
lmo
st is
alw
ays o
verp
red
icte
d)
Fig
ure
4
Loca
tions
of
over
(+
1)
or
under
(-1
) pre
dic
ted d
iver
sity
for
all
upla
nd b
ird s
pec
ies
(lef
t),
and (
right)
the
loca
tions
of
over
(+
1)
or
under
(-1
)
pre
dic
ted b
lack
gro
use
fre
quen
cy. Y
ello
w s
quar
es h
ave
an o
bse
rved
spec
ies
freq
uen
cy o
f ze
ro. L
ines
rep
rese
nt
50-k
m G
B N
atio
nal
Gri
d s
quar
es.
![Page 28: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/28.jpg)
Research Paper
28
Table 11 Confusion matrices, for all species and H, showing the number of true
positive, false positive, false negative and true negative predictions of species
presence/absence in each 10-km square. In each matrix the terms are as follows:
a correct positive prediction (top-left), b false positive prediction (top-right),
c false negative prediction (bottom-left), d true negative prediction (bottom-right).
Black Grouse Curlew
Actual
Actual + - + -
+ 32 29 + 180 34 Predicted
- 23 218 Predicted
- 27 61
Golden Plover Meadow Pipit
Actual
Actual
+ - + -
+ 110 40 + 298 4 Predicted
- 40 112 Predicted
- 0 0
Ptarmigan
Actual
+ -
+ 36 10
Predicted
- 11 245
![Page 29: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/29.jpg)
Res
earc
h P
aper
29
Tab
le 1
2
Per
form
ance
mea
sure
s d
eriv
ed f
rom
confu
sion m
atri
ces
show
n i
n T
able
11.
Thes
e re
sult
s sp
ecif
y t
he
deg
ree
of
succ
ess
wit
h w
hic
h
each
model
acc
ura
tely
pre
dic
ts b
oth
pre
sen
ce a
nd a
bse
nce
. M
easu
res
uti
lisi
ng a
ll m
atri
x d
ata
are
the
odds
rati
o a
nd
Kap
pa.
F
orm
al d
efin
itio
ns
are
giv
en T
able
6, af
ter
Fie
ldin
g a
nd B
ell
(1997).
Per
form
an
ce m
easu
re
Bla
ck G
rou
se
Cu
rlew
G
old
en P
lover
Mea
dow
Pip
it
Pta
rm
igan
O
ver
all
Pre
val
ence
0.1
82119
0.6
8543
0.4
96689
0.9
86755
0.1
55629
0.5
01325
Over
all
dia
gnost
ic p
ow
er
0.8
17881
0.3
1457
0.5
03311
0.0
13245
0.8
44371
0.4
98675
Corr
ect
clas
sifi
cati
on r
ate:
0.8
27815
0.7
98013
0.7
35099
0.9
86755
0.9
30464
0.8
55629
Mis
clas
sifi
cati
on r
ate:
0.1
72185
0.2
01987
0.2
64901
0.0
13245
0.0
69536
0.1
44371
Sen
siti
vit
y
0.5
81818
0.8
69565
0.7
33333
1
0.7
65957
0.7
90135
Spec
ific
ity
0.8
82591
0.6
42105
0.7
36842
0
0.9
60784
0.6
44465
Fal
se p
osi
tive
rate
0.1
17409
0.3
57895
0.2
63158
1
0.0
39216
0.3
55535
Fal
se n
egat
ive
rate
0.4
18182
0.1
30435
0.2
66667
0
0.2
34043
0.2
09865
Posi
tive
pre
dic
tive
pow
er (
PP
P)
0.5
2459
0.8
41121
0.7
33333
0.9
86755
0.7
82609
0.7
73682
Neg
ativ
e pre
dic
tive
pow
er (
NP
P)
0.9
04564
0.6
93182
0.7
36842
N/A
0.9
57031
2.5
73846*
Odds-
rati
o
10.4
5877
11.9
6078
7.7
N
/A
80.1
8182
50.1
65010*
Kap
pa
K
0.4
45519
0.5
22078
0.4
70175
0
0.7
33103
0.4
34175
* T
hes
e val
ues
are
aver
aged
ov
er o
nly
fou
r ca
ses.
![Page 30: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/30.jpg)
Res
earc
h P
aper
30
Fa
lse
ne
gative
Tru
e n
eg
ative
Tru
e p
ositiv
e
Fa
lse
po
sitiv
e
�
Fa
lse
ne
ga
tive
Tru
e n
ega
tive
Tru
e p
ositiv
e
Fa
lse
po
sitiv
e
Th
is m
ap
is b
ased o
n d
ata
pro
vid
ed
with
the s
uppo
rt o
f th
e E
SR
C a
nd
JIS
C a
nd
uses b
oun
da
ry m
ate
rial w
hic
h is c
op
yrig
ht of th
e C
row
n a
nd t
he P
ost
Offic
e.
0100
200
50
Kilo
metr
es
Na
ture
of
Bla
ck G
rou
se
pre
dic
tio
n
Natu
re o
f P
tarm
igan
pre
dic
tio
n
Fig
ure
5
Eval
uat
ion ‘
gri
d’
map
s fo
r pre
dic
ted d
istr
ibuti
on o
f bla
ck g
rouse
(le
ft)
and p
tarm
igan
(ri
ght)
. G
reen
shad
es i
ndic
ate
corr
ect
pre
dic
tions.
R
ed
shad
es i
ndic
ate
inco
rrec
t pre
dic
tions.
L
ines
rep
rese
nt
50-k
m G
B N
atio
nal
Gri
d s
quar
es.
![Page 31: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/31.jpg)
Res
earc
h P
aper
31
Fals
e n
egative
Tru
e n
egative
Tru
e p
ositiv
e
Fals
e p
ositiv
e
�
Fals
e n
egative
Tru
e n
ega
tive
Tru
e p
ositiv
e
Fals
e p
ositiv
e
This
map is b
ased o
n d
ata
pro
vid
ed w
ith the s
uppo
rt o
f th
e E
SR
C a
nd
JIS
C a
nd
uses b
oun
dary
mate
rial w
hic
h is c
opyrig
ht of th
e C
row
n a
nd t
he P
ost
Offic
e.
0100
200
50
Kilo
metr
es
Natu
re o
f C
urlew
pre
dic
tio
n
Natu
re o
f G
old
en P
lover
pre
dic
tion
Tru
e p
ositiv
e
Fals
e p
ositiv
e
Natu
re o
f M
ea
dow
Pip
itpre
dic
tion
Fig
ure
6
Eval
uat
ion ‘
gri
d’
map
s fo
r pre
dic
ted d
istr
ibuti
on o
f cu
rlew
(le
ft),
gold
en p
lover
(m
iddle
), a
nd m
eadow
pip
it (
right)
. G
reen
shad
es i
ndic
ate
corr
ect
pre
dic
tions.
R
ed s
had
es i
ndic
ate
inco
rrec
t pre
dic
tions.
L
ines
rep
rese
nt
50-k
m G
B N
atio
nal
Gri
d s
quar
es.
![Page 32: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/32.jpg)
Res
earc
h P
aper
32
Tab
le 1
3
Mea
sure
s of
the
deg
ree
of
over
and u
nder
pre
dic
tion i
n f
requen
cy a
nd d
iver
sity
GL
M m
odel
s.
The
two l
eftm
ost
fie
lds
are
aver
aged
over
all
cas
es,
incl
udin
g t
hose
wher
e th
e obse
rved
val
ue
is z
ero.
Thes
e ar
e su
bje
ct t
o s
tati
stic
al n
ois
e (i
.e.
ver
y r
arel
y w
ill
the
model
pre
dic
t ex
actl
y z
ero d
ue
to t
he
pro
bab
ilis
tic
nat
ure
of
the
outp
ut)
, th
eref
ore
val
ues
bas
ed o
n o
nly
non-z
ero o
bse
rved
val
ues
are
giv
en i
n t
he
rem
ainin
g c
olu
mns.
T
hes
e in
clude,
the
aver
age
deg
ree
of
over
or
under
pre
dic
tion,
giv
en a
s a
per
centa
ge
of
the
true
(obse
rved
) val
ue.
C
are
is a
dvis
able
in i
nte
rpre
ting t
hes
e ty
pes
of
resu
lt,
but
they
are
giv
en h
ere
for
com
ple
tenes
s.
A
ver
age
resi
du
al
Over
/un
der
pre
dic
tion
rati
o
Aver
age
resi
du
al
(excl
. zer
o-v
alu
ed r
esp
on
ses)
Over
/un
der
pre
dic
tion
rati
o
(excl
. zer
o. res
p.)
Perc
enta
ge
over
/un
der
pre
dic
tion
(excl
. zer
o-v
alu
ed r
esp
on
ses)
Bla
ck G
rouse
-0
.00
56
90
6
.02
32
56
0
.11
92
17
0
.04
65
12
-6
5.6
1%
Curl
ew
0
.01
11
49
1
.01
33
33
0
.04
93
12
0
.66
1
5.3
3%
Go
lden
Plo
ver
0
.01
43
83
2
.43
18
18
0
.04
94
65
0
.52
27
27
-2
.55
%
Mea
do
w P
ipit
0
.01
04
69
0
.65
02
73
0
.02
74
47
0
.60
65
57
1
1.0
5%
Pta
rmig
an
0.0
082
95
9
.41
37
93
0
.01
79
88
0
.13
79
31
-5
.61
%
H
0.0
748
95
0
.49
50
50
0
.08
95
55
0
.48
51
49
-2
.28
%
![Page 33: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/33.jpg)
Research Paper
33
Figure 7 Correlation plots of observed versus predicted diversity (top-left) and frequency, for
each of the five species modelled. Simple linear regression line of best fit and correlation
coefficient, R2, added for comparison between models.
Frequency Black Grouse
y = 0.3716x + 0.0133
R2 = 0.0315
0
0.1
0.2
0.3
0.4
0.5
0.6
0 0.05 0.1 0.15 0.2 0.25
Predicted
Ob
serv
ed
Frequency Curlew
y = 0.9232x + 0.0424
R2 = 0.3859
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
Predicted
Ob
se
rve
d
Frequency Golden Plover
y = 1.0007x + 0.0143
R2 = 0.3566
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8
Predicted
Ob
se
rve
d
Frequency Ptarmigan
y = 1.057x + 0.0072
R2 = 0.2075
0
0.2
0.4
0.6
0.8
1
1.2
0 0.1 0.2 0.3 0.4
Predicted
Ob
serv
ed
Frequency Meadow Pipit
y = 1.0645x - 0.0372
R2 = 0.4334
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1 1.2
Predicted
Ob
se
rve
d
Observed vs. Predicted H
y = 0.7091x + 0.7038
R2 = 0.3539
0
0.5
1
1.5
2
2.5
3
3.5
0 0.5 1 1.5 2 2.5 3 3.5
Predicted H
Ob
se
rve
d H
![Page 34: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/34.jpg)
Res
earc
h P
aper
34
-1 0 1
�
-1 0 1
This
map is b
ased o
n d
ata
pro
vid
ed w
ith
th
e s
uppo
rt o
f th
e E
SR
C a
nd
JIS
C a
nd
use
s b
oun
da
ry m
ate
rial w
hic
h is c
opyrig
ht
of
the
Cro
wn a
nd
the P
ost
Off
ice.
01
00
200
50
Kilo
me
tre
s
Over
(+ve
) o
r u
nd
er
(-ve
) p
red
ictio
n o
f C
urle
w fre
qu
en
cy
Over
(+ve
) or
und
er
(-ve
) p
red
iction o
f M
ea
dow
Pip
itfr
equ
ency
F
igu
re 8
L
oca
tions
of
over
(+
1)
or
under
(-1
) pre
dic
ted f
requen
cy f
or
upla
nd g
ener
alis
ts,
curl
ew (
left
), a
nd m
eadow
pip
it (
right)
. Y
ello
w s
quar
es h
ave
an
obse
rved
spec
ies
freq
uen
cy o
f ze
ro. L
ines
rep
rese
nt
50-k
m G
B N
atio
nal
Gri
d s
quar
es.
![Page 35: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/35.jpg)
Res
earc
h P
aper
35
0.0
0 -
0.2
0
0.2
1 -
0.4
0
0.4
1 -
0.6
0
0.6
1 -
0.8
0
0.8
1 -
1.0
0
�
0.0
1 -
0.0
5
0.0
6 -
0.0
9
0.1
0 -
0.1
4
0.1
5 -
0.1
8
0.1
9 -
0.2
3
This
map is b
ased o
n d
ata
pro
vid
ed w
ith
th
e s
uppo
rt o
f th
e E
SR
C a
nd
JIS
C a
nd
use
s b
oun
da
ry m
ate
rial w
hic
h is c
opyrig
ht
of
the
Cro
wn a
nd
the P
ost
Off
ice.
010
020
050
Kilo
metr
es
Pre
dic
ted
dis
trib
utio
n
of B
lack G
rou
se
Pre
dic
ted
fre
que
ncy
(rela
tive a
bu
nd
an
ce
)of
Bla
ck G
rouse
Fig
ure
9 P
redic
ted d
istr
ibuti
on (
pro
bab
ilit
y o
f pre
sence
) of
bla
ck g
rouse
(le
ft),
and p
redic
ted f
requen
cy o
f bla
ck g
rouse
(ri
ght)
.
Lin
es r
epre
sent
50-k
m G
B N
atio
nal
Gri
d s
quar
es.
![Page 36: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/36.jpg)
Research Paper
36
4 Discussion
4.1 Ecological interpretation
A description of species characteristics is given by Macdonald (2004), which readers
unfamiliar with the species may find helpful in reviewing the following text.
4.1.1 Mathematical vs. biological selection
This study seeks to create species-habitat models for several upland bird species, and
also for overall diversity, at a national scale. The modelling process, however,
involves the use of theoretical statistical models rooted firmly in mathematics.
Stepwise model fitting produces a mathematically correct result, but is likely to
include variables other than the most biologically intuitive. Correlation may exist
between explanatory variables (Hill, 2001). This does not necessarily mean a poor
choice of submitted parameters, for as Hill (2001) points out; a species’ predilection
for transitional or ‘gradational’ landscapes (McGarigal and Cushman, 2002) can lead
to variable equivalence. The stepwise procedure simply selects the strongest
numerically. Interpretation must therefore be made in light of real world problems
rather than purely statistical.
4.1.2 Species predictors
The models include intuitively sensible variables. Blanket bog and peatland, and
heather moor, are host to numerous species (Thompson et. al., 1995; Thirgood, 2000).
Selection of a good set of metrics for H (Table 8, and see 3.1.2.1) means the model is
not dominated by any one species and is truly representative of overall diversity.
Cohesive and contiguous landscapes are seen to be important generally. Heather
moor is considered home to unusually high numbers of curlew, golden plover and
more recently lowland species such as the meadow pipit (Thirgood et. al., 2000;
Thompson et. al., 1995). Only golden plovers appear to respond positively to shape
metrics. Again though, particularly for the frequency model, cohesion was selected.
![Page 37: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/37.jpg)
Research Paper
37
British curlews in the north and west may face pressure from modern agricultural
practices (Sterry, 1995). The predicted eastward association (Table 9, also Figure 6)
may reflect preference for less intensive grazing, or perhaps upland areas near to less
intensive arable farmland seen in eastern Scotland. This hypothesis would require
verification by other means, e.g. local survey, however it is echoed by a predicted
preference for patchy landscapes (higher AI and PR). This suggests models may be
useful for predicting the geographical variation in species and even possible human
activity-species interactions.
Selection of young plantation as a negative predictor of black grouse distribution
requires consideration. The species favours herbaceous growth that often appears
after fire disturbance (Moorland Working Group, 2002), and is known to exploit
similar plant growth in young plantations, (see Baines 1995, Hill 2001, Hack 2002,
and Hancock et. al. 1999). Looking further, black grouse are predicted to respond
negatively to montane vegetation, and positively to moorland edge, and coarse
grassland found in sub-montane zones (Macaulay Land Use Research Institute, 1989).
Conifer plantations are commonly associated with mid-range elevations, and logically
do not occur at high altitudes. The model therefore indirectly predicts that black
grouse are associated with conifer plantations as expected. This more complex
picture may possibly be due to the sparseness of data for black grouse.
4.2 Prediction Success
4.2.1 Diversity model
The diversity model produces good results with a slight average under-prediction of
-2.28% (Table 13). This is echoed in the graphical plots in Figure 3 and 4 (left). The
correlation plot of Figure 7 (top-left) shows clustering along the line of best fit which
appears within the limits of statistical noise. The model thus appears, graphically, to
be a credible conservation tool.
4.2.2 Distribution models
All models show reasonable results in terms of explained deviance and p-values.
Evaluation of distribution model success has, however, been standardised by the set of
measures provided by Fielding & Bell (1997). The models have highly impressive
![Page 38: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/38.jpg)
Research Paper
38
correct classification rates, all above 70%, with an average overall prediction success
rate of 86%. Measures of the prediction success for presence and absence (sensitivity
and specificity) are reasonably close for most models, and near identical for golden
plover. Meadow pipits exist throughout the dataset thus these measures take their
extreme values for this species, with some measures unavailable due to the presence
everywhere of this species. This is a significant result, meaning prediction of
widespread species (where intuitive selection of model controls is difficult) is indeed
possible. GIS-statistical modelling appears to be applicable to many species types,
and hence ‘total’ diversity.
Measures utilising all confusion matrix values are the odds ratio and Cohen’s Kappa
index. The odds ratio for ptarmigan is high, however this simply reflects the low
number of incorrect predictions. Landis and Koch (1977) proposed a suitable scale of
classes for Kappa (also Fielding and Bell, 1997). Most values were classed as
moderate, 0.41 < K < 0.6, with ptarmigan classed as substantial, 0.61 > K > 0.8. The
Kappa value for Meadow pipit is unavailable due to the species’ widespread nature.
Figures 5 and 6 illustrate the variables selected (listed in Table 9). Black grouse and
ptarmigan (Figure 5) reflect sparse data, with the eastward trend of curlews clear in
Figure 6 (left map). The presence of golden plover in the north-west highlands may
reflect the predicted slight positive correlation with shape complexity (Table 9).
4.2.3 Frequency models
The majority of frequency models produced reasonable values of explained deviance,
and Pearson product-moment correlation, both with low p-values. The degree of
scatter in correlation plots is much more pronounced (Figure 7). This reflects the
often more sparse nature of the frequency data, likely due to low numbers of tetrads in
some squares. Figures 4 and 9 show this feature of the frequency data. Clearly, the
quality of results (for all models) is entirely dependent upon that of input data.
![Page 39: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/39.jpg)
Research Paper
39
4.3 Limitations and potential error sources
4.3.1 Biotic errors
The relatively high success rates (Tables 7-9, Figures 2-5), suggest sufficient
environmental variables were considered. Biotic errors, due to inadequate models of
species ecology, are thus kept small.
The problems of zero inflated data, a form of overdispersion, have been explored by
Barry and Welsh (2002). Testing for overdispersion (residual deviance divided by
residual degrees of freedom) does not apply to binary response data, however we see
that frequency-derived models are actually underdispersed (Table 7), indicating
negatively correlated responses (Wilson and Hardy, 2002). This may also be related
to the lack of binomial weights (surveyed tetrads) and thus a lack of model fit.
Methods to correct for underdispersion have not received significant attention in
statistical literature. In most cases, however, the phenomenon is not thought to pose
overly significant problems (Wilson and Hardy, 2002).
4.3.2 Algorithmic errors
4.3.2.1 Randomised data partitioning and spatial autocorrelation
Autocorrelation in samples is normally to be avoided since this can lead to
autocorrelated residuals or clustering of over/under-predicted cases. In this study,
however, there are a number of points worth considering. Firstly, the data extent,
bounded by the outline of Scotland, inevitably increases the likelihood of
autocorrelation, especially in remote islands. The size of the build set (70%) within
this constraint also means most squares will be neighboured by other survey squares,
regardless of what pattern of squares we remove for evaluation. Additionally,
autocorrelation between adjacent squares forms a key part of the ecological systems
studied, since species are often seen to favour contiguous habitats. In such cases, a
small amount of artificial pattern may help to distinguish clustering due to natural
processes from that due to algorithmic error, in our predictions.
Examination of Figures 4 and 8 shows a small degree of clustered predictions. The
results for curlew are the least clustered with little autocorrelation of over and under
![Page 40: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/40.jpg)
Research Paper
40
predictions. Results for diversity, H, (Figure 4, left) and for golden plover show a few
clusters e.g. Morayshire coast (linear pattern), west of Inverness/Black Isle, and in
Perthshire. These are, however, within or near to significant upland areas. The
clusters do vary between species and do not strongly indicate any undue bias or
systematic error. The patterns are assumed to reflect the local land cover within those
10-km squares.
4.3.3 Boundary effects: land cover polygons
The use of 10-km squares as a fixed sampling unit may introduce errors through
imposing unnatural boundaries upon land cover polygons where these are split into
multiple patches. This was considered preferable to counting whole polygon area,
which would result in repeat counting where polygons overlap multiple squares. The
choice of a 10-km unit was considered sufficiently large to hold a number of home
ranges for each species examined. The Fragstats landscape pattern analysis package
also allows for exclusion of unnatural boundaries imposed by the landscape (map)
boundary.
Finally, Hill (2001) notes that imprecision in the interpretation of LCS88 source
photography may give rise to mixed habitat within polygons, or particular land cover
in areas where not expected. This may therefore contribute to small numbers of
spurious or inaccurate predictions. This problem may be increased in some cases due
to the land cover selection methodology. Figure 1 shows a large number of coastal
squares present in the data, with low levels of genuine upland type habitat. This may
lead to further small errors in models.
![Page 41: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/41.jpg)
Research Paper
41
5 Conclusions
5.1 Spatial controls of upland diversity
Statistical modelling reveals that a varied landscape of peat and heather patches, with
the presence of grass and conifer edge habitats, provides the conditions necessary for
increased overall diversity.
Modelling of the controls upon individual species’ distribution and frequency (or
relative abundance) has shown most species to favour larger, contiguous or cohesive,
patch areas. Requirements however for suitable edge habitat, often with coniferous
plantations, are significant. Potential conflict arises where patch shape complexity, or
proportion of ‘edge’, is increased. Of the five species examined, only one, golden
plover Pluvialis apricaria, exhibited a preference for greater shape complexity, as
chosen by a stepwise statistical selection process. A cohesive, compacted, habitat
was, however, also selected as a control on this species’ distribution.
5.2 Application of models as conservation/land management tools
The models provide land managers and conservationists with the first steps towards
mapping the ‘ideal landscape configuration’ for diversity. Statistical measures can be
assessed comparatively with maps of the existing landscape to infer the optimum
spatial structures that maximise diversity.
As Hill (2001) notes the current power is for the ‘real-world’ application, where
proposed land developments (already in motion) can be substituted into models in
place of land cover data. The addition of land pattern analysis in this study, while
embryonic, adds further power to this process, allowing us not merely to see the
effects on one localised species population, but to draw inferences regarding the
consequences for all bird life.
This process is still though somewhat limited. Real power would come from the
ability to visualise the ideal landscape, produced from specified statistics, rather than
repeatedly searching through a set of arbitrarily defined landscapes for the best
example currently available.
![Page 42: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/42.jpg)
Research Paper
42
5.3 Future work
The study could be extended in several ways. Firstly a more detailed investigation of
all landscape metrics, including possible non-linear interactions or correlation, could
be undertaken. Refinement of models in terms of the numbers of metrics could
simplify the interpretation process where the more unusual variables are chosen.
Secondly, landscape patterns could be measured on a range of scales, e.g. from
species home ranges up to multiple 10-km squares. Further to this, use could be made
of class or patch level metrics. These are not available for every 10-km square due to
each square’s individual geography, however some additional metrics exist only at
class and patch levels and may prove worthy of investigation. Alternatively a similar
analysis of diversity as in this study, but grouped by smaller geographical areas
similar to the approach of Hill (2001) may provide interesting information on regional
variation.
The most exciting opportunity is the potential development of maps, graphically
illustrating possible optimal landscapes for maximising species diversity. These could
be achieved using iterative simulation techniques e.g. Monte Carlo random
landscapes, incorporating suitable land pattern metrics as predicted by GIS-powered
statistical modelling of bird diversity.
![Page 43: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/43.jpg)
Research Paper
43
6 Cited References
Baines, D. (1995) Habitat requirements of Black Grouse. In Proceedings of the Sixth
International Grouse Symposium (ed. D Jenkins), pp. 147-150. World Pheasant
Association/Instituto Nazionale per las Fauna Selvatica, Ozzana dell’Emilia.
Barry, S.C. and Welsh, A.H. (2002) Generalized additive modelling and zero inflated
data. Ecological Modelling, 157, 179-188.
Begon, M., Harper, J.L., and Townsend, C.R. (1990) Ecology: Individuals,
Populations and Communities (2nd
edn). Blackwell Scientific Publications,
Masachusetts.
British Trust for Ornithology (2004) Species Codes. PDF document available from
BTO web site, www.bto.org [Accessed 03 September 2004]
Cramp, S. & Simmonds, K.E.L. (eds) (1977) Handbook of the Birds of Europe, the
Middle East and North Africa. Oxford University Press, Oxford
Crawley, M.J. (2002) Statistical Computing: An Introduction to Data Analysis using
S-Plus. Wiley, Chichester.
ESRI (2000) ArcInfo 8: A New GIS for the New Millennium. Environmental Systems
Research Institute, Inc., Redlands, California.
ESRI (2004) ArcGIS Desktop. Environmental Systems Research Institute, Inc.,
Redlands, California.
Fielding, A.H. and Bell, J.F. (1997) A review of methods for the assessment of
prediction errors in conservation presence/absence models. Environmental
conservation, 21, 38-49.
![Page 44: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/44.jpg)
Research Paper
44
Gibbons, D.W., Reid, J.B., and Chapman, R.A. (1993) The New Atlas of Breeding
Birds in Britain and Ireland: 1988-1991 T & AD Poyser, London.
Hancock, M. et. al. (1999) Status of male Black Grouse Tetrao tetrix in Britain in
1995-1996. Bird Study, 46, 1-15.
Landis, J.R. and Koch, G.C. (1977) The measurement of observer agreement for
categorical data. Biometrics, 33, 159-174.
Luscher, L.M. (2001) Oracle 9i Database Concepts – Release 1 (9.0.1). Oracle
Corporation, Redwood City, California.
Macaulay Land Use Research Institute (1989). The Land Cover of Scotland by Air
Photo Interpretation. Contract specification for the Scottish Development
Department.
Macdonald, O.N. (2004) Modelling the spatial arrangement of upland bird habitat in
Scotland: Technical Report. MSc thesis, University of Edinburgh.
Malanson, G.P. and Cramer, B.E. (1999) Landscape heterogeneity, connectivity, and
critical landscapes for conservation. Diversity and Distributions, 5, 27-39.
McCullagh, P. and Nelder, J. (1983) Generalized Linear Models. Chapman and Hall,
London.
McGarigal, K. (2002) Landscape pattern metrics. In: A. H. El-Shaarawi and W. W.
Pegorsch, eds. Encyclopaedia of Environmetrics Volume 2: 1135-1142. John Wiley
& Sons, Sussex.
McGarigal, K. and Cushman, S.A. (2002) The Gradient Concept of Landscape
Structure: Or, Why are there so many patches? Landscape Ecology. In Press.
Available online at www.umass.edu/landeco/pubs/pubs.html. [Accessed 26 August
2004]
![Page 45: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/45.jpg)
Research Paper
45
McGarigal, K., et. al. (2002) FRAGSTATS: (Version 3) Spatial Pattern Analysis
Program for Categorical Maps. Computer software program produced by the authors
at the University of Massachusetts, Amherst. Available online at www.umass.edu/
landeco/research/fragstats/fragstats.html [Accessed 26 August 2004].
McGarigal, K. and Marks, B.J. (1995) FRAGSTATS: (Version 2) Spatial pattern
analysis program for quantifying landscape structure. General Technical Report
PNW-GTR-351. USDA Forest Service, Pacific Northwest Research Station, Portland,
Oregon.
Moorland Working Group (2002) Scotland’s Moorland: The Nature of Change.
Battleby: Scottish Natural Heritage.
Nelder, J.A. and Wedderburn, R.W.M. (1972) Generalized Linear Models. Journal of
the Royal Statistical Society, Series A 135, 370-384.
Pearce-Higgins, J.W. and Yalden, D. W. (2003) Variation in the use of pasture by
breeding European Golden Plovers Pluvialis apricaria in relation to prey availability.
Ibis, 145, 365-381.
S-PLUS 6 for Windows User’s Guide (2001) Insightful Corporation, Seattle,
Washington. Available online at www.insightful.com [Accessed 26 August 2004].
Sterry, P. ed. (1995) Explore Britain’s Birds. AA Publishing, Basingstoke.
Stillman, R.A. and Brown, A.F. (1998) Pattern in the distribution of Britain’s upland
breeding birds. Journal of Biogeography, 25, 73-82.
Tharme et. al. (2001) The effect of management for red grouse shooting on
the population density of breeding birds on heather-dominated moorland. Journal of
Applied Ecology, 38, 439–457
Thirgood, S. et. al. (2000) Conservation conflicts and management solutions.
Conservation Biology, 14(1), 95-104.
![Page 46: Abstract - geos.ed.ac.ukomacdona/dissertation/researchpaper... · Partitioning of data, to give build and evaluation datasets, was performed in Excel using simple randomisation](https://reader038.vdocuments.us/reader038/viewer/2022110221/5a76874a7f8b9aa3618d49a0/html5/thumbnails/46.jpg)
Research Paper
46
Thompson, D.B.A. et. al. (1995) Upland heather moorland in Great Britain: A review
of international importance, vegetation change and some objectives for nature
conservation. Biological Conservation, 71, 163-178.
Tucker, N.I.J. (2001) Linkage restoration: Interpreting fragmentation theory for the
design of a rainforest linkage in the humid Wet Tropics of north-eastern Queensland.
Ecological Management and Restoration, 1(1), 35-41.
Wilson, K. and Hardy, I.C.W. (2002) Sex ratios. Concepts and research methods.
Cambridge University Press, Cambridge.