archive bio climatic

Upload: alberto-santos

Post on 05-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Archive Bio Climatic

    1/27

    Progress in Physical Geography 30, 6 (2006) pp. 127

    2006 SAGE Publications 10.1177/0309133306071957

    Methods and uncertainties in bioclimatic

    envelope modelling under climate changeRisto K. Heikkinen,1* Miska Luoto,1Miguel B. Arajo,2,3 Raimo Virkkala,1Wilfried Thuiller3,4 and Martin T. Sykes5

    1Finnish Environment Institute, Research Department, Research Programmefor Biodiversity, PO Box 140, FIN-00251 Helsinki, Finland2Department of Biodiversity and Evolutionary Biology, National Museum ofNatural Sciences, CSIC, Madrid, Spain3Macroecology and Conservation Unit, University of vora, Portugal4Laboratoire dEcologie Alpine, UMR-CNRS 5553, Universit Joseph Fourier,BP 53, 38041 Grenoble Cedex 9, France5Geobiosphere Science Centre, Department of Physical Geography andEcosystems Analysis, Lund University, Sweden

    Abstract: Potential impacts of projected climate change on biodiversity are often assessed using

    single-species bioclimatic envelope models. Such models are a special case of species distributionmodels in which the current geographical distribution of species is related to climatic variables soto enable projections of distributions under future climate change scenarios. This work reviews anumber of critical methodological issues that may lead to uncertainty in predictions frombioclimatic modelling. Particular attention is paid to recent developments of bioclimatic modellingthat address some of these issues as well as to the topics where more progress needs to be made.Developing and applying bioclimatic models in a informative way requires good understanding ofa wide range of methodologies, including the choice of modelling technique, model validation,collinearity, autocorrelation, biased sampling of explanatory variables, scaling and impacts of non-climatic factors. A key challenge for future research is integrating factors such as land cover, directCO2 effects, biotic interactions and dispersal mechanisms into species-climate models. Weconclude that, although bioclimatic envelope models have a number of important advantages, they

    need to be applied only when users of models have a thorough understanding of their limitationsand uncertainties.

    Key words: bioclimatic model, climate change, land cover, model performance, modellingmethods, niche properties, scale, species distribution model, species geography, uncertainty,validation.

    *Author for correspondence. Email: [email protected]

  • 8/2/2019 Archive Bio Climatic

    2/27

    2 Methods and uncertainties in bioclimatic envelope modelling

    I IntroductionSeveral studies and meta-analyses have indi-cated that recent climatic change has alreadyaffected species geographical distributionsand the persistence of populations (Parmesan,1996; Walther et al., 2002; Moore, 2003;

    Parmesan and Yohe, 2003). Furthermore, pro-jected climate changes are likely to have aneven greater impact on biota (Berry et al.,2002; Hill et al., 2003; Thomas et al., 2004;Thuilleret al., 2005b). There are many differ-ent methodological approaches available forexamining the potential effects of climatechange on biodiversity, ranging from dynamicecosystem and biogeochemistry models (eg,Woodward and Beerling, 1997; Peng, 2000)and spatially explicit mechanistic models for

    single species range shifts (eg, Hillet al., 2001)to physiologically based (eg, Sykeset al., 1996;Walther et al., 2005) and correlative biocli-matic envelope models (eg, Box et al., 1993;Huntley et al., 1995; Iverson and Prasad,1998; 2001; 2002; Pearsonet al., 2002; 2004:Thuiller, 2003; Thuiller et al., 2005b). Thispaper concentrates on the latter group statistical bioclimatic envelope models whichare among the most popular approaches tosimulate species-climate change impacts.

    Statistical bioclimatic modelling techniquesaim at defining, for any chosen species, theclimate envelope that best describes thelimits to its spatial range by correlating thecurrent species distributions with selectedclimate variables (Beaumont and Hughes,2002; Berry et al., 2002; Pearson andDawson, 2003; Thuiller, 2003; Huntleyet al.,2004). Assessments of future speciesbiogeo-graphical ranges are developed by applyingthe models based on the climate variablesthat best describe the current equilibrium dis-tributions to simulate future distributionsunder selected climate change scenarios(Bakkeneset al., 2002; Petersonet al., 2002a;2002b; 2004; Peterson, 2003; Thuiller, 2003;Pearson et al., 2004; Thomas et al., 2004;Thuilleret al., 2004b; 2005b).

    The validity of bioclimatic envelope mod-elling approaches has been questioned on the

    grounds that many factors other than climatecan significantly influence species distribu-tions and the rate of distribution changes(Hampe, 2004), especially in a simulations ofthe future. Huntley et al. (1989) (see alsoHuntley, 1998, and references therein)

    showed that in the Holocene the approachusing climate alone did predict distributionseven though this distribution included otheraspects such as biotic interactions (cf. Daviset al., 1998; Beaumont and Hughes, 2002). Inthe future, however, extensive habitatfragmentation and species dispersallimitations (Iverson and Prasad, 2002), andimpacts of rising atmospheric CO2(Woodward and Beerling, 1997; Iverson andPrasad, 2002), soil and fire changes (Brereton

    et al., 1995; Iverson and Prasad, 1998;Crumpacker et al., 2001) driven by humaninteractions, genetic differences in popula-tions in different parts of the range, andchanging biotic interactions suggest that it isnecessary to assess realistically the use ofsuch approaches in future scenarios. Suchlimitations and uncertainties pose restrictionson bioclimatic envelope models and theirresults should be interpreted with caution(Pearson and Dawson, 2003; Guisan and

    Thuiller, 2005).Nevertheless, bioclimatic envelope modelshave certain advantages. They offer a tool forundertaking relatively rapid analysis for numer-ous individual species, and allow the identifica-tion of key relationships between species andgoverning factors of their distributions (Iversonand Prasad, 2001; 2002; Pearson and Dawson,2003; Gavin and Hu, 2005). They are particu-larly valuable for providing insight into potentialclimate warming effects on biodiversity whenrange-limiting physiological factors for studiedspecies are poorly known (Crumpackeret al.,2001). In addition, individual-species modelsmay provide more precise and realistic predic-tions than those offered by species-assemblagemodels (Iverson and Prasad, 1998), not leastbecause species responses to climate changeare thought to be mainly individualistic(Huntley, 1991; Huntleyet al., 1995).

  • 8/2/2019 Archive Bio Climatic

    3/27

    Risto K. Heikkinen et al. 3

    However, if we are to develop as accuratelyas possible bioclimatic envelope models andspecies range shift scenarios there are a num-ber of critical methodological issues that needto be addressed (Arajoet al., 2005a; Guisanand Thuiller, 2005). Several methodological

    aspects and decisions in modelling exercises,such as differences between statistical tech-niques and decisions on which model selec-tion criteria and explanatory variables areused in modelling, can have a notable impacton the accuracy of bioclimatic envelope mod-els, as well as species distribution models ingeneral (Elith et al., 2002; Thuiller et al.,2004b; Guisan and Thuiller, 2005).

    Our objectives are to discuss the keymethodological issues that may lead to uncer-

    tainties in bioclimatic envelope modelling.Based on a review of both earlier and morerecent published studies, we assess progressmade to improve understanding and reduceuncertainties of predictive bioclimatic model-ling. We also highlight issues that appear to behitherto insufficiently examined in the sci-ence of bioclimatic modelling of species distri-butions. While we acknowledge that therehave been an impressive number of biocli-matic modelling studies done, we do not pro-

    pose to describe them all. Instead, we focuson selected recent developments and studiesthat are most relevant to the uncertaintyissues discussed here.

    II Modelling methods and approachesStatistical bioclimatic envelope models repre-sent one specific type of species distributionmodels (referred to also as habitat modelsand ecological niche-based models; seeGuisan and Zimmermann, 2000), in whichthe biogeographical distributions of speciesare related to broad-scale variation in climateby given modelling techniques (Arajoet al.,2005a; Guisan and Thuiller, 2005). Severaltechniques have been employed in speciesdistribution modelling in general and in bio-climatic envelope modelling in particular(Franklin, 1995; Guisan and Zimmermann,2000; Elith and Burgman, 2002; Olden and

    Jackson, 2002a; Segurado and Arajo, 2004).A list of approaches used in bioclimatic mod-elling is provided in Table 1 and these are fur-ther discussed below.

    Climatic envelope techniques (Environmental

    envelope techniques)have been used to calculate

    a fitted, species-specific, minimal rectilinearenvelope in a multidimensional climatic space(Boxcar) (Guisan and Zimmermann, 2000).The best-known of these techniques areBIOCLIM (eg, Busby, 1991; Beaumont andHughes, 2002; Kadmonet al., 2003; Beaumontet al., 2005), the Florida Model (Box et al.,1999), HABITAT (Walker and Cocks, 1991),and DOMAIN (Carpenter et al., 1993). Thefuzzy minimal rectilinear envelope modellingapplied by Skov and Svenning (2004) also

    belongs to climatic envelope techniques. Othermethods closely related to these techniqueshave been used by Erasmus et al. (2002),Midgleyet al. (2002) and Mileset al. (2004).

    BIOCLIM and other environmental enve-lope techniques, as well as some relatedmethods such as Ecological Niche FactorAnalysis (ENFA), are designed to deal withpresence-only data (Guisan andZimmermann, 2000; Kadmon et al., 2003).This is a valuable feature in cases where reli-

    able absence data is not available (Kadmonet al., 2003; Brotonset al., 2004). Paucity ofspecies distribution data is a common situa-tion in remote and poorly inventoried regions.Presence-only models can be useful also formodelling distributions of highly mobileorganisms, because valid absences can be dif-ficult to obtain for such species (Guisan andThuiller, 2005). However, in cases whereabsence data is available, modellingapproaches that employ presence/absencedata are generally prioritized and seem to givebetter predictions (Brotons et al., 2004;Segurado and Arajo, 2004; Pearson et al.,2006).

    Classification tree analysis (CTA), alsoreferred to as classification and regression trees(CART), involves rule-based methods thathave been used by, for example, Iverson andPrasad (1998; 2001; 2002), Thuiller (2003;

  • 8/2/2019 Archive Bio Climatic

    4/27

    4 Methods and uncertainties in bioclimatic envelope modelling

    2004), Thuilleret al. (2003b) and Arajoet al.(2005a; 2005b; 2006). CTA uses recursive par-titioning to split the data into increasinglysmaller, homogenous, subsets until a termina-tion is reached (Iverson and Prasad, 1998;Venables and Ripley, 2002). The advantage ofclassification trees (CTA) is that it is allows cap-turing of non-additive behaviour and complexinteractions (DeAth and Fabricius, 2000;Thuiller et al., 2003b). Moreover, numericaland categorical variables can readily be usedtogether in CTA (Iverson and Prasad, 1998).However, there are limitations in applyingCTA. In particular, CTA has a tendency to pro-duce overly complex models that lead to spuri-ous interpretations (Thuiller, 2003; Muoz andFelicsmo, 2004; Arajoet al., 2005a).

    Two closely related parametric approaches,multiple logistic regression andgeneralized lin-ear models (GLM) with a binomial distributionand a logistic link, have been employed byGuisan and Theurillat (2000), Price (2000),Bakkeneset al. (2002) and Burnset al. (2003)(see also Hirzel and Guisan, 2002, and refer-ences therein). Using cover-abundance dataof plant species, Dirnbcket al. (2003) applieda closely related method, ordinal logisticregression models (proportional odds model).

    An increasingly popular approach isgener-alized additive models (GAM), which has beenused in climate change impact studies by, forexample, Leathwick et al. (1996), Thuiller(2003; 2004) and Arajoet al. (2004). GAMsare non-parametric extensions of GLMs.

    Table 1 Examples of the statistical techniques, and their abbreviations, applied inbioclimatic envelope modelling

    Study Modelling methods

    Breretonet al., 1995; Beaumont and Hughes, 2002 BIOCLIMKadmonet al., 2003; Meynecke, 2004; Beaumontet al., 2005

    Boxet al., 1993; 1999; Crumpackeret al., 2001 The Florida ModelWalker and Cocks, 1991 HABITATCarpenteret al., 1993 DOMAINBakeret al., 2000 CLIMEXSkov and Svenning, 2004; Svenning and Skov, 2004 Fuzzy minimal rectilinear envelope

    modellingSykeset al., 1996; Waltheret al., 2005 STASHIverson and Prasad, 1998; 2001; 2002 Classification and regression tree

    analysis (CTA / CART / RTA)Guisan and Theurillat, 2000; Price, 2000 Logistic regression/binomial GLMBakkeneset al., 2002; Burnset al., 2003Leathwicket al., 1996; Midgleyet al., 2003 GAM

    Arajoet al., 2004; Luotoet al., 2005Beerlinget al., 1995; Huntleyet al., 1995; 2004 Locally weighted regressionHillet al., 1999; 2002 (local regression/loess)Berryet al., 2002; Pearsonet al., 2002; 2004 ANNPeterson, 2001; Andersonet al., 2002a; 2002b GARPPetersonet al., 2002a; 2002b; 2004Prasad and Iverson, 2000 MARSGavin and Hu, 2005 GM-SMAPThuiller, 2003; 2004; Arajoet al., 2005a; 2005b GLM, GAM, CTA, ANNThuilleret al., 2005a; 2005b

    ANN artificial neural networks; GAM generalized additive models; GARP genetic algorithm for rule-set prediction;GLM generalized linear models; GM-SMAP Gaussian mixture distributions and multiscale segmentation; MARS multivariate

    adaptive regression splines.

  • 8/2/2019 Archive Bio Climatic

    5/27

    Risto K. Heikkinen et al. 5

    They provide a flexible data-driven class ofmodels that permit both linear and complexadditive response shapes, as well as thecombination of the two within the samemodel (Bioet al., 2002; Wood and Augustin,2002). Thus GAMs can cope with regression

    functions that are of a form not easily approx-imated by conventional parametric tech-niques, such as logistic regression (Leathwicket al., 1996). However, possible problemswith overdispersion must be handled withcaution when fitting GAMs (Leathwicket al.,1996).

    Species-climate response surfaces havebeen developed also using locally weightedregression (local regression/loess) (seeCleveland and Devlin, 1988; Venables and

    Ripley, 2002) by, for example, Huntley et al.(1995) and Hill et al. (1999; 2002). Localregression is a non-linear flexible method thatrequires no prior assumption about the formof the relationship between a climate predic-tor and the species probability of occurrence(Huntley et al., 2004). The method is thusable to capture complex non-linear and multi-modal relationships. According to Beerlinget al. (1995) local regression is also morerobust for extrapolation beyond the domain

    within which the model was calibrated thanother modelling methods. A potential disad-vantage is that locally weighted regressionrequires an a priori selection of the bioclimaticpredictors to be included in the model(Huntleyet al., 2004). A difference betweenGAMs and local regression is that in the lattermethod the model is fitted as a single smoothfunction of all the predictors, whereas inGAMs the effects of the terms in the modelare expected to enter the model additively,without interactions between terms(Anonymous, 2001).

    Artificial neural networks (ANN) are pow-erful rule-based modelling techniques whichare increasingly used in bioclimatic envelopemodelling (eg, Berry et al., 2002; Pearsonet al., 2002; Thuiller, 2003; 2004; Arajoet al., 2005a). This method is able to handleexplanatory variables from different sources,

    such as categorical and boolean data.Moreover, ANN is considered to be robust tonoise in the training data set, and it is able todetermine climatic envelopes that have non-linear responses to predictors (Hilbert andOstendorf, 2001; Pearsonet al., 2002; 2004).

    Disadvantages include the requirement oflarge quantity of data to train, validate andtest the network, and the limited insights intothe contributions of the predictors in the pre-diction process (but see Olden and Jackson,2002b). Moreover, ANN do not allow exam-ining the response curves of species againstenvironmental gradients (Manel et al., 1999;Pearsonet al., 2002).

    Genetic algorithm for rule-set prediction

    (GARP) is an artificial intelligence based

    super-algorithm which uses other techniques(eg, logistic regression, BIOCLIM) in adynamic machine-learning environment(Stockwell and Peterson, 2002; Andersonet al., 2003). GARP uses species presencerecords and georeferenced data on ecologicalfactors to produce a model of species ecolog-ical niches. The software is tailored to searchfor non-random correlations between speciespresence and absence and environmentalcharacteristics using several different types of

    rules. GARP works in an iterative process ofrule selection, evaluation, testing and incorpo-ration or rejections to produce a heteroge-neous rule set summarizing speciesecologicalrequirements (Anderson et al., 2002a). Thealgorithm can run several thousand iterations.GARP primarily works on presence datapoints, but it allows for resampling withreplacement from the pixels without con-firmed presence data in the training set to cre-ate a set of pseudo-absence points (Andersonet al., 2003). When projected onto a geo-graphical space, GARP provides predictions ofthe species geographical distribution. GARPhas been used extensively by A.T. Petersonand collaborators (eg, Peterson, 2001;Andersonet al., 2002a; Petersonet al., 2002b;Stockwell and Peterson, 2002).

    Multivariate adaptive regression splines

    (MARS) is a relatively novel and promising

  • 8/2/2019 Archive Bio Climatic

    6/27

    6 Methods and uncertainties in bioclimatic envelope modelling

    modelling approach. MARS has been hithertovery rarely applied in either bioclimatic enve-lope modelling or species distribution model-ling in general (but see Prasad and Iverson,2000; Muoz and Felicsmo, 2004). Thistechnique combines linear regression, mathe-

    matical construction of splines and binaryrecursive partitioning to produce a localmodel where relationships between responseand predictors are either linear or non-linear.

    An extension of MARS is generalizedboosted trees or generalized boosted models(GBM) which have been recently introducedin ecology. They are highly efficient in fittingthe data, non-parametric and combine thestrength of different modern statistical tech-niques (Ridgeway, 1999; Friedman et al.,

    2000). Boosting improves predictive accura-cies by iteratively estimating classifiers using abase learning algorithm (eg, a decision tree)while systematically varying the training sam-ple. The final boosted classifiers prediction isbased upon an accuracy-weighted voteacross the estimated classifiers. Recently,boosting has been shown to be a form of addi-tive logistic regression (Friedman et al.,2000), in which the probabilities of classmembership can be obtained from boosting.

    Hitherto, there are hitherto very few studiesin a context of global change based on gener-alized boosted models. Clearly, due to thepotentiality of this approach such studiesmight provide important contributions to theclimate change modelling.

    Two additional novel techniques whichhave hitherto rarely been applied in geograph-ical species distribution modelling but whichalso merit further research are maximumentropy method (Maxent) and RandomForests Analysis (or Multiple Tree) (but seePhillips et al., 2006; Prasad et al., 2006).Random Forests generates hundreds or thou-sands of random trees, evaluates them as awhole and ultimately selects the smallest(most parsimonious) tree that has a givenerror level (predictive value). Maxent is amachine-learning powerful technique suitedparticularly for modelling species geographic

    distributions based on presence-data only. Inan extensive comparative study Phillipset al.showed that Maxent was able to performregularly better (based on AUC measures) inpredicting the spatial distribution of two studyspecies than GARP. Moreover, Maxent was

    more successful in producing detailed (fine-grained) predictions than GARP. Thus thismethod can provide useful improvements tospecies-climate change modelling in regionswhere absence data are not available forspecies.

    III Performance of different modellingtechniquesEach of the modelling approaches have theirstrengths and weaknesses and also differ in

    their ability to summarize plausible biogeo-graphic responses of species distributions toclimatic predictors (Segurado and Arajo,2004). Moreover, in assessments ofthe potential impacts of climate change,seemingly small differences in the accuracy ofmodels in predicting current distributions mayresult in disturbingly dissimilar projections offuture distributions (Thuiller, 2003; 2004).

    1 Measures of prediction accuracy

    One criterion for evaluating performance ofmodels is to measure the accuracy of the pre-dictions (ie, prediction error; see Fielding,2002), preferably based on independent datasets. There are several measures that can beused in evaluating the classification accuracyof the presence/absence models (Fielding andBell, 1997; Pearce and Ferrier, 2000a).However, currently the discrimination abilityof the species distribution models is mainlyassessed using two measures: the Kappa sta-tistic (Cohens Kappa) (Congalton, 1991), andthe area under the curve (AUC) of a receiveroperating characteristic (ROC) plot (Fieldingand Bell, 1997).

    The Kappa coefficient measures the cor-rect classification rate (proportion of correctlyclassified presences and absences) after theprobability of chance agreement has beenremoved (Congalton, 1991). Landis and Koch

  • 8/2/2019 Archive Bio Climatic

    7/27

    Risto K. Heikkinen et al. 7

    (1977) proposed a scale to describe the degreeof concordance based on Kappa: 0.811.00 almost perfect; 0.610.80 substantial;0.410.60 moderate; 00.210.40 fair;0.000.20 fail. Kappa is dependent on asingle threshold to distinguish between pre-

    dicted presence and predicted absence andthus falls into the class of thresholddepend-ent measures. The earlier practice of using a0.5 cut-off probability as a rule of thumb hasshown to be inadequate (Manel et al., 2001;Segurado and Arajo, 2004; Liuet al., 2005).A more objective and increasingly popularapproach is to select an optimum probabilitythreshold based on the cut-off level that max-imizes Kappa. This can be determined by eval-uating Kappa values at successive probability

    increments across the entire range from 0.00to 1.00 (Huntleyet al., 2004).A recent important evaluation of the

    approaches to select an optimal threshold fortransforming the species distribution predic-tions to presences/absences was provided byLiu et al. (2005). The authors compared 12approaches for selecting the threshold ofoccurrence. Their results challenged the com-monly used kappa maximization approachbecause it was not among the most robust

    methods. Instead, the threshold-determiningapproaches recommended by Liuet al. (2005)included (i) prevalence (ie, the number ofoccurrences in relation to the number of sam-ples) approach (taking the prevalence ofmodel-building data as the threshold); (ii)average probability/suitability approach (tak-ing the average predicted probability/suitabil-ity (ie, the mean value of predictedprobabilities of species presence) of themodel-building data as the threshold); and (iii)using data sets with prevalence of 50% tobuild models.

    An alternative measure of accuracy is theAUC of the ROC plot. AUC relates relativeproportions of correctly classified (true posi-tive proportion) and incorrectly classified(false positive proportion) cells over a wideand continuous range of threshold levels(Cumming, 2000; Erasmuset al., 2002). This

    makes it a thresholdindependent measure(Pearce and Ferrier, 2000a). The AUC rangesgenerally from 0.5 for models with no dis-crimination ability to 1.0 for models with per-fect discrimination. An approximate guide forclassifying the accuracy of AUC is that pro-

    posed by Swets (1988): 0.901.00 excel-lent; 0.800.90 good; 0.700.80 fair;0.600.70 poor; 0.500.60 fail. AUCvalues of less than 0.5 indicate that the modeltends to predict presence at sites at which thespecies is, in fact, absent (Elith and Burgman,2002).

    2 Comparisons of modelling techniques

    Bioclimatic envelope modelling studies haveusually been conducted by employing only

    one modelling approach (eg, Boxet al., 1993;Huntley, 1995; Iverson and Prasad, 1998;2001; Bakkenes et al., 2002; Beaumont andHughes, 2002; Berry et al., 2002; Pearsonet al., 2002; 2004; Huntley et al., 2004;Beaumontet al., 2005), or different variationsof the same technique, eg, GARP (Andersonet al., 2002a; 2003; Peterson, 2003). Somevariation is expected from using differenttechniques because different models use avariety of assumptions, algorithms and

    parameterizations. Thus, when studies use asingle modelling technique there is no infor-mation of whether the selected methodprovides the best predictive accuracy for theparticular data set used (Arajo and New,2006).

    A number of comparative studies haveexamined the performance of differentspecies distribution modelling techniques(Franklin, 1995; Manelet al., 1999; Bioet al.,2002; Elith and Burgman, 2002; Moisen andFrescino, 2002; Olden and Jackson, 2002a;Thuilleret al., 2003a; Muoz and Felicsmo,2004; Segurado and Arajo, 2004), althoughin the context of climate change suchappraisals are rare (but see Thuiller, 2003;2004; Arajo et al., 2005a; Pearson et al.,2006).

    When assessing model variability in thecontext of present-time bioclimatic modelling,

  • 8/2/2019 Archive Bio Climatic

    8/27

    8 Methods and uncertainties in bioclimatic envelope modelling

    some studies have reported only subtle differ-ences among techniques. For example,Franklin (1995) reported that GAM, GLMand CTA produced all projections with similaraccuracy. In contrast, using the same threemethods and data on plant species at three

    different scales, Thuiller et al. (2003a)showed that CTA had a lower overall modelperformance than the two generalized meth-ods. In a study by Bioet al. (2002) more thanhalf of the species were better modelled byGAM than by GLM, indicating that speciesresponses are often complex and difficult tofit using simple symmetric response shapes.Elith and Burgman (2002) reported an appar-ent trend towards better model discrimina-tory performance from GAM and GLM in

    comparison to ANUCLIM (climatic envelopemethod) and GARP. The authors stated thatGARP appears to perform better than theother three methods when tested with origi-nal data, but testing with independent dataindicated no clear differences in the accuracyof models. These results seem to contradictthe optimistic statements of Peterson andCohoon (1999) and Anderson et al. (2003),who have suggested that GARP is especiallysuccessful in predicting species distributions

    under a wide variety of situations. Also theresults of Pearson et al. (2006) indicate thatGARP may yield projections that markedlydiffer from theoretically more robust semi-parametric techniques.

    Contrasting results have also been obtainedusing ANN. Olden and Jackson (2002a)stated that, on average, ANN outperformedlogistic regression, linear discriminant analysisand CTA, although all approaches predictedspecies presence/absence with moderate toexcellent success (Thuiller, 2003; Arajoet al., 2005a). In an extensive evaluation studywith seven modelling methods, Segurado andArajo (2004) concluded that ANN per-formed generally best, immediately followedby GAMs including a covariate term for spatialautocorrelation. In contrast, Manel et al.(1999) concluded that ANN do not currentlyhave major advantages over logistic regression

    and discriminant analysis as regards modelperformance. However, it is important to notethat ANN, as well as other modelling tech-niques, are sensitive to small variations inmodel parameterization. Unless studies equalparameterizations for a given technique,

    results from models are not entirely compara-ble (Segurado and Arajo, 2004). Multivariateadaptive regression splines (MARS) has rarelybeen included in comparative studies.However, it has been shown to perform inseveral cases better than CTA, ANN or logis-tic regression (Moisen and Frescino, 2002;Muoz and Felicsmo, 2004).

    Comparisons of modelling techniques inthe context of climate change are morerecent and provide evidence of previously

    unnoticed levels of variability across model-ling techniques (Thuiller, 2003; 2004; Arajoet al., 2005a; 2005b; Pearson et al., 2006).For example, in Thuiller (2003) CTAappeared as the weakest method with a ten-dency to overfit. ANN performed slightlybetter than the other methods but showedalso a tendency to overfit during the calibra-tion process. GAM and GLM did not appearto overfit and had a higher accuracy thanANN in several cases. The results of the fol-

    low-up study by Thuiller (2004) indicatedthat the variability across model projections ofspecies future ranges from GLM, GAM,ANN, and CTA can be large and even over-ride the variability arising from the use of arange of climate change scenarios.

    In a bioclimatic modelling study based on116 breeding bird species in Britain at twotime periods (Arajo et al., 2005b), modelsyielded projections that were variable both inmagnitude and direction. For example, 90%of the species were projected both to expandand to contract depending on the modellingtechnique and calibration used. Arajoet al.(2006) examined projected potential distribu-tions of European amphibian and reptilespecies under a set of climate change scenar-ios. As with the UK birds study, they detectedconsiderable amount of methodologicaluncertainty as model projections were

  • 8/2/2019 Archive Bio Climatic

    9/27

    Risto K. Heikkinen et al. 9

    extremely variable. Pearson et al. (2006)applied nine modelling techniques to modelcurrent and potential future distributions offour target species in South Africa. Theirresults showed significant differencesbetween predictions from different models,

    with predicted changes in range size by 2030differing in both magnitude and directions.Some general conclusions can be drawn

    from these comparative studies. First, thebest model performance has been most oftenattributed to techniques using complexapproaches to mode fitting; GAM, ANN, andrecently also MARS (but see Elith et al.,2006; Lawler et al., 2006). These methodsthus represent perhaps the most reliablechoices for bioclimatic modelling exercises

    employing only one modelling method.However, it should be noted that some of thetechniques, eg, GARP and locally weightedregression, have rarely been included in com-parative studies (but see Elith and Burgman,2002; Pearsonet al., 2006). Moreover, undercertain circumstances (for example interpola-tions made within a given region and timeperiod) most of the methods are able to pro-vide moderate or excellent performance.

    3 Approaches to account for modelpredictionsvariability

    As a response to the variability in the modelperformance between different methods,two recent developments to reduce theuncertainty in species-climate impacts model-ling have been defined (Arajo and New,2006). Rather than using a single modellingtechnique investigators can (i) use a frame-work including different methods and modelsfor each species and select the most accuratetechnique using both evaluation methods andexpert knowledge (Thuiller, 2003; 2004), or(ii) use majority vote criterion approachamong multiple models thus deriving a singleprojection that represents the central ten-dency across all models considered. Thuiller(2004) used a consensus analysis based onprincipal components analysis (PCA) toderive composite variables that summarize

    the highest amount of information from indi-vidual projections from different methods(GLM, GAM, CTA, ANN) (see alsoAndersonet al., 2003; Thuilleret al., 2005b).This approach was further explored and itsusefulness demonstrated in a study by Arajo

    et al. (2005b) where the authors were able totest the results of bioclimatic models appliedto bird distributions in Great Britain using twodata sets in different time period subject toclimate change. The results of models werefound to be extremely variable across speciesand modelling methods. Using a consensusapproach authors were able to produce fore-casts with significantly reduced uncertainties.However, as pointed out by Arajo et al.(2005b), averaging the model projections will

    mainly increase the accuracy of forecastswhen better models (eg, models based ontechniques that generally provide superioroverall performance and perform consistentlywell across a range of species) and not merelymore models are taken into account.Improved accuracy will thus also criticallydepend on traditional challenges of trying tobuild better models with improved data.However, there is a possibility that modelsproviding realistic projections are a minority

    within one ensemble of forecasts. Thus theywould contribute little to building a consensusamong an ensemble of forecasts. An alterna-tive for combining forecasts is proposed byArajoet al. (2006), whereby different clus-ters of model projections are produced andconsensus is calculated within each cluster.This allows simple conditional statements ofprobabilities to be made, whereas the fullbreadth of variability provided by an ensembleof forecasts is still preserved (for extendeddiscussion see Arajo and New, 2006).

    IV Model validation approaches andbioclimatic envelope modelsA wealth of literature is available on the eval-uation, validation and confirmation of numericmodels (Oreskes et al., 1994; Harrell et al.,1996; Rykiel, 1996; Fielding and Bell, 1997;Guisan and Zimmermann, 2000; Arajoet al.,

  • 8/2/2019 Archive Bio Climatic

    10/27

    10 Methods and uncertainties in bioclimatic envelope modelling

    2005a). However, the perspectives presentedhave varied considerably. Oreskeset al. (1994)takes an extreme view that because all naturalsystems are not closed model results arealways non-unique, and verification or valida-tion of numerical models is impossible. These

    authors argued that the primary value of mod-els is heuristic and that predictions will alwaysbe open to question. Others have providedless extreme standpoints. For example,Harrell et al. (1996), Fielding and Bell (1997)and Olden et al. (2002) contend with dis-cussing the following strategies for model vali-dation: resubstitution (no partitioning iscarried out, the data used to calibrate modelsare also used to validate them); bootstrappingand leave-one-out jack-knifing; (one-time)

    data splitting; grouped cross-validation (alsoknown as k-fold partitioning, hold-out, orexternal method).

    A recent contribution to the debate ofmodel validation under climate change wasprovided by Arajo et al. (2005a). Theauthors argued that the validation of species-climate envelope models has been insuffi-ciently explored and this has contributed tocreating an optimistic perception of their trueperformance in climate-change impact

    assessments. This is not surprising since vali-dation of models is often made with non-independent data as predictions concernevents that have not yet occurred. In somecases species range shift projections (eg,Stersdal et al., 1998; Beaumont andHughes, 2002) were made without attemptsto validate the predictive accuracy of modelsbeing presented. However, most bioclimaticenvelope modelling studies (eg, Breretonet al., 1995; Huntley et al., 1995; Bakkeneset al., 2002; Midgley et al., 2002; Huntleyet al., 2004; Skov and Svenning, 2004) haveused the resubstitution approach, ie, usingthe same data for training and testing (Figure1a). Resubstitution is currently not considereda desirable option, because it tends to giveoptimistically biased estimates or error ratesand model performance (Harrellet al., 1996;Fielding and Bell, 1997; Olden et al., 2002;

    Thuiller, 2003; Muoz and Felicsmo, 2004;Thuiller, 2004).

    Data partitioning approaches in model val-idation have increasingly been applied to cir-cumvent the problems of resubstitutionapproach. The most commonly usedapproach is one-time data splitting (ie, splitsample; see Harrell et al., 1996; Guisan andZimmermann, 2000), whereby data is ran-domly split into calibration and validation sub-sets (Figure 1b). This approach has been used

    Figure 1 Three main approaches forcalibrating and validating species-climateenvelope models: (a) resubstitution; (b)one-time data splitting (split-sampleapproach); and (c) independentvalidationSource: Reprinted from Arajoet al.(2005a).

  • 8/2/2019 Archive Bio Climatic

    11/27

    Risto K. Heikkinen et al. 11

    in bioclimatic envelope modelling by Boxet al.(1993), Iverson and Prasad (1998), Berryet al.(2002), Pearsonet al. (2002; 2004), Thuiller(2003; 2004), Araujo et al. (2004), amongothers (for review, see Arajoet al., 2005a).

    Some of the studies have used bootstrap

    method (Peterson et al., 2002a; 2004;Peterson, 2003) or fourfold cross-validation(Luoto et al., 2005). However, as with datasplitting, bootstrapping and cross-validationtake samples randomly from the original data.Although they provide a more robust meas-ure of model performance than resubstitu-tion, they may also provide non-independentsamples that are vulnerable to certain pitfallsof correlative models based on geographicaldata, especially spatial autocorrelation.

    Spatial autocorrelation may bias the accuracyof models that are fitted with samples fromthe original data even when these areobtained with additional field sampling withinthe modelled region (Vaughan and Ormerod,2003; Arajoet al., 2005a).

    The best option to validate bioclimaticenvelope models is to use independent testdata collected from another region (eg,Beerling et al., 1995; Price, 2000; Randinet al., 2006) or from another point of time

    (Arajo et al., 2005a; Arajo and Rahbek,2006; see Figure 1c). One of the most impor-tant challenges in bioclimatic modelling is tounderstand the limitations of models by cali-brating and validating models with data thatare distinctively independent from each other.This is because predicting species distributionpatterns at one point of time distant from thatused to calibrate the models is one of the chiefpurposes of bioclimatic envelope models.To date, such studies have been very rare(but see Hillet al., 1999; Arajoet al., 2005a;2005b; Waltheret al., 2005).

    The results of Arajo et al. (2005a)showed that accuracies of correlative biocli-matic envelope models for 116 UK birds werealways higher when evaluated by one-timesplit sample than accuracy values derivedfrom fitting the calibrated model to the inde-pendent data recorded c. 15 years later than

    the calibration data. This result supportedconcerns that models predictive accuracymeasured by one-time data splitting (andother related validation methods) are likely toprovide generally overoptimistic assessmentof model performance on truly independent

    data. In contrast, using a physiologically basedclimatic envelope model (STASH), Waltheret al. (2005) compared detailed past climaticand distribution records ofIlex aquifoliumwith the current climate and distribution. Inthis case the model predicted well the newsuitable areas available for the species to becolonized. The authors concluded that a shiftin the northern margin ofIlex aquifolium, inconcert with increasing winter temperatures,was demonstrated.

    V Uncertainty issues in model buildingThere are several decisions which may con-tribute to uncertainty in the models andshould thus be taken into account when devel-oping species distribution models (includingbioclimatic envelope models) (Elith et al.,2002). Sources of uncertainty in bioclimaticmodels have received a considerable attentionin the statistical and ecological literature(Chatfield, 1995; Harrellet al., 1996; Buckland

    et al., 1997; Guisan and Zimmermann, 2000;Vaughan and Ormerod, 2003; Johnston andOmland, 2004; Rushton et al., 2004). Thisattention is warranted because there isincreasing evidence that the best modeldeveloped for a given studied region is onlyone among many alternative models.

    Research has been primarily concernedwith regression models (GLM, GAM).However, many of the methodological uncer-tainties discussed for these techniques applyto other approaches (Vaughan and Ormerod,2003). Factors contributing to the uncer-tainty of model projections include, for exam-ple, decisions associated with model building(a priori selected predictors versus manualselection versus automated model selection),collinearity, choice of the variable and modelselection criteria, identifying and excludingoutliers, autocorrelation and overdispersion,

  • 8/2/2019 Archive Bio Climatic

    12/27

    12 Methods and uncertainties in bioclimatic envelope modelling

    and finding a balance between underfitted(overly parsimonious) and overfitted (over-parametrized) models (Nicholls, 1989;Crawley, 1993; Chatfield, 1995; Mac Nally,2000; Elith et al., 2002; Heikkinen et al.,2004; Johnston and Omland, 2004).

    1 Model building approaches

    Basically, there are three approaches todevelop a single final model in a multipleregression setting: (i) a priori selection of a setof explanatory variables, (ii) manual modelbuilding, and (iii) automated model calibra-tion. Automated model building is a com-monly available option in the majority ofstatistical packages (eg, S-Plus, SPSS). It isalso embedded in some novel modelling

    frameworks such as GRASP (Lehmannet al.,2003) and BIOMOD (Thuiller, 2003). Thestrong point of this approach is that severalhundreds of species can be analysed and theirpotential range shifts examined within onecomputer run in a reasonable amount of time.Moreover, BIOMOD framework can runanalysis for each species using several differ-ent modelling techniques or different formsand parameterizations of one particularmethod (Thuiller, 2003).

    A disadvantage of the automated modelselection approach is that critical steps andalternative pathways in model building arenot as transparent and controlled as in manu-ally conducted modelling exercises.Automated selection of variables may thusresult into biologically implausible models andselection of irrelevant (or noise) variables(James and McCulloch, 1990; Pearce andFerrier, 2000b). Particularly, when modellinga limited set of indicator species vulnerable toclimate change, manual model building mayprovide an appropriate, hands-on processthat gives a better control to the modellerthan automated techniques and enables thedevelopment of ecologically plausible models(Nicholls, 1989; Crawley, 1993; Pausaset al.,2003). As highlighted by Leathwick et al.(1996), increasing sophistication in analyticaltools is no substitute for specific and detailed

    insight into the behaviour of complex ecolog-ical systems. Chatfield (1995) referred tomodel building as a process involving formu-lating, fitting and checking a model in an iter-ative and interactive way. However, whendealing with several hundreds species this

    approach can be time-consuming and difficultto apply. Moreover, both with manual andautomated model building the modeller needsto pay attention to other sources of uncer-tainty, such as collinearity and overfitting.

    The third option, a priori selection of a lim-ited number of predictors in the models, hasbeen used in a number of bioclimatic model-ling studies (eg, Beerling et al., 1995; Sykeset al., 1996; Hill et al., 2002; Huntley et al.,2004). A disadvantage of this approach is that

    it requires an a priori decision as to the selec-tion of bioclimatic variables to be included inthe model. However, when empirical infor-mation about the physiological limits con-straining species geographical distributionsare available, a priori selection of appropriatepredictor variables may well turn into anadvantage (Huntleyet al., 2004). Moreover,by using a limited number of physiologicallymeaningful variables the modeller may morereadily circumvent the potential collinearity

    and overfitting problems (cf. Mac Nally, 2000;Beaumont and Hughes, 2002; Beaumontet al., 2005). A possible shortcoming is that,when models are applied to new regions, thestudied species may respond to the a prioriselected variables differently. For example,species may show a strong correlation with aparticular predictor variable in one area andlittle response in another area (Osborne andSurez-Seoane, 2002; Pearson and Dawson,2003). Also it is possible that presence orabsence of a given species is highly correlatedwith climatic variables excluded from the apriori selected set. This may happen becauseof the correlations between the predictorvariables (Beerling et al., 1995) or because aparticular response is observed only in certainparts of the study area (see Osborne andSurez-Seoane, 2002, and referencestherein).

  • 8/2/2019 Archive Bio Climatic

    13/27

    Risto K. Heikkinen et al. 13

    2 Overfitting and model selection criteria

    One of the crucial issues in species distributionmodelling is the identification of the optimaltrade-off between producing underfitted andoverfitted models, and developing an under-standing of the factors that may result into

    these two phenomena (Harrell et al., 1996;Elith et al., 2002; Vaughan and Ormerod,2003). Some studies include what seems to bean exceeded number of climatic variables andare thus vulnerable to unbalanced observa-tions per predictors ratio, as well as to potentialoverfitting problems (see Breretonet al., 1995;Beaumont et al., 2005; Guisan and Thuiller,2005). Briefly, this means that overfitted mod-els including too many explanatory variablesare exceedingly complex and may begin to fit

    random noise in the data (Chatfield, 1995;Fielding, 2002). Although more complicatedmodels may appear to give a better fit, the pre-dictions they produce may be poorer(Chatfield, 1995). The study by Beaumontet al. (2005) suggests that the use of BIOCLIMwith all its 35 climatic variables may lead tooverfitting and therefore affect the potentialusefulness of projections. The authors alsoshowed that the size of the predicted speciesdistributions using all 35 variables were on

    average half of the size of the distributions pro-duced using only six relevant parameters.Certain modelling techniques (particularly

    CTA) have a higher tendency to overfittingthan others (Thuiller, 2003; Arajo et al.,2005a). However, many factors other thanthe choice of the technique can have animpact on the number of predictors selectedin the final models. One of these factorsincludes the model selection criteria. A tradi-tionalmodel selection approach in regressionmodelling is selecting significant predictorsusing forward or backward (or both) stepwiseprocedures. The decision as to whether avariable is included or dropped from themodel can be judged byF-statistic or 2 statis-tic (Crawley, 1993; Lehmann et al., 2003;Johnston and Omland, 2004).

    Recently, there has been a trend towardsusing Bayesian information criterion (BIC)

    (also known as Schwarz criterion) or Akaikesinformation criterion (AIC) in model selection(Rushtonet al., 2004). AIC has two compo-nents: negative log-likelihood, which meas-ures lack of model fit to the observed data,and a bias correction factor, which increases

    as a function of the number of model param-eters (Johnston and Omland, 2004). AIC isequivalent to twice the log-likelihood of themodel fitted plus two times the number ofparameters included in the model (Rushtonet al., 2004).

    BIC is superficially similar to AIC and hasalso two components: negative log-likelihood,and a penalty term that varies as a function ofsample size (increases as sample sizeincreases) and the total number of para-

    meters (Johnston and Omland, 2004).Advantages of AIC and BIC include the factthat they can be used to make inferencesfrom more than one model and they considerboth model fit and complexity simultaneously.Applying different model selectionapproaches to the same data may result indifferently parameterized models. Thus thechoice of model selection criterion is impor-tant and is likely to affect the accuracy ofpredictions. Bioet al. (2002) considered that

    BIC leads into more reliable and parsimonious(simpler) models (see also Buckland et al.,1997). Sometimes BIC may lead to too strict(underfitted) models that do not capturesome of the important relationships betweenresponse and predictor variables (Bio et al.,2002). According to Johnston and Omland(2004) AIC is a generally favoured option.However, the consequences of choosingamong the AIC, BIC or traditionalF- and 2statistics in model selection has been poorlyexamined and it remains a field with a needfor further inquiry.

    Recent work by Maggini et al. (2006) hasfurther challenged the model selectionapproaches by reporting that two alternativeapproaches that are currently available inGRASP, cross-selection and the Brutomethod (Venables and Ripley, 2002: 234),appear to have clear advantages over AIC,

  • 8/2/2019 Archive Bio Climatic

    14/27

    14 Methods and uncertainties in bioclimatic envelope modelling

    BIC or F-statistics. They concluded that,while AIC appears to be very conservativeand BIC too selective, cross-selection createsmore stable models (based on ROC statistics)and the Bruto method has the advantage ofproviding automatic selection of smoothing

    degrees of freedom and increasing computa-tional speed.

    3 Sample size

    The sample size also affects model selectionand the predictive performance of the model(Cumming, 2000; Peterson et al., 2002a;Arajoet al., 2005b). Stockwell and Peterson(2002) studied the effects of sample size onthe accuracy of species distribution modelsusing GARP, logistic regression, and a

    surrogate method with a single environmentalvariable. Their results suggest that machine-learning methods such as GARP can reach anear maximal average success at predictingspecies occurrences with 50 data points,whereas the remaining two methods reachedtheir maximum accuracy at c. 100 data points.This indicates that species distribution modelsbased on fewer data points than 100 may pro-vide less robust models.

    However, the assumption that a sample of

    100 data points provides a maximum potentialpredictive accuracy of models in all situationsis probably too optimistic. Pearce and Ferrier(2000b) showed that in their data a clearincrease in the discrimination performance ofthe models was observed between the 250and 500 sample sizes. The results ofMcPhersonet al. (2004) suggested that opti-mal models had large sample sizes, in theircase between 300 and 500 (see alsoCumming, 2000; Reeseet al., 2005). Arajoet al. (2005a) demonstrated that using a 70%random sample instead of the total 286110-km grid squares reduced the predictive per-formance of British bird models dramatically.

    These results indicate that the more dataare available for model building the better.However, increasing the sample size may alsoproduce some undesirable effects. This isbecause the relatively high sample size tends

    to lower p-values compared to smaller samplesizes (McBrideet al., 1993). When data con-sists of several thousands data points (whichis not unfamiliar with atlas data sets) it is easyto obtain statistically significant differences,even though the predictors account for only a

    minor part of the variation in the species dis-tribution data (cf. Crawley, 1993: 57). It isthus advisable to be cautious when buildingmodels with large data sets in order to avoidoverparameterized models that include vari-ables with little ecological relevance. Whenmodel selection is based on stepwise proce-dures using F-statistics or 2 statistics, thiscan be done by applying a more stringent p-value as the variable selection criteria thanthe commonly used 0.05. However, as

    regards AIC and BIC, very little research hasbeen carried out of their behaviour to varyingsample sizes. In the future, new work shouldbe addressed to assess relationships betweensample size and different model selection cri-teria so to understand their synenergeticimpacts on the accuracy of species-climatemodels.

    VI Multicollinearity andautocorrelation

    1 Multicollinearity

    Multicollinearity among predictors may ham-per the analysis of species-environment rela-tionships in multiple regression settings. Dueto collinearity, ecologically more causal vari-ables may be excluded from the models ifother intercorrelated variables explain thevariation in response variable better in statis-tical terms (Mac Nally, 2000; Luoto et al.,2002; Heikkinen et al., 2004). However, inbioclimatic modelling studies very little atten-tion has hitherto been paid to multicollinear-ity (Guisan and Thuiller, 2005; Luoto et al.,2006). Future contributions would berequired to amplify our better understandingof the possible biases in species-climate mod-elling arising from collinearity problems.

    Approaches to tackle collinearity includethe examination of the correlation or variation

  • 8/2/2019 Archive Bio Climatic

    15/27

    Risto K. Heikkinen et al. 15

    inflation factors (VIFs) between the predic-tors, and dropping some of the highly inter-correlated variables from the analysis(Cawsey et al., 2002; Elith and Burgman,2002) (but see Philippi, 1993). Data-reductiontechniques have also been applied, such as

    using principal components analysis (PCA) toreduce the dimensionality of the predictordata set (Gates and Donald, 2000; Surez-Seoane et al., 2002; Muoz and Felicsmo,2004). However, one disadvantage of usingPCA is the difficult interpretation of its out-puts. Furthermore, some variables that areirrelevant to a species distribution may con-tribute to the principal components (Gatesand Donald, 2000; Vaughan and Ormerod,2003), creating a false expectation of their

    relevance for species distribution modelling.Recent developments have provided alter-native means of addressing collinearity withinspecies distribution models. When aiming atprediction with regression analysis, valuableinsights can be developed by sequentialregression and structural equation modelling(Graham, 2003). Collinearity can also beaddressed by variation partitioning (Borcardet al., 1992) and hierarchical partitioningmethods (Chevan and Sutherland, 1991; Mac

    Nally, 2000), which are designed to providemore in depth understanding of the explana-tory powers of predictors (Watson andPeterson, 1999). These methods provide adissection of the variation in response vari-able(s) into independent components whichreflect the relative importance of individualpredictors or groups of predictors and theirjoint effects (Anderson and Gribble, 1998;Cushman and McGarigal, 2004; Heikkinenet al., 2004; 2005). In a recent work, Gibsonet al. (2004) provided a useful approach tospecies distribution modelling by combininglogistic regression, model selection based onAIC and hierarchical partitioning into thesame modelling framework. Hierarchical par-titioning was used to support the identifica-tion of predictor variables most likely toinfluence the variation in the target-speciesdistribution.

    2 Spatial autocorrelation

    Autocorrelation is a frequently observed fea-ture in spatially sampled biogeographical data(Diniz-Filhoet al., 2003), which can hamperattempts to identify plausible relationshipsbetween species distributions and explana-

    tory variables (Legendre, 1993; Seguradoet al., 2006). Due to spatial autocorrelation,values of particular variables in neighbouringsites are more or less similar than they wouldbe in a random set of observations (Legendre,1993).

    Recent biogeographical studies haveincreasingly addressed spatial autocorrelationin broad-scale biodiversity modelling (eg,Selmi and Boulinier, 2001; Lichstein et al.,2002; Diniz-Filho et al., 2003; Diniz-Filho

    and Bini, 2005; Ferrer-Castn and Vetaas,2005). There are several differentapproaches to explore spatial structure in thedata, including: (i) applying generalized leastsquares (GLS), where spatial correlationstructure is incorporated assuming exponen-tial, spherical or Gaussian relationshipsbetween error terms and geographical dis-tances (Selmi and Boulinier, 2001; Diniz-Filhoand Bini, 2005); (ii) using geographicalcoordinates of the sampled data points and

    their higher and cross-product terms in themodelling exercise (trend-surface analysis)and associated variation partitioning analysis(eg, Heikkinen and Birks, 1996; Lichsteinet al., 2002; Titeux et al., 2004; Ferrer-Castn and Vetaas, 2005); and (iii) compar-ing semivariograms or Morans Icoefficientsof the field data and of the model residuals tosee how much of the spatial autocorrelationstructure in the species data is accounted forby the environmental variables (Bio et al.,2002; Hawkins et al., 2003). According toDiniz-Filho et al. (2003) and Ferrer-Castnand Vetaas (2005), strong correspondencebetween the spatial structures of bothspecies data and environmental data mayincrease the danger that accounting statisti-cally for spatial component (eg, by partialregression or spatial GLS) downplays thesignificance of environmental variables.

  • 8/2/2019 Archive Bio Climatic

    16/27

    16 Methods and uncertainties in bioclimatic envelope modelling

    A recent novel approach to use eigenvector-based spatial filters that are capable to cap-ture spatial structures at different scales(Borcard and Legendre, 2002; Griffiths,2003; Borcard et al., 2004; Diniz-Filho andBini, 2005). These filters can be used as pre-

    dictors in (partial) multiple regression analysisto take the spatial autocorrelation intoaccount as effective as possible (Diniz-Filhoand Bini, 2005).

    However, these recent developments inaccounting for spatial autocorrelation haspredominantly concerned species richnessmodelling. As regards species distributionmodelling and particularly bioclimatic enve-lope modelling, progress has been more mod-est (but see Arajo et al., 2005a; Segurado

    et al., 2006). In species-environment model-ling one of the approaches to account for thepatch-like autocorrelation in the data is to useautologistic models in which information ofthe response variable from the neighboringsampling points is used to produce a summary(autocovariate) variable (Augustin et al.,1996; Heikkinenet al., 2004). Another optionis to use spatial autoregressive models inwhich a spatial autocorrelation (SCA) term isadded to the linear predictor to reduce the

    spatial structure in the model residuals(Lichstein et al., 2002; Guisan and Thuiller,2005).

    Overall, there is a need for additional stud-ies assessing the importance of autocorrela-tion within bioclimatic modelling and moreimportantly to investigate approaches tocircumvent or deal with the issue. This isbecause species-climate studies are oftenbased on atlas data sets that are most likelyvulnerable to autocorrelation (Diniz-Filho andBini, 2005). However, integrating spatialautocorrelation effects into bioclimaticmodelling in a context of global change can beproblematic. More specifically, it is unlikelythat spatial structure described under currentconditions will be maintained in the futurebecause they indirectly reflect the effects ofhistorical, dispersal and environmental fac-tors (Guisan and Thuiller, 2005).

    VII Species geographical and ecologicalcharacteristics and performance ofspecies-climate modelsBy definition, bioclimatic envelope modelsexamine the relationship between climaticvariables and species distributions. However,

    there is evidence that the performance ofspecies-climate models may be influenced bythe characteristics of the species distributionpatterns. Stockwell and Peterson (2002)noted that predictive accuracy of GARP wasnot independent of range size; widespreadspecies were modelled less accurately. Similarresults had been discussed by Arajo andWilliams (2000), using logistic regression.Fielding and Bell (1997), Manel et al. (2001)and McPherson et al. (2004) argued that

    species prevalence can also have an impact onthe accuracy of the species distribution mod-els, measured either by Cohens kappa orAUC. However, the results of these paperswere not fully coincident. According toManelet al. (2001) kappa was only marginallyaffected by prevalence and AUC values werewholly independent of it. In contrast,McPhersonet al. (2004) argued that kappa issensitive to variation in prevalence values andAUC provides a more reliable measure of

    model performance. McPhersonet al. (2004)concluded that models perform best whenprevalence is intermediate (see also Virkkalaet al., 2005).

    Other species spatial characteristics mightalso affect model performance (Brotonset al.,2004; Segurado and Arajo, 2004; Luotoet al., 2005). In a simulation experiment,Reeseet al. (2005) showed that model accu-racy was positively related to the level of con-tiguity in the distribution maps. This suggeststhat species with high spatial contiguity mightbe better modelled than species with low con-tiguity. Brotonset al. (2004) reported that thepredictive accuracy of the models was gener-ally higher for more marginal species (margin-ality distance of species mean distributionin environmental space relation to the globalmean). In a comprehensive modelling study,Segurado and Arajo (2004) investigated the

  • 8/2/2019 Archive Bio Climatic

    17/27

    Risto K. Heikkinen et al. 17

    effects of two species geographical character-istics (area of occupancy and extent of occu-pancy) and two species-environmentalcharacteristics (marginality and niche breath)on models accuracy. Their results indicate aclear trend towards increasing model perform-

    ance for restricted-range species and decreas-ing performance for widespread species.Segurado and Arajo (2004) also noted thatmodel performance is higher for species withhigh environmental marginality and low nichebreath than for generalist species.

    Three recent studies have also demon-strated the sensitivity of model projections forthe geographical distribution and ecologicalproperties of the target species (Kadmonet al., 2003; Luotoet al., 2005; Thuilleret al.,

    2005a). Kadmon et al. (2003) showed thatspecies characterized by high prevalencewithin a limited range of climatic conditionswere modelled with higher precision thanrarer ones with wide climatic ranges; thusniche breath had a negative effect on modelaccuracy. The results of Luoto et al. (2005)suggested that the performance of species-climate models for boreal butterflies werenegatively correlated with latitudinal range(geographical extent of distribution) and

    prevalence, and positively with spatial auto-correlation (clumping of distribution) (Figure2). In other words, species at the margin oftheir range or with low prevalence were bet-ter predicted than widespread species, andspecies with clumped distributions betterthan widely scattered dispersed species. Theoverall message emerging from these studies,as well as other recent studies discussed here,is that species geographical attributes can sig-nificantly influence the behaviour and uncer-tainty of pure species-climate models, whichshould be taken into account in assessmentsof climate change impacts.

    VIII Sampling and delineating climaticpredictors and equilibrium of speciesdistributions with climateAn accurate description of species-environ-ment relationships requires that samples are

    Figure 2 The effects of (A) latitudinalrange (geographical extent ofdistribution); (B) prevalence; and (C)clumping of distribution (measured asspatial autocorrelation) on the accuracyof species-climate models for thedistributions of 98 butterfly species in

    Finland. Accuracy of models wasmeasured by area under curve (AUC)values from the ROC plots. Results areshown both as resubstitution (opensymbols) and cross-validation (solidsymbols) statisticsSource: Reprinted from Luotoet al.(2005).

  • 8/2/2019 Archive Bio Climatic

    18/27

    18 Methods and uncertainties in bioclimatic envelope modelling

    taken across the complete gradient of environ-mental space in which species occur. Suchsampling should include sites defining theboundaries of species environmental distribu-tions (Vaughan and Ormerod, 2003). Kadmonet al. (2003) reported that the climatic bias in

    sampling can have a significant effect on theaccuracy of model predictions and reducemodel performance. Thuiller et al. (2004c)examined the consequences of restricting therange of environmental conditions over whichspecies-climate models are developed. Theauthors showed that the incomplete samplingof the climatic range can strongly influence theestimation of response curves, especiallytowards upper and lower ends of environmen-tal ranges (see also Pearsonet al., 2006). This

    may reduce the applicability of the models forpredictive purposes and produce spurious pro-jections of species distributions into thefuture. In conclusion, projections of speciesfuture distributions should be evaluated care-fully if the model calibration data is not cover-ing the full range of environmental gradientswhich species inhabit.

    Incomplete coverage of projected futureclimates can also produce prediction errors.For example, in the Northern Hemisphere

    species are generally projected to movenorthwards. In many studies, southernmostregions are projected to experience a futureclimate that has no modern analogue in thestudy region. In such cases modelling resultsprovide incomplete information of howspecies might respond to non-analogue situa-tions (Stersdal et al., 1998). This may leadto the projections of species loss in the south-ern quarters of the study area being an arti-fact. This issue has been discussed in some ofthe bioclimatic envelope modelling papers (eg,Bakkeneset al., 2002; Petersonet al., 2002b).However, only Stersdalet al. (1998) have toour knowledge treated the problem in anexplicit manner, by excluding those parts ofthe study area which were projected to havea future climate with no modern analogue.

    It is possible that different types of climaticpredictors other than those usually used in

    bioclimatic modelling might provide better cor-relations with the species distributions.Particularly the long-term (eg, 30 years) aver-age data commonly used may not show theeffects of certain coincidences, such as asequence of cool, short summers or growth

    seasons (Bakeret al., 2000). Establishment of,for example, insect populations may depend onvery short-term weather events. Moreover,Hill et al. (1999) pointed out that meanmonthly climate values cannot reflect short-term climatic events and extreme conditionsthat may have an important influence uponpopulation dynamics of butterflies. A period ofparticularly wet and cold summer weathermay depress local populations to levels wherethey can be vulnerable to extinction.

    A general assumption in bioclimatic model-ling is that species distributions are at equilib-rium with current climate. Recent studieshave asked how distant from equilibrium arecurrent distributions of species, and furtherquestioned whether possible deviations fromequilibrium would produce important biases inspecies-climate model projections (Svenningand Skov, 2004; Arajo and Pearson, 2005).Using species data from a 50-km grid systemin Europe and a pattern-based analysis

    (Mantel test), Arajo and Pearson (2005)showed that the degree of covariationbetween four studied species groups andclimate varied notably. Covariation wasstrongest between plant and bird speciescomposition and climate, whereas the rela-tionships between reptiles, amphibians andclimate were weaker. These findings led to theconclusion that the two latter groups wouldbe more likely to have distributions that departfrom equilibrium assumptions, possible due tolower dispersal abilities. The authors alsoconcluded that such a departure would affectthe reliability in which bioclimatic models cap-ture the full responses of species to climate.

    IX Effects of non-climatic factorsand scaleIn the species-climate modelling literatureit has been increasingly highlighted and

  • 8/2/2019 Archive Bio Climatic

    19/27

    Risto K. Heikkinen et al. 19

    occasionally examined (see Iverson andPrasad, 1998; Pearson et al., 2004; Thuilleret al., 2004a; Virkkala et al., 2005; Luotoet al., 2006) that many factors other thanclimate may have an important role in explain-ing species geographical distributions. Such

    effects need to be taken into account whendeveloping future projections of species dis-tributions. Iverson and Prasad (1998) showedthat in many cases a combination of climateand edaphic variables was necessary toachieve the best CTA models for the studiedtree species (see also Coudun et al., 2006).Huntley (1995) proposed that, although anappropriate model relating geographical distri-bution of plants to environment may includeonly macroclimatic variables, in the case of

    birds such a model may need also to includestructural attributes and taxonomic composi-tion of vegetation. Crick (2004) emphasizedthat the current distributions of birds may bedue factors other than climate, for examplepast persecution (Milvus milvus in the UK asan example). In such cases purely species-climate models based on current distributionscan yield incomplete indications of thespecies climatic needs.

    However, most attention has hitherto

    focused on integrating the possible impacts ofland cover into bioclimatic modelling studies.Bakkeneset al. (2002) noted that conclusionsof the pure species-climate models on coastalregions should be treated carefully because insuch regions species distributions are limitedgeographically by the sea and not by theclimate. In fact, only a few biogeographicalstudies have explored the importance of cli-mate versus land cover for modelling speciesdistributions (but see H-Acevedo and Currie,2003; Stefanescuet al., 2004; Thuilleret al.,2004a; Bomhardet al., 2005). Moreover, it isimportant to note that the variation of impor-tance of climate and land-cover factors is verylikely a scale-dependent phenomenon (cf.Wiens, 1989; Pearsonet al., 2004). The rela-tive roles of land cover and scale have beeninsufficiently examined at different scales(but see Thuilleret al., 2003a). However, the

    limited results available so far suggest that, atthe coarse European resolution, the predic-tive power of bioclimatic models are generallynot greatly improved by the inclusion of land-cover variables (Thuiller et al., 2004a),whereas at 10 km or higher resolution the

    integration of land-cover data can improvespatial predictions for certain species (Pearsonet al., 2004; Virkkalaet al., 2005; Luotoet al.,2006).

    The importance of land cover at varyingscales is likely to vary among different groupsof organisms and may also be related to thehabitat specifity of species. Virkkala et al.(2005) showed that the distribution patternsof marshland birds in boreal regions at thescale of 10 10 km reflect the interplay

    between habitat availability and climatic vari-ables; however, the coverage of marshlandhabitats was clearly the most important pre-dictor for species distributions. Luoto et al.(2006) modelled the distributions of 98 borealbutterfly species at the same resolution usingclimate and land-cover information. Althoughcertain butterfly species showed clear corre-lations with some land-cover variables, mostof the variation in butterfly distributionsappeared to be associated with climate, par-

    ticularly growing degree-days and mean tem-perature of the coldest month.The results of Thuiller et al. (2004a),

    Pearsonet al. (2004), Virkkalaet al. (2005) andLuoto et al. (2006) are concordant with thecurrent paradigm that climate governs thespecies distributions at broad biogeographicalscales (Currie, 1991; Huntley et al., 1995;Parmesan, 1996), whereas land-cover and spa-tial distribution of suitable habitats determinespecies occupancy at finer spatial resolution(Hill et al., 1999; Bailey et al., 2002; Pearsonet al., 2004; Luotoet al., 2006). However, ourunderstanding of how the relative importanceof land cover versus climate varies over a rangeof scales and between different species groupsis still incomplete, and more multiscale assess-ments are needed (cf. Rahbek and Graves,2001; Rahbek, 2005) if progress in modellingspecies distributions is to be expected.

  • 8/2/2019 Archive Bio Climatic

    20/27

    20 Methods and uncertainties in bioclimatic envelope modelling

    X ConclusionsAll modelling approaches have their advan-tages and disadvantages. The choice of themodelling approach has to consider a range offactors including the breadth of ecologicalknowledge, existing distribution data, spatial

    and temporal scale as well as goals of themodelling (Pearson and Dawson, 2003).Bioclimatic models allow consideration ofhow climate change may affect many speciessimultaneously and provide estimates ofpotential change in species richness andecosystems in a given region (eg, Crumpackeret al., 2001; Peterson et al., 2002b; Pearsonand Dawson, 2003; Beaumont et al., 2005).These models are useful first filters for iden-tifying locations and species that may be

    greater risk and provide first approximationsas to the impact of climate change on speciesranges (Pearson and Dawson, 2003; Thuilleret al., 2005b). Bioclimatic models may also beinformative when investigating the likelihoodthat particular change in climate might affectspecies distributions (Arajo et al., 2006).However, due to numerous sources of uncer-tainty, the models and their results shouldonly be applied with a thorough understand-ing of the limitations involved (Pearson and

    Dawson, 2003; Thuiller, 2003; 2004; Arajoet al., 2005a; 2005b).Sources of uncertainty reviewed here indi-

    cate that in applying correlative species-climate models we need to have broadunderstanding of the wide range of method-ological issues that may affect the usefulnessof models; only then should we be able tocircumvent the pitfalls associated with model-ling species responses to climate change.Furthermore, it is clear that more realisticsimulations of the impacts of climate changeon species range shifts require addressing thecomplex interactions between the many fac-tors affecting distributions (Pearson andDawson, 2003). Many recent papers havehighlighted the need for research aiming todevelop approaches that integrate factorssuch as land cover, biotic interactions and dis-

    persal mechanisms into pure species-climatemodels (Pearson et al., 2002; Guisan andThuiller, 2005). An increasing number ofstudies have started to address these chal-lenges (eg, Leathwick and Austin, 2001;Iverson et al., 2004; Pearson et al., 2004;

    Pearson and Dawson, 2005), but there isstill substantial research required. In fact,developing hybrid-models that bringtogether the best of correlative bioclimaticmodelling with the best of mechanistic andtheoretical models is one of the most impor-tant challenges for modellers (Arajo et al.,2006).

    Nevertheless, it is important to acknowl-edge that natural systems are not closed,hence it is not possible to account for all

    potential driving forces of species distribu-tions (Arajoet al., 2005a). Thus, errors arean inherent property of bioclimatic models,whether they are correlative, mechanistic ortheoretical, and the primary value of modelsis very likely more heuristic than predictive(Arajoet al., 2005b). By and large, becauseof the various sources of uncertainty dis-cussed here it may be conceptually inade-quate to use the projections of bioclimaticmodels as face value for making predictions of

    future events. However, as argued byWhittakeret al. (2005), when the limitationsof models are understood, we should be in abetter position to make the best use of theirresults. Last but not least, a plea made byArajoet al. (2005b) deserves to be put for-ward also here: more empirical evidenceneeds to be gathered to reinforce the confi-dence of bioclimatic models and their predic-tions. By providing repeated evidence of theirvalue we should be in a better position to usetheir outputs with confidence.

    Acknowledgements

    This research was funded by the EC FP6Integrated Project ALARM (GOCE-CT-2003-506675). MBA thanks RKH forinvitation to visit the Finnish EnvironmentalInstitute in May, 2004.

  • 8/2/2019 Archive Bio Climatic

    21/27

    Risto K. Heikkinen et al. 21

    ReferencesAnderson, M.J. and Gribble, N.A. 1998: Partitioning

    the variation among spatial, temporal and environ-mental components in a multivariate data set.Australian Journal of Ecology 23, 15867.

    Anderson, R.P., Gmez-Laverde, M. and Peterson,A.T. 2002a: Geographical distributions of spiny

    pocket mice in South America: insights from predic-tive models. Global Ecology and Biogeography 11,13141.

    Anderson, R.P., Lew, D. and Peterson, A.T. 2003:Evaluating predictive models of speciesdistributions:criteria for selecting optimal models. EcologicalModelling 162, 21132.

    Anderson, R.P., Peterson, A.T. and Gmez-Laverde, M. 2002b: Using niche-based GIS model-ing to test geographic predictions of competitiveexclusion and competitive release in South Americanpocket mice. Oikos 98, 316.

    Anonymous 2001: S-PLUS 6 for Windows Guide toStatistics, Volume 1. Seattle, WA: InsightfulCorporation.

    Arajo, M.B. and New, M. 2006: Ensemble forecast-ing of species distributions. TREE, in press.

    Arajo, M.B. and Pearson, R.G. 2005: Equilibrium ofspecies distributions with climate. Ecography 28,69395.

    Arajo, M.B. and Rahbek, C. 2006: How does climatechange affect biodiversity? Science 313, 139697.

    Arajo, M.B. and Williams, P.H. 2000: Selectingareas for species persistence using occurrence data.Biological Conservation 96, 33145.

    Arajo, M.B., Cabeza, M., Thuiller, W., Hannah,L. and Williams, P.H. 2004: Would climate change

    drive species out of reserves? An assessment of exist-ing reserve-selection methods. Global Change Biology10, 161826.

    Arajo, M.B., Pearson, R.G., Thuiller, W. andErhard, M. 2005a: Validation of species-climateimpact models under climate change. Global ChangeBiology 11, 150413.

    Arajo, M.B., Thuiller, W. and Pearson, R.G. 2006:Climate warming and the decline of amphibians andreptiles in Europe. Journal of Biogeography 33,171228.

    Arajo, M.B., Whittaker, R.J., Ladle, R.J. andErhard, M. 2005b: Reducing uncertainty in projec-

    tions of extinction risk from climate change. GlobalEcology and Biogeography 14, 52938.Augustin, N., Mugglestone, M.A. and Buckland,

    S.T. 1996: An autologistic model for spatial distribu-tion of wildlife. Journal of Applied Ecolog y 33,33947.

    Bailey, S.-A., Haines-Young, R.H. and Watkins, C.2002: Species presence in fragmented landscapes:modelling of species requirements at the nationallevel.Biological Conservation 108, 30716.

    Baker, R.H.A., Sansford, C.E., Jarvis, C.H.,Cannon, R.J.C., MacLeod, A. and Walters,K.F.A. 2000: The role of climatic mapping inpredicting the potential geographical distribution ofnon-indigenous pests under current and futureclimates.Agriculture Ecosystems and Environment 82,5771.

    Bakkenes, M., Alkemade, J., Ihle, F., Leemans, R.and Latour, J. 2002: Assessing the effects of fore-casted climate change on the diversity and distribu-tion of European higher plants for 2050. GlobalChange Biology 8, 390407.

    Beaumont, L.J. and Hughes, L. 2002: Potentialchanges in the distributions of latitudinally restrictedAustralian butterfly species in response to climatechange. Global Change Biology 8, 95471.

    Beaumont, L.J., Hughes, L. and Poulsen, M. 2005:Predicting species distributions: use of climaticparameters in BIOCLIM and its impact on predictionsof speciescurrent and future distributions.EcologicalModelling 186, 25069.

    Beerling, D.J., Huntley, B. and Bailey, J.P. 1995:Climate and the distribution ofFallopia japonica: useof an introduced species to test the predictive capac-ity of response surfaces.Journal of Vegetation Science6, 26982.

    Berry, P.M., Dawson, T.P., Harrison, P.A. andPearson, R.G. 2002: Modelling potential impactsof climate change on the bioclimatic envelope ofspecies in Britain and Ireland. Global Ecology andBiogeography 11, 45362.

    Bio, A.M.F., De Becker, P., De Bie, E., Huybrechts,W. and Wassen, M. 2002: Prediction of plantspecies distribution in lowland river valleys in

    Belgium: modelling species response to site condi-tions.Biodiversity and Conservation 11, 2189216.Bomhard, B., Richardson, D.M., Donaldson, J.S.,

    Hughes, G.O., Midgley, G.F., Raimondo, D.C.,Rebelo, A.G., Rouget, M. and Thuiller, W. 2005:Potential impacts of future land use and climatechange on the Red List status of the Proteaceae in theCape Floristic Region, South Africa. Global ChangeBiology 11, 145268.

    Borcard, D. and Legendre, P. 2002: All-scale spatialanalysis of ecological data by means of principal coor-dinates of neighbour matrices. Ecological Modelling153, 5168.

    Borcard, D., Legendre, P., Avois-Jacquet, C. andTuomisto, H. 2004: Dissecting the spatial structureof ecological data at multiple scales. Ecology 85,182632.

    Borcard, D., Legendre, P. and Drapeau, P. 1992:Partialling out the spatial component of ecologicalvariation.Ecology 73, 104555.

    Box, E.O., Crumpacker, D.W. and Hardin, E.D.1993: A climatic model for location of plant species inFlorida, USA.Journal of Biogeography 20, 62944.

  • 8/2/2019 Archive Bio Climatic

    22/27

    22 Methods and uncertainties in bioclimatic envelope modelling

    1999: Predicted effects of climatic change ondistribution of ecologically important native treeand shrub species in Florida. Climatic Change 41,21348.

    Brereton, R., Bennett, S. and Mansergh, I. 1995:Enhanced greenhouse climate change and its poten-tial effect on selected fauna of south-eastern

    Australia: a trend analysis.Biological Conservation 72,33954.Brotons, L., Thuiller, W., Arajo, M.B. and Hirzel,

    A.H. 2004: Presence-absence versus presence-onlyhabitat suitability models: the role of species ecologyand prevalence.Ecography 27, 16572.

    Buckland, S.T., Burnham, K.P. and Augustin, N.H.1997: Model selection: an integral part of inference.Biometrics 53, 60318.

    Burns, C.E., Johnston, K.M. and Schmitz, O.J.2003: Global climate change and mammalian speciesdiversity in USA national parks. Proceedings of theNational Academy of Sciences 100, 1147477.

    Busby, J.R. 1991: BIOCLIM A bioclimate analysis

    and prediction system. In Margules, C.R. and Austin,M.P., editors, Nature conservation: Cost effective bio-logical surveys and data analysis, Australia: CSIRO,6468.

    Carpenter, G., Gillison, A.N. and Winter, J. 1993:DOMAIN: a flexible modelling procedure for map-ping potential distributions of plants and animals.Biodiversity and Conservation 2, 66780.

    Cawsey, E.M., Austin, M.P. and Baker, B.L. 2002:Regional vegetation mapping in Australia: a casestudy in the practical use of statistical modelling.Biodiversity and Conservation 11, 223974.

    Chatfield, C. 1995: Model uncertainty, data mining and

    statistical inference. Journal of the Royal StatisticalSociety: Series A 158, 41966.Chevan, A. and Sutherland, M. 1991: Hierarchical

    partitioning. The American Statistician 45, 9096.Cleveland, W.S. and Devlin, S.J. 1988: Locally

    weighted regression: an approach to regression analy-sis by local fitting. Journal of American StatisticalAssociation 83, 596610.

    Congalton, R.G. 1991: A review of assessing theaccuracy of classifications of remotely sensed data.Remote Sensing of Environment 37, 3546.

    Coudun, C., Gegout, J.-C., Piedallu, C. andRameau, J.-C. 2006: Soil nutritional factorsimprove models of plant species distribution: an illus-tration with Acer campestre L. in France. Journal ofBiogeography 33, 175063.

    Crawley, M.J. 1993: GLIM for ecologists. Oxford:Blackwell.

    Crick, H.Q.P. 2004: The impact of climate change onbirds.Ibis 146 (Supplement 1), 4856.

    Crumpacker, D.W., Box, E.O. and Hardin, E.D.2001: Implications of climatic warming for conserva-tion of native trees and shrubs in Florida.Conservation Biology 15, 100820.

    Cumming, G.S. 2000: Using between-model compar-isons to fine-tune linear models of species ranges.Journal of Biogeography 27, 44155.

    Currie, D.J. 1991: Energy and large-scale patterns ofanimal- and plant-species richness. AmericanNaturalist 137, 2749.

    Cushman, S.A. and McGarigal, K. 2004: Hierarchical

    analysis of forest bird species-environment relation-ships in the Oregon coast range. EcologicalApplications 14, 1090105.

    Davis, A.J., Lawton, J.H., Shorrocks, B. andJenkinson, L.S. 1998: Individualistic speciesresponses invalidate simple physiological models ofcommunity dynamics under global environmentalchange.Journal of Animal Ecology 67, 60012.

    DeAth, G. and Fabricius, K.E. 2000: Classificationand regression trees: a powerful yet simple techniquefor ecological data analysis.Ecology 81, 317892.

    Diniz-Filho, J.A.F. and Bini, L.M. 2005: Modellinggeographical patterns in species richness using eigen-vector-based spatial filters. Global Ecology andBiogeography 14, 17785.

    Diniz-Filho, J.A.F., Bini, L.M. and Hawkins, B.A.2003: Spatial autocorrelation and red herrings in geo-graphical ecology. Global Ecology and Biogeography12, 5364.

    Dirnbck, T., Dullinger, S. and Grabherr, G. 2003: Aregional impact assessment of climate and land-usechange on alpine vegetation.Journal of Biogeography30, 40117.

    Elith, J. and Burgman, M. 2002: Predictions and theirvalidation: rare plants in the Central Highlands,Victoria, Australia. In Scott, J.M., Heglund, P.J.,Morrison, M.L., Haufler, J.B., Raphael, M.G., Wall,

    W.A. and Samson, F.B., editors, Predicting speciesoccurrences. Issues of accuracy and scale, Washington,DC: Island Press, 30313.

    Elith, J., Burgman, M.A. and Regan, H.M. 2002:Mapping epistemic uncertainties and vague conceptsin predictions of species distributions. EcologicalModelling 157, 31329.

    Elith, J., Graham, C. H., Anderson, R.P., Dudk,M., Ferrier, S., Guisan, A., Hijmans, R.J.,Huettmann, F., Leathwick, J.R., Lehmann, A.,Li, J., Lohmann, L.G., Loiselle, B.A., Manion,G., Moritz, G., Nakamura, M., Nakazawa, Y.,Overton, J. McC. M., Peterson, A.T., Phillips,S.J., Richardson, K., Scachetti-Pereira, R.,Schapire, R.E., Sobern, J., Williams, S., Wisz,M.S. and Zimmermann, N.E. 2006: Novel meth-ods improve prediction of species distributions fromoccurrence data.Ecography 29, 12951.

    Erasmus, B.F.N., Van Jaarsveld, A.S., Chown,S.L., Kshatriya, M. and Wessels, K.J. 2002:Vulnerability of South African animal taxa to climatechange. Global Change Biology 8, 67993.

    Ferrer-Castn, D. and Vetaas, O.R. 2005:Pteridophyte richness, climate and topography in the

  • 8/2/2019 Archive Bio Climatic

    23/27

    Risto K. Heikkinen et al. 23

    Iberian Peninsula: comparing spatial and nonspatialmodels of richness patterns. Global Ecology andBiogeography 14, 15565.

    Fielding, A. and Bell, J. 1997: A review of methods forthe assessment of prediction errors in conservationpresence/absence models.Environmental Conservation24, 3849.

    Fielding, A.H. 2002: What are the appropriate charac-teristics of an accuracy measure? In Scott, J.M.,Heglund, P.J., Morrison, M.L., Haufler, J.B., Raphael,M.G., Wall, W.A. and Samson, F.B., editors,Predicting species occurrences. Issues of accuracy and

    scale, Washington: Island Press, 27180.Franklin, J. 1995: Predictive vegetation mapping: geo-

    graphic modelling of biospatial patterns in relation toenvironmental gradients. Progress in PhysicalGeography 19, 47499.

    Friedman, J., Hastie, T. and Tibshirani, R. 2000:Additive logistic regression: a statistical view ofboosting.Annals of Statistics 28, 33774.

    Gates, S. and Donald, P.F. 2000: Local extinction of

    British farmland birds and the prediction of furtherloss.Journal of Applied Ecology 37, 80620.

    Gavin, D.G. and Hu, F.S. 2005: Bioclimatic modellingusing Gaussian mixture distributions and multiscalesegmentation. Global Ecology and Biogeography 14,491501.

    Gibson, L.A., Wilson, B.A., Cahill, D.M. and Hill,J. 2004: Spatia