introduction
DESCRIPTION
Introduction. to the gradient analysis. Community concept. (from Mike Austin). Continuum concept. (from Mike Austin). The real situation is somewhere between and more complicated. Originally (and theoretically). Community concept as a basis for classification - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/1.jpg)
Introduction
to the gradient analysis
![Page 2: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/2.jpg)
Community concept
(from Mike Austin)
![Page 3: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/3.jpg)
Continuum concept
(from Mike Austin)
![Page 4: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/4.jpg)
The real situation is somewhere between and more complicated
![Page 5: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/5.jpg)
Originally (and theoretically)
• Community concept as a basis for classification
• Continuum concept as a basis for ordination or gradient analysis
![Page 6: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/6.jpg)
In practice
• I need a vegetation map (or categories for a nature conservation agency) - I will use classification
• I am interested in transitions, gradients, etc. - lets go for the gradient analysis (ordination)
![Page 7: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/7.jpg)
Methods of the gradient analysis
pH
CO
VE
R
0
10
20
30
40
5 6 7 8 9
No. of environmental variables
No. of species
1, n 1 no Regression Dependence of the species on environmental variables
None n yes Calibration Estimates of environmental values
None n no Ordination Axes of variability in species composition
1, n n no Constrained ordination
Variability in species composition explained by environmental variables and Relationship of environmental variables to the species data
Data used in calculations A priori
knowledge of species-environment relationships
Method Result
![Page 8: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/8.jpg)
Over a short gradient, the linear response is good approximation, over a long gradient, it is not.
![Page 9: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/9.jpg)
However
• In most cases, neither the linear, nor the unimodal response models are sufficient description of reality for all the species
• I use methods based on either of the models not because I would believe that all the species behave according to this model, but because I see them as a reasonable compromise between reality and simplicity.
![Page 10: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/10.jpg)
Estimating species optima by the weighted averaging method
n
ii
n
iii
Abund
AbundEnvSpWA
1
1)(
n
ii
n
iii
Abund
AbundSpWAEnvDS
1
1
2))((..
Optimum Tolerance
“Weighted averaging regression”
![Page 11: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/11.jpg)
Environmental variable
Sp
eci
es
ab
un
da
nce
0
1
2
3
4
5
0 20 40 60 80 100 120 140 160 180 200
604.9/564)(
1
1
n
ii
n
iii
Abund
AbundEnvSpWA
![Page 12: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/12.jpg)
Environmental variable
Sp
eci
es
ab
un
da
nce
0
1
2
3
4
5
0 20 40 60 80 100 120 140 160 180 200
The techniques based on the linear response model are suitable for homogeneous data sets, the weighted averaging techniques are suitable for more heterogeneous data.
![Page 13: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/13.jpg)
s
ii
s
iii
Abund
AbundIVSampWA
1
1)(
Calibrations (using weighted averages)
Nitrogen IV Sample 1 IV x abund. Sample 2 IV x abund.Drosera rotundifolia 1 2 2 0 0
Andromeda polypofila 1 3 3 0 0Vaccinium oxycoccus 1 5 5 0 0Vaccinium uliginosum 3 2 6 1 3
Urtica dioica 8 0 0 5 40Phalaris arundinacea 7 0 0 5 35
Total 12 16 11 78Nitrogen (WA): 1.333
(=16/12)7.090
(=78/11)
![Page 14: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/14.jpg)
Cactus Nymphea
Urtica
Drosera
Menyanthes
Comarum
Chenopodium
Aira
Ordination diagram
![Page 15: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/15.jpg)
Cactus Nymphea
Urtica
Drosera
Menyanthes
Comarum
Chenopodium
Aira
Ordination diagram
Nutrients
Water
Proximity means similarity
![Page 16: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/16.jpg)
Terminology
• Old CANOCO – samples, species, environmental variables
• New Canoco5 – general terminology – observational units, variables – in book, we use cases, reponse variables, predictors – but you can decide, what will be their names – so, if you prefer so, you can use samples, species, environmental variables
• You can also use a third table, traits
![Page 17: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/17.jpg)
1. Find a configuration of cases in the ordination space so that the distances between cases in this space correspond best to the dissimilarities of their species composition. This is explicitly done by the multidimensional scaling methods (metric and non-metric). Requires a measure of dissimilarity between cases.
2. Find "latent" variable(s) (ordination axes) which represent the best predictors for the values of all the species. This approach requires the model of species response to such latent variables to be explicitly specified.
Two formulations of the ordination problem
![Page 18: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/18.jpg)
The linear response model is used for linear ordination methods, the unimodal response model for weighted averaging methods. In linear methods, the case score is a linear combination (weighted sum) of the species (response variable) scores. In weighted averaging methods, the case score is a weighted average of the species scores (after some rescaling).
Note: The weighted averaging algorithm contains an implicit standardization by both cases and species. In contrast, we can select in linear ordination the standardized and non-standardized forms.
![Page 19: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/19.jpg)
Transformation is an algebraic function Xij’=f(Xij) which is applied independently of the other values. Standardization is done either with respect to the values of other species in the case (standardization by cases) or with respect to the values of the species in other cases (standardization by response variables).
Quantitative data
Centering means the subtraction of a mean so that the resulting variable (species) or case has a mean of zero. Standardization usually means division of each value by the case (species) norm or by the total of all the values in a case (sum of response variable (species) values).
![Page 20: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/20.jpg)
Weighted averaging methods correspond to the use of
Note that double standardization (by total) is implicit in the distance measure – and consequently, it is implicit in all the methods based on it (also follows from the weighted averaging algorithm)
![Page 21: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/21.jpg)
From practical point of view
• Whenever you use ordination based on weighted averaging, you compare relative representation of species in your cases (these methods are most often used for classical cases x species matrices)
![Page 22: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/22.jpg)
The two formulations may lead to the same solution. (When cases of similar species composition would be distant on an ordination axis, this axis could hardly serve as a good predictor of their species composition.) For example, principal component analysis can be formulated as a projection in Euclidean space, or as a search for latent variable when linear response is assumed.
By specifying species response, we implicitly specify the (dis)similarity measure
![Page 23: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/23.jpg)
2 3 4 5 6 7 8 9 10
Species 1
0
1
2
3
4
5
6
7
8
9
Spe
cies
2
„good“
„bad“
„Good” axis conserves the original distances, and is also a good predictor of individual species, “bad” axis does not either of those.
![Page 24: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/24.jpg)
2 3 4 5 6 7 8 9 10
Species 1
0
1
2
3
4
5
6
7
8
9
Spe
cies
2 2 4 6 8 10 12 14
Good axis
2
3
4
5
6
7
8
9
10
Spe
cies
1
2 4 6 8 10 12 14
Good axis
0
1
2
3
4
5
6
7
8
9
Spe
cies
2
„Good” axis conserves the original distances, and is also a good predictor of individual species.
![Page 25: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/25.jpg)
“Bad axis” is useless as a species representation predictor
2 3 4 5 6 7 8 9 10
Species 1
0
1
2
3
4
5
6
7
8
9
Spe
cies
2
7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6
Bad axis
2
3
4
5
6
7
8
9
10
Spe
cies
1
7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6
Bad axis
0
1
2
3
4
5
6
7
8
9
Spe
cies
2
![Page 26: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/26.jpg)
If the variables (species) are independent, it is difficult to find a “good” axis – whatever we choose, the distances are not
conserved and the axis is not a good predictor
-2 0 2 4 6 8 10
Species1
0
1
2
3
4
5
6
7
8
9
Spe
cies
2
![Page 27: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/27.jpg)
The result of the ordination will be the values of this latent variable for each case - called the case scores - and the estimate of species optimum on that variable for each species - the species (response variable) scores [in unimodal methods; characteristics of linear dependence in unimodal methods]. Further, we require that the species optima must be correctly estimated from the case scores (by weighted averaging) and the case scores be correctly estimated as weighted averages of the species scores (species optima). This can be achieved by the following iterative algorithm:
![Page 28: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/28.jpg)
Step 1 Start with some (arbitrary) initial case scores {xi}Step 2 Calculate new species scores {yi} by [weighted averaging] regression from {xi}Step 3 Calculate new case scores {xi} by [weighted averaging] calibration from {yi}Step 4 Remove the arbitrariness in the scale by standardizing case scores (stretch the axis)Step 5 Stop on convergence, else GO TO Step 2
length
xx minmax =eigenvalue
![Page 29: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/29.jpg)
0 10
0 10
xmin xmax
Steps 1 to 3
![Page 30: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/30.jpg)
The length of an axis is often arbitrary (but there are exceptions – see later on)
The orientation of axes is arbitrary (what is important are the relative positions of the objects)
![Page 31: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/31.jpg)
The larger the eigenvalue, the better is the explanatory power of the axis. Amount of variability explained is proportional to the eigenvalue.
In weighted averaging, eigenvalues < 1 (=1 only for perfect partitioning).
In CANOCO, linear methods are scaled so that total of eigenvalues = 1 (not in some other programs)
00x x 0 x x
x x x 0 x
x 0 x x 0 x x x 0 x
cases
spec
ies
perfect partitioning
![Page 32: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/32.jpg)
The second, third, etc. axes
• Are obtained in analogical way, after partialling out variability explained by the first axis.
• The second axis is linearly independent of the first axis (i.e. correlation coefficient r=0), the third of the first two etc.
• The second axis never explains more than the first, so the eigenvalue of the first axis is the largest one, and the others generally decrease (never increase).
![Page 33: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/33.jpg)
Constrained ordination
The axis is linear combination of measured explanatory variables (linear combination = a X1 +b X2 + c X3 )
Step 1 Start with some (arbitrary) initial case scores {xi}Step 2 Calculate new species scores {yi} by [weighted averaging] regression from {xi}Step 3 Calculate new case scores {xi} by [weighted averaging] calibration from {yi}
Step 4 Remove the arbitrariness in the scale by standardizing case scores (stretch the axis)Step 5 Stop on convergence, else GO TO Step 2
![Page 34: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/34.jpg)
Constrained ordinationThe axis is linear combination of measured explanatory variables (linear combination = a X1 +b X2 + c X3 )
Step 1 Start with some (arbitrary) initial case scores {xi}Step 2 Calculate new species scores {yi} by [weighted averaging] regression from {xi}Step 3 Calculate new case scores {xi} by [weighted averaging] calibration from {yi}Step 3a Calculate a multiple regression of the case scores {xi} on the explanatory variables and take the fitted values of this regression as the new case scores. Step 4 Remove the arbitrariness in the scale by standardizing case scores (stretch the axis)Step 5 Stop on convergence, else GO TO Step 2
![Page 35: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/35.jpg)
CaseR and CaseE
• Step 3a Calculate a multiple regression of the case scores {xi} on the explanatory variables and take the fitted values of this regression as the new case scores.
• CaseR score=score based on species composition, i.e. before regression (where the case is according to species composition [or generally, the response variables])
• CaseE score=fitted value, i.e. linear combination of explanatory (where the case should be according to the fitted model)
![Page 36: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/36.jpg)
Maximum number of constrained axes
• equals to the number of independent predictors (explanatory variables), the higher axes are then calculated as unconstrained
• The first unconstrained axis can have higher eigenvalue then the previous constrained axes
![Page 37: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/37.jpg)
Linear methods Weighted averaging
Unconstrained Principal ComponentsAnalysis (PCA)
Correspondence Analysis (CA)
Constrained Redundancy Analysis(RDA)
Canonical CorrespondenceAnalysis (CCA)
Basic ordination techniques
Detrending
Hybrid analyses
![Page 38: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/38.jpg)
PCA CA
RDA CCA
![Page 39: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/39.jpg)
Detrending - second axis is BY DEFINITION linearly independent of the first - this does not prevent quadratic dependence
![Page 40: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/40.jpg)
Let’s take a hammer
Done in each iteration – it often forces the algorithm to find ecologically meaningful gradient
![Page 41: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/41.jpg)
And straight the axis
Detrending by segments (highly non-parametric) or by polynomials
Despite its very “heuristic” nature, detrending often makes the second axis interpretable
![Page 42: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/42.jpg)
Detrending by segments is connected with the non-linear rescaling - So called s.d. – units
The idea
1 s.d.
Species response along a gradient has the shape of probability density of normal distribution – the the „niche width“ can be characterized by the „s.d.“ – average s.d. (across all the species) is the s.d. unit
![Page 43: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/43.jpg)
Simplicity vs. realism
• In unimodal methods, species points are species optima (assuming symmetrical species response), in linear methods, arrows are direction of species linear response
• It would be nice to include more complicated (and realistic) species response, but imagine how complicated the ordination diagram would be, if we decided to include asymmetric responses, bimodal responses, curvilinear responses etc.
![Page 44: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/44.jpg)
Two approachesHaving both environmental (explanatory) data and data on species composition, we can first calculate an unconstrained ordination and then calculate a regression of the ordination axes on the measured environmental variables (i.e. to project the environmental variables into the ordination diagram) or we can calculate directly a constrained ordination.
(D)CAwith passively projected explanatory var.
CCAor
![Page 45: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/45.jpg)
The two approaches are complementary and both should be used! By calculating the unconstrained ordination, we do not miss the main part of the variability in species composition, but we can miss that part of the variability that is related to the measured environmental variables.
By calculating a constrained ordination, you do not miss the main part of the biological variability explained by the environmental variables, but we can miss the main part of the variability that is not related to the measured environmental variables.
![Page 46: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/46.jpg)
What shall we do with categorial variables?
![Page 47: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/47.jpg)
Scatterplot (Spreadsheet1 10v*10c)
Var2 = 4.2+3.6*x
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2
Var1
0
2
4
6
8
10
12
Var2
![Page 48: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/48.jpg)
ANOVA grouping=var4
Regression Summary for Dependent Variable: Var7 (Spreadsheet1) Independent Var5 and Var6R= .88898086 R2= .79028698 Adjusted R2= .73036897F(2,7)=13.189 p<.00422 Std.Error of estimate: 1.3452
![Page 49: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/49.jpg)
4groundrock
5basalt
6granit
7limestone
8biomass
123456789
10
basalt 1 0 0 2basalt 1 0 0 3basalt 1 0 0 4granit 0 1 0 2granit 0 1 0 5granit 0 1 0 6limestone 0 0 1 7limestone 0 0 1 8limestone 0 0 1 9limestone 0 0 1 8
Dummy variables
In Canoco 5 (not in older versions), the expansion of factors is done by simply assigning the factor attribute to a variable, nevertheless, the calculations are done with the dummy variables
![Page 50: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/50.jpg)
Predictors and response are correlated, distribution usually
non-normal. Use the distribution free
Monte Carlo permutation test.
![Page 51: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812de6550346895d9340e8/html5/thumbnails/51.jpg)
Nitrogen Plant
height (as measured)
1-st permutation
2-nd permutation
3-rd permutation
4-th permutation
5-th etc
5 3 3 8 5 5 ...
7 5 8 5 5 8 ...
6 5 4 4 3 4 ...
10 8 5 3 8 5 ...
3 4 5 5 4 3 ...
F-value 10.058 0.214 1.428 4.494 0.826 0.###
nspermutatioofnumbertotal
Fwherenspermutatioofno
1
)058.10(.1
Monte Carlo permutation test