new methods in ecology complex statistical tests, and why we should be cautious!
Post on 20-Dec-2015
216 views
TRANSCRIPT
![Page 1: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/1.jpg)
New Methods in Ecology
Complex statistical tests, and why we should be cautious!
![Page 2: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/2.jpg)
Complex tests
• Logistic Regression
• Principal Components Analysis
• Cluster AnalysisMultivariate
• Multivariate tests mean you have a single explanatory variable, but multiple response variables.
![Page 3: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/3.jpg)
Logistic Regression
![Page 4: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/4.jpg)
Logistic Regression
Insects were exposed to a pesticide to determine the effectiveness of the treatment. The response is dead individuals from a sample
Dose Dead Batch1 2 1003 10 9010 49 9830 96 100100 98 100
![Page 5: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/5.jpg)
Linear regression on the proportions killed vs dose
dose
At dose 0, Proportion killed is less than 0 (negative deaths?) and greater than dose 4, get > 100% mortality!
P(kill) = ax + b
![Page 6: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/6.jpg)
Need to ensure the model is bounded by 0 and 1, build a new equation
No longer have impossible predictions, and the model fits better
dose
P(ki
ll)
€
P(Kill) = 1- P(survived)
P(survived) =e(ax+b )
1+ e(ax+b )
![Page 7: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/7.jpg)
dose
P(ki
ll)
Can now look at what proportion would be killed at a particular dosage
€
P(Kill) = 1- P(survived)
P(survived) =e(ax+b )
1+ e(ax+b )
![Page 8: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/8.jpg)
Logistic regression issues…• Implementing and coding the model can be difficult• Can be tough to work through the equation• Is it easier to design around the issue?
Dose Dead Batch1 2 1003 10 10010 49 10030 96 100100 98 100
• Use the same number in each batch, use “number dead” as the response variable?
€
#Killed = ax +b€
P(survived) =e(ax+b )
1+ e(ax+b )
![Page 9: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/9.jpg)
Multivariate Statistics
• Single explanatory variable, multiple response variables
• Multivariate tests can be useful and insightful• Can be deeply confusing• Very often misused• Difficult to explain the results• Used to mask bad designs, confuse/impress
stupid people.
![Page 10: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/10.jpg)
Parrots in Bonaire
www.parrotwatch.org
Sam Williams
Sam collected a load of data on different aspects of the birds’ biology
![Page 11: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/11.jpg)
Parrots in Bonaire
• What to do with all this?• 1 descriptive variable (nest)• Multiple response variables• Principal component analysis…
![Page 12: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/12.jpg)
Principal Component Analysis• Obtains values for as many principle
components as there are response variables• Each PC accounts for some more of the total
variation• Each nest has a PC value for each PC• Each response variable has a rotation value for
each PC• What do these PC values and rotation values
relate to?• God knows
![Page 13: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/13.jpg)
Principal Component Output
Principle Component
Scree plot, first few Principal components account for much of the variation
![Page 14: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/14.jpg)
Principal Component Output
Biplot of the first 2 principle components
Can be used to look for correlations
Some significance tests (redundancy analysis)
Lots of noise!
![Page 15: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/15.jpg)
Other use of PCA• each nest/individual/replicate has a value of
each Principal component
• Can use these values as a response variable, and subject to other tests
• Called “Dimensionality Reduction”
![Page 16: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/16.jpg)
Salmon Genomics and Survival
• Gene expression data for ~16000 genes, from ~300 fish.
• Each fish is a replicate, each gene is a response variable
![Page 17: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/17.jpg)
• 16000 genes is lot of data, and a lot of variation.
• Do a PCA on the genes, use the PC values as a response variable
• Reduces the dimension of the data, rather than 16000 response variables, now have 1 (PC1, or PC2)
• Can then use this in other tests.
Salmon Genomics and Survival
![Page 18: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/18.jpg)
Salmon Genomics and Survival
Principle component
• Related value of PC1 to survival of the fish, showed a correlation for one stock
![Page 19: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/19.jpg)
days
Prop
ortio
n su
rviv
ing Scotch Chilko Adams
Salmon Genomics and Survival
• Condensed the gene expression data into something useable
• Method insanely complex and computer intensive• Still don’t really know what PC1 is!
![Page 20: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/20.jpg)
Cluster Analysis
• Like PCA, a multivariate method• Unlike PCA, looks for patterns within the data• Produces a hierarchical cluster• Groups similar individuals together• Unsupervised• Have to then decide where groups lie• Try and relate the grouping to something else?
![Page 21: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/21.jpg)
Cluster Analysis
![Page 22: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/22.jpg)
Multivariate Summary• Multivariate statistics are useful for data
mining• Often used when data collection was done
improperly/you’ve been given data sets• Can indicate how to proceed• Can be very messy• Totally opposite to the a priori “carry out an
experiment to test a hypothesis” idea.
![Page 23: New Methods in Ecology Complex statistical tests, and why we should be cautious!](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d4c5503460f94a2a8df/html5/thumbnails/23.jpg)
• Can be very useful and insightful if used properly
• More complex doesn’t necessarily mean better
• Can be difficult to interpret• Remember the golden rule – know how to
analyse the type of data you will collect, before you collect it!
Complex stats Summary