a computational phylogenetic approach to interaction analysis cynthia sims parr university of...

19
A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal, Canada August 9, 2005

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

A computational phylogenetic approach to interaction analysis

Cynthia Sims Parr

University of Maryland College Park

Ecological Society of America Montreal, Canada August 9, 2005

Page 2: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Predicting Ecological Interactions

?

Page 3: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Terminology & Outline

Describe computational framework for predicting links

Propose general algorithms and discuss implications

Preliminary results Simple model using large database

and evolutionary trees does a surprisingly good job.

web

node

link

Page 4: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Evolutionary trees

Family Genus

Species

Page 5: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Computational framework

Database

Interaction Web

DatabaseADW

DB andGraph Vis tools

Algorithms

Field Test Predictions

Predictions

Explore forpatterns

PhylogeniesClassifications

Note: More than one way to do it!

Page 6: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Predicting Links: parameterized functions

Step 1. Select functions that might predict links using characteristics of taxa. For example, size or stoichiometry.

Step 2. Determine parameters using known links among all taxa across whole or partial database.

For taxon A and taxon B with known link status: LinkStatusAB

LinkStatusAB= ƒ(α, sizeA, sizeB) + ƒ(β, stoichA, stoichB)

Step 3. Use parameterized equation to estimate LinkStatus between target taxa C and D.

Page 7: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Implications: parameterized functions

Requires good data for target species Can incrementally add natural history functions to

get better estimate, try different functions from literature or use genetic algorithms

Parameterizing functions: multivariate statistics, machine learning, fuzzy inference

Could use evolutionary info if you localize parameter estimates to clades or taxonomic subsets

LinkPredictedCD = ƒ(α , sizeC,sizeD) + ƒ(β , stoichC,stoichD)

Page 8: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Predicting Links: neighbor distance weighting

E.g. for taxa X and Y, where X has nearest neighbor A and Y has nearest neighbor B, where LinkStatus between A,B is known N

LinkPredictedXY= 1 (LinkStatusAB) 1 + distanceXA + distanceYB

Step 1. Provide distance threshold or number of neighbors N to use.

Step 2. Find nearest neighbors to your target nodes in evolutionary or trait space with known link status.

Step 3. Combine LinkStatus weighted by distances:

Page 9: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Implications:Neighbor distance weighting

Evolutionary Uses phylogeny or classification or

combination of these Distance could be branch length or # steps Does not explicitly take advantage of

natural history

Trait space e.g. Euclidean distance in N-space Uses richest possible natural history data Could include evolutionary distance as a

term

Page 10: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Missing data avoid it avoid comparisons with nodes without complete data substitute value of relative otherwise closest in trait space “Ancestral” Node Reconstruction e.g. Phylogenetic Mixed

Model (Houseworth et al. 2001) Nodes that do not map to taxa e.g. detritus,

suspended organic matter Treat as if they are a phylogenetic unit all in one polytomy Can create a “phylogeny” of neighbors. For example,

“detritus” may be part of a reasonable heirarchy of organic material.

Nodes that are not resolved to species Doesn’t matter for these algorithms

Problems and suggested solutions

Page 11: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Picture of tree from TaxonTree overview

Take advantage of all information as needed

Page 12: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Whole web solutionsSome links affect others

use a priori prediction of strongest links to run first, allow status of these links to enter link predictions.

Webs should be realisticVary parameters (e.g. scale of parameterization, thresholds) and rerun analyses until criterion met for the whole web

Criteria: “natural” values for connectedness, stability, chain length, trophic level ratios, etc. Methodology: parsimony or likelihood analysis

Computational demands will be highS2 possible links, simultaneous multivariate equations by all variants of runs. May need heuristics.

Page 13: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Summary of approaches

Link prediction Parameterized functions Weighted distances

Evolutionary Trait space

Total community solution Parsimony or likelihood solution Include other links as terms and run prioritized,

stepwise analysis

Page 14: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Data needed

Wide range of well-identified taxa Cross section of habitats Natural history data

Page 15: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Database status

Source Webs Nodes Links

Animal Diversity Web n/a 1012 2869

Webs on the Web 17 1537 6328

Interaction Web DB 26 2177 9882

EcoWEB 213 4064 6363

Total 256 8790 25,442

4214 unique taxa

Evolutionary tree as in Parr et al. 2004. Bioinformatics.

Page 16: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

LinkPredictor preliminary resultsData 43% of nodes mapped to species level 16% nodes have no evolutionary information at all. Using only presence or absence of links

Procedure Pulling out one food web at a time and predicting its links

based on the rest of the data Up to 4 steps up and down the evolutionary tree, no weighting

yet for distance

Results On average, 49% of actual links are correctly predicted 38% of predicted links are false positives

Take home: Our DB and evolutionary approach does surprisingly well at predicting food links

…With SPIRE at UMBC

Page 17: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

More questions

What about predicting links among taxa from big studies outside the current database?

How much improvement comes from adding links to the DB?

How robust are results to differing degrees of phylogenetic resolution or taxon sampling?

How robust are results to missing data? How to handle data quality issues? Error estimates?

Page 18: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Future work with SPIRE

Role in ELVIS – LinkEP (Evidence Provider) Integrate into platform that

takes location as input generates list of taxa gives evidence for interaction among taxa models change due to invasive species

Pull data from semantic web rather than local database

Page 19: A computational phylogenetic approach to interaction analysis Cynthia Sims Parr University of Maryland College Park Ecological Society of America Montreal,

Acknowledgements

NSF IDM/ITR 0219492 (PI Bederson) Bongshin Lee NBII

Joel Sachs and Andrey Parafiynyk Bill Fagan and lab members Michael Kantor EcoWeb (Joel Cohen) NCEAS Interaction Web Database (Diego Vázquez) WoW (J. Dunne and N. Martinez)

http://www.cs.umd.edu/hcil/biodiversityhttp://spire.umbc.edu/linkpredictor/