treescaper: software to visualize and extract phylogenetic signals from sets of trees

13
TreeScaper: Software to visualize and extract phylogenetic signals from sets of trees Guifang Zhou 1 , Wen Huang 3 , Melissa Marchand 4 , Jeremy Ash 2 , David Morris 1 , Pual Van Dooren 3 , Jim C. Wilgenbusch 5 , Jeremy M. Brown 1 , Kyle A. Gallivan 4 1 Department of Biological Sciences, Louisiana State University 2 Bioinformatics Research Center, North Carolina State University 3 ICTEAM Institute, Université catholique de Louvain 4 Department of Mathematics, Florida State University 5 Minnesota Supercomputing Institute, University of Minnesota June 21, 2016 June 21, 2016

Upload: jembrown

Post on 23-Jan-2017

68 views

Category:

Science


1 download

TRANSCRIPT

Page 1: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

TreeScaper: Software to visualize and extractphylogenetic signals from sets of trees

Guifang Zhou 1, Wen Huang 3, Melissa Marchand 4, Jeremy Ash 2, David Morris 1,Pual Van Dooren 3, Jim C. Wilgenbusch 5, Jeremy M. Brown 1, Kyle A. Gallivan 4

1Department of Biological Sciences, Louisiana State University2Bioinformatics Research Center, North Carolina State University

3ICTEAM Institute, Université catholique de Louvain4Department of Mathematics, Florida State University

5Minnesota Supercomputing Institute, University of Minnesota

June 21, 2016

June 21, 2016

Page 2: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Motivations

Phylogenetic analyses often produce large sets ofcompeting trees

Summarize interesting evolutionary history:HybridizationRecombinationHorizontal Gene TransferIncomplete Lineage Sorting

Identify Systematic Error

June 21, 2016

Page 3: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Shortcomings of Current Approaches

Consensus treeDiscards information concerning competing trees

Dimensionality ReductionMay be difficult to interpret

June 21, 2016

Page 4: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Shortcomings of Current Approaches

ClusteringBased on pairwise tree to tree distanceOnly consider nonnegative links

June 21, 2016

Page 5: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Our Approaches

Apply graph-based methods to understand relationship among:

Tree topologies Bipartitions within treetopologies

June 21, 2016

Page 6: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

TreeScaper (Version 1)

NLDR

Optimization Algorithm

Linear iterationMajorizationGauss-NewtonStochastic gradientdescentMCMC simulatedannealing

Cost functions

Kruskal-1 stressNormalized stressSammon stressCurvilinear componentsanalysis

Dimension Estimator

Nearest neighbor estimator

Correlation dimension

Maximum likelihood estimatorVisualization

June 21, 2016

Page 7: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

TreeScaper (Version 2)

NLDRDimensionality estimationNew input data typesDistance/Affinity matrix

Robinson-Foulds (Unweighted/Weighted)MatchingSubtree Prune and Regraft

Covariance matrixCommunity Detection methods

Configuration Null ModelConstant Potts ModelErdos-Renyi Null ModelNo Null Model

Interactive visualization interface

June 21, 2016

Page 8: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Application

Yeast dataset with 5 species, 106 loci106 gene trees were reconstructed using maximumparsimony

June 21, 2016

Page 9: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Topology-based Network Analysis

Affinity matrixReciprocal of pairwisedistances

Detect communitiesDiscovered 11 communities

Consensus trees for eachcommunity

Top 2 recovers the top 2candidate species trees 62/106

17/10611/106

4/106

3/106

2/106

2/106

2/106

· · ·

June 21, 2016

Page 10: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Bipartition-based Network Analysis

Covariance matrix based on presence or absence ofbipartitions in the gene trees

June 21, 2016

Page 11: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

TreeScaper Software

Available on GitHubhttps://github.com/whuang08/TreeScaper

June 21, 2016

Page 12: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

TreeScaper Software

Available on GitHub

https://github.com/whuang08/TreeScaper

June 21, 2016

Page 13: TreeScaper: Software to Visualize and Extract Phylogenetic Signals from Sets of Trees

Acknowledgements

Computing support from FSU’s Research ComputingCenter and HPC@LSUThe National Science Foundation for funding to supportsome of this work (ABI-1262476)

June 21, 2016