a separate analysis approach to the reconstruction of phylogenetic networks
DESCRIPTION
A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks. Luay Nakhleh Department of Computer Sciences UT Austin. Who’s Involved. UT CS : Tandy Warnow, Luay Nakhleh UT BIO : Randy Linder UNM CS : Bernard Moret. Why Networks?. Lateral gene transfer (LGT) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/1.jpg)
A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks
Luay NakhlehDepartment of Computer Sciences
UT Austin
![Page 2: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/2.jpg)
Who’s Involved
– UT CS: Tandy Warnow, Luay Nakhleh– UT BIO: Randy Linder – UNM CS: Bernard Moret
![Page 3: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/3.jpg)
Why Networks?
• Lateral gene transfer (LGT)– Ochman estimated that 755 of 4,288 ORF’s in
E.coli were from at least 234 LGT events
• Hybridization– Estimates that as many as 30% of all plant
lineages are the products of hybridization– Fish– Some frogs
![Page 4: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/4.jpg)
Phylogenetic Networks
• Rooted, directed, acyclic graphs that actually model the evolutionary process
• “tree” nodes and “network” nodes
• Time constraints
![Page 5: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/5.jpg)
Separate Analysis
• Analyze individual genes separately
• Reconcile the resulting phylogenies
• As opposed to combined analysis in which the datasets are combined (via concatenation) and the combined dataset is then analyzed
![Page 6: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/6.jpg)
SPR Distances Among Gene Trees
A B C D E
A B C D E A B C D E
SPR Distance 1
![Page 7: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/7.jpg)
Maddison’s Method
Given two gene datasets
• Construct two gene trees T1 and T2
• If SPR(T1,T2)=0– Return a tree
• If SPR(T1,T2)=1– Return a network with one reticulation event
Open problem: extend to reconstructing a network with m reticulation events
![Page 8: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/8.jpg)
Challenges
(1) Computational
– Computing SPR distances is of unknown computational complexity (probably hard)
![Page 9: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/9.jpg)
Solving the Computational Challenge
• Galled-networks: reticulation events are independent
• For two gene trees T1 and T2 on n leaves we can– Decide whether SPR(T1,T2)=m in O(mn)
time, and – Construct network N from T1 and T2 in O(mn)
time
![Page 10: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/10.jpg)
Challenges
(2) Systematic
– Obtaining the correct gene trees in practice is very hard (due to missing data, inaccuracy of tree reconstruction methods, wrong assumptions, etc.)
![Page 11: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/11.jpg)
Solving the Systematic Challenge: Our Method SpNet
Given the sequences of two genes I & II on a set of species
• Run MP or ML on gene I and obtain a set U1 of trees, represented by its consensus tree t1
• Run MP or ML on gene II and obtain a set U2 of trees, represented by its consensus tree t2
• Find binary trees T1 and T2, that refine t1 and t2, respectively, and such that SPR(T1,T2)=1
• Build network N from T1 and T2
![Page 12: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/12.jpg)
SpNet: Running Time
• We have a linear-time algorithm for the single hybrid case (implementation and experimental results are available as well)
• We are working on the general case of arbitrary number of reticulation events
![Page 13: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/13.jpg)
Experimental Study
• Generated random networks on 10 and 20 taxa, with 0, 1, and 2 hybrids
• Evolved sequences under the GTR+Gamma model of evolution with invariant sites
• We studies the topological accuracy based on the splits defined by the model and inferred network
![Page 14: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/14.jpg)
Evaluation Criteria
• Detection Quality– How often did the method infer the correct
number of hybrids in the model phylogeny?
• Reconstruction Quality– What is the topological accuracy of the
inferred phylogeny?
![Page 15: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/15.jpg)
Methods
• SpNet(i): Our method where we contract i edges
• NNet: The method of Bryant and Moulton
• NJ
![Page 16: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/16.jpg)
Detection Quality of SpNetModel Phylogeny: 20-taxon Tree
![Page 17: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/17.jpg)
Detection Quality of SpNetModel Phylogeny: 20-taxon 1-hybrid network
![Page 18: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/18.jpg)
Detection Quality of SpNetModel Phylogeny: 20-taxon 2-hybrid network
![Page 19: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/19.jpg)
Reconstruction QualityModel Phylogeny: 20-taxon tree
![Page 20: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/20.jpg)
Reconstruction QualityModel Phylogeny: 20-taxon 1-hybrid network
![Page 21: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/21.jpg)
Reconstruction QualityModel Phylogeny: 20-taxon 1-hybrid network
![Page 22: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/22.jpg)
Conclusions
• Considering a set of “good” trees rather than a single optimal tree is advantageous in network reconstruction
• Separate analysis approaches outperform combined analysis approaches
![Page 23: A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks](https://reader036.vdocuments.us/reader036/viewer/2022062423/568146f2550346895db42613/html5/thumbnails/23.jpg)
Ongoing research
• Using other techniques for obtaining unresolved trees (e.g., Bayesian analyses, bootstrapping, etc.)
• Detection vs. reconstruction – visualization and clustering techniques may also be useful (collaboration with St John)
• Refining unresolved networks
• DCM-like network reconstruction