1 phylogeny workshop by eyal privmaneyal privman the bioinformatics unit g.s. wise faculty of life...
TRANSCRIPT
![Page 1: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/1.jpg)
1
Phylogeny WorkshopPhylogeny Workshop
By Eyal Privman
The Bioinformatics UnitG.S. Wise Faculty of Life Science
Tel Aviv University, IsraelNovember 2009
http://ibis.tau.ac.il/twiki/bin/view/Bioinformatics/Phylogeny2009
![Page 2: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/2.jpg)
2Why should weWhy should we
care about phylogeny?care about phylogeny?
"Nothing in biology makes sense except in the light of evolution"
(Theodosius Dobzhansky, 1973)
![Page 3: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/3.jpg)
33 Alignment and phylogeny are mutually dependant
Inaccurate tree building
MSA
Sequence alignment
0.4
Phylogeny reconstruction
Unaligned sequences
![Page 4: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/4.jpg)
44 Alignment and phylogeny are both challenging
25% of residues are
aligned wrong
Based on BAliBASE: a large representative set of proteins
![Page 5: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/5.jpg)
55 Alignment and phylogeny are both challenging
5% of tree branches are wrong
Based on simulations of 100 protein sequences
![Page 6: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/6.jpg)
66 Multiple sequence alignment (MSA)
progressive alignment
ABCDE
Guide tree
A
DCB
E
MSA
Pairwise distance table Iterative
![Page 7: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/7.jpg)
77
Multiple sequence alignment (MSA)
Several advanced MSA programs are available.Today we will use two:
• MAFFT – fastest and one of the most accurate
• PRANK – distinct from all other MSA programs because of its correct treatment of insertions/deletions
![Page 8: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/8.jpg)
88
MAFFT• Web server & download:
http://align.bmr.kyushu-u.ac.jp/mafft/online/server/
• Efficiency-tuned variants quick & dirty or slow but accurate
Nucleic Acids Research, 2002, Vol. 30, No. 14 3059-3066© 2002 Oxford University Press
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
Kazutaka Katoh, Kazuharu Misawa1, Kei-ichi Kuma and Takashi Miyata*
![Page 9: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/9.jpg)
99
Choosing a MAFFT strategy
quick & dirty slow
but accurate
![Page 10: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/10.jpg)
1010
Choosing a MAFFT strategy
quick & dirty slow
but accurate
![Page 11: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/11.jpg)
1111
Choosing a MAFFT strategy
quick & dirty slow
but accurate
![Page 12: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/12.jpg)
1212
Choosing a MAFFT strategy
L-INS-i
ooooooooooooooooooooooooooooooooXXXXXXXXXXX-XXXXXXXXXXXXXXX------------------
--------------------------------XX-XXXXXXXXXXXXXXX-XXXXXXXXooooooooooo-------
------------------ooooooooooooooXXXXX----XXXXXXXX---XXXXXXXooooooooooo-------
--------ooooooooooooooooooooooooXXXXX-XXXXXXXXXX----XXXXXXXoooooooooooooooooo
--------------------------------XXXXXXXXXXXXXXXX----XXXXXXX------------------
G-INS-i
XXXXXXXXXXX-XXXXXXXXXXXXXXX
XX-XXXXXXXXXXXXXXX-XXXXXXXX
XXXXX----XXXXXXXX---XXXXXXX
XXXXX-XXXXXXXXXX----XXXXXXX
XXXXXXXXXXXXXXXX----XXXXXXX
E-INS-i
oooooooooXXX------XXXX---------------------------------XXXXXXXXXXX-XXXXXXXXXXXXXXXooooooooooooo
---------XXXXXXXXXXXXXooo------------------------------XXXXXXXXXXXXXXXXXX-XXXXXXXX-------------
-----ooooXXXXXX---XXXXooooooooooo----------------------XXXXX----XXXXXXXXXXXXXXXXXXooooooooooooo
---------XXXXX----XXXXoooooooooooooooooooooooooooooooooXXXXX-XXXXXXXXXXXX--XXXXXXX-------------
---------XXXXX----XXXX---------------------------------XXXXX---XXXXXXXXXX--XXXXXXXooooo--------
quick & dirty slow
but accurate
![Page 13: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/13.jpg)
1313
MAFFT outputSaving the output
• Choose a format: Clustal, Fasta, or click "Reformat" to convert to a selection of other formats
• Save page as a text file
A colored view of the alignment
![Page 14: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/14.jpg)
1414PRANK
![Page 15: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/15.jpg)
1515
Classical alignment errors for HIV env
![Page 16: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/16.jpg)
1616
PRANK
• Web server: http://www.ebi.ac.uk/goldman-srv/webPRANK/
![Page 17: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/17.jpg)
1717
PRANK output
If you need a different format – copy the results to the READSEQ sequence converter: http://www-bimas.cit.nih.gov/molbio/readseq/
![Page 18: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/18.jpg)
1818
Downloadable PRANK
• http://www.ebi.ac.uk/goldman-srv/prank/prank/– PRANK: A command-line program interface
– PRANKSTER: A program with graphical user interface
![Page 19: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/19.jpg)
1919 1. Download and unzip the sequence files from my homepage (Google "Eyal Privman" and look for the workshop materials under "Teaching"). Open "fahA.fas" in Notepad – these are 65 protein sequences in FASTA format.
2. Run PRANKSTER, open the "fahA.fas" file, and run "Alignment""Make alignment"
3. While you wait: Copy the sequences into the MAFFT web server and run the "automatic" "moderate" strategy – which strategy did MAFFT choose for you? Click "Reformat", choose "phylip|phylip4", and save as "fahA.mafft.phylip"
4. When PRANKSTER finishes click FileSave, and save the MSA in Phylip format by the name "fahA.prank.phylip"
![Page 20: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/20.jpg)
2020
Phylogeny reconstruction
Different approaches (algorithms / programs):
• Distance based methods (e.g. neighbor-joining, as in ClustalW) Fast but inaccurate
• Maximum parsimony (e.g. MEGA)
• Maximum likelihood methods (e.g. phyML, RAxML) Accurate but slower
• Bayesian methods (e.g. MrBayes) Most accurate but very slow
ABCDE
Guide tree
A
DCB
E
MSA
Pairwise distance table
![Page 21: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/21.jpg)
2121
PhyMLThe most widely used maximum likelihood (ML) program
• Web server & download: http://www.atgc-montpellier.fr/phyml/
Accepts input MSA in PHYLIP format only:
• Interleaved: • Sequencial:
![Page 22: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/22.jpg)
2222
Downloadable PhyMLLess user-friendly, but allows using local computer power
• Run "phyml.bat"
• Drag the file from Windows Explorer to the blue window
• Enter "d" to switch fromDNA to AA
• Enter "y" to run
![Page 23: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/23.jpg)
2323
1. Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the phyML webserver (don't forget to choose "Amino-acids" and enter your email)
2. Run it with the local installation of "phyml.bat"
You should end up with a file: "fahA.prank.phylip_phyml_tree.txt"
![Page 24: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/24.jpg)
2424
RAxML
• Web server: http://phylobench.vital-it.ch/raxml-bb/
• Similar maximum likelihood (ML) methodology as phyML, but much faster Faster results Better results in same run-time
![Page 25: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/25.jpg)
2525
Downloadable RAxML
• A command-line program:http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm(On that page you will also find instructions for running on Windows, and the RAxML manual)
• easyRAx takes care of some of the RAxML options for you: http://projects.exeter.ac.uk/ceem/easyRAx.htmlbut installation is a somewhat more complex
![Page 26: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/26.jpg)
2626
1. Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the RAxML webserver (don't forget to tick "Protein sequences" and enter your email)
Save the resulting tree file as: "fahA.prank.phylip.raxml"
![Page 27: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/27.jpg)
2727 FigTree: tree visualization and figure creation
Manipulate a node
Manipulate a clade
Manipulate a taxon
![Page 28: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/28.jpg)
2828
1. Open "fahA.prank.phylip_phyml_tree.txt" in FigTree
2. Play around with the different options and make a pretty figure!
1. Find out how to color specific clades, as below
2. Try each of the three options under "Layout"
3. Export a figure in PDF format(File Export Graphic…)
![Page 29: 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009](https://reader035.vdocuments.us/reader035/viewer/2022070400/56649f115503460f94c248c7/html5/thumbnails/29.jpg)
29
Thanks for your attentionThanks for your attention
andand
happy phylogeny…happy phylogeny…