genome-scale phylogenomics
Post on 17-Jul-2015
183 Views
Preview:
TRANSCRIPT
Collaborators• Lyon collaborators:
• Adrián Arellano Davín
• Gergely Szöllősi (Budapest),
• Eric Tannier,
• Vincent Daubin,
• Thomas Bigot,
• Magali Semeria,
• Manolo Gouy,
• Laurent Duret
• Austin collaborators:
• Siavash Mirarab
• Md. Shamsuzzoha Bayzid
• Tandy Warnow
• RevBayes collaborators:
• Sebastian Hoehna • Michael Landis • Tracy Heath • Fredrik Ronquist • Brian Moore • John Huelsenbeck • …
To study genome evolution:
1. One species tree:
!!!
2. Thousands of gene trees:
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
To study genome evolution:
1. One species tree:
!!!
2. Thousands of gene trees:
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Why our current pipeline can be improved
�������������
��������
� ���������
�������� �
�������������
���������������
��������
�������������� ���������������������
���������������������� ������������ ���������������
�
�
�
�
�
�
�
�
�
����������� !���"� !��#����!�#$��%
���������&$�%!�������������'(%!�#$�%
�������( )'�
����!�����*+ ('�,#$��%
����!��������&�����-���!�����&( ��� $�.��"'(%
���������/���
Why our current pipeline can be improved
�������������
��������
� ���������
�������� �
�������������
���������������
��������
�������������� ���������������������
���������������������� ������������ ���������������
�
�
�
�
�
�
�
�
�
����������� !���"� !��#����!�#$��%
���������&$�%!�������������'(%!�#$�%
�������( )'�
����!�����*+ ('�,#$��%
����!��������&�����-���!�����&( ��� $�.��"'(%
���������/���
•Gene alignments: •Error prone •Short •Point estimates
Why our current pipeline can be improved
�������������
��������
� ���������
�������� �
�������������
���������������
��������
�������������� ���������������������
���������������������� ������������ ���������������
�
�
�
�
�
�
�
�
�
����������� !���"� !��#����!�#$��%
���������&$�%!�������������'(%!�#$�%
�������( )'�
����!�����*+ ('�,#$��%
����!��������&�����-���!�����&( ��� $�.��"'(%
���������/���
•Gene alignments: •Error prone •Short •Point estimates
•Gene trees: •based on alignments •Point estimates
Why our current pipeline can be improved
�������������
��������
� ���������
�������� �
�������������
���������������
��������
�������������� ���������������������
���������������������� ������������ ���������������
�
�
�
�
�
�
�
�
�
����������� !���"� !��#����!�#$��%
���������&$�%!�������������'(%!�#$�%
�������( )'�
����!�����*+ ('�,#$��%
����!��������&�����-���!�����&( ��� $�.��"'(%
���������/���
•Gene alignments: •Error prone •Short •Point estimates
•Gene trees: •based on alignments •Point estimates
•Species trees: •based on gene trees
Why our current pipeline can be improved
�������������
��������
� ���������
�������� �
�������������
���������������
��������
�������������� ���������������������
���������������������� ������������ ���������������
�
�
�
�
�
�
�
�
�
����������� !���"� !��#����!�#$��%
���������&$�%!�������������'(%!�#$�%
�������( )'�
����!�����*+ ('�,#$��%
����!��������&�����-���!�����&( ��� $�.��"'(%
���������/���
•Gene alignments: •Error prone •Short •Point estimates
•Gene trees: •based on alignments •Point estimates
•Species trees: •based on gene trees
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
D
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
D DL
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
LGTD DL
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
LGT ILSD DL
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
LGT ILS
DL: Boussau et al., Genome Research 2013
D DL
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
LGT ILS
DL: Boussau et al., Genome Research 2013
D DLDL+T:!
Szöllősi et al. "PNAS 2013
Species: A B C D
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
LGT ILSILS: !
Mirarab et al. Science 2014
DL: Boussau et al., Genome Research 2013
D DLDL+T:!
Szöllősi et al. "PNAS 2013
(thousands of alignments)
PHYLDOG
All gene families
Rooted species tree,numbers of duplications
and losses,rooted gene trees D1
D2
D3D4
D5
D6
L2L1
L4L3
L5
L6
Joint reconstruction of the species tree, gene trees, and
numbers of duplications and losses
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
D1D3
D2 D4
D5 D6
L1L3
L2 L4
L5 L6
Boussau et al., Genome Research 2013
(thousands of alignments)
PHYLDOG
All gene families
Rooted species tree,numbers of duplications
and losses,rooted gene trees D1
D2
D3D4
D5
D6
L2L1
L4L3
L5
L6
Joint reconstruction of the species tree, gene trees, and
numbers of duplications and losses
Species: A B C D
Discrete character:Continuous character:
a a b a0.1 0.2 0.2 0.4
TIME
D1D3
D2 D4
D5 D6
L1L3
L2 L4
L5 L6
Probabilistic models: • sequence evolution • gene family evolution
Boussau et al., Genome Research 2013
Sus scrofa
Felis catus
Ornithorhynchus anatinus
Oryctolagus cuniculus
Loxodonta africana
Mus musculus
Gorilla gorilla
Dipodomys ordii
Monodelphis domestica
Vicugna pacos
Macaca mulatta
Tupaia belangeri
Procavia capensis
Spermophilus tridecemlineatus
Pongo pygmaeus
Tursiops truncatus
Microcebus murinus
Callithrix jacchus
Equus caballus
Erinaceus europaeus
Tarsius syrichta
Choloepus hoffmanni
Ochotona princeps
Cavia porcellus
Pan troglodytes
Bos taurus
Rattus norvegicus
Homo sapiens
Otolemur garnettii
Dasypus novemcinctusEchinops telfairi
Pteropus vampyrus
Macropus eugenii
Canis familiaris
Sorex araneus
Myotis lucifugus
Laurasiatheria
Afrotheria
Xenarthra
Marsupials
Primates
Glires
010
000
010
000
010
000
010
000
010
000
010
000
010
000PHYLDOG
TreeBeSTPhyML
PHYLDOG: better trees for better ancestral genomes
An example gene family
0.1
Ornithorhynchus anatinus
0.3
Ornithorhynchus anatinusMus musculusMus musculusMus musculusCavia porcellusMus musculus
Oryctolagus cuniculusCanis familiaris
Bos taurusHomo sapiens
Pongo pygmaeusOryctolagus cuniculus
Cavia porcellusEquus caballusEquus caballus
Bos taurusCallithrix jacchusHomo sapiens
Monodelphis domesticaSpermophilus tridecemlineatus
Homo sapiensOrnithorhynchus anatinusOrnithorhynchus anatinusOrnithorhynchus anatinusOrnithorhynchus anatinus
Mus musculusMus musculus
Ornithorhynchus anatinusOrnithorhynchus anatinus
Mus musculusMus musculusMus musculus
Cavia porcellus
Mus musculus
Oryctolagus cuniculus
Canis familiaris
Bos taurus
Homo sapiens
Pongo pygmaeus
Oryctolagus cuniculus
Cavia porcellus
Equus caballusEquus caballus
Bos taurus
Callithrix jacchusHomo sapiens
Monodelphis domestica
Spermophilus tridecemlineatus
Homo sapiens
Ornithorhynchus anatinusOrnithorhynchus anatinusOrnithorhynchus anatinusOrnithorhynchus anatinus
Mus musculusMus musculus
TreeBeST PHYLDOG
Boussau et al., Genome Research 2013
Species: A B C D
TIME
ILS: !Mirarab et al. Science 2014
DL: Boussau et al., Genome Research 2013DL+T:!
Szöllősi et al. "PNAS 2013
Species: A B C D
TIME
LGT ILSILS: !
Mirarab et al. Science 2014
DL: Boussau et al., Genome Research 2013
D DLDL+T:!
Szöllősi et al. "PNAS 2013
Gene transfers and the quixotic pursuit of the TOL
Doolittle WF, Science 1999
“The monistic concept of a single universal tree appears […] increasingly obsolete. […][It is] no longer the most scientifically productive position to hold[…][It] accounts for only a minority of observations from genomes.”!
Bapteste, O’Malley, Beiko, Ereshefsky, Gogarten, Franklin-Hall, Lapointe, Dupré, Dagan, Boucher, Martin, !
Biology Direct 2009.
Using transfers to date clades
?T IM E
Because we can identify gene transfers, we have information for ordering the nodes of a species tree
Bayesian species tree inference
accounting for DTL events
• STRALE: • A Bayesian probabilistic method that can interpret thousands of
gene trees in terms of: • speciation events • duplication events (D) • transfer events (T) • loss events (L)
• A method able to estimate the DTL rates • A method able to reconstruct the species tree • A method able to order the nodes of the species tree
Simulation to test the species tree reconstruction• 20 species • 200 gene families
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
1 5
1
3
14
10
6
8
12
18
13
5
4
2
9
0
11
19
7
16
17
0.0 0.25 0.5 0.75 1.0 1.25
2
13
7
17
15
1
5
12
10
16
11
9
0
4
8
3
14
19
6
18
Simulated Inferred
Better gene trees, fewer transfers
Usual approach
ALE+DTL
RF d
ista
nce
to re
al tr
ee
Szöllősi et al., Syst. Biol. 2013
Better gene trees, fewer transfers
Usual approach
ALE+DTL
Tran
sfer
eve
nts
per f
amily
Usual approach
ALE+DTL
RF d
ista
nce
to re
al tr
ee
Szöllősi et al., Syst. Biol. 2013
Better gene trees, fewer transfers
Usual approach
ALE+DTL
Tran
sfer
eve
nts
per f
amily
Usual approach
ALE+DTL
RF d
ista
nce
to re
al tr
ee
Szöllősi et al., Syst. Biol. 2013
Better ancestral genomes:
go see Adrián Arellano Davín’s poster on reconstructing ancestral genomes across the
tree of life!
Species: A B C D
TIME
ILS: !Mirarab et al. Science 2014
DL: Boussau et al., Genome Research 2013DL+T:!
Szöllősi et al. "PNAS 2013
Species: A B C D
TIME
LGT ILSILS: !
Mirarab et al. Science 2014
DL: Boussau et al., Genome Research 2013
D DLDL+T:!
Szöllősi et al. "PNAS 2013
RevBayes
• Collaborative effort
• Model-based phylogenetics
• Many models of sequence evolution
• Models for dating
• Models for phylogeography
• Models for continuous traits
• Models for gene tree/species tree inference
• http://revbayes.net
• Sebastian Hoehna • Michael Landis • Tracy Heath • Fredrik Ronquist • Nicolas Lartillot • Brian Moore • John Huelsenbeck • …
Conclusions
• We develop methods for gene tree and species tree inference
• Improvement of gene trees and species trees in the presence of:
• duplications and losses,
• transfers,
• incomplete lineage sorting
• Parallel algorithms applicable to genome-scale data
Thanks!
• Lyon collaborators:
• Adrián Arellano Davín
• Gergely Szöllősi (Budapest),
• Eric Tannier,
• Vincent Daubin,
• Thomas Bigot,
• Magali Semeria,
• Manolo Gouy,
• Laurent Duret
• Austin collaborators:
• Siavash Mirarab
• Md. Shamsuzzoha Bayzid
• Tandy Warnow
Thanks!
• Lyon collaborators:
• Adrián Arellano Davín
• Gergely Szöllősi (Budapest),
• Eric Tannier,
• Vincent Daubin,
• Thomas Bigot,
• Magali Semeria,
• Manolo Gouy,
• Laurent Duret
• Austin collaborators:
• Siavash Mirarab
• Md. Shamsuzzoha Bayzid
• Tandy Warnow
Go see Adrián Arellano Davín’s poster on reconstructing ancestral genomes across the tree of life!
top related