animal phylogeny the animal kingdom comprises 34 large groups, or phyla - species within a phylum...
TRANSCRIPT
Animal phylogeny
The animal kingdom comprises 34 large groups, or phyla
- species within a phylum share a basic body plan, or set of characters that allows them to be grouped
Relationships between phyla are difficult to determine, and remain controversial after 150 years of work
- lack of homologous characters between phyla
- simultaneous radiation of most groups during Cambrian
hard to tell which groups are sister taxa, or who came first
“Traditional” Animal Phylogeny
- acoelomates, pseudocoelomates branch off first- 2 deep divisions: protostomes + deuterostomes
i.e.,wrong
Who is related to whom?Mollusc Annelid Arthropod
share obvious segmentation
common ancestor, segmented?
Who is related to whom?Mollusc Annelid Arthropod
share trochophore larval stage
common ancestor,with trochophore?
Molecular Phylogeny re-wrote the book
Molluscs Annelids Arthropods Nematodes
ancestor withtrochophorelarva
ancestorwith acuticle
Lophotrochozoa Ecdysozoa
…Protostomes were divided into 2 major sub-groups
Based on analysis ofDNA, geneorder…
Animal phylogeny: re-written by genetics
Initial molecular phylogenetic studies compared genes that were highly conserved across phyla
Compare a gene that is so important, it does not change much even over long periods of time
- that way, creatures as different as a person and a sponge still have sequences that are similar enough to compare
Small subunit ribosomal RNA gene (18S rRNA) is strongly conserved by selection; permits comparisons across very distantly related organisms
Lophophorate Placement: 18S rRNA data
Halanych et al. 1995
Comparison of the highly-conserved 18S rRNA gene sequence showed that all 3 lophophorates group with protostomes, not with the deuterostomes as previously thought
Proposed clade within protostomes: the Lophotrochozoa
- lophophorates, molluscs + annelids; excludes arthropods
- no synapomorphy; either have a lophophore, or a trochophore larva
Long-Branch Attraction ProblemSome taxa have fast-evolving DNA
Often drop out at base of tree, clustered with:
(a) primitive animals
(b) other fast-evolvers, whom they may not be related to
Early molecular studies found that nematodes dropped out near the cnidarians, suggesting they were basal bilaterians
- this supported morphological analysis that said nematodes lacked a true coelom, so were basal to (more primitive than)
other Bilaterians
Long-Branch Attraction ProblemSome taxa have fast-evolving DNA
Often drop out at base of tree, clustered with:
(a) primitive animals
(b) other fast-evolvers, whom they may not be related to
This is an artifact (a false result) of how computer programs analyze DNA sequences, called long-branch attraction
- sequences that are fast-evolving give very long branches on trees, which tend to “attract” other long branches
sequences that are very different (fast mutating) get lumped together with other fast-evolving sequences
Long-Branch Attraction Problem
Turned out that most nematodes have very fast-mutating DNA, so give long branches and tend to fall out near the base of any tree
A slow-evolving nematode DNA sequence grouped with the arthropods, contradicting the older hypothesis that “pseudo- coelomates” were basal (= primitive) to other bilaterians
Ecdysozoa: clade of molting protostomes
Later analyses used 18S sequences to group molting pseudocoelomates such as nematodes & priapulids with the arthropods
- clade was named Ecdysozoa, to reflect the synapomorphy of molting a cuticle to grow
All subsequent analyses have supported this split within the protostomes
Traditional vs. Molecular Trees
Molecular studies found a hidden deep division in protostomes that conventional morphology had never suggested
Also moved lophophorates out of Deuterostomia
Morphology Molecular
Next-generation sequencingRecent animal phylogenies have relied on recent advances in sequencing technology and computational analysis
1) reverse-transcribe all mRNA in a tissue sample into DNA
- this gives you a cDNA library of all the protein-coding genes that were being expressed in that tissue, which is called the transcriptome, or an EST library
2) use 454 pyrosequencing (“next-generation sequencing”) to get partial sequences for thousands of genes
- you get many short sequences (~300 bp) which are hopefully overlapping, allowing computer to align them and infer the whole gene sequence from the pieces
- 400 million nucleotides can be sequenced in 10 hr
Phylogenomicstop represents the complete gene sequence, inferred fromthe aligned, overlapping pieces from individual runs
each sequence is one piece of this particular gene, assembled by the computer based on the parts where they overlap (i.e., where the sequences are identical)
PhylogenomicsRecent animal phylogenies have relied on recent advances in sequencing technology and computational analysis
3) computer algorithms are then used to piece together each gene sequence
4) the corresponding amino acid sequences are then used to compute the phylogenetic tree
PhylogenomicsProblems with next-generation sequencing approaches to phylogenomics:
- any given gene sequence may be incomplete- some genes may not be expressed in a given tissue,
so they will not get sequenced- cost limits how many runs you can do, hence how many sequences you can obtain for a given species
lots of missing data in the final dataset: the sequence ofany particular gene may be available for only some of the species you are trying to put into a phylogeny
PhylogenomicsProblems with next-generation sequencing approaches to phylogenomics:
- contamination from symbionts or food
- determining whether copies of a gene from different taxa are orthologs, meaning copies of the same gene and not
a related but different gene
- some genes exits as families of similar genes, related by decent from one ancestral gene
Phylogenomicsmade-up example: say animals have Actin 1, Actin 2, and Actin 3 genes, descended from one ancestral Actin gene
ancestralActin gene
actin actin actin 1 2 3
gene duplication:3 versions of actinwere present in ancestral mollusc, with similar but different amino acids
(present in 1st bilaterian)
GAFLSM.. GAFGSW.. TALLMM..
GAFLSM..
ancestralamino acids
red letters = amino acids that changed by mutation since the 3 versions appeared in the ancestral mollusc
Phylogenomicsmade-up example: say there’s Actin 1, Actin 2, and Actin 3 genes in animals, descended from an ancestral Actin gene
chitonsquidclamsnail
GAFLSM.. MAFGSW.. TALLMM..
GAFLSM.. MAFGSW.. TALLMM..
AAFPMM.. GWFGSP.. KRLLMY..
AAFPMM.. GAFLSP.. KRLLMQ..
3 “Actin” genes inancestral mollusc
actin actin actin 1 2 3GAFLSM.. GAFGSW.. TALLMM..
diverge over timein each lineage
related lineages should have more similar amino acids for each gene ortholog (copy of same original version)
clam
snail
squid
chiton
PhylogenomicsDetermining whether copies of a gene are orthologs
- what you DON’T want: to align the copies of Actin 1 from squid, clam and chiton with Actin 2 from chiton !!
GAFLSM.. MAFGSW.. TALLMM..
GAFLSM.. MAFGSW.. TALLMM..
AAFPMM.. GWFGSP.. KRLLMY..
AAFPMM.. GAFLSP.. KRLLMQ..
actin actin actin 1 2 3
this is a paralog of actin 1 gene: a divergent copy that exists alongside actin 1 in the genome
chitonsquidclamsnail
Phylogenomics
GAFLSM..
GAFLSM..
AAFPMM..
GAFLSP..
“actin 1” (or so you think..)
reallyactin 2!
chitonsquidclamsnail chiton
squid
snail
clam
at these 3 positions, “snail” now looksmore related to chiton + squid becauseit has the same amino acids as they do!
false information
wrong phylogeny
Determining whether copies of a gene are orthologs
- what you DON’T want: to align the copies of Actin 1 from squid, clam and chiton with Actin 2 from chiton !!
Hejnol et al. 2009
Most recent animal phylogenyused 1,487 genes
> 270,000 amino acid positions were analyzed
complete data set could not even be fully analyzed due to requirement for huge amount of computer time
- reduced 844-gene dataset used to test stability of proposed relationships
Most recent animal phylogenyused 1,487 genes
- each gene only had to be present in 18 of the
94 included species (lots of missing data)
- phoronid: only 2 genes used in analysis!
indicates that ctenophores are most basal animal – even more distant than sponges !?!
Do you buy it?
Why or why not?
Are you surprised the position of phoronids is still unknown, if only 2 genes were used to place Phoronis in this tree?
The rules of what makes a grouping “significant” (meaning we can really trust it is correct) state that you need bootstrap support of over 70%
These are the numbers in black to the left of a node = 100%
Of the major clades that group many phyla together (shaded boxes, or names on the left of the tree), in which can we be really confident?
*
Based on the rules of what makes a grouping count as “significant” (meaning we can really trust it), only the following clades are secure:
- Bilateria
- Protostomia
- Deuterostomia
None of the other named clades of phyla I have been using were “significantly” supported
Does it surprise you that after 100 years of effort, in the age of whole-genome sequencing, we still do not really know what phyla are sister to what other phyla??
What could be done to try to better resolve the relationships among phyla? What kinds of things can you think of that would help to improve this tree either in terms of data or methods?
Taxonomic rank “phylum” is used to lump animals into groups based on body plan (arrangement of morphological traits)
Most phyla are pretty different from each other; which phyla are sister groups remains controversial
Body plan is the product of development, when genetic information is converted into tissues/organs, relative position, numbers, shape...
- why are there only a handful of different body plans?
“Why should there be so much variety and so little real novelty?”- Darwin, 1872
Body Plan evolution
Body Plan evolution
Holland (1998) proposed 6 major developmental transitions during the evolution of the Animal Kingdom
- meaning, times when big changes in developmental control mechanisms resulted in major changes to body plan
- changes in development = differences in control genes (Hox) and genes they boss around (whose expression they control)
1 multicellularity
2 symmetry
3 bilateral, 3 germ layers
4 axis flip
5 6
Transition 1 - origin of multicellularity
- cell layers, cell adhesion, spatially controlled differentiation
- genes encoding cadherins, collagen, lectin proteins- transcription factors controlling cell differentiation
- in fact, sponge homeobox genes correspond to genes in higher animals controling cell differentiation, not spatial organization
Transition 2 - origin of symmetry + tissues
- tissues (2 or more cell types working together) - nerves (genes for ion channels, neurotransmitters, etc) - spatial information: body axis formation
Body Plan evolution
1 multicellularity
2 symmetry
3 bilateral, 3 germ layers
4 axis flip
5 6
Transition 3 - origin of bilateral symmetry + 3 germ layers
- nerve chord (running down belly, or down back)
- complete digestive system
- 3 germ layers (ectoderm, endoderm, mesoderm)
- Bilateral symmetry, meaning several body axes
anterior-posterior (head tail)
dorsal-ventral (back front)
left-right
proximal-distal (near the center out toward tips)
Body Plan evolution
Body Plan evolution
Duplication of ancestral gene cluster Hox genes (ectoderm) and ParaHox genes (endoderm), needed for 3-layer embryo with a complete gut
This gene duplication may have spurred Cambrian Explosion, when all modern phyla appear at about same time in fossils
Hox genes originally patterned ectoderm of triploblast phyla
ParaHox gene cluster: Gsx --- Xlox --- Cdx
Gsx affects anterior end of gut, near mouth
Xlox patterns middle of complete gut (pancreas)
Cdx patterns the end of gut, near anus
endodermpatterning
a) duplication of ancestral Hox-like gene into a cluster of related genes (happened before step #2, cnidarians)
b) duplication of ancestral cluster of Hox-like genes (now 2 sets)
c) divergence of double cluster into 2 different clusters: 1) ParaHox genes, for patterning endoderm
2) Hox genes, controlling ectoderm
ParaHox
Hox
Gsx Xlox Cdx
Happened at step #3, before most animal phyla diverged
(head) (tail)
(genes diverge over time)
Transition 4 - dorsal-ventral axis inverted in protostomes
- front-to-back axis flipped in ancestor of protostomes such that they have nerve chord on ventral side, not dorsal
worm you
gut gut
nervechord
nerve chord (backbone)dorsal
ventral
Body Plan evolution
Body Plan evolution
Transition 5 - origin of the vertebrates
- migrating neural crest cells, key to development of our complex central nervous systems
- new cell types (i.e., osteoblasts that build bones)
- tetraploidy (= 4 copies) of Hox cluster: double-duplication
Transition 6 - after hagfish, rest of vertebrates got:
- wandering mesoderm cells
- 2 pairs of appendages (tetrapods)
- cranial arches become jaws
Body Plan evolution
Transitions 5 & 6 involved major gene duplication events
- whole groups of genes were duplicated via tetraploidy of the genome (four copies of everything!)
- some genes copies were lost after duplication
- other copies, especially of developmental genes, were kept
gene duplication produces new master control genes that can take on new roles, producing changes in body plans
step #5: double-duplication of Hox cluster in vertebrate ancestor
(fly)
mouse
ParaHox
Hox
Gsx Xlox Cdx
1st duplication
2nd duplication