3- ribosomal rna gene reconstruciton phenetics vs. cladistics ...
Post on 01-Jan-2016
227 Views
Preview:
TRANSCRIPT
3- RIBOSOMAL RNA GENE RECONSTRUCITON
Phenetics Vs. CladisticsPhenetics Vs. Cladistics
Homology/Homoplasy/Orthology/ParalogyHomology/Homoplasy/Orthology/Paralogy
Evolution Vs. PhylogenyEvolution Vs. Phylogeny
The relevance of the alignmentThe relevance of the alignment
The algorithmsThe algorithms
BootstrapBootstrap
One tree is no treeOne tree is no tree
Phylogenetic coherencePhylogenetic coherence(monophyly) (monophyly)
phylogenetic coherencephylogenetic coherence
RNAr 16SRNAr 16SFunctional genes (MLSA)Functional genes (MLSA)
Genomic analysesGenomic analyses
70-50%
70%
genomic coherencegenomic coherence
Reasociación DNA-DNAReasociación DNA-DNAG+C, AFLP, G+C, AFLP, MLSAMLSA
Genomic comparisonsGenomic comparisons(ANI; AAI)(ANI; AAI)
100%100%
60%60%
70%70%
80%80%
50%50%
phenotypic coherencephenotypic coherence
metabolismmetabolismchemotaxonomychemotaxonomy
spectrometry spectrometry (Maldi-Tof; ICR-FT/MS)(Maldi-Tof; ICR-FT/MS)
Generally based on Generally based on 16S rRNA16S rRNA gene analysis gene analysis
important to recognize the closest relatives by means of the important to recognize the closest relatives by means of the Type Strain gene sequencesType Strain gene sequences
Housekeeping genes (Housekeeping genes (MLSAMLSA approach or approach or singlesingle gene) may help in resolve phylogenies gene) may help in resolve phylogenies
Future perspectives will be done with Future perspectives will be done with full-genomefull-genome sequences sequences
M8M31
PR1C12
M1
E11
C25AE7
P13
C16
C4C5
A1
C9
P18
A7
E3
80 9085
Similarity matrix or alignmentSimilarity matrix or alignment
OTU AOTU A 1010001001001001010100010010010010
OTU BOTU B 1101000101000101011010001010001010
OTU COTU C 0001001001111010100010010011110101
OTU DOTU D 0011111001010101000111110010101010
OTU EOTU E 0001001011100110100010010111001101
……
PHENETICSPHENETICS
CLADISTICSCLADISTICS
Phenetics vs CladisticsPhenetics vs Cladistics
Data can be treated as presence/absence/intensity to generate Data can be treated as presence/absence/intensity to generate similarity matricessimilarity matrices
If data is analyzed by their similarity If data is analyzed by their similarity PHENETICSPHENETICS
If data is analyzed in an evolutionary context (i.e. changes in If data is analyzed in an evolutionary context (i.e. changes in homologous characters are mutations or evolutive steps) homologous characters are mutations or evolutive steps) CLADISTICSCLADISTICS
For evolutive purposes is necessary to recognize For evolutive purposes is necessary to recognize HOMOLOGYHOMOLOGY
HOMOLOGY HOMOLOGY ORTOLOGY ORTOLOGY PARALOGY PARALOGY HOMOPLASY HOMOPLASY
Organism AOrganism AGene XGene X
Organism BOrganism BGene XGene X
Organism AOrganism AGene XGene XGene X’Gene X’Gene X’’Gene X’’
Homology Homology same ancestral origin same ancestral origin Homoplasy Homoplasy false homology false homology
Orthology Orthology homologous genes in different organisms homologous genes in different organisms Paralogy Paralogy homologous genes in homologous genes in the same organism, gene the same organism, gene duplications with identical or duplications with identical or different functiondifferent function
HOMOLOGY HOMOLOGY ORTOLOGY ORTOLOGY PARALOGY PARALOGY HOMOPLASY HOMOPLASY
Organism AOrganism AGene XGene X
Organism BOrganism BGene XGene X
Organism AOrganism AGene XGene XGene X’Gene X’Gene X’’Gene X’’
Orthology Orthology homologous homologous genes in different genes in different organismsorganisms
Paralogy Paralogy homologous genes homologous genes in the same organism, gene in the same organism, gene duplications with identical or duplications with identical or different functiondifferent function
HomoplasyHomoplasy(false homology)(false homology)
HomologyHomology(same ancestral origin)(same ancestral origin)
Evolution Evolution ≠ phylogeny≠ phylogeny
Evolution Evolution =>=> mutations (morphometrics) mutations (morphometrics) + + age (fossil record) age (fossil record)
Phylogeny Phylogeny == genealogy genealogy =>=> we know only the tips of the tree, we know only the tips of the tree,
nothing is said about putative ancestorsnothing is said about putative ancestors
PROKARYOTES PROKARYOTES =>=> no fossil record no fossil record =>=> molecular clocks molecular clocks
Molecular clocks (housekeeping Molecular clocks (housekeeping genes):genes):
16S rRNA; 23S rRNA; ATPases; 16S rRNA; 23S rRNA; ATPases; TU-elongation factor; gyrases…TU-elongation factor; gyrases…
The 16S rRNA:The 16S rRNA: Universally representedUniversally represented ConservedConserved No protein codingNo protein coding Base pairing (helix)Base pairing (helix) Natural amplificationNatural amplification Proper sizeProper size
Evolution vs. PhylogenyEvolution vs. Phylogeny
Ludwig and Schleifer, 1994 FEMS Rev 15:155-173Ludwig and Schleifer, 1994 FEMS Rev 15:155-173
The relevance of the alignmentThe relevance of the alignment
To perform cladistic analyses we should first align al sequences in order to To perform cladistic analyses we should first align al sequences in order to recognize all homologous positions. recognize all homologous positions.
Recognition by:Recognition by:
Sequence similaritiesSequence similarities
Base pairing due secondary structure (helixes for rRNA)Base pairing due secondary structure (helixes for rRNA)
Insertions & deletionsInsertions & deletions
Empirically (subjective)Empirically (subjective)
Minimize homoplasic influencesMinimize homoplasic influences
There are many alignment programs, all look to common features that may indicate There are many alignment programs, all look to common features that may indicate homologous sites:homologous sites:
Clustal XClustal X
MAFFTMAFFT
PileUpPileUp
……
The relevance of the alignmentThe relevance of the alignment
Most of the programs do not take into account secondary structure, just sequence motive similaritiesMost of the programs do not take into account secondary structure, just sequence motive similarities
rRNA has a secondary structure with helixes that help in aligning sequencesrRNA has a secondary structure with helixes that help in aligning sequences
Functional gene or translated proteins cannot be improved by secondary structure analysisFunctional gene or translated proteins cannot be improved by secondary structure analysis
The relevance of the alignmentThe relevance of the alignment
ARB does take into account features as helix pairingARB does take into account features as helix pairing
By increasing the numbers of sequences, the By increasing the numbers of sequences, the
alignment improvesalignment improves
www.arb-home.dewww.arb-silva.de
TT CC
GGAA
transitionstransitions
tran
sver
sio
ns
tran
sver
sio
ns
Maximum LikelihoodMaximum Likelihood
Like Maximum Parsimony Like Maximum Parsimony but takes into accountbut takes into account
difficulties in mutation difficulties in mutation events (transitions vs. events (transitions vs. transversions)transversions)
mutation positionmutation position
SlowerSlower
Maximum ParsimonyMaximum Parsimony
G C C A T => aG C C A T => aG C A C T => bG C A C T => bG C A C C => cG C A C C => c
a – b => 2 mutationsa – b => 2 mutations
a – c => 3 mutationsa – c => 3 mutations
b – c => 1 mutationb – c => 1 mutation
bb ccaa aa cc bb aabb cc
22
33
33
22
11
22
55 55 33
(pitfalls: nature may not be parsimonious)(pitfalls: nature may not be parsimonious)
a => a => 00b => b => 4040 00c => c => 6060 2020 00
aa bb cc
aa
bb
ccNeighbor Joining:Neighbor Joining:
G C C A T => aG C C A T => aG C A C T => bG C A C T => bG C A C C => cG C A C C => c
a => a => 100100b => b => 6060 100100c => c => 4040 8080 100100
aa bb cc
alignmentalignment
Similarity matrixSimilarity matrix Distance matrixDistance matrix
dendrogramsdendrograms
Distance transformationDistance transformation
Jukes-CantorJukes-CantorKimuraKimura
De SoeteDe Soete
(pitfalls: does not take into account multiple mutations)(pitfalls: does not take into account multiple mutations)
aabb cc
The algorithmsThe algorithms
Bootstrap indicates how stable is a branching order when a given dataset is Bootstrap indicates how stable is a branching order when a given dataset is
submitted to multiple analysissubmitted to multiple analysis
Generally short internode branches will have low bootstrap valuesGenerally short internode branches will have low bootstrap values
BootstrapBootstrap
PHYLOGENETIC FILTERSPHYLOGENETIC FILTERS
TERMINI TERMINI 42,284 homologous positions 42,284 homologous positions
BACTERIA BACTERIA 1,532 homologous positions 1,532 homologous positions
30% 30% 1,433 homologous positions 1,433 homologous positions
50% 50% 1,288 homologous positions 1,288 homologous positions
NJ_bac
NJ_30% NJ_50%
USE OF PHYLOGENETIC USE OF PHYLOGENETIC FILTERSFILTERS
Conservational filters are useful for deep-Conservational filters are useful for deep-branching phylogeniesbranching phylogenies
complete sequences are useful for close complete sequences are useful for close relative organismsrelative organisms
Size & information contentSize & information content
complete sequences give complete information complete sequences give complete information
partial sequences lose phylogenetic signalpartial sequences lose phylogenetic signal
short sequences lose resolutionshort sequences lose resolution
1500 nuc1500 nuc
900 nuc900 nuc
300 nuc300 nuc
One tree is no treeOne tree is no tree
PARPAR
NJNJ
RaXMLRaXML
different algorithms different algorithms different topologies different topologies
try different datasets as welltry different datasets as well
draw a consensus tree draw a consensus tree
RECOMMENDATIONS FOR 16S rRNA TREE RECONSTRUCTIONRECOMMENDATIONS FOR 16S rRNA TREE RECONSTRUCTION
BA DC
FE
100
IHG
100
90
10095
5025
25
BA DC
FE IHG
Tree with bootstrap Tree with multifurcation
SEQUENCESEQUENCE almost complete is better than short partial sequencesalmost complete is better than short partial sequences
ALIGNMENTALIGNMENT Better take into account secondary structuresBetter take into account secondary structures
ALGORITHMALGORITHM Better maximum likelihood, but compare with other as neighbor joining and maximum Better maximum likelihood, but compare with other as neighbor joining and maximum parsimonyparsimony
DATASETDATASET Never just one dataset, try different sets of data (i.e. different number of sequences; Never just one dataset, try different sets of data (i.e. different number of sequences; different filters to find the best resolution)different filters to find the best resolution)
FINAL TREE FINAL TREE Either you show all trees, or the best bootstrapped, or a multifurcation showing Either you show all trees, or the best bootstrapped, or a multifurcation showing unresolved branching order.unresolved branching order.
MLSA: phylogenetic reconstructions
MULTIPLE SEQUENCE ALIGNMENTS
sometimes have better resolution than the 16S rRNA gene
16S rRNA gene can have very low resolution
Jiménez et al., 2013, System Appl Microbiol, 36: 383- 391
top related