1 chapter 7 building phylogenetic trees. 2 contents phylogeny phylogenetic trees how to make a...
TRANSCRIPT
![Page 1: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/1.jpg)
1
Chapter 7
Building Phylogenetic Trees
![Page 2: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/2.jpg)
2
Contents
• Phylogeny • Phylogenetic trees• How to make a phylogenetic tree from pairwise
distances– UPGMA method (+ an example)– Neighbor-Joining method (+ an example)
• Comparison of methods• Conclusion
![Page 3: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/3.jpg)
3
Phylogeny• Phylogeny is the evolution of related
species/genes• Phylogenetic tree: diagram showing evol
utionary lineages of species/genes• The history of genes or species may be
very different• Genes can be homologous or
analogous, but still remind each other
![Page 4: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/4.jpg)
4
Phylogeny
• The similarity of molecular mechanisms of the organisms that have been studied strongly suggests that all organisms on Earth had a common ancestor
• Any set of species is related, and this relationship is called a phylogeny
• The relationship can be represented by a phylogenetic tree
![Page 5: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/5.jpg)
5
Phylogeny
• Traditionally, morphological characters (both from living and fossilized organisms) have been used for inferring phylogenies
• Zuckerkandel & Pauling (1962) showed that molecular sequences provide sets of characters that can carry a large amount information
• If we have a set of sequences from different species , we may be able to use them to infer a likely phylogeny of the species in question
• This assumes that the sequences have descended from some common ancestral gene in a common ancestral species
![Page 6: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/6.jpg)
6
Phylogeny
• The widespread occurrence of gene duplication means that the foregoing assumption needs to be checked carefully
• The phylogentic tree of a group of seqences does not necessarily reflect the phylogenetic tree of their host species, because gene duplication is another mechanism, in addition to speciation, by which two sequences can be separated and diverge from a common ancestor
• Genes which diverged because of speciation
![Page 7: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/7.jpg)
7
Phylogeny
• Genes which diverged because of speciation are called orthologues ( 直系同源 )
• Genes which diverged by gene duplication are called paralogues ( 平行進化同源 )
![Page 8: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/8.jpg)
8
Phylogeny• Homologous sequences can be divided into two
parts– Orthologous sequences diverged by
specification from a common ancestor– Paralogous sequences evolved by gene
dublication within species• Analogous sequences may appear and function
very similarly, but they do not have a common ancestor
• WHEN WE WANT TO EXPLORE EVOLUTIONARY RELATIONSHIPS, WE NEED TO HANDLE ORTHOLOGOUS SEQUENCES
![Page 9: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/9.jpg)
9
Genes
Homologous Analogous
Orthologous Paralogous
![Page 10: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/10.jpg)
10
Orthologues / Paralogues
![Page 11: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/11.jpg)
11
Orthology/paralogy
Orthologous genes are homologous (corresponding) genes in different species (genomes)
Paralogous genes are homologous genes within the same species (genome)
![Page 12: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/12.jpg)
12
Phylogenetic Trees
• WHY construct a phylogenetic tree?– to understand lineage of various species– to understand how various functions evolved– to inform multiple alignments
• Trees can be rooted (a common ancestor in known) or unrooted
• Leaves are the terminal nodes that correspond to the observed sequences of genes or species (A, B, C, D)
• Internal nodes are hypothetical ancestral nodes• All trees will be assumed to be binary, meaning that an
edge that branches splits into two daughter edges• Each edge has a certain amount of evolutionary
divergence associated to it, defined by some measure of distance between sequences, or from a model of substitution of residues over the course of evolution
![Page 13: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/13.jpg)
13
![Page 14: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/14.jpg)
14
Phylogenetic Trees
• We adopt the general term “length” or “edge length” here, and represent this by the lengths of edge in the figures we draw
• A true biological phylogeny has a “root”, or ultimate ancestor of all the sequences
• The leaves of trees have names or numbers• A tree with a given labelling will be called a labell
ed branching pattern• We refer to this as the tree topology and denote i
t by the symbol T• The lengths of its edges are denoted by ti with a
suitable numbering scheme for the is
![Page 15: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/15.jpg)
15
Rooted / Unrooted Tree
![Page 16: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/16.jpg)
16
Types of treesUnrooted tree represents the same phylogeny w
ithout the root node
Depending on the model, data from current day species often does not distinguish between different placements of the root.
![Page 17: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/17.jpg)
17
Rooted versus unrooted treesTree a
ab
Tree b
c
Tree c
Represents all three rooted trees
![Page 18: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/18.jpg)
18
Rrooting the tree:
To root a tree mentally, imagine that the tree is made of string. Grab the string at the root and tug on it until the ends of the string (the taxa) fall opposite the root: A
BC
Root D
A B C D
RootNote that in this rooted tree, taxon A is no more closely related to taxon B than it is to C or D.
Rooted tree
Unrooted tree
![Page 19: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/19.jpg)
19
Counting Trees
![Page 20: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/20.jpg)
20
Counting Trees
(2N - 5)!! = # unrooted trees for N taxa(2N- 3)!! = # rooted trees for N taxa
CA
B D
A B
C
A D
B E
C
A D
B E
C
F
![Page 21: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/21.jpg)
21
How many trees?
• Number of unrooted trees = (2n-5)! / 2n-3 (n-3)!
=3x5x…x(2n-5)
• Number of rooted trees = (2n-3)! / 2n-32(n-2)!
=3x5x…x(2n-3)
![Page 22: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/22.jpg)
22
Combinatoric explosion
# sequences # unrooted # rooted trees trees
2 1 13 1 34 3 155 15 1056 105 9457 945 10,3958 10,395 135,1359 135,135 2,027,02510 2,027,025 34,459,425
![Page 23: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/23.jpg)
23
Phylogenetic trees
• Different ways to represent a phylogenetic tree (illustrated by Treeview)
HRV10
HRV100
HRV66
HRV77
HRV25
HRV62
HRV29
HRV44
HRV31
HRV47
HRV39
HRV59
HRV63
HRV40
HRV85
HRV56
HRV54
HRV98
HRV1A
HRV1bGenba
HRV12
HRV78
HRV20
HRV68
HRV28
HRV53
HRV71
HRV51
HRV65
HRV46
HRV80
HRV45
HRV8
HRV95
HRV58
HRV36
HRV89Genba
HRV7
HRV88
HRV23
HRV30
HRV2Genban
HRV49
HRV43
HRV75
HRV16Genba
HRV81
HRV57
HRV55
HRVHanks
HRV21
HRV11
HRV33
HRV76
HRV24
HRV90
HRV18
HRV34
HRV50
HRV73
HRV13
HRV41
HRV61
HRV96
HRV15
HRV74
HRV38
HRV60
HRV67
HRV32
HRV9
HRV19
HRV82
HRV22
HRV64
HRV94
0.1
HRV12
HRV78
HRV20
HRV68
HRV28
HRV53
HRV71
HRV51
HRV65HRV46
HRV80
HRV45
HRV8HRV95
HRV58
HRV36HRV89GenbaHRV7
HRV88
HRV23HRV30
HRV2Genban
HRV49
HRV43
HRV75
HRV16Genba
HRV81
HRV57HRV55
HRVHanks
HRV21HRV11
HRV33HRV76
HRV24
HRV90HRV18HRV34HRV50
HRV73
HRV13
HRV41
HRV61
HRV96 HRV15HRV74
HRV38
HRV60
HRV67HRV32HRV9HRV19
HRV82HRV22
HRV64
HRV94HRV1A
HRV1bGenbaHRV39
HRV59HRV63
HRV40
HRV85
HRV56
HRV54
HRV98
HRV66
HRV77
HRV25 HRV62
HRV29
HRV44
HRV31
HRV47
HRV100HRV10
HRV10
HRV100
HRV66
HRV77
HRV25
HRV62
HRV29
HRV44
HRV31
HRV47
HRV39
HRV59
HRV63
HRV40
HRV85
HRV56
HRV54
HRV98
HRV1A
HRV1bGenba
HRV12
HRV78
HRV20
HRV68
HRV28
HRV53
HRV71
HRV51
HRV65
HRV46
HRV80
HRV45
HRV8
HRV95
HRV58
HRV36
HRV89Genba
HRV7
HRV88
HRV23
HRV30
HRV2Genban
HRV49
HRV43
HRV75
HRV16Genba
HRV81
HRV57
HRV55
HRVHanks
HRV21
HRV11
HRV33
HRV76
HRV24
HRV90
HRV18
HRV34
HRV50
HRV73
HRV13
HRV41
HRV61
HRV96
HRV15
HRV74
HRV38
HRV60
HRV67
HRV32
HRV9
HRV19
HRV82
HRV22
HRV64
HRV94
0.1
HRV10
HRV100
HRV66
HRV77
HRV25HRV62
HRV29
HRV44
HRV31
HRV47
HRV39
HRV59
HRV63
HRV40
HRV85
HRV56
HRV54
HRV98
HRV1AHRV1bGenba
HRV12
HRV78
HRV20
HRV68
HRV28
HRV53HRV71
HRV51
HRV65
HRV46
HRV80
HRV45
HRV8
HRV95
HRV58
HRV36
HRV89Genba
HRV7
HRV88
HRV23
HRV30
HRV2Genban
HRV49
HRV43
HRV75
HRV16Genba
HRV81
HRV57
HRV55
HRVHanks
HRV21
HRV11
HRV33HRV76
HRV24
HRV90
HRV18
HRV34
HRV50
HRV73
HRV13HRV41
HRV61
HRV96
HRV15
HRV74
HRV38HRV60
HRV67
HRV32
HRV9
HRV19
HRV82
HRV22
HRV64
HRV94
![Page 24: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/24.jpg)
24
Making a tree from pairwise distances
• Distances dij between each pair of sequences i and j are calculated in the given dataset
• Different ways defining distances– For nucleotide sequences:
Jukes-Cantor, Kimura-2-parameter K2P, HKY (Hasegawa-Kishino-Yano), F84, Tamura-Nei, General time-reversible model, General 12-parameter model
– For amino acid sequences:PAM-matrices, BLOSUM-matrices
A B C D
A 0 32 44 46
B 32 0 29 43
C 44 29 0 30
D 46 43 30 0
![Page 25: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/25.jpg)
25
Distance matrix methods
• UPGMA– Algorithm introduced by Sokal and Michener
1958
• Neighbor-Joining– Algorithm introduced by Saitou and Nei 1987– Modified by Studier and Keppler 1988
![Page 26: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/26.jpg)
26
Clustering method: UPGMA
• UPGMA = Unweighted pair group method using arithmetic averages
• Simple method • It works by clustering the sequences, at each
stage connecting two clusters and finally creating a new node on a tree
• Method assumes equal rate of evolutionary change along branches Molecular clock assumption
![Page 27: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/27.jpg)
27
UPGMA• UPGMA produces a rooted tree• Branch lengths satisfy a molecular clock The divergence of sequences is assumed to occur at the same
constant rate at all points in the tree• Trees that are clocklike are rooted and the total branch length
from the root up to any leaf is equal• Trees are often referred to be ultrametric• A distance measures are ultrametric if either all three distances a
re equal dij = dik = djk or two of them are equal and one is smaller: djk < dij = dik
UPGMA is guaranteed to build the correct tree if distances are ultrametric
• Method can be used for reconstructing phylogenies if evolutionary rates are assumed to be same in all lineages criticism in the phylogeny literature– Suitable for the species closely related
• Running time O(n2)
A
C
B
D
![Page 28: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/28.jpg)
28
Algorithm: UPGMAInitialisation:
Assign each sequence i in dataset to its own cluster
Define one leaf of T for each sequence, and place at height zero
Iteration:
Find the two clusters i and j for which dij is the smallest (pick randomly if several equal distances)
Define a new cluster ij by Cij = Ci U Cj. Cluster ij has nij = ni + nj
members ( initially ni = 1 )
Connect i and j on the tree to a new node v
The branch lengths from new node to i and j are
placed at height
2ijd
![Page 29: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/29.jpg)
29
Algorithm: UPGMA (cont.)
Iteration (cont.)Compute the distances between the new cluster and the remaining clusters by using
Add ij to the current clusters and remove i and j Termination:
When only two clusters i and j remain, place the rootat height
2ijd
jkji
jik
ji
ikij d
nn
nd
nn
nd
),(
![Page 30: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/30.jpg)
30
UPGMA -- Unweighted Pair Group Method with Arithmetic mean
simplest method - uses sequential clustering algorithm(assumption of rate constancy among lineages - often violated)
A BB dABC dAC dBC
(AB)C d(AB)C d(AB)C = (dAC + dAB) / 2Distance matrix
Tree
dAB / 2
A
B
A
d(AB)C / 2
B
C
step 1 step 2
![Page 31: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/31.jpg)
31
UPGMA -- Ilustrations
![Page 32: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/32.jpg)
32
An example UPGMA (1)
• Distance matrix (arbitrary) for four items (sequences) A, B, C and DActually distances are not ultrametric, because three distances are not equal
dij ≠ dik ≠ djk or two of them are not equal and one is smaller:
djk < dij ≠ dik
A B C D
A 0 8 7 12
B 8 0 9 14
C 7 9 0 11
D 12 14 11 0
Step 1. Find the smallest distance, dij, between two clusters A and C, where dij is 7
![Page 33: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/33.jpg)
33
An example UPGMA (2)Step 2. Define new cluster ij, which has nij = ni + nj members (initially ni = 1)
New cluster A and C nAC = nA+ nC=2
Step 3. Connect A and C on the tree to a new node v1
Step 4. The branch lengths from new node v1 to A and C
5,32
7
2ACd A
C3,5
3,5
A B C D
A 0 8 7 12
B 0 9 14
C 0 11
D 0
![Page 34: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/34.jpg)
34
An example UPGMA (3)
Step 5. Compute the distances between the new cluster AC and the remaining clusters (B and D):
5.89*2
18*
2
1,
CBCA
CAB
CA
ABAC d
nn
nd
nn
nd
5.1111*2
112*
2
1,
CDCA
CAD
CA
ADAC d
nn
nd
nn
nd
![Page 35: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/35.jpg)
35
Step 6. Delete the columns and rows of the distance matrix that correspond to clusters A and C, and add a column and a row for cluster AC
An example UPGMA (4)
AC B D
AC 0 8.5 11.5
B 0 14
D 0
New distance matrix
![Page 36: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/36.jpg)
36
An example UPGMA (5)AC B D
AC 0 8.5 11.5
B 0 14
D 0
2nd iteration process
Step 1. Find the two sequences i and j for which dij is the smallest (randomly if several equal distances)AC-B
Step 2. Define new cluster (ij), which has nij = ni + nj members ( initially ni = 1 ) New cluster AC and B nACB = nAC+ nB = 2 + 1 = 3
Step 3. Connect AC and B on the tree to a new node v2
Step 4. The branch lengths from new node v2 to AC and B
25.42
5.8
2ACBd
A
C3.5
3.5
B4.25
![Page 37: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/37.jpg)
37
An example UPGMA (6)Step 5. Compute the distances between the new
cluster and the remaining cluster (D)
Step 6. Delete the columns and rows of the distance matrix that correspond to clusters AC and B, and add a column and a row for cluster ACB
33.1214*3
15.11*
3
2),(
BDBAC
BACD
BAC
ACDACB d
nn
nd
nn
nd
ACB D
ACB 0 12.33
D 0
New distance matrix
![Page 38: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/38.jpg)
38
An example UPGMA (7)Termination:
Only two clusters (ACB and D) remaining
Place the root height ACB D
ACB 0 12.33
D 017.62
33.12
2ijd
A
C3.5
3.5
B4.25
6.17D
Original distance matrix and final phylogenetic tree(including thebranch lengths)
1.92A B C D
A 0 8 7 12
B 0 9 14
C 0 11
D 0
0.75
![Page 39: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/39.jpg)
39
When UPGMA fails …
![Page 40: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/40.jpg)
40
When UPGMA fails …
• The closest leaves are not neighboring leaves; they do not have a common parent node
• A test of whether reconstruction is likely to be correct is the ultrametric condition
• A distance measures are ultrametric if either all three distances are equal dij = dik = djk or two of them are equal and one is smaller: djk < dij = dik
![Page 41: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/41.jpg)
41
Ultrametric Distances
Given three leaves, two distances are equal while a third is smaller:
d(i,j) d(i,k) = d(j,k)
a+a a+b = a+b
a
a
b
i
j
k
nodes i and j are at same evolutionary distance from k – the dendrogram will therefore have ‘aligned’ leaves; i.e. they are all at the same distance from root
![Page 42: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/42.jpg)
42
Evolutionary clock speeds
Uniform clock: Ultrametric distances lead to identical distances from root to leaves
Non-uniform evolutionary clock: leaves have different distances to the root -- an important property is that of additive trees. These are trees where the distance between any pair of leaves is the sum of the lengths of edges connecting them. Such trees obey the so-called 4-point condition (next slide).
![Page 43: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/43.jpg)
43
Additivity
• Given a tree, its edge lengths are said to be additive if the distance between any pair of leaves is the sum of the lengths of the edges on the path connecting them
• This property is built in automatically as the UMGMA tree is constructed
• It is possible for the molecular clock property to fail but for additivity to hold, and in that case there are algorithms that can be used to reconstruct the tree correctly
![Page 44: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/44.jpg)
44
Neighbor Joining
• Very popular method• Does not make molecular clock assumption :
modified distance matrix constructed to adjust for differences in evolution rate of each taxon
• Produces unrooted tree• Assumes additivity: distance between pairs of
leaves = sum of lengths of edges connecting them
• Like UPGMA, constructs tree by sequentially joining subtrees
![Page 45: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/45.jpg)
45
Neighbor Joining: Once we know the correct (i,j) pair
![Page 46: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/46.jpg)
46
• dim=dik+dkm
• djm=djk+dkm
• dim+djm=dik+djk+2dkm=dij+2dkm
• dkm=(dim+djm-dij)/2
![Page 47: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/47.jpg)
47
Neighbour Joining: why not pick the smallest (i,j) pair?
![Page 48: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/48.jpg)
48
Neighbour Joining(3)i j
i j ik jk
ij ik jk
r r d d
d d d
![Page 49: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/49.jpg)
49
Neighbour Joining: Algorithmik jk i j
ik jk ij
d d r r
d d d
![Page 50: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/50.jpg)
50
Neighbor-Joining: Complexity
• The method performs a search using time O(n2) and using time O(n2) to update distance matrix.
• Giving a total time complexity of O(n3),and a space complexity of O(n2).
![Page 51: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/51.jpg)
51
Neighbor-Joining
• We can use neighboring-joining even lengths are not additive, but reconstruction of the correct tree is no longer guaranteed
• We can test for additivity• For every set of four leaves, i, j, k, and l, t
wo of the distances dij+dkl, dik+djl and dil+djk must be equal and larger than the third
• dij+dkl= dik+djl > dil+djk
![Page 52: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/52.jpg)
52
Additivity
![Page 53: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/53.jpg)
53
Additivity
Theorem: A set M of L objects is additive iff any subset of four objects can be labeled i,j,k,l so that:
d(i,k) + d(j,l) = d(i,l) +d(k,j) ≥ d(i,j) + d(k,l)
![Page 54: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/54.jpg)
54
Additive trees
All distances satisfy 4-point condition:
For all leaves i,j,k,l:
d(i,j) + d(k,l) d(i,k) + d(j,l) = d(i,l) + d(j,k)
(a+b)+(c+d) (a+m+c)+(b+m+d) = (a+m+d)+(b+m+c)
i
j
k
l
a
b
mc
d
Result: all pairwise distances obtained by traversing the tree
![Page 55: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/55.jpg)
55
Step 1. Computefor each row indistance matrix
Step 2. Compute(the lower-diagonal matrix) and choose the smallest (most negative)
An example N-J (1)
A B C D Step 1 - ri
A 0 8 7 12 =(8+7+12)/(4-2) = 13.5
B 8 0 9 14 =(8+9+14)/(4-2)=15.5
C 7 9 0 11 =(7+9+11)/(4-2)=13.5
D 12 14 11 0 =(12+14+11)/(4-2)=18.5
n
ij
iji n
dr
2
)( jiij rrd
A B C D
A 0 8 7 12
B 8-(13.5+15.5)=-21 0 9 14
C 7-(13.5+13.5)=-20 9-(15.5+13.5)= -20 0 11
D 12-(13.5+18.5)=-20 14-(15.5+18.5)=-20 11-(13.5+18.5)=-21 0
![Page 56: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/56.jpg)
56
An example N-J (2)
Step 3. Join A and B together with a new node v1. Compute the edge lengths, from A to node v and from B to node v1
Step 4. Compute distances between the new node v1 and remaining items (C and D)
3
2
5.155.13
2
8
2
)(
2
BAAB
A
rrdv
5
2
5.135.15
2
8
2
)(
2
ABAB
B
rrdv
92
81412
2
)(
42
897
2
)(
),(
),(
ABBDADDAB
ABBCACCAB
dddd
dddd
v1
B
A
5
3
![Page 57: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/57.jpg)
57
An example N-J (3)
Step 5. Delete A and B from the distance matrix and replace them by new item AB
Step 6. Continue from step 1, because more than two items remain
Step 1. Compute for each row indistance matrix
Step 2 Computeand choose the smallest (the lower-diagonal matrix)
AB C D Step 1 = ri
AB 0 4 9 (4+9)/1=13
C 4 0 11 (4+11)/1=15
D 9 11 0 (9+11)/1=20
New reduced distance matrix
n
ij
iji n
dr
2
)( jiij rrd AB C D
AB 0 4 9
C 4-(13+15)=-24 0 11
D 9-(13+20)=-24 11-(15+20)=-24 0
![Page 58: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/58.jpg)
58
An example N-J (4)
Step 3 Join v1 and C together with a new node v2. Compute the edge lengths, from v1 to node v2 and from C to node v2
Step 4 Compute distances between the new node v2 and remaining items (D)
3
2
1315
2
4
22
12
1513
2
4
2
)(
21
ABCABCC
CABABC
rrdv
rrdv
AB C D Step 1 = ui
AB 0 4 9 (4+9)/1=13
C 4 0 11 (4+11)/1=15
D 9 11 0 (9+11)/1=20
82
4119
2
)(),(
ABCCDABD
DABC
dddd
v1B
A
5
3
v21
3C
![Page 59: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/59.jpg)
59
An example N-J (5)
Step 5 Delete AB and C from the distance matrix and replace them by ABC
Step 6 Only two nodes remaining connect them
ABC D
ABC 0 8
D 0
B
A
5
3C
D
8A B C D
A 0 8 7 12
B 0 9 14
C 0 11
D 0
13
Original distance matrix and final phylogenetic tree (including the edge lengths)
![Page 60: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/60.jpg)
60
Comparison• UPGMA
– The total branch length from the root up to any leaf is equal
– Produces a rooted tree, where the root is hypothesized ancestor of the sequences in the tree
– Suitable for closely related sequences
– Can be used to infer phylogenies if one can assume that evolutionary rates are the same in all lineages
• Neighbor-joining– Unrooted tree, where the
direction of evolution is unknown
– Suitable for datasets with largely varying rates of evolution
– Suitable for large datasets
B
A
5
3C
D
8
13
A
C3.5
3.5
B4.25
6.17 D
![Page 61: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/61.jpg)
61
Comparison• UPGMA method constructs a rooted phylogenetic tree correctly
if there is a molecular clock with a constant rate of mutation• UPGMA method is rarely used, because molecular clock
assumption is not generally true: selection pressures vary across time periods, genes within organisms, organisms, regions within gene
• N-J method produces an unrooted tree without molecular clock hypothesis
• N-J method is one of the most popular and widely used by molecular evolutionist
• Distance methods are strongly dependent on the model of evolution used
• Sequence information is reduced when transforming sequence data into distances
• Distance methods are computationaly fast
![Page 62: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/62.jpg)
62
Parsimony
• Find the tree which can explain the observed sequences with a minimal number of substitutions
• It assigns a cost to a tree, and it is necessary to search through all topologies, or to pursue a more efficient search strategy that achieves this effect, in order to identify the ‘best’ tree
![Page 63: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/63.jpg)
63
Parsimony
• The computation of a cost for a given tree• A search through all trees, to find the overall
minimum of this cost• Suppose we have the following four aligned
nucleotide sequences:
AAG
AAA
GGA
AGA
![Page 64: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/64.jpg)
64
Parsimony
![Page 65: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/65.jpg)
65
Cost of Evaluating Parsimony
• Score is evaluated on each position independetly. Scores are then summed over all positions.
• If there are n nodes, m characters, and k possible values for each character, then complexity is O(nmk)
• By keeping traceback information, we can reconstruct most parsimonious values at each ancestor node
![Page 66: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/66.jpg)
66
Evaluating Parsimony Scores
• How do we compute the Parsimony score for a given tree?
• Traditional Parsimony– Each base change has a cost of 1
• Weighted Parsimony– Each change is weighted by the score c(a,b)
![Page 67: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/67.jpg)
67
Traditional Parsimony
}{},{
nodesinternalmin);,...,(
vu xxEvu
n TssPar
11
a g a
{a,g}
{a}
•Solved independently for each position
•Linear time solution
a
a
![Page 68: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/68.jpg)
68
Traditional Parsimony
![Page 69: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/69.jpg)
69
Traditional Parsimony
• There is a traceback procedure for finding ancestral assignments in traditional parsimony
• We choose a residue from R2n-1, then proceed down the tree
• Having chosen a residue from the set Rk, we pick the same residue from the daughter set Ri if possible, and otherwise pick a residue at random from Ri
![Page 70: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/70.jpg)
70
Traditional Parsimony is not “complete”
![Page 71: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/71.jpg)
71
Weighted Parsimony
![Page 72: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/72.jpg)
72
Example
Aardvark Bison Chimp Dog Elephant
A: CAGGTAB: CAGACAC: CGGGTAD: TGCACTE: TGCGTA
![Page 73: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/73.jpg)
73
Parsimony & DistanceSequences 1 2 3 4 5 6 7Drosophila t t a t t a a fugu a a t t t a a mouse a a a a a t a human a a a a a a t
human x
mouse 2 x
fugu 4 4 x
Drosophila 5 5 3 x
human
mouse
fuguDrosophila
Drosophila
fugu
mouse
human
12
3 7
64 5
Drosophila
fugu
mouse
human
2
11
12
parsimony
distance
![Page 74: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/74.jpg)
74
How to assess confidence in tree
• Distance method – bootstrap:– Select multiple alignment columns with
replacement– Recalculate tree– Compare branches with original (target) tree– Repeat 100-1000 times, so calculate 100-
1000 different trees– How often is branching (point between 3
nodes) preserved for each internal node?– Uses samples of the data
![Page 75: 1 Chapter 7 Building Phylogenetic Trees. 2 Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances –UPGMA method](https://reader031.vdocuments.us/reader031/viewer/2022032709/56649eb55503460f94bbdf81/html5/thumbnails/75.jpg)
75
The Bootstrap -- example
1 2 3 4 5 6 7 8 - C V K V I Y SM A V R - I F SM C L R L L F T
3 4 3 8 6 6 8 6 V K V S I I S IV R V S I I S IL R L T L L T L
1
2
3
1
2
3
Original
Scrambled
4
5
1
5
2x 3x
Non-supportive