comparative genomicsigm.univ-mlv.fr/algob/epit2014/08-tannier.pdf · see also « combinatorics of...
TRANSCRIPT
![Page 1: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/1.jpg)
Comparative Genomics
EPIT 2014, Oléron, May 16th 2014
Eric Tannier, INRIA, LBBE, university of Lyon
![Page 2: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/2.jpg)
What are the documents and the methods to write the history of life on earth (human and non human)?
![Page 3: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/3.jpg)
200 years
Oral (human) history
![Page 4: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/4.jpg)
Written (human) history
6000 years
![Page 5: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/5.jpg)
Archaeology
4 000 000 years
![Page 6: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/6.jpg)
PaleontologyComparative anatomy
1 000 000 000 years
![Page 7: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/7.jpg)
Comparative Genomics
> 3 000 000 000 years
More than ¾ of the history on earth is made of microbes that have left no fossils. They represent 9/10 of today's biodiversity.Molecules are by far the most abundant source for this history.
![Page 8: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/8.jpg)
Paleogenomics
Sequencing ancient DNA and proteins
200 year old viruses1000 year old bacteria700 000 year old mammals
Computational prediction of ancestral DNA or proteins
> 3 000 000 000 year old molecules
![Page 9: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/9.jpg)
The principle is, as in comparative anatomy, to compare extant molecules and decipher what part of them is inherited from ancient ancestors, and what part is derivative.
For molecules we can integrate what we know about the processes of genome evolution.
![Page 10: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/10.jpg)
Part 1 : how we learned that chromosomes are linear arrangements of linear genes (with classical rearrangement problems)
Part 2 : how we learned to forget it (with an open problem session)
Part 3 (if we have time) : research results- A 650 year old genome : Yersinia Pestis- how to date without fossils : chronology of cyanobacteria
Outline
![Page 11: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/11.jpg)
Comparative Genomics starts with Mendel laws
Two individuals x1 and x
2, with « factors » (genes), and two values
per factor
x1
x2
x11
x12
x21
x22
(x11
+x12
)(x21
+x22
) = x11
x21
+ x11
x22
+ x12
x21
+ x12
x22
offsping has factors from (x11
x21
, x11
x22
, x12
x21
, x12
x22
) with probabilities (¼, ¼, ¼, ¼)
![Page 12: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/12.jpg)
Values are ordered: the observed trait is taken from the highestExamples: for one trait with alleles a and A (a<A)
- Offspring from AA and AA parents are AA with probability 1(A + A)(A + A) = 4AAObservation A
- Offspring from aa and aa parents are aa with probability 12a*2a = 4aaObservation a
- Offspring from AA and aa parents are Aa with probability 1(A + A)(A + a) = 2AA + 2AaObservation A
- Offspring from Aa and Aa parents are (AA,Aa,aa) with prob (¼, ½, ¼)Or (A+a)2 = AA + 2Aa + aaObservation (A,b) with freq (¾, ¼)
Comparative Genomics starts with Mendel laws
![Page 13: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/13.jpg)
For two traits, probability distribution of factors is the independent combination of the two distributions
Example: for two independent traits with alleles a and A (a<A), b and B (b<B)
(A+a)2(B+b)2 = AABB + 2AABb + AAbb + 2AaBB + 4AaBb + 2Aabb + aaBB + 2aaBb + aabb
Observations (AB , Ab , aB , ab) freq (9/16, 3/16, 3/16, 1/16)
Comparative Genomics starts with Mendel laws
![Page 14: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/14.jpg)
2 factors, 2 values each(round > wrinkled)And(yellow > green)
![Page 15: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/15.jpg)
Genetic linkage: Bateson and Punnet experiment (1900)
Traits
A genome is not a set of genes
![Page 16: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/16.jpg)
Introduction of a "linkage probability": lp
Cross (A+a)2 and (B+b)2 with linkage coefficient lp =
lp2 (AABB + 2AaBb + aabb) +(1lp)2 (AAbb + 2AaBb + aaBB) +2*lp(1lp) (AABb + AaBB + Aabb + aaBb)
Observed frequencies AB : (lp2 + 2)/4
Ab : (1lp2)/4aB : (1lp2)/4ab : lp2/4
![Page 17: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/17.jpg)
lp = ½ means independance, like in Mendel's law
In Bateson and Punnet's experiment, lp = 0.88
![Page 18: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/18.jpg)
Sturtevant, 1913 (with Thomas Morgan)
The linkage probability lp is "close" to a linear distance d=1-lpIf a, b, c, are factors such that d(a,c) > d(a,b) and d(a,c) > d(b,c)
d(a,c) = d(a,b) + d(b,c)
Factors are organized linearly (mathematical prediction of the structure of chromosomes)
![Page 19: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/19.jpg)
![Page 20: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/20.jpg)
Bridges, Morgan, 1916 Castle, 1919
![Page 21: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/21.jpg)
![Page 22: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/22.jpg)
Sturtevant, 1921
Prediction of the inversion
![Page 23: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/23.jpg)
Confirmation by observations
Sturtevant, 1921
![Page 24: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/24.jpg)
Biology, thanks to mathematics, is a predictive science: like in physics, objects can be predicted before being discovered
(chromosomes, inversions)
Epistemological parenthesis
![Page 25: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/25.jpg)
Molecular phylogeny, Dobzhanski and Sturtevant, 1938
![Page 26: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/26.jpg)
Mathematical investigation of inversions, Sturtevant, Tan, 1937
![Page 27: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/27.jpg)
You don't need fast computers or abundant sequences to do computational biology and bioinformatics.
Epistemological parenthesis
![Page 28: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/28.jpg)
The mathematical problem stayed univestigated from 1941 to 1982, and it was « solved » in 1999 by an NP-completeness result due to Caprara
Epistemological parenthesis
![Page 29: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/29.jpg)
Summary at this point :- chromosomes are linear arrangement of genes- this arrangement can mutate- reconstructing the history of mutations is a complicated (and complex) problem
But genes are still abstract points on the line. Following the discovery of the structure of DNA, what is the structure of a gene ?
![Page 30: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/30.jpg)
Now that the chromosomes are (more or less) linear, what about the genes?
![Page 31: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/31.jpg)
![Page 32: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/32.jpg)
![Page 33: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/33.jpg)
Benzer,1959
![Page 34: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/34.jpg)
![Page 35: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/35.jpg)
Sometimes important objects are invented in the context of their application.
Proxy to decide if a problem is polynomial: ask a biologist to solve a small instance
Epistemological parenthesis
![Page 36: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/36.jpg)
In fact interval graphs were invented before by Norbert Wiener in 1914, but in the language of Whitehead and Russell, and the biological article of Benzer is more readable by any mathematician
![Page 37: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/37.jpg)
For posterity write in common language, it is more sustainable than an invented one
Epistemological parenthesis
![Page 38: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/38.jpg)
Now that the genes have a structure, what about the inversion problem ?
It was NP-hard without a gene structure, it turns polynomial with the gene structure
![Page 39: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/39.jpg)
Modelisation: Genes are segments, say they are arcs
Genes
![Page 40: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/40.jpg)
Modelisation: chromosomes are linear arrangementsof genes, draw a matching (disjoint edges)
Genes
Genome 1
adjacencies
![Page 41: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/41.jpg)
Modelisation: for another arrangement on the samegenes, draw another matching
Genes
Genome 2
![Page 42: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/42.jpg)
Inversion
Modelisation: define an operation to transform a matching into another
Inversion: cut two edges, this makes four pending extremities, join them with two edges
![Page 43: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/43.jpg)
Double cut-and-join (DCJ)
Fusion
Fission
Inversion
Reciprocal Translocation
Non reciprocal translocation
![Page 44: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/44.jpg)
Optimization problem : given two matchings, give a minimum DCJ scenario that transforms one into the other
Solution given in 2005, following the solution on permutations in 1995
![Page 45: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/45.jpg)
The comparison graph
Genes
Genome 1
Genome 2
![Page 46: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/46.jpg)
The comparison graph
Genome 1
Genome 2
![Page 47: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/47.jpg)
The comparison graph
Cycles, RedBlue paths, RedRed paths,BlueBlue paths
![Page 48: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/48.jpg)
d(Red,Blue) = n – (c + e/2)
Minimum number of DCJ
Number of genesNumber of cycles
Number of redblue paths
![Page 49: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/49.jpg)
Example
D = n – (c+e/2) = 9 – (4+2/2) = 4
![Page 50: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/50.jpg)
An additional constraint from reality (the linearity of genes) can miraculously turn a problem easier
But this was an illusion, it was going further in the dead end
Epistemological parenthesis
![Page 51: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/51.jpg)
51
Useful generalizations
These complexity results (illustrated by the non optimal solution of 1937) explain why, though discovered 25 years before, rearrangements are not yet as considered as substitutions.
Ancestral state reconstruction (with >2 genomes), counting, sampling, integrating other events (duplications)
All remain hard?
![Page 52: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/52.jpg)
52
One solution : try your favorite solver or metaheuristic or MCMC
Badger, application on mitochondria Larget, Simon, 2004, 2005
Application on Yersinia, Darling, Miklos, Ragan 2008
Theoretical results on convergence Miklos, Tannier, 2010, 2012
Still only inversions (no duplications, transfers...)
Limited number of genomes (up to ~10) with limited number of genes (up to ~30)
Navigating in the space of phylogenies, ancestral configurations and scenarios
See also « combinatorics of genome rearrangements », Fertin et al, 2009
![Page 53: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/53.jpg)
53
Reverse the history (and forget the biology): keep genes linear but forget the inversions (and then chromosomes)
For this look at another (more successful) history
Epistemological parenthesis
![Page 54: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/54.jpg)
54
Since the 1960's, phylogenies are based on substitutions more than rearrangements
Zuckerkandl and Pauling, 1965
![Page 55: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/55.jpg)
55
The (wrong) hypothesis simplifying all computations:
Sites evolve independently from each other
Then you have a wonderful tool to reconstruct ancestral sequences: dynamic programming again.
![Page 56: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/56.jpg)
56
General problem. Given- a rooted phylogenetic tree- a state space, and values from this space on the leaves- a non negative cost function d(x,y) on the state spaceFind the states at the internal nodes, minimizing the sum of the cost over all branches
Ancestral character reconstruction
![Page 57: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/57.jpg)
57
Ancestral character reconstruction
Variants: find- the minimum cost solutions or- the most probable solution, or- the most probable solutions, or- sample among probable solutions, or- sample according to the probability distribution.
![Page 58: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/58.jpg)
0 1 1 0 0
Example: presence or absence of a trait
![Page 59: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/59.jpg)
E R R U U
Example: one amino acid site
![Page 60: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/60.jpg)
3.5 3.2 3.2 2.4 2.2
Example: genome size (in millions of base pairs)
![Page 61: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/61.jpg)
0 1 1 0 0
Fitch Method :Ascending phase
0
{0,1}
1
{0,1}
![Page 62: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/62.jpg)
0 1 1 0 0
0
1
1
1Fitch Method :Descending phase
![Page 63: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/63.jpg)
0 1 1 0 0
0
1
0
1
Fitch Method :Descending phase
![Page 64: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/64.jpg)
0 1 1 0 0
0
0
0
0
Fitch Method :Some optimal solutions are never examined
![Page 65: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/65.jpg)
A general scheme by dynamic programming
f(u,x) is the cost of assigning x to node u in the subtree rooted at u
If u is a leaf, f(u,x) = 0 if x is the value of u, ∞ otherwiseIf u is not a leaf
Sankoff, Rousseau, 1975
![Page 66: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/66.jpg)
x
y z
c(x,1) = min (c(y,1)+c(z,1),c(y,0)+c(z,1)+1,c(y,1)+c(z,0)+1,c(y,0)+c(z,0)+2)
c(x,0) = min (c(y,1)+c(z,1) +2,c(y,0)+c(z,1)+1,c(y,1)+c(z,0)+1,c(y,0)+c(z,0))
x
c(x,1) = 0 et c(x,0) = infinite if x=1c(x,0) = 0 et c(x,1) = infinite if x=0
For a binary state space
![Page 67: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/67.jpg)
Not generalizable to permutations or matchings because of the state space (one cost computation for every state)
Ancestral character reconstruction is NP-hard for all variants.
?
![Page 68: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/68.jpg)
68
Today often genomes are compared uniquely with substitutions
It is not that substitutions are more important for evolution, but that computational tools are available
An important part of adaptation might be due to rearrangements but it is not reachable because of algorithmic complexity
Epistemological parenthesis
![Page 69: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/69.jpg)
69
The (wrong) hypothesis simplifying all computations:
Sites evolve independently from each other(that is forgetting the linearity of genes)
For rearrangements consider adjacencies as sites and decide they evolved one by one independently from each other
![Page 70: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/70.jpg)
Single cut-and-join (SCJ)
Fusion (or circularising a path)
Fission (or cutting a cycle)
![Page 71: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/71.jpg)
Optimization problem : given two matchings, give a minimum SCJ scenario that transforms one into the other
![Page 72: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/72.jpg)
Now it is possible to- optimise (trivial)- count, enumerate (exercise, think about dynamic programming)
What is the loss of accurary ?
![Page 73: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/73.jpg)
73
Simulation of inversions
N = number of genes (=number of sites) = 100
![Page 74: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/74.jpg)
74
Simulation of inversions
N = number of genes (=number of sites) = 100
![Page 75: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/75.jpg)
75
Estimation of the number of events under a Jukes-Cantor model if sites evolve independentlyNd = number of different sitesN = size of the sequencet = time (actual number of events per site)
N = number of genes (=number of sites) = 100
![Page 76: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/76.jpg)
76
Simulation of inversions
N = number of genes (=number of sites) = 100
![Page 77: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/77.jpg)
Now it is possible to- optimise, count, enumerate, sample....
With the same complexity as for amino acid or nucleotide sequencesWith a good accuracy
What about ancestral character reconstruction ?
![Page 78: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/78.jpg)
0 1 1 0 0
0
1
1
0
1 0 0 1 1
1
1
1
1
Presence of an edgeAB AC
Ancestral character reconstruction :Are the internal nodes matchings ?
![Page 79: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/79.jpg)
Equivalent to: if an edge xy is present at ancestral node then all edges xt and yt are not
Proof (exercise): - by induction on the ascending phase- by contradiction on the descending phase
Again an illusion theorem. "Fitch" Solutions choosing 0 (absence) if there is an ambiguity at the root garanty that evolving each edge independently makes a matching at each ancestral node.
(Feijao and Meidanis, 2013)
Use of Fitch is bad for enumeration and counting in the ancestorsCounting SCJ scenarios is #P-hard (Miklos, Tannier, 2014)
![Page 80: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/80.jpg)
Bridges, Morgan, 1916 Castle, 1919
Solution:Give up the linearity of chromosomes(after all it was not sure)
![Page 81: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/81.jpg)
81
But... more embarassing... we still assume that the genomes all have the same genes
Now it is possible to- optimise, count, enumerate, sample, compute ancestral states, sample among them
Ancestral chromosomes are not linear... Who cares ? If someone cares, linearize them with a maximum matching algorithm
![Page 82: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/82.jpg)
82
Genes evolve !!!
![Page 83: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/83.jpg)
The effect of a duplication on adjacencies if chromosomes are linear:
- either it is a tandem duplication and each exemplar receives a neighbor
- or one copy is inserted elsewhere
![Page 84: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/84.jpg)
The effect of a loss on adjacencies if chromosomes are linear:
![Page 85: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/85.jpg)
The effect of a duplication on adjacencies if chromosomes are NOT linear:
- either it is a tandem duplication and each exemplar receives a neighbor
- or one copy is inserted elsewhere
Any combination of edge distribution is allowed (including none to one copy)
![Page 86: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/86.jpg)
The effect of a loss on adjacencies if chromosomes are NOT linear:
or
![Page 87: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/87.jpg)
human humanhuman human
Duplications
cattle cattlecattle cattle
One gene in the human/cattle ancestor Two genes in the human/cattle ancestor
The history of duplications for one gene can be drawn on a gene tree
![Page 88: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/88.jpg)
Open problems (solution might be easy):
Given two genomes, transform one into the other with gains and losses of adjacencies, gains, duplications and losses of genes, with a minimum number of events.
Variants :- Genes can be vertices or arcs, labels can be repeated- adjacencies can form matchings or graphs- duplications can be in tandem or not, and duplications and loss can concern a connected subgraph of genes- history of duplications and losses of genes can be considered known or not
Generalization for ancestral character reconstruction
![Page 89: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/89.jpg)
Almost all variants are open.
My conjecture is that all are NP-hard for linear chromosomes« Genomes Containing Duplicates Are Hard to Compare », Chauve et al, 2006
Theorem: one is polynomial if- chromosomes are not required to be linear- history of duplications are known (gene trees)- duplications are in tandem- losses remove all adjacencies
In this case ancestral reconstruction on a phylogenetic tree is polynomial
Bérard et al, DeCo, 2012
The conditions preserve the independance of adjacencies and allows dynamic programming
![Page 90: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/90.jpg)
90
Dynamic programming with propagation rules for adjacencies
Two genes
One duplicates
Adjacency is propagated to one copy
![Page 91: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/91.jpg)
91
Dynamic programming with propagation rules for adjacencies
![Page 92: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/92.jpg)
92
What time is it ?
If time << 12h30, then vote between- the recent story, yersinia pestis in the middle ages
- the old story, dating with horizontal gene transfers
ElseThanks for still being here, and have a nice lunch
Bioinformatics, 2013
PNAS, 2012
![Page 93: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/93.jpg)
93
Transfers
![Page 94: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/94.jpg)
Reconstructing ancestral genomesIn the presence of transfers
![Page 95: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/95.jpg)
Methanococcus jannashii
Aeropyrum pernix
Synechocystis sp.
Pyrococcus horikoshii Pyrococcus abyssi
Archaeoglobus fulgidus
Thermoplasma acidophilum
Halobacterium sp.
Streptomyces coelicolor
Mycobacterium tuberculosis
Deinococcus radiodurans
Bacillus subtilis
Mycoplasma pneumoniae Mycoplasma genitalium
Ureaplasma parvum
Helicobacter pylori Campylobacter jejuni
Xylella fastidiosa
Haemophilus influenzae Vibrio cholerae
Buchnera sp.
Pseudomonas aeruginosa
Neisseria meningitidis
Rickettsia prowazekii
Bacillus halodurans
Treponema pallidum Borrelia burgdorferi
Chlamydia trachomatis Chlamydia muridarum Chlamydophila pneumoniae
Thermotoga maritima
Aquifex aeolicus
Staphylococcus aureusLactococcus lactisStreptococcus pyogenes
Mycobacterium leprae
Pasteurella multocida
Caulobacter crescentus
Sulfolobus solfataricus
Arabidopsis thaliana
Methanobacterium thermoautotrophicum
0.5
Methanococcus jannashii
Aeropyrum pernix
Synechocystis sp.
Pyrococcus horikoshii Pyrococcus abyssi
Archaeoglobus fulgidus
Thermoplasma acidophilum
Halobacterium sp.
Streptomyces coelicolor
Mycobacterium tuberculosis
Deinococcus radiodurans
Bacillus subtilis
Mycoplasma pneumoniae Mycoplasma genitalium
Ureaplasma parvum
Helicobacter pylori
Xylella fastidiosa
Haemophilus influenzae Vibrio cholerae
Buchnera sp.
Pseudomonas aeruginosa
Neisseria meningitidis
Rickettsia prowazekii
Bacillus halodurans
Treponema pallidum Borrelia burgdorferi
Chlamydia trachomatis Chlamydia muridarum Chlamydophila pneumoniae
Thermotoga maritima
Aquifex aeolicus
Staphylococcus aureusLactococcus lactisStreptococcus pyogenes
Mycobacterium leprae
Pasteurella multocida
Caulobacter crescentus
Sulfolobus solfataricus
Arabidopsis thaliana
Methanobacterium thermoautotrophicum
Bacteria
Archaea
plantscyanobacteria
Arabidopsis thalianaChlamydomonas reinhardtii
Eukaryota
Campylobacter jejuni
Lateral gene transfer produces gene phylogenieswhich are far from species phylogenies
![Page 96: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/96.jpg)
Two possible models for transfers
A transfered gene replaces an old gene
Identification: subtree prune and regraft, NP-Hard
A tranfered gene is added in the genome
Gene lineages are independant, reconciliation, polynomial
![Page 97: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/97.jpg)
Transfers contain informations about dates when fossils are absent
![Page 98: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/98.jpg)
A dated cyanobacteria treeWith 473 gene families
Agrees with all known dated fossil records
PNAS, 2012
![Page 99: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/99.jpg)
Épidémie médiévale qui a ravagé l'Europe au XIVe siècle
Trois grandes épidémies de pestes connues dans l'histoire- « justinienne », empire byzantin au VIIIe- peste noire, Europe occidentale 1348- Asie XIXe
![Page 100: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/100.jpg)
Yersinia pestis
- Découverte au XIXe par Alexander Yersin, de l'institut Pasteur, pendant la troisième épidémie, initialement appelée Pastorella Pestis
- Première séquence chromosomique complète en 2001 (Parkhill et al, Nature, 2001)
- A évolué depuis un ancêtre commun (20000 ans) avec Yersinia pseudotuberculosis, a gagné un plasmide soupçonné d'être responsable de la pathogénicité chez l'humain
- Séquence pleine de répétition, évolution pleine de réarrangements (Chain et al, PNAS, 2004)
![Page 101: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/101.jpg)
Premier séquençage de génome complet d'un bactérie pathogène ancienne
![Page 102: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/102.jpg)
Séquençage Illumnia de 4 individus d'un cimetière londonien2366647 reads, taille ~55 bp, couverture 30X, capturé avec la
souche CO92Tentative d'assemblage (Velvet) à partir des souches CO92 et
Microtus : 2134 contigs, couvrant 4Mb
![Page 103: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/103.jpg)
![Page 104: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/104.jpg)
A partir des 2134 contigs, une comparaison avec 11 génomes apparentés permet de reconstituer la séquence chromosomique
complète du génome ancien
1/ on découpe les contigs
On obtient 2619 familles de marqueurs, potentiellement dupliqués
![Page 105: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/105.jpg)
A partir des 2134 contigs, une comparaison avec 11 génomes apparentés permet de reconstituer la séquence chromosomique
complète du génome ancien
2/ on détecte des adjacences ancestrales potentielles entre marqueurs
![Page 106: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/106.jpg)
A partir des 2134 contigs, une comparaison avec 11 génomes apparentés permet de reconstituer la séquence chromosomique
complète du génome ancien
3/ on donne un ordre aux marqueurs en fonction des adjacences trouvées (problèmes avec les conflits et les répétitions)
![Page 107: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/107.jpg)
Une séquence hybride, 82 % séquencée, 18 % prédite
4/ on remplit les gaps entre les marqueurs par une estimation de la séquence ancestrale à partir d'alignements
![Page 108: Comparative Genomicsigm.univ-mlv.fr/AlgoB/EPIT2014/08-Tannier.pdf · See also « combinatorics of genome rearrangements », Fertin et al, 2009. 53 Reverse the history (and forget](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f492f68527bfb6f4c3f66ac/html5/thumbnails/108.jpg)
La séquence est distante de toutes les souches actuelles en termes de réarrangements
Certains nouveaux gènes apparaissent à la suite de ces réarrangements