berbnard dujon institut pasteur, paris bioinformatics and genome data analysis how eukaryotic...

66
Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Upload: caren-bathsheba-horn

Post on 13-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Berbnard Dujon

Institut Pasteur, Paris

Bioinformatics and Genome data Analysis

How Eukaryotic Genomes Evolve : the example of Yeasts

How Eukaryotic Genomes Evolve : the example of Yeasts

Page 2: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

MECHANISMS OF DUPLICATIONS

Whole genome duplicationspolyploidization (auto- or allo-)accidental (rare)highly instable (no genome is actually duplicated, except for some plants)

Segmental duplicationsvarious sizes of chromosome segments (several adjacent genes)intra- or inter- chromosomalfrequent (human)chimeric genes (domain accretion)sufficiently stable

Tandem gene repeat formationarrays of paralogsinstable (looping out)rapid divergence

Dispersed (single) gene duplications (retrogenes)transposon-mediatedchimeric genes (domain accretion)

Page 3: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

WHOLE GENOME DUPLICATIONS

Page 4: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

The genome of Saccharomyces cerevisiae, 1997The genome of Saccharomyces cerevisiae, 1997

Page 5: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

YDR200c

YDR201w

YDR202c

YDR204w

YDR205w

YDR206w

YDR207c

YDR208w

YDR209c

YDR210w

YDR211w

YDR212w

YDR213w

YDR214w

YDR215c

YDR216w

YDR217c

YDR218c

YDR219c

YDR220c

YDR221w

YDR222w

YLR225c

YLR226w

YLR227c

YLR228c

YLR229c

YLR231c

YLR233c

YLR234w

YLR236c

YLR237w

YLR238w

Chromosome 4

Chromosome 12

Duplicated chromosomal blocs in S. cerevisiaeDuplicated chromosomal blocs in S. cerevisiae

Chromosome 15

YLR266c

YLR267w

YLR268w

YLR270w

YLR271w

YLR272c

YLR273c

YLR274w

YLR275w

YLR276c

YLR277c

YLR278c

YLR281c

YLR283w

YLR284c

YLR285w

YLR286c

YLR287c

YLR287ca

YLR288c

YLR289w

YLR290c

YLR291c

YLR292c

YLR293c

YLR295c

YLR296w

YLR297w

YLR298c

YLR299w

YLR300w

YOR162c

YOR163w

YOR164c

YOR165w

YOR166c

YOR167c

YOR168w

YOR171c

YOR172w

YOR173w

YOR174w

YOR175c

YOR176w

YOR177c

YOR178c

YOR179c

YOR180c

YOR181w

YOR182c

YOR183w

YOR184w

YOR185c

YOR186w

YOR187w

YOR188w

YOR189wYOR190w

YOR191w

Chromosome 16

YPL145c

YPL144w

YPL143w

YPL141c

YPL140c

YPL139c

YPL138c

YPL137c

YPL135w

YPL134c

YPL133c YPL132w

YPL131w

YPL130w

YPL129w

YPL128c

YPL127c

YPL126w

YPL125w

YPL124w

YPL123c

YPL122c

YPL121c

YPL120w

YPL119c

YOR204w

YOR205c

YOR206w

YOR207c

YOR208w

YOR209c

YOR210w

YOR211c

YOR212w

YOR213c

YOR214c

YOR215c

YOR216c

YOR217w

YOR219c

YOR220w

YOR221c

YOR222w

YOR223w

YOR224c

YOR226c

YOR227w

YOR228c

YOR229w

YOR230w

YOR231w

YOR232w

YOR233w

YOR234cYOR235w

YOR236w

YOR237w

YOR192c

YLR234w

YLR236c

YLR237w

reli

c

reli

c

Page 6: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Wolfe and Schields, Nature (1997) 387: 708-713

DUPLICATED BLOCKS IN THE GENOME OF S. cerevisiae

Seoighe and Wolfe, Gene (1999) 238: 253-261

Page 7: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Total number of genes into identified blocks: 3391 (58 % of genome)

Total number of paired genes (paralogs) in blocks: 898 (449 pairs) 26 %

Total number of unpaired genes in blocks: 2493 74 %

60 to 80 ancient duplicated blocks can be identified in the entire yeast genome

DUPLICATED BLOCKS IN THE GENOME OF S. cerevisiae

WHOLE GENOME DUPLICATION FOLLOWED BY MASSIVE (ca. 92%) LOSS OF PARALOGOUS COPIES

Nb of « unique » gene prior to duplication 5800-449 = 5351Nb of paralogous copies lost 5351-449 = 4902Fraction of paralogous copies lost 4902 / 5351 = 91.6 %

Page 8: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

1

2

3

4

S. cerevisiae

C. glabrata

K. lactis

D. hansenii

Y. lipolytica

extensive loss of duplicated genes

accidentalgenomeduplication

map dispersion

genome size control

MAT cassettes and centromeres

Charting genome evolution Charting genome evolution

tandem repeat formation mechanism

reductive evolution

segmental duplication mechanism

segmental duplication mechanism

segmental duplication mechanism

segmental duplication mechanism

Overall genome

redundancy

44 %

35%

32 %

51%

42%

Page 9: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

SIGNATURES OF A WHOLE GENOME DUPLICATION

COMPARISON OF MAPS BETWEEN A DUPLICATED SPECIES AND NON DUPLICATED SPECIES

e.g. S. cerevisiae and K. lactis Dujon et al. Nature (2004) 430: 35-44

S. cerevisiae and K. waltii Kellis et al. Nature (2004) 428: 617-624

S. cerevisiae and Ashbya gossypii Dietrich et al. (2004) Science 304: 304-307

Tetraodon negroviridis and Homo sapiens Jaillon et al. Nature (2004) 431: 946-957

one to two relationship between intermingled segments

COMPARISON BETWEEN TWO DUPLICATED SPECIES ORIGINATING FROM THE SAME EVENT

e.g. S. cerevisiae and C. glabrata Dujon et al. Nature (2004) 430: 35-44

coincidence between duplicated blocks

Page 10: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Ancient duplicated blocks in each genomeAncient duplicated blocks in each genome

S. cerevisiae C. glabrata

S. cerevisiae C. glabrata

Total nb of duplicated blocksinternal to chromosomes 56 20subtelomeric 21 0

Block size (kb) mean 42 27max. 243 89

Nb of gene pairs /block mean 5.8 3.8max. 15 6

Application of ADHoRe (Vandepoele et al. 2002) (r2 cutoff = 0.8, max gap = 35, min pair = 3)

Coincidence of blocks

38 18 2

Page 11: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Hypothetical ancestor

chromosome of interest (X)

other chromosomes

1 2 3 4 5 76 8 9 10 11 12 13 14 15 16 17 181920

1 2 3 4 5 76 8 9 10 11 12 13 14 15 16 17 181920

Species 2Species 1

Instable intermediate

chromosome A

chromosome B

other chromosomes

other chromosomes

Comparison between species 1 and species 2

genome duplication

Page 12: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Kellis et al. Nature (2004) 428: 617-624

MAP OF K. waltii GENOME RELATIVE TO S. cerevisiae

Page 13: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

RECONSTRUCTION OF S. cerevisiae DUPLICATED BLOCKS RELATIVE TO K. waltii

Kellis et al. Nature (2004) 428: 617-624

Page 14: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Jaillon et al. Nature (2004) 431: 946-957

DISTRIBUTION OF IDENTITY BETWEEN PARALOGOUS GENE PAIRS IN FISHES

Tetraodon negroviridis Takifugu rubripes

ancient duplicated pairs ancient duplicated pairs

Page 15: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Jaillon et al. Nature (2004) 431: 946-957

MAP OF ANCIENT DUPLICATED PAIRS ON ENTIRE GENOME OF Tetraodon negroviridis

Page 16: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et un décompresseurTIFF (LZW) sont requis pour visualiser

cette image.

A B C D E F G H I J K L

Ancestral karyotype of bony vertebrates (12 chromosomes)

Amplification of transposoable elements Duplication

FusionsTranslocations and fusions

TetraodonHuman

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 211 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X

Jaillon et al. Nature (2004) 431: 946-957

Page 17: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

SEGMENTAL DUPLICATIONS

Page 18: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Segmental duplications in mammalian genomes:

segments of sequences of ≥ 90 % identity (recent) and ≥ 1 kb in length (weak criterion) or ≥ 5kb in length (more stringent)

interchromosomal

intrachromosomal

Example: rat genome

unassembled sequence reads

Total: 2.9 % of genome (rat)1-2% of genome (mouse)5-6% of genome (human)

Gibbs et al. Nature (2004) 428: 493-521

Page 19: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

DETAILED MAP OF SEGMENTAL DUPLICATIONS ON HUMAN CHROMOSOME 16

Martin et al. Nature (2004) 432: 988-994

interchromosomal

intrachromosomal

centromere

also deteceted by whole genome shogun

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Page 20: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Martin et al. Nature (2004) 432: 988-994

DISTRIBUTION OF LENGTHS AND IDENTITIES OF SEGMENTAL DUPLICATIONS

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

Human chromosome 16 Rat genome

Tuzun et al. Genome Res. (2004) 14: 493-506

Page 21: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

1

2

3

4

S. cerevisiae

C. glabrata

K. lactis

D. hansenii

Y. lipolytica

extensive loss of duplicated genes

accidentalgenomeduplication

map dispersion

genome size control

MAT cassettes and centromeres

Charting genome evolution Charting genome evolution

tandem repeat formation mechanism

reductive evolution

segmental duplication mechanism

segmental duplication mechanism

segmental duplication mechanism

segmental duplication mechanism

Overall genome

redundancy

44 %

35%

32 %

51%

42%

Page 22: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

K. lactis D. hansenii Y. lipolytica

Total nb of duplicated blocksinternal to chromosomes 8 5 2sutelomeric 1 10 0

Block size (kb) mean 9 19 90max. 25 59 148

Nb of gene pairs /block mean 4.3 3.7 4.0max. 11 6 4

K. lactis D. hansenii Y. lipolytica

Sporadic segmental duplications

?

Ancient duplicated blocks in each genomeAncient duplicated blocks in each genome

Page 23: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Spontaneous segmental duplications in the yeast genome:experimental design

Spontaneous segmental duplications in the yeast genome:experimental design

Wild type (two copies of ribosomal protein gene)

RPL20A

RPL20B

Deletion mutantSlow growth (gene dosage effect)

RPL20B

Spontaneous normal growth mutants

mutation rate ≈ 10-9 / generation / cell

?

R. KOSZUL, S. CABURET, B. DUJON, G. FISCHER Eucaryotic genome evolution through the spontaneous duplication of large chromosomal segments EMBO J. (2004) 23, 234-243

Page 24: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts
Page 25: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts
Page 26: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts
Page 27: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts
Page 28: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts
Page 29: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Formation of chimeric ORFs at junctionsFormation of chimeric ORFs at junctions

intrachromosomal segmental duplications

YKF1072 in frame fusion between YOR329c (SDC5) and YOR267c chimeric proteinYKF1057 in frame fusion between YOR372c (NDD1) and YOR267c chimeric proteinYKF1223 in frame fusion between YOR336w (KRE5) and YOR227w chimeric protein

YKF1022 out of frame fusion between YOR328w and YOR272w truncated proteinYKF1159 antiparallel fusion between YOR357c and YOR269w truncated proteinYKF1050 fusion between YOR328w and intergene truncated proteinYKF1080 fusion between YOR370c and intergene truncated proteinYKF1124 fusion between intergene and YOR220w truncated protein

YKF1175 fusion between LTRsYKF1095 fusion between intergenic regionsYKF1016 fusion between intergenic regions

interchromosomal segmental duplications

YKF1114 out of frame fusion between YJR090c and YOR267c truncated protein

YKF1085 fusion between LTRsYKF1246 fusion between LTRsYKF1122 fusion between LTRsYKF1027 fusion between intergenic regions

Page 30: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

original strain, wild-type fitnessinitial genetic complexity

single gene deletion mutantreduced fitnessinitial genetic complexity

offspring of mutantrestored fitness, compete out its parentincreased genetic complexity (up to 300 genes simultaneously duplicated as a single segment)

1 2

3

spontaneous events

10-9 / generation / cell

Page 31: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

TANDEM GENE REPEAT FORMATION

Page 32: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Well known cases of gene tandems: and globin genes

G A Human chromosome 11

Human chromosome 16

approx. Scale (kb)

100

Ancestor of vertebrates1 globine gene + 1 myoglobine gène

Ancestor of chordates: 1 gène

duplication and divergence1 gene + 1 gene

(e.g. Xenopus)

Gene number expansion(mammals, birds)

pseudogenes

birthpostnatalage after fecondation

weeks

% o

f to

tal g

lob

ine

50

40

30

20

10

6 12 18 24 30 36 42 486 12 18 24 30 36

Page 33: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

lab pb Dfd Scr Antp

Ubx AbdA AbdB

ANT-C

BX-C

Drosophila genome

thorax abdomen

Drosophila larvae

Mouse embryohead

Mouse or human genomes

1 2 4 5 6 7 9HoxA

HoxB

HoxC

HoxD

3

1 2 4 5 6 7 8 93

4 5 6 8 9

1 4 8 93

Page 34: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Tandem repeat arrays in YeastsTandem repeat arrays in Yeasts

D. Hansenii chromosome K

Similar to S. cerevisiae YHR179w OYE2 NADPH dehydrogenase (old yellow enzyme), isoform 1

pseudogenes pseudogenes

Amino-acid sequence identity between copies: from 82 % to 95 %

total nb of direct total nb of

tandem pairs orientation arrays

S. cerevisiae 61 79% 50

C. glabrata 47 83% 32

K. lactis 36 72 % 33

D. hansenii 329 92 % 247

Y. lipolytica 54 72 % 48

Page 35: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

YIL009ca YIL010w YIL011w YDR007w YNL031c YNL030w

Homologous to YIL014w (MNT3) : alpha-1, 3-mannosyltransferases responsible for adding the terminal mannose residues of O-linked oligosaccharides

CAGL0C03828g

CAGL0C03850g

CAGL0C03872g

similar toSACE

CAGL0C03894g

CAGL0C03916g

CAGL0C03938g

CAGL0C03960g

CAGL0C03982g

CAGL0C04004g

CAGL0C04026g

CAGL0C04048g

CAGL0C04092g

CAGL0C04114gCAGL0C04136g

TANDEM REPEATS IN C. glabrata

Homologous to YLR120c , YLR121c or YDR144c, Aspartic preoteases

YOL128c YOL126c YOL125w YOL124c

CAGL0E01683g

CAGL0E01705g

CAGL0E01749g

CAGL0E01771g

CAGL0E01793g

CAGL0E01815g

CAGL0E01837g

CAGL0E01859g

CAGL0E01881gCAGL0E01903g

CAGL0E01925gCAGL0E01727g

A. Thierry and B. Dujon, unpublished

Page 36: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

CAGL0C03894g CAGL0C03916g CAGL0C03938g CAGL0C03960g CAGL0C03982g CAGL0C04004g CAGL0C04026g CAGL0C04048g

51% 75% 81% 76% 60% 59% 59%

78%50% 63%

• 20 • 40 • 60 • 80 • 100 • 120 1 ACTCTTATACACCTAGTACCCGATCGCTTCTGTCAACGTCCCCGCTCGGTTACTGTGCATTCCTAACCCCCACAGATACAATGACTACAGCAATACTTCCACAACCACTTATCTCACTTCAGAAA 125 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 4302 ACTCTTATACACCTAGTACCCGATCGCTTCTGTCAACGTCCCCGCTCGGTTACTGTGCATTCCTAACCCCCACAGATACAATGACTACAGCAATACTTCCACAACCACTTATCTCACTTCAGAAA 4426 • 4320 • 4340 • 4360 • 4380 • 4400 • 4420 • 140 • 160 • 180 • 200 • 220 • 240 • 126 TGCTCTCATAACACTTTCCCGCCAGCAATCTCTCACTACCACAACACCCTTCCCATTGTTCCCTCGAGACTCACGCTGGCAGATCGCTTTCGGTAAATCCTTTGTAAACTAACTTTTTCACCAGG 250 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 4427 TGCTCTCATAACACTTTCCCGCCAGCAATCTCTCACTACCACAACACCCTTCCCATTGTTCCCTCGAGACTCACGCTGGCAGATCGCTTTCGGTAAATCCTTTGTAAACTAACTTTTTCACCAGG 4551 • 4440 • 4460 • 4480 • 4500 • 4520 • 4540 • 260 • 280 • 300 • 251 GTCTGCGCTGTTTCTCTGGCAACCTCGAGGACTCCCGTCGACTGGTGATGTGCGATAAAGCTGCCC |||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||| || 4552 GTCTGCGCTGTTTCTCTGGCAACCTCGAGGACTCCCGTCGACTGGTGATGTGAGATAAAGCTGTCC 4560 • 4580 • 4600 • 4620

56% 64%

TANDEM REPEATS IN C. glabrata

A. Thierry and B. Dujon, unpublished

Page 37: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

EVOLUTION WITHIN THE ORTHOLOGOUS ALCOHOL DESHYDROGENASE GENE CLUSTER

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

pseudogene

pseudogene

Expansion in the human lineage

Expansion in the chicken lineage

Hillier et al. Nature (2004) 432: 695-716

Page 38: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

DISPERSED (SINGLE) GENE DUPLICATIONS

(RETROGENES)

Page 39: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

FORMATION OF RETROGENES AND PROCESSED PSEUDOGENES

Gene with intron

5 ’ 3 ’exon 1 exon 2intron

transcription

5 ’ 3 ’

splicing

Messager RNA

P T

Complementary DNA

Action of reverse transcriptase

integration

poly A:T tail

AAAAAAA 3 ’

AAAAAAA 3 ’AAAAAAA 3 ’TTTTTTTT 5 ’

polyadenylation

Page 40: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

gag pol

protease, integrasereverse transcriptase

LTR LTR

+1 frameshift

6 kb

Yeast genome Ty elements (transposons of yeast)

Boeke, J. D. et al. 1985. Ty elements transpose through an RNA intermediate. Cell 40, 491-500

variable position in genome between strains, mutagenic

Presence of molecular tagLoss of intron

artificial intron molecular tagartificial promoteur

experiment galactose inductionselection of [his+] mutants (reactivation of promoter-less gene)molecular analysis of integrated transposon

targetP - HIS3 integration

transcription

splicing

reverse transcription

ORIGINAL DISCOVERY OF RETROPOSONS

Page 41: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Single gene duplications in S. cerevisiae

Anecdotal observations:

ACP1 Hansche et al., (1978) Genetics 88, 673-687HIS4 Greer and Fink, (1979) PNAS 76, 4006-4010ADH2 Paquin et al., (1992) Genetics 130, 263-271HXTx Brown et al., (1998) Mol. Biol. Evol. 15, 931-942

10-10 - 10-12 duplication / cell / generation

recent experimental demonstration

Page 42: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Ty-mediated gene duplicationTy-mediated gene duplication

QuickTime™ et un décompresseurTIFF (LZW) sont requis pour visualiser

cette image. GATase: glutamine amidotransferaseCPSase: carbamoylphosphate synthetaseDHOase: dihydro-orotaseATCase: aspartate transcarbamylase

Haploid strain, select [Ura+] prototroph ca. 10-10 event / cell / generation

1- Insertion of Ty1 upstream of ATCase Roelants et al., (1997) Mol. Gen. Genet. 246, 767-773

2- Deletion of the GATase, CPSase mutated region Welcker et al., (2000) Genetics 156, 549-557

(RAD52-dependent)

3- Duplication of the ATCase coding sequence elsewhere in the genome Bach et al., (1995) Yeast 11, 169-177

Schacherer et al. (2004) Genome Res. 14, 1291-1297

Page 43: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et un décompresseurTIFF (LZW) sont requis pour visualiser

cette image.

Ty-mediated gene duplicationTy-mediated gene duplication

Spontaneous events

QuickTime™ et un décompresseurTIFF (LZW) sont requis pour visualiser

cette image.

Ty overexpressionInterchromosmal events: 16Intrachromosmal events: 4

Interchromosmal events: 3Intrachromosmal events: 1

Schacherer et al. (2004) Genome Res. 14, 1291-1297

Page 44: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et un décompresseurTIFF (LZW) sont requis pour visualiser

cette image.

Ty-mediated gene duplicationTy-mediated gene duplication

Schacherer et al. (2004) Genome Res. 14, 1291-1297

polyA tailsmicrohomology regions between TyA and URA2

Accidental incorporation of URA mRNA in Ty-VLPReverse transcription of URA3 mRNATemplate switch onto Ty-RNAIntegration of cDNA

Page 45: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Increases polymorphism

Decreases polymorphism

Incr

ease

s re

dund

ancy

Dec

reas

es r

edun

danc

y

DuplicationsGene loss

Sequence divergence

Genetic drift, selection

Page 46: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

GENE LOSS

Page 47: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

GENE RELICSGENE RELICS

IVtII

SuYBR60c SuYBR061c

SuYDR037w

IItIVSuYDR038c

SuYDR037w

S. uvarum

YBR60c

II

YBR061c

YDR036cIV

YDR038c

YDR037w(KRS1)

S. cerevisiaeRelic of

YDR037w paralog

1000 2000 3000 4000 5000

YBR060c YBR061c

YD

R0

37w

(KR

S1)

Stringency 15/23

Page 48: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

. . . .10 . . . .20 . . . .30 . . . .40 . . . .50 . . . .60relic_ydr037w 1:GTGCCACAGCAAGTTAATGTCACGGCAGCTAGTGACGCTATTGCTAGTTTACACCTAGAT: 60YDR037w 1:ATGTCTCAACAAGATAATGTCAAAGCCGCCGCTGAAGGTGTTGCTAACCTACATCTCGAC: 60 . . . .70 . . . .80 . . . .90 . . . 100 . . . 110 . . . 120relic_ydr037w 61:GAGGCCACTGGAGAAATGGTCTCTAAGACAGAGTTGAAGAAGCGTATTAAGGGAATACAA: 120YDR037w 61:GAAGCTACCGGGGAAATGGTCTCCAAGTCTGAATTGAAGAAGCGTATCAAGCAAAGACAA: 120 . . . 130 . . . 140 . . . 150 . . . 160 . . . 170 . . . 180relic_ydr037w 121:ATTGAGGCCAAAAAG.CTGTCAAAAAGACTCTTGCGAAACCAAAACCAGCTTC....GAA: 175YDR037w 121:GTCGAAGCTAAAAAGGCCGCCAAAAAGGCTGCCGCTCAACCAAAACCGGCTTCCAAAAAA: 180 . . . 190 . . . 200 . . . 210 . . . 220 . . . 230 . . . 240relic_ydr037w 176:AAGACTAATTTCCTGGCCGGTTTATAGTCATCTCAATACT........AGATCACAGCAA: 227YDR037w 181:AAAACAGATTTGTTCGCTGACCTGGATCCATCGCAATATTTCGAAACAAGATCTCGCCAA: 240 . . . 250 . . . 260 . . . 270 . . . 280 . . . 290 . . . 300relic_ydr037w 228:ATCCAATTAA.GAAACAGACTCTTGATATAAATTTTTATCCATACAAGTTCCGATTATAT: 286YDR037w 241:ATTCAAGAATTGAGAAAGACTCACGAACCAAATCCATACCCACACAAGTTTCACGTTTCT: 300 . . . 310 . . . 320 . . . 330 . . . 340 . . . 350 . . . 360relic_ydr037w 287:ATATTCAATCCTGAATTTTTGGCCAAGTATGCCCATTC..AAAAAGGCGAAAATTTCCCT: 344YDR037w 301:ATATCCAATCCTGAGTTCTTGGCCAAATATGCGCATTTGAAAAAAGGTGAAACCTTACCT: 360 . . . 370 . . . 380 . . . 390 . . . 400 . . . 410 . . . 420relic_ydr037w 345:TAAGAGAAGTTTCACATTGCTAGGAGAGTTCATGCAGAAAGAGAATCAGCTTAAAAATTG: 404YDR037w 361:GAAGAGAAGGTTTCAATTGCTGGTAGAATTCATGCCAAAAGAGAATCTGGCTCCAAATTG: 420 . . . 430 . . . 440 . . . 450 . . . 460 . . . 470 . . . 480relic_ydr037w 405:AAATTCTACGTTCT...CAATGGTGGTGTTGAGCTCTAAATTATTTTACAATTTCAGGAT: 461YDR037w 421:AAATTCTATGTTCTTCACGGTGATGGTGTTGAAGTTCAATTGATGTCCCAATTGCAGGAC: 480 . . . 490 . . . 500 . . . 510 . . . 520 . . . 530 . . . 540relic_ydr037w 462:TATTACGACGAGAACCCATA..AAAAGGAGCATGACCTTT.AAGGAGGAGTAATAT....: 514YDR037w 481:TACTGCGACCCAGACTCTTACGAAAAGGATCACGACCTTTTGAAAAGGGGTGATATCGTT: 540 . . . 550 . . . 560 . . . 570 . . . 580 . . . 590 . . . 600relic_ydr037w 514:.......................ATATCCACCAAAGAAGACCGGCGGAGATGAGATATAT: 551YDR037w 541:GGTGTCGAGGGTTACGTCGGAAGAACTCAACCAAAGAAAGGTGGTGAAGGTGAAGT.TTC: 599 . . . 610 . . . 620 . . . 630 . . . 640 . . . 650 . . . 660relic_ydr037w 552:TTTTTTCGTTAACAGAGTGCAATT...GACAACTTGTTTGCAC...TTGCCTGCTAACTG: 605YDR037w 600:CGTCTTCGTTAGCAGAGTGCAATTATTGACACCATGTTTGCACATGTTACCTGCCGACCA: 659 . . . 670 . . . 680 . . . 690 . . . 700 . . . 710 . . . 720relic_ydr037w 606:TTTTGGTTTCAAAGATCAAGAAAATAGATA..............................: 635YDR037w 660:CTTTGGTTTCAAAGACCAGGAAACCAGATACAGAAAGCGTTATTTGGATTTGATCATGAA: 719 . . . 730 . . . 740 . . . 750 . . . 760 . . . 770 . . . 780relic_ydr037w 635:...........GAACCCGTTTTATTATTCAAT.TGACATCGCCCGTTATATCAGACGATT: 683YDR037w 720:CAAAGACGCCAGAAACCGTTTTATTACCCGTTCTGAAATTATCCGTTACATCAGAAGATT: 779 . . . 790 . . . 800 . . . 810 . . . 820 . . . 830 . . . 840relic_ydr037w 684:TTTGGATCAAAAAAAGTTTATTGGAGCAGAAGCAATTCTGAAATGAAGGTCCTAATATGA: 743YDR037w 780:TTTGGACCAAAGAAAGTTTATTGAAGTAGAAAC..TCCAATGATGAACGTTATTGC.TGG: 836 . . . 850 . . . 860 . . . 870 . . . 880 . . . 890 . . . 900relic_ydr037w 744:CCCCAATATGAC.ACATAATTCGGAATCTGCCACTTGTGAGTTTTATCAAGCCTATGCGG: 802YDR037w 837:TGGTGCTACCGCTAAGCCATTTATTACCCACCA.TAATGACCTTGAT.ATGGACATGTAC: 894 . . . 910 . . . 920 . . . 930 . . . 940 . . . 950 . . . 960relic_ydr037w 803:ATGTTTGTGACTAGTTGGATATGACTGAATTAATACTTTCAGAAATGGACAAGGAGATAT: 862YDR037w 895:ATGAGAATTGCTCCAGAATTGTTCTTGAAACAAT.TGGTTGTCGGTGGTTTGGATCGTGT: 953

Average sequence identitybetween relic and

gene

= 62 %

(1127 / 1818)

i.e.

One copy of the two ancestral genes has undergone several hundred events of:

nucleotide substitutions

single nucleotide deletions or insertions,

microdeletions.

Page 49: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Distribution of gene relics on the S. cerevisiae genomic mapDistribution of gene relics on the S. cerevisiae genomic map

relics

functional paralogs

(106 identified)

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Page 50: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Species-specific gene lossSpecies-specific gene loss

Lost from C. glabrata

YBR018c Carbohydrate metabolismYBR019c Carbohydrate metabolismYBR020w Carbohydrate metabolismYDR009w Carbohydrate metabolismYIL162w Carbohydrate metabolismYNR071c Carbohydrate metabolismYMR096w Cell cycle and DNA processingYFL059w Cell rescue, defense and virulenceYLL057c Cell rescue, defense and virulenceYNL333w Cell rescue, defense and virulenceYBR296c Homeostasis of cations YLR189c Lipid, fatty-acid and isoprenoid metabolism YGR286c Metabolism of vitamins, cofactors, and prosthetic groupsYIR027c Nitrogen and sulfur metabolism YIR029w Nitrogen and sulfur metabolism YIR032c Nitrogen and sulfur metabolism YPR194c Oligopeptide Transporter YJL212c Pheromone response, mating-type determination, sex-specific proteins YAR071w Phosphase metabolismYBR092c Phosphase metabolismYBR093c Phosphase metabolismYHR215w Phosphase metabolismYDR104c Sporulation and germinationYOR313c Sporulation and germinationYMR283c tRNA modificationYJL100w Unclassified proteinsYMR321c Unclassified proteinsYOR129c Unclassified proteinsYPL273w Unclassified proteins

Lost from Y. lipolyticaYJL094c Cation transporteursYBR238c Cell cycleYDR082w Cell cycleYBR131w Cell rescue, defense and virulenceYIL150c DNA processingYDL200c DNA recombination and DNA repairYPL057c Fungal cell differentiationYCR020c mRNA transcriptionYLR067c Protein synthesisYMR257c Protein synthesisYNL284c Protein synthesisYML111w Proteolytic degradationYMR275c Proteolytic degradationYBL014c rRNA transcriptionYML043c rRNA transcriptionYER132c Sporulation and germinationYGL197w Sporulation and germinationYHR184w Sporulation and germinationYLR139c TranscriptionYBR163w Unclassified proteinsYDR131c Unclassified proteinsYDR367w Unclassified proteinsYEL001c Unclassified proteinsYER004w Unclassified proteinsYER077c Unclassified proteinsYFR013w Unclassified proteinsYGL107c Unclassified proteinsYGR134w Unclassified proteinsYHR029c Unclassified proteinsYJL149w Unclassified proteinsYJR003c Unclassified proteinsYJR003c Unclassified proteinsYJR111c Unclassified proteinsYLL033w Unclassified proteinsYLR320w Unclassified proteinsYNR068c Unclassified proteinsYNR069c Unclassified proteinsYOL017w Unclassified proteinsYOR060c Unclassified proteinsYPL005w Unclassified proteins

Lost from D. hansenii

YFR018c Amino acid metabolismYEL023c Cell growth and morphogenesisYCR014c DNA recombination and DNA repairYJL132w Lipid, fatty-acid and isoprenoid metabolism YBR227c Proteolytic degradationYMR265c Unclassified proteinsYNL187w Unclassified proteinsYPR002w Unclassified proteins

Lost from K. lactisYGL156w Carbohydrate metabolismYML005w Unclassified proteinsYPL207w Unclassified proteinsYPR147c Unclassified proteins

Criterion: a protein family represented in all yeast species but one

reductive evolution

Page 51: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

MORE TO THE EVOLUTIONARY DYNAMICS:

HGT and NUMTs

Page 52: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Species Gene name HomologAcc. N°

Species Function

KLLA0C09218g Q8PPU9 Xanthomonas axonopodis Conserved glyoxalase domain protein

KLLA0A02431g Q8EG95 Shewanella oneidensis Hypothetical proteinKLLA0A12089g P21340 Bacillus subtilis negative regulatory protein, acetyll transferase domain

DEHA0B15763g Q8ZIB2 Bacillus cereus protein ydhR precursor

YALI0F04290g Q987V4 Rhizobium loti D-amino peptidaseYALI0F05654g Q8EAT4 Shewanella oneidensis Conserved hypothetical proteinYALI0F31867g Q9I5L7 Pseudomonas aeruginosa Conserved hypothetical protein

YALI0E33011g P45900 Bacillus subtilis Conserved hypothetical protein, adenylate kinase family

K. lactis

D. hansenii

Y. lipolytica

Species-specific genes (in yeasts) with homologs in Bacteria

Possible cases of horizontal gene transfer Possible cases of horizontal gene transfer

Summary of HGT

C. glabrata noneK. lactis 5 genes (including a pair of paralogs)D. hansenii 1 geneY. lipolytica 8 genes (including two pairs of paralogs)

YALI0D21582gQ87HL8 Pseudomonas putida Yee/YedE family protein

YALI0F01408g

YALI0A15400gQ92QU2 Rhizobium meliloti Putative acetyltransferase

YALI0F11605g

KLLA0B00451g Q9JYX0 Neisseria meningitidis Alcohol dehydrogenaseKLLA0D19949g

Page 53: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

QuickTime™ et undécompresseur TIFF (LZW)

sont requis pour visionner cette image.

NUMTS in the genome of S. cerevisiae

Ricchetti et al., Nature (1999) 402: 96-100

Page 54: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

expression plasmid (pPEX7)

I-Sce I GENEPgal

I-Sce Isite

URA3TSH1 > < TSH2

I-Sce Isite

artificial cassette

Normal yeast chromosome

telomeretelomere centromere

I-Sce I I-Sce I> <

Engineered yeast chromosome

Broken yeast chromosome

?

ca. 99 % of cases: cell arrest, no colony

ca. 1 % of cases: repair, loss of cassette, [ura-] colonies

I-Sce I endonucleas

e

Ricchetti et al., (1999) Nature 402, 96-100

Transfer of mitochondrial DNA to the nucleusexperimental design

Transfer of mitochondrial DNA to the nucleusexperimental design

Page 55: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Experimental designExperimental design

Repaired yeast chromosome

[ura-] colonies?

PCR amplifications

Chromosome repaired by non-homologous end-joining

short PCR fragments

Chromosome repaired with insertion of novel DNA fragment

long PCR fragments

sequenced?

Page 56: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

ccaagagataaaattgtacaagaagttataagaataatttta

ctattactttaatattttaaataactaatttagatcaatctaaaaaatctaagtgtttagatgataataaagaatatttattaaagtatt

gaaccccgaaaggaggaataagataaatatatagCAGGGTAAT

tatttatatttatatttc

(T)

(AT)

ATTACCCTGTTAatgattttaaaacaataattttgttttaagtattaataataatattaatattcgacctcttaattgaggatattataatcataattttttgatacaatttttgataaaaagAACAGGGTAAT

34-II-89

(A)

ATTACCCTGTTATattattattttttattattaataataataatttatagggtttattctgttttatcataaatacgtaaatatctaacttagctctcaaattatattacTAACAGGGTAAT

34pAT9

(ATAA)

ATTACCCTGTTATctttattatatttaagaatattattataattattattattattattatttttaataattaaaaatattaataataagtaaatattaattattgttcatttaatcattccaaaaatttaggtaatgatactgcttcgatcttaattggcatatttgcatgacctgtcccacacaactcagaacatgctccggccacgggagccg

34pAS15

(T)

(A)

ATTACCCTGTTAagtttccatagaagtaataataataataaatatattaaatattaatataattattaatta

TAACAGGGTAAT

622pBS8

ATTACCCTGTTATttagaatatttttaattaaataatataattaaatgaataccaaacttatattatatttaTAACAGGGTAAT

(A)

34pAS16

ATTACCCTGTTATtttataattttataaataatatattattataaatatttaatataattTAACAGGGTAAT

(A)

34pAS7

------------ATTACCCTGTTAT3'------------TAATGGGAC

3'TATTGTCCCATTA----------? CAGGGTAAT----------

Page 57: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts
Page 58: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

mitochondria

nucleus

A flux of mitochondrial DNA sequences to the nucleus A flux of mitochondrial DNA sequences to the nucleus

Fragmentation of mtDNA

Transfer into the nucleus (?)

Integration into chromosomes following double-strand break

427494282826324

230kbI

7100152379

270kbVI

784kbXIV

61906

72377

24069

36933

41088

745kbX

60312

13770

1091kb

50089

XV813kbII

5455

71001

79268

III315kb

60739

575kbV

64614

42749

26324

64412

80378or 56176

58841

45799

1091kb

42828

VII948kbXVI

64603

42084

71001

77611

52679 or62508

36144

84295

563kbVIII

73712

74289

41088

16105

440kbIX

58971

51903

666kbXI

80559

51324

6556

4360549454

65001

924kbXIII

58341

1367

1522kbIV

13723

29100

9445

6836

XII1078kb+rDNA

XII1078kb+rDNA

rDNA repeatsrDNA repeats

13723

29100

9445

6836

100

200

400

600

800

1000

1200

1400

1600

kb

34 numts in the yeast nuclear genome

Page 59: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

211 numts in the human genome

93 % are insertions of single DNA fragments, 7 % are insertions of multiple, non-adjacent mtDNA fragmentnumts size range 47 - 14654 bp and 78-100 % identity to mtDNA.

PCR amplification on DNA from 21 human donors and 3 chimpanzees (Pan troglodytes) using either one primer in the nuclear sequence and one in the numts sequence or two primers in the nuclear sequencesome PCR fragments were directly sequenced for verification.

Results

10 numts common to H. sapiens and P.troglodytes: present in all human individuals tested -----> ancient

21 numts specific to H. sapiens and present in all 21 individuals tested

6 numts specific to H. sapiens but present in some individuals only

Conclusion: 27 insertions have occured in the human genome since its separation from Pt6 of them are not fixed in the human population

Numts in the human nuclear genomeNumts in the human nuclear genome

Ricchetti, Tekaia, Dujon (2004) PLOS 2(9) E273

fixation of one novel numt per 200 000 years in human lineage

Page 60: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

numts insertion and human genetic diseasesnumts insertion and human genetic diseasesTurner et al., (2003) Human Genet. 112, 303-309

16 year old boy with sporadic case of Pallister-Hall syndrome (anomalous development: polydactyly, metacarpal fusion, hypothalamic hamartoma, bifid epiglotis)

72 bp insertional mutation in exon 14 of GLI3 genesequence identical to fragment of mtDNA (fragment of ser-tRNA - leu tRNA genes)sequence predicts a truncated protein (935 aa compared to 1580 aa for w.t.)functional disruption of a key developmental gene

conception of patient temporally and geographically associated with high-level radioactive contamination following the Chernobyl accident

Borensztajn et al., (2002) Brit. J. Haematol. 117, 168-171family case of 251 bp mitDNA fragment inserted into coagulation factor VII gene

Willett-Brozick et al., (2001) Human Genet. 109, 216-223germline insertion of a 41 bp mtDNA fragment (12S rRNA) associated with a balanced translocation (t(9;11)(p24;q23)) of uncertain clinical significance, founder of mutation unknown.

C / T T / T

C / T72 bp insert

de novo mitochondrial-nuclear DNA transfer of paternal origin (associated SNP)

Page 61: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

SOME CONCLUSIONS

AND PERPECTIVES

Page 62: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

Eucaryotic genome evolution represents a dynamic equilibrium between:

1- duplications and loss of genes :

consequences: 1 -formation of paralogs with possibility of neo-functionalization (acquisition of novel function) or subfunctionalization (specialization of function between members of a family)

2 -gene family expansion and reduction

3- change of genetic maps (loss of synteny)

Increases polymorphism

Decreases polymorphism

Increases redundancy

Decreases redundancy

DuplicationsGene loss

Sequence divergence

Genetic drift, selection

2- divergence of sequences (creation of alleles, polymorphism of population) and loss of divergence (genetic drift and selection)

4- possible acquisition of external sequences (HGT) or internal sequences (NUMTs)

5- what about non-coding RNA genes ?

consequences: formation of pseudogenes (non-processed, disabled genes)

consequences: acquisition of novel functions (selection) or gene inactivation

3- activity and elimination of transposable elementsconsequences: duplication of genes or fragments (domain accretion)

change of genetic maps (chromosome reanrragements)formation of retrogenes and processed pseudogenes

Page 63: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

The central dogma of molecular biologyThe central dogma of molecular biology

RNA

Proteins

Transcription

Translation

DNA

Replication

19771977

RNA

Splicing

19701970

Reverse trancription

ATTENTION: This RNA is not exactly identical to the gene product

presentpresent

Edition

5’ AUCGUUGCAGUC 3 ’

5’ AUCGUUGUAGUC 3 ’

5’ ATCGTTGCAGTC 3 ’

example of RNA editing

Page 64: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

type loci chr.

S. cerevisiae 1 1 1 internal

C. glabrata 1 2 2 subtel

K. lactis 1 1 1 internal

D. hansenii 3 3 3 subtel + 1 orphan unit

Y. lipolytica >9 7 4 subtel

+ several orphan units + 105 copies 5S dispersed

Variability of rDNAVariability of rDNA

5S25S

5.8S18S

5S25S

5.8S18S

5S25S

5.8S18S

5S

5S25S

5.8S18S

5S var

25S5.8S

18Svar

Page 65: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

VARIABILITY OF NON-CODING RNA GENES IN YEASTS AND VERTEBRATES

SACE CAGL KLLA DEHA YALI

Total tRNA genes 274 207 162 205 510(co-transcribed tRNA gene pairs) (4) (0) (2) (17) (11)

Splicing RNA U1 1 1 1 1 2U2 1 1 1 1 1U4 1 1 1 2 1U5 1 1 1 1 1U6 1 1 1 1 1

Processing U3 1 1 1 2 3Rnase P 1 1 1 1 1

Protein transport (SRP) 1 1 1 1 2

Telomerase 1 1 1 nd nd

Chicken Human synteny

Total tRNA genes 280 496 33 %

Splicing RNA U1 18 146U2 6 88U4 4 119U5 9 36U6 15 821 20%U4atac 1 1U6atac 4 5U11 1 1U12 1 2

Processing U3 nd ndRnase P 1 1

Protein transport (SRP) 3 12

Telomerase 1 1

HILLIET et al. Nature (2004) 432: 695-716

Page 66: Berbnard Dujon Institut Pasteur, Paris Bioinformatics and Genome data Analysis How Eukaryotic Genomes Evolve : the example of Yeasts

and UMR8030 CNRS, Evry J. Weissenbach, V. Anthouard, V. Barbe, L. Cattolico, S. Oztas, C. Scarpelli,

P. Wincker

Génopole Institut Pasteur, Paris C. Bouchier, L. Frangeul, L. Ma

LaBRI (UMR5800 CNRS), Centre de bioinformatique and IBGC (UMR5095 CNRS), Univ. Victor Segalen, Bordeaux D. Sherman, E. Beyne, I. Lesur, M. Nikolski, H. Ferry-Dumazet, A. Groppi; A. de Daruvar, N. Goffard; M. Aigle, P. Durrens

Dynamique, évolution et expression des génomes de microorganismes (FRE2326 CNRS), Univ. Louis Pasteur, Strasbourg J-L. Souciet, S. Potier, C. Bleykasten, J. de Montigny, L. Despons, N. Jauniaux, M-L. Straub, B. Wirth, M. Zeniou-Meyer

Unité de Génétique moléculaire des levures, (URA2171 CNRS, and Univ. P. M. Curie), Institut Pasteur , Paris B. Dujon , J. Boyer, E. Fabre, C. Fairhead, G. Fischer, C. Hennequin, A. Kerrest, R. Koszul, I. Lafontaine, H. Muller, O. Ozier-Kalogeropoulos, S. Pellenz, G-F. Richard, E. Talla, F. Tekaia, A. Thierry

CLIB and Génétique moléculaire et cellulaire (UMR216 INRA and URA1925 CNRS), Institut National Agronomique, Grignon C. Gaillardin, A. Babour, S. Barnay, J-M. Beckerich, S. Blanchin, A. Boisramé, S. Casaregola, P. Joyet, C. Neuvéglise, J-M. Nicaud, A. Suleau, D. Swennene

Institut de Génétique moléculaire (UMR8621 CNRS), Univ. Paris-Sud, Orsay M. Bolotin-Fukuhara, F. Confanioleri, I. ZivanovicGénétique des levures, (UMR5122 CNRS), Univ. Claude Bernard, Lyon M. Wésolowski-Louvel, M. Lemaire

IBMC (UPR9002 CNRS), Strasbourg E. Westhof, R. Kachouri

Logiciels et banques de données, Institut Pasteur, Paris B. Caudron

Biochimie et Génétique moléculaire CEA, Saclay C. Marck

Interactions macromoléculaires (URA2171 CNRS), Institut Pasteur, Paris F. Hantraye

The GénolevuresGénolevures Sequencing Consortium (GDR 2354 CNRS )