fyodor a. kondrashov - bionet.nsc.ru filemacaca mulatta macaca fascicularis erythrocebus patas homo...

Post on 07-Feb-2019

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Патогенные мутации и

компенсаторная эволюция

Evolutionary Genomics Lab

Centre for Genomic Regulation

Barcelona, Spain

Fyodor A. Kondrashov

Genotype

Phenotype (fitness)

Polymeropoulos et al., Science, 1997

Macaca mulatta

Macaca fascicularis

Erythrocebus patas

Homo sapiens

Pongo pygmaeus abelii

Saguinus labiatus

Ateles geoffroyi

Lagothrix lagotricha

Rattus norvegicus

Mus musculus

Gallus gallus

Xenopus laevis

91

26

94

77

72

66

89

30

91

A

A

A

A

T

A

T

T

T

T

T

T

T

A

T

T

A

A

A

T

T

T

Site 53 of

alpha-synuclein T -> A

X -> Z that made A better than T

U A G A U U G A

A G C C A

G U U G A

U U A G G G U

G

C U U A G

C U G U U

A A

C U A A G U

G U U U

G U G G G U U

U A

A G U

C C C A U U G G U C U A G

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

Homo sapiens tRNAAsn

5’

3’

C C A G U U

G A

U U A G G G

U U A G

C U G U U

A A

C U A A

Pan troglodytes (chimpanzee) tRNAAsn

D-stem/ loop

Anticodon stem/loop

U A G A U U G A

A G

U

G

U A U G U U U

G U G G G U U

U A

A G U

C C C A U U A A U C U A G 5’

3’

Acceptor stem

TYC- stem/loop

C G

G

G

Kern and Kondrashov, Nat Genet 2004

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

5’

3’

C A U C G U G A

A G C U G A

A C A G C

A

U U A A C

C U

U U U A A

G U U A A A

G A A

U G G A A G

U A C U A

A C C U U C C

C A C A A U G A

Cynocephalus variegatus (Malayan flying lemur) tRNALys

U U

A A

G C U

A

U

C

U U U

G

C

A

C

A G G A A U U U

A G G U U

A C A C

C A G A C C

A

A G G A C

C U

U C A A A

G C C C U A

A G

C A A G U A

C A A

A C U A C U U

A A U U C C U G

Ceratotherium simum (white rhinoceros) tRNATrp

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

5’

3’

A G

A U

A

UA

U

C

U G

C

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

5’

3’

G A G A G A G

A C A U A

G U G G U

U A U G A U A U U G G

C U

U G A A A

C C A A U

U C C A

G A G G G U U

C G

A C U

C C U U C C U U U C U U A

Ursus maritimus (polar bear) tRNASer(UCN)

A

A U

C

G U A

C

G G

G

G C

U U

G

G

U

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

5’

3’

A G A A A C A U

G U C C G A

U A

A C A G A

G

U U A C U

U U

G A U A G

A G U G A A U

A A

U A G A G G

U C A A

G C C C U C U

U G U U U C U A

Spalax ehrenbergi (Ehrenberg's mole-rat) tRNAIle

U

A

A U A

A

A G C UU

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

5’

3’

A G A A A U A U

G U C C G A

C A

A A G G A A

U U A C U

U U

G A U A G

A G U A A A

A C A

U A G A G G

U C A C

A G C

C C U C U U A U U U C U A

Tamandua tetradactyla (southern tamandua) tRNAIle

U

A G

A U

U

U

A G C U

U

A

C

C A

A

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

5’

3’

G U U G G G G U

G G C A G

A G U C U G G

C A A C U G U

A

U A A A A

C U

U A A

A C

U U U U A C

A C C

C A G A G G

U U

A U C C U C U

C C U C A A C A

Hyperoodon ampullatus (northern bottlenose whale) tRNALeu(UUR)

C C U U

A

A

A C

C

A

C

G U

U

U

U

A U U A A G G U

G A C A G

A G A C

C G

G C A A

U U G U

G

U A A A A

C U

U A A G C

U U U U A U

A A

U C A G A G G

U U C A

A A U

C C U C U C C U U A A U A

Tachyglossus aculeatus (Australian echidna) tRNALeu(UUR)

5’ 3’

G C

C C

G

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

U

A C

A

A

A

C

C

G

U

U

A G C C C U G A

G G U G G

G U A C

U A A C A U

A U U G A A

U U G C A

A A

U U C A A A G

A A

G C A G C U U

C A

A C U C U G C

C G G G G C U U

Oryctolagus cuniculus (rabbit) tRNACys

5’

3’

U

U U

A

G

C C

A

Wittenhagen & Kelley Nat Struct Biol (2002) andTrends Biochem. Sci. (2003),

Molecular basis of the A3243G mt disease mutation

G U U A G G G

U G C A G

G G C C C G G

U A A C U G C

G

U A A A A

C U

U A A A C

C U U U A C U

A U

C A G A G G

U U C A

A U U C C U C U

C C C U A A C A

Canis familiaris (dog) tRNALeu(UUR)

Acceptor stem

D-stem/ loop

Anticodon stem/loop

TYC- stem/loop

5’

3’

A

U C

A

A

U

U

G

A

A

G A

[gttaaga]tg[gcag]agcccggtaa[tcgc]a[taaaa]cttaaaa[cttta]cagtc[agagg]ttcaatt[cctct|tcttaac]a [.......]..[....].......C..[.t..].[.....].......[.....]tta..[.....].....c.[.....|.......]. [......g]..[....]........c.[.t..]c[.....].....g.[t....]Agta.[...a.].....a.[.....|c......]. [......g]..[....].......C..[.t..].[.....].......[.....]t.a..[.....].....c.[.....|c......]. [.c.....]..[....]..........[.t..].[C..g.]......c[.C..g]A.tc.[.....].....c.[.....|.....G.]. [a......]..[....].......C..[.t..].[...g.]t......[.C...].ta..[.....].....c.[.....|......T]. [a......]..[....]..-.......[.t..].[C..g.]......c[.C..g].t...[.....]......C[.....|......T]. [....g..]..[....].......C..[.t..].[.....].....gc[t....]t.a..[.....].....c.[.....|..c....]. [.....ag]..[....]..a.......[.t..].[.....].....g.[t....]g..c.[.....].....c.[.....|ct....T]. [....g.g]..[....]........G.[.t..].[.....]......c[.....]t.c..[.....].....a.[.....|c.c....]. [....g.g].-[....]..........[.t..].[.....]......c[t....].ccc.[.....].....c.[.....|c.c...T]. [....g..]..[....]..........[ct..].[.....]......c[.....]t.ac.[.....].....c.[.....|..c....]. [...gg..]..[....]..ta...C..[.t..].[.....]......c[.....]t.cc.[.....].....a.[.....|..cc...]. [...g..g]..[....]..t....C..[.t.T].[.....]......c[t....]..c..[.....].....a.[.....|c.cc...]. [a...g.g]..[....]..a.......[.t..]g[.....]......c[.....]ttac.[.....].....c.[.....|c.c...T]. [.......]..[....]..a.a.....[.t..].[...g.]......c[.....]ttac.[.....].....a.[.....|.......]. [......g]..[....]..........[.t..].[.....].....gc[t....]..ac.[.....].......[.....|c......]. [.......].a[...a].aatt...c.[ct..].[.....].....gc[t....].tca.[G....].....c.[.....|.......]. [a.....g]..[....]..-....C..[.t..].[.....]......c[.....]t.a..[.....].....a.[.....|c.....T]. [a.....g]..[....]..-.a.....[.t..].[.....].....gc[.....]..ac.[.....].....aC[.....|c.....T]. [a.....g]..[a...]..a.......[.t.T]g[.....].....gc[t....]t....[.....].....a.[.....|c.....T]. [a.....g]..[a...]..a....C..[.t.T]g[.....].....gc[t....]t.a..[.....].....a.[.....|c.....T].

Homo sapiens (human)

Tarsius bancanus (western tarsier)

Tupaia belangeri (northern tree shrew)

Lepus europaeus (European hare)

Jaculus jaculus (lesser Egyptian jerboa)

Sciurus vulgaris (Eurasian red squirrel)

Echinops telfairi (small Madagascar hedgehog)

Pteropus scapulatus (little red flying fox)

Pipistrellus abramus (Japanese house bat)

Ursus maritimus (polar bear)

Odobenus rosmarus rosmarus (Atlantic walrus)

Rhinoceros unicornis (greater Indian rhinoceros)

Monodon monoceros (narwhal)

Platanista minor (Indus River dolphin)

Sus scrofa (pig)

Dasypus novemcinctus (nine-banded armadillo)

Orycteropus afer (aardvark)

Elephas maximus (Asiatic elephant)

Macropus robustus (wallaroo)

Vombatus ursinus (common wombat)

Ornithorhynchus anatinus (platypus)

Tachyglossus aculeatus (Australian echidna)

[actcttt]ta[gtat]aaat--a[gtac]c[gttaa]cttccaa[ttaac]tagt[tttga]c-aacat[tcaaa|aaagagt]a [.......]..[....]..G.--.[....].[.....].......[.....]....[.....].-.....[.....|.......]. [.......]..[....]..Gc--.[....].[.....].......[.....]....[.....].-.....[.....|.......]. [.......]..[....]..t.--.[....].[.....].......[.....]c...[....g]t-.gt.c[c....|.......]. [.......]..[....]..Gc--.[....].[.....].......[.....]c...[.....].-....c[.....|.......]. [.......]..[....]...c--.[....].[.....].......[.....]....[.....].-...Gc[c....|.......]. [.......]..[....]..t.--.[....]a[A..g.].......[.c..t]c..c[.....].-..t..[.....|.......]. [.......]..[....]..cc--.[....]a[A..g.].......[.c..t]c...[.....].-.....[.....|.......]. [.......]..[....]...c--.[....]t[.....].......[.....]c..c[..c..]t-...Gc[..g..|.......]. [.t...c.]..[....]...c--.[....]a[A..g.].......[....t]ag.c[c....]t-..-.c[c...g|.g...a.]. [.t.....]..[....]cg.ccc.[a...]a[A..g.].......[....t]..ac[..c.g]tg..-.a[c.gg.|.....a.]. [g......]..[...c]..c.--.[....]a[A..g.].......[.c..t]ag.a[....g]ta..t.a[c....|.g....c]. [gt..c..]..[....]c...t-.[....]a[A..g.].......[.c..t]...c[cc.ag]tac.at.[ct.gg|..g..ac].

Homo sapiens (human)

Pan troglodytes (chimpanzee)

Pan paniscus (pygmy chimpanzee)

Gorilla gorilla (gorilla)

Pongo pygmaeus (orangutan)

Pongo pygmaeus abelii (Sumatran orangutan)

Papio hamadryas (hamadryas baboon)

Macaca sylvanus (Barbary ape)

Hylobates lar (common gibbon)

Cebus albifrons (white-fronted capuchin)

Lemur catta (ring-tailed lemur)

Nycticebus coucang (slow loris)

Tarsius bancanus (western tarsier)

1 2 2’ 3 3’ 4 4’ 1’

1 2 2’ 3 3’ 4 4’ 1’

A G

A A U U U

A G G U U

A A A

U

C A G A C C

A

A G A G C

C U

U C A A A

G C C C U

C A G

U A A G U

U

U A C U U

A A U U U C U G

Homo sapiens

(human) tRNATrp

5’

3’

A G

C

A A

A G

A G

A

A

U

C

U

A

A|C

G

C G

11 Total CPDs and 7 different types of CPDs in 10 species

Kondrashov et al. PNAS 2002

Predicted compensatory interactions

MVYPEPWCMPRM

VVYPEPWCMPRL

MVYPEPWHMPRL

MTFPEDYCMPRL

TTFPHDWCMPRL

TTFPEDWCMPRL

MVYPEPWCMPRL

MVYPEPWCMPGL

MVYPEPYCMPRL

MVYKERWHMPRL

MVYKEPWHMPRL

MVFPEDWCIPRL

MTFPEDWCIPRL

MTFPEDWCMPRL

MTFPYDWCMPRL

MTFPHDWQMPRL

MTYPHDLCMPRL

MTFPHDFCMPRL

MTFPHDLCMPRL

MMYPHDFCMPRL

Studying amino acid diversity in proteins

MVYPEPWCMPRM

VVYPEPWCMPRL

MVYPEPWHMPRL

MTFPEDYCMPRL

TTFPHDWCMPRL

TTFPEDWCMPRL

MVYPEPWCMPRL

MVYPEPWCMPGL

MVYPEPYCMPRL

MVYKERWHMPRL

MVYKEPWHMPRL

MVFPEDWCIPRL

MTFPEDWCIPRL

MTFPEDWCMPRL

MTFPYDWCMPRL

MTFPHDWQMPRL

MTYPHDLCMPRL

MTFPHDFCMPRL

MTFPHDLCMPRL

MMYPHDFCMPRL

MVYPEPWCMPRM

VVYPEPWCMPRM

TVYPEPWCMPRM

MTYPEPWCMPRM

MVYPYPWCMPRM

MVYPEDWCMPRM

MVYPEPYCMPRM

MVYPEPLCMPRM

MVYPEPFCMPRM

MVFPEPWCMPRM

MVYPHPWCMPRM

MVYKEPWCMPRM

MVYPEPWQMPRM

MVYPEPWHMPRM

MVYPEPWCIPRM

MVYPEPWCMPGM

MVYPEPWCMPRL

Number of sequences

(species)

Average Amino Acid Usage

AVERAGE 3538 9.5

ATP6 3021 10.0

ATP8 1244 11.2

COX1 4450 7.4

COX2 4204 10.6

COX3 2191 9.4

CYTB 7954 12.0

ND1 2056 10.0

ND2 5963 11.2

ND3 2852 10.5

ND4 2041 10.2

ND4L 1785 11.5

ND5 949 8.9

ND6 1015 10.8

Elongation factor 1743 3.7

Histone 3 1228 5.2

RuBisCO 13912 9.2

Amino acid usage predicts dn/ds

An average site in a protein can accept ~8 amino acid states. The non-epistatic expected dn/ds ratio of an average protein should be (u-1)/19, where u is the expected amino acid usage. 7/19 ~ 0.35

Short-term evolution rate

AAT CTC AAG CAT GGA

N L K H G

AGT CTA AAA TAT GGG

S L K Y G

Kn = Number of nonsynonymous substitutions/Number of nonsynonymous sites

Ks = Number of synonymous substitutions/Number of synonymous sites

Kn/Ks = 2/35 / 3/10 = 0.19

GGG

AGG GAG GCG

TGG

CGG GTG GGC

GGT

GGA

Number of pairwise comparisons

Pairwise dn/ds

Average dn/ds = 0.03

Fraction of clade-specific evolution

Gene Corrected Usage Expected dn/ds Observed dn/ds Fraction of non-

epistatic evolution

AVERAGE 8.387 0.389 0.059 0.15

0

2

4

6

8

10

12

0.6-0.7 0.7-0.8 0.8-0.9 0.9-1

Fraction of epistatic evolution

Nu

mb

er

of

ge

ne

s

MVYPEDWCMPRM

VVYPEDWCMPRL

MVYPEDWHMPRL

MVYPEDYCMPRL

MVYPHDWCMPRL

MVYPEDWCMPRL

MVYPEDWCMPRL

MVYPEDWCMPGL

MVYPEDYCMPRL

MVYPEDWCMPRL

MVYKEPWCMPRL

MVYPEDWCIPRL

MVYPEDWCIPRL

MVYPEDWCMPRL

MVYPYDWCMPRL

MVFPEDWQMPRL

MVYPEDWCMPRL

MTYPEDWCMPRL

MTYPEDWCMPRL

MMYPEDWCMPRL

MVYPEPWCMPRM

VVYPEPWCMPRL

MVYPEPWHMPRL

MTFPEDYCMPRL

TTFPHDWCMPRL

TTFPEDWCMPRL

MVYPEPWCMPRL

MVYPEPWCMPGL

MVYPEPYCMPRL

MVYKERWHMPRL

MVYKEPWHMPRL

MVFPEDWCIPRL

MTFPEDWCIPRL

MTFPEDWCMPRL

MTFPYDWCMPRL

MTFPHDWQMPRL

MTYPHDLCMPRL

MTFPHDFCMPRL

MTFPHDLCMPRL

MMYPHDFCMPRL

Expected protein divergence: fi,j is the frequency of amino acid

i at site j

L is the protein length

hemoglobin subunit beta [Macaca mulatta]

Sequence ID: ref|NP_001157900.1|Length: 147Number of Matches: 1

Score Expect Method Identities Positives Gaps

288 bits(736) 3e-97 139/147(95%) 143/147(97%) 0/147(0%)

Query 1 MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK 60

MVHLTPEEK+AVT LWGKVNVDEVGGEALGRLLVVYPWTQRFF+SFGDLS+PDAVMGNPK

Sbjct 1 MVHLTPEEKTAVTTLWGKVNVDEVGGEALGRLLVVYPWTQRFFDSFGDLSSPDAVMGNPK 60

Query 61 VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG 120

VKAHGKKVLGAFSDGL HLDNLKGTFA LSELHCDKLHVDPENF+LLGNVLVCVLAHHFG

Sbjct 61 VKAHGKKVLGAFSDGLNHLDNLKGTFAQLSELHCDKLHVDPENFKLLGNVLVCVLAHHFG 120

Query 121 KEFTPPVQAAYQKVVAGVANALAHKYH 147

KEFTP VQAAYQKVVAGVANALAHKYH

Sbjct 121 KEFTPQVQAAYQKVVAGVANALAHKYH 147

beta-globin [Mus musculus]

Score Expect Method Identities Positives Gaps

164 bits(414) 2e-48 118/147(80%) 131/147(89%) 0/147(0%)

Query 1 MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK 60

MVHLT EKSAV+ LW KVN DEVGGEALGRLLVVYPWTQR+F+SFGDLS+ A+MGNPK

Sbjct 1 MVHLTDAEKSAVSCLWAKVNPDEVGGEALGRLLVVYPWTQRYFDSFGDLSSASAIMGNPK 60

Query 61 VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG 120

VKAHGKKV+ AF++GL +LDNLKGTFA+LSELHCDKLHVDPENFRLLGN +V VL HH G

Sbjct 61 VKAHGKKVITAFNEGLKNLDNLKGTFASLSELHCDKLHVDPENFRLLGNAIVTVLGHHLG 120

Query 121 KEFTPPVQAAYQKVVAGVANALAHKYH 147

K+FTP QAA+QKVVAGVA ALAHKYH

Sbjct 121 KDFTPAAQAAFQKVVAGVATALAHKYH 147

beta-globin epsilon-m [Didelphis virginiana]

Score Expect Method Identities Positives Gaps

168 bits(425) 4e-50 108/147(73%) 132/147(89%) 0/147(0%)

Query 1 MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK 60

MVH TPE+K+ +T++W KV+V++VGGE+L RLLVVYPWTQRFF+SFG+LS+ AVMGNPK

Sbjct 1 MVHFTPEDKTNITSVWTKVDVEDVGGESLARLLVVYPWTQRFFDSFGNLSSASAVMGNPK 60

Query 61 VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG 120

VKAHGKKVL +F +G+ ++DNLKGTFA LSELHCDKLHVDPENFRLLGNVL+ VLA FG

Sbjct 61 VKAHGKKVLTSFGEGVKNMDNLKGTFAKLSELHCDKLHVDPENFRLLGNVLIIVLASRFG 120

Query 121 KEFTPPVQAAYQKVVAGVANALAHKYH 147

KEFTP VQA++QK+V+GV++AL HKYH

Sbjct 121 KEFTPEVQASWQKLVSGVSSALGHKYH 147

hemoglobin subunit rho [Gallus gallus]

Score Expect Method Identities Positives Gaps

165 bits(418) 4e-49 97/147(66%) 125/147(85%) 0/147(0%)

Query 1 MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK 60

MVH + EEK +T++W KVNV+E G EAL RLL+VYPWTQRFF++FG+LS+P A++GNPK

Sbjct 1 MVHWSAEEKQLITSVWSKVNVEECGAEALARLLIVYPWTQRFFDNFGNLSSPTAIIGNPK 60

Query 61 VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG 120

V+AHGKKVL +F + + +LDN+K T+A LSELHC+KLHVDPENFRLLGN+L+ VLA HF

Sbjct 61 VRAHGKKVLSSFGEAVKNLDNIKNTYAKLSELHCEKLHVDPENFRLLGNILIIVLAAHFT 120

Query 121 KEFTPPVQAAYQKVVAGVANALAHKYH 147

K+FTP QA +QK+V+ VA+ALA+KYH

Sbjct 121 KDFTPTCQAVWQKLVSVVAHALAYKYH 147

Sequence divergence beyond accumulation of deleterious alleles

0

100

200

300

400

500

0 10 20 30 40 50 60 70 80 90 100

Percent of pairwise sequence comparions beyond the theoretical divergence

Num

ber

of gene f

am

ilies

Expected protein divergence:

fi,j is the frequency of amino acid

i at site j

L is the protein length

1773 gene families

Small dn/ds but large sequence divergence -> epistasis (or frequent adaptation)

Wright’s Fitness Landscape

Wright, S 1931

"Functional proteins must form a continuous network which can be traversed by unit mutational steps without passing through nonfunctional intermediates"

Evolution of sequences

WORD <-> WORE <-> GORE <-> GONE <->GENE

- John Maynard Smith, Nature, 1970

Total sequence space

Non-epistatic fitness landscape

Epistatic fitness landscape

Microevolution

Accumulation of many allele replacements

Macroevolution

MDGHTSKLRG

MD HT K RG

MDSHTVKFRG

causation

Complicated, stochastic, dynamic world with a detailed underlying theory

Observation-based insights with almost no theory, which is necessarily based in the microevolutionary world

confirmation of theory

molecular biology

Our Institute

top related