alignments why do alignments?. detecting selection evolution of drug resistance in hiv

Post on 15-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Alignments Why do Alignments?

DetectingSelection

Evolution of Drug Resistance in HIV

Selection on Amino Acid Properties TreeSAAP (2003) Wu Method (Sainudiin et al. 2005)

TreeSAAP Properties Alpha-helical tendencies Average number of surrounding

residues Beta-structure tendencies Bulkiness Buriedness Chromatographic Index Coil tendencies Composition Compressibility Equilibrium constant (ionization of

COOH) Helical contact area Hydropathy Isoelectric point Long-range non-bonded energy Mean r.m.s. fluctuation displacement

Molecular volume Molecular weight Normalized consensus hydrophobicity Partial specific volume Polar requirement Polarity Power to be at the C-terminal Power to be at the middle of alpha-helix Power to be at the N-terminal Refractive index Short and medium range non-bonded

energy Solvent accessible reduction ratio Surrounding hydrophobicity Thermodynamic transfer

hydrophobicity Total non-bonded energy Turn tendencies

TreeSAAP

Rhinoviruses

SelectedSites

3D Mapping

PHENOTYPEGENOTYPE

ENVIRONMENT

OPSIN: Model System for Molecular Evolution

Wavelength (nm)

400 500 600 700

UV IR

CRLAKIAMTTVALWFIAWTPYLLINWVGMFARSYLSPVYTIWGYVFAKANAVYNPIVYAISHPKYRAAMEKKLPCLSCKTESDDVSESASTTTSS

Is max Correlated with Ecological Differences?

microscopic thin beam of spectral light

INPUT OUTPUT

INPUT – OUTPUT = pigment absorbance

Detect light not absorbed by the photopigment

400 – 700 nm at 1nm intervals

0.1

Heliconius eratoHeliconius saraBicyclus anynanaJunonia coenia

Vanessa carduiPapilio xuthus Rh1Papilio xuthus Rh3

Pieris rapaeManduca sextaGalleria mellonellaSpodoptera exiguaPapilio xuthus Rh2

Osmia rufaBombus terretsrisApis mellifera

Camponotus abdominalisCataglyphis bombycinus

Schistocerca gregariaSphrodromantis sp.

Drosophila melanogaster Rh6Drosophila melanogaster Rh1Calliphora erythrocephala Rh1

Drosophila melanogaster Rh2Neogonodactylus oerstedii Rh3Neogonodactylus oerstedii Rh1

Neogonodactylus oerstedii Rh2Homarus gammarus

Neomysis americanaHolmesimysis costata

Procambarus milleriOrconectes virilisProcambarus clarkiiCambarus ludovicianusCambarellus schufeldtiiEuphausia suberba

Mysis relicta sp.IVArchaeomysis grebnitzkii

Limulus polyphemusLimulus polyphemusHemigrapsus sanguineusHemigrapsus sanguineus

Camponotus abdominalisCataglyphis bombycinusApis mellifera

Manduca sextaPapilio xuthus Rh5

Drosophila melanogaster Rh4Drosophila melanogaster Rh3

Apis melliferaSchistocerca gregaria

Papilio xuthus Rh4Manduca sexta

Drosophila melanogaster Rh5Loligo pealiiLoligo forbesiLoligo subulata

Sepia officinalisTodarodes pacificus

Enteroctopus dofleiniGallus gallus pinealAnolis carolinensis pineal

Bos taurus rhodopsin Homo sapiens melatonin 1A

Homo sapiens GPR52

Insect LWS508-575 nm

Crustacean LWS496-533 nm

Insect UV345-375nm

Cephalopod Rh480-499nm

Crustacean MWS (480)Chelicerate LWS (520)

Insect MWS420-490 nm

Insect BL430-460nm

Invertebrate Opsin Evolution

PHYMLamino acid

ML tree

Thick branches indicate bootstrap values >Thicker branches indicate bootstrap values > 90%

Coil Tendencies, Compressibility, Alpha-Helix

Amino acid alignment number

-2

0

2

4

6

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260

Coil Tendencies

-2

0

2

4

6

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260

Compressibility

-2

0

2

4

6

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260

Power to be at mid alpha

-2

0

2

4

6

8

10

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260

Refractive Index

Z-s

core

TMI TMII TMIII TMIV TMV TMVI

TreeSAAP

0.1

Heliconius eratoHeliconius saraBicyclus anynanaJunonia coenia

Vanessa carduiPapilio xuthus Rh1Papilio xuthus Rh3

Pieris rapaeManduca sextaGalleria mellonellaSpodoptera exiguaPapilio xuthus Rh2

Osmia rufaBombus terretsrisApis mellifera

Camponotus abdominalisCataglyphis bombycinus

Schistocerca gregariaSphrodromantis sp.

Drosophila melanogaster Rh6Drosophila melanogaster Rh1Calliphora erythrocephala Rh1

Drosophila melanogaster Rh2Neogonodactylus oerstedii Rh3Neogonodactylus oerstedii Rh1

Neogonodactylus oerstedii Rh2Homarus gammarus

Neomysis americanaHolmesimysis costata

Procambarus milleriOrconectes virilisProcambarus clarkiiCambarus ludovicianusCambarellus schufeldtiiEuphausia suberba

Mysis relicta sp.IVArchaeomysis grebnitzkii

Limulus polyphemusLimulus polyphemusHemigrapsus sanguineusHemigrapsus sanguineus

Camponotus abdominalisCataglyphis bombycinusApis mellifera

Manduca sextaPapilio xuthus Rh5

Drosophila melanogaster Rh4Drosophila melanogaster Rh3

Apis melliferaSchistocerca gregaria

Papilio xuthus Rh4Manduca sexta

Drosophila melanogaster Rh5Loligo pealiiLoligo forbesiLoligo subulata

Sepia officinalisTodarodes pacificus

Enteroctopus dofleiniGallus gallus pinealAnolis carolinensis pineal

Bos taurus rhodopsin Homo sapiens melatonin 1A

Homo sapiens GPR52

Insect LWS508-575 nm

Crustacean LWS496-533 nm

Insect UV345-375nm

Cephalopod Rh480-499nm

Crustacean MWS (480)Chelicerate LWS (520)

Insect MWS420-490 nm

Insect BL430-460nm

Invertebrate Opsin Evolution

PHYMLamino acid

ML tree

Thick branches indicate bootstrap values >Thicker branches indicate bootstrap values > 90%

Homology

Homology definitions Homology is an evolutionary relationship that

either exists or does not. It cannot be partial. An ortholog is a homolog that arose through a

speciation event A paralog is a homolog that arose through a

gene duplication event. Paralogs often have divergent function.

Similarity is a measure of the quality of alignment between two sequences. High similarity is evidence for homology. Similar sequences may be orthologs or paralogs.

One More Homology type Xenology – similarity due to horizontal

gene transfer (HGT) How do you discover this?

Alignment Problem (Optimal) pairwise alignment consists of

considering all possible alignments of two sequences and choosing the optimal one.

Sub-optimal (heuristic) alignment algorithms are also very important: eg BLAST

Key Issues

Types of alignments (local vs. global)

The scoring system The alignment algorithm Measuring alignment significance

Types of Alignment Global—sequences aligned from end-

to-end. Local—alignments may start in the

middle of either sequence Ungapped—no insertions or deletions

are allowed Other types: overlap alignments,

repeated match alignments

Local vs. Global Pairwise Alignments A global alignment includes all elements of

the sequences and includes gaps. A global alignment may or may not include "end

gap" penalties. Global alignments are better indicators of

homology and take longer to compute. A local alignment includes only

subsequences, and sometimes is computed without gaps. Local alignments can find shared domains in

divergent proteins and are fast to compute

How do you compare alignments? Scoring scheme

What events do we score? Matches Mismatches Gaps

What scores will you give these events? What assumptions are you making?

Score your alignment

Scoring Matrices How do you determine scores? What is out there already for your use? DNA versus Amino Acids?

TTACGGAGCTTC CTGAGATCC

Multiple Sequence Alignment

Global versus Local Alignments Progressive alignment

Estimate guide tree Do pairwise alignment on subtrees

ClustalX

Improvements Consistency-based Algorithms

T-Coffee - consistency-based objective function to minimize potential errors

Generates pair-wise global (Clustal) Local (Lalign) Then combine, reweight, progressive alignment

Iterative Algorithms Estimate draft progressive alignment

(uncorrected distances) Improved progressive (reestimate guide

tree using Kimura 2-parameter) Refinement - divide into 2 subtrees,

estimate two profiles, then re-align 2 profiles

Continue refinement until convergence

Software Clustal T-Coffee MUSCLE (limited models) MAFFT (wide variety of models)

Comparisons Speed

Muscle>MAFFT>CLUSTALW>T-COFFEE

Accuracy MAFFT>Muscle>T-COFFEE>CLUSTALW

Lots more work to do here!

top related