ejercicios de alineamiento de secuencias: clustalw insertar secuencias de fasta

26
rcicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FAS

Upload: adelle

Post on 16-Jan-2016

124 views

Category:

Documents


2 download

DESCRIPTION

Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA. Pedir alineamiento múltiple. Analizar resultado. Regiones conservadas y variables en proteinas. Codones y aminoácidos. The 20 amino acids have overlapping properties. Small change. big change. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Page 2: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Pedir alineamiento múltiple

Page 3: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Analizar resultado

Page 4: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA
Page 5: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Regiones conservadas y variables en proteinas

Page 6: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA
Page 7: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Codones y aminoácidos

Page 8: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

The 20 amino acids have overlapping properties

Small change

big change

Page 9: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Relationship between physico-chemical difference and relative substitution frequency

Physico-chemical difference

Rel

ativ

e su

bstit

utio

n fr

eque

ncy

Drastic changes are infrequent

Minor changes are more frequent

Kimura (1983) The neutral theory of molecular evolution.

Page 10: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Pseudogenes as a paradigm of neutral evolution

Li, Gojobori and Nei (1981) Nature 292: 237-239

Pseudogenes show an extremely high rate of nucleotide substitution.

Page 11: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Conservation in a ‘typical’ gene

Start of transcription Polyadenylation site

Splice sitesStart of translation

On the basis of 3,165 human-mouse pairsMGSC Nature (2002) 420 520-562

Page 12: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Degeneracy of the Genetic Code

Colors represent amino acids

Each of the 61 sense codons can mutate in 9 different ways 134 of the 549 possible changes are synonymous

nonsynonymous

synonymous

Page 13: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Synonymous changes can be neutral mutations

King, J. L., and Jukes, T. H. 1969. Non-Darwinian evolution, Science 164, 788-798.

• If most DNA changes were due to adaptive evolution than one would imagine that most changes would occur in the first and second codon positions.

• If DNA divergence includes neutral mutations, then the third position should change more rapidly because synonymous mutations are more likely to be neutral.

Page 14: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

The first 220 nucleotides of human and mouse renin binding protein

The third position of all codons are marked

Of the 31 changes:4 - 1st position4 - 2nd position23 - 3rd position

Preponderance of changes in the 3rd position

Page 15: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Estimating separately the rate of synonymous change andnon-synonymous change

• KS = number of Synonymous substitutions per synonymous site

• KA = number of non-synonymous (Altering) substitutions per non-synonymous site

One way of estimating Ks and Ka would be to examine each change individually and check if it is synonymous or not. In the following we present a method for doing this in a systematic manner.

Page 16: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Nucleotide sites can be classified into 3 types of degenerate sites

4-folddegenerate – changes of this nucleotide relate to 4 codons for the same AA

2-foldDegeneratechanges of this nucleotide relate to pairs of codons for the same AA

0-folddegenerate -no change at this nucleotideleaves coding for the same AA

Synonymous - Altering(AA = amino acids)

Page 17: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

4-fold degenerate sites are found in 32 of the 3rd position of 61 codon sites

Page 18: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

2-fold degenerate sites are found in 25 of the 3rd positionsand 8 of the 1st position

Page 19: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

0-fold degenerate sites are found in 2nd position sites of all codons (61) and in of 53 of the 1st position sites

Page 20: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Classify each site in a sequence according to the degeneracy of the sites.

002

002

002

002002

002002

002

002

002

002

002

002

002

002

002

202

202

204

204

004

004

004

004

004

004

002

002

002

002

202

202

204

204

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

- - -

- - -

- - -

000

000

002

002

002

Page 21: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

000002002002204002004204002004000002002004004004002002204002004004004

000002002002204002004204002004000002002004004002002002204002004004002

Classify each site in a sequence according to the degeneracy of the sites.

L0= (45+45)/2 = 45L2= (13+15)/2 = 14L4= (10+8)/2 = 9

Counting the number of 4-,2-,0-fold sites(taking the average between the two sequences)

Page 22: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Classify the differences with another sequence as a. transition (S) or transversion (V)b. degeneracy (0,2,4)

0-fold 2-fold 4-fold

transition S0 S2 S4

transversion V0 V2 V4

Page 23: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

The key simplification is the special relationship between transition/transversion and degeneracy:

0-fold 2-fold 4-fold

transition S0 S2 S4

transversion V0 V2 V4

Synonymous mutations

Non-synonymous mutations

)Exceptions: 1st position of arginine (CGA,CGG,AGA,AGG), last position of isoleucine (AUU, AUC, AUA).(

Page 24: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

A G

TC

= transitions

= transversions

We distinguish between transitions and transversions according to the Kimura model

Page 25: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Use Kimura’s 2-parameter model to estimate the numbers of transitions (Ai) and transversions (Bi) per i-th type site.

Calculate the proportions of transitional and transversional differences:Pi = Si/Li (12/70)Qi = Vi/Li (3/70)

Kimura model is used to correct for multiple hits:

Ai = (1/2) ln (1/(1- 2Pi – Qi)) – (1/4) ln (1/(1- 2Qi))Bi = (1/2) ln (1/(1- 2Qi))

(~6 times more transitions than transversions)

(0.242)(0.045)

The Kimura model is similar to the Jukes-Cantor model (from the previous lecture) but also takes into consideration that transitions and transversions occur at different frequencies

Page 26: Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Relationship between the number of nucleotide substitutions and the difference in the year of isolation for the H3 hemagglutinin gene of human influenza A viruses. All sequence comparisons were made with the strain isolated in 1968.

The Molecular Clock of Viral Evolution

Gojobori et al. 1990 PNAS 87 10015-10018

Different rates