secondary structure prediction cb1 sec1 lecture: protein prediction 1 - protein ... · 2014. 5....
TRANSCRIPT
© Burkhard Rost (TUM Munich) /991
title: Secondary structure predictionshort title: cb1_sec1
lecture: Protein Prediction 1 - Protein structure Computational Biology 1 TUM summer 2014
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Announcements
Videos: YouTube / www.rostlab.orgTHANKS : Tim Karl + Jonas ReebSpecial lectures:• Apr 15 - Andrea Schafferhans
No lecture:• Apr 17/22 Easter• May 01 Thu May day• May 06 Tue Student assembly• May 29 Thu Ascension day• Jun 03 Tue no room• Jun 10 Tue Whitsun holidays• Jun 19 Thu Corpus Christi
LAST lecture: July 1Examen: July 8 • Makeup: Oct 21 - morning
CONTACT: Lothar Richter [email protected]
2
TimKarl
LotharRichter
JonasReeb
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Recap: secondary structure
prediction
3Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Goal of structure prediction
Epstein & Anfinsen, 1961: sequence uniquely determines structure
• INPUT: sequence
3D structureand function
• OUTPUT:
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Zones
5
Day
light
Zon
e
Twili
ght Z
one
Mid
nigh
t Zon
eprofile - profile
sequence - profilesequence - sequence
sequ
ence
sim
ilar
->st
ruct
ure s
imila
r
B Rost (1997) Fold Des 2:S19-24B Rost (1999) Protein Eng 12:85-94
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Comparative modeling applicable to about 1/3 of all proteins
6Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
CASP
Protein Structure Prediction
Only homology modeling goodNo general prediction of 3D from sequence, yetImportant improvements in many fields!
© Burkhard Rost (Columbia New York)
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
L Pauling & RB Corey (1953) PNAS 39:247-252L Pauling, RB Corey & HR Branson (1951) PNAS 37:205-234W Kabsch & C Sander (1983) Biopolymers 22:2577-2637
DSSP
Pauling’s H-bond pattern used in DSSP
8Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Notation: protein structure 1D, 2D, 3DPQITLWQRPLVTIKIGGQLKEALLDTGADDTVL
PP PQQQYFFQVISSIVRLLSTLWWQEDRKQAKRRRPQPPPPPVVTKFVVLIITTKEKAALIVHYKKFIILVIEENGGGGGTGQQKRRPPLWWVVFKVEESKKVVGLGLLILLLLLVVDDDDDTTTTTGGGGGAAAAADDDDDDDAKESSTTVIIVIVVVIVL
1281757077
120238169200247114740
904
466268
11831
1241
292449726217
102691
140
1109760691481976248590
690
730
415371597395000
5851300
79586900
EEEEE
EEEEEE
EEEEEEE
EE
EEEEE
EEEEEE
EE
kcal/mol0 -1 -2 -3 -4 -5
1 10 20 30 40 50 60 70 80 90
1
10
20
30
40
50
60
70
80
90
1D1D 2D2D 3D3D
9Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
1D: secondarystructureprediction
10Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Words
11
Secondary structure prediction2ndary structure prediction2D prediction
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /9912
Close Homology (Sequence Id. > 60% Psi-Blast Eval < 10-20)
Distant Homology (Domain, Motif)
Machine Learning (NN, SVM)
Protein Space:
X=Positive Y=Negative
Protein function classification
© Kaz Wrzeszczynski: Thesis
W
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Coverage of structure space
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction
14
DSSP secondary assignment has 8 “states”
H = HelixG = 310 helixI = Pi helixE = Extended (strand)B = beta-bridge, single strand residueT = Turn, i.e. one turn of helixS = bent“ “ = loop
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Goal of secondary structure prediction
LEDKSPDHNPTGID
AKGKPMDRNFTGRNHPPKDSS
AAQVKDALTK
LEQWGTLAQLRAIWEQELTDFPEFLTMMARQETWLGWLTI
helix strand
loop
LAVIGVLMKW
FVFLMIEKIYHKLT
DIRVGLTYYIAQ
VNTFVGTFAAVAHAL
15W Kabsch & C Sander (1985) Identical pentapetides with different backbones. Nature 317:207
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /9916
??
???
How pentapeptides occur in 2 states?
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
L Pauling & RB Corey (1953) PNAS 39:247-252L Pauling, RB Corey & HR Branson (1951) PNAS 37:205-234W Kabsch & C Sander (1983) Biopolymers 22:2577-2637
DSSP
Pauling’s H-bond pattern used in DSSP
17Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction methods
18
L Pauling, RB Corey and HR Branson (1951) Two Hydrogen-Bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-211.L Pauling, RB Corey and HR Branson (1951) The Structure of Proteins: Two Hydrogen-bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-234.AG Szent-Györgyi & C Cohen (1957) Role of proline in polypeptide chain configuration of proteins. Science 126:697.some are more equal than others ...
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Sec str pred methods: single residues
19
Pauling, RB Corey and HR Branson (1951) Two Hydrogen-Bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-211.L Pauling, RB Corey and HR Branson (1951) The Structure of Proteins: Two Hydrogen-bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-234.AG Szent-Györgyi & C Cohen (1957) Role of proline in polypeptide chain configuration of proteins. Science 126:697.MF Perutz, MG Rossmann, AF Cullis, G Muirhead, G Will and AT North (1960) Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5 Å resolution, obtained by X-ray analysis. Nature 185:416-422.JC Kendrew, RE Dickerson, BE Strandberg, RJ Hart, DR Davies and DC Phillips (1960) Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å resolution. Nature 185:422-427.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Simple prediction: frequency
First step (Szent-Györgyi)Proline breaks a helixHelices span several turns, i.e. >4 residues-> identify helices/non-helices
20
Proline bends main chain
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Simple prediction: frequency
First step (Szent-Györgyi)Proline breaks a helixHelices span several turns, i.e. >4 residues-> identify helices/non-helices
from Proline to odds for all ....
21Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Simple prediction: frequency
from Proline to odds for all
22
....,....1....,....2....QEKSPREVTMKKGDILTLLNSTNK E..E EEEEEE
AA D E G I K L M N P Q R S T V
E 1 1 3 1 1 1
L 1 1 1 4 1 1 1 1 2 1
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction methods
23
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
Robson B & Pain RH (1971) Analysis of the Code Relating Sequence to Conformation in Proteins: Possible Implications for the Mechanism of Formation of Helical Regions. J. Mol. Biol. 58:237-259.Chou PY & Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:211-215.Garnier J, Osguthorpe DJ and Robson B (1978) Analysis of the accuracy and Implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
how to assess performance?
problem 1: where to get secondary structure from?
24Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
how to assess performance?
problem 2: how to measure?
25Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction accuracy
26
• Q3 : three-state per-residue accuracy
number of correctly predicted residues in states helix, strand, otherQ3= ---------------------------------------------------------------------------- number of residues in protein
Schulz GE & Schirmer RH (1979) Prediction of secondary structure from the amino acid sequence. In: (eds). Principles of protein structure. Berlin: Springer-Verlag, pp 108-130.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction methods
27
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
published: 63% accuracy
Robson B & Pain RH (1971) Analysis of the Code Relating Sequence to Conformation in Proteins: Possible Implications for the Mechanism of Formation of Helical Regions. J. Mol. Biol. 58:237-259.Chou PY & Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:211-215.Garnier J, Osguthorpe DJ and Robson B (1978) Analysis of the accuracy and Implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary Structure Assignment: DSSP
Dictionary of protein Secondary Structure for ProteinsASSESSING secondary structure prediction
28
Wolfgang Kabsch & Chris Sander (1983) Biopolymers 22:2577-637
Wolfgang KabschChris Sander
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction methods
29
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy (assessed in 1994)
Robson B & Pain RH (1971) Analysis of the Code Relating Sequence to Conformation in Proteins: Possible Implications for the Mechanism of Formation of Helical Regions. J. Mol. Biol. 58:237-259.Chou PY & Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:211-215.Garnier J, Osguthorpe DJ and Robson B (1978) Analysis of the accuracy and Implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
2nd Generation:what would you do?
30Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary Structure Prediction: Segment 83
Dictionary of protein Secondary Structure for Proteins
31
Wolfgang Kabsch & Chris Sander (1983) Biopolymers 22:2577-637W Kabsch & C Sander (1985) Identical pentapetides with different backbones. Nature 317:207W Kabsch & C Sander (1983) Segment 83 unpublished
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction: 1.+2. Generation
32
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy segments (2. generation)• GORIII 1986-92
• Gibrat J-F, Garnier J and Robson B (1987) Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol. 198:425-443.
• Biou V, Gibrat JF, Levin JM, Robson B and Garnier J (1988) Secondary structure prediction: combination of three different methods. Prot. Engin. 2:185-191.
• Garnier J & Robson B (1989) The GOR method for predicting secondary structure in proteins. In: D. FG (eds). Prediction of protein structure and the principles of protein conformation. New York: Plenum Press, pp 417-465.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction: 1.+2. Generation
33
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy (Q3) segments (2. generation)• GORIII 1986-92
55-60% Q3• Gibrat J-F, Garnier J and Robson B (1987) Further developments of protein
secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol. 198:425-443.
• Biou V, Gibrat JF, Levin JM, Robson B and Garnier J (1988) Secondary structure prediction: combination of three different methods. Prot. Engin. 2:185-191.
• Garnier J & Robson B (1989) The GOR method for predicting secondary structure in proteins. In: D. FG (eds). Prediction of protein structure and the principles of protein conformation. New York: Plenum Press, pp 417-465.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction: 1.+2. Generation
34
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Helix formation is local
residuesiandi+3
THYROID hormone receptor (2nll)
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
β-sheet formation is NOT local
Erabutoxin β (3ebx)Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction: 1.+2. Generation
37
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure prediction: 1.+2. Generation
38
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
SEQ! KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLDOBS! EEEE E E E EEEEEE EEEEEE EEEEEEHHHEEEE
TYP! EHHHH EE EEEE EE HHHEE EEEHH
Problems of secondary structure predictions (before 1994)
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
INSERT:concept of neural
networks
40Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
J11
J12
1
1
1
0
out0 = in1J11 in2J12 +
out = tanh (out0)
Simple Neural Network
Simple neural network
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
10
Training a neural network 1
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
10
Errare = (out net - out want) 2
.
1
- 121-1-2
out
in
Training a neural network 2
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Error
Junctions
10
01
11
11
Training a neural network 3
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
10
01
11
11
.
1
- 121-1-2
out
in
10
01
01
12
10
01
- 1
1
12+?
Training a neural network 4
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Neural networks classify points
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Neural networks classify points
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Neural networks classify points
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Neural networks classify points
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Neural networks classify points
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Simple Neural NetworkWith Hidden Layer
outi = f ij2 J ⋅ f jk
1 Jk∑ ⋅ kin⎛
⎝⎜
⎞
⎠⎟
j∑
⎛
⎝⎜⎜
⎞
⎠⎟⎟
Simple neural network with hidden layer
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Principles of networks: input -> output
two steps:1. linear: sum over all input × connection2. non-linear: sigmoid trigger, i.e., project sum onto 0-1
.
:ACACC:
1.0
0input to unit
(=sum)
Σconnectionij*inputjstep 1:
step 2:
outp
utfr
om u
nit
inpu
t = 3
adj
acen
t res
idue
s in
pro
tein
seq
uenc
e
outp
ut =
sec
onda
ry s
truct
ure
stat
e of
cen
tral r
esid
ue
α
L
s1s2s3
Jdecision line
sum
result: < decision line
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
outi = ∑i=1
Nin+1
Jij inj
inj value of input unit j ; outi value of output unit i ; Jij connection between input unit j and output unit i
E = ∑i=1
Nout
(outi - desi)2
outi value of output unit i ; desi secondary structure stateobserved for central amino acid for output unit i (e.g. fora helix: des1=1, des2=0, des3=0)
Principles of neural networks: error
• output:
• error:
• free variables: connections { J }• goal:
representation of set of examples (training set) for which the mapping input->output is known, i.e., the secondary structure state of the central residue has been observed by the network
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Principles of neural networks: training
∆Jij(t+1) = - ε ∂E(t)∂Jij(t) + α ∆Jij(t-1)
where ∂E/∂J is the derivative of the error with respect tothe network connection; t is the algorithmic time given bythe presentation of one example; ε determines the stepwidth of the change (learning strength, typically some0.01); α gives the contribution of the momentum term(∆J(t-1) , typically some 0.2), which permits uphill moves
Error
{ J }
training = change of connections {J} such that E decreasessimplest procedure:• gradient descent
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Effect of over-training: theory
100
50
0Training time
over-train
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
num
ber o
f cor
rect
clas
sific
atio
ns p
er ex
ampl
e
0 5 10 15 20 25
number of cycles
ratio for training set
ratio for testing set
Effect of over-training: practice
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
RETURN:secondary structure prediction
53Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure predictions of 1. and 2. generation
54
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
ACDEFGHIKLMNPQRSTVWY.
H
E
L
D (L)
R (E)
Q (E)
G (E)
F (E)
V (E)
P (E)
A (H)
A (H)
Y (H)
V (E)
K (E)
K (E)
Neural Network for secondary structure
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
helix strand otheroverallaccuracymethod
unbalanced 62%
NN predicts secondary structure
56
neural network
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
helix strand otheroverallaccuracymethod
unbalanced 62%
NN predicts secondary structure
57
neural network
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
helix strand otheroverallaccuracymethod
unbalanced 62%
NN predicts secondary structure
57
neural network
... and developer believes that application of machine learning is all the intelligence he will ever need...
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
NN sec str: training dynamicsOther Strand Helix
time: 1 step = 20,000 training samples
Perfo
rman
ce
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
NN sec str: training dynamics
1 2 3 4 5 6 7 8 9 100
0.2
0.4
0.6
0.8
1Other Strand Helix
time: 1 step = 20,000 training samples
Perfo
rman
ce
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
helix strand otheroverallaccuracymethod
unbalanced 62%neural network
NN predicts secondary structure
59
full pie: all correctly predicted residues
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
helix strand otheroverallaccuracymethod
unbalanced 62%comparison:data bankdistribution
NN predicts secondary structure
60
neural network
full pie: all correctly predicted residues
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
helix strand otheroverallaccuracymethod
unbalanced 62%comparison:data bankdistribution
comparison:33:33:33
NN predicts secondary structure
61
neural network
full pie: all correctly predicted residues
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
normal training
Balanced training
62Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
E = oiµ − di
µ( )i∑
µ=α ,β,L∑
2
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
normal training
balanced training
Balanced training
63Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Balanced training: dynamics
64
Other Strand Helix
10.80.60.40.20
unbalanced balancedEµ = oi
µ − diµ( )
i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
train:
E = oiµ − di
µ( )i∑
µ=α ,β,L∑
2µ
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Balanced training: dynamics
64
1 2 3 4 5 6 7 8 9 100
0.20.40.60.8
1Other Strand Helix
1 2 3 4 5 6 7 8 9 10
10.80.60.40.20
unbalanced balancedEµ = oi
µ − diµ( )
i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
train:
E = oiµ − di
µ( )i∑
µ=α ,β,L∑
2µ
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
helix strand otheroverallaccuracymethod
unbalanced 62%comparison:data bankdistribution
comparison:33:33:33balanced 60%
65
full pie: all correctly predicted residues
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Neural networks DO improve if developer does something more
than dream the machine learning
dream...66
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure predictions of 1. and 2. generation
67
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
β-sheet formation is NOT local
Erabutoxin β (3ebx)Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Conclusion:not all sound
explanations are right!
69Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Secondary structure predictions of 1. and 2. generation
70
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Bad segment prediction
HHHHHHHHHEEEEE
HHHHEEE
HHHHHHHEEEEE
1st level
2nd level
comparison:observed:
71
SEQ! KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLDOBS! EEEE E E E EEEEEE EEEEEE EEEEEEHHHEEEE
TYP! EHHHH EE EEEE EE HHHEE EEEHH
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Select samples at random
72
∆Jij(t+1) = - ε ∂E(t)∂Jij(t) + α ∆Jij(t-1)
where ∂E/∂J is the derivative of the error with respect tothe network connection; t is the algorithmic time given bythe presentation of one example; ε determines the stepwidth of the change (learning strength, typically some0.01); α gives the contribution of the momentum term(∆J(t-1) , typically some 0.2), which permits uphill moves
Error
{ J }
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Local correlations in reality
residuesiandi+3
Erabutoxin β (3ebx)
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /9974
??
???
How to get those into the prediction?
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
H
E
L
V (E)
P (E)
A (H)
PHDsec:
structure-to-structure
PHDsec: structure-to-structure network
75B Rost (1996) Methods Enzymol 266:525-39Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Better segment prediction
HHHHHHHHHEEEEE
HHHHEEE
HHHHHHHEEEEE
1st level
2nd level
comparison:observed:
76Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
.
0
200
400
600
800
1000
1200
0 10 20 30 40 50
Num
ber o
f seg
men
ts
Segment length
0
5
10
15
20
25
25 30 35 40 45 50
DSSPPHD
-800
-600
-400
-200
0
200
400
600
800
0 2 4 6 8 10
helixstrandloop
Diff
eren
ce in
num
ber
of o
bser
ved
- pre
dict
ed se
gmen
tsSegment length
A B
Better prediction of segment lengths
77Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
.
0
200
400
600
800
1000
1200
0 10 20 30 40 50
Num
ber o
f seg
men
ts
Segment length
0
5
10
15
20
25
25 30 35 40 45 50
DSSPPHD
-800
-600
-400
-200
0
200
400
600
800
0 2 4 6 8 10
helixstrandloop
Diff
eren
ce in
num
ber
of o
bser
ved
- pre
dict
ed se
gmen
tsSegment length
A B
Better prediction of segment lengths
78Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Structure-to-structure network: Invented?
79
N Qian & TJ Sejnowski (1988) Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202:865-884.
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Other ideas
More output units, e.g. instead of central residue: take central 31. 9 output units2. average output -> 3 unitsoutput back into neural networks:Gianluca Pollastri, Dariusz Przybylski, B Rost and Pierre Baldi (2002) Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins: Structure, Function, and Bioinformatics 47:228-235.
80Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Other ideas
output back into neural networks:
81
Gianluca Pollastri, Dariusz Przybylski, B Rost and Pierre Baldi (2002) Proteins 47:228-235: Fig. 1
idea: P Frasconi & M Gori (1996) IEEE Trans Neural netw 7:1521-5
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
STILL ONLY 60+ε% accuracy.
How to improve beyond that?
82Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
How to get more data into it?
83Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
How to get more data into it?
83
?Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Evolution has it!
.
0
20
40
60
80
100
0 50 100 150 200 250
Perc
enta
ge se
quen
ce id
entit
y
Number of residues aligned
Sequence identityimplies structural
similarity !
Don't know region
84
C Sander & R Schneider 1991 Proteins 9:56-68B Rost 1999 Prot Engin 12:85-94
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF
85B Rost (1996) Methods Enzymol 266:525-39Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF
SH3Src-homology 3 domainone domain of proteins such asSrc tyrosine kinase (STK)
86
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF
SH3Src-homology 3 domainone domain of proteins such asSrc tyrosine kinase (STK)
86
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Evolution improves prediction
Evolutionary profile implicitly captures history of and individual protein!
87Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Evolution improves prediction
Evolutionary profile implicitly captures history of and individual protein!
87Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Evolution improves prediction
Evolutionary profile implicitly captures history of and individual protein!
fly
chicken
rat
mouse
human
87Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Η
Ε
L
>
>
>
pickmaximal
unit=>
currentprediction
J2
inputlayer
first orhidden layer
second oroutput layer
s0 s1 s2J1
:GYIY
DPAVGDPDNGVEP
GTEF:
:GYIY
DPEVGDPTQNIPP
GTKF:
:GYEY
DPAEGDPDNGVKP
GTSF:
:GYEY
DPAEGDPDNGVKP
GTAF:
Alignments
5 . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5 . .. . . . . . . 2 . . . . . 3 . . . . . .. . . . . . . . . . . . . . . . . 5 . .
. . . . 5 . . . . . . . . . . . . . . .
. . . 5 . . . . . . . . . . . . . . . .
. . 3 . . . . 2 . . . . . . . . . . . .
. . . . 1 . . 2 . . . 2 . . . . . . . .5 . . . . . . . . . . . . . . . . . . .. . . . 5 . . . . . . . . . . . . . . .. . . 5 . . . . . . . . . . . . . . . .. . . . 4 . 1 . . . . . . . . . . . . .. . . . 1 3 . . . 1 . . . . . . . . . .4 . . . . 1 . . . . . . . . . . . . . .. . . . . . . . . . . 4 . 1 . . . . . .. . . 1 . 1 . 1 2 . . . . . . . . . . .. . . 5 . . . . . . . . . . . . . . . .
5 . . . . . . . . . . . . . . . . . . .. . . . . . 5 . . . . . . . . . . . . .. 1 1 . 1 . . 1 1 . . . . . . . . . . .. . . . . . . . . . . . . . . . . . 5 .
GSAPD NTEKQ CVHIR LMYFW
profile table
:GYIY
DPEDGDPDDGVNP
GTDF:
Protein
corresponds to the the 21*3 bits coding for the profile of one residue
PHD: Neural network & evolutionary information
88B Rost & C Sander (1993) PNAS 90:7558-62B Rost (1996) Methods Enzymol 266:525-39
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
25%
80
100%
number of residues alignedSequ
ence
iden
tity
filterMaxHom
sequencedata bank
protein Aprotein B
:protein N
protein Aprotein C
:protein M
MaxHom
BLAST
11
22
33
ext ractal ignment
PHD
U
From sequence to profile
89B Rost (1996) Methods Enzymol 266:525-39Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
P H D s e c
H
L
E
4+1""""""
20444
outputlayer
inputlayer
hiddenlayer
20444
21+3""""""
H
L
E
0.5
0.1
0.4percentage of each amino acid in proteinlength of protein (≤60, ≤120, ≤240, >240)distance: centre, N-term (≤40,≤30,≤20,≤10)distance: centre, C-term (≤40,≤30,≤20,≤10)
input global in sequence
input local in sequence
localalign-ment13
adjacentresidues
:::AAAAA.LLLLIIAAGCCSGVV:::
globalstatist.wholeprotein
%AALength∆ N-term∆ C-term
A C L I G S V ins del cons100 0 0 0 0 0 0 0 0 1.17100 0 0 0 0 0 0 33 0 0.42 0 0 100 0 0 0 0 0 33 0.92 0 0 33 66 0 0 0 0 0 0.74 66 0 0 0 33 0 0 0 0 1.17 0 66 0 0 0 33 0 0 0 0.74 0 0 0 33 0 0 66 0 0 0.48
first levelsequence-to- structure
second levelstructure-to- structure
PHDsec: more details
90B Rost (1996) Methods Enzymol 266:525-39Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Jury
centre of mass = jury over 1-4
architecture 3architecture 4
singlenetworkvs.jurydecision
architecture 2architecture 1
91Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
PROFsec: Evolutionary information + more
B Rost (2001) J Struct Biol 134, 204-18 92Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
PROFsec: Evolutionary information + more
B Rost (2001) J Struct Biol 134, 204-18 92Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
HEADER CYTOSKELETONCOMPND ALPHA SPECTRIN (SH3 DOMAIN) �SOURCE CHICKEN (GALLUS GALLUS) BRAINAUTHOR M.NOBLE,R.PAUPTIT,A.MUSACCHIO,M.SARASTE
Spectrin homology domain (SH3)
59%65%
72%
93Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Prediction accuracy varies!
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70 80 90 100
Num
ber o
f pro
tein
cha
ins
Per-residue accuracy (Q3)
<Q3>=72.3% ; sigma=10.5%
1spf
1bct
1stu
3ifm
1psm
94Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Stronger predictions more accurate!
.
0
20
40
60
80
100
0
20
40
60
80
100
3 4 5 6 7 8 9
Q per protein3 fit: Q3fit = 21 + 8.7 * Q
3
Q3 p
er p
rote
in
Reliability index averaged over protein
ACDEFGHIKLMNPQRSTVWY.
H
E
L
D (L)
R (E)
Q (E)
G (E)
F (E)
V (E)
P (E)
A (H)
A (H)
Y (H)
V (E)
K (E)
K (E)
H=0.5E=0.4L=0.1
H=0.8E=0.1L=0.1
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70 80 90 100
Num
ber o
f pro
tein
cha
ins
Per-residue accuracy (Q3)
<Q3>=72.3% ; sigma=10.5%
1spf
1bct
1stu
3ifm
1psm
95Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Correct prediction of correctly predicted residues
.
70
75
80
85
90
95
100
0 20 40 60 80 100
PHDsec
PHDacc
PHDhtm
70
75
80
85
90
95
100RI=9
RI=0RI=9
RI=0
RI=9
RI=4
7
over
all p
er-re
sidue
acc
urac
y
percentage of resdidues predicted96
Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
BAD errors are frequent!
0
50
100
150
200
250
300
350
0 10 20 30 40
Num
ber o
f pro
tein
cha
ins
BAD error (H for E, or E for H)
<BAD>=4.0% ; sigma=5.9%
0
5
10
15
20
0 20 40 60 80 100Pe
rcen
tage
of e
rrors
Cumulative percentage of protein chains
97Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
False prediction for engineered proteins!
GB1: IgG-binding domain of protein G (CHAMELEON) Kim & Berg, Nature, 366, 267-270, 1993
....,....1....,....2....,....3....,....4....,....5....,..AA TTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEKDSSP EEEEEEE EEEEEEEEE HHHHHHHHHHHHHHHHH EEEEEEE EEEEEEEE
PHD 30 EEEEEE E EEHHHHHHHHHHHHHHEEE EEEEEE EEEEEPHD no EEEEEE EEEEEHHHHHHHHHHHHHHHH EEEEE EEEEEE
AATAEKVFKQY AWTVEKAFKTFPHD 30 EEEEEE EEEEEEE HHHHHHHHHEEE EEEE EEEEEEPHD no EEEEEE EEEEEEHHHHHHHHHHHHHHH EEEEE EEEEEE
EWTYDDATKTF AWTVEKAFKTFPHD 30 EEEEEE EEE EHHHHHHHHHHHHHHHH EEEEE EEEEEEPHD no EEEEEE E E EHHHHHHHHHHHHHHHH HHHHHHH EEEEE
AWTVEKAFKTF HHHHH
98Wednesday May 21, 2014
© Burkhard Rost (TUM Munich) /99
Lecture plan (CB1: Structure)-generic01: 2014/04/08 Tue: sorry02: 2014/04/10 Thu: welcome: who we are03: 2014/04/15 Tue: Intro I - acids/structure (Andrea Schafferhans)04: 2014/04/17 Thu: SKIP: Easter vacation05: 2014/04/22 Tue: SKIP: Easter vacation06: 2014/04/24 Thu: Intro 2 - domains07: 2014/04/29 Tue: Intro 3 - 3D comparisons08: 2014/05/01 Thu: SKIP: “May day” - (NOT to be confused with “m’aidez”)09: 2014/05/06 Tue: SKIP: student assembly (SVV)10: 2014/05/08 Thu: Alignment 111: 2014/05/13 Tue: Alignment 2 12: 2014/05/15 Thu: Comparative modeling 113: 2014/05/20 Tue: Secondary structure prediction 114: 2014/05/22 Thu: Secondary structure prediction 215: 2014/05/27 Tue: 1D: Secondary structure prediction 116: 2014/05/29 Thu: SKIP: holiday (Ascension Day)17: 2014/06/03 Tue: SKIP: no room 18: 2014/06/05 Thu: 1D: Secondary structure prediction 219: 2014/06/10 Tue: SKIP: Whitsun holidays20: 2014/06/12 Thu: 1D: Transmembrane helix prediction21: 2014/06/17 Tue: Nobel prize symposium22: 2014/06/19 Thu: SKIP: Corpus Christi (Fronleichnam)23: 2014/06/24 Tue: 1D: Transmembrane strand prediction, solvent accessibility24: 2014/06/26 Thu: 2D prediction25: 2014/07/01 Tue: 3D prediction/wrap up26: 2014/07/03 Thu: wrap up again27: 2014/07/08 Tue: examen, no lecture28: 2014/07/10 Thu: no lecture
99Wednesday May 21, 2014