h-bonds d-h … a assignment: if d…a < 3.7 Å in crystal structure. (normally 2.7-3.1 Å)....
TRANSCRIPT
H-bonds
D-H … A Assignment: if D…A < 3.7 Å in crystal
structure. (normally 2.7-3.1 Å). Energy of stablization: -12~-40 kJ/mol) Tends to be linear. Only weakly stabilize proteins. (!?)
A survey over H-bonds in globular proteins (J. Mol. Biol. (1992) 226, 1143)
Local H-bonds? The authors made
this conclusion: “Most H-bonds are local.” Should be more
critically reviewed. Most H-bonds are
between beckbone atoms.
Source: K. Schulten GroupUniversity of Illinois Urbana-Champaign
Protein folding, dynamics and structural evolution
Chapter 9
Questions
How does a peptide sequence find its native, functional conformation? Is there a set of fundamental principles?
We discussed several factors determining protein structure. Can we *predict* the structure, from sequence information yet?
Determinants of Protein Folding We will discuss the following factors, as listed in V&V
chapter 9. Space Packing. Directed mainly by internal residues Protein structures are hierarchically organized. Protein structures are highly adaptable. Secondary structure can be context dependent. Changing the fold of a protein.
Still, keep in mind that most of them are based from observations and deductions. Exceptions are possible. Whether the statistical methods/criteria are acceptable is
another question.
Compactness
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Proteins are like liquids and glasses, instead of crystalline solids.
The reverse statement is, does compactness serve as a factor determining protein structure?
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Compactness helps, but not enough.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Folding is directed mainly by internal residues
Mutations that change surface residues are accepted more frequently and are less likely to affect protein conformations than are changes of internal residues.
This is consistent with the idea of Hydrophobic force-driven folding.
Determinants of Protein Folding Space Packing. Directed mainly by internal residues Protein structures are hierarchically
organized. Protein structures are highly adaptable. Secondary structure can be context
dependent. Changing the fold of a protein.
Determinants of Protein Folding Space Packing. Directed mainly by internal residues Protein structures are hierarchically
organized. Protein structures are highly adaptable. Secondary structure can be context
dependent. Changing the fold of a protein.
Protein structures are quite “resistant” to mutations
A large number of single residue mutations do not yield a very different structure.
A complete study was done in phage T4 lysozyme by B. W. Matthews.
Homologous proteins comes with some sequence identity and they are often structurally similar.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
Determinants of Protein Folding Space Packing. Directed mainly by internal residues Protein structures are hierarchically
organized. Protein structures are highly adaptable. Secondary structure can be context
dependent. Changing the fold of a protein.
Voe
t Bio
chem
istr
y 3e
© 2
004
John
Wile
y &
Son
s, In
c.
Table 9-1 Propensities and Classifications of Amino Acid Residues for Helical and Sheet Conformations.
Pag
e 30
0
Secondary structure
Secondary structure prediction can be done with more sophisticated algorithms. Artificial intelligence such as neuronal
networks or support vector machines. Basically look at a local sequence and
recongnize its pattern. Usually such methods need a training set.
I.e. knowledge-based methods.
Voe
t Bio
chem
istr
y 3e
© 2
004
John
Wile
y &
Son
s, In
c.
Figure 9-6 NMR structure of protein GB1.
Pag
e 28
0
• Green: residues 23-33.• Cyan: residues 42-53.• Chm-alpha: a new
sequence replaces green part.
• Chm-beta: the same new sequence replaced cyan part.
• Both are structurally similar to native GB1.
• The same sequence can be either an alpha helix or a beta sheet structure, depending on their context.
Determinants of Protein Folding Space Packing. Directed mainly by internal residues Protein structures are hierarchically
organized. Protein structures are highly adaptable. Secondary structure can be context
dependent. Changing the fold of a protein.
Homology
Proteins that share some sequence identity may be structurally similar. One evidence that support evolution. Proteins with as little as 20% sequence
identity may have similar structure. How much should be changed for a
protein to assume a different structure?
Voe
t Bio
chem
istr
y 3e
© 2
004
John
Wile
y &
Son
s, In
c.
Figure 9-7 X-Ray structure of Rop protein, a homodimer of motifs that associate to form a 4-helix bundle
Pag
e 28
1
• GB1 and Rop are structurally different.
• 50% of the residues of GB1 is changed, yielding a new polypeptide that assumes Rop-like structure.
• This new peptide has 41% sequence identity with native Rop.
• The idea of Protein design and engineering.
Protein Folding Levinthal’s paradox
If for each residue there are only two degrees of freedom (,).
Assume each can have only 3 stable values. This leads to 32n possible conformations. If a protein can explore 1013 conformation per
second. (10 per picosecond). Still requires an astronomical amount of time to fold a
protein. This is impossible. So protein must fold in a
way that does not randomly explore each possible conformations.
Molten Globule Much of the secondary structure that is present in
a native proteins forms within a few milliseconds. This is called hydrophobic collapse. Something called “Molten Globule”.
Slightly (5-15% in radius) larger than native conformation.
Significant amount of secondary structure formed. Side chains are still not ordered/packed. Structure fluctuation is much larger. Not very
thermodynamically stable.
Are proteins sticky tapes? Are they simply
hetereopolymers that like to form H-bond, hydrophobic interactions with each other?
Proteins are not any random hetero-polymers
• By observation: – every protein has a very stable native structure,– while polymers are usually random in their
conformation.
• Interesting observation for simple models: the “designability”. In the following materials are from:– R. Helling et al., “The designability of protein structures”, J. Mol.
Graphics and Modelling, 19, 157, (2001).
– J. Miller et al.,“Emergence of highly designable protein-backbone conformations in an off-lattice model” Proteins, 47, 506 (2002).
– Steven S. Plotkin and Jose N. Onuchic, “Understanding protein folding with energy landscape theory Part I : Basic concepts” Quart. Rev. of Biophys., 35, 2 (2002), 111.
A 3D lattice HP model
• Enumerate all 227 possible sequences.
• Each sequence has a lowest energy structure.
• Some sequences share the same structure. Count the number of sequence per unique structure NS.
• Plot the distribution of NS.
• Assuming only two kinds of residule H and P.
• Well-studied before.EHH=−2.3, EHP=−1, EPP=0
• (a) Histogram of NS for the 3 × 3 × 3 system. (b) Average energy gap between the ground state and the first excited state versus NS for the 3 × 3 × 3 system.
NS: Number of seq. corresponding to a structure S
3794 different sequences share ONE structure!
On the average, these structure have large energy gaps between the lowest energy structure and the next lowest one.
Some structures are very “popular” for many different sequences
NS: Number of seq. corresponding to a structure S
3794 different sequences share ONE structure!
On the average, these structure have large energy gaps between the lowest energy structure and the next lowest one.
Such a property does not depend on the model used
• Very similar behavior are seen in 2D 6×6 HP model and in 2D or 3D models with 20 different amino acids.
Off the lattice: 23mer 3 state model
Zinc finger of 1PSV
Off-lattice model: results
• a: Backbone configuration of the 11th most designable 23-mer structure
• b: Backbone configuration of the zinc finger 1NC8, truncated to 23 amino acids.
What does it mean?
Sequence Structure
The energy landscape
Proteins are in a special subset of heteropolymers
• Such that the number of possible structures are greatly reduced.
• Evolution!• Therefore protein structure prediction is not as hard as it
appears. (still a hard problem though..)• That also explains why knowledge-based methods
works.• Nevertheless, the tools developed offers valuable clues
for the structure of a new protein.
Computer Simulation
• Goals:1) Structure prediction: From primary sequences to tertiary
structures (so that we can infer its function)
2) Known structure (from X-Ray or NMR or another simulation). Want: dynamics (how it moves at room temperature, with a ligand, or with a mutation).
• (1) above is difficult but do-able. We will discuss about some of the methods.
• (2) is often done with the same methods developed for (1).
Computers
• Deals with numbers and logical operations.
• Needs some “principles” (written in mathematical equations).
• For protein simulations there are different approaches:
1. Physics-based2. Knowledge-based
Physics
• Laws for particles moving and interaction– Classical Mechanics (Newton’s Equation of
motion)
– Quantum Mechanics (Schrödinger’s Equation)
• Many developments in physical chemistry can be used.
amFrr
=
)()(ˆ :tIndependen-Time
),(ˆ),( :Dependent-Time
rr
rr
EH
tHtt
i
=
=∂∂
h
Physics-Based protein simulation
• All quantum mechanics (QM) calculation is not feasible.
• QM can be applied to a small set of atoms.– Modeling of an active site (other atoms: not
treated or treated as dielectric continuum)– Can get total energies (binding vs. non-
binding, pKa etc.), wave function (charge distribution).
– QM/MM simulations (other atoms: treated with Molecular Mechanics)
An example of using QM (Case et al., J. Biol. Inorg. Chem. 2002, 7, 632)
• Rieske iron-sulfur protein in bc-type cytochromes
• Calculations based on density functional theory (DFT) performed.
• pKa and redox potentials can be obtained from total energies of several states.
• Change of pKa (proton-binding) and redox potential (electron-binding) are strongly coupled, as observed in experiments.
Using classical mechanics for protein structure and dynamics
• Ignore electrons, assigning (empirical) force fields for atoms (or clusters of atoms).
• A very simple potential:
Force fields: bond stretching and bending
A. R. Leach, “Molecular Modelling”, 1996
Torsional potential
A. R. Leach, “Molecular Modelling”, 1996
Ab initio QM results
3 point charges
N2 molecules:Known to have anElectric quadrupole moment
5 point charges
Polarization: many-body effect
Physics based: methods• Energy Minimization
– Steepest descent– Conjugated gradient
• Monte Carlo Simulation– Random sampling– Stimulated annealing
• Molecular Dynamics– Compute conformational
changes.
Energy surface of two torsional angles
• Very shallow valleys.• Similar in energy.• Determining the (ψ,φ)
conformations of peptide backbones is even more complicated.
Trapping at a local minimum
• Standard practice: use Monte Carlo (random sampling) with stimulated annealing techniques.
Using classical mechanics for protein structure and dynamics
• With a Force field ( V(rN) ), for lowest energy structure• Find the structure that gives energy minimum. Hopefully
this is done within finite amount of computer resources. And hopefully this energy minimum gives the desired native protein structure.
• For protein dynamics: calculate trajectories (Newton’s eq.) at thermal condition and find the averaged physical quantities.
Questions to ask:
• Is the energy function correct? – Precise enough to discriminate other non-
native structure.– Yet simple enough for computers to carry out
efficiently.
• Is the conformational search good enough to cover the global minimum?
Take-home messages: Physics-based methods
• Protein folding without any prior knowledge about protein structure is a difficult task.
• Protein structure prediction is often quoted as an “N-P complete problem”, i.e. the complexity of the problem grows exponentially as the number of residues increases.
• Structures of small proteins (~101 - 102 a.a.) can be solved in principle.