new strategies for protein folding joseph f. danzer, derek a. debe, matt j. carlson, william a....
TRANSCRIPT
New Strategies for Protein Folding
Joseph F. Danzer, Derek A. Debe, Matt J. Carlson, William A. Goddard III
Materials and Process Simulation CenterCalifornia Institute of Technology
…-HIS-CYS-ALA-ALA-GLY-GLU-ASP-...
Protein Tertiary Structure Prediction
Given a Protein’s Primary Structure -- Amino Acid Sequence
Can We Determine It’s 3D Structure
What Local Structural Units Does It Form?-Helix (Cylinder)-Sheets (Ribbon)
How Do Those Structural Units Pack Together?
With a 6 () state representation,
650 or 1038 states for a 50 residue protein
Assuming protein may sample 1state/ps,
1019 years to fold
•Conformational Search Problem
–Given the exponentially large number of possible states, how do we generate a correct state?
•Recognition Problem
–How do we differentiate correct from incorrect folds?
Structure Prediction is a Two Fold Problem
Restrained Generic Protein (RGP)Direct Monte Carlo
Highly efficient, off-lattice residue buildup procedurefor generating ensembles of protein conformations that comply with a set of user defined distance restraints.
l
l = 3.8Å; = 120; Typically = 0, 60, 120, 180, 240, 300. (6 states per residue)
Generic Protein Model•Each residue is a 5.5 Å sphere•Fixed geometry connects residues
Restraint Implementation
i-1 i
i+4
i+4
i+4
i+4
i+4
i+4i+4
i+4
i+4
i+4
i+4
z
r
At residue addition step i, the maximal position ofresidue i+n in the (z,r) plane is known.
Satisfies pairwiserestraints with>90% efficiencywith negligiblecomputational cost.
Leads to a simple set of trigonometric conditions for restraint satisfaction.
RGP EnsembleGeneration
Inter-residuerestraints
Secondarystructure
prediction
Static Residue BurialSelection
<10 4 topologies
<500 topologies
Intact PeptideBackbone
DynamicResidue Burial
Selection
AdditionalRestraints
<20 topologies
AdditionalRefinement
<10 topologies
<5 topologies
Amino AcidSequence
Generate-and-Select Hierarchy
Local StructureRefinement
RGP Ensemble Selected Set Sec. PredictionSa CRMSb sc Rankd CRMSe Rankf CRMSg
N/36 30,0000 6.85Å 395 24t 7.46Å 14t 6.67Å
N/24 5,000 6.57Å 209 6t 6.76Å 2t 6.11Å
N/12 500 6.28Å 271 1 6.43Å 7t 4.45Å
N/6 - - 44 2 6.13Å 1t 5.76Å
Secondary Structure Prediction-PHDBurkhard Rost & Chris Sander, J. Mol. Biol. 232, 584 (1993).
LexA Repressor
RGP Ensemble Selected Set Sec. PredictionS CRMS s Rank CRMS Rank CRMS
N/12 50,000 8.95Å 117 11 8.77Å 5 7.01Å
N/6 - - 23 1 9.28Å 1 6.30Å
Myoglobin
Inter-Residue Restraints
If tertiary structure is unknown, How can we generate distance restraints?•Experimentally determined disulfide bond connectivity•Use PHD prediction algorithm to generate loose restraints1
1. Burkhard Rost & Chris Sander, J. Mol. Biol. 232, 584 (1993).
PHD predicts whether each residue will be buried or exposed to solvent•Assume the residues with greatest burial form a hydrophobic core•Generate a few loose restraints (4-10 Å) between these residues
Tests on two proteins (3icb,1lea) using loose restraints were done
Protein # Restraints EnergyCut-Off
# SelectedStructures
# NearNative
BestCRMS
3icb 3 -26 463 4 7.7871 -23 460 2 7.827
1lea 3 -27 172 1 8.3008* -18 2242 1 8.4847** -30 110 3 7.001
-27 330 8 7.001*All restraints were picked so that they were incorrect**All restraints were picked so that they were correct
Local Structure Refinement
•Dynamic Monte Carlo–Make small local deformations to the backbone structure–Overall topology must be kept intact –Use simple energy function to determine if deformation is accepted or rejected
•Fragment Sewing–Isites1 library is a database of structural fragments widely observed in the Protein Data Bank.–Based on sequence homology, Isites will generate a list of fragments whose structures are likely to be found in the protein–Local structure can be refined by sewing these fragments into the overall structure
1. C. Bystroff & D. Baker, J. Mol. Bol. 281, 565 (1998).
Dynamic Monte Carlo
Local deformations are made by modifying the position of a single residue.
Energy function properly orients side chains. Hydrophilic groups point outward
and hydrophobic groups point inward.
Axis of rotation
Circle defines allowed movement based on fixed geometry of model
C- AtomsHydrophilic Side ChainHydrophobic Side Chain
Fragment Sewing
Rest of protein
Segment’s original structure
New structure after sewing
Overall topology is still intact, but now local structure has -helical structure rather than a random coil.