current status of homology modeling using mcsg structures
DESCRIPTION
1t5b domain. 1t5b domain. Q92LV5. Q92LV5. template. template. domain model. domain model. Gly140/141. Gly140/141. Gly140/141. Gly140/141. Gly154/155. Gly154/155. Gly154/155. Gly154/155. Tyr95. Tyr95. Tyr95. Tyr95. Phe97. Phe97. Phe97. Phe97. Phe103. Phe103. Phe103. - PowerPoint PPT PresentationTRANSCRIPT
Current Status of Homology Modeling Using MCSG Structures
319 MCSG structures in PDB have over 400,000 sequence homologues.
These structures represent ~350 domains.
Models are built by MODELLER (Sali) and quality is assessed using PROSA (Sippl).
High-quality models can be generated for ~80,000 proteins.
Web site has been established that allows automated modeling of sequence homologues and evaluate the quality of the models.
www.biochem.ucl.ac.uk/~dlee/GeMMA
Gly140/141
Phe97Asp96
Tyr95
Gly154/155
Phe103
Trp102
Gly140/141
Phe97Asp96
Tyr95
Gly154/155
Phe103
Trp102
1t5b domain template
Q92LV5 domain model
Gly140/141
Phe97Asp96
Tyr95
Gly154/155
Phe103
Trp102
Gly140/141
Phe97Asp96
Tyr95
Gly154/155
Phe103
Trp102
1t5b domain template
Q92LV5 domain model
Protein Structure Initiative - the Need for Large-Scale Homology Modeling
In the next five years PSI can determine approximately 3,000-4,000 protein structures, mainly at course granularity.
Reality check: novel structures in PDB will represent very small fraction of sequences in GenBank – reliable homology modeling is critical for obtaining 3D models and extending experimental work.
In PSI2 targets for structure determination are selected from large families, therefore determined structures have a large number of sequence homologues at wide range of sequence similarity. Protein often display different function.
Homology modeling must provide tools and 3D proteins models that can be used for high-confidence, reliable interpretation of specific structural features in distant (15-25%) sequence homologues, protein function assignment and evolution.
Models should provide guide for increasing number of more sophisticated experiments including: (i) aid mutagenesis and biochemical studies, (ii) predicting ligand binding, (iii) predicting oligomerization state, (iv) predicting cellular interactions (protein/protein/DNA/RNA).
We need to consider how PSI target selection of protein sequences and subsequent structure determination can improve homology modeling and the quality of the models.
Major Issues with Large-Scale Homology Modeling for Structural Genomics
3D proteins models for distant (15-25%) sequence homologues are often not suitable.
Because of sequence divergence for very large families only small fraction of sequences can be reliably modeled (10-20%).
Homology modeling must provide input to target selection in fine coverage of protein families.
Domain parsing needs improvement. We should be able to model multi-domain proteins from structures of
individual domains. We should be able to model neighbouring side chains and important
structural and functional features that currently are difficult to assigned and predict correctly.
We need methods to predict unusual features and departures from the structure that is used for modelling.
Modelling loop and high B factor regions needs improvement.
Structure of P5CR Exemplifies Challenges for Homology Modeling
• Two structures of P5CR were determined.• The proteins share 22% sequence identity and 47% sequence similarity.• Structures of monomer are very similar but show individual features.• Problems:• Protein has two domains and forms oligomers, one domain shows major swapping and protein forms different oligomeric forms in different species
Human Aldose Reductase – SeMet MAD at 0.9 Å Comparison – Experimental vs. Refined Map
Refined map @ 0.9 Å, sigmaA
(2mFo-DFc), contour level: 1 sigma
Experimental map @ 0.9 Å, Fo, contour
level: 1 sigma
MAD Map at 3.2 Å, 1.8 Å, 1.6 Å and 1.1 Å
Inhibitor Head Existing in Double Conformation Hard to Interpret at RT (1.45 Å), Clear at 100 K (0.8 Å)
Tyr 48
His 110