automated model-building with textal thomas r. ioerger department of computer science texas a&m...
Post on 21-Dec-2015
213 views
TRANSCRIPT
Automated Model-Building with TEXTAL
Thomas R. Ioerger
Department of Computer Science
Texas A&M University
• Automated model-building program
• Can we automate the kind of visual processing of patterns that crystallographers use?– Intelligent methods to interpret density, despite noise– Exploit knowledge about typical protein structure
• Focus on medium-resolution maps– optimized for 2.8A (actually, 2.6-3.2A is fine)
– typical for MAD data (useful for high-throughput)
– other programs exist for higher-res data (ARP/wARP)
Overview of TEXTAL
Electron density map(not structure factors) TEXTAL Protein model
(may need refinement)
Main Stages of TEXTALelectron density map
CAPRA
C chains
LOOKUP
model (initial coordinates)
model (final coordinates)
Post-processing routines
Reciprocal-spacerefinement/DM
HumanCrystallographer
(editing)
build-in side-chainand main-chain atoms
locally around each C
example:real-spacerefinement
CAPRA: C-Alpha Pattern-Recognition Algorithm
tracing
linking
Neural network:estimates whichpseudo-atoms areclosest to true C’s
Example of C-chains fit by CAPRA
% built: 84%# chains: 2lengths: 47, 88RMSD: 0.82A
Rat 2 urinary protein (P. Adams)data: 2.5A MRmap generated at 2.8A
Stage 2: LOOKUP
• LOOKUP is based on Pattern Recognition – Given a local (5A-spherical) region of density, have we seen a
pattern like this before (in another map)?
– If so, use similar atomic coordinates.
• Use a database of maps with known structures– 200 proteins from PDB-Select (non-redundant)
– back-transformed (calculated) maps at 2.8A (no noise)
– regions centered on 50,000 C’s
• Use feature extraction to match regions efficiently– feature (e.g. moments) represent local density patterns
– features must be rotation-invariant (independent of 3D orientation)
– use density correlation for more precise evaluation
Examples of Numeric Density Features
Distance from center-of-sphere to center-of-massMoments of inertia - relative dispersion along orthogonal axesGeometric features like “Spoke angles” Local variance and other statistics
TEXTAL uses 19 distinct numeric features to represent the pattern of density in a region, each calculated over 4 different radii, for a total of 76 features.
F=<1.72,-0.39,1.04,1.55...> F=<1.58,0.18,1.09,-0.25...>
F=<0.90,0.65,-1.40,0.87...> F=<1.79,-0.43,0.88,1.52...>
Interfaces for Using TEXTAL
• Stand-alone commands and scripts– capra-scale prot.xplor prot-scaled.xplor– neotex.sh myprotein > textal.log
– lots of intermediate files and logs…
• WINTEX: Tcl/Tk interface
– creates jobs in sub-directories
– Public Release: July 2004– http://textal.tamu.edu:12321
• Integrated into Phenix – http://phenix-online.org
– Python module– model-building tasks in GUI
Conclusions• Pattern recognition is a successful technique for
macromolecular model-building• Future directions:
– building ligands, co-factors, etc.
– recognizing disulfide bridges
– phase improvement (iterating with refinement)
– loop-building
– further integration with Phenix
– Intelligent Agent-based methods for guiding/automating model-building
– interactive graphics for specialized needs (e.g. fixing chains, editing identities)
Acknowledgements• Funding:
– National Institutes of Health
• People:– James C. Sacchettini
– Kevin Childs, Kreshna Gopal, Lalji Kanbi, Erik McKee, Reetal Pai, Tod Romo
• Our association with the PHENIX group:– Paul Adams (Lawrence Berkeley National Lab)
– Randy Read (Cambridge University)
– Tom Terwilliger (Los Alamos National Lab)