automated model-building with textal thomas r. ioerger department of computer science texas a&m...

14
Automated Model-Building with TEXTAL Thomas R. Ioerger Department of Computer Science Texas A&M University

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Automated Model-Building with TEXTAL

Thomas R. Ioerger

Department of Computer Science

Texas A&M University

• Automated model-building program

• Can we automate the kind of visual processing of patterns that crystallographers use?– Intelligent methods to interpret density, despite noise– Exploit knowledge about typical protein structure

• Focus on medium-resolution maps– optimized for 2.8A (actually, 2.6-3.2A is fine)

– typical for MAD data (useful for high-throughput)

– other programs exist for higher-res data (ARP/wARP)

Overview of TEXTAL

Electron density map(not structure factors) TEXTAL Protein model

(may need refinement)

Main Stages of TEXTALelectron density map

CAPRA

C chains

LOOKUP

model (initial coordinates)

model (final coordinates)

Post-processing routines

Reciprocal-spacerefinement/DM

HumanCrystallographer

(editing)

build-in side-chainand main-chain atoms

locally around each C

example:real-spacerefinement

CAPRA: C-Alpha Pattern-Recognition Algorithm

tracing

linking

Neural network:estimates whichpseudo-atoms areclosest to true C’s

Example of C-chains fit by CAPRA

% built: 84%# chains: 2lengths: 47, 88RMSD: 0.82A

Rat 2 urinary protein (P. Adams)data: 2.5A MRmap generated at 2.8A

Stage 2: LOOKUP

• LOOKUP is based on Pattern Recognition – Given a local (5A-spherical) region of density, have we seen a

pattern like this before (in another map)?

– If so, use similar atomic coordinates.

• Use a database of maps with known structures– 200 proteins from PDB-Select (non-redundant)

– back-transformed (calculated) maps at 2.8A (no noise)

– regions centered on 50,000 C’s

• Use feature extraction to match regions efficiently– feature (e.g. moments) represent local density patterns

– features must be rotation-invariant (independent of 3D orientation)

– use density correlation for more precise evaluation

Examples of Numeric Density Features

Distance from center-of-sphere to center-of-massMoments of inertia - relative dispersion along orthogonal axesGeometric features like “Spoke angles” Local variance and other statistics

TEXTAL uses 19 distinct numeric features to represent the pattern of density in a region, each calculated over 4 different radii, for a total of 76 features.

F=<1.72,-0.39,1.04,1.55...> F=<1.58,0.18,1.09,-0.25...>

F=<0.90,0.65,-1.40,0.87...> F=<1.79,-0.43,0.88,1.52...>

Databaseof knownmaps

Region in map to be interpreted

The LOOKUP ProcessFind optimalrotation

Stage 3: Post-Processing

Interfaces for Using TEXTAL

• Stand-alone commands and scripts– capra-scale prot.xplor prot-scaled.xplor– neotex.sh myprotein > textal.log

– lots of intermediate files and logs…

• WINTEX: Tcl/Tk interface

– creates jobs in sub-directories

– Public Release: July 2004– http://textal.tamu.edu:12321

• Integrated into Phenix – http://phenix-online.org

– Python module– model-building tasks in GUI

Gallery of Examples

Conclusions• Pattern recognition is a successful technique for

macromolecular model-building• Future directions:

– building ligands, co-factors, etc.

– recognizing disulfide bridges

– phase improvement (iterating with refinement)

– loop-building

– further integration with Phenix

– Intelligent Agent-based methods for guiding/automating model-building

– interactive graphics for specialized needs (e.g. fixing chains, editing identities)

Acknowledgements• Funding:

– National Institutes of Health

• People:– James C. Sacchettini

– Kevin Childs, Kreshna Gopal, Lalji Kanbi, Erik McKee, Reetal Pai, Tod Romo

• Our association with the PHENIX group:– Paul Adams (Lawrence Berkeley National Lab)

– Randy Read (Cambridge University)

– Tom Terwilliger (Los Alamos National Lab)