parallel fine sampling to solve large or difficult structures manually exploring large parameter...

Post on 04-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Parallel Fine Sampling to Solve Large or Difficult Structures

• Manually exploring large parameter space to find right combination of parameters is time-consuming and frustrating. It often results in giving up on an otherwise solvable structure.

• Parallel exploration of parameter space is an effective approach to solve challenging structures efficiently and reliably– Systematically explore parameter space– Speed up with parallel execution on PC cluster

• xxx

Structures Solved by Fine Grid Search

Target Mol/ASU Sites/Mol Sites Space Group

Resolution

MB3864A 4 6 24 P43 2.65

PE000293D 6 9 54 H3 2.15

PD06751F 6 14 84 P21212 1.90

TB1547G 8 12 96 P212121 2.20

PC06751C 6 20 120 P3121 2.70

FJ5490C 12 6 72 P1 2.00

FH7599A* 12 16 196 C2 2.00

*work in progress

PD06751F• 454aa/15 Met, 1.9Å P21212,

hexamer in asu• Space group choices narrowed

down by systematic absence• 1080 SHELXD jobs (200 trials

each), parameters explored:– E value cutoff (1.1-1.5/0.1)– Number of sites (40-120/10)– Resolution cutoff (3.5-5.8/0.1)

• 3.5 hrs to finish all 1080 jobs on SDC cluster (220 CPUs)

• Of 1080 jobs, 39% find correct heavy atom solutions

• First correct solution within minutes, 84/84 sites found

PE00293D• 285aa/11 Met, H3, 2.15Å,

hexamer/asu, 2 wavelength MAD, PDB id: 2p10

• 760 SHELXD jobs (200 trials each), parameters explored:– E value cutoff (1.1-1.5/0.1)

– Number of sites (20-90/10)

– Resolution cutoff (4.0-5.8/0.1)

• 1 hrs to finish all 760 jobs on SDC cluster (220 CPUs)

• Solutions are rare, only 12 jobs (out of 760 jobs, 1.5%) find correct heavy atom solutions, 53/54 sites found

TB1547G• 409aa (13 Met)/monomer, P212121, 2

tetramers per asu• Initially labeled as something else (TB5131A,

179aa/2 Met)• Treated as an unknown target• POINTLESS and XPREP to narrow down

space group choices, XPREP to generate FA values

• SHELXD Grid search:– Sites 20-120 in step of 10– Resolution cutoff 3.3-4.5 in step of 0.1– E value cutoff from 1.1-1.5 in step of 0.1

• 520 parallel SHELXD jobs, each SHELXD job attempts 200 trials

• The job order is randomized to uniformly sample the search space initially

• Solutions appeared usually appears in minutes, so jobs can be terminated early if necessary

• Each SHELXD job needs ~1hrs, ~2 hrs for all jobs to finish on SDC cluster (220 CPUs)

• Interpretation of density map gave correct identification of the target

FH7599A: MR+MAD• Estimated 10-20 monomers per

asu, 100-300 Heavy atom sites• No highly homologous (>20%

seq id) MR models• FFAS or PSI-BLAST identified a

remote sequence homolog TM0064 (14% seq id)

• TM0064 trimer poly-alanine is used as MR model, use of the trimer as MR template significantly improved signal to noise in MR procedure

• Density modification is critical for improving MR phases

• Improved DM phases + MAD data to locate ~200 heavy atom sites and MAD phasing

rmsd 2.42 Å for 82% C

FH7599A vs TM0064

top related