don't fffear the buccaneer kevin cowtan, york. ● map simulation ⇨ a tool for building...

Post on 21-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Don't fffear the buccaneerKevin Cowtan, York.

● Map simulation⇨ A tool for building robust statistical methods

● 'Pirate'⇨ A new statistical phase improvement method

● 'Buccaneer'⇨ A new statistical chain tracing method

● Results⇨ And a diatribe about their irrelevance

The Royal SocietyYork Structural Biology Laboratory

Map simulation

The Royal SocietyYork Structural Biology Laboratory

Refined modeldensity.

Targetnoisy map.

Simulatednoisy map.

Structurefactors

Known (reference) structure Unknown (work) structure

Phases

Scale factors

Phase errors

• Map simulation is a tool to generate problem specific statistical targets:

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

Transferring the errors:1. Classify the reflections from both structures by |E| and resol.

(Note: we use 225 bins, not 9!)

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

0.1, 0.00.0, 0.0

...

...

...

...

...0.9, 0.80.6, 0.4

...

...

Transferring the errors:2. Copy FOMs by bin from work structure to reference.

(We pick a random FOM from the same bin of the work structure for each reflection in the reference structure.)

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

P()

0

Transferring the errors:3. Simulate a phase error in accordance with the distribution

for that FOM:

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

|E|2

Resolution

|E|2

Resolution

Transferring the scales:Rescale the reference data to match the work data, after

accounting for the difference in cell volumes.

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

Result:

• Map calculated from simulated reference data has same statistical properties as work map.

Notes:

• Need reliable FOMs!

• Can potentially simulate HL coeffs too.

• Should bin FOMs for centric/acentric data separately (if data available).

'Pirate': Rationale• Density modification history has been

dominated by the solvent mask in one form or another.

• Limitations:– What do we do with disordered protein?

– What do we do with ordered solvent?

– Need to know solvent content.

– What do we do for non-proteins?

The Royal SocietyYork Structural Biology Laboratory

'Pirate': Method• Divide map into a multi-dimensional

continuum of states.

The Royal SocietyYork Structural Biology Laboratory

e.g. Local mean and local variance classify map into:

●Electron sparse/dense●Disordered/ordered

Dense, ordered

Dense, disordered

Sparse, ordered

Sparse, disordered

'Pirate': MethodCompare simulated and known map to obtain density distributions for each region, then apply these distributions to the unknown map.

The Royal SocietyYork Structural Biology Laboratory

Reference structure: Work structure:

'Pirate': Method• Obtain per-grid density probability distributions

– Also allows NCS, known density etc.

• Transform using equations of Bricogne (1992).– Similar to Terwilliger (1999).

– Map probability becomes phase probability distribution.

The Royal SocietyYork Structural Biology Laboratory

Bricogne (1992) Proc. CCP4 Study WeekendBricogne (1997) Methods in Enzymology

R

I

'Pirate': Method

The Royal SocietyYork Structural Biology Laboratory

• Finally, combine new distribution with original HL coefficients, for new phases and maps.

• Gives final 'improved' phase probabilities.

R

I

R

I

X ABCD

'Pirate': Method

The Royal SocietyYork Structural Biology Laboratory

Notes:• No solvent content required, since reference map is

pre-scaled to work map.

• Single step process (for now)

– No solvent mask -> no mask to refine.

• Should work for novel problems too (with related reference structure)

– e.g. No solvent, disordered domains, metaloproteins.

'Buccaneer': MethodCompare simulated map and known model to obtain likelihood target, then search for this target in the unknown map.

The Royal SocietyYork Structural Biology Laboratory

Reference structure: Work structure:

LLK

'Buccaneer': Method• Compile statistics for reference map in 4A

sphere about C => LLK target.

The Royal SocietyYork Structural Biology Laboratory

4A sphere about Ca also used by 'CAPRA'Ioeger et al. (but different target function).

• Use mean/variance (in future histogram).

'Buccaneer': MethodFind candidateC positionsusing LLK-fffearsearch.(~1 per 3 residues)

The Royal SocietyYork Structural Biology Laboratory

'Buccaneer': MethodExtend fromcandidates using 2 residue lookahead withRamachandranrestraints.

(Same target-fn.but in real space)

Then ARP/wARP?

The Royal SocietyYork Structural Biology Laboratory

Lookahead search c.f.Jones, Oldfield, Terwilliger, etc.

ResultsProblem: “tuning” of one program to another.

The Royal SocietyYork Structural Biology Laboratory

Ecorr

/ MPEw

/ m0

Phasing Ph.Impr.Ecorr

/ MPEw

/ m0

'dm'

'resolve'

'dm'

'resolve'

'mlphare'

'solve'

0.508 / 59.1 / 1.35

0.474 / 61.0 / 0.83

0.700 / 50.6 / 0.61

0.436 / 67.8 / 0.37

0.750 / 47.7 / 0.68

0.710 / 48.0 / 0.67

'resolve' version 2.0.5, with 'no build' optionin order to compare model-free phasing.

Statistics are: Ecorr

: E-map correlation;

MPEw: weighted Mean Phase Error;

m0: gradient of regression of cos() vs.

FOM

What other examples of “tuning” are present in this case?

ResultsAfter 'solve', but with other tuning problems:

The Royal SocietyYork Structural Biology Laboratory

Ecorr

/ MPEw

/ m0Ph.Impr.

'pirate' 1

'resolve'

'dm' 0.750 / 47.7 / 0.68

0.710 / 48.0 / 0.67

0.775 / 43.2 / 1.08

'pirate' 2

'pirate' 3

'pirate' 6

'pirate' 5

'pirate' 4

0.762 / 43.3 / 0.98

0.824 / 37.2 / 1.02

0.788 / 39.7 / 0.94

0.745 / 44.7 / 1.02

0.759 / 42.7 / 0.94

Reference structures

Beta-mannosidase (2003) StructureBoraston, Revett, Boraston, Nurizzo, Davies

Results

The Royal SocietyYork Structural Biology Laboratory

SAD 'dm'

Results

The Royal SocietyYork Structural Biology Laboratory

'resolve' 'pirate'

ResultsOther cases:

– MIRAS:

• Commercial structure phased with MLPHARE.

• Results better than 'dm'.

– High resolution:

• RNAse phase extension to 1.5, 1.0A.

• Map improved (unlike 'dm') with appropriate reference structure.

• (But not as good a dual space methods: ACORN).

The Royal SocietyYork Structural Biology Laboratory

Future• 'Pirate' available soon Q1 2004 (after tuning)

• 'Pirate' flexi-domain averaging Q3 2004

• 'Buccaneer' 2004?

Technology:

Both applications are extremely simple, built using Clipper libraries, less than 1000 lines of code each, less than 2 months development.

The Royal SocietyYork Structural Biology Laboratory

Conclusions• Very simple but effective applications can be

built with improved statistical targets from map simulation calculations.

• Preliminary results on real data suggest this approach is competitive with the state-of-the-art, even at an early stage of development.

• Need reliable phase probability distributions (figures of merit).

The Royal SocietyYork Structural Biology Laboratory

Acknowledgements

● G. Bricogne(Original probability transformation eqns.)

● T. Terwilliger(First implementation of statistical dm.)

● E. Dodson(Test data)

● Royal Society (KDC funding)

top related