molecular replacement - portal ifsc · november 16, 2018 ifsc/ccp4 mx school, são carlos 31 1....
TRANSCRIPT
MolecularReplacement
AndreyLebedevCCP4
MRProblem
Knowncrystalstructure Newcrystalstructure
Given: • Crystal structure of a homologue • New X-ray data
Find: • The new crystal structure
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 2
MRTechnique
Method: • 6×N - dimensional global optimisation – one 6-d search for each molecule in the AU >> split further to orientation + translation searches = 3 + 3 >> fast search step using FFT
Required: • Scoring
– the match between the data and an (incomplete) crystal model – ideally: the highest score = correct solution
Searchmodel
Knowncrystalstructure Newcrystalstructure
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 3
RealandReciprocalspaces
• TermsmayreferrealspacebutactualcalculaKonsmaybeperformedinthereciprocalspace:– "Searchintheelectrondensity"– "PaOersonsearch"
• TheconceptsformulatedinrealspacearemoreintuiKve
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 4
Func8onsinRealandReciprocalspaces
"���!� ����������������
���������������������������������
�������������������������������������
�������������� ���������������������������������
����������
��� ����������������
"���!�
"���!�
"���!�
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 5
StructurefactorsandElectrondensitymap
F = A+ iB
Structurefactors F(h,k,l)–AdiscretecomplexfuncKoninthereciprocalspaceAtgivenh,k,l–Complexnumber:–Canbeexpressedviastructureamplitudeandphase
F = F exp(iφ)
Electrondensitymap–periodic3-dfuncKoninrealspace
isdirectlyinterpretable–modelbuilding–real-spacefiXngoffragments
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 6
Intensi8esandPa>ersonmap
Intensi8es I(h,k,l)–3-ddiscreterealfuncKoninthereciprocalspace
Pa>ersonmap:–3-dfuncKoninreal(*)space
–Nofeaturesresemblingaproteinmolecule–Modelbuilding,residuebyresidue,isimpossible
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 7
DataandMR
MR: Two distinct cases dependent on availability of phases • Data = structure factors (include phases) – "Search in the electron density"
o Electron density maps are compared: calculated vs. observed o Model building is a more straightforward approach
» Useful in special cases • Data = observed intensities (no phases) – "Patterson search"
o Patterson maps are compared: calculated vs. observed o Direct model building is impossible in the absence of phases
» The most common case of MR As a rule, all computations are in the reciprocal space
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 8
Selfandcrossvectors
Electrondensitymap
PaOersonmap
Onemolecule Twomolecules
r1r2
r1–r1r2–r2r1–r2r2–r1
Electrondensitymap=peaksfromallatomsPaOersonmap=peaksfromallinteratomicvectors•self-vectors:vectorsbetweenatomsbelongingtothesamemolecule•cross-vectors:vectorsbetweenatomsbelongingtodifferentmolecules
self-vectors
cross-vectorsContribuKonfromself-vectors–iscentredattheorigin–dominatesinavicinityoftheoriginNovember16,2018 IFSC/CCP4MXSchool,SãoCarlos 9
Pa>ersonsearch
PaOersonmap:
• ContribuKonfromself-vectorsiscentredattheorigin
• Self-vectorsare,inaverage,shorterthancross-vectors– Peaksfromself-vectorsdominatesinavicinityoftheorigin– Peaksfromcross-vectordominatesawayfromtheorigin
• One6-dimensionalsearchsplitsinto– RotaKonFuncKon:3-dimensionalsearch(usingself-vectors)– TranslaKonFuncKon:3-dimensionalsearch(usingcross-vectors)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 10
Rota8onFunc8on
containsonly
• self-vectorscontains
• self-vectors(signalfromoneoftheorientaKonsinthecrystal)
• cross-vectors(noise)
RF(α,β,γ ) = Pobs (r)×Pselfcalc (α,β,γ,r)dr∫∫∫
3
Pobs (r)
Pselfcalc (α,β,γ,r)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 11
Rota8onFunc8on
Crystal Model
×
α,β,γ
Pselfcalc (α,β,γ,r)Pobs (r)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 12
RF(α,β,γ ) =r∫∫∫
Transla8onFunc8on
contains• self-vectors(background)• cross-vectors
contains
• self-vectors(backgroundornoise)• cross-vectors(relevantvectors:signal,others:noise)
TF(t) = Pobs (r)×Pcrosscalc (t,r)dr∫∫∫
3
Pobs (r)
Pcrosscalc (t,r)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 13
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 14
Transla8onFunc8on
• TranslaKontdoesnotchange
thestructure
– canbecompensatedwithshidof
crystallographicorigin
• TFstepisnotneeded
t
tt
t
P1
t 1
43
2
P222
• Thecentreofmolecule1:– parametert
• Centresofmolecules2,3and4– fromsymmetryoperaKon
• MRprogrammatchesPcalc(t)toPobsto
findthebestmatchingt
Fixedpar8almodel
AlmostthesameequaKonasforasinglemoleculesearch,
• AgainFFTtechniquecanbeused
• NoexcepKonforthespacegroupP1anymore!– translaKonofthesecondcopyrelaKvetothefirstonecannotbecompensatedbyashidofcrystallographicorigin
TF(t) = Ihobs × Fh
fixed +Fhcalc (t)
2
h∑
Fixedmodel
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 15
Packingconsidera8ons
MoleculesinthecrystaldonotoverlapHowcanweusethisinformaKon?
» PaOersonmapdoesnotexplicitlyrevealmolecularpacking
RejectMRsoluKons
• Restrictdistancebetweencentresofmolecules
• CountcloseinteratomiccontactsModifyTF
• DivideTFbyOverlapFuncKon• MulKplyTFbyPackingfuncKon
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 16
κρ
jρ
SpecialisedMRtechniques
• Searchinthedensity(phasedMR)• HandlingTranslaKonalNon-Crystallographicsymmetry– Non-originpeaksinthePaOersonmapindicatethepresenceofTNCS– Requiresspecialhandlingofmodelerrors(Phaser)– MoleculesrelatedbyTNCScanbefoundinonegoastheyhavenearlythesameorientaKon
• SelfRotaKonFuncKon
• LockedRFandTF– Usingpointsymmetryofoligomers
• ExhausKvesearches
• StochasKcsearches
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 17
Molrep
AlexeyVagin
YSBLUniversityofYork
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 19
Molrep
molrep-fdata.mtz-mmodel.pdb-mxfixed.pdb-starget.seq
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 20
Log-file
!"#
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 21
CCP4I2
Defaultprotocol
• modelcorrecKonbasedonsequenceandstructureinformaKon
• definesthenumberofmoleculesperAU
• anisotropiccorrecKonofthedata• weighKngthedataaccordingtomodelcompletenessandsimilarity
• checkforpseudotranslaKonanduseitifpresent• 30+peaksinCrossRFforuseinTF(accountsforclosepeaks)• appliedpackingfuncKon• makeuseofparKalstructure(fixedmodel)
molrep-fdata.mtz-mmodel.pdb-mxfixed.pdb-starget.seq
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 22
Modelmodifica8oninMOLREP
• PerformsmodelcorrecKon:– IdenKfiessecondarystructureinthemodel
– Alignstargetandmodelsequences» nodeleKonsorinserKonsinα-helixesorβ-strands
– Retainsalignedresidues
– Retains"aligned"atomsinalignedresidues
• AddsB-factortoresiduesexposedtosolvent
• UsessequenceidenKtytodown-weighthighresoluKondata
molrep-mmodel.pdb-starget.seq
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 23
Molrepprotocolfortwocopiesofamodel
RF=∑hklw*IO*IC(αβγ)TF=∑hklw*IO*IC(xyz)Rescoring:CorrelaKonCoefficient*PF
X-raydata allsteps
RF TF*PF Rescoring Par<alstructureSearchmodel
TF*PF Rescoring Possiblesolu<on
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 24
Molrepprotocolfortwocopiesofamodel
RF=∑hklw*IO*IC(αβγ)TF=∑hklw*IO*IC(xyz)Rescoring:CorrelaKonCoefficient*PF
X-raydata allsteps
RF TF*PF Rescoring Par<alstructureSearchmodel
TF*PF Rescoring Possiblesolu<on
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 25
MolrepvsPhaser
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 26
SinglesoluKonistakenforward >> MoresophisKcatedsearchstrategyTF/sig(TF) = TFZCC >> Log-likelihoodgain(LLG)LSrigidbodyrefinement >> LL-basedrigidbodyrefinementandmoreImprovedscoring-iscrucialforusingdistanthomologuessuccessfullyinMRmethod-allowscorrectplacementofsmallfragmentmodels(evensingleatom)
Molrepprotocolfortwocopiesofamodel
RF=∑hklw*IO*IC(αβγ)TF=∑hklw*IO*IC(xyz)Rescoring:CorrelaKonCoefficient*PF
X-raydata allsteps
RF TF*PF Rescoring Par<alstructureSearchmodel
TF*PF Rescoring Possiblesolu<on
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 27
Searchintheelectrondensitymap
Map coefficients Refmac
Partial structure
Molrep
Search model
Extended model
Searchinthemap• Calculate2-1or1-1mapsaderrestrainedrefinementofparKalstructure
• FlaOenthemapcorrespondingtotheknownsubstructure
• Calculatestructureamplitudesfromthemodifiedmap
• UsethesemodifiedamplitudesinRotaKonFuncKon
• Andfinally–PhasedTF
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 28
Molrep:SAPTF
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 29
SphericallyAveragedPhasedTranslaKonFuncKon(FFTbasedalgorithm)
€
ρ Map(s,r)
€
ρ Model(r)
€
SAPTF(s) = ρ Map(s,r)∫ ρ Model(r) r2 dr
€
Map
€
Model
€
s
Molrep:SearchinthemapwithSAPTF
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 30
1.FindapproximateposiKon:SphericallyAveragedPhasedTranslaKonFuncKon
2.FindorientaKon:LocalPhasedRotaKonFuncKon– LocalsearchoftheorientaKoninthedensity
3.VerifyandadjustposiKon:PhasedTranslaKonFuncKon
Molrep:SearchinthemapwithSAPTF
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 31
1.FindapproximateposiKon:SphericallyAveragedPhasedTranslaKonFuncKon
2.FindorientaKon:LocalRotaKonFuncKon– StructureamplitudesfromthedensitywithintheSAPTFsphere
3.VerifyandadjustposiKon:PhasedTranslaKonFuncKon
• LocalRFislesssensiKvethanPhasedRFtoinaccuracyofthemodelposiKon
Example
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 32
• Asymmetricunit twocopies
• ResoluKon 2.8Å
Phaneet.al(2011)Nature,474,50-53
Ushercomplexstructuresolu8on
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 33
3.FiXngintotheelectrondensity– FimD-Plug– FimD-NTD– FimD-CTD-2
4.Manualbuilding– FimD-CTD-1
1.ConvenKonalMR– FimC-N+FimC-C– FimH-L+FimH-P– FimD-Pore
2.Jellybodyrefinement(Refmac)– FimD-Pore
PerformanceoffiOngmethods
TryingseveralmethodsisagoodpracKce(alsobecauseofcross-validaKon)
searchmodel
sequenceidenKty
"Masked"RFPTF
prfn
SAPTFPRFPTFprfy
SAPTFLocalRFPTFprfs
FimD-Plug 3fip_A 38.5% 2(2) –(–) 1(2)
FimD-NTD 1ze3_D 100% 2(2) 1(2) 2(2)
FimD-CTD-2 3l48_A 33.3% –(–) 2(2) –(–)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 34
FiOngintoEMmaps
SPP1portalprotein
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 35
Self Rotation FunctionMOLREP
Rad : 33.00 Resmax : 2.90RF(theta,phi,chi)_max : 0.2814E+05 rms : 1093.
Chi = 180.0
X
Y
RFmax = 0.2813E+05
Chi = 90.0
X
Y
RFmax = 2176.
Chi = 120.0
X
Y
RFmax = 7563.
Chi = 60.0
X
Y
RFmax = 3202.
X
Y
SelfRota8onFunc8on(SRF)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 36
PreliminaryanalysisofX-raydata• Oligomericstateoftheproteinincrystal
• SelecKonofoligomericsearchmodel
Limiteduse• NoclearinterpretaKonorevenarKfactpeaksinhighsymmetrypointgroups(e.g.622)
• differentoligomerswiththesamesymmetry
ExampleofSRF
• SpacegroupP21• One222-tetramerintheAU
Chi=180°
LockedRota8onFunc8on
• UsesSRFtoderiveNCSoperaKons• AveragesRFoverNCSoperaKons• InfavorablecasesImprovessignaltonoiseraKoinRF
AutomaKcmode:ThereisanopKonofselecKngspecificSRFpeaksAvailablefromCCP4I
molrep-fs100.mtz-mmonomer.pdb-ss100.seq–i<<+locky+
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 37
One-dimensionalexhaus8vesearch(exo8ccase)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 38
SRFhelpsrestrictdimensionalityinanexhausKvesearch• OrientaKonofthetrimerisknownfromtheanalysisofSRF
• Unknownparameter:rotaKonabout3-foldaxis
• One-parametricexhausKvesearchusingTFasscorefuncKon
at χ values of 180o, 120o (Fig. 2.2c) and 90o, which correspond to the 432 symmetry. This
means that the NCS axes are in special orientations with respect to the crystallographic two-fold
axis (two of the four NCS triads are orthogonal to the crystallographic axis). These data may
suggest that the asymmetric unit of the crystal contains either four dodecamers with 23 point-
group symmetry (the high apparent symmetry of SRF in this case is the consequence of special
orientations of the NCS axes) or four 24-mers with 432 symmetry. However, in the second case
one of the six diagonal dyads of a 24-mer would be parallel to a crystallographic two-fold screw
axis and such an arrangement would generate strong peaks in the native Patterson synthesis
at v = 0.5, which were not observed. Moreover, four 24-mers would result in an impossible
specific volume and solvent content and therefore this possibility was excluded.
χSRF = 120o χSRF = 180o
χ
xx
yy
a a
b b
c c
(a)
(b)
(c)
(d) (e)
χ (o)
CC
0 40 80 1200.02
0.06
0.10
Figure 2.2. Structure solution of anti-TRAP from B. licheniformis. (a) Ribbon diagram of the dodecamer
of anti-TRAP from B. subtilis. (b) Native Patterson synthesis, in which three strong non-equivalent non-
origin peaks are present. (c) 120o and 180o sections of the SRF, indicating the orientations of two-fold
and threefold NCS axes. The trimeric search model (centre) was oriented so that its threefold molecular
axis was aligned with the NCS threefold axis (red lines). TF searches were performed for a series of
orientations related to that shown by a rotation around the molecular threefold axis by the variable angle
χ. (d) The highest CC in the TF search is plotted as a function of χ. (e) The MR solution with four
dodecamers in the asymmetric unit, which are related by the translational NCS. This figure was prepared
using BOBSCRIPT,MOLREP, CCP4mg (Potterton et al., 2004) and R.
63
MRsubstructuresolu8on(exo8ccase)
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 39
• SelectisomorphousderivaKve– bycomparingnaKveSRFandSRFfromD-iso
• Hg-substructureisa13-atomring(fromnaKveSRFanalysis)– OrientaKonoftheringisknownfromtheanalysisofSRF– Unknownparameters:radiusofthering,rotaKonabout13-foldaxis
• Two-parametricexhausKvesearch
A series of search models was generated. Each model contained 13 Hg atoms located around
a circle with a step of 27.7o (Fig. 2.7b). An additional carbon atom was placed on the axis of
the circle but outside of its plane for the MR program to be not confused with an ill-conditioned
inertia matrix, which would occur for a flat model. The radius of the circle varied in the series of
the search models from 10 to 90 A with a step of 1 A. Every model was submitted to a rotation
Fobs Diso Diso(native) (Hg-1 – native) (Hg-2 – native)
χ=180o
x
y
x
y
x
y
χ=27
.7o
x
y
x
y
x
y
(a)
R
(b) R0 30 60 90
RF
0
20
40
60Hg-1Hg-2
(c)
Figure 2.7. The Hg-substructure solution of the portal-protein derivative crystals. (a) The SRF computed
for the native data (left) and for the isomorphous differences between the native data and the derivative
data Hg-1 (middle) and Hg-2 (right). (b) The search model for exhaustive search composed of 13 mercury
atoms located on the circle of variable radius R. (c) The plots of the maximal value of the CRF against
R computed for the isomorphous differences between the native data and the derivative data Hg-1 (thick
line) and Hg-2 (thin line).
85
A series of search models was generated. Each model contained 13 Hg atoms located around
a circle with a step of 27.7o (Fig. 2.7b). An additional carbon atom was placed on the axis of
the circle but outside of its plane for the MR program to be not confused with an ill-conditioned
inertia matrix, which would occur for a flat model. The radius of the circle varied in the series of
the search models from 10 to 90 A with a step of 1 A. Every model was submitted to a rotation
Fobs Diso Diso(native) (Hg-1 – native) (Hg-2 – native)
χ=180o
x
y
x
y
x
y
χ=27
.7o
x
y
x
y
x
y
(a)
R
(b) R0 30 60 90
RF
0
20
40
60Hg-1Hg-2
(c)
Figure 2.7. The Hg-substructure solution of the portal-protein derivative crystals. (a) The SRF computed
for the native data (left) and for the isomorphous differences between the native data and the derivative
data Hg-1 (middle) and Hg-2 (right). (b) The search model for exhaustive search composed of 13 mercury
atoms located on the circle of variable radius R. (c) The plots of the maximal value of the CRF against
R computed for the isomorphous differences between the native data and the derivative data Hg-1 (thick
line) and Hg-2 (thin line).
85
Whichdirec8ondoesMRgo?
Automation: ✖ Collection of tricks ✔ Improvement of "standard" methods ✔ Better scoring system ✔✔ Models
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 40
November16,2018 IFSC/CCP4MXSchool,SãoCarlos 41