H.F. Fan & Y.X. GuBeijing National Laboratory for Condensed Matter Physics
Institute of Physics, Chinese Academy of SciencesP.R. China
H.F. Fan & Y.X. GuBeijing National Laboratory for Condensed Matter Physics
Institute of Physics, Chinese Academy of SciencesP.R. China
Direct Methods inDirect Methods inProtein CrystallographyProtein Crystallography
Direct Methods inDirect Methods inProtein CrystallographyProtein Crystallography
• The phase problem & direct methods• Sayre’s equation & tangent formula• Use of direct methods in
protein crystallography• Direct-method SAD/SIR phasing• Direct-method aided model completion
• The phase problem & direct methods• Sayre’s equation & tangent formula• Use of direct methods in
protein crystallography• Direct-method SAD/SIR phasing• Direct-method aided model completion
The Phase ProblemThe Phase Problem
2 ( )1( , , ) ( , , ) i hx ky lz
h k l
x y z F h k lV
e 2 ( )1( , , ) ( , , ) i hx ky lz
h k l
x y z F h k lV
e
( , , ) ?h k l ( , , ) ?h k l
, ,( , , ) i h k lF h k l e , ,( , , ) i h k lF h k l e
The Point of View fromDirect Methods:
The Point of View fromDirect Methods:
Phases are not missing butjust hidden in the magnitudes!
Phases are not missing butjust hidden in the magnitudes!
What is a Direct Method ?What is a Direct Method ?
It derives phases directly from the magnitudes.
It derives phases directly from the magnitudes.
( , , ) ( , , )F h k l h k l ( , , ) ( , , )F h k l h k l
Why it is possible ?Why it is possible ?
1
, , exp 2N
j j j jj
F h k l f i hx ky lz
Why it is possible ?Why it is possible ?
Each reflection is accompanied by an unknown phase, but yields two simultaneous equations. Hence in theory, a diffraction data set of 3n reflections can be used to solve a structure with n independent atoms (assuming 3 parameters per atom).
That is to say, the phases may, at least in theory, be derived from a large enough set of magnitudes given the known quantities of atomic scattering factors.
1
1
, , cos , , cos2
, , sin , , sin 2
N
j j j jj
N
j j j jj
F h k l h k l f hx ky lz
F h k l h k l f hx ky lz
1
1
, , cos , , cos2
, , sin , , sin 2
N
j j j jj
N
j j j jj
F h k l h k l f hx ky lz
F h k l h k l f hx ky lz
Conditions for the Sayre Equation to be valid
1. Positivity2. Atomicity
3. Equal-atom structure
Conditions for the Sayre Equation to be valid
1. Positivity2. Atomicity
3. Equal-atom structure
Sayre’s EquationSayre’s Equation
' ''
sq
fF F F
f V h h h hh
sin = h’ h, h’ sin (h’ +h h’)
cos = h’ h, h’ cos (h’ +h h’)
The tangent formulaThe tangent formula
, ' ' ''
, ' ' ''
sin( )tan
cos( )
h h h h hh
hh h h h h
h
, ' ' ''
, ' ' ''
sin( )tan
cos( )
h h h h hh
hh h h h h
h
1/ 22 2
, ' ' ' , ' ' '' '
sin( ) cos( )
h h h h h h h h h hh h
1
0( ) 2 ( ) exp[ cos( )]P I h h 1
0( ) 2 ( ) exp[ cos( )]P I h h
h, h’ = 2hh’ h - h’
• Locating heavy atoms
• Ab initio phasing of protein diffraction data
at 1.2Å or higher resolution
SnB, SHELXD, ACORN
• Direct-method aided SAD/SIR phasing and structure-model completion
OASIS
• Locating heavy atoms
• Ab initio phasing of protein diffraction data
at 1.2Å or higher resolution
SnB, SHELXD, ACORN
• Direct-method aided SAD/SIR phasing and structure-model completion
OASIS
Use of direct methods in Protein Crystallography
Use of direct methods in Protein Crystallography
the SAD/SIR phase ambiguity the SAD/SIR phase ambiguity
Direct methods breakingDirect methods breaking
Bimodal distributionfrom SAD
" " "
The phase ofF”
P
Phase information available in SADPhase information available in SAD
Cochrandistribution
Peaked atany where
from 0 to 2
Peaked at
"2
Sim distribution
P+ formulaP+ formula
Acta Cryst. A40, 489-495 (1984)Acta Cryst. A40, 495-498 (1984)Acta Cryst. A41, 280-284 (1985)
' ' , ' 3 ', ','
1 1tanh sin
2 2
sin ' sinbest best
P
m m
h h
h h h h h h h h hh
' h h h Reducing the phase problem to a sign problem
Breaking the SAD/SIR phase ambiguity by theCochran distribution incorporating with partial structure information
+-
Direct-method phasing of the 2Å experimental SAD data of the protein aPP
Direct-method phasing of the 2Å experimental SAD data of the protein aPP
Avian Pancreatic Polypeptide
Space group: C2 Unit cell: a = 34.18, b = 32.92, c = 28.44Å; = 105.3o
Protein atoms in ASU: 301Resolution limit: 2.0ÅAnomalous scatterer: Hg (in centric arrangement)Wavelength: 1.542Å (Cu-K) f” = 7.686Locating heavy atoms & SAD phasing: direct methods
Acta Cryst. A46, 935 (1990)
Avian Pancreatic Polypeptide
Space group: C2 Unit cell: a = 34.18, b = 32.92, c = 28.44Å; = 105.3o
Protein atoms in ASU: 301Resolution limit: 2.0ÅAnomalous scatterer: Hg (in centric arrangement)Wavelength: 1.542Å (Cu-K) f” = 7.686Locating heavy atoms & SAD phasing: direct methods
Acta Cryst. A46, 935 (1990) Data courtesy of Professor Tom BlundellData courtesy of Professor Tom Blundell
• Direct-method SAD/SIR phasing combined with density modification
OASIS + DM, OASIS + RESOLVE,SOLVE/RESOLVE + OASIS
• Direct-methods aided
dual-space structure-model completion
ARP/wARP + OASIS, PHENIX + OASIS
• Direct-method SAD/SIR phasing combined with density modification
OASIS + DM, OASIS + RESOLVE,SOLVE/RESOLVE + OASIS
• Direct-methods aided
dual-space structure-model completion
ARP/wARP + OASIS, PHENIX + OASIS
Further developmentsFurther developments
TTHA1634 fromThermus thermophilus HB8
Data courtesy of Professor Nobuhisa WatanabeDepartment of Biotechnology and Biomaterial Chemistry, Nagoya University, Japan
Space group: P21212 Unit cell: a = 100.57, b = 109.10, c = 114.86ÅNumber of residues in the AU: 1206Resolution limit: 2.1ÅMultiplicity: 29.2Anomalous scatterer: S (22) X-ray wavelength: = 1.542Å (Cu-K)Bijvoet ratio: <|F|>/<F> = 0.55%Phasing method: A single run of OASIS2006 + DM (Cowtan)Model building: ARP/wARP
ARP/wARP found 1178 of the total 1206 residues,all docked into the sequence.
Ribbon model plotted by PyMOL
Reciprocal-space fragment extension
OASIS + DM
Reciprocal-space fragment extension
OASIS + DM
Dual-space fragment extensionDual-space fragment extension
, 3
1 1tanh sin
2 2
s s inin
best best
P
m m
h h
h' h h' h h' h' hh h'h'
Real-spacefragment extension
RESOLVE BUILD and/or ARP/wARP
Real-spacefragment extension
RESOLVE BUILD and/or ARP/wARP
Partialstructure
Partialstructure
NoNo
YesYes
OK?OK?
EndEnd
PartialmodelPartialmodel
Glucoseisomerase
S-SADCu-K
17%Cycle 097%Cycle 6
Glucoseisomerase
S-SADCu-K
Cr-K Se, S-SAD Alanine racemase
Cycle 052%
Cr-K Se, S-SAD Alanine racemase
Cycle 497%
25%Cycle 0
Xylanase S-SADSynchrotron = 1.49Å
Xylanase S-SADSynchrotron = 1.49Å
99%Cycle 6
52%Cycle 0
LysozymeS-SADCr-K
LysozymeS-SADCr-K
98%Cycle 6 Azurin
Cu-SADSynchrotron = 0.97Å
Cycle 042%
AzurinCu-SADSynchrotron = 0.97Å
Cycle 395%
Ribbon models plotted by PyMOL
Data courtesy of Professor N. Watanabe,Professor S. Hasnain, Dr. Z. Dauter andDr. C. Yang
Direct-method aided
MR-model completion
Direct-method aided
MR-model completion
Dual-space fragment extension
without SAD/SIR information
Dual-space fragment extension
without SAD/SIR information
, 3
1 1tanh sin
2 2
sin si n
best best
P
m m
h h
h' h h' h h' h' hh h'h'
Partialstructure
Partialstructure
" h 5%
the phase of atoms
randomly selected from the current model
" h h h
" . . "model modeli e h h h h
Density modification
by DM
Density modification
by DM
NoNo
MRmodel
MRmodel
YesYes
EndEndModel completion
by ARP/wARPor PHENIX
Model completion by ARP/wARP
or PHENIXOK?OK?
Phase improvement
by OASIS
Phase improvement
by OASIS
P+ > 0.5
” model
P+ < 0.5
” model
<||> ~
<||> ~
”
MR-model completionof 1UJZ
Space group: I222
a=62.88, b=74.55, c=120.44
Number of residuals in AU: 215
Resolution limit: 2.1Å
46 residues13 with side chains
MRmodelMRmodel
Cycle 2
ARP/wARP-DMiterationCycle 1
Cycle 1 Cycle 3
ARP/wARP-OASIS-DM iteration
Cycle 7Cycle 5201 residuesall with side chains Final
modelFinalmodel
215 residues
1UJZ
Ribbon models plotted by PyMOL
MR-model completionof an originally unknown protein
Space group: P212121
a=71.81, b=81.40, c=108.95Å
Number of residuals in AU: 728
Solvent content: 0.37
Resolution limit: 2.5Å
Starting model
R-factor: 0.34
R-free: 0.44
No. of residuals: 479
with side chains: 479
After phenix.autobuild
R-factor: 0.33
R-free: 0.40
No. of residuals: 503
with side chains: 503
After 4 cycles of oasis-phenix
R-factor: 0.24
R-free: 0.30
No. of residuals: 597
with side chains: 588
What’s thelow resolution limit for
direct methods?
What’s thelow resolution limit for
direct methods?
SAD phasing at different resolutionsTTHA1634 Cu-Kdata, <|F|>/<F> ~ 0.55%
SAD phasing at different resolutionsTTHA1634 Cu-Kdata, <|F|>/<F> ~ 0.55%
2.1Å
3.0Å
3.5Å
4.0Å
Very good
Good
Marginally traceable
Still informative
Maps at 1 phased by a single run of OASIS + DM (Cowtan) plotted by PyMOL
dealing with low resolution SIR/SAD data
dealing with low resolution SIR/SAD data
Combining SOLVE/RESOLVE and OASIS + DM
Combining SOLVE/RESOLVE and OASIS + DM
R-phycoerythrinSIR data from the native and thep-chloromercuriphenyl sulphonic acid derivative
Space group: R3Unit cell: a = b = 189.8, c = 60.0Å; = 120o
Number of residues in the ASU: 668 Resolution limit: 2.8ÅReplacing atoms: HgX-rays: Cu-K, λ = 1.542Å
J.Mol.Biol. 262 721-731 (1996)Chinese Physics 16, 3022-3028 (2007)
SOLVE/RESOLVE SOLVE/RESOLVE& OASIS + DM
Maps plotted by PyMOL
SOLVE/RESOLVE SOLVE/RESOLVE& OASIS + DM SOLVE/RESOLVE
SOLVE/RESOLVE &OASIS + DM
Tom70pSpace group: P21
Unit cell: a = 44.89, b = 168.8, c = 83.4Å; β = 102.74o
Number of residues: 1086 Resolution limit: 3.3ÅMultiplicity: 3.3Anomalous scatterer: Se (24)X-rays: Synchrotron, λ = 0.9789Å, Δf" = 6.5Bijvoet ratio: <|ΔF|>/<|F|> = 4.3%
Nature Structural & Molecular Biology 13, 589-593 (2006)Chinese Physics B 17, 1-9 (2008)
Maps plotted by PyMOL
OASIS-2006OASIS-2006Institute of Physics
Chinese Academy of SciencesBeijing 100080, P.R. China
Institute of PhysicsChinese Academy of SciencesBeijing 100080, P.R. China
http://cryst.iphy.ac.cnhttp://www.ccp4.ac.uk/prerelease
http://cryst.iphy.ac.cnhttp://www.ccp4.ac.uk/prerelease
Institute of Biophysics, Chinese Academy of Sciences, Beijing, China Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
AcknowledgementsAcknowledgementsProfessor Zhengjiong LinProfessor Zhengjiong Lin
1 Beijing National Laboratory for Condensed Matter Physics,
Institute of Physics, Chinese Academy of Sciences, China2 National Laboratory of Protein Engineering and Plant Genetic
Engineering, Peking University, Beijing, China3 Institute of Biophysics, Chinese Academy of Sciences, Beijing China
1 Beijing National Laboratory for Condensed Matter Physics,
Institute of Physics, Chinese Academy of Sciences, China2 National Laboratory of Protein Engineering and Plant Genetic
Engineering, Peking University, Beijing, China3 Institute of Biophysics, Chinese Academy of Sciences, Beijing China
Drs Y. He1, D.Q. Yao1, J.W. Wang1, S. Huang1, J.R. Chen1, Q. Chen2, H. Li3, Prof. T. Jiang3,
Mr. T. Zhang1, Mr. L.J. Wu1 & Prof. C.D. Zheng1
Drs Y. He1, D.Q. Yao1, J.W. Wang1, S. Huang1, J.R. Chen1, Q. Chen2, H. Li3, Prof. T. Jiang3,
Mr. T. Zhang1, Mr. L.J. Wu1 & Prof. C.D. Zheng1
The project is supported by the Chinese Academy of Sciences and the 973 Project (Grant No 2002CB713801) of the Ministry of Science
and Technology of China.
The project is supported by the Chinese Academy of Sciences and the 973 Project (Grant No 2002CB713801) of the Ministry of Science
and Technology of China.
Thank you!Thank you!