main_ms_jbnmr_final_version

26
ab initio Calculation of NMR Chemical Shifts in Denatured Proteins: Prediction of Secondary Structural Preferences Abhilash Kannan, 1 Dinesh Kumar, 2 R. V. Hosur, 2 * Niels Chr. Nielsen, 3* S.Ganapathy 1 * 1 Central NMR Facility, National Chemical Laboratory, Homi Bhabha Road, Pune 411008, India and CAS in Crystallography & Biophysics, University of Madras, Chennai-600025, India 2 Department of Chemical Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai-400005, India 3 Center for Insoluble Protein Structures (inSPIN), Interdisciplinary Nanoscience Center (iNANO) and Department of Chemistry, Aarhus University, DK-8000, Aarhus C, Denmark Authors for correspondence: [email protected] [email protected]

Upload: abhilash-kannan

Post on 15-Apr-2017

72 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Main_Ms_JBNMR_Final_version

ab initio Calculation of NMR Chemical Shifts in Denatured Proteins:

Prediction of Secondary Structural Preferences

Abhilash Kannan,1 Dinesh Kumar,2 R. V. Hosur,2* Niels Chr. Nielsen,3* S.Ganapathy1*

1Central NMR Facility, National Chemical Laboratory, Homi Bhabha Road, Pune 411008,

India and CAS in Crystallography & Biophysics, University of Madras, Chennai-600025, India

2Department of Chemical Sciences, Tata Institute of Fundamental Research, Homi Bhabha

Road, Mumbai-400005, India

3Center for Insoluble Protein Structures (inSPIN), Interdisciplinary Nanoscience Center (iNANO) and Department of Chemistry, Aarhus University, DK-8000,

Aarhus C, Denmark

Authors for correspondence: [email protected] [email protected]

Page 2: Main_Ms_JBNMR_Final_version

Abstract

An in silico approach aimed at determining the secondary structural preferences in

unfolded/denatured proteins is proposed and is based on molecular dynamical simulations

to arrive at the conformational states of a denatured protein and ab initio quantum

chemical methods to determine the dynamically averaged chemical shifts. The first

successful demonstration of this approach is presented for the 8 M urea denatured SUMO

protein from drosophila melanogaster (dSmt3) for which full resonance assignments and

chemical shift determinations have been previously made from solution state

multidimensional NMR experiments. It is shown that from the ab initio calculations of 13C chemical shielding tensors, the cumulative (Cα, CO) shifts can be determined and used

as a marker to unravel the secondary structural features exhibited by the denatured protein.

The calculations on 8M urea denatured dSmt3 reveal α-helical and β-sheet propensities

and these are in excellent accord with experimental results. On the whole, our work

illustrates the usefulness of this approach in predicting NMR chemical shifts in

unfolded/denatured proteins and deriving secondary structural information.

------------------------------------------------------------------------------------------------------------

Key words

denatured proteins, secondary chemical shifts, molecular dynamics simulations, ab initio

calculations

Abbreviations Used

SUMO: Small Ubiquitin-related modifier; dSmt3: SUMO-1 homologue in drosophila

melanogaster; NMR: Nuclear Magnetic Resonance; MD: Molecular Dynamics; ONIOM:

Our own N-layered Integrated molecular Orbital and molecular Mechanics

Page 3: Main_Ms_JBNMR_Final_version

Introduction

NMR chemical shifts and protein structure have an intimate relationship (Wishart

et al. 1992; Sternberg et al. 2004; Cavalli et al. 2007). In folded proteins, the valuable

structural information they provide has been the basis of several algorithms published for

calculation of solution structures of proteins purely on the basis of chemical shifts

(Cornilescu et al. 1999; Iwadate et al. 1999; Wang and Jardetzky 2002; Meiler 2003; Neal

et al. 2003; Gong et al. 2007; Matsuki et al. 2007; Shen and Bax 2007; Shen et al. 2008;

Sibley et al. 2003, Wishart et al. 2008; Shen et al. 2009). Due to the strong dependence

they have on the secondary structure, chemical shift based assignment and refinement

strategies have therefore assumed considerable importance in biomolecular NMR. The

vast majority of NMR determined protein structures, which are routinely deposited in the

BMRB and PDB data banks, have led to empirical methods by which chemical shift based

structure predictions can be made with an accuracy reported to be better than 95%. Thus,

while working with new proteins fair amount of secondary structural information can

already be derived from the chemical shifts alone even before detailed structural

calculations are carried out and the derived information used as input for further structure

refinements.

In denatured proteins, the starting point of protein folding inside a cell, NMR

studies have revealed the presence of residual structures which are believed to be the

initiation sites for protein folding (Bhuyan 2002; Chiti and Dobson 2006; Dyson and

Wright 2005; Francis et al. 2006; Neri et al. 1992; Shortle and Ackerman 2001; Tafer et

al. 2004). Folding pathways can be different depending upon the initial conditions and

there can be many different parallel pathways. Detailed investigations of this type are only

a few because of the difficulties in assigning NMR spectra of denatured proteins. The

determination of NMR chemical shifts in such situations has therefore remained a

formidable proposition. Computational support is also unavailable at this point since there

are no established methods for calculation of NMR chemical shifts for such denatured

proteins, which are highly dynamic with multiple conformations existing in equilibrium in

the ensemble and the observed chemical shifts are actually the ensemble averages.

For the denatured/unfolded proteins, theoretical calculation of chemical shifts by

quantum chemical methods (Oldfield 1995; de Dios et al. 1996; Pearson et al. 1995)

provides a means for assessing their structural properties. Although the methodology for

Page 4: Main_Ms_JBNMR_Final_version

calculation of chemical shifts for folded proteins has been well established and several

successful applications have been reported (de Dios et al. 1993a; de Dios et al. 1993b; Gao

et al. 2007; Havlin et al. 1997; Laws et al. 1993; Pearson et al. 1997; Sun et al. 2002; Vila

et al. 2008, Vila and Scheraga 2009), no attempt has so far been made in this direction to

determine the chemical shifts of denatured/unfolded proteins as a means of assessing their

structural properties. Such studies are probably lacking due to paucity of structural data, a

prerequisite for any theoretical chemical shift calculation.

In this background, as a way forward, we have envisioned an in silico approach

wherein the preferred conformational states of a protein under given denaturing conditions

are first determined by molecular dynamics simulations and the structural data obtained

therein is further used to determine the ensemble averaged chemical shifts by ab initio

quantum chemical methods (Facelli 2002; Casabianca and de Dios 2008; Jameson and de

Dios 2009). Formidable as it may sound, a successful calculation would be extremely

useful in predicting the structural propensities and identifying the hotspots for protein

folding initiation. In this article, we present the first such attempt considering certain

segments of a denatured protein. For the purpose of demonstration, we have chosen the 88

residue long drosophila melanogaster SUMO (dSmt3) protein as a model system since

this protein has been well studied both in the folded (Kumar et al. 2009a) and 8M urea-

denatured state (Kumar et al. 2009b) and full resonance assignments have been made

from multidimensional NMR experiments.

Computational Methods

MD simulations of the 8 M urea denatured states of SUMO (Asn22-Pro61)

Molecular dynamics (MD) simulations have been used to determine the various

conformational states of the denatured dSmt3 protein which we have taken as the model

system. For MD simulations, the protein fragment Asn22-to-Pro61 was chosen as this

region has been shown to depict pronounced secondary structural propensities in the

denatured state (Kumar et al. 2009b). Hundred random topologies of this fragment were

generated using CYANA-3.0 (Guntert et al. 1997). Out of these, five with lowest CYANA

target function were selected for the 13C chemical shielding calculations, and these were

taken as members of the inter-converting conformers in the unfolded ensemble. This is of

course an extreme simplification and was largely driven by computational limitations.

Page 5: Main_Ms_JBNMR_Final_version

Nevertheless, these numbers of conformers can be considered to be good enough for

determining the dynamically averaged chemical shifts and their comparison with

experimental data. As we show later, the results are highly encouraging and match the

experimental data quite well.

To mimic the experimental conditions (Kumar et al. 2008, Kumar et al. 2009b),

each of the selected five topologies (generated in CYANA) corresponding to the selected

fragment of dSmt3 polypeptide chain was subjected to energy minimization in aqueous

urea solution (~8 M) using the software package Gromacs 4.0 (Scott et al. 1999). The 8M

urea system (mole fraction of 0.186) was constructed by randomly replacing water

molecules with urea, resulting in 518 water molecules and 114 urea molecules. The box

volume was adjusted to give the experimental density for 300 K i.e. 1.12 g/ml (Kawahara

and Tanford 1996, Stumpe and Grubnuller 2007). The urea system was then subjected to

energy minimization using a steepest decent method for 2000 steps. The generated urea

box was then used for equilibrating all the polypeptide fragments under the actual

experimental conditions (pH ~ 5.6 and 300 K) at which the NMR experiments (Kumar et

al. 2008) had been carried out. The ionizable side groups were properly charged,

depending upon their corresponding pKa values, i.e., (i) all lysines and arginines were

positively charged, (ii) all aspartic and glutamic acids were deprotonated, and (iii)

histidines were protonated. Then, depending upon the charge of the whole system,

sodium/chloride ions were added to the system to maintain charge neutrality. The

electrostatics was in this case treated by Particle Mesh Ewald (PME) method (Darden et

al. 1993) implementing a Coulomb cut-off of 1.4 nm, Fourier spacing of 0.12 nm and an

interpolation order of 4. Each topology was equilibrated in two steps: first, the topology

was energy minimized using a steepest decent method for 2000 steps and, second, a

position restrained MD run under the conditions of position restraints for heavy atoms and

LINCS constraints (Hess et al. 2008) for all bonds was carried out. In position-restrained

MD run, the water molecules were first energy minimized and then briefly equilibrated

around the protein by a 200 ps dynamics simulation while protein and ion coordinates

were held fixed. Next, the whole system (i.e. protein in 8M urea box) was subjected to

energy minimization step using the GROMOS-96 43A1 force field (Scott et al. 1999) and

periodic boundary conditions under NVT (constant number of particles, volume, and

temperature). The temperature of the system was regulated by weak coupling to an

external bath (Berendsen et al. 1984). Cut-off distances for the calculation of the Coulomb

Page 6: Main_Ms_JBNMR_Final_version

and van der Waals interaction were 1.0 and 1.4 nm, respectively. The five topologies

obtained from the final MD simulations are shown in Figure 1.

ab initio calculation of 13C chemical shifts

The ab initio calculations are focused on 13C chemical shifts since these have been

used to predict the secondary structure of folded proteins (Glushka et al. 1989; Spera and

Bax 1991; Vila et al. 2007a; Vila et al. 2007b; Vila et al. 2008). 13C chemical shifts are

also known to be very diagnostic of secondary structural preferences in the denatured

states (Peti et al. 2001). Furthermore, it has been shown that for 13C chemical shifts, short

range contribution due to local geometry at the carbon site of interest far outweighs the

long range electrostatic and magnetic contributions (de Dios et al. 1993a). As the short

range contribution depends strongly on the (φ,ψ) angles, 13C chemical shifts can be

accurately determined by ab inito methods and correlated to secondary structure. Since the

geometrical details in terms of (φ,ψ) angles have been determined for the 8 M denatured

dSmt3 from MD simulations, these can be directly used for our 13C chemical shift

calculations.

13C (Cα, CO) nuclear magnetic-shielding tensors (σσσσ) were determined using the

GIAO (gauge including atomic orbital) method (Ditchfield 1974) coupled with density

functional theory (DFT) and employing the Becke’s three-parameter hybrid functional

(Becke 1993). The shielding tensor σσσσ is related to the chemical shift δ by the reference

standard σref as δ = (σ0 - σiso) / (1 - σ0) x106 ≈ (σ0 - σiso) x106, where σiso = 1/3 Tr(σσσσ) is

the absolute isotropic shielding constant. By determining the absolute isotropic shieldings

of Cα, CO and the reference compound (DSS) at the same level of theory, the chemical

shifts for Cα and CO can be readily estimated from the DFT calculations. Calculations

carried out using increasing basis sets, from sparse STO to heavy 6-311G (2d,2p), showed

that the results had converged and the 13C chemical shifts could be determined from

B3LYP/6-311G (2d,2p) calculations with a high numerical accuracy (See Supplementary

Information Figure F1).

The computational effort involved in determining 13Cα and 13CO chemical shifts

for the whole protein fragment all at once is quite severe as the total number of atoms

involved is quite large (668 for the whole As22-pro61 fragment) and the computational

Page 7: Main_Ms_JBNMR_Final_version

time scales up very rapidly with the number of contracted basis set functions. However,

the effects of all atoms need not be incorporated in the chemical shielding calculation

because nuclear shielding is fundamentally a local phenomenon. Most of the effects

propagate through the bonding framework and hence are short range. As mentioned

earlier, we are predominantly interested in the short-range contribution to 13C chemical

shifts. Therefore, for the purpose of chemical shift calculation, the protein fragment can be

conveniently divided into smaller clusters. Basically, the N-reside long polypeptide chain

is divided into a number N of small clusters of chosen radius. In this cluster model, the

selected amino acid residue i, for which the 13C NMR chemical shifts were determined, is

at the center and is surrounded by one or more neighboring residues on either side [(i-1)

and (i+1)]. While considering the immediate neighbor residues, if one or more atoms of

the next succeeding residue fall within the cutoff boundary then the cutoff boundary is

decided by using truncation at this residue level. As long as the cluster is sufficiently large

as to preserve the local geometry, particularly the torsion angles φ,ψ in our case,

computational effort is dramatically reduced.

For our chemical shift calculations cluster models were generated for each of the

residues in the urea denatured Asn21-pro62 fragment. Figure 2A shows the polypeptide

sequence and the scheme we have employed to generate the molecular clusters. The lower

panel (Figs. 2B,C,D) shows representative models with 3 Å cutoff radius. As seen, the

residue of interest may include more than one immediate neighbor. The overall geometry

of the cluster in terms of the bond lengths, bond angles and dihedral angles φ,ψ was the

same as they were determined from MD simulations. In this manner, the local

environment at various residues could be satisfactorily modeled in each of the five

conformers. First, few trial calculations were performed on molecular clusters of radii 3, 4,

and 5 Å. For molecular clusters with larger radii, the computational time increased several

times, but the results did not deviate significantly from those calculated for molecular

clusters with a 3-Å cut-off radius. This can be seen from Figure3. If we consider the large

chemical shift range spanned by 13Cα (22.31 ppm) and 13CO (8.03 ppm), an increase in

cluster size from 3 Å to 5 Å results in an average improvement of only 0.6 % (Cα) and 2.4

% (CO) for the calculated chemical shifts. Independent calculations were also carried out

using the more elaborate ONIOM models (Hayashi and Ohmine 2000; Vreven et al. 2003;

He et al. 2009). Trial calculations on some of the residues of the selected fragment Asn22

– Pro61 showed that the results derived from 3 Å clusters were comparable to the ONIOM

Page 8: Main_Ms_JBNMR_Final_version

results (See Supplementary Information Figure F2, Table T1). Thus, the 3-Å cluster, which

adequately includes the effect of immediate neighbors adjoining the residue of interest,

was chosen for all the B3LYP/6-311G (2d,2p) calculations as it was computationally

faster and yielded 13C chemical shifts within an RMSD of ±0.03 ppm. This provided the

best compromise between speed and accuracy in our chemical shift calculations. Final

calculations for the 40 residue long fragment were therefore carried out using molecular

clusters of 3-Å radius. These were sequentially generated by stepping one residue at a time

along the Asn22 – Pro61 denatured fragment. Computer graphic views of a few

representative molecular clusters with 3-Å cut-off radius are shown in Figure 4. All the

B3LYP/6-311G(2d,2p) calculations were carried out using Gaussian ’03 (Frisch et al.

2009) package on the Grendel AMD/Operon cluster at the Danish Center for Scientific

Computing, Aarhus University, Denmark. Total computational time involved for the

whole denatured dSmt3 fragment was about 30 days.

Results and discussion

The isotropic chemical shifts (δ) were determined for the various 13Cα and 13CO

sites in each of the five topologies using δ = σRef - σiso, where σiso denotes the absolute

isotropic shielding of Cα, CO determined from B3LYP/6-311G(2d,2p) calculations. σref

denotes the absolute isotropic shielding of the reference determined at the same level of

theory, For DSS reference, this was estimated to be 183.141 ppm. Figure 5 shows a plot

of the residue-wise isotropic 13Cα and 13CO shifts determined in each of the five

topologies. As seen, the chemical shift dispersion across the five topologies is rather small

(±0.58 and ±0.52 ppm for Cα and CO, respectively). Although it is desirable to have a

larger number of topologies for deriving the ensemble average chemical shift, the above

results suggest that the average chemical shift over a smaller five member ensemble we

have chosen is statistically significant. Accordingly we have used the data of Fig. 5 and

determined the average Cα and CO chemical shifts over the five-member ensemble for the

purpose of comparison with experimental results. The ensemble averaged residue-wise Cα

and CO chemical shifts determined from our ab initio calculations are compared with the

corresponding experimentally determined shifts in Figure 6. The experimental data were

taken from the BMRB data bank (accession number 15,473) (See Supplementary

Information Table T2). As seen, the calculations lead to good agreement with experimental

Page 9: Main_Ms_JBNMR_Final_version

data spanning a range of 22.31 to 7.05 ppm between the shielding (Cα) and deshielding

(CO) extremes. The variation in the experimentally determined 13Cα and 13CO chemical

shifts across the various residues in the polypeptide chain is well reproduced and matched

by the calculations. In the case of Cα, for which the short range contribution due to φ,ψ

effects is considered to be dominant, a superior determination of its chemical shift has

been made. Similarly, in the case of CO, for which contributions due to φ,ψ torsion angles

and hydrogen-bonding (secondary regions) (de Dios and Oldfield 1994) are considered to

be important, our calculations lead to a satisfactory agreement with experimental results.

Overall, the good agreement between the calculated and experimentally determined

chemical shifts lend credence to the intra and intermolecular coordinate geometry of the

denatured dSmt3 that we have derived from MD simulations. Figure 7 shows a correlation

plot of the experimental and calculated shifts. As seen, the calculations show excellent

correlation with experimental data over the entire range of Cα and CO chemical shifts,

with a reliability coefficient R = 0.99 and RMSD = 1.47 ppm.

Secondary shifts and structural propensities

From the ensemble average Cα and CO chemical shifts that we have determined

from ab initio calculations, secondary shifts were estimated by subtracting the random coil

shifts. For this purpose the sequence corrected random coil shifts (Schwarzinger et al.

2001) were used. These were determined for all the five different 8 M urea equilibrated

topologies and their ensemble average was used to derive residue-wise secondary shifts.

Previous experimental studies of denatured dStm3 have employed 13Cα, 13CO cumulative

shifts as a marker of secondary structural preferences. Accordingly, we have used the 13Cα

and 13CO secondary shifts estimated from ab initio calculations to derive the cumulative

secondary shifts using the following equation:

The normalizations used here for the individual secondary shifts are based on the total

span of the respective chemical shifts in the folded states.

The 13Cα, 13CO and cumulative secondary shifts determined from our ab inito

calculations are compared with experimentally determined 13Cα, 13CO cumulative shifts in

Figure 8. All the numerical results are given in Table 1. As can be seen from Fig. 8, the ab

Page 10: Main_Ms_JBNMR_Final_version

initio calculations reveal some residual structural elements (Fig. 8A) which are strikingly

similar to the experimentally determined structural preferences (Fig. 8B). Except for four

residues, namely, two N-terminal residues Asn22 and Pro61 and two core residues Met39

and Asn40 in the interior; the denatured protein exhibits α-helical (Ala41 to Gly47) and β-

sheet propensities (Val25 to Pro34 and Gly51 to Phe57). This is in excellent accord with

the experimental results. It may also be recognized from Fig. 8, that the folded state of

dSmt3 has well-defined secondary structural elements, and the denatured state also has

some residual structural elements of the native type.

Concluding remarks

In summary, we have shown that the residual secondary structural features

exhibited by denatured proteins can be revealed by molecular simulations and ab initio

chemical shift calculations. This has been demonstrated in the model protein dSmt3. Using

Cα and CO cumulative secondary shift as a marker, the 8M urea denatured dSmt3 is

shown to exhibit α-helical and β-sheet propensities. These findings are in excellent

agreement with experimental observations.

The successful calculation of chemical shifts for denatured dsmt3 gives an

important message. The fact that the individual residue-wise calculated chemical shifts in

the five topologies were not very different suggests that in denatured states, the shifts are

mostly dictated by local environments. Thus, as against the common notion that a large

number of conformers may be required for reliable averaging, it appears that only a few

topologies may be adequate and that makes the computations far more tractable in general.

Further, the possibility of reliably calculating chemical shifts for denatured protein,

as reported here, opens up new computational avenues for NMR characterization of

complex proteins. Firstly, the spectral features of even the flexible domains of otherwise

folded proteins can be calculated, which would then enable complete characterization of

the structure and dynamics of the proteins at residue level detail in solution. Such

information would further allow experimental characterization of specific interactions of

the chosen protein with different target molecules. Second, the properties of various

intermediate states along the equilibrium folding pathway of a protein driven by

progressive dilution of denaturants can be calculated. Comparison of such data with

different denaturants would enable understanding the basic principles driving

Page 11: Main_Ms_JBNMR_Final_version

folding/misfolding of proteins. How-so-ever limited they may be in detail; they provide

extremely valuable insights into protein folding mechanisms, in general.

Acknowledgments

SG thanks CSIR, New Delhi for support under Emeritus Scientist Scheme

(SG:21(0701)/07/EMR-II). We acknowledge support from the Danish National Research

Foundation and the Danish Center for Scientific Computing.

Supporting Information:

Supplementary material contains all the tables of Cα and CO chemical

shieldings/shifts derived from experimental and computational methods.

References Becke A D (1993) A New Mixing of Hartree-Fock and Local Density-Functional Theories. J Chem Phys 98: 1372-1377.

Berendsen HJC, Postma JPM, Vangunsteren WF, Dinola A, Haak JR (1984) MoleculaDynamics with Coupling to An External Bath. J Chem Phys 81: 3684-3690.

Bhuyan AK (2002) Protein Stabilization by urea and Guanidine Hydrochloride. Biochemistry 41: 13386-13394.

Casabianca LB, de Dios AC (2008) Ab initio calculations of NMR chemical Shifts. J Chem Phys 128:52201.

Cavalli A, Salvatella X, Dobson CM, Vendruscolo M (2007) Protein structure determination from NMR chemical shifts. Proc Natl Acad Sci USA 104:9615–9620.

Chiti , Dobson DM (2006) Protein misfolding, functional amyloid, and human disease. Ann Rev Biochem 75:333-366.

Cornilescu G, Delaglio F, Bax A (1999) Protein backbone angle restraints from searching a database for chemical shift and sequence homology J Biomol NMR 13:289–302. Darden T, York D, Pedersen L (1993) Particle mesh Ewald: An N·log(N) method for Ewald sums in large systems. J Chem Phys 98: 10089.

de Dios AC, Pearson JG, Oldfield E (1993a) Secondary and tertiary structural effects on protein NMR chemical shifts: An ab initio approach. Science 260:1491-1496.

Page 12: Main_Ms_JBNMR_Final_version

de Dios AC, Pearson JG, Oldfield E (1993b) Chemical shifts in proteins: An ab initio study of carbon-13 nuclear magnetic resonance chemical shielding in glycine alanine and valine residues. J Am Chem Soc 115:9768–9773. de Dios AC, Oldfield E (1994) Chemical shifts of carbonyl carbons in peptides and proteins. J Am Chem Soc 116:11485-11148.

Ditchfield, R (1974) Self-consistent perturbation theory of diamagnetism. A gauge-invariant LCAO method for N.M.R. chemical shifts. Mol Phys 27:789–807. Dyson HJ, Wright PE (2005) Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol 6:197-208.

Facelli JC (2002) Chemical shielding calculations. Encyclopedia of Nuclear Magnetic Resonance. Grant, DM, Harris, RK (Eds), Wiley, London, p323-333.

Francis CJ, Lindorff-Larsen K, Best RB, Vendruscolo M (2006) Characterization of the residual structure in the unfolded state of the D131D fragment of staphylococcal nuclease. Proteins 65:145-152.

Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Montgomery Jr JA, Vreven T, Kudin KN, Burant JC, Millam JM, Iyengar SS, Tomasi J, Barone V, Mennucci B, Cossi M, Scalmani G, Rega N, Petersson GA, Nakatsuji H, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Klene M, Li X, Knox JE, Hratchian HP, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Ayala PY, Morokuma K, Voth GA, Salvador P, Dannenberg JJ, Zakrzewski VG, Dapprich S, Daniels AD, Strain MC, Farkas O, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Ortiz JV, Cui Q, Baboul AG, Clifford S, Cioslowski J, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komaromi I, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Challacombe M, Gill, PMW, Johnson B, Chen W, Wong MW, Gonzalez C, Pople JA Gaussian 03, Revision C.02, Gaussian, Inc., Wallingford CT, 2004.

Gao Q, Yokojima S, Kohno T, Ishida T, Fedorov DG, Kitaura K, Fujihira M, and Nakamura

S (2007) Ab initio NMR chemical shift calculations on proteins using fragment molecular orbitals with electrostatic environment. Chem Phys Lett 445:331-339. Glushka J, Lee M, Coffin S, Cowburn D (1989) 15N chemical shifts of backbone amides in bovine pancreatic trypsin inhibitor and apamin. J Am Chem Soc 111:7716–7722. Gong H, Shen Y, Rose GD (2007) Building native protein conformation from NMR backbone chemical shifts using Monte Carlo fragment assembly. Protein Sci 16:1515–1521. Güntert P, Mumenthaler C, Wüthrich K (1997) Torsion angle dynamics for NMR structure calculation with the new program DYANA. J Mol Biol 273:283-298.

Page 13: Main_Ms_JBNMR_Final_version

Havlin RH, Le H, Laws DD, deDios AC, Oldfield E (1997) An ab initio quantum chemical investigation of carbon-13 NMR shielding tensors in glycine, alanine, valine, isoleucine, serine, and threonine: Comparisons between helical and sheet tensors, and effects of χ1 on shielding. J Am Chem Soc 119:11951–11958.

Hayashi S, Ohmine I (2000) Proton Transfer in Bacteriorhodopsin: Structure, excitation, IR spectra and potential energy surface analyses by an ab initio QM/MM method. J Phys Chem B 104:10678-10691. He X, Wang B, Merz KM (2009) Protein NMR Chemical Shift Calculations Based on the Automated Fragmentation QM/MM Approach. J Phys Chem B 113:10380-10388.

Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput 4: 435-447.

Hess B, Bekker H, Berendsen HJC, Fraaije JGEM (1997) LINCS: A linear constraint solver for molecular simulations. J Comp Chem 18:1463-1472. Iwadate M, Asakura T, Williamson MP, (1999) Cα and Cβ chemical shifts in proteins from an empirical data base. J Biomol NMR 13:199–211. Jameson CJ, de Dios AC (2009) Theoretical and physical aspects of nuclear shielding. In Nuclear Magnetic Resonance, The Royal Society of Chemistry, London, 38:68-93.

Kawahara K, Tanford C (1996) Viscosity and Density of Aqueous Solutions of Urea and Guanidine Hydrochloride. J Biol Chem 241:3228-3232.

Kumar D, Kumar A, Misra JR, Chugh J, Sharma S, Hosur RV (2008) 1H, 15N, 13C resonance assignment of folded and 8 M urea-denatured state of SUMO from drosophila melanogaster. Biomol NMR Assign 2:13-15.

Kumar D, Chugh J, Sharma S, Hosur RV (2009a) Conserved structural and dynamics features inc the denatured states of drosophila SUMO, human SUMO and ubiquitin proteins: Implications to sequence-folding paradigm. Proteins 76:387-402.

Kumar D, Misra JR, Chugh J, Sharma S, Hosur RV (2009b) NMR derived solution structure of SUMO from drosophila melanogaster (dSmt3). PROTEINS: Structure Function and Bioinformatics 75:1046-1050.

Laws DD, de Dios AC, Oldfield E (1993) NMR chemical shifts and structure refinement in proteins. J Biomol NMR 3:607-612.

Lumsden MD, Wasylishen RE, Eichele K, Schindler M, Penner GK, Power WP, Curtis RD (1994) Carbonyl carbon and nitrogen chemical shift tensors of the amide fragment of acetanilide and N-methylacetanilide. J Am Chem Soc 116: 1403-1413.

Page 14: Main_Ms_JBNMR_Final_version

Matsuki Y, Akutsu H, Fujiwara T (2007) Spectral fitting for signal assignment and structural analysis of uniformly 13C-labeled solid proteins by simulated annealing based on chemical shifts and spin dynamics. J Biomol NMR 38:325–339. Meiler J (2003) PROSHIFT: Protein chemical shift prediction using artificial neural networks. J Biomol NMR 26:25–37.

Neal S, Nip AM, Zhang HY, Wishart DS (2003) Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J Biomol NMR 26:215–240. Neri D, Billeter M, Wider G, Wüthrich K (1992) NMR determination of residual structure in a urea-denatured protein, the 434-repressor. Science 257:1559-1563.

Oldfield E (1995) Chemical shifts and three-dimensional protein structures. J Biomol NMR 5:217-225.

Pearson JG, Le H, Sanders LK, Godbout N, Havlin RH, Oldfield E (1997) Predicting chemical shifts in proteins: Structure refinement of valine residues by using ab initio and empirical geometry optimizations. J Am Chem Soc 119:11941–11950. Peti W, Smith LJ, Redfield C, Schwalbe H (2001) Chemical shifts in denatured proteins: resonance assignments for denatured ubiquitin and comparisons with other denatured proteins. J Biomol NMR 19:153-165.

Schwarzinger S, Kroon GJ, Foss TR, Chung J, Wright PE, Dyson HJ (2001) Sequence-dependent correction of random coil NMR chemical shifts. J Am Chem Soc 123:2970-2978.

Scott WRP, Hunenberger PH, Tironi IG, Mark AE, Billeter SR, Fennen J, Torda AE, Huber T, Kruger P, van Gunsteren WF (1999) The GROMOS biomolecular simulation program package. J Phys Chem A 103:3596-3607.

Shen Y, Bax A (2007) Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology. J Biomol NMR 38:289–302.

Shen Y, Lange O, Delaglio F, Rossi P, Aramini JM, Liu G, Eletsky A, Wu Y, Singarapu KK, Lemak A, Ignatchenko A, Arrowsmith CH, Szyperski T, Montelione GT, Baker D, Bax A (2008) Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci U S A 105:4685-4690.

Shen Y, Delaglio F, Cornilescu G, Bax A (2009) TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR 44:213-223.

Shortle D, Ackerman MS (2001) Persistence of native-like topology in a denatured protein in 8 M urea. Science 293: 487-489.

Sibley AB, Cosman M, Krishnan VV (2003) An empirical correlation between secondary

structure content and average chemical shifts in proteins. Biophys J 84:1223-1227.

Page 15: Main_Ms_JBNMR_Final_version

Spera S, Bax A (1991) Empirical correlation between protein backbone conformation and Cα and Cβ 13C Nuclear Magnetic Resonance chemical shifts. J Am Chem Soc 113:5490–5492. Sternberg U, Witter R, Ulrich A (2004) 3D structure elucidation using NMR chemical shifts. Annu Rep NMR Spectrosc 52, 53–104. Stumpe M C, Grubmuller H (2007) Aqueous urea solutions: structure, energetics, and urea aggregation. J Phys Chem B 111:6220-6228. Sun H, Sanders LK, Oldfield E (2002) Carbon-13 NMR shielding in the twenty common amino acids: Comparisons with experimental results in proteins. J Am Chem Soc 124:5486–5495. Tafer H, Hiller S, Hilty C (2004) Nonrandom structure in the urea-unfolded Escherichia coli outer membrane protein X (OmpX). Biochemistry 43:860-869. Vila JA, Ripoll DR, Scheraga HA (2007a) Use of 13Cα chemical shifts in protein structure determination. J Phys Chem B 111:6577–6585. Vila JA, Villegas ME, Baldoni HA, Scheraga HA (2007b) Predicting 13Cα chemical shifts for validation of protein structures. J Biomol NMR 38:221–235. Vila JA, Arnautova YA, Scheraga HA (2008) Use of 13Cα chemical shifts for accurate determination of β-sheet structures in solution. Proc Natl Acad Sci USA 105:1891–1896. Vila JA, Aramini JM, Rossi P, Kuzin A, Su M, Seetharaman J, Xiao R, Tong L, Montelione GT, Scheraga HA (2008) Quantum chemical 13Cα chemical shift calculations for protein NMR structure determination, refinement, and validation. Proc Natl Acad Sci U.S.A. 105:14389–14394.

Vila JA, Scheraga HA (2009) Assessing the accuracy of protein structures by quantum mechanical computations of 13Cα chemical shifts. Acc Chem Res 42:1545-1553. Vreven T, Morokuma K, Farkas O, Schlegel HB, Frisch MJ (2003) Geometry optimization with QM/MM, ONIOM, and other combined methods. I. Microiterations and constraints. J Comput Chem 24: 760-769.

Wang Y, Jardetzky O (2002) Probability-based protein secondary structure identification using combined NMR chemical-shift data. Protein Sci 11:852-861. Wei Y, Lee D K, Ramamoorthy A (2001). Solid-state (13)C NMR chemical shift anisotropy tensors of polypeptides. J Am Chem Soc 123: 6118-6126.

Wishart DS, Sykes BD, Richards FM (1992) The chemical shift index: a fast and simple method for the assignment of protein secondary structure through NMR spectroscopy. Biochemistry 31:1647-1651.

Page 16: Main_Ms_JBNMR_Final_version

Wishart DS, Case DA (2001) Use of chemical shifts in macromolecular structure determination. Methods Enzymol 338:3–34.

Wishart D S, Arndt D, Berjanskii M, Tang P, Zhou J, Lin G. CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data. Nucleic Acids Res 2008; (36): W496-W502.

Page 17: Main_Ms_JBNMR_Final_version

Legend to Figures

Figure 1: Five energy-minimized conformations of the 8 M urea denatured

drosophila melanogaster SUMO (dSmt3) fragment (Asn22-Pro61) used in

the ab initio calculations of 13C chemical shifts. The 13Cα and 13CO

chemical shifts reported here are the residue-wise averaged values of the

chemical shifts individually determined for the above shown topologies.

(see text for details).

Figure 2: (A) Polypeptide chain of the 40 residue Asn22-Pro61 fragment of used for

generation of molecular clusters. Three representative molecular clusters

for the residues Val24 (B), Leu38 (C) and Gln60 (D) are shown and the

boundary residues in each case fall within the cutoff radius of 3 Å.

Figure 3: 13Cα (A) and 13CO (B) absolute isotropic shieldings determined from

Gaussian ’03 B3LYP/6-311G(2d,2p) calculations for the indicated residues

using different cutoff radii.

Figure 4: Computer graphics view of representative molecular clusters with 3 Å

cutoff radius for Val25 (A) and Lys 28 (B) residues.

Figure 5: Residue-wise 13Cα (A) and 13CO (B) absolute isotropic shieldings

determined for the five topologies of the 8 M urea denatured dSmt3 Asn22-

Pro61 fragment from Gaussian ’03 B3LYP/6-311G(2d,2p) calculations.

Figure 6: Comparison of 13Cα (A) and 13CO (B) experimental chemical shifts (blue)

with those determined from Gaussian ’03 B3LYP/6-311G(2d,2p)

calculations.

Figure 7: Correlation between experimental and calculated Cα and CO chemical

shifts. Straight line represents a linear least-squares fit to the data

(R=0.997).

Figure 8: Comparison of cumulative 13C secondary shifts from ab initio quantum

chemical calculations (A) with experimental results (B). Secondary

structural preferences for continuous stretches of three residues are shown

on top.

Page 18: Main_Ms_JBNMR_Final_version

Figure 1

Page 19: Main_Ms_JBNMR_Final_version

Figure 2

Page 20: Main_Ms_JBNMR_Final_version

Figure 3

5 4 3116

118

120

122

124

126

128 A

iso

tro

pic

shie

ldin

g (p

pm)

Radius (A)

His 32

Pro 34

Lys 37

Met 39

Tyr 42

Leu 48

5 4 3

5

6

7

8

9

10

11 B

Radius (A)

Page 21: Main_Ms_JBNMR_Final_version

Figure 4

A B

Page 22: Main_Ms_JBNMR_Final_version

Figure 5

20 25 30 35 40 45 50 55 6040

45

50

55

60

65

70

13Cαααα

δδ δδ (p

pm

)

Residue20 25 30 35 40 45 50 55 60

165

170

175

180

185

13CO

δδ δδ (p

pm

)

Residue

1 2 3 4 5 mean

Page 23: Main_Ms_JBNMR_Final_version

Figure 6

Page 24: Main_Ms_JBNMR_Final_version

Figure 7

40 60 80 100 120 140 160 180 200

40

60

80

100

120

140

160

180

200

δδ δδ calc (

ppm

)

δδδδexpt (ppm)

Page 25: Main_Ms_JBNMR_Final_version

Figure 8

Page 26: Main_Ms_JBNMR_Final_version

Table 1

Calculated 13C chemical shifts (ppm) for the 8 M urea denatured dStm3 (Asn22 to Pro61) No Residue

(ab initio ) CO

(ab initio) Cα

(random coil)

CO (random

coil)

(secondary)a

CO (secondary)b

Cumulative

(secondary)c

22 ASN 55.63 177.12 55.27 175.26 0.015 0.186 0.201 (-0.119) 23 ALA 52.68 177.58 52.73 177.55 -0.002 0.003 0.001 (0.004) 24 VAL 63.55 177.20 62.42 175.86 0.045 0.134 0.180 (0.039) 25 VAL 60.16 175.11 62.45 176.04 -0.091 -0.093 -0.184 (-0.010) 26 GLN 57.18 174.03 55.92 175.83 0.051 -0.180 -0.129 (-0.042) 27 PHE 57.78 175.74 57.98 175.71 -0.008 0.003 -0.005 (-0.060) 28 LYS 57.79 173.82 56.45 176.31 0.054 -0.249 -0.195 (-0.041) 29 ILE 59.08 176.83 61.41 176.08 -0.093 0.075 -0.018 (-0.024) 30 LYS 58.01 175.65 56.51 176.47 0.060 -0.082 -0.022 (-0.043) 31 LYS 56.16 176.24 56.68 176.55 -0.021 -0.031 -0.051 (-0.026) 32 HIS 58.88 173.74 55.32 174.42 0.143 -0.068 0.075 (0.069) 33 THR 60.61 171.65 61.95 174.71 -0.053 -0.306 -0.359 (-0.253) 34 PRO 66.18 173.36 63.62 177.29 0.103 -0.393 -0.290 (-0.304) 35 LEU 56.60 178.70 53.49 175.09 0.125 0.361 0.486 (0.310) 36 ARG 55.68 176.98 56.22 175.98 -0.021 0.100 0.079 (0.003) 37 LYS 54.92 175.28 56.57 176.57 -0.066 -0.129 -0.195 (-0.052) 38 LEU 58.28 173.63 55.63 177.52 0.106 -0.389 -0.283 (-0.175) 39 MET 57.37 176.10 55.74 176.32 0.065 -0.022 0.044 (-0.054) 40 ASN 55.15 177.35 55.44 175.25 -0.011 0.210 0.199 (-0.035) 41 ALA 51.29 175.24 52.86 177.44 -0.063 -0.220 -0.282 (-0.034) 42 TYR 59.30 176.14 58.34 175.68 0.039 0.046 0.085 (0.024) 43 CYS 56.18 176.48 55.36 174.33 0.033 0.215 0.248 (0.181) 44 ASP 55.61 177.30 52.92 175.04 0.108 0.226 0.334 (0.175) 45 ARG 57.44 175.50 56.49 176.43 0.038 -0.093 -0.055 (-0.031) 46 ALA 51.59 179.68 52.73 177.74 -0.045 0.194 0.149 (0.059) 47 GLY 47.80 174.85 45.37 173.9 0.097 0.095 0.193 (0.049) 48 LEU 58.35 175.34 55.43 177.88 0.117 -0.254 -0.137 (-0.031) 49 SER 60.01 174.78 58.65 174.69 0.055 0.009 0.064 (0.001) 50 MET 55.21 174.51 55.7 176.36 -0.019 -0.185 -0.204 (-0.026) 51 GLN 54.18 176.19 56.3 176.1 -0.085 0.009 -0.075 (-0.032) 52 VAL 64.30 174.58 62.54 176.15 0.071 -0.157 -0.086 (-0.033) 53 VAL 60.60 175.93 62.45 176.1 -0.074 -0.017 -0.091 (-0.054) 54 ARG 55.33 175.76 56.22 176.04 -0.035 -0.028 -0.063 (-0.067) 55 PHE 54.39 175.87 58.01 175.62 -0.145 0.025 -0.119 (-0.042) 56 ARG 54.68 174.63 56.52 176.08 -0.073 -0.145 -0.218 (-0.013) 57 PHE 60.08 174.18 57.97 175.81 0.085 -0.163 -0.078 (-0.026) 58 ASP 55.08 175.68 52.85 174.81 0.089 0.087 0.177 (0.211) 59 GLY 44.52 173.58 45.37 173.94 -0.034 -0.036 -0.070 (-0.017) 60 GLN 55.33 176.18 56.17 176.51 -0.033 -0.033 -0.066 (-0.190) 61 PRO 66.83 178.49 63.88 177.19 0.118 0.130 0.248 (-0.082)

aIndicated as Cα(secondary)/25. aIndicated as CO(secondary)/10.cValues in parentheses denote experimental data (taken from BMRB data bank, accession number 15,473).