a functional role of rv1738 in mycobacterium tuberculosis ... · diffracted to 1.65 Å resolution,...

6
A functional role of Rv1738 in Mycobacterium tuberculosis persistence suggested by racemic protein crystallography Richard D. Bunker a,1 , Kalyaneswar Mandal b,c,d , Ghader Bashiri a , Jessica J. Chaston a,2 , Bradley L. Pentelute b,c,d,3 , J. Shaun Lott a , Stephen B. H. Kent b,c,d,4 , and Edward N. Baker a,4 a Maurice Wilkins Centre for Molecular Biodiscovery and School of Biological Sciences, University of Auckland, Auckland 1142, New Zealand; and b Department of Chemistry, c Department of Biochemistry and Molecular Biology, and d Institute for Biophysical Dynamics, University of Chicago, Chicago, IL 60637 Edited by Gregory A. Petsko, Weill Cornell Medical College, New York, NY, and approved February 25, 2015 (received for review November 23, 2014) Protein 3D structure can be a powerful predictor of function, but it often faces a critical roadblock at the crystallization step. Rv1738, a protein from Mycobacterium tuberculosis that is strongly impli- cated in the onset of nonreplicating persistence, and thereby la- tent tuberculosis, resisted extensive attempts at crystallization. Chemical synthesis of the L- and D-enantiomeric forms of Rv1738 enabled facile crystallization of the D/L-racemic mixture. The struc- ture was solved by an ab initio approach that took advantage of the quantized phases characteristic of diffraction by centrosym- metric crystals. The structure, containing L- and D-dimers in a cen- trosymmetric space group, revealed unexpected homology with bacterial hibernation-promoting factors that bind to ribosomes and suppress translation. This suggests that the functional role of Rv1738 is to contribute to the shutdown of ribosomal protein synthesis during the onset of nonreplicating persistence of M. tuberculosis. racemic protein crystallography | Mycobacterium tuberculosis | ribosome binding | dormancy | protein structure T uberculosis is caused by the bacterium Mycobacterium tu- berculosis (Mtb), currently estimated to infect one-third of the worlds population (1). A major reason for this high rate of infection is the ability of Mtb to enter a dormant state known as nonreplicating persistence (2, 3). This occurs in response to engulfment of the bacterium by activated macrophages. A pro- portion of bacteria survives the immune systems antimicrobial onslaught and persists in a dormant form, but can reemerge many years later as an active infection. In the nonreplicating persistent state, protein synthesis is drastically shut down (4), and the bacteria use host lipids, notably cholesterol, as a carbon source (5). Bacteria in this state are largely resistant to current drugs. Microarray studies have identified a number of genes that are highly up-regulated under conditions thought to trigger the onset of dormancy, including hypoxia (low oxygen concentration) and exposure to nitric oxide (6, 7). The most highly up-regulated gene in hypoxic conditions, and the second-highest in response to NO, is Rv1738, suggesting this gene to be a key element in the onset of persistence. Rv1738 is also classed as an essential gene in the genome-wide transposon mutagenesis study of Sassetti et al. (8). The Rv1738 gene encodes a predicted 94-residue protein of un- known function. In this research, we set out to determine the structure of the protein encoded by the ORF Rv1738 of Mtb H37Rv. Our hypoth- esis was that important clues to the function of the Rv1738 protein might be obtained from its 3D structure, as structure is much more highly conserved than sequence during evolution, and can reveal unsuspected functional and evolutionary rela- tionships. There have been both failures and successes from this approach, but on discovering that Rv1738 could readily be ex- pressed in soluble form in Escherichia coli, we initiated structural studies. Unfortunately, all attempts to crystallize recombinant Rv1738 failed, and the protein also proved unsuitable for struc- tural analysis by NMR because of a tendency to aggregate. Because of our inability to obtain crystals using protein expressed in E. coli, we turned to racemic protein crystallogra- phy, a recently introduced approach that can facilitate crystalli- zation of recalcitrant proteins (9). In this method, a racemic mixture comprising equal amounts of the D-protein and L-protein forms of a target molecule are used for crystallization. A protein racemate can crystallize in centrosymmetric space groups, and can thus access many additional space groups, including some predicted to be highly favored for protein crystallization (10). In contrast, natural proteins are chiral, built only from L-amino acids and the achiral amino acid glycine, and thus cannot be incorporated into crystal lattices that include mirror planes or inversion centers. Evidence is accumulating that racemic protein mixtures can be much easier to crystallize in cases in which the natural L-protein is refractory to crystallization (9). D-proteins can only be made by total chemical synthesis (11, 12), but as Rv1738 is a small protein, we considered it a good candidate for total chemical synthesis using modern ligation methods. Significance Racemic protein crystallography was used to determine the X-ray structure of the predicted Mycobacterium tuberculosis protein Rv1738, which had been completely recalcitrant to crystallization in its natural L-form. Native chemical ligation was used to synthesize both L-protein and D-protein enan- tiomers of Rv1738. Crystallization of the racemic {D-protein + L-protein} mixture was immediately successful. The resulting crystals diffracted to high resolution and also enabled facile structure determination because of the quantized phases of the data from centrosymmetric crystals. The X-ray structure of Rv1738 revealed striking similarity with bacterial hibernation factors, despite minimal sequence similarity. We predict that Rv1738, which is highly up-regulated in conditions that mimic the onset of persistence, helps trigger dormancy by associa- tion with the bacterial ribosome. Author contributions: J.S.L., S.B.H.K., and E.N.B. designed research; R.D.B., K.M., G.B., J.J.C., and B.L.P. performed research; R.D.B. contributed new reagents/analytic tools; R.D.B., K.M., S.B.H.K., and E.N.B. analyzed data; and R.D.B., K.M., S.B.H.K., and E.N.B. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 4wsp and 4wpy). 1 Present address: Friedrich Miescher Institute for Biomedical Research, 4058 Basel, Switzerland. 2 Present address: Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia. 3 Present address: Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139. 4 To whom correspondence may be addressed. Email: [email protected] or skent@ uchicago.edu. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1422387112/-/DCSupplemental. 43104315 | PNAS | April 7, 2015 | vol. 112 | no. 14 www.pnas.org/cgi/doi/10.1073/pnas.1422387112 Downloaded by guest on June 25, 2020

Upload: others

Post on 18-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A functional role of Rv1738 in Mycobacterium tuberculosis ... · diffracted to 1.65 Å resolution, whereas Form-II crystals, al-though having greater solvent content (45%; Vm = 2.23

A functional role of Rv1738 in Mycobacteriumtuberculosis persistence suggested by racemicprotein crystallographyRichard D. Bunkera,1, Kalyaneswar Mandalb,c,d, Ghader Bashiria, Jessica J. Chastona,2, Bradley L. Penteluteb,c,d,3,J. Shaun Lotta, Stephen B. H. Kentb,c,d,4, and Edward N. Bakera,4

aMaurice Wilkins Centre for Molecular Biodiscovery and School of Biological Sciences, University of Auckland, Auckland 1142, New Zealand; and bDepartmentof Chemistry, cDepartment of Biochemistry and Molecular Biology, and dInstitute for Biophysical Dynamics, University of Chicago, Chicago, IL 60637

Edited by Gregory A. Petsko, Weill Cornell Medical College, New York, NY, and approved February 25, 2015 (received for review November 23, 2014)

Protein 3D structure can be a powerful predictor of function, but itoften faces a critical roadblock at the crystallization step. Rv1738,a protein from Mycobacterium tuberculosis that is strongly impli-cated in the onset of nonreplicating persistence, and thereby la-tent tuberculosis, resisted extensive attempts at crystallization.Chemical synthesis of the L- and D-enantiomeric forms of Rv1738enabled facile crystallization of the D/L-racemic mixture. The struc-ture was solved by an ab initio approach that took advantage ofthe quantized phases characteristic of diffraction by centrosym-metric crystals. The structure, containing L- and D-dimers in a cen-trosymmetric space group, revealed unexpected homology withbacterial hibernation-promoting factors that bind to ribosomes andsuppress translation. This suggests that the functional role of Rv1738is to contribute to the shutdown of ribosomal protein synthesisduring the onset of nonreplicating persistence of M. tuberculosis.

racemic protein crystallography | Mycobacterium tuberculosis |ribosome binding | dormancy | protein structure

Tuberculosis is caused by the bacterium Mycobacterium tu-berculosis (Mtb), currently estimated to infect one-third of

the world’s population (1). A major reason for this high rate ofinfection is the ability of Mtb to enter a dormant state knownas nonreplicating persistence (2, 3). This occurs in response toengulfment of the bacterium by activated macrophages. A pro-portion of bacteria survives the immune system’s antimicrobialonslaught and persists in a dormant form, but can reemergemany years later as an active infection. In the nonreplicatingpersistent state, protein synthesis is drastically shut down (4), andthe bacteria use host lipids, notably cholesterol, as a carbon source(5). Bacteria in this state are largely resistant to current drugs.Microarray studies have identified a number of genes that are

highly up-regulated under conditions thought to trigger the onsetof dormancy, including hypoxia (low oxygen concentration) andexposure to nitric oxide (6, 7). The most highly up-regulated genein hypoxic conditions, and the second-highest in response to NO,is Rv1738, suggesting this gene to be a key element in the onset ofpersistence. Rv1738 is also classed as an essential gene in thegenome-wide transposon mutagenesis study of Sassetti et al. (8).The Rv1738 gene encodes a predicted 94-residue protein of un-known function.In this research, we set out to determine the structure of the

protein encoded by the ORF Rv1738 ofMtbH37Rv. Our hypoth-esis was that important clues to the function of the Rv1738protein might be obtained from its 3D structure, as structure ismuch more highly conserved than sequence during evolution,and can reveal unsuspected functional and evolutionary rela-tionships. There have been both failures and successes from thisapproach, but on discovering that Rv1738 could readily be ex-pressed in soluble form in Escherichia coli, we initiated structuralstudies. Unfortunately, all attempts to crystallize recombinantRv1738 failed, and the protein also proved unsuitable for struc-tural analysis by NMR because of a tendency to aggregate.

Because of our inability to obtain crystals using proteinexpressed in E. coli, we turned to racemic protein crystallogra-phy, a recently introduced approach that can facilitate crystalli-zation of recalcitrant proteins (9). In this method, a racemicmixture comprising equal amounts of the D-protein and L-proteinforms of a target molecule are used for crystallization. A proteinracemate can crystallize in centrosymmetric space groups, andcan thus access many additional space groups, including somepredicted to be highly favored for protein crystallization (10). Incontrast, natural proteins are chiral, built only from L-aminoacids and the achiral amino acid glycine, and thus cannot beincorporated into crystal lattices that include mirror planes orinversion centers. Evidence is accumulating that racemic proteinmixtures can be much easier to crystallize in cases in which thenatural L-protein is refractory to crystallization (9). D-proteinscan only be made by total chemical synthesis (11, 12), but asRv1738 is a small protein, we considered it a good candidate fortotal chemical synthesis using modern ligation methods.

Significance

Racemic protein crystallography was used to determine theX-ray structure of the predicted Mycobacterium tuberculosisprotein Rv1738, which had been completely recalcitrant tocrystallization in its natural L-form. Native chemical ligationwas used to synthesize both L-protein and D-protein enan-tiomers of Rv1738. Crystallization of the racemic {D-protein +L-protein} mixture was immediately successful. The resultingcrystals diffracted to high resolution and also enabled facilestructure determination because of the quantized phases ofthe data from centrosymmetric crystals. The X-ray structure ofRv1738 revealed striking similarity with bacterial hibernationfactors, despite minimal sequence similarity. We predict thatRv1738, which is highly up-regulated in conditions that mimicthe onset of persistence, helps trigger dormancy by associa-tion with the bacterial ribosome.

Author contributions: J.S.L., S.B.H.K., and E.N.B. designed research; R.D.B., K.M., G.B., J.J.C.,and B.L.P. performed research; R.D.B. contributed new reagents/analytic tools; R.D.B., K.M.,S.B.H.K., and E.N.B. analyzed data; and R.D.B., K.M., S.B.H.K., and E.N.B. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.

Data deposition: The atomic coordinates and structure factors have been deposited in theProtein Data Bank, www.pdb.org (PDB ID codes 4wsp and 4wpy).1Present address: Friedrich Miescher Institute for Biomedical Research, 4058 Basel, Switzerland.2Present address: Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia.3Present address: Department of Chemistry, Massachusetts Institute of Technology, Cambridge,MA 02139.

4To whom correspondence may be addressed. Email: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1422387112/-/DCSupplemental.

4310–4315 | PNAS | April 7, 2015 | vol. 112 | no. 14 www.pnas.org/cgi/doi/10.1073/pnas.1422387112

Dow

nloa

ded

by g

uest

on

June

25,

202

0

Page 2: A functional role of Rv1738 in Mycobacterium tuberculosis ... · diffracted to 1.65 Å resolution, whereas Form-II crystals, al-though having greater solvent content (45%; Vm = 2.23

Here we describe the total chemical synthesis of the D- andL-forms of Rv1738 and the successful crystallization of the{D-Rv1738 + L-Rv1738} protein racemate, in striking contrast tothe complete failure of conventional crystallization approacheswith the recombinant protein. The racemic crystals diffracted togood resolution, and the structure was solved using an ab initioapproach that benefited from the quantized nature of centrosym-metric phases (13). Moreover, the structure of Rv1738 revealeda surprising similarity to a family of stress proteins known ashibernation-promoting factors (HPFs). This suggests that thefunctional role of the up-regulated Rv1738 protein in non-replicating persistence of Mtb is to contribute to the shutdown ofribosomal protein synthesis.

ResultsAttempts to Crystallize Recombinant Rv1738 Proteins. Rv1738 wasreadily expressed in E. coli as a soluble, His6-tagged protein atexpression levels of 10 mg/L. After purification, crystallizationtrials were undertaken using a Cartesian nanoliter dispensing ro-bot, with a 480-component screen (14). No crystals were obtained,either with or without the His6-tag. Attempts were made to crys-tallize alternative constructs with truncations at the N- andC-termini; the truncations were intended to remove residuespreceding the predicted N-terminal β-strand and following thepredicted C-terminal α-helix. Other “rescue” strategies that changea protein surface and can alter the propensity for crystallizationwere also used, including reductive methylation of lysine residues(15) (Rv1738 has four lysines) and surface entropy reduction (16).In the latter approach, eight mutant proteins were prepared inwhich residues with long, flexible side chains were mutated to al-anine. These proteins included single mutants K29A and K37A,double mutants K29A/K37A and K72A/K76A, triple mutantsE36A/K37A/E38A and K29A/K72A/K76A, a quadruple mutantK27A/E36A/K37A/E38A, and a sextuple mutant K27A/E36A/K37A/E38A/K72A/K76A. The three proteins with the EKE mu-tation aggregated and were unsuitable for crystallization, but theother five proteins were used in crystallization trials. No crystalswere obtained from any of these protein constructs.

Total Chemical Synthesis of D-Rv138 and L-Rv1738 Proteins. The94-residue Rv1738 polypeptide was synthesized by native chemicalligation of three unprotected peptides comprising residues 1–29,30–65, and 66–94 (Fig. 1 and Fig. S1). Residues Ala30 and Ala66were initially changed to Cys to enable native chemical ligation atthose sites, and were then restored to native Ala residues bydesulfurization (17, 18). The same synthetic scheme was usedto make each of the full-length L- and D-polypeptide chains,which were folded by rapid dilution after reverse-phase HPLCpurification (Fig. S2).

Racemic Crystallization and Structure Determination. SyntheticRv1738 was crystallized as a racemic mixture consisting of{D-Rv1738 + L-Rv1738}. Conditions 25–96 of the Index crystalli-zation screen (HR2-144; Hampton Research) were screened.These crystallization experiments gave immediate success within4–7 d, yielding microcrystals in 10 of 72 conditions, for a successrate of 14%. After optimization, synchrotron data were collectedfor crystals from two conditions. Both crystal forms were in thecentrosymmetric space group C2/c, with one molecule per asym-metric unit. Form-I crystals (26% solvent content; Vm = 1.66 Å3/Da)diffracted to 1.65 Å resolution, whereas Form-II crystals, al-though having greater solvent content (45%; Vm = 2.23 Å3/Da),diffracted to higher resolution (1.5 Å).

Fragment-Based Ab Initio Structure Solution. To exploit the pre-dicted advantages of racemic protein crystallography for structuresolution (13), we tried a fragment-based ab initio approach basedon that used in the program ARCIMBOLDO (19). Because the

latter does not work on centrosymmetric space groups, we de-veloped an analogous procedure, described here, and determineda near-complete Rv1738 model, starting with a single idealizedα-helix as a search model. Both circular dichroism spectra andsecondary structure prediction had indicated that Rv1738 con-tained at least one α-helix.An initial structure was determined using 1.9 Å data from the

Form-II crystals. Molecular replacement (MR) in PHASER (20)placed a single 14-mer poly-Ala α-helix in the unit cell, givingvalues of R/Rfree of 76.4%/76.2%. R factors for a centrosymmet-ric crystal structure are ∼1.45 times higher than those from thesame model in a noncentrosymmetric crystal structure (21). Thus,although an MR solution with Rfree of 76.2% would normally seemunpromising, it is equivalent to Rfree ∼50% for a noncentricstructure. This solution, making up less than 8% of the totalscattering matter of the final Rv1738 model, was refined withREFMAC (22) to improve the phases. Automatic “build” and“extend” model building cycles with BUCCANEER (23), iteratedwith REFMAC and removal of residues that did not fit the Rv1738sequence, led to a model comprising 87 of 94 residues, with R/Rfreeof 41.4%/44.7%. The structure solution was then completed withconventional model-building with COOT (24) and refinement inPHENIX/REFINE (25), using all data to 1.5 Å resolution.The final Form-II model comprised one Rv1738 protein mole-

cule (residues 7–94), 88 waters, one glycerol, and a trifluoroacetateion, and had R/Rfree values of 22.8% and 25.3% (equivalent to Rfree∼18% for a noncentric structure). The Form-I crystal structure wasthen solved by MR, using the Form-II structure as a search model,followed by refinement in PHENIX/REFINE and model building inCOOT. Validation checks were carried out with MOLPROBITY(26). Details of both final models, refined at 1.65 and 1.5 Å reso-lution, respectively, are found in Table S1.

Structure of the Rv1738 Protein. The two crystal forms have thesame space group, but distinct packing, with differing proportionsof solvent. In each case, the unit cell contains four dimers, twoL-dimers, and two D-dimers, with the L- and D-dimers related toeach other by centers of inversion. No significant differences areapparent between the protein structures obtained from the twocrystal forms, although the higher-resolution structure was morecomplete, with residues 7–94 well defined by electron density.

Fig. 1. Synthetic strategy for preparation of Mtb Rv1738. The 94-residue poly-peptide, with amino acid sequence MGGDQSDHVL QHWTVDISID EHEGLTRAKARLRWREKELV GVGLARLNPA DRNVPEIGDE LSVARALSDL GKRMLKVSTHDIEAVTHQPA RLLY, was prepared from three synthetic peptide segments. Na-tive Ala residues (bold, underlined) were initially replaced by Cys residues at thetwo ligation sites. The first native chemical ligation was followed by conversionof the N-terminal thiazolidine (Thz: 1,3-thiazolidine-4-carboxylic acid) toCys. This was followed by a second native chemical ligation step. Once thefull-length polypeptide was obtained, selective desulfurization was usedfor the conversion of Cys to Ala in the presence of Met.

Bunker et al. PNAS | April 7, 2015 | vol. 112 | no. 14 | 4311

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

June

25,

202

0

Page 3: A functional role of Rv1738 in Mycobacterium tuberculosis ... · diffracted to 1.65 Å resolution, whereas Form-II crystals, al-though having greater solvent content (45%; Vm = 2.23

The Rv1738 polypeptide is folded into a three-stranded anti-parallel β-sheet with strand order β1-β2-β3, packed against a longC-terminal amphipathic α-helix, residues 55–86 (Fig. 2A). Theprotein forms an intimately associated homodimer (Fig. 2B) inwhich the two monomers are related by a crystallographic two-fold axis. The Rv1738 dimer is arranged in two layers. The an-tiparallel β-sheets are paired through their β1 strands to form anextended six-stranded β-sheet that provides a curved outer facefor the dimer. This extended β-sheet wraps around the two an-tiparallel C-terminal α-helices, which pack against the β-sheetand make intimate side chain–side chain interactions with eachother. The dimer core is roughly cylindrical, with dimensions57 × 25 Å.Both crystal forms are in the same centrosymmetric space

group and are dominated by interactions between the L- andD-dimers. In Form-I, residues 5–10 of the extended β1 strands,which are not involved in intramonomer or intradimer hydrogenbonding, are paired to form an antiparallel β-ribbon that linksthe L- and D-dimers across a center of inversion. Similar β-typehydrogen bonds also link the edge (β3) strands of the L- andD-dimers, again across a center of inversion. This gives a tightlypacked (26% solvent) crystal built from layers of L- and D-dimerslinked by a continuous network of hydrogen bonds. Similar crystalpacking pertains in Form-II (Fig. 3), but with looser packing(45% solvent); the β3 strands are hydrogen-bonded betweendimers (Fig. 3A), but the β1 strands are farther apart and do notinteract directly.

Structural Similarity of Rv1738 to HPFs. Rv1738 is one of a smallgroup of proteins from mycobacteria and other actinobacteriathat form the PFAM (27) family DUF1876 (domain of unknownfunction number 1876), but outside this family, no significantsequence relatives were found. Searches of the Protein DataBank were conducted with SSM (28), using the Rv1738 mono-mer as the query structure. The best match was another Mtbprotein of unknown function, Rv2632c (PDB ID code 2fgg), withan rms difference (rmsd) of 1.23 Å for 80 equivalent Cα positionsand 51% sequence identity. Other matches were all of low sig-nificance, but the top hits consistently identified similarities withdomains that bind double-stranded RNA (dsRNA). These dsRNA-

binding domains typically have an αβββαmotif of ∼70 residues (29),and although Rv1738 lacks the first α-helix, its three-strandedβ-sheet and C-terminal α-helix superimpose on the dsRBD βββαmotif with rmsds of ∼2 Å for ∼55 Cα positions.More insight came from comparisons with the Rv1738 dimer.

These revealed close structural correspondence with a family ofribosome-associated proteins (S30Ae ribosomal proteins, PFAMfamily PF02482), which have a conserved βαβββα motif. Theseinclude the E. coli cold shock protein YfiA (29, 30) and the HPFproteins from E. coli and Vibrio cholerae (31, 32), which restrictprotein synthesis during environmental stress. Although theHPFs and YfiA are monomeric, their four β-strands overlay fourof the six β-strands of the Rv1738 dimer, and their two helicescorrespond spatially with the two helices of the Rv1738 dimer.Overall, ∼70 Cα atoms can be matched with an rmsd of 1.5–2.0 Å(Fig. 4A). There are differences, in that the β1 strand and α1helix of HPF are antiparallel to the matching segments of theRv1738 dimer, and the α1 helix diverges toward its N-terminalend, but the overall structures are of similar size and shape.Importantly, a band of positive charge around the waist of YfiA

Fig. 2. Structure of Rv1738. (A) Cartoon diagram, with β-strands in blue andthe single C-terminal α-helix in green. The corresponding topology diagramis also shown. (B) The Rv1738 dimer, formed by twofold crystallographicsymmetry. The two C-terminal α-helices pack antiparallel to one another,with the Rv1738 β-sheets combining to give an overall 6-stranded sheet thatfolds over the two helices.

Fig. 3. Form-II crystal packing. (A) L- and D-dimers of Rv1738 are in contactacross a center of inversion (o), forming antiparallel β-type hydrogen bonds(Inset). (B) The overall crystal packing, viewed down the b axis, is formed byrows of L-dimers (blue) and D-dimers (green), emphasizing its dependence oninteractions between the two enantiomeric forms of Rv1738.

4312 | www.pnas.org/cgi/doi/10.1073/pnas.1422387112 Bunker et al.

Dow

nloa

ded

by g

uest

on

June

25,

202

0

Page 4: A functional role of Rv1738 in Mycobacterium tuberculosis ... · diffracted to 1.65 Å resolution, whereas Form-II crystals, al-though having greater solvent content (45%; Vm = 2.23

and HPF, which is one of their defining features and key to ri-bosome binding (33, 34), is also found in the Rv1738 dimer (Fig.4B), and conserved basic residues in YfiA and HPF that contactthe ribosome have counterparts in the Rv1738 dimer (Fig. 4A).In contrast, Rv2632c (see earlier) is not dimeric and shares noneof these basic residues or positive charge, suggesting Rv1738 hasa distinct biological role related to HPFs and the onset of dor-mancy in Mtb.

DiscussionThe work reported here dramatically illustrates the utility ofracemic protein crystallography, enabled by modern chemicalprotein synthesis, for determining the structures of difficult-to-crystallize protein molecules. Racemic protein crystallographyprovides two major advantages over crystallographic studies onnatural (chiral) proteins. The primary advantage is the relativeease with which protein racemates can be crystallized. Witha racemic protein mixture, it is possible to access centrosym-metric space groups such as P1, P21/c, and C2/c, which allowinteractions between protein molecules that are highly favorableto crystallization (9, 10). Naturally occurring proteins, in con-trast, are restricted to the 65 noncentrosymmetric space groups.This more facile crystallization of protein racemates is an ad-vantage vividly demonstrated by the present example, for whichexhaustive attempts were made to crystallize the recombinantprotein, including many different constructs and the use ofseveral widely used methods for modifying the protein surface.The racemic protein mixture, in contrast, crystallized veryreadily. Intriguingly, both the crystal forms we chose for opti-mization and analysis proved to be in C2/c, one of the spacegroups predicted to be favored, and all the principal crystalpacking contacts were between the L- and D-protein dimers.The second advantage of racemic protein crystallography

concerns phase determination and refinement. Crystallization incentrosymmetric space groups results in quantized phases thatare usually either 0° or 180°, and it has been predicted that thisshould make phase determination easier than for noncentro-symmetric data, for which the phases may take general valuesbetween 0° and 360° (13). The validity of this prediction maydepend on the particular phase determination method used, butour results, supported by test calculations with synthetic data (SIMaterials and Methods), appear to bear this out. For the ab initiomethod used here, starting with a model α-helix of 10–14 resi-dues, a solution emerged much more quickly with centrosymmetric

data than with noncentrosymmetric data. Similar advantagesapply in phasing centrosymmetric data by isomorphous replace-ment (35). Structure refinement also converges faster with cen-trosymmetric data, as the correct phases can be more quicklyestablished early in refinement (36). This is shown clearly by ourtest calculations with synthetic data (Fig. S3) and arises becausegiven a model that is basically correct, even if incomplete, mostof the phases will be exactly right, and refinement can progressin a manner more concerned with fitting the model-derivedamplitudes to the experimental amplitudes than with phase im-provement. The fragment-based ab initio structure solution de-scribed here is a demonstration of the ease of protein structuredetermination in a centrosymmetric space group.The structure of Rv1738 revealed here provides an intriguing

insight into the molecular events thought to be involved in theonset of dormancy of Mtb. Although its amino acid sequenceclassifies Rv1738 within a family of proteins of unknown function(DUF1876), restricted largely to mycobacteria, its folded struc-ture identifies it as a member of a much wider family of proteinsknown to be associated with ribosomes; that is, the S30Ae pro-teins, PFAM family PF02482. Members of this family are alsoreferred to variously as cold shock proteins or HPFs. Both theE. coli cold shock protein YfiA and E. coli HPF have been showncrystallographically to bind to a conserved site on 70S ribosomes,occupying a deep groove in the 30S subunit (33, 34). There theyoverlap the binding sites for tRNA, mRNA, and several initia-tion factors, thereby inhibiting translation and protein synthesis.This inhibition occurs under stress (cold shock) and is rapidlyreversed at normal growth temperatures.Dormancy inMtb is known to be associated with a shutdown of

protein synthesis and to be readily reversible by heat shock (4). Itis known that entry to the dormant state in Mtb, and othermycobacteria, is triggered by hypoxia and controlled by the DosRresponse regulator, which controls the expression of some 48genes in Mtb (6, 37). A study in Mycobacterium smegmatis hasimplicated two S30AE proteins from the dosR regulon as keymediators of ribosome stability during hypoxic stasis (38). Bothhave counterparts in Mtb.The most highly up-regulated dosR gene under hypoxic stress

is Rv1738 (6). Our structural analysis provides a compelling ex-planation for the up-regulation of this gene by revealing thestriking similarity between Rv1738 and the ribosome-associatedstationary-phase proteins HPF and YfiA. The latter are mono-mers, whereas Rv1738 is a dimer, but all have a belt of positive

Fig. 4. Relationship of Rv1738 to bacterial stress response proteins. (A) Superposition of the Rv1738 dimer (blue) on to E. coli YfiA (magenta). Side chains thatmatch between Rv1738 and YfiA and are implicated in ribosome binding by YfiA (34) are indicated. (B) Cartoon and surface representations of the Rv1738dimer (Left) and YfiA (Right), showing similar belts of surface positive charge (blue). (C) The YfiA binding site on the Thermus thermophilus ribosome. (Inset)Close-up of the Rv1738 dimer superimposed onto YfiA in this site.

Bunker et al. PNAS | April 7, 2015 | vol. 112 | no. 14 | 4313

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

June

25,

202

0

Page 5: A functional role of Rv1738 in Mycobacterium tuberculosis ... · diffracted to 1.65 Å resolution, whereas Form-II crystals, al-though having greater solvent content (45%; Vm = 2.23

charge in common and most of the secondary structural elementsmatch spatially, giving a similar size and shape. Modeling theRv1738 dimer into the groove where HPF and YfiA bind in thebacterial 70S ribosome (34), using either HPF or YfiA as a tem-plate, shows that it fits extremely well (Fig. 4C). The major dif-ference between them, divergence of one Rv1738 helix from helixα1 of YfiA, is well accommodated. The Rv1738 dimer also hasbasic side chains that match spatially with five of the six basic sidechains by which YfiA makes key contacts with the ribosome(Fig. 4A).By analogy with YfiA and HPF, whose effects on ribosome

structure and function differ in detail while sharing a commonbinding site and shared ability to inhibit translation and proteinsynthesis (34), we propose that Rv1738 acts, very likely withother protein factors, to induce dormancy under conditions oflow-oxygen stress by interaction with ribosomes.

Materials and MethodsRecombinant Protein Production and Modification. Recombinant Rv1738proteins were expressed in E. coli and purified by immobilized metal affinitychromatography and size exclusion chromatography, as described in SIMaterials and Methods. The N-terminal His6-tag could be readily removedby incubation with recombinant tobacco etch virus (rTEV) protease. Re-ductive methylation of lysine residues was carried out with dimethyl-amine–borane complex and formaldehyde (16), and the modified proteinwas purified by gel filtration. Mass spectral analysis showed the additionof eight methyl groups (mass increase, 112 Da), implying that all fourlysines had been di-methylated. Potential sites for surface entropy re-duction were identified with the surface entropy reduction prediction(SERp) server, and eight mutant Rv1738 proteins were made and expressed.Threemutants could not be purified because of aggregation, but the other fivewere successfully purified as for wild-type Rv1738.

Total Synthesis of D-Rv138 and L-Rv1738 Proteins. Rv1738 peptides 1–29 thioester,Thz30-65 thioester, and Cys66-94 were prepared as in Fig. S1. Ligation reactionsused previously published conditions: 200 mM sodium phosphate buffer con-taining 6M guanidine hydrochloride, 20mM tris(2-carboxyethyl)phosphine·HCl(TCEP) at pH 6.8, 2–4 mMof each peptide, 30mM 4-mercaptophenylacetic acid,

purged and sealed under argon. The one-pot sequential ligations were carriedout on a 17.55-μmole scale for each of the three peptide segments. The Trp(CHO) formyl side chain-protecting groups were removed by adding equalvolumes of piperidine and β-mercaptoethanol. Purification yield was 5.6 μmole(60 mg, 32% yield) of full-length polypeptide. Selective desulfurization torestore the native Ala residues at the ligation sites in the synthetic poly-peptide was carried out as described in SI Materials and Methods. A totalyield of 15 mg of each synthetic polypeptide was isolated and usedfor crystallization.

Crystallization. Crystals of the D,L-Rv1738 racemic mixture were grown at19 °C in hanging drops made up with 1 μL protein solution and 1 μL reservoirsolution. The protein solution comprised 9 mg/mL of each Rv1738 enantio-mer (18 mg/mL total protein) in water. Form-I crystals were obtained using0.1 M Bis·Tris at pH 5.5, 0.2 M NaCl, 25% (wt/vol) PEG 3350 as precipitant,and Form-II crystals with 0.1 M Bis·Tris at pH 5.5, 25% (wt/vol) PEG 3350 (noNaCl). Full details of the screening procedures for both recombinant andracemic Rv1738 proteins are detailed in SI Materials and Methods.

Structure Determination. Diffraction data were collected from both Form-Iand Form-II crystals at the Advanced Photon Source, as detailed in SIMaterialsand Methods and Table S1. An initial model for the Rv1738 structure wasobtained using Form-II data to 1.9 Å with the ab initio approach described inthe Results section and in SI Materials and Methods. After reprocessing theForm-II data to higher resolution, the model was refined using data to 1.5 Åresolution. The Form-I crystal structure was then determined by molecularreplacement with PHASER (20), using the Form-II structure as a search model(see Results section), and was refined using data to 1.65 Å resolution. Fullrefinement details are provided in Table S1.

ACKNOWLEDGMENTS. We thank Dr. Geoffrey B. Jameson for helpfuldiscussions at several stages of this work. Funding was provided by theHealth Research Council of New Zealand, the Tertiary Education Commissionof New Zealand through Centre of Research Excellence funding of theMaurice Wilkins Centre, and the University of Auckland (Doctoral Scholar-ships to R.D.B. and J.J.C.). Use of NE-CAT beamline 24-ID at the AdvancedPhoton Source, Argonne National Laboratory, was supported by AwardRR-15301 from the National Center for Research Resources at the NationalInstitutes of Health.

1. Corbett EL, et al. (2003) The growing burden of tuberculosis: Global trends and in-

teractions with the HIV epidemic. Arch Intern Med 163(9):1009–1021.2. Parrish NM, Dick JD, Bishai WR (1998) Mechanisms of latency in Mycobacterium tu-

berculosis. Trends Microbiol 6(3):107–112.3. Wayne LG, Sohaskey CD (2001) Nonreplicating persistence of Mycobacterium tuber-

culosis. Annu Rev Microbiol 55:139–163.4. Hu YM, Butcher PD, Sole K, Mitchison DA, Coates ARM (1998) Protein synthesis is

shutdown in dormant Mycobacterium tuberculosis and is reversed by oxygen or heat

shock. FEMS Microbiol Lett 158(1):139–145.5. Pandey AK, Sassetti CM (2008) Mycobacterial persistence requires the utilization of

host cholesterol. Proc Natl Acad Sci USA 105(11):4376–4380.6. Sherman DR, et al. (2001) Regulation of the Mycobacterium tuberculosis hyp-

oxic response gene encoding alpha-crystallin. Proc Natl Acad Sci USA 98(13):

7534–7539.7. Voskuil MI, et al. (2003) Inhibition of respiration by nitric oxide induces a Mycobac-

terium tuberculosis dormancy program. J Exp Med 198(5):705–713.8. Sassetti CM, Boyd DH, Rubin EJ (2003) Genes required for mycobacterial growth de-

fined by high density mutagenesis. Mol Microbiol 48(1):77–84.9. Yeates TO, Kent SBH (2012) Racemic protein crystallography. Annu Rev Biophys 41:

41–61.10. Wukovitz SW, Yeates TO (1995) Why protein crystals favour some space-groups over

others. Nat Struct Biol 2(12):1062–1067.11. Dawson PE, Muir TW, Clark-Lewis I, Kent SBH (1994) Synthesis of proteins by native

chemical ligation. Science 266(5186):776–779.12. Kent SBH (2009) Total chemical synthesis of proteins. Chem Soc Rev 38(2):338–351.13. Mackay AL (1989) Crystal enigma. Nature 342:133.14. Moreland N, et al. (2005) A flexible and economical medium-throughput strategy for

protein production and crystallization. Acta Crystallogr D Biol Crystallogr 61(Pt 10):

1378–1385.15. Rayment I (1997) Reductive alkylation of lysine residues to alter crystallization

properties of proteins. Methods Enzymol 276:171–179.16. Cooper DR, et al. (2007) Protein crystallization by surface entropy reduction: Opti-

mization of the SER strategy. Acta Crystallogr D Biol Crystallogr 63(Pt 5):636–645.17. Yan LZ, Dawson PE (2001) Synthesis of peptides and proteins without cysteine resi-

dues by native chemical ligation combined with desulfurization. J Am Chem Soc

123(4):526–533.

18. Pentelute BL, Kent SB (2007) Selective desulfurization of cysteine in the presenceof Cys(Acm) in polypeptides obtained by native chemical ligation. Org Lett 9(4):687–690.

19. Rodríguez DD, et al. (2009) Crystallographic ab initio protein structure solution belowatomic resolution. Nat Methods 6(9):651–653.

20. McCoy AJ, et al. (2007) Phaser crystallographic software. J Appl Cryst 40(Pt 4):658–674.21. Luzzati V (1952) Traitement statistique des erreurs dans la determination des struc-

turesbcristallines. Acta Crystallogr 5:802–810.22. Murshudov GN, et al. (2011) REFMAC5 for the refinement of macromolecular crystal

structures. Acta Crystallogr D Biol Crystallogr 67(Pt 4):355–367.23. Cowtan K (2006) The Buccaneer software for automated model building. 1. Tracing

protein chains. Acta Crystallogr D Biol Crystallogr 62(Pt 9):1002–1011.24. Emsley P, Lohkamp B, Scott WG, Cowtan K (2010) Features and development of Coot.

Acta Crystallogr D Biol Crystallogr 66(Pt 4):486–501.25. Adams PD, et al. (2010) PHENIX: A comprehensive Python-based system for

macromolecular structure solution. Acta Crystallogr D Biol Crystallogr 66(Pt 2):213–221.

26. Chen VB, et al. (2010) MolProbity: All-atom structure validation for macromolecularcrystallography. Acta Crystallogr D Biol Crystallogr 66(Pt 1):12–21.

27. Finn RD, et al. (2010) The Pfam protein families database. Nucleic Acids Res38(Database issue, Suppl. 1):D211–D222.

28. Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fastprotein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr60(Pt 12 Pt 1):2256–2268.

29. Ye K, Serganov A, Hu W, Garber M, Patel DJ (2002) Ribosome-associated factor Yadopts a fold resembling a double-stranded RNA binding domain scaffold. Eur J Bio-chem 269(21):5182–5191.

30. Rak A, Kalinin A, Shcherbakov D, Bayer P (2002) Solution structure of the ribosome-associated cold shock response protein Yfia of Escherichia coli. Biochem Biophys ResCommun 299(5):710–714.

31. Sato A, et al. (2009) Solution structure of the E. coli ribosome hibernation promotingfactor HPF: Implications for the relationship between structure and function. BiochemBiophys Res Commun 389(4):580–585.

32. De Bari H, Berry EA (2013) Structure of Vibrio cholerae ribosome hibernation pro-moting factor. Acta Crystallogr Sect F Struct Biol Cryst Commun 69(Pt 3):228–236.

33. Vila-Sanjurjo A, Schuwirth B-S, Hau CW, Cate JHD (2004) Structural basis for the controlof translation initiation during stress. Nat Struct Mol Biol 11(11):1054–1059.

4314 | www.pnas.org/cgi/doi/10.1073/pnas.1422387112 Bunker et al.

Dow

nloa

ded

by g

uest

on

June

25,

202

0

Page 6: A functional role of Rv1738 in Mycobacterium tuberculosis ... · diffracted to 1.65 Å resolution, whereas Form-II crystals, al-though having greater solvent content (45%; Vm = 2.23

34. Polikanov YS, Blaha GM, Steitz TA (2012) How hibernation factors RMF, HPF, and YfiAturn off protein synthesis. Science 336(6083):915–918.

35. Matthews BW (2009) Racemic crystallography—easy crystals and easy structures:What’s not to like? Protein Sci 18(6):1135–1138.

36. Zawadzke LE, Berg JM (1993) The structure of a centrosymmetric protein crystal.Proteins 16(3):301–305.

37. Park HD, et al. (2003) Rv3133c/dosR is a transcription factor that mediatesthe hypoxic response of Mycobacterium tuberculosis. Mol Microbiol 48(3):833–843.

38. Trauner A, Lougheed KEA, Bennett MH, Hingley-Wilson SM, Williams HD (2012)The dormancy regulator DosR controls ribosome stability in hypoxic mycobacteria.J Biol Chem 287(28):24053–24063.

Bunker et al. PNAS | April 7, 2015 | vol. 112 | no. 14 | 4315

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

June

25,

202

0