ribosome-binding sites and rna-processing sites in the ... · strain 1100 a (uncb-uncc) were used...

8
JOURNAL OF BACTERIOLOGY, JUIY 1989, p. 3901-3908 Vol. 171, No. 7 0021-9193/89/073901-08$02.00/0 Copyright © 1989, American Society for Microbiology Ribosome-Binding Sites and RNA-Processing Sites in the Transcript of the Escherichia coli unc Operon ELIZABETH M. SCHAEFER,1 DIETER HARTZ,2 LARRY GOLD,2 AND ROBERT D. SIMONIl* Department of Biological Sciences, StanJ6rd University, Stanford, California 94305-5020,1 and Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado 803092 Received 13 January 1989/Accepted 4 April 1989 The polycistronic mRNA encoding the nine genes of the unc operon of Escherichia coli was studied. We demonstrated the ribosome-binding capabilities of six of the nine unc genes, uncB, uncE, uncF, uncH, uncA, and uncD, by using the technique of primer extension inhibition or "toeprinting." No toeprint was detected for the other genes, uncI, uncG, and uncC. The lack of a toeprint for uncG suggests that this gene is expressed by some form of translational coupling, such that either uncG is read by ribosomes which have translated the preceding gene, uncA, or translation of uncA is required for ribosome binding at the uncG site. RNA sequencing and primer extension in the regions of uncl and uncC, the first and last genes in the operon, respectively, gave less intense signals than those obtained for the other unc genes. This suggested that there are fewer copies of those regions of the transcript and that processing of the unc transcript occurred. Using primer extension and RNA sequencing, we identified sites in the unc transcript at which processing appears to take place, including a site which may remove much of the uncI portion of the transcript. Northern (RNA) blot analysis of unc RNA is consistent with the presence of an RNA-processing site in the uncl region of the transcript and another in the uncH region. These processing events may account for some of the differential levels of expression of the unc genes. The unc operon of Escherichia coli encodes the subunits of the proton-translocating ATPase, a membrane-bound enzyme complex capable of reversibly coupling transloca- tion of protons across the plasma membrane to ATP synthe- sis or hydrolysis. This enzyme is similar in structure and function to enzymes of other bacteria, as well as to those of mitochondria and chloroplasts. Under aerobic conditions, it can synthesize ATP by using the electrochemical proton gradient generated by the electron transport chain, while in an anaerobic state it hydrolyzes the ATP produced by anaerobic metabolism to create an electrochemical proton gradient capable of driving other cellular processes. The ATPase complex is composed of two domains known as F( and F1. Three types of integral membrane proteins, a, b, and c, make up the F( domain, which forms a proton pore through the membrane. The catalytic F1 portion of the complex is formed by five types of peripheral membrane proteins known as ox, 13, y, 8, and e. Each subunit is encoded by a single gene, and these eight genes plus a ninth (uncI), which encodes a polypeptide (i) of unknown function, form the unc operon, located at about 83.5 min on the E. coli chromosome. The uinc promoter has been mapped, and the gene order in the operon has been determined to be uncIBEF HAGDC, corresponding to subunits i, a, c, b, 8, o, y, 3, and E (for reviews, see references 5 and 27). Previous experiments have indicated that a single polycis- tronic mRNA encodes the entire operon (10). However, the stoichiometry of the subunits in the mature ATPase is quite disparate and is probably a1, c10, b,, 81, xX, -1Y f3, and e1 (4). Experiments performed in vitro and in minicells have indi- cated that the unc polypeptides are not made in equimolar amounts but are actually synthesized in relative amounts approximating their final stoichiometry in the mature com- plex (2, 22). No differential degradation of the subunits is detected when they are synthesized in vitro, confirming that * Corresponding author. they are produced in unequal amounts (2). These results, taken together, suggest that control at the level of translation is responsible for the differences in expression of the iinc genes. Statistical studies of codon usage indicate that there is a bias toward rare codon usage in unc genes (uncH, uncG, and uncC) that are expressed in low amounts (5, 11), as in other bacterial genes. This may play a part in the differential synthesis of the ATPase polypeptides, but other mechanisms are probably important as well. Several investigations have indicated that differences in rate of initiation of translation among the iinc genes account for much of the disparity in their levels of synthesis. It has been suggested that the presence of secondary structures in some intergenic regions of the uinc message affects transla- tion of the subsequent genes by inhibiting ribosome binding (2). Computer predictions have indicated the presence of secondary structures before uncF, uncH, and uncG (16). These structures have not been directly demonstrated, but manipulations of the translation initiation regions of some of the iinc genes (15, 17), some of which were predicted to change the secondary as well as the primary structures of those regions, have been shown to increase levels of expres- sion. These studies suggest that several factors affect the rate of translation initiation but indicate that the rate of elongation cannot be the limiting step in the translation of these genes. The translation initiation region of uncE, the most highly expressed gene in the operon, has been shown to support relatively efficient translation initiation, both in its native position and when cloned in front of other genes (16. 18). Generation of progressive deletions upstream of the lincE coding sequence showed that the full efficiency of this region is obtained only when at least 30 bases of the sequence before the initiation codon are present (18). Deletions before the uncE initiation codon have been shown to decrease the expression of both uncE and uncF 3901 on November 13, 2020 by guest http://jb.asm.org/ Downloaded from

Upload: others

Post on 12-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

JOURNAL OF BACTERIOLOGY, JUIY 1989, p. 3901-3908 Vol. 171, No. 70021-9193/89/073901-08$02.00/0Copyright © 1989, American Society for Microbiology

Ribosome-Binding Sites and RNA-Processing Sites in the Transcriptof the Escherichia coli unc Operon

ELIZABETH M. SCHAEFER,1 DIETER HARTZ,2 LARRY GOLD,2 AND ROBERT D. SIMONIl*

Department of Biological Sciences, StanJ6rd University, Stanford, California 94305-5020,1 and Department of Molecular,Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado 803092

Received 13 January 1989/Accepted 4 April 1989

The polycistronic mRNA encoding the nine genes of the unc operon of Escherichia coli was studied. Wedemonstrated the ribosome-binding capabilities of six of the nine unc genes, uncB, uncE, uncF, uncH, uncA,and uncD, by using the technique of primer extension inhibition or "toeprinting." No toeprint was detected forthe other genes, uncI, uncG, and uncC. The lack of a toeprint for uncG suggests that this gene is expressed bysome form of translational coupling, such that either uncG is read by ribosomes which have translated thepreceding gene, uncA, or translation of uncA is required for ribosome binding at the uncG site. RNAsequencing and primer extension in the regions of uncl and uncC, the first and last genes in the operon,respectively, gave less intense signals than those obtained for the other unc genes. This suggested that there arefewer copies of those regions of the transcript and that processing of the unc transcript occurred. Using primerextension and RNA sequencing, we identified sites in the unc transcript at which processing appears to takeplace, including a site which may remove much of the uncI portion of the transcript. Northern (RNA) blotanalysis of unc RNA is consistent with the presence of an RNA-processing site in the uncl region of thetranscript and another in the uncH region. These processing events may account for some of the differentiallevels of expression of the unc genes.

The unc operon of Escherichia coli encodes the subunitsof the proton-translocating ATPase, a membrane-boundenzyme complex capable of reversibly coupling transloca-tion of protons across the plasma membrane to ATP synthe-sis or hydrolysis. This enzyme is similar in structure andfunction to enzymes of other bacteria, as well as to those ofmitochondria and chloroplasts. Under aerobic conditions, itcan synthesize ATP by using the electrochemical protongradient generated by the electron transport chain, while inan anaerobic state it hydrolyzes the ATP produced byanaerobic metabolism to create an electrochemical protongradient capable of driving other cellular processes.The ATPase complex is composed of two domains known

as F( and F1. Three types of integral membrane proteins, a,b, and c, make up the F( domain, which forms a proton porethrough the membrane. The catalytic F1 portion of thecomplex is formed by five types of peripheral membraneproteins known as ox, 13, y, 8, and e. Each subunit is encodedby a single gene, and these eight genes plus a ninth (uncI),which encodes a polypeptide (i) of unknown function, formthe unc operon, located at about 83.5 min on the E. colichromosome. The uinc promoter has been mapped, and thegene order in the operon has been determined to be uncIBEFHAGDC, corresponding to subunits i, a, c, b, 8, o, y, 3, andE (for reviews, see references 5 and 27).

Previous experiments have indicated that a single polycis-tronic mRNA encodes the entire operon (10). However, thestoichiometry of the subunits in the mature ATPase is quitedisparate and is probably a1, c10, b,, 81, xX, -1Y f3, and e1 (4).Experiments performed in vitro and in minicells have indi-cated that the unc polypeptides are not made in equimolaramounts but are actually synthesized in relative amountsapproximating their final stoichiometry in the mature com-plex (2, 22). No differential degradation of the subunits isdetected when they are synthesized in vitro, confirming that

* Corresponding author.

they are produced in unequal amounts (2). These results,taken together, suggest that control at the level of translationis responsible for the differences in expression of the iincgenes.

Statistical studies of codon usage indicate that there is abias toward rare codon usage in unc genes (uncH, uncG, anduncC) that are expressed in low amounts (5, 11), as in otherbacterial genes. This may play a part in the differentialsynthesis of the ATPase polypeptides, but other mechanismsare probably important as well.

Several investigations have indicated that differences inrate of initiation of translation among the iinc genes accountfor much of the disparity in their levels of synthesis. It hasbeen suggested that the presence of secondary structures insome intergenic regions of the uinc message affects transla-tion of the subsequent genes by inhibiting ribosome binding(2). Computer predictions have indicated the presence ofsecondary structures before uncF, uncH, and uncG (16).These structures have not been directly demonstrated, butmanipulations of the translation initiation regions of some ofthe iinc genes (15, 17), some of which were predicted tochange the secondary as well as the primary structures ofthose regions, have been shown to increase levels of expres-sion. These studies suggest that several factors affect therate of translation initiation but indicate that the rate ofelongation cannot be the limiting step in the translation ofthese genes.The translation initiation region of uncE, the most highly

expressed gene in the operon, has been shown to supportrelatively efficient translation initiation, both in its nativeposition and when cloned in front of other genes (16. 18).Generation of progressive deletions upstream of the lincEcoding sequence showed that the full efficiency of this regionis obtained only when at least 30 bases of the sequencebefore the initiation codon are present (18).

Deletions before the uncE initiation codon have beenshown to decrease the expression of both uncE and uncF

3901

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 2: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

3902 SCHAEFER ET AL.

(16), indicating that some form of translational coupling.such as reinitiation by ribosomes that read the previousgene, occurs here. Mutations introduced before iuncF or inthe uncF initiation codon (15) cause changes in the expres-sion of both uincF and the following gene, uncH, suggestingthat some form of translational coupling occurs betweenthese genes as well.

In the study described here, we used the technique ofprimer extension inhibition or "toeprinting" (6, 8) to inves-tigate whether some of the differential translation in thisoperon is due to an inability of some of the genes to bindribosomes. We then used Northern (RNA) blot analysis ofthe tinc transcript, in addition to primer extension and RNAsequencing, to identify possible RNA-processing sites whichresult in shortened species of the uinc transcript. Theseshortened species are probably partly responsible for thedifferential expression of iunc genes.

MATERIALS AND METHODSMaterials. [y-32P]ATP (>3 Ci/mol) and [a-32P]dATP (>5

Ci/mol) were purchased from Amersham Corp. (ArlingtonHeights, Ill.). E. coli tRNAmfet was obtained from Boehr-inger Mannheim Biochemicals (Indianapolis, Ind.). Avianmyeloblastosis virus reverse transcriptase was purchasedfrom Life Sciences, Inc. (St. Petersburg, Fla.), and DNApolymerase I was purchased from New England BioLabs(Beverly, Mass.).Each oligonucleotide used as a sequencing primer was 20

bases long and was complementary to the unc transcriptstarting at the following sites: nucleotide 160 of the iinclcoding sequence, nucleotide 80 of uncB, nucleotide 90 ofuncE, nucleotide 60 of uncF, nucleotide 110 of uncH,nucleotide 70 of uncA, nucleotide 130 of uncG, nucleotide 90of uncD, and nucleotide 150 of uncC.For Northern blots, the uncG sequencing primer and two

other oligonucleotides, which were 45 nucleotides long,were used as probes. The uncI probe was complementary tothe region starting at nucleotide 1 of uncl. The uncC probewas complementary to the sequence starting at position 374of uncC and thus extended to the end of the uncC codingsequence.

Oligonucleotides were made on an Applied Biosystems380B DNA synthesizer in the Department of BiologicalSciences, Stanford University. All other chemicals were of acommercially available high grade.

Bacterial strains and plasmids. Strain 1100 (unc+) (9) andstrain 1100 A (uncB-uncC) were used to isolate RNA tobe used for RNA sequencing and toeprinting. The 1100A (uncB-uncC) strain was constructed by Carol Kumamotoin our laboratory by P1 transduction of a zXuncB-unc Cplasmid to ilv, followed by complementation analysis withvarious plasmids encoding partial uinc operons. Strain DK6(14), a minicell-producing strain carrying an une deletion,was used for minicell synthesis and labeling of plasmid-encoded polypeptides.

Plasmid pRPG54, carrying uncB-incC, has been previ-ously described (7). A point mutation discovered in pRPG54has also been described and characterized (29) and wasfound to be present in our laboratory stock of this plasmid aswell. The whole-operon plasmid was recloned from pRPG23and pRPG32, as in the original construction, and this timesequencing of the entire urncB-uncE intergenic regionshowed that the sequence had remained wild type (data notshown). The new plasmid was named pEMS54.

Preparation of RNA. E. coli was grown at 37°C in Luriabroth (10 g of NaCl, 10 g of tryptone, and 5 g of yeast extract

per liter) with 0.2% glucose (containing 30 pLg of chloram-phenicol per ml when plasmids were present) to approxi-mately 30 Klett units, and then total cellular nucleic acidswere isolated as described by McPheeters et al. (20).

Preparation of 30S ribosomal subunits. The 30S ribosomalsubunits used were a kind gift of Robert Traut, Departmentof Biological Chemistry, University of California, Davis.These were prepared by the method of Kenny et al. (13).RNA sequencing and primer extension inhibition. End

labeling of oligonucleotide primers, RNA sequencing, initi-ation complex formation, and extension inhibition wereperformed as previously described (8), with the followingmodifications. A 10- to 100-,ug sample of total cellularnucleic acids was used for each annealing mixture. Avianmyeloblastosis virus reverse transcriptase (27 U/,ul) wasdiluted before use as follows: for sequencing reactions, 0.5[L1 of reverse transcriptase, 4.5 p.1 of reverse transcriptasestorage buffer, 3 RI of 1Ox reverse transcriptase buffer, and22 RI of H,O; for extension reactions, 1 p. of reversetranscriptase, 9 p.l of reverse transcriptase storage buffer, 6p.L of Sx standard buffer (containing 50 mM magnesiumacetate), and 14 p. of H2O. Reverse transcriptase storagebuffer is 0.2 M KPO4 (pH 7.2)-2 mM dithiothreitol-0.2%Triton X-100-50% glycerol, and 1Ox reverse transcriptasebuffer is 0.5 M Tris hydrochloride (pH 8.6)-0.6 M NaCl-60mM magnesium acetate-0.1 M dithiothreitol. The final con-centration of the 30S ribosomal subunits was 1 p.M, and thefinal concentration of tRNA"'et was 10 p.M. Reactions wereterminated by adding 10 pl. of loading dye to the sequencingreactions and 20 pI. to the extension reactions and heatingthem to 95°C for 2 min. A 5-p.l sample of each reactionmixture was loaded on an 8% sequencing gel.

Densitometry. An LKB UltroScan XL laser densitometerwas Lised for densitometry.Northern blot analysis. Separation of RNA on a formalde-

hyde-containing agarose gel, transfer to nitrocellulose, prep-aration of end-labeled oligonucleotides, and Northern blotanalysis were performed by standard methods (26) with thefollowing modifications. The prehybridization and hybridiza-tion solutions did not contain formamide and did contain0.5% sodium dodecyl sulfate and 10 mM dithiothreitol.Oligonucleotides end labeled as described above were usedas probes.

RESULTS

Detection of ribosome-binding sites in the unc operon. Weused primer extension inhibition to detect the ribosome-binding sites in the unc operon. Total cellular nucleic acidswere isolated from 1100, a strain of E. coli that is wild typefor unc, and from 1100 AuncB-uncC, a strain deleted fornearly the entire unc operon, carrying plasmid pRPG54 orpEMS54, each of which carries uncB-uncC. A radioactivelylabeled synthetic oligonucleotide primer complementary to asite 80 to 160 nucleotides downstream of the start codon ofthe gene of interest was annealed to this mixture. Then 30Sribosomes were allowed to bind to the RNA in the presenceof uncharged tRNA"Mct. Avian myeloblastosis virus reversetranscriptase was then used in the presence of deoxynucle-otides to synthesize cDNA which extended toward the 5'end of the transcript in a primer extension reaction. Acontrol reaction to which no 30S particles or tRNA wereadded was performed in parallel. In addition, separatedideoxynucleotide sequencing reactions were performed onthis annealing mixture, in the absence of 30S particles andtRNA, using the labeled primer as the sequencing primer.

J. BACTERIOL.

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 3: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

VtO79OPERON TRANSCRIPT 3903

A C G T -+

._0maw

.....

4-*

A-uncB

4-U_

4-4--

B-uncE C - unc E (mutation) D-unc F

n3

.....

..g...

X*

.. g..

F* .:.*:4.1!: . .,.::..... : _......... ..

}5e:ii* : ':X . X-n r.sie.:,:,. . .. F

: : ,. .^..: . :. ........ .. ............*- - w* .... :.:. :

.,.:. .: _

.... _F,::

il'i.', F

*2"'S.}' qq,. ',*. ;'. W

:: :::. ... :,::. i: :::

F-uncA G-uncG

........ _:.w. :w

.w*t.

*j.,. a E:

a.....

... w... .. P;.

:::.

*::::::* :.:: ::.... p...... ; .** s ii.

.....

:; ^-. i lu .3 1*. g. l

.."

.:' .' '' ..

.:, :: ,,.'...

*!!. 4.i..!.^- ;.,:<,.

2. ., [: q*:

... W ^

H - unc DFIG. 1. Toeprints of the genes of the unc operon. Arrows indicate the toeprint bands detected. The bracket in panel G indicates the site

at which a toeprint band was expected on the basis of the uncG sequence; no band was detected at this site. Panel C shows the iincE toeprintobtained from pRPG54, which carries a single-base substitution in the uncE Shine-Dalgarno sequence. All others are from a wild-typetranscript, either chromosomal (strain 1100) or encoded by pEMS54 (umcB-uncC). Lane A indicates an A in the transcript, and ddTTP wasused in the cDNA reactions in this lane. Similarly, ddGTP was used in lane C. ddCTP was used in lane G. and ddATP was used in the laneT sequencing reaction. Primer extension without 30S ribosomes or tRNAmCt is indicated by a minus, and that with 30S particles (1 FM) andtRNAmCt (10 FiM) is indicated by a plus. The lane order is the same in each panel.

The products of these reactions were loaded on adjacentlanes of a polyacrylamide sequencing gel and separated byelectrophoresis. Inhibition of primer extension results in atoeprint band when the reverse transcriptase encounters abound ribosome on the transcript and causes cDNA synthe-sis to terminate at that point. Previous analysis by thismethod (6, 8, 21, 33) has shown that the position of thetoeprint band is usually 15 nucleotides downstream of thefirst nucleotide of the initiation codon of the gene whentRNA"'ct is used to stabilize ribosome binding at that site.Thus, by comparing the products of the primer extensionreactions with and without ribosomes, we could identifysites at which the cDNA was interrupted because of ribo-some binding. By comparing the site of the toeprint bandwith the RNA sequence on the gel, we were able to identifythe initiation codon recognized by the ribosome.We detected ribosome-binding sites for six of the nine

genes of the operon: imcB, uncE, utncF, incH, micA, andnicD. We did not detect toeprints for uncI, uncG. or i,,icC.

The toeprints obtained are shown in Fig. 1. Two toeprints ofiln(E are included; that resulting from RNA encoded bypEMS54, which carries the wild-type ,imcE Shine-Dalgarnosequence (28, 30), is shown in Fig. 1B, and the toeprint fromRNA of pRPG54, which carries a mutation which changesthe GGAG Shine-Dalgarno sequence to GGAA, is shown in

Fig. IC. Both experiments resulted in toeprints at the sameposition, + 15 of the A of the AUG initiation codon of uncE,as expected.The two uncE toeprints (Fig. lB and C) from genes with

different Shine-Dalgarno sequences yielded toeprints of dif-ferent intensities, however. Densitometric analysis of thesetwo bands, normalized to the sequencing lanes of bothsamples in this region to account for slight differences inRNA concentration, indicated that pEMS54 (wild type)yields an micE toeprint which is at least 2.3 times moreintense than that of pRPG54 (carrying the ribosome-bindingsite mutation). Comparison of ATPase subunit synthesesfrom these two plasmids in minicells showed that pEMS54prodUces 2.3 times as much of the c subunit (encoded by

ic(E) as does pRPG54 and somewhat more b and 8 butsimilar amounts of all of the other subunits (Fig. 2). This is inagreement with the results of in vitro transcription andtranslation assays of pRPG54 and a plasmid identical topEMS54 (29) which showed a two- to threefold difference inunicE expression between the two. These results suggest thatthis mutation decreased the efficiency of uincE ribosomebinding by this amount and agree with previous conclusions(6, 8. 21, 33; D. Hartz, unpublished data) that the strength ofthe toeprint corresponds to the efficiency of the ribosome-binding site.

E -uncH

VOL. 171, 1989

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 4: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

3904 SCHAEFER ET AL.

2 ACGT-+-.% a

-7C'AT-W6

.......bD

Isbss__- 2

c

FIG. 2. Products of minicell-directed protein synthesis frompEMS54 (lane 1) and pRPG54 (lane 2). The two plasmids differ onlyin a single base change in the Shine-Dalgarno sequence of uncE,which encodes subunit c. The wild-type sequence, GGAG, is carriedby pEMS54, while the pRPG54 sequence is GGAA and so containsa shorter sequence (GGA) complementary to 16S rRNA.

The position of the uncB toeprint (Fig. 1A) allowed us todetermine the actual initiation site of uncB by showing theparticular initiation codon which is recognized by ribo-somes. In this case, another potential in-frame start codon islocated 15 nucleotides later (Fig. 3), and we did not detect atoeprint which would correspond to this second possibleinitiation site (Fig. 4). The amino-terminal protein sequenceof the a subunit (encoded by uncB) has not been determined,as have those of the other ATPase subunits (32), so thesecond possible start site had not previously been ruled out.Thus, the amino-terminal sequence of the a subunit isMet-Ala-Ser-Glu-Asn-Met-Thr.A summary of the toeprints we detected is diagrammed in

Fig. 3. Each of the six genes which show toeprints had bands

uncI CCUCGAAGGGAGCAGGAGUGAAAAACGUGAUGUCU

uncB AAAGGGUAAAAGQCAUCAUGGCUUCAGAAAAUAUG

uncE ACAAACUGGAGACUGUCAUGGAAAACCUGAAUAUG

uncF AAUAGGAGCAUUGUGCUGUGAAUCUUAACGCAACA

uncH UAAGGAGGGAGGGGCUGAUGUCUGAAUUUAUUACG

uncA CUUAAGGGGACUGGAGCAUGCAACUGAAUUCCACC

uncG GCAUUGAGQAGAAGCUCAUGGCCGGCGCAAAAGAG

uncD UUCGUAGAGGAUUUAAGAUGGCUACUGGAAAGAUU

uncC CUUAAUCGGAGGGUGAUAUGGCAAUGACUUACCACT T0 15

FIG. 3. Summary of the translation initiation regions of the uincgenes and their toeprints. Potential Shine-Dalgarno sequences areunderlined. Initiation codons are in boldface in the centers of thelines. The toeprints detected are in boldface at or around the +15position of each gene. There is a second in-frame AUG in uncB atposition +15 (see Results).

3

-4

FIG. 4. RNA sequencing reactions and primer extension reac-tions without (-) and with (+) 30S ribosomes and tRNA"'et with anoligonucleotide primer complementary to uncB bound to RNAtranscribed from pEMS54 (uncB-uncC). Number 1 indicates the endof the pEMS54 transcript near the second HindIll site in uncI.Number 2 shows the reverse transcriptase stop site in uncI whichmay be due to specific processing of the unc transcript. Number 3indicates the uincB toeprint. Number 4 shows the site at which atoeprint would be expected if the AUG at position 15 of uncB wereused for translation initiation. No toeprint was detected at this site.

located at +15 from the start of the coding sequence. asexpected. We observed additional bands for uncF and uncA,however. Bands were seen at the +15 and +17 positions foruncF and at the + 14 and + 15 sites for uncA. There were nopotential initiation codons at the positions 15 bases beforethese extra bands, so it seems unlikely that these representactual initiation sites.No toeprints were detected for uncl, uncG, or uncC. RNA

sequencing from the primer used to study uncG resulted insignals of typical intensity, but even after long exposure ofthe autoradiogram, no toeprint band was detected anywherein the vicinity of the uncG initiation region.No toeprints were detected in the uincl or iuncC initiation

region either. The signals obtained from RNA sequencingand primer extension near the start of these two genes weresignificantly weaker than those from any of the other re-gions, although the same RNA preparations were used.After longer autoradiogram exposures than usual, we wereable to identify the sequences as the correct ones in eachcase, and there was little underlying sequence which mightindicate multiple primer-binding sites. The oligonucleotideprimers were designed to bind to regions of the transcriptwhich had relatively little secondary structure on the basis ofcomputer predictions, so it seems unlikely that secondary

J. BACTERIOL.

....-MaNdmikIqlIm

-

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 5: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

1unc OPERON TRANSCRIPT 3905

structures are responsible for inhibiting the binding of theseprimers. Thus, the sequences obtained suggested that theprimers bound at the predicted sites but that lower levels ofthe transcript encoding these genes were present. Even afterexposure for many days, no toeprints were detected any-where near the start of uncI or uncC.

Detection of reverse transcriptase stop sites. An unexpectedresult of the primer extension inhibition experiments wasthat several sites were revealed in the unc message at whichstops reproducibly occurred in a significant proportion of thecDNA, whether ribosomes were present or absent. One suchsite was found near the 3' end of the uncI coding sequence,324 nucleotides from the translation initiation site (Fig. 4).The reverse transcriptase stop was seen at this site whetherthe RNA had been transcribed from a plasmid or thechromosome. Another prominent stop is located at position452 of uncH (data not shown). Sites which look similar but atwhich the reverse transcriptase stops appeared less promi-nently were identified at position 7 of the uncG-uncD inter-genic region and at position 77 of the uncC coding sequence(data not shown).There are many other less intense reverse transcriptase

stops throughout the unc mRNA. The cause of these stops isunknown, but it is possible that they represent sites at whichsome of the unc operon transcript is processed away, eitherby specific cleavage or by exonucleolytic degradation.Northern blot analysis of the unc transcript. To further

examine the size(s) of the unc message, we performedNorthern blots of nucleic acids prepared from strain 1100,which encodes a wild-type iinc operon. We probed thismaterial with three end-labeled oligonucleotides. The uncIprobe was a 45-mer complementary to the 5' end of uncl.The uncG probe was a 20-mer complementary to an internalportion of the uncG coding sequence. The uncC probe was a45-mer complementary to the 3' end of uncC. The labeledprobes used for each hybridization had nearly identicalspecific activities, and the same amount of probe was used ineach case.A ladder of RNA fragments was run alongside the sample

lanes and then the ladder and a sample lane were cut off andstained. The migration distances of these bands with knownfragment lengths were measured, as were the 23S and 16SrRNA bands in the sample. The Northern blot results areshown in Fig. 5, and the size markers are indicated. Thestandard curve obtained from the ladder is shown in Fig. 6.The 23S and 16S RNA bands, whose sizes are known (1), fitthe curve well.

In no case did our Northern blots show a 7.0-kilobase (kb)band, which is the size of the full-length inc transcript, giventhat it begins 73 nucleotides before the start of uncI (22) andends 52 nucleotides after the uncC termination codon (11,25). Instead, a band of 6.5 kb (labeled A in Fig. 5) wasdetected by both the iuncG and the uncC probes. No band ofthis size was detected by the und probe, however. Theseresults suggest that a processing event occurred whichremoved approximately 500 bases from the 5' end of the iuncmessage.A band of about 4.6 kb (labeled B) was also detected with

the uncG and uncC probes but not with the iincl probe. Inaddition, the uncC probe detected a 1.9-kb band and severalbands of less than 1.4 kb.

DISCUSSION

We analyzed the ribosome-binding activities of the trans-lation initiation regions of the unc operon genes by toeprint-

uinciI noG uncC

r ...

2 4-

1.4.....

AB

~23SRNA.416S PNA

FIG. 5. Northern blots of nucleic acids from strain 1100. Theprobes were end-labeled oligonucleotides complementary to theRNA in the region of the gene indicated above each lane. The largestspecies of iunc RNA detected in both the uncG and uncC lanes islabeled A. A shorter fragment detected in these two lanes is labeledB. These bands are also indicated on the standard curve in Fig. 6.The locations of RNA fragments of known sizes are indicated at theleft, with the fragment lengths listed in kilobases.

ing. The results obtained demonstrated that ribosomes doindeed bind to the initiation regions of six of the genes in theunc operon: uncB, uncE, uncF, uncH, uncA, and uncD.Although all nine of the unc genes have Shine-Dalgarnosequences, independent initiation of translation has not

10.0 -

8.0 -

6.0

0U)

._

wN

z

4.0 -

2.0

1.0

X (A)

X (B)

X (23S RNA)

X (16S RNA)

5 7 9 11

DISTANCE (cm)

FIG. 6. Plot of RNA fragment length (on a logarithmic scale)versus migration distance for fragments of known sizes from acommercially available RNA ladder (datum points in squares). Thesizes of the fragments are listed in Fig. 5. The values obtained for the23S and 16S rRNA bands are shown. The migration distances of thetwo largest uinc RNA species detected, labeled A and B in Fig. 5 andhere, are shown as they were fitted to the curve to determine theirlengths.

VOL. 171, 1989

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 6: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

3906 SCHAEFER ET AL.

previously been demonstrated or eliminated as a mechanismof expression in this operon, raising the possibility that sometype of translational coupling, such as reinitiation by theribosome that read the previous gene, acts to coordinate thestoichiometric synthesis of some of the ATPase subunits.The positive toeprinting results for the iinc genes listed

above indicate that reinitiation of translation is not the onlymethod of expression available to these genes. This is ofparticular note for uncF and uncH, for which the geneproduct is expressed at a level lower than that of thepreceding gene and for which mRNA secondary structureswhich sequester the ribosome-binding sites have been pro-posed, suggesting that reinitiation from the previous geneoccurs.The toeprints we obtained also confirm the amino-terminal

protein sequences of these polypeptides. In the a subunit,this provides the first direct identification of the aminoterminus. The initiation codon of uncB, which encodes the asubunit, was previously difficult to identify because morethan one potential initiation codon is located in frame in thisregion (Fig. 3). We note that there remains some uncertaintyabout the actual initiation site of uncI, since codons 4 and 5of the presumed uncl coding sequence are potential initiationcodons as well (GUG and AUG, respectively; Fig. 3). It ispossible that initiation occurs at one or both of these codonsby using the last of the three Shine-Dalgarno sequences inthis region (Fig. 3). It is even possible that different initiationcodons are used in uncI to produce different peptide speciesunder different conditions, as has been observed in the Sgene of bacteriophage X (24). We were unable to detect anytoeprint for uncI which might identify the initiation site(s).

Previous studies (8, 21, 33; Hartz, unpublished data) havedemonstrated a direct correlation between the intensity of atoeprint band in vitro and the strength of the ribosome-binding site in vivo for a variety of mRNAs. The toeprintingtechnique has also been used to study the inhibition ofribosome binding by translational repressor proteins (6, 21,33). As increasing concentrations of a repressor are allowedto bind to repressor-sensitive mRNA, ribosome binding tothe target mRNA is blocked and the toeprint signal obtainedbecomes weaker.

Accurate comparisons of the toeprint strengths of the iuncgenes have not been possible because of difficulties innormalizing the differences in the strengths of the sequenc-ing signals. A modified toeprinting technique which providesan internal standard for accurate quantitative comparisonsamong different genes will be used in future studies toexamine the quantitative aspect of ribosome binding to theunc transcript.

Preliminary quantitation of the toeprints obtained demon-strated a very good correlation between the difference in theintensities of the uncE toeprints of plasmids pEMS54 andpRPG54 (Fig. 1) and the difference in the expression levels ofuncE from these two plasmids in minicells (Fig. 2). pEMS54yielded a toeprint 2.3 times stronger than that of pRPG54.which carries a mutation in the uncE Shine-Dalgarno se-quence, and protein synthesis in minicells indicated thatuncE is 2.3 times more highly expressed in pEMS54 than inpRPG54. These results also show that uncE yields a toeprintat least twice as strong as those of the other unc genes.One gene, uncG, showed a strong signal in sequencing

reactions but no ribosome binding. Previous studies (17)detected no increase in incG expression when expression ofiuncA was increased. The lack of a toeprint suggests thattranslational coupling occurs here, however. It is possiblethat coupling occurs because translation of the preceding

gene, uncA, is required to unmask the uncG ribosome-binding site and that translational reinitiation, which wasaddressed by the previous experiments, is not involved atthis site.Two other genes, uncI and uncC, showed no toeprints and

only weak signals in general, suggesting that these portionsof the transcript are present in lesser amounts than the restof the operon. It is possible that weak toeprints were presentin these cases but were simply below the level detectable bythis method. However, the significantly weaker sequencingsignals suggested that something else acted in these cases aswell. We detected a site in the message encoding uncI atwhich reverse transcriptase appears to stop (Fig. 4), support-ing the hypothesis that some of the uncI message wasremoved from the rest of the transcript. A study of uncmessage stability (19) which indicates that the uncI and uncBregions of the transcript are less stable that the rest of themessage supports this idea. The site we detected is atposition 324 of the uncI gene and does not correspond to anyof the weak promoters which have been identified in the uncIsequence (12, 23), leading us to believe that it represents asite of message processing. Cleavage of the message at thissite would remove 397 bases and shorten the unc RNA toapproximately 6.6 kb, close to the size of the 6.5-kb banddetected in the Northern blots probed with uncG or uncColigonucleotides (Fig. 5, band A).A similar strong reverse transcriptase stop was detected at

position 452 of uncH. If cleavage of the message occurred atthe site in uncH, a fragment 4,449 base pairs long encodinguncA-uncC would result. Our Northern blots probed witholigonucleotides complementary to either uncG or uncCshowed hybridization to a band whose size we estimated at4.6 kb (labeled band B) which may represent this processingproduct. Consistent with this interpretation, this species wasnot detected when an uncI probe was used.Two similar but weaker stops were also observed in RNA

sequencing and primer extension experiments. Cleavage atone of these sites, at nucleotide position 7 of the uncG-uncDintergenic region, would yield a 4.7-kb uncG-containingfragment, if the proposed uncI processing occurred, and a-fragment of about 1.9 kb encoding uncD-uncC. A 1.9-kbband was seen on our unc C-probed Northern blot (Fig. 5)but was not detected by the other probes. The other pre-dicted product of this cleavage, a 4.7-kb uncG-containingfragment, might correspond to the 4.6-kb band detected onour uncG-probed Northern blot (Fig. 5, band B), although asdiscussed above, another fragment could be assigned to thatband, so that the occurrence of processing at this siteremains uncertain.The other moderately strong reverse transcriptase stop

detected was at position 77 in the iuncC gene. Cleavage herewould yield a 0.4-kb fragment containing uncuC and a 6.2-kbspecies if the transcript had also been processed at the unclsite as discussed above. The incC probe bound to someadditional bands of about 1.4 kb or less, one of which mightrepresent the 0.4-kb fragment that would result from cleav-age at the weak site in iincC. However, we attribute thesebands to nonspecific binding of the probe, since no bands ofcorresponding sizes (such as 6.2- and 5.5-kb bands whichhybridized to the uneG probe) were detected that wouldsuggest that these bands in the uncC blot represent addi-tional cleavage products. We therefore conclude that the uncmessage may be processed at the reverse transcriptase stopsite we detected in mncl and that it may sometimes beprocessed at a site which corresponds to the reverse tran-scriptase stop site in uncH but that other stops detected by

J. BACTERIOL.

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 7: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

ulinc OPERON TRANSCRIPT 3907

sequencing and primer extension do not represent actualcleavage sites.The causes of reverse transcriptase stops are not com-

pletely understood, and the stops we detected may be due toother effects such as secondary structures which cause thereverse transcriptase to terminate at specific sites, as insome other cases (31). However, our computer predictionsof mRNA secondary structure indicate that no stable struc-tures can form at either the uncI or uncH site discussed here.There is a correlation between the products predicted fromcleavage at some of the strong stops reported here and themRNA species detected by Northern blot analysis, suggest-ing that these stops represent specific processing sites.Studies of the degradation of the inc message (19) havedemonstrated that decay does not occur by a simple 3'-to-5'exonucleolytic process, suggesting that cleavage sites whichplay a role in the degradation process probably exist withinthe message.The results of this Northern blot analysis are somewhat

different from those of earlier experiments which investi-gated the size of the transcript (10). We note, however, thata minor species estimated to be 5 to 6 kb long was detectedin the earlier study and suggest that this may correspond tothe 4.6-kb band described here. Another recent examinationof linc transcript size (19) yielded a major species of about6.0 kb.We propose that processing of the unnc operon transcript at

the sites identified here is one mechanism which contributesto the overall differential expression of the ATPase subunits.Removal of uncI from the message may be partially respon-sible for the very low level of expression of this gene (3).Cleavage of the message at the uncH site may be a factor insome less obvious way, perhaps by limiting translation of thepromoter-distal genes relative to the very highly expresseduncE.

Clearly, the differences in the ribosome-binding sites inthe uinc operon transcript and the processing of this messageare important aspects of the expression of the unac geneswhich require further investigation.

ACKNOWLEDGMENTS

We thank Robert Traut of the Department of Biological Chemis-try, University of California, Davis, for generously providing uswith preparations of purified 30S ribosomal subunits.

This work was supported by U.S. Public Health Service grantsGM18539 to R.D.S. and GM28685 to L.G. from the NationalInstitutes of Health. E.M.S. is a predoctoral trainee supported byU.S. Public Health Service grant GM07276 from the NationalInstitutes of Health.

LITERATURE CITED

1. Brosius, J., T. J. Dull, D. D. Sleeter, and H. F. Noller. 1981.Gene organization and primary structure of a ribosomal RNAoperon from Escherichia coli. J. Mol. Biol. 148:107-127.

2. Brusilow, W. S. A., K. J. Klionsky, and R. D. Simoni. 1982.Differential polypeptide synthesis of the proton-translocatingATPase of Escherichia col(i. J. Bacteriol. 151:1363-1371.

3. Brusilow, W. S. A., A. C. G. Porter, and R. D. Simoni. 1983.Cloning and expression of uncl, the first gene of the unc operonof Escherichia coli. J. Bacteriol. 155:1265-1270.

4. Foster, D. L., and R. H. Fillingame. 1982. Stoichiometry ofsubunits in the H+-ATPase complex of Esculher-icIhi(0/oi. J. Biol.Chem. 257:2009-2015.

5. Futai, M., and H. Kanazawa. 1983. Structure and tunction ofproton-translocating adenosine triphosphatase (FIF(,): biochem-ical and molecular biological approaches. Microbiol. Rev. 47:

285-312.6. Gold, L. 1988. Post-transcriptional regulatory mechanisms in E.

(oli. Annu. Rev. Biochem. 57:199-233.7. Gunsalus, R. P., W. S. A. Brusilow, and R. D. Simoni. 1982.

Gene order and gene-polypeptide relationships of the proton-translocation ATPase operon (unc) of Escherichia co/i. Proc.NatI. Acad. Sci. USA 79:320-324.

8. Hartz, D., D. S. McPheeters, R. Traut, and L. Gold. 1988.Extension inhibition analysis of translation initiation com-plexes. Methods Enzymol. 164:419-425.

9. Humbert, R., W. S. A. Brusilow, R. P. Gunsalus, D. J. Klionsky,and R. D. Simoni. 1983. Escherichia coli mutants defective inthe aincH gene. J. Bacteriol. 153:416-422.

10. Jones, H. M., C. M. Brajkovich, and R. P. Gunsalus. 1983. Ini'h'o 5' terminus and length of the mRNA for the proton-translocating ATPase (unc) operon of Escherichiia c-oli. J. Bac-teriol. 155:1279-1287.

11. Kanazawa, H., T. Kayano, T. Kiyasu, and M. Futai. 1982.Nucleotide sequence of the genes for P and E subunits ofproton-translocation ATPase from Escherichia (oli. Biochem.Biophys. Res. Commun. 105:1257-1264.

12. Kanazawa, H., K. Mabuchi, T. Kayano, T. Noumi, T. Sekiya,and M. Futai. 1981. Nucleotide sequence of the genes for FOcomponents of the proton-translocating ATPase from Esc/ie-richiia coli: prediction of the primary structure of F(1 subunits.Biochem. Biophys. Res. Commun. 103:613-620.

13. Kenny, J. W., T. G. Fanning, J. M. Lambert, and R. R. Traut.1979. The subunit interface of the Escherichiia (oli ribosome-crosslinking of 30 S protein S9 to proteins of the 50 S subunit. J.Mol. Biol. 135:151-170.

14. Klionsky, D. J., W. S. A. Brusilow, and R. D. Simoni. 1984. Invivo evidence for the role of the £ subunit as an inhibitor of theproton-translocating ATPase of Escherichia (oli. J. Bacteriol.160:1055-1060.

15. Klionsky, D. J., D. G. Skalnik, and R. D. Simoni. 1986. Differ-ential translation of the genes encoding the proton-translocatingATPase of Escherichia (/oli. J. Biol. Chem. 261:8096-8099.

16. McCarthy, J. E. G. 1988. Expression of the uinc genes inEschcerichia coli. J. Bioenerg. Biomembr. 20:19-39.

17. McCarthy, J. E. G., and C. Bokelmann. 1988. Determinants oftranslational initiation efficiency in the atip operon of Esclhe-richia (oli. Mol. Microbiol. 2:455-465.

18. McCarthy, J. E. G., H. U. Schairer, and W. Sebald. 1985.Translational initiation frequency of atp genes from Escheric/hia(oli: identification of an intercistronic sequence that enhancestranslation. EMBO J. 4:519-526.

19. McCarthy, J. E. G., B. Schauder, and P. Ziemke. 1988. Post-translational control in Escherichia coli: translation and degra-dation of the atp operon mRNA. Gene 72:131-139.

20. McPheeters, D. S., A. Christensen, E. T. Young, G. Stormo, andL. Gold. 1986. Translational regulation of expression of thebacteriophage T4 lysozyme gene. Nucleic Acids Res. 14:5813-5826.

21. McPheeters, D. S., G. D. Stormo, and L. Gold. 1988. Autoge-nous regulatory site on the bacteriophage T4 gene 32 messengerRNA. J. Mol. Biol. 201:517-535.

22. Nielsen, J., F. G. Hansen, J. Hoppe, P. Friedl, and K. vonMeyenberg. 1981. The nucleotide sequence of the (tip genescoding for the F( subunits a, b, c and the F, subunit 8 of themembrane-bound ATP synthase of Escherichia co/i. Mol. Gen.Genet. 184:33-39.

23. Porter, A. C. G., W. S. A. Brusilow, and R. D. Simoni. 1983.Promoter for the unac operon of Esc/herichia (coi. J. Bacteriol.155:1271-1278.

24. Raab, R., G. Neal, C. Sohaskey, J. Smith, and R. Young. 1988.Dominance in lambda S mutations and evidence for transla-tional control. J. Mol. Biol. 199:95-105.

25. Saraste, M., N. J. Gay, A. Eberle, M. J. Runswick, and J. E.Walker. 1981. The (itp operon: nucleotide sequence of the genesfor the y, P,B and £ subunits of Escherichia (cli ATP synthase.Nucleic Acids Res. 9:5287-5296.

26. Selden, R. F. 1987. Analysis of RNA by Northern hybridization.p. 4.9.1-4.9.8. It F. M. Ausubel, R. Brent, R. E. Kingston,

VOL. 171, 1989

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 8: Ribosome-Binding Sites and RNA-Processing Sites in the ... · strain 1100 A (uncB-uncC) were used to isolate RNA to be used for RNA sequencing and toeprinting. The 1100 A(uncB-uncC)

3908 SCHAEFER ET AL.

D. D. Moole, J. A. Smith, J. G. Seidman, and K. Struhl (ed.),Current protocols in molecular biology. John Wiley & Sons.Inc., New York.

27. Senior, A. E. 1985. The proton-ATPase of Escherichia coli.Curr. Top. Membr. Transp. 23:135-151.

28. Shine, J., and L. Dalgarno. 1974. The 3'-terminal sequence ofEscherichia coli 16S ribosomal RNA: complementarity to non-

sense triplets and ribosome binding sites. Proc. Natl. Acad. Sci.USA 71:1342-1346.

29. Solomon, K. A., and W. S. A. Brusilow. 1988. Effect of an uncEribosome-binding site mutation on the synthesis and assemblyof the Escherichia coli proton-translocating ATPase. J. Biol.Chem. 263:5402-5407.

30. Steitz, J. A., and K. Jakes. 1975. How ribosomes select initiatorregions in mRNA: base pair formation between the 3' terminusof 16S rRNA and the mRNA during initiation of protein synthe-

sis in Escherichia coli. Proc. Nat]. Acad. Sci. USA 72:4734-4738.

31. Tuerk, C., P. Gauss, C. Thermes, D. R. Groebe, M. Gayle, N.Guild, G. Stormo, Y. D'Aubenton-Carafa, 0. C. Uhlenbeck, I.

Tinoco, Jr., E. N. Brody, and L. Gold. 1988. CUUCGG hair-pins: extraordinarily stable RNA secondary structures associ-ated with various biochemical processes. Proc. Natl. Acad. Sci.USA 85:1364-1368.

32. Walker, J. E., M. Saraste, and N. J. Gay. 1984. The unc operon

nuLcleotide sequence, regulation and structure of ATP-synthase.Biochim. Biophys. Acta 768:164-200.

33. Winter, R. B., L. Morrissey, P. Gauss, L. Gold, T. Hsu, and J.Karam. 1987. Bacteriophage T4 regA protein binds to mRNAsand prevents translation initiation. Proc. Natl. Acad. Sci. USA84:7822-7826.

J. BACTERIOL.

on Novem

ber 13, 2020 by guesthttp://jb.asm

.org/D

ownloaded from