rat hepatic cytosolic phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec-...

13
THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 260, No. 19, Issue of September 5, pp. 10748-10760,1985 Printed in U. S. A. Rat Hepatic Cytosolic Phosphoenolpyruvate Carboxykinase (GTP) STRUCTURES OF THEPROTEIN,MESSENGERRNA, AND GENE* (Received for publication, January 31, 1985) Elmus G. Beale“”, Nancy B. Chrapkiewicz&, Hubert A. Scoble‘, Raymond J. Metz”’, Douglas Pi Quickg, Richard L. Noble”*,John E. Donelsong, Klaus Biemanne, and Daryl K. Grannerdi From the Departments of ‘Internal Medicine and dBiochemistry, Diabetes and Endocrinology Research Center, University of Iowa, and the Veterans Administration Medical Center, Iowa City, Iowa 52240 and ‘Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 The primary structure of the messenger RNA coding for cytosolic phosphoenolpyruvate carboxykinase was determined by sequencing cDNA and genomic DNA and by primer extension of the mRNA. The molecule is 2624 nucleotides in length; this includes 143 non- translated nucleotides at the 5‘ end and 615 nontrans- lated nucleotides at the 3’ end. The 3‘ nontranslated sequence contains a 102-base pair region of alternat- ing purine-pyrimidine nucleotides (the majority of which are UpG dinucleotides), several direct repeats and palindromic sequences, and 8 CpG dinucleotides. The corresponding segment of the phosphoenolpyru- vate carboxykinase gene thus has characteristics which favor the formation of Z-DNA. The amino acid sequence of phosphoenolpyruvate carboxykinase wasdeduced from the mRNA sequence and confirmed by fast atom bombardment mass spec- trometric analysis of peptides generated with trypsin and Staphylococcus aureus V8 protease. The protein consists of 62 1 amino acids and has a molecular weight of 69,289. Charon 4A X bacteriophage clones containing ge- nomic DNA coding for phosphoenolpyruvate carboxy- kinase were isolated from a library of partial Hue111 digests of rat liver DNA. Two clones, XPCll2 and XPC103, contained the entire coding region in 15- kilobase inserts and were used to subclone the gene into pBR322 as EcoRI, BamHI, or SstI-KpnI frag- ments. Using these subclones, the structure of the phos- phoenolpyruvate carboxykinase gene was determined by S1 nuclease mapping, R-loop analysis, and DNA sequencing. The gene is composed of 10 exons and 9 introns with a total lengthof 6.0 kilobases. * This work was supported in part by National Institutesof Health Grants AM20858 (Iowa), AM25295 (Iowa Diabetes and Endocrinol- ogy Research Center), GM05472 (Massachusetts Institute of Tech- nology) and RR00317 (Massachusetts Institute of Technology), and by Veterans Administration Research Funds. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked“advertisement”in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. *E. G. Beale and N. B. Chrapkiewicz were each responsible for major portions of this work. Present address: Department of Anatomy, Texas Tech University Health Sciences Center, School of Medicine, Lubbock, T X 79430. Present address: Department of Physiology, Vanderbilt Univer- sity School of Medicine, Nashville, T N 37232. ’Presentaddress:CPCInternational, Moffet TechnicalCenter, P.O. Box 345, Summit-Argo, IL 60501. hPresent address: AppliedBiosystems,Inc., 850 Lincoln Center Dr., Foster City, CA 94404. Author to whom correspondence should be addressed. The transcription initiation site of the gene was de- termined by a combination of in vitro transcription in a HeLa cell lysate system, primer extension of mRNAPEPCK, and S1 nuclease mapping. In vitro tran- scription of purified DNA templates revealed three RNA polymerase 11-dependent start sites. Two sites were separated by 600 base pairs on the coding strand and the third site was on the noncoding strand. The products of S 1 nuclease mapping and primer extension from a BglII site were compared in order to determine which of the coding strand initiation sites was ex- pressed in vivo. In both cases a 69-base pair fragment was generated and the 5’ end of this corresponded to a thymidine residue identified in a sequence ladder of the genomic DNA coding strand. We conclude that mRNAPEPCK synthesis initiates with an adenine residue 69 base pairs 5’ of the BglII site; this corresponds to the 3’ most transcription initiation site determined in vitro. Cytosolic P-enolpyruvate carboxykinase’ is a rate-limiting gluconeogenic enzyme whose activity is altered by a number of hormones involved in the regulation of this metabolic process. Glucagon (via cyclic AMP) and glucocorticoids in- crease and insulin decreases rat hepatic P-enolpyruvate car- boxykinase enzyme activity (1-4). Each of these effectors regulates P-enolpyruvate carboxykinase activity by specifi- cally increasing or decreasing enzyme synthesis (1-7), mRNAPEPCK translational activity (5-9), and mRNAPEPCK amount (9-13). Lamers et al. (14) recently demonstrated that glucose and W,O*‘-dibutyryl CAMP regulate mRNAPEPCK by modulating P-enolpyruvate carboxykinase gene transcription in rat liver, and we showed that insulin, glucocorticoids, and cyclic AMP analogs regulate P-enolpyruvate carboxykinase gene transcription in H4IIE hepatoma cells (15-18). Several interesting observations came from these studies. First, the glucocorticoid and cyclic AMP effects are additive, indicating separate mechanisms of action. Second, insulin, in combina- tion with these inducers, exhibits the dominant effect, which was inhibition of P-enolpyruvate carboxykinase gene tran- The abbreviations used are: P-enolpyruvate carboxykinase, cy- tosolic phosphoenolpyruvate carboxykinase (GTP) (EC 4.1.1.32); mRNAPEPCK, mRNA coding for P-enolpyruvate carboxykinase; kb, kilobase; bp, base pair; pPC2, pPC116, etc., plasmids that contain mRNAPEPCK cDNAs; PC2, PC116, etc., cDNA inserts; XPc112, XPC103, etc., viral vectors containing genomic P-enolpyruvate car- boxykinase DNA; pXPC112.R3, pXPC103.B, etc., pBR322 plasmids containing genomic P-enolpyruvate carboxykinase DNA. 10748

Upload: truongthien

Post on 19-May-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 260, No. 19, Issue of September 5, pp. 10748-10760,1985 Printed in U. S. A.

Rat Hepatic Cytosolic Phosphoenolpyruvate Carboxykinase (GTP) STRUCTURES OF THE PROTEIN, MESSENGER RNA, AND GENE*

(Received for publication, January 31, 1985)

Elmus G. Beale“”, Nancy B. Chrapkiewicz&, Hubert A. Scoble‘, Raymond J. Metz”’, Douglas Pi Quickg, Richard L. Noble”*, John E. Donelsong, Klaus Biemanne, and Daryl K. Grannerdi From the Departments of ‘Internal Medicine and dBiochemistry, Diabetes and Endocrinology Research Center, University of Iowa, and the Veterans Administration Medical Center, Iowa City, Iowa 52240 and ‘Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

The primary structure of the messenger RNA coding for cytosolic phosphoenolpyruvate carboxykinase was determined by sequencing cDNA and genomic DNA and by primer extension of the mRNA. The molecule is 2624 nucleotides in length; this includes 143 non- translated nucleotides at the 5‘ end and 615 nontrans- lated nucleotides at the 3’ end. The 3‘ nontranslated sequence contains a 102-base pair region of alternat- ing purine-pyrimidine nucleotides (the majority of which are UpG dinucleotides), several direct repeats and palindromic sequences, and 8 CpG dinucleotides. The corresponding segment of the phosphoenolpyru- vate carboxykinase gene thus has characteristics which favor the formation of Z-DNA.

The amino acid sequence of phosphoenolpyruvate carboxykinase was deduced from the mRNA sequence and confirmed by fast atom bombardment mass spec- trometric analysis of peptides generated with trypsin and Staphylococcus aureus V8 protease. The protein consists of 62 1 amino acids and has a molecular weight of 69,289.

Charon 4A X bacteriophage clones containing ge- nomic DNA coding for phosphoenolpyruvate carboxy- kinase were isolated from a library of partial Hue111 digests of rat liver DNA. Two clones, XPCll2 and XPC103, contained the entire coding region in 15- kilobase inserts and were used to subclone the gene into pBR322 as EcoRI, BamHI, or SstI-KpnI frag- ments. Using these subclones, the structure of the phos- phoenolpyruvate carboxykinase gene was determined by S1 nuclease mapping, R-loop analysis, and DNA sequencing. The gene is composed of 10 exons and 9 introns with a total length of 6.0 kilobases.

* This work was supported in part by National Institutes of Health Grants AM20858 (Iowa), AM25295 (Iowa Diabetes and Endocrinol- ogy Research Center), GM05472 (Massachusetts Institute of Tech- nology) and RR00317 (Massachusetts Institute of Technology), and by Veterans Administration Research Funds. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

*E. G. Beale and N. B. Chrapkiewicz were each responsible for major portions of this work.

Present address: Department of Anatomy, Texas Tech University Health Sciences Center, School of Medicine, Lubbock, T X 79430.

Present address: Department of Physiology, Vanderbilt Univer- sity School of Medicine, Nashville, T N 37232.

’Present address: CPC International, Moffet Technical Center, P.O. Box 345, Summit-Argo, IL 60501.

hPresent address: Applied Biosystems, Inc., 850 Lincoln Center Dr., Foster City, CA 94404.

Author to whom correspondence should be addressed.

The transcription initiation site of the gene was de- termined by a combination of in vitro transcription in a HeLa cell lysate system, primer extension of mRNAPEPCK, and S1 nuclease mapping. In vitro tran- scription of purified DNA templates revealed three RNA polymerase 11-dependent start sites. Two sites were separated by 600 base pairs on the coding strand and the third site was on the noncoding strand. The products of S 1 nuclease mapping and primer extension from a BglII site were compared in order to determine which of the coding strand initiation sites was ex- pressed in vivo. In both cases a 69-base pair fragment was generated and the 5’ end of this corresponded to a thymidine residue identified in a sequence ladder of the genomic DNA coding strand. We conclude that mRNAPEPCK synthesis initiates with an adenine residue 69 base pairs 5’ of the BglII site; this corresponds to the 3’ most transcription initiation site determined in vitro.

Cytosolic P-enolpyruvate carboxykinase’ is a rate-limiting gluconeogenic enzyme whose activity is altered by a number of hormones involved in the regulation of this metabolic process. Glucagon (via cyclic AMP) and glucocorticoids in- crease and insulin decreases rat hepatic P-enolpyruvate car- boxykinase enzyme activity (1-4). Each of these effectors regulates P-enolpyruvate carboxykinase activity by specifi- cally increasing or decreasing enzyme synthesis (1-7), mRNAPEPCK translational activity (5-9), and mRNAPEPCK amount (9-13). Lamers et al. (14) recently demonstrated that glucose and W,O*‘-dibutyryl CAMP regulate mRNAPEPCK by modulating P-enolpyruvate carboxykinase gene transcription in rat liver, and we showed that insulin, glucocorticoids, and cyclic AMP analogs regulate P-enolpyruvate carboxykinase gene transcription in H4IIE hepatoma cells (15-18). Several interesting observations came from these studies. First, the glucocorticoid and cyclic AMP effects are additive, indicating separate mechanisms of action. Second, insulin, in combina- tion with these inducers, exhibits the dominant effect, which was inhibition of P-enolpyruvate carboxykinase gene tran-

The abbreviations used are: P-enolpyruvate carboxykinase, cy- tosolic phosphoenolpyruvate carboxykinase (GTP) (EC 4.1.1.32); mRNAPEPCK, mRNA coding for P-enolpyruvate carboxykinase; kb, kilobase; bp, base pair; pPC2, pPC116, etc., plasmids that contain mRNAPEPCK cDNAs; PC2, PC116, etc., cDNA inserts; XPc112, XPC103, etc., viral vectors containing genomic P-enolpyruvate car- boxykinase DNA; pXPC112.R3, pXPC103.B, etc., pBR322 plasmids containing genomic P-enolpyruvate carboxykinase DNA.

10748

Page 2: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

Struc ture of P-enolpyruvate Carboxykinase, Its mRNA, and Gene 10749

scription. Finally, these effectors are very specific, since none change the rate of transcription of total RNA (14-18).

The structure of the P-enolpyruvate carboxykinase gene must be known in order to investigate how glucocorticoids, cyclic AMP, and insulin interact to regulate transcription. This paper summarizes our findings related to the organiza- tion and nucleotide sequence of the rat liver P-enolpyruvate carboxykinase gene. In the course of this study we also de- duced the primary structures of mRNAPEPCK and of the en- zyme itself.

EXPERIMENTAL PROCEDURES AND RESULTS'

Isolation of Recombinant Plasmids Containing P-enolpyru- vate Carboxykinase cDNAs-We recently reported the isola- tion of a plasmid, designated pPC2, that contained a cDNA complementary to a portion of mRNAPEPCK (10). In an at- tempt to isolate a cDNA representing the entire mRNAPEPCK molecule, the original cDNA library was size-fractionated and rescreened using a 202-bp AluI-AluI cDNA fragment (Probe I, Fig. 1) isolated from pPC2 (10). A total of 16 clones were found by this procedure. The approximate size of each of the cDNAs was determined by agarose gel electrophoresis of the plasmids following digestion with several restriction enzymes. Two plasmids, pPC113 and pPC116, contained large cDNAs but neither represented the complete mRNAPEPCK molecule. We next screened the same library with a 1305-bp SmaI-SphI cDNA fragment (Probe 2, Fig. 1) isolated from pPC116 and characterized only those clones that hybridized with Probe 2 but not with Probe 1. Approximately 30 clones were found, but the largest, pPC201, contained a cDNA that still did not extend to the 5' end of mRNAPEPCK. Consequently, the 511- bp SmaI-SmaI cDNA fragment of pPC201 was annealed to rat liver poly(A+) RNA and copied with reverse transcriptase in an attempt to complete the cDNA by primer extension. The result of this, PC302, extended only 6 nucleotides further 5' than did PC201. We repeatedly had difficulty extending past the 5' most sequences of PC302 and PC201 with double- stranded DNA primers, a stuttering problem which is dis- cussed in the Miniprint Section. This difficulty was circum- vented by synthesizing a single-stranded cDNA, PC17, which is complementary to 17 nucleotides very near the 5' end of PC302 (Fig. 2). Reverse transcription primed with PC17 re- sulted in PC400 which was shown by restriction mapping to extend far enough to represent the 5' end of mRNAPEPCK. All of the cDNAs are shown in relation to mRNAPEPCK in Fig. 1.

Sequence of mRNAPEPCK-The primary structure of mRNAPEPCK was determined by sequencing the cDNAs as illustrated in Fig. 1. The resulting nucleotide sequence is shown in Fig. 2; the 5' ends of each of the cDNAs are marked by the vertical arrows. To determine whether PC400 extended to the 5' most nucleotide of mRNAPEPCK, primer extension analysis of mRNAPEPCK was performed using a synthetic 24- nucleotide oligomer designated PC24 (nucleotides 46-69, Fig. 2). This experiment, plus another designed to determine the transcription initiation site of the gene (to be discussed later, see Fig. 8), revealed that PC400 was 1 base short of the 5' end of mRNAPEPCK. The missing nucleotide, an adenine resi-

* Portions of this paper (including "Experimental Procedures," part of "Results," Figs. 9 and 10, and Table I) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are available from the Journal of Biological Chemistry, 9650 Rockville Pike, Bethesda, MD 20814. Request Document No. 85M-300, cite the authors, and include a check or money order for $4.00 per set of photocopies. Full size photocopies are also included in the microfilm edition of the Journal that is available from Waverly Press.

c -

+ PC 302 - - " c " c_ "- PC 201 - -

c_

PC 113 - c_ - c_

"C

" - - "

PC 116 - 4 -

Probe 2 PC 2

U Probe 1

Nucleotides

FIG. 1. cDNA library and sequencing strategy. The cDNAs were isolated as described in the text. A restriction map of each cDNA was determined as described by Boseley et al. (45). The restriction map shown in Fig. 1 was determined from the actual sequence data shown in Fig. 2 and agrees with the experimentally determined maps. Only those restriction sites that were labeled for sequencing by the Maxam and Gilbert (46) procedure are shown. The sequenced DNA fragments are shown by arrows; the length of the arrows corresponds to the number of bases determined from each labeled end. Arrows located below the cDNA indicate that the bottom strand was se- quenced, and arrows above indicate that the top strand was se- quenced. Dashed arrows indicate sequences that were obtained from genomic DNA in exonic sequences corresponding to the messenger RNA. Probe I and Probe 2 indicate restriction fragments that were used as nick-translated probes to screen for pPC116, pPC113, and pPC201 as described in the text. Abbreviations used are: Ac, AccI; A, AuaI; All, AuaII; B, BstNI; BII, BglII; C, ClaI; E, EcoRI; M , MspI; R, RsaI; S , SphI; T, T q I ; X , XmaI; NCS, noncoding sequence; An, polyadenylate tail.

due, has been added to the mRNA sequence shown in Fig. 2. Amino Acid Sequence of P-enolpyruvate Carboxykinase-

The amino acid sequence of P-enolpyruvate carboxykinase, shown in Fig. 2, was deduced from the only mRNA reading frame that coded for a protein of the appropriate length. The primary structure of P-enolpyruvate carboxykinase has never been determined directly; thus, the derived amino acid se- quence was corroborated using the purified enzyme. Fast atom bombardment mass spectrometry (19) was used to confirm this amino acid sequence using proteolytic peptides obtained from carboxamidomethylated P-enolpyruvate carboxykinase. The peptides identified by this technique are denoted by the brackets in Fig. 2. Further confirmation of these assignments was obtained by subjecting tryptic digests to two manual Edman degradation steps and Staphylococcus aureus V8 pro- tease digests to a single step, followed in each case by repeat analysis in the mass spectrometer. The specific N-terminal amino acids identified by this procedure are shown by the horizontal arrows in Fig. 2. Further support for the accuracy of this amino acid sequence comes from the fact that these peptide assignments are unique as to their placement; that is, none of the assigned peptides correspond to isobaric peptides from another reading frame or to either of the nontranslated regions.

Technical limitations preclude the detection of peptides smaller than 300 daltons or larger than 2000 daltons; hence, some of the predicted proteolytic peptides in the enzymatic

Page 3: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

10750 Structure of P-enolpyruvate Carboxykinase, Its mRNA, and Gene

A l a A l a A la Phe Pro Ssr A l a C y 5 G l y Ly:\hr Asn Leu Ala Met Yet As" Pro Thr Leu Pro G I " T r p Lv:?al Cy$ Val G I y Asp Asp 11. Ala Trp k t LyS%%ep AI. Gln 320 GCA GCA GCC UUC CCC AGU GCC ffiU GGG A M ACC AAC Cffi GCC Affi AUG AAC CCC ACC C W CCC GGG LUX A M GUU G M LGU Gffi S U GAU GAC AUU UT: LUX Affi M G UUU GAU GCc C M

A 4

I350 1400 1450 1 1

1850 I900

I """"_ Asn Ala Asp Leu Pro T y r G I u I l e G I u Arg G l u Leu Arg A la Leu Lys G I n A r g l l l e Ssr G I o Met'--- AAC GCC GAC C W CCU UAC G M AUA GAG AGG GAG CUC CGA GCC C f f i M A CAG AGA AUC A G C CAG Affi YAA U C C u i A u % u % G u i U C C U f f i Y A ~ C U U C ~ U C A ~ A C ~ ~ U ~ A

1

I950 PC2

# + 2 0 5 0

C U U A C C U U U C U U U A C A U A A U V G A A A U A G G U A U C C A C U U C U ~ M A A A ~ A U A G - 3 '

2600

FIG. 2

Page 4: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

Structure of P-enolpyruvate Carboxykinase, I ts mRNA, and Gene 10751

digests could not be seen. In addition, several peptides within the detectable mass range were not found. This observation, as previously noted (20, 21), may be due to competition for ionization among peptides in the mixture.

The designated N-terminal peptide in Fig. 2 [Pro-Pro-Gln- Leu-His-Asn-Gly-Leu-Asp-Phe-Ser-Ala-Lys], which has a mass of 1422 (i.e. a protonated molecular ion of mass 1423), was detected in lower abundance than the majority of the peptides. This peptide, which is adjacent to a peptide con- firmed by two cycles of Edman degradation, would result if the initiating methionine is clipped from P-enolpyruvate car- boxykinase. Because there was insufficient material to con- firm this designation by Edman degradation, several attempts were made to determine the N-terminal amino acid by Edman degradation of intact P-enolpyruvate carboxykinase. To date this has not been accomplished, presumably due to the resist- ance of the Pro-Pro bond to cleavage (22). Final confirmation of proline as the N-terminal amino acid of P-enolpyruvate carboxykinase awaits a successful protein sequence determi- nation.

Cloning and Analysis of P-enolpyruuate Carboxykinase Ge- nomic DNA-Genomic P-enolpyruvate carboxykinase DNA was isolated by screening a X bacteriophage DNA library (described under “Experimental Procedures”) in three stages: first with a 511-bp SmaI-SmaI cDNA fragment obtained from pPC201, next with an 1185-bp BglI-BglII cDNA fragment from pPC201, and finally with the BglI-BglII fragment plus a 202-bp AluI-AluI cDNA fragment from pPC2. Twelve positive clones were obtained from these three screenings. Rescreening of these 12 under more stringent hybridization conditions (23) eliminated six of the clones. Two of the remaining six clones, XPCll2 and XPC103, contained 15-kb inserts which were similar by restriction enzyme analysis. These were chosen for further characterization.

BamHI digestion of kPC112 results in a 7.2-kb fragment while EcoRI digestion results in fragments of 4.4,4.8, and 5.7 kb in addition to the X vector. Digestion of the EcoRI frag- ments with BamHI and the BanHI fragment with EcoRI made it possible to orient the restriction enzyme sites as indicated in the map of XPC112 in Fig. 3A. Southern blots of restriction enzyme-cut genomic and XPC112 DNA were com- pared to determine whether XPC112 contained the entire P- enolpyruvate carboxykinase gene. PC2, which contains the 3’ most 623 bp of mRNAPEPCK, and PC302, which contains 828 bp near the 5’ end of mRNAPEPCK, were used as probes. It was determined that the 4.4-kb EcoRI fragment contained the 3’ most coding sequence for mRNAPEPCK, the 5.7-kb EcoRI

A . XPC112

A Vector E Sr EhBhE K Xh ESs Bh f A vector

I 1 I I I 1 1 1

0 2 4 6 8 1 0 1 2 1 4

K ~ l o D o r e f

B. - pAPC112.R2 - p A P C l 1 2 . R 3 - p h P C ? l ~ . R t

pAPC103.B - pAPC103.3KS

FIG. 3. Restriction map and subcloned fragments of XpCll2. In A a partial restriction map of hPC112 is presented. The placement of these sites was determined by a combination of Southern blot experiments, and restriction enzyme analysis of phPC112.R1, phPC112.R2, and pXPC112.R3. These three subclones of hPC112, along with the two subclones of hPC103, phPC103.B and phPC103.3KS, are shown in B in their positions relative to hPC112. The details of all the subclonings are given under “Experimental Procedures.” The restriction enzyme abbreviations are as follows: E, EcoRI; Ss, SstI; Bh, BamHI; K , KpnI; Xh , XhoI .

fragment contained the 5‘ mRNAPEPCK sequence, and the 7.2- kb BamHI fragment contained both (data not shown). The 4.8-kb EcoRI fragment is entirely 5’ flanking DNA. Since the 3’ end of the P-enolpyruvate carboxykinase gene is contained within the 4.4-kb EcoRI and 7.2-kb BamHI fragments, and assuming that the primary transcript of the P-enolpyruvate carboxykinase gene is not substantially larger than the 6.9- kb band seen in Northern blots of rat liver or H4IIE cell nuclear RNA (9, 14, 15), XPC112 and XPC103 (not shown) contain the entire gene with an additional 5 kb of 5’ and 4 kb of 3‘ flanking sequence. The fragments of XPCll2 and XPC103 that were subcloned are depicted in Fig. 3B.

Location of the Exons and Introns of the P-enolpyruoate Carboxykinase Gene by SI Nuclease Mapping-A composite drawing of the P-enolpyruvate carboxykinase gene is shown in Fig. 4. The size and position of the exons (solid boxes) and introns (open boxes) were initially determined by S1 nuclease mapping (24). DNA fragments were isolated, 3’ or 5’ end- labeled, and then made single-stranded by X exonuclease or exonuclease 111 digestion. The single-stranded DNA was then annealed to total rat liver poly(A+) RNA, digested with S1 nuclease, and resolved by electrophoresis on denaturing gels. Fig. 5A shows an example of these S1 mapping data. The three SI-protected fragments, from 5”labeled ClaI, EcoRI, and XmaI sites in pXPC112.R3, are 185, 355, and 195 bp in

FIG. 2. Nucleotide sequence of mRNAPEPCK and amino acid sequence of P-enolpyruvate carboxyki- nase. A sequence of the strand corresponding to ,RNAPEPcK, in which U residues replace T residues, is shown along with the deduced amino acid sequence. The sequence of the 5’ end of mRNAPEPCK (presumably the capped nucleotide) was determined by comparing the products of reverse transcription from a synthetic oligonucleotide with the sequence of genomic DNA as described in the text. A single base present in PC113 but not PC116 is marked by an * a t position 1279. This difference resulted from a reverse transcription error during the original in vitro synthesis of PC116, an artifact that has been reported by other investigators (47-49). The arrows at positions 2, 219, 225, 766, 1032, and 2002 mark the locations of the 5’ ends of the cDNAs of pPC400, pPC302, pPC201, pPC113, pPC116, and pPC2, respectively (see Fig. 1). The locations and sequences of the two synthetic oligomers, PC17 and PC24, are also shown. Brackets at nucleotides 2314 and 2415 enclose a 102-bp region of alternating pyrimidine-purine bases, with two exceptions denoted with asterisks. This region contains numerous repeats and palindromes. The repeated sequences having greater than 90% homology are designated by arrows labeled with identical numbers; several repeats are overlapping, such as 1 with 2, and 3 with 4, and other repeats are nested, such as 3 and 6. Repeating palindromes are overlined with dashed or dotted lines. Two hexanucleotides, CGCGCG, and the polyadenylation sequence, AAUAAA, are enclosed in bores; the CGCGCG is highlighted because it is unusual, but its function is unknown. The horizontal brackets above the amino acid sequence enclose tryptic and chymotryptic peptides (solid brackets) and S. aureus V8 protease peptides (dashed lines) that were confirmed by fast atom bombardment mass spectrometry as described in the text. The short arrows above the first one or two amino acids in some of the bracketed peptides represent amino acids that were cleaved by Edman degradation and the resulting peptides were subsequently detected by mass spectrometry.

Page 5: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

10752 Structure of P-enolpyruvate Carboxykinase, Its mRNA, and Gene

l , , , , 1 , , , , 1 , , , , 1 , , , , l 1 1 1 , l l l a l l l r l . l l

-1000 i 1 1000 2000 3000 4000 5000 6000 Base Paws ”

FIG. 4. Structure and sequence strategy of the P-enolpyruvate carboxykinase gene. The sizes and positions of the exons and introns were determined by S1 nuclease mapping and confirmed by sequencing as described in the text. The exons are denoted by solid boxes and the introns by open boxes. The hatched box 5’ of the first exon is an SI nuclease-protected fragment that was observed but determined not to be an exon of the P- enolpyruvate carboxykinase gene (see “Discussion”). The small vertical arrows point to the restriction enzyme sites used for S1 nuclease mapping. The bars originating from the arrows in the 3rd, 7th, and 10th exons mark the sizes and location of the S1-protected fragments shown in Fig. 5. The sequenced DNA fragments are shown by the horizontal arrows below the base scale. The arrow points in the direction of reading and its length corresponds to the number of bases determined from each 3’ labeled end. The restriction enzyme abbreviations are as in Figs. 1 and 3 with the addition of: H, HindIII; Sm, SmaI. All of the cutting sites of Ac, R, and T are not shown.

.”

15801 - 1525 995 -

682 - 430 -

120-

3,

H a 0 0

I=% 15801 1525 995

682

430

I20

FIG. 5. S1 nuclease mapping. The three fragments used for the S1 nuclease mapping in A were isolated from pXPC112.R3 digested with the appropriate restriction enzyme. The ClaI fragment was 4300 bp in length and extended 5’ from the ClaI site a t position 3800 in Fig. 4 to a ClaI site in pBR322. The EcoRI fragment was the 5700- base insert of pXPC112.R3. The XmaI fragment was 8300 bp in length and contained all the pXPC112.R3 DNA except an 1800-bp XmaI fragment at the center of its insert. All three of these fragments were 5’ end-labeled and S1 mapped as described under “Experimental Procedures” in the presence (+) or absence (-) of poly(A+) RNA isolated from diabetic rat liver. In B the DNA used was the 5700-bp pXPC112.R3 insert. This was 3’ end-labeled and S1 mapped in the presence (+) or absence (-) of poly(A+) RNA.

size, respectively. These fragments are marked by the barred lines under the 3rd, 7th, and 10th exons in Fig. 4. The results of the S1 mapping were very reproducible in all cases but one; the 3‘ end of the 1st exon was impossible to locate since a smear of three large mRNA-dependent S1 fragments was seen when a DNA fragment 3‘ end-labeled at the BglII site was used. In a separate S1 protection experiment, a 190-bp frag- ment was obtained when the 5’ EcoRI site of the pXPC112.R3 insert was 3‘ end-labeled (Fig. 5B). This fragment is probably not an exon of the P-enolpyruvate carboxykinase gene (hence the hatched, not solid, box in Fig. 4) as will be discussed later.

Sequence of the P-enolpyruuate Carboxykinase Gene-Ge- nomic DNA was sequenced to characterize the 5’ flanking DNA, to define the transcription initiation site, and to pre- cisely locate the intron/exon boundaries. The strategy used is shown in Fig. 4, and all of the sequence data obtained are presented in Fig. 6. A comparison of these data with the mRNAPEPCK sequence confirmed all of the SI-mapped intron/ exon boundaries which were accurate to within 30 bp. In addition, the DNA sequence revealed an exon (the 2nd) and intron (the 4th) not found by S1 mapping and defined the 3’ boundary of the 1st exon. Thus, the P-enolpyruvate carboxy- kinase gene is 6.0 kb in length and is composed of 10 exons (numbered 1-10) of 102,265, 182, 204, 188,163, 225,132,96, and 1067 bp in size separated by 9 introns (lettered A - I ) of 172, 371, (590), (230), (570), (600), 90, (580), and 132 bp (listed from 5’ to 3’). The sizes in parentheses are close approximations based upon fragments prepared for sequenc- ing and S1 nuclease mapping.

Identification of the 5‘ Initiation Site-Two approaches were used to determine the 5‘ initiation site of the P-enolpy- ruvate carboxykinase gene. The first involved in uitro tran-

FIG. 6. Partial sequence of the P-enolpyruvate carboxykinase gene and 1200 bases of 5’ flanking DNA. The DNA sequence of the noncoding strand of the P-enolpyruvate carboxykinase gene is presented as determined from the strategy shown in Fig. 4, using the method of Maxam and Gilbert (46). The exons (I through 10) are underlined, the introns ( A through I) and 5’ flanking DNA are not. The boxed nucleotides indicate a “TATA” sequence. The nucleotides enclosed by the dashed box form an AT-rich region in the 5’ flanking DNA. Nucleotide designations which were unclear due to compressions have a plus (+) placed above them. At position -(387/388) the order of the two nucleotides could not clearly be determined. At the two other positions the (+) indicates an uncertainty as to whether the nucleotide is present. Nucleotides determined from cDNA rather than genomic DNA have asterisk beneath them. The arrows at positions +48, -205, and -(296/297) mark the points of discrepancy with the 621 bases of sequence published by Wynshaw-Boris et al. (42), as discussed in the text. The ouerlined hexanucleotides are homologous to sequences found within the mouse mammary tumor virus LTR (43). Portions of introns not sequenced are shown with an approximation of the number of bases missing from the sequence.

Page 6: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

-9188

-1420

-980

-840

-700

-5160

-420

-280

-1 40

+ I

Structure of P-enolpyruuate Carboxykinase, Its mRNA, and Gene

5'-CCTCCTGCCTACAGGGMGTAG~CATGTCCCTGCCCCCTACTCTGAGCCCAGCTGTGGGAGCCAGCC

CTGCCCMTGU;CTCTCTCTGATT~TTCTCACTCACTTCTAMCTCCAGTWIGCMCTTCTCTCGGCTCGTTCAATTGGCC;TGAAGGTCTGTGTCTTGCAGAGAAGGTTCTTCACAACTGGGATMAGGTCTCGCTGC -

TCMGTGTAGCCCAGTA~AACTGCCMGCCCCTTCCCCTCCTCTCCCTAGACTCTTGGATGC~GAAGAATCCAGGCAGCTCCM~GGTGATTGTGTCCMCCTAGAATGTCTTGAAA~GACATT~GGGWCTAGAGA

AGACAGGU;ATCCMC~GTTCTCTGCAGCCCAGCCTGACTWCATGTMCTCTT~TGGTT~TCACCAGCCAGCTGGACCTG(:TTAGTATTCTTTCTGCCTCAGTTTCCCAGCCTGTACCCAGGGCTGTCATAGTTCCATT

T C A G G C A G T A G T M T ~ A A T G A G C T G A C A T A M A C A T T T A G A G C A G G G T C A G T A T ~ A ~ ~ ~ ~ G T G A T ~ A T A T C A G G C A T T G C C T C C T C G G M T G A A G C T T A C A A T C A C C C C T C C C T C T G C A G T T C A T C T T G G G

GTGGCCAGAGGLTCCAGCAGACACCTAG~GGGGTAACACACCCCAGCCAACTCGGCTGTTGCAGACTTTGTCTAGAAG~~TCACGTCTCAGAGCT~~TCCCTTCTCATGACCTT~GGCCGTGGGAGTGACACCTCACA

GCTGTGGTGTTTTGACMCCUCAGCC~TGGZ;CACAAAATGTGCAGCCAGC~~~CATATWGTCCMGAGGCGTCCCGGCCAGCCC~TWCCCCCACCTGACAATTAAGGCMGAGCCTAT~GTTTGCATCAG

- -

- * tMCAGTCACffiTCUAGTTTAGTCMTCAAACGTTGTGTMGGACTCAACTATGGCTGACACGGGGGCCTGAGGCCTCCCAACATTCATTAAC~AGCMGTTCMTCATTATCTCCCCAMGlTTATTGTGTTAGGT

* C A G ~ T C C ~ ~ C G T G C T M C C A T G G C T A T G A T C C M G C C G G C C C C ~ A C G T C A G A G ~ G A G C C ~ C C A G G ~ C C A G C T G A G G G G C A G G G C T G T C C T C C T T C T G T A T A ~ G C G A G G A G G G C T A G C T M C ~ G C

ACCGTTff iCCTTCCCTCTGACACCCTTGGCCAACAGGGGAMTCCff iCGAGACGCTCTGAGATCTCTGATCCAGACCTTCCAMAGWIGWGAM A G G C A G G G C C A T T C C A C T C T C C T C T C C ~ G T A C C *

Enon 1 ACTWGGffiTCCACTC~ffiTCTTTGCCCTGACMCCAAGGCCCTCTAGCCATGTGCMCTCATGCMAGTCTACAGAAGGAACGGGMGGGCCCATGGGTGCCGGTGTTCAAGAGCCTCTCTTTTTGTTTCA TGGCA

I n t r o n A

k 2 CC~GTCCCTCCCTCTCTCCACACCATTGCMGAATGCCTCCTCAGCTGCATMTffiTCTGGACTTCTCTGCCMGGTCATCCAGGGCAGCCTCGACAGCCTGCCCCAGGAAGTGAGGMGTTTGTGGMGGCMTGCC

CAGCTGTGCC~GCCU;AtTATATTCACATCTGCGACGGCTCCGAGGAGGAGTACGM;CGGCTCCTGGCCCACATGCAGGAGGAGGGTGTCATCCGCM~TGAAGAMTATGACAACT

T C T C T G C C T A C A G G C T C C C C A G C T C T A C C T T C T C G G A G T C T G G A G M T A M W C C C T T G C A G T A G C A A A C G C C T G A G A A G C T T A G M G A T C A G T C A ~ T C C A G A C A C C C T G T G A ~ ~ T T A G C A ~ A ~ C C A ~ ~ ~ ~ C ~ ~ ~ ~ ~

CCAGATAGCCATGCMGGCGGCCCC~TCCCTAGGGCCAGCTCATGGCTTAGCMATGTCTACCTGGTCMGTTCCCGTTGGCTTGGCTTCCATACCTMGTACACGGTCAACGTCCTTGTGTCTGCMG~CGGWG

AGCCTGAGGTAGGCMGAWffiTACATCCCCMGGGTACCTTGCTGMGGCATCCCTGTGACTTTTTCA GGCTGGCTCTCACTGACCCCAG~TGTGGCCAGGATCGAMGCMGACGGTCATCATTACCCMGAG

CAGAGAGACACCGTGCCCATCCCCCAAMGTGGGCAGAGCCAGCTGGGCCGCTGGATGTCAtAAGAGGACTTCGAGAMGCATTCAACGCCAGGTTCCCffiGGTGCATGAM Exon 3

AGTGTGCTGTGTGCACG~CACGCTCCCGGTAGTAGCTTAAGCTGCAAAGGGTU~CCCGCGT~GMGCTCCGCGTCTGGTTCTGTCTCACGGMTATC~TTGTTCTCTGAGAGTACGGGTGGCTCC~~~T~TCA~~ Lntron C

+ TCA.GAUTGGMGCT.GAGGCAGGGCATTCAMGTTTMGGCC..--220 b p ~ - ~ C T G T ~ G T T T M T T C A G C A T ? T T G C T G T l T C T A C T G G ? C T C T G ~ C C A G G E C T G G C T G C C A G T A G G G ~ A T ~ ~ T ~ ~ ~ ~

GACGTAMTGGMGATGAGTCMGWGCTCAGAGGGCCTCCTTffiCTTCTTACCTATTTCTACCCATCCACCTCTGCA CGCACCATGTATGTCATCCCATTCAGCATGGGGCCGCTGGGCTCACCTCTGGCCMW\

Exon 4

FIG. 6

10753

+I40

+ 2 8 0

+420

+ 5 6 0

+roo

+a40

+sa0

+1120

+1260

- 4 6 0 6

-1746

-1886

-2232

-2372

-2542

-2788

-2928

-3068

-3208

-3624

-3764

-3904

-4044

-4610

-4750

-4890

-5030

-5051

Page 7: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

10754 Structure of P-enolpyruvate Carboxykinase, Its mRNA, and Gene

766 - 5431525 - f

y1 .*I:.&

237 -

109 -

a-Amanitin - - 2 (yg/ml)

B, B s t X I / K p n I

4 5 6 7

I5251 - 1580 1085 -

766 - 5431525 -

237 -

109 -

C, E c o R I / K p n I 8 9 IO II

15251 - I580

5431525 - 1 *4$ 237 -

t I

I I I I I I I I I 0 200 400 600 800 1000 1200 1400 1600

Bases

FIG. 7. In vitro transcription of P-enolpyruvate carboxykinase genomic DNA. All assays utilized the New England Nuclear in vitro transcription kit as described under “Experimental Procedures.” Transcription of the 850-bp BstXI-BglII fragment isolated from pXPC103.3KS is shown in A; lane I , no DNA; lane 2, transcription in the presence of 900 ng of DNA; and lane 3, 900 ng of DNA plus 2 pg/ml of a-amanitin. Transcription of the 1650-bp BstXI-KpnI fragment of pXPC103.3KS is shown in B: lane 4 , no DNA; lane 5, transcription of 700 ng of DNA lane 6, 700 ng of DNA plus 2 pg/ml of a-amanitin; and lane 7,700 ng of DNA plus 200 pg/ml of a-amanitin. Transcription of the 1350-bp EcoRI-KpnI fragment from XPC103.3KS is shown in C lane 8, no DNA; lane 9,900 ng of DNA; lane IO, 900 ng of DNA plus 2 pg/ml of a-amanitin; and lane 11,900 ng of DNA plus 200 pg/ml of a- amanitin. An interpretation of the results of the transcription assays with respect to the polymerase 11-dependent transcripts is shown in D. The blocked ends of the arrows show the RNA polymerase I1 initiation sites and the arrowheads show the direction of synthesis. These are drawn in relation to a partial restriction map of the 3’ end of the pXPC103.3KS insert.

scription of cloned genomic DNA. These transcriptions were performed using the HeLa whole cell lysate system of Manley et al. (25 ) . a-Amanitin was used to determine whether specific transcripts were synthesized by RNA polymerase I, 11, or I11 by using a concentration of 2 pg/ml to inhibit RNA polym- erase I1 activity and a concentration of 200 pg/ml to inhibit both RNA polymerases I1 and 111. Fig. 7 shows the results obtained when transcripts were synthesized using either a

BstXI-BglII (A) , BstXI-KpnI (B) , or EcoRI-KpnI (C) frag- ment. By comparing the sizes of the transcripts synthesized from these three fragments with different 3’ and/or 5’ ends, three RNA polymerase 11-dependent start sites were found within the 1700-bp BstXI-KpnI region of the gene. These are shown in D in relation to a restriction map of the region tested; the arrows identify the three initiation sites and the direction of transcript synthesis. Two of the initiation sites

Page 8: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

Structure of P-enolpyruvate Carboxykinase, Its mRNA, and Gene 10755

occur on the coding strand; one is approximately 100 bp 5’ from the EcoRI site and the other is about 50 bp 5’ from the BglII site. The third site initiates transcription approximately 25 bp 3‘ from the EcoRI site on the noncoding strand. These conclusions are based on the following observations. The coding strand initiation site 100 bp 5’ of the EcoRI site was detected from a 650-bp RNA polymerase 11-dependent tran- script synthesized using the BstXI-BglII DNA fragment (A, compare lanes 2 and 3) . This transcript was lengthened by 800 bp to approximately 1400 bp when the BstXI-KpnI frag- ment was used as the template (B, compare lanes 5 and 6), an increase equal to the additional downstream DNA contained in this DNA fragment compared to the BstXI-BgZII DNA. This indicated that the initiation site was 5’ of the EcoRI site on the coding strand since, if this transcript had been synthe- sized from the noncoding strand, its size would not have changed. Confirmation of this assignment came from the observation that a 1400-bp RNA polymerase 11-dependent transcript was not synthesized from the EcoRI-KpnI DNA fragment (C, compare lanes 9 and 10).

The second coding strand initiation site was detected when BstXI-KpnI and EcoRI-KpnI fragments were used as tem- plates. In both cases an 800-bp RNA polymerase 11-dependent transcript was synthesized, indicating the presence of an initiation site on the coding strand about 50 bp 5’ from the BglII site (B , compare lanes 5 and 6; C, compare lanes 9 and 10). Based on this observation a transcript of about 50 bp was expected from the BstXI-BglII fragment; however, this was never seen, even when the RNAs were resolved on 10% polyacrylamide denaturing gels (data not shown). Segments of DNA downstream from the BglII site may be required for efficient initiation of in uitro transcription at this site on the P-enolpyruvate carboxykinase gene.

The initiation site on the noncoding strand was detected when BstXI-BglII and BstXI-KpnI fragments were used as templates. A 350-bp RNA transcript, which was only partially sensitive to low concentrations of a-amanitin, was synthe- sized from both DNAs (A, compare lanes 2 and 3; B, compare lanes 5 and 6). Further analysis of these transcripts on 10% polyacrylamide denaturing gels revealed that the 350-bp band was actually a doublet and only one form was RNA polymer- ase 11-dependent (data not shown). This indicated that the 350-bp RNA polymerase 11-dependent transcript was synthe- sized from the noncoding strand starting about 25 bp 3‘ from the EcoRI site. Transcription of the EcoRI-KpnI fragment verified the placement, as no 350-bp transcript was synthe- sized (C, compare lunes 9 and 10).

Several other transcripts were detected but were difficult to assign because of variable sensitivity to a-amanitin. Tran- scripts 950 and 1200 bp in length were synthesized when the BstXI-KpnI and EcoRI-KpnI fragments were used as the template. The 950-bp transcript was partially sensitive to a- amanitin at 2 pg/ml while synthesis of the 1200 bp transcript was totally inhibited when the BstXI-KpnI fragment was used ( B , compare lanes 5 and 6). The opposite result was seen with the EcoRI-KpnI fragment (C, compare lanes 9 and 10). Rep- etition of the experiments using these templates never gave a consistent result with regard to these two transcripts so this information was not considered in the final interpretation of results.

Since all three of the templates synthesized transcripts that were insensitive to low concentrations of a-amanitin, the BstXI-KpnI and EcoRI-KpnI fragments were used as tem- plates in an in uitro transcription assay containing a-amanitin at 200 pg/ml. This high concentration of a-amanitin totally eliminated transcription of RNA (B, lane 7 and C, lane 11);

thus RNA polymerase I11 was responsible for the synthesis of the in uitro transcripts which were not RNA polymerase II- dependent.

The second approach used to determine the 5‘ initiation site of the gene was primer extension analysis of mRNAPEPCK. The single-stranded DNA PC24 (see Fig. 2, nucleotides 46- 69) was synthesized so that its 5‘ end was identical with that of BglII-cut pXPC103.3KS (refer to Figs. 3 and 4). PC24 was 5’ end-labeled, annealed to diabetic rat liver poly(A+) RNA, and then extended with reverse transcriptase. The major product was approximately 70 bases in length but some less abundant transcripts approximately 400-600 bases long were also synthesized (data not shown). These primer-extended products were run alongside genomic DNA that was se- quenced and S1 nuclease-mapped from the BglII site to iden- tify the specific initiation site (Fig. 8). The SI-protected products generated by annealing the 5‘ end-labeled EcoRI- BglII fragment to poly(A+) RNA from diabetic rat liver and then digesting the hybrids under conditions of increasing S1 nuclease concentration are represented in lanes 1-7. This is followed by four lanes of sequencing reactions for the same EcoRI-BglII fragment 5’ end-labeled at the BglII site; the sequence assignments are shown in lane 10. The primer- extended products synthesized with poly(A+) RNA from dia- betic rat liver and triamcinolone-treated rat kidney as tem- plates, respectively, are shown in lanes 8 and 9. These RNA

oc3 I- A + +

Lane1 2 3 4 5 6 7 a a o o 8 9 10 G A T G G T

I G

FIG. 8. Determination of the initiation site of the P-enol- pyruvate carboxykinase gene. S1 nuclease mapping from the BglII site of pXPC103.3KS is shown in lanes 1-7. The EcoRI-BglII fragment (position -460 to +69, Figs. 4 and 6) was 5’ end-labeled and annealed to poly(A+) RNA isolated from diabetic rat liver. The hybrids were then digested with increasing S1 nuclease concentra- tions and run on a 10% urea-polyacrylamide sequencing gel. The S1 concentrations used in each assay were: lune 1 , 13.3 units/ml; lune 2, 26.6 units/ml; lane 3, 53.3 units/ml; lane 4, 106.6 units/ml; lune 5, 211.2 units/ml; lane 6, 412.5 units/ml; and lane 7, 825 units/ml. The next four lanes show a sequence ladder generated from the BglII site of pXPC103.3KS which had been 5’ end-labeled, cut asymmetrically with a second restriction enzyme to separate the labeled ends, and subjected to Maxam and Gilbert sequencing (46). The sequence assignments determined are shown in lune 10. Lunes 8 and 9 are the primer-extended products synthesized when 5’ end-labeled PC24 (see Fig. 2) was annealed to poly(A+) RNA and reverse transcribed as described under “Experimental Procedures.” The RNAs used as tem- plates were from diabetic rat liver (lane 8) and kidney of a triamcin- olone-treated rat (lune 9). The nucleotides on the sequence ladder that correspond to the S1 nuclease and primer-extended products of lanes 1-7 and 8 and 9 are indicated by offset connecting lines. For every lane the labeled end corresponds to the BglII digestion site 69 nucleotides from the 5’ end of mRNAPEPCK (see Fig. 2).

Page 9: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

10756 Structure of P-enolpyruvate Carboxykinase, I ts mRNA, and Gene

preparations are known to contain significant amounts of mRNAPEPCK (11, 26) and in both cases the primer extension reaction generated two fragments differing in size by 1 nu- cleotide. Primer extension reactions using tRNA and rat liver poly(A+) RNA that was devoid of measurable mRNAPEPCK were also performed as controls. In these instances, no primer extension products were seen (data not shown). The two products seen in lanes 8 and 9 of Fig. 8 could represent length heterogeneity of the primer, two mRNAPEPCK populations that differ by one nucleotide at their 5’ ends, or premature termi- nation of reverse transcription. The first two possibilities are unlikely since the primer was purified on a sequencing gel that resolved single nucleotide differences, and there was not an S1 nuclease-protected fragment that corresponded in size to the smaller primer extension product. The final possibility is probably correct. Other investigators have obtained similar results and have attributed it to an artifact caused when reverse transcriptase encounters the 5’ capped nucleotide (27, 28).

A comparison of the S1-protected and primer-extended products (Fig. 8) shows that the smallest S1-protected frag- ment at the highest S1 concentrations (lanes 6 and 7) is exactly the same size as the larger of the primer-extended fragments (lanes 8 and 9). This is excellent evidence that the primer-extended products are synthesized to the 5’ end of mRNAPEPCK. Base-modified sequencing fragments migrate 1.5 bases faster than corresponding S1 nuclease or primer exten- sion fragments on sequencing gels (29); thus, the proper alignment of these fragments with the sequence ladder is shown by the staggered line between lanes 9 and 10. The sequence in this figure, which matched all previous sequence determinations, shows that the largest primer extension and smallest S1 nuclease-protected fragments correspond to a thymidine residue. Since the sequence is of the genomic DNA coding strand, mRNAPEPCK initiates with a complementary adenine residue. This residue was added to the mRNAPEPCK sequence (Fig. 2) and was assigned the position +1 in Figs. 4 and 6. Thus, the 3’ most RNA polymerase 11-dependent start site determined by i n vitro transcription (Fig. 7) is the major initiation site in vivo.

DISCUSSION

The mRNAPEPCK sequence of 2624 nucleotides constitutes the entire molecule exclusive of the poly(A+) tail. The pres- ence of a poly(A+) region (not shown) and a polyadenylation sequence AAUAAA (30) a t nucleotide 2610 identified the 3’ end and the coding strand of the cDNA. From the coding strand the three possible reading frames were determined, only one of which was long enough to code for a protein the expected size of P-enolpyruvate carboxykinase. This reading frame showed that P-enolpyruvate carboxykinase is composed of 621 amino acids with a molecular weight of 69,289. As can be seen in Table I (see Miniprint Section), this molecular weight and the amino acid composition decoded from the mRNA sequence in Fig. 2 compare very favorably with those determined by acid hydrolysis and polyacrylamide gel electro- phoresis of P-enolpyruvate carboxykinase in two different laboratories (5, 31). This derived amino acid sequence was then confirmed by fast atom bombardment mass spectrometry of purified P-enolpyruvate carboxykinase. These data provide direct evidence that we have cloned P-enolpyruvate carboxy- kinase cDNA.

Flanking the 5’ end of the 1866 translated nucleotides in mRNAPEPCK are 143 bases of nontranslated sequence. In Kozak’s (32) comparison of the sequences of higher eukaryotic mRNAs known to date, 5’ noncoding sequences averaged

approximately 50 nucleotides in length; thus, mRNAPEPCK has a comparatively long leader sequence. Flanking the 3‘ side of the translated bases in mRNAPEPCK is another 615 base non- coding sequence. This contains a distinctive 102-bp region (enclosed in brackets in Fig. 2) of repeated sequences and palindromes. In addition, this region has features which might favor a Z-DNA conformation (33). First, 100 of the 102 nucleotides are arranged in an alternating pyrimidine-purine pattern (predominantly UpG). Hamada et al. (34) reported that stretches of TpG commonly occur in eukaryotic DNA and suggested that these could represent Z-DNA-forming elements in genomic DNA. Second, there are eight CpG dinucleotides (2-3 times the frequency observed in total ge- nomic DNA of eukaryotic cells (35)). These have a high probability of being methylated (36), a condition which sta- bilizes Z-DNA (37). Z-DNA has recently been detected in biological systems (33, 38) and several investigators have speculated that it may regulate processes such as transcription and be present in enhancer sequences (33, 39). The 102-bp structure which was seen in mRNAPEPCK could thus be im- portant in the regulation of P-enolpyruvate carboxykinase gene expression.

The P-enolpyruvate carboxykinase gene is 6.0 kb in length and is composed of 10 exons and 9 introns. The exon and intron locations were initially determined by S1 nuclease mapping and then by R-loop analysis (data not shown), which was performed in cooperation with Marjery Sullivan at Genex Corp., Rockville, MD. These procedures provided an excellent orientation of the locations of the exons and introns and served as a framework for finalizing the exact boundaries by the sequencing of genomic DNA (see Fig. 6). All but two of the exonlintron boundaries are ambiguous; splice sites in more than one place could maintain a contiguous mRNA sequence. Consequently, the designated splice junctions were assigned based upon current consensus sequences (40). In all cases, only one of the possible assignments emerged as a “best fit”; these are shown in Fig. 6. Thus, the structural map of the P-enolpyruvate carboxykinase gene presented in Fig. 4 is complete except for the precise sizes of introns C, D, E, F, and H.

Yoo-Warren et al. (41), using R-loop mapping to determine the size and location of the exons and introns of the P- enolpyruvate carboxykinase gene, reported that the gene was approximately 6 kb in size and that it was composed of 9 exons interrupted by 8 introns. The restriction enzyme maps and the alignment of the exons and introns (compare Fig. 4 with Ref. 41) indicate that the gene examined by both groups is identical despite this difference in the number of exons/ introns. The 89-bp intron (G, the 7th and smallest in our map) was apparently missed in the previous R-loop mapping photographs (41).

There are three minor differences between our DNA se- quence of the 5‘ end of the gene and 5’ flanking DNA and that reported by Wynshaw-Boris et al. (42). These sites on our sequence are marked by the arrows at positions +48, -205, and -(296/297) in Fig. 6. Where we found a T , C, and CC, respectively, Wynshaw-Boris et al. found a G, no nucleo- tide, and CCC. We consulted with Dr. Richard Hanson and colleagues about these differences and both groups re-read their films through these areas. The consensus is that the base at +48 is a T. The two other positions remain unresolved. We both feel that our sequence designations are the correct interpretation of our respective sequencing films; therefore, it is possible that we have encountered some inherent micro- heterogeneity in the 5‘ flanking DNA of the P-enolpyruvate carboxykinase gene. In a fourth position, at -(387/388), the

Page 10: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

Structure of P-enolpyruvate Carboxykinase, I ts mRNA, and Gene 10757

two nucleotides C and A ran together as a compression on all our sequencing gels. We are unable to clearly designate their order and have positioned them to be consistent with the previously published sequence (42).

An additional difference with the data of Wynshaw-Boris et al. (42) involves the identification of the initiation site. These authors chose the adenine residue designated as posi- tion -3 in our sequence (see Fig. 6), based on their interpre- tation of the proper alignment of the most prominent (not smallest) SI-protected fragment to their P-enolpyruvate car- boxykinase DNA sequence. We base our assignment on S l nuclease mapping, genomic DNA sequencing, and primer extension analysis.

An examination of the DNA near the initiation site reveals the sequence 5’-TATTTAA-3‘ at positions -31 to -25 which differs by only one base from a published “TATA” box con- sensus sequence (40). The sequence around the cap site also resembles the general observation that pyrimidines are lo- cated at positions -1, -5, and +2 to +6 around the capped nucleotide (40).

Since P-enolpyruvate carboxykinase is regulated by gluco- corticoids, it was of interest to see if the 5’ flanking DNA contained any homologies to the sequence 5’-TGTTCT-3‘ which is found 4 times within a region of the MMTV long terminal repeat known to interact with glucocorticoid receptor protein (43). Shinomiya et al. (44) found six 5/6 base matches to this sequence in the first 600 bases of 5’ flanking DNA in their study of the glucocorticoid-regulated tyrosine amino- transferase gene. In the 1200 bases of 5’ flanking P-enolpy- ruvate carboxykinase DNA, 5/6 base matches to this sequence occur 9 times. These are overlined in Fig. 6. In addition, this feature appears 8 times in the sequence between exons 1 and 10 (approximately 4800 bp); 7 of these are in introns. This sequence is even more frequent in the 10th exon where 5 close matches occur within 500 bp (not shown). It will be interesting to see whether these sequences are involved in glucocorticoid- regulated transcription of this gene.

The question of whether the P-enolpyruvate carboxykinase gene is present in one or more copies in liver-derived tissues was addressed by Yoo-Warren et al. (41). They probed South- ern blots of BamHI-digested rat liver and H41IE cell genomic DNA and SstI- or SphI-digested rat liver DNA with a P- enolpyruvate carboxykinase cDNA designated pPCK10. The fragments were always approximately the size predicted from the restriction map of P-enolpyruvate carboxykinase genomic DNA (41). We performed similar Southern blot experiments using rat liver, H4IIE cell, and HTC cell DNA, cut with EcoRI, HindIII, or BamHI. The results were always consistent with the restriction map of XPCll2 (data not shown), which supports the previous conclusion (41) that rat liver cytosolic P-enolpyruvate carboxykinase is coded for by a single copy gene.

One unexpected observation deserves further comment. A 19O-bp SI-protected fragment was found extending from the EcoRI site at position -460 to approximately position -270 (Fig. 5B). This fragment was of weaker intensity than usual (compare Fig. 5, A and B ) , but it was consistently observed. To determine whether this represented an alternative initia- tion site, nuclear RNA isolated from N6,@’-dibutyry1 cyclic AMP-treated H4IIE cells was analyzed on formaldehyde gels and probed with either DNA isolated from the 5’ flanking DNA around this EcoRI site or DNA from known coding sequence of the gene. The filter probed with the flanking DNA usually showed very faint fragments of approximately the same size as the “precursor” P-enolpyruvate carboxyki- nase RNAs on the filter probed with the coding sequence

DNA (data not shown). Unfortunately, the inability to obtain a stronger signal, despite varying the time and/or temperature of hybridization, made any firm conclusions regarding the data impossible. I t is intriguing, however, to consider these observations in conjunction with the in vitro transcription and primer extension results. The in vitro transcription assays showed an RNA polymerase 11-dependent start site 650 bp 5’ of the BglII site on the coding strand, placing its initiation site at about position -600 in the 5‘ flanking DNA of the P- enolpyruvate carboxykinase gene. Just 5’ of this general area is a 20-bp stretch of AT-rich tracts, similar to TATA se- quences, indicated by the dashed line box in Fig. 6. Primer extension from the BglII site, as already mentioned, showed minor amounts of a 600-base product. Thus, there may be a low level of transcription from the AT-rich sequences at position -620 of the P-enolpyruvate carboxykinase gene in vivo.

Acknowledgments-We are grateful to Dan Petersen, Mark Gran- ner, Gerald Clancy, Diane Ochs, Debra Blair, Steve Murray, and Dr. Bradford W. Gibson (Massachusetts Institute of Technology) for technical assistance and to Sara Paul and Patsy Barrett for secretarial help. We are also indebted to Drs. Joseph Walder and Roxanne Walder, Department of Biochemistry, and the Iowa Diabetes and Endocrinology Research Center Oligonucleotide Synthesis Core for their help in oligonucleotide synthesis, and we thank Dr. Gerald Carlson, Department of Biochemistry, University of Mississippi, for his help in analyzing the amino acid sequence.

REFERENCES 1. Gunn, J., Tilghman, S., Hanson, R., Reshef, L., and Ballard, F.

2. Barnett, C. A,, and Wicks, W. D. (1971) J. Biol. Chem. 2 4 6 ,

3. Wicks, W., Lewis, W., and McKibbin, J. (1972) Biochim. Biophys.

4. Wicks, W., and McKibbin, J. (1972) Biochem. Biophys. Res.

5. Beale, E., Katzen, C . , and Granner, D. (1981) Biochemistry 2 0 ,

6. Iynedjian, P. B., and Hanson, R. W. (1977) J . Biol. Chem. 2 5 2 ,

7. Andreone, T. C., Beale, E. G., Bar, R. S., and Granner, D. K. (1982) J. Biol. Chem. 2 5 7 , 35-38

8. Kioussis, D., Reshef, L., Cohen, H., Tilghman, S. M., Iynedjian, P. B., Ballard, F. J., and Hanson, R. W. (1978) J . Biol. Chem.

9. Chrapkiewicz, N. B., Beale, E. G., and Granner, D. K. (1982) J .

10. Beale, E. G., Hartley, J. L., and Granner, D. K. (1982) J. Biol.

(1975) Biochemistry 14,2350-2357

7201-7206

Acta 264,177-185

Commun. 4 8 , 205-211

4878-4883

655-662

253,4327-4332

Biol. Chem. 257, 14428-14432

Chem. 257, 2022-2028 11. Yoo-Warren. H.. Cimbala. M. A., Felz. K.. Monahan. J. E.. Leis,

J. P., and’Hanson, R. W . (1981) J.’SiOl. Chem. 2 5 6 , 10224- 10227

12. Cimbala, M. A., Lamers, W. H., Nelson, K., Monahan, J. E., Yoo- Warren, H., and Hanson, R. W. (1982) J . Biol. Chem. 2 5 7 ,

13. Beale, E., Andreone, T., Koch, S., Granner, M., and Granner, D. (1984) Diabetes 33,328-332

14. Lamers, W., Hanson, R., and Meisner, H. (1982) Proc. Natl. Acad. Sci. U. S. A. 7 9 , 5137-5141

15. Granner, D., Andreone, T., Sasaki, K., and Beale, E. (1983) Nature (Lond.) 305,549-551

16. Granner, D., and Beale, E. (1985) Biochemical Actions of the Hormones (Litwack, G., ed), Vol. XII, Academic Press, New York

17. Granner, D. K., Petersen, D. D., Koch, S. R., Sasaki, K., and Beale, E. G. (1984) Trans. Assoc. Am. Physicians 97, 33-40

18. Sasaki, K., Cripe, T. P., Koch, S. R., Andreone, T. L., Petersen, D. D., Beale, E. G., and Granner, D. K. (1984) J. Biol. Chem.

19. Barber, M., Bordoli, R. S., Sedgwick, R. D., and Tyler, A. N.

20. Webster, T. A., Gibson, B. W., Keng, T., Biemann, K., and

7629-7636

2 5 9 , 15242-15251

(1981) J. Chem. SOC. Chem. Commun. 7 , 325-327

Page 11: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

10758 Structure of P-enolpyruvate Carboxykinase, Its mRNA, and Gene

Schimmel, P. (1983) J . Biol. Chem. 258, 10637-10641 Nature (Lond.) 278, 857-859

S. A. 81, 1956-1960 21. Gibson, B. W., and Biemann, K. (1984) Proc. Natl. Acad. Sci. U. 49. Fagan, J., Pastan, I., and de Crombrugghe, B. (1980) Nucleic

22. Tarr, G. E. (1977) Methods Enzymol. 4 7 , 335-357 23. Durnam, D., Perrin, F., Gannon, F., and Palmiter, R. (1980) Proc. Res. 10, 1755-1769

Natl. Acad. Sci. U. S. A. 77, 6511-6515 24. Berk, A., and Sharp, P. (1977) Cell 1 2 , 721-732 25. Manley, J., Fire, A., Cano, A., Sharp, P., and Gefter, M. (1980) 280

26. Iynedjian, P. B., and Hanson, R. W. (1977) J. Biol. Chem. 2 5 2 , 1523

27. Chan, L., Dugaiczyk, A,, and Means, A. (1980) Biochemistry 19, Mol. Biol. 1 1 3 , 237-251

28. Shelness, G. S., and Williams, D. L. (1984) J. Biol. Chem. 259, S. A . 7 2 , 3961-3965

29. Sollner-Webb, B., and Reeder, R. (1979) Cell 18, 485-499 56. Chirgwin, J., Przybyla, A,, MacDonald, R., and Rutter, W. (1979) 30. Proudfoot, N., and Brownlee, G. (1976) Nature (Lond.) 263, Biochemistry 18, 5294-5299

31. Colombo, G., Carlson, G., and Lardy, H. (1978) Biochemistry 17, 58. Ghosh, P., Reddy, V., Piatak, M., Lebowitz, P., and Weissman,

32. Kozak, M. (1984) Nucleic Acids Res. 12,857-872 59. Land, H., Grez, M., Hauser, H., Lindenmaier, W., and Schutz, 33. Nordheim, A., Pardue, M., Lafer, E., Moller, A., Stollar, B., and G. (1981) Nucleic Acids. Res. 9, 2251-2266

Rich, A. (1981) Nature (Lond.) 294, 417-422 60. Dagert, L., and Ehrlich, S. (1979) Get'e 6, 23-28 34. Hamada, H., Petrino, M., and Kakunaga, T. (1982) Proc. Natl. 61. Gubler, U., and Hoffman, B. (1983) Gene 25,263-269

Acad. Sci. U. S. A. 79,6465-6469 62. Staden, R. (1977) Nucleic Acids. Res. 4,4037-4051 35. Swartz, M. N., Trautner, T. A,, and Kornberg, A. (1962) J. Biol. 63. Queen, C., and Korn, L. (1980) Methods Enzymol. 65, 595-609

Chem. 237, 1961-1967 64. Novotny, J. (1982) Nucleic Acids Res. 10, 127-131 36. Razin, A,, and Riggs, A. (1980) Science (Wash. D. C.) 2 1 0 , 604- 65. Lagrimini, L., Brentano, S., and Donelson, J. (1984) Nucleic

610 Acids. Res. 12, 605-614 37. Behe, M., and Felsenfeld, G. (1981) Proc. Natl. Acad. Sci. U. S. 66. Sargent, T., Wu, J., Sala-Trepat, J., Wallace, R., Reyes, A., and

38. Morgenegg, G., Celio, M., Malfoy, B., Leng, M., and Kuenzle, C. 67. Benton, W., and Davis, R. (1977) Science (Wash. D. C . ) 1 9 6 ,

39. Nordheim, A,, and Rich, A. (1983) Nature (Lond.) 303,674-679 68. Yamamoto, K., Alberts, B., Benzinger, R., Lawhorne, L., and 40. Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem. Freiber, G. (1970) Virology 40, 734-744

41. Yoo-Warren, H., Monahan, J., Short, J., Short, H., Bruzel, A., 70. Deng, G., and Wu, R. (1981) Nucleic Acids Res. 9, 4173-4188

Acids. Res. 8,3055-3064 50. Ito, H., Ike, Y., Ikuta, S., and Itakura, K. (1982) Nucleic Acids.

51. Kahn, M., Kolter, R., Thomas, C., Figurski, D., Meyer, R., Re- maut, E., and Helinski, D. (1979) Methods Enzymol. 68, 268-

Proc. Natl. Acad. Sci. U. S. A. 77, 3855-3859 52. Birnboim, H., and Doly, J. (1979) Nucleic Acids. Res. 7, 1513-

8398-8403 53. Rigby, P., Dieckmann, M., Rhodes, C., and Berg, P. (1977) J.

5631-5637 54. Grunstein, M., and Hogness, D. (1975) Proc. Natl. Acad. Sci. U.

9929-9935 55. Hanahan, D., and Meselson, M. (1980) Gene 10,63-67

211-214 57. Morrison, D. (1979) Methods Enzymol. 68, 326-331

5321-5329 S. (1980) Methods Enzymol. 6 5 , 580-595

A. 78, 1619-1623 Bonner, J. (1979) Proc. Natl. Acad. Sci. U. S. A. 76,3256-3260

(1983) Nature (Lond.) 303,540-543 180-182

50,349-383 69. Southern, E. (1975) J. Mol. Biol. 98, 503-517

Wynshaw-Boris, A., Meisner, H., Samols, D., and Hanson, R. 71. Maniatis, T., Fritsch, E., and Sambrook, J. (1982) Molecular (1983) Proc. Natl. Acad. Sci. U. S. A. 80, 3656-3660 Cloning-A Laboratory Manual, pp. 202-203, Cold Spring Har-

42. Wynshaw-Boris, A,, Lugo, T. G., Short, J. M., Fournier, R. E. K., bor Laboratory, Cold Spring Harbor, NY and Hanson, R. W. (1984) J. Biol. Chem. 259, 12161-12169 72. Tinoco, I., Borer, P., Dengler, B., Levine, M., Uhlenbeck, O.,

43. Scheidereit, C., Geisse, S., Westphal, H., and Beato, M. (1983) Crothers, D., and Gralla, J. (1973) Nature New Biol. 246, 40- Nature (Lond.) 304,749-752 41

44. Shinomiya, T., Scherer, G., Schmid, W., Zentgraf, H., and Schutz, 73. Duyk, G., Leis, J., Longiaru, M., and Skalka, A. (1983) Proc. G. (1984) Proc. Natl. Acad. Sci. U. S. A. 81, 1346-1350 Natl. Acad. Sci. U. S. A. 80, 6745-6749

45. Boseley, P., Moss, T., and Birnstiel, M. (1980) Methods Enzymol. 74. Bentle, L. A., Snoke, R. E., and Lardy, H. A. (1976) J. Biol. 65,478-494 Chem. 251, 2922-2928

46. Maxam, A,, and Gilbert, W. (1980) Methods Enzymol. 65, 499- 75. Hopp, T., and Woods, K. (1981) Proc. Natl. Acad. Sci. U. S. A. 560 78,3824-3828

47. Hartley, J., Chen, K., and Donelson, J. (1982) Nucleic Acids Res. 76. Garnier, J., Osguthorpe, D., and Robson, B. (1978) J. Mol. Biol.

48. Gopinathan, K., Weymouth, L., Kunkel, T., and Loeb, L. (1979) 77. Levitt, M. (1976) J. Mol. Biol. 104,59-107 10,4009-4025 120,97-120

SUPPLEMENTARY MTERIAL TO

R a t Hepat ic Cytosol ic Phosphoenolpyruvate Carboxykinase IGTPI: Structures o f t h e P r o t e i n . Messenger RNA, and Gene:

Elnus G . Beale. Nancy B. Chrapkiewi tz . Huber t A. Scoble. Raymond J . Metz. Douglas P. Quick. Richard L. Noble. John E . Donelson.

Klaus Biemann and Oary l K. Granner.

EXPERIMENTAL PROCEDURES

M a t e r i a l s and Animals - R e r t r i c t i o n enzyxws were purchaled from BetheSda

purchased f rom L i fe Sc iences. Inc . , or frm Betherda Research Labs. Terminal Research Labs, New England B io labf and Amerrham. Reverse t r a n s c r i p t a s e was

t r a n s f e r a s e and RNase H were purchased from P-L Biochemicals . DNA polymerase

polymerase frm Bethesda Research Labs. radToisotopes from New England Nuclear I , Exonuclease I l l . and i Exonuclease came from New England B io labs, Klenor DNA

or Amerrham. reagents for o l i g o n u c l e o t i d e s y n t h e s i s f r m A l d r i c h 01 Beckmn. and t r y p s i n and 5. aureys Y8 protease from sigma. A l l r e a g e n t s for the agarose ana p o l y a c r y l a m i 5 g e l s r e ultra-pure grade from Betherda Research Labs. Lysozyme and =-amanitin were purchased from Sigma Corpo ra t i on ; S I -nuc lease from Signa Corparat lon or M i l e s L a b o r a t o r i e s . N i t r o c e l l u l o s e was purchased frm

Male CD S t r a i n r a t s M r e p u r c h a s e d frm Char les R ive r B reed ing l abo ra to r ies Schle icher and Schuel l . A l l o ther chemicals were reagent grade or b e t t e r .

and m d e d i a b e t l c w i t h ~ t r e p t 0 2 0 t O C i n as d e s c r i b e d p r e v i o u s l y 1131.

General Methods - Plasmid DNA was i s o l a t e d f rm l a r g e C u l t u r e s u s i n g t h e c e s i u m c h l o r i d e c e n t r i f u g a t i o n p r o c e d u r e o f Kahn e t SI. 1511 and from small c u l t u r e s "S in0 the ~ r o c e d u r e descr ibed bv B i rnbo in Z d n v 1521. H v b r i d i z a t i o n DrObes r e r e - l a b l i e d w i t h 32P by n i c k t r a n s l a t i o n I531 tb a s p e c i f i c a c t i v i t y a i

t h e t e c h n i q u e o f G r u n s t e i n and HOqneSs I 5 4 1 as modi f ied by Hanahan and MeSelSOn 1-2 I. 10% cpmlug. and used t o screen b a c t e r i a l c o l o n i e s c o n t a i n i n g cDNAs by

l i v e r s of d ibu ty ry l C IWP- t rea ted d iabet ic ra ts by the method O f C h i r g w i n e t a1: ( 5 5 ) . RNA t o be used for c l o n i n g a n d f o r p r i m e r e x t e n s i o n n a s i s o l a t e d frm

156) . p o l y ( A ) + RNA was i s o l a t e d b y a s i n g l e ge o v e r o l i g o - d l c e l l u l o s e -

O f t h e t o t a l p o l y ( A I + R W i n such preparat ions. as desc r ibed p rev ious l y ( 5 ) . T y p i c a l l y . .RNA@$?~ r e p r e s e n t s 0.5-1.0°1,

p o o l e d c u l t u r e o f t h e l i b r a r y t h a t we c o n s t r u c t e d p r e v i o u s l y (10). The S i z e - f r a c t i o n a t i o n o f cDNA L i b r a r y - Tota l p lasmid DNA was i s o l a t e d frm a

superco i l ed p lasmids were f r a c t i o n a t e d b y e l e c t r o p h o r e s i s on O.Bolo I w / v ) lo* m e l t i n g p a i n t agarose. The r e g i o n o f t h e g e l t h a t c o n t a i n e d s u p e r c o i l e d p l a s m i d s v i t h cDNA i n s e r t s r a n g i n g i n s i z e frm 1200-6000 b a s e p a i r s was excised; rnall i n s e r t 5 and open c i r c l e farms of the p lasmids were thus excluded. The DNA was recovered f rm the ge l and used t o t ransform E: c o l i s t r a i n HE101 as descr ibed by Mor r ison 157). Al l o f the te t racyc l Ine~e?;F ; ; fan t c o l o n i e s were p o o l e d a n d s t a r e d a t - 2 0 C i n a 1:1 m i x t u r e of g l y c e r o l and L E - t e t r a c y c l i n e llalo Tryptone, 0.5OlO NaCl. 0.5°10 y e a s t e x t r a c t , and 25 ,g/rnl t e t r a c y c l i n e ) .

l e a t i d e s - The Seventeen n u c l e o t i d e ~-3'1, was s y n t h e s i z e d on a s o l f d s u p p o r t

by the phospho t r i e r te r me thod desc r ibed by I t 0 e t d l . 1%). The r e s i n was

added as b locked d inuc leo t i des . and t he f i na l p roduc t *as p u r i f i e d b y t w o purchased uith t h e f i r s t n u c l e o t i d e a t t a c h e d . ~ e ~ m i n i n g n u c l e o t i d e s were

The f i r s t passage was p r i o r t o r e n o v a l o f t h e d i m e t h o x y t r i t y l p r o t e c t i n g g r o u p passages over a h igh per formance l iqu id chromatography Hami l ton PRP-1 co1unm:

and t h e f i n a l passage was a f t e r d e p r o t e c t i o n w i t h 8 o O l , a c e t i c a c i d a t room temperature for 30 min.

was synthes ized on a Beckman automated syn thes izer us ing the phosphot r ies te r method w i t h b l o c k e d m n o n u c l e o t i d e s . The product was p u r i f i e d u s i n g h i g h

c h r m t o g r a p h i c s t e was p e r f o r u e d ; P r i o r t o use, the o l i g m r was l a b e l l e d performance l i q u i d c h r o m t o g r a p h y as d e s c r i b e d above b u t o n l y t h e f i r s t

a t t h e 5 ' e n d w i t h b , u s i n g p o l y n u c l e o t i d e k i n a s e as descr ibed be low. and

The twenty four nuc leo t i de o l i gomer , PC24 I5'-W\TCTCAWGCGTCTCGCCGGITT-3').

Page 12: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

Structure of P-enolpyruvate Carboxykinase, Its mRNA, and Gene 10759 t h e n f u r t h e r p u r i f i e d b y e l e c t r o p h o r e s i s t h r o u g h a 2 O o l 0 polyacry lamide-7 H Urea re uencing gel. The DNA was e l u t e d b y d i f f u s i o n a n d r e c o v e r e d w i t h an Elut ip-dd column (Schleicher and Schuel11.

work descr ibed here. The t1rst l i b r a r y . u s e d t o i s o l a t e pPC302. was prepared Cons t ruc t i on o f cDNA L i b r a r i e s - Two ney cDNA l i b r a r i e s were prepared for the

base p a l r X m l - x m a I r e s t r i c t i o n frTmEit i s o l a t e d fran D p C w l i i i S a n n e a l e d t o by the methods descr ibed by Ghosh e t a i . (58) and by Land e t d l . (591; A 511

pOlY(A'1RNA a t 52'C for 18 h i n a b i f f e r c o n t a i n i n g 8O0;, ( v l v l formamde. 0.01 M sodium EDTA. 0.4 M NaCl, 0.1 M PIPES a t pH 6.8. The m i x t u r e *as t hen ch romtographed ove r o l i go -d l ce l l u lose i n t he p resence of h i g h s a l t and the bound RNA-Drimer was e l u t e d w i t h low s a l t and r e c w e r w hv e thanol p r e c i p i t a t i o n . The reve rse t ran5 ; r i p t iOn rea i i i on was done i n a b u f f e r c o n t a i n r n g 75 iM T r i s H C I a t pH 8:l ( a t 22'C). 6 nN HgC12, 148 Rw KCl. 5 mH d i t h i o t h r e i t o l . 0 . 1 mglml b o v i n e serum albumin. 1nM each o f dGTP. dCTP. dATP and TTP. 5 u n i t s l m l o f p l a c e n t a l r i b o n u c l e a s e i n h i b i t o r and 230 u n i t s o f AUV reverse t r a n s c r i p t a s e . T h i s s o l u t i o n was incubated for 1 h a t 42'C and the PrOdUCtl *re e x t r a c t e d w i t h p h e n o I - c h l O r o f O r m ( 1 : l I . F o l l o w i n g h y d r o l y s i s d t h NaOH the DNA war t a i l e d r i t h dCTP; the Syn thes i s o f t he second s t rand o f <DNA and i n s e r t l o n o f t h e d o u b l e - r t r a n d e d cDNA i n t o P s t l d i g e s t e d , d G - t a i l e d PER322 was then performed dS descr ibed by Land e t a l . (591. The r e s u l t i n g plasmids were used t o t ransform E. co l i S t ra in F fE lUTby t he p rocedure o f oagert and E h r l i c h (60). TWO h u n d r e d f i f t y t r a n s f a r m a n t s were Obtained and screened by i n s i t u c o l o n y h y b r i d i z a t i o n u s i n g t h e XmaI-Xmal fragment probe. C o l u n i e l t o w h i c h t h e probe h y b r i d i z e d were s e l e c t e d fa r f u r t h e r screening. D i g e s t i m w i t h P s t I revealed t h a t pPc302 had t h e l a r g e s t i n s e r t o f t h e group.

PC17 was used t o p r i m e t h e r e v e r s e t r a n x r i p t i o n o f 30 !g o f POI;(A*)RNA. pPC400 w a s prepared by the method of Gubler and Hoffman (61). 20 pmol o f

A f t e r t h e s y n t h e s i s o f t h e second Strand o f DNA us ing r7bonuc lesre H and ~ l e n o u DNA Polymerase I . a p roduc t o f app rox ima te l y 260 bp was i so la ted f rom a

dCTP using t e r m i n a l t r a n z f e r d s e . a n d i n s e r t e d a t t h e a t - t a i l e d P s t l z l t e i n p r e p a r d t l v e . 2 o l 0 low m e l t i n g p o i n t a g a r o s e g e l . The DNA as tailed w i t h

pBR322. Transformat ion and rcreenlng was accomplished a s described for pPC302 e x c e p t t h a t an 1101 bp EcoRi-Hindl I i f ragment f rom \PC112 was used as the probe.

. .. -I ~

Y q u e n c i y g - R e P t r l c t l O n enzyme napping was a c c m p l i s h e d as descr ibed by ose ey e t a , (451. The techniques and polyacry lamide-urea gels descr ibed by

Mxam a n b G i m e r t ( 4 6 ) were used for sequencing. DNA Sequences were analyzed for symnetries, repeats, amino ac id cod ing reg ions , and r e s t r i c t i o n enzyme S i t e s using the computer program descr ibed by Staden (62), Queen and Korn (63). and the dot Ratr ix program descr ibed by Novotny (64) . us ing e i ther a O i g l t a l Epulpment Corporat ion YAY 111780 I f o r t h e Staden and d o t m a t r i x programs1 or an IBH 3701168 computer ( f o r t h e Queen ana Korn programl. Some analyses sere done on an IBH persondl computer using t h e programs of L d g r i m i n i e t d l . (651, "

C o n f l r m t i o n o f t h e Amino A c i d Se uence o f P-enolpyruvate Carboxykinase - The

p e p t i d e s r e s u l t i n g f r m t r y p s i n and 8. aureus V8 pro tease d iges ts o f mRNA-derlved amlno a c i d sequence :as c o n f i r m e d b y a n a l y z i n g t h e p r o t e o b t i c

C a r b o X a m i d m t h y l a t e d P - e n o l p y r u v a t e T a ~ i n a s e . B r i e f l y 15 nmoles o f t h e p r o t e i n vas d i a e s t e d w i t h d i p h e n y l c a r b a m v l c h l o r i d e t r e a t e d i r v m i n l and another 20 nmier Has digested u;th S . &reus V8 protease. E&h was Separated i n t o 7-8 f r a c t i o n s O f 5-15 pept ides lT.3- us ing rever5ed.phase h i g h performance l i q u i d chromatography to reduce the complexi ty o f t h e r e s u l t i n g d i g e s t s . A f t e r l y o p h i l i z a t i o n . each f r a c t i o n was analyzed by f a s t atom babardment mass spec t romet ry t o de te rm ine the mlecular w i g h t of t h e

R a l e c u l a r w i g h t s o f a l l p r a t e o l y t l c p e p t i d e expected f rom the hypothet ical p r o t e o l y t i c p e p t i d e s . These data *ere then c m p a r e d w i t h t h e c a l c u l a t e d

amlno acid sequence deduced frm the mRNAPEPSK sequence. To f u r t h e r SUppOrt these pept ide assignments, each o f t h e mass-analyzed l i q u i d chromatographic f r a c t i o n 5 was s u b j e c t e d t o one or two successive steps o f manual Edmn d e g r a d a t i o n w i t h f a s t atom banbarhnent m s s s p e c t r m t r i c a n a l y s i s a f t e r each S tep . Th i s co rb ined s t ra tegy f o r t he mass s p e c t r o m e t r i c v e r i f i c a t i o n and COrreCtlOn O f cONA-derived amino a c i d sequencer o f p r o t e i n s ( 2 1 ) a s *e l1 as t h e canputer a lgor i thms invo lved in th is approach have been descr ibed e lsewhere2.

IS01at lon o f P-eno lpyruvate Car, ,x inase Genomic DNA Clones - A r a t l i v e r

C a l i f o r n i a i n s t i t u t e o f Technology. T h i s l i b r a r y *as g e n e r a t e d f r o m p a r t i a l genomic m b r a r y was obta ined f rom Drs. Linda Jagodzrnsk i and James Bonner.

Charan 4A ldnda phage 166) . The l i b r a r y was screened w i t h t h r e e sUCcessive H a e i l I digests o f r a t l i v e r DNA *h ich were i n s e r t e d w i t h EcoRI l i n t e r s i n t o t h e

rounds o f l n I l t u h y b r i d i z a t i o n ( a t h i g h , i n t e r m e d i a t e . and low p laque densi ty . r e s p e c t i v e r n t t n e procedure o f Benton and Davis 1671. Approx imate ly lo6 c lones * e r e screened i n t h e f i r s t round. R e s t r l c t i a n fragments iSOlated from cloned P-enolpyruvate Cdrboxykinase cDNA *ere used a i hybr id i za t i on p robes . Phage YdS i s o l a t e d as descr ibed by Yamamoto e t a l . 1681. Phage DNA was obta ined by pheno l l cn lo ro fo rm ex t rac t i on . T G Eeena lpyruuate cdrbmyk indse genomlc c l o n e s i d e n t i f i e d i n t h i s way are .PC103 and > P C l I Z . The l i a l d t e d DNA's *ere C h a r a c t e r i z e d b y r e s t r i c t i o n mapping 1451 and Southern b l o t t i n g 1691.

pBR322. For p~PC103.3KS. tPC103 was d i g e s t e d t o comple t lan w i th Kpn i and Sr t I .

P s t i - c u t d G - t a i l e d pBR322 Obtained from Bethesda Research Labs. These t a i l e d 4 1 t h dCTP as descr ibed by Deng and Yu ( 7 0 ) . end t h e n i n s e r t e d i n t o

c h i m r l c p l a s m i d s were used t o t ransform E. c o l i S t r a i n H B l O l and then cloned. Clones containing p~PC103.3KS Mere i d e n t l f i e T E i n g d n i c k - t r d n s l a t e d 250 bp BamHI-BdmHI f ragment iso la ted f rom p~Pc112.RZ as probe.

Prlmer Extenson o f 5 ' End-Labe l l ed PC24 - The pr imer-extens ion procedure was perrorme esse" l a y as e s c r i e osh e t d l . 1581. T h i r t y "5 o f

T r i r - H C I a t pH 8:6 ( a t 22'CI. 0.18 H KC1 and 1 nH disodium EDTA by h e a t i n g t o pOly(A'1~NA wastanA6a1ed t o 20 ~ m ~ l b ~ f ? ' - 3 2 D n 4 I see F lg . 2 ) i n 0.13 H

85'C f a r 5 rnln and then incubat ing the mix ture a t 42'C f o r I h. The t o t a l

m i x t u r e *as a d j u s t e d t o 100 "1 c o n t a i n i n g 2 nH each o f dCTP dGTP dATP and volume o f t h i s r e a c t i a n was 78 " 1 . A t t h e e n d o f t h e a n n e a l i n g p e r i o d t h e

TTP. 28 mM d-mercaptoethanol. 10 mH HgC12. 13 m i l l i u n i t s O f ' p l a c e A t a l r i b o n u c l e a s e i n h i b i t o r ( B R L ) and 1 u n i t l u l o f A U V r e v e r s e t r a n s c r i p t a s e . Reverse t r a n s c r i p t i o n was then Car r ied O u t a t 42.C for 1 h. The r e a c t i o n was

EDTA a t pH 8: The n u c l e i c a c i d s were e x t r a c t e d w i t h p h e n 0 l : c h l o r o f o r m ( 1 : l ) stopped by p lac ing the tube on i c e and then adding 100 o f 20 Rw disodium

0.3 H and i n c u b a t i n g a t room temperature Overn ight . The NaOH was n e u t r a l i z e d and RNA was hydrolyzed by addlng sodium hydroxide t o a f i n a l c o n c e n t r a t i o n O f

32P- la5e led s ing le s t randed cDNA was p r e c i p i t a t e d w i t h e t h a n o l and a m n i u r n N i t h 0.6 H a c e t i c a c i d . 2 ug of tRNA *as added as C a r r i e r . and the

acetate. The p r e c i p i t a t e d cDNA was harves ted by cen t r i fugat ion and d isso lved i n a small anaunt o f 80°/ formamide containing 50 nN Tr i s -bo ra te [ pH 8.3). 1 Rw EDTA, and 0.101 ( ~ 1 ~ 1 each o f xy lene cyano1 and bromphenol blue and r e s o l v e d on polyacryPamide-urea sequencing gels (461.

SI Nuclease Ha p i n - The f o l l c u i n g p r o c e d u r e i s P m d i f i c a t i o n of t h a t p u b l i s h e d b y BErk &d Sharp (24); A p p r o r i m t e l y 200 n g o f p u r i f i e d DNA fragment a t l e a s t 1000 bp i n s i z e was 3 ' or 5' end- labe l led . A f te r pheno l :ch la ra fo rm ex t rac t ion . 5 ' end- IaQel led f r a w n t s were exonuclease I11

HgCl2. and 3 ' e n d - l a b e l l e d fraglRnts were i exonuclease d i g e s t e d i n a b u f f e r d i g e s t e d i n a b u f f e r c o n t a i n i n g 50 nN T r i s pH 8.0110 nM 8-mercaptoethanol l5 n*l

c o n t a i n i n g 67 nN g l y c i n e pH 9.413 nN HgC1213 Rw 8 - m r c a p t o e t h a n o I for 15 rnin a t 37'C. The r e s u l t i n g e n d - l a b e l l e d SSDNA vas pheno l :ch lo ro fo r rn ex t rac ted and d i v i d e d between two 1.5 ml Eppendorf tubes, one con ta in ing 10 "9 o f tRNA and the o the r tRNA * 2.5 o f r a t l i v e r poly(A)'RNA. The samples were then e t h a n o l p r e c i p i t a t e d a n d t r e a t e d e x a c t l y a s described by Berk and Sharp (241 excep t t ha t no c a r r i e r DNA was added t o t h e 8 1 n u c l e a s e b u f f e r and a l l 8 1

of Miles L a b o r a t o r i e s or 1 W U/ml S i g m enzyme. The f i n a l SI nuclease i n c u b a t i o n s Yere p e r f o r r e d a t O'c for 30 m i n u s i n g e i t h e r 3000 U I ~ I

nuclease-protected f ragments Yere analyzed by e lect rophores is on 2.5QlO agaroselNaOH dena tu r ing ge l s . The NaOH buffer used i n t h e g e l and r e S e r n o i r s was 30 aH NaOH12 nN EDTA. Elect rophoreSiS was a t 40V o v e r n i g h t . The g e l s were then d r ied and p laced on f i l m .

I n V i t r o Transcri t i o n - A l l t h e i n v i t r o t r a n s c r i p t i o n s were done u s i n g a %P HeLa c e l I P l v r a t e svstem 1 T ) o b t a i n e d frm New Enaland Nuclear. P-eno lpyruvate carb&yk ina& genmic DNA w a s used a t a c o n c e n t r a t i o n of 700-1000 ng125 i n t h e assair. The synthes ized RNAs were analyzed by e l e c t r o p h o r e s i s on 1oOl0 po lyacry lamide lu rea sequenc ing gels 146) or lo/, agaroselHOPS-formdldehyde g e l s 171):

RESULTS AND OISCUSSION

Pu ta t i ve Fo rmat ion o f a H a i r p i n Loop - The mRNAPcPCK n u c l e o t i d e sequence was exanlned i n an e f f o r t t o u n d e r s t a n d t h e r e a s o n f o r t h e p r e m t u r e t e r m i n a t i o n d u r i n g t h e c o n s t r u c t i o n o f the CONAS and whv t h e r e i s a " s t u t t e r " s i t e i n the

near t h e 5 ' t e r m i n i of PC201 and PC302 (see F i g . 91. The f ree ene rgy ( b G .

r e g i o n O f the 5 ' end o f PC302; It i s p o s s i b l e tc draw a h a i r p i n 100; s t r u c t u r e

25'CI o f t h i s c o n f o r m a t i o n . c a l c u l a t e d a s descr ibed by Tinoco e t a l . 1721. i s d p p r o X i m t e l Y - 1 4 k c a l s i n d i m t i n o t h a t a s t a b l e lo00 S t r u c t u r b m T i n d e e d

it has been p o s t u l a t e d t h a t r i b o n u c l e a s e H a c t i v i t y i n t h e r e v e r s e t ransCr ip taSe degrades the RNA d u r i n g t h e pause a t a loop (73). This would re lease t he 5' end o f "RNAPEPCK and. s i n c e t h e r e l e a s e d f r a g m n t w o u l d n o t have a pr imer , i t c o u l d no t be cop ied by reverse t ranscr ip tase .

F i g u r e 9 - Pu td t l ve Pa l i nd romic S t ruc tu res I " MNAPEPCK That Could Cause S t u t t e r o f Reverse T r a n s c r l t a l e . T h l s s h a s a p o s s i b l e h a i r p i n l o o p s t r u c t u r e t h a t c o u l d form i: mRMF'EPCK and terminate reverse t r a n s c r i p t i o n The 5 ' ends O f PC201 and PC302 are designated. The s i g n i f i c a n c e o f t h i s S t r u c t u r e i s discuSsed i n t h e t e x t d r t h i s r e g i o n O f t h e messenger RNA represents d s t u t t e r s i t e f o r r e v e r s e t r a n s c r i p t i o n .

A n a l y s i s O f t h e P - e n o l p y r u v a t e C a r b o r y k i n a r e m i n o A c i d Sequence - The

prov ide a strong b d l i l for unders tand ing the mechanism o f C a t a I v s I s t h e e l u c i d a t i o n o f t h e amino a c i d sequence for P-enolpyruvate CarbOxykinase Mill

carboxyk inase w i th o the r p ro te ins such as t h e f e r r o a c t i v a t o r p r o t e i n ( 7 4 ) . t e r t l a r y s t r u c t u r e o f t h e p r o t e i n . and {he l n t e r a c t i o n O f P-enolpyruvate

Severa l coments can be mde abou t t he p r l rna ry S t ruc tu re o f t he p ro te in . res idues may b e b u r i e d w i t h i n t h e p r o t e i n (751. it was o f i n t e r e s t t o determine Since Charged resldues may l i e on t h e s u r f a c e o f t h e m l e c u l e and hydrophobic

whether charged and hydrophobic residues appear i n clusters th roughout the pro te in . Th is Indeed appears t o be the case as shown i n F i g . 10. There are numerous shor t St retches o f 6 or m r e cont iguous amino a c i d s t h a t a r e p a r t i c u l a r l y h y d r o p h i l i c and m y t h e r e f o r e l i e on the sur face o f the molecule. where they may r e p r e s e n t a n t i g e n i c determinant s i t e s i n t h e na t ive p r o t e i n (751. i t a l s o seems l i k e l y t h a t much o f the C- terminus. beglnn ing wi th amino a c i d 560. mould be on t he su r face o f the p r o t e i n s i n c e t h e r e i s a preponderance o f h y d r o p h i l i c r e s i d u e s i n t h a t r e g i o n . T h e r e a r e 4 l o n g r e g i o n s t h a t a r e pa r t i cu la r l y hyd rophob ic and t hese i nc lude amino acids 138-223, 281-386.

,~ .. .

Page 13: Rat Hepatic Cytosolic Phosphoenolpyruvate … · and confirmed by fast atom bombardment mass spec- ... digests of rat liver ... Edman degradation steps and Staphylococcus aureus V8

10760 Structure of P-enolpyruvate Carboxykinase, I ts mRNA, and Gene 398-460 and 479-517. I t seems l i k e l y t h a t t h e s e h y d r o p h o b i c r e g i o n s w o u l d b e b u r i e d w i t h i n t h e m l e c u l e i t s e l f . A l s o s h w n i n F i g . 10 i s a p r e d i c t i o n o f the secondary St ructure o f the prote in based on the technique descr ibed by Garn ier e t a1 (761. The Stretches shown have a t l e a s t 6 cont iguous amino acids t h a t a r e l i m y t o b e e i t h e r 0 - h e l i x or 0-p lea ted shee t s t ruc tu res . The amino a c i d sequence O f P-enolpyruvate carboxykinase vas compared w i t h t h e e n t i r e d a t a base at the Nat ional Biomedical Research Foundat ion. Georgetown Universi ty, Washington. D.C. There were no p r o t e i n s i n t h e d a t a base Which were found t o have subs tan t ia l homology w i th th is enzyme, thus i t was n o t p o s s i b l e t o p r e d i c t secondary or t e r t i a r y S t r u c t u r e by ana logy w i th o ther p ro te ins o f known t h r e e d imens iona l con formt ion .

F i n a l l y , Colombo e t a l . (311 p r e v i o u s l y r e p o r t e d t h a t t h e r e i s a c y s t e i n e res idue, loca ted 44°,"oTThe t o t a l l e n g t h o f t h e p o l y p e p t i d e c h a i n f r o m

a c t i v i t y . The c y s t e i n e s t h a t a r e c l o s e s t t o t h i s p o s i t i o n a r e Cys 245, which the N- te rminus . tha t ?s essent ia l fo r P-eno lpyruvate carboxyk inase enzymatic

i s 39OlO o f t h e d i s t a n c e frm the N-terminus, and Cy5 288, which i s 46O/, o f t he d i s tance f rom the N - te rm inus . Th i s l a t te r Cys. a l though c l o s e s t t o the p r e d i c t e d p o s i t i o n , i s l o c a t e d w i t h i n one o f t h e s e p n t s t h a t

245 m y be t he bes t cand ida te f o r t h i s e s s e n t i a l amino a c i d : i s e n r i c h e d i n hydrophobic amino a c i d s and m y t h e r e f o r e b e b u r i e d . Thus, Cys

F i g u r e 10 - Ana lys i s o f Poss ib le Secondary S t ruc tu res o f P -eno lpy ruva te Carboxykinase. n c t e m a n y rop I c t y v a ues o s x c o n t uous amino a c l d s were D l o t t i d f h r k h a u t !h: en!;:: i e n q t h ' o f th: a l i n a a c i j sequence as descr ibed by Hopp and Uoods (751. The s o l v e n t D a r a m r s d e s c r i b e d b y . L e v i t t

h y d r o p h i l i c r e g i o n s and nega t i ve Va lues i nd i ca te hyd rophob ic reg ions o f t he 177) were used as t h e h y d r o p h i l i c i t y v a l u e s . P o s i t i v e values i n d i c a t e

o m t i d e c h a i n . In b s o l i d b a r s r e o r e s e n t h v d r O D h i l i c r e a i o n s I i k e l v t o b e on ' t h k s u r f a c e o f t h e F r o r e i n and o& b a r s r e i r e r e n t s t r o n i l v h v d r o D h i b i c r e q i o n s t h a t m y b e i n s i d e t h e p r o t e i n . ' S h o r n a t the top, a, a r e 6 - h e l i r ' a n d 0-pt;ated shee t reg ions p red ic ted as desc r ibed by Garn ie r e t a. (76) . Only those regions which have a t l e a s t s i x c o n t i g u o u s a m i n o T c T a f p r e d i c t e d t o have the saw con fo rma t ion a ~ e s h w n . The "saw- too th " i nd i ca tes 8 -o lea ted shee t and

TABLE I

A c m p a r i r o n o f t h e amino a c i d c m p o s i t i o n and molecu la r we igh t o f P-enolpyruvate carboxyk inase determined by ac id hydro lys is and po lyacry lamide g e l e l e c t r o p h o r e s i s o f t h e p u r i f i e d en2 (Colonbo e t a l ; (311 and Beale ?J - a l . ( 5 ) ) and deduction frm the nRNbPEPpsequence (F fgFe 21.

From Sequence By a n a l y s i s o f t h e p u r i f i e d enzyme

Amino A c i d F i g u r e 2 Cololnbo e t 11. 1311 Beale e t a l . ( 5 1

(mol5 o f res idue pe r mal P-enolpyruvate carboxyk inasel

48 29

51 28

27 56 26

46 27

51

CY s 13 13 12

Gln Glu

G ~ Y H i s l l e I e" L Y 5 Met Phe Pro ser T h r

18 52

55 13 39 51 42

28 18

44 32 23 16

33 14

76 71

58

36 12

44 55

18 28 41 34 23 19 15 28

56

35 13

50 40

27 15

43 30 22 18 13 32

T o t a l : 621 635 601

M, : 69.289 72,000 67.000

Al though the t r y p s i n *as t r ea ted w i th d ipheny lcd rbdmy l ch lo r i de t o min lmize c h y m o t r y p t i c a c t i v i t y . many o f t h e p r o t e o l y t i c p e p t i d e s f r o m t h e d i g e s t c o u l d oe d t t r l b u t e d t o c l e a v a g e o f c h y m o t r y p t i c r i t e s . T h i s was conf i rmed by measur ing the exact mass o f seve ra l O f t hese pep t ides us ing h igh reso lu t i on l M S 5 s p e c t r o w t r y .

2 Scoble. H.A. and Blemann. K . (19841. 32nd Annual conference an Mass Spectrometry and A l l i e d T o p i c s . San Antonio. T X . NO. 349. s D i r a l s i n d i c a t e d o - h e l i c a l r e g i o n s .