mammalian hexokinase 1: evolutionary conservation and structure to function analysis

11
GENOMICS 11,1014-1024 (1991) Mammalian Hexokinase 1: Evolutionary Conservation and Structure to Function Analysis L. D. GRIFFIN,* B. D. GELB,*‘t D. A. WHEELER,+ D. DAVISON,~ V. ADAMS,* AND E. R. B. MCCABE*‘t *Institute for Molecular Genetics, tDepartment of Pediatrics, *Molecular Biology Information Resource, Baylor College of Medicine, and §Department of Biochemical and Biophysical Sciences, University of Houston, Houston, Texas 77030 ReceivedJune 18, 1991 We have amplified and sequenced the complete coding region of bovine hexokinase isoenzyme 1 (IIKl) from brain RNA with PCR primers selected for sequence conserva- tion. The sequence information was analyzed to evaluate the evolutionary and structure-function relationships among the mammalian and yeast HK isoenzymes. Struc- ture to function analysis identified an unduplicated, invari- ant N-terminal domain involved in HKl outer mitochon- drial membrane targeting, as well as putative carbohy- drate and nucleotide-binding sites in the regulatory and catalytic halves of IIKl essentialto enzyme function. The ATP-binding site in the catalytic half of the IIKl protein resembles nucleotide-binding regions from protein ki- nases,with the single amino acid replacement (lysine to glutamate) in the ATP-binding site of the amino half ex- plaining the lossof HKl catalytic function in the regula- tory domain. Sequencecomparisonssuggestthat the SO- kDa mammalian and yeast glueokinases arose separately in evolution. In addition to providing valuable phylogene- tic and structure-function insights, this work provides an efficient strategy for rapid cloning and sequencingof the coding regions for other HKs and related proteins. o 1991 Academic Press, Inc. INTRODUCTION Hexokinase (HK E.C.2.7.1.1) catalyzes the first step in glucose metabolism, utilizing ATP for the phosphorylation of glucose to glucose-6-phosphate. In mammals there are four HK isozymes, which vary in their tissue distribution and kinetic properties (Katzen and Schimke, 1965). Each of the types l-3 HK isozymes consists of a single polypeptide chain with a molecular weight of approximately 100 kDa and is inhibited by the product glucose g-phosphate (Katzen and Schimke, 1965). The type 4 HK, or glu- cokinase, is similar to yeast HK, as it is approxi- mately 50 kD, and insensitive to inhibition by glu- cose-6-phosphate (Grossbard and Schimke, 1966). Several groups (Ureta, 1982; Vowles and Easterby, 1979; Holroyde and Trayer, 1976) have speculated that the mammalian HKs evolved through the dupli- cation and fusion of an ancestral hexokinase, which resembled the yeast HKs and mammalian glucokin- ase in size. After the duplication event, one of the halves evolved to have a regulatory function, while the other half retained the catalytic functions of the ancestral enzyme. The new HK gene then underwent further duplications to produce the three lOO-kDa HKs. It has also been suggested (Ureta, 1975) that the 50-kDa form (glucokinase) survived in mammals but was lost through a mutation event in certain verte- brate families, such as Aues and others. The recent cloning of the rat (Schwab and Wilson, 1989) and human (Nishi et aZ., 1988) HKl as well as rat HK2 (Thelen and Wilson, 1991) and HK3 (Schwab and Wilson, 1991) has provided evidence in support of this gene duplication-fusion hypothesis. Data from these cDNAs indicate that there is signifi- cant sequence identity between the mammalian forms, as well as strong similarity to the yeast HKs (Kopetzki et aZ., 1985; Frohlich et al., 1985). Rat glu- cokinase was also found to be highly similar to both the 3’ catalytic half of HKl and to the yeast isozymes (Andreone et al., 1989). Here we report the cloning of the bovine brain HKl using knowledge of the evolutionary conservation of HK amino acid and nucleotide sequence. This has al- lowed us to determine the overall alignment of the HK and glucokinase sequences and will facilitate the cloning and sequencing of HK and related genes in other species. We have identified regions correspond- ing to important functional domains in both halves of the HK cDNAs, lending further support for the evo- lutionary origin of mammalian HK l-3 by duplica- tion-fusion events. We discuss the amino acid se- quence alterations in the regulatory and catalytic do- mains and their effects on functional properties of the enzyme. We construct a phylogenetic tree for the HK o&3&7543/91 $3.00 Copyright 0 1991 by Academic Press, Inc. All rights of reproduction in any form reserved. 1014

Upload: erb

Post on 04-Jan-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

GENOMICS 11,1014-1024 (1991)

Mammalian Hexokinase 1: Evolutionary Conservation and Structure to Function Analysis

L. D. GRIFFIN,* B. D. GELB,*‘t D. A. WHEELER,+ D. DAVISON,~ V. ADAMS,* AND E. R. B. MCCABE*‘t

*Institute for Molecular Genetics, tDepartment of Pediatrics, *Molecular Biology Information Resource, Baylor College of Medicine, and §Department of Biochemical and Biophysical Sciences, University of Houston, Houston, Texas 77030

ReceivedJune 18, 1991

We have amplified and sequenced the complete coding region of bovine hexokinase isoenzyme 1 (IIKl) from brain RNA with PCR primers selected for sequence conserva- tion. The sequence information was analyzed to evaluate the evolutionary and structure-function relationships among the mammalian and yeast HK isoenzymes. Struc- ture to function analysis identified an unduplicated, invari- ant N-terminal domain involved in HKl outer mitochon- drial membrane targeting, as well as putative carbohy- drate and nucleotide-binding sites in the regulatory and catalytic halves of IIKl essential to enzyme function. The ATP-binding site in the catalytic half of the IIKl protein resembles nucleotide-binding regions from protein ki- nases, with the single amino acid replacement (lysine to glutamate) in the ATP-binding site of the amino half ex- plaining the loss of HKl catalytic function in the regula- tory domain. Sequence comparisons suggest that the SO- kDa mammalian and yeast glueokinases arose separately in evolution. In addition to providing valuable phylogene- tic and structure-function insights, this work provides an efficient strategy for rapid cloning and sequencing of the coding regions for other HKs and related proteins. o 1991

Academic Press, Inc.

INTRODUCTION

Hexokinase (HK E.C.2.7.1.1) catalyzes the first step in glucose metabolism, utilizing ATP for the phosphorylation of glucose to glucose-6-phosphate. In mammals there are four HK isozymes, which vary in their tissue distribution and kinetic properties (Katzen and Schimke, 1965). Each of the types l-3 HK isozymes consists of a single polypeptide chain with a molecular weight of approximately 100 kDa and is inhibited by the product glucose g-phosphate (Katzen and Schimke, 1965). The type 4 HK, or glu- cokinase, is similar to yeast HK, as it is approxi- mately 50 kD, and insensitive to inhibition by glu- cose-6-phosphate (Grossbard and Schimke, 1966).

Several groups (Ureta, 1982; Vowles and Easterby, 1979; Holroyde and Trayer, 1976) have speculated that the mammalian HKs evolved through the dupli- cation and fusion of an ancestral hexokinase, which resembled the yeast HKs and mammalian glucokin- ase in size. After the duplication event, one of the halves evolved to have a regulatory function, while the other half retained the catalytic functions of the ancestral enzyme. The new HK gene then underwent further duplications to produce the three lOO-kDa HKs. It has also been suggested (Ureta, 1975) that the 50-kDa form (glucokinase) survived in mammals but was lost through a mutation event in certain verte- brate families, such as Aues and others.

The recent cloning of the rat (Schwab and Wilson, 1989) and human (Nishi et aZ., 1988) HKl as well as rat HK2 (Thelen and Wilson, 1991) and HK3 (Schwab and Wilson, 1991) has provided evidence in support of this gene duplication-fusion hypothesis. Data from these cDNAs indicate that there is signifi- cant sequence identity between the mammalian forms, as well as strong similarity to the yeast HKs (Kopetzki et aZ., 1985; Frohlich et al., 1985). Rat glu- cokinase was also found to be highly similar to both the 3’ catalytic half of HKl and to the yeast isozymes (Andreone et al., 1989).

Here we report the cloning of the bovine brain HKl using knowledge of the evolutionary conservation of HK amino acid and nucleotide sequence. This has al- lowed us to determine the overall alignment of the HK and glucokinase sequences and will facilitate the cloning and sequencing of HK and related genes in other species. We have identified regions correspond- ing to important functional domains in both halves of the HK cDNAs, lending further support for the evo- lutionary origin of mammalian HK l-3 by duplica- tion-fusion events. We discuss the amino acid se- quence alterations in the regulatory and catalytic do- mains and their effects on functional properties of the enzyme. We construct a phylogenetic tree for the HK

o&3&7543/91 $3.00 Copyright 0 1991 by Academic Press, Inc. All rights of reproduction in any form reserved.

1014

MAMMALIAN HEXOKINASE 1: EVOLUTIONARY CONSERVATION 1015

5’ UTR

l-l-

Coding Region 3’ UTR

+-I

AA-100% N-957.

RR-loo% N-1007.

tlfl-100% N-1007.

2762

RG% N-1007.

2760

RiO% I= 23 dT

AA-100% -

N-1007. N-92% RR-100%

N-927. dT 91-30

1159

liiOO%

2761

- iizrO7. flciYO%

N-1007. N-92% N-1007.

1% ys

MOPRC MOPRC

FIG. 1. HKl cloning strategy. Four primer pairs were used to amplify the coding portion of the bovine HKl cDNA. Percentages under each primer indicate the amino acid and nucleotide conservation between rat and human HKls. Arrows represent the direction from which amplification occurs. Right arrows (+) are sense strand primers. Left arrows (+) represent antisense primers. Untranslated regions (UTR) were generated with one nonspecific oligo(dT) primer [dT], and one specific primer near the end of the coding regions.

family and propose for the first time a novel evolu- tionary origin for mammalian glucokinase.

METHODS

Materials

Tuq polymerase was obtained from Perkin-Elmer- Cetus. Reverse transcriptase (Moloney murine leuke- mia virus, MoMLV) was purchased from Pharmacia. Bovine brain was obtained from Texas A&M Univer- sity Department of Animal Sciences, and freeze clamped (Lowry and Passonneau, 1972) within 2 min of death using aluminum blocks cooled in liquid ni- trogen. Sequenase (version 2.0) and dideoxynucleo- tides were obtained from United States Biochemicals.

Primer Design and Synthesis

The rat brain and human kidney HKl cDNA se- quences were compared to identify short (25-30 bp) regions of 100% sequence identity. Primers were de- signed to span 500- to 900-bp regions of the bovine HKl cDNA, and were constructed as unique oligonu- cleotides. Four pairs of primers were used to amplify overlapping coding portions of bovine HK (Fig. 1). Primers corresponding to regions with nucleotide ho- mology greater than 88% were considered acceptable for use based on previous results with primers of high degeneracy or mixed oligonucleotide primed amplifi- cation of cDNA (MOPAC, Griffin et al., 1989). Un- translated regions were amplified using one specific internal primer and a nonspecific primer for the other end (Frohman et al., 1988). The nonspecific primer consisted of a poly(dT) stretch with an attached linker. The linker contained three unique restriction enzyme sites including CZuI, XhoI, and SalI. The oligo-

nucleotide mixtures were synthesized on an Applied Biosystems 380B or a 392 oligonucleotide synthe- sizer.

Template Preparation and PCR

Total RNA was isolated from bovine brain using the RNAzol method first described by Chomczynski and Sacchi (1987) and was then used for first-strand cDNA synthesis. First-strand cDNA for each reaction was generated from 1 pg of total RNA with MoMLV reverse transcriptase and either an oligo(dT) primer or the 3’ primer of each primer pair (Nos. 91-30,2758, 2761, 1160, or 2762). Each of the 3’ primers corre- sponded to the antisense strand. In order to amplify the 5’ untranslated region it was necessary to add a poly(dA) tail onto the first strand cDNA using termi- nal transferase and dATP. The nonspecific primer could then be used in the amplification of this portion of the cDNA.

A total of 5 pmol of primer was added to 1 kg of total RNA, in the presence of 20 U of an RNase inhibitor, RNasin. The mixture was heated to 95°C for 5 min to remove RNA secondary structure and immediately cooled on ice. To each first strand reaction was then added 10 ~1 2 mM dNTPs, 2 ~1 of 10X PCR buffer (10X: 200 mM Tris, ph 8.3,500 mhf KCl, and 25 mM MgCl,), 20 U RNasin, 200 U of MoMLV reverse tran- scriptase, and DEPC-treated water to 20 ~1. Each re- action was allowed to proceed for 1 h at the appro- priate empirically determined temperature (37,48, or 55“C). Due to instability of the reverse transcriptase at temperatures higher than 37°C additional enzyme was added more frequently to those reactions requir- ing the two higher incubation temperatures. Each tube was then heated to 95°C to inactivate the en-

1016 GRIFFIN ET AL.

zyme. A total of 45 pmol of additional primer 1 (used in the first reaction), 50 pmol of the second primer, and 8 ~1 of reaction buffer were added to the appro- priate reaction mixtures and then were diluted to 100 ,ul total volume with water. Taq, 2.5 U, was added after an initial 5-min incubation at 94’C. Thirty rounds of amplification were performed at the following initial conditions: 94°C for 30 s, 55°C for 30 s, and 72°C for 2 min using a Perkin-Elmer-Cetus Thermocycler.

Ten microliters of each reaction was analyzed on a 2% ME agarose gel in 1 X TBE (90 mMTris/64.6 mM boric acid/2.5 mM EDTA, pH 8.0). Those primer sets that generated multiple product bands were then am- plified using higher annealing temperatures, while several failed initially to generate products of the correct size. Alteration of conditions for the first strand cDNA synthesis, specifically, increased tem- perature for reverse transcription, eliminated the problem of incorrect size. Higher annealing tempera- tures for the PCR failed in most cases to reduce the number of products. Hybridization of these products to rat HKl not only identified the correctly sized prod- ucts as bovine HK, but also indicated that several of the others were related to HK. It was noted during these experiments that the crucial step for generating full-length products was production of the first strand cDNA and not PCR.

Gels were denatured in 0.4 N NaOH and trans- ferred to a Zetaprobe nylon membrane overnight in 0.4 N NaOH by Southern blotting (Maniatis et al., 1982). A rat HKl cDNA was random hexamer labeled with [a3’-P]dCTP (Amersham) and Klenow fragment of DNA polymerase (Feinberg and Vogelstein, 1983). Filters were prehybridized (Church and Gilbert, 1984) then hybridized in Church buffer at 68°C with 2 X lo5 cpm/ml of labeled probe for 12-18 h, and washed in 0.4 M sodium phosphate, 0.1% SDS, at 65°C for 1 h, followed by 0.1X SSC, 0.1% SDS at 75°C for 15 min. Filters were exposed to Kodak X-OMat film at room temperature. The filter was then stripped and rehy- bridized to a 45-bp oligonucleotide primer, which corresponded to the internal region of the original MOPAC product (Griffin et al., 1989), under the same conditions. Appropriate products were reamplified for use as template for sequencing or as template for asymmetric PCR.

DNA Sequencing

Products of the correct size were sequenced directly after reamplification using asymmetric PCR to de- crease the problem of reading errors when amplified products are subcloned and sequenced. Using the primer pairs designated above, reaction mixtures con- taining 50 pmol of one primer and 1 pmol of the other

were amplified and sequenced by the asymmetric PCR method (Gyllensten and Erlich, 1988) or using the direct method of Kusukawa et al. (1990). Reaction mixtures were as described above (10X buffer: 500 n&f KCl, 100 mM Tris, pH 8.3,15 mA4 MgCl,). Reac- tion mixtures were applied after PCR to a Centricon- 30 microconcentrator (Amicon) to remove the excess dNTPs and buffer components. The columns were washed three times with water, and spun at 4000g. Retentates of 35 ~1 were collected for each; 3-7 ~1 (1 pmol) of each was added to 10 pmol of the appropriate primer in 1X sequenase buffer. Annealing of the sin- gle strand was performed at 65°C for 15 min, followed by room temp incubation for 15 min. The sequencing reactions then proceeded as per manufacturer’s speci- fications (United States Biochemicals).

Sequence Alignment and Phylogenetic Analysis

N- and C-terminal halves of the vertebrate HKs were analyzed independently. The boundaries of du- plication were defined by dot-matrix homology plot; nonduplicated portions were trimmed away. The sepa- rate homologous regions of the vertebrate HKs were aligned as monomeric units. Protein sequences were aligned using the PIMA (Pattern Induced Multiple Alignment) program (Smith and Smith, 1990). The aligned protein sequences were used as a guide to align the corresponding DNA sequences using the pro- gram PIMA-to-paup. Phylogenetic trees were con- structed based on the aligned DNA sequences. Pro- grams in PHYLIP (Phylogeny Interference Package version 3.3; J. Felsenstein, Department of Genetics, University of Washington, Seattle, WA) were used for creating phylogenetic trees. The program DNA- DIST generated a matrix of genetic distances for all pairwise combinations of nucleotide sequences in the alignment. Distances were corrected for nucleotide reversions by the two parameter model of Kimura (1981). The distance matrix was analyzed by the pro- gram KITSCH to construct the phylogenetic tree by the Fitch-Margoliash least-squares method (Fitch and Margoliash, 1969).

RESULTS

Six fragments, which represented the complete coding region of the bovine HKl cDNA, were success- fully amplified by PCR from bovine brain RNA. The sequence from these overlapping PCR fragments was compared to all available eukaryotic HK sequences at both the nucleotide and amino acid levels, and the results are shown in Table 1. The sequence similarity ranges from approximately 90% among the other mammalian HKls to 49% when the bovine HKl cod-

MAMMALIAN HEXOKINASE 1: EVOLUTIONARY CONSERVATION

TABLE 1

Homologies” and Phylogenetic Distances*

1017

mhklc rhklc hhklc bhklc rgkl mhkln rhkln hhkln bhkln ysha yshb yshc

mushkl-cc 0.0000 0.0696 0.1418 0.1511 0.6799 0.7184 0.7139 0.6914 0.7592 1.3207 1.3653 1.4064 rathkl-cd 6.5 0.0000 0.1275 0.1472 0.6673 0.7132 0.7071 0.6899 0.7555 1.2773 1.3068 1.3242 humhkl-c’ 13.4 13.1 0.0000 0.0923 0.6876 0.7175 0.7079 0.6853 0.7686 1.3066 1.3892 1.3676 bovhkl-c 12.5 11.4 8.5 0.0000 0.6580 0.7163 0.7234 0.6721 0.7573 1.2987 1.3991 1.3451 ratgk-1’ 40.2 39.6 39.3 40.3 0.0000 0.7313 0.7362 0.7332 0.7892 1.2964 1.4278 1.7016 mushkl-n 41.6 41.3 41.2 41.3 41.4 0.0000 0.0599 0.1434 0.1915 1.4061 1.4276 1.6364 rathkl-n 41.3 41.0 41.2 40.9 41.7 6.2 0.0000 0.1453 0.1815 1.3683 1.4467 1.5879 humhkl-n 40.7 40.9 39.7 39.7 41.8 13.1 13.3 0.0000 0.1278 1.4246 1.4346 1.5629 bovhkl-n 42.8 41.2 42.1 42.1 43.0 16.9 16.1 11.7 0.0000 1.5108 1.4964 1.6983 yschka8 53.5 53.4 53.6 53.4 54.8 56.8 56.5 56.8 57.7 0.0000 0.2856 1.0118 yschkb’ 54.4 54.0 55.2 55.4 56.2 57.5 57.9 57.2 57.5 23.6 0.0000 1.0602 yschkc i 54.4 53.5 54.1 53.5 57.6 57.4 56.9 56.8 59.1 49.1 50.3 0.0000

a Numbers below the diagonal indicate the number of nucleotide changes per 100 bases (the percentage difference between the compared sequences).

’ Numbers above the diagonal represent the corrected “distance” between the compared sequences, and are the raw data from which the phylogenetic tree is based.

’ GenBank accession No. JO5277 (Ref. (4)). d Ref. (37). = Ref. (35). ’ GenBank accession No. 504218, Ref. (3). 8 GenBank accession No. X03482, Ref. (30). h GenBank accession No. X03483, Ref. (13). i Ref. (1).

ing sequence is compared to rat HK3. The higher per- centage similarity of the bovine sequence and the other sequences that have been designated as HKl confirms their collective identity as HKl. As in other mammalian HKs, the bovine HK amino half is a du- plication of the carboxy half. The two halves of the coding sequence from the bovine cDNA were com- pared and found to be 58% conserved at the nucleo- tide and 49% at the amino acid level.

All available HK sequences were aligned using the PIMA and PIMA-to-paup programs (Smith and Smith, 1990). Several regions of greater similarity were noted within the aligned sequences (Fig. 2), in- cluding the ATP-binding (box B) and glucose-binding (box C) sites. Boxes D and E in Fig. 2 represent other residues that are potentially involved in ATP binding. In addition, we noted that a l&amino-acid N-termi- nal domain (box A) was absolutely conserved throughout the mammalian HKls at the amino acid level and 92-96% conserved at the nucleotide level. Five amino acids at positions 178, 231, 232, 303, and 338 in the aligned sequences (marked by a +), which are involved in glucose binding in yeast (Bennett and Steitz, 1980), are conserved in all existing HKs and are present in both the amino and carboxy halves of the mammalian HKs.

We were able to identify the regions thought to be involved in ATP- and glucose-binding in both halves

of the protein, although the sequences were slightly altered in the amino half. These regions are illus- trated in Figs. 3 and 4. Core regions of conservation were noted for the substrate- and nucleotide-binding sites. Consensus sequences derived for each sits were compared to all peptide sequences in the PIR protein database (George et al., 1986) and failed to recognize any other carbohydrate kinase. We also found that two amino acid positions in the substrate-binding site, and seven amino acid positions in the nucleotide- binding site have residues that are present in the cata- lytic sites but not in the regulatory (amino half) sites. A lysine residue present lo-13 residues downstream of the ATP-binding site (box B) of the catalytic half of bovine HKl was noted to be altered to a glutamate in this site in the amino half.

The evolutionary relationships between the HKs were explored through the construction of a phylo- genetic tree (Fig. 5). Nonduplicated regions, including sequences that encode the N-terminal outer mito- chondrial membrane-binding domain, were omitted to achieve the best alignment. We used the KITSCH algorithm of Felsenstein to produce a rooted tree with contemporaneous tips, for clarity. The sequences fell into three clusters: yeast, N-terminal, and C-termi- nal. The yeast HKs were most diverged from the others, while the C-terminal sequences were most conserved as indicated by the depth of the branch bi-

1018 GRIFFIN ET AL.

rn”ShkI-n rdChk1-n humhkl-n b.¶Yhkl~” !ll”Shkl-C rathkl-c humhkl-c bO”hkl-c ratg)r-1 yschka yschkb yxhkc

Kl”Shk1-n rShkl_” h”*kl~n bO”hkl-” lll”Shkl-C rathkl-c humhkl-c bO”hkl-c catgk-1 yschka yschkb yschkc

m”Jhk1-n rathkl-n hwnhkl-n bovhkl-n mushkl-c rathkl-c htikl-c bovhkl-c ratgk-1 vschka yschkb yschkc

mushkl-n rathkl-n humhkl-n bcvhkl-n mushkl-c rathkl-c humhkl-c bovhkl-c ratgkLl yschka yschkb yschkc

““shkl-n rathkl-n hmhkl-n bovhkl-n mushkl-c rathkl-c humhkl-c bovhkl-c ratqk-l yschka yschkb

""shkl-n rathkl-n humtikl-n bovhkl-n mushkl-c rathkl-c humhkl-c txwhkl-c rargk-1 yschka y.,chkb yschkc

30 31 45 46 60 61 75 76 90 WKK IDKYLYAMRLSDEIL 1DILTRFKKE"KNOL -SDRYNPI----ASV KIILCTF"RSI8DOSL QVKK IDKYLYAMRLSDEIL IDILTRFKKEHKNOL -SDRYNPT----AS" KllLITLLRSIVDaS~ QVKK IDKYLYAMRLSDETL IDIMTRFRKEMKNOL -SDRFNPT----AT" KIL?TFVRSI?DOS~ PVKK IDKYLYAMRLSDETL wrk.94RfKm.mNar. -sRDFNPT----*TV KILVTFVRSIVDOSL

--------------- ----------EQHRQ IEETLSHFRLSKQAL MEVKKKLRSEMEMOL -RKETNSR----AT" K,,L?SYVRSI?DOTIC --------------- ----------EQHRQ IEETLRHFRLSKQTL MEVKKRLRTEMEMOL -RKETNSK----AT" K,,L?SFVRSI?DaTL --------------- ----------EQHRQ IEETLRHFHLTKDML LEVKKRMRAEHELOL -RKQTHNN----A"" K,,L?SFYRRT?DOTL --------------- ----------EQHRQ IEETLAHFRLSKQTL MEVKKRLRTEMEMOL -KKf.TNSN----AT" NNL?SFLRSI,DOTI --------------- MAMDTTRCGRQLLTL VEQILAEFQLQEEDL KKVnSRnQKE,,DROL -RLETHEE----AS" K"LITY"RST,EOSL MVHLGPKKPQARKGS -"PK----EWE IHQLEDMFTVPTETL RKWKHFIDELNKOL --TKKGVN------I P,,I8GWVNEF8TOKL

MWLGPKKPQARKGS "ADVPK----ELMQQ IEIFEKIFTVPTETL QAVTKHFISELEKOL --SKKGVN------I p"I,GWMDF,TaKX

-------------MS FDDLHKATERAVIQA VDQICDDFEVTPEKL DELTAYFIEQMEKOL APPKEGHTLASDKGL P~IVMVTGSVNOTZ

91 B 105 106 120 121 135 136 150 151 ILR+N"EK--SON VSMESLVYDTPENIV --HGSGSQLFDH-"A

rIALDL008SF1 ILK VSMESLIYDTPENIV --HGSGTQLFDH-VA D----cLGDFPIEKKK

VIUab001SFI ILK VHMESCVYDTPENIV --HGSGSQLFDH-“A E----cLGDFMEKRK

rrhmmwsfn ILR VHMES~YYDTPENIM --"GSGSQLFDH-VL E----CLGDFHEKKK VEMHNXIYSIPLEIH --QGTGDELFDH-I” S----CISDFLDYNG

IRSGK--KRT VEMHNKIYSIPLEIM --WGTGDELFDH-IV S----CISDfLDYMG IKGP

ru&&aaTNFn YEMHNaIYAIPIEIM --QGTGDELFbH-I” S----CISDFLDYHG

VEMHNIIYSIPIEIM --QGTGEELFDH-I" S----CISDFLDYHG “KTIHQMYSIPEDAM --TGTAEMLFDY-IS E----CISDFLDKHQ

TQSKYKLPHDMRTTK --"QEELWSFIA-DS L----KDFM"EQELL LGGDR-TFDT TQSKYRLPDAMRTTQ --NPDELNEFIA-DS L----KAFIDEQFPQ LHGDH-TFDT EQMISKIPDDLLDDE NVTSDDLFGFLARRT LRFHKKYHPDELAKG KDRKPMt(LarTrlYVl

l l

181 c 195 196 210 211 225 226 ** 240 241 255 256 D 270 RQSKIDEAVLITIT X ASGVEGADWK LLNKAIKKRGDYDRN 1VA~TVGTbK-W GYDD----------- QQCEVOLII~~~

ASGVEGADVVK LLNKAIKKRGDYDAN IVImTVGT"MTC GIDD----------- QQcE\nLI QQSKIDEAILITIT I ASGVEGADWK LLNKAIKKRGDYDAN IVA~TVGT"MTC GYDD----------- QHCEVQLI RQSKIDEAILITVT K LLDKAIKKRGDYDAN IVaVVIPTVGTMIDC GYDD----------- QHCEVOLI

LLRDAVKRREEFDLD W~W~PTVGTMMTC AXEE----------- PSCEIOLZ LLP.DAVKRREEFDLD W~W~~PTVGTMMTC AYEE----------- PTCEIOLI LLRDAIKRREEFDLD WA~TVGTHMTC A~EE----------- PTCEVOL~ LLRDAVKRREEFDLD W~~W~PTVGTHHTC AYEE----------- PTCEVOL LLRDAIKRRGDFEMD WIMV~PTVATMISC YYED----------- RQCE"OU LMKEISKRELP-IE IVILImTVGTLIRS YYTD----------- PETKMOV MLQKQIsKKNIP-Is WALImTTGTLVAS YXTD----------- PETKMOV LYQEQLSRQGMPMIK WALTIPTVGTYLS" CTTSDNTDSMTSGEI SEP"IaC

271 285 286 300 301* 315 316 330 331 * 345 346 360

CYNEELRHI-----D LVEGD----EGRRCI PTIWQRVGDDGSLED IRTEfDRELDRG4L NPQKQLF~XWSOMY MGELVILILVKMAKE CYMEELRHI-----D LVEGD----EGRs(CI ~YT&!GW.~GDDGSLED IRTEFDRELDRG-IL NPOK~LF~VSOMY MGELVILILY-KS

CYMEELRHI-----D LVEGD----EGR*CI IITIwQ~GDDGSLED IRTEFDREIDRG-BL NPOKOLF~SOMY LGELVULILVFAMKE CYMEELRQI-----D LGWGD----DGR#CI PTEWODTGDDGSLED IRKEFDREFRRG-IL NPaKoRFmSaKY MEDWIL"LW,%F,KE CYMEEMKNV-----E MVEGN----QGQmCI WawOArGDNGCLDD IRTDFDKVVDEY-no NSOK~Rf~ISOMy LGEIVUIILIDFTKK CYMEEHKNV-----E MVEGN----QGQYcI MIWOAIGDNGCLDD IRTDFDKWDEY-IL NsOKORF&~ISOHY LGEIvWILIDFTKK

CYMEEHKNY-----E MVEGD----QGQWCI UNaArGDNGCLDD IRTHYDRLVNEY-8L NROKORY~KMISOMY LGEIVWILIDFTKK CYMEEHKNV-----E MVEGN----QF@CI *MNOAVGDNGCSDD IRTDFDKVVDEY-IL NSONORF&KMISOIY LGEIVINILIDFTKK CYHEEMQNV-----E LVEGD----EGPMCV NTINOATGDSGELDE FLLEYDWWDES-IA NPOPOLYSXIIGOKY MGELVILVLLKLVDE AFYDVCSDIEKLEGK LADDI--PSNSPmRI HCfiYaSrIaNEHLVL PRTKYDVAVDEQ-BP RP~,,F~TSaYY LGEILRLVLLELNEK AYYDYCSDIEKLQGK LSDDI--PPSRPs(AI PC&YaSr-DNEHWL PRTKYDITIDEE-BP RPQPOTF‘X"SSOYY LGEII&AL~~DHYKQ CYHEEINKITKLPQE LRDKLIKEGKTH~II IVENaSV-aNELK"L PTTKYDVVIDOKLIT NPOFHLF~IRVSOMF LGEVLINILVDLHSQ

361 375 376 390 391 405 406 420 421 435 436 450

SLLF-----EGRITP ELLTRGKFTTSDVAA IETD--KEGVQNAKE ILTRLGVEPSHDDC" SVQHLCTIVSFRSAN LVAATLGAILNRLRD GLLF-----EGRITP ELLTRGKFNTSDVSA IIKD--KEGIQNAKE ILTRLG"EPSKD"C" SVQHICTIVSFRSAN LVRATLGAILNRLRD GLLF-----EGRITP ELLTRGKFNTSDSVR IIKN--KEGIQNAKE ILTRLGVEPSDDDC" S"Q""CTI"SF.SAN LVMTLGAILNRLRD GLLF-----EGRITP ELLTRGKFNTSDVSA ILKD--KEGL"NAKE ILTRLGvERSDDDCY SVQHVCTIVSF~SAN LVAATLGAILNRLRD GFLF-----RGQISE PLKTRGIFETKFLSQ ILSD--KLALLQvRA ~L~QLGLNSTCSDSI LVKTVCGWSKMQ LCGAGMAAWEKIRE

GFLF-----RGQISE PLKTRGIFETKFLSQ IISD--RLALLQVRA ILQQLGLNSTCDDSI LVKTYCGWSK~Q LCGAG&AA""EKIRE GFLF-----RGQISE TMKTRGIFETKFLSQ ILSD--RLALLQVRA ILQQLGLNSTCDDSI LVKTVCGWSRURQ LCGAGMAAWDKIRE GFLF-----RGQISE PLKTRGIFETKFLSQ IISD--RLALLQVRA ILQOLGLNSTCDDSI LVKTVCGWSKIVIPQ LCGAGMAAWEKIRE NLLF-----"GEASE QLRTRGAFETRFVSQ VISD--SGDRKQI"K ILSTLGLRPSVTDCD IVRRI\CESVST~H.HCSAGLAGVINRILRE GLML-----KDQDLS KLKQPyIHoTSyPAR ICDDPFENLEDTDDM FQKDFGVKTTKPERK LIRRLCELIGTtiR LA"CGIMI------ GFIF-----KNQDLS KFDKPFVPLDTSYPAR IBDPFENLEDTDDH FQNEFGINTTVQERK LIRRLSELIGALAAR LSYCGIMI------ GLLLQQYRSKEQLPR "LTTPFQLSSE"LSH ICIDDSTGLRETELS LLQSLRLPTTPTERV QIQKLVRAISRISAY LAAVPLAAILIKTNA

451 465 466 480 481 495 496 E 510 511 +I 525

NKGTPRLRTTVGVDO SLYKM"?QYSRRFHK TL-------RRLVPD SDYRFLLSE AV$&yRLA--- -------- NKGTPSLRTTVGVDO SLYKMHVQYSRRFHK TL-------RRWPD SDVRFLLSE &"AYRLA--- -------- NKGTPdLRTTVGVo@ SLYKTHVQYSRRFHK TL-------RRLYPD SDVRFLLSE A"*yRL*--- -------- NKGTPRLRTTVGVDO SLYKT"?QYSRRF"K TL-------RRL"PD SDVRFLLSE I\"AYLM--- -------- NRGLDHLNVTVGVDO TLYKLHIHFSRIMHQ TV-------KELSPK CTYSFLLSE AVGVRLRGDP TNA----- NRGLDHLNVTVGMQ TLYKLH,“FSRIMHO TV-------KELSPK CTVSFL ITAVGYRLRGDP SIA----- NRGWRLNYTVGVDO TLYKLH,“FSRI,4”P TV-------KELSPK CN"SFL ITAVGVRLRTEA SS------ NRGLDRL.NVTVG"Da TLYKLH,QFSRI""Q TV-------KELSPK CNVSFL ITAYGVRLRGES AIS----- SRSEDVMRITVGVDO S"YKLH?SFKERF"A SV-------RRLTPN CEITFI “SAYACKKAC”L AQ------ CQKRGYKTGHIRAOO S"YKLH.GFKEAAAK GLRDIyGWTGENASK DPITI" IMLSEKRIAEG KVSGIIGA CQKRGYKTGHIAADO SVYNKYIGFKEKAlVl ALKDIYGNTQTSLDD YPIKIYPAE IMLAQKRIAEG KSVGIIGA LNKRYHGEVEIGCDO SWEYYIGFRSHLRH AL-ALSPLGAEGERK YHLKI--A .?&tJA------- -----___

464 464

464

464

455

455

454

455

466

487

487

501

76 76 76

76

60

60

60

60

70

7s

7s

77

151

151

157

157

141

141

141

141 153

160

160

166

236

236

236

236

220

220

220

220

232

238 238

256

316 316

316

316

300

300

300

300

312

324

324

345

399

399

399

399

383

393

383

383

385

403

403

435

MAMMALIAN HEXOKINASE 1: EVOLUTIONARY CONSERVATION 1019

Boulne HK I -n Rat HKI -n Human HKl -n Mouse HKI-n

Bouine HK I -c

Rat HK I -c Human HKI -c

Rat HK4 GFTFSF VRHEDL Yeast HKPI GFTFSF Yeast HKPI I GFTFSF Yeast 6lk

+

Consensus: P*GFTFSFP..Q..*

ITWTK

*.*.L. . WTK

FIG. 3. Substrate-binding site of the eukaryotic hexokinases. Both regulatory and catalytic halves of the HKls as well as rat HK4 (glucokinase) and the yeast hexokinases are shown. Residues that are conserved in two-thirds of the sequences are boxed. Two positions that have an amino acid uniaue to the catalvtic site are marked with a +. A consensus sequence is indicated. Asterisks (*) indicate positions with conservative or semiconservative amino acid changes.

furcations within each group. The rat glucokinase was most closely related to the C-terminal halves of the HKs, whereas the yeast glucokinase was most closely related to the yeast HKs. This observation suggests that the duplication leading to rat glucokinase was a separate (and more recent) event from the duplica- tion leading to the yeast glucokinase, implying that the glucokinase activities arose twice in evolution.

DISCUSSION

We have shown that the bovine brain HKl cDNA exhibits significant homology both to other mamma- lian HKs and to the yeast 50 kDa HKs. Our sequence data, along with that from other groups (Schwab and Wilson, 1989; Nishi et al., 1988), support the earlier hypothesis that HKl arose as a result of a gene dupli- cation and fusion of an ancestral HK protoenzyme (Ureta, 1982). This was followed by functional diver- gence of the duplicated halves creating the regulatory and catalytic domains of HKl (Fig. 6). Comparisons of the bovine coding sequence to that of other HKs allow us to identify not only the functional binding domains of HKl but also to pinpoint key residues within those domains. We have shown that essential

domains are conserved in each of the duplicated halves of HK.

Alignment of the eukaryotic HKs allowed for both the identification of highly conserved coding regions (discussed below) and for the generation of a phylo- genetic tree. Several interesting conclusions could be drawn from the tree. Most notably, it appears that the 50-kDa mammalian and yeast glucokinases do not have a direct common ancestor, suggesting that the gene-encoding glucokinase (HK4) arose at least twice during evolution. We determined that the mamma- lian glucokinase (HK4) may have arisen after the du- plication-fusion-reduction event that created the mammalian HKs, by loss of the N-terminal half, sug- gesting that it was a more recent evolutionary event. This would perhaps explain why some nonmamma- lian vertebrate species lack a glucokinase isozyme (Ureta, 1975). In addition, we noted that a much more ancient event gave rise to the yeast glucokinase iso- zyme presumably from the HK protoenzyme. Consis- tent with the duplication-fusion-reduction hypothe- sis is the observation that the rat glucokinase enzyme is more closely related to the C-terminal half of mam- malian HKl than to the N-terminal half. It has been noted that in rat liver, glucokinase interacts with a

FIG. 2. Alignment of the eukaryotic hexokinases. Available hexokinases, including the hepatic form of glucokinase (rat gkl), were aligned. Residues that are conserved in all existing hexokinases are offset. Residues of unknown function are located outside the boxed regions. Amino acid positions thought to be glucose contact points are marked by a l . An asterisk (*) marks the position that corresponds to a critical lysine residue (18) in the ATP-binding site of the protein kinases. Box A, outer mitochondrial membrane-binding domain; B, ATP-binding site; C, glucose-binding site. Boxes D and E represent alternate ATP-binding sequences (G-X-G-X-X-G/A), which may function in the binding process (Refs. (22,23,35)). Right angle arrows (CI, r) demark portions of the sequence used in phylogenetic analysis. Numbers to the right indicate the position within the sequence; numbers on top indicate the position within the alignment. Species and sequence identification are to the left. Top four sequences (ending in “-n”) are N-terminal half; those ending in “z” are the corresponding C-terminal sequences. The N-terminal halves are thought to have regulatory function (Ref. (48)). All other sequences possess catalytic activity. yschka, yeast HK PL yschkb, yeast HK PII; yschkc, glucokinase; mushkl, mouse HKl.

1020 GRIFFIN ET AL.

lloulne HKl -n

Rat HKl-n

Human HKI -n

Mouse HKl -n Boulne HKl -c Rat HKI -c

Human HKI -c

Mouse HKI -c Rat HK4

Protein Kinases:

CAMP-dep PK cGMP-dep PK

F . L R V Q...+13....E F . L R V Q...+13....E M-U-l F . L R V Q...+13....E F . R I L R V Q...+13....& F. R V L L V K...+13... K

F. R V L L V . ..+13... K

F. V L L V K...+13... K F . R V L L V K...+13... K

iM i

F . R VM L V . ..+13... K

l * l

F G R V M L V . ..+ll... K F G R V E L V Q...+12... K

FIG. 4. Nucleotide-binding site of the mammalian hexokinases. Regulatory (“-n”) and catalytic (“-c”) domain sequences are compared to the ATP-binding site sequences of the bovine CAMP-dependent and cGMP-dependent protein kinases. Residues that are conserved are boxed. Positions that have an amino acid unique to the HKl catalytic site are marked with a +. A Iysine residue, thought to play a key role in the phosphoryl transfer of the y-phosphate of ATP, is present 11-13 residues downstream of the main binding site of the protein kinases, rat HK4, and the catalytic domain of the HKls, but not in the regulatory half.

regulatory protein (Vandercammen and Van Schaf- tingen, 1990) and it has been postulated that this regu- latory protein actually corresponds to the N-terminal regulatory half of HK (Schwab and Wilson, 1991). Taken together this information suggests that gluco- kinase arose after the evolution of the N-terminal half of the HKs into a regulatory domain and that after the reduction event, the regulatory domain may have been retained as an independent gene product (Fig. 6).

7 yschkb

I yschka

yschkc

bovhkl n

humhkl n

rathkl n

mushkl n

ratgk 1

humhkl c

bovhkl c

rathkl c

mushkl c

FIG. 5. Phylogenetic tree for the hexokinases. A rooted tree was produced using the KITSCH algorithm of Felgenstein (Phylo- geny Interference Package version 3.3, University of Washington, Seattle). Tree constructed for the mammalian HKls, mammalian glucokinase, and all yeast hexokinases. Those sequences ending in “-n” are N-terminal half; those ending in “-c” are the correspond- ing C-terminal sequences. yschka, yeast HK PI; yschkb, yeast HK PII; yschkc, glucokinase; mushkl, mouse HKl; bovhkl, bovine HKl; humhkl, human HKl; ratgk-1 = hepatic form of rat HK4 (glucokinase).

Putative ATP- and glucose-binding domains were identified in the carboxy-terminal (catalytic) half of bovine HKl by alignment and comparison to other HKs and other proteins that possess ATP-binding re- gions. Inspection of the corresponding regions in the amino (regulatory) half of the bovine protein showed that these same domains were present. The sub- strate-binding site was slightly more divergent in the regulatory half, but a core of conserved sequences could be identified. Our results indicate that these al- terations in nonessential residues in the primary structure may allow for changes in specificities, al- though we cannot rule out changes in other residues involved in the secondary and tertiary structures that may alter the affinities of these domains.

The binding site in bovine HKl is very similar to the putative ATP-binding domains of several protein kinases (Zoller et al., 1981; Hashimoto et al., 1982; Hanks et al., 1988), oncogenes such as c-myc (Hanks et al., 1988) and c-src (Kamps et al., 19&M), and several other ATPases. There have been other nucleotide binding sequences reported in proteins such as ATP synthase (Walker et al., 1982), adenylate kinase (Fry et al., 1986), and Fl ATPase (Cross et al., 1987). Nota- bly, the mammalian and yeast HKs possess the former, but not the latter types of binding sequences. In addition, this HKl site does not have recognizable similarity to other carbohydrate kinases. A portion of another ATP-binding domain of a bovine 70 kDa heat shock cognate protein (Flaherty et al., 1990) has also been noted to be conserved in the HKs. We find the protein kinase ATP-binding core sequence in all HKs examined, including rat and yeast glucokinase (An- dreone et al., 1989; Albig and Entian, 1988). Of note, all of the catalytically functional ATP-binding do- mains of both the protein kinases and HK have an invariant lysine that is found lo-15 residues down-

MAMMALIAN HEXOKINASE 1: EVOLUTIONARY CONSERVATION 1021

cat. -HK Protoenzyme

/ \ GENE DUPLICATION AND

-a-

/ FUSION

\ Yeast HKPI, PII. & cat. cat. Yeast glucoklnase

Alt. I

- Mammalian HK4 (Glucoklnase)

Deletion of S’half of gene to create glucoklnase \

ISOZYME EVOLUTION

FIG. 6. Evolution of the eukaryotic hexokinases. An evolution- ary model is proposed for the eukaryotic hexokinases based upon phylogenetic analyses as well as structure-function consider- ations. The yeast isozymes evolve separately from the events that will give rise to the vertebrate and mammalian isozymes. The 100- kDa mammalian hexokinases would have been created from a gene duplication and fusion event, followed by evolution of one of the halves into a regulatory domain. The mammalian glucokinase may have arisen either before (alt. 1) or after (alt. 2) the evolution of the regulatory domain through loss of the 5’ half of the gene. Exon recruitment may be responsible for the addition of an outer mito- chondrial membrane porin binding domain (PBD) to HKl and HK2.

stream (i.e., toward the C-terminus) of the core se- quence (Hanks et al., 1988). It is also intriguing that at least one of the yeast HK isozymes (HK PII) has been noted to have protein kinase activity (Herrero et aZ., 1989) and recently we have been able to show that rat HKl possesses not only protein kinase activity but also the ability to autophosphorylate (Adams et al., manuscript submitted).

It has been postulated that this conserved lysine residue is responsible for the interaction with, and phosphoryl transfer of, the y-phosphate of the ATP molecule (Kamps and Sefton, 1986). In yeast HK, this lysine was shown to interact with both trinitrophenyl ATP (Arora et al., 1990h) and PLP-AMP (pyridoxal 5’-diphosphate-5’-adenosine) (Tamura et al., 1988), a potent inhibitor of yeast HK PII, suggesting an es-

sential role for this residue in HK catalysis. Mutation of this lysine in oncogenes results in loss of protein kinase activity (Snyder et al., 1985; Hannink and Donoghue, 1985). It is known from crystallographic data that this lysine is positioned on the surface of the small lobe of HK where it can rotate into the active site (Shoham and Steitz, 1980). It is interesting that in the putative ATP-binding domain in the regulatory half, this lysine is replaced by a glutamate residue in bovine, rat, human, and mouse HK. This nonconser- vative amino acid change may represent the struc- tural basis for the loss of catalytic function at this site, while maintaining its binding properties for adenine nucleotides.

Two groups (Schwab and Wilson, 1989; Nishi et al., 1988) have proposed an alternate ATP-binding site, while a third was proposed by Herrero et al. (1989) (boxes D and E, Fig. 2). Each of these sites do possess the consensus sequence Gly-X-Gly-X-X-Gly(Ala) found in the protein kinase domains, but we expect that neither site will represent the correct binding do- main because both lack the critical lysine residue. However, as these sites are conserved in both halves of bovine HKl, it may be that they do function in nucleotide binding. The ATP-binding domain that we identify does lack this glycine cluster, which may rep- resent an “elbow” that surrounds and contacts the ribose moiety of the molecule (Sternberg and Taylor, 1984). Differences in substrate specificity and fimc- tion between HK and the protein kinases may neces- sitate the placement of this glycine cluster in a differ- ent spatial position relative to the active site to pre- serve proper HK function.

Certain residues (alignment position: Ser-178, Asn-231, Asp-232, Glu-303, and Glu-338) that were shown to be catalytically important for glucose bind- ing in yeast HK PI and PI1 by X-ray crystallography are present in bovine HKl (Bennett and Steitz, 1980; Anderson et al., 1979; Harrison, 1985). These residues were shown to be present in the 5’ regulatory half as well. One of these residues, Ser-178, is positioned within the putative core glucose-binding sequence Gly-Phe-Thr-Phe-Ser-(Phe/Tyr)-Pro-Cys (bovine HKl AAs G-599 to C-606; alignment Nos. 174-181). This serine presumably is involved in hydrogen bind- ing to the 6-hydroxyl group of glucose, an interaction that is critical for proper conformational changes of the active site, which include rotation of the smaller lobe toward the larger lobe (Anderson et al., 1979). In addition, an important proline residue (Pro-172, alignment) is present in all HKs except yeast gluco- kinase. This core site was shown by Schirch and Wil- son (1987) to be included in a peptide that is labeled by the glucose analog, N-(bromoacetyl)-D-glucos- amine, identifying it as the active site.

1022 GRIFFIN ET AL.

Protein studies (Nemat-Gorgani and Wilson, 1986; White and Wilson, 1987) have provided evidence for the existence of binding sites for glucose and glucose- 6-phosphate in both halves of the protein. This evi- dence is supported by studies (White and Wilson, 1989) that indicate that each half, when removed from the other, had the ability to bind to both sub- strate and inhibitor. These studies have failed to show, however, that the binding domains are separate entities. In fact, the sequence comparisons to other HKs that we report here suggest the presence of only one substrate domain per half, indicating that glu- cose-6-phosphate binds to the altered substrate site in the amino half. We predict that alterations in certain residues of this domain may allow glucose-6-phos- phate to bind with a higher affinity to the site in the amino half and glucose to bind the site in the carboxy half in the native protein.

The final domain examined was the putative outer mitochondrial membrane-binding domain. This amino terminal l&amino-acid sequence is predicted, in mammals, to be an LX helical structure based on secondary structure analysis. This mammalian a! he- lical domain is longer than the N-terminal domain of yeast that is predicted to project out from the rest of the enzyme (Anderson etal., 1979). Sequence compari- son between bovine, rat, human, and mouse HKl indi- cates that these N-terminal 15 amino acids are 100% conserved. Such absolute conservation of this region suggests that it serves an essential function. We have used this information to develop a reporter gene con- struct coding for a chimeric protein consisting of the N-terminal HKl 15 amino acids coupled to chloram- phenicol acetyltransferase (CAT) and have shown that this HKl domain is necessary and sufficient for HKl binding to porin (Gelb et al., manuscript sub- mitted).

The family of HK enzymes is responsible for the phosphorylation of glucose, a key control point in car- bohydrate metabolism. Since each HK isoenzyme ex- hibits different tissue distributions and kinetic prop- erties, knowledge of the molecular basis for these properties would be helpful in understanding these enzymes and their detailed cellular roles. Our ability to study HK at a molecular level, and to understand how structural features affect function of the enzyme, has been acquired very recently with the cloning of various HK cDNAs, including bovine brain HKl. Alignment and comparisons of these sequences sup- ports the theory of Ureta and other groups that the mammalian HKs arose from the duplication and fu- sion of an ancestral protoenzyme and suggests that the yeast and mammalian glucokinases arose twice in evolution. We have described a method for cloning the cDNA for a low abundance protein using knowl-

edge of the evolutionary conservation of amino acid and nucleotide sequence. This not only permits the isolation of a specific cDNA for a protein that shares common features with many others, it also allows for the rapid identification and sequencing of HKs from other species. In addition, identification of core bind- ing domain sequences will allow for the use of site-dir- ected mutagenesis, and/or reporter gene constructs, to alter key residues and determine the effects of these changes upon HK function.

ACKNOWLEDGMENTS

This work was supported in part by a March of Dimes Birth Defects Foundation Predoctoral Fellowship (18-88-18) to L.D.G. and by grants (ROl HD22563 to E.R.B. McCabe; P30 HD24064, Baylor Mental Retardation Research Center; and P30 HD27823, Baylor Child Health Research Center) from the National Institute of Child Health and Human Development, National Institutes of Health.

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

REFERENCES

ALBIG, W., AND ENTIAN, K-D. (1988). Structure of yeast glu- cokinase, a strongly diverged specific aldo-hexose-phosphor- ylating isoenzyme. Gene 73: 141-152.

ANDERSON, C. M., ZUCKER, F. H., AND STEITZ, T. A. (1979). Space-filling models of kinase clefts and conformation changes: Comparison of the surface structures of kinase en- zymes implicates closing clefts in their mechanism. Science 204:375-380. ANDREONE, T. L., PRINT, R. L., PILKIS, S. J., MAGNIJSON, M. A., AND GRANNER, D. K. (1989). The amino acid sequence of rat liver glucokinase deduced from cDNA. J. Bill. C/rem. 264:363-369. ARORA, K. K., FANCIULLI, M., AND PEDERSEN, P. L. (199Oa). Glucose phosphorylation in tumor cells: Cloning, sequencing, and overexpression in active form of a full length cDNA en- coding a mitochondrial bindable form of tumor hexokinase. J. Biol. Chem. 265: 6481-6488.

ARORA, K. K., SHENBAGAMURHTI, P., FANCIIJLLI, M., AND PEDERSEN, P. L. (199Ob). Glucose phosphorylation. Interac- tion of a 50-amino acid peptide of yeast hexokinase with trin- itrophenyl ATP. J. Biol. Chem. 265: 5324-5328.

BENNETT, W. S., AND STFXIX, T. A. (1980). Structure of a complex between yeast hexokinase A and glucose. II. Detailed comparisons of conformation and active site configuration with the native hexokinase B monomer and dimer. J. Mol. Sol. 140: 211-230.

CHOMCZYNSKI, P., AND SACCHI, N. (1987). Single step

method of RNA isolation by acid guandinium thiocyanate- phenol-chloroform extraction. Anal. Biochem. 162: 156-159.

CHURCH, G. M., AND GILBERT, W. (1984). Genomic sequenc- ing. Proc. N&l. Acad. Sci. USA 81: 1991-1995.

CROSS, R. L., CUNNINGHAM, D., MILLER, C. G., XUE, Z., ZHOU, J-M., AND BOYER, P. D. (1987). Adenine nucleotide binding sites on beef heart Fl ATPase: Photoaffinity labeling of o-subunit Tyr-368 at a noncatalytic site and fi Tyr-345 at a catalytic site. Proc. Natl. Acad. Sci. USA 84: 5715-5719.

WEINBERG, A. P., AM) VOGELSTEIN, B. (1983). A technique for

MAMMALIAN HEXOKINASE 1: EVOLUTIONARY CONSERVATION 1023

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. B&hem. 132: 6-13. FITCH, W. M., AND MARCOLIASH, E. (1969). Construction of phylogenetic trees: A method based on mutation distances as estimated from cytochrome c sequences is of general applica- bility. Science 155: 279-284. FLAHERTY, K. M., DELUCA-FLAHERTY, C., AND MCKAY, D. B. (1990). Three-dimensional structure of the ATPase fragment of a 70K heat-shock cognate protein. Nature 346: 623-628. FROHLICH, K-U., ENTIAN, K-D., AND MECKE, D. (1985). The primary structure of the yeast hexokinase PI1 gene (HXKP) which is responsible for glucose repression. Gene 36: 105-111. FROHMAN, M. A., DUSH, M. K., AND MARTIN, G. R. (1988). Rapid production of full-length cDNAs from rare transcripts: Amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. USA 85: 8998-9002.

FRY, D. C., KUBY, S. A., AND MILDVAN, A. S. (1986). ATP- binding site of adenylate kinase: Mechanistic implications of its homology with ras-encoded ~21, Fl-ATPase, and other nucleotide binding proteins. Proc. Natl. Acad. Sci. USA 83: 907-911. GEORGE, D. G., BARKER, W. C., AND HUNT, L. T. (1986). The protein identification resource (PIR). Nucleic Acids Res. 14: 11-20. GRIFFIN, L. D., MACGREGOR, G. R., MUZNY, D. M., HARTER, J., COOK, R. G., AND MCCABE, E. R. B. (1989). Synthesis and characterization of a bovine HKl cDNA probe by mixed oligo- nucleotide primed amplification of cDNA using high com- plexity primer mixtures. Biochem. Med. Metub. Biol. 41: 125- 131. GROSSBARD, L., AND SCHIMKE, R. T. (1966). Multiple hexoki- nases of rat tissues. Purification and comparison of soluble forms. J. Biol. Chem. 241: 3546-3560. GYLLENSTJZN, U. B., AND ERLICH, H. A. (1988). Generation of single-stranded DNA by the polymerase chain reaction and its application to sequencing of the HLA-DQA locus. Proc.

Natl. Acad. Sci. USA 85: 7562-7566.

HANKS, S. K., QUINN, A. M., AND HUNTER, T. (1988). The protein kinase family: Conserved features and deduced phy- logeny of the catalytic domains. Science 241: 42-52. HANNINK, M., AND DONOGHUE, D. J. (1985). Lysine residue 121 in the proposed ATP-binding site of the v-mos protein is required for transformation. Proc. Natl. Acad. Sci. USA 82: 7894-7898. HARRISON, R. W. (1985). “Crystallographic Refinement of Two Isozymes of Yeast Hexokinase and the Relationship of Structure to Function.” PhD. thesis, Yale University, New Haven, CT. HASHIMOTO, E., TAKIO, K., AND KREBS, E. G. (1982). Amino acid sequence at the ATP-binding site of cGMP-dependent protein kinase. J. Biol. Chem. 257: 727-733.

HERRERO, P., FERNANDEZ, R., AND MORENO, F. (1989). The hexokinase isozyme PI1 of Sacchnromyces cereuisiae is a pro- tein kinase. J. Gen. Microbial. 135: 12091216. HOLROYDE, M. J., AND TRAYER, I. P. (1976). Purification and properties of rat skeletal muscle hexokinase. FEBS Lett. 62: 215-217. KAIMPS, M. P., AND SEPIY)N, B. M. (1986). Neither arginine nor histidine can carry out the function of lysine-295 in the ATP-binding site of p6v”. Mol. Cell. Biol. 6: 751-757.

KAMPS, M. P., TAYLOR, S. S., AND SEFTON, B. M. (1984).

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

41.

42.

43.

44.

Direct evidence that oncogenic tyrosine kinases and cyclic AMP-dependent protein kinases have homologous ATP- binding sites. Nature 310: 589-591. KATZEN, H. M., AND SCHIMKE, R. T. (1965). Multiple forms of hexokinase in the rat: Tissue distribution, age dependency, and properties. Proc. Natl. Acad. Sci. USA 54: 1218-1225. KIMURA, M. (1981). A simple method for estimating evolu- tionary rates of base substitutions through comparative stud- ies of nucleotide sequence. J. Mol. Euol. 16: 111-120. KOPETZKI, E., ENTIAN, K-D., AND MECKE, D. (1985). Com- plete nucleotide sequence of the hexokinase PI gene (HXKl) of Saccharomyces cerevisiue. Gene 39: 95-102. KUSUKAWA, N., UEMORI, T., ASADA, K., AND KATO, I. (1990). Rapid and reliable protocol for direct sequencing of material amplified by the polymerase chain reaction. Biotechniques 9: 66-72. LOWRY, 0. H., AND PASSONNEAU, J. V. (1972). “A Flexible System of Enzymatic Analysis,” pp. 121-128, Academic Press, New York. MANIATIS, T., FR~SCH, E. F., AND SAMEIROOK, J. (1982). “Molecular Cloning: A Laboratory Manual,” Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. NEMAT-GORGANI, M., AND WILSON, J. E. (1986). Rat brain hexokinase: Location of the substrata nucleotide binding site in a structural domain at the C-terminus of the enzyme. Arch. Biochem. Biophys. 251: 97-103. NISHI, S., SEIKO, S., AND BELL, G. I. (1988). Human hexoki- nase: Sequences at the amino- and carboxy-terminal halves are homologous. Biochem. Biophys. Res. Commun. 157: 937- 943. SCHIRCH, D. M., AND WILSON, J. E. (1987). Rat brain hexoki- nase: Amino acid sequence at the substrate hexose binding site is homologous to that of yeast hexokinase. Arch. Bio- them. Biophys. 254: 385-396.

SCHWAB, D. A., AND WILSON, J. E. (1989). Complete amino acid sequence of the rat brain hexokinase, deduced from the cloned cDNA, and proposed structure of a mammalian hexo- kinase. Proc. Natl. Acad. Sci. USA 86: 2563-2567.

SCHWAB, D. A., AND WILSON, J. E. (1991). Complete amino acid sequence of the type III isoxyme of rat hexokinase, de- duced from cloned cDNA. Arch. Biochem. Biophys. 285: 365- 370. SHOHAM, M., AND STEITZ, T. A. (1980). Crystallographic stud- ies and model building of ATP at the active site of hexoki- nase. J. Mol. Biol. 140: 1-14. SMITH, R. F., AND SMITH, T. F. (1990). Automatic generation of primary sequence patterns from sets of related protein se- quences. Proc. Natl. Acad. Sci. USA 87: 118-122. SNYDER, M. A., BISHOP, J. M., MCGRATH, J. P., AND LEVIN- SON, A. D. (1985). A mutation at the ATP-binding site of pp60”-@, abolishes kinase activity, transformation, and tumor- igenicity. Mol. Cell. Biol. 5: 1772-1779. STERNBERG, M. J. E., AND TAYLOR, W. R. (1984). Modelling the ATP-binding site of oncogene products, the epidermal growth factor receptor, and related proteins, FEBS Z&t. 175: 387-396. TAMURA, J. K., LADINE, J. R., AND CROSS, R. L. (1988). The adenine nucleotide binding site of yeast hexokinase PII. Af- finity labeling of Lys-111 by pyridoxal5’-diphospho-5’-adeno- sine. J. Biol. Chem. 263: 7907-7912.

THELEN, A. P., AND WILSON, J. E. (1991). Complete amino acid sequence of the type II isozyme of rat hexokinase, de-

1024 GRIFFIN ET AL.

45.

46.

47.

48.

49.

duced from a cloned cDNA: Comparison with a hexokinase from Novikoff ascites tumor. Arch. Biochem. Biophys. 286: 645-651.

URETA, T. (1975). Phylogeny, ontogeny, and properties of the hexokinases from vertebrates. In “Isozymes. III. Developmen- tal Biology” (C. L. Markert, Ed.), pp. 575-602, Academic Press, New York.

URETA, T. (1982). The comparative isozymology of verte- brate hexokinases. Comp. Biochem. Physiol. 71B: 545-555.

VANDERCAMMEN, A., AND VAN SCHAY~INGEN, E. (1990). The mechanism by which rat liver glucokinase is inhibited by a regulatory protein. Eur. J. Biochem. 191: 483-489.

Vow-s, D. T., AND EASTERBY, J. S. (1979). Comparison of type I hexokinases from pig heart and kinetic evaluation of the effects of inhibitors. Biochim. Biophys. Acta 566: 283- 295.

WALKER, 3. E., SARASTE, M., RUNSWICK, M. J., AND GAY,

N. J. (1982). Distantly related sequences in the cy- and p-sub- units of ATP synthase, myosin, kinases, and other ATP-re- quiring enzymes and a common nucleotide binding protein. EMBO J. 8: 945-951.

50. WHITE, T. K., AND WILSON, J. E. (1987). Rat brain hexoki- nase: Location of the allosteric regulatory site in a structural domain at the N-terminus of the enzyme. Arch. Biochem. Biophys. 259: 402-411.

51. WHITE, T. K., AND WILSON, J. E. (1989). Isolation and charac- terization of the discrete N- and C-terminal halves of rat brain hexokinase: Retention of full catalytic activity in the isolated C-terminal half. Arch. Biochem. Biophys. 274: 375- 393.

52. ZOLLER, M. J., NELSON, N. C., AND TAYLOR, S. S. (1981). Affinity labeling of CAMP-dependent protein kinsse with p- fluorosulfonylbenzoyl adenosine. J. Bid. Chem. 266: 1837- 1842.