nucleotide sequence of a plasmodium falciparum stress protein with similarity to mammalian 78-kda...

4
Molecular and Biochemical Parasitology, 56 (1992) 353-356 353 © 1992 Elsevier Science Publishers B.V. All rights reserved. / 0166-6851/92/$05.00 MOLBIO 01882 Short Communication Nucleotide sequence of a Plasmodium falciparum stress protein with similarity to mammalian 78-kDa glucose-regulated protein Nirbhay Kumar* and Hong Zheng Department of Immunology and Infectious Diseases, School of Hygiene and Public Health, The Johns Hopkins University, Baltimore, USA Key words: Plasmodium falciparum; Heat shock protein; Stress proteins Genes for two members of the heat shock protein 70 family have been cloned from Plasmodium falciparum. These include pro- teins of 75 kDa (Pfhsp) and 72 kDa (Pfgrp), sharing sequence similarity with a eukaryotic heat shock protein of 70 kDa and a glucose- regulated protein of 78 kDa, respectively [1-4]. While barely detectable in the salivary gland sporozoites, these proteins are expressed at elevated levels in parasites undergoing devel- opment in liver cells (exoerythrocytic stages) (Kumar et al., submitted for publication). Temperature shift studies in blood-stage para- sites of P. falciparum and P. berghei have also shown that the hspT0-1ike protein is heat- inducible [5]. These proteins are localized in the nuclear and cytoplasmic compartments of the parasite [5]. In contrast, known stimulators of grp78 expression (glucose deprivation, tunica- mycin, 2-deoxyglucose) in mammalian cells [6] and temperature shift had no apparent effect on the expression of Pfgrp in P. falciparum [5]. The mechanism of regulation of Pfgrp expres- sion remains unclear. Correspondence address: Nirbhay Kumar, DIID-SHPH-JHU, 615 N.Wolfe St. Baltimore, MD 21205, USA. Note: Nucleotide sequence data reported in this paper have been submitted to the GenBank TM data base with the accession number L02822. Abbreviations: HSP, heat shock protein; GRP, glucose-regu- lated protein; PCR, polymerase chain reaction. A tetrapeptide sequence at the carboxy terminus of grp78, KDEL in the mammalian proteins and HDEL in the Saccharomyces cerevisiae protein, plays a critical role in the retention of these proteins in the lumen of the endoplasmic reticulum [7]. The grp78-1ike protein in P. falciparum contains SDEL at the carboxy-terminus, and is also localized in the endoplasmic reticulum-like membranous compartment, as revealed by immunoelectron microscopy (ref. 5, and Kumar et al., sub- mitted). Thus the sequence SDEL appears to serve the same function as the KDEL in mammalian cells and HDEL in yeasts. The knowledge of complete sequence, including the promoter region, would facilitate understand- ing of functions and the regulation of the Pfgrp gene in P. falciparum. Here we describe the complete nucleotide sequence of the P. falciparum (3D7 clone of NF54 isolate) gene encoding Pfgrp. The sequences of a genomic clone previously designated T-114 [1], a cDNA clone, and a genomic clone obtained by inverse PCR and finally by RNA-PCR [8], were compiled to obtain the complete sequence of Pfgrp (Fig. 1). The genomic clone in Fig. 1A (1417-2268 bp, nucleotide positions based on the final com- plete sequence as shown in Fig. 2) was described earlier [1]. A cDNA clone (Fig. 1B), isolated from a P. falciparum cDNA library in lambda NM 1149 contained the sequence from 689 to the 3' end of the gene.

Upload: nirbhay-kumar

Post on 31-Aug-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Molecular and Biochemical Parasitology, 56 (1992) 353-356 353 © 1992 Elsevier Science Publishers B.V. All rights reserved. / 0166-6851/92/$05.00

MOLBIO 01882

Short Communica t ion

Nucleotide sequence of a Plasmodium falciparum stress protein with similarity to mammalian 78-kDa glucose-regulated protein

Nirbhay Kumar* and Hong Zheng Department of Immunology and Infectious Diseases, School of Hygiene and Public Health,

The Johns Hopkins University, Baltimore, USA

Key words: Plasmodium falciparum; Heat shock protein; Stress proteins

Genes for two members of the heat shock protein 70 family have been cloned from Plasmodium falciparum. These include pro- teins of 75 kDa (Pfhsp) and 72 kDa (Pfgrp), sharing sequence similarity with a eukaryotic heat shock protein of 70 kDa and a glucose- regulated protein of 78 kDa, respectively [1-4]. While barely detectable in the salivary gland sporozoites, these proteins are expressed at elevated levels in parasites undergoing devel- opment in liver cells (exoerythrocytic stages) (Kumar et al., submitted for publication). Temperature shift studies in blood-stage para- sites of P. falciparum and P. berghei have also shown that the hspT0-1ike protein is heat- inducible [5]. These proteins are localized in the nuclear and cytoplasmic compartments of the parasite [5]. In contrast, known stimulators of grp78 expression (glucose deprivation, tunica- mycin, 2-deoxyglucose) in mammalian cells [6] and temperature shift had no apparent effect on the expression of Pfgrp in P. falciparum [5]. The mechanism of regulation of Pfgrp expres- sion remains unclear.

Correspondence address: Nirbhay Kumar, DIID-SHPH-JHU, 615 N.Wolfe St. Baltimore, MD 21205, USA.

Note: Nucleotide sequence data reported in this paper have been submitted to the GenBank T M data base with the accession number L02822.

Abbreviations: HSP, heat shock protein; GRP, glucose-regu- lated protein; PCR, polymerase chain reaction.

A tetrapeptide sequence at the carboxy terminus of grp78, KDEL in the mammalian proteins and HDEL in the Saccharomyces cerevisiae protein, plays a critical role in the retention of these proteins in the lumen of the endoplasmic reticulum [7]. The grp78-1ike protein in P. falciparum contains SDEL at the carboxy-terminus, and is also localized in the endoplasmic reticulum-like membranous compartment, as revealed by immunoelectron microscopy (ref. 5, and Kumar et al., sub- mitted). Thus the sequence SDEL appears to serve the same function as the KDEL in mammalian cells and HDEL in yeasts. The knowledge of complete sequence, including the promoter region, would facilitate understand- ing of functions and the regulation of the Pfgrp gene in P. falciparum. Here we describe the complete nucleotide sequence of the P. falciparum (3D7 clone of NF54 isolate) gene encoding Pfgrp.

The sequences of a genomic clone previously designated T-114 [1], a cDNA clone, and a genomic clone obtained by inverse PCR and finally by RNA-PCR [8], were compiled to obtain the complete sequence of Pfgrp (Fig. 1). The genomic clone in Fig. 1A (1417-2268 bp, nucleotide positions based on the final com- plete sequence as shown in Fig. 2) was described earlier [1]. A cDNA clone (Fig. 1B), isolated from a P. falciparum cDNA library in lambda NM 1149 contained the sequence from 689 to the 3' end of the gene.

354

b

b

b 1

1 b

I

(D) : 1 I

V (E) r - ' - - ' l J

(F) 5'-GTAAGTAT . . . . AT-I-TTTTAG-3'

(G) 1 2 3 4

Fig. 1. Cloning and sequencing strategy. (A) represents a genomic fragment (1417-2268; numbers based on complete sequence as in Figure 2) [1]. (B) represents a cDNA clone (starting at 689 to the 3' end of the gene). The clone was isolated from a P. falciparum cDNA library using oligonucleotides based on the genomic fragment (A, above) as probes. The insert was cloned into pUC19 for sequencing by the dideoxy chain termination procedure using a 'Sequenase' kit [10]. (C) represents a BgllI (restriction site at 2152 identified by b) genomic fragment hybridizing to an oligonucleotide 20 (CAGAGTATGAAAG- CAACTG, 1904-2012). Bglll digested genomic DNA was diluted to 1 and 2 #g ml - l and circularized by self ligation. Oligonucleotides identified as 1 (AGGTGTCTTAATTCAAGT, 1563-1580) and 16 (GCACTAATTTGTTCAGGAGC, complement of 704-723) were used to amplify the DNA by PCR. The 'inverted PCR' product was blunt-end cloned into the SmaI site of pUC19 for sequence analysis. (D) shows the genomic fragment representing the complete sequence from the translation initiation site to the termination codon and an intron (dark box). (E) represents the coding sequence of the gene after defining the boundaries of the intron based on sequencing the mRNA according to RNA-PCR protocol [8]. Briefly, P. falciparurn RNA was isolated [12] and treated with RNase-free DNase I. After reverse transcription using Moloney murine leukemia virus reverse transcriptase and anti-sense oligonucleotide 31 (GGGTTATAATTCGTCACTATCTAC) based on the known 3'-end sequence of the gene, the first strand cDNA was amplified by PCR using oligonucleotides 250 (ATGAACCAAATTAGGCC, 1-17) corresponding to the Y-end of the coding sequence and 16. (F) shows the sequences at the ends of the intron. (G) shows a gel of the PCR products using genomic DNA (lane 3) and first strand cDNA product (lane 4) from E (above) as template DNA and oligonucleotides 250 and 16 as primers. Lanes 1 and 2 show HindlII-digested lambda DNA and 123 bp ladder DNA size markers respectively. The ethidium bromide stained gel was photographed as a

negative image using 'Stratagene' Eagle-Eye.

'Inverted' PCR [9] was then used to obtain the sequence at the Y-end of the gene. Hybridiza- tion of BglII digested P. falciparum genomic DNA with an oligonucleotide 20 identified an approximately 2.5 kb fragment (Fig. 1C). The Bg/II-digested P. falciparum DNA was circu- larized by self-ligation and used as template DNA for PCR. The PCR product containing a major band of expected size (~ 1.5 kb) was treated with T4 polymerase and blunt-end cloned into the Smal site of pUC19 for sequencing (Fig. 1D) by the dideoxy chain termination method [10]. The sequence was independently confirmed using the linear amplification DNA sequencing method [11].

Figure 2 shows the complete nucleotide sequence of the gene for Pfgrp including an intron-like non coding region revealed by analysis of the sequence [13] near the 5' end. The ATG codon identified as nucleotide 1, followed by a sequence encoding a long stretch of hydrophobic amino acids, typical of leader sequence, is assigned as the translation initia- tion codon. Sequence upstream of this ATG (approximately 200 bp, not shown) lacked any open reading frame. To exactly define the boundaries of the intron, RNA from P. falciparum was reverse transcribed and ampli- fied using oligonucleotides 250 and 16 as primers, cloned into pUC19 and sequenced

355

-22 TT ATCATATATA TAATTCAAAA 1 ~i~CCAAA TTAGGCCATA TATTTTACTA TTAATTGTTT CCTTATTAAA ATTTATAAGT

121 ~T~AATAAAA AAAAAAAAAA ATATATATAT ATATATATAT GTAATTAAAA AAATCTCTGA 241 ATTTATATAT TTATAATACA CTTATGTATG TATTACACAT ATGTATGTAT TTGAAGTATA 361 TTTAGTTGAG GGACCCGTTA TTGGTATTGA CTTGGGTACC ACTTATAGTT GCGTTGGTGT 481 ATATGTTTCC TTTGTAGATG GAGAAAGGAA AGTTGGTGAG GCAGCTAAAT TAGAAGCTAC 601 CCAAGAAGTT GTTAAAGATC GTTCTTTATT ACCATATGAA ATTGTAAATA ATCAAGGCAA 721 TGCTATGGTT TTAGAAAAAA TGAAAGAAAT AGCTCAATCA TTTTTAGGTA AACCAGTAAA 841 TGCTGGTACT ATAGCTGGAT TGAACATT~T TCGTATTATT ATCAATCAAC CAACTGCTGC 961 TGGTGGTACT TTTGATGTTT CTATTCTTGT TATTGACAAT GGTGTTTTTG AAGTATATGC

1081 TATAAAAATG TTCAAGAAAA AAAACAATAT CGATTTAAGA ACTGACAAAA GAGCTATTCA 1201 AATCGAAATT GAAGATATAG TTGAAGGACA TAATTTTTCT GAAACCTTAA CAAGAGCCAA 1321 GGATGATGCT AAATATGAAA AAAGTAAAAT TGATGAAATT GTTTTAGTAG GAGGTTCAAC 1441 AAATAGAGGT ATAAATCCTG ATGAAGCTGT TGCTTATGGT GCTGCTATCC AAGCAGGTAT 1561 TATAGAAACT GTGGGTGGTA TTATGACACA ATTAATTAAA AGAAATACTG TCATCCCAAC 1681 TTTTGAAGGA GAAAGAGCAT TAACCAAAGA TAATCACCTT TTAGGAAAGT TTGAATTATC 1801 AAATGGTATC TTACATGTTG AAGCTGAAGA CAAAGGTACA GGTAAAAGTA GAGGTATAAC 1921 AGAAAAATTC GCAGATGAAG ATAAJU~CTT AAGAGAAAAA GTTGAAGCCA AAAATAAACT ~041 AATCGAAAAA GAAGATAAAA ATACTATCCT TTCAGCTGTT AAAGATGCTG AAGATTGGTT 2161 TGTATGCCAA CCAATCATTG TTAAATTATA TGGTCAACCA ~GAGGACCTT CACCACAACC

GCCGTTGACT CAAACAGTAA GTATTATATA ATTTTAAAGA AAGTATTTAT GTTTTTTTAA TGAAGCATAT TATATTATGT GTAAAAATAT ATTACACTTA TGTATTTATT ACACATATAT TGTAATTATA TATATGTAAT ATGTGTATAT ATAAAAAATT TGAATCTTTT ATTTTAATTT TTTTAAAAAT GGAAGAGTTG AAATATTGAA TAATGAATTA GGTAATCGTA TTACCCCATC TGTACATCCT ACTCAAACAG TTTTTGATGT AAAGAGATTA ATAGGAAGAA AATTTGATGA ACCAAATATT AAGGTACAAA TAAAGGATAA AGATACTACA TTTGCTCCTG AACAAATTAG AAATGCAGTT GTTACTGTCC CTGCTTATTT TAATGATGCT CAAAGACAAG CAACAAAAGA TGCTTTAGCA TATGCTTTAG ATAAGAAAGA AGAGACCAGT ATTTTAGTAT ACGATTTAGG TACTGCTGGT AATACTCATT TAGGAGGTGA AGATTTTGAT CAAAGAGTTA TGGACTATTT GAAATTAAGA AAAGAAGTTG AAATAGCAAA AAGAAACTTA TCTGTTGTTC ACTCAACACA ATTTGAAGAA TTAAATGATG ATTTATTTAG AGAAACCTTA GAGCCAGTAA AAAAAGTTTT ACGTATTCCA AAAATTCAAC AAATTATCAA AGAATTCGAA TTCTTTAATG GTAAAGAACC TATTTTAGGT GAAGAATTAC AAGACGTTGT TTTATTAGAT GTTACTCCAT TAACTTTAGG CAAAAAATCA CAAACCTTTT CAACATATCA AGATAACCAA CCAGCTGTCT TAATTCAAGT TGGTATTCCA CCAGCACAAA GAGGAGTACC CAAAATTGAA GTTACCTTTA CCGTAGACAA TATTACTAAT GACAAAGGTA GATTATCGAA AGAACAAATC GAAAAAATGA TTAATGATGC TGATAATTAT ATACAGAGTA TGAAAGCAAC TGTTGAAGAT AAAGATAAAT TAGCTGATAA AAATAATAAC TCGAATGCTG ATTCTGAAGC ATTAAAACAA AAATTAAAAG ATCTTGAAGC TAGTGGAGAC GAAGATGTAG ATAGTGACGA A T T A i n T CTTCACAG

Fig. 2. Nucleotide sequence of the Pfgrp gene numbered relative to the putative translation initiation site. The termination site TAA is identified at 2255-2257 by the shaded area. The underlined sequence (77-365) denotes the intron.

(Fig. 1E). The intron sequence (77-365 bp) of 289 bp contains GTAAGTA at the 5'- splice site and TTTAG at the 3' splice site (Fig. IF). The PCR results in Fig. 1G further confirm the presence of the intron. Bands of expected sizes (723 bp from genomic DNA and 434 bp from cDNA) were obtained when amplified by PCR using oligonucleotides 250 and 16. The exon- intron boundary follows the GT-AG splicing

rule [14] and matches exactly the consensus sequence [15]. While the Pfgrp gene contains a single intron, it is interesting to note that the gene for the other member of the hsp70 family in P. falciparum (Pfhsp) does not contain an intron [4].

The coding sequence (1965 bp) can thus code for a protein of 655 amino acids with a predicted molecular mass of 72 693. The

MNQIRPYILLLIVSLLKFISAVDSNIE ...... GPVI~~GVFKNGRVEILNNELGNRITPSYVSFVD ~.S-V~-M.LLL..~E.DKKEDV.T.V:7:::f~:?:?~:::~::?:~ :. ......... IA.DQ ......... A.TP

T -GERKVGE ATVHPTQTVFDWRLIGRKFDDQEVV aS LPYEIW. KpNIKVQI .... KD TTFAPE

E...LI.D...NQL.SN.EN .... A ...... TWN.PS.QQ.IKF..FKV.EKKT..Y.Q.D.GGQT .......

Q I S A M V L E K M K E I A Q S F L G K P V K N A ~ ~ ~ ~ ~ i ~ i ~ ~ ~ K K E - E T

S I LVY ~ ~ I LV I D N G V ~ ~ ~ ~ R V M D Y F I KMFKKKNN I D LRTD KRA I QKLRKEVE I

..... I Q'~V'~'~'~'~'""~'~-~"-"~'~---.EH.--LY-..TGK-V-K.N--V .... R...K

.... .............. .... . . . . . . . . . . . . . . .

~ I I K E F E F F N G K E P N R G I ~ ~ ~ A G I I L G E E L Q - D ~ ~ t~i~QLIKRNTVIp

• ..LV...--. ..... S .............. V...VLS.DQDTG.L...H.C ........... V..K..PS...V.

TKKSQTFSTYQDNQPAVLIQVFEGERALTKDNHLLGKFELSGIPPAQR~~VDKNGILHVEAEDKGTGK • .N..I...AS .... T.T.K.Y .... P ......... T.D.T ..... P.:~:~:::~::~::~:::?~::~:~:~:::~EI.V .... R.T ....... N

SRGITITNDKGRLSKEQIEKMINDAEKFADEDKNLREKVEAKNKLDNYIQSMKATVEDKDKLADKIEKEDKNTILS

KNK ...... QN..TP.E..R.V ....... E...K.K.RIDTR.E.ES.AY.L.NQIG..E..GG.LSS...E.MEK

AVKDAEDWLNNNSNADSEALKQKLKDLEAVCQPIIVKLYGQPGGPSpQPSGDEDVDS-DEL ..EEKIE..ESHQD..I.DF.A.K.E..EIV .... S .... SA.P.---.T.E..TAEK...

Fig. 3. Comparison of the amino acid sequence encoded by the P~rp open reading ~ame (1965 bp; upper row) with that of human grp78 (lower row). Sequen~s were aligned using the PRTALN program included in the NlH-Molecular biology progams. Computer generated insertions are indicated by dashes. Amino add ~sidues in the human grp78 that are identical to those of P~rp have been replaced by dots. Shaded areas rep~sent the eight domains conserved in heat-inducible proteins of several sp~ies [13]. An arrow in the human grp78 sequence identifies the cleavage site ~r the signal pepfide pre~nt at the

amino terminus.

356

coding and intron sequences contain 68.9% and 87.6% A/T respectively, typical of P. falciparum genes [15]. The deduced amino acid sequence of the entire open reading frame is shown in Fig. 3. Comparison of the sequence of Pfgrp with that of human grp78 revealed 64% (72% including conservative changes) similarity. The protein is very hydrophilic, with only a few hydrophobic regions (data not shown). Included in the regions of similarity are eight domains (shaded sequences in Fig. 3) that are 80% conserved among heat- inducible proteins of human, Xenopus, Drosophila, yeast, and E. coli [16]. The signal sequence at the amino terminus and the canonical (S)DEL sequence at the carboxy terminus presumably facilitate translocation and retention of these proteins in the endoplasmic reticulum.

Acknowledgements

The authors thank Pichart Uparanukraw for helpful discussions. These studies were supported by research grants from the NIH (AI24704 and AI31589) and John D. and Catherine T. MacArthur foundation.

References

1 Kumar, N., Syin, C., Carter, R., Quakyi, I. and Miller, L.H. (1988) Plasmodium falciparum gene encoding a protein similar in sequence to the 78 kDa rat glucose- regulated stress protein. Proc. Natl. Acad. Sci. USA 85, 6277-6281.

2 Aredeshir, F., Flint, J.E., Richman, S. and Reese, R.T. (1987) A 75 kDa merozoite surface protein of Plasmodium faleiparum which is related to the 70 kDa heat-shock protein. EMBO J. 6, 493~,99.

3 Bianco, A.E., Favaloro, J.M., Burkot, T.R., Culvenor,

J.G., Crewther, P.E., Brown, G.V., Anders, R., Coppel, R.L. and Kemp, D.J. (1986) A repetitive antigen of Plasmodium falciparum that is homologous to heat shock protein 70 of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 83, 8713-8717.

4 Yang, Y-F., Tan-ariya, P., Sharma, Y.D. and Kilejian, A. (1987) The primary structure of a Plasmodium falciparum polypeptide related to heat shock proteins. Mol. Biochem. Parasitol. 26, 61-68.

5 Kumar, N., Koski, G., Harada, M., Aikawa, M. and Zheng, H. (1991) Induction and localization of Plasmodium falciparum stress proteins related to the heat shock protein 70 family. Mol. Biochem. Parasitol. 48, 47 58.

6 Lee, A.S. (1987) Coordinated regulation of a set of genes by glucose and calcium ionophores in mammalian cells. Trends Biochem. Sci. 12, 21~23.

7 Pelham, H.R.B. (1989) Control of protein exit from the endoplasmic reticulum. Annu. Rev. Cell Biol. 5, 1 23.

8 Kawasaki, E.S. and Robinson, I.B. (1989) PCR technology: Principles and Application of DNA Amplification (Ehrlich, H.A. ed.), pp. 89 97, Stock- ton, New York.

9 Triglia, T., Peterson, M.G. and Kemp, D.J. (1988) A procedure for in vitro amplification of DNA segments that lie outside the boundaries of known sequences. Nucleic Acids Res. 16, 8186.

10 Sanger, F., Nicklen, S. and Coulson, A.K. (1977) DNA sequencing with chain-terminating inhibitor. Proc. Natl. Acad. Sci. USA 74, 5463 5467.

11 Murray, V. (1989) Improved double stranded DNA sequencing using the linear polymerase chain reaction. Nucleic Acids Res. 17, 8889.

12 Chomczynski, P. and Sacchi, N. (1987) Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal. Bio- chem. 162, 156159.

13 Mount, D. (1985) Computer analysis of sequence, structure and function of biological macromolecules. Biotechniques. 3, 102 112.

14 Breathnach, R., Benoist, C., O'Hare, K., Gannon, F. and Chambon, P. (1978) Ovalbumin gene: Evidence for a leader sequence in mRNA and DNA sequence at the exon-intron boundaries. Proc. Natl. Acad. Sci. USA 75, 4853~4857.

15 Weber, J.L. (1988) Molecular biology of malaria parasite. Exp. Parasitol. 66, 143-170.

16 Ting, J. and Lee, A.S. (1988) Human gene encoding the 78 000-dalton glucose-regulated protein and its pseudo- gene: Structure, Conservation, and Regulation. DNA 7, 275 286.