sindbis - california institute of technology

5
Proc. Nati. Acad. Sci. USA Vol. 78, No. 4, pp. 2062-2066, April 1981 Biochemistry Nucleotide sequence of the 26S mRNA of Sindbis virus and deduced sequence of the encoded virus structural proteins (membrane glycoproteins/capsid protein/alphavirus/togavirus/protein processing) CHARLES M. RICE AND JAMES H. STRAUSS Division of Biology, California Institute of Technology, Pasadena, California 91125 Communicated by Ray D. Owen, December 11, 1980 ABSTRACT The nucleotide sequence of intracellular 26S mRNA of Sindbis virus has been determined by direct sequence analysis of the cDNA made to this RNA with reverse transcriptase. From this, the amino acid sequences of the encoded virus struc- tural proteins, which include a basic capsid protein and two in- tegral membrane glycoproteins, have been deduced. The features of these proteins as related to their functions are discussed. We suggest that three proteases are required to produce these pro- teins from their polyprotein precursor: a viral protease, which functions in the cytosol to release the capsid protein; signalase, which makes two cleavages to separate the glycoproteins; and a protease of the Golgi complex that cleaves after double basic res- idues, to process the precursor form of one of the glycoproteins. Alphaviruses such as Sindbis virus and the closely related Sem- liki Forest virus are simple enveloped viruses. The icosahedral nucleocapsid is assembled in the cytoplasm, diffuses to the cell surface, and buds through the host cell plasmalemma, acquiring a lipoprotein envelope that contains only virus-encoded gly- coproteins. These glycoproteins are synthesized on the rough endoplasmic reticulum and glycosylated; they migrate to the plasma membrane by way of the Golgi apparatus, where the carbohydrates are modified and lipids are covalently attached. The interaction between the alphavirus nucleocapsid and its glycoproteins is much more specific than that of other enve- loped viruses; mature virions contain exclusively alphavirus proteins (1). All three of the virus structural proteins are translated as a continuous polypeptide from a single mRNA molecule-26S RNA (2). We have determined the nucleotide sequence of Sind- bis HR 26S RNA to investigate in detail the structure and pro- cessing of the viral proteins and to make possible further study of the temperature-sensitive mutants previously characterized genetically and physiologically (3). MATERIALS AND METHODS Details of the methods and strategy used for preparation and sequence determination of single-stranded cDNA to Sindbis 26S RNA will be published elsewhere. Briefly, 26S RNA was used as a template for synthesis of cDNA at 42.5°C by using avian myeloblastosis virus reverse transcriptase (kindly pro- vided by J. Beard) primed with either oligo(dT)12_18 (Collabo- rative Research, Waltham, MA) or a mixture of short (=6-8 nucleotides) random oligonucleotides derived from calf thymus DNA (a gift from J. Casey). The reaction mixture contained 4 mM sodium pyrophosphate to inhibit second-strand synthesis. After 30-60 min, an excess of EDTA (Na form) was added to stop the reaction, and the cDNA-RNA hybrid was isolated by phenol/chloroform extraction followed by ethanol precipita- tion. The RNA strand was- hydrolyzed by incubation in 0.1 M NaOH at 60'C for 30 min. The cDNA was chromatographed on Bio-Gel A-J m (Bio-Rad); the excluded peak fractions were pooled, and the cDNA was ethanol precipitated. The cDNA was then digested with Hae III, Taq I, Hha I, or Rsa I (New England Biolabs) and labeled at the 5' ends with T4 polynucleotide ki- nase or at the 3' ends with terminal deoxynucleotidyl transfer- ase [Bethesda Research Laboratories (Rockville, MD), ribosub- stitution grade] essentially as described by Maxam and Gilbert (4). Single-stranded, end-labeled restriction fragments were separated on 6% polyacrylamide sequencing gels, excised, and eluted, and their sequences were determined by using the base- specific chemical cleavage procedure (4). The modifications of Smith and Calvo (5) were used for long (>300 nucleotides) frag- ments. By using four different restriction enzymes and deter- mining the sequences of numerous partial digestion products and of some fragments from both the 5' and the 3' direction, virtually all of the sequence was determined more than once and sufficient overlap was obtained to align all of the fragments. RESULTS AND DISCUSSION Sequence of 26S RNA. The entire nucleotide sequence of Sindbis virus (HR strain) 26S RNA, except a sequence of '-150 nucleotides at the 5' end that has not been unambiguously de- termined, and the deduced amino acid sequences of the en- coded proteins are shown in Fig. 1. From the AUG codon initi- ating synthesis of the capsid protein, there is an open reading frame for 3735 nucleotides encoding, in order, capsid protein and glycoproteins E3, E2, and El, followed by a termination codon. The identification of the NH2 terminus and the COOH terminus of each protein is discussed below. The deduced amino acid sequence is in precise agreement with the NH2-ter- minal sequences found by automated sequence analysis of El and E2 (6), of PE2 (the precursor to E3 and E2) J. R. Bell and M. W. Hunkapiller, personal communication), and of the hy- drophobic "roots" of El and E2 derived by treatment of intact virions with a-chymotrypsin (unpublished data). In addition, sequences of many of the tryptic peptides from the Sindbis (strain Sa-AR-86) capsid protein have been published recently (7); most of these sequence data are consistent with our deduced sequence for the capsid protein. The experimental amino acid compositions for El, E2, and the capsid protein are in excellent agreement with the com- positions deduced from the nucleotide sequence and furnish additional support for the deduced protein sequences. In ad- dition, our sequence data adjacent to the poly(A) tail is in com- plete agreement with the sequence obtained for the same region by using a chain termination method (9). The method we have used requires only a few micrograms of purified RNA and, because molecular cloning is not involved, Abbreviation: SFV, Semliki Forest virus. The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertise- ment" in accordance with 18 U. S. C. §1734 solely to indicate this fact. 2062

Upload: others

Post on 09-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sindbis - California Institute of Technology

Proc. Nati. Acad. Sci. USAVol. 78, No. 4, pp. 2062-2066, April 1981Biochemistry

Nucleotide sequence of the 26S mRNA of Sindbis virus anddeduced sequence of the encoded virus structural proteins

(membrane glycoproteins/capsid protein/alphavirus/togavirus/protein processing)

CHARLES M. RICE AND JAMES H. STRAUSSDivision of Biology, California Institute of Technology, Pasadena, California 91125

Communicated by Ray D. Owen, December 11, 1980

ABSTRACT The nucleotide sequence of intracellular 26SmRNA of Sindbis virus has been determined by direct sequenceanalysis ofthe cDNA made to this RNA with reverse transcriptase.From this, the amino acid sequences of the encoded virus struc-tural proteins, which include a basic capsid protein and two in-tegral membrane glycoproteins, have been deduced. The featuresof these proteins as related to their functions are discussed. Wesuggest that three proteases are required to produce these pro-teins from their polyprotein precursor: a viral protease, whichfunctions in the cytosol to release the capsid protein; signalase,which makes two cleavages to separate the glycoproteins; and aprotease of the Golgi complex that cleaves after double basic res-idues, to process the precursor form of one of the glycoproteins.

Alphaviruses such as Sindbis virus and the closely related Sem-liki Forest virus are simple enveloped viruses. The icosahedralnucleocapsid is assembled in the cytoplasm, diffuses to the cellsurface, and buds through the host cell plasmalemma, acquiringa lipoprotein envelope that contains only virus-encoded gly-coproteins. These glycoproteins are synthesized on the roughendoplasmic reticulum and glycosylated; they migrate to theplasma membrane by way of the Golgi apparatus, where thecarbohydrates are modified and lipids are covalently attached.The interaction between the alphavirus nucleocapsid and itsglycoproteins is much more specific than that of other enve-loped viruses; mature virions contain exclusively alphavirusproteins (1).

All three of the virus structural proteins are translated as acontinuous polypeptide from a single mRNA molecule-26SRNA (2). We have determined the nucleotide sequence ofSind-bis HR 26S RNA to investigate in detail the structure and pro-cessing of the viral proteins and to make possible further studyof the temperature-sensitive mutants previously characterizedgenetically and physiologically (3).

MATERIALS AND METHODSDetails of the methods and strategy used for preparation andsequence determination of single-stranded cDNA to Sindbis26S RNA will be published elsewhere. Briefly, 26S RNA wasused as a template for synthesis of cDNA at 42.5°C by usingavian myeloblastosis virus reverse transcriptase (kindly pro-vided by J. Beard) primed with either oligo(dT)12_18 (Collabo-rative Research, Waltham, MA) or a mixture of short (=6-8nucleotides) random oligonucleotides derived from calfthymusDNA (a gift from J. Casey). The reaction mixture contained 4mM sodium pyrophosphate to inhibit second-strand synthesis.After 30-60 min, an excess of EDTA (Na form) was added tostop the reaction, and the cDNA-RNA hybrid was isolated byphenol/chloroform extraction followed by ethanol precipita-

tion. The RNA strand was- hydrolyzed by incubation in 0.1 MNaOH at 60'C for 30 min. The cDNA was chromatographed onBio-Gel A-J m (Bio-Rad); the excluded peak fractions werepooled, and the cDNA was ethanol precipitated. The cDNA wasthen digested with Hae III, Taq I, Hha I, or Rsa I (New EnglandBiolabs) and labeled at the 5' ends with T4 polynucleotide ki-nase or at the 3' ends with terminal deoxynucleotidyl transfer-ase [Bethesda Research Laboratories (Rockville, MD), ribosub-stitution grade] essentially as described by Maxam and Gilbert(4). Single-stranded, end-labeled restriction fragments wereseparated on 6% polyacrylamide sequencing gels, excised, andeluted, and their sequences were determined by using the base-specific chemical cleavage procedure (4). The modifications ofSmith and Calvo (5) were used for long (>300 nucleotides) frag-ments. By using four different restriction enzymes and deter-mining the sequences of numerous partial digestion productsand of some fragments from both the 5' and the 3' direction,virtually all of the sequence was determined more than onceand sufficient overlap was obtained to align all ofthe fragments.

RESULTS AND DISCUSSIONSequence of 26S RNA. The entire nucleotide sequence of

Sindbis virus (HR strain) 26S RNA, except a sequence of '-150nucleotides at the 5' end that has not been unambiguously de-termined, and the deduced amino acid sequences of the en-coded proteins are shown in Fig. 1. From the AUG codon initi-ating synthesis of the capsid protein, there is an open readingframe for 3735 nucleotides encoding, in order, capsid proteinand glycoproteins E3, E2, and El, followed by a terminationcodon. The identification of the NH2 terminus and the COOHterminus of each protein is discussed below. The deducedamino acid sequence is in precise agreement with the NH2-ter-minal sequences found by automated sequence analysis of Eland E2 (6), of PE2 (the precursor to E3 and E2) J. R. Bell andM. W. Hunkapiller, personal communication), and of the hy-drophobic "roots" of El and E2 derived by treatment of intactvirions with a-chymotrypsin (unpublished data). In addition,sequences of many of the tryptic peptides from the Sindbis(strain Sa-AR-86) capsid protein have been published recently(7); most ofthese sequence data are consistent with our deducedsequence for the capsid protein.The experimental amino acid compositions for El, E2, and

the capsid protein are in excellent agreement with the com-positions deduced from the nucleotide sequence and furnishadditional support for the deduced protein sequences. In ad-dition, our sequence data adjacent to the poly(A) tail is in com-plete agreement with the sequence obtained for the same regionby using a chain termination method (9).The method we have used requires only a few micrograms

ofpurified RNA and, because molecular cloning is notinvolved,

Abbreviation: SFV, Semliki Forest virus.

The publication costs ofthis article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertise-ment" in accordance with 18 U. S. C. §1734 solely to indicate this fact.

2062

Page 2: Sindbis - California Institute of Technology

Biochemistry: Rice and Strauss

UACUACAAC

C-.N N R G F F N N L

CACCACCACCAUGAAUAGAGGAUUCUUUAACAUGCCL A..V

R N C L A S II QGCGCAACGGGCUGGCUUCUCAAAUCCAGCAA

A P K Q P P K P K KGCGCCCCAAGCAACCACCGAAGCCGCAGACA

cc 15 L F 0 V K N E 0 G 0

3711 UUGUUCGACGUCAAGAACGAGGACGCGAAI

comu K F K S S A Y 0 N

359 AAAUUUACCAAGUCGUCACCAUACGACAUI

Coom A V Y S G G R F3471 GCGGUGCAGUAUAGUGGAGGUAGAUUUAC

ca2m 0 E G T R A L S V3351 GAUGAAGGAACACGAACUGCCCUUUCGGU

E3C1) L L G N V S F P C 0

231 UUGCUCGGAAAUGUGAGCUUCCCAUGCGA

E3 51) N A I L R C G S S G3111 AAUGCCAUAUUGCGGUGCGGAUCGUCUGG

E2 2m S P V K I E V V 0

291 AGCCCUGUUAAGAUCGCGCAGGUCUGGGA

E2d S L K 0 0 H V K2971 AUCUCGCUUAAGCAGGAUCACACCGUUAA

E2 (11 P G 0 S V T V S I V

2751 CCAGGGGACAGCGUAACGGUUAGCAUAGU

E2(147) G K K I P C v 0

2631 CGUAAAAAAAUUCCUUGCACAGUGUACGA

E2(19n V Y A K P P S G K N2511 CUUUACGCAAAGCCCCCAUCUGGGAAGAA

E2(27 V A Y K S 0 K V2391 CUCCCCUAUAAGAGCGACCAAACGAAGUG

uc

L T T A vACUGACCACAGCCGUC

P K T a E

Proc. Natl. Acad. Sci. USA 78 (1981) 2063

G R R P F P A P A N V R P R R R R OA A P P A

:GCCCCGCCCCCCUUCCCGGCCCCCACUGCCAUGUGGAGGCCGCGGAGAAGGAGGCAGGCGGCCCCCGAUGCCUGCC

S A L V I G O A r R P O P P R P R P P P R K 0

CAGUGCCCUACUCAUUGGACAGGCAACUAGACCUCAACCCCCACGUCCACGCCCGCCACCGCGCCAGAGACAGCAG

K K K K O P A K P K P G K R O R N A L K L E A D RKO- PP-100 ~ -l- eA Af l,Alff-,AIPKACK IIA E'I rrAO'lR'frA ~

,ACC-AAAAACGCAGGAGAAGAGA-GAAGAAGCAACCUGCAAAACCAACC^^CsGGAAAGAGAGACOAUGUCACUb>^UAuUUUAULUb^b6^AUAUV I G H A L A N E G K V N K P L H V K G T D H P V L S K L*LGUtCAUGGGCACGCACUGGCCAUGGAAGGAAAGGUAAUCAAACCUCUGCACGUGAAAGGAACCAUCGACCACCCUGUGCUAUCAAAGCUCCwCACUUCC---w

E F AIGGAGUUCGCAI P R

'CAUCCCUCG

V TV

ICGUCACCUG

R P PiCCGCCCGCC

R S K;CAGAAGCAA

E A

CGAAGCGGA

E G TkACAAGGCAC

S S NJGAGUAGCCA

R L KkCCGUCUGAA

I YCAUUACGUA

V F N

:1

G L P V N N R S E A F S E H P E C F N V H HACAGUUGCCAGUCAACAUGAGAAGUGAGGCAUUCACCUACACCAGUGAACACCCCGAAGGAUUCUAUAACUGGCACCACGGA

C V CGC R CO S C R P I N O N S C R V V A I V L CC A

CGGAGUAGGAGGCAGAGGAGACACCGGUCGUCCGAUCAUGGAUAACUCCGGUCGGGUUGUCGCGAUAGUCCUCGGUGGCGCUEa-.

N S K C K r I K t P E G E E V S A A P L V A H CGAAUAGUAAAGGGAAGACAAUUAAGACGACCCCGGAAGGGACAGAAGAGUGGUCCGCAGCACCACUGGUCACGGCAAUGUGU

r C Y R E P S R A L D I L E E N V N H E A Y r L LCACAUGCUAUACCCGCGAACCUUCCAGAGCCCUCGACAUCCUUGAAGAGAACGUGAACCAUGAGCCCCUACGAUACCCUGCUC

E2-.R S V I C F r L S P Y L G c S C H H E P C FAAGAAGCGUCAUUGACGGCUUUACCCUGACCAGCCCCUACUUGGGCACAUGCUCGUACUGCCACCAUACUGAACCGUGCUUC

N r I R I C r S A O F G Y O C S G A A S A N K R YCGAUAACACCAUACGCAUACAGACUUCCGCCCAGUUUGGAUACGACCAAAGCGGAGCAGCAAGCGCAAACAAGUACCGCUACN D t K I S S C P C R R L S Y K G Y F L L A K C P

CAUGGAUGACAUCAAGAUUAGCACCUCAGGACCGUGUAGAAGGCUUAGCUACAAAGGAUACUUUCUCCUCGCAAAAUGCCCU

S A S C L A R K I K P K F V C R E K Y O L P P V H,CUCAGCAACGUCAUGUACACUCGCCCGCAAGAUAAAACCAAAAUUCGUGGGACGGGAAAAAUAUGAUCUACCUCCCGUUCAC

E A C Y I N H R P R P H A Y S Y L E E S S G K,AGAAACAACUGCAGGCUACAUCACUAUGCACAGGCCGAGACCGCACGCUUAUACAUCCUACCUGGAAGAAUCAUCAGGGAAAE C K C CGO K r c r v s R r E I G C A I K C C

,UGAGUGCAAGUGCGGCGACUACAAGACCGGAACCGUUUCGACCCGCACCGAAAUCACUGGUUGCACCGCCAUCAAGCAGUGCS P O L I R H O H A G K L H L P F K L I P S C

CGGUCUUCAACUCACCGGACUUGAUCACACAUGACGACCACACGGCCCAACCGAAAUUGCAUUUGCCUUUCAAGUUGAUCCCGAGUACCUCCE2 26) N V P V A H A P N V I H G F K H I S L 0 L D T 0 H L 1 L L T T R R L C A N P E P2271 AUGGUCCCUGUUGCCCACGCGCCGAAUGUAAUACAUGGCUUUAAACACAUCAGCCUCCAAUUAGAUACAGACCACUUGACAUUGCUCACCACCAGGAGACUAGGGGCAAACCCGGAACCAE2cwn r T E V I V G K T V R N F T V 0 R 0 G L E Y I V C N H E P V R V Y A 0 E S A P G21SI ACCACUGAAUGGAUCGUCGGAAAGACGGUCAGAAACUUCACCGUCGACCGAGAUGGCCUGGAAUACAUAUGGGGAAAUCAUGAGCCAGUGAGGGUCUAUGCCCAAGAGUCAGCACCAGGAE2'347) D P H G V P H E I V 0 H YVY H R H P V Y T I L A V A S A T V A N N I G V T V A v2131 GACCCUCACGGAUGGCCACACGAAAUAGUACAGCAUUACUACCAUCGCCAUCCUGUGUACACCAUCUUAGCCGUCGCAUCAGCUACCGUGGCGAUGAUGAUUGGCGUAACUGUCGCAGUG

E2us) L C A C K A R R E C L 1 P Y A L A P N A V I P T S L A L L C C V R S A N A E r Floll UUAUGUGCCUGUAAAGCGCGCCGUGAGUGCCUGACGCCAUACGCCCUGGCCCCAAACGCCGUAAUCCCAACUUCGCUGGCACUCUUGUGCUGCGUCAGGUCGGCCAAUGCUGAAACGUUC6Kw r E r N S Y L V S N S 0 P F F V V 0 L C I P L A A F I V L N R C C S C C L P F L1791 ACCGAGACCAUGAGUUACUUGUGGUCGAACAGUCAGCCGUUCUUCUGGGUCCAGUUGUGCAUACCUUUGGCCGCUUUCAUCGUUCUAAUGCGCUGCUGCUCCUGCUGCCUGCCUUUUUUA

£1-.ac(44) V V A G A Y L A K V 0 A Y E H A r r V P N V P 0 I P Y K A L V E R A G Y A P L N1071 GUGGUUGCCGGCGCCUACCUGGCGAAGGUAGACGCCUACGAACAUGCGACCACUGUUCCAAAUGUGCCACAGAUACCGUAUAAGGCACUUGUUGAAAGGGCAGGGUAUGCCCCGCUCAAUE1 L E I r V N S s E V L P S r N 0 E Y I r C K F r r V V P S P K I K C C G S L E C1551 UUGGAGAUCACUGUCAUGUCCUCGGAGGUUUUGCCUUCCACCAACCAAGAGUACAUUACCUGCAAAUUCACCACUGUGGUCCCCUCCCCAAAAAUCAAAUGCUGCGGCUCCUUGGAAUGU

El 0 P A A H A 0 Y r C K V F G G V Y P F N V G G A 0 C F C 0 S E N S 0 N S E A Y V1431 CAGCCGGCCGCUCAUGCAGACUAUACCUGCAAGGUCUUCGGAGGGGUCUACCCCUUUAUGUGGGGAGGAGCGCAAUGUUUUUGCGACAGUGAGAACAGCCAGAUGAGUGAGGCGUACGUC

ElCmI E L S A V C A S 0 H A 0 A I K V H r A A N K V G L R I V Y G N r r S F L 0 V Y V1311 GAAUUGUCACCAGUAUGCGCGUCUGACCACGCGCAGGCGAUUAAGGUGCACACUGCCGCGAUGAAAGUAGGACUGCGUAUUGUGUACGGGAACACUACCAGUUUCCUAGAUGUGUACGUG

Elam1 N G V T P G r s K 0 L K V I A G P I S A S F r P F D H K V V I H R G L V Y N Y 0It@1 AACGGAGUCACACCAGGAACGUCUAAAGACUUGAAMAGUCAUAGCUGGACCAAUUUCAGCAUCGUUUACGCCAUUCGAUCAUAAGGUCGUUAUCCAUCGCGGCCUGGUGUACAACUAUGAC

Elcim F P E Y G A H K P G A F G D I O A r s L r s K D L I A S r D I R L L K P S A K N1r1 UUCCCGGAAUAUGGAGCGAUGAAACCAGGAGCGUUUGGAGACAUUCAAGCUACCUCCUUGACUAGCAAGGAUCUCAUCGCCAGCACAGACAUUAGGCUACUCAAGCCUUCCGCCAAGAAC

El G V H V P Y rO A

S S G F E V K N N S G R P L E APF G C K I A V N P L R A

I5 GUGCAUGUCCCGUACACGCAGGCCUCAUCAGGAUUUGAGAUGUGG^AAAACAACUCAGGCCGCCCACUGCAGGAAACCGCACCUUUCGGGUGUAAGAUUGCAGUAAAUCCGCUCCGAGCG

ElC V D C S Y G N I P I S I D I P N A A F I R r s D A P L V S r v K C E V S E C r YGa GUGGACUGUUCAUACGGGAACAUUCCCAUUUCUAUUGACAUCCCGAACGCUGCCUUUAUCAGGACAUCAGAUGCACCACUGGUCUCAACAGUCAAAUGUGAAGUCAGUGAGUGCACUUAU

ElCMS A F G G NA

L Y V S R EG

C P V H S H S S r A L E S r v H V L E711 UCAGCAGACUUCGGCGGGAUGGCCACCCUGCAGUAUGUAUCCGACCGCGAAGGUCAAUGCCCCGUACAUUCGCAUUCGAGCACAGCAACUCUCCAAGAGUCGACAGUACAUGUCCUGGAG

iEiaD K G A V v HFS A S P A N F

IV S L C G K K c N A E C K P P A D HI V

Sol AAAGGAGCGGUGACAGUACACUUUAGCACCGCGAGUCCACAGGCGAACUUUAUCGUAUCGCUGUGUGGGAAGAAGACAACAUGCAAUGCAGAAUGUAAACCACCAGCUGACCAUAUCGUG

EDO S r P H K N D O E F t0 A A I S K r s v s v L F A L F G G A S S L L I I G L H I F471 AGCACCCCGCAC^AAAAUGACCAAGAAUUUCAAGCCGCCAUCUCAAAAACAUCAUGGAGUUGGCUGUUUGCCCUUUUCGGCGGCGCCUCGUCGCUAUUAAUUAUAGGACUUAUGAUUUUU

El (420 A C S K H L r s r R R351 GCUUGCAGCAUGAUGCUGACUAGCACACGAAGAUGACCGCUACGCCCCAAUGAUCCGACCAGCAAAACUCGAUGUACUUCCGAGGAACUGAUGUGCAUAAUGCAUCAGGCUGGUACAUUA

231

1ll Poly (A)

FIG. 1. Nucleotide sequence of Sindbis 26S RNA and amino acid sequence of the encoded proteins. Nucleotides are numbered from the 3' end.Amino acids are numbered from the NH2 terminus ofeach protein [capsid (C), E3, E2, the 6000-dalton one (6K), El]. The beginning of each proteinis labeled, and the NH2 termini of the polypeptide segments from E2 and El isolated from spikeless particles after chymotrypsin treatment areindicated by triangles. Carbohydrate attachment sites are denoted by asterisks. The single letter amino acid code is used: A, Ala; C, Cys; D, Asp;E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; Y, Tyr.

c(1)4371

c35)

C (75)

aICX

aIC)c

kc

hAa11

AChi

cchi

ACil

A)Al

AtA.

GAUCGCUUACCGCGGGCAAUAUAGCAACACUAAAAACUCGAUGUACUUCCGAGGAAGCGCAGUGCAUAAUNUGCGCNGUGUUGCCACAUAACCACUAUAUUAACCAUUUAUCUAGCGGAC

Page 3: Sindbis - California Institute of Technology

2064 Biochemistry: Rice and Strauss

there is no chance ofselecting a minor variant in the population.We are currently using this method to locate sequence changesin Sindbis ts mutants.The coding regions of 26S RNA of Semliki Forest virus (SFV)

have been sequenced recently (10, 11). The amino acid se-quences of Sindbis structural proteins are compared with thoseof SFV in Fig. 2. Sequences have been aligned to maximizehomology. The overall homology between the viruses is strik-ing: 47% ofthe residues are identical and another 12% representconservative substitutions. It is also obvious that some areas ofthe proteins are more highly conserved than others (see Fig.2), as will be discussed below. The cysteine residues are in gen-eral conserved except in certain hydrophobic areas.

Overall Structure of Sindbis 26S RNA. We have identifieda Hae III fragment of the 26S cDNA that includes the NH2 ter-minus of the capsid protein and extends from the AUG codon=175 nucleotides toward the 5' end of the RNA. If we assumethat this fragment is derived from the 26S RNA, this brings theoverall length of the Sindbis '26S RNA to at least =4230 nu-cleotides, not including the poly(A) tail, which is in good agree-ment with previous estimates of its size (2). A 5' untranslatedregion of 175 nucleotides if fairly long in comparison with mostknown mRNAs; however, the mRNA for VP1 of SV40 has 240nucleotides 5' to the initiation codon (12). The 3' untranslatedsequence is 318 nucleotides long in Sindbis compared with 264nucleotides in SFV (11). Thus, despite numerous deletions inthe coding regions relative to one another (see Fig. 2), thelengths of the two virus 26S RNAs are remarkably similar. Fromthe AUG codon to the beginning of the poly(A) tail is 4023 nu-cleotides in SFV and 4053 nucleotides in Sindbis.We have examined the codon usage in Sindbis 26S RNA; for

most of the amino acids, the distribution among the possiblecodons is clearly nonrandom (data not shown). The base com-position or nearest-neighbor frequency cannot account for the

codon frequency found. In addition, despite the extensive nu-cleotide sequence divergence between Sindbis and SFV, thetwo viruses show similar codon usage (data not shown). (As anexample of this divergence, regions in which the amino acidsequence is totally conserved between the two viruses show anaverage of about 0.8 base changes per codon.) We do not finda low frequency of the C-G doublet, as has been found for someDNA viruses such as simian virus 40 (13) or for RNA viruses suchas fowl plague virus (14), which have mammalian or avian hosts.Sindbis and SFV have a wide host range that includes insectsas well as birds and mammals. The codon preferences found inthese two viruses may be due in part to an adaptation for optimaltranslational efficiency in this wide range of hosts.

Capsid Protein. The capsid protein of Sindbis is 264 aminoacids long. Inspection of the sequence shows a striking clus-tering-of lysine, arginine, and proline in the NH2-terminal half.'Seventy percent of these three amino acids, as well as 85% ofthe glutamine and 60% of the alanine, are within the first 120residues. This basic region ofthe capsid protein is probably im-portant in interacting with the virion RNA. Although SFV cap-sid protein has a similar basic region near the NH2 terminus,there is relatively little sequence homology between the twoproteins in this region (see Fig. 2). The COOH-terminal end(residues 166-264), however, shows remarkable sequence ho-mology between Sindbis and SFV; 76% ofthe residues are iden-tical and another 6% are conservative substitutions. This con-servation suggests that this region could be important inprotein-protein interactions, such as those between capsid pro-tein subunits to form the nucleocapsid and those between thenucleocapsid and the COOH-terminal end(s) of the transmem-brane viral glycoproteins.

Glycoprotein E3. Sindbis E3 is 64 amino acids long. The first19 residues are uncharged, highly conserved between Sindbisand SFV (see Fig. 2), and may serve as the signal sequence forthe insertion of PE2 into the endoplasmic reticulum during

C_ I ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~IIIIC(1) NNRGFFNNL GRRP'FPAPTANVRPRRRROAAPNPA RNGLASOIOOLTTAVSALVIGOATRPOPPRPRPPPR OKKQAPKOPPKPKKPKTGEKKKKOPA KPKPGKRORNALKLEAC(1) -.YIPTiTFY...VR.R.A.RPW-L . T.VAPVVPDFO-OQ-... -.-N..TNR.NAIAPARP.K.KKKKTT.PK.TO--KING.-.OO...DKQADKK.K.....E-C4I-NC(113) ORLFDVKNEODOVIGHALAMEGKVMKPLHVKGTIDHPVLSKLKFTKSSAYDNEFAQLPVNNRSEAFTYTSEHPEGFYNVHNGAVOYSGGRFTIPRGVGGRGDSGRPI1DNSGRVVAIVLGC(1B .CI-E-HjH *K.T*Y.CLVGD.....A.. . VNAD.A. A-K**.K.KL-C.--H. 9SK H-K.H..................T.KP......F.F.K........

E3 . , E2CQ33 GAOEGTRTALSVVTVNSKGKTIKTTPEGTEEV SAAPLVTANCLLGNVSFPCDRPP TCYTREPSRALOILEENVNHEAYDTLLNAILRCGSSGRSKR SVIDGFTL TSCQ3. . .9.DNTR... .... ....J... * . . . .ENAA.N .0. .DP...O.A.T.RG.H. . .9g.4Vf,.....................CCM) **N;-i .......... .DNVTiV....-@i.. *@V;-i-eFii..-CCVPC--ENNAEAT-RA-eD**DRPii'YD-,-iO -T-RNGT-R- *SOeN- VYKA- R

E2(11) PYLGTCSYCHHTEPCFSPVKIEQVVDEADONTIRIOTSAOFGYDOSGAASANKYRYNSLKODHTVKEGTNDDIKISTSGPCRRLSYKGYFLLAKCPPG9SVTVSIVSS NSATSCTLARKE2(14) ..IAY.AD.GAGHS-H.-A-.ARS.T.GMLKFH--F..I I.KONHDYT-I--A DG-AIENAVRSSL.VA... DFVHGTM-Hi.......EFL9...ODTR.AVRA.RIGYH

l l~~ ~ ~ ~ ~~~~~~~~~~~E2(138 IKPKFvG.REKYDLPPVHGKKIPCTVYDRLKETTAGYIT*14RPRPliAYTSYLEESSGKV-YAKPPSGKNITYECKCGD)YKTGTVSTRTEITGCTAIKQCVAYKSDQTKVVFN4SPIDLI*RHDDHE2(131) HD-OP....FTIR-HY E. ....TOTTAETVEE.D . .kPDTP0RTLSLOENKITG-KVK.N.T.TGNVTMSOT.NT. LE-.HVSVT.HKK-.... FVP-A.EP

E2CMi TAQGKLHLPFKLIPSTCNVPVA'HAPhVIHCGFKHISLOLDTDNLTLLTTRRLGANPEPTTEVIVGKTVRNFTVDRDGLEYIVGNNEPVRVYAOESAPlGDPHGVPHEIVQHYYHRHPVYTIL' ~ ~~ ~~~ ~ ~ ~~~~~~~~~' .16K ,E2C249) AR-~i--N-R--ET..K~iMH-P-i---DGN-~iV-I-V--H. 6. V-iEK....0...Y..GLY.iAA.II K 111

E207V) AVASATVAMNIGVTVAVLCACKARRECLTPYALAPNAVIPTSLALLCCVRSANA ETFTETMSYLVSNSOPFFVVQL CIPLAAFIVLMRCCSCCLPFLVVAGAYLAKVDA* ~~~~~~~. . .* * *- .**.*. . .

E22G)D *-VGMSLLAL.SIFASCYMLVA. .9K.------T-C-AV-..-G-AVi . -PR-H ASVA.. A....DON.AL--LEFAAPVA-LJITYCLRNVLiC-KSLSFL.LLSLGTAR

ElllEl1l) YEHATTVPNVPOIPYKALVERAGYAPLNtLEITVMSSEVLPSTNOCEYITCKFTTVVPSPXIKCCGSLECOP-AAHADYTCXVFGGVYPFMVGGAQCFCDSENSONSEAYVELSAVCASDHAGE11) .-.S-V--*-VGF....HI.--P-... T.OMVETSLE.TL-L....EYK......Yi....AS9-STKEKP--.0 T. ..... Y..T.L..OR....-RH

E10121) AIKVHTAAWKVGLRI!VYGNTTSFLDVYVNGVTPGTSKDLKVIAGPISASFTPFDHKVVIHRGLVYNYOFPEYGAMKPCArCuICATSLTSKDLIASTOIKLLKrbAKNVHVPYTOASSGFEl(121) *Y A .. SL- AKV.V- . * ,VNOTV ........*DHAV-IGGT0F.F.-L.SAV....NI*VYKDEF0-0-- P.-9SGO--R. SRTVE-N-.Y-N-ALK.AR- PG .........-TP...

E1(241) EMWKNNSGRPLOETAPFGCKIAVNPLRAVDCSYGNIPISIDIPNAAFIRTSDAPLVSTVKCEVSECTYSADFGGMATLQYVSDREGQCPVHSHSSTATLCESTVHVLEKGAVTVHFSTASI. .

El(241) KY*LKEK*TA*NTK*'** .0.KT..V..MN.AV....V.MiNL.DS..T.IVE..TJiDLT.T.AT..H.9....VL..T.KTNKN.D-S....N-V...A.AiKKTA-K'L....El(361) POANFIVSLCGKKTTCNAECKPPADHIVSTPHKNDOQEFOAAISKTSVSWLFALFGGASSLLIIGLMIFACCSMNLTSTRRE1(GI) ASPS.V. . . S9RA *-SSS;E * .K*.-PYAASHSNVVFPD.G- AL- .VQKiS *-LGA FA.*AILVLVVYTCIGL*

FIG. 2. Comparison of the amino acid sequence of the Sindbis structural proteins shown in Fig. 1 (upper sequence) with that of the structuralproteins ofSFV (from refs. 10 and 11) (lower sequence). Adot in the SFV sequence means that the amino acid is identical to that above in the Sindbissequence. A dot between the amino acids of the two sequences denotes a conservative substitution (R = K, S = T, D = E, Q = N, V = L = I = MA = G, A = V, Y = F). Possible carbohydrate attachment sites are denoted by asterisks. An attempt was made to keep the number of gaps introducedto a minimum. Nucleotide sequence homology was used to position gaps in some areas where amino acid sequence homology is low. The single letteramino acid code is used as in Fig. 1.

Proc. Natl. Acad. Sci. USA 78 (1981)

Page 4: Sindbis - California Institute of Technology

Proc. Natl. Acad. Sci. USA 78 (1981) 2065

protein synthesis (15). This putative signal sequence is notcleaved from PE2 during protein synthesis; rather, PE2 iscleaved to E2 and E3 during virus maturation (2, 15).

Sindbis E3 is known to be glycosylated (16), and the sequencecontains a single glycosylation site of the type Asn-X-Ser/Thr(Asn-14). Also, during amino acid sequence analysis of PE2 andE3, Asn-14 is not recovered, as is characteristic of glycosylatedasparagine residues, whereas other amino acids around this siteare recovered (M. W. Hunkapiller, J. R. Bell, and J. Mayne,personal communication). Thus, it is virtually certain that Asn-14 is glycosylated. The polysaccharide chain is known to be ofthe complex type (E. G. Strauss, personal communication). Itis interesting that this glycosylated site occurs within the pu-

tative signal sequence for PE2 and is conserved between Sind-bis and SFV. The E3 polypeptide of SFV has, in addition, a

second potential glycosylation site at Asn-60 (11).Glycoprotein E2. Sindbis E2 is 423 amino acids long and has

two potential glycosylation sites. Burke and Keegstra (17) haveshown that Sindbis E2 has two carbohydrate units, one ofwhichis a simple oligosaccharide chain containing only mannose andN-acetylglucosamine and the other of which is a complex chainthat contains, in addition, galactose, fucose, and sialic acid.They isolated glycopeptides after Pronase digestion of E2 andfound that the glycopeptide of the complex chain had the aminoacid composition Asn1Ile08Thro.7, while the simple chain hadthe composition AsnjPhe06ThrO 1. From Fig. 1, it is apparentthat Asn-196 carries a complex chain and Asn-318 has a simplechain. It is ofinterest that SFV E2 also carries two carbohydratechains (11), one at Asn-200 and the second at Asn-262.

Sindbis E2 has two long stretches of uncharged amino acidsnear its COOH terminus. The first of these is 28 amino acidslong (residues 363-390) and begins near the NH2 terminus ofthe hydrophobic root derived from E2 by treatment of the intactvirion with chymotrypsin (unpublished data) (Fig. 1). About 30residues can be removed from the COOH-terminus of PE2 bychymotrypsin treatment of intact microsomes (18), making itclear that this hydrophobic domain traverses the bilayer. E2 has5 or 6 residues of fatty acid covalently attached (19), all ofwhichare found in the root (unpublished data). There are three thre-onines and one serine in this first hydrophobic domain thatcould serve as attachment sites.

The second hydrophobic domain is 23 amino acids long (res-idues 396-418) and also appears to contain attached fatty acid

R. Bell, personal communication). Its orientation in the vi-rion is unknown. The homology (87%) between the Sindbis andSFV proteins in this region is striking and contrasts with the lowhomology found in the first hydrophobic region. This homology,together with the many conserved cysteine residues, suggeststhat this segment may provide the specificity for the interactionbetween the viral nucleocapsid and the glycoproteins duringbudding.

55 Amino Acid Peptide (4200- or 6000-Dalton Protein). Thispeptide is made up of predominantly uncharged amino acids(91%) and probably serves in whole or in part as the signal se-

quence for El. Whether it also has other functions is not known.The corresponding Sindbis and SFV peptides diverge widelv(see Fig. 2), implying that hydrophobicity is important for thefunction of this peptide but the exact sequence is not.

Glycoprotein El. El, the viral hemagglutinin (20), is thelargest of the Sindbis glycoproteins (439 amino acids). Two po-

tential glycosylation sites (Asn-139 and Asn-245) are present,and both appear to be glycosylated. First, El of Sindbis grown

in primary chicken cells contains both simple and complex poly-saccharide chains, although El from Sindbis grown in a contin-

uous hamster cell line contains only complex chains (21, 22).Second, the carbohydrate composition of El suggests that two

chains are present (22). Third, the fact that Sindbis El migratesmore slowly than E2 in NaDodSO4-containing gels, whereasSFV El (which contains a single glycosylation site [ref. 11]) andE2 comigrate, suggests that Sindbis El has two chains, as doesE2. We hypothesize that Asn-139 has a complex chain and Asn-245 has a simple chain in chicken-grown virus and that both havecomplex chains in hamster grown virus. The single glycosylationsite of SFV El, Asn-141 (11) is shifted by two residues from thefirst Sindbis site, a shift that requires mutations in two separatecodons (see Fig. 2).

Sindbis El has two long sequences of uncharged amino acids.The first is 17 amino acids long and begins at Val-80. In the re-gion of El from Asp-75 to Glu-109, which includes this un-charged segment, there are only seven differences betweenSindbis and SFV and four of these are highly conservativechanges (see Fig. 2). This uncharged region is not present inSindbis spikeless particles (unpublished data), suggesting thatit is not imbedded in the viral membrane. It may play a role inmediating virus fusion with intracellular membranes duringpenetration (11).

The second uncharged region is located at the COOH ter-minus of El and begins seven residues from the NH2 terminusof the root of El (Fig. 1; unpublished data). A sequence of 33uncharged amino acids is followed by two arginine residues,suggesting that El spans the bilayer with only two (or at mosta few) residues exposed on the cytoplasmic side. This unchargeddomain contains six serine and three threonine residues, whichare potential attachment sites for the one or two palmitic acidmolecules located in this region of E1 (ref. 19; unpublisheddata). Sindbis and SFV show little homology in this region, butboth sequences are highly hydrophobic (see Fig. 2).

Sites of Cleavage. The NH2 terminus of the Sindbis capsidprotein is Met-Asn-Arg- (refs. 7 and 15; J. R. Bell, personal com-munication), thus identifying the start of this protein. TheCOOH terminus is established by the (capsid) tryptic peptideThr-Thr-Pro-Glu-Gly-Thr-Glu-Glu-Trp sequenced by Boege etal. (7) found at residues 256-264 (see Fig. 1). The NH2-terminalsequence of PE2 (and E3), beginning Ser-Ala-Ala-Pro- J. R.Bell, personal communication), follows directly. Thus, trypto-phan is the COOH-terminal amino acid of the capsid protein,and the protease responsible for the capsid-PE2 cleavage hasa specificity similar to chymotrypsin. It has been suggested thatthis proteolytic activity resides in the capsid protein itself(23-25). The highly conserved region around the cleavage site(see Fig. 2) may be important for a site-specific, viral-encodedcleavage. This cleavage occurs during translation and appar-ently must take place ifthe signal sequence for PE2 is to function(26).The cleavage site between E3 and E2 has yet to be precisely

determined. The amino terminus of E2 is Ser-Val-Ile- (ref. 6;see Fig. 1). E3 is radiolabeled by lysine UJ. Mayne, personalcommunication), therefore, the COOH terminus is either Lysor Lys-Arg. In SFV, the COOH-terminal Arg is apparently re-moved (11). The cleavage of PE2 to form E3 and E2, in eithercase, involves a trypsin-like specificity. The origin and localiza-tion of the protease responsible for this is at present unknown.It occurs relatively late, approximatelT20 min after synthesisof PE2 and is required for virus maturation (2). After cleavage,Sindbis E3 is found in the culture medium (16) while that ofSFVremains associated with the mature virion (27). Thus, the cleav-age may occur in the plasma membrane, outside the cell, orinternally, perhaps in the lumen ofGolgi vesicles shortly beforeor concomitant with their fusion with the plasma membrane.The latter alternative is suggested by analogy to the processingof such proteins as pro-albumin and parathyroid prohormone(11), which are cleaved after double basic amino acids in the

Biochemistry: Rice and Strauss

Page 5: Sindbis - California Institute of Technology

2066 Biochemistry: Rice and Strauss

Golgi shortly before secretion of the mature protein (28). Thefailure of several groups to detect PE2 in the plasma membranelends further support to this idea.The COOH terminus of E2 is (Ala, Asn)-Ala, as determined

with carboxypeptidase Y (T. Crowley, personal communica-.tion). The beginning of El is Tyr-Glu-His- (ref. 6; see Fig. 1).There is a sequence of 55 residues between the end of E2 andthe beginning of El. This is probably the peptide isolated byWelch and Sefton (16) from Sindbis virus-infected cells, whichthey called a 4200 polypeptide. They have located the leucineand methionine residues in the first 20 residues of this poly-peptide by microsequence analysis, finding methionine at po-sition .7 and leucine at. position 10 (B. Sefton, personal.com-.munication), in agreement with the sequence shown in Fig. 1.

The cleavages at the end ofPE2 and the beginning ofEl bothoccur after alanine residues (for SFV as well as Sindbis; see Fig.2). It is thus tempting to propose that signalase is responsiblefor both cleavages (the properties of signalase have recentlybeen reviewed; see ref. 29). If so, and if the signalase activityis restricted, to the lumen of the rough endoplasmic reticulum,it would predict that the COOH-terminal region of PE2 spansthe.membrane twice,, at-least transiently. This could occur iftheCOOH terminus ofPE2 and the 55 amino acid polypeptide forma set of stop transfer sequences and internal signal sequencesthat leads to multiple, crossings of the membrane of the endo-plasmic reticulum. After cleavage, the 55 amino acid peptidecould be membrane-associated or could be released into thelumen of the endoplasmic reticulum. The COOH terminus ofPE2, after cleavage, could remain transmembranous, or it couldfold in such a way as to become partly or completely cytoplas-mic. This domain contains proline and asparagine, which wouldbe unusual in an intramembranous region, but it also. appearsto contain covalently attached lipid J. R. Bell, personal com-munication), suggesting membrane association. Further workwill be required to resolve these questions. It is noteworthy,however, that the two host proteases thought to be involved inprocessing the Sindbis structural polypeptides are located in thelumen ofthe endoplasmic reticulum and in the Golgi apparatus,whereas a virus-encoded protease appears to function in thecytosol. Virus-specific proteases that cleave capsid precursorslocated in the cytosol have also been reported for other viruses(30), including picornaviruses (31), RNA tumor viruses (32), andprobably adenovirus (33). We suggest as a general rule that vi-rus-encoded proteases are used for the processing in the cytosolof virus polypeptides.

We would like to thank E. Lenches for preparing the chicken fibro-blasts; T. E. Crowley for help-with the 26S RNA isolation; H. V. Huang,T. Hunkapiller, and M. and R. Douglas for writing the computer pro-grams and preparing the figures; J. R. Bell, M. W. Hunkapiller, andE. G. Strauss for helpful discussions; and L. Hood and T. Maniatis fortheir generosity in lending us supplies and equipment. We are espe-cially grateful to H. Garoff, W. J. Welch, and B. M. Sefton for sendinguspreprints of their work. This work was supported by Grant PCM 77-26728 from the National Science Foundation, Grants GM 06965 andAl 10793 from the National Institutes of Health, and Biomedical Re-search Support Grant 507 RR 07003 from the National Institutes of

Health. C.M.R. was supported by Training Grant GM 00086 from theNational Institutes of Health.

1. Strauss, E. G. (1978) J. Virol. 28, 466474.2. Strauss, J. H. & Strauss, E. G. (1977) in The Molecular Biology

of Animal Viruses, ed. Nayak, D. (Dekker, New York), Vol. 1,pp. 111-166.

3. Strauss, E. G. & Strauss, J. H. (1980) in The Togaviruses, ed.Schlesinger, R. W. (Academic, New York), pp. 393426.

4. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65,499560.

5. Smith, D. R. & Calvo, J. M. (1980) Nucleic Acids Res. 8,2255-2274.

6. Bell, J. R., Hunkapiller, M. W., Hood, L. E. & Strauss, J. H.(1978) Proc. Natl. Acad. Sci. USA 75, 2722-2726.

7. Boege, U., Wengler, G., Wengler, G. & Wittman-Liebold, B.(1980) Virology 103, 178-190.

8. Bell, J. R., Strauss, E. G. & Strauss, J. H. (1979) Virology 97,287-294.

9. Ou, J., Strauss, E. G. & Strauss, J. H. (1980) Virology, in press.10. Garoff, H., Frischauf, A.-M., Simons, K., Lehrach, H. & Delius,

'H. (1980) Proc. Natl. Acad. Sci. USA 77, 6376-6380.11. Garoff, H., Frischauf, A.-M., Simons, K., Lehrach, H. & Delius,

H. (1980) Nature (London) 288, 236-241.12. Ghosh, P.- K., Reddy, V. B., Swinscoe, J., Choudary, P. V., Le-

bowitz, P. & Weissman, S. M. (1978) J. Biol. Chem. 253,3643-3647.

13. Fiers, W., Gontreras, R., Haegeman, G., Rogiers, R., Van deVoorde, A., Van Heuverswyn, H., Van Herreweghe, J., Vol-ckaert, G. & Ysebaert, M. (1978) Nature (London) 273, 113-120.

14. Porter, A. G., Barber, C., Carey, N. H., Hallewell, R. A., Threl-fall, G. & Emtage, J. S. (1979) Nature (London) 282, 471-477.

15. Bonatti, S. & Blobel, G. (1979)J. Biol. Chem. 254, 12261-12264.16. Welch, W. J. & Sefton, B. M. (1979)J. Virol. 29, 1186-1195.17. Burke, D. & Keegstra, K. (1979)J. Virol. 29, 546-554.18. Wirth, D. F., Katz, F., Small, B. & Lodish, H. F. (1977) Cell 10,

253-263.19. Schmidt, M. F. G., Bracha, M. & Schlesinger, M. J. (1979) Proc.

Natl. Acad.. Sci. USA 76, 1687-1691.20. Dalrymple, J. M., Schlesinger, S. & Russell, P. K. (1976) Virol-

ogy 69, 93-103.21. Sefton, B. M. & Keegstra, K. (1974)J. Virol. 14, 522-530.22. Burke, D. J. & Keegstra, K. (1976)J. Virol. 20, 676-684.23. Simmons, D. T. & Strauss, J. H. (1974)J. Mol. Biol. 86, 397-409.24. Scupham, R. K., Jones, K. F., Sagik, B. P. & Bose, H. R,, Jr.

(1977)J. Virol. 22, 568-571.25. Aliperti, G. & Schlesinger, M. J. (1978) Virology 90, 366-369.26. Wirth, D. F., Lodish, H. F. & Robbins, P. W. (1979)J. Cell Biol.

81, 154-162.27. Garoff, H., 'Simons, K. & Renkonen, 0. (1974) Virology 61,

493-504.28. Dean, -R. T. & Judah, J. D. (1980) in Comprehensive Biochem-

'istry, Part 1, eds. Florkin, M. & Neuburger, A. (Elsevier, Am-sterdam), Vol. 19B, pp. 233-298.

29. Blobel, G., Walter, P., Chang, C., Goldman, B. M., Erickson,A. H. & Lingappa, V. R. (1979) in Symposia of the Society forExperimental Biology, No. 33, eds. Hopkins, C; R. & Duncan,C. J. (Cambridge Univ. Press, Cambridge, England) pp. 9-36.

30. Butterworth, B. E. (1977) in Current Topics in Microbiology andImmunology, ed. Arber, W. (Springer, Berlin), Vol. 76, pp.1-41.

31. Palmenberg, A. C., Pallansch, M. A. & Rueckert, R. R. (1979)J.Virol. 32, 770-778.

32. von der Helm, K. (1977) Proc. Natl. Acad. Sci. USA 74, 911-915.33. Bhatti, A. R. & Weber, J. (1979) Virology 96, 478-485.

Proc. Natl. Acad. Sci - USA 78 (1981)