materials and methods virus and cell culture · 2003. 5. 29. · m93390], porcine hemagglutinating...

20
Materials and Methods Virus and Cell Culture. The newly recognized coronavirus that is associated with severe acute respiratory syndrome (SARS-CoV, Urbani strain) was isolated on Vero cells from the throat washings of a patient who was exposed to the virus in Vietnam and subsequently died from progressive respiratory failure (6). RNA was purified from infected Vero cells by the guanidinium acid-phenol method and used for all subsequent experiments. Reverse Transcription-Polymerase Chain Reaction (RT-PCR) and Sequencing. The complete sequence of the genome of SARS-CoV was determined using a combination of techniques. Most of the sequence was derived from RT-PCR products that were amplified directly from viral RNA. Initially, degenerate, inosine-containing primers were designed to anneal to sites encoding conserved amino acid motifs that were identified on the basis of alignments of available coronavirus ORF1a, ORF1b, S, HE, M, and N gene sequences. Additional, specific, primers were designed as sequences were generated from RT-PCR products amplified with the degenerate primers and as SARS-CoV sequences became available from the World Health Organization Laboratory Network (3, 6). In all cases, the RT-PCR products were gel-isolated and purified for sequencing by means of a QIAquick Gel Extraction kit (Qiagen, Inc., Santa Clarita, CA). Both strands were sequenced by automated methods, using fluorescent dideoxy-chain terminators (Applied Biosystems; Foster City, CA). 1

Upload: others

Post on 26-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

Materials and Methods

Virus and Cell Culture. The newly recognized coronavirus that is associated

with severe acute respiratory syndrome (SARS-CoV, Urbani strain) was isolated

on Vero cells from the throat washings of a patient who was exposed to the virus

in Vietnam and subsequently died from progressive respiratory failure (6). RNA

was purified from infected Vero cells by the guanidinium acid-phenol method and

used for all subsequent experiments.

Reverse Transcription-Polymerase Chain Reaction (RT-PCR) and

Sequencing. The complete sequence of the genome of SARS-CoV was

determined using a combination of techniques. Most of the sequence was

derived from RT-PCR products that were amplified directly from viral RNA.

Initially, degenerate, inosine-containing primers were designed to anneal to sites

encoding conserved amino acid motifs that were identified on the basis of

alignments of available coronavirus ORF1a, ORF1b, S, HE, M, and N gene

sequences. Additional, specific, primers were designed as sequences were

generated from RT-PCR products amplified with the degenerate primers and as

SARS-CoV sequences became available from the World Health Organization

Laboratory Network (3, 6). In all cases, the RT-PCR products were gel-isolated

and purified for sequencing by means of a QIAquick Gel Extraction kit (Qiagen,

Inc., Santa Clarita, CA). Both strands were sequenced by automated methods,

using fluorescent dideoxy-chain terminators (Applied Biosystems; Foster City,

CA).

1

Page 2: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

For RT-PCR products of less than 3 kb, cDNA was synthesized in a 20-µl

reaction mixture containing 500 ng of RNA, 200 U of Superscript II reverse

transcriptase (Invitrogen Life Technologies, Carlsbad, CA), 40 U of RNasin

(Promega Corp., Madison, WI), 100 mM each dNTP (Roche Molecular

Biochemicals, Indianapolis, IN), 4 µl of 5X reaction buffer (Invitrogen Life

Technologies), and 200 pmol of the reverse primer. The reaction mixture, except

for the reverse transcriptase, was heated to 70°C for 2 minutes, cooled to 4°C for

5 minutes and then heated to 42°C in a thermocycler. The mixture was held at

42°C for 4 minutes, and then the reverse transcriptase was added, and the

reactions were incubated at 42°C for 45 minutes. Two microliters of the cDNA

reaction was used in a 50-µl PCR reaction containing 67 mM Tris-HCl (pH 8.8), 1

mM each primer, 17 mM ammonium sulfate, 6 mM EDTA, 2 mM MgCl2, 200 mM

each dNTP, and 2.5 U of Taq DNA polymerase (Roche Molecular Biochemicals).

The thermocycler program for the PCR consisted of 40 cycles of denaturation at

95°C for 30 seconds, annealing at 42°C for 30 seconds, and extension at 65°C

for 30 seconds. For specific primers, the annealing temperature was increased

to 55°C.

For amplification of fragments longer than 3 kb, regions of the genome

between sections of known sequence were amplified by means of a long RT-

PCR protocol and SARS-CoV-specific primers. First-strand cDNA synthesis was

performed at 42°C or 50°C using Superscript II RNase H reverse transcriptase

(Invitrogen Life Technologies) according to the manufacturer’s instructions with

2

Page 3: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

minor modifications. Coronavirus-specific primers (500 ng) and SARS-CoV RNA

(350 ng) were combined with the PCR Nucleotide Mix (Roche Molecular

Biochemicals, Mannheim, Germany), heated for 1 minute at 94°C, and cooled to

4°C in a thermocycler. The 5X first-strand buffer, dithiothreitol (Invitrogen), and

Protector RNase Inhibitor (Roche Molecular Biochemicals) were added, and the

samples were incubated at 42°C or 50°C for 2 minutes. After reverse

transcriptase (200 U) was added, the samples were incubated at 42°C or 50°C

for 1.5 to 2 hours. Samples were inactivated at 70°C for 15 minutes and

subsequently treated with 2 U of RNase H (Roche Molecular Biochemicals) at

37°C for 30 minutes. Long RT-PCR amplification of 5- to 8-kb fragments was

performed using Taq Plus Precision (Stratagene, La Jolla, CA) and AmpliWax

PCR Gem 100 beads (Applied Biosystems) for “hot start” PCR with the following

thermocycling parameters: denaturation at 94°C for 1 minute followed by 35

cycles of 94°C for 30 seconds, 55°C for 30 seconds, an increase of 0.4 degrees

per second up to 72°C, and 72°C for 7 to 10 minutes, with a final extension at

72°C for 10 minutes. RT-PCR products were separated by electrophoresis on

0.9% agarose TAE gels and purified by use of a QIAquick Gel Extraction Kit

(Qiagen, Inc).

The sequence of the leader was obtained from the subgenomic mRNA

coding for the N gene and from the 5’ terminus of genomic RNA. The 5’ rapid

amplification of cDNA ends (RACE) technique (4) was used with reverse primers

specific for the N-gene or for the 5’ untranslated region. RACE products were

either sequenced directly or were cloned into a plasmid vector before

3

Page 4: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

sequencing. A primer that was specific for the leader of SARS-CoV was used to

amplify the region between the 5’ terminus of the genome and known sequences

in the rep gene. The 3’ terminus of the genome was amplified for sequencing by

use of an oligo-(dT) primer and primers specific for the N gene.

Once the complete genomic sequence had been determined, it was

confirmed by sequencing a series of independently amplified RT-PCR products

spanning the entire genome. Positive- and negative-sense sequencing primers,

at intervals of approximately 300 nt, were used to generate a confirmatory

sequence with an average redundancy of 9.1. The confirmatory sequence was

identical to the original sequence. The sequence has been deposited in the

GenBank sequence database (accession no. AY278741). The sequences of the

primers used for sequencing and RT-PCR are available upon request.

Microarray Design and DNA Recovery. N gene sequences for SARS-CoV

were also obtained using a DNA microarray that contains approximately ~11,000

70-mer oligonucleotides representing all complete, previously described

reference viral genome sequences available from the National Center for

Biotechnology Information, National Library of Medicine (7). Total nucleic acid

was amplified from infected cell RNA by using a random-primer protocol as

described (1, 7) with the following modifications: first-strand synthesis was

primed by using primer A (5’-GTTTCCCAGTCACGATCNNNNNNNNN) followed

by PCR amplification with primer B (5’-GTTTCCCAGTCACGATC) for 40 cycles.

Aminoallyl-dUTP was incorporated into the PCR product by using an additional

4

Page 5: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

20 cycles of thermocycling. Microarray spots were visualized by fluorescence

microscopy (Nikon TE300). Amplified viral DNA hybridized to individual

microarray spots was recovered by means of a tungsten wire probe (Omega

Engineering, Inc.) mounted on a micromanipulator to scrape a 100-µm area of

the microarray. Recovered material was PCR-amplified with primer B and

subsequently cloned and sequenced.

Sequence Analyses. Predicted amino acid sequences were compared with

those from reference viruses representing each species for which complete

genomic sequence information was available (group 1 representatives included

human coronavirus 229E [GenBank accession no. AF304460], porcine epidemic

diarrhea virus [GenBank accession no. AF353511], and transmissible

gastroenteritis virus [GenBank accession no. AF271965]; group 2

representatives included bovine coronavirus [GenBank accession no. AF220295]

and mouse hepatitis virus [GenBank accession no. AF201929]; group 3 was

represented by infectious bronchitis virus [GenBank accession no. M95169]).

Sequences for representative strains of other coronavirus species for which

partial sequence information was available were included for some of the

structural protein comparisons (group 1 representative strains included canine

coronavirus [GenBank accession no. D13096], feline coronavirus [GenBank

accession no. AY204704], and porcine respiratory coronavirus [GenBank

accession no. Z24675]; and group 2 representatives included three strains of

human coronavirus OC43 [GenBank accession nos. M76373, L14643, and

5

Page 6: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession

no. AY078417], and rat coronavirus [GenBank accession no. AF207551]).

Sequence alignments and neighbor-joining trees were generated by using

ClustalX (5), version 1.83, with the Gonnet protein comparison matrix. The

resulting trees were adjusted for final output by using treetool version 2.0.1.

Uncorrected pairwise distances were calculated from the aligned sequences by

using the Distances program from the Wisconsin Sequence Analysis Package,

version 10.2 (Accelrys, Burlington, MA). Distances were converted to percent

identity by subtracting from 100.

Poly(A)+ RNA Isolation and Northern Hybridization. Total RNA from infected

or uninfected Vero cells was isolated with Trizol reagent (Invitrogen Life

Technologies) used according to the manufacturer’s recommendations. Poly(A)+

RNA was isolated from total RNA by use of the Oligotex Direct mRNA Kit

(Qiagen), following the instructions for the batch protocol, followed by ethanol

precipitation. RNA isolated from 1 cm2 of cells was separated by electrophoresis

on a 0.9 % agarose gel containing 3.7% formaldehyde, followed by partial

alkaline hydrolysis (2). RNA was transferred to a nylon membrane (Roche

Molecular Biochemicals) by vacuum blotting (Bio-Rad, Hercules, CA) and fixed

by UV cross-linking. The DNA template for probe synthesis was generated by

RT-PCR amplification of SARS-CoV nt 29,083 to 29,608, by using a reverse

primer containing a T7 RNA polymerase promoter to facilitate generation of a

negative-sense riboprobe. In vitro transcription of the digoxigenin-labeled

6

Page 7: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

riboprobe, hybridization, and detection of the bands were carried out with the

digoxigenin system by using manufacturer’s recommended procedures (Roche

Molecular Biochemicals). Signals were visualized by chemiluminescence and

detected with x-ray film.

7

Page 8: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

HCoV

-229

E (

S: 1

to

1,17

4)

HCoV-SARS (S: 1 to 1,256)0

500

1,000

1,0005000

TGEV

(S

: 1

to 1

,449

)

HCoV-SARS (S: 1 to 1,256)0

500

1,000

1,0005000

PEDV

(S

: 1

to 1

,384

)

HCoV-SARS (S: 1 to 1,256)0

500

1,000

1,0005000MH

V (

S: 1

to

1,36

2)

HCoV-SARS (S: 1 to 1,256)0

500

1,000

1,0005000BC

oV

(S:

1 to

1,3

64)

HCoV-SARS (S: 1 to 1,256)0

500

1,000

1,0005000

IBV

(S:

1 t

o 1,

163)

HCoV-SARS (S: 1 to 1,256)0

500

1,000

1,0005000

stm2
A Dot-plot comparison of coronavirus S proteins (window = 30, strigency = 15).
Page 9: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

HCoV

-229

E (

E: 1

to

78)

HCoV-SARS (E: 1 to 77)0

20

40

60

6040200

TGEV

(E

: 1

to 8

3)

HCoV-SARS (E: 1 to 77)0

20

40

60

80

6040200

PEDV

(E

: 1

to 7

7)

HCoV-SARS (E: 1 to 77)0

20

40

60

6040200MH

V (

E: 1

to

89)

HCoV-SARS (E: 1 to 77)0

20

40

60

80

6040200BC

oV

(E:

1 to

85)

HCoV-SARS (E: 1 to 77)0

20

40

60

80

6040200

IBV

(E:

1 t

o 10

9)HCoV-SARS (E: 1 to 77)

0

50

100

500

stm2
B Dot-plot comparison of coronavirus E proteins (window = 30, strigency = 15)
Page 10: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

HCoV

-229

E (

M: 1

to

226)

HCoV-SARS (M: 1 to 222)0

50

100

150

200

200150100500

TGEV

(M

: 1

to 2

63)

HCoV-SARS (M: 1 to 222)0

50

100

150

200

250

200150100500

PEDV

(M

: 1

to 2

27)

HCoV-SARS (M: 1 to 222)0

50

100

150

200

200150100500MH

V (

M: 1

to

229)

HCoV-SARS (M: 1 to 222)0

50

100

150

200

200150100500BC

oV

(M:

1 to

231

)

HCoV-SARS (M: 1 to 222)0

50

100

150

200

200150100500

IBV

(M:

1 t

o 22

6)HCoV-SARS (M: 1 to 222)

0

50

100

150

200

200150100500

stm2
C Dot-plot comparison of coronavirus M proteins (window = 30, strigency = 24)
Page 11: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

HCoV

-229

E (

N: 1

to

390)

HCoV-SARS (N: 1 to 423)0

100

200

300

4003002001000

TGEV

(N

: 1

to 3

83)

HCoV-SARS (N: 1 to 423)0

100

200

300

4003002001000

PEDV

(N

: 1

to 4

42)

HCoV-SARS (N: 1 to 423)0

100

200

300

400

4003002001000MH

V (

N: 1

to

452)

HCoV-SARS (N: 1 to 423)0

100

200

300

400

4003002001000BC

oV

(N:

1 to

449

)

HCoV-SARS (N: 1 to 423)0

100

200

300

400

4003002001000

IBV

(N:

1 t

o 41

0)HCoV-SARS (N: 1 to 423)

0

100

200

300

400

4003002001000

stm2
D Dot-plot comparison of coronavirus N proteins (window = 30, strigency = 20)
Page 12: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

HCoV

-229

E (

pp1a

b: 1

to

6,78

9)

HCoV-SARS (pp1ab: 1 to 7,074)0

2,000

4,000

6,000

6,0004,0002,0000

TGEV

(p

p1ab

: 1

to 6

,695

)

HCoV-SARS (pp1ab: 1 to 7,074)0

2,000

4,000

6,000

6,0004,0002,0000

PEDV

(p

p1ab

: 1

to 6

,792

)

HCoV-SARS (pp1ab: 1 to 7,074)0

2,000

4,000

6,000

6,0004,0002,0000MH

V (

pp1a

b: 1

to

7,13

2)

HCoV-SARS (pp1ab: 1 to 7,074)0

2,000

4,000

6,000

6,0004,0002,0000BC

oV

(pp1

ab:

1 to

7,0

60)

HCoV-SARS (pp1ab: 1 to 7,074)0

2,000

4,000

6,000

6,0004,0002,0000

IBV

(pp

1ab:

1 t

o 6,

640)

HCoV-SARS (pp1ab: 1 to 7,074)0

2,000

4,000

6,000

6,0004,0002,0000

stm2
E Dot-plot comparison of coronavirus pp1ab proteins (window = 30, strigency = 26)
Page 13: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

Figure S1. Identification of conserved regions of coronavirus proteins. The predicted SARS-CoV proteins (S, E, M, N, and pp1ab) were compared to the corresponding proteins from each of six reference viruses for which complete genomic sequence information was available (Group1: human coronavirus 229E [HCoV-229E], af304460; porcine epidemic diarrhea virus [PEDV], af353511; transmissible gastroenteritis virus [TGEV], aj271965. Group 2: bovine coronavirus [BCoV], af220295; murine hepatitis virus [MHV], af201929. Group 3: infectious bronchitis virus [IBV], m95169) using the compare program of the Wisconsin Sequence Analysis Package version 10.2 (Accelrys, Burlington, MA). A sliding window of 30 amino acids was used for each comparison, with the stringency set in proportion to the pairwise identity. In each panel, the SARS-CoV sequence is depicted along the horizontal axis and the comparison sequence is depicted along the vertical axis. (A) Comparison of coronavirus S proteins, stringency = 15. (B) Comparison of coronavirus E proteins, stringency = 15. (C) Comparison of coronavirus M proteins, stringency = 24. (D) Comparison of coronavirus N proteins, stringency = 20. (E) Comparison of coronavirus pp1ab proteins, stringency = 26.

Page 14: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

200 400 600 800 1000 1200

*

200 400 600 800 1000 1200

*

200 400 600 800 1000 1200

*

200 400 600 800 1000 1200

*

200 400 600 800 1000 1200

*

200 400 600 800 1000 1200

*

200 400 600 800 1000 1200

*

200 400 600 800 1000 1200 1400

*

200 400 600 800 1000 1200 1400

*

200 400 600 800 1000 1200 1400

*

200 400 600 800 1000 1200

*

100 200 300 400 500 600 700 800 900 1000 1100

*

100 200 300 400 500 600 700 800 900 1000 1100

*

BCoV

HCoV-OC43

HEV

MHV

RtCoV

SARS-CoV

PRCoV

TGEV

CCoV

FCoV

PEDV

HCoV-229E

IBVFigure S2. Predicted alpha-amphipathic regions in the coronavirus S proteins. Alpha-amphipathic regions were calculated according to Eisenberg. Red boxes represent longer regions of heptad repeat regions, whereas blue boxes show shorter regions. Heptad repeat regions in the carboxyl terminal region of the S protein are thought to collapse into coiled-coils after receptor binding, thus bringing the viral membrane into close proximity with the cellular membrane, leading to fusion.

Page 15: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

HCoV-229E GYWNVQKR..FRTRKGKRVDLSPKLHFYYLGTGPHKDAKFRERVEGVVWVAPEDV GYWNEQIR..WRMRRGERIEQPSNWHFYYLGTGPHGDLRYRTRTEGVFWVATGEV GYWNRQTR..YRMVKGQRKELPERWFFYYLGTGPHADAKFKDKLDGVVWVACCoV GYWNRQTR..YRMVKGRRKNLPEKWFFYYLGTGPHADAKFKQKLDGVVWVAFCoV GYWNRQIR..YRIVKGQRKELAERWFFYFLGTGPHADAKFKDKIDGVFWVAPRCoV GYWNRQTR..YRMVKGQRKELPERWFFYYLGTGPHADAKFKDKLDGVVWVABCoV GYWYRHNRRSFKTADGNQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVFWVAMHV GYWYRHNRRSFKTPDGQHKQLLPRWYFYYLGTGPHAGAEYGDDIEGVVWVAHEV GYWYRHNRRSFKTADGNQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVFWVARtCoV GYWYRHNRRSFKTPDGQQKQLLPRWYFYYLGTGPHAGASFGDSIEGVFWVAIBV GYWRRQAR..FKPGKGGRKPVPDAWYFYYTGTGPAADLNWGDTQDGIVWVA

SARS-CoV GYYRRATRR.VRGGDGKMKELSPRWYFYYLGTGPEASLPYGANKEGIVWVA

Figure S3. Conserved motifs in coronavirus N proteins. The predicted amino acid sequences of the nucleocapsid proteins of the indicated coronaviruses were aligned by Clustalx 1.83. The portion of the aligned sequences around the conserved motif, FYYLGTGP, is shown. Amino acids that were identical in at least 11 of the 12 aligned sequences are highlighted in blue. Sequences used for the alignments included the following for group 1: human coronavirus 229E (HCoV-229E), af304460; porcine epidemic diarrhea virus (PEDV), af353511; transmissible gastroenteritis virus (TGEV), aj271965; canine coronavirus (CCoV), d13096; feline coronavirus (FCoV), ay204704; porcine respiratory coronavirus (PRCoV), z24675; for group 2: bovine coronavirus (BCoV), af220295; murine hepatitis virus (MHV), af201929; porcine hemagglutinating encephalomyelitis virus (HEV), ay078417; rat coronavirus (RtCoV), af207551; for group 3: infectious bronchitis virus (IBV), m95169.

Page 16: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

S (180 kD)

N (55 kD)

Figure S4. SDS-PAGE analysis of purified SARS-CoV virions. SARS-CoV was concentrated from infected Vero cell supernatant medium by precipitation with polyethylene glycol. Virions were purified by centrifugation though a 20-60% sucrose gradient before being subjected to electrophoresis on a 10% SDS-PAGE gel. Proteins were visualized by staining with Coomassie Blue. A preparation of Ebola virus proteins of known molecular weights were included as size markers.

Page 17: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

Table S1. Locations of SARS-CoV ORFs and sizes of predicted proteins and mRNAs

Genome Location Predicted Size

ORF TRSa ORF Start ORF End Protein (aa) mRNA (nt)b

1a 72 265 13,398 4,378 29,727

1b 13,398 21,482 2,695

S 21,491 21,492 25,256 1,255 8,308c

X1 25,265 25,268 26,089 274 4,534c

X2 25,689 26,150 154

E 26,117 26,344 76

M 26,353 26,398 27,060 221 3,446c

X3 27,074 27,262 63

X4 27,272 27,273 27,638 122 2,527c

X5 27,778 27,864 28,115 84 2,021d

N 28,111 28,120 29,385 422 1,688c

a The location is the 3’-most nucleotide in the consensus transcriptional regulatory sequence

(TRS), AAACGAAC.

b Not including poly(A). Predicted size is based on the position of the conserved TRS.

c Corresponding mRNA detected by Northern blot analysis (Fig. 1C)

d No mRNA corresponding to utilization of this consensus TRS was detected by Northern blot

analysis (Fig. 1C)

Page 18: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

Table S2. Calculated molecular weights and potential N-linked glycosylation sites of coronavirus S proteins.

Virus S Protein MW Potential Glycosylation SitesNXS / NXT / Total

HCoV-229E 128,653 9 / 21 / 30PEDV 151,371 7 / 22 / 29TGEV 160,136 6 / 26 / 32CCoV 160,487 4 / 29 / 33FCoV 160,489 4 / 31 / 35PRCoV 134,809 4 / 25 / 29

BCoV 150,889 8 / 11 / 19MHV 149,861 11 / 9 / 20HCoV-OC43 150,108 10 / 11 / 21HEV 149,512 10 / 9 / 19RtCoV 149,566 11 / 10 / 21

IBV 128,062 12 / 17 / 29

SARS-CoV 137,688 7 / 16 / 23

The predicted molecular weight and consensus glycosylation sites were calculated for representative coronavirus S proteins. Sequences used for the comparisons included the following: for group 1: human coronavirus 229E (HCoV-229E), af304460; porcine epidemic diarrhea virus (PEDV), af353511; transmissible gastroenteritis virus (TGEV), aj271965; canine coronavirus (CCoV), d13096; feline coronavirus (FCoV), ay204704; porcine respiratory coronavirus (PRCoV), z24675; for group 2: bovine coronavirus (BCoV), af220295; murine hepatitis virus (MHV), af201929; human coronavirus OC43 (HCoV-OC43), m76373, l14643, m93390; porcinehemagglutinating encephalomyelitis virus (HEV), ay078417; rat coronavirus (RtCoV), af207551; and for group 3: infectious bronchitis virus (IBV), m95169.

Page 19: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

Table S3. Description and comparison of SARS-CoV genomic sequences available in GenBank (as of 25 April 2003).

Sourcea Strain Accession GI Mod Date Length Poly(A) Uniqueb 5'Endc

HKU HKU-39849 AY278491.2 30023963 18-Apr-2003 29,742 15 29,727 0CUHK CUHK-W1 AY278554.2 30027610 21-Apr-2003 29,736 24 29,712 -15CDC Urbani AY278741.1 30027617 21-Apr-2003 29,727 0 29,727 0BCCA GSC TOR2 AY274119.2 30088476 23-Apr-2003 29,736 24 29,712 -15

Positiond Consensus HKU-39849 CUHK-W1 Urbani TOR2 2,601 T C * * * 7,746 G * T * * 7,919 C * * T * 7,930 G A * * * 8,387 G C * * * 8,417 G C * * * 9,404 T * C * * 9,479 T * C * *

13,494 G A * * * 13,495 T G * * * 16,622 C * * T * 17,564 T * G * * 17,846 C * T * * 18,065 G A * * * 19,064 R A G G A 21,721 G * A * * 22,222 T * C * * 23,220 T * * * G 24,872 T * * C * 25,298 G * * * A 25,569 T A * * * 26,600 C T * * * 26,857 T * * C * 27,827 T * C * *

a Original source of the sequence information: The University of Hong Kong (HKU), Chinese University of Hong Kong (CUHK); US Centers for Disease Control and Prevention (CDC), British Columbia Cancer Agency, Genome Sciences Centre (BCCA GSC). b Length of the unique sequence without poly(A). c Number of nucleotides missing from the 5’-end, assuming that the longest reported sequences are full-length. d Position based on alignment with the two longer sequences.

Page 20: Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession no. AY078417], and rat coronavirus [GenBank

References

1. S.K. Bohlander et al., Genomics 13,1322 (1992).

2. T. Brown, T. In Current protocols in molecular biology, Vol. 1. Eds. F. M.

Ausubel, R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, J. A.

Smith, K. Struhl, (John Wiley & Sons, Inc., New York, N.Y. 1996) Ch. 4.9

3. C. Drosten, et al., 2003, N Engl J Med. Available April 17 at

http://nejm.org/earlyrelease/sars.asp#4-2

4. B.H. Harcourt, et al., Virology 271, 334 (2000).

5. J.D. Thompson, et al., Nucleic Acids Res, 25, 4876 (1997).

6. T.G. Ksiazek, et al., N Engl J Med 348, 1947 (2003).

7. D. Wang, et al., PNAS, 99, 15687 (2002).

1