gene structure of the human mitochondrial adenosine

6
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1988 by The American Society for Biochemistry and Molecular Biology, Inc. VOl. 263, No .23, Issue of August 15. PP. 11257-11262,1988 Printed in U.S.A. Gene Structure of the Human Mitochondrial Adenosine Triphosphate Synthase B Subunit* (Received for publication, November 13, 1987) Shigeo OhtaS, Hideaki Tomura, Kakuko Matsuda, and Yasuo Kagawa From the Department of Biochemistry, Jichi Medical School, Minumikawachi-machi, Tochigi-ken, 329-04 Japan ThemitochondrialATPsynthase B subunit is en- coded by a nuclear gene and assembled with the other subunits encoded by both mitochondrial and nuclear genes. As the next step in the analysis of the molecular mechanisms coordinating the two genetic systems, the gene for the human B subunit was cloned, and its struc- ture was determined. The gene contains 10 exons, with the first exon corresponding to the noncoding region and most of the presequence which targets this protein to themitochondria.EightAlurepeatingsequences including inverted repeats were found in the 5’ up- stream region and introns. An S1 nuclease protection experiment revealed two initiation sites for the tran- scription. A typical TATA box was not present at about 30 base pairs upstream from either initiation site. Three CAT boxes (CCAAT) were found between the two initiation sites. In addition, one CAT box was found 41 base pairs upstream from the first initiation site. Two GC boxes (potential Spl binding sites) were located in the 5’ upstream region, one of them linked to Alu repeating sequences. For determination of the promoter activity, fragments of various length from the 5‘ upstream regionwere fused to a chlorampheni- col acetyltransferase geneandtransfectedintocul- tured cells. This experiment showed the existence of an enhancing structure(s) for transcription between nucleotide -400 and -1100 in the upstream region. Mitochondrial ATP synthase (FoFl) catalyzes ATP for- mation, using the energy of proton flux through the inner membrane during oxidative phosphorylation (1,2). Two sub- units of mammalian FoFl are encoded by a mitochondrial gene (3) and the othersubunits (7-12 subunits) by a nuclear gene (4). The /3 subunit is encoded by the nuclear genome, synthesized in the cytosol, imported into mitochondria, and then assembled with the other subunits (5). The numbers of mitochondria per cell vary greatly depending on the develop- mental stage, cell activity, and type of tissue (6, 7). These facts suggest a functional interaction between the two genetic systems. However, the molecular mechanism for coordinating the two genetic systems is unknown. To understand the molecular basis of this coordination requires an analysis of the regulatory system for the subunit encoded on the nuclear * This research was supported by Grant 62617006 from the Min- istry of Education, Science, and Culture of Japan. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked ‘‘adver- tisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequencefs) reported in thispaper has been submitted 503906. to the GenBankm/EMBL Data Bank with accession numberfs) $ To whom correspondence should be addressed. genome. As the first step, we have cloned cDNA of the human p subunit (8). All subunit structures of FoFl reported to date, for either prokaryotes or eukaryotes, are very similar (1, 2). In particu- lar, the primary structure of the fi subunit is highly conserved between various species. In addition, it contains sequences very similar to those in parts of some nucleotide binding proteins, such as recA protein, adenylate kinase, and the ras gene product (2). Thus, it may be possible to correlate function of each region of the human /3 subunit with exon structure. Here, we report the organization of the gene, some struc- tures involved inits expression, and the existence of an enhancing structure for transcription in the 5‘ upstream region. EXPERIMENTAL PROCEDURES Gene Cloning and DNA Sequencing-Genomic DNA encoding the 0 subunit was isolated from a human genomic library, using the 0 subunit cDNA (8) (EcoRI fragment; 1.8 kilobase pairs) as a hybridi- zation probe. The library was a gift from Dr. Nojima (Department of Pharmacology, Jichi Medical School). Human genomic DNA was partially digested with Hue111 and AluI and ligated to the EcoRI sites of a X phage DNA, Charon 4A, with EcoRI linkers. Phage DNA was purified from positive plaques as described (9). DNA fragments were subcloned into plasmid pUC18 or pUC19 (Fig. 1). The nucleotide sequence was determined by the dideoxy nucleotide chain termination method using M13 phage single-stranded DNA as template (10, 11). Most of the sequence was determined by the shot-gun sequence method using sonication for shearing the fragments (12). The DNA sequences were assembled and analyzed by GENETYX program (SDC Inc., Tokyo). S1 NucleaseMapping-S1 nuclease mapping was performed as described (13). The EcoRI-BamHI fragment (2.2 kilobase pairs) was subcloned into the plasmid pTZ18R (purchased from Pharmacia LKB Biotechnology Inc.). The Hinfr fragment (nucleotide -416 to +273, Fig. 2; 689 bp)’ was isolated by agarose electrophoresis and treated with calf intestinal phosphatase. The fragment was labeled at the 5’ end with T4 polynucleotide kinase and[y3’P]ATP.The labeled fragment was digested with DraI (at nucleotide -355). The fragments (61 and 618 bp) labeled at their 5’ ends were hybridized with poly(A+) RNA and digested with S1 nuclease. Poly(A+) RNA was prepared from HeLa cells (14). CATAssay-An EcoRI-BamHI (nucleotide +432, Fig. 2) fragment containing the 5’ upstream region, the first exon, and part of the first intron of the fi subunit gene was subcloned into plasmid pTZ18R. The plasmid was digested with EcoRI or DraI (nucleotide -355), shortened by treatment with Ba131 exonuclease, and then digested with BamHI. The ends of the digested fragments were filled in with T4-polymerase. ligated with HindIII linkers (dCAAGCTTG), di- gested with HindIII, and then fractionated by agarose electrophoresis. The isolated fragments were ligated to HindIII-digested pMLCAT (15,161. pSV2CAT was used as a positive control (17). The plasmids were isolated by the alkaline sodium dodecyl sulfate method (9), digested with DNase-free RNase A, and purified by CsCl gradient centrifugation (9) followed by Sephacryl S 2000 column chromatog- raphy (Pharmacia LKB Biotechnology Inc.). These plasmids were RNA-free. The orientations and lengths of the inserts were deter- The abbreviation used is: bp, base pair(s). 11257

Upload: ngokhanh

Post on 09-Feb-2017

223 views

Category:

Documents


0 download

TRANSCRIPT

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1988 by The American Society for Biochemistry and Molecular Biology, Inc.

VOl. 263, No .23, Issue of ’ August 15. PP. 11257-11262,1988 Printed in U.S.A.

Gene Structure of the Human Mitochondrial Adenosine Triphosphate Synthase B Subunit*

(Received for publication, November 13, 1987)

Shigeo OhtaS, Hideaki Tomura, Kakuko Matsuda, and Yasuo Kagawa From the Department of Biochemistry, Jichi Medical School, Minumikawachi-machi, Tochigi-ken, 329-04 Japan

The mitochondrial ATP synthase B subunit is en- coded by a nuclear gene and assembled with the other subunits encoded by both mitochondrial and nuclear genes. As the next step in the analysis of the molecular mechanisms coordinating the two genetic systems, the gene for the human B subunit was cloned, and its struc- ture was determined. The gene contains 10 exons, with the first exon corresponding to the noncoding region and most of the presequence which targets this protein to the mitochondria. Eight Alu repeating sequences including inverted repeats were found in the 5’ up- stream region and introns. An S1 nuclease protection experiment revealed two initiation sites for the tran- scription. A typical TATA box was not present at about 30 base pairs upstream from either initiation site. Three CAT boxes (CCAAT) were found between the two initiation sites. In addition, one CAT box was found 41 base pairs upstream from the first initiation site. Two GC boxes (potential Spl binding sites) were located in the 5’ upstream region, one of them linked to Alu repeating sequences. For determination of the promoter activity, fragments of various length from the 5‘ upstream region were fused to a chlorampheni- col acetyltransferase gene and transfected into cul- tured cells. This experiment showed the existence of an enhancing structure(s) for transcription between nucleotide -400 and -1100 in the upstream region.

Mitochondrial ATP synthase (FoFl) catalyzes ATP for- mation, using the energy of proton flux through the inner membrane during oxidative phosphorylation (1,2). Two sub- units of mammalian FoFl are encoded by a mitochondrial gene (3) and the other subunits (7-12 subunits) by a nuclear gene (4). The /3 subunit is encoded by the nuclear genome, synthesized in the cytosol, imported into mitochondria, and then assembled with the other subunits (5). The numbers of mitochondria per cell vary greatly depending on the develop- mental stage, cell activity, and type of tissue (6, 7). These facts suggest a functional interaction between the two genetic systems. However, the molecular mechanism for coordinating the two genetic systems is unknown. To understand the molecular basis of this coordination requires an analysis of the regulatory system for the subunit encoded on the nuclear

* This research was supported by Grant 62617006 from the Min- istry of Education, Science, and Culture of Japan. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked ‘‘adver- tisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequencefs) reported in thispaper has been submitted

503906. to the GenBankm/EMBL Data Bank with accession numberfs)

$ To whom correspondence should be addressed.

genome. As the first step, we have cloned cDNA of the human p subunit (8).

All subunit structures of FoFl reported to date, for either prokaryotes or eukaryotes, are very similar (1, 2). In particu- lar, the primary structure of the fi subunit is highly conserved between various species. In addition, it contains sequences very similar to those in parts of some nucleotide binding proteins, such as recA protein, adenylate kinase, and the ras gene product (2). Thus, it may be possible to correlate function of each region of the human /3 subunit with exon structure.

Here, we report the organization of the gene, some struc- tures involved in its expression, and the existence of an enhancing structure for transcription in the 5‘ upstream region.

EXPERIMENTAL PROCEDURES

Gene Cloning and DNA Sequencing-Genomic DNA encoding the 0 subunit was isolated from a human genomic library, using the 0 subunit cDNA (8) (EcoRI fragment; 1.8 kilobase pairs) as a hybridi- zation probe. The library was a gift from Dr. Nojima (Department of Pharmacology, Jichi Medical School). Human genomic DNA was partially digested with Hue111 and AluI and ligated to the EcoRI sites of a X phage DNA, Charon 4A, with EcoRI linkers. Phage DNA was purified from positive plaques as described (9). DNA fragments were subcloned into plasmid pUC18 or pUC19 (Fig. 1). The nucleotide sequence was determined by the dideoxy nucleotide chain termination method using M13 phage single-stranded DNA as template (10, 11). Most of the sequence was determined by the shot-gun sequence method using sonication for shearing the fragments (12).

The DNA sequences were assembled and analyzed by GENETYX program (SDC Inc., Tokyo).

S1 Nuclease Mapping-S1 nuclease mapping was performed as described (13). The EcoRI-BamHI fragment (2.2 kilobase pairs) was subcloned into the plasmid pTZ18R (purchased from Pharmacia LKB Biotechnology Inc.). The Hinfr fragment (nucleotide -416 to +273, Fig. 2; 689 bp)’ was isolated by agarose electrophoresis and treated with calf intestinal phosphatase. The fragment was labeled at the 5’ end with T4 polynucleotide kinase and [y3’P]ATP. The labeled fragment was digested with DraI (at nucleotide -355). The fragments (61 and 618 bp) labeled at their 5’ ends were hybridized with poly(A+) RNA and digested with S1 nuclease. Poly(A+) RNA was prepared from HeLa cells (14).

CATAssay-An EcoRI-BamHI (nucleotide +432, Fig. 2) fragment containing the 5’ upstream region, the first exon, and part of the first intron of the fi subunit gene was subcloned into plasmid pTZ18R. The plasmid was digested with EcoRI or DraI (nucleotide -355), shortened by treatment with Ba131 exonuclease, and then digested with BamHI. The ends of the digested fragments were filled in with T4-polymerase. ligated with HindIII linkers (dCAAGCTTG), di- gested with HindIII, and then fractionated by agarose electrophoresis. The isolated fragments were ligated to HindIII-digested pMLCAT (15,161. pSV2CAT was used as a positive control (17). The plasmids were isolated by the alkaline sodium dodecyl sulfate method (9), digested with DNase-free RNase A, and purified by CsCl gradient centrifugation (9) followed by Sephacryl S 2000 column chromatog- raphy (Pharmacia LKB Biotechnology Inc.). These plasmids were RNA-free. The orientations and lengths of the inserts were deter-

’ The abbreviation used is: bp, base pair(s).

11257

11258 Human Mitochondrial ATP Synthase Gene - 2 - 1 0 1 2 3 L 5 6 7 8 9 k b o . .

_c

" S

sub clone

I c

FIG. 1. A schematic illustration of the restriction map and the sequencing strategy. The exons are shown by boxes, and the encoding regions are shown by closed boxes. The fragments were subcloned as shown by long arrows. The small fragments were ob- tained by sonication or with restriction enzymes for the nucleotide sequencing. Some parts were determined by using synthetic oligonu- cleotides (17 bases in length; synthesized with an automatic DNA synthesizer, Applied Biosystems Model 380B, Foster City, CA) as primers. The direction and extent of sequence determinations are indicated by short arrows. +, h, and indicate the nucleotide sequence obtained by sonication, with restriction enzymes, or by using a synthetic oligonucleotide as a primer, respectively. The loca- tions of the Alu repetitive sequences in the human F1-ATPase j3 subunit are also shown by arrows.

-

- 1 n ~ n t

mined by electrophoresis after digestion with selected restriction enzymes. Plasmid DNA (10 pg) was transfected into HeLa or mouse A9 cells (2 X lo6 cells in 9-cm dishes) (14, 18). After 2 days, the cells were lysed, and the lysates (2 mg protein mg/ml) were incubated with acetyl-coA and ["C]chloramphenicol for 1 h at 37 'C (14). The acetylated chloramphenicol was separated by thin layer chromatog- raphy on Merck high performance thin layer chromatography pre- coated plates of Silica Gel 60 with a concentrating zone. For quanti- tative analysis, the silica gel around the radioactive spot was scraped off and assayed for radioactivity in a scintillation spectrometer.

RESULTS

Molecular Cloning-A human genomic gene library con- structed in Charon 4A was screened with 32P-labeled human F1 /3 subunit cDNA (8). Out of a total of 1,500,000 plaques, we identified five positives. The positive plaques were purified by successive rounds of plaque hybridization. The phage DNAs were isolated and subjected to "Southern" analysis using 5' region (EcoRI-Aut11 fragment; 280 bp) and 3' region (SmaI-EcoRI fragment; 886 bp) of the cloned cDNA (8) as hybridization probes. Three kinds of clones were isolated. DNA from the first clone hybridized with 5' and 3' regions of the cDNA. The second clone appeared to contain an allelic gene of the first one, because all but the KpnI fragments were identical. DNA in the third clone was shorter than that in the

-1800; ATGCCTCCCAAAGTGCAGGGATTACAGGTGTAAGCCTACCACACCCAGCCAAAATCCTATGTTTTGATGCACTCCACTGAAATTAGTTTTGATCCCACTCCACCATGGAATGATTTCCTA

GTTGCCAAACCCAAACCCAGAGGATTTTTTTCAGCCCTAATTTTTTTTTCTTCTTTTTTATTTTTCTTTTTTTGTTGAGACAGGGTCTCACAACGTCAGCCACGATGGAGTACAATGGTG - 1740* -1680;

CAATCTTAGCTCTCTGCAACCTCTACCTCCTGGGCTCAAGCCATATTCCCATCTCAGCTTCCCCAGTAGCTGGGACTACTGGCACATGCCACACCTGGCTAATTGTGTGTCTGTGTGTGT - 1620* -1560*

GTGTGTGTGTGTGTTTATTTTTTTACATATTTTTTAGAGACAAGGTTTCACCACGTTGCCCAGGCTGGTATTGAACTCCTGGGCTCAAGTTATCCACCCACCTCGGCTTCCCAAAGTAGA - 1500* - 1440*

Z-DNA - 1 38n* - 1 7 9 n t ~

TTGGGATTACTGGGCTGGCCGTGCCCAGCCCAGCCCTTTTTACTTATTTTATTATTTATTTATTTATTTATTTTTTATTATTATTTTTGAGACGGAGTCTCGCTCTGTCACCAGGCTGGA

FTGCAGTGGCACGATCTCAGCTCATTGAAACCTCCGCCTTCCAGGTTCACGCCATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGACTACAGGCGCCCGCCACCACGCCCAGCTAATTTT

TGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGCCAGGATGGTCTAGATCGCTTGACCTCATGATCTGCCCACCTCGGCCTCCCAGAGTGCTGGGATTACAGGCGTGAGCCACCGTG

"" "" -

-1260; inverted Alu repeating sequence -1200;

- 1 1 4 0 * - 1080*

-1020* -960* CCCAGCCTATTTTATTTTATATATTTATTTTATTTTTTTAGAGAGTCTCACTCTGTCGCCCAAGCTGGAGTGCAGTGGCGTATCTCGGCTCACTGCAAGCTCCGCCTCTCGGATTTACGC

CATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGGACTACAGGCGCCCGCCACCACGCCCGGCTAATTTTTTGTATTGTTAGTAGAGACCGGGTTTCACTGTGTTAGTCAGGATGGTCTCGA -900; inverted Alu repeating sequence -8405

-7SO* -720; X J C C T G A C C T C G T C A ~ ~ C C G C C ~ G C C T T G G C C T C C C A

GC box -6bl.J; TAATTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTAGCT

AGGCTGGTCTCGAACTCCTGACCTCAGGTG~TCCACCTGCCTCGGCCTCCGAAAGTACTGGGATTACAGGCGTGAGC~ ACCGCTCCCAGCCT.ACCCTAAGAATTTCTATTCACATTTCAT inverted Alu repeating sequence * - 6 0 0

-540; - d m * AGCTGAGGTCTCTCAGCTTCCTTCAAGGTGGACAATTCTGTATCTCTAGGTGACCCAATTTTATCCCCTCGTCTTTCGTGCACCCAGTTTATGCTAACCAGGAACCATCACATTATTGCA

- .. ".

TTGTTTGTTCACCTGGACCTTAAAAGGAAGGTTCTTAGTCTTATTTATCTTATTATT~TCCTCAATGCCCTGCTAAAGTACTGGTTACATTAGTTTATTTCATTTTAATAGGTTTA -420* -360*

AATGATAAAGATCCTGTCTAGGTTGCTAATCTGTGCAATCGCCTGGGAGCTTGCAATCTCACTACTTCGAGAGAAAAACCTTATTTTGGCTCTAAAGAGGGCTCCTGACTTCGCCACGCT -300* Hinfl -240; Dral

- 1 80; -120* CACCTTCAGTGGTGCTTCACCGCACTACCACTTCCCTAACC~~~TATGGCTGTCACCTAGATCAAGGACCTATCTAAGGAGAAAGCCCAAGGACAGGCAAAGACAGGCCACGCAC

TTTCAACAGGAACTCGGCCCCTTTCCTAAACCTAGTTCCTCTGATTACCCTGTCCAACGGACGTCGCTATGA~AAGTTTTGAGGCTCAGAGTGACGCCTCTAACCTC CACTGT

I C C A A A A A A G G A G T T T C A G A T T A G C A G C G C A G T T T T G ~ G A T G A G A A G A T C A G C C T G T A T T A A C C T G T A G A A G C A G C A C A G A G A A T G A T G A G C A G C C A C T T C A ~ G A

GC box -60; - I * * ]

*60 CAT box 1 s t initiation s i t A 2 0

CAT box CCTGCCTACTGCAGCGTAGGCCTCGCCTCAACGGCAGGAGAGCAGGCGGCTGCGGTTGCTGCAGCCTT~GTCTCCACCCGGACTACGC TGTTGGGGTTTGTGGGTCGGGTGGCCGCT

CAT box * I S 0 240rCAT box

nd l n ~ t l a t r o n s ~ t e etLeuG1YPheVaIGlyArKValAlaAla

GCTCCGGCCTCCGGGGCCTTGCGGAGACTCACCCCTTCAGCGTCGCT GCCCCCAGCTCAGCTCTTACTGCGGGCCGTCCGACGGCGGTTCCATCC AlaPrOAlaSerGlyAlaLeuArgArgLeuThrProSerAlaSerLeuProProAlaGlnLeuLeuLeuArgAlaValArgArgArgSerHisPr

TAAGTGCGTTTTCTCTAGGAGG U n f 1 *300 o f transcription t 3 6 0

CTAACGTTCCATTTTGCCGCCCCATGAGCCTTGAGCCGGGGAACGATGGTAGCTCGGGCCTAAG~GGTGTGTTAAAAGGATGCCTGGAGCCGCGTCTTGCTCTCTAATGGCTGGA

GAACAAGAATGGGACACCCATAGGAGGTTTCTTCGACCCAGCGTCTGTCCCATTTTGTATAAAGTCCTTTGTGTGACGGAAAAGGTGCGGGCAGGTGCGTTGACCTTCCGGTTGAACAGG

GGTACTGAACTCTCGAAAAATGGGGTCTTGTGCATGGCGGATGGGAAGGCTGTGCAGGCGCTTTATCGATCTGCGCGGCCTTCCCTCGTTGGACACGCGTTCCGTACAAGCGGAAAGAGG

CCTGACTTCGGCTGAGTTTTCCTCTCCTTGTGTGTTAAGTCCTGGCTTCCGATGACCTTTAACTGGCCTGACCCCAGCTCCTTCCAATATCTTTTGTCTGTT~ CAGGGACTATGCGGC

*420 e480

*540 BamH 1 *600

*660 * 7 2 0

*780

2nd exon

GCAAACATCTCCTTCGCCAAAAGCAGGCGCCGCCACCGGGCGCATCGTGGCGGTCATTGGCGCAGTGGTGGACGTCCAGTTTGATGAGGGACTACCACCAATTCTAAATGCCCTGGAAGT aGlnThrSerProSerProLysAlaGlyAlaALaThrG~yArglleValAlaVallleGlyAlaValValAspValGlnPheAspGluGlyLeuProProIleLeuAsnAlaLeuGluVa

GCAAGGCAGGGAGACCAGACTGGTTTTGGAGGTGGCCCAGCATTTG TTGGAATAGTAAAGACCTTGTGTAGCCCAAATATCCCCAAAACCAAGCTGTCTGCCTTC lGlnGlyArgGluThrArgLeuValLeuGluValAlaGlnHisLeu

*goo t 9 6 0

* 1 0 m * 1080

FIG. 2. Nucleotide sequence of the gene for the human ATP synthase fi subunit and the deduced amino acid sequence of the exons. The sequences described in the text are noted with underlines or boxes. The nucleotides are numbered from the upstream initiation site for transcription. Since part of the gene was not sequenced, numbers are not accurate from the 5809th nucleotide. The undetermined segment was judged to be 2.0 kilobases in length based on an electrophoretic analysis.

Human Mitochondrial ATP Synthase Gene 11259

TATGATGATTTTATCAAAATGACTTTCGTTCTTCTGAGTTTGCTGAAGCCACATTTGGTACTGAGAAGGAGTCTTGGTCGATTTAGGTCTTGATACCAATTTATCCTTATGTATAACCCT *I140 * I 2 0 0

* I 2 6 0 * I 3 2 0 T G A C A G C A C A G T A A G C A C T A T T G C T A T G G A T G G T A C A G A A G C C ~ G T T A C A G C C C A C A A A G T A C ~ A C C A A C T C A A A A ~ yGluSerThrValArgThrlleAlaMetAspGlyThrGluGlyLeuValArgGlyGlnLysValLeuAspSerGlyAlaProlleLyslleProVa~Gl

TGCATCTATGCACAAAGGAACTTCTGTTAT * I 3 8 0

TCCTGAGACTTTGGGCAGAATCATGAATGTCATTGGAGAACCTATTGATG yProGluThrLeuGlyArgl lenetAsnYal l leClyCluProl leAspG

*I440

TTGCAGTTTTACACATTAAGGTGACATGTTCTCCTAAACCAGAATTTTTCTTCGTATACGTTCGATTATTAAAATCACACAGTGCAGTTTGACCTGAAGAAATTAGGATGCAGAATTTCT * I 500 * I560 *I620 * I 6 8 0

inverted Alu repeating sequence *I800 GATTGACAAAAGCCTACTACTTGAGGTTTTTTTTTTTGAGAGGGAGTTTGCGCTCTCGACTCACTGCAACCTCCACCTCCTGGGTTCAAGTGATTCTCCTGTCTCAGCCTCCTGAGTAGC

TGGGATTACAGGTGCATGCCATCATGCCCGGCTAATTTTTGTATTTTTAGTACAGATGGGTTTCACCATGTTGGCCAGGCTGGTCTCGAACTCCTGAGTTTTGGGTGATTCCCCCCCCAC

CTCGGCCCCCCAAAGTGCTGGGATAACAGGCATGAGCCACTGCGCCTGGCTAGTTGAGTATTTTTAAGACTCGGTCTTATGGCTTAGGTTTAGGAGTCTCCAAAACTTGTAGGTTTTTGT

ATTTGCTATTGCTATGTGAGGGAATGCTGTTAGGTTAAGGCTCAGTAAAATCAAAAATTTATGACAGTTAAAGTCAGGGAATGGGATTATTGTTTCCTAAGATAATAATTTTCTTTATTG

CAGAAAGTTGGGTTAGATCCTAGATGATATGTTTATCTGATTTATGTAAGAAAGGGGAGGTAAGACTGTGGTTCCTCAGGTAAAGGGATGGACTTGAGAGTTGATAGGGAGGGATATATA

AGATTATCATTTGCTTTCAACAAGGTGAATAGATACTATCCTCTCTT TGAGGCTCCAGAGTTCATGGAAATGAGTGTTGAGCAGGAAATT

*I740

* l 8 b U *I920 - *I980 e2040

* 2 1 0 0 *2 160

* 2 2 2 0 *2280

aGluAlaProGluPheHetGluMetSerValGluGlnGlulle * 2 4 0 " -

CTGGTGACTGGTATCAAGGTTGTCGATCTGCTAGCTCCCTATGCCAA TACATATTTTCCAAACCTAATATGGGAGGAAAAACTCATACAA LeuValThrGlyIleLysValValAspLeuLeuAlaProTyrAlaLy

TGTTATGAGTGGATATTGACATCTATTCCTCACTGATGAGTACGTTCTGACTTTCGTTCTTCTGAGTTTGCTGAAGCCAGATGCAATTTCTGAGAAGGAATAGGATGGAAGGAAGCAATT

ACTTTTTTTGAAGTTTGCTTATTTAGGAAAGCAGAACTCTTAAATATTCTTTCTTTTCAAC GCTTTTTGGTGGTGCTGGAGTTGGCAAGACTGTACTGATCATGGAGTTA yLeuPheGlyGlyAlaGlyValGlyLysThrValLeulleMetGluLeu

*2340

*2460 * 2 5 2 0

*2580 *2640

*2760 ATCAACAATGTCGCCAAACCCCATGGTGGTTACTCTGTGTTTGCTGGTGTTGGTGAGAGGACCCGTGAAGGCAATGATTTATACCATGAAATGATTGAATCTGGTGTTATCAACTTAAA~

e2700

IleAsnAsnValAlaLysAlaHisGlyGlyTyrSerValPheAlaGlyValGlyGluArgThrArgGluGlyAsnAspLeuTyrHisGluMetlleGluSerGlyVallleAsnLeuLys

AspAlaThrSe:Lyf intron 5 - GATGCCACCTC AA TAAGCATTGAGCTAATTGATGTCCACGTGGTAGTTACGCCAGGAAACTATTGCCAACTGAAGCCATGTTCTATATTTTCTATTTAATTTTTTTTTCTTTTTAAG

AAAATGTTAATATTTTCATTTTCGTGTCCCAGGGCCTACAAAGGAAACCTGTTAGTGGTTTTCCCCCCCAACTTCCAATACACAAATGGACATTTTCTTTCCTCTTTGATACTGACATTA e2940 *3000

AATAACATTTCCAGGTTTTCTTTTTGGTAGATAGTCCCATCATTTGAAAGCTAATTTTGGGAGGGCTGGATTGGACGTGGTACATATGTTTCAGTTAAGCCTATTTTGTACTTTTCACTT

AACTTGAATAGGTTTTTATTTCCTTTCTACTATCAGTTTTAGGCCACTAACTATAAGAAGCTGTCTTTTTTTTGAACTGCATCTCTAATAGGGAAAAGAAACTTGCACACAAAAACGCTG

+3060 * 3 1 2 0

*3180 $3240

ACACCTTCTGAGAAAGGATCTGTGGTCGTTTCTCCGATTTGGGAACCTTCAGTATGTGGCTTCTCTACTCCTTTGTCAGGTTTTACGATTATGTATGATTACTGCA TAGCGCTGGTAT *3300 *3360

6th exon-$-

ATGGTCAAATGAATCAACCACCTGGTGCTCGTGCCCGGGTAGCTCTGACTGGGCTGACTGTGGCTGAATACTTCAGAGACCAAGAAGGTCAAGATGTACTGCTATTTATTGATAACATCT * 3 4 2 0 *3480

yrGlyGlnHetAsnGlnProProGlyAlaArgAlaArgValAlaLeuThrG~yLeuThrValAlaGluTyrPheArgAspGlnGluGlyGlnAspValLeuLeuPhelleAspAsnlleP

TTCGCTTCACCCAGGCTGGTTCAGA TTGAGAGGACCTGGTTCATCTGGCCTTTCTTCGGATGAGGCTAAGGATGTAGACACCTAAGACCTTTTTTTCTTTTA T heArgPheThrGlnAlaGlySerGl 7 t h e x 0 n - E

RCTGCATTATTGGGCCGAAT ATCAGCCTAC 5

*YS40 *3600

e3660 *3720

lSerAlaLeuLeuGlyArglleProSerAlaValGlyTyrGlnProThrLeuAlaThrAspMetGlyThrMetGlnGluArglleThrThrThrLysLysGlySerlleThrSerVa~Gl

*3780 *3840 TAAGGCCCATTTACATAGATGAAGATCTGATTTGTATAAAGGAGCCGGGCAGTGGTGCATCTCAGCTACTGAGGAGGCTGAGGCAGGAAGATTGCTTGAGCCCAAGAGATCAGCCTGG

~ ~~

intron 7 - Alu repeating sequence *5830

GGAACAAAGCTGTATGT"""""""""""""""- ( 2 , 0 k b ) """"""""" CTGAAACCCCGTCTCTACTAAAAATACAA *5890 e5950

AAAAATTAGCCGGGTGTGGTGATGGGCACCTGTAGTCCTAGCCACTCGGGAAGCTGAGACAGACAATTGCTTGAATCCAGAGGCAGAGGTTGCAGTGAGCCGAGATTGCACCACTGCACT

CCAGCCTGGGCGACAGAGCGAGACCTCTCAAAAAAAAAAAAAAAAATTTACTTTGGTTCCTGTTCTCATGGG CTATCTATGTGCCTGCTGATGACTTGACTGACCCT IalleTyrValProAlaAspAspLeuThrAspPro

Alu repeat~ng sequence e6010 e6070

.LC I -an * c 1 o n

GCCCCTGCTACTACGTTTGCCGATTTGGATGCTACCACTGTACTGTCGCGTGCCATTGCTGAGCTGGGCATCTATCCAGCTGTGGATCCTCTAGACTCCACCTCTCGTATCATGGATCCC AlaProAlaThrThrPheAlaHisLeuAspALaThrThrThrValLeuSerArgAlalleAlaGluLeuGlylleTyrProAlaValAspProLeuAspSerThrSerArglleMetAspPro

"I"" .,"

e6250 *6310 AACATTGTT~AGTCACCATTACGATGTTGCCCGTGGGGTGCAAAAGATCCTGCA TGAGTATATTACTATGTGGGATCAGTGTCCGGAAAGTCAAAAAGAGGGCCTGTGGGGACTAT AsnlleValGlySerGluHisTyrAspValAlaArgGlyValGlnLYslleLeuGl intron 8 - ATGAGGACTGGATCTTTCTTAGTGATTTGTTTTGGAGCAAGGAAAGTTGAGGCTGGCCAATTGCTAGTGAGAGCTAATGAGGTCTCTTGAGTTTCCAGGTACTTAACGCTTTGGAATGCA

*6370 *6430

ATATTTTTCTTTCTTTTTTTTTTTTTTTGAGATGAGTCTCACTGTGTCTCCCAAGCTGGAGTGCCAGTGGTGCCCTATCTCGGCTCACTGCAACCTCCGCCTCCCAGGTTCAAGCGATTC *I3490 *6550

TCCTGCCTAGCCTCCTGAGTACCTGGGATTACAGGTGCGCGCCACATGCCTGGCTACTTGGGATTACAGGTGCGTGCCACATGCCTGGCTACTTTTTTGTATTTTTAGTAGAGACCAGGT *6610 inverted AIu repeating sequence *6670

*6730 TCATCAGTTGGTAGGCTGGTCTCGATCTCCTGACCACATTGATCCACCCACCGTGGCCTCCCAAAGTGGTGGGATTACAGACGTGAGGCCACTTGGCCCGGCGGAATGGGCAATATTTTT

*6790

TCTAACAGCACCAAACTAGGACT'ATAGGAGAACAAGACACTGATCTTTCTTGGCTGAGGGCCCTCATAACACCACCACCTTCTCTGCCCCCTAGCATGTAACTTTCCCTTTGTTAGCTTG *6850 *6910

TCCAAATTAAAGGAATGAGAATACTTAACTCAGTCTTCTT ACTACAAATCCCTCCATGATATCATTGCCATCCTGGGTATGGATGAACTTTCTGAGGAAGACAAGTTG *6970 *7030

spTyrLysSerLeuGlnAspllelleAlalleLeuGlyMetAspGluLeuSerGluGluAspLysLeu *7090

A C C C T G T C C C G T C C A C C G A A A A T A C A G C G G A A G G A G A - *7150

ThrValSerArgALaArgLyslleGlnArgPheLeuSerGlnProPheG~nValAlaGluValPheThrGlyHisMetGlyLysLeuValProLeuLysGluThrlleLysGlyPheGln

Z Z " 2 T intron 9 - TGAGATTTTGAGTACAAATCTTGAATGTTTACTGTGCTGTGGTCCCATTCCAACAACCGTCAAGCAATATATGTAATATACTATGCTTAATTATATTTTTTAATTT

AAAAAACAAACTTACCCATTGATTTTGTTTGAAAATACTCTACATTTAATTTGAGTGTGATTTGTTACTTGATTCTCTAGGCTCCCTTTTATTTTATATGTATTTTTTGAGGACTGAGGT

CTCTGTCGCCCAGGGCTGGAGTACACTGGTGCAATCTTGGCTACTGCAACCTCCACCTGCCGGGTTCAAGTGATTCTCAGCCTCAACCATCCAAGTAGCTGGGATTACAGGCACACGCCA

GGACGCCTGGCTAATTTTTGTATTTTTAGTAGCCATGGGGTTTCACCATGTTGGCTGGGCTTGTCTCGAACTCCTGACCTTAGGTGATCCGCCTGCCTTGGCCTCCCAAAGTGCTGGGAT

TACAGGTGTAAGGCCAGGCCACCGTGCCTGGCCCATGTGTTCTTAATTCATACTGTATCATATCTTGTAAATTTGATTTGTGAGGGAAATTTAAGCTTTCTAAGATGACATGAATTCATC

* 7 2 1 0 *7270

*7330 *7390

*7450 *7510

lnverted A I *7570 a7630 u repeating sequence

* ' /b90 *7750

FIG. 2-continued

11260 Human Mitochondrial ATP Synthase Gene

ACATTCTAACTGATGGCCTGAAGTGGTGAGGAATGTTACGTAGCAGAAAGTTGATATCCCTCCGCTTCTTACTCTTTTTTTTTTTCTCCCCCATCATACA TGAATATGACCATCTCCC 10th e x o n - t T G l u T rAs HisLeuPr

AGAACAGGCCTTCTATATGGTGGGACCCATTGAAGAAGCTGTGGCAAAAGCTGATAAGCTGGCTGAAGAGCATTCATC GGGGTCTTTGTCCTCTGTACTTGTCTCTCTCCTGCCC

*7810 *7870

*7930

OGluGlnAlaPheTyrHetValGlyProIleGluGluAlaValAlaLysAlaAspLysLeuA~aGluGluHisSerSe

CTAACCCAAAAAGCTTCATTTTTCTGTGTAGGCTGCACAAGAGCCTTGATTGAAGATATATTCTTTCTGAACAGTATTTAAGGTTTCCAATAAAATGTACACCCCTCAGGAATTTGTCTG

ATTCTCTTGGTTCTGACAACATACCTCAACAACACTGAAGGGTTATGTATTTAATTTTAGTTTTAGGAGACACGGTGTCTGGCCTGTGTTGCCAAGACTGGTCTCTAACTCCTGGGCTCGAGA

TCCTGAGTAGCTGGGGCTACAGGTGTGTGTAGTCTCACGTCACCAGCACTGTTTTCAACAATTAGATTTTTAGGAGTGGCCATAATGAGGCAGTTCAGCATGAAGTGGGCCCATGTATGT

*so50 881 10

*a170 p o l y A signal *a230

*I3290 *a350

FIG. 2-continued

others, so this gene may be a pseudo-gene.’ Therefore, the structure of the DNA in the first clone was determined in detail.

Organization of the human F1 ATPase /3 subunit gene is shown in Fig. 1, and most of the nucleotide sequence is shown in Fig. 2. The intronlexon junctions were determined by direct comparison of the nucleotide sequences with the cloned cDNA. The sequence at the 5’ end of the cDNA was not found in the gene. In addition, the sequence at the point of divergence of the cDNA and genomic sequences did not agree with the consensus sequence (AG/GT rule) at intronlexon junctions (19). An S1 mapping experiment showed that mRNA from HeLa cells hybridized with the fragment from the genomic gene (Fig. 4). Therefore, we conclude that 18 nucleotides encoding the presequence were rearranged, prob- ably during construction of the cDNA library. The pre- sequence should be MLGFVGRVAAAPASGALRRLT- PSASLPPAQLLLRAVRRRSHPVREYAAQ instead of

PPAQLLLRAVRRRSHPVREYAAQ (8). Sequences at the other intronlexon junctions were completely consistent with the “AG/GT” rule. The gene was composed of 10 exons. The first exon contained the noncoding region and most of the presequence which targets this protein to the mitochondria.

The amino acid sequence of the human Fl /3 subunit is highly homologous to that of tobacco (20) (Fig. 3B). However, the gene organization, junction points, and the length of exons and introns are quite different from that of the tobacco gene (Fig. 3A). Four of the splice junctions in the human gene were close to their counterparts in tobacco except that they were shifted in the 3‘ direction by three or four codons (Fig. 3B).

Structure in Noncoding Region-A stretch of 29 alternating Ts and Gs began at nucleotide -1558. This sequence has the potential to form a left-handed helical structure or Z-DNA (21). Eight Alu repeating sequences (22) are present in the gene (Figs. 1 and 2). Three inverted repeats are located in the 5’ upstream region. The seventh intron contains two direct repeats of the Alu sequence.

Structure Related to Transcription-The sites of transcrip- tion initiation were determined by S1 nuclease mapping analysis of the 5‘ end of the gene. The S1 nuclease protection experiment showed the existence of two initiation sites for transcription (Fig. 4). The first initiation site is at +1 and the second at +197. Putative TATA boxes (19) were not found around 30 nucleotides upstream from either initiation site. Three CAT boxes (CCAAT) (19, 23, 24) were found between the first and second initiation sites, one at -41. Two GC boxes (CCGCCC) (23, 25) were found, one of them linked to Alu repeating sequences (Fig. 2).

Cis-acting sequences involved in the regulation of transcrip- tion were assayed by measuring expression of CAT in tran- sient expression assays. Various lengths of the 5”flanking

MTSLWGKBTGCKLFKFRVAAAPASGALRRLTPSASL-

H. Tomura, unpublished result.

- A Human

exon 1 2 3 4 5 67 8 9 10

exon 1 234567 8 9 tobacco

tobacco IIASRRLLASLLR~SAQRGGGLISRSLGNSIPKSASRASSRASPKCFLL~RAVOYATSAAA

human MLGFVGRVAAAPASGALRRLTPSASLPPAOLLLRAVRRRS

FIG. 3. Comparison of the human gene and tobacco (20) genes. A, the length of the exons and the introns are illustrated. The intron/exon junctions are connected with their counterparts with lines. Only the encoding regions are shown by boxes. B, the deduced amino acid sequences. The identical amino acid residues are boxed. The intron points are shown by arrows.

region of the /3 gene were inserted into the DNA of a plasmid which harbors the gene for chloramphenicol acetyltransferase. These plasmids were transfected into HeLa cells or A9 cells by calcium phosphate co-precipitation (14, 17).

The proximal 200-bp segment of 5”flanking DNA region was sufficient to elicit measurable levels of expression (pHB3CAT). The fragment between 400 and 1100 bp (pHB1CAT) enhanced the expression 9-fold over the level obtained with pHB3CAT (Fig. 5). Similar results were ob- tained on transfection into mouse A9 cells (data not shown). These results suggest the existence of an enhancing structure upstream from about -400.

Human Mitochondrial ATP Synthase Gene 11261

Mr 1 2 3 4 A - 310, 281 . 271

234,

194,

118,

I

I

72,

- - +276bp t

Hlnfl 1

1 mRNA +- 4-

SI 4 -" """- ""_. 4 --""

""""" 7" 4.4.

f e o b p

FIG. 4. Determination of the transcriptional initiation sites by S1 mapping. The right panel shows the outline of the experiment described under "Experimental Procedure." The hybrid protected from S1 nuclease digestion was subjected to electrophoresis in a sequencing gel (6% acrylamide gel with buffer gradient). Lane 1, the sequence ladder obtained by with a mixture of four bases. Lane 2, fragment markers obtained by digestion of 6x174 DNA with HaeIII. Lane 3, labeled fragment hybridized with poly(A+) RNA (7 pg) pre- pared from HeLa cells. Lane 4, fragment hybridized with 7 pg of yeast tRNA (control). The two arrows indicate the two fragments protected against S1 nuclease in the lane 3.

DISCUSSION

Organization of the Gene-The FoF, complex is widely distributed in most organisms. The /3 subunit is a catalytic center of FoFI, and its structure is highly conserved (2). Thus, the organization of the gene should provide evidence of evo- lutionary relationships.

Boutry and Chua (20) described organization of the gene for the @ subunit of Nicotianuplumbaginifolia. The length and number of introns and exons differ from those of the human gene. Four intronlexon junctions of the human gene were close to their counterparts in tobacco, but shifted three or four codons in a 3' direction. The shifting may be explained by mutations that block a splice junction, unmasking a cryptic slice site that adds or deletes small amounts of protein (26). However, it is difficult to explain why the splice junctions were shifted in the same direction in each case.

Most mitochondrial proteins encoded in the nuclear genome have transient presequences (5). These peptides, which direct proteins into the mitochondria, are sometimes embedded in a cytosolic protein (27). Since such targeting sequences are often found even in Escherichia coli. (28), they may have been fused to an existing protein during evolution making it into a mitochondrial protein. According to the "exon-shuffling" model (29), the addition of such an additional function should be provided in a separate exon. In fact, the first exon in the

origin

B - J 1 I: 20

10

FIG. 5. Promoter activity of the 6' region shown by CAT assay. HeLa cells were transfected with 10 pg of DNA from various plasmids. The lengths of human gene in pHBlCAT, pHBSCAT, pHB7CAT, and pHB3CAT are -1500, -830, -740, and -640 bp, respectively. Since the fragments were ligated at BamHI site (nucleo- tide +432), all the fragments contain the first exon and the part of the first intron. Therefore, the lengths of 5'-flanking region are -1100, -400, -300, and -200 bp, respectively. pHBlCAT should contain 2 Alu repeats connected to a GC box. pSV2CAT was con- nected to an SV40 promoter. pMLCAT did not contain a promoter (negative control). A, autoradiogram obtained in a typical experiment. The spots of chloramphenicol acetylated by the chloramphenicol acetyltransferase expressed in HeLa cells. B, average values obtained in five separate experiments in which the plasmids of independent preparations were used. The values expressed were determined by titration of cell extracts in the region where increase in activity was linear.

gene for the human ATP synthase /3 subunit corresponds to the presequence. According to the "module hypothesis" by Go (30), an exon is a structural unit which makes a compact peptide. This suggests the compact structure of the prese- quence peptide.

Nucleotide binding proteins have regions of sequence sim- ilarity which constitute nucleotide binding sites. In the /3 subunit, such regions are found in exons 5 and 6 (8). The Alu repeating sequences are located in introns 3 and 7. Thus, the exons that code for the nucleotide binding sites may have been rearranged at the sites of the repetitive sequences during evolution. Our finding of a pseudo-gene lacking exons 4-73 is consistent with this hypothesis. The cluster of acidic and basic amino acid residues in the @ subunit is proposed to

H. Tomura and S. Ohta, unpublished observation.

11262 Human Mitochondrial ATP Synthase Gene

include the conformational change by a proton flux (31). The region for the acid-base cluster was found in exon 9, also located between two Alu sequences.

Expression of the Gene-ATP synthase and some enzymes related in the respiratory chain are complexes of the products of both nuclear and mitochondrial genes (4,32). Mitochondria vary in number and morphology depending on intercellular and extracellular conditions. However, the fundamental mo- lecular mechanism for the coordination of the two genetic systems is unknown. Mammalian cells in culture are good experimental systems to study this problem for several rea- sons. Mammalian mitochondrial DNA has been studied as extensively (33-35) as the yeast one. Mammalian DNA frag- ments can be transfected into cultured cells and promoter activity easily estimated (18). Several transcriptional factors have been purified from mammalian cells. In addition, it may be possible to study nuclear-mitochondrial interactions using some abnormal mitochondria from human patients with mi- tochondrial myopathy (36). Therefore, we used a human cell line for our analysis.

S1 mapping revealed two initiation sites for the transcrip- tion. The role of the two start sites in the regulation of transcription is unknown. Three CAT boxes (CCAAT) are located between the first and second initiation sites. If a transcriptional factor is bound to one of those CAT boxes, transcription from the second site may be enhanced. A stretch of alternating Gs and Ts was found in the 5“flanking region. Although the tendency of this structure to form a left-handed helix or Z-DNA is weaker than that of (GC),, the sequence (GT), is widely distributed in eukaryotic genome (21). The relations of the left-handed helix of DNA with the potential effect on transcription were discussed (21). The role of the alternating GT structure in the expression of the p subunit gene is under investigation.

Three inverted Alu sequences were found in the 5’ upstream region, one of them linked with a GC box. The role of the repetitive sequence in regulating gene expression is not clear, but the CAT assay showed that the fragment (-400 to -1100) containing the Alu repeats exhibited enhanced transcription. Further analysis is required to determine the roles of these structures. In any event, our data demonstrate that full expression of the gene required more than 400 bp of upstream sequence in the gene of a “house-keeping’’ enzyme. It will be important to isolate the trans-acting transcriptional factors involved. Analysis of the gene structure may allow us to elucidate the molecular basis of the regulation and of coordi- nation of the synthesis of the ATP synthase complex in cultured cell systems.

Acknowledgments-We thank Dr. H. Nojima (Jichi Medical School, Department of Pharmacology) for providing the human ge- nomic library and for valuable discussions. We thank Drs. A. Fujis- awa-Sehara and Y. Fujii-Kuriyama (Cancer Institute, Japanese Foun- dation for Cancer Research) for gifts of pML-CAT and pSV2-CAT.

1.

2.

3. 4. 5. 6. 7.

8.

9.

10.

11.

12.

13. 14.

15. 16.

17.

18.

19.

20. 21.

22.

23. 24.

25.

26.

27. 28.

29. 30. 31. 32.

33.

34. 35. 36.

REFERENCES Kagawa, Y. (1984) in Bioenergetics (Ernster, L., ed) pp. 149-186,

Amzel, L. M., and Pedersen, P. L. (1983) Annu. Reu. Biochem.

Fearnley, I. M., and Walker, J. E. (1986) EMBO J. 5, 2003-2008 Hatefi, Y. (1985) Annu. Reu. Biochem. 54, 1015-1069 Schatz, G., and Butow, R. A. (1983) Cell 32,316-318 Williams, R. S. (1986) J. Biol. Chem. 261, 12390-12394 Williams, R. S., Garcia-Moll, M., Mellor, J., Salmons, S., and

Harlan, W . (1987) J. Biol. Chem. 262, 2764-2767 Ohta, S., and Kagawa, Y. (1986) J. Biochem. (Tokyo) 99, 135-

141 Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular

Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY

Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463-5467

Messing, J., Crea, R., and Seeburg, P. H. (1981) Nucleic Acids Res. 9,309-321

Bankier, A. T., and Barrell, B. G. (1983) in Nucleic Acid Biochem- istry, B508 (Flavell, R. A., ed.) pp. 1-34. Elsevier Scientific Publishing Co., Inc., Amsterdam

Elsevier Scientific Publishing Co., Inc., Amsterdam

52,801-824

Berk, A. J., and Sharp, P. A. (1977) Cell 12, 721-732 Davis, L. G., Dibner, M. D., and Battey, J. F. (1986) Methods in

Molecular Biology, Elsevier Scientific Publishing Co., Inc., Am- sterdam

Lusky, M., and Botchan, M. (1981) Nature 293, 79-81 Fujisawa-Sehara, A., Sogawa, K., Nishi, C., and Fuii-Kuriyama,

Laimins, L., Khoury, G., Gorman, C., Howard, B. H., and Gruss,

Gorman, C. M., Moffat, L. F., and Howard, B. H. (1982) Mol.

Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem.

Boutry, M., and Chua, N-H. (1985) EMBO J. 4, 2159-2165 Rich, A., Nordheim, A., and Wang, A. H-J. (1984) Annu. Reu.

Schmid, C. W., and Jelinek, W . R. (1982) Science 216, 1065-

Dynan, W . S., and Tjian, R. (1985) Nature 316, 774-778 Kelly, T. J., Jones, K. A,, Kodonaga, J. T., Rosenfeld, P. J., and

Gidoni, D., Dynan, W. S., and Tjian, R. (1984) Nature 312,409-

Gilbert, W., Marchionni, M., and McKnight, G. (1986) Cell 46,

Hurt, E. C., and Schatz, G . (1987) Nature 325, 499-503 Baker, A,, and Schatz, G. (1987) Proc. Natl. Acad. Sci. U. S. A.

Gilbert, W. (1978) Nature 271, 501 Go, M. (1981) Nature 291,90-92 Kagawa, Y. (1984) J. Biochem. (Tokyo) 95,295-298 Schatz, G., and Mason, T. L. (1974) Annu. Reu. Biochem. 43,

Anderson, S., Bankier, A. T., Barrell, B. G., de Bruijn, M. H. L., Coulson, A. R., Drouin, J., Eperon, I. C., Nierlich, D. P., Roe, B. A,, Sanger, F., Schreier, P. H., Smith, A. J. H., Staden, R., and Young, I. G. (1981) Nature 290,457-465

Y. (1986) Nucleic Acids Res. 14, 1465-1477

P. (1982) Proc. Natl. Acad. Sci. U. S. A. 79,6453-6457

Cell. Biol. 2, 1044-1051

50,349-383

Biochern. 53, 791-846

1070

Tjian, R. (1987) Cell 48, 79-89

413

151-154

84,3117-3121

51-87

Clayton, D. A. (1982) Cell 28, 693-705 Clayton, D. A. (1984) Annu. Reu. Biochm. 53, 573-594 DiMauro, S., Bonilla, E., Zeviani, M., Nakagawa, M., and DeVivo,

D. C. (1985) Ann. Neurol. 17, 521-538