mobile element insertions causing mutations in the drosophila

12
Copyright 0 1990 by the Genetics Society of America Mobile Element Insertions Causing Mutations in the Drosophila suppressor of sable Locus Occur in DNase I Hypersensitive Subregions of 5”Transcribed Nontranslated Sequences Robert A. Voelker, Joan Graves, Willie Gibson and Marcia Eisenberg Laboratory of Molecular Genetics, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709 Manuscript received March 7, 1990 Accepted for publication September 4, 1990 ABSTRACT The locations of 16 mobile element insertions causing mutations at the Drosophila suppressor of sable [su(s)]locus were determinedby restriction mapping and DNA sequencing of the junction sites. The transposons causing the mutations are: P element (5 alleles), gypsy (3 alleles), 17.6, HMS Beagle, springer, Delta 88, prypn, Stalker, and a new mobile element which was named roamer (2 alleles). Four P element insertions occur in 5‘ nontranslated leader sequences, while the fifth P element and all 11 non-P elements inserted into the 2053 nucleotide, 5’-most intron that is spliced from the 5’ nontrans- lated leader - 100 nucleotides upstream of the translation start. Fifteen of the 16 mobile elements inserted within a - 1900 nucleotide region that contains seven 100-200-nucleotide long DNase I- hypersensitive subregions that alternate with DNase I-resistant intervals of similar lengths. The locations of these 15 insertion sites correlate well with the roughly estimated locations of five of the DNase I-hypersensitive subregions. These findings suggest that the features of chromatin structure that accompany gene activation may also make the DNA susceptible to insertion of mobile elements. M OBILE element insertion-caused mutations in Drosophila generally have one of two origins. Either they arise as rare spontaneous mutations, or they are due to one of the mobile elements that are mobilized in “hybrid dysgenesis”-type phenomena. The factors that determine the insertion sites of the mobile elements are not well understood. Molecular analyses of spontaneous mutations at the white UUDD 1987), vermilion (SEARLES and VOELKER 1986; SEARLES et al. 1990), yellow (CHIA et al. 1986), rosy (CLARK et al. 1986; COT^ et al. 1986; LEE et al. 1987; KEITH et al. 1987) and forked (V. CORCES, personal communication) loci have shown that many of the mobile elements causing mutations are members of the retroviral-like families andthat these elements have inserted into both protein-coding and -non-cod- ing portions of the transcribed regions as well as into nontranscribed regulatory regions of these genes. On the other hand, insertions of the P element, the best studied of the dysgenesis-type elements, most often occur near the 5’ end of the gene, either in nontran- scribed or in transcribed but nontranslated regions (SEARLES et al. 1982, 1986; TSUBOTA, ASHBURNER and SCHEDL 1985; B. ZERGES, personal communica- tion; CHIA et al. 1986; KELLEY et al. 1987; GEYER et al. 1988; ROIHA, RUBIN and O’HARE 1988). We have reported that 14 mutations of the Drosoph- ila mehogaster suppressor of sable [su(s)] locus that arose spontaneously, in dysgenic crosses, or after in- jection of heterologous Drosophila DNA into embryos Genetics 126 1071-1082 (December, 1990) are caused by insertions of mobile genetic elements into a 2-kb region near the 5’ end of the gene (CHANG et al. 1986). Two additional spontaneous mutations caused by mobile elements have been characterized (VOELKER et al. 1989; R. VOELKER, unpublished re- sults). To ascertain how these mobile elements inter- fere with 4s) function, we have determined at the nucleotide sequence level the insertion sites of these 16 mutations. All are inserted within DNA giving rise to a -2500 nucleotide (nt) region of the primary transcript that is processed into a -500 nt nontrans- lated leader and a 2053 nt intron. The distribution of insertion sites is nonrandom and can be partially ex- plained by the distribution of potential insertion target sites. However, a more important factor may be that the region into which all insertions occurred contains a number of DNase I hypersensitive subregions. MATERIALS AND METHODS Cloning ofsu(s) mobile element insertions: T h e region of the su(s)locus in which the mobile element insertions occur is included within an -8-kb genomic BamHI fragment [--- 5.0 (not shown) to +3 in Figure 3A]. All mutant DNAs were cloned into BamHI-cut, dephosphorylated XEMBL4 (ProMega). Those mutations whose mobile elements lacked BamHI sites [83f24(8), 83f24(IO), 83g2(15),83g2(22), ab and 12701 were cloned by carrying out complete BamHI digests of genomic DNA and ligation into the vector. Those mutations whose mobile elements contained BamHI sites (A66, 66, 83f1, 84a and e5.6) were cloned by ligation of partially MboI-digested mutant genomic DNA into the vec- tor. Plaque purification was as previously described, using

Upload: vuongdang

Post on 05-Jan-2017

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Mobile Element Insertions Causing Mutations in the Drosophila

Copyright 0 1990 by the Genetics Society of America

Mobile Element Insertions Causing Mutations in the Drosophila suppressor of sable Locus Occur in DNase I Hypersensitive Subregions of

5”Transcribed Nontranslated Sequences

Robert A. Voelker, Joan Graves, Willie Gibson and Marcia Eisenberg

Laboratory of Molecular Genetics, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709

Manuscript received March 7 , 1990 Accepted for publication September 4, 1990

ABSTRACT The locations of 16 mobile element insertions causing mutations at the Drosophila suppressor of

sable [su(s)] locus were determined by restriction mapping and DNA sequencing of the junction sites. The transposons causing the mutations are: P element (5 alleles), gypsy (3 alleles), 17.6, HMS Beagle, springer, Delta 88, prypn, Stalker, and a new mobile element which was named roamer (2 alleles). Four P element insertions occur in 5‘ nontranslated leader sequences, while the fifth P element and all 1 1 non-P elements inserted into the 2053 nucleotide, 5’-most intron that is spliced from the 5’ nontrans- lated leader - 100 nucleotides upstream of the translation start. Fifteen of the 16 mobile elements inserted within a - 1900 nucleotide region that contains seven 100-200-nucleotide long DNase I- hypersensitive subregions that alternate with DNase I-resistant intervals of similar lengths. The locations of these 15 insertion sites correlate well with the roughly estimated locations of five of the DNase I-hypersensitive subregions. These findings suggest that the features of chromatin structure that accompany gene activation may also make the DNA susceptible to insertion of mobile elements.

M OBILE element insertion-caused mutations in Drosophila generally have one of two origins.

Either they arise as rare spontaneous mutations, or they are due to one of the mobile elements that are mobilized in “hybrid dysgenesis”-type phenomena.

The factors that determine the insertion sites of the mobile elements are not well understood. Molecular analyses of spontaneous mutations at the white UUDD 1987), vermilion (SEARLES and VOELKER 1986; SEARLES et al. 1990), yellow (CHIA et al . 1986), rosy (CLARK et al. 1986; COT^ et al. 1986; LEE et al. 1987; KEITH et al. 1987) and forked (V. CORCES, personal communication) loci have shown that many of the mobile elements causing mutations are members of the retroviral-like families and that these elements have inserted into both protein-coding and -non-cod- ing portions of the transcribed regions as well as into nontranscribed regulatory regions of these genes. On the other hand, insertions of the P element, the best studied of the dysgenesis-type elements, most often occur near the 5’ end of the gene, either in nontran- scribed or in transcribed but nontranslated regions (SEARLES et al. 1982, 1986; TSUBOTA, ASHBURNER and SCHEDL 1985; B. ZERGES, personal communica- tion; CHIA et al . 1986; KELLEY et al. 1987; GEYER et al. 1988; ROIHA, RUBIN and O’HARE 1988).

We have reported that 14 mutations of the Drosoph- ila mehogaster suppressor of sable [su(s)] locus that arose spontaneously, in dysgenic crosses, or after in- jection of heterologous Drosophila DNA into embryos

Genetics 126 1071-1082 (December, 1990)

are caused by insertions of mobile genetic elements into a 2-kb region near the 5’ end of the gene (CHANG et al. 1986). T w o additional spontaneous mutations caused by mobile elements have been characterized (VOELKER et al. 1989; R. VOELKER, unpublished re- sults). To ascertain how these mobile elements inter- fere with 4 s ) function, we have determined at the nucleotide sequence level the insertion sites of these 16 mutations. All are inserted within DNA giving rise to a -2500 nucleotide (nt) region of the primary transcript that is processed into a -500 nt nontrans- lated leader and a 2053 nt intron. The distribution of insertion sites is nonrandom and can be partially ex- plained by the distribution of potential insertion target sites. However, a more important factor may be that the region into which all insertions occurred contains a number of DNase I hypersensitive subregions.

MATERIALS AND METHODS

Cloning of su(s) mobile element insertions: The region of the su(s)locus in which the mobile element insertions occur is included within an -8-kb genomic BamHI fragment [--- 5.0 (not shown) to +3 in Figure 3A]. All mutant DNAs were cloned into BamHI-cut, dephosphorylated XEMBL4 (ProMega). Those mutations whose mobile elements lacked BamHI sites [83f24(8), 83f24(IO), 83g2(15), 83g2(22), ab and 12701 were cloned by carrying out complete BamHI digests of genomic DNA and ligation into the vector. Those mutations whose mobile elements contained BamHI sites (A66, 66, 83f1, 84a and e5.6) were cloned by ligation of partially MboI-digested mutant genomic DNA into the vec- tor. Plaque purification was as previously described, using

Page 2: Mobile Element Insertions Causing Mutations in the Drosophila

1072 R. A. Voelker et al.

either nick-translated ‘“P-lableled probe #3 or #4 (Figure 3A), depending on the known location of the insertion (CHANG et al. 1986).

Localization of sites of insertion: The approximate lo- cations of the insertion sites were determined by restriction mapping and Southern blotting of the cloned mutant DNAs, so that the insertion junctions could then be determined by DNA sequencing. In some cases a fragment containing the mobile element insertion site could be subcloned with the insertion site close enough to the polylinker to utilize com- mercially available primers to sequence across the insertion site. Based on the genomic DNA sequence already deter- mined, a set of 18-22 nt primers spaced about 250 nt apart was synthesized for both strands. These were used as se- quencing primers near insertion sites that could not be conveniently subcloned close to the polylinker. Sanger di- deoxy DNA sequencing was carried out using the vectors M13m 18 or 19, pBKS+/-, pBSK+/- or hEMBL4 with either ‘P or ““S as label, using Sequenase (U.S. Biochemical) according to recommended procedures.

Location of DNase I-hypersensitive subregions: Isola- tion of nuclei from 0-12 hr y’ wbf [= S U ( S ) + ] embryos and treatment were as in KEENE et al. (1981). DNase I concen- trations of 0 , 25, 50, 150 and 250 units/ml were used.

Computer search analyses: Three programs were used in computer-aided DNA sequence analyses. The StemLoop and Find programs of the University of Wisconsin Genetics Computer Group package were use to search for pallin- dromic and insertion target sequences, respectively. The FASTA program of PEARSON and LIPMAN (1988) was used to search DNA databases for sequence similarities to the 5’ end of the su(s) locus.

RESULTS

Cloning of su(s) mobile element insertions: CHANC et al. (1986) reported the localization by restriction endonuclease mapping of 14 mobile element-caused xu($) mutations and the cloning of five of the alleles (W20, 2, 51~15, 51j and S). Two additional mobile element-caused mutations have been recovered: A66 and 84a. Restriction mapping of all these mutations localized them between the “0.2 Hind111 site and -+2 of the genomic map shown in Figure 3A (CHANG et al. 1986; VOELKER et al. 1989; R. VOELKER, unpub- lished results).

Sequence of genomic DNA: T h e nucleotide se- quence of the segment of DNA necessary to encode wild-type su(s) function has been determined (R. VOELKER, W. GIBSON,^. GRAVES, J. STERLING and M. Eisenberg, submitted for publication). T h e 5’ non- translated leader sequence from the transcription start site through the ninth amino acid is shown in Figure 1, with upper case letters representing exons and lower case letters representing either 5‘ nontran- scribed material or the large 2053 nt (nt 808-2860) intron. Identity of mobile elements detected in this study with the canonical form was in all cases estab- lished by comparison of the nucleotide sequences of the terminal repeats (IRs or LTRs) and/or comparison of restriction maps with those given in FINNECAN and FAWCETT (1 986). The insertion points of the mobile elements are indicated by a v3‘ to the last nt beyond

which su(s ) sequence is interrupted. Data further de- scribing the insertions are given in Table l . Data for specific alleles are grouped according to the mobile element causing the mutation. P element: Five different P element-caused alleles

were analyzed. W20 was induced in 7r2 dysgenic flies and was used to clone the locus by transposon-tagging (CHANG et al. 1986). 83j24(8), 83$24(10), 83g2(15) and 83g2(22) were induced in Ha@ I 2 (Mr h12) dysgenic flies (M. M. GREEN, personal communica- tion).

Four of the P element-caused mutations are due to insertions in the 5‘ transcribed but nontranslated leader sequence. Three alleles [W20, 83f24(8) and 83g2( l5)] were caused by insertions following nt 446. That the latter two are of independent origin is indi- cated by their P elements being inserted in opposite orientations. 83g2(22) resulted from an insertion fol- lowing nt 584.

83f24(10) is caused by an insertion following nt 1637, which is in the large intron. The insertion site lies within the 1573-1 655 Sal1 fragment that had not been detected when the insertion site was earlier erroneously reported to lie in the same region as the other P element insertions (CHANG et al. 1986).

T h e five P element insertions each contain an 8 nt duplication (Table l), which is characteristic of such insertions. T h e consensus P element target site was reported to be GGCCAGAC (O’HARE and RUBIN 1983). Comparison in both orientations of the three target sites found in this study (TGTTGAAC, CGGGGCCC and GCTTAGAT) with the consensus finds best matches of only three, three and four bases, respectively. T h e previously reported P element poly- morphism (SEARLES et al . 1986) for A (~2-derived) or T (Haifa 12-derived) at position 32 (results not shown) just inside the 5’ inverted repeat was also observed in these P elements.

gypsy: Three spontaneous alleles (2, 5 1 c l 5 and 5 Ij) were previously reported to be caused by gypsy inser- tions, based on hybridization and restriction endonu- clease digestion results (CHANG et al. 1986). All three inserted within the large intron and their direction of transcription is apposite that of su(s). 5 1 ~ 1 5 inserted following nt 1804 at TACATA (5’ to 3‘ in antiparallel strand at 1804) and duplicated the sequence TACA. 51j inserted following nt 825 at TATATA (822 in strand shown or 823 in antiparallel strand) and dupli- cated TATA; bath insertion sites and duplicated se- quences have been previously reported (FREUND and MESELSON 1984; BAYEV et al. 1984). T h e 3’ L T R of the gypsy insertion of 51j also contained the subter- minal T to C transition (shown in italic capital letters in Table 1) mutation seen in the sc’ mutation (FREUND and MESELSON 1984).

Allele 2 presents an atypical gyfisy insertion. It was previously noted that the + I .2 Sa21 site (from DNA

Page 3: Mobile Element Insertions Causing Mutations in the Drosophila

suppressor of sable Mutations 1073

1 gatccggccatggcgatcgggagtatacaaaaatgtttcactttaggaaatgqccgataaattgcactcgctgacatccgatagccgtggcctgaagagcttccaccgatcgatatcgat 120

121 aaatggatatgttttctgcaatagcgaaagatcggggtgctacctattaattatacttaagaaacagatgatgtagttttttgatatagaagtttaacaaccqgttcaacaatcggttaa 240

241 tatttaaaaatgtcgcagccagcgagcctttagctttaaaagtnn9Cffgcgttcacggtttttggggaaataccaaatatactaggtaaaaacatcgattaactaatatcatccaacac 360

tranaaription atart v U20 83f24 (8) 83g2(15)

v 83g2(22) 361 gacgatatgt tct t tgcqaaadatatcgatgtactggtat t t t tatgt t tccacACCTCTAGTTCGTTTMGTCGA~TGTTGMCTGTTMCMCACMCTCCTGMTATTATGTTTTG 480

481 GACG-GTTTTATTMTTTMTGTTTTTTGTATATTGTATTCTGTCCATGAGCGTMCA~GACAGACAGA~GAM~GACAGATACT~GATACGG~CCCTCAGGCAGAG~GAG 600

601 G G C G ~ ~ M ~ T A C ~ C G G G T C C G A G C G A C A G A G C G C G C G C T T G C C T C A C G C A C A T A T A T A T ~ ~ T G T A C A T A C A T A T A G T A T C A G M G M ~ C T ~ ~ C G C T A T T C G A T T T A C 720

721 G A A M C T G G C G ~ T T C G A C G A T T T G T C T C G T A ~ T A T C T A T G T A C C T C A C M C C A C T T C G T C A T C ~ T A C G T A T M G T C A G A T g t a t g t c a a a c a t a t a t a t a c a t a t g c a t g t a t Slj v v 0 5 . 6 A66 8311 66

841

961

1081

1201

1321

1441

1561

16a1 2

1801

1921

2041

2161

2281

2401

2521

2641

2761

2881

~.

v 84. atacacatctatgtgaggccactgctatatcccatacatgtatgcctctatatatatgtatatatttatccacctactgtaccacttccacattatgtaagcagcgacgcaaacaaaaca

ctaaagcagaaaatccgaaaagcaaaatgcgaaatgcaaaacaacgaaccaacaaaatcccatctataaaccgaaaaccgaacagcaatgttataacaaataccgaaaagttgttaatgt

tccacaaaaaagcataaagcagtcacgctaagagagtaagatcgaacaaatgaaatgtttgtatggcataagtagttcccagatttttgtatgtacttcaatgcatatgcatgtgtgcgt

atgcatgtatgtatgtatgtttctcggggccatgtctccccacgcccacgtacgtgggtgctagaaagggacagatggtgcaaattttttaaattgcatacacttgattttggaacgcca

v 1270 v 2 gcc~aacaaaaagataaattgcgtgcttttcttttgctgttcgtatgtacgtacatatgcaagtggaaactgcgcagcagggaagccacacgcatgaacgccttccatttacagagcatt

caaacaaatcgtaccgaaaaccaaacaaaaactttcacctcaagtgtaacaattatattcaaattcgctttgaagtcgctgtaaaatatatgtatqtatgtataaacaaagacaaaaaca

t t t tgcctcgc~acacacacgcgcactcgcactgcacgcctcggctg tq tg tg tg ta tgccctgct tagatca tgagtg tgcqagcg~atgct tggcaatcgcgagtc

agaactaaacacgcgctctatatgcacgtgtgtgagacgcattggattttgttattttcatcaatcgcaatgattccgtccatcagctacatccaaatatatacacgcatccgcgccata

tgtacaatgctacgagcatttacatacacatgtacatatgcgcgtgtggctcataacggttcactccgcttacgcacaataataaaggcaaacgcctacgacttccccattgttcctagg

ctcaatggaaagccactccgttgctttcgactttcgcttttcggattctggaatttgtttcgctgtccaaaactacgtataactgaatctttatgttaagccaattgtaaaaagtgtatt

ttcctttctcgttctctccacacaacaaatgtgctagcgcaactttgttgatcacttttattttatactggaagcatgtccaggacccataccctttaattcatcaaactgatttgctta

ttattttgttgctgcactttggaactagctaagtgccaggctaaacctaagcgatccqatcccacttaaaccaaccaacaaagagcaatccaaccattgtcagcgaaaagtgctcaaact

ttccgagtgctgagataaggaagggtgcccaaaacattcatcgcagccatgctgcccacgctacaaagtgactctgactgaatgcatcaggatcgtatgatccgggtggtctatcttaag

tcaatagtttctgtcgctagtacttggacgtgccttctgaatccgtagtcatttgtgataccagccattttatattgtatttacctcaaaaacattaatqtagcatttattcaagcaacg

cttgattttagtgttaaccaattcctttcatttgtgttcaaaattttgaccgccaatggggtgattcaaataatccccgaatgattttggaaaaaaagaataaggggcattaaactacat

ttattaattcgtttttattcacttagaagcgtattatggtacattccacctttttagttaatactagcagttgctttattttaagataccattttatattgtaattggacgttaagacaa

attataagacgtccttatatatgattagattgggaatttgaccttaaaacatatttatgcatccatatatggtattcaaaatccgatcccttatcaacagCTTTTCCAAATACC~cAc

v 83f24 (10)

v v 51015 v ab

v s v

TTAAGAGTGCAGCMCAGAAGAGGGAGGCCACTTTTTTTAAGGGCACCAATTATTCCAATCTCATCCTGTGAGAGATTAGCTGCC~TCGATMTGTCTGTCGCTTTAGCGGATGAACC MetSerValAlaLeuAlaAspGluPro

840

960

1080

1200

1320

1440

1560

1680

1800

1920

2040

2160

2280

2400

2520

2640

2760

2880

3000

FIGURE 1.-Nucleotide sequence of su(s) gene from 5' end of sequenced region to the putative translation start. The sequence shown is from R. VOELKER, W. GIBSON, J. GRAVES, J. STERLING and M. EISENBERC (manuscript in preparation). Nontranscribed and intron sequences are indicated by lower case letters, and exon sequences are indicated by upper case letters. The first nine amino acids of the putative protein are shown. The transcription start site is indicated by the italicized A4,5 (R. VOELKER, W. GIBSON, J. GRAVES, J. STERLING and M . EISENBERG, manuscript in preparation), and the TATA-like box is italicized beginning at nt 379. The HindIIInsr and SalIlS7,, 1655 sites are underlined. The mobile element insertion sites are indicated by arrowheads (v) between the nucleotides where the insertion occurred, and the allelic designations are given in bold face letters adjacent to the arrowheads.

~,

sequence data now known to be two Sal1 sites 82 nt apart) is absent in 2 (CHANG et al. 1986). DNA se- quence data indicated that the left LTR of gypsy follows nt 1375 within the sequence TACATA, whereas the right LTR precedes nt 180 1, also at the sequence TACATA (beginning at nt 1804 in the antiparallel strand). The intervening 425 nt genomic segment is deleted. The 3' site of the insertion is identical to the insertion site of 5 1 ~ 1 5 . Moreover, the 4-nt tandem duplication normally associated withmpsy insertion is not identical; rather, the four adjacent nts on the 5' and 3' sides, respectively, are TACA and TGTA, which are complementary rather than iden- tical.

The foregoing observations suggest two possible origins for 2: (1) it arose as a bizarre insertion event, with the preferred gypsy insertion sites (TACATA) being recognized but at different places in opposite

strands; or (2) the form of 2 analyzed here is not the original mutation but is a recombinant derivative be- tween the original 2, in which the gypsy inserted fol- lowing nt 1375, and 5 1 ~ 1 5 (or another identical allele) in which the gypsy inserted in the same orientation following nt 1804. Because the gypsies inserted in the same orientation, crossing over could have occurred between them, similar to the cases reported for Bel (= 3S18) (GOLDBERG et al. 1983) and B104 (= roo) (DAVIS, SHEN and JuDD 1987).

su(s)' was discovered by BRIDGES before 19 15 (LIN- DSLEY and ZIMM 1990). T o attempt to learn more about the possible origin of the extant 2, we have examined by whole genomic Southern analysis 14 additional stocks (Table 2) that are labeled as carrying su(s)' or an su(s) allele without superscript designa- tion; all except four appear to have the same structure as noted above. The su(s)' W' cu t stocks from both

Page 4: Mobile Element Insertions Causing Mutations in the Drosophila

1074 R. A. Voelker et al.

3

w 4

9 c

CI * a 2 C

: 1 8 c 5

.- - c

u

E

I

E.: 0" Y 2 .= -z 2 a.; Y Y Y 5 c

U ~ e U ' E " gE & & E = = = o g o g g & & & + E " o w o w g a g

.- .- .-

E

M M M O 6 0 0 0 . r + + r O Q D + M - o v v o r r m r a

eU 6 6 6 V c l l m o m .- I v1

$ " " a < u u u v 2 2 r E . m

6 6 6 V M m 6 6 E M M ~ O M ~ D O ~ U V r + r ~ r m a m a 5 r r r v a c r r u ~ $-

r r r w r u r u p

c t - l - t - t - t - u u v 'S M M M M b O M M b 4

E 5 5 2 2 2 E E E ;

o u u u u l - b b 4

v u u u u u v v v t - b

m E E E . . q $ E E L 2 . . . . . . . .c.

l l 2 $ V b c u Y

. . . . . . . ' 5

'- 2 s 2 5 2 2 s s v 5 u u u u u l - l - 3 t - t - h t - t - t - E l - 2

j i j j j3445

r O $SSudud>$TG o o u o r a a w a

u u u U u u a d a t - t - l - t -

.- p

M M M M 6 + + U U - V 8 m 6 0 M M m W m c m m m u m r + v +

b r u ~ M o m m ~ m Y + C , M Y + + O + M M M M V 5 m m m Y + * O M o U O + + O u U m r u m M m r r r r O M m r v

0 2 2 2 s -5 -5-5-5

K m m m m

E E E E .E ,E ,E ,E ,E g 0 0 0 0 C C i l c

e e e i l e e e e e .- c E E 2 E c c c c c

,z ,z ,z ,z .z ,g .E .E .E 3 lrJLnlrJmC4!AL&C4Lr

I Q H

u 0

I

m .- e

v ) > O,;ob&i-&bb;o v " r m m o 0 . 1 r - 0 . 1

.^

~ b - m w m m m m 3 - -

I I C

- E W { $ E ? Z $ < 2 z z z x z cg cg

2 4 , 4 4 4 4 O O Q S

"-0 h

- s V - - V q u

C r y r y Q r y l - . e z 2 : z -

ry4--+"? ? & 2 2 2 2 ? ? 0 i 2

h a. v

- . w o w

m d m m v)v)v)

E E E

6 r r m w m m o u

* O m 6

Y + S 6 6 m r w

F W 6 6 6 6 W V

u a r o

m m o w u r m ~

$ I $ $ d ! i p m . . . b . . . u a L 2 u V u l- a

3 . . E $ 6

3 z z :%

g @ 3 h t- 3 s U ~ + + O F"?? ; ; v c m m c

. m a + O O + 0 M a 0 e r u e

C C C K i l d e e g g .e .e .e .e i z i z i z i z i l d d d v 1 I I I V I I I I L L L L

x > > : w - ( o w 0 . I vm0. I m w m m

w

Y 2

k 4 QL.

u.?- : ti L.2 g - 3 % L.

F u0$4\0

cg \d

ucg a i

CL. R.

w ro * v u V8.q u a w

: " p i g $ E u g r u r v 6 b M U

8";12 E 2

2 a t - S E;@ . " $ # $ E i W E E E g

v

v u 2 q i

b V q z a

s 5 <

5 ; u g , a i l -

s p g = " s % < : v r v o

U + k M ?:go"

. m . r Q . O

c c d e g K C .- .- I C

.- .- i r h 2 2

\ 00 m Qi M

Fa ..

h b - 2 m d v

S k F 2 Q G

C h

2 Y

o w 0 L U 5

.o 0- c

6 .sg - k

E X e x v z

; g -ti

i % $ $ v) .;: &

0" 2 5.2 tu z . 5 2

$ ' S $ < g y .c

m 8 %

y n s 4 id E 5 e,

VI " .-

W - Q

z

. c : c $ $ % 0 A V 2 5 5 % 2 8 : 4 E

E;: s g $ z 5

"a2 2 O 1 0 u y . C b

3 .gE %

4 5 2 00

E $ 3 2 g f=

L ki 8 v1.g b

S r 3

E 3 Q r $ * .e f -5 2: % : e :

." 85 6

2 .8 -5

2 3 < k 9% 0

- G 3 g O Z g a g 0 & 'g 0

3 .: 5 -0L % .c 4 A 2 $ - E ? 2 .- v) ,E U P : 3 S C 6 2 v1 l?%5 gg2.U ,

0- ' h k LI Snz22; 0 E, mnp, ; y - 5 aJ 2 Y",

g,.v n -

M zg i

B

1 m u UJ

cz u

. 2 , 5 - e 0

O K 0

E & d

O m

.-

e , < y w w f . 2 k

* *

- 5 c %

2 % M

0 0.-L+.

.- P

8 $ . Z % S

sp" g &!

8.5 ,c E,? g 2 . 2 w c p.ggp_c;

?Em .E

Mu .E L c - o m c

L d J s u -

-0 m C

i - 2 5

F L"

5 b

e,

CI

il

Ir e,

c B LI z P, 0 S .- I

v1 .- 0 V s 0 W

u L; 'Z

r d E - 6 C.tM

; 2 2 c, Z ? % .-0 g .2 F

P .- c g u b .= P

5 % a 0 .E 0 E 0 0

m u

'Z .z E L C

2 5:: $ 3 3 G 2i e 26 .- v1 3 g g :b 'J $ t;

c '$ 2 : I Z .E a; .a "i S Z S E m h 0 g 2-0 y " c e Wt: Uh %.k $22: C E J m E

2 3 0 2 ; * ST'; m 3 & 2 E L c-0 3 ,o 2 5 g.s.02 o o c c ; :

* s b t: 0 .f u) 'I) 0 - c L. '2 .e E ," E 5%: 0

0 E u O b

0 . 0 2 );;A F$JG

urn 2 boa

m s

72 = k .- -

m o

" 6 0

: q 5

.-

m z K C C

.- L 0 -x$3,5

B 6 Z 8 $ $ r Q *

Page 5: Mobile Element Insertions Causing Mutations in the Drosophila

suppressor of sable Mutations 1075

511 ab I

Genomic DNA o Exon size

Restriction 284 1573 1855 sites H s s H

FIGURE 2.-Location of mobile element insertions causing mutations within the su(s) gene. The information on DNA sequence and the processing of the primary transcript into mature message are from R. VOELKER, W. GIBSON, J. GRAVES, J. STERLING and M. EISENBERC (manuscript in preparation). The thin horizontal line bounded by 0 and 8493 represents the genomic segment sequenced. The thick line represents the transcribed region. The solid areas of the thick line indicate the exons, whereas the hollow portions represent the introns. The translation start and stop signals and the polyadenylation signal are shown. The open triangles indicate mobile element insertions and are not drawn to scale. The designations above the triangles indicate the su(s) allele, and the designations within the triangles are the names of the mobile elements. All mobile element insertions occur within the region of DNA that gives rise to the 5’ transcribed but nontranslated leader and to the 2053-nt 5’-most intron. The Hind111 (H) and Sal1 (S) sites shown are those shown in Figure 3A.

TABLE 2

Additional “’su(s)~” stocks examined

Stock NO. Designation Source“ Molecular structure

b27 1 120 1157 1395 1398 11700 11000

b272 121 b274 b275

~ U ( S ) ~ v ; bw su(s)’ v ; bw yz sc su(s) l z f su( f )1s67g/FM3, uof C ( I ; Y ) P 1 , y2 sc su(s ) p n sn3/O/C(Z)M4, y C(I;Y)PI, y2 sc su(s) p n sn3/O/C(l)M4, y yz sc su(s) z ec c t / C ( I ) M 3 , yz

su(s)’ c tK /FM6 su(s)’; pr cn su(s)’; prb” cn S U ( S ) Z w e cv t S U ( S ) Z w n N t su(s)Z v / C ( I ) R M , f B su(s)2 u / C ( I ) D X , rf

yz sc su(s) SPl/C(l)DX, y w f

BG IU IU 1 u IU U U BG BG OR BG IU BG BG

2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 2, as described in Table 1 solo gypsy L T R ~ SOIO gypsy L T R ~ S‘ S‘

a Source abbreviations: BG, Bowling Green Stock Center; IU, Indiana University Stock Center; U, Umea Stock Center; OR, Oak Ridge.

‘ Labeling confusion may have arisen because S U ( S ) ~ ‘ was sometimes designated SU.‘~(V, p r ) . Presumably arose as a crossover event between the gypsy LTRs.

sources (duplicates carried at the two stock centers) contain only a gypsy solo LTR at the junction of the 425 nt deletion, suggesting that a crossover event between the two LTRs deleted the body of gypsy, giving rise to that variant (results not shown). When we analyzed this variant to determine whether it still suppresses v , we found that it still behaves as a sup- pressor, but is a much weaker suppressor than its 2 parent. In the cases of stocks b274 and b275 a con- fusion in the nomenclature of su(s)’ [sometimes la- beled as sus2 (v , p r ) ] probably led to that stock’s being

designated su(s)’. The documentations on the above “2” stocks containing the different visible markers do not indicate whether any predate 195 1, after which the recombinant between 2 and 5 1 ~ 1 5 would have arisen. Thus we cannot discriminate between the two possible origins of the extant 2.

Delta 88 e 5 . 6 was recovered after injection of het- erologous DNA into v ; bw embryos (FOX and YOON 1970; FOX 1977). Restriction map similarities to the mobile element associated with tuh-? (KARCH et al. 1985) suggested that the two elements might be the

Page 6: Mobile Element Insertions Causing Mutations in the Drosophila

1076 R. A. Voelker et al.

same. Homology was confirmed by probing a dot blot of the cloned Delta 88 from tuh-3 with the cloned S U ( S ) . ~ . ~ (results not shown). The ends of Delta 88 have not been sequenced far enough to determine if they contain LTRs.

The Delta 88 of e5.6 inserted following nt 826, assuming that the 5’ end of the mobile element begins with the commonly observed AGTTA terminal in- verted repeat motif. I f this interpretation is correct, the duplication at the insertion site is ATAT, and the 5 nt terminal repeat is imperfect, with one side being AGTTA and the other being ACTTA.

17.6: 6 6 was also recovered after injection of het- erologous DNA into v ; bw embryos (GERMERAAD 1975, 1976). It is caused by an insertion of 17.6 following nt 826, the identical site as the Delta 88 of e5.6. Although only one end of this mobile element was cloned, the insertion site and 4 nt duplication of ATAT is consistent with that previously reported for 17.6 (KUGIMYA, IKENAGA and Saigo 1983). The target site ATATAC is the same as that found in three of ten previously analyzed 17.6 insertions (INOUE, YUKI and SAIGO 1984).

springer: 84a is a spontaneous mutation in which springer inserted following nt 841. The target se- quence, TATATA, is identical to that previously re- ported (KARLIK and FYRBERG 1985); the availability of this second example allows the conclusion that the duplicated sequence is TATA.

HMS Beagle: ab is a spontaneous mutation caused by the insertion of HMS Beagle following nt 1836 at the sequence TACATA with the 4 bp duplication being TACA. This differs from the TATA duplica- tion reported for the first isolation of this element (SNYDER et al. 1982) which inserted at the sequence TATATA.

prygun: 1270 is a spontaneous mutation caused by the insertion of a mobile element following nt 1345. When first found, the mobile element appeared dif- ferent from any previously described, and it was pro- visionally named rover (LINDSLEY and ZIMM 1990). It was subsequently discovered that the terminal -25 nt at both ends are nearly identical to those of the mobile element prygun, first reported in y g (GEYER, GREEN and CORCES 1990). There are, however, four nucleo- tide differences: the 1270 element has C’s at nucleo- tides 6 and 10 in the 5’ terminus, whereas prygun has an A and a T, respectively, at those two sites; at positions 9 and 14 from the 3’ end of the 3‘ terminus 1270 contains Gs, whereas the prygun of y g contains Ts. Both elements contain a terminal inverted repeat of 5 bases, AGTTA. The 1270 insertion is associated with a duplication of CGTG, whereas CGCG is the duplication associated with f. The prygunlrover target sites are unusual in that they contain three of four GC base pairs, unlike the other sites observed here which are generally AT-rich. The 1270 mobile element is

-7-8 kb long and contains an AvaI site very near each end (CHANG et al. 1986). The amount of sequence that could accurately be read from either end of the 1270 element was not sufficient to determine whether the element contains LTRs. Whether the differences between the f and 1270 elements represent polymor- phisms within the prygun mobile genetic element or a family of very closely related elements remains to be determined.

Stalker: S is a peculiar spontaneous mutation caused by insertion of a mobile element (provisionally named rambler; LINDSLEY and ZIMM 1990) following nt 1945 and preceding nt 1958 (e.g., nt 1946 to 1957 are deleted). Subsequent comparison of the ends of the S insertion with the mobile element Stalker (GEORGIEV et al. 1990) showed that the insertion of S is an incomplete Stalker. Stalker is 7 162 nt long and has LTRs 405 nt long (nt 1-405; nt 6758-7 162; sequence from V. CORCES, personal communication). Nucleo- tides 11-42 (and beyond, not shown) of the 5‘ end of the S insertion are identical to nt 662-693 of Stalker, and the internal 37 nt of the 3’ end of the S insertion are identical to nt 6670-6706 of Stalker. Thus, the “Stalker” from su(s)’ lacks both LTRs as well as some sequences just inside the LTRs. This is consistent with the observation that the S insertion is 6 kb in length (CHANG et al. 1985) rather than 7.1-kb length of Stalker known from the nt sequence. Another peculi- arity is that nt 1-1 0 of the 5’ end of the su(s)’ Stalker are not identical with sequences elsewhere within Stalker or within su(s), nor is it possible to determine the origin of the 3’ terminal TAA of the su(s)’ Stalker.

Because the parental chromosome of su(s)’ is not available, it is not possible to determine whether the 12-bp deletion associated with the Stalker insertion was preexisting, arose in conjunction with the trans- position or occurred after the insertion. Whether the mechanism of origin is similar to that for deletions that arise in conjunction with or after insertion of bacterial transposons (reviewed in CALOS and MILLER 1980) remains unknown. It is perhaps noteworthy that this insertion occurs in a region of su(s) that contains a triple direct imperfect repeat, beginning at nt 1944: CTTTCG(A), CTTTCG, CTTTTCG. The insertion occurred 3’ to first T in the first repeat and 5’ to the first T in the last repeat. On the other hand, if the deletion arose after the insertion, it might have been associated with the loss of the insertion duplica- tion and/or the Stalker LTRs.

roamer: Two spontaneous mutations (A66 and 83’1) are associated with insertions of the same previously unnamed mobile element into the same site. Only the right end of each mobile element was cloned. By dot blot hybridization both are homologous with the par- tially characterized clones #2157 and 2 179 (E. STRO- BEL, unpublished results). We have named this ele- ment roamer. From whole genomic Southern mapping

Page 7: Mobile Element Insertions Causing Mutations in the Drosophila

suppressor of sable Mutations 1077

TABLE 3

Distribution of potential mobile element target sites

Downstream introns Coding

su(s) region 5' Leader 5' Intron (= 4) sequence 3' Trailer

Siz,e (nt) 507 2053 917 3966 42 1 Proportion 0.064 0.261 0.1 16 0.504 0.053 No. of insertions" 16 Probability 0.0000000 15

Target sequence distribution TACATA, TATGTA 3 14 1 TATATA 2 8 2 ATATAT 2 9 2 ATATAC, GTATAT 2 5 0 ATATAG, CTATAT 1 3 1

1 0 0 1 0 0 0 1 1 1

Totals Number Proportion Probability

No. of insertionsb

10 39 6 2 3 0.167 0.65 0.10 0.033 0.05

0.02 0 9 0 0 0

Includes all mobile element insertions in this study. Includes those mobile elements with insertion AT-rich target sites (gypsy, Delta 88, 17.6, springer, HMS Beagle and roamer).

the element was determined to be at least 3.8 kb in length and contains at least one internal BamHI site. The roamer insertion interrupts genomic sequence preceding nt 822 within a run of ATs. I f the element contains an AGT terminal inverted repeat, this ele- ment may also select ATAT sequences as its target site. Also, if this element produces a 4 nt duplication, it likewise inserted following nt 826, the identical site as the insertions of 66 and e5.6.

All mobile element insertions occur in the 5' transcribed but nontranslated leader exon or in the large 2053-nt intron that interrupts it: A review of the foregoing (Table 1 and Figure 1) shows that four of the mobile element insertions (all P elements) oc- curred in DNA sequences giving rise to the exon that becomes the 5' transcribed but nontranslated leader and that the remaining twelve (one P and 11 non-P elements) occurred in the 2053-nt intron that inter- rupts the leader 1 14 nt upstream from the putative translation start (Figure 2). Moreover, there are two notable clusters of insertion sites: three of the P ele- ments inserted following nt 446 and four other ele- ments inserted preceding or following nt 826.

We have examined these sequences to determine whether there is some factor(s) which predisposes them to mobile element insertion. Probabilities were calculated, making certain assumptions about the dis- tribution of the mobile elements (Table 3). The first calculation assumed a random distribution of inser- tions throughout the 7864-nt primary transcript of the su(s) gene (R. VOELKER, W. GIBSON, J. GRAVES, J. STERLING and M. EISENBERG, unpublished results), i e . , reflective of the interval size where insertions could occur, disregarding any target site specificity. The observed distribution of 16 insertions all within

the 5' 2560 nt (507 nt nontranslated leader plus 2053- nt intron of the 7864-nt primary transcript) would occur with a probability of 1.5 X lo-' [= 0.32516], obviously nonrandom.

The foregoing result was not unexpected because some mobile elements exhibit striking target site pref- erence. The mobile element gypsy exhibits a prefer- ence for insertion at TACATA or TATATA (BAYEV et a l . 1984; FREUND and MESELSON 1984; this report) and HMS Beagle may have a similar preference (SNY- DER et al . 1982; this report). A computer search of both strands of the sequenced 8493 nt of the su(s) region indicated the distributions shown in Table 3. Fourteen of the 19 TACATA sequences occur within the 2053-nt intron, and three lie in the 5' nontrans- lated leader. One of the remaining two is in a down- stream intron and only one lies in coding sequences. Eight of 13 TATATA sequences lie within the 2053- nt intron, two occur within the 5' nontranslated leader, and the remaining three lie in downstream introns or in 3' transcribed but nontranslated se- quences.

17.6 has previously been reported to insert prefer- entially at the target sites ATATAT/C/G (INOUE, YUKI and SAIGO 1984), and the target site observed in the present study is ATATAC. Delta 88 and roamer reported here may also share that target preference. Seventeen of 29 ATATAT/C/G sequences in the su(s) region lie in the 2053-nt intron, five occur in the 5' leader, five lie in downstream introns or in the 3' trailer and only one lies in coding sequences.

Thus, 39 of the 60 potential target sites for the nine mobile elements having AT-rich hexanucleotide tar- get preferences lie within the 2053-nt intron. Ten lie within the 5' nontranslated leader, nine of the re-

Page 8: Mobile Element Insertions Causing Mutations in the Drosophila

1078 R. A. Voelker et al.

A + Telomere Centromere

3’ 5’ 5’ SNS) 3’ - nt -

Bp H Bg H SS SpB 79 H

kb -4 -2 0 2 4 6 8

-1320 -597 284 1573 led5 3388 3428 7247 8375

Genomic DNA

Probes #1

#2 #3 #4

B Enzyme: Hlnd 111 Bgl IllSph I c Enzyme: ~ ind llllsph I Bgl IllSph I Bg/ IllSph I U DNase I 25 50 25 50 100125 0 U DNase I 2550 100125 50 100125 50100125

- 7844

- 510

- 3985

- 2939

e

e

e

* . .

- 8061 e - 3985

- 3104

e e c

- 1838

4- - 1636

L W * ’

4-

e Probe: #3 t3 t 4

- 7844

- 3985

- 1838

- 510

Probe: #2 #2

FIGURE 3.-Localization of DNase 1 hypersensitive sites. (A) Location of restriction sites and probes used in DNase I hypersensitivity experiments. An 8.5-kb genomic fragment of the su(s) region was sequenced (R. VOELKER, W. GIBSON, J. GRAVES, J. STERLING and M. EISENRERG, manuscript in preparation). I t includes the -8 kb (nt 284-8375) Hind111 fragment that is sufficient for su(s)+ function when reintroduced into the genome by P element transformation (VOELKER et af. 1989). The nt scale shows the positions of the restriction sites in the DNA sequence. The kb scale has its origin (0) at the site of the W20 P element insertion that was used to clone the DNA by transposon- tagging; restriction endonuclease localizations of the mobile element insertions were referenced on this scale (CHANG et Qf. 1986). 0 in the kb scale corresponds to n t 446. Restriction enzyme abbreviations: B, BamHI; Bg, Bglll; H, HindIII; S, SdI; Sp, SphI. The probes were used in the locali7ation of the mobile element insertion sites and in the cloning of the alleles. (B) Location of 5’ end of DNase I-hypersensitive region. The left panel shows the results when nuclei were treated with DNase I followed by digestion with Hindlll, and the blot was probed with probe #2. There is at least one hypersensitive site near one end of this 1604 nt (-1320 to +284) fragment, as indicated by the -1400 nt fragment. In the right panel the DNase I treatment was followed by digestion with Bglll and SphI and probe #2 was used. The Sphl digestion was incomplete, as indicated by the presence of both the 7844 nt (-597 to +7247) BgflI fragment (not cut by Sphl) and the 3985 nt (-597 to +3388)BgfII/Sphl fragment. The resistance to DNase I digestion ofthe 2939 nt Bglll fragment (-3536 to -597). which includes the 5’ end of the 1604-nt Hindlll fragment (-1320 to +284), indicates that the hypersensitive site is not at the 5’ end of the 1604 nt Hindlll fragment and therefore must lie at the 3’ end of that fragment. The 7844 nt BgfIl (-597 to +7247) fragment shares a common 5’ end with the 3985 nt Bgfll/Sphl fragment, and therefore both are similarly sensitive to increased DNase 1 concentrations. The DNase I-digested derivatives of the 3985 fragment range in size from -700 to 1600 nt (indicated by arrows). These bands are very faintly detected by probe #2 because the DNase I-hypersensitive region is also the region of homology to the probe (see below); as more of the sensitive region is digested by DNase I , there is less of these fragments remaining to hybridize with the probe. Taken together, these results indicate that the DNase I-hypersensitive region extends 3’ from about nt 100. (C) Location of 3’ end of DNase I-hypersensitive region. Nuclei from su(s)+ embryos were treated with DNase I followed by digestion of the DNA with Hindlll and SphI (left panel) or Bgfll and Sphl (center panel), and the blots were probed with probe #3. Because the SphI did not digest to completion, two patterns can be observed in the left panel. The 8091-nt band is the +284 to +8375 HindIll fragment, and the -6-7-kb band (top arrow) indicates that a hypersensitive site lies near one end of that fragment. This identical pattern had been previously observed when DNase I-treated nuclei were digested with HindIIl and probed with probe # I (results not shown). The 3104-nt band is the +284 Hindlll to +3388 Sphl fragment. There are a number of DNase I-hypersensitve sites on the 5’ end of this fragment, as indicated by the appearance of a number of smaller bands (arrows at -3000, -2800. -2600, -2200, -1800 and -1500 nt) in the presence of increased DNase I concentrations. Probing the filter in the left panel with probe

Page 9: Mobile Element Insertions Causing Mutations in the Drosophila

suppressor of sable Mutations 1079

Estimated midpoint of 130 330 530 730 1130 1530 DNase I resistant region (nt) I I I I I I

1880 3388 Sph I I

Nucleotide 0 200 400 600 800 1000 1200 1400 1600 1800 2000 - 3388

Insertion Sites (nt) I I I L JI I J C I

446 584 825 841 1345 1375 1637 1801 1836 1945 (3) 826 (2)

(4)

FIGURE 4.-Correlation of mobile element insertion sites with DNase I-hypersensitive subregions. The wide bar represents the genomic D S A region from the 5’ end of the sequenced region (nt 0) to the SphI site (nt 3388), with shaded areas representing DNase I-resistant subregions and unshaded areas representing DNase I-hypersensitive subregions. The transcription start site is at nt 41 5. The midpoints of the lengths of the DNase I-resistant subregions were roughly estimated from the lengths of the fragments in Figure 3. The mobile element insertion sites are accurate to the nucleotide (Table I ) . With the caveat that the estimations of the locations of the hypersensitive subregions are subject to some error and that there are probably gradations between the DNase I- “resistant” and ”hypersensitive” subregions, 15 of the I6 (all except that at 1945) mobile elements inserted at sites from the margin to the middle of DNase I-hypersensitive subregions.

maining potential eleven target sequences lie in down- stream introns or in 3’ nontranslated sequences, and only two lie within coding sequences. Considering the distribution of potential target sites indicated above, the probability that nine of nine insertions would occur within the 2053-nt intron is 0.02 (= 0.659). Thus, while the target site distribution correlates with the region in which the insertions were found, even that factor alone is not sufficient to account for the limitation of insertions to the region observed ( i e . , the distribution is nonrandom in spite of that consid- eration). Moreover, three other insertions [1270, S and 83p4(10)] that do not have the same target site preferences also inserted within that region.

To determine whether there might be peculiarities of the structure of the 5’ end of the su(s) gene, two additional analyses were performed. First, the 2700- nt (nt 28 1-2980) 5’ end was used as a query sequence to search the available DNA data bases for regions of greatest similarity (FASTA, PEARSON and LIPMAN 1988). A consistent finding was that sequences from -790 to -940 showed homology with AT-rich regions of sequences from many different organisms. Partic- ularly striking was the frequency with which the alter- nating purine-pyrimidine AT-rich interval from nt 818 to 853 showed up in these searches. This is especially interesting because 6 of the 1 1 non-P element insertions occurred within that -40-nt interval.

Second, a computer search (StemLoop, University of Wisconsin Genetics Computer Group) was carried ou t to determine whether the sequences in this region could form pallindromic structures. The results indi- cate that, while such stem-loop structures are by no means absent in other regions of the gene, the regions into which the mobile elements inserted have the

potential of forming many such structures, some quite large (stems = 15-30 bp; loop d 20 nt) and rather stable (22-48 hydrogen bonds/stem). Whether the potential of forming such structures has any biological significance remains to be determined.

The mobile element insertions occur in a region containing DNase I-hypersensitive subregions: Be- cause the mobile element insertion sites clustered near the 5’ end the gene where nuclease hypersensitive sites are often found, we examined the -4 to +8 segment of genomic DNA for DNase I hypersensitiv- ity (results not shown). The results of the analysis of the only region that exhibited hypersensitivity are shown in Figure 3. At least seven hypersensitive subre- gions were found: Figure 3B indicates that the 5’- most hypersensitive subregion is closely 3‘ to nt 0, and Figure 3C positions the 3’ end of the DNase I hypersensitive region at about nt 1900. All except one (+1945) of the mobile element insertions occur within that interval. This region includes the 5’ end of the segment of genomic DNA (i.e., 3’ of the 284 HindIII site) that is sufficient to encode su(s)+ func- tion, and it may include 5’ sequences of the next gene upstream from su(s), because that gene is transcribed from right to left in Figure 3A, and most of it is encoded by the 1604 nt HindIII fragment (-1320 to +284) (VOELKER et al. 1989).

We then asked whether the mobile element inser- tion sites correlated with the DNase I hypersensitive subregions (Figure 4). The fragments produced by DNase I digestion are visible as broad but distinct bands differing in length by 200-400 nt (see Figure 3C, middle panel), indicating that the DNase I hyper- sensitivity is distributed over - 100-200-nt subregions separated by nonsensitive subregions of approxi-

# I indicated that the +3388 Sphl to +a375 Hind111 fragment did not contain any DNase I-hypersensitive sites (results not shown). In the center panel the Sphl digest was again incomplete; the 7844-nt Bglll and and 3985-nt BglII/SphI fragments are similarly digested by DNase I . revealing the existence of at least seven hypersensitive sites at the 5’ end common to these two fragments. Because the smallest of these digested fragments is - 1 500 nt, these results indicate that the 3’ end of the DNase I-hypersensitive region lies at n t -1900 (3388--1500). The right panel shows an identically prepared filter as the center filter, but probed with probe #4. The corresponding bands are much fiinter, because the DNase I digestion removes much of the region of these bands that is homologous to probe #4.

Page 10: Mobile Element Insertions Causing Mutations in the Drosophila

1080 R. A. Voelker et al.

mately the same length; these features have been described for many DNase I hypersensitive regions (GROSS and GARRARD 1988; ELCIN 1988). The sizes of these fragments were used to approximate the locations of the DNase I hypersensitive subregions. While the positioning of the DNase 1 hypersensitive subregions is subject to some error, the results are consistent with the notion that all except one (at +1945) of the mobile elements inserted into hyper- sensitive subregions, albeit near the margins in some cases. Even the single exception inserted within sev- eral hundred nt 3’ of a hypersensitive subregion.

DISCUSSION

This study has determined at the nucleotide level the insertion points of 16 independent mobile element insertions that cause su(s) mutations. The significant finding is that all of these mobile elements inserted into the 5’ nontranslated leader or the large 2053 nt 5’-most intron that is included within a -1900-nt DNase 1-hypersensitive region. We would like to know the reason(s) for this distribution of mobile element- caused mutations.

Are mobile element insertions into nontranslated su(s) regions the only kind recoverable, or are they the predominant or only kind that occur? The avail- able genetic data suggest that the su(s) locus is not lethal-mutable and may not be essential for fertility. Among more than 20 ENU- and EMS-induced su(s) alleles that were obtained under conditions in which visible, lethal and sterile alleles could be recovered, not one lethal or semilethal allele was recovered, although more than half the alleles recovered exhib- ited varying degrees of cold-sensitive male sterility (VOELKER et al. 1989). However, the phenotype ex- hibited by complete loss of function may not have been observed among the mutations recovered. We are currently recovering and analyzing y-ray induced chromosomal aberrations involving the su(s) locus. Preliminary analyses indicate that: (1) females totally lacking su(s) function are viable and weakly to mod- erately fertile; (2) males carrying an intralocus dele- tion of more than half of the protein coding sequences are viable and fertile; and (3) males that carry an aberration within su(s ) coding sequences are viable and fertile. Thus, the available genetic evidence sug- gests that the su(s) protein is not essential for either viability or fertility. This would suggest that if mobile element insertion mutations occurred within protein coding sequences, they should be recoverable, much as they are in “nonessential” genes such as white (JUDD 1987), vermilion (SEARLES et al. 1990), yellow (CHIA et al. 1986), rosy (CLARK et a1 1986; COTk et al. 1986; LEE et al. 1987; KEITH e t al. 1987) and forked (V. CORCES, personal communication), in which mobile element insertions throughout transcribed and trans- lated sequences are recoverable. Yet no mobile ele-

ment insertions into su(s) protein coding sequences were recovered.

Thus we turn to the second question: are there reasons why mobile elements might select to insert predominantly or only into nontranslated sequences of su(s)? One possible reason is the distribution of potential target sites. Some copia-like mobile elements appear to prefer strongly target sites that are alter- nating purine-pyrimidine AT-rich regions. More than 80% of these potential target sites occur in the portion of su(s) that gives rise to the 5’ nontranslated leader and the 2053-nt intron. Six of the insertions of copia- like elements occurred within one such -40-nt AT- rich alternating purine-pyrimidine region of that in- tron. Yet, the distribution of insertions of copia-like mobile elements is still too clustered near the 5’ end, even taking into account the distribution of potential target sites.

Also, mere target site distribution appears not to be a primary factor in the distribution of P element inser- tions. The five P elements studied here inserted at three different sites, with three having inserted at the same site. A computer search of the su(s ) DNA se- quence showed that each P element insertion site is unique within the transcribed su(s) region, although more than 20 sites exist throughout the gene that differ from these three by only one nucleotide. That the specific nucleotide sequences at the three su(s) insertion sites bear little resemblance to each other or to the “consensus” P element target site (O’HARE and RUBIN 1983) suggests that the selection of insertion sites by P elements is based on some criterion other than simple nucleotide sequence. KELLEY et al. (1987) reported similar findings for the Notch locus.

Four of the five P elements have inserted into the 5‘ leader sequence and the fifth inserted in the intron which interrupts that leader. The proclivity of P ele- ments to insert into 5’ nontranslated sequences or just 5‘ to the transcription start site has been noted pre- viously at rudimentary (TSUBOTA, ASHBURNER and SCHEDL 1985; B. ZERCES and P. SCHEDL, personal communication), RpZZ215 (SEARLES et al. 1982, 1986), Notch (KELLEY et al. 1987), yellow (CHIA et al. 1986; GEYER et al. 1988) and singed (ROIHA, RUBIN and O’HARE 1988). KELLEY et al. (1987) suggested that perhaps the primary factor in predisposing the 5’ end of the gene to P element insertions is altered chromatin structure accompanying gene activation. An indicator of altered chromatin structure and gene activation is DNase I hypersensitivity (GROSS and GARRARD 1988; ELCIN 1988). The five P elements, as well as 10 of the 1 1 non-P elements, inserted into a -1900-nt long re- gion within the 5’ end of the transcribed region of the su(s) gene that contains at least seven DNase I hypersensitive subregions. An approximate localiza- tion of these hypersensitive subregions correlates well with locations of the insertion sites. The region in-

Page 11: Mobile Element Insertions Causing Mutations in the Drosophila

suppressor of sable Mutations 1081

cluding these hypersensitive subregions has the poten- tial of forming numerous stem-loop structures. Whether this region contains important internal pro- moter or enhancer sequences remains to be deter- mined, but it seems a reasonable possibility. Such internal control regions associated with DNase I hy- persensitivity have been found in the Drosophila An- tennapedia (PERKINS, DAILEY and TJIAN 1988), en- grailed (SOELLER, POOLE and KORNBERG 1988), yellow (GEYER and CORCES 1987) and rudimentary genes (TSUBOTA, ASHBURNER and SCHEDL 1985; B. ZERGES and P. SCHEDL, personal communication).

In summary, while the distribution of a variety of mobile element insertions into the su(s) gene can be partially explained by the distribution of potential target sites, the results of this study are also consistent with the notion that the distribution of insertions of mobile elements is reflective of the altered chromatin state associated with gene activation. Whether there is a tendency for potential target sequences to occur in regions involved in DNase I hypersensitivity and gene activation has not been determined.

Is susceptibility to mobile element insertion an indicator of which genes are active in germinal cells or their precursors? That a large proportion of pu- tative point mutations at su(s) result in varying degrees of cold-sensitive male sterility (VOELKER et al. 1989) suggests that su(s) is transcribed in germinal tissues (male, at least) and/or in somatic cells that are precur- sors of germinal cells. That only transpositions that occur in the germinal tissues are genetically transmit- ted to offspring requires that all mobile elements transpose in the germ line cells or their precursors. This study has shown that in cells of the early embryo (which includes germ line precursors) the chromatin contains DNase I hypersensitive sites near the 5’ end of su(s). Perhaps these hypersensitive regions are in- volved in gene activation and impart accessibility of the DNA to transposases. Because any and all mobile elements transpose when su(s) is likely to be active, it is perhaps not surprising that a wide array of mobile elements are found to have inserted at or near the 5’ end of su(s). In the larger perspective the differential susceptibility of genes to mobile element insertions may be an indicator of whether or not they are being actively transcribed in germ cells or their precursors. Stated another way, loci that are not susceptible to mobile element insertions may be “resistant” because their rather tightly coiled, transcriptionally inactive DNA is not accessible to transposase.

How might the mobile element insertions cause mutant phenotypes? It is possible to surmise how some of the insertions within the large 5‘ intron cause a mutant phenotype. The insertions of six mutations (e5.6,1466, 8?fI, 5Ij , 66 and 84a) occur within 34 nt downstream from the GT donor splice signal of the 2053-nt intron (see Figure 1). They might interfere

with the pairing of the 5’ end of the intron with the U1 RNA in the spliceosome complex, thereby inhib- iting splicing of the primary transcript and causing a reduced amount of qualitatively wild-type message to be produced. None of the six has a phenoype as severe as 2, which has been shown to produce a strongly reduced amount of apparently wild-type size tran- script (CHANG et al. 1986). The remaining five non-P element insertions (1270, 2, 5 1 ~ 1 5 , ab and S) occurred near the middle of the intron distributed over a -600- nt region. They may interrupt promoter or enhancer sequences as suggested below for 83$24( 10).

The P element causing 8?$24( 10) inserted within the large intron following nt 1637, which lies within a small 82-nt SalI fragment (1573-1655). A P element transformation construct lacking this 82-nt SalI frag- ment is capable of wild-type su(s) function (VOELKER et al. 1989). Yet, a P element insertion into this frag- ment causes a mutant phenotype. A UWGCG Stem- loop search indicated that this SalI fragment contains sequences that could potentially form a prominent stem-loop structure (stem = 20 nt, 40 hydrogen bonds and a 12-nt loop). The P element inserted at the right end of this potential stem-loop. Other similar potential pallindromes also occur within the region. Because the -425-nt deletion associated with 2 also includes this region, it is possible that the strong mutant effect of 2 is due to a combination of interference of the gypsy insertion and the deletion of the potential stem- loop sequences that may be associated with gene acti- vation.

The authors wish to thank JOAN STERLING, SCOTT CARPENTER and INCA OLEKSY for providing technical assistance during the course of this study. The Drosophila stock centers at Bowling Green, Ohio, Bloomington, Indiana, and Umea, Sweden, kindly and efficiently provided various mutant Drosophila stocks. Thanks are also extended to BURKE JUDD, LILLIE SEARLES and MICHAEL SIMMONS for their constructive criticisms of the manuscript.

LITERATURE CITED

BAYEV, A. A., N. V. LYUBOMIRSKAYA, E. B. DZHUMAGALIEV, E. V. ANANIEV, I. G. AMIANTOVA and Y. V. ILYIN, 1984 Structural organization of transposable element mdg4 from Drosophila melanogaster and a nucleotide sequence of its long terminal repeats. Nucleic Acids Res. 12: 3707-3723.

CALOS, M. P., and J. H. MILLER, 1980 Transposable elements. Cell 2 0 579-595.

CHANG, D.-Y., B. WISELY, S.-M. HUANG and R. A. VOELKER, 1986 Molecular cloning of suppressor of sable, a Drosophila melanogaster transposon-mediated suppressor. Mol. Cell. Biol. 6: 1520-1 528.

CHIA, W., G. HOWES, M. MARTIN, Y. B. MENG, K. MOSES and S. TSUBOTA, 1986 Molecular analysis of the yellow locus of Dro- sophila. EMBO J. 5: 3597-3605.

CLARK, S. H., M. MCCARRON, C. LOVE and A. CHOVNICK, 1986 On the identification of the rosy locus DNA in Drosoph- ila melanogaster: intragenic recombination mapping of muta- tions associated with insertions and deletions. Genetics 112: 755-767.

COT&, B., W. BENDER, D. CURTIS and A. CHOVNICK, 1986 Molecular mapping of the rosy locus in Drosophila mel- anogaster. Genetics 112: 769-783.

Page 12: Mobile Element Insertions Causing Mutations in the Drosophila

1082 R. A. Voelker et al.

DAVIS, P. S., M. W. SHEN and B. H. JUDD, 1987 Assymmetrical pairing of transposons in and proximal to the white locus of Drosophila account for four classes of regularly occurring exchange products. Proc. Natl. Acad. Sci. USA 84: 174-1 78.

ELGIN, S. R. C., 1988 The formation and function of DNase I hypersensitive sites in the process of gene activation. J. Biol. Chem. 263: 19259-19262.

FINNEGAN, D. J., and D. H. FAWCETT, 1986 Transposable ele- ments in Drosophila melanogaster. Oxford Surv. Eukaryotic Genes 3: 1-62.

FOX, A., 1977 Gene transfer in Drosophila melanogaster, pp. 101- 13 1 in Molecular Genetic Modz&ztion of Eucaryotes, edited by I . RUBENSTEIN et al. Academic Press, New York.

FOX, A. S. , and S. B. YOON, 1970 DNA induced transformation in Drosophila: locus specificity and the establishment of trans- formed stocks. Proc. Natl. Acad. Sci. USA 67: 1608-161 5.

FREUND, R., and M. MESELSON, 1984 Long terminal repeat nu- cleotide sequence and specific insertion of the gypsy transposon. Proc. Natl. Acad. Sci. USA 81: 4462-4464.

GEORGIEV, P. G., S. L. KISELEV, 0. B. SIMONOVA and T. I. GERA- SIMOVA, 1990 A novel transposition system in Drosophila mel- anogaster depending on the Stalker mobile genetic element. EMBO 9: 2037-2044.

GERMERAAD, S., 1975 Induction of genetic alterations in Dro- sophila by injection of DNA into embryos. Genetics 80: 534- 535.

GERMERAAD, S., 1976 Genetic transformation in Drosophila by microinjection of DNA. Nature 262: 229-231.

GEYER, P., AND V. G. CORCES, 1987 Separate regulatory elements are responsible for the complex pattern of tissue-specific and developmental transcription of the yellow locus in Drosophila melanogaster. Genes Dev. 1: 996-1004.

GEYER, P. K., K. L. RICHARDSON, V. G. CORCES and M. M. GREEN, 1988 Genetic instability in Drosophila melanogaster: P-element mutagenesis by gene conversion. Proc. Natl. Acad. Sci. USA 85: 6455-6459.

GEYER, P. K., M. M. GREEN and V. G. CORCES, 1990 Tissue- specific transcriptional enhancers may act in trans on the gene located in the homologous chromosome: the molecular basis of transvection in Drosophila. EMBO 9: 2247-2256

GOLDBERG, M. L., J.-Y. SHEEN, W. J. GEHRINC and M. M. GREEN, 1983 Unequal crossing-over associated with assymmetrical synapsis between nomadic elements in the Drosophila melano- gaster genome. Proc. Natl. Acad. Sci. USA 80: 5017-5021.

GROSS, D. S.. and W. T GARRARD, 1988 Nuclease hypersensitive sites in chromatin. Annu. Rev. Biochem. 57: 159-197.

INOUE, S., S. YUKI and K. SAIGO, 1984 Sequence-specific insertion of the Drosophila transposable element 17.6. Nature 310: 332- 333.

JUDD, B. H., 1987 The white locus of Drosophila melanogaster. Results Probl. Cell Differ., pp. 81-94.

KARCH, F., B. WEIFFENBACH, W. BENDER, 1. DUNCAN, S. CELNIKER, M . CROSBY and E. B. LEWIS, 1985 The abdominal region of the bithorax complex. Cell 43: 81-96.

KARLIK, C. C., and E. FYRBERG, 1985 An insertion within a variably spliced Drosophila tropomyosin gene blocks accumu- lation of only one encoded isoform. Cell 41: 57-66.

KEENE, M. A,, V. CORCES, K. LOWENHAUPT and S. C. R. ELGIN, 1981 DNase 1 hypersensitive sites in Drosophila chromatin

occur at the 5’ ends of regions of transcription. Proc. Natl. Acad. Sci. USA 7 8 143-146.

KEITH, T . P., M. A. RILEY, M. KREITMAN, R. C. LEWONTIN, D. CURTIS and G. CHAMBERS, 1987 Sequence of the structural gene for xanthine dehydrogenase (rosy locus) in Drosophila melanogaster. Genetics 116: 67-73.

KELLEY, M. R., S. KIDD, R. BERG and M. W. YOUNG, 1987 Restriction of P-element insertions at the Notch locus of Drosophila melanogaster. Mol. Cell. Biol. 7: 1545-1548.

KUGIMIYA, W., H. IKENACAand K. SAIGO, 1983 Close relationship between the long terminal repeats of avian leukosis-sarcoma virus and copia-like movable genetic elements of Drosophila. Proc. Natl. Acad. Sci. USA 8 0 3193-3197.

LEE, C. S., D. CURTIS, M. MCCARRON, C. LOVE, M. GRAY, W. BENDER and A. CHOVNICK, 1987 Mutations affecting expres- sion of the rosy locus in Drosophila melanogaster. Genetics 116

LINDSLEY, D. L., and G. ZIMM, 1990 The genome of Drosophila melanogaster. Part 4: Genes L-Z, balancers and transposable elements. Drosophila Inform. Serv. 68: 250-252.

O’HARE, K., and G. M. RUBIN, 1983 Structures of P transposable elements and their sites of insertion and excision in the Dro- sophila melanogaster genome. Cell 34: 25-36.

PEARSON, W. R., and D. J. LIPMAN, 1988 Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 8 5 2444-2448.

PERKINS, K. K., G. M. DAILEY and R. TJIAN, 1988 In vitro analysis of the Antennapedia P2 promoter: identification of a new Dro- sophila transcription factor. Genes Dev. 2: 1615-1626.

ROIHA, H., G . M. RUBIN and K. O’HARE, 1988 P element insertions and rearrangements at the singed locus of Drosophila melano- gaster. Genetics 119: 75-83.

SEARLES, L. L., and R. A. VOELKER, 1986 Molecular characteriza- tion of the vermilion locus and its suppressible alleles. Proc. Natl. Acad. Sci. USA 83: 404-408.

SEARLES, L. L., R. S. JOKERST, P. M . BINGHAM, R . A. VOELKER and A. L. GREENLEAF, 1982 Molecular cloning of sequences from a Drosophila RNA poluymerase I1 locus by Pelement transposon tagging. Cell 31: 585-592.

SEARLES, L. L., A. L. GREENLEAF, W. E. KEMP and R. A. VOELKER, 1986 Sites of P element insertions and structures of P element deletions in the 5’ region of Drosophila melanogaster RpII215. Mol. Cell. Biol. 6: 33 12-33 19.

SEARLES, L. L., R. S. RUTH, A,”. PRET, R. A. FRIDELL and A. J . ALI, 1990 Structure and transcription of the Drosophila mel- anogaster vermilion gene and several mutant alleles. Mol. Cell. Biol. 10: 1423-1431.

SOELLER, W. C., S. J. POOLE and T. KORNBERG, 1988 In vitro transcription of the Drosophila engrailed gene. Genes Dev. 2: 68-8 1.

SNYDER, M. P., D. KIMBRELL, M. HUNKAPILLER, R. HILL, J. FRIS- TROM and N. DAVIDSON, 1982 A transposable element that splits the promoter region inactivates a Drosophila cuticle pro- tein gene. Proc. Natl. Acad. Sci. USA 79: 7430-7434.

TSUBOTA, S., M . ASHBURNER and P. SCHEDL, 1985 P-element- induced control mutations at the r gene of Drosophila melano- gaster. Mol. Cell. Biol. 5: 2567-2574.

VOELKER, R. A,, S.-M. HUANG, G. B. WISELY, J. F. STERLING, S . P. BAINBRIDGE and K. HIRAIZUMI, 1989 Molecular and genetic organization of the suppressor of sable and M ( 1 ) I B region of Drosophila melanogaster. Genetics 122: 625-642.

55-66.

Communicating editor: W. M. GELBART