arabidopsis - uma 1997 pj.pdf · arabidopsis genes, a very large number of r-ests (~95), and by...

15
The Plant Journal (1997) 12(5), 1197-1211 TECHNICAL ADVANCE Map positions of 47 Arabidopsis sequences with sequence similarity to disease resistance genes Miguel A. Botella 1, Mark J. Coleman 1, Douglas E. Hughes 2, Marc T. Nishimura 2, Jonathan D.G. Jones 1 and Shauna C. Somerville 2,* 7The Sainsbury Laboratory, John Innes Centre, Colney Lane, Norwich NR4 7UH, UK, and 2Carnegie Institution of Washington, Department of Plant Biology, Stanford, CA 94305, USA Summary Map positions have been determined for 42 non-redundant Arabidopsis expressed sequence tags (ESTs) showing sim- ilarity to disease resistance genes (R-ESTs), and for three Pto-like sequences that were amplified with degenerate primers. Employing a PCR-based strategy, yeast artificial chromosome (YAC) clones containing the EST sequences were identified. Since many YACs have been mapped, the locations of the R-ESTs could be inferred from the map positions of the YACs. R-EST clones that exhibited ambigu- ous map positions were mapped as either cleavable ampli- fiable polymorphic sequence (CAPS) or restriction fragment length polymorphism (RFLP) markers using F s (Ler × Col-0) recombinant inbred (RI) lines, In all cases but two, the R-ESTs and Pto-like sequences mapped to single, unique locations. One R-EST and one Pro-like sequence each mapped to two locations. Thus, a total of 47 loci were identified in this study. Several R-ESTs occur in clusters suggesting that they may have arisen via gene duplication events. Interestingly, several R-ESTs map to regions containing genetically defined disease resistance genes. Thus, this collection of mapped R-ESTs may exped- ite the isolation of disease resistance genes. As the cDNA sequencing projects have identified an estimated 63% of Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur in the Arabidopsis genome. Introduction Arabidopsis serves as a useful model for investigating plant interactions with a wide array of parasites, including bacterial, fungal, viral and nematode pathogens (Crute Received13 March 1997;revised18June 1997; accepted 26 June 1997. *For correspondence (e-mail [email protected]). et aL, 1994; Holub, 1997; Kunkel, 1996). Arabidopsisdisease resistance genes are being identified at a rapid rate, due, in part, to the extensive natural variation that occurs in this species and to the availability of more than 350 accessions in the Arabidopsis stock centers. Roughly 27 mapped disease resistance loci have been identified (Kunkel, 1996), and, at a recent conference, preliminary reports of an additional 12 resistance genes were presented (Somerville and Somerville, 1996). Disease resistance loci are not randomly distributed in the Arabidopsis genome. Rather, clusters of disease resistance and defense response genes exist, which Holub (1997) has termed major recognition complexes (MRCs). The clustering of disease resistance genes has been noted in other plant species, including barley (Jcirgensen, 1994), lettuce (Hulbert and Michelmore, 1985), flax (Ellis etaL, 1995), tomato (Dixon etaL, 1996; Jones etal., 1994) and maize (Richter etaL, 1995). Genetic studies suggest that both intergenic recombination between tightly linked res- istance genes and intragenic recombination contribute to allelic variation at the maize Rpl locus (Richter etaL, 1995). Similarly, molecular evidence suggests the flax rust resistance locus, M, consists of a cluster of closely linked genes (Ellis etaL, 1995). An understanding of the role of intragenic and intergenic recombination in creating new allelic variants or in generating new resistance genes will be important in developing strategies to engineer novel resistance specificities. Recently, several disease resistance genes have been isolated from different plant species, including Arabidopsis, providing a first glimpse of the primary structure of this economically important group of proteins. The first plant resistance gene isolated that conforms to the gene-for- gene hypothesis was the tomato Pto gene, which confers resistance to Pseudomonas syringae pv. tomato carrying avrPto (Martin etaL, 1993). Pto encodes a Ser/Thr protein kinase, implicating protein phosphorylation in the patho- gen recognition process (Loh and Martin, 1995), Most other cloned resistance genes encode proteins containing a leucine-rich repeat (LRR) domain (Bent, 1996; Jones and Jones, 1996). LRR-containing resistance genes fall into two subclasses: those encoding proteins containing an extra- cytoplasmic LRR domain and those containing cytoplasmic LRRs. The first class includes the tomato Cf-9 and Cf-2 genes, which code for resistance to defined races of the fungal pathogen, Cladosporium fulvum (Dixon et aL, 1996; Jones et aL, 1994), and the rice Xa21 gene encoding resist- 1197

Upload: others

Post on 23-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

The Plant Journal (1997) 12(5), 1197-1211

TECHNICAL ADVANCE

Map positions of 47 Arabidopsis sequences with sequence similarity to disease resistance genes

Miguel A. Botella 1, Mark J. Coleman 1, Douglas E. Hughes 2, Marc T. Nishimura 2, Jonathan D.G. Jones 1 and Shauna C. Somerville 2,* 7The Sainsbury Laboratory, John Innes Centre, Colney Lane, Norwich NR4 7UH, UK, and 2Carnegie Institution of Washington, Department of Plant Biology, Stanford, CA 94305, USA

Summary

Map positions have been determined for 42 non-redundant Arabidopsis expressed sequence tags (ESTs) showing sim- ilarity to disease resistance genes (R-ESTs), and for three Pto-like sequences that were amplified with degenerate primers. Employing a PCR-based strategy, yeast artificial chromosome (YAC) clones containing the EST sequences were identified. Since many YACs have been mapped, the locations of the R-ESTs could be inferred from the map positions of the YACs. R-EST clones that exhibited ambigu- ous map positions were mapped as either cleavable ampli- fiable polymorphic sequence (CAPS) or restriction fragment length polymorphism (RFLP) markers using F s (Ler × Col-0) recombinant inbred (RI) lines, In all cases but two, the R-ESTs and Pto-like sequences mapped to single, unique locations. One R-EST and one Pro-like sequence each mapped to two locations. Thus, a total of 47 loci were identified in this study. Several R-ESTs occur in clusters suggesting that they may have arisen via gene duplication events. Interestingly, several R-ESTs map to regions containing genetically defined disease resistance genes. Thus, this collection of mapped R-ESTs may exped- ite the isolation of disease resistance genes. As the cDNA sequencing projects have identified an estimated 63% of Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur in the Arabidopsis genome.

Introduction

Arabidopsis serves as a useful model for investigating plant interactions with a wide array of parasites, including bacterial, fungal, viral and nematode pathogens (Crute

Received 13 March 1997; revised 18 June 1997; accepted 26 June 1997. *For correspondence (e-mail [email protected]).

et aL, 1994; Holub, 1997; Kunkel, 1996). Arabidopsisdisease resistance genes are being identified at a rapid rate, due, in part, to the extensive natural variation that occurs in this species and to the availability of more than 350 accessions in the Arabidopsis stock centers. Roughly 27 mapped disease resistance loci have been identified (Kunkel, 1996), and, at a recent conference, preliminary reports of an additional 12 resistance genes were presented (Somerville and Somerville, 1996).

Disease resistance loci are not randomly distributed in the Arabidopsis genome. Rather, clusters of disease resistance and defense response genes exist, which Holub (1997) has termed major recognition complexes (MRCs). The clustering of disease resistance genes has been noted in other plant species, including barley (Jcirgensen, 1994), lettuce (Hulbert and Michelmore, 1985), flax (Ellis etaL, 1995), tomato (Dixon etaL, 1996; Jones etal., 1994) and maize (Richter etaL, 1995). Genetic studies suggest that both intergenic recombination between tightly linked res- istance genes and intragenic recombination contribute to allelic variation at the maize Rpl locus (Richter etaL, 1995). Similarly, molecular evidence suggests the flax rust resistance locus, M, consists of a cluster of closely linked genes (Ellis etaL, 1995). An understanding of the role of intragenic and intergenic recombination in creating new allelic variants or in generating new resistance genes will be important in developing strategies to engineer novel resistance specificities.

Recently, several disease resistance genes have been isolated from different plant species, including Arabidopsis, providing a first glimpse of the primary structure of this economically important group of proteins. The first plant resistance gene isolated that conforms to the gene-for- gene hypothesis was the tomato Pto gene, which confers resistance to Pseudomonas syringae pv. tomato carrying avrPto (Martin etaL, 1993). Pto encodes a Ser/Thr protein kinase, implicating protein phosphorylation in the patho- gen recognition process (Loh and Martin, 1995), Most other cloned resistance genes encode proteins containing a leucine-rich repeat (LRR) domain (Bent, 1996; Jones and Jones, 1996). LRR-containing resistance genes fall into two subclasses: those encoding proteins containing an extra- cytoplasmic LRR domain and those containing cytoplasmic LRRs. The first class includes the tomato Cf-9 and Cf-2 genes, which code for resistance to defined races of the fungal pathogen, Cladosporium fulvum (Dixon et aL, 1996; Jones et aL, 1994), and the rice Xa21 gene encoding resist-

1197

Page 2: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

1198 Miguel A. Botella et al.

ance to the bacterium Xanthomonas oryzae pv. oryzae (Song etal., 1995). It has been proposed that, upon ligand recognition, Cf-2 and Cf-9 transmit signals by interaction with another transmembrane protein, perhaps a transmem- brane protein kinase (Dixon etal., 1996). In contrast, Xa21 may be able to effect signal transduction directly via its cytoplasmic Ser/Thr protein kinase domain (Song etal., 1995). Recently, a sugar beet nematode resistance gene, Hs pr°-l, was cloned (Cai etal., 1997). This gene appears to encode a protein with a membrane anchor at its C terminus and a leucine-rich extra-cytoplasmic region.

The second class of resistance genes includes the Arabi- dopsis genes RPS2 (resistance to P. syringae pv. tomato carrying avrRpt2; Bent et al., 1994; Mindrinos et al., 1994), RPM1 (resistance to P. syringae pv. maculicola with the avrRpml avirulence gene; Grant etal., 1995), and RPP5 (recognition of the downy mildew fungus, Peronospora parasitica; Parker etal., 1997). This group also includes the tobacco N gene (resistance to tobacco mosaic virus; Whitham et al., 1994) and the flax L6 gene (resistance to flax rust; Lawrence et aL, 1995). The cytoplasmic class of LRR disease resistance genes can be divided further into two subclasses distinguished by features at their amino termini. A potential leucine zipper defines the first subclass, which includes RPS2 and RPMI. The second subclass consists of L6, N and RPP5, genes containing an amino terminal domain homologous to the Drosophila Toll and mammalian interleukin-1 receptors. The cloning and char- acterization of additional resistance genes will reveal how many fall within these classes and how many define novel classes of disease resistance genes.

The conservation of features, such as the LRR motif, among a large fraction of the cloned disease resistance genes has suggested that conserved sequences can be used to find additional resistance genes. Recently, three groups have independently designed degenerate primers based on conserved sequences in the nucleotide binding site and a weak hydrophobic domain to amplify resistance gene-like sequences (Kanazin etal., 1996; Leister etal., 1996; Yu etal., 1996). The PCR products were placed into distinct groups and a member of each group was mapped genetically. Altogether, 19 soybean (Kanazin etaL, 1996; Yu et al., 1996) and 12 potato (Leister et al., 1996) sequences were identified and mapped. In each of the three projects, several resistance gene-related fragments mapped near known disease resistance loci. In one particularly note- worthy example, no recombinants from a population of 1100 gametes separated a resistance gene-related fragment from the nematode resistance gene, Grol (Leister etal., 1996). A fourth group used a Pto-like cDNA to identify a novel receptor-kinase-like gene that co-segregates with the wheat LrlO rust resistance gene (Feuillet et al., 1997). These results suggest that this will be a productive strategy

for identifying resistance gene sequences, especially in economically important crop species.

Arabidopsis is one of the few plant species for which an extensive set of EST sequences has been collected (Cooke etaL, 1996; Newman etaL, 1994). In a recent compilation, it was determined that 29 166 Arabidopsis EST sequences had been deposited in the National Center for Bio- technology Information (NCBI) database of ESTs (dbEST). Most of the clones that were sequenced to generate the EST sequence information have been deposited in the Arabidopsis Biological Resources Stock Center and are readily available to the community (http:// aims.cps.msu.edu/aims/). In The Institute for Genomic Research (TIGR) database, overlapping Arabidopsis ESTs derived from the same gene have been grouped into tentative consensus (TC) assemblies (Rounsley et al., 1996). The Arabidopsis ESTs have been placed into 4153 TC groups and 12 598 singleton ESTs (Release 1.5; Rounsley etal., 1996). These 16 751 sequences represent the prod- ucts of distinct genes in most cases and it is estimated that roughly 63% of Arabidopsis genes are represented by at least one EST (Rounsley etaL, 1996). Using various software searching programs, such as the FASTA or BLAST programs, the databases can be searched for sequences similar or identical to known genes or proteins (Altschul etal., 1990; Pearson and Lipman, 1988). This approach is a fast and efficient method for identifying Arabidopsis genes and has been used extensively by the Arabidopsis community. Many disease resistance genes are expressed in uninfected tissues (e.g. Cf-9, Jones etal., 1994; L6, Lawrence etal., 1995; RPM1, Grant etal., 1995; RPS2, Mindrinos et al., 1994). Thus, although none of the tissues used for generating the majority of EST clones were specifically infected with pathogens, clones for disease resistance genes should still be represented in the data- base. Therefore, screening the Arabidopsis EST database with the sequences of cloned disease resistance genes should be an efficient method of identifying candidate disease resistance genes.

The construction of physical maps consisting of contigu- ous yeast artificial chromosome (YAC) clones covering the Arabidopsis genome (Hwang et al., 1991; Matallana et al., 1992; Schmidt etal., 1996; Zachgo etal., 1996) facilitates the mapping of Arabidopsis genes. By identifying a set of anchored YACs containing the gene of interest, the map position of the gene can be deduced from the map position of the YACs. The use of YAC libraries permits the rapid mapping of ESTs because it is not necessary to identify a polymorphic marker for each EST. This is a generally useful strategy that can be applied to the mapping of any gene or EST of interest in Arabidopsis (Agyare et al., 1997).

In this paper, the identification of 94 Arabidopsis ESTs with sequence similarity to plant disease resistance genes is presented. These 94 R-ESTs fell into 62 distinct groups.

Page 3: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

The map locations for 42 of these ESTs were established by determining their YAC coordinates or by genetic map- ping. In addition, four Pto-like sequences were recovered and three were mapped. The aim of this work is to facilitate the cloning and characterization of disease resistance genes by the Arabidopsis community.

Results

Identification of R-ESTs and Pto-like sequences

All of the disease resistance genes used for sequence similarity searches of the various databases encode pro- teins that contain LRRs (Table 1). We chose not to include Hm I, the maize gene conferring resistance to Cochliobolus carbonum, because this gene encodes a reductase that detoxifies a highly specific pathogen-produced toxin (Johal and Briggs, 1992), and was regarded as a specialized example. Computer-aided sequence similarity searches were made with the BLAST programs (Altschul etaL, 1990) and the NCBI non-redundant nucleotide and protein sequence databases or the Arabidopsis subset of nucleot- ide and protein sequences available via the Arabidopsis thaliana database (http://ge nome-www.stanford.ed u/ Arabidopsis/). In parallel, FASTA analyses were also per- formed. FASTA analyses can be more effective at identifying regions of sequence similarity than BLAST comparisons in some cases because the former can accommodate gaps in one of the two sequences being compared. Although T44979 and Z30811 did not give significant BLAST scores (Table 1), these ESTs gave significant FASTA scores with the Cf-9 protein (FASTA score = 91) and N protein (FASTA

score = 83). Even though RPS2, RPM1 and RPP5 are expressed at

low levels, ESTs highly similar to these resistance genes were recovered (Table 1). For example, the nucleotide sequences of R64932 and T88149 were virtually identical to sequences in RPS2 and the T44885 sequence was highly similar to sequences in RPM1. T41662 strongly resembled RPP5 in nucleotide sequence. The recovery of ESTs for known resistance genes suggests that the assumption that resistance genes are represented in dbEST is correct.

Since the entire protein sequence of a disease resistance gene was used in the similarity searches, ESTs with similar- ity to any part of a disease resistance gene could potentially be identified. The one exception, the Ser/Thr kinase domain of Xa21, was not used in BLAST searches due to the very large number of kinases present in the database. We assumed that the proportion of kinases specifically involved in plant disease resistance could be relatively small. Considering the entire set of 62 R-ESTs (see Table 1), the largest number of ESTs were identified based on their sequence similarity to the LRR domain of one of the disease resistance genes (46/62). The region of similarity for four

Mapping Arabidopsis R-ESTs 1199

R-ESTs (i.e. T04362, T44885, T44979, Z17993) spanned a pre-LRR segment and the beginning of the LRR domain. Additionally, three R-ESTs (i.e. T75662, Z18443, Z30817) exhibited significant sequence similarity with the region just preceding the LRR domain. Although no specific func- tion has been assigned to this region, it does appear to be reasonably well conserved. Five R-ESTs were similar to the Toll domain, two were similar to the nucleotide binding site, and two encoded kinase sequences. No ESTs showing similarity specifically to the leucine zipper region of RPS2 and RPM1 were recovered, which most likely reflects the bias of searches towards contiguous stretches of identical sequence.

In the discussion below, TBLASTN scores are used as a basis for comparisons between the amino acid sequence of each disease resistance protein and the deduced amino acid of each R-EST (Table 1). Only comparisons that resulted in BLAST scores >80, indicating significant sequence similarity, are discussed (Newman et aL, 1994).

When the Cf-9 amino acid sequence, which consists primarily of extra-cytoplasmic LRRs, was used in BLAST searches, a majority of ESTs were similar in sequence to the LRR domain (Table 1). The deduced amino acid sequences of a few ESTs were similar to the region immedi- ately preceding the LRR domain and none showed similar- ity to the transmembrane or cytoplasmic domains of Cf-9. Not surprisingly, BLAST searches with the Xa21 amino acid sequence, which has an LRR domain similar to Cf-9 (Song etaL, 1995), identified many of the same ESTs (Table 1). In addition, Xa21 identified the two Pto-like ESTs, H36913 and R64794.

The number of ESTs exhibiting sequence similarity to RPM1 or RPS2, proteins characterized by a leucine zipper, a nucleotide binding site and an LRR domain, was less than the number that were similar to Cf-9 and Xa21 (Table 1). In addition to R64932 and T88149, the ESTs derived from the RPS2 gene, only EST Z17993 showed significant sequence similarity to RPS2. ESTs H36320 and N97067 exhibited significant similarity to RPM1 in the LRR domain and little or no similarity to the other disease resistance proteins. None of the R-ESTs were similar to the nucleotide binding site or leucine zipper regions of RPM1 or RPS2 (Table 1).

The second subclass of cytoplasmic LRR resistance pro- teins, including N, L6 and RPP5, identified a third set of ESTs. Among these R-ESTs, one, F14315, was similar to the nucleotide binding site of N, and five (i.e. H77224, N65692, R29891, T20808 and T46721) were similar to the N-terminal Toll region or the region immediately following the Toll domain. The remaining R-ESTs with sequence similarity to L6, N or RPP5 contained LRR sequences.

If only BLAST scores >80 are considered, then R-ESTs exhibited sequence similarity only with resistance proteins from the same class (e.g. with Cf-9 and Xa21 or with L6,

Page 4: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

1 2 0 0 Miguel A. Botella e t a l .

T a b l e 1 . R-ESTs w i t h s e q u e n c e s i m i l a r i t y t o d i s e a s e r e s i s t a n c e g e n e s

T a r g e t r e s i s t a n c e p r o t e i n

G E N B K Cf-9 Xa21 RPM1 RPS2 L6 N RPP5

F13577 76 (c) 70 . . . .

F14315 . . . .

F15353 62 95 - -

F15571 59 62 - - -

F20107 124 164 - -

H 3 6 3 2 0 - - 94 -

H36821 128 112 - -

H 3 6 9 1 3 - 146 - -

H37061 . . . . 62

H 7 7 2 2 4 - - - 87 58

N38401 74 59 - -

N 6 5 1 9 8 132 121 - - -

N 6 5 4 1 6 87 116 - -

N 6 5 5 4 9 88 83 - -

N 6 5 6 9 2 . . . . 204 272

N 6 5 8 3 6 93 111 - - -

N 9 5 8 4 8 99 104 - -

N 9 6 0 7 8 . . . . 62

N 9 6 3 0 7 - 65 76 67 73

N 9 6 4 9 3 - - 66 72 -

N96711 82 56 - -

N 9 7 0 6 7 - - 99 59 76

R29891 . . . . 192 296

R30025 . . . . 68

R30624 104 88 - -

R64794 - 105 - -

R64932 - - - 568 62 -

R 8 9 9 9 8 173 216 - - 68

R90150 107 94 - - 55

T 0 4 1 0 9 77 94 - - 67

T 0 4 1 3 5 135 113 - - -

T 0 4 3 6 2 88 54 - -

T 1 3 6 4 8 90 100 - -

T 1 4 2 3 3 97 128 - - - 55

T 2 0 4 9 3 145 153 - 49

T 2 0 6 7 1 71 101 - - -

T 2 0 8 0 8 - - - 68 96

T 2 1 4 4 7 173 189 - - - 78

T 4 1 6 2 9 76 78 - -

T 4 1 6 6 2 . . . .

T 4 2 2 9 4 69 66 75 63 - 67

T 4 3 9 6 8 . . . . 55

T 4 4 8 8 5 - - 635 65 - -

T 4 4 9 7 9 . . . . .

74 160

62

74

125

196

111

64

149

48

61

6O

65

156

53

477

114

T 4 5 8 4 5 130 201 - 55

T 4 5 9 9 6 78 94 - -

T 4 6 0 6 4 - - 60

T 4 6 1 4 5 - - -

T 4 6 3 7 9 91 101 -

T 4 6 7 2 1 - - -

T 7 5 6 6 2 60 - -

T 8 8 1 4 9 53 - 80 577

Z 1 7 7 9 8 138 171 -

62

99 203

- 119

67 74 83

- 6 4 -

- 6 5 -

Pto

91

161

R e g i o n o f TC

s i m i l a r i t y (a) g r o u p (b) C o n t i g s w i t h : (b)

LRR na

N B S na

LRR na

LRR na

LRR na

LRR

LRR

K i n a s e

LRR

pos t -To l l

LRR

LRR

LRR

LRR

Tol l

LRR

LRR

LRR

LRR

LRR

LRR

LRR

To l l

LRR

LRR

K inase

LRR

LRR

LRR

LRR

LRR

p r e - L R R + L R R

LRR

LRR

LRR

LRR

Tol l

LRR

LRR

LRR

LRR

LRR

p r e - L R R + L R R

p r e - L R R + L R R

LRR

LRR

LRR

LRR

LRR

Tol l

p r e - L R R

LRR

LRR

na

na

na

na

na

na

na

na

na

na

na

na

na

na

na

na

na

na

T C 8 5 8 2

na

T C l 1 7 1 7

na

TC9681

T C l 1 2 6 6

na

na

na

na

na

na

na

T C 1 3 9 5 7

na

na

na

na Ra

na

T C 8 3 5 7

na

na

na

TC9706

na

na

T C l 1 0 1 6

na

na

(F20107 a n d F20108 are

f r o m t h e s a m e c lone . )

T 2 1 8 3 8

Z 3 7 6 6 9

H37195 , H37296 , H 3 7 3 0 0

R84144, T21150

R65475

W 4 3 5 3 0

R64813, R90306, T 0 4 4 6 1 ,

T 0 4 5 5 8 , T 4 4 4 0 6 , T 4 6 4 0 9 ,

Z 3 7 6 6 0

H36884 , T 2 2 0 9 0 , T 4 4 8 3 0 ,

Z34571 , Z 3 4 5 7 2

T42271

Page 5: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

Table 1. Continued

Mapp ing Arabidopsis R-ESTs 1201

Target resistance protein

GENBK Cf-9 Xa21 RPM1 RPS2 L6 N RPP5 Pto

Region of similarity (a)

TC group (b) Contigs with: (b)

Z17993 - - 191 - - pre-LRR+LRR na Z18443 - - - 65 83 64 - pre-LRR na Z26226 163 209 48 63 - LRR TC8632

Z30800 95 102 . . . . . LRR TC10452 Z30811 . . . . . . . LRR na Z30817 65 . . . . . . . pre-LRR TC10756 Z33873 82 53 . . . . . LRR TC9829

Z34772 . . . . . 52 56 - NBS na Z46819 128 133 - - - 56 - - LRR TCl1040

H36976, N38535, Z26710, Z30856, Z33755, DRTIO0 (Z26226 and Z26710 are from the same clone.) T42336

T22460 Z46820 (Z46819 and Z46820 are from the same clone.)

T45680 (Z46819 and Z46820 are from the same clone.)

(a) The designation "LRR" or "pre-LRR" indicates that the R-ESTs was similar to a disease resistance gene in the LRR domain or the domain immediately upstream of the LRR domain. "Toll", "NBS" and "Kinase" indicate that the R-EST showed similarity to a Toll- signaling, a nucleotide binding site or a ser/thr kinase domain respectively. (b) The last two columns indicate the TIGR TC assembly group to which the R-EST belongs and any other ESTs that are members of the same TC group, na, not assignable. (c) The values represent TBLASTN scores (BLOSUM 62 matrix, no filtering), which reflect the degree of sequence similarity between the deduced amino acid sequences of the indicated disease resistance protein and the R-EST, with higher scores indicating a greater degree of sequence similarity. Dashes ("-") indicate that no sequence similarity was found. See Altschul et aL (1990) and Newman et aL (1994) for a description of BLAST scores.

N and RPP5). Although a rough measure of relatedness due to the incomplete nature of the EST sequences, these data support the validity of the classes of resistance genes deduced from an analysis of resistance gene sequences (Jones and Jones, 1996).

Initial searches with the Pto polypeptide sequence identi- fied several Arabidopsis genes of unknown function, including APK1 (Hirayama and Oka, 1992) and ATPRIKINA (Moran and Walker, 1993), plus three Arabidopsis EST sequences, H36913, R64794 and Z37669. An alignment of the deduced amino acid sequences of APK1 and ATPRIKINA revealed blocks of conserved amino acid residues (data not shown). Three degenerate primers (kinfl, kinf3 and kinrl) were designed based on the conserved motifs and

used in a two-step PCR amplification protocol using Landsberg erecta (Ler) genomic DNA as the template. Three distinct fragments of approximately 260, 370 and 490 bp were produced (data not shown). These fragments were cloned and the nucleotide sequences of nine were determined. Of these, two apparently did not encode kin- ases, and two encoded putative protein kinases unlike any plant protein kinases in the databases. Three of the cloned products (designated PK2, PK5 and PK9) coded for putative protein kinases closely related to Pto, Fen (Loh and Martin, 1995), APK1 and ATPRIKINA. One, PK1, encoded a putative product similar to Pto but more closely related to plant

receptor-like protein kinases (Braun and Walker, 1996). The gene fragments represented by PK1, PK2, PK5, and PK9

encoded putative protein products exhibiting respectively 48.6, 51.1, 45.1 and 53.6% amino acid sequence identity with Pto. A more recent search based on Pto revealed > 132 significant matches in the TIGR database, suggesting that additional Pto-like ESTs could be analyzed in future.

The R-ESTs were used to search the TIGR database of TC assemblies. Thirteen TC assembly groups, consisting of 2--8 ESTs, were found (Table 1). For comparison, the average TC group consists of four members, with the largest consisting of 221 EST sequences (Rounsley et aL, 1996). In total, 94 R-ESTs, which includes three Pto-like ESTs, were analyzed in this report. These represent 62 non- redundant sequences. No conclusions can be drawn about the relative levels of expression of the various genes based on the number of ESTs identified for each gene because EST clones are not sequenced at random from the source libraries (Cooke et al., 1996; Newman et al., 1994). However, most genes were represented by a single R-EST, suggesting that the genes analyzed are not highly expressed. This observation is consistent with the fact that disease resist- ance genes are weakly expressed.

The 3' ends of ESTs of six TIGR TC groups were sequenced to test whether the members of a given TC group represented clones derived from the same gene or

Page 6: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

1202 MiguelA. Botella et al.

Page 7: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

from closely related genes. Most members of a TC group had nearly identical 3' sequences in regions of sequence overlap. For instance, the 3' sequences of members of TC8582, TC9681 and TCl1016 grouped together in their respective TC groups. The 3' sequence of all members of TC8632, except H36976 (clone 181E3), were highly similar. About 6% of nucleotides from the 3' end of H36976 (clone 181E3) differed from the 3' sequences of other members of TC8632. Additional sequencing would be required to determine conclusively whether this clone was derived from the same gene or a distinct member of a gene family. The 3' sequence of T45680, a member of TCl1040, groups with TC9829. Also, Z46820, a member of TC9829, and Z46819, a member of TCl1040, are the 3' and 5' sequences, respectively, of the same clone. Thus, TCl1040 and TC9829 consist of non-overlapping sequences from the 5' and 3' ends of the same gene. It is important to exercise some caution in interpreting these results both because the sequence fidelity of the ESTs is estimated to be only about 95% (Newman eta/., 1994) and because the full-length sequences of the EST clones are not available. Alternate splicing, such as exhibited by N (Whitham et aL, 1994) and L6 (Lawrence eta/., 1995), or variation in the poly(A) addition site (Rothnie, 1996) may also contribute to vari- ation in the 3' sequences of mRNAs derived from the same gene. Given the limitations of the mapping data (as noted below), R-ESTs R89998 and T21150, which form part of TC9681, mapped to similar locations (Table 3, Figure 1). Together, these data support the concept that members of a given TC group are generally derived from a single gene.

Mapping R-ESTs by identifying anchored YACs

Two YAC libraries, the CIC (Creusot etaL, 1995) and yUP (Matallana et aL, 1992) libraries, were screened, and a PCR- based method was used to identify YACs containing the ESTs. The primer sequences and the annealing temperature used in the PCR reactions are listed in Table 2. The ESTs T88149 and T41662, which show high homology to RPS2 and RPP5, respectively, were mapped as positive controls. Reassuringly, only YACs previously anchored by RPS2 or RPP5 were identified (Bent etaL, 1994; Mindrinos etaL, 1994; Parker et aL, 1997).

Most of the ESTs were mapped unambiguously to a set of YACs anchored by the same marker (Table 3). If the map position of an R-EST is based on only one anchored YAC, the position should be viewed with caution due to the possible chimerism of the YAC (Creusot etaL, 1995;

Mapping Arabidopsis R-ESTs 1203

Matallana etal., 1992; Schmidt etal., 1996). Sometimes, no anchored or only chimeric YAC clones were identified in either the CIC or yUP libraries. We attempted to develop CAPS markers for the EST by digesting the PCR products amplified from Ler and Col-0 genomic DNA with six restric- tion endonucleases to provide map positions in these cases. CAPS markers were generated in this fashion for four ESTs, F15571, H77224, N97067 and R89998, and scored on 90 F s (Let x Col-0) RI lines (Lister and Dean, 1993). When this approach failed to yield a polymorphism, the EST (i.e. R30025, T04109, T20808, T43968, Z17993) was mapped as an RFLP marker using Southern blots of 27-30 highly recombinant Fs (Ler × Col-0) RI lines (Lister and Dean, 1993). The ESTs mapped as either CAPS markers or RFLP markers are noted in Table 3 and the original mapping data have been deposited at the Nottingham Stock Center (Anderson, 1996). In addition to F15571 and T04109, eight R- ESTs (i.e. F15353, H36320, H37061, R64932, T13648, T20493, T42294 and Z30800) were mapped to YAC clones but could not be placed on the genetic map due to the existence of chimeric or non-anchored clones. These eight, in addition to 11 other R-ESTs (i.e. F14315, N65198, N65416, N65549, N65836, N96307, T20671, T44885, T44979, T45996 and Z34772), were not mapped genetically for a variety of reasons, which included the absence of a polymorphic marker or poor amplification reactions.

The EST R30025 was mapped as an RFLP marker to chromosome 3 between the markers m457 and g2778 using the F 8 (Let x Col-0) RI lines. This position was confirmed by identifying YACs anchored in the same region of chromo- some 3. Additionally, mapping with an F 3 (Columbia- gl l x Kas-1) population revealed a similar chromosome 3 location for R30025 (I.W. Wilson, unpublished results). However, an additional location for R30025 is shown on chromosome 5 based on the results of PCR amplifications with primers PEST33 and 34. We assume that an R30025 homologue resides on chromosome 5 (Table 3). PK1, the Pto-like kinase sequence, also mapped to two locations in the genome (Table 3, Figure 1).

The map positions for the R-ESTs given in Table 3 and Figure 1 should be utilized with some care. These map positions are not very precise. The average size of the CIC YAC clones is 420 kbp and the average relationship between genetic and physical distance is about 140 kbp/ cM (Schmidt et aL, 1996). Therefore, R-ESTs mapped with reference to CIC YACs have been mapped to a 3.0 cM interval on average. In one extreme example, YAC CIC6F3, which contains R-EST N97067, is estimated to be

Figure 1. Map positions of Arabidopsis R-EST and Pro-like sequences. R-ESTs with sequence similarity to the LRR class of disease resistance genes and Pro-like R-ESTs are indicated in red by their GenBank number. Pro-like sequences are indicated in black text by their clone name. The approximate positions on known disease resistance genes are shown in green text (Holub, 1997; Kunkel, 1996). Chromosome numbers are indicated above each chromosome.

Page 8: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

1 2 0 4 Miguel A. Botella et al .

Table 2. Primers and the specific annealing temperatures used to ampl i fy R-EST and Pto-like sequences.

GenBank Upper Lower Temp. Product leng

number primer (a) Upper primer sequence (5'-~3') primer (a) Lower primer sequence (5'~3') (°C) (b) (bp)

R-EST CLONES F13577 SS217 F14315 SS190 F15353 SS219 F15571 SS213 F20107 SS211 F20108 SS215 H36320 SS182 H36821 SS166 H37061 SS164 H77224 SS180 N38401 SS203 N65549 SS195

N65692 SS225

N65836 SS223 N95848 SS209 N96078 SS231 N96307 SS235 N96493 SS233

N96711 SS207" N97067 SS199 R29891 PEST13 R30025 PEST33

R30025 SS150 R30624 SS172 R64932 SS174 R89998 SS176

R90150 SS178 T04109 SS136 T04135 SS130

T04362 PEST47 T13648 PEST45 Tt4233 PEST19 T14233 SS108 T20493 SS90 T20671 SS188 T20808 PEST15 T21150 SS144 T21447 PEST21 T21447 SS92 T22090 PEST57 T41629 SS221

T41662 PEST1 T42294 SS102 T43968 PEST9 T44979 A T45845 PEST23 T45996 SS227 T46064 PEST3 T46145 SSl16

T46379 SS168 T46721 PEST11 T75662 SS170 T88149 PEST35 Z17798 SS104 Z17993 PEST31

Z18443 PEST7 Z26226 PEST27 Z30800 SS132

CGA TGG GTG TGG AGA GT SS218 GCC ATG ACC AAG GAA AAT AAG SS191 GGA AGA GAA CCC TCA AGA AGA SS220 TTA ACG TAG CCG ATA ACC SS214 GAC GCG TGC CCA AAC TC SS212 CEI CTC ATC ACT ATC ATC ACC SS216

CCA ACT CTA GAG AAA CTC CTA SSt84 AGC GAC ATT GTG TCT C]-i GTT SS167 CAT GGA GCC TFI- CAG T SS165 TTG GGG AAG AGG AAG T SS181 GCC AGA GTA CAC AGG ACA TGA SS204 CGT CT[ GTT GTT CTT GGA SS196

AAA AAT CAA AAT GTC TTC TCA SS226

TCT GCC GGG GAC AAT GGA G SS224 CTC CCT CCA AAA CCT AAT SS210 CAC AAT GAT GGG AAG TAA G SS232 TTC GCT GAA TCT Tq-C CAA T SS236 TTT ATC TGT CGG GCT GTT C SS234 TCC TGC AAC TTG ATA ATA ACA SS208 GGA TTG CTG ATA CAT TGC SS200 CAT ACT GCA ACT AAG TAC GAT G PEST14

GCA AGC CTT CAA CAC TTA GCA TG PEST34 GCA AGC CIT CAA CAC TTA GCA SS151 TTC GTC GAT TTC GGA GTT AG SS173 GAA GCC AAT CAT TTT CA SS175

GCC AATCCC TAGAGCACT CA SS177

AAT GTT TCC GAG GAT TGA A SS179 ATG ATC ACT CTT TTG CCT TTT SS137

AAT ATA Ff-G GGG ACT TGA AGA SS131 TGA ACT CTT CCT TTA CTC TC PEST48 CGA TGT TCA AAA CAC TTA TG PEST46 TGA GAT CTA ATA GTA GCC TGT T PEST20

GGC CCT TTG AGA TCT AAT AGT SS109 TCT TCA TGG CAG GGA GTG G SS91 GCG GAG AGC TTA AGC CAA CT SS189 AAG CGT ACT TCT CTG AAA GCA C PEST16

ACA TGG CGT CTC GAA ACT SS145 GGA ATC TCC GGT GTT ATA CCT C PEST22 TCG GAA GAC FI-A GGA AAC TGA SS93 CGA ACA TTG AAT CTG GAT GC PEST58 GCC CGG AAC AAG TTT AGT GG SS222 TTG TTG CTC AAG TTT GAG AAC PEST2

TTG GGG AGC TGT AAG AAG A SS157 GCC AAC AAC TAG TTG ATA TTG AC PEST10 TTT TCA TTT CCG TCA TCT B GAA CTC TGA AAC ATG CCA AGG C PEST24 CGG TTA CTC Cq-l CTC CTT CTA SS228 TGT AAT CTC CAA ]-FC CTC AGG PEST4

TTT CAT CCC TCA CTC TCC SSl17 GGC GTC ATA CCG AGT AAT SS169 GGG CTG ATA AAC CAA CCA GGC PEST12

TCG TCG TCT TCT TGC TTC TC SS171 CAA CAC TGA TGC TCC AAC AGA AC PEST36 ATG GAC TGG CGT GAT G SS105

GAG ATG GCT CTA TGG ATT GCA TC PEST32 GGA TGA AGC AAT ATT TCG TCA C PEST8

GGG GAG ATT CCC GCG GAA ATC GGC PEST28 CAA TTC TTT CTC AGG CTC TGT SS133

CCG GAG ATG TCA AGA TGT 50 (c) 173 AAG CAT ACA TGC AGA AGA TTT GA 50 157

CCG GTG TAA TCC CTC CIF 54 158 CAA CGA AAA CAA ACC AGT 50 (c) 500 GGG AAA CAT AAC AAT CTC ACT 55 (d) 154

CGA CGC CAC TCC AAA GAC ACA 52 192 CAT CTG GAA TCT CCT TTA 52 223 CCT GAG CAT TCG GAG GTT 56 390 CCT GCA AG A GCT GTG AGT 48 174 CTT AGC CTT GCT TTG ATA 50 222 ACG ACC AGT GAA GAG GT-I- GTT 52 (e) 229

GGC TTA CGC ACC GAT TTT 55 281 TCC AGC TCT TTG TCG TCT 50 150

TCG AGT CTT GGG ATT TCA C 55 182 TTA GAC GTG GCA AAG AAG 57 (f) 160 AAA AAC AAA ATA GGG AAA AAC 47 162 TTT CCT TTA AAT CGG ATA A 52 396 TAC TTC AGA TGC TTA CAA CCA 51 154 TGT CAC ATC CCT TAG AAA AAT 50 (g) 213 GCC CAT CAC CAC ATT C 48 153 AGT ACT CCG GTC TGC CAC CTC A 55 342 CAA AGT TCC CGA AGG TTT GTG AG 55 323

AAG GGA TCC ACA GTT CGT CA 54 195 CTC GGA ATT GAC CCA CTG 50 (c) 192 AAA CAT ATA CAG CAT CTC CA 54 155

GTA ATT CCG GTC CCT CCA AC 50 500 GAT ATG TCC TGC AAG ATG GT 52 (e) 148

CCC CGG TGA GAT TGT T 54 184 ATA CCC CTT AGG ATG AAC AC 54 334 AAA TCC CAT CAA GGT GCT TG 60 318 GGA TCC CGA CCA CGG GTG CT 60 368 CGA GCT CAA GTT GAG TAG CAT A 50 290 TCT GTT AGC TGG TTT AGG TTG 54 266

CCC GCA ACA AAT TCT CTG G 52 258 GGA TCT CGG TTT GAT ACT TGC 56 195 CAT CAC CGT CTG ACT TCA AGA G 55 224

TGG AAC CAG GTA CAA GGA 50 190 TCC TAC AAG GCC CTC CTC TAA T 52 337 AAT GAG ACG GGG ATA GGA C 45 214 CGT TCA GAG AGA CCA AGC CG 60 304 GCT CAG CAA ATC CTT CAC CTC 54 169 CGG TTC ACA TGT ATA TTC AAT G 60 316

AAC AGG CAG TTG CTG TA 55 (d) 290 GGT TTG AGA CAC CTG AAA GAC C 60 300 CTC TCC GGT GAT TCC T 55 350 TGT TTG AGA AAA qq-A ACC TCG A 60 340 GAT TCT TCC GTT GTT ACA GC 54 210 TCG AGA TTT GTA GCA TTT GAC 55 299

AGC TTC CCT GAT GTC ATT 54 300 AGA TTG TGG AAG CGA GTT AG 50 191 CTC TCC CTT TCA TCT CGT TGC 55 300 CGC AGC TCA CAC CAT ACC 68 (h) 162 ACC GGC GTA CIF GTA GTA CAA GT 55 321 GAG AGA CCC TGA GAT TCT GTT 52 303 GGT TAG GCA A l l TCC GGA GAC TTG 52 298

GTC TCG TTA ATA TCC GAT GCG 57 321 CCT TCA TGT TAC CCA TCC ATT C 55 322 GCC TCG GGT TAG GTT TC 55 (d) 262

Page 9: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

Mapping A r a b i d o p s i s R-ESTs 1205

Table 2. Continued

GenBank Upper Lower Temp. Product length number primer (a) Upper primer sequence (5'~3') primer (a) Lower primer sequence (5'-~3') (°C) (b) (bp)

Z30811 SS110 Z30817 SS82 Z33873 SS134 Z34772 SS192 Z46819 SS112 Pto-LIKE CLONES PK1 PK1F PK2 PK2F PK5 PK5F PK9 PK9F H36913 H36913F R64749 R64794F

AGC TCG GAA ACA TGA CTA SSlll ACG TGT TGT CCG CTG TCA GA SS83 AAT AAA AAC AAA All" GGC ACT SS135 GCT GGT GGA CGT AAT CGC TGA CA SS193 GGG TI'C CTG AGT TI'C TG SS113

GCA CCA TAG CTG CAT TCC TC PK1R CCA AAG CTT TCT GAT TTT GG PK2R TTG CAC GAC TAT GCA GAT CC PK5R ATG AAG CTC AAG TCA TAT AC PK9R GAA CTT GAT TGG ATA CTG TG H36913R All" ACA TGG CTC ATG GTA C R64794R

GAC ATA CGA GCT GCT GAA 54 193 CGC CCC GGA GAT TTA TGT C 56 172 GAA TCA CAA TGG CAT CAC 53 264 CTT CCT CCT CI-I" GCC CAA ATC CC 70 (h) 175 ACT CTG GTA TCG GAC CTG 54 156

CTG TAC ATC GGA AAC TCT GG 55 202 CCA GTG GCA ACG TAT TCA GG 55 120 AGT GAG 1-FG ACC GGT CAT CG 55 196 CCT GAG "I-I'C CCA TCA CTT GG 55 351 AAA TGT AN" CGG GAT CGA GG 55 342 GGG TCA AGA TAA CCC GAA AC 55 209

(a) PEST primers are from the J. Jones laboratory and SS primers are from the S. Somerville laboratory. (b) Standard PCR conditions: 94°C for 30 s, annealing temperature as indicated in the table for 30 s, 72°C for 30 s over 30 cycles. (c) Touch-down annealing temperature began at 52°C, decreased 0.5°C/cycle for 4 cycles and was followed by 30 cycles at an annealing temperature of 50°C. (d) Touch-down annealing temperature began at 60°C, decreased 0.5°C/cycle for 10 cycles and was followed by 30 cycles at an annealing temperature of 55°C. (e) Touch-down annealing temperature began at 57°C, decreased 0.5°C/cycle for 10 cycles and was followed by 30 cycles at an annealing temperature of 52°C. (f) Touch-down annealing temperature began at 62°C, decreased 0.5°C/cycle for 10 cycles and was followed by 30 cycles at an annealing temperature of 57°C. (g) A "hot-start" PCR was preformed in which the primer labeled with an asterisk was separated from the second primer and the Taq polymerase by a wax layer until the first cycle of PCR. (h) Two-step PCR conditions: 94°C for 30 s, annealing temperature as indicated in the table for 90 s over 30 cycles.

> 1500 kbp in size and ex tends over > 6.8 cM (Schmidt

etaL, 1996). Fur thermore , some anchor ing markers have

not been precisely mapped on the ful l set o f F 8 (Ler × Col-

0) RI lines. In add i t ion to a lack of precis ion, some map

pos i t ions may be incorrect. The occurrence of mu l t igene

fami l ies of h igh ly related members can lead to some

amb igu i t y in the map posi t ion. For example , both R30025

and PK1 mapped to t w o locat ions. As men t i oned above,

some YAC c lones are ch imer ic and can g ive mis lead ing

map in fo rmat ion . We encourage those interested in a

specif ic R-EST to remap the EST as a PCR-based or RFLP

marker on the ful l set of F 8 (Ler × Col-0) RI l ines or a

mapp ing popu la t ion segregat ing for the resistance gene

of interest. Figure 1 represents the R-ESTs map locat ions graphical ly .

These locat ions are compared w i th the in tegrated map of

Arabidopsis disease resistance genes (Holub, 1997; Kunkel,

1996). The R-EST and Pto-l ike loci were not un i fo rm ly

d is t r ibuted in the Arabidopsis genome. A d isp ropor t iona te

n u m b e r occurred on c h r o m o s o m e s 1 (11 of 47) and 5 (20

of 47). Some R-EST loci are c lustered in MRC regions

(Holub, 1997). The mos t no tab le examp le is the cluster of

n ine R-EST loci centered on the marker mi2 in the MRC-

J region, wh ich conta ins several v iral , bacter ial and fungal

resistance loci (Holub, 1997; Kunkel, 1996). Add i t iona l LRR-

conta in ing sequences, wh ich are not represented in dbEST,

have been ident i f ied in the MRC-J region, con f i rm ing that

this reg ion is enr iched in resistance gene- l ike sequences

(D. Mura l i , J. McDowe l l and J. Dangl , personal c o m m u n -

icat ion).

A numbe r of R-ESTs and Pto-l ike sequences mapped

near genet ica l ly mapped disease resistance genes. R-EST

R90150 mapped near RPB1 (resistance to Plasmodiophora brassicae 1) and T42294 and PK5 were placed near CAR1 (cau l i f lower mosaic v i rus resistance 1). More precise map-

ping w o u l d be requi red to de te rm ine whe the r or not these

R-ESTs are good candidates for the Co lumb ia al leles of

RPB1 and CAR1, respect ively. Also, Z30811 and PK lb

mapped near RPW2 ( recogni t ion of p o w d e r y m i l dew 2),

and the co-segregat ion of these markers w i th RPW2 is

being tested (1. Wi lson, personal commun ica t ion ) . MRC-

J on c h r o m o s o m e 5 consists of at least nine disease

resistance loci (RAC3, RPS4, HRT1 and 7-1"R1, and f ive

dist inct RPP loci) and nine R-EST loci mapped to this

region. Notably , MRC-J const i tuent RPP22 maps near the

marker mi2 as do R-ESTs N38401, R30624, T20808 and

T21477. Int r iguingly, T20808 shows h ighest sequence simi l -

ar i ty to another RPP gene, RPP5 (Table 1). This represents

one of the more p romis ing regions of g e n o m e f rom wh ich

to c lone resistance genes v ia the candidate gene approach.

Discussion

Al together , 94 R-ESTs and four Pto-l ike sequences were

ident i f ied f rom Arabidopsis. These assembled into 62 non-

redundant R-EST TC groups or s ing le tons and four non-

redundant Pto-l ike sequences, wh ich were used to def ine

47 loci. As new EST and bacterial art i f icial c h r o m o s o m e

(BAC) genomic sequences are added to the database,

add i t iona l R-ESTs wi l l cont inue to be d iscovered. Despite

Page 10: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

1206 MiguelA. Botella et al.

Table 3. Map positions and YAC clones associated with R-EST and Pro-like sequences.

GENBANK Primers Chr (a) cM (a) CIC YACs (b) yUP YACs (b) Anchor (c)

F13577 SS217/218 4 99.8 3H12, 9H6 um596A F15353 SS219/220 nm nm 5E7, 7B7, 7B8 F15571 SS213/214 1 24.2 (d) F20107 (e) SS211/212 2 47.0 2E12, 5B9, 8Fll, 8F12, 8Hll,

10F1 F20108 (e) SS215/216 2 47.0 5B9, 8Fll, 8F12, 8Hll, 10F1 H36320 SS182/184 nm nm 1H12, 3G7 (cp DNA) H36821 SS166/167 5 13.0

g3786/g3829 g6842

g6842

12Bll, 14A12", 19G12, 21G3, g3837 23B2, 23F6, 24C4

H36913 H36913F/R 5 84.1 5H10, 10H7 m435 H37061 SS164/165 nm nm 21H1 H77224 SS180/181 1 111.8 (d) 12F12, 12A8 nga111 N38401 SS203/204 5 80.8 4B10", 4A3", 8G12, 12H4 mi2 N65692 SS225/226 1 (84.1) 2E9, 12G9 g4121 (mi230) N95848 SS209/210 2 47.0 5B9, 8Fll, 8H12, 10F1 g6842 N96078 SS231/232 4 54.8 5E12, 6G4, 7G8 g4539 N96493 SS233/234 5 24.0 9E10 mi438 N96711 SS207/208 5 7.2 4D2 g3715 N97067 SS199/200 5 14.8 (d) 6F3 KG31 R29891 PEST13/14 1 111.2 6H1, 9C4, 12H9 nga111 R30025 SS150/151 3 65.1 (f,g) 4C2, 9B2, 17D3 m457 R30025 PEST33/34 5 87.6 (h) 9C9, 10B4, 10C9, 10E2, 8E1", m558A

8E2, 2C12, 8E3, 9C7 R30624 SS172/173 5 80.8 17Dll R64794 R64794F/R 5 87.6 8E1", 8E2, 10B4 R64932 (i) SS174/175 nm nm 20H8" R89998 (j) SS176/177 5 28.8 (d) 7F10, 8F10 17E1", 5G5, 7G10 R90150 SS178/179 1 (31.9) 20C3, 14E3 T04109 SS136/137 3 54.0 (f) 11H1, 15B9 T04135 SS130/131 1 6.4 5H2", 9D5, 10B6, 10C5, 10D1,

22C7 T04362 PEST47/48 5 7&0 7G6,7E12 T13648 PEST45/46 nm nm 8C9, 5F3,11F11 T14233 PEST19/20 4 47.0 3B2, 3C1, 12B6, 3D3 T14233 SS108/109 4 47.0 T20493 SS90~1 nm nm T20808 PEST15/16 5 80.8 (f) 2E4, 8G12, 12H2

8D5,4G5 5Dll 8B5,8E4 2B8,5E12,6G4,7G8,12B8, 12B9,12C8 11A4" 9A7,2A12,12D9,12D10

11C8 7C1,7A3*,8C8,6E9

4C7(cpDNA),11E9,11E11 2B3,9E3

1E5,3C11,10G2*,12B3,12H1 3A3,9H3,9B5 5B3,5C3" 4A10 6C6,11H11,11H12 1E8*,1F7,6C4

T21150 (j) SS144/145 5 29.7 T21447 SS92/93 nm nm T21477 PEST21/22 5 (80.8) T22090 (k) PEST57/58 1 18.8 T41629 SS221/222 1 93.8 T41662 (I) PEST1/2 4 54.8

9E4*,12A4,15C5 15A8,20H1*,21H7 3F12,19G5, 3G12, 7E1,13G6, 14G7 5G5, 8F10,17E1* 21H12

3G8,3G9,3H12,8C12,9B4, 9E9 9F8

10C4 4E4

9A7

1E7,12G2,12H2,22E10

1F12*,9D2, 1E1,5A3, 5D10

13B1

T42294 SS102/157 nm nm T43968 PEST9/10 5 74.4 (f)

T45845 PEST23/24 2 73.2 T46064 PEST3/4 5 62.7 T46145 (k) SS116/117 1 18.7 T46379 SS168/169 2 (67.3) T46721 PEST11/12 4 61.3 T75662 SS170/171 5 (20.0) T88149 (i) PEST35/36 4 72.2 Z17798 SS104/105 5 (95.9) Z17993 PEST31/32 5 100.4 (f) Z18443 PEST7/8 5 76.0 Z26226 PEST27/28 5 (20.0) Z30800 SS132/133 nm nm

mi2 m558A

mi138 yUP9H2L(m235) GL1/m249 PAIl

mi423b

g6837 g6837

mi2

mi138

m268 (mi2) ve007 m315 g13683

g4028

g4514 m247 g3786 MRL1 (m323) g3883 I/dSpm113 (mi174) PG11 g4554 (g2368) pCITd110 (g2368) m423b pCITd37 (nga151)

Page 11: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

Mapping Arabidops is R-ESTs 1207

Table 3. Continued

GENBANK Primers Chr (a) cM (a) CIC YACs (b) yUP YACs (b) Anchor (c)

Z30811 SS110/111 3 17.5 5F8, 5H8, 7D3, 12A4 4C12, 6G5, 20B8 nga 162 Z30817 SS82/83 3 (25.2) 11E2 g6220 (m105) Z33873 (m) SS134/135 5 13.0 12Bll, 19G12, 21G3, 23B2, g3837

23F6, 24C4 Z46819 (m) SSl12/113 5 13.0 14A12", 19G12, 21G3, 23F6, g3837

24C4 PK1 (n) PK1F/R 1 81.2 (o) 6D4 nga128 PK1 (n) PK1F/R 3 17.5 (p) 4C5 nga162 PK5 (n) PK5F/R 1 84.1 6H10 mi230 PK9 (n) PK9F/R 1 39.1 1D2 ve094

(a) Chr and cM indicate the chromosome and estimated map position relative to the RI map (May 12, 1997 version; Anderson, 1996). A map position given in brackets indicates the map position of the nearest mapped marker to the marker anchoring the YAC clones. Occasionally, the marker anchoring the YACs was not placed on the RI map and the map position of a neighboring marker was given in the table. Thus, map positions given in brackets are not as precise as unbracketed map positions. Map positions determined by genetic mapping on the RI lines are given where available, nm, not mapped. (b) * Indicates a chimeric YAC. (c) Markers that anchor the YAC clones. Markers in brackets are neighboring markers used to provide an estimate of the map position of the YACs and R-EST, when the anchoring marker had not been placed on the RI map. Two markers separated by a slash are two flanking markers determined by genetic mapping. (d) Denotes positions acquired via mapping the R-EST as a CAPS marker on 90 RI lines. (e) F20107 and F20808 are derived from the same clone. (f) Denotes positions acquired via mapping the R-ESTs as RFLP markers on 27-30 RI lines. (g) Locus R30025a and (h) Locus R30025b. (i) RPS2 (j) R89998 and T21550 are members of TC9681. (k) T22090 and T46145 are members of TC9706. (I) RPP5 (m) Z33873 and Z46819 are derived from the same gene (see text). (n) PK1, PK5, and PK9 are clone names. The GenBank numbers for the sequences of these clones are U82399 (PK1), U82401 (PK5) and U82402 (PK9). (o) Locus PKla and (p) Locus PKlb.

low levels of expression of the RPS2, RPMI and RPP5 genes, ESTs corresponding to these disease resistance genes were recovered from dbEST. Therefore, disease resistance genes appear to be adequately represented in dbEST. Disease resistance genes that would not be represented in the current EST databases include those for which the Columbia allele is deleted or missing, a phenomenon observed in Arabidopsis accessions suscept- ible to P. syringae maculicola (avrRpm 1) (Grant et aL, 1995). We estimated the total number of R-ESTs of the LRR class represented in the Arabidopsis genome at about 95 (number of loci defined by R-ESTs + number of unmapped R-ESTs/proportion of Arabidopsis genes represented in dbEST; (41+19)/0.63). This calculation is based on the assumption that disease resistance genes were not under- represented in the database. With the caveat that some genes with an LRR domain are not resistance genes as noted in the next paragraph, this observation suggests that the LRR-class of disease resistance genes is a signific-

ant class in Arabidopsis. Not all genes with an LRR motif are disease resistance

genes; thus, not all of the R-ESTs necessarily represent disease resistance genes. Polygalacturonase-inhibiting

proteins and LRR-extensins both contain LRR motifs (Jones and Jones, 1996). Although, these proteins may contribute to the outcome of plant-microbe interactions, they are considered defense response genes rather than true dis- ease resistance genes (Jones and Jones, 1996). Also, three LRR receptor-like kinases of unknown function have been described in Arabidopsis (e.g. TMK1, TMKL1, RLK5) (Braun and Walker, 1996). Unexpectedly, two developmental genes, ERECTA, which exhibits pleiotropic effects on the growth habit of Arabidopsis (Torii eta/., 1996) and CLAVATA1, which affects the apical meristem (Clark eta/., 1997), are predicted to contain an extracellular LRR domain and an intracellular Ser/'l'hr kinase domain, and thus resemble Xa21. Many of these LRR-containing proteins were identified along with R-ESTs in TBLASTN searches and no features that distinguish disease resistance genes, like Xa21, from non-resistance genes have been defined. Although a limited example, it is tempting to speculate that cell-cell communication in developmental processes and plant-pathogen interactions may have evolved from a common ancestral mechanism (Wilson eta/., 1997).

No example of an Arabidopsis disease resistance gene with a kinase domain similar in structure to ,°to or Xa21

Page 12: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

1208 Miguel A. Botella et al.

has been cloned to date. By mapping either kinase-like ESTs or additional sequences generated with degenerate primers, it is possible that kinase-containing genes involved in the expression of resistance may be identified. As a number of signaling pathways include one or more kinases, it seems likely that at least some signaling pathways initiated by disease resistance genes wil l also include kinases (Jones and Jones, 1996). The fact that Pto is a Ser/ Thr kinase (Loh and Martin, 1995) and interacts with another Ser/Thr kinase, Ptil (Zhou etaL, 1995), and the fact that Xa21 is a hybrid LRR-Ser/Thr kinase protein lend credence to this idea.

Understanding how plant disease resistance genes have evolved and how genetic variation at these genes is created and maintained are important topics of discussion (Bennetzen and Hulbert, 1992; Bent, 1996; Holub, 1997; Jones and Jones, 1996; Pryor and Ellis, 1993). With the mapping of roughly 27 disease resistance loci and 26 RPP resistance specificities, it is evident that Arabidopsis disease resistance loci tend to occur in clusters (Holub, 1997; Kunkel, 1996). Although the complete genomic sequence of these intervals wil l definit ively address this issue, our preliminary data imply that gene duplication events have contributed to the clustering of some disease resistance genes. Three (H36821, Z33873, Z46819) of four ESTs mapping to the same location near the top of chromo- some 5 are most similar to Cf-9 or Xa21, and the fourth, N97067, is most similar to RPM1. Thus, the genes repres- ented by H36821 and Z33873/Z46819 may have arisen relatively recently by gene duplication. However, the gene represented by N97067 is sufficiently different that it may have arisen either via a gene duplication event at a distant time or via some other mechanism. Similarly, R-ESTs N38401, R30624 and T21477 are most similar to Cf-9 or Xa21, while a neighboring R-EST, T20808, which maps to some of the same YAC clones, is most similar to RPP5. If this clustering of similar and dissimilar R-ESTs can be confirmed using functionally defined disease resistance genes, then it would suggest that gene duplication events occur repeatedly over time. It is less likely that these gene clusters arose during a brief, active period of genome rearrangement.

As a general strategy for identifying a specific class of genes, mapping ESTs is a relatively straightforward and inexpensive method compared with map-based cloning or tagging (Agyare etal., 1997). Mapping the ESTs to the CIC library is generally preferable because of the low frequency of chimeric clones in this library (Schmidt etal., 1996). Also, this library is being used extensively by groups developing physical maps of the Arabidopsis genome; therefore, the proportion of mapped CIC YACs wil l increase over time. However, not all sequences are represented in any one library and a back-up strategy for mapping is required. Finding polymorphisms is more difficult for CAPS

than RFLP markers because the PCR-amplified fragments used for generating CAPS are relatively small. However, scoring large numbers of individuals from a mapping population is more convenient with CAPS markers than RFLP markers and the accuracy and precision of a map position reflects the size of the mapping population. On balance, we favor sequencing the 3' ends of EST clones so that larger fragments can be amplified and screened for CAPS polymorphisms as a back-up strategy.

We hope that the information generated in this report wil l benefit the Arabidopsis community and wil l aid in the cloning of additional disease resistance genes. With a full complement of cloned and characterized resistance genes, long-standing questions in plant pathology about the nature and evolution of disease resistance genes can be addressed. Given that two developmental genes also encode LRR-containing proteins, it is possible that at least some of the ESTs identified in this report wil l be of interest to developmental biologists as well.

Information contained in Tables 1, 2 and 3 and Figure 1 wil l be made available via the World Wide Web page for the Arabidopsis thaliana database (http://genome- www.stanford.ed u/Arabidopsis/). Map positions for disease resistance genes or R-ESTs can be added to the tables and figure by contacting either J. Jones or S. Somerville.

Experimental procedures

PCR amplification

PCR amplification of Arabidopsis genomic or yeast DNA was performed in an M JR PCT-100 thermocycler (MJ Research Inc., Watertown, MA). PCR (25 I~1) consisted of 10 mM Tris-CI (pH 9.0), 50 mM KCI, 0.1% Triton X100, 2 mM magnesium chloride, 0.1 mM each of the four deoxyribonucleotide triphosphates, 600 nM of each primer (Life Technologies, Gaithersburg, MD), and 0.5 units of Taq polymerase (Promega Corporation, Madison, Wl). Sequences of the various primers are listed in Table 2. Arabidopsis DNA (10 ng) or yeast DNA (50 ng) was used as template. Standard cycling conditions were 94°C for 30 sec, a specific annealing temperature for 30 sec, and 72°C for 30 sec over 30 cycles. The specific annealing temperatures are given in Table 2. PCR products (5111) were separated on agarose gels (Amresco Corporation, Solon, OH) in 1 × TAE (40 mM Tris-acetate, pH, 1 mM EDTA, pH 8.0) and visualized by staining with ethidium bromide.

YAC DNA preparation

Two YAC libraries were used. The yUP library contains 2300 clones with an average insert size of 250 kbp, representing between five and six genome equivalents (Matallana etal., 1992). The CIC library consists of 1152 clones with an average insert size of 420 kbp, representing about four nuclear genome equivalents (Creusot et al., 1995). Both YAC libraries are available from the Arabidopsis Biological Resource Center (http://aims.cps.msu.edu/ aims/). The yUP YAC clones were originally obtained from J. Ecker (University of Pennsylvania, Philadelphia, PA). The yUP YAC clones were grouped from 24 96-well plates into 24 plate pools, 24

Page 13: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

Mapping Arab idops is R-ESTs 1209

column pools and 24 row pools. Plate pools consisted of 96 colonies from each plate. The 12 colonies from a row for eight plates formed a row pool. Column pools included eight colonies from a column for 12 plates. Thus, each unique YAC will give a positive signal in one plate pool, one row pool and one column pool. However, due to the redundancy inherent in the yUP library, multiple positive signals are generated. Therefore, to confirm which YACs gave positive signals, the individual YACs were tested. CIC YAC pools were constructed according to Creusot eta/. (1995). DNA was prepared from pools of YAC clones according to Matal- lana eta/. (1992). Extraction of DNA from individual YAC clones was carried out using methods described by Hoffman and Win- ston (1987).

Generation and mapping of CAPS markers

Genomic DNA was prepared from Col-0 and Ler and from 90 F8 (Ler x Col-0) RI lines (Lister and Dean, 1993) by the method of Bernatzky and Tanksley (1986). PCR products were generated using the primers and amplification conditions listed in Table 2 and above. The PCR products were digested with the indicated restriction endonuclease at 37°C for 1 h according to the manufac- turers' recommendations and then separated on 4% Metaphor agarose (FMC Bioproducts, Rockland, ME) in 1 x TAE. The restric- tion enzymes used were Hinfl (N97067), Hpall (F15571), Rsal (R89998) and Taql (H77224).

Generation and mapping of RFLP markers

Genomic DNA from Col-0, Ler and 27-30 highly recombinant RI lines (Lister and Dean 1993) was restricted with specific enzymes to identify RFLPs. The restriction endonucleases used were Bg~l (R30025, T43968), EcoRI (T04109), EcoRV (T20808) and Hindlll (Z17993). DNA (4 lag) was separated on 0.85% SeaKem agarose (FMC Bioproducts, Rockland, ME) in 1 x TAE. Products were then transferred to Hybond N filters (Amersham, Chicago, IL). The filters were UV crosslinked at 1200 MJ and then baked at 80°C for 2 h. Pre-hybridization took place at 65°C for 1 h in 5 x SSPE (1 × SSPE: 180 mM NaCI, 10 mM NaH2PO4, 1 mM EDTA, pH 7.4), 0.5% SDS, 5 x Denhardt's solution (50 x Denhardt's solution: 0.5% Ficoll (Type 400), 0.5% polyvinylpyrrolidone, 0.5% bovine serum albumin), and 0.1 mg m1-1 sheared salmon sperm DNA. Probes were hybridized overnight in the same solution at 65°C, and then the blots were washed once in 1 × SSPE and 0.1% SDS and twice in 0.5% SSPE and 0.05% SDS at 65°C. Blots were exposed to Kodak X-Omat AR X-Ray film (Eastman Kodak Co., Rochester, NY) for between 1 and 5 days. General methods of DNA manipulation were performed as described by Ausubel eta/. (1990).

PCR amplification ofArabidopsis Pto homologs

Three degenerate primers were designed based on motifs con- served between Pto, Fen, APK1 and ATPRIKINA. One of these primers (kinfl) was designed to be specific for this class of protein kinases. For the initial amplifications from genomic DNA of Ler and Col-0 (10 ng per reaction), primer kinf3, 5'- GGNTI'NGGNGAYGTNTANAA, corresponding to the amino acid sequence GFGDVYK, and the antisense primer kinrl, 5'- CCRAANSYRTANACRTC, corresponding to the amino acid sequence GFSYVD, were employed at concentrations of 10 and 1 p_M, respectively. For the second amplification, 1 I11 of the product of the initial reaction was transferred to fresh reaction mix con- taining primer kinfl, 5'-GGNGCNGCNMRNGGNYT, corresponding

to amino acid sequence GAA(R/K)GL, and primer kinrl at concen- trations of 10 and 1 pM, respectively. Both amplifications were performed in a model 480 thermal cycler (Perkin Elmer Cetus, Norwalk, CT) for 40 cycles (94°C for 15 sec, 50°C for 30 sec, 72°C for 1 min) in 40 p.I of reagent mix containing 50 mM KCI, 10 mM Tris-HCI, pH 8.3, 0.2 mM each of the four deoxyribonucleotide triphosphates, and 1 unit Taq polymerase. PCR products derived from Let were treated with T4 DNA polymerase to produce flush ends, separated by agarose gel electrophoresis, and purified following digestion of the gel with agarase (New England Biolabs, Beverly, MA). The manufacturer's protocol was employed. The products were ligated to Hincll-cut pUC19 (Yanisch-Perron eta/., 1985) that had been treated with calf intestinal alkaline phosphat- ase, and the ligation mixture was used to transform Escherichia coil strain DH5c~ (Life Technologies, Gaithersburg, MD). The M13 universal forward and reverse primers (Yanisch-Perron eta/., 1985) were used to direct the sequencing reactions, which were per- formed by the DyeDeoxy terminator cycle sequencing method (Applied Biosystems Division, Foster City, CA) and an ABI model 373A saquenator.

Sequencing 3' and 5' ends of R-EST clones of TC assembly groups

Plasmid DNA was isolated using the Promega Wizard Plus Minip- rep columns (Promega, Inc., Madison, Wl). The 5' sequence was determined to confirm the identify of the clone and then the 3' sequence was determined. Sequencing reactions were prepared using the Dye Terminator sequencing kit supplied by Applied Biosystems Division and run on an ABI Prism 310 Genetic Ana- lyzer sequence.

Data analysis

Computer-aided sequence similarity searches of the NCBI non- redundant nucleotide and peptide sequence databases were made with BLAST V.1.3.11MP (Altschul eta/., 1990) or FASTA programs (Pearson and Lipman, 1988). Default parameters for the TBLASTN searches reported in Table 1 were BLOSUM62 matrix, default S threshold and no filtering. Several overlapping blocks of 150-300 amino acids of the disease resistance proteins were used in TBLASTN searches of dbEST to identify candidate ESTs. In addition, related EST sequences were identified using the search alogrithm developed by TIGR, which groups ESTs with at least 40 nt of overlap and 95% sequence similarity in the region of overlap into TC assemblies (Rounsley et al., 1996). Conceptual translations and alignments were performed with the Wisconsin Package Version 8.0 programs MAP, GAP, BESTfiT, PILEUP and PRE]-rY (Genetics Computer Group, Madison, Wl) or with the CONTIG and ALIGNMENT routines of the ONASlS software package (Hitachi Software, Co., San Bruno, CA). The SS series of primers were designed using the OLIGO software program, version 5.0 (National Biosciences, Plymouth, MN).

MAPMAKER (V.II for Macintosh computers) analysis was performed using the Kosambi mapping function (Koornneef and Stam, 1992; Lander eta/., 1987). A LOD score of 6.00 was used as a minimum criterion for significance when mapping was performed with 90 F s (Ler x Col-0) RI lines (Lister and Dean, 1993). However, a LOD score of 3.00 was used when mapping with only 27-30 RI lines. RFLP markers in the data set were obtained from the World Wide Web (http://cbil.humgen.upenn.edu/-atgc/genetic-mapping/ ListerFeb95.html). The mapping data for those ESTs mapped as either CAPS or RFLP markers were subsequently re-analyzed by

Page 14: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

1210 M i g u e l A . Botella et al.

Mary Anderson and colleagues (Nottingham Arabidopsis Stock Centre, Nottingham, UK) using a larger data set of reference markers.

Acknowledgments

We wish to express our appreciation to Mary Anderson and Mike Arnold (Nottingham Arabidopsis Stock Centre, Nottingham, UK) for analyzing the mapping data and to Joe Ecker (University of Pennsylvania, Philadelphia, USA) for providing us with the yUP library. We also want to thank David Bouchez (INRA, Versailles, France) and Caroline Dean and Melanie Stammers (John Innes Centre, Norwich, UK) for helping in the analysis of the YACs. John McDowell (University of North Carolina, Chapel Hill, USA) provided information about the map positions of T20808 and Z17993, and Shunyuan Xiao (University of East Anglia, Norwich, UK) provided information about the location of T04109. We appreciate their helpfulness, in addition, we thank Jane Parker for her helpful advice and Jim Beynon (Wye College, Ashford, UK) for useful discussions and information about R-ESTs, We appreci- ate the help of Dave Flanders (Stanford University, Stanford, USA) in making Tables 1-3 and Figure 1 available via the Arabidopsis thaliana database. This work was supported in part by funds provided to Shauna Somerville by the Carnegie Institution of Washington, the National Science Foundation and the US Depart- ment of Energy. The Sainsbury Laboratory is supported by the Gatsby Charitable Foundation and Miguel 8otella was supported by an EU BIOTECH postdoctoral fellowship. This is Carnegie Institution of Washington publication no. 1339.

References

Agyare, I.D., Lashkari, D.A., Lagos, A., Namath, A.I., Lagos, G., Davis, R.W. and Lemieux, B. (1997) Mapping expressed sequence tag sites on yeast artificial chromosome clones of Arabidopsis thaliana DNA. Genome Res. 7, 1-9.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W, and Lipman, D.J. (1990) Basic local alignment search tool. J. Mo/. BioL 215, 403-4 10.

Anderson, M. (1996) The latest RI map using the Lister and Dean RI lines. Weeds World, 3, 27-42.

Ausubel, I.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A. and Struhl, K. (eds) (1990) Current Protocols in Molecular Biology. New York: Wiley Interscience.

Bennetzen, J.L. and Hulbert, S.H. (1992) Organization, instability, and evolution of plant disease resistance genes. Plant MoL Biol. 20, 575-578.

Bent, A.I. (1996) Plant disease resistance genes: function meets structure. Plant Cell, 8, 1757-1771.

Bent, A.I., Kunkel, B.N., Dahlbeck, D., Brown, K.L., Schmidt, R., Giraudat, J., Leung, J. and Staskawicz, B.J. (1994) RPS2 of Arabidopsis tha/iana: a leucine-rich repeat class of plant disease resistance genes. Science, 265, 1856-1860.

Bernatzky, R. and Tanksley, S.D. (1986) Genetics of aetin-related sequences of tomato. Theor. App/. Genet. 72, 314-321.

Braun, D.M. and Walker, J.C. (1996) Plant transmembrane receptors: new pieces in the signaling puzzle. Trends Biochem. Sci. 21, 70-73.

Cai, D., Kleine, M., Kifle, S. etaL (1997) Positional cloning of a gene for nematode resistance in sugar beet. Science, 275, 832-834.

Clark, S.E., Williams, R.W. and Meyerowitz, E.M. (1997) The

CLAVATA 1 gene encodes a putative receptor kinase that controls shoot and floral meristem size in Arabidopsis. Cell, 89, 575-585.

Cooke, R., Raynal, M., Laudie, M. et al, (1996) Further progress towards a catalogue of all Arabidopsis genes: analysis of a set of 5000 non-redundant ESTs. Plant J. 9, 101-124.

Creusot, I., Fouilloux, E., Dron, M. etal. (1995) The CIC library: a large insert YAC library for genome mapping in Arabidopsis thaliana. Plant J. 8, 763-770.

Crute, I., Beynon, J., Dangl, J., Holub, E., Mauch-Mani, B., Slusarenko, A., Staskawicz, B. and Ausubel, I. (1994). Microbial pathogenesis of Arabidopsis. In Arabidopsis (Meyerowitz, E.M. and Somerville, C.R., eds). Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, pp. 705-747,

Dixon, M.S., Jones, D.A., Keddie, J.S., Thomas, C.M., Harrison, K. and Jones, J.D.G. (1996) The tomato Cf-2 disease resistance locus comprises two functional genes encoding leucine-rich repeat proteins, Cell, 84, 451-459.

Ellis, J.G., Lawrence, G.J., Finnegan, E.J. and Anderson, P.A. (1995). Contrasting complexity of two rust resistance loci in flax. Proc. Nat/Acad. Sci. USA, 92, 4185-4188.

Feuillet, C., Schachermayr, G. and Keller, B. (1997) Molecular cloning of a new receptor-like kinase gene encoded at the LrlO disease resistance locus of wheat. Plant J. 11, 45-52.

Grant, M.R., Godiard, L., Straube, E., Ashfield, T., Lewald, J., Sattler, A., Innes, R.W. and Dangl, J.L. (1995) Structure of the Arabidopsis RPM1 gene enabling dual specificity disease resistance. Science, 269, 843-846.

Hirayama, T. and Oka, A. (1992) Novel protein kinase of Arabidopsis thaliana (APK1) that phosphorylates tyrosine, serine and threonine. Plant Mol. Biol, 20, 653-662.

Hoffman, C.S. and Winston, F. (1987) A ten-minute DNA preparation from yeast efficiently releases autonomous plasmids for transformation of Escherichia coil Gene, 57, 267-272.

Holub, E.B. (1997) Organization of resistance genes in Arabidopsis. In The Gene-for-Gene Relationship in Plant-Parasite Interactions (Crute, I., Holub, E. and Burdon, J., eds). Wallingford, UK: CAB International, pp. 5-26.

Hulbert, S.H. and Michelmore, R.W. (1985) Linkage analysis of genes for resistance to downy mildew (Bremia lactucae) in lettuce (Lactuca sativa). Theor. Appl. Genet. 70, 520-528,

Hwang, I., Kohchi, 1"., Hauge, B.M. etaL (1991) Identification and map position of YAC clones comprising one-third of the Arabidopsis genome. Plant J. 4, 403-410.

Johal, G. and Briggs, S.P. (1992) Reductase activity encoded by the HM1 disease resistance gene in maize. Science, 258, 985-987.

Jones, D.A. and Jones, J.D.G. (1996) The roles of leucine-rich repeat proteins in plant defenses. Adv. Plant Pathol. 24, 90-167.

Jones, D.A., Thomas, C.M., Hammond-Kosack, K.E., Balint-Kurti, P.J. and Jones, J.D.G. (1994) Isolation of the tomato Cf-9 gene for resistance to Cladosporium fulvum by transposon tagging. Science, 266, 789-793.

Jorgensen, J.H. (1994) Genetics of powdery mildew resistance in barley. Crit. Rev. Plant Sci. 13, 97-119.

Kanazin, V., Marek, L.E and Shoemaker, R.C. (1996) Resistance gene analogs are conserved and clustered in soybean. Proc. Nat/Acad. Sci. USA, 93, 11 746-11 750.

Koornneef, M. and Stam, P. (1992) Genetic analysis. In Methods in Arabidopsis Research (Koncz, C., Chua, N.-H. and Schell, J., eds). Singapore: World Scientific Publishing, pp. 83-99.

Kunkel, B.N. (1996) A useful weed put to work: genetic analysis of disease resistance in Arabidopsis thaliana. Trends Genet. 12, 63-69.

Page 15: Arabidopsis - UMA 1997 PJ.pdf · Arabidopsis genes, a very large number of R-ESTs (~95), and by inference disease resistance genes of the leucine- rich repeat-class probably occur

Mapp ing Arabidopsis R-ESTs 1211

Lander, E., Green, P., Abrahamson, J., Barlow, A., Daley, M., Lincoln, S. and Newburg, L. (1987) MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics, 1, 174-181.

Lawrence, G.J., Finnegan, E.J., Ayliffe, M.A. and Ellis, J.G. (1995) The L6 gene for flax rust resistance is related to the Arabidopsis bacterial resistance gene RPS2 and the tobacco viral resistance gene N. Plant Cell, 7, 1195-1206.

Leister, D., Ballvora, A., Salamini, F. and Gebhardt, C. (1996) A PC:R-based approach for isolating pathogen resistance genes from potato with potential for wide application in plants. Nature Genet. 14, 421-429.

Lister, C. and Dean, C. (1993) Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J. 4, 745-50.

Loh, Y.-T. and Martin, G.B. (1995) The Pto bacterial resistance gene and the Fen insecticide sensitivity gene encode functional protein kinases with serine/threonine specificity. Plant Physiol. 108, 1735-1739.

Martin, G.B., Brommonschenkel, S.H., Chunwongse, J., Frary, A., Ganal, M.W., Spivey, R., Wu, T., Earle, E.D. and Tanksley, S.D. (1993) Map-based cloning of a protein kinase gene conferring disease resistance in tomato. Science, 262, 1432-1436.

Matallana, E., Bell, C.J., Dunn, R J., Lu, M. and Ecker, J.R. (1992) Genetic and physical linkage of the Arabidopsis genome: methods for anchoring yeast artificial chromosome. In Methods in Arebidopsis Research (Koncz, C., Chua, N.-H. and Schell, J., eds). Singapore: World Scientific Publishing, pp. 144-169.

Mindrinos, M., Katagiri, E, Yu, G.-L. and Ausubel, EM. (1994) The A. thaliana disease resistance gene RPS2 encodes a protein containing a nucleotide-binding site and leucine-rich repeats. Cell, 78, 1089-1099.

Moran, T.V. and Walker, J.C. (1993) Molecular cloning of two novel protein kinase genes from Arabidopsis thaliana. Biochim. Biophys. Acta, 1216, 9-14.

Newman, T., Bruijn F.J., Green, P. seal. (1994) Genes galore: a summary of methods for accessing results from large-scale partial sequencing of anonymous Arabidopsis cDNA clones. Plant Physiol. 106, 1241-1255.

Parker, J.E., Coleman, M.J., Szabo, V., Frost L.N., Schmidt, R., Biezen, E.V.D., Moores, T., Dean, C., Daniels, M.J. and Jones, J.D.G. (1997)The Arebidopsis downy mildew resistance gene RPP5 shares similarity to the Toll and Interleukin-1 receptors with N and L6. Plant Cell, 9, 879-894.

Pearson, W.R. and Lipman, D.J. (1988) Improved tools for

biological sequence comparison. Proc. Natl Acad. ScL USA, 85, 2444-2448.

Pryor, T. and Ellis, J. (1993) The genetic complexity of fungal resistance genes in plants. Adv. Plant Pathol. 10, 281-305.

Richter, T.E., Pryor, T.J., Bennstzen, J.L. and Hulbert, S.H. (1995) New rust resistance specificities associated with recombination in the Rpl complex in maize. Genetics, 141,373-381.

Rothnie, H.M. (1996) Plant mRNA 3'-end formation. Plant Mol. Biol. 32, 43--61.

Rounsley, S.D., Glodek, A., Sutton, G., Adams, M.D., Somerville, C.R., Venter, J.C. and Kerlavage, A.R. (1996) The construction of Arabidopsis expressed sequence tag assemblies. Plant Physiol. 112, 1177-1183.

Schmidt, R., West, J., Cnops, G., Love, K., Balsstrazzi, A. and Dean, C. (1996) Detailed description of four YAC: contigs representing 17 Mbp of chromosome 4 of Arabidopsis thaliana ecotype Columbia. Plant J. 9, 755-765.

Somerville, S.C. and Somerville, C.R. (1996) Arabidopsis at 7: still growing like a weed. Plant Cell, 8, 1917-1933.

Song, W.-Y., Wang, G.-L, Chen, L.-L. etal. (1995) A receptor kinase- like protein encoded by the rice disease resistance gene, Xa21. Science, 270, 1804-1806.

Torii, K.U., Mitsukawa, N., Oosumi, T., Matsuura, Y., Yokoyama, R., Whittier, R.E and Komeda, Y. (1996) The Arabidopsis ERECTA gene encodes a putative receptor protein kinase with extracellular leucine-rich repeats. Plant Ceil, 8, 735-746. ,

Whitham, S., Dinesh-Kumar, S.P., Choi, D., Hehl, R., Corr, C. and Baker, B. (1994). The product of the tobacco mosaic; virus resistance gene N: similarity to Toll and the interleukin-1 receptor. Cell, 78, 1101-1115.

Wilson, I., Vogel, J. and Somerville, S. (1997) Signalling pathways: a common theme in plants and animals? Curr. Biol. 7, R175- R178.

Yanisch-Perron, C., Vieira, J. and Messing, J. (1985) Improved M 13 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUC19 vectors. Gene, 33, 103-119.

Yu, Y.G., Buss, G.R. and Saghai Maroof, M.A. (1996) Isolation of a superfamily of candidate disease-resistance genes in soybean based on a conserved nucleotide-binding site. Proc. Nat/Acad. Sci. USA, 93, 11 751-11 756.

Zachgo, E.A., Wang, M.L., Dewdney, J., Bouchez, D., Camilleri, C., Belmonts, S., Huang, L., Dolan, M. and Goodman, H.M. (1996) A physical map of chromosome 2 of Arabidopsis thaliana. Genome Res. 6, 19-25.

Zhou, J., Loh, Y.-T., Bressan, R.A. and Martin, G.B. (1995) The tomato gene Ptil encodes a serine/threonine kinase that is phosphorylated by Pto and is involved in the hypersensitive response. Cell, 83, 925-935.

GenBank accession numbers: U82399 (PK1), U82401 (PK5), U82402 (PK9)

Note added in proof

We have learned that CIC YACs 8E2, 9C9 and 10B4 are Chimeric. Based on this new information, we believe that R30025 is located only on Chromosome 3. The chromosome 5 positions listed for R30025 and R64794 in Table 3 and in Figure 1 are likely incorrect. We thank J. Turner and S. Xiao for bringing this information to our attention.