local dna hypomethylation activates genes in rice endosperm · local dna hypomethylation activates...

6
Local DNA hypomethylation activates genes in rice endosperm Assaf Zemach 1 , M. Yvonne Kim 1 , Pedro Silva, Jessica A. Rodrigues, Bradley Dotson, Matthew D. Brooks, and Daniel Zilberman 2 Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720 Edited* by Robert L. Fischer, University of California, Berkeley, CA, and approved September 16, 2010 (received for review July 12, 2010) Cytosine methylation silences transposable elements in plants, vertebrates, and fungi but also regulates gene expression. Plant methylation is catalyzed by three families of enzymes, each with a preferred sequence context: CG, CHG (H = A, C, or T), and CHH, with CHH methylation targeted by the RNAi pathway. Arabidopsis thali- ana endosperm, a placenta-like tissue that nourishes the embryo, is globally hypomethylated in the CG context while retaining high non- CG methylation. Global methylation dynamics in seeds of cereal crops that provide the bulk of human nutrition remain unknown. Here, we show that rice endosperm DNA is hypomethylated in all sequence contexts. Non-CG methylation is reduced evenly across the genome, whereas CG hypomethylation is localized. CHH methyla- tion of small transposable elements is increased in embryos, suggest- ing that endosperm demethylation enhances transposon silencing. Genes preferentially expressed in endosperm, including those cod- ing for major storage proteins and starch synthesizing enzymes, are frequently hypomethylated in endosperm, indicating that DNA methylation is a crucial regulator of rice endosperm biogenesis. Our data show that genome-wide reshaping of seed DNA methyla- tion is conserved among angiosperms and has a profound effect on gene expression in cereal crops. DNA methylation | embryo | transposable element R oughly 150 million y ago, owering plants diverged to form the two dominant extant lineages, monocots and dicots (1). Ara- bidopsis thaliana, the preeminent plant genetic system, is a dicot, whereas cereal crops, such as rice, wheat, and maize, that feed much of the world are monocots. In both plant groups, pollen grains contain two sperm nuclei, one of which fertilizes a diploid central cell to give rise to triploid endosperm (2). A. thaliana en- dosperm is consumed by the developing embryo, whereas cereal endosperm persists and makes up the bulk of the mature seeda developmental difference of particular practical importance (3). Developing seeds are genetic battlegrounds on multiple fronts: parents are proposed to be in conict over resource allocation (2), whereas the embryo must repress parasitic transposable elements (TEs) to prevent damage to the genome. A. thaliana endosperm DNA methylation participates in both conicts (2, 46). Plant TEs are preferentially methylated through a small RNA pathway, which recruits the DOMAINS REAR- RANGED METHYLASES (DRM) methyltransferases that cat- alyze methylation of cytosines in all sequence contexts (6). This methylation is generally referred to as CHH (H = A, C, or T) to differentiate it from CG methylation, catalyzed by the METH- YLTRANSFERASE 1 family, and CHG methylation, catalyzed by the CHROMOMETHYLASE family (6). CHG and CHH meth- ylation is predominantly found in TEs, whereas CG methylation is abundant in both TEs and genes (68). Plants also posses DNA glycosylase enzymes capable of removing 5-methylcytosine (6). One of these enzymes, DEMETER (DME), is expressed in the A. thaliana central cell before fertilization, leading to extensive hypomethylation of the maternal genome (2, 4, 5). This methyla- tion difference between the maternal and paternal genomes in endosperma form of genomic imprintingcauses differential expression of a number of genes, depending on parent of origin (2, 4). Imprinted expression is generally explained in terms of conict between parents over resource allocation, with male-expressed genes maximizing resource extraction from the female and female- expressed genes counteracting this drive (2). Most of our knowledge about DNA methylation in plant seeds is derived from A. thaliana. Processes involving genetic conict tend to evolve rapidly (9), and therefore, methylation dynamics in ce- real seeds may be quite different. Here, we use deep bisulte se- quencing to examine DNA methylation in rice seeds. Wild-type rice endosperm methylation patternsglobally reduced non-CG methylation and local CG hypomethylationresemble those of DME-decient A. thaliana endosperm, a nding consistent with lack of DME in monocots. Reduced endosperm methylation is common in genes with preferential endosperm expression, in- dicating that demethylation is a major mechanism for gene acti- vation in rice endosperm. Short TEs are hypermethylated at CHH sites in embryo, suggesting that endosperm demethylation func- tions to immunize the embryo against TEs through small RNAs. Results Global Genomic Methylation Patterns Are Similar Across Rice Tissues. To learn how cytosine methylation regulates cereal seed genomes, we quantied methylation in rice embryos, endosperm, and seedling shoots and roots by sequencing bisulte-converted ge- nomic DNA (bisulte treatment converts unmethylated cytosine to uracil) to 11- to 15-fold coverage of the nuclear genome (Table S1). The aggregate methylation patterns in all tissues are very similar to those of mature rice leaves (8) as well as those of A. thaliana (6)CG methylation is common in gene bodies, except near the transcription start and termination sites, whereas TEs are methylated in all sequence contexts (Fig. 1). Overall CG methyl- ation patterns and levels are virtually indistinguishable between embryos, shoots, roots, and leaves (Fig. 1 A and B). CHG meth- ylation increases modestly with age of the tissue: lowest in em- bryos, higher in young shoots and roots, and highest in mature leaves (Fig. 1 C and D), consistent with reports of increased methylation in older tissues of maize and petunia (10, 11). CHH methylation is also higher in leaves than in seedling tissues (Fig. 1 E and F). Short TEs Are Hypermethylated at CHH Sites in Rice Embryo. Average CHH methylation of embryo TEs is higher than in seedlings near the points of alignment but indistinguishable past 1 kb into the element (Fig. 1F), a pattern caused by differential methylation of short and long TEs (Fig. 2A and Fig. S1). TEs longer than 1 kb show the same methylation levels in embryos and seedlings, whereas shorter elements [i.e., miniature inverted-repeat trans- posable elements (MITEs) and short interspersed nuclear ele- Author contributions: A.Z. and D.Z. designed research; A.Z., M.Y.K., B.D., and M.D.B. performed research; A.Z., P.S., J.A.R., and D.Z. analyzed data; and A.Z. and D.Z. wrote the paper. The authors declare no conict of interest. *This Direct Submission article had a prearranged editor. Freely available online through the PNAS open access option. Data deposition: The data reported in this paper have been deposited in the Gene Ex- pression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE22591). 1 A.Z. and M.Y.K. contributed equally to this work. 2 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1009695107/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1009695107 PNAS | October 26, 2010 | vol. 107 | no. 43 | 1872918734 PLANT BIOLOGY Downloaded by guest on May 18, 2020

Upload: others

Post on 19-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Local DNA hypomethylation activates genes in rice endosperm · Local DNA hypomethylation activates genes in rice endosperm Assaf Zemach1, M. Yvonne Kim1, Pedro Silva, Jessica A. Rodrigues,

Local DNA hypomethylation activates genes inrice endospermAssaf Zemach1, M. Yvonne Kim1, Pedro Silva, Jessica A. Rodrigues, Bradley Dotson, Matthew D. Brooks,and Daniel Zilberman2

Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720

Edited* by Robert L. Fischer, University of California, Berkeley, CA, and approved September 16, 2010 (received for review July 12, 2010)

Cytosine methylation silences transposable elements in plants,vertebrates, and fungi but also regulates gene expression. Plantmethylation is catalyzed by three families of enzymes, each witha preferred sequence context: CG, CHG (H=A, C, or T), and CHH,withCHH methylation targeted by the RNAi pathway. Arabidopsis thali-ana endosperm, a placenta-like tissue that nourishes the embryo, isgloballyhypomethylated in theCGcontextwhile retaininghighnon-CG methylation. Global methylation dynamics in seeds of cerealcrops that provide the bulk of human nutrition remain unknown.Here, we show that rice endosperm DNA is hypomethylated in allsequence contexts.Non-CGmethylation is reducedevenly across thegenome, whereas CG hypomethylation is localized. CHH methyla-tionof small transposableelements is increased inembryos, suggest-ing that endosperm demethylation enhances transposon silencing.Genes preferentially expressed in endosperm, including those cod-ing for major storage proteins and starch synthesizing enzymes, arefrequently hypomethylated in endosperm, indicating that DNAmethylation is a crucial regulator of rice endosperm biogenesis.Our data show that genome-wide reshaping of seed DNA methyla-tion is conserved among angiosperms and has a profound effect ongene expression in cereal crops.

DNA methylation | embryo | transposable element

Roughly 150 million y ago, flowering plants diverged to form thetwo dominant extant lineages, monocots and dicots (1). Ara-

bidopsis thaliana, the preeminent plant genetic system, is a dicot,whereas cereal crops, such as rice, wheat, and maize, that feedmuch of the world are monocots. In both plant groups, pollengrains contain two sperm nuclei, one of which fertilizes a diploidcentral cell to give rise to triploid endosperm (2). A. thaliana en-dosperm is consumed by the developing embryo, whereas cerealendosperm persists and makes up the bulk of the mature seed—a developmental difference of particular practical importance (3).Developing seeds are genetic battlegrounds on multiple fronts:parents are proposed to be in conflict over resource allocation (2),whereas the embryo must repress parasitic transposable elements(TEs) to prevent damage to the genome.A. thaliana endosperm DNA methylation participates in both

conflicts (2, 4–6). Plant TEs are preferentially methylated througha small RNA pathway, which recruits the DOMAINS REAR-RANGED METHYLASES (DRM) methyltransferases that cat-alyze methylation of cytosines in all sequence contexts (6). Thismethylation is generally referred to as CHH (H = A, C, or T) todifferentiate it from CG methylation, catalyzed by the METH-YLTRANSFERASE 1 family, andCHGmethylation, catalyzed bythe CHROMOMETHYLASE family (6). CHG and CHH meth-ylation is predominantly found in TEs, whereas CGmethylation isabundant in both TEs and genes (6–8). Plants also posses DNAglycosylase enzymes capable of removing 5-methylcytosine (6).One of these enzymes, DEMETER (DME), is expressed in the A.thaliana central cell before fertilization, leading to extensivehypomethylation of the maternal genome (2, 4, 5). This methyla-tion difference between the maternal and paternal genomes inendosperm—a form of genomic imprinting—causes differentialexpression of a number of genes, depending on parent of origin (2,4). Imprinted expression is generally explained in terms of conflictbetween parents over resource allocation, with male-expressed

genes maximizing resource extraction from the female and female-expressed genes counteracting this drive (2).Most of our knowledge aboutDNAmethylation in plant seeds is

derived from A. thaliana. Processes involving genetic conflict tendto evolve rapidly (9), and therefore, methylation dynamics in ce-real seeds may be quite different. Here, we use deep bisulfite se-quencing to examine DNA methylation in rice seeds. Wild-typerice endosperm methylation patterns—globally reduced non-CGmethylation and local CG hypomethylation—resemble those ofDME-deficient A. thaliana endosperm, a finding consistent withlack of DME in monocots. Reduced endosperm methylation iscommon in genes with preferential endosperm expression, in-dicating that demethylation is a major mechanism for gene acti-vation in rice endosperm. Short TEs are hypermethylated at CHHsites in embryo, suggesting that endosperm demethylation func-tions to immunize the embryo against TEs through small RNAs.

ResultsGlobal Genomic Methylation Patterns Are Similar Across Rice Tissues.To learn how cytosine methylation regulates cereal seed genomes,we quantified methylation in rice embryos, endosperm, andseedling shoots and roots by sequencing bisulfite-converted ge-nomic DNA (bisulfite treatment converts unmethylated cytosineto uracil) to 11- to 15-fold coverage of the nuclear genome (TableS1). The aggregate methylation patterns in all tissues are verysimilar to those of mature rice leaves (8) as well as those of A.thaliana (6)—CG methylation is common in gene bodies, exceptnear the transcription start and termination sites, whereas TEs aremethylated in all sequence contexts (Fig. 1). Overall CG methyl-ation patterns and levels are virtually indistinguishable betweenembryos, shoots, roots, and leaves (Fig. 1 A and B). CHG meth-ylation increases modestly with age of the tissue: lowest in em-bryos, higher in young shoots and roots, and highest in matureleaves (Fig. 1 C and D), consistent with reports of increasedmethylation in older tissues of maize and petunia (10, 11). CHHmethylation is also higher in leaves than in seedling tissues (Fig. 1E and F).

Short TEs Are Hypermethylated at CHH Sites in Rice Embryo.AverageCHH methylation of embryo TEs is higher than in seedlings nearthe points of alignment but indistinguishable past 1 kb into theelement (Fig. 1F), a pattern caused by differential methylation ofshort and long TEs (Fig. 2A and Fig. S1). TEs longer than 1 kbshow the same methylation levels in embryos and seedlings,whereas shorter elements [i.e., miniature inverted-repeat trans-posable elements (MITEs) and short interspersed nuclear ele-

Author contributions: A.Z. and D.Z. designed research; A.Z., M.Y.K., B.D., and M.D.B.performed research; A.Z., P.S., J.A.R., and D.Z. analyzed data; and A.Z. and D.Z. wrotethe paper.

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

Freely available online through the PNAS open access option.

Data deposition: The data reported in this paper have been deposited in the Gene Ex-pression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE22591).1A.Z. and M.Y.K. contributed equally to this work.2To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1009695107/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1009695107 PNAS | October 26, 2010 | vol. 107 | no. 43 | 18729–18734

PLANTBIOLO

GY

Dow

nloa

ded

by g

uest

on

May

18,

202

0

Page 2: Local DNA hypomethylation activates genes in rice endosperm · Local DNA hypomethylation activates genes in rice endosperm Assaf Zemach1, M. Yvonne Kim1, Pedro Silva, Jessica A. Rodrigues,

kb

Genes .7

0

n

o

i t a

l y

h

t e

m

G

C

Embryo

Endosperm

Seedling shoot A

-3 1 0 -1 3 2 -3 1 0 -1 3 2 4

Seedling root

-2 -2

Leaf

kb

Genes .36

0

n

o

i t a

l y

h

t e

m

G

H

C

Embryo

Endosperm Seedling shoot

C

-3 1 0 -1 3 2 -3 1 0 -1 3 2 4

Seedling root

-2 -2

Leaf

kb

Genes .13

0

n

o

i t a

l y

h

t e

m

H

H

C

Embryo

Endosperm Seedling shoot

E

-3 1 0 -1 3 2 -3 1 0 -1 3 2 4

Seedling root

-2 -2

Leaf

kb

TEs.9

0

noit

aly

hte

mG

C

Embryo

Endosperm

Seedling shootB

-3 10-1 32 -31 0 -13 24

Seedling root

-2-2

Leaf

kb

TEs.7

0

noit

aly

hte

mG

HC

Embryo

Endosperm

Seedling shootD

-3 10-1 32 -31 0 -13 24

Seedling root

-2-2

Leaf

kb

TEs.14

0

noit

aly

hte

mH

HC

Embryo

Endosperm Seedling shoot

F

-3 10-1 32 -31 0 -13 24

Seedling root

-2-2

Leaf

Fig. 1. Patterns of DNAmethylation in rice tissues. Rice genes (A, C, and E) or TEs (B,D, and F) were aligned at the 5′ end (Left) or the 3′ end (Right), and averagemethylation levels for each 100-bp interval are plotted. The dashed line represents the point of alignment.

A 1

0 St En Em Rt

n

o

i t a

l y

h

t e

m

H

H

C

MITEs

St En Em Rt St En Em Rt

kb

.28

0

yc

ne

uq

erfE

TIM

C

-3 10-1 32 -31 0 -13 24 -2-2kb

.11

0

noit

aly

hte

mH

HC

to

oh

S

D

-3 10-1 32 -31 0 -13 24 -2-2

kb

.25

0

yc

ne

uq

erfE

T

MITEs

TE > 1000 bp

B

-3 10-1 32 -31 0 -13 24 -2-2Tissue

St En Em Rt

SINEs LTRs TE > 1000 bp

LTRsn = 130,849 n = 11,343 n = 106,225 n = 35,193

5th (highest)4th3d2nd1st

5th (highest)4th3d2nd1st

Fig. 2. MITEs are thepredominant target of CHHmethylation. (A) Box plots showingmethylation levels of different TE classes in rice embryo (Em), shoot (St), root(Rt), and endosperm (En). Each box encloses themiddle 50% of the distribution, with the horizontal linemarking themedian. The lines extending from each boxmark theminimumandmaximumvalues that fallwithin1.5 times theheightof thebox.MITEmean length=189bp;maximum length=500bp. SINEmean length=141 bp; maximum length = 487 bp. LTRmean length = 855 bp; maximum length = 11 kb. (B–D) Rice genes were aligned as in Fig. 1, and TE frequency (B and C) oraverage methylation levels (D) for each 100-bp interval are plotted. In C and D, genes were grouped into quintiles by transcription.

18730 | www.pnas.org/cgi/doi/10.1073/pnas.1009695107 Zemach et al.

Dow

nloa

ded

by g

uest

on

May

18,

202

0

Page 3: Local DNA hypomethylation activates genes in rice endosperm · Local DNA hypomethylation activates genes in rice endosperm Assaf Zemach1, M. Yvonne Kim1, Pedro Silva, Jessica A. Rodrigues,

ments (SINEs)] are hypermethylated in embryos (Fig. 2A and Fig.S1). The abundance of CHH methylation in short TEs led us toexamine whether their genomic distribution accounts for the spikein CHH methylation upstream of genes (Fig. 1E). MITEs, the

most abundant short elements in rice (some of which are active)(12), preferentially occur near genes (13). MITE distribution in-deed closely parallels that of CHH methylation (Fig. 2B). MITEfrequency 5′ and 3′ of genes is directly correlated with gene tran-scription, whereas MITE frequency within genes is inversely cor-related with transcription (Fig. 2C). CHH methylation showsa similar distribution (Fig. 2D), a pattern quite different from CGor CHG methylation (Fig. S2). Thus the distribution of CHHmethylation closely follows that of MITEs.

Global Hypomethylation of Non-CG Sites and Local Hypomethylationof CG Sites in Rice Endosperm.DNAmethylation in rice endospermis lower in all sequence contexts compared with all other tissuesthat we examined—CG methylation is about 93% of that of em-bryos, whereas CHG and CHH methylation is lower by abouttwofold and fivefold, respectively (Figs. 1 and 2A, Fig. S3, andTable S1). The decrease in CG methylation affects gene bodies,gene-adjacent regions, and TEs (Fig. 1), which may reflect A.thaliana-like even hypomethylation of the entire genome (5) ormight be because of severe demethylation of specific loci. Toidentify differentially methylated rice sequences, we calculatedfractional methylation in each context within 50-bp windows,subtracted one dataset from another in all pair-wise combinations,and identified loci with significant methylation differences be-tween tissues (Fig. 3 and Table S2). In contrast to A. thaliana,CG methylation of most loci is unchanged in rice endosperm (Fig.3A), with hypomethylation restricted to specific domains (Fig. 3 Dand E and Fig. S4), whereas non-CG methylation is reducedthroughout the genome (Fig. 3 B–E). Gene bodies, transcriptionalstart site (TSS)-proximal regions, and MITEs show similar

A5

00-0.5 1-1 0.5

ytisned

fractional methylation difference

Embryo -

endospermEmbryo - shoot

CG methylationB

3

00-0.5 1-1 0.5

ytisned

fractional methylation difference

Embryo - shoot

Embryo -

endosperm

CHG methylationC

6

00-0.5 1-1 0.5

ytisned

fractional methylation difference

Embryo - shoot

Embryo -

endosperm

CHH methylation

Shoot - root

Shoot - root

Shoot - root

CG embryo

CG endosperm

CHG embryo

CHG endosperm

Expresson embryo

Expression endosperm

Genes

TEs

glutelin B7

glutelin B1

glutelin B1

starch synthase III

CG embryo

CG endosperm

CHG embryo

CHG endosperm

Expresson embryo

Expression endosperm

Genes

TEs

D

E

CHH embryo

CHH endosperm

CHH embryo

CHH endosperm

1

1800

1

1

1

1

1

1800

1

8800

1

1

1

1

1

8800

Fig. 3. Local CG and global non-CG hypomethylationof rice endosperm. (A–C) Kernel density plots of thedifferences between embryo and endosperm methyl-ation (red trace), the differences between embryo andseedling shoot methylation (blue trace), and the dif-ferences between seedling shoot and root methylation(green trace). Methylation differences for 50-bp win-dows containing at least 10 informative sequencedcytosines are shown. Differences for windows withfractional CG methylation of at least 0.7, fractionalCHG methylation of at least 0.5, and fractional CHHmethylation of at least 0.1 in one of the tissues areshown in A–C, respectively. (D and E) Methylation andtranscription of positions 8,425,000–8,495,000 onchromosome 2 encompassing three glutelin genes(Os02g15150, Os02g15169, and Os02g15178; D) and5,290,000–5,360,000 on chromosome 8 encompassingstarch synthase III (Os08g09230; E) in embryo and en-dosperm. Genes and TEs oriented 5′ to 3′ and 3′ to 5′are shown above and below the line, respectively. En-dosperm CG hypomethylation overlapping the TSS ofthe relevant genes is highlighted by boxes. Also notethe extensive loss of CHG methylation at all threeglutelin genes.

A3

0 0 -0.5 1 -1 0.5

y

t i s

n

e

d

fractional methylation difference

A. thaliana dme embryo -

endosperm

Rice embryo -

endosperm

CHG methylation B

6

0 0 -0.5 1 -1 0.5

y

t i s

n

e

d

fractional methylation difference

CHH methylation

A. thaliana WT embryo -

endosperm

A. thaliana dme embryo -

endosperm

Rice embryo -

endosperm

A. thaliana WT embryo -

endosperm

Fig. 4. Wild-type rice endosperm methylation resembles A. thaliana dmeendosperm. (A and B) Kernel density plots of CHG and CHH methylationdifferences within 50-bp windows between rice embryo and endosperm (redtrace), A. thaliana endosperm from loss-of-function dme plants (blue trace),and A. thaliana endosperm from wild-type plants (green trace) (5). Differ-ences for windows with fractional CHG methylation of at least 0.5 in one ofthe tissues are shown in A, and differences for windows with fractional CHHmethylation of at least 0.1 in one of the tissues are shown in B.

Zemach et al. PNAS | October 26, 2010 | vol. 107 | no. 43 | 18731

PLANTBIOLO

GY

Dow

nloa

ded

by g

uest

on

May

18,

202

0

Page 4: Local DNA hypomethylation activates genes in rice endosperm · Local DNA hypomethylation activates genes in rice endosperm Assaf Zemach1, M. Yvonne Kim1, Pedro Silva, Jessica A. Rodrigues,

decreases in endosperm CG methylation, whereas long TEs ex-hibit virtually no CG hypomethylation (Fig. S5). Long TEs alsolose less CHG methylation in endosperm than other sequences(Fig. S5). There are modest (193–1,799 loci) CG methylationdifferences between tissues other than endosperm, with leavesmost similar to seedling shoots and seedling shoots most similar toseedling roots (Table S2), suggesting that age and tissue type in-fluence methylation patterns. CG methylation differences aremuch greater when endosperm is considered, with 25,655–29,969loci hypomethylated in endosperm (Table S2).

DEMETER Orthology Group Is Restricted to Dicots. Our data showthat a major reduction of DNA methylation occurs in the endo-sperm of rice, but the specifics of demethylation are quite differentcompared with A. thaliana (4, 5), leaving open the question ofwhether global demethylation has a common evolutionary originamong angiosperms. A plausible answer is suggested by the factthat the methylation landscape of rice endosperm closely resem-bles that of A. thaliana endosperm with a mutation in the DE-METER (DME) DNA glycosylase (5). DME is required for globalCG demethylation in A. thaliana endosperm (4, 5), and its loss offunction leads to increased CG methylation and decreased non-CG methylation (5) very similar to that of wild-type rice (Fig. 4).DME and its three A. thaliana homologs that function primarilyoutside the seed (14, 15) belong to three distinct orthology groups

within flowering plants: DME, REPRESSOR OF SILENCING 1(ROS1), and DEMETER-LIKE 3 (DML3) (Fig. 5). The DMEorthology group extends tomonkey flower (Mimulus guttatus) (Fig.5), a basal dicot that diverged from A. thaliana about 125 million yago (1), suggesting that DME function may be conserved acrossdicots. However, monocots like rice lack DME orthologs (Fig. 5),and therefore, rice endosperm hypomethylation must rely onROS1 and/or DML3 orthologs or alternate biochemical mecha-nisms. The similarity between wild-type rice andDME-deficientA.thaliana suggests that the overall process of extensive endospermdemethylation is likely conserved between monocots and dicots,with the observed differences caused by divergent evolution of theDME/ROS1/DML3 family.

CG and CHG Hypomethylation Is a Major Mechanism of Gene Activationin Rice Endosperm. Gene expression in A. thaliana endosperm issignificantly linked to DNA methylation—those genes with re-duced DNA methylation upstream of the TSS tend to be moreexpressed in endosperm (5)—but the association is not verystrong, with most genes that are preferentially expressed in en-dosperm not directly activated by removal of DNA methylation(4, 5). To examine the situation in rice, we measured gene ex-pression in the same tissues that we used to quantify DNAmethylation (embryos, endosperm, and seedling shoots and roots)using tiling microarrays. We identified a stringent group of 165

DML3

ROS1

DME

100

100

96

100

93

96

100

100

Fig. 5. Phylogenetic analysis of the DME/ROS1/DML3 glycosylase family. A phylogenetic tree based on conserved domains of glycosylase proteins, with basalland plant (moss and lycophyte) proteins as an outgroup. Posterior probability values (0–100) are shown for key nodes. Monocot, dicot, and moss/lycophyteproteins are colored green, red, and black, respectively. Aly, A. lyrata; Ath, A. thaliana; Bdi, Brachpodium distachyon; Cpa, Carica papaya (papaya); Csa,Cucumis sativus (cucumber); Gma, Glycine max (soybean); Mes, Manihot esculenta (cassava); Mgu, Mimulus guttatus (monkey flower); Osa, Oryza sativa (rice);Ppa, Physcomitrella patens (moss); Ptr, Populus trichocarpa (poplar); Rco, Ricinus communis (castor bean); Smo, Selaginella moellendorffii (lycophyte); Sbi,Sorghum bicolor; Vvi, Vitis vinifera (grape); Zma, Zea mays (maize). Protein sequences were downloaded from Phytozome (http://www.phytozome.net),National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov), and Joint Genome Institute (http://www.jgi.doe.gov).

18732 | www.pnas.org/cgi/doi/10.1073/pnas.1009695107 Zemach et al.

Dow

nloa

ded

by g

uest

on

May

18,

202

0

Page 5: Local DNA hypomethylation activates genes in rice endosperm · Local DNA hypomethylation activates genes in rice endosperm Assaf Zemach1, M. Yvonne Kim1, Pedro Silva, Jessica A. Rodrigues,

genes with a strong preference for endosperm expression (TableS3) by selecting only those genes with fourfold or greater RNAlevels in endosperm compared with each of the other three tissuesand that also met these criteria in previously published expressiondatasets (16). We similarly identified a control group of 153 genespreferentially expressed in embryo (Table S4).Embryo-preferred genes have similar methylation patterns in all

tissues (Fig. 6 A and B and Fig. S6). In contrast, endosperm-pre-ferred genes are, on average, hypermethylated in embryos, shoots,roots, and leaves, with decreased CG and CHG methylation inendosperm (Fig. 6C andD); 69 of the endosperm-preferred genes(42%) have a significant decrease (P < 10−7, Fisher’s exact test) inCG methylation within 100 bp of the TSS or in CHG methylationwithin the gene body, with 38 genes (23%) exhibiting both (TableS3). Only nine embryo-preferred genes (6%) have a significantdecrease in CG or CHG methylation in endosperm, and noneshow both (Table S4)—a highly significant difference (P < 10−11,Fisher’s exact test).For a global comparisonofmethylationand transcription changes,

we aligned all genes according to our microarray data from thosemost expressed in endosperm vs. embryo to those least expressed inendosperm vs. embryo, and we displayed embryo methylation levelsand the difference between embryo and endosperm as heat maps(Fig. 6EandFandFig. S6).Endosperm-expressedgeneshavehigherlevels of embryo CGmethylation near the TSS (Fig. 6E) and higher

embryo CHGmethylation throughout the gene body (Fig. 6F), withreduced levels of methylation in endosperm. CHH methylationshows a weak, if any, correlation with embryo/endosperm expressiondifferences (Fig. S6).Our data suggest thatDNAhypomethylation isa major mechanism for activation of genes in rice endosperm. En-dosperm-preferred genes apparently activated by DNA demethyla-tion include precursors of glutelin, which accounts for about 70% ofrice endosperm protein (17), and carbohydrate enzymes that syn-thesize endosperm starch (Fig. 3 D and E, Fig. S4, and Table S3),genes that create the nutritive molecules relied on by germinatingseedlings and much of the human population.

DiscussionOur data have important implications for engineering of transgeniccereal crops. Glutelin promoters have been investigated for theirsuitability to drive transgene expression in rice endosperm, andstrong endosperm-specific promoters are generally highly sought(17). However, our data suggest that a substantial fraction of en-dosperm-specific expression is caused by hypomethylation, andtransgenic constructs may not recapitulate endogenous expressionpatterns.Understanding theepigeneticmechanisms regulating suchpromoters will allow for more informed design of transgenic lines.Considering the difficulty of targeted mutagenesis in plants (18),

RNAi is an attractive mechanism for silencing unwanted endoge-

kb

Embryo

4-fold

overexpressed

.7

0

n

o

i t a

l y

h

t e

m

G

C

Embryo

Endosperm

Seedling shoot A

-3 1 0 -1 3 2 -3 1 0 -1 3 2 4

Seedling root

-2 -2

Leaf

kb

.45

0

n

o

i t a

l y

h

t e

m

G

H

C

Embryo

Endosperm

Seedling

shoot

B

-3 1 0 -1 3 2 -3 1 0 -1 3 2 4

Seedling root

-2 -2

Leaf

Embryo

4-fold

overexpressed

kb

Endosperm

4-fold

overexpressed

.7

0

noit

aly

hte

mG

C

Embryo

Endosperm

Seedling shootC

-3 10-1 32 -31 0 -13 24

Seedling root

-2-2

Leaf

kb

.45

0

noit

aly

hte

mG

HC

Embryo

Endosperm

Seedling

shoot

D

-3 10-1 32 -31 0 -13 24

Seedling root

-2-2

Leaf Endosperm

4-fold

overexpressed

E CG methylation (embryo) CG methylation

(embryo-endosperm)

-3 3 0 -3 3 0

0 1 -0.2 0.2

kb kb

F

-3 30 -33

0

0 0.5 -0.2 0.2

kb kb

CHG methylation (embryo)CHG methylation

(embryo-endosperm)

7

-7

) o

y

r b

m

e

/ m

r

e

p

s

o

d

n

e

( g

o

l n

o

i t p

i r c

s

n

a

r T

2

)o

yrb

me/

mre

ps

od

ne(

gol

noit

pirc

sn

arT

27

-7

Fig. 6. Hypomethylation is a major mechanism for activation of rice endosperm genes. (A–D) Embryo-preferred (A and B; n = 153) or endosperm-preferred (Cand D; n = 165) genes were aligned as in Fig. 1, and average methylation levels for each 100-bp interval are plotted. (E and F) All rice genes were aligned atthe 5′ end and ranked according to the ratio of expression in endosperm over embryo. Embryo methylation is displayed as a heat map in Left, and differencesbetween embryo and endosperm are in Right.

Zemach et al. PNAS | October 26, 2010 | vol. 107 | no. 43 | 18733

PLANTBIOLO

GY

Dow

nloa

ded

by g

uest

on

May

18,

202

0

Page 6: Local DNA hypomethylation activates genes in rice endosperm · Local DNA hypomethylation activates genes in rice endosperm Assaf Zemach1, M. Yvonne Kim1, Pedro Silva, Jessica A. Rodrigues,

nous genes. However, the extremely low levels of CHHmethylationin rice endosperm indicate that the functionality of theRNAi systemis altered in this tissue. Small RNAs may be transported out of theendosperm to immunize the embryo, or the link between RNAi andDNAmethylationmay be weakened. In either case, targeting RNAito promoters for the purpose of transcriptional repression may beineffective. Similarly, transgenes inactivated byDNAmethylation inother tissues may be reactivated in endosperm.We previously suggested that DNA hypomethylation in A. thali-

ana endosperm is a mechanism to enhance TE silencing in the em-bryo (5), a hypothesis supported by an abundance of TE-derivedsmall RNAs in the endosperm (19).Our rice data are fully consistentwith this hypothesis. The greatest amount of CHH methylation isfound in short rice TEs, consistent with the observation that meth-ylation of short TEs is more dependent on the RNAi system (20).Short TEs lose the most CHH methylation in rice endosperm (Fig.1F) and are the only elements hypermethylated in embryo (Fig. 2A),suggesting that demethylation and activation of TEs in endospermlead to hypermethylation and silencing in the embryo through theRNAi pathway. Considering that the major targets of rice CHHmethylation areMITEs,which tend to be foundnear the start sites ofactive genes, this system is likely of particular importance for properfunctionality of the rice genome.Presentmodels of plant gene imprintingposit thatDME-mediated

removal of DNA methylation in the central cell before fertilizationcreates epigenetic differences between the maternal and paternalgenomes in endosperm, which, in turn, lead to differential gene ex-pression (2). A number of monocot imprinted genes apparently ac-tivated by selective maternal demethylation have been indentified(21–23), but lack of monocot DME implies that another member ofthis gene family or a different biochemical mechanism is responsiblefor monocot imprinting. Furthermore, it is unclear to what extentendosperm DNA hypomethylation and associated gene expressionchanges are allele-specific. We find that major resource genes areactivated by hypomethylation in endosperm, and if their expression isconfined to the maternal genome, our observations would seem in-consistent with the prevalent parental conflict theory, which predictsthat genes that increase resource allocation should be expressed from

the paternal genome.Whether and to what extent the demethylationthat we observe is parent-specific and how this translates intoimprinted gene expression are urgent issues for further study.

MethodsBisulfite Sequencing and Analysis. Endosperm and embryos were isolated atthe milky stage, and shoots and roots were taken from 14-d-old seedlingsgrown in liquid medium as described (8). Bisulfite sequencing and dataprocessing were performed as described (5, 8).

Microarray Analysis. OurcustomNimbleGenmicroarrayconsistsof2,154,32545-to 85-bp probes that are tiled across the entire sequenced rice genome (Oryzasativa ssp. japonica cultivar Nipponbare, Michigan State University release 5,http://rice.plantbiology.msu.edu) without repeat masking. Each probe is se-lected tohaveapredictedmelting temperatureclose to76°C. Thearraydesign isdeposited in GeneExpressionOmnibus (GEO)with accessionnumberGSE22591.cDNA samples were prepared and labeled as described (24), with hybridizationand data extraction preformed at the Fred Hutchinson Cancer Research Center(www.fhcrc.org) DNA array facility (25). Two independent cDNA samples foreach tissue were labeled with Cy5 and cohybridized with sonicated genomicDNA labeled with Cy3. The two replicates were averaged, and outlier probeswere removedbymediansmoothing(three-probewindow).Anexpression scorefor each gene was calculated by averaging the signal of all probes within thegene’s exons.

Phylogenetic Analysis. Conserved domains of the indicated glycosylase pro-teins were aligned using MUSCLE v3.7, and phylogenetic trees were inferredusing MrBayes v3.1.2 as described (8). The tree was checked and graphicallypresented in FigTree v1.2.2 (http://tree.bio.ed.ac.uk/software/figtree).

ACKNOWLEDGMENTS. We thank Leath Tonkin for Illumina sequencing,Peijian Cao and Pamela Ronald for processed Affymetrix array data(GSE11966), and Barbara Rotz for rice care. A.Z. is a fellow of the JaneCoffin Childs Memorial Fund for Medical Research. J.A.R. is a FulbrightScholar. This work was partially funded by a Young Investigator Grant fromthe Arnold and Mabel Beckman Foundation and Grant IOS-1025890 fromthe National Science Foundation (to D.Z.).

1. Hedges SB, Dudley J, Kumar S (2006) TimeTree: A public knowledge-base ofdivergence times among organisms. Bioinformatics 22:2971–2972.

2. Huh JH, Bauer MJ, Hsieh TF, Fischer RL (2008) Cellular programming of plant geneimprinting. Cell 132:735–744.

3. Chaudhury AM, et al. (2001) Control of early seed development. Annu Rev Cell DevBiol 17:677–699.

4. Gehring M, Bubb KL, Henikoff S (2009) Extensive demethylation of repetitive elementsduring seed development underlies gene imprinting. Science 324:1447–1451.

5. Hsieh TF, et al. (2009) Genome-wide demethylation of Arabidopsis endosperm.Science 324:1451–1454.

6. Law JA, Jacobsen SE (2010) Establishing, maintaining and modifying DNAmethylationpatterns in plants and animals. Nat Rev Genet 11:204–220.

7. Feng S, et al. (2010) Conservation and divergence of methylation patterning in plantsand animals. Proc Natl Acad Sci USA 107:8689–8694.

8. Zemach A, McDaniel IE, Silva P, Zilberman D (2010) Genome-wide evolutionaryanalysis of eukaryotic DNA methylation. Science 328:916–919.

9. Swanson WJ, Vacquier VD (2002) The rapid evolution of reproductive proteins. NatRev Genet 3:137–144.

10. Martienssen R, Barkan A, Taylor WC, Freeling M (1990) Somatically heritable switchesin the DNA modification of Mu transposable elements monitored with a suppressiblemutant in maize. Genes Dev 4:331–343.

11. Meyer P, et al. (1992) Endogenous and environmental factors influence 35S promotermethylation of amaizeA1gene construct in transgenic petunia and its colour phenotype.Mol Gen Genet 231:345–352.

12. Jiang N, et al. (2003) An active DNA transposon family in rice. Nature 421:163–167.13. Bureau TE, Wessler SR (1994) Mobile inverted-repeat elements of the Tourist family are

associated with the genes of many cereal grasses. Proc Natl Acad Sci USA 91:1411–1415.

14. Penterman J, et al. (2007) DNA demethylation in the Arabidopsis genome. Proc Natl

Acad Sci USA 104:6752–6757.15. Zhu JK (2009) Active DNA demethylation mediated by DNA glycosylases. Annu Rev

Genet 43:143–166.16. Xue LJ, Zhang JJ, Xue HW (2009) Characterization and expression profiles of miRNAs

in rice seeds. Nucleic Acids Res 37:916–930.17. Qu le Q, Xing YP, Liu WX, Xu XP, Song YR (2008) Expression pattern and activity of six

glutelin gene promoters in transgenic rice. J Exp Bot 59:2417–2424.18. Li J, Hsia AP, Schnable PS (2007) Recent advances in plant recombination. Curr Opin

Plant Biol 10:131–135.19. Mosher RA, et al. (2009) Uniparental expression of PolIV-dependent siRNAs in

developing endosperm of Arabidopsis. Nature 460:283–286.20. Tran RK, et al. (2005) Chromatin and siRNA pathways cooperate to maintain DNA

methylation of small transposable elements in Arabidopsis. Genome Biol 6:R90.21. Gutiérrez-Marcos JF, et al. (2006) Epigenetic asymmetry of imprinted genes in plant

gametes. Nat Genet 38:876–878.22. Haun WJ, et al. (2007) Genomic imprinting, methylation and molecular evolution of

maize Enhancer of zeste (Mez) homologs. Plant J 49:325–337.23. Jahnke S, Scholten S (2009) Epigenetic resetting of a gene imprinted in plant embryos.

Curr Biol 19:1677–1681.24. Zilberman D, Gehring M, Tran RK, Ballinger T, Henikoff S (2007) Genome-wide

analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence

between methylation and transcription. Nat Genet 39:61–69.25. Zilberman D, Coleman-Derr D, Ballinger T, Henikoff S (2008) Histone H2A.Z and

DNA methylation are mutually antagonistic chromatin marks. Nature 456:

125–129.

18734 | www.pnas.org/cgi/doi/10.1073/pnas.1009695107 Zemach et al.

Dow

nloa

ded

by g

uest

on

May

18,

202

0