recently duplicated sesterterpene (c25) gene clusters in...

12
RESEARCH PAPERhttps://doi.org/10.1007/s11427-019-9521-2 ........................................................................................................... Recently duplicated sesterterpene (C25) gene clusters in Arabidopsis thaliana modulate root microbiota Qingwen Chen 1,8† , Ting Jiang 1,4,8† , Yong-Xin Liu 1,4 , Haili Liu 2 , Tao Zhao 3 , Zhixi Liu 1,8 , Xiangchao Gan 5 , Asis Hallab 6 , Xuemei Wang 1,8 , Juan He 1,8 , Yihua Ma 1,8 , Fengxia Zhang 1 , Tao Jin 7 , M. Eric Schranz 3 , Yong Wang 2 , Yang Bai 1,4,8* & Guodong Wang 1,8* 1 State Key Laboratory of Plant Genomics and National Center for Plant Gene Research, Institute of Genetics and Developmental Biology, The Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing 100101, China; 2 Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China; 3 Biosystematics Group, Wageningen University, 6708 PB Wageningen, The Netherlands; 4 CAS-JIC Centre of Excellence for Plant and Microbial Sciences (CEPAMS), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; 5 Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829 Köln, Germany; 6 Forschungszentrum Jülich, Plant Sciences (IBG-2) Wilhelm-Johnen-Straße, 52428 Jülich, Germany; 7 China National Genebank-Shenzhen, the Beijing Genomics Institute (BGI-Shenzhen), Shenzhen 518083, China; 8 College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing 100039, China Received February 21, 2019; accepted March 25, 2019; published online April 0, 2019 Land plants co-speciate with a diversity of continually expanding plant specialized metabolites (PSMs) and root microbial communities (microbiota). Homeostatic interactions between plants and root microbiota are essential for plant survival in natural environments. A growing appreciation of microbiota for plant health is fuelling rapid advances in genetic mechanisms of controlling microbiota by host plants. PSMs have long been proposed to mediate plant and single microbe interactions. However, the effects of PSMs, especially those evolutionarily new PSMs, on root microbiota at community level remain to be elucidated. Here, we discovered sesterterpenes in Arabidopsis thaliana, produced by recently duplicated prenyltransferase-terpene synthase (PT-TPS) gene clusters, with neo-functionalization. A single-residue substitution played a critical role in the acquisition of sesterterpene synthase (sesterTPS) activity in Brassicaceae plants. Moreover, we found that the absence of two root-specific sesterterpenoids, with similar chemical structure, significantly affected root microbiota assembly in similar patterns. Our results not only demonstrate the sensitivity of plant microbiota to PSMs but also establish a complete framework of host plants to control root microbiota composition through evolutionarily dynamic PSMs. plant specialized metabolites, microbiota, sesterterpene, terpene synthase Citation: Chen, Q., Jiang, T., Liu, Y.X., Liu, H., Zhao, T., Liu, Z., Gan, X., Hallab, A., Wang, X., He, J., et al. (2019). Recently duplicated sesterterpene (C25) gene clusters in Arabidopsis thaliana modulate root microbiota. Sci China Life Sci 62, https://doi.org/10.1007/s11427-019-9521-2 © Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2019 life.scichina.com link.springer.com SCIENCE CHINA Life Sciences †Contributed equally to this work *Corresponding authors (Yang Bai, email: [email protected]; Guodong Wang, email: [email protected])

Upload: others

Post on 24-Dec-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

•RESEARCH PAPER• https://doi.org/10.1007/s11427-019-9521-2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Recently duplicated sesterterpene (C25) gene clusters inArabidopsis thaliana modulate root microbiota

Qingwen Chen1,8†, Ting Jiang1,4,8†, Yong-Xin Liu1,4, Haili Liu2, Tao Zhao3, Zhixi Liu1,8,Xiangchao Gan5, Asis Hallab6, Xuemei Wang1,8, Juan He1,8, Yihua Ma1,8, Fengxia Zhang1,

Tao Jin7, M. Eric Schranz3, Yong Wang2, Yang Bai1,4,8* & Guodong Wang1,8*

1State Key Laboratory of Plant Genomics and National Center for Plant Gene Research, Institute of Genetics and Developmental Biology, TheInnovative Academy of Seed Design, Chinese Academy of Sciences, Beijing 100101, China;

2Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academyof Sciences, Shanghai 200032, China;

3Biosystematics Group, Wageningen University, 6708 PB Wageningen, The Netherlands;4CAS-JIC Centre of Excellence for Plant and Microbial Sciences (CEPAMS), Institute of Genetics and Developmental Biology, Chinese

Academy of Sciences, Beijing 100101, China;5Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829 Köln, Germany;6Forschungszentrum Jülich, Plant Sciences (IBG-2) Wilhelm-Johnen-Straße, 52428 Jülich, Germany;

7China National Genebank-Shenzhen, the Beijing Genomics Institute (BGI-Shenzhen), Shenzhen 518083, China;8College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing 100039, China

Received February 21, 2019; accepted March 25, 2019; published online April 0, 2019

Land plants co-speciate with a diversity of continually expanding plant specialized metabolites (PSMs) and root microbialcommunities (microbiota). Homeostatic interactions between plants and root microbiota are essential for plant survival in naturalenvironments. A growing appreciation of microbiota for plant health is fuelling rapid advances in genetic mechanisms ofcontrolling microbiota by host plants. PSMs have long been proposed to mediate plant and single microbe interactions. However,the effects of PSMs, especially those evolutionarily new PSMs, on root microbiota at community level remain to be elucidated.Here, we discovered sesterterpenes in Arabidopsis thaliana, produced by recently duplicated prenyltransferase-terpene synthase(PT-TPS) gene clusters, with neo-functionalization. A single-residue substitution played a critical role in the acquisition ofsesterterpene synthase (sesterTPS) activity in Brassicaceae plants. Moreover, we found that the absence of two root-specificsesterterpenoids, with similar chemical structure, significantly affected root microbiota assembly in similar patterns. Our resultsnot only demonstrate the sensitivity of plant microbiota to PSMs but also establish a complete framework of host plants to controlroot microbiota composition through evolutionarily dynamic PSMs.

plant specialized metabolites, microbiota, sesterterpene, terpene synthase

Citation: Chen, Q., Jiang, T., Liu, Y.X., Liu, H., Zhao, T., Liu, Z., Gan, X., Hallab, A., Wang, X., He, J., et al. (2019). Recently duplicated sesterterpene (C25)gene clusters in Arabidopsis thaliana modulate root microbiota. Sci China Life Sci 62, https://doi.org/10.1007/s11427-019-9521-2

© Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2019 life.scichina.com link.springer.com

SCIENCE CHINALife Sciences

†Contributed equally to this work*Corresponding authors (Yang Bai, email: [email protected]; Guodong Wang, email: [email protected])

Page 2: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

INTRODUCTION

In nature, land plants featured with continually expandingplant specialized metabolites (PSMs) have co-evolved with adiverse community of soil microorganisms, and recruitspecific root microbiota, which play pivotal roles in planthealth and productivity (Berendsen et al., 2012; Müller et al.,2016; Verbon and Liberman, 2016; Leach et al., 2017).Meanwhile, plants face strong selective pressure to controltheir root microbiota to adapt or survive in natural environ-ments, given the huge diversity and evolutionary dynamismof microbes. Thus, plants have evolved multiple mechanismsto control root microbiota, as humans have done to controltheir gut microbiota (Foster et al., 2017). Several recentstudies have identified processes that control root microbiotaassembly in Arabidopsis thaliana, including salicylic aciddependent immune system and phosphate starvation path-ways (Lebeis et al., 2015; Castrillo et al., 2017). However,the key processes characterized to date can only explainlimited variation observed in root microbiota, suggesting thathost plants almost certainly use additional molecular me-chanisms to manipulate root microbiota.PSMs have long been proposed to function in key roles

that mediate interactions between host plants and microbesin their environment (van Dam and Bouwmeester, 2016).The shared lineage and tissue specificity and the evolu-tionary dynamism of PSM pool and plant microbiota alsosupport this hypothesis (Bulgarelli et al., 2012; Lundberg etal., 2012; Schlaeppi et al., 2014). However, the mutual re-lationships between PSMs and root microbiota, i.e., how andto what extent they are related, are largely unknown. Amongvarious PSMs, terpenoids are the largest and the most diversegroup; more than 70,000 terpenoids have been structurallycharacterized (Vickers et al., 2014). Despite their enormousstructural and functional diversity, all terpenoids originatevia de novo synthesis from two common C5 precursors,isopentenyl diphosphate (IPP) and dimethylallyl diphosphate(DMAPP). These precursors are subsequently modified byprenyltransferase (PT) and terpene synthase (TPS) enzymes,which together generate the diversity of terpenoid backbones(Chen et al., 2011). There are 32 TPS genes in the Arabi-dopsis thaliana genome, 18 of which are expressed at var-ious levels in root tissues (Tholl and Lee, 2011), implyingthat terpenoid diversity is critically important for root phy-siological processes in Arabidopsis thaliana. Interestingly,three PT-TPS gene clusters had been discovered by genomesequence analysis in Arabidopsis thaliana fifteen years ago(Aubourg et al., 2002). Till recently, we and other groupselucidated the biochemical function of these PT-TPS geneclusters, which showed unprecedented sesterterpene syn-thase (sesterTPS) activity utilizing geranylfarnesyl pyr-ophosphate (GFPP, C25) as a substrate (Wang et al., 2016;Huang et al., 2017; Shao et al., 2017). However, the in planta

functions of these PT-TPS gene clusters remain unknown.Here, we provide genetic and biochemical evidences to

characterize two root-specific sesterterpenoids produced byrecently duplicated gene clusters (TPS25-GFPPS3 andTPS30-GFPPS4) in Arabidopsis thaliana. We further foundthat a single-residue substitution played a critical role in theacquisition of sesterTPS activity in Brassicaceae plants. Theabsence of two pentacyclic ring sesterterpenoids in loss-of-sesterTPS-function mutants significantly affected the Ara-bidopsis thaliana root microbiota composition. The ses-terterpenoid-mediated plant mechanism for controlling rootmicrobiota establishes a conceptual framework of host plantsto modulate root microbiota composition through evolutio-narily new PSMs.

RESULTS AND DISCUSSION

Evolutionarily new sesterterpene gene clusters in Ara-bidopsis thaliana

The Arabidopsis thaliana genome contains four ger-anylfarnesyl pyrophosphate synthases (GFPPS, C25) genes.Conspicuously, each GFPPS gene is adjacent to a terpenesynthase (TPS) gene, followed by one or more P450 genes,thereby forming TPS-GFPPS-P450 gene clusters. One largecluster includes three TPS genes, two GFPPS genes, andeight P450 genes (TPS17-GGPPS5-TPS18-GFPPS1-TPS19-GFPPS2-P4501–8, Figure 1A), while two small clus-ters include TPS25-GFPPS3 and TPS30-GFPPS4-P450(Figure 1A) (Wang et al., 2016). The products of TPS18(Compound 1, (+)-thalianatriene), TPS19 (Compound 2,(−)-retigeranin B), TPS25 (Compound 3, (−)-ent-quiannu-latene) and TPS30 (Compound 4, (+)-astellatene) wereidentified using NMR techniques (Huang et al., 2017; Shaoet al., 2017). These compounds are unprecedented Arabi-dopsis thaliana sesterterpenes largely similar to those iso-lated from fungi (Figure 1A). We next attempted tounderstand the evolutionary trajectory of the PT-TPS geneclusters. Phylogenomic synteny analysis revealed that anancestral TPS-GFPPS cluster formed after an α whole-gen-ome duplication event; orthologous clusters were present inBrassicaceae but not in any Cleomaceae (Figure 1B). No-tably, TPS18-GFPPS1, TPS19-GFPPS2, TPS25-GFPPS3,and TPS30-GFPPS4 clusters were specific to the Arabi-dopsis thaliana genome via relatively recent duplications ofthe TPS17-GGPPS5 cluster; no counterparts for these geneclusters were found in Arabidopsis lyrata and other Brassi-caceae genomes. Moreover, the TPS25-GFPPS3 and TPS30-GFPPS4-P450 clusters were located in dynamic chromoso-mal regions that are enriched in transposable elements (TEs).These results suggest that gene duplication and TE-mediatedrecombination contribute to the recent formation of TPS25-GFPPS3 and TPS30-GFPPS4-P450 clusters in Arabidopsis

2 Chen, Q., et al. Sci China Life Sci

Page 3: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

thaliana.

A single amino acid determines the substrate specificityof sesterTPSs

As TPSs are determinant enzymes for the diversity of the

terpene backbone, we explored the phylogeny of sesterTPSsin 246 TPSs from 13 representative species on plant evolu-tionary history. Four aforementioned Arabidopsis sesterTPSsand their close homologues were only present in Brassica-ceae genomes and formed an evolutionarily conserved sub-clade (28 members) within the TPS-a family (Figure 2A),

26.7 - 26.6 Mb

4.9 - 5.1 Mb

6.3 - 6.5 Mb

6.6 - 6.9 Mb

A

GFPPS

sesterTPS

1.0 kb

P450TPS30(At3g32030)

GFPPS4(At3g32040)

12.5 kb

P450(CYP705A, At3g32047)

12.0 kb

TPS25(At3g29410)

GFPPS3(At3g29430)

6.4 kb

TPS18(At3g14520)

GFPPS1 (At3g14530 )

TPS19(At3g14540)

GFPPS2(At3g14550)

8 x P450s(CYP72A)

26.4 kb

TPS17(At3g14490)

GGPPS5(At3g14510)

7.4 kb

B

Lineage III

Cleome gynandra

Schrenkiella parvula

Sisymbrium irio

Aethionema arabicum

Camelina sativa

Arabidopsis lyrata

Arabidopsis thaliana

Capsella rubella

Brassica rapa

Boechera strictaLineage I

Lineage II

N.A.Brassicaceae

At-α

At-ß

Compound 3 (TPS25) Compound 4 (TPS30)Compound 1 (TPS18) Compound 2 (TPS19)

13

1418

17

16

151

19

23

12

2

34

56

7

8 910

11

22

20

21

HH

H

24

25

H

H

H

1213

1418

17

16

151

2

3

45

6

7

8 9

10

11

22 1924

25

2320

21

12

13

1418

17

16

15

123

4

5 6

7 8

910

11

21

19

2425

2320

22

H

H

H

1

2

34

5

6

7

89

10

11 1213

1415

16

17

18

19

20

21

22

23

24

25

H

H H

C. sativa NC_025699

C. sativa NC_025685

C. rubella Scaffold3

B. stricta Scaffold28625

A. lyrata Scaffold3

A. thaliana Chr3

B. rapa A01

B. rapa A03

S. parvula Chr3

S. irio Scaffold2805

A. arabicum Scaffold26

C. gynandra Scaffold278

1.4 - 1.2 Mb

6.0 - 6.2 Mb

4.8 - 5.0 Mb 11.3 - 13.0 Mb

17.9 - 18.0 Mb

4.4 - 4.5 Mb

0.0 - 0.1 Mb

0.7 - 0.9 Mb

0.2 - 0.6 Mb

IV

V

IV VII & III

II IIII

Figure 1 Brassicaceae-wide Synteny of TPS-GFPPS-P450 Gene Clusters. A, Illustration of prenyltransferase-terpene synthase (PT-TPS) gene clusters inArabidopsis thaliana and chemical structures of their corresponding products. Five putative sesterterpene (C25) gene clusters in the Arabidopsis genome. Allof these gene clusters are located on chromosome 3. GGPPS5 (At3g14510) is a pseudogene. Dotted lines represent multiple genes in this region. Thechemical structures of four sesterterpenes generated by the Arabidopsis clustered sesterTPS are shown below. B, Simplified molecular phylogenetic tree (left)of Brassicaceae species that were used for the synteny analysis. The positions of the well-known ancient α and β whole-genome duplication events areindicated. Phylogenomic synteny analysis (right) of TPS-GFPPS-P450 gene clusters (cyan: TPS; orange: GFPPS; light green: P450) from 10 representativeBrassicaceae and Cleomaceae genomes. Chromosome segments are represented by gray bars, and genes are represented by dark green bricks. The positions ofthe TPS18-GFPPS1, TPS19-GFPPS2, TPS25-GFPPS3, and TPS30-GFPPS4 clusters are highlighted with red boxes. N.A. indicates that genome informationof Brassicaceae lineage III species is not publicly available.

3Chen, Q., et al. Sci China Life Sci

Page 4: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

*

20NH ***** *2J5C5EASTPS18TPS19TPS25TPS30

319 356316 353319 356319 356

264 301

259 296286 323

* ** *******

496 577493 574496 576496 576

440 520

436 517463 542

B

TPS-

c/e/

f TPS-a

TPS-b/g

Cru

1001

5372

mC

gr00

26s0

064

09

AT3

g294

10(T

PS25

)B

raA

0318

6Es

a100

1542

9mA

T3g3

2030

(TPS

30)

Bst

2862

5s02

50A

ly89

1372

88B

raC

0352

9Es

a100

0562

7mB

raC

0426

1Bs

t268

33s0

091

Bst9

638s

0070

95AT

1g33

750(

TPS2

2)Al

y922

469

BraC

0353

1Cr

u100

1608

9mAT

3g14

490(

TPS1

7)

Aly9

0438

6Cr

u100

1162

4m

Cru1

0016

237m

AT3g

1452

0(TP

S18)

AT3g

1454

0(TP

S19)

AT1g

3195

0(TP

S29)

80

89

AT1g

7008

0(TP

S06)

Cgr1

961s

0007

AT5g4

8110

(TPS

20)

Cru10

0124

77m

Cru10

0281

57m

Cru10

0281

68m

AT5g4

4630

(TPS11

)

Aly330

931

Aly915

722

Cru1002

8094

m

Cgr1562

s000

1

93Bst29514s0043

Cru10028174m

BraD00967

BraB02791

99Th2v24410

BraJ01717

Esa10001108m

Esa10001228m

87Th2v26019

Th2v19177

Bst30275s0018

Bst17584s0004

AT4g15870(TPS01)

BraJ00933

Bst10689s0001

Cru10021320m

Cru10022111m

Cgr0013s0001

9592 Aly481194

AT2g23230(TPS05)

Cru10024815m

Cgr2793s0006

8197Cru10003521m

AT1g48800(TPS28)

Aly913919BraI01892Aly913921Cru10012518mBraI01686Cru10012239mAT4g20210(TPS08)AT3g29110(TPS16)BraJ01171Esa10028539mEsa10027515mBraK01795BraH01040BraH01002AT4g20230(TPS09)Cru10007162m

80

AT4g20200(TPS07)AT3g29190(TPS15)

AT1g66020(TPS26)BraK01797BraA01099Cru10007667m

Cgr4235s0001

92

87

AT4g13300(TPS13)

Aly332992AT4g13280(TPS12)

Cru10006722m

Atr00066.146

Cru10018762m

Cgr0167s0006

97

AT3g25810(TPS24)

BraF03314

BraF03315

Aly896435

AT3g25830(TPS23)

AT3g25820(TPS27)

97

BraF03317

Cru10022869m

Cgr4111s0001

97

Cru10022874m

BraC02485

BraC02651

AT2g24210(TPS10)

Aly899756

Aly932440

Th2v24447

Cru10004460m

Cru10004436m

Cgr0812s0028

94AT4g16730(TPS02)

Aly329840

Esa10027480m

BraA01862

78Th2v05855)

Th2v20955

97Th2v21802

Th2v21803

98Th2v20363

Th2v05137

Th2v05138

BraH

00876

AT4g16740(TPS03)

Aly493202

Cru10004490m

Cgr0812s0027

9876

Th2v00751Th2v00752Th2v11008

Aco

005

0040

3A

co00

5 00

40179100 610oc

A50200 610oc

AAco

016

0020

3A

co01

6 00

201

93A

co01

2 00

173

Atr

0004

3.50

Aco

010

0062

0

Atr

0000

3.31

1

Aco

012

0026

0

Aco

012

0025

7

95

Aco

016

0006

2

Aco

016

0006

498

79

Atr0

0007

.400

Atr0

0032

.223

Aco0

12 0

0262

Aco0

25 0

0001

Aco0

30 0

0128

Cpa1

9.15

Cpa1

.290

Aco0

05 0

0393

Aco0

05 0

0392

Aco0

12 0

0259

Aco0

12 0

0256

99

Cpa8

1.3

Aco0

11 0

0462

Aco0

12 00

115

Aco0

28 00

026

Aco01

2 002

17

Aco10

4 000

06

Aco02

8 002

44

Aco07

1 000

12

Aco10

4 000

02

Cru10

0222

22m

Cgr1546

3s00

02

96

Aly924

854

AT1g61

680(T

PS14)

75

Th2v04950

Th2v04952

92

Th2v20646

Th2v20645

Aco005 00387

Aco005 00390

Aco005 00402

96 87

Cpa585.3Cpa761.1

Th2v17335Th2v08642

Th2v08644 76Th2v23914

Th2v23915Th2v23919 88BraK00917AT5g23960(TPS21)

Cru10000642mCgr0086s0016 88BraF02619Bst10199s0039Cru10018817mCgr0645s0016Aly348000 82

77

Smo406214Smo403764Smo40376199

Atr00066.195AT1g61120(TPS04)

Aly92492396

Cru10019776mCgr2213s0003 91

Bst10040s0087Esa10023261m

BraI01514

90

BraK00278Aly338258

Bst10040s0088

Cru10021930m

Cgr2213s0002 96

88

Th2v04970

Cpa596.2

Cru10003225m

Cgr8967s0012

Bst2983s0171

AT4g02780(TPS31)

75

Aly352546

BraC02792

Esa10029352m

BraI00186

Th2v16249

Atr00103.20

Cpa52.43

Ppa007G008700

Ppa007G00800093

Smo163980

Aco028 00016

Aco028 00017

87

Aco012 00177

Aco012 00228

Aco028 00293

Aco028 00290

92

Aco014 00341

Cpa32.89

Atr00032.208

AT1g79460(TPS32)

Aly47711394

Cru10019830m

Cgr1725s0020

98

Bst20129s0381

Esa10018144mBraG03667

77

Th2v10128Sm

o112927Sm

o124329Sm

o418910Sm

o402349Sm

o439448Sm

o412139Sm

o407280Sm

o40235198

*

**A

C

Enzyme GFPP (C25) GGPP (C20) FPP (C15) GPP (C10)

TPS18 TPS19 TPS25 TPS30

TPS18G328W

TPS19G325W

TPS25G328W

TPS30P328W

Figure 2 Substrate Specificity of sesterTPSs from Brassicaceae Plants. A, Phylogenetic analysis of 246 TPS proteins obtained from 13 representativespecies on plant evolutionary history. To simplify the classification of TPS proteins, three TPS clades (TPS-a, TPS-b/g and TPS-c/e/f) are shown here.Bootstrap values (based on 500 replicates) >75% are shown for corresponding nodes. All 28 putative sesterTPS proteins in blue form a Brassicaceae-specificclade. The 16 TPS proteins showing sesterTPS activity and 12 TPS proteins showing no sesterTPS activity in this clade are marked in red dots and black dots,respectively. Four clustered sesterTPSs from Arabidopsis are marked with additional red asterisks. Species abbreviations: Aco, Aquilegia coerulea Gold-smith; Aly, Arabidopsis lyrata; At, Arabidopsis thaliana; Atr, Amborella trichopoda; Cgr, Capsella grandiflora; Cpa, Carica papaya; Cru, Capsella rubella;Bst, Boechera stricta; Bra, Brassica rapa; Esa, Eutrema salsugineum; Ppa, Physcomitrella patens; Smo, Selaginella moellendorffii; Tha, Tarenaya has-sleriana. It should be noteworthy that AtTPS20 is a pseudogene in Arabidopsis thaliana (Col-0 ecotype). B, Amino acid sequence alignment of fourArabidopsis thaliana clustered sesterTPSs and three plant TPSs with solved crystal structures. Tobacco 5-epi-aristolochene synthase (TEAS, Protein DataBank entry 5EAS), (4S)-limonene synthase from spearmint ((4S)-LS, Protein Data Bank entry 20NH), and 1,8-cineole synthase from Salvia fruticosa (Sf-CinS1, Protein Data Bank entry 2J5C). Sixteen residues lining the active site, mainly based on the 5EAS structure information, are denoted by asterisks. C,Substrate specificities of native and mutated Arabidopsis sesterTPSs. All four sesterTPSs showed specific sesterTPS (GFPP, C25) activity; however,mutagenized variants lost sesterTPS activity. Notably, TPS18G328W, TPS19G325Wand TPS25G328W showed monoTPS (GPP, C10), sesquiTPS (FPP, C15), ordiTPS (GGPP, C20) activities.

4 Chen, Q., et al. Sci China Life Sci

Page 5: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

which use geranyl diphosphate (GPP, C10), farnesyl dipho-sphate (FPP, C15) and/or geranylgeranyl diphosphate(GGPP, C20) as substrates (Chen et al., 2011). Importantly,we found that 16 out of 28 members of the sesterTPS cladeused geranylfarnesyl diphosphate (GFPP, C25) as a substrate(Figure 2A and Supplemental Figure 1), suggesting thatfunctional sesterTPSs in Brassicaceae utilizing GFPP mayevolve from TPSs utilizing GPP/FPP/GGPP.To explore the evolution of sesterTPS in Brassicaceae

plants, we scanned for conserved sites under recent positiveselection in the sesterTPS clade (47 TPS genes were iden-tified from eight Brassicaceaen genomes, see MATERIALSAND METHODS for detailed information) (Gan et al.,2016). Interestingly, only a single site (G328 (Glycine) inTPS18/TPS25, G325 in TPS19, and P328 (Proline) inTPS30) showed significant signals of selection (adjusted P-value<0.004), suggesting that positive selection was a driv-ing force for the functional diversification of plant ses-terTPS. Moreover, we aligned four clustered sesterTPSs inArabidopsis thaliana (TPS18/TPS19/TPS25/TPS30) tomonoTPSs and sesquiTPS with crystal structure (Starks etal., 1997; Kampranis et al., 2007; Srividya et al., 2015) toexplore the enzymatic mechanism for sesterTPS substratespecificity, given that all known plant TPS-a members sharea similar active-site scaffold (Christianson, 2017). Among 16conserved residues in substrate binding and catalysis, thetryptophan (W) residue (position 273 in 5EAS; Figure 2B) inmonoTPS and sesquiTPS proteins was replaced with G inTPS18/TPS19/TPS25 or P in TPS30 (Figure 2B). Togetherwith positive selection result, this residue position (G328 inTPS18 and TPS25, G325 in TPS19 and P328 in TPS30)played a critical role during the evolutionary acquisition ofsesterTPS activity among Brassicaceae plants. We hypothe-sized that the small side chains of G and P likely provide alarge active-site cavity for the larger GFPP substrate. To testthis hypothesis, we mutated G328 in TPS18 and TPS25,G325 in TPS19 and P328 in TPS30 toW. All four sesterTPSsshowed specific activity towards GFPP (C25); however,mutagenized variants lost sesterTPS activity. More im-portantly, TPS18G328W, TPS19G325Wand TPS25G328W showedmonoTPS (GPP, C10), sesquiTPS (FPP, C15) or diTPS(GGPP, C20) activities (Figure 2C and Figures S2–S5 inSupporting Information).

Tissue specificity of sesterterpenes in Arabidopsis thali-ana

qPCR analysis revealed that TPS25 and TPS30 were pre-dominantly expressed in roots, while TPS18 and TPS19 weremainly expressed in flower/silique tissue. The correspondingGFPPSs showed similar degrees of tissue specificity as theclustered sesterTPS genes (Figure 3A). To test the sub-cellular localization of four clustered sesterTPSs, we ex-

pressed sesterTPS:GFP in protoplasts. The GFP signals forall four clustered sesterTPSs were detected exclusively inchloroplasts (Figure 3B), suggesting that plant sesterterpenes(C25) are derived from the plastid-localized 2-C-methyler-ythritol 4-phosphate (MEP) pathway rather than the cytosol-localized mevalonic acid (MVA) pathway.To test whether Arabidopsis plants produce sesterterpe-

noids, we performed sesterterpenoid-profiling of eight Ara-bidopsis organs using gas chromatography-triplequadrupole-mass spectrometry (GC-QQQ-MS). We foundthat Compound 3 (TPS25) and Compound 4 (TPS30) spe-cifically accumulated in root tissue (Figure 3C). Interest-ingly, we found that old roots had two-fold more Compound3 (44.31±1.17 vs. 17.83±0.66 µg g−1 fresh weight (F.W.),P<0.001, student-test) and 9-fold more Compound 4 (20.55±0.69 vs. 2.18±0.093 µg g−1 F.W., P<0.001, student-test)comparing with young roots (Figure 3C). We obtainedArabidopsis mutants defective in TPS25 and TPS30 (tps25-1, tps25-2, tps30-1, tps30-2, two double mutants: tps25/tps30-1, and tps25/tps30-2; Figures S6–S9 in SupportingInformation). Compounds 3 and 4 were not detected in thecorresponding TPS25 and TPS30 mutants (Figure 3D andFigure S10 in Supporting Information). Based on these re-sults, we concluded that TPS25 and TPS30 encode functionalbona fide sesterTPS in Arabidopsis thaliana.

Sesterterpenoids modulate the composition of root mi-crobiota

Due to the root specificities of Compounds 3 and 4 and theincrease in their accumulation with time throughout rootdevelopment, we tested the effects of these sesterterpenoidson root microbiota. We performed 16S rRNA gene profilingto compare the root bacterial microbiota compositions fromwild-type (Col-0) plants and the TPS25 and TPS30 ses-terTPS mutants (tps25-1, tps25-2, tps30-1, tps30-2, tps25/tps30-1, tps25/tps30-2) grown in natural soil (Chang-pingFarm in Beijing, China) under climate-controlled conditions.For each plant genotype, we performed three independentbiological replicates. Each biological replication was testedin 4–5 individual pots, each containing four plants. The Col-0 and sesterTPS mutant plants appeared healthy when har-vested (Figure S11 in Supporting Information).We found that loss-of-function mutants of the TPS25 and

TPS30 genes affected root microbiota assembly. Constrainedordination analysis revealed significant differences in theroot microbiota between Col-0 and the various sesterTPSmutants (Figure 4A, 9.28% constrained variance, canonicalanalysis of principal coordinates; Figure S12 in SupportingInformation). A hierarchical clustering analysis (post hocBray-Curtis distance) also showed that the microbiota ofCol-0 and sesterTPS single and double mutants each formeddistinctive clusters (Figure 4A and B). Importantly, the ef-

5Chen, Q., et al. Sci China Life Sci

Page 6: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

fects of tps25, tps30, and double mutants on the root mi-crobiota composition were all validated in two independentmutant lines. The differential bacterial operational taxo-nomic units (OTUs) between mutants and Col-0 showedsignificant similarity, with overlaps of 44%–79% in two in-dependent lines of each gene (Figure 4C). The differentially

abundant OTUs mainly belonged to Proteobacteria, Acti-nobacteria, Firmicutes, Acidobacteria, Planctomycetes, andVerrucomicrobita (Figure 4D). These differences were foundat a lower taxonomic OTU level, but not at a higher taxo-nomic phylum level (Figures S13 and S14 in SupportingInformation). Notably, TPS25 and TPS30 deficiencies af-

C

Con

cent

ratio

n (µ

g/g

F.W

.)

Seedli

ng ro

ot

Seedli

ng le

afRoo

t

Rosett

e lea

f

Caulin

e lea

fStem

Flower

Silique

Compound 1

Compound 2

Compound 3

Compound 4

Col-0

tps25

-1

tps30

-1

tps25

-2

tps30

-2

tps25

/tps3

0-1

tps25

/tps3

0-2

0

10

20

Compound 3

Compound 4

0

20

40

60

80

B Bright Green Red Merged

TPS18-GFP

TPS19-SP-GFP

Free GFP

TPS25-GFP

TPS30-GFP

A

Rel

ativ

e ex

pres

sion

Root

Stem

Rosett

e lea

f

Caulin

e lea

f

Flower

Silique

Seedli

ng ro

ot

Seedli

ng le

af

0

0.002

0.004

0.006

0.008

0

0.003

0.006

0.009

0.012

0

0.003

0.006

0.009

0.012

0

0.004

0.008

0.012

0.016 TPS18

GFPPS1

TPS19

GFPPS2

TPS25

GFPPS3

TPS30

GFPPS4

Root

Stem

Rosett

e lea

f

Caulin

e lea

f

Flower

Silique

Seedli

ng ro

ot

Seedli

ng le

af

Root

Stem

Rosett

e lea

f

Caulin

e lea

f

Flower

Silique

Seedli

ng ro

ot

Seedli

ng le

afRoo

tStem

Rosett

e lea

f

Caulin

e lea

f

Flower

Silique

Seedli

ng ro

ot

Seedli

ng le

af

DC

once

ntra

tion

(µg/

g F.

W.)

Figure 3 Characterization of Four sesterTPSs in Arabidopsis. A, RT-qPCR analysis of two GFPPS-sesterTPS clustered genes in different organs ofArabidopsis plants. Samples were harvested from 8-week-old plants and 2-week-old seedlings. Transcript levels are expressed relative to the Actin2 gene(n=3). B, Subcellular localization of TPS18-GFP, TPS19-SP-GFP, TPS25-GFP and TPS30-GFP in Arabidopsis leaf mesophyll protoplasts revealed by laserconfocal microscopy. Chloroplasts are revealed by red chlorophyll autofluorescence. Scale bars=10 μm. C, Sesterterpene profiling (Compounds 1 to 4) ineight different organs of Arabidopsis thaliana. D, Sesterterpene profiles in roots of single and double mutants of TPS25 and TPS30. Error bars represent theSE (n=3). F.W., fresh weight.

6 Chen, Q., et al. Sci China Life Sci

Page 7: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

fected root microbiota in similar patterns based on two linesof evidence: first, the root microbiota of the tps25 and tps30mutants formed closely related clusters (Figure 4A and B);second, the differentially abundant OTUs between Col-0 andtps25 roots showed a significant overlap with those betweenCol-0 and tps30 mutants, especially among the enrichedOTUs (Figure 4E and Figure S15 in Supporting Informa-tion). This trend was further confirmed by the enriched ordepleted OTUs in tps25/tps30mutants (Figure 4E). Together,our results demonstrate that TPS25 and TPS30 affect rootmicrobiota assembly; moreover, the extensive overlap of the

bacterial taxa in both mutants strongly suggests that theserecently duplicated sesterterpene gene clusters influence rootmicrobiota assembly through related mechanisms. Onepossible explanation is that TPS25 product (Compound 3)and TPS30 product (Compound 4) share a similar chemicalstructure, both belong to pentacyclic structure: Compound 3is a 5/5/5/6/5 pentacyclic sesterterpene and Compound 4 is a5/4/7/6/5 pentacyclic sesterterpene (Figure 1A).Land plants continually make new chemical molecules for

fitness due to the never-ending selective pressure from plant-environment interactions (Kliebenstein and Osbourn, 2012).

−0.2

−0.1

0.0

0.1

−0.3 −0.2 −0.1 0.0 0.1 0.2CPCoA 1 (26.24%)

CP

CoA

2 (2

1.22

%)

Time

T1

T2

T3

Genotype

tps25/tps30-1

tps25/tps30-2

tps25-1

tps25-2

tps30-1

tps30-2

Col-0

9.28% of variance; P <0.001A

38 3444

tps25-1 tps25-2

3123 26

5421 42

5212 17

29 2726

36 3024

152

7

5

4

4

11

Enr

iche

dD

eple

ted

0.00

0.05

0.1

0.15

0.2

0.25

0.3

B

44

26

42

17

26

24

tps30-2tps30-1 tps25/tps30-1

1110

8

1

22

2

1

tps25 tps30

tps25/tps30

C D tps25 tps30 tps25/tps30

Hclust with euclidean

Phylum

Acidobacteria

Actinobacteria

Bacteroidetes

Firmicutes

Low Abundance

Planctomycetes

Proteobacteria

Verrucomicrobia

Etps25 tps30

tps25/tps30

Enriched Depleted

Group1 Group2 Group3

tps25/tps30-2

Enr

iche

dD

eple

ted

Figure 4 Loss-of-function Mutations in the Sesterterpenoid Biosynthesis Genes TPS25 and TPS30 Result in Similar Effects on Root Microbiota Assembly.A, Constrained principle coordination analysis (CPCoA) based on Bray-Curtis dissimilarity for 16S rRNA gene profiling data showing that TPS25 and TPS30single mutants (tps25-1, tps25-2, tps30-1, tps30-2) and double mutants (tps25/tps30-1 and tps25/tps30-2) assembled distinct root microbiota comparing withwild-type plants (Col-0). Importantly, similar patterns were demonstrated with two independent mutant lines for TPS25, TPS30 and double mutants.Independent biological experiments (T1, T2, and T3) are indicated by shapes. One biological replicate for each genotype consisted of root organ materialsfrom 4-5 independent pots. The total replicate numbers are as follows: Col-0 (n=14); tps25-1 (n=15); tps25-2 (n=15); tps30-1 (n=15); tps30-2 (n=14); tps25/tps30-1 (n=15); tps25/tps30-2 (n=13). B, Hierarchical clustering of post hoc Bray-Curtis distances of root microbiota from Col-0, tps25, tps30 and tps25/tps30 mutants clusters in 3 groups. Group 1 mainly contains Col-0 samples, group 2 mainly contains TPS25 and TPS30 single mutants (tps25-1, tps25-2,tps30-1, tps30-2), and group 3 mainly contains double mutants (tps25/tps30-1 and tps25/tps30-2). C, Two independent lines of TPS25 mutants (tps25-1 andtps25-2), TPS30 (tps30-1 and tps30-2) mutants and double mutants (tps25/tps30-1 and tps25/tps30-2) showed significant overlaps of enriched and depletedOTUs compared with Col-0, validating the sesterTPS function to modulate root microbiota composition. D, Phylum-level distributions of bacterial OTUsidentified as either enriched or depleted in each mutant comparing with Col-0. The number of total differential OTUs is indicated in the centre. E, Overlap ofenriched or depleted OTUs in tps25 and tps30 single mutants and tps25/tps30 double mutants. Remarkably, the overlapping proportion of enriched OTUs invarious mutant plants was significantly higher than that of depleted OTUs.

7Chen, Q., et al. Sci China Life Sci

Page 8: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

In this study, we demonstrate that the recently duplicatedGFPPS-sesterTPS clusters are responsible for sesterterpeneproduction in Arabidopsis thaliana. We further provide ge-netic evidence to demonstrate that two root-specific ses-terterpenes have significant impacts on root microbiotaassembly (Figure 4A, 9.28% constrained variance). Our re-sults suggest that the dynamic processes of duplication andfunctional divergence of metabolic genes (or gene clusters inthis study) endow host plants with a powerful and flexiblemechanism to control root microbiota. In addition to ses-terterpenoids (C25), plants produce a myriad of other ter-penoids (C10, C15, C20, C30, etc.) and other classes ofPSMs (e.g., alkaloids, polyphenols, and other compounds)(Hartmann, 2007). Thus, it will be important to further testthe effects of different classes of PSMs on controlling rootmicrobiota by plant hosts. Our work opens a new avenue forthe potential application of PSMs in modulating the rootmicrobiota with the long-term goal of enhancing plant healthand productivity.

MATERIALS AND METHODS

Plant materials and chemicals

The Arabidopsis thaliana lines (Col-0 and various transgenicplants under Col-0 background) were grown on soil in agrowth room or on 1/2 solidified MS medium plates in agrowth chamber at 22 °C with a 16-h light/8-h dark photo-period. The T-DNA insertion mutants, tps25-1(SALK_123505), tps25-2 (SALK_020266), and tps30-1(CS805958) were ordered from ABRC (http://abrc.osu.edu/).The tps30-2 and tps30-3 mutants were generated using theCRISPR/Cas9 technique. The tps30/tps25-1 and tps30/tps25-2 double mutants were generated by mutating TPS30using the CRISPR/Cas9 technique in the tps25-1 mutantbackground. All authentic reference standards used in thisstudy were purchased from Sigma. All DNA constructsgenerated in this study were verified by sequencing.

Heterologous expression in Escherichia coli, TPS proteinpurification, Terpene synthase assays, and enzymecharacterization

All truncated sesterTPS genes (removal of signal peptidesequence) were cloned into the pMAL-c2x vector (NewEngland Biolabs). An MBP tag was added at the C termini ofthe truncated TPS genes. Construct transformation into E.coli Rosetta (DE3), protein induction, and MBP-tagged TPSprotein purification were carried out by following standardprotocols. Quantification and evaluation of the relative purityof the recombinant proteins was performed using SDS-PAGE with BSA (Sigma) as a standard. Assays testingwhether or not the four Arabidopsis sesterTPSs could re-

cognize GPP, FPP, or GGPP as substrates in vitro employed apreviously described SPME-GC-MS method for analyzingvolatile terpenes (Chen et al., 2015). It should be noteworthythat all other cDNA sequences of putative BrassicaceaesesterTPSs were synthesized by TaKaRa Bio.

RT-qPCR analysis and subcellular localization

RNA extraction, reverse transcription reaction, and RT-qPCR were performed as described previously (Wang et al.,2008; Li et al., 2015). The Actin 2 (At3g18780) gene waschosen as internal standard gene due to its relatively stableexpression across all tested samples. GFP fusion proteinconstruct generation (pJIT163-hGFP vector), Arabidopsisleaf protoplast preparation and transformation, and laserscanning confocal microscope based image collection wereall performed as described previously (Xu et al., 2013).

GC-MS Analysis

For quantification of sesterterpenes in Arabidopsis usingGC-QQQ-MS (Agilent 7890B/7000C), 100 or 200 mg offresh plant tissue was collected and ground into a finepowder in liquid nitrogen. The resulting samples were ex-tracted with 8 mL of EtOAc for 1 h, the supernatant wasdried by nitrogen gas and re-dissolved into 200 µL ofEtOAc. 1 µL of analytical sample was loaded on GC-QQQ-MS for analysis and quantification. The initial oven tem-perature was held at 60 °C for 1 min, then ramped at10 °C/min to 325 °C, and kept at 325 °C for 10 min. Theinlet temperature was 270 °C. Positive mode EI ionizationwas used. The temperatures of the ion source and quadrupolewere set at 230 and 150 °C, respectively. Mass spectra wereacquired within a scanning range of 50−600 m/z. The MRMsettings for Compounds 1–4 are described in Table S1 inSupporting Information. The endogenous amounts of thesefour sesterterpenes were quantified based on a standard curvegenerated using the Compound 4 that we purified.

Phylogenetic and Phylogenomic Synteny Analysis

The sequences of Arabidopsis TPS proteins and the TPSproteins from 11 other plant species were extracted from theTAIR (http://www.arabidopsis.org) and Phytozome 11 da-tabases (http://www.phytozome.net). Twenty-seven Tar-enaya TPS sequences were extracted from http://genomevolution.org/CoGe/CoGeBlast.pl. A maximum like-lihood tree was constructed using MEGA6 software (Tamuraet al., 2013). Shared synteny analysis of GFPPS-TPS-P450gene clusters using 10 representative Brassicaceae andCleomaceae genomes was carried out as described pre-viously (Zhao et al., 2017).For molecular evolutionary analysis of sesterTPS, total 47

8 Chen, Q., et al. Sci China Life Sci

Page 9: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

sesterTPS genes were identified in eight Brassicaceaengenomes, including Cardamine hirsuta, Capsella rubella,Arabidopsis thaliana, Arabidopsis lyrata, Aethionema ara-bicum, Brassica rapa, Schrenkiella parvula and Eutremasalsugineum, using a reciprocal BLAT and Markov cluster-ing (MCL) based method (Gan et al., 2016). Coding se-quences were aligned based on the respective multiple aminoacid sequence alignment generated with MAFFT (v6.851b)(Katoh et al., 2002). Subsequently, the resulting multiplecoding sequence alignment (MSA) was manually curatedand two un-aligned blocks from the sequence of geneaet_18203 of species A. arabicum were removed. It was thensubmitted to phylogenetic maximum likelihood reconstruc-tion (Murrell et al., 2012), with the resulting tree rooted usingthe midpoint-rooting method. Based on this phylogeny andthe MSA, a mixed effects model of evolution (Murrell et al.,2012) was fitted and queried for amino acids showing signsof Darwinian selection.

Bacterial 16S rRNA Gene Profiling of Arabidopsis Rootsand unplanted control Soil

Arabidopsis thaliana plants were grown in natural soil (40°5′49″, 116°24′44″E, Beijing, China) under greenhouse condi-tions (22 °C, 50% humidity, 16-h light/8-h dark photo-period). Roots of 5 week-old plants (just after bolting) wereharvested, washed twice in water, washed twice in PBS,shaken twice at 180 r min−1 for 20 min, and frozen at −80 °C(Bulgarelli et al., 2012). Frozen root and corresponding bulksoil samples were homogenized using a Bertin Precellys®24tissue lyser (four times, at 6,200 r min−1 for 30 s each time).DNAwas extracted using a FastDNA SPIN Kit for soil (MPBiomedicals) according to product instructions. The DNAconcentration was measured using a PicoGreen dsDNAAssay Kit (Life Technologies), and diluted to 3.5 ng µL−1.16S rRNA genes were amplified using primers targeting thevariable regions V5-V7 (799F and 1193R) (Bulgarelli et al.,2012). Each sample was amplified in triplicate using 30 µLreactions containing 3 µL template, 0.75 U PrimeSTAR HSDNA Polymerase, 1× PrimSTAR Buffer (Takara),0.2 mmol L−1 dNTPs, 10 pmol L−1 of barcoded forward andreverse primers with linker regions. The PCR conditionswere as below. Denaturation at 98 °C for 30 s, followed by25 cycles of 98 °C for 10 s, 55 °C for 15 s and 72 °C for1 min, and a final elongation at 72 °C for 5 min. If no visibleamplication from negative control (no template added), tri-plicate PCR products were pooled and purified using anAMPure XP Kit (Beckman Coulter), and measured by Na-nodrop (NanoDropTM 2000C, Thermo Scientific), and ad-justed 10 ng µL−1. During the second step PCR, 3 µLtemplate was amplified in triplicates by 0.75 U PrimeSTARHS DNA Polymerase, 1× PrimSTAR Buffer (Takara),0.2 mmol L−1 dNTPs, 10 pmol L−1 of barcoded forward and

reverse primers with linker regions. The PCR conditionswere as below. Denaturation at 98 °C for 30 s, followed by 8cycles of 98 °C for 10 s, 55 °C for 15 s and 72 °C for 1 min,and a final elongation at 72 °C for 5 min. PCR products oftriplicates were subsequently combined, purified, and se-quenced with the Illumina HiSeq 2500 Ultra-High-Throughput platform.

Bioinformatics Analysis

The 16S rRNA gene sequences were processed using QIIME1.9.1 (Caporaso et al., 2010b), USEARCH 8.0 (Edgar, 2010),and in-house scripts as described previously (Zhang et al.,2018). QIIIME mapping files are provided in External Da-tabase S1). Paired-end Illumina reads were first evaluated byFastQC. Forward and reverse reads were joined by join_-paired_ends.py script. Barcodes were extracted by extra-ct_barcodes.py script. Merged reads were checked to removelow quality sequences and demultiplexed by split_librar-ies_fastq.py script (using–max_bad_run_length 3–min_-per_read_length_fraction 0.75–max_barcode_errors 0-q 19).Primers were removed by cutadapt 1.9.1. After dereplicationand removal of singletons, high-quality reads were clusteredinto OTUs by the cluster_otus function of USEARCH at97% identity (Edgar, 2013). Chimeric sequences were re-moved by the UCHIME algorithm, and the “gold” database(http://drive5.com/uchime/gold.fa) was used as a reference(Edgar et al., 2011). Another filter step was performed toremove non-bacterial 16S sequences by aligning all OTUrepresentative sequences to the Greengenes database usingPyNAST (align_seqs.py script). Unaligned 16S sequenceswere discarded at a threshold of 75% identity using the fil-ter_fasta.py script (DeSantis et al., 2006; Caporaso et al.,2010a). Based on the high confidence 16S representativesequences, an OTU table was generated by USEARCH(-usearch_global and uc2otutab.py scripts). The taxonomy ofthe representative sequences was classified with the RDPclassifier (Wang et al., 2007).Phylogenetic tree construction: OTU representative se-

quences were aligned by Clustal Omega, and conserved re-gions were filtered with the filter_alignment.py script. Aphylogenetic tree was constructed with the make_phylogeny.py script with the FastTree algorithm. Alpha diversity(Shannon diversity index, number of observed OTUs, andchao1 index) were calculated after resampling each sampleto 50,000 reads (QIIME script single_rarefaction.py and al-pha_diversity.py). Beta diversity (Bray-Curtis and weightedand unweighted UniFrac) were calculated for a cumulativesum scaling (CSS) transformed OTU table (QIIME scriptnormalize_table.py and beta_diversity.py).The principal coordinate analysis (PCoA) was performed

using classical multidimensional scaling of beta diversitydistance matrices with the cmdscale function in R. Con-

9Chen, Q., et al. Sci China Life Sci

Page 10: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

strained PCoAwas implemented using the capscale functionin the vegan package, and plant genotypes were used as aconstrained factor. The genotype effect on root microbiotacomposition was determined by a permutation-based ANO-VA test implemented with the anova.cca function, using5,000 permutations.Analysis of the differential OTU abundance and taxa were

performed using a negative binomial generalized linearmodel in the edgeR package (Robinson et al., 2010). We firstobtained normalization factors with the calcNormFactorsfunction and then used the estimateGLMCommonDisp andestimateGLMTagwiseDisp functions to estimate, commonand tag-wise dispersions for a Negative Binomial General-ized Linear Model (GLM). We fitted a negative binomialgeneralized log-linear model with OTU read counts by theglmFit function to test differential OTU abundance; corre-sponding P-values were corrected for multiple tests using afalse discovery rate (FDR) set at 0.05 (Benjamini andHochberg, 1995).Scatter plots, boxplots, and pie charts were generated using

the ggplot2 package. Venn diagrams were visualized with theVenndiagram package (Chen and Boutros, 2011). Hier-archical clustering of samples was based on Euclidean dis-tance, and was visualized with the ggdendro package.

Accession Numbers

Bacterial 16S rRNA sequencing data were uploaded to theNCBI SRA database, with accession PRJNA400677. Scriptswere deposited in https://github.com/microbiota/Chen2019SCLS. All primers used in this study are listed inTable S2 in Supporting Information.

Compliance and ethics The author(s) declare that they have no conflictof interest.

Acknowledgements This work was financially supported by the PriorityResearch Program of the Chinese Academy of Sciences (ZDRW-ZS-2019-2and QYZDB-SSW-SMC021), the Strategic Priority Research Program of theChinese Academy of Sciences (XDA08000000 and XDB11020700), theNational Program on Key Basic Research Projects (2013CB127000), andthe State Key Laboratory of Plant Genomics of China (2016A0219-11 andSKLPG2013A0125-5). We thank Dr. Jay D Keasling (University of Cali-fornia, Berkeley) for providing the pMBIS plasmid.

References

Aubourg, S., Lecharny, A., and Bohlmann, J. (2002). Genomic analysis ofthe terpenoid synthase (AtTPS) gene family of Arabidopsis thaliana.Mol Genets Genom 267, 730–745.

Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discoveryrate: A practical and powerful approach to multiple testing. J RStatistical Soc-Ser B 57, 289–300.

Berendsen, R.L., Pieterse, C.M.J., and Bakker, P.A.H.M. (2012). Therhizosphere microbiome and plant health. Trends Plant Sci 17, 478–486.

Bulgarelli, D., Rott, M., Schlaeppi, K., Ver Loren van Themaat, E.,

Ahmadinejad, N., Assenza, F., Rauf, P., Huettel, B., Reinhardt, R.,Schmelzer, E., et al. (2012). Revealing structure and assembly cues forArabidopsis root-inhabiting bacterial microbiota. Nature 488, 91–95.

Caporaso, J.G., Bittinger, K., Bushman, F.D., DeSantis, T.Z., Andersen, G.L., and Knight, R. (2010a). PyNAST: A flexible tool for aligningsequences to a template alignment. Bioinformatics 26, 266–267.

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D.,Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I., et al.(2010b). QIIME allows analysis of high-throughput communitysequencing data. Nat Methods 7, 335–336.

Castrillo, G., Teixeira, P.J.P.L., Paredes, S.H., Law, T.F., de Lorenzo, L.,Feltcher, M.E., Finkel, O.M., Breakfield, N.W., Mieczkowski, P., Jones,C.D., et al. (2017). Root microbiota drive direct integration ofphosphate stress and immunity. Nature 543, 513–518.

Chen, F., Tholl, D., Bohlmann, J., and Pichersky, E. (2011). The family ofterpene synthases in plants: a mid-size family of genes for specializedmetabolism that is highly diversified throughout the kingdom. Plant J66, 212–229.

Chen, H., and Boutros, P.C. (2011). VennDiagram: A package for thegeneration of highly-customizable Venn and Euler diagrams in R. BMCBioinf 12, 35.

Chen, Q., Fan, D., and Wang, G. (2015). Heteromeric geranyl(geranyl)diphosphate synthase is involved in monoterpene biosynthesis inArabidopsis flowers. Mol Plant 8, 1434–1437.

Christianson, D.W. (2017). Structural and chemical biology of terpenoidcyclases. Chem Rev 117, 11570–11648.

DeSantis, T.Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E.L., Keller,K., Huber, T., Dalevi, D., Hu, P., and Andersen, G.L. (2006).Greengenes, a chimera-checked 16S rRNA gene database andworkbench compatible with ARB. Appl Environ Microbiol 72, 5069–5072.

Edgar, R.C. (2010). Search and clustering orders of magnitude faster thanBLAST. Bioinformatics 26, 2460–2461.

Edgar, R.C. (2013). UPARSE: Highly accurate OTU sequences frommicrobial amplicon reads. Nat Methods 10, 996–998.

Edgar, R.C., Haas, B.J., Clemente, J.C., Quince, C., and Knight, R. (2011).UCHIME improves sensitivity and speed of chimera detection.Bioinformatics 27, 2194–2200.

Foster, K.R., Schluter, J., Coyte, K.Z., and Rakoff-Nahoum, S. (2017). Theevolution of the host microbiome as an ecosystem on a leash. Nature548, 43–51.

Gan, X., Hay, A., Kwantes, M., Haberer, G., Hallab, A., Ioio, R.D.,Hofhuis, H., Pieper, B., Cartolano, M., Neumann, U., et al. (2016). TheCardamine hirsuta genome offers insight into the evolution ofmorphological diversity. Nat Plants 2, 16167.

Hartmann, T. (2007). From waste products to ecochemicals: Fifty yearsresearch of plant secondary metabolism. Phytochemistry 68, 2831–2846.

Huang, A.C., Kautsar, S.A., Hong, Y.J., Medema, M.H., Bond, A.D.,Tantillo, D.J., and Osbourn, A. (2017). Unearthing a sesterterpenebiosynthetic repertoire in the Brassicaceae through genome miningreveals convergent evolution. Proc Natl Acad Sci USA 114, E6005–E6014.

Kampranis, S.C., Ioannidis, D., Purvis, A., Mahrez, W., Ninga, E.,Katerelos, N.A., Anssour, S., Dunwell, J.M., Degenhardt, J., Makris, A.M., et al. (2007). Rational conversion of substrate and productspecificity in a Salvia monoterpene synthase: Structural insights intothe evolution of terpene synthase function. Plant Cell 19, 1994–2005.

Katoh, K., Misawa, K., and Kuma, K.I. (2002). MAFFT: A novel methodfor rapid multiple sequence alignment based on fast Fourier transform.Nucl Acids Res 30, 3059–3066.

Kliebenstein, D.J., and Osbourn, A. (2012). Making new molecules—evolution of pathways for novel metabolites in plants. Curr Opin PlantBiol 15, 415–423.

Leach, J.E., Triplett, L.R., Argueso, C.T., and Trivedi, P. (2017).Communication in the phytobiome. Cell 169, 587–596.

Lebeis, S.L., Paredes, S.H., Lundberg, D.S., Breakfield, N., Gehring, J.,

10 Chen, Q., et al. Sci China Life Sci

Page 11: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

McDonald, M., Malfatti, S., Glavina del Rio, T., Jones, C.D., Tringe, S.G., et al. (2015). Salicylic acid modulates colonization of the rootmicrobiome by specific bacterial taxa. Science 349, 860–864.

Li, W., Zhang, F., Chang, Y., Zhao, T., Schranz, M.E., and Wang, G. (2015).Nicotinate O-glucosylation is an evolutionarily metabolic traitimportant for seed germination under stress conditions in Arabidopsisthaliana. Plant Cell 27, 1907–1924.

Lundberg, D.S., Lebeis, S.L., Paredes, S.H., Yourstone, S., Gehring, J.,Malfatti, S., Tremblay, J., Engelbrektson, A., Kunin, V., Del Rio, T.G.,et al. (2012). Defining the core Arabidopsis thaliana root microbiome.Nature 488, 86–90.

Müller, D.B., Vogel, C., Bai, Y., and Vorholt, J.A. (2016). The plantmicrobiota: Systems-level insights and perspectives. Annu Rev Genet50, 211–234.

Murrell, B., Wertheim, J.O., Moola, S., Weighill, T., Scheffler, K., andKosakovsky Pond, S.L. (2012). Detecting individual sites subject toepisodic diversifying selection. PLoS Genet 8, e1002764.

Robinson, M.D., McCarthy, D.J., and Smyth, G.K. (2010). edgeR: ABioconductor package for differential expression analysis of digitalgene expression data. Bioinformatics 26, 139–140.

Schlaeppi, K., Dombrowski, N., Oter, R.G., Ver Loren van Themaat, E.,and Schulze-Lefert, P. (2014). Quantitative divergence of the bacterialroot microbiota in Arabidopsis thaliana relatives. Proc Natl Acad SciUSA 111, 585–592.

Shao, J., Chen, Q.W., Lv, H.J., He, J., Liu, Z.F., Lu, Y.N., Liu, H.L., Wang,G.D., and Wang, Y. (2017). (+)-Thalianatriene and (−)-retigeranin Bcatalyzed by sesterterpene synthases from Arabidopsis thaliana. OrgLett 19, 1816–1819.

Srividya, N., Davis, E.M., Croteau, R.B., and Markus Lange, B. (2015).Functional analysis of (4S)-limonene synthase mutants revealsdeterminants of catalytic outcome in a model monoterpene synthase.Proc Natl Acad Sci USA 112, 3332–3337.

Starks, C.M., Back, K., and Chappell, J. (1997). Structural basis for cyclicterpene biosynthesis by tobacco 5-epi-aristolochene synthase. Science277, 1815–1820.

Tamura, K., Stecher, G., Peterson, D., Filipski, A., and Kumar, S. (2013).

MEGA6: Molecular evolutionary genetics analysis Version 6.0. MolBiol Evol 30, 2725–2729.

Tholl, D., and Lee, S. (2011). Terpene specialized metabolism inArabidopsis thaliana. Arabidopsis Book 9, e0143.

van Dam, N.M., and Bouwmeester, H.J. (2016). Metabolomics in therhizosphere: Tapping into belowground chemical communication.Trends Plant Sci 21, 256–265.

Verbon, E.H., and Liberman, L.M. (2016). Beneficial microbes affectendogenous mechanisms controlling root development. Trends PlantSci 21, 218–229.

Vickers, C.E., Bongers, M., Liu, Q., Delatte, T., and Bouwmeester, H.(2014). Metabolic engineering of volatile isoprenoids in plants andmicrobes. Plant Cell Environ 37, 1753–1775.

Wang, C., Chen, Q., Fan, D., Li, J., Wang, G., and Zhang, P. (2016).Structural analyses of short-chain prenyltransferases identify anevolutionarily conserved GFPPS clade in Brassicaceae plants. MolPlant 9, 195–204.

Wang, G., Tian, L., Aziz, N., Broun, P., Dai, X., He, J., King, A., Zhao, P.X., and Dixon, R.A. (2008). Terpene biosynthesis in glandulartrichomes of hop. Plant Physiol 148, 1254–1266.

Wang, Q., Garrity, G.M., Tiedje, J.M., and Cole, J.R. (2007). NaiveBayesian classifier for rapid assignment of rRNA sequences into thenew bacterial taxonomy. Appl Environ MicroBiol 73, 5261–5267.

Xu, H., Zhang, F., Liu, B., Huhman, D.V., Sumner, L.W., Dixon, R.A., andWang, G. (2013). Characterization of the formation of branched short-chain fatty acid:CoAs for bitter acid biosynthesis in hop glandulartrichomes. Mol Plant 6, 1301–1317.

Zhang, J., Zhang, N., Liu, Y.X., Zhang, X., Hu, B., Qin, Y., Xu, H., Wang,H., Guo, X., Qian, J., et al. (2018). Root microbiota shift in ricecorrelates with resident time in the field and developmental stage. SciChina Life Sci 61, 613–621.

Zhao, T., Holmer, R., de Bruijn, S., Angenent, G.C., van den Burg, H.A.,and Schranz, M.E. (2017). Phylogenomic synteny network analysis ofMADS-box transcription factor genes reveals lineage-specifictranspositions, ancient tandem duplications, and deep positionalconservation. Plant Cell 29, 1278–1292.

SUPPORTING INFORMATION

Figure S1 Functional characterization of 16 plant sesterTPSs in E.coli system.

Figure S2 Terpene synthase assays of TPS18 and TPS18G328W analyzed by GC-MS.

Figure S3 Terpene synthase assays of TPS19 and TPS19G325W analyzed by GC-MS.

Figure S4 Terpene synthase assays of TPS25 and TPS25G328W analyzed by GC-MS.

Figure S5 Terpene synthase assays of TPS30 and TPS30P328W analyzed by GC-MS.

Figure S6 Characterization of TPS25 T-DNA insertion mutants.

Figure S7 Characterization of TPS30 T-DNA insertion mutant (tps30-1|CS805958).

Figure S8 tps30-2 and tps30/tps25-1 double mutants generated via CRISPR/Cas9.

Figure S9 tps30-3 and tps30/tps25-2 double mutants generated via CRISPR/Cas9.

Figure S10 Sesterterpene profiling in various tissues of Arabidopsis and different transgenic plants by GC-QQQ-MS.

Figure S11 Phenotypes of sesterTPS mutants and Col-0 when grown in natural soil (Changping farm in Beijing) under climate controlconditions.

Figure S12 Sample diversity (α/β-diversity) measurements among each genotype.

11Chen, Q., et al. Sci China Life Sci

Page 12: Recently duplicated sesterterpene (C25) gene clusters in …bailab.genetics.ac.cn/pdf/Chen-2019-SCLS.pdf · 2019-05-06 · the effects of PSMs, especially those evolutionarily new

Figure S13 Stack plot showing relative abundance distribution of the OTUs in phylum.

Figure S14 Taxonomic difference and overlap of specific differential OTUs between TPS mutants and Col-0.

Table S1 The MRM settings for four sesterterpene (C25) backbones analyzed by GC-QQQ-MS in this study

Table S2 The primers used in this study

The supporting information is available online at http://life.scichina.com and https://link.springer.com. The supportingmaterials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and contentremains entirely with the authors.

12 Chen, Q., et al. Sci China Life Sci