systematic analysis of nac transcription factors in ... · systematic analysis of nac transcription...

24
Submitted 8 April 2019 Accepted 7 October 2019 Published 5 November 2019 Corresponding author Zhanji Liu, [email protected] Academic editor Blanca Landa Additional Information and Declarations can be found on page 18 DOI 10.7717/peerj.7995 Copyright 2019 Liu et al. Distributed under Creative Commons CC-BY 4.0 OPEN ACCESS Systematic analysis of NAC transcription factors in Gossypium barbadense uncovers their roles in response to Verticillium wilt Zhanji Liu, Mingchuan Fu, Hao Li, Yizhen Chen, Liguo Wang and Renzhong Liu Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture, Cotton Research Center of Shandong Academy of Agricultural Sciences, Jinan, China ABSTRACT As one of the largest plant-specific gene families, the NAC transcription factor gene family plays important roles in various plant physiological processes that are related to plant development, hormone signaling, and biotic and abiotic stresses. However, systematic investigation of the NAC gene family in sea-island cotton (Gossypium babardense L.) has not been reported, to date. The recent release of the complete genome sequence of sea-island cotton allowed us to perform systematic analyses of G. babardense NAC GbNAC ) genes. In this study, we performed a genome-wide survey and identified 270 GbNAC genes in the sea-island cotton genome. Genome mapping analysis showed that GbNAC genes were unevenly distributed on 26 chromosomes. Through phylogenetic analyses of GbNACs along with their Arabidopsis counterparts, these proteins were divided into 10 groups (I–X), and each contained a different number of GbNACs with a similar gene structure and conserved motifs. One hundred and fifty-four duplicated gene pairs were identified, and almost all of them exhibited strong purifying selection during evolution. In addition, various cis-acting regulatory elements in GbNAC genes were found to be related to major hormones, defense and stress responses. Notably, transcriptome data analyses unveiled the expression profiles of 62 GbNAC genes under Verticillium wilt (VW) stress. Furthermore, the expression profiles of 15 GbNAC genes tested by quantitative real-time PCR (qPCR) demonstrated that they were sensitive to methyl jasmonate (MeJA) and salicylic acid (SA) treatments and that they could be involved in pathogen-related hormone regulation. Taken together, the genome-wide identification and expression profiling pave new avenues for systematic functional analysis of GbNAC candidates, which may be useful for improving cotton defense against VW. Subjects Agricultural Science, Bioinformatics, Genomics, Mycology Keywords NAC transcription factor, Sea-island cotton, Genome-wide survey, Verticillium wilt INTRODUCTION The plant-specific NAC genes (N AM, no apical meristem; A TAF, Arabidopsis transcription activation factor; and C UC, cup-shaped cotyledon) form one of the largest families of transcription factors (Nuruzzaman, Sharoni & Kikuchi, 2013). Typically, NAC proteins How to cite this article Liu Z, Fu M, Li H, Chen Y, Wang L, Liu R. 2019. Systematic analysis of NAC transcription factors in Gossypium barbadense uncovers their roles in response to Verticillium wilt. PeerJ 7:e7995 http://doi.org/10.7717/peerj.7995

Upload: others

Post on 07-Mar-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Submitted 8 April 2019Accepted 7 October 2019Published 5 November 2019

Corresponding authorZhanji Liu, [email protected]

Academic editorBlanca Landa

Additional Information andDeclarations can be found onpage 18

DOI 10.7717/peerj.7995

Copyright2019 Liu et al.

Distributed underCreative Commons CC-BY 4.0

OPEN ACCESS

Systematic analysis of NAC transcriptionfactors in Gossypium barbadense uncoverstheir roles in response to VerticilliumwiltZhanji Liu, Mingchuan Fu, Hao Li, Yizhen Chen, Liguo Wang andRenzhong LiuKey Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture,Cotton Research Center of Shandong Academy of Agricultural Sciences, Jinan, China

ABSTRACTAs one of the largest plant-specific gene families, the NAC transcription factor genefamily plays important roles in various plant physiological processes that are relatedto plant development, hormone signaling, and biotic and abiotic stresses. However,systematic investigation of the NAC gene family in sea-island cotton (Gossypiumbabardense L.) has not been reported, to date. The recent release of the completegenome sequence of sea-island cotton allowed us to perform systematic analyses ofG. babardense NAC GbNAC) genes. In this study, we performed a genome-wide surveyand identified 270 GbNAC genes in the sea-island cotton genome. Genome mappinganalysis showed that GbNAC genes were unevenly distributed on 26 chromosomes.Through phylogenetic analyses of GbNACs along with their Arabidopsis counterparts,these proteinswere divided into 10 groups (I–X), and each contained a different numberof GbNACs with a similar gene structure and conserved motifs. One hundred andfifty-four duplicated gene pairs were identified, and almost all of them exhibited strongpurifying selection during evolution. In addition, various cis-acting regulatory elementsin GbNAC genes were found to be related to major hormones, defense and stressresponses. Notably, transcriptome data analyses unveiled the expression profiles of62 GbNAC genes under Verticillium wilt (VW) stress. Furthermore, the expressionprofiles of 15GbNAC genes tested by quantitative real-time PCR (qPCR) demonstratedthat they were sensitive to methyl jasmonate (MeJA) and salicylic acid (SA) treatmentsand that they could be involved in pathogen-related hormone regulation. Takentogether, the genome-wide identification and expression profiling pave new avenues forsystematic functional analysis ofGbNAC candidates, whichmay be useful for improvingcotton defense against VW.

Subjects Agricultural Science, Bioinformatics, Genomics, MycologyKeywords NAC transcription factor, Sea-island cotton, Genome-wide survey, Verticillium wilt

INTRODUCTIONThe plant-specificNAC genes (NAM, no apical meristem; ATAF, Arabidopsis transcriptionactivation factor; and CUC, cup-shaped cotyledon) form one of the largest families oftranscription factors (Nuruzzaman, Sharoni & Kikuchi, 2013). Typically, NAC proteins

How to cite this article Liu Z, Fu M, Li H, Chen Y, Wang L, Liu R. 2019. Systematic analysis of NAC transcription factors in Gossypiumbarbadense uncovers their roles in response to Verticillium wilt. PeerJ 7:e7995 http://doi.org/10.7717/peerj.7995

harbor a highly conserved NAC domain at the N-terminal and a variable transcriptionalregulatory region (TR) at the C-terminal. The NAC domain can be further divided intofive subdomains (A–E) and functions as DNA binding, nuclear localization, and formationof homodimers or heterodimers, while the TR region is responsible for transcriptionregulation as either an activator or a repressor (Olsen et al., 2005). NAM, the first NACgene, was discovered in Petunia and functions in determining positions of shoot apicalmeristems and primordia (Souer et al., 1996). Since then, a large number of NAC geneshave been identified from diverse plant species (Nuruzzaman et al., 2010; Liu et al., 2019).

Accumulated evidences indicate that the functions of NAC genes are associatedwith almost every biological process in plants, such as leaf senescence (Fan et al., 2015;Zhao et al., 2016), secondary cell wall formation (Zhang et al., 2018), and hormonesignaling (Takasaki et al., 2015). Notably, a number of NAC genes, especially ATAFsubfamily members, were proven to serve as critical regulators in plant defense againstbiotic and abiotic stresses (Nuruzzaman, Sharoni & Kikuchi, 2013; Karanja et al., 2017). InArabidopsis, ANAC032 was induced by bacterial pathogen Pst (Pseudomonas syringae pv.tomato DC3000) infection, and SA and jasmonic acid (JA) treatments. Furthermore,transgenic Arabidopsis plants overexpressing ANAC032 showed strongly enhancedresistance to Pst, while the ANAC032 knockout mutant exhibited increased susceptibilityto Pst (Allu et al., 2016). In rice, the expressions of SNAC1 were enhanced under drought,salt, and cold stresses. Overexpression SNAC1 in rice resulted in increased tolerance todrought (Hu et al., 2006). Similarly, transgenic OsNAC111 showed improved resistance toblast fungus in rice by regulating the expression of several defense-related genes (Yokotaniet al., 2014).

Cotton is the most important natural fiber crop, and is also used as a food crop due tohigh levels of vegetable oil and protein in cottonseeds (Li et al., 2009). However, cottonproduction can be dramatically decreased by the occurrence of VW, a devastating vasculardisease caused by Verticillium dahliae (Sun et al., 2013). Typically, infected plants showleaf chlorosis, leaf shedding, vascular discoloration, wilting, and plant death. In general,upland cotton (Gossypium hirsutum), accounting for about 90% of annual world cottonproduction, is susceptible to VW, whereas sea-island cotton, accounting for approximately5% of annual world cotton output, is immune to VW. Thus, extensive efforts have beenmade to investigate the molecular mechanism of sea-island cotton resistance to VW(Zhang et al., 2015). Recently, a NAC gene GbNAC1 was identified from G. barbadense.Overexpression of GbNAC1 in Arabidopsis can significantly enhance resistance to VW,implying that G. barbadense NAC genes might play pivotal roles in biotic stress resistance(Wang et al., 2016). The availability of diploid and tetraploid cotton genomes has allowedscientists to identify the NAC gene family members at the genome-wide scale, such as 145genes in G. raimondii (Shang et al., 2013), 141 in G. arboreum (Shang et al., 2016; Fan etal., 2018), and 283 in G. hirsutum (Sun et al., 2018). However, systematic analysis of NACgenes in G. babardense has not been completed. G. babardense (AD2) and G. hirsutum(AD1) are allotetraploids, which evolve from transoceanic hybridization between anA-genome species immigrated from Africa and a native American D-genome species,while G. arboreum (A2) and G. raimondii (D5) are diploids resembling the A-genome

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 2/24

and D-genome progenitor, respectively (Hu et al., 2019). In this study, we performed agenome-wide survey and identified 270 GbNAC genes in the latest G. babardense genomereleased by Wang et al. (2019). These GbNAC genes were classified into 10 groups on thebasis of sequence similarity. Sequence comparison among GbNAC genes revealed thepresence and distribution of duplicated genes. Additionally, to identify GbNAC candidategenes associated with VW resistance, we analyzed the expression patterns of GbNAC genesusing available transcriptome data of G. babardense cv. 7,124 inoculated with the fungalpathogen V. dahliae. Subsequently, qPCR-based gene expression profiling demonstratedthat selected GbNAC candidate genes might be involved in MeJA and SA regulation. Thisstudy provides comprehensive information about sea-island cotton NAC genes, as well asa foundation for in-depth functional analysis of novel GbNAC candidate genes, which maybe useful for the improvement of pathogen resistance in cotton.

MATERIALS AND METHODSIdentification of NAC genes in the sea-island cotton genomeWe downloaded the genome sequences (v. HAU) of sea-island cotton from the CottonGendatabase (https://www.cottongen.org/, Wang et al., 2019). The Hidden Markov Model(HMM) profile of the NAC domain (PF02365) was retrieved from the Pfam database(http://pfam.xfam.org/, Finn et al., 2016). The HMMER program was used to searchNAC protein in the sea-island cotton genome (Finn, Clements & Eddy, 2011). with anE-value cutoff of e−5. Then, all the putative proteins were confirmed by the Pfam andSMART database (http://smart.embl-heidelberg.de/, Letunic & Bork, 2018). The MW(molecular weight) and pI (theoretical isoelectric point) of each NAC protein werepredicted by the online software ExPASy (https://www.expasy.org/, Artimo et al., 2012).The TMHHM server (v. 2.0, http://www.cbs.dtu.dk/services/TMHMM/) was used toidentify membrane-bound NAC proteins (Krogh et al., 2001). CELLO (v. 2.5, subCELlularLocalization predictor, http://cello.life.nctu.edu.tw/; Yu et al., 2006) was used to predictthe subcellular localization of GbNACs.

Multiple alignments, phylogenetic analysis, gene duplication andsynteny analysisMultiple sequence alignments were performed with the NAC domain sequences of theGbNAC proteins using MEGA X (https://www.megasoftware.net/, Kumar et al., 2018).A phylogenetic tree was constructed by the neighbor-joining method with the followingparameters: Poisson correction, pairwise deletion, and 1,000 bootstrap replicates. Geneduplications were analyzed with two major criteria, that is, the length of the alignedsequence covers more than 75% of the longer gene and similarity of the aligned regions isgreater than 75% (Vatansever et al., 2016). Alignment of the coding sequences of duplicatedgenes was performed by the Clustal X (v. 2.0) program (Larkin et al., 2007), and the valuesof nonsynonymous (Ka) and synonymous (Ks) substitution rates were calculated usingKaKs_Calculator package (Zhang et al., 2006) via model averaging. The approximate dateof duplication events (million years ago, Mya) was estimated using the formula T =Ks/2λ× 10−6, on the basis of molecular clock rate of 2.6 × 10−9 substitutions/synonymous site

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 3/24

for cotton (Liu et al., 2015). The relationships of duplicated genes were illustrated with theCircos program (Krzywinski et al., 2009). MCScanX was used to detect the synteny of NACgenes between G. barbadense and the other plant species (Wang et al., 2012).

Gene structure, chromosomal mapping, and conserved motifanalysisThe gene structures were determined using the CDS and DNA sequences of GbNACgenes and visualized by the Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/,Hu et al., 2015). The positions of these GbNAC genes were determined by using thenucleotide sequence as a query to search against the G. barbadense genome. In addition,chromosomal localization map was constructed by using the MapChart (v. 2.32)program (Voorrips, 2002). In order to identify the conserved motifs among all theGbNAC genes, their protein sequences were subjected to the online software MEME(http://meme-suite.org/tools/meme, Bailey et al., 2015) using default parameters withexception for number of motif. The number of motifs was set to 20.

Cis-acting regulatory element and miRNA target analysisFor cis-acting regulatory element analysis, we retrieved 1,500 bp DNA sequencesup-stream from the transcription start site from the newly released G. barbadensegenome sequences (Wang et al., 2019) and then screened them in the PlantCare(http://bioinformatics.psb.ugent.be/webtools/plantcare/html/, Lescot et al., 2002) andPLACE (https://www.dna.affrc.go.jp/PLACE/, Higo et al., 1999) databases. For miRNAtarget analysis, we downloaded all the mature miRNA sequences of Gossypium from themiRBase database (v. 22.1) (http://www.mirbase.org/). The online sever psRNATarget(http://plantgrn.noble.org/psRNATarget/, Dai, Zhuang & Zhao, 2018) was used to predictthe miRNA target.

Gene expression analysis under Verticillium wilt stressThe transcriptome data of sea-island cotton were obtained from the Sequence ReadArchive under accession number SRP03537 at the NCBI website (Chen et al., 2015), wherethe plants were inoculated withV. dahliae. Briefly, two-week-old seedlings ofG. barbadenseresistant cultivar 7,124 were inoculated with the high virulence V991 defoliating strain ofV. dahliae (5× 106 spores/mL) by the root-dip method for 2, 6, 12, 24, 48, and 72 h. Then,the samples, including the mock-inoculated (control) and six inoculated, were collected forRNA sequencing. We retrieved the expression data of GbNAC genes from the root underV. dahliae infection (Chen et al., 2015). The hierarchical clustering and the heatmap-basedexpression profiles ofGbNAC genes were performed using ClustVis (Metsalu & Vilo, 2015).The Venn map of differentially expressed GbNAC genes was constructed using the UpSetRpackage (Lex et al., 2014).

RNA isolation and qPCR analysisG. barbadense cv. 7,124 seeds were cultivated in commercial soil at 28 ◦Cwith a photoperiodof 16 h light/8 h dark. Two-week-old seedlings were gently uprooted, rinsed and cultivatedin Hoagland solution for two days. Then these seedlings were treated with Hoagland

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 4/24

solution containing 0.1 mM MeJA and 1 mM SA, respectively. Roots were sampled fromthree biological replicates after treatment for 1, 2, 6, and 12 h, then immediately immersedin liquid nitrogen and stored at −80 ◦C for qPCR.

The total RNA from the root samples treated with MeJA and SA and the control (rootsfrom Hoagland solution without hormone) was extracted using Trizol (Invitrogen). Thefirst strand cDNA was generated from 1 µg of total RNA using a PrimerScriptTM 1st StrandcDNA Synthesis Kit (Takara, Dalian, China). qPCR was performed with three replicatesusing an ABI QuantStudio 5 Real-Time PCR System (Thermo Fisher Scientific, Waltham,MA, USA) and SYBR Premix Ex TaqTM (Takara, Dalian, China). The amplificationprocedure was as follows: one cycle at 95 ◦C for 3 min; then 40 cycles at 95 ◦C for 15s, 60 ◦C for 15 s. The cotton actin (AF059484) was selected as the internal referencegene (Zhang et al., 2013). Gene expression levels were calculated according to the 2−11CT

method described by Livak & Schmittgen (2001). The primers used for qPCR are listed inTable S1.

RESULTSIdentification and phylogenetic analysis of NAC family members inGossypium barbadenseA total of 270 NAC proteins were identified from G. barbadense and named as GbNAC001to GbNAC270. All of the 270 GbNACs contained the NAC domain (PF02365) based onPfam and SMART tests. The lengths of GbNAC proteins ranged from 154 (GbNAC059)to 959 (GbNAC175) amino acids with MW from 17.65 to 107.75 kDa, and pI from 4.67 to9.79. Subcellular localization of GbNACs was predicted using the online software CELLO(http://cello.life.nctu.edu.tw/). Among the 270 GbNAC proteins, three were predicted tobe mitochondrial proteins (GbNAC031, 032, and 255); five were located in the chloroplast(GbNAC014, 036, 116, 174, and 249); 10 were extracellular; 23 were cytoplasmic, and therest were localized in the nucleus. These results are similar to those of cucumber (Liu et al.,2018). Detailed information including gene locus, chromosome location, exon number,sequence length, MW, pI, Arabidopsis orthologous locus and subcellular location of allidentified GbNAC proteins is provided in Table S2.

The NAC domain sequences of GbNAC proteins were used to construct a neighbor-joining phylogenetic tree. As a result, the 270 GbNACs were classified into 10 groupswhich were named as I to X (Fig. 1). Group VI contained the most NAC members with 47GbNACs, followed by group I with 39 GbNACs. Group III had the least NAC memberswith only ten GbNACs. Additionally, 35 GbNACs with high similarity to ArabidopsisATAF members were clustered in group IV. Arabidopsis ATAF members play pivotalroles in the responses to biotic and abiotic stresses. These results suggest that NACmembers in group IV may have similar functions with Arabidopsis ATAF members.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 5/24

GbNAC093

GbNAC261

AN

AC

068

GbNAC180

GbNAC245

GbN

AC

190

ANAC080

GbNAC091

GbNAC034

GbNAC079

GbNAC127

ANAC046

ANAC009

GbNAC006

GbNAC209

GbNAC099

ANAC020

ANAC041

GbN

AC

033GbN

AC

001

ANAC071

GbN

AC11

0

GbN

AC

086

GbNAC205

ANAC094

GbN

AC

090

GbNAC269

GbN

AC

004

GbN

AC

221

AN

AC

005

GbNAC045

GbNAC233

GbN

AC

148

GbNAC160

GbN

AC

067

GbNAC031

GbNAC078

GbNAC080

ANAC023

GbN

AC

047

ANAC012

GbNAC119

GbN

AC

125

GbNAC101

GbNAC151

GbNAC202

GbNAC08

7

GbN

AC

100

GbNAC104

GbNAC167

GbNAC268

GbN

AC

062

ANAC051

GbN

AC11

1

ANAC072

GbN

AC

030G

bNA

C193

ANAC086

AN

AC

067

GbN

AC17

3

GbNAC230

ANAC096

GbN

AC

083

ANAC033

GbN

AC

185

ANAC050

GbNAC016

GbN

AC

263

ANAC028

GbN

AC

174

GbNAC098

ANAC037

GbNAC092

ANAC076

AN

AC

036

ANAC087

GbNAC158

ANAC010

ANAC053A

NA

C02

2

GbNAC061

ANAC025

GbNAC270

GbNAC021

GbN

AC

199

GbNAC238

GbN

AC

003

GbNAC128

ANAC058

GbNAC142

GbNAC129

GbNAC164

ANAC024

GbNAC203

GbN

AC21

7

ANAC077

ANAC056

AN

AC

021

GbN

AC

147

GbNAC253GbNAC064

GbNAC113

GbN

AC

038

GbNAC015

GbN

AC14

6

GbNAC16

9

GbNAC135

AN

AC

093

ANAC015

GbN

AC

054

GbN

AC

194

GbN

AC

178

ANAC029

GbNAC18

4

AN

AC

061

GbNAC074

GbNAC137

GbNAC177

GbN

AC

011

ANAC052

GbNAC187

ANAC073

GbNAC013

GbNAC157

GbN

AC

082

ANAC078

GbNAC126

GbNAC181

GbNAC105

ANAC

031

ANAC083

GbNAC265

GbNAC124

GbNAC107

GbNAC150

GbN

AC

250

ANAC103

GbNAC132

ANAC057

GbN

AC

141

GbN

AC

213

GbNAC262G

bNA

C06

5

GbNAC255

GbN

AC

179

GbN

AC

170

GbN

AC

010GbNAC115

GbN

AC

226

ANAC100

GbNAC120

GbNAC189

AN

AC

044

GbN

AC

117 AN

AC

065

AN

AC

001

GbNAC231

GbN

AC

156

GbNAC136

ANAC019

AN

AC

095

GbNAC176

AN

AC

004A

NA

C097

ANAC091

GbNAC211

GbNAC228

AN

AC

008

ANAC011

GbNAC073

GbNAC254

GbNAC257

AN

AC

090

GbN

AC

155

ANAC018

GbNAC252

GbNAC153

GbNAC014

ANAC098

GbNAC095

GbN

AC

212

GbN

AC

138

GbNAC133

GbNAC259

GbNAC161

ANAC082

GbNAC19

8

ANAC002

GbNAC068

GbN

AC

256

ANAC102

GbN

AC

032

GbN

AC

026

ANAC026

GbNAC022

GbNAC248GbNAC260

GbNAC071

GbNAC227

GbNAC044

ANAC017

GbNAC084

GbNAC069

AN

AC

060

AN

AC

104

GbNAC058

ANAC043

GbNAC094

AN

AC

006

AN

AC

048

ANAC092

GbN

AC

118

ANAC101

GbNAC103

GbN

AC19

7

GbNAC05

2

ANAC070

GbNAC134

GbNAC085GbNAC168GbNAC236

GbNAC057

ANAC075

ANAC062

GbN

AC19

5

ANAC055

ANAC038

GbNAC223

GbNAC024

GbNAC050

GbNAC237

GbNAC108

GbNAC182

GbN

AC

220

GbNAC222

GbNAC048

GbNAC060

GbNAC029

ANAC030

ANAC059

GbN

AC

154

GbNAC175

GbNAC240

ANAC014

ANAC079

GbNAC206

AN

AC

034GbNAC096

GbNAC171

ANAC007

GbN

AC20

4

GbN

AC

081

GbN

AC

041

GbNAC210GbNAC267

GbNAC166

GbNAC225

ANAC016

GbN

AC

077

GbN

AC

123

GbN

AC10

6

GbNAC070

GbNAC023

GbN

AC23

9

GbNAC06

6

GbNAC036

GbN

AC

075G

bNAC

049

GbNAC215

AN

AC

089

GbN

AC

258

GbNAC144

GbNAC088

GbNAC149

GbN

AC

251

GbNAC051

ANAC045

GbN

AC24

3

GbN

AC06

3

GbNAC235

AN

AC

027

GbN

AC

012

GbNAC055

GbNAC216

GbNAC247

ANAC099

GbNAC241

AN

AC

063

GbN

AC

009

GbNAC191

GbNAC200

GbNAC114

GbNAC172

ANAC105 GbNAC043

GbNAC035

GbNAC224

GbNAC121

GbNAC039

ANAC039

GbNAC219

GbN

AC

163

ANAC054

ANAC040

GbNAC102

AN

AC

074

GbN

AC

208

GbNAC025

GbNAC131

GbNAC139

AN

AC

084

GbNAC162

GbNAC186

GbN

AC

027

GbN

AC

097

GbNAC040

GbNAC201

GbN

AC24

4

GbN

AC

130GbNAC028

GbNAC18

3

GbNAC046GbNAC143

GbNAC020

GbN

AC

207

ANAC081

AN

AC

003

GbNAC24

2

GbNAC192

AN

AC

049

AN

AC

035

GbNAC159

GbNAC017

AN

AC

088

GbNAC165

GbNAC196

GbNAC246

GbNAC266

GbN

AC

140

ANAC047

GbN

AC

214

GbN

AC

089

GbN

AC

229

GbNAC24

9

GbN

AC

059

GbNAC112

GbN

AC00

8

GbNAC002

GbNAC07

6

GbNAC234

GbNAC007

GbNAC019

GbN

AC

018

GbNAC152

GbN

AC109

ANAC032

GbNAC264

GbNAC056

GbNAC11

6

ANAC013

AN

AC

085

AN

AC

069

ANAC042G

bNAC188

GbNAC05

3

GbNAC00

5

GbNAC072

GbNAC145

GbNAC042

AN

AC

064

ANAC066

GbNAC03

7

GbNAC232

GbNAC122

GbNAC21

8

1.001.00

1.00

0.95

1.00

1.00

0.66

0.67

1.00

0.58

0.81

0.92

1.00

1.00

1.00

0.76

0.99

0.61

1.00

0.990.98

0.90

0.50

0.61

0.55

0.98

0.97

0.84

0.58

1.00

1.00

0.60

0.98

0.52

0.82

0.64

1.00

1.00

0.63

0.65

1.00

0.58

1.00

0.94

0.84

1.00

1.00

0.99

0.960.82

0.65

0.80

0.91

0.87

0.99

0.94

0.90

1.00

1.00

0.98

0.90

0.98

1.00

0.99

0.61

1.00

0.99

0.93

0.94

0.98

1.00

0.99

1.00

1.00

0.85

0.63

1.00

1.00

0.94

0.97

1.00

1.00

1.00

1.00

0.93

0.96

1.00

1.00

0.891.00

0.92

0.89

1.00

0.55

1.00

0.94

0.97

0.95

0.99

0.99

1.00

0.94

0.97

0.97

0.81

0.99

0.67

0.98

0.83

1.00

0.81

0.99

0.58

0.50

0.65

1.00

0.98

0.69

0.66

0.95

0.64

0.82

0.98

1.00

1.00

1.00

0.97

1.00

1.00

0.84

0.71

0.96

0.54

0.88

0.97

0.96

1.00

0.99

1.00

1.00

0.90

0.99

0.97

0.99

1.00

0.72

0.60

1.00

0.99

0.99

0.98

0.91

0.95

0.92

0.89

1.00

1.00

0.71

0.69

1.00

1.00

1.00

1.00

0.79

0.58

1.00

1.00

1.00

0.99

0.60

1.00

0.91

1.00

0.99

0.78

0.98

0.51

0.55

0.52

0.69

1.00

0.81

1.00

1.00

1.00

0.90

0.75

0.94

0.93

0.62

0.72

0.97

0.72

0.75

1.00

0.85

0.85

0.99

0.93

0.53

0.88

0.98

0.98

0.86

1.00

1.00

0.82

1.00

1.00

0.54

0.87

1.00

0.59

0.76

0.78

0.76

0.53

0.58

0.940.99

0.99

0.95

0.99

1.00

1.00

0.99

1.00

0.51

1.00

0.99

0.90

1.00

0.53

1.00

0.87

0.85

1.00

0.85

0.98

1.00

0.52

0.86

0.73

0.59

1.00

1.00

0.89

0.99

0.97

1.00

0.98

1.000.

98

1.00

0.80

1.00

1.00

0.74

1.00

0.84

0.73

0.99

0.95

1.00

1.00

1.00

0.59

0.66

0.79

1.00

1.00

1.00

1.00

0.99

1.00

1.00

1.00

1.00

0.85

0.92

0.90

0.88

1.00

0.98

0.68

0.84

0.87

1.00

0.99

1.00

0.92

0.99

1.00

0.56

0.92

0.99

0.97

0.51

0.98

0.99

0.59

0.69

1.00

0.98

0.86

0.96

1.00

0.50

I

II

III

IV

V

VI

VIIVIII

IX

X

Figure 1 Phylogenetic tree of the 270 GbNAC proteins.Multiple sequence alignment of NAC domainsequences of G. barbadense and Arabidopsis was performed using ClustalW. MEGA X was used to con-struct the neighbor-joining (NJ) tree with 1000 bootstrap replicates. Various colors indicate differentgroups of GbNACs.

Full-size DOI: 10.7717/peerj.7995/fig-1

Chromosomal locations, duplications, and synteny analysis of theGbNAC genesTo determine the chromosomal distribution of the GbNAC genes, we searched the sea-island cotton genome database using blastn and the DNA sequence of each GbNAC gene.The results suggested that 263 GbNAC genes (97.4%) were mapped to 26 chromosomes.Specifically, 132 GbNAC genes were distributed in the A-subgenome, and 131 were locatedin the D-subgenome (Fig. 2). In addition, five genes (GbNAC011, GbNAC012, GbNAC049,GbNAC050, andGbNAC069) were anchored in four A-subgenome scaffolds, and two genes(GbNAC148 and GbNAC262) were found in two D-subgenome scaffolds. The number ofGbNAC genes distributed on each chromosome was uneven. Chromosome D11 containedthe highest number of NAC genes, with 15 GbNAC genes. In contrast, chromosome A10contained the least number ofNAC genes, with only five genes. Interestingly, manyGbNAC

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 6/24

GbNAC001GbNAC002GbNAC003GbNAC004GbNAC005GbNAC006GbNAC007GbNAC008

GbNAC009

GbNAC010

A

GbNAC013GbNAC014

GbNAC015GbNAC016

GbNAC017GbNAC018

B

GbNAC019GbNAC020

GbNAC021

GbNAC022GbNAC023

GbNAC024GbNAC025GbNAC026GbNAC027

C

GbNAC028GbNAC029GbNAC030GbNAC031GbNAC032

GbNAC033

GbNAC034

GbNAC035

GbNAC036

D

GbNAC037GbNAC038GbNAC039GbNAC040GbNAC041

GbNAC042GbNAC043

GbNAC044

GbNAC045GbNAC046GbNAC047GbNAC048

E

GbNAC051

GbNAC052

GbNAC053

GbNAC054

GbNAC055

GbNAC056GbNAC057

GbNAC058GbNAC059

F

GbNAC060GbNAC061

GbNAC062GbNAC063GbNAC064

GbNAC065

GbNAC066GbNAC067GbNAC068

G

05101520253035404550556065707580859095100105110115

GbNAC070

GbNAC071GbNAC072

GbNAC073GbNAC074GbNAC075GbNAC076GbNAC077GbNAC078GbNAC079GbNAC080GbNAC081GbNAC082GbNAC083

H

GbNAC084GbNAC085GbNAC086

GbNAC087

GbNAC088GbNAC089GbNAC090GbNAC091GbNAC092GbNAC093GbNAC094GbNAC095

I

GbNAC096GbNAC097GbNAC098

GbNAC099GbNAC100

JGbNAC101GbNAC102GbNAC103GbNAC104GbNAC105GbNAC106GbNAC107GbNAC108GbNAC109GbNAC110GbNAC111GbNAC112GbNAC113

GbNAC114GbNAC115

GbNAC116

K

GbNAC117GbNAC118

GbNAC119

GbNAC120GbNAC121GbNAC122GbNAC123GbNAC124GbNAC125GbNAC126GbNAC127GbNAC128GbNAC129

L

GbNAC130GbNAC131

GbNAC132

GbNAC133GbNAC134GbNAC135GbNAC136GbNAC137

M

GbNAC138GbNAC139GbNAC140GbNAC141GbNAC142GbNAC143GbNAC144GbNAC145GbNAC146

GbNAC147

N

GbNAC149GbNAC150

GbNAC151GbNAC152

GbNAC153GbNAC154GbNAC155

O

GbNAC156GbNAC157

GbNAC158

GbNAC159

GbNAC160GbNAC161

P

GbNAC162GbNAC163GbNAC164GbNAC165GbNAC166

GbNAC167

GbNAC168

Q

GbNAC169GbNAC170GbNAC171GbNAC172GbNAC173GbNAC174GbNAC175GbNAC176GbNAC177

GbNAC178GbNAC179

GbNAC180GbNAC181

R

GbNAC182GbNAC183

GbNAC184

GbNAC185GbNAC186GbNAC187GbNAC188

GbNAC189GbNAC190

S

GbNAC191GbNAC192GbNAC193GbNAC194GbNAC195GbNAC196GbNAC197

GbNAC198GbNAC199GbNAC200

T

GbNAC201

GbNAC202GbNAC203GbNAC204GbNAC205GbNAC206GbNAC207GbNAC208GbNAC209GbNAC210GbNAC211GbNAC212GbNAC213GbNAC214

U

GbNAC215GbNAC216GbNAC217

GbNAC218GbNAC219GbNAC220GbNAC221GbNAC222GbNAC223GbNAC224GbNAC225GbNAC226

V

GbNAC227GbNAC228GbNAC229

GbNAC230GbNAC231GbNAC232

WGbNAC233GbNAC234GbNAC235GbNAC236GbNAC237GbNAC238GbNAC239GbNAC240GbNAC241GbNAC242GbNAC243GbNAC244GbNAC245GbNAC246

GbNAC247GbNAC248

GbNAC249

X

GbNAC250GbNAC251GbNAC252

GbNAC253GbNAC254GbNAC255GbNAC256GbNAC257GbNAC258GbNAC259GbNAC260GbNAC261

Y

GbNAC263GbNAC264

GbNAC265

GbNAC266GbNAC267GbNAC268GbNAC269GbNAC270

Z

Mb

Figure 2 Chromosomal locations of theGbNAC genes. Grey bars denote the G. babardense chromosome. Scale bar on the left indicates the chro-mosome lengths (Mb). The letter A-M indicates the A-subgenome chromosome A01–A13, and N-O indicates the D-subgenome chromosome D01–D13, respectively.

Full-size DOI: 10.7717/peerj.7995/fig-2

genes were clustered within a short distance, such as the top of A11 andD11 and the bottomof A08 and D08.

Gene duplication including tandem duplication, segmental duplication, and whole-genome duplication (WGD) is a major driving force in the evolution of plants. Theorigin of multigene family is due to gene duplication that arose from region-specificduplication or WGD (Du et al., 2012). To reveal the expansion mechanism of the GbNACgene family, gene duplication analysis was performed using blastn and the codingsequences (cds) of all GbNAC genes. In all, we identified 148 pairs (212 GbNAC genes)of segmental duplications, three pairs of tandem duplications (GbNAC011/GbNAC012,GbNAC024/GbNAC025, andGbNAC082/GbNAC083), and one triplicate repeat of tandemduplications (GbNAC212/GbNAC213/GbNAC214) (Fig. 3). One hundred and thirty-twoduplication gene pairs occur between the A-subgenome and the D-subgenome, while only12 and 10 duplication gene pairs occur within the A-subgenome and the D-subgenome,respectively. These genes represent approximately 81.9% (221 of 270) of theGbNAC genes,

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 7/24

A01

0 25 50 75 100

A020

25

50

75

100

A03

0

25

50

75

100

A04

0

25

50

75

A05

0

25

50

75

100

A06

0

25

50

75

100

A07

0

25

50

75

A08

0

25

50

75

100

A09

0

25

50

75

A10

0

255075100

A11

0255075

100A12

02550

75100

A13 0

25

50

75

100

D01 0

25

50

D02

0

25

50

D03

0

25

50

D04

0

25

50

D05

0

25

50

D06

0

25

50

D07

0

25

50

D08

0

25

50

D09

0

25

50

D10

0

25

50

D11

0

25

50

D12

0

25

50

D13

0

25

50

GbN

AC

001

GbN

AC

005

GbN

AC

008

GbN

AC

010

GbN

AC01

4

GbN

AC01

6

GbNAC01

7GbN

AC019

GbNAC02

1

GbNAC022

GbNAC024

GbNAC026

GbNAC027

GbNAC028

GbNAC030

GbNAC034

GbNAC035

GbNAC037

GbNAC039

GbNAC041

GbNAC042

GbNAC044

GbNAC046

GbNAC051

GbNAC053

GbNAC054

GbNAC055

GbNAC056GbNAC057

GbNAC058

GbNAC061GbNAC062GbNAC064GbNAC065

GbNAC066GbNAC068GbNAC070

GbNAC071GbNAC073

GbNAC075

GbNAC077

GbNAC085

GbNAC086GbNAC087

GbNAC088

GbNAC089

GbNAC093

GbNAC096

GbNAC097

GbN

AC

099

GbN

AC

102G

bNA

C107

GbN

AC

111

GbN

AC

112GbN

AC

114

GbN

AC

116

GbN

AC

117

GbN

AC

118

GbN

AC

119

GbN

AC

120

GbN

AC

121

GbN

AC

124

GbN

AC12

6

GbN

AC13

0

GbN

AC13

1GbNAC13

2

GbNAC13

3

GbNAC13

5

GbNAC13

7

GbNAC13

8

GbNAC142

GbNAC146

GbNAC147GbNAC150

GbNAC151GbNAC153GbNAC155GbNAC156

GbNAC158GbNAC159GbNAC160GbNAC163GbNAC166

GbNAC167GbNAC168

GbNAC169GbNAC171GbNAC174

GbNAC176

GbNAC178

GbNAC180

GbNAC182

GbNAC184

GbNAC185GbNAC186GbNAC187

GbNAC188

GbNAC189

GbNAC192

GbNAC194

GbNAC196

GbNAC197

GbNAC198

GbNAC200

GbNAC201

GbNAC202

GbNAC205

GbNAC207

GbNAC208

GbNAC216

GbNAC217

GbNAC218

GbNAC219

GbNAC222

GbNAC223

GbNAC227

GbNAC228

GbNAC230GbNAC231

GbNAC241G

bNAC245

GbNAC247

GbNAC249

GbN

AC

250G

bNA

C252

GbN

AC

253G

bNA

C254

GbN

AC

257G

bNA

C259

GbN

AC

263G

bNA

C264

GbN

AC

265G

bNA

C266

GbN

AC

268G

bNA

C270

Figure 3 Circos diagram of theGbNAC duplication pairs inG. barbadense. 143 GbNAC duplicationpairs are linked with green lines. Scale bar marked on the chromosome indicating chromosome lengths(Mb).

Full-size DOI: 10.7717/peerj.7995/fig-3

indicating their origin may be from sea-island cotton genome duplication events. Usingthe Ka and Ks of each duplicated GbNAC gene pair, we found that the Ks values of all genepairs were between 0.008 and 0.912. Specifically, the Ks values of 87 (58.78%) gene pairswere less than 0.05. Additionally, the Ka/Ks value of each gene pair was calculated and theKa/Ks values of 140 gene pairs (94.59%) were less than 1, which indicated these genes hadevolved under strong purifying selection. Furthermore, eight gene pairs (Ka/Ks >1) mayevolve under strong positive selection after duplication. Moreover, we also calculated theapproximate date of duplication events. The duplication events of GbNAC genes occurredfrom 1.58 Mya (Ks = 0.008) to 175.29 Mya (Ks = 0.912), with a mean of 59.31 Mya (Ks= 0.154). Detailed information including duplication gene pairs, chromosome location,duplication type, Ka, Ks, Ka/Ks and approximate duplication date (Mya) of all identifiedduplicated GbNAC genes is provided in Table S3.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 8/24

To detect the synteny of NAC genes, we performed a collinearity analysis betweenG. barbadense and the other four plant species (Arabidopsis thaliana, G. arboretum,G. raimondii and G. hirsutum) using MCScanX. Previously, 117, 141, and 145 NAC geneswere revealed in A. thaliana (AtNAC, Nuruzzaman et al., 2010), G. arboretum (GaNAC,Shang et al., 2016), and G. raimondii (GrNAC, Shang et al., 2013), respectively. For NACgenes in G. hirsutum (GhNAC), we identified 272 GhNAC genes in the upland cottongenome that was released recently (Wang et al., 2019). Totally, 945 NAC genes were usedto evaluate synteny relationship in this study. As a result, we found 421 paired collinearityrelationships between 247 GbNAC and 234 GhNAC genes, 241 pairs between 241 GbNACand 133 GrNAC genes, 181 pairs between 120 GbNAC and 54 AtNAC genes, and 142 pairsbetween 141 GbNAC and 81 GaNAC genes (Fig. 4 and Table S4). Notably, 69 GbNACgenes are collinear with NAC genes from the other four species (Table S4).

GbNAC gene structures and conserved motifsTo better understand the relationship between gene function and evolution among theGbNAC genes, the exon/intron organization and conserved motifs were analyzed (Fig. 5).The number of exons ranged from 1 to 10. Most GbNAC genes (182/270, 67.41%) hadthree exons (Fig. 5), although GhNAC059, GhNAC105, and GhNAC238 contained onlyone exon, and GhNAC129 and GhNAC175 contained 10 exons, the highest number of allgenes. We also found that GbNAC genes in the same group had a similar gene structure.For example, all 12 members in group IX had three exons. Among the 39 members ingroup I, all had three exons except GbNAC060 and GbNAC145 (Fig. 5).

Twenty conserved motifs were identified among the 270 GbNAC proteins (Fig. S1and Table S5). As a result, all GbNAC proteins contain a conserved NAC domain at theN-terminal, which includes five subdomains (A–E, Fig. 6). Notably, the members with highsimilarity in the same group shared a commonmotif composition. For example, GbNAC213and GbNAC214 were found to contain the same three motifs (Fig. 5). This finding indicatesthat these genes may have similar functions. Most GbNAC proteins (257/270, 95.19%)contain 2–5 conserved motifs. However, seven GbNAC proteins (GbNAC033, GbNAC068,GbNAC092, GbNAC109, GbNAC200, GbNAC226, and GbNAC242) contain only onemotif and six GbNAC proteins (GbNAC017, GbNAC020, GbNAC074, GbNAC160,GbNAC206, and GbNAC253) contain six motifs. Additionally, we found that subdomainsA (246/270), and/or C (200/270), and/or D (250/270) are present in most GbNAC proteins.Furthermore, we found that 231GbNACproteins contained the conserved histidine residuein subdomain D, which influences homodimerization and DNA binding of NAC proteins(Kang et al., 2018).

Membrane-bound NAC proteins feature a distinctive transmembrane motif (TM)at either the C terminal region or the N terminal region and play vital roles in plantdefense against abiotic stresses (Kim et al., 2010; Li et al., 2016; Sun et al., 2018). In thesea-island cotton genome, 22 membrane-bound GbNAC proteins were identified (TableS2). Notably, 20 membrane-bound GbNAC proteins contain one TM at the C terminal andtwo GbNAC members (GbNAC110 and GbNAC243) contain one TM at the N terminal.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 9/24

At

Gb

1 5 3 2 4

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13

Ga

Gb

1 2 3 4 5 6 7 8 9 10 11 12 13

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13

Gr

Gb

1 2 3 4 5 6 7 8 9 10 11 12 13

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13

Gh

Gb

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13

A

B

C

D

Figure 4 Synteny analysis of NAC genes between Gossypium barbadense and other plant species. At,Ga, Gr, Gb, and Gh indicate Arabidopsis thaliana, G. arboreum, G. raimondii, G. barbadense, and G. hirsu-tum, respectively. (A) Collinearity between At and Gb; (B) Collinearity between Ga and Gb; (C) Collinear-ity between Gr and Gb; (D) Collinearity between Gh and Gb. Gray lines in the background represent thecollinear blocks within the genomes of G. barbadense and other plant species, while the red lines show thecollinear NAC gene pairs.

Full-size DOI: 10.7717/peerj.7995/fig-4

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 10/24

GbNAC113GbNAC246GbNAC119GbNAC252GbNAC017GbNAC157GbNAC104GbNAC237GbNAC133GbNAC266GbNAC021GbNAC159GbNAC060GbNAC191GbNAC074GbNAC206GbNAC020GbNAC160GbNAC127GbNAC259GbNAC078GbNAC209GbNAC080GbNAC211GbNAC145GbNAC064GbNAC196GbNAC108GbNAC241GbNAC120GbNAC253GbNAC016GbNAC158GbNAC071GbNAC203GbNAC095GbNAC225GbNAC128GbNAC260GbNAC067GbNAC199GbNAC111GbNAC244GbNAC065GbNAC197GbNAC075GbNAC207GbNAC003GbNAC140GbNAC032GbNAC179GbNAC008GbNAC146GbNAC049GbNAC173GbNAC005GbNAC142GbNAC136GbNAC269GbNAC116GbNAC249GbNAC053GbNAC184GbNAC087GbNAC218GbNAC103GbNAC235GbNAC124GbNAC257GbNAC107GbNAC240GbNAC114GbNAC247GbNAC132GbNAC265GbNAC084GbNAC215GbNAC091GbNAC222GbNAC165GbNAC231GbNAC232GbNAC092GbNAC002GbNAC139GbNAC014GbNAC150GbNAC051GbNAC182GbNAC046GbNAC164GbNAC006GbNAC143GbNAC029GbNAC180GbNAC088GbNAC219GbNAC096GbNAC227GbNAC036GbNAC270GbNAC137GbNAC115GbNAC248GbNAC023GbNAC152GbNAC122GbNAC255GbNAC034GbNAC167GbNAC055GbNAC186GbNAC040GbNAC172GbNAC135GbNAC268GbNAC007GbNAC144GbNAC028GbNAC181GbNAC043GbNAC177GbNAC048GbNAC162GbNAC099GbNAC230GbNAC042GbNAC176GbNAC105GbNAC238GbNAC019GbNAC161GbNAC073GbNAC205GbNAC126GbNAC262GbNAC102GbNAC234GbNAC013GbNAC149GbNAC050GbNAC175GbNAC061GbNAC192GbNAC093GbNAC223GbNAC070GbNAC201GbNAC112GbNAC245GbNAC079GbNAC210GbNAC134GbNAC267GbNAC012GbNAC148GbNAC011GbNAC001GbNAC138GbNAC027GbNAC155GbNAC118GbNAC251GbNAC018GbNAC156GbNAC010GbNAC147GbNAC033GbNAC009GbNAC062GbNAC194GbNAC089GbNAC220GbNAC054GbNAC185GbNAC097GbNAC229GbNAC022GbNAC151GbNAC131GbNAC264GbNAC045GbNAC085GbNAC216GbNAC057GbNAC188GbNAC058GbNAC189GbNAC004GbNAC141GbNAC030GbNAC178GbNAC069GbNAC193GbNAC031GbNAC081GbNAC212GbNAC082GbNAC213GbNAC083GbNAC214GbNAC072GbNAC202GbNAC094GbNAC224GbNAC101GbNAC233GbNAC015GbNAC129GbNAC261GbNAC038GbNAC170GbNAC090GbNAC221GbNAC066GbNAC198GbNAC076GbNAC109GbNAC242GbNAC110GbNAC243GbNAC204GbNAC063GbNAC195GbNAC086GbNAC217GbNAC106GbNAC239GbNAC123GbNAC256GbNAC052GbNAC183GbNAC037GbNAC169GbNAC121GbNAC254GbNAC039GbNAC171GbNAC056GbNAC187GbNAC024GbNAC153GbNAC025GbNAC035GbNAC168GbNAC236GbNAC044GbNAC166GbNAC068GbNAC200GbNAC098GbNAC228GbNAC026GbNAC154GbNAC117GbNAC250GbNAC130GbNAC263GbNAC047GbNAC163GbNAC041GbNAC174GbNAC125GbNAC258GbNAC226GbNAC077GbNAC208GbNAC100GbNAC059GbNAC190

0 200 400 600 800 10005' 3'

0 1000 2000 3000 4000 5000 6000 70005' 3'

Motif 13

Motif 4

Motif 9

Motif 18

Motif 19

Motif 2

Motif 3

Motif 14

Motif 12

Motif 17

Motif 7

Motif 5

Motif 6

Motif 8

Motif 1

Motif 16

Motif 10

Motif 20

Motif 11

Motif 15

UTR

Exon

I

II

III

IV

V

VI

VII

VIII

IX

X

A CB

Figure 5 Putative conserved motifs and gene structures of theGbNAC genes. (A) Phylogenetic tree.Multiple sequence alignment of NAC domain sequences of G. barbadense was performed using ClustalW.The neighbor-joining (NJ) tree was constructed using MEGA X with 1,000 bootstrap replicates. (B) Con-served motif. MEME analysis revealed the conserved motifs of the GbNAC proteins. The colored boxes onthe right denote 20 motifs. (C) Gene structure. The yellow boxes, black lines, and green boxes representexon, intron, and UTR (untranslated region), respectively.

Full-size DOI: 10.7717/peerj.7995/fig-5

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 11/24

Figure 6 Conserved subdomains in the NAC domain at the N-terminal of GbNAC proteins. A–E indi-cate five conserved subdomain A–E, respectively.

Full-size DOI: 10.7717/peerj.7995/fig-6

These membrane-bound GbNAC proteins are distributed in three groups: seven in groupV, 11 in group VII, and four in group VIII.

Cis-acting regulatory element and miRNA target analysisTo better understand the transcriptional regulation mechanisms of GbNAC genes, wecharacterized the cis-acting regulatory elements within a 1,500 bp upstream regionfrom the transcription start site using PlantCARE and PLACE database (Table S6). Alarge number of cis-acting regulatory elements were identified in promoter sequencesof 270 GbNAC genes (Fig. S3 and Table S7). Common regulatory elements such asTATA-box and CAAT-box were present in all GbNAC genes. Meanwhile, we identified11 cis-acting regulatory elements related to hormone responses. Among these, AuxRR-core and TGA-element, auxin-responsive element; ABRE, MYB, and MYC, cis-elementinvolved in abscisic acid (ABA) responsiveness; GARE, P-box, and TATC-box, involvedin gibberellin responsiveness; ERE, ethylene responsive element; CGTCA-motif, MeJAresponsive elements; and TCA-element, involved in SA responsiveness. Notably, all of the270 GbNAC genes contained at least one hormone-responsive element (Fig. S3). MYB,MYC, and ERE were available in at least 80% of the GbNAC genes. Additionally, promotersequences of some GbNAC genes also contained several elements involved in biotic andabiotic stress responses, including pathogen defense (AT-rich and TC-rich repeat), drought(MBS), cold (DRE and LTR), anaerobic stress (ARE), and wounding (WUN-motif). Thus,GbNAC genes could be regulated by diverse hormone and environmental changes.

Recent reports have defined a subset of genes from the NAC-domain gene family aspotential targets of miRNAs. To determine the involvement of miRNAs in regulatingthe expression of GbNAC genes, putative miRNA targets were determined in the 270

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 12/24

GbNAC genes using the online sever psRNATarget with the default parameters exceptExpection (≤3.0). Totally, 43 GbNAC genes were predicted as the targets of 21 knownmiRNAs (Table S8). Seven GbNAC genes were each predicted to be the targets of twomiRNAs and GbNAC007 was the target of three miRNAs (gra-miR8634, gra-miR8786aand gra-miR8786b). Specifically, miR164 targets 14 GbNAC genes, which are all fromgroup II (Table S8). In Arabidopsis, the miR164 family (ath-miR164a/b/c) guides thecleavage of the transcripts of five NAC genes (NAC1/At1G56010, CUC1/At3g15170,CUC2/At5g53950, ANAC080/At5g07680, and ANAC100/At5g61430) that function in theregulation of plant growth and development such as lateral root emergence, formationof vegetative and floral organs, and age-dependent cell death (Fang, Xie & Xiong, 2014;Hernandez & Sanan-Mishra, 2017). The 14 miR164-targeted GbNAC genes and the fiveArabidopsis NAC genes were clustered into group II, indicating that these genes have highsequence identity and may have a similar function.

Expression profile of GbNAC genes in response to Verticillium wiltand hormonesWe used publicly available transcriptome data to assess expression of GbNAC genes inroots under VW stress. As a result, 239 (88.52%) GbNAC genes were identified to beexpressed in infected root samples (Fig. 7). Three patterns (Pattern I–III) of expressionwere revealed. Genes from Pattern I have low expression levels in the control and thenwere up-regulated gradually during VW infection. Genes from Pattern II, in contrast toPattern I, were highly expressed in the control and then were subsequently down-regulatedduring VW infection. Genes from Pattern III showed high expression levels in the control,then were down-regulated at the early stages (2, 6 and 12 h) of inoculation, and finallyup-regulated at the late stages (24, 48 and 72 h) of inoculation. There are 130 duplicated genepairs among the 239 GbNAC genes. Most duplicated gene pairs (80.77%) demonstratedsimilar expression patterns, suggesting that duplicated genes are functionally redundant.However, some duplicated genes have a divergent expression. For example,GbNAC082 andGbNAC083 are tandem duplication genes. GbNAC082 had considerably high expression inthe control and was up-regulated by VW stress at 2 h, while GbNAC083 had low expressionin the control and was down-regulated remarkably by VW stress at 2 h.

The expression of 192 GbNAC genes was significantly altered (|log2Fold| ≥1) in rootsinoculated with V. dahliae as compared to the control in at least one time point (Fig. S2).103, 130, 123, 135, 130, and 128 GbNAC genes were significantly induced at 2, 6, 12, 24,48, and 72 h after inoculation, respectively, which indicates the number of differentiallyexpressed GbNAC genes was similar among different time points. Additionally, theexpression of 62 GbNAC genes including 15 up-regulated and 47 down-regulated geneswere significantly altered at all of the six time points (Fig. S2 and Table S9). Eighteenduplicated gene pairs were revealed among the 62 GbNAC genes. Notably, GbNACgenes in each duplicated gene pair had similar expression patterns. The expression ofpreviously reported sea-island cotton NAC gene GbNAC1 (GbNAC220 in this study) wasalso altered. GbNAC1, a positive regulator involved in cotton resistance to V. dahliae, wasdown-regulated in infected roots compared to the control, which is consistent with qPCR

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 13/24

Pattern IPattern II

Pattern III

Figure 7 Expression profiles ofGbNAC genes in roots under Verticillium wilt infection. The heat mapwas generated by ClustVis software. The expression data were gene-wise normalized and hierarchicallyclustered with average linkage. The bar on the right of the heat map indicates relative expression values.Values 2, 0,−2 represent high, intermediated, and low expression, respectively.

Full-size DOI: 10.7717/peerj.7995/fig-7

results reported by Wang et al. (2016). These results imply that GbNAC genes could playcrucial roles in the defense of cotton against VW.

Phytohormones, such asMeJA and SA, regulate plant defenses against diverse pathogens.In order to identify hormone-responsive GbNAC genes, G. barbadense cv. 7124 seedlingswere treated with MeJA and SA, and the changes in transcript abundance of 15 genesselected from the 62 differentially expressed GbNAC genes were analyzed by qPCR. Asshown in Fig. 8, all the GbNAC genes tested were sensitive to the hormones MeJA and SA,but the levels of sensitivity were substantially different at the four time points. In general,most tested GbNAC genes were up-regulated under MeJA treatment, but down-regulatedunder SA treatment. Specifically, the expression of two GbNAC genes (GbNAC014 andGbNAC164) from group IV were up-regulated at all the four time points, and had at least186-fold and 225-fold increase at 6 h compared with control, respectively (Fig. 8).

DISCUSSIONCharacterization of GbNAC genesIn this study, we performed a genome-wide analysis of the sea-island cotton GbNACgene family to investigate their potential functions in response to VW. As a result,

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 14/24

GbNAC003

0h 1h 2h 6h 12h

Rel

ativ

e ex

pres

sion

0.0

.5

1.0

1.5

2.0

2.5

SAMeJA

GbNAC014

0h 1h 2h 6h 12h0

50

100

150

200GbNAC010

0h 1h 2h 6h 12h0

1

2

3

4

5 GbNAC023

0h 1h 2h 6h 12h0

1

2

3

4

5

6 GbNAC137

0h 1h 2h 6h 12h0

1

2

3

4

GbNAC140

0h 1h 2h 6h 12h

Rel

ativ

e ex

pres

sion

0

2

4

6

8

10 GbNAC147

0h 1h 2h 6h 12h0

10

20

30

40

50

60 GbNAC152

0h 1h 2h 6h 12h0

2

4

6

8

10 GbNAC164

0h 1h 2h 6h 12h0

50

100

150

200

250 GbNAC165

0h 1h 2h 6h 12h0

10

20

30

40

50

60

GbNAC186

0h 1h 2h 6h 12h

Rel

ativ

e ex

pres

sion

0

5

10

15

20

25

30

35 GbNAC188

0h 1h 2h 6h 12h0

2

4

6

8

10

12

14

16 GbNAC215

0h 1h 2h 6h 12h0

5

10

15

20

25

30 GbNAC248

0h 1h 2h 6h 12h0.0

.5

1.0

1.5

2.0

2.5

3.0 GbNAC270

0h 1h 2h 6h 12h0.0

.5

1.0

1.5

2.0

2.5

3.0

****

**

**

** ** **

****

*

*

** ****

**

**

**

**

**

****

**

**

**

**** **

**

**

* *

** **

**

**

** ****

**

****

**** **

****

**

** **

**

**

**

**

**

**

**

**

**

** **

**

****

**

**

* ** **

***

**

**

****

**

**

* *

***

**

* ****

** **

A B C D E

F G H I J

K L M N O

Figure 8 Expression profiles ofGbNAC genes in response to hormoneMeJA and SA. qPCR was used to analyze the expression profiles of 15GbNAC genes under hormone MeJA and SA treatment. A-O indicates the 15 GbNAC genes tested, respectively. The relative expression levels arenormalized to GbActin. The data represent the mean of three biological replicates. Significant differences were indicated as * (Tukey’s HSD, P <0.05) and ** (P < 0.01) above the bars between treatment and control.

Full-size DOI: 10.7717/peerj.7995/fig-8

270 GbNAC genes were revealed in the G. barbadense genome, which is similar tothat found in G. hirsutum (283 NAC members, Sun et al., 2018), but is about twiceas much as that found in G. raimondii (145, Shang et al., 2013) and G. arboreum (141,Shang et al., 2016; Fan et al., 2018). The difference in the size of NAC genes is becausethe genomes of tetraploid species possess more NAC genes than the diploid species,presumably because an allopolyploidization event occurred in G. babardense and G.hirsutum approximately 1.7–1.9 Mya (Hu et al., 2019). GbNAC genes were unevenlydistributed on 26 chromosomes, and a number of GbNAC genes were clustered on thetop or bottom of specific chromosomes (Fig. 2). These results are similar to those of thetwo diploid cotton species, G. raimondii, and G. arboreum (Shang et al., 2013; Shang et al.,2016).

According to themultiple sequence alignment, all theGbNACproteins contain the highlyconserved NAC domain with 150–160 amino acids, which can be further divided into fivedistinct subdomains (A–E) at the N-terminal. However, a number of GbNAC proteinshave an atypical NAC domain pattern. For example, three GbNAC proteins (GbNAC068,GbNAC200, and GbNAC226) have only conserved subdomain A, while four GbNACproteins (GbNAC033, GbNAC092, GbNAC109, and GbNAC242) have only subdomain D.In addition, the previously reported GbNAC1 lacks the conserved subdomains B, C, andE (Wang et al., 2016). NAC proteins lacking one to four subdomains were also observed

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 15/24

in Arabidopsis, rice, and radish (Ooka et al., 2003; Nuruzzaman et al., 2010; Karanja et al.,2017). Thus, the atypical NAC domain pattern appears to be common amongNACproteinsfrom diverse plant species. Previous studies revealed that subdomains A, C, and D havehigher levels of conservation than subdomains B and E, and play important roles in thefunction of NAC genes (Ooka et al., 2003). Subdomain A has the ability to form a helicalstructure and is involved in the formation of homodimers or heterodimers with other NACdomain proteins (Jensen et al., 2010). Subdomain D contains the nuclear localization signaland subdomain C is associated with DNA binding activity (Hernandez & Sanan-Mishra,2017). In this study, most GbNAC proteins (196/270) contain subdomains A, C, and D,whereas only 48 GbNAC proteins have subdomain B (Fig. 5). Furthermore, we found231 GbNAC proteins contained a conserved histidine residue in subdomain D. A recentstudy revealed that the conserved histidine residue is present in 80% of Arabidopsis NACmembers and functioned as a switch to regulate both pH-dependent homodimerizationand DNA binding (Kang et al., 2018). Thus, the subdomain D may function not only innuclear localization but also in the formation of functional dimers and DNA-binding.

NAC gene duplication was investigated in sea-island cotton genome. One hundred andfifty-four duplicated NAC gene pairs including 148 segmental duplication pairs and 6tandem duplication pairs were revealed in sea-island cotton (Fig. 3). Therefore, it can beconcluded that segmental duplication dominates the expansion of the NAC gene family intheG. barbadense genome. Furthermore, most duplicated gene pairs had undergone strongpurifying selection during evolution, indicating that purifying selection played pivotal rolesin the confinement of the GbNAC gene functions, which was further confirmed by thesimilar expression pattern of most duplication gene pairs. Previous studies have showedthat the A and D ancestor genomes diverged approximately 6.2–7.1 Mya (Hu et al., 2019).In this study, about half of the duplication events occurred after the divergence of the twodiploid progenitors (Table S3). Synteny analysis indicated that the collinearity of NACgenes for the four Gossypium species is highly conserved. However, the level of collinearityis different. G. barbadense and G. hirsutum have the highest level of collinearity, while G.barbadense and G. arboretum have the lowest level of collinearity (Table S4). The differenceis probably attributed to the genomic characteristics and evolution history of cotton. G.barbadense and G. hirsutum evolved from a common tetraploid ancestor and divergedapproximately 0.4–0.6 Mya (Hu et al., 2019), which leads to high collinearity for the twotetraploid species. Nevertheless, the A subgenome of G. barbadense has large chromosomeinversions in comparison with G. arboretum due to chromosomal rearrangement afterallopolyploidization (Wang et al., 2019), which results in relatively low collinearity betweenG. barbadense and G. arboretum.

Differential expression of GbNAC genes in response to VerticilliumwiltCotton VW is a destructive soil-borne fungal disease and dramatically reduces the yieldand quality of cotton. VW was first reported in Virginia, USA, in 1914 and now occursworldwide. Over the past century, substantial efforts have been made to develop waysto control VW, and a number of genes, such as Gh_A10G2076, GhATAF1, GbERF1,

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 16/24

GbWRKY1, and GbRLK, have been revealed to be associated with VW resistance (Li et al.,2017;He et al., 2016; Guo et al., 2016; Li et al., 2014; Jun et al., 2015). However, whether theNAC genes could play possible role in response to VW in G. barbadense is still in question.In this study, 62 GbNAC genes were identified to be significantly up- or down-regulated inroots inoculated withV. dahliae at all six time-points (Fig. S2 and Table S9). These genes areall from the 10 phylogenetic groups, except group V, and contain at least one cis-regulatoryelement involved in stress responses (Table S7). We also found that GbNAC genes in eachduplicated gene pair had similar expression patterns in response to VW infection. Thefunctional redundancy of duplicated genes may be related to their similar gene structure,motifs, and cis-regulatory elements. In Arabidopsis, fiveNAC genes (ANAC016, ANAC036,ANAC037, ANAC061, ANAC081, and ANAC091) were significantly up-regulated afterchitin treatment (Libault et al., 2007). Chitin is an elicitor of plant defense responses andits elicitation plays important role in plant defense to fungal pathogens. Among the 62GbNAC genes, 10 genes clustered with ANAC081 are from group IV; 13 genes along withANAC036 and ANAC061 belong to group VI; 4 genes and ANAC091 are from group VII,and 3 genes and ANAC037 are from group I (Table S9).

Previously, an upland cotton ATAF subfamily NAC gene, GhATAF1 (DT549350),was reported to be up-regulated by V. dahliae inoculation. Cotton plants overexpressingGhATAF1 increased susceptibility to pathogen V. dahliae (He et al., 2016). In our study,GbNAC164, an ortholog to GhATAF1, was down-regulated by V. dahliae infection. Thedifferent expression pattern may be caused by cotton genotypes and/or V. dahliae strains.GbNAC164 was investigated in G. barbadense cv. 7,124 treated with a highly virulent V.dahliae strain V991, while GhATAF1 was analyzed in G. hirsutum cv. YZ1 treated with amoderately aggressive V. dahliae strain ICD3-2. Sea-island cotton GbNAC1 (GbNAC220in this study) belongs to the Tobacco elicitor-responsive gene encoding NAC-domainprotein (TERN) subgroup and was down-regulated by VW. Cotton plants silencingGbNAC1 reduce resistance to VW, whereas transgenic Arabidopsis lines overexpressingGbNAC1 enhance resistance to VW compared to wild type (Wang et al., 2016). GbNAC220was also down-regulated by VW in our study. In addition, MeJA and SA are importantpathogen-related hormonal regulators. Among the 62 GbNAC genes, 31 and 22 genescontain MeJA- and SA-responsive elements, respectively, and 12 genes contain both MeJA-and SA-responsive elements (Table S7), indicating the expression of these genes may beregulated by MeJA and/or SA. These results were further verified by qPCR analysis. TheqPCR results indicated GbNAC003, GbNAC137, GbNAC140, GbNAC215, and GbNAC248were sensitive to MeJA and SA treatments (Fig. 8), which was in agreement with the resultsof cis-regulatory element analysis (Table S7). GbNAC164 was up-regulated by MeJA andSA treatment. Moreover, GbNAC164 keeps high expression level more durable by SA thanby MeJA (Fig. 8). This result was consistent with GhATAF1 (He et al., 2016). Thus, thesefindings suggest that up-regulating of the 62 GbNAC genes may result in increased ordecreased VW resistance in cotton and these genes can be candidate genes for in-depthstudy on VW resistance.

Interestingly, GhATAF1 was also up-regulated by ABA, cold, and salt treatments.Overexpression of GhATAF1 confers transgenic cotton improved salt tolerance (He et al.,

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 17/24

2016). Similarly, TransgenicGbNAC1 results in enhanced drought tolerance in Arabidopsis(Wang et al., 2016). TheNAC genes ANAC019, ANAC055 and ANAC072 from Arabidopsiswere up-regulated by drought, salt, and ABA treatments. Transgenic either ANAC019,ANAC055 or ANAC072 conferred Arabidopsis plants enhanced drought tolerance. Thefinding suggests that the 62 GbNAC genes may play role in response to abiotic stresses.

CONCLUSIONSIn this study, the plant-specific NAC gene family in sea-island cotton was characterizedwith particular focus on their responses to VW infection. A total of 270 GbNAC geneswere identified and characterized in sea-island cotton. The gene structure, chromosomaldistribution, gene duplication, conserved motif, cis-elements, and expression profiles ofthe GbNAC genes were analyzed. Furthermore, expression profile analyses revealed that62 GbNAC genes may play crucial role in response to VW infection. However, furtherfunctional data are required to evaluate each GbNAC gene. Overall, our results will providenew insights for plant engineering programs so that economically important traits forcotton can be developed, including improved resistance to VW.

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Natural Science Foundation of China (No.31571755) and the Agricultural Scientific and Technological Innovation Project ofShandong Academy of Agricultural Sciences (CXGC2018E06 and CXGC2018B01). Thefunders had no role in study design, data collection and analysis, decision to publish, orpreparation of the manuscript.

Grant DisclosuresThe following grant information was disclosed by the authors:National Natural Science Foundation of China: 31571755.Agricultural Scientific and Technological Innovation Project of Shandong Academy ofAgricultural Sciences: CXGC2018E06, CXGC2018B01.

Competing InterestsThe authors declare there are no competing interests.

Author Contributions• Zhanji Liu conceived anddesigned the experiments, performed the experiments, analyzedthe data, prepared figures and/or tables, authored or reviewed drafts of the paper,approved the final draft.• Mingchuan Fu and Hao Li performed the experiments, analyzed the data, preparedfigures and/or tables, approved the final draft.• Yizhen Chen performed the experiments, authored or reviewed drafts of the paper,approved the final draft.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 18/24

• LiguoWang andRenzhongLiu analyzed the data, contributed reagents/materials/analysistools, prepared figures and/or tables, approved the final draft.

Data AvailabilityThe following information was supplied regarding data availability:

The raw data are available in the Supplemental Files.

Supplemental InformationSupplemental information for this article can be found online at http://dx.doi.org/10.7717/peerj.7995#supplemental-information.

REFERENCESAllu AD, Brotman Y, Xue GP, Balazadeh S. 2016. Transcription factor ANAC032

modulates JA/SA signaling in response to Pseudomonas syringae infection. EMBOReports 17(11):1578–1589 DOI 10.15252/embr.201642197.

Artimo P, JonnalageddaM, Arnold K, Baratin D, Csardi G, De Castro E, DuvaudS, Flegel V, Fortier A, Gasteiger E, Grosdidier A, Hernandez C, Ioannidis V,Kuznetsov D, Liechti R, Moretti S, Mostaguir K, Redaschi N, Rossier G, XenariosI, Stockinger H. 2012. ExPASy: SIB bioinformatics resource portal. Nucleic AcidsResearch 40(W1):W597–W603 DOI 10.1093/nar/gks400.

Bailey TL, Johnson J, Grant CE, NobleWS. 2015. The MEME suite. Nucleic AcidsResearch 43(W1):W39–W49 DOI 10.1093/nar/gkv416.

Chen JY, Huang JQ, Li NY, Ma XF,Wang JL, Liu C, Liu YF, Liang Y, Bao YM, DaiXF. 2015. Genome-wide analysis of the gene families of resistance gene analoguesin cotton and their response to verticillium wilt. BMC Plant Biology 15:148DOI 10.1186/s12870-015-0508-3.

Dai X, Zhuang Z, Zhao PX. 2018. psRNATarget: a plant small RNA target analysis server(2017 release). Nucleic Acids Research 46(W1):W49–W54DOI 10.1093/nar/gky316.

DuH, Yang SS, Liang Z, Feng BR, Liu L, Huang YB, Tang YX. 2012. Genome-wideanalysis of the MYB transcription factor superfamily in soybean. BMC Plant Biology12:106 DOI 10.1186/1471-2229-12-106.

Fan K, Bibi N, Gan S, Li F, Yuan S, Ni M,WangM, Shen H,Wang X. 2015. A novel NAPmember GhNAP is involved in leaf senescence in Gossypium hirsutum. Journal ofExperimental Botany 66(15):4669–4685 DOI 10.1093/jxb/erv240.

Fan K, Li F, Chen J, Li Z, LinW, Cai S, Liu J, LinW. 2018. Asymmetric evolution andexpansion of the NAC transcription factor in polyploidized cotton. Frontiers in PlantScience 9:47 DOI 10.3389/fpls.2018.00047.

Fang Y, Xie K, Xiong L. 2014. Conserved miR164-targeted NAC genes negativelyregulate drought resistance in rice. Journal of Experimental Botany 65(8):2119–2135DOI 10.1093/jxb/eru072.

Finn RD, Clements J, Eddy SR. 2011.HMMER web server: interactive sequencesimilarity searching. Nucleic Acids Research 39(Web Server issue):W29–W37.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 19/24

Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M,Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A. 2016. The Pfamprotein families database: towards a more sustainable future. Nucleic Acids Research44(Database issue):D279–D285 DOI 10.1093/nar/gkv1344.

GuoW, Jin L, Miao Y, He X, Hu Q, Guo K, Zhu L, Zhang X. 2016. An ethylene response-related factor, GbERF1-like, from Gossypium barbadense improves resistanceto Verticillium dahliae via activating lignin synthesis. Plant Molecular Biology91(3):305–318 DOI 10.1007/s11103-016-0467-6.

He X, Zhu L, Xu L, GuoW, Zhang X. 2016. GhATAF1, a NAC transcription factor,confers abiotic and biotic stress responses by regulating phytohormonal signalingnetworks. Plant Cell Reports 35(10):2167–2179 DOI 10.1007/s00299-016-2027-6.

Hernandez Y, Sanan-Mishra N. 2017.miRNA mediated regulation of NAC transcriptionfactors in plant development and environment stress response. Plant Gene 11(PartB):190–198 DOI 10.1016/j.plgene.2017.05.013.

Higo K, Ugawa Y, IwamotoM, Korenaga T. 1999. Plant cis-acting regulatoryDNA elements (PLACE) database: 1999. Nucleic Acids Research 27(1):297–300DOI 10.1093/nar/27.1.297.

Hu B, Jin J, Guo AY, Zhang H, Luo J, Gao G. 2015. GSDS 2.0: an upgraded gene featurevisualization server. Bioinformatics 31(1):1296–1297DOI 10.1093/bioinformatics/btu817.

HuH, Dai M, Yao J, Xiao B, Li X, Zhang Q, Xiong L. 2006. Overexpressing a NAM,ATAF, and CUC (NAC) transcription factor enhances drought resistance and salttolerance in rice. Proceedings of the National Academy of Sciences of the United Statesof America 103(35):12987–12992 DOI 10.1073/pnas.0604882103.

Hu Y, Chen J, Fang L, Zhang Z, MaW, Niu Y, Ju L, Deng J, Zhao T, Lian J, BaruchK, Fang D, Liu X, Ruan YL, RahmanMU, Han J, Wang K,Wang Q,WuH,MeiG, Zang Y, Han Z, Xu C, ShenW, Yang D, Si Z, Dai F, Zou L, Huang F, Bai Y,Brodt A, Ben-HamoH, Zhu X, Zhou B, Guan X, Zhu S, Chen X, Zhang T. 2019.Gossypium barbadense and Gossypium hirsutum genomes provide insights intothe origin and evolution of allotetraploid cotton. Nature Genetics 51(4):739–748DOI 10.1038/s41588-019-0371-5.

JensenMK, Kjaersgaard T, NielsenMM, Galberg P, Petersen K, O’Shea C, Skriver K.2010. The Arabidopsis thaliana NAC transcription factor family: structure-functionrelationships and determinants of ANAC019 stress signaling. Biochemical Journal426(2):183–196 DOI 10.1042/BJ20091234.

Jun Z, Zhang Z, Gao Y, Zhou L, Fang L, Chen X, Ning Z, Chen T, GuoW, Zhang T.2015. Overexpression of GbRLK, a putative receptor-like kinase gene, improvedcotton tolerance to Verticillium wilt. Scientific Reports 5:15048DOI 10.1038/srep15048.

KangM, Kim S, KimHJ, Shrestha P, Yun JH, Phee BK, LeeW, NamHG, Chang I.2018. The C-domain of the NAC transcription factor ANAC019 is necessary forpH-tuned DNA binding through a histidine switch in the N-domain. Cell Reports22(5):1141–1150 DOI 10.1016/j.celrep.2018.01.002.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 20/24

Karanja BK, Xu L,Wang Y, Muleke EM, Jabir BM, Xie Y, Zhu X, ChengW, Liu L.2017. Genome-wide characterization and expression profiling of NAC transcriptionfactor genes under abiotic stresses in radish (Raphanus sativus L.). PeerJ 5:e4172DOI 10.7717/peerj.4172.

Kim SG, Lee S, Seo PJ, Kim SK, Kim JK, Park CM. 2010. Genome-scale screening andmolecular characterization of membrane-bound transcription factors in Arabidopsisand rice. Genomics 95(1):56–65 DOI 10.1016/j.ygeno.2009.09.003.

Krogh A, Larsson B, Von Heijne G, Sonnhammer EL. 2001. Predicting transmembraneprotein topology with a hidden markov model: application to complete genomes.Journal of Molecular Biology 305(3):567–580 DOI 10.1006/jmbi.2000.4315.

Krzywinski M, Schein JE, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, MarraMA. 2009. Circos: an information aesthetic for comparative genomics. GenomeResearch 19(9):1639–1645 DOI 10.1101/gr.092759.109.

Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018.MEGA X: molecular evolutionarygenetics analysis across computing platforms.Molecular Biology and Evolution35(6):1547–1549 DOI 10.1093/molbev/msy096.

LarkinMA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliamH,Valentin F, Wallace IM,Wilm A, Lopez R, Thompson JD, Gibson TJ, HigginsDG. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948DOI 10.1093/bioinformatics/btm404.

Lescot M, Dehais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y, Rouze P, RombautsS. 2002. PlantCARE, a database of plant cis-acting regulatory elements and aportal to tools for in silico analysis of promoter sequences. Nucleic Acids Research30(1):325–327 DOI 10.1093/nar/30.1.325.

Letunic I, Bork P. 2018. 20 years of the SMART protein domain annotation resource.Nucleic Acids Research 46(D1):D493–D496 DOI 10.1093/nar/gkx922.

Lex A, Gehlenborg N, Strobelt H, Vuillemont R, Pfister H. 2014. UpSet: visualizationof intersecting sets. IEEE Transactions on Visualization and Computer Graphics20(12):1983–1992 DOI 10.1109/TVCG.2014.2346248.

Li C, He X, Luo X, Xu L, Liu L, Min L, Jin L, Zhu L, Zhang X. 2014. Cotton WRKY1mediates the plant defense-to-development transition during infection of cotton byVerticillium dahliae by activating JASMONATE ZIM-DOMAIN1 expression. PlantPhysiology 166(4):2179–2194 DOI 10.1104/pp.114.246694.

Li S, Wang N, Ji D, Xue Z, Yu Y, Jiang Y, Liu J, Liu Z, Xiang F. 2016. Evolutionary andfunctional analysis of membrane-bound NAC transcription factor genes in soybean.Plant Physiology 172(3):1804–1820 DOI 10.1104/pp.16.01132.

Li T, Ma X, Li N, Zhou L, Liu Z, Han H, Gui Y, Bao Y, Chen J, Dai X. 2017. Genome-wide association study discovered candidate genes of Verticillium wilt resis-tance in upland cotton (Gossypium hirsutum L.). Plant Biotechnology Journal15(12):1520–1532 DOI 10.1111/pbi.12734.

LiW, Zhou Z, Meng Y, Xu N, FokM. 2009.Modeling boll maturation period, seedgrowth, protein, and oil content of cotton (Gossypium hirsutum L.) in China. FieldCrops Research 112(2–3):131–140 DOI 10.1016/j.fcr.2009.02.009.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 21/24

Libault M,Wan J, Czechowski T, Udvardi M, Stacey G. 2007. Identification of 118Arabidopsis transcription factor and 30 ubiquitin-ligase genes responding to chitin,a plant-defense elicitor.Molecular Plant-Microbe Interactions 20(8):900–911DOI 10.1094/MPMI-20-8-0900.

LiuM,Ma Z, SunW, Huang L,WuQ, Tang Z, Bu T, Li C, Chen H. 2019. Genome-wideanalysis of the NAC transcription factor family in tartary buckwheat (Fagopyrumtataricum). BMC Genomics 20:113 DOI 10.1186/s12864-019-5500-0.

Liu X,Wang T, Bartholomew E, Black K, DongM, Zhang Y, Yang S, Cai Y, Xue S,Weng Y, Ren H. 2018. Comprehensive analysis of NAC transcription factors andtheir expression during fruit spine development in cucumber (Cucumis sativus L.).Horticulture Research 5:31 DOI 10.1038/s41438-018-0036-z.

Liu X, Zhao B, Zheng HJ, Hu Y, Lu G, Yang CQ, Chen JD, Chen JJ, Chen DY, ZhangL, Zhou Y,Wang LJ, GuoWZ, Bai YL, Ruan JX, Shangguan XX, Mao YB, ShanCM, Jiang JP, Zhu YQ, Jin L, Kang H, Chen ST, He XL,Wang R,Wang YZ, ChenJ, Wang LJ, Yu ST,Wang BY,Wei J, Song SC, Lu XY, Gao ZC, GuWY, Deng X, MaD,Wang S, LiangWH, Fang L, Cai CP, Zhu XF, Zhou BL, Jeffrey Chen Z, Xu SH,Zhang YG,Wang SY, Zhang TZ, Zhao GP, Chen XY. 2015. Gossypium barbadensegenome sequence provides insight into the evolution of extra-long staple fiber andspecialized metabolites. Scientific Reports 5:14139 DOI 10.1038/srep14139.

Livak KJ, Schmittgen TD. 2001. Analysis of relative gene expression data usingreal-time quantitative PCR and the 2−11CT method.Methods 25(4):402–408DOI 10.1006/meth.2001.1262.

Metsalu T, Vilo J. 2015. ClustVis: a web tool for visualizing clustering of multivariatedata using principal component analysis and heatmap. Nucleic Acids Research43(W1):W566–W570 DOI 10.1093/nar/gkv468.

NuruzzamanM,Manimekalai R, Sharoni AM, Satoh K, Kondoh H, Ooka H, Kikuchi S.2010. Genome-wide analysis of NAC transcription factor family in rice. Gene 465(1–2):30–44 DOI 10.1016/j.gene.2010.06.008.

NuruzzamanM, Sharoni AM, Kikuchi S. 2013. Roles of NAC transcription factors in theregulation of biotic and abiotic stress responses in plants. Frontiers in Microbiology4:248 DOI 10.3389/fmicb.2013.00248.

Olsen AN, Ernst HA, Leggio LL, Skriver K. 2005. NAC transcription factors:structurally distinct, functionally diverse. Trends in Plant Science 10(2):79–87DOI 10.1016/j.tplants.2004.12.010.

Ooka H, Satoh K, Doi K, Nagata T, Otomo Y, Murakami K, Matsubara K, Osato N,Kawai J, Carninci P, Hayashizaki Y, Suzuki K, Kojima K, Takahara Y, YamamotoK, Kikuchi S. 2003. Comprehensive analysis of NAC family genes in Oryza sativa andArabidopsis thaliana. DNA Research 10(6):239–247.

Shang H, LiW, Zou C, Yuan Y. 2013. Analysis of the NAC transcription factor genefamily in Gossypium raimondii Ulbr.: chromosomal location, structure, phylogeny,and expression patterns. Journal of Integrative Plant Biology 55(7):663–676DOI 10.1111/jipb.12085.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 22/24

Shang H,Wang Z, Zou C, Zhang Z, LiW, Li J, Shi Y, GongW, Chen T, Liu A, GongJ, Ge Q, Yuan Y. 2016. Comprehensive analysis of NAC transcription factorsin diploid Gossypium: sequence conservation and expression analysis uncovertheir roles during fiber development. Science China Life Sciences 59(2):142–153DOI 10.1007/s11427-016-5001-1.

Souer E, Van Houwelingen A, Kloos D, Mol J, Koes R. 1996. The no apical meris-tem gene of Petunia is required for pattern formation in embryos and flowersand is expressed at meristem and primordia boundaries. Cell 85(2):159–170DOI 10.1016/S0092-8674(00)81093-4.

Sun H, HuM, Li J, Chen L, Li M, Zhang S, Zhang X, Yang X. 2018. Compre-hensive analysis of NAC transcription factors uncovers their roles duringfiber development and stress response in cotton. BMC Plant Biology 18:150DOI 10.1186/s12870-018-1367-5.

Sun Q, Jiang H, Zhu X,WangW, He X, Shi Y, Yuan Y, Du X, Cai Y. 2013. Analysis ofsea-island cotton and upland cotton in response to Verticillium dahliae infection byRNA sequencing. BMC Genomics 14:852 DOI 10.1186/1471-2164-14-852.

Takasaki H, Maruyama K, Takahashi F, Fujita M, Yoshida T, Nakashima K, MyougaF, Toyooka K, Yamaguchi-Shinozaki K, Shinozaki K. 2015. SNAC-As, stress-responsive NAC transcription factors, mediate ABA-inducible leaf senescence. PlantJournal 84(6):1114–1123 DOI 10.1111/tpj.13067.

Vatansever R, Koc I, Ozyigit II, Sen U, Uras ME, AnjumNA, Pereira E, Filiz E.2016. Genome-wide identification and expression analysis of sulfate transporter(SULTR) genes in potato (Solanum tuberosum L.). Planta 244(6):1167–1183DOI 10.1007/s00425-016-2575-6.

Voorrips R. 2002.MapChart: software for the graphical presentation of linkage maps andQTLs. Journal of Heredity 93(1):77–78 DOI 10.1093/jhered/93.1.77.

WangM, Tu L, Yuan D, Zhu D, Shen C, Li J, Liu F, Pei L, Wang P, Zhao G, Ye Z, HuangH, Yan F, Ma Y, Zhang L, LiuM, You J, Yang Y, Liu Z, Huang F, Li B, Qiu P, ZhangQ, Zhu L, Jin S, Yang X, Min L, Li G, Chen LL, Zheng H, Lindsey K, Lin Z, Udall JA,Zhang X. 2019. Reference genome sequences of two cultivated allotetraploid cottons,Gossypium hirsutum and Gossypium barbadense. Nature Genetics 51(2):224–229DOI 10.1038/s41588-018-0282-x.

WangW, Yuan Y, Yang C, Geng S, Sun Q, Long L, Cai C, Chu Z, Liu X,Wang G,Du X, Miao C, Zhang X, Cai Y. 2016. Characterization, expression, and func-tional analysis of a novel NAC gene associated with resistance to Verticilliumwilt and abiotic stress in cotton. G3: Genes, Genomes, Genetics 6(12):3951–3961DOI 10.1534/g3.116.034512.

Wang Y, Tang H, DeBarry JD, Tan X, Li J, Wang X, Lee TH, Jin H, Marler B, Guo H,Kissinger JC, Paterson AH. 2012.MCScanX: a toolkit for detection and evolu-tionary analysis of gene synteny and collinearity. Nucleic Acids Research 40(7):e49DOI 10.1093/nar/gkr1293.

Yokotani N, Tsuchida-Mayama T, Ichikawa H, Mitsuda N, Ohme-Takagi M, Kaku H,Minami E, Nishizawa Y. 2014. OsNAC111, a blast disease-responsive transcription

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 23/24

factor in rice, positively regulates the expression of defense-related genes.MolecularPlant-Microbe Interactions 27(10):1027–1034 DOI 10.1094/MPMI-03-14-0065-R.

Yu CS, Chen YC, Lu CH, Hwang JK. 2006. Prediction of protein subcellular lo-calization. Proteins: Structure, Function and Bioinformatics 64(3):643–651DOI 10.1002/prot.21018.

Zhang J, Huang GQ, Zou D, Yan JQ, Li Y, Hu S, Li XB. 2018. The cotton (Gossypiumhirsutum) NAC transcription factor (FSN1) as a positive regulator participatesin controlling secondary cell wall biosynthesis and modification of fibers. NewPhytologist 217(2):625–640 DOI 10.1111/nph.14864.

Zhang J, Yu J, Pei W, Li X, Said J, SongM, Sanogo S. 2015. Genetic analysis of Verti-cillium wilt resistance in a backcross inbred line population and a meta-analysisof quantitative trait loci for disease resistance in cotton. BMC Genomics 16:577DOI 10.1186/s12864-015-1682-2.

Zhang Y,Wang XF, Ding ZG, Ma Q, Zhang GR, Zhang SL, Li ZK,WuQ, Zhang GY,Ma ZY. 2013. Transcriptome profiling of Gossypium barbadense inoculated withVerticillium dahliae provides a resource for cotton improvement. BMC Genomics14:637 DOI 10.1186/1471-2164-14-637.

Zhang Z, Li J, Zhao XQ,Wang J, Wong GK, Yu J. 2006. KaKs_Calculator: calculatingKa and Ks through model selection and model averaging. Genomics ProteomicsBioinformatics 4(4):259–263 DOI 10.1016/S1672-0229(07)60007-2.

Zhao F, Ma J, Li L, Fan S, Guo Y, SongM,Wei H, Pang C, Yu S. 2016. GhNAC12, aneutral candidate gene, leads to early aging in cotton (Gossypium hirsutum L). Gene576(1 Pt 2):268–274 DOI 10.1016/j.gene.2015.10.042.

Liu et al. (2019), PeerJ, DOI 10.7717/peerj.7995 24/24